# convit

**Repository Path**: wuwu-wu/convit

## Basic Information

- **Project Name**: convit
- **Description**: No description available
- **Primary Language**: Unknown
- **License**: Apache-2.0
- **Default Branch**: main
- **Homepage**: None
- **GVP Project**: No

## Statistics

- **Stars**: 0
- **Forks**: 1
- **Created**: 2023-03-02
- **Last Updated**: 2023-03-02

## Categories & Tags

**Categories**: Uncategorized

**Tags**: None

## README

# ConViT : Vision Transformers with Convolutional Inductive Biases

This repository contains PyTorch code for ConViT. It builds on code from the [Data-Efficient Vision Transformer](https://github.com/facebookresearch/deit) and from [timm](https://github.com/rwightman/pytorch-image-models).

For details see the [ConViT paper](https://arxiv.org/abs/2103.10697) by Stéphane d'Ascoli, Hugo Touvron, Matthew Leavitt, Ari Morcos, Giulio Biroli and Levent Sagun.

If you use this code for a paper please cite:

```
@article{d2021convit,
  title={ConViT: Improving Vision Transformers with Soft Convolutional Inductive Biases},
  author={d'Ascoli, St{\'e}phane and Touvron, Hugo and Leavitt, Matthew and Morcos, Ari and Biroli, Giulio and Sagun, Levent},
  journal={arXiv preprint arXiv:2103.10697},
  year={2021}
}
```

# Usage

Install PyTorch 1.7.0+ and torchvision 0.8.1+ and [pytorch-image-models 0.3.2](https://github.com/rwightman/pytorch-image-models):

```
conda install -c pytorch pytorch torchvision
pip install timm==0.3.2
```

## Data preparation

Download and extract ImageNet train and val images from http://image-net.org/.
The directory structure is the standard layout for the torchvision [`datasets.ImageFolder`](https://pytorch.org/docs/stable/torchvision/datasets.html#imagefolder), and the training and validation data is expected to be in the `train` folder and `val` folder respectively:

```
/path/to/imagenet/
  train/
    class1/
      img1.jpeg
    class2/
      img2.jpeg
  val/
    class1/
      img3.jpeg
    class/2
      img4.jpeg
```

## Evaluation
To evaluate ConViT-Ti on ImageNet test set, run:
```
python main.py --eval --model convit_tiny --pretrained --data-path /path/to/imagenet
```

This should give
```
Acc@1 73.116 Acc@5 91.710 loss 1.172
```

## Training
To train ConViT-Ti on ImageNet on a single node with 4 gpus for 300 epochs run:

```
python -m torch.distributed.launch --nproc_per_node=4 --use_env main.py --model convit_tiny --batch-size 256 --data-path /path/to/imagenet
```

To train the same model on a subsampled version of ImageNet where we only use 10% of the images of each class, add ```--sampling_ratio 0.1```

## Multinode training
Distributed training is available via Slurm and [submitit](https://github.com/facebookincubator/submitit):

```
pip install submitit
```

To train ConViT-base on ImageNet on 2 nodes with 8 gpus each for 300 epochs:
```
python run_with_submitit.py --model convit_base --data-path /path/to/imagenet
```

# License
The majority of this repository is released under the Apache 2.0 license as found in the [LICENSE](LICENSE) file.