# BagofTricks-LT
**Repository Path**: sing_jay_lee/BagofTricks-LT
## Basic Information
- **Project Name**: BagofTricks-LT
- **Description**: No description available
- **Primary Language**: Unknown
- **License**: MIT
- **Default Branch**: dev
- **Homepage**: None
- **GVP Project**: No
## Statistics
- **Stars**: 0
- **Forks**: 0
- **Created**: 2021-12-01
- **Last Updated**: 2021-12-01
## Categories & Tags
**Categories**: Uncategorized
**Tags**: None
## README
## Bag of tricks for long-tailed visual recognition with deep convolutional neural networks
This repository is the official PyTorch implementation of [Bag of Tricks for Long-Tailed Visual Recognition with Deep Convolutional Neural Networks](http://www.lamda.nju.edu.cn/zhangys/papers/AAAI_tricks.pdf), which provides practical and effective tricks used in long-tailed image classification.
- #### Recommond to install [Github Sort Content](https://github.com/Mottie/GitHub-userscripts/wiki/GitHub-sort-content), which can sort the columns of tables listed on github. With Github Sort Content, you can easily find the most efficient trick under each dataset in [trick_gallery.md](https://github.com/zhangyongshun/BagofTricks-LT/blob/main/documents/trick_gallery.md). You can see [this issue on stackoverflow](https://stackoverflow.com/questions/42843288/is-there-any-way-to-make-markdown-tables-sortable) for more information.
## Development log
- [x] `2021-11-08` - Add [InfluenceBalancedLoss ICCV 2021](https://arxiv.org/abs/2110.02444) in [trick_gallery.md](https://github.com/zhangyongshun/BagofTricks-LT/blob/main/documents/trick_gallery.md), which belongs to two-stage training.
- [x] `2021-05-19` - Add CONFIGs and experimental results of [BBN-style sampling CVPR2020](https://arxiv.org/abs/1912.02413) in [trick_gallery.md](https://github.com/zhangyongshun/BagofTricks-LT/blob/main/documents/trick_gallery.md), which consists of a uniform sampler and a reverse sampler.
Previous logs
2021-04-24 - Add the validation running command, which loads a trained model, then returns the validation acc and a corresponding confusion matrix figure. See Usage in this README for details.
2021-04-24 - Add classifier-balancing and corresponding experiments in Two-stage training in trick_gallery.md, including $\tau$-normalization, cRT and LWS.
2021-04-23 - Add CrossEntropyLabelAwareSmooth (label-aware smoothing, CVPR 2021) in trick_gallery.md.
2021-04-22 - Add one option (TRAIN.APEX) in config.py, so you can set TRAIN.APEX to False for training without using apex.
2021-02-19 - Test and add the results of two-stage training in trick_gallery.md.
2021-01-11 - Add a mixup related method: Remix, ECCV 2020 workshop.
2021-01-10 - Add CDT (class-dependent temparature), arXiv 2020, BSCE (balanced-softmax cross-entropy), NeurIPS 2020, and support a smooth version of cost-sensitive cross-entropy (smooth CS_CE), which add a hyper-parameter $ \gamma$ to vanilla CS_CE. In smooth CS_CE, the loss weight of class i is defined as: $(\frac{N_{min}}{N_i})^\gamma$, where $\gamma \in [0, 1]$, $N_i$ is the number of images in class i. We can set $\gamma = 0.5$ to get a square-root version of CS_CE.
2021-01-05 - Add SEQL (softmax equalization loss), CVPR 2020.
2021-01-02 - Add LDAMLoss, NeurIPS 2019, and a regularization method: label smooth cross-entropy, CVPR 2016.
2020-12-30 - Add codes of torch.nn.parallel.DistributedDataParallel. Support apex in both torch.nn.DataParallel and torch.nn.parallel.DistributedDataParallel.
## Trick gallery
#### Brief inroduction
We divided the long-tail realted tricks into four families: re-weighting, re-sampling, mixup training, and two-stage training. For more details of the above four trick families, see the [original paper](https://cs.nju.edu.cn/wujx/paper/AAAI2021_Tricks.pdf).
#### Detailed information :
- Trick gallery:
##### Tricks, corresponding results, experimental settings, and running commands are listed in ***[trick_gallery.md](https://github.com/zhangyongshun/BagofTricks-LT/blob/main/documents/trick_gallery.md)***.
## Main requirements
```bash
torch >= 1.4.0
torchvision >= 0.5.0
tensorboardX >= 2.1
tensorflow >= 1.14.0 #convert long-tailed cifar datasets from tfrecords to jpgs
Python 3
apex
```
- We provide the detailed requirements in [requirements.txt](https://github.com/zhangyongshun/BagofTricks-LT/blob/main/documents/requirements.txt). You can run `pip install requirements.txt` to create the same running environment as ours.
- The [apex](https://github.com/NVIDIA/apex) **is recommended to be installed for saving GPU memories**:
```bash
pip install -U pip
git clone https://github.com/NVIDIA/apex
cd apex
pip install -v --disable-pip-version-check --no-cache-dir --global-option="--cpp_ext" --global-option="--cuda_ext" ./
```
- **If the apex is not installed, the `Distributed training with DistributedDataParallel` in our codes cannot be used.**
## Preparing the datasets
We provide three datasets in this repo: long-tailed CIFAR (CIFAR-LT), long-tailed ImageNet (ImageNet-LT), and iNaturalist 2018 (iNat18).
The detailed information of these datasets are shown as follows:
| Datasets |
CIFAR-10-LT |
CIFAR-100-LT |
ImageNet-LT |
iNat18 |
| Imbalance factor |
| 100 |
50 |
100 |
50 |
| Training images |
12,406 |
13,996 |
10,847 |
12,608 |
11,5846 |
437,513 |
| Classes |
50 |
50 |
100 |
100 |
1,000 |
8,142 |
| Max images |
5,000 |
5,000 |
500 |
500 |
1,280 |
1,000 |
| Min images |
50 |
100 |
5 |
10 |
5 |
2 |
| Imbalance factor |
100 |
50 |
100 |
50 |
256 |
500 |
- `Max images` and `Min images` represents the number of training images in the largest and smallest classes, respectively.
- `CIFAR-10-LT-100` means the long-tailed CIFAR-10 dataset with the imbalance factor $\beta = 100$.
- `Imbalance factor` is defined as $\beta = \frac{\text{Max images}}{\text{Min images}}$.
- #### Data format
The annotation of a dataset is a dict consisting of two field: `annotations` and `num_classes`.
The field `annotations` is a list of dict with
`image_id`, `fpath`, `im_height`, `im_width` and `category_id`.
Here is an example.
```
{
'annotations': [
{
'image_id': 1,
'fpath': '/data/iNat18/images/train_val2018/Plantae/7477/3b60c9486db1d2ee875f11a669fbde4a.jpg',
'im_height': 600,
'im_width': 800,
'category_id': 7477
},
...
]
'num_classes': 8142
}
```
- #### CIFAR-LT
[Cao et al., NeurIPS 2019](https://arxiv.org/abs/1906.07413) followed [Cui et al., CVPR 2019](https://arxiv.org/abs/1901.05555) 's method to generate the CIFAR-LT randomly. They modify the CIFAR datasets provided by PyTorch as [this file](https://github.com/zhangyongshun/BagofTricks-LT/blob/main/lib/dataset/cao_cifar.py) shows.
- #### ImageNet-LT
You can use the following steps to convert from the original images of ImageNet-LT.
1. Download the original [ILSVRC-2012](http://www.image-net.org/). Suppose you have downloaded and reorgnized them at path `/downloaded/ImageNet/`, which should contain two sub-directories: `/downloaded/ImageNet/train` and `/downloaded/ImageNet/val`.
2. Download the train/test splitting files (`ImageNet_LT_train.txt` and `ImageNet_LT_test.txt`) in [GoogleDrive](https://drive.google.com/drive/u/0/folders/19cl6GK5B3p5CxzVBy5i4cWSmBy9-rT_-) or [Baidu Netdisk](https://pan.baidu.com/s/17alnFve8l-oMZFZQLxHrzQ) (password: cj0g). Suppose you have downloaded them at path `/downloaded/ImageNet-LT/`.
3. Run tools/convert_from_ImageNet.py, and you will get two jsons: `ImageNet_LT_train.json` and `ImageNet_LT_val.json`.
```bash
# Convert from the original format of ImageNet-LT
python tools/convert_from_ImageNet.py --input_path /downloaded/ImageNet-LT/ --image_path /downloaed/ImageNet/ --output_path ./
```
- #### iNat18
You can use the following steps to convert from the original format of iNaturalist 2018.
1. The images and annotations should be downloaded at [iNaturalist 2018](https://github.com/visipedia/inat_comp/blob/master/2018/README.md) firstly. Suppose you have downloaded them at path `/downloaded/iNat18/`.
2. Run tools/convert_from_iNat.py, and use the generated `iNat18_train.json` and `iNat18_val.json` to train.
```bash
# Convert from the original format of iNaturalist
# See tools/convert_from_iNat.py for more details of args
python tools/convert_from_iNat.py --input_json_file /downloaded/iNat18/train2018.json --image_path /downloaded/iNat18/images --output_json_file ./iNat18_train.json
python tools/convert_from_iNat.py --input_json_file /downloaded/iNat18/val2018.json --image_path /downloaded/iNat18/images --output_json_file ./iNat18_val.json
```
## Usage
In this repo:
- The results of CIFAR-LT (ResNet-32) and ImageNet-LT (ResNet-10), which need only one GPU to train, are gotten by DataParallel training with apex.
- The results of iNat18 (ResNet-50), which need more than one GPU to train, are gotten by DistributedDataParallel training with apex.
- If more than one GPU is used, DistributedDataParallel training is efficient than DataParallel training, especially when the CPU calculation forces are limited.
### Training
#### Parallel training with DataParallel
```bash
1, To train
# To train long-tailed CIFAR-10 with imbalanced ratio of 50.
# `GPUs` are the GPUs you want to use, such as `0,4`.
bash data_parallel_train.sh configs/test/data_parallel.yaml GPUs
```
#### Distributed training with DistributedDataParallel
```bash
1, Change the NCCL_SOCKET_IFNAME in run_with_distributed_parallel.sh to [your own socket name].
export NCCL_SOCKET_IFNAME = [your own socket name]
2, To train
# To train long-tailed CIFAR-10 with imbalanced ratio of 50.
# `GPUs` are the GPUs you want to use, such as `0,1,4`.
# `NUM_GPUs` are the number of GPUs you want to use. If you set `GPUs` to `0,1,4`, then `NUM_GPUs` should be `3`.
bash distributed_data_parallel_train.sh configs/test/distributed_data_parallel.yaml NUM_GPUs GPUs
```
### Validation
You can get the validation accuracy and the corresponding confusion matrix after running the following commands.
See [main/valid.py](https://github.com/zhangyongshun/BagofTricks-LT/blob/main/main/valid.py) for more details.
```bash
1, Change the TEST.MODEL_FILE in the yaml to your own path of the trained model firstly.
2, To do validation
# `GPUs` are the GPUs you want to use, such as `0,1,4`.
python main/valid.py --cfg [Your yaml] --gpus GPUS
```
## The comparison between the baseline results using our codes and the references [Cui, Kang]
- We use **Top-1 error rates** as our evaluation metric.
- **For the ImageNet-LT, we find that the color_jitter augmentation was not included in our experiments, which, however, is adopted by other methods. So, in this repo, we add the color_jitter augmentation on ImageNet-LT. The old baseline without color_jitter is 64.89, which is +1.15 points higher than the new baseline.**
- You can click the `Baseline` in the table below to see the experimental settings and corresponding running commands.
| Datasets |
CIFAR-10-LT
| CIFAR-100-LT
| ImageNet-LT |
iNat18 |
| Imbalance factor
|
| 100
| 50
| 100
| 50
|
| Backbones
| ResNet-32
| ResNet-10
| ResNet-50
|
Baselines using our codes
- CONFIG (from left to right):
- configs/cao_cifar/baseline/{cifar10_im100.yaml, cifar10_im50.yaml, cifar100_im100.yaml, cifar100_im50.yaml}
- configs/ImageNet_LT/imagenetlt_baseline.yaml
- configs/iNat18/iNat18_baseline.yaml
- Running commands:
- For CIFAR-LT and ImageNet-LT: bash data_parallel_train.sh CONFIG GPU
- For iNat18: bash distributed_data_parallel_train.sh configs/iNat18/iNat18_baseline.yaml NUM_GPUs GPUs
| 28.05
| 23.55
| 62.27
| 56.22
| 63.74
| 40.55
|
| Reference [Cui, Kang, Liu]
| 29.64
| 25.19
| 61.68
| 56.15
| 64.40
| 42.86
|
## Paper collection of long-tailed visual recognition
[Awesome-of-Long-Tailed-Recognition](https://github.com/zwzhang121/Awesome-of-Long-Tailed-Recognition)
[Long-Tailed-Classification-Leaderboard](https://github.com/yanyanSann/Long-Tailed-Classification-Leaderboard)
## Citation
```
@inproceedings{zhang2021tricks,
author = {Yongshun Zhang and Xiu{-}Shen Wei and Boyan Zhou and Jianxin Wu},
title = {Bag of Tricks for Long-Tailed Visual Recognition with Deep Convolutional Neural Networks},
pages = {3447--3455},
booktitle = {AAAI},
year = {2021},
}
```
## Contacts
If you have any question about our work, please do not hesitate to contact us by emails provided in the [paper](http://www.lamda.nju.edu.cn/zhangys/papers/AAAI_tricks.pdf).