# BagofTricks-LT **Repository Path**: sing_jay_lee/BagofTricks-LT ## Basic Information - **Project Name**: BagofTricks-LT - **Description**: No description available - **Primary Language**: Unknown - **License**: MIT - **Default Branch**: dev - **Homepage**: None - **GVP Project**: No ## Statistics - **Stars**: 0 - **Forks**: 0 - **Created**: 2021-12-01 - **Last Updated**: 2021-12-01 ## Categories & Tags **Categories**: Uncategorized **Tags**: None ## README ## Bag of tricks for long-tailed visual recognition with deep convolutional neural networks This repository is the official PyTorch implementation of [Bag of Tricks for Long-Tailed Visual Recognition with Deep Convolutional Neural Networks](http://www.lamda.nju.edu.cn/zhangys/papers/AAAI_tricks.pdf), which provides practical and effective tricks used in long-tailed image classification. - #### Recommond to install [Github Sort Content](https://github.com/Mottie/GitHub-userscripts/wiki/GitHub-sort-content), which can sort the columns of tables listed on github. With Github Sort Content, you can easily find the most efficient trick under each dataset in [trick_gallery.md](https://github.com/zhangyongshun/BagofTricks-LT/blob/main/documents/trick_gallery.md). You can see [this issue on stackoverflow](https://stackoverflow.com/questions/42843288/is-there-any-way-to-make-markdown-tables-sortable) for more information. ## Development log - [x] `2021-11-08` - Add [InfluenceBalancedLoss ICCV 2021](https://arxiv.org/abs/2110.02444) in [trick_gallery.md](https://github.com/zhangyongshun/BagofTricks-LT/blob/main/documents/trick_gallery.md), which belongs to two-stage training. - [x] `2021-05-19` - Add CONFIGs and experimental results of [BBN-style sampling CVPR2020](https://arxiv.org/abs/1912.02413) in [trick_gallery.md](https://github.com/zhangyongshun/BagofTricks-LT/blob/main/documents/trick_gallery.md), which consists of a uniform sampler and a reverse sampler.

Previous logs

2021-04-24 - Add the validation running command, which loads a trained model, then returns the validation acc and a corresponding confusion matrix figure. See Usage in this README for details.

2021-04-24 - Add classifier-balancing and corresponding experiments in Two-stage training in trick_gallery.md, including $\tau$-normalization, cRT and LWS.

2021-04-23 - Add CrossEntropyLabelAwareSmooth (label-aware smoothing, CVPR 2021) in trick_gallery.md.

2021-04-22 - Add one option (TRAIN.APEX) in config.py, so you can set TRAIN.APEX to False for training without using apex.

2021-02-19 - Test and add the results of two-stage training in trick_gallery.md.

2021-01-11 - Add a mixup related method: Remix, ECCV 2020 workshop.

2021-01-10 - Add CDT (class-dependent temparature), arXiv 2020, BSCE (balanced-softmax cross-entropy), NeurIPS 2020, and support a smooth version of cost-sensitive cross-entropy (smooth CS_CE), which add a hyper-parameter $ \gamma$ to vanilla CS_CE. In smooth CS_CE, the loss weight of class i is defined as: $(\frac{N_{min}}{N_i})^\gamma$, where $\gamma \in [0, 1]$, $N_i$ is the number of images in class i. We can set $\gamma = 0.5$ to get a square-root version of CS_CE.

2021-01-05 - Add SEQL (softmax equalization loss), CVPR 2020.

2021-01-02 - Add LDAMLoss, NeurIPS 2019, and a regularization method: label smooth cross-entropy, CVPR 2016.

2020-12-30 - Add codes of torch.nn.parallel.DistributedDataParallel. Support apex in both torch.nn.DataParallel and torch.nn.parallel.DistributedDataParallel.

## Trick gallery #### Brief inroduction We divided the long-tail realted tricks into four families: re-weighting, re-sampling, mixup training, and two-stage training. For more details of the above four trick families, see the [original paper](https://cs.nju.edu.cn/wujx/paper/AAAI2021_Tricks.pdf). #### Detailed information : - Trick gallery: ##### Tricks, corresponding results, experimental settings, and running commands are listed in ***[trick_gallery.md](https://github.com/zhangyongshun/BagofTricks-LT/blob/main/documents/trick_gallery.md)***. ## Main requirements ```bash torch >= 1.4.0 torchvision >= 0.5.0 tensorboardX >= 2.1 tensorflow >= 1.14.0 #convert long-tailed cifar datasets from tfrecords to jpgs Python 3 apex ``` - We provide the detailed requirements in [requirements.txt](https://github.com/zhangyongshun/BagofTricks-LT/blob/main/documents/requirements.txt). You can run `pip install requirements.txt` to create the same running environment as ours. - The [apex](https://github.com/NVIDIA/apex) **is recommended to be installed for saving GPU memories**: ```bash pip install -U pip git clone https://github.com/NVIDIA/apex cd apex pip install -v --disable-pip-version-check --no-cache-dir --global-option="--cpp_ext" --global-option="--cuda_ext" ./ ``` - **If the apex is not installed, the `Distributed training with DistributedDataParallel` in our codes cannot be used.** ## Preparing the datasets We provide three datasets in this repo: long-tailed CIFAR (CIFAR-LT), long-tailed ImageNet (ImageNet-LT), and iNaturalist 2018 (iNat18). The detailed information of these datasets are shown as follows:

Datasets	CIFAR-10-LT		CIFAR-100-LT		ImageNet-LT	iNat18
	Imbalance factor
	100	50	100	50
Training images	12,406	13,996	10,847	12,608	11,5846	437,513
Classes	50	50	100	100	1,000	8,142
Max images	5,000	5,000	500	500	1,280	1,000
Min images	50	100	5	10	5	2
Imbalance factor	100	50	100	50	256	500

- `Max images` and `Min images` represents the number of training images in the largest and smallest classes, respectively. - `CIFAR-10-LT-100` means the long-tailed CIFAR-10 dataset with the imbalance factor $\beta = 100$. - `Imbalance factor` is defined as $\beta = \frac{\text{Max images}}{\text{Min images}}$. - #### Data format The annotation of a dataset is a dict consisting of two field: `annotations` and `num_classes`. The field `annotations` is a list of dict with `image_id`, `fpath`, `im_height`, `im_width` and `category_id`. Here is an example. ``` { 'annotations': [ { 'image_id': 1, 'fpath': '/data/iNat18/images/train_val2018/Plantae/7477/3b60c9486db1d2ee875f11a669fbde4a.jpg', 'im_height': 600, 'im_width': 800, 'category_id': 7477 }, ... ] 'num_classes': 8142 } ``` - #### CIFAR-LT [Cao et al., NeurIPS 2019](https://arxiv.org/abs/1906.07413) followed [Cui et al., CVPR 2019](https://arxiv.org/abs/1901.05555) 's method to generate the CIFAR-LT randomly. They modify the CIFAR datasets provided by PyTorch as [this file](https://github.com/zhangyongshun/BagofTricks-LT/blob/main/lib/dataset/cao_cifar.py) shows. - #### ImageNet-LT You can use the following steps to convert from the original images of ImageNet-LT. 1. Download the original [ILSVRC-2012](http://www.image-net.org/). Suppose you have downloaded and reorgnized them at path `/downloaded/ImageNet/`, which should contain two sub-directories: `/downloaded/ImageNet/train` and `/downloaded/ImageNet/val`. 2. Download the train/test splitting files (`ImageNet_LT_train.txt` and `ImageNet_LT_test.txt`) in [GoogleDrive](https://drive.google.com/drive/u/0/folders/19cl6GK5B3p5CxzVBy5i4cWSmBy9-rT_-) or [Baidu Netdisk](https://pan.baidu.com/s/17alnFve8l-oMZFZQLxHrzQ) (password: cj0g). Suppose you have downloaded them at path `/downloaded/ImageNet-LT/`. 3. Run tools/convert_from_ImageNet.py, and you will get two jsons: `ImageNet_LT_train.json` and `ImageNet_LT_val.json`. ```bash # Convert from the original format of ImageNet-LT python tools/convert_from_ImageNet.py --input_path /downloaded/ImageNet-LT/ --image_path /downloaed/ImageNet/ --output_path ./ ``` - #### iNat18 You can use the following steps to convert from the original format of iNaturalist 2018. 1. The images and annotations should be downloaded at [iNaturalist 2018](https://github.com/visipedia/inat_comp/blob/master/2018/README.md) firstly. Suppose you have downloaded them at path `/downloaded/iNat18/`. 2. Run tools/convert_from_iNat.py, and use the generated `iNat18_train.json` and `iNat18_val.json` to train. ```bash # Convert from the original format of iNaturalist # See tools/convert_from_iNat.py for more details of args python tools/convert_from_iNat.py --input_json_file /downloaded/iNat18/train2018.json --image_path /downloaded/iNat18/images --output_json_file ./iNat18_train.json python tools/convert_from_iNat.py --input_json_file /downloaded/iNat18/val2018.json --image_path /downloaded/iNat18/images --output_json_file ./iNat18_val.json ``` ## Usage In this repo: - The results of CIFAR-LT (ResNet-32) and ImageNet-LT (ResNet-10), which need only one GPU to train, are gotten by DataParallel training with apex. - The results of iNat18 (ResNet-50), which need more than one GPU to train, are gotten by DistributedDataParallel training with apex. - If more than one GPU is used, DistributedDataParallel training is efficient than DataParallel training, especially when the CPU calculation forces are limited. ### Training #### Parallel training with DataParallel ```bash 1, To train # To train long-tailed CIFAR-10 with imbalanced ratio of 50. # `GPUs` are the GPUs you want to use, such as `0,4`. bash data_parallel_train.sh configs/test/data_parallel.yaml GPUs ``` #### Distributed training with DistributedDataParallel ```bash 1, Change the NCCL_SOCKET_IFNAME in run_with_distributed_parallel.sh to [your own socket name]. export NCCL_SOCKET_IFNAME = [your own socket name] 2, To train # To train long-tailed CIFAR-10 with imbalanced ratio of 50. # `GPUs` are the GPUs you want to use, such as `0,1,4`. # `NUM_GPUs` are the number of GPUs you want to use. If you set `GPUs` to `0,1,4`, then `NUM_GPUs` should be `3`. bash distributed_data_parallel_train.sh configs/test/distributed_data_parallel.yaml NUM_GPUs GPUs ``` ### Validation You can get the validation accuracy and the corresponding confusion matrix after running the following commands. See [main/valid.py](https://github.com/zhangyongshun/BagofTricks-LT/blob/main/main/valid.py) for more details. ```bash 1, Change the TEST.MODEL_FILE in the yaml to your own path of the trained model firstly. 2, To do validation # `GPUs` are the GPUs you want to use, such as `0,1,4`. python main/valid.py --cfg [Your yaml] --gpus GPUS ``` ## The comparison between the baseline results using our codes and the references [Cui, Kang] - We use **Top-1 error rates** as our evaluation metric. - **For the ImageNet-LT, we find that the color_jitter augmentation was not included in our experiments, which, however, is adopted by other methods. So, in this repo, we add the color_jitter augmentation on ImageNet-LT. The old baseline without color_jitter is 64.89, which is +1.15 points higher than the new baseline.** - You can click the `Baseline` in the table below to see the experimental settings and corresponding running commands.

Datasets	CIFAR-10-LT		CIFAR-100-LT		ImageNet-LT	iNat18
	Imbalance factor
	100	50	100	50
Backbones	ResNet-32				ResNet-10	ResNet-50
Baselines using our codes CONFIG (from left to right): configs/cao_cifar/baseline/{cifar10_im100.yaml, cifar10_im50.yaml, cifar100_im100.yaml, cifar100_im50.yaml} configs/ImageNet_LT/imagenetlt_baseline.yaml configs/iNat18/iNat18_baseline.yaml Running commands: For CIFAR-LT and ImageNet-LT: bash data_parallel_train.sh CONFIG GPU For iNat18: bash distributed_data_parallel_train.sh configs/iNat18/iNat18_baseline.yaml NUM_GPUs GPUs	28.05	23.55	62.27	56.22	63.74	40.55
Reference [Cui, Kang, Liu]	29.64	25.19	61.68	56.15	64.40	42.86

## Paper collection of long-tailed visual recognition [Awesome-of-Long-Tailed-Recognition](https://github.com/zwzhang121/Awesome-of-Long-Tailed-Recognition) [Long-Tailed-Classification-Leaderboard](https://github.com/yanyanSann/Long-Tailed-Classification-Leaderboard) ## Citation ``` @inproceedings{zhang2021tricks, author = {Yongshun Zhang and Xiu{-}Shen Wei and Boyan Zhou and Jianxin Wu}, title = {Bag of Tricks for Long-Tailed Visual Recognition with Deep Convolutional Neural Networks}, pages = {3447--3455}, booktitle = {AAAI}, year = {2021}, } ``` ## Contacts If you have any question about our work, please do not hesitate to contact us by emails provided in the [paper](http://www.lamda.nju.edu.cn/zhangys/papers/AAAI_tricks.pdf).