# dist_kd
**Repository Path**: xyz-dev-max/dist_kd
## Basic Information
- **Project Name**: dist_kd
- **Description**: 论文开源代码---https://github.com/hunto/DIST_KD
- **Primary Language**: Unknown
- **License**: Apache-2.0
- **Default Branch**: main
- **Homepage**: None
- **GVP Project**: No
## Statistics
- **Stars**: 0
- **Forks**: 0
- **Created**: 2023-05-05
- **Last Updated**: 2023-05-05
## Categories & Tags
**Categories**: Uncategorized
**Tags**: None
## README
# Knowledge Distillation from A Stronger Teacher (DIST)
Official implementation of paper "[Knowledge Distillation from A Stronger Teacher](https://arxiv.org/abs/2205.10536)" (DIST), NeurIPS 2022.
By Tao Huang, Shan You, Fei Wang, Chen Qian, Chang Xu.
:fire: **DIST: a simple and effective KD method.**
## Updates
* **December 27, 2022**: Update CIFAR-100 distillation code and logs.
* **September 20, 2022**: Release code for semantic segmentation task.
* **September 15, 2022**: DIST was accepted by NeurIPS 2022!
* **May 30, 2022**: Code for object detection is available.
* **May 27, 2022**: Code for ImageNet classification is available.
## Getting started
### Clone training code
```shell
git clone https://github.com/hunto/DIST_KD.git --recurse-submodules
cd DIST_KD
```
**The loss function of DIST is in** [classification/lib/models/losses/dist_kd.py](https://github.com/hunto/image_classification_sota/blob/main/lib/models/losses/dist_kd.py).
* classification: prepare your environment and datasets following the `README.md` in `classification`.
* object detection: coming soon.
* semantic segmentation: coming soon.
## Reproducing our results
### ImageNet
```
cd classification
sh tools/dist_train.sh 8 ${CONFIG} ${MODEL} --teacher-model ${T_MODEL} --experiment ${EXP_NAME}
```
* Baseline settings (`R34-R101` and `R50-MBV1`):
```
CONFIG=configs/strategies/distill/resnet_dist.yaml
```
|Student|Teacher|DIST|MODEL|T_MODEL|Log|Ckpt|
|:--:|:--:|:--:|:--:|:--:|:--:|:--:|
|ResNet-18 (69.76)|ResNet-34 (73.31)|72.07|`tv_resnet18`|`tv_resnet34`|[log](https://github.com/hunto/DIST_KD/releases/download/v0.0.1/baseline_Res34-Res18.txt)|[ckpt](https://drive.google.com/file/d/1_nzcAwxZApLU496iypsdeNhXYzPA4ZF4/view?usp=sharing)|
|MobileNet V1 (70.13)|ResNet-50 (76.16)|73.24|`mobilenet_v1`|`tv_resnet50`|[log](https://github.com/hunto/DIST_KD/releases/download/v0.0.1/baseline_Res50-MBV1.txt)|[ckpt](https://drive.google.com/file/d/1uSzFbcY6uudQgfDBxHataBPO1xX8J_yW/view?usp=sharing)|
* Stronger teachers (`R18` and `R34` students with various ResNet teachers):
|Student|Teacher|KD (T=4)|DIST|
|:--:|:--:|:--:|:--:|
|ResNet-18 (69.76)|ResNet-34 (73.31)|71.21|72.07|
|ResNet-18 (69.76)|ResNet-50 (76.13)|71.35|72.12|
|ResNet-18 (69.76)|ResNet-101 (77.37)|71.09|72.08|
|ResNet-18 (69.76)|ResNet-152 (78.31)|71.12|72.24|
|ResNet-34 (73.31)|ResNet-50 (76.13)|74.73|75.06|
|ResNet-34 (73.31)|ResNet-101 (77.37)|74.89|75.36|
|ResNet-34 (73.31)|ResNet-152 (78.31)|74.87|75.42|
* Stronger training strategies:
```
CONFIG=configs/strategies/distill/dist_b2.yaml
```
`ResNet-50-SB`: stronger ResNet-50 trained by [TIMM](https://github.com/rwightman/pytorch-image-models) ([ResNet strikes back](https://arxiv.org/abs/2110.00476)) .
|Student|Teacher|KD (T=4)|DIST|MODEL|T_MODEL|Log|
|:--:|:--:|:--:|:--:|:--:|:--:|:--:|
|ResNet-18 (73.4)|ResNet-50-SB (80.1)|72.6|74.5|`tv_resnet18`|`timm_resnet50`|[log](https://github.com/hunto/DIST_KD/releases/download/v0.0.1/stronger_Res50SB-Res18.txt)|
|ResNet-34 (76.8)|ResNet-50-SB (80.1)|77.2|77.8|`tv_resnet34`|`timm_resnet50`|[log](https://github.com/hunto/DIST_KD/releases/download/v0.0.1/stronger_Res50SB-Res34.txt)|
|MobileNet V2 (73.6)|ResNet-50-SB (80.1)|71.7|74.4|`tv_mobilenet_v2`|`timm_resnet50`|[log](https://github.com/hunto/DIST_KD/releases/download/v0.0.1/stronger_Res50SB-MBV2.txt)|
|EfficientNet-B0 (78.0)|ResNet-50-SB (80.1)|77.4|78.6|`timm_tf_efficientnet_b0`|`timm_resnet50` |[log](https://github.com/hunto/DIST_KD/releases/download/v0.0.1/stronger_Res50SB-EfficientNetB0.txt)|
|ResNet-50 (78.5)|Swin-L (86.3)|80.0|80.2|`tv_resnet50`|`timm_swin_large_patch4_window7_224` |[log](https://github.com/hunto/DIST_KD/releases/download/v0.0.1/stronger_SwinL-Res50.txt) [ckpt](https://drive.google.com/file/d/1iZFP53i4Yw7lqvfV707aTBddN_waEU1r/view?usp=sharing)|
|Swin-T (81.3)|Swin-L (86.3)|81.5|82.3|-|-|[log](https://github.com/hunto/DIST_KD/releases/download/v0.0.1/stronger_SwinL-SwinT.txt)|
* `Swin-L` student:
We implement our DIST on the official code of [Swin-Transformer](https://github.com/microsoft/Swin-Transformer).
### CIFAR-100
Download and extract the [teacher checkpoints](https://github.com/hunto/DIST_KD/releases/download/v0.0.2/cifar_ckpts.zip) to your disk, then specify the path of the corresponding checkpoint `pth` file using `--teacher-ckpt`:
```
cd classification
sh tools/dist_train.sh 1 configs/strategies/distill/dist_cifar.yaml ${MODEL} --teacher-model ${T_MODEL} --experiment ${EXP_NAME} --teacher-ckpt ${CKPT}
```
**NOTE**: For `MobileNetV2`, `ShuffleNetV1`, and `ShuffleNetV2`, `lr` and `warmup-lr` should be `0.01`:
```
sh tools/dist_train.sh 1 configs/strategies/distill/dist_cifar.yaml ${MODEL} --teacher-model ${T_MODEL} --experiment ${EXP_NAME} --teacher-ckpt ${CKPT} --lr 0.01 --warmup-lr 0.01
```
|Student|Teacher|DIST|MODEL|T_MODEL|Log|
|:--:|:--:|:--:|:--:|:--:|:--:|
|WRN-40-1 (71.98)|WRN-40-2 (75.61)|74.43±0.24|`cifar_wrn_40_1`|`cifar_wrn_40_2`|[log](https://github.com/hunto/DIST_KD/releases/download/v0.0.2/log_cifar100_wrn_40_1-wrn_40_2.zip)|
|ResNet-20 (69.06)|ResNet-56 (72.34)|71.75±0.30|`cifar_resnet20`|`cifar_resnet56`|[log](https://github.com/hunto/DIST_KD/releases/download/v0.0.2/log_cifar100_res56-res20.zip)|
|ResNet-8x4 (72.50)|ResNet-32x4 (79.42)|76.31±0.19|`cifar_resnet8x4`|`cifar_resnet32x4`|[log](https://github.com/hunto/DIST_KD/releases/download/v0.0.2/log_cifar100_res32x4-res8x4.zip)|
|MobileNetV2 (64.60)|ResNet-50 (79.34)|68.66±0.23|`cifar_mobile_half`|`cifar_ResNet50`|[log](https://github.com/hunto/DIST_KD/releases/download/v0.0.2/log_cifar100_res50-mbv2.zip)|
|ShuffleNetV1 (70.50)|ResNet-32x4 (79.42)|76.34±0.18|`cifar_ShuffleV1`|`cifar_resnet32x4`|[log](https://github.com/hunto/DIST_KD/releases/download/v0.0.2/log_cifar100_res32x4-shufflev1.zip)|
|ShuffleNetV2 (71.82)|ResNet-32x4 (79.42)|77.35±0.25|`cifar_ShuffleV2`|`cifar_resnet32x4`|[log](https://github.com/hunto/DIST_KD/releases/download/v0.0.2/log_cifar100_res32x4-shufflev2.zip)|
### COCO Detection
The training code is in [MasKD/mmrazor](https://github.com/hunto/MasKD/tree/main/mmrazor). An example to train `cascade_mask_rcnn_x101-fpn_r50`:
```shell
sh tools/mmdet/dist_train_mmdet.sh configs/distill/dist/dist_cascade_mask_rcnn_x101-fpn_x50_coco.py 8 work_dirs/dist_cmr_x101-fpn_x50
```
|Student|Teacher|DIST|DIST+mimic|Config|Log|
|:--:|:--:|:--:|:--:|:--:|:--:|
|Faster RCNN-R50 (38.4)|Cascade Mask RCNN-X101 (45.6)|40.4|41.8|[[DIST]](https://github.com/hunto/MasKD/blob/main/mmrazor/configs/distill/dist/dist_cascade_mask_rcnn_x101-fpn_x50_coco.py) [[DIST+Mimic]](https://github.com/hunto/MasKD/blob/main/mmrazor/configs/distill/dist/dist+mimic_cascade_mask_rcnn_x101-fpn_x50_coco.py)|[[DIST]](https://github.com/hunto/DIST_KD/releases/download/v0.0.1/det_DIST_fpn-r50_cascade-rcnn-x101.txt) [[DIST+Mimic]](https://github.com/hunto/DIST_KD/releases/download/v0.0.1/det_DIST+mimic_fpn-r50_cascade-rcnn-x101.txt)|
|RetinaNet-R50 (37.4)|RetinaNet-X101 (41.0)|39.8|40.1|[[DIST]](https://github.com/hunto/MasKD/blob/main/mmrazor/configs/distill/dist/dist_retinanet_x101-retinanet-r50_coco.py) [[DIST+Mimic]](https://github.com/hunto/MasKD/blob/main/mmrazor/configs/distill/dist/dist%2Bmimic_retinanet_x101-retinanet-r50_coco.py)|[[DIST]](https://github.com/hunto/DIST_KD/releases/download/v0.0.1/det_DIST_retinanet-r50_retinanet-x101.txt) [[DIST+Mimic]](https://github.com/hunto/DIST_KD/releases/download/v0.0.1/det_DIST+mimic_retinanet-r50_retinanet-x101.txt)|
### Cityscapes Segmentation
Detailed instructions of reproducing our results are in `segmentation` folder ([README](./segmentation/README.md)).
|Student|Teacher|DIST|Log|
|:--:|:--:|:--:|:--:|
|DeepLabV3-R18 (74.21)|DeepLabV3-R101 (78.07)|77.10|[log](https://github.com/hunto/DIST_KD/releases/download/v0.0.1/seg_DIST_deeplabv3_resnet101_resnet18_log.txt)|
|PSPNet-R18 (72.55)|DeepLabV3-R101 (78.07)|76.31|[log](https://github.com/hunto/DIST_KD/releases/download/v0.0.1/seg_DIST_psp_resnet101_resnet18_log.txt)|
## License
This project is released under the [Apache 2.0 license](LICENSE).
## Citation
```
@article{huang2022knowledge,
title={Knowledge Distillation from A Stronger Teacher},
author={Huang, Tao and You, Shan and Wang, Fei and Qian, Chen and Xu, Chang},
journal={arXiv preprint arXiv:2205.10536},
year={2022}
}
```