diff --git a/PyTorch/contrib/cv/detection/GCNet/README.md b/PyTorch/contrib/cv/detection/GCNet/README.md
new file mode 100644
index 0000000000000000000000000000000000000000..1cd45081045d14eaea2db40d80d179af8c4c8f4a
--- /dev/null
+++ b/PyTorch/contrib/cv/detection/GCNet/README.md
@@ -0,0 +1,151 @@
+
+

+
+
+**News**: We released the technical report on [ArXiv](https://arxiv.org/abs/1906.07155).
+
+Documentation: https://mmdetection.readthedocs.io/
+
+## Introduction
+
+MMDetection is an open source object detection toolbox based on PyTorch. It is
+a part of the OpenMMLab project developed by [Multimedia Laboratory, CUHK](http://mmlab.ie.cuhk.edu.hk/).
+
+The master branch works with **PyTorch 1.3 to 1.6**.
+The old v1.x branch works with PyTorch 1.1 to 1.4, but v2.0 is strongly recommended for faster speed, higher performance, better design and more friendly usage.
+
+
+
+### Major features
+
+- **Modular Design**
+
+ We decompose the detection framework into different components and one can easily construct a customized object detection framework by combining different modules.
+
+- **Support of multiple frameworks out of box**
+
+ The toolbox directly supports popular and contemporary detection frameworks, *e.g.* Faster RCNN, Mask RCNN, RetinaNet, etc.
+
+- **High efficiency**
+
+ All basic bbox and mask operations run on GPUs. The training speed is faster than or comparable to other codebases, including [Detectron2](https://github.com/facebookresearch/detectron2), [maskrcnn-benchmark](https://github.com/facebookresearch/maskrcnn-benchmark) and [SimpleDet](https://github.com/TuSimple/simpledet).
+
+- **State of the art**
+
+ The toolbox stems from the codebase developed by the *MMDet* team, who won [COCO Detection Challenge](http://cocodataset.org/#detection-leaderboard) in 2018, and we keep pushing it forward.
+
+Apart from MMDetection, we also released a library [mmcv](https://github.com/open-mmlab/mmcv) for computer vision research, which is heavily depended on by this toolbox.
+
+## License
+
+This project is released under the [Apache 2.0 license](LICENSE).
+
+## Changelog
+
+v2.6.0 was released in 1/11/2020.
+Please refer to [changelog.md](docs/changelog.md) for details and release history.
+A comparison between v1.x and v2.0 codebases can be found in [compatibility.md](docs/compatibility.md).
+
+## Benchmark and model zoo
+
+Results and models are available in the [model zoo](docs/model_zoo.md).
+
+Supported backbones:
+- [x] ResNet
+- [x] ResNeXt
+- [x] VGG
+- [x] HRNet
+- [x] RegNet
+- [x] Res2Net
+
+Supported methods:
+- [x] [RPN](configs/rpn)
+- [x] [Fast R-CNN](configs/fast_rcnn)
+- [x] [Faster R-CNN](configs/faster_rcnn)
+- [x] [Mask R-CNN](configs/mask_rcnn)
+- [x] [Cascade R-CNN](configs/cascade_rcnn)
+- [x] [Cascade Mask R-CNN](configs/cascade_rcnn)
+- [x] [SSD](configs/ssd)
+- [x] [RetinaNet](configs/retinanet)
+- [x] [GHM](configs/ghm)
+- [x] [Mask Scoring R-CNN](configs/ms_rcnn)
+- [x] [Double-Head R-CNN](configs/double_heads)
+- [x] [Hybrid Task Cascade](configs/htc)
+- [x] [Libra R-CNN](configs/libra_rcnn)
+- [x] [Guided Anchoring](configs/guided_anchoring)
+- [x] [FCOS](configs/fcos)
+- [x] [RepPoints](configs/reppoints)
+- [x] [Foveabox](configs/foveabox)
+- [x] [FreeAnchor](configs/free_anchor)
+- [x] [NAS-FPN](configs/nas_fpn)
+- [x] [ATSS](configs/atss)
+- [x] [FSAF](configs/fsaf)
+- [x] [PAFPN](configs/pafpn)
+- [x] [Dynamic R-CNN](configs/dynamic_rcnn)
+- [x] [PointRend](configs/point_rend)
+- [x] [CARAFE](configs/carafe/README.md)
+- [x] [DCNv2](configs/dcn/README.md)
+- [x] [Group Normalization](configs/gn/README.md)
+- [x] [Weight Standardization](configs/gn+ws/README.md)
+- [x] [OHEM](configs/faster_rcnn/faster_rcnn_r50_fpn_ohem_1x_coco.py)
+- [x] [Soft-NMS](configs/faster_rcnn/faster_rcnn_r50_fpn_soft_nms_1x_coco.py)
+- [x] [Generalized Attention](configs/empirical_attention/README.md)
+- [x] [GCNet](configs/gcnet/README.md)
+- [x] [Mixed Precision (FP16) Training](configs/fp16/README.md)
+- [x] [InstaBoost](configs/instaboost/README.md)
+- [x] [GRoIE](configs/groie/README.md)
+- [x] [DetectoRS](configs/detectors/README.md)
+- [x] [Generalized Focal Loss](configs/gfl/README.md)
+- [x] [CornerNet](configs/cornernet/README.md)
+- [x] [Side-Aware Boundary Localization](configs/sabl/README.md)
+- [x] [YOLOv3](configs/yolo/README.md)
+- [x] [PAA](configs/paa/README.md)
+- [x] [YOLACT](configs/yolact/README.md)
+- [x] [CentripetalNet](configs/centripetalnet/README.md)
+- [x] [VFNet](configs/vfnet/README.md)
+
+Some other methods are also supported in [projects using MMDetection](./docs/projects.md).
+
+## Installation
+
+Please refer to [get_started.md](docs/get_started.md) for installation.
+
+## Getting Started
+
+Please see [get_started.md](docs/get_started.md) for the basic usage of MMDetection.
+We provide [colab tutorial](demo/MMDet_Tutorial.ipynb), and full guidance for quick run [with existing dataset](docs/1_exist_data_model.md) and [with new dataset](docs/2_new_data_model.md) for beginners.
+There are also tutorials for [finetuning models](docs/tutorials/finetune.md), [adding new dataset](docs/tutorials/new_dataset.md), [designing data pipeline](docs/tutorials/data_pipeline.md), [customizing models](docs/tutorials/customize_models.md), [customizing runtime settings](docs/tutorials/customize_runtime.md) and [useful tools](docs/useful_tools.md).
+
+For trouble shooting, please refer to [trouble_shooting.md](docs/trouble_shooting.md)
+
+## Contributing
+
+We appreciate all contributions to improve MMDetection. Please refer to [CONTRIBUTING.md](.github/CONTRIBUTING.md) for the contributing guideline.
+
+## Acknowledgement
+
+MMDetection is an open source project that is contributed by researchers and engineers from various colleges and companies. We appreciate all the contributors who implement their methods or add new features, as well as users who give valuable feedbacks.
+We wish that the toolbox and benchmark could serve the growing research community by providing a flexible toolkit to reimplement existing methods and develop their own new detectors.
+
+## Citation
+
+If you use this toolbox or benchmark in your research, please cite this project.
+
+```
+@article{mmdetection,
+ title = {{MMDetection}: Open MMLab Detection Toolbox and Benchmark},
+ author = {Chen, Kai and Wang, Jiaqi and Pang, Jiangmiao and Cao, Yuhang and
+ Xiong, Yu and Li, Xiaoxiao and Sun, Shuyang and Feng, Wansen and
+ Liu, Ziwei and Xu, Jiarui and Zhang, Zheng and Cheng, Dazhi and
+ Zhu, Chenchen and Cheng, Tianheng and Zhao, Qijie and Li, Buyu and
+ Lu, Xin and Zhu, Rui and Wu, Yue and Dai, Jifeng and Wang, Jingdong
+ and Shi, Jianping and Ouyang, Wanli and Loy, Chen Change and Lin, Dahua},
+ journal= {arXiv preprint arXiv:1906.07155},
+ year={2019}
+}
+```
+
+## Contact
+
+This repo is currently maintained by Kai Chen ([@hellock](http://github.com/hellock)), Yuhang Cao ([@yhcao6](https://github.com/yhcao6)), Wenwei Zhang ([@ZwwWayne](https://github.com/ZwwWayne)),
+Jiarui Xu ([@xvjiarui](https://github.com/xvjiarui)). Other core developers include Jiangmiao Pang ([@OceanPang](https://github.com/OceanPang)) and Jiaqi Wang ([@myownskyW7](https://github.com/myownskyW7)).
diff --git a/PyTorch/contrib/cv/detection/GCNet/README_raw.md b/PyTorch/contrib/cv/detection/GCNet/README_raw.md
new file mode 100644
index 0000000000000000000000000000000000000000..4273d4f51593c8fc079792921c01c89e32b8986a
--- /dev/null
+++ b/PyTorch/contrib/cv/detection/GCNet/README_raw.md
@@ -0,0 +1,118 @@
+# GCNet
+
+This implements training of Fcos on the Coco dataset, mainly modified from [pytorch/examples](https://github.com/open-mmlab/mmdetection).
+
+## GCNet Detail
+
+GCNet is initially described in [arxiv](https://arxiv.org/abs/1904.11492). Via absorbing advantages of Non-Local Networks (NLNet) and Squeeze-Excitation Networks (SENet), GCNet provides a simple, fast and effective approach for global context modeling, which generally outperforms both NLNet and SENet on major benchmarks for various recognition tasks.
+
+## Requirements
+
+- NPU配套的run包安装
+- Python 3.7.5
+- PyTorch(NPU版本)
+- apex(NPU版本)
+
+### Document and data preparation
+
+1. 下载压缩GCNet文件夹
+2. 于npu服务器解压GCNet压缩包
+3. 准备coco数据集并放置在指定位置(配置中的位置为/opt/npu/coco)
+
+### Download and modify mmcv
+
+1. 下载mmcv-full,使用的版本为1.3.8; 下载mmdetection,使用的版本为1.2.7
+
+```
+git clone -b v1.3.8 https://github.com/open-mmlab/mmcv.git
+git clone -b v1.2.7 https://github.com/open-mmlab/mmdetection.git
+```
+
+2. 用GCNet/mmcv目录替换clone文件夹里mmcv的mmcv(mmcv/mmcv)
+
+或是pip安装mmcv-full后手动替换库文件
+
+3. 用GCNet/mmdet目录替换clone文件夹里mmdetction的mmdet(mmdetction/mmdet)
+
+或是pip安装mmdet后手动替换库文件
+
+如果要验证1p性能,使用mmdet_1p
+
+### Configure the environment
+
+1. 推荐使用conda管理
+
+```
+conda create -n gcnet --clone env # 复制一个已经包含依赖包的环境
+conda activate gcnet
+```
+
+2. 配置安装mmcv
+
+```
+cd mmcv
+export MMCV_WITH_OPS=1
+export MAX_JOBS=8
+python3.7 setup.py build_ext
+python3.7 setup.py develop
+pip3 list | grep mmcv # 查看版本和路径
+```
+
+3. 配置安装mmdet
+
+```
+cd mmdetection
+pip install -r requirements/build.txt
+pip install -v -e . # or "python setup.py develop"
+```
+
+
+
+## Train MODEL
+
+### 进入GCNet文件夹下
+
+```
+cd GCNet
+```
+
+### 1p
+
+导入环境变量,修改train_1p.sh权限并z运行
+
+```
+chmod +x ./test/train_full_1p.sh
+bash ./test/train_full_1p.sh
+```
+
+### 8p
+
+导入环境变量,修改train_8p.sh权限并运行
+
+```
+chmod +x ./test/train_full_8p.sh
+bash ./test/train_full_8p.sh
+```
+
+### 8p_perf
+
+修改eval.sh权限并运行
+
+```
+chmod +x ./test/train_performance_8p.sh
+bash ./test/train_performance_8p.sh
+```
+
+
+
+### Eval
+
+修改eval.sh权限并运行
+
+```
+chmod +x ./test/eval.sh
+bash ./test/eval.sh
+```
+
+
+
diff --git a/PyTorch/contrib/cv/detection/GCNet/configs/_base_/datasets/cityscapes_detection.py b/PyTorch/contrib/cv/detection/GCNet/configs/_base_/datasets/cityscapes_detection.py
new file mode 100644
index 0000000000000000000000000000000000000000..156aca02588a96a4e279de2e647864b0739e476d
--- /dev/null
+++ b/PyTorch/contrib/cv/detection/GCNet/configs/_base_/datasets/cityscapes_detection.py
@@ -0,0 +1,55 @@
+dataset_type = 'CityscapesDataset'
+data_root = 'data/cityscapes/'
+img_norm_cfg = dict(
+ mean=[123.675, 116.28, 103.53], std=[58.395, 57.12, 57.375], to_rgb=True)
+train_pipeline = [
+ dict(type='LoadImageFromFile'),
+ dict(type='LoadAnnotations', with_bbox=True),
+ dict(
+ type='Resize', img_scale=[(2048, 800), (2048, 1024)], keep_ratio=True),
+ dict(type='RandomFlip', flip_ratio=0.5),
+ dict(type='Normalize', **img_norm_cfg),
+ dict(type='Pad', size_divisor=32),
+ dict(type='DefaultFormatBundle'),
+ dict(type='Collect', keys=['img', 'gt_bboxes', 'gt_labels']),
+]
+test_pipeline = [
+ dict(type='LoadImageFromFile'),
+ dict(
+ type='MultiScaleFlipAug',
+ img_scale=(2048, 1024),
+ flip=False,
+ transforms=[
+ dict(type='Resize', keep_ratio=True),
+ dict(type='RandomFlip'),
+ dict(type='Normalize', **img_norm_cfg),
+ dict(type='Pad', size_divisor=32),
+ dict(type='ImageToTensor', keys=['img']),
+ dict(type='Collect', keys=['img']),
+ ])
+]
+data = dict(
+ samples_per_gpu=1,
+ workers_per_gpu=2,
+ train=dict(
+ type='RepeatDataset',
+ times=8,
+ dataset=dict(
+ type=dataset_type,
+ ann_file=data_root +
+ 'annotations/instancesonly_filtered_gtFine_train.json',
+ img_prefix=data_root + 'leftImg8bit/train/',
+ pipeline=train_pipeline)),
+ val=dict(
+ type=dataset_type,
+ ann_file=data_root +
+ 'annotations/instancesonly_filtered_gtFine_val.json',
+ img_prefix=data_root + 'leftImg8bit/val/',
+ pipeline=test_pipeline),
+ test=dict(
+ type=dataset_type,
+ ann_file=data_root +
+ 'annotations/instancesonly_filtered_gtFine_test.json',
+ img_prefix=data_root + 'leftImg8bit/test/',
+ pipeline=test_pipeline))
+evaluation = dict(interval=1, metric='bbox')
diff --git a/PyTorch/contrib/cv/detection/GCNet/configs/_base_/datasets/cityscapes_instance.py b/PyTorch/contrib/cv/detection/GCNet/configs/_base_/datasets/cityscapes_instance.py
new file mode 100644
index 0000000000000000000000000000000000000000..3c5472aab09acdd5efa2cee206d94824f06058f9
--- /dev/null
+++ b/PyTorch/contrib/cv/detection/GCNet/configs/_base_/datasets/cityscapes_instance.py
@@ -0,0 +1,55 @@
+dataset_type = 'CityscapesDataset'
+data_root = 'data/cityscapes/'
+img_norm_cfg = dict(
+ mean=[123.675, 116.28, 103.53], std=[58.395, 57.12, 57.375], to_rgb=True)
+train_pipeline = [
+ dict(type='LoadImageFromFile'),
+ dict(type='LoadAnnotations', with_bbox=True, with_mask=True),
+ dict(
+ type='Resize', img_scale=[(2048, 800), (2048, 1024)], keep_ratio=True),
+ dict(type='RandomFlip', flip_ratio=0.5),
+ dict(type='Normalize', **img_norm_cfg),
+ dict(type='Pad', size_divisor=32),
+ dict(type='DefaultFormatBundle'),
+ dict(type='Collect', keys=['img', 'gt_bboxes', 'gt_labels', 'gt_masks']),
+]
+test_pipeline = [
+ dict(type='LoadImageFromFile'),
+ dict(
+ type='MultiScaleFlipAug',
+ img_scale=(2048, 1024),
+ flip=False,
+ transforms=[
+ dict(type='Resize', keep_ratio=True),
+ dict(type='RandomFlip'),
+ dict(type='Normalize', **img_norm_cfg),
+ dict(type='Pad', size_divisor=32),
+ dict(type='ImageToTensor', keys=['img']),
+ dict(type='Collect', keys=['img']),
+ ])
+]
+data = dict(
+ samples_per_gpu=1,
+ workers_per_gpu=2,
+ train=dict(
+ type='RepeatDataset',
+ times=8,
+ dataset=dict(
+ type=dataset_type,
+ ann_file=data_root +
+ 'annotations/instancesonly_filtered_gtFine_train.json',
+ img_prefix=data_root + 'leftImg8bit/train/',
+ pipeline=train_pipeline)),
+ val=dict(
+ type=dataset_type,
+ ann_file=data_root +
+ 'annotations/instancesonly_filtered_gtFine_val.json',
+ img_prefix=data_root + 'leftImg8bit/val/',
+ pipeline=test_pipeline),
+ test=dict(
+ type=dataset_type,
+ ann_file=data_root +
+ 'annotations/instancesonly_filtered_gtFine_test.json',
+ img_prefix=data_root + 'leftImg8bit/test/',
+ pipeline=test_pipeline))
+evaluation = dict(metric=['bbox', 'segm'])
diff --git a/PyTorch/contrib/cv/detection/GCNet/configs/_base_/datasets/coco_detection.py b/PyTorch/contrib/cv/detection/GCNet/configs/_base_/datasets/coco_detection.py
new file mode 100644
index 0000000000000000000000000000000000000000..09a75c404687223c71dcdf0abc7af827f2e498a6
--- /dev/null
+++ b/PyTorch/contrib/cv/detection/GCNet/configs/_base_/datasets/coco_detection.py
@@ -0,0 +1,48 @@
+dataset_type = 'CocoDataset'
+data_root = 'data/coco/'
+img_norm_cfg = dict(
+ mean=[123.675, 116.28, 103.53], std=[58.395, 57.12, 57.375], to_rgb=True)
+train_pipeline = [
+ dict(type='LoadImageFromFile'),
+ dict(type='LoadAnnotations', with_bbox=True),
+ dict(type='Resize', img_scale=(1333, 800), keep_ratio=True),
+ dict(type='RandomFlip', flip_ratio=0.5),
+ dict(type='Normalize', **img_norm_cfg),
+ dict(type='Pad', size_divisor=32),
+ dict(type='DefaultFormatBundle'),
+ dict(type='Collect', keys=['img', 'gt_bboxes', 'gt_labels']),
+]
+test_pipeline = [
+ dict(type='LoadImageFromFile'),
+ dict(
+ type='MultiScaleFlipAug',
+ img_scale=(1333, 800),
+ flip=False,
+ transforms=[
+ dict(type='Resize', keep_ratio=True),
+ dict(type='RandomFlip'),
+ dict(type='Normalize', **img_norm_cfg),
+ dict(type='Pad', size_divisor=32),
+ dict(type='ImageToTensor', keys=['img']),
+ dict(type='Collect', keys=['img']),
+ ])
+]
+data = dict(
+ samples_per_gpu=2,
+ workers_per_gpu=2,
+ train=dict(
+ type=dataset_type,
+ ann_file=data_root + 'annotations/instances_train2017.json',
+ img_prefix=data_root + 'train2017/',
+ pipeline=train_pipeline),
+ val=dict(
+ type=dataset_type,
+ ann_file=data_root + 'annotations/instances_val2017.json',
+ img_prefix=data_root + 'val2017/',
+ pipeline=test_pipeline),
+ test=dict(
+ type=dataset_type,
+ ann_file=data_root + 'annotations/instances_val2017.json',
+ img_prefix=data_root + 'val2017/',
+ pipeline=test_pipeline))
+evaluation = dict(interval=1, metric='bbox')
diff --git a/PyTorch/contrib/cv/detection/GCNet/configs/_base_/datasets/coco_instance.py b/PyTorch/contrib/cv/detection/GCNet/configs/_base_/datasets/coco_instance.py
new file mode 100644
index 0000000000000000000000000000000000000000..78dbcc86366c429fbac8143929e54c7234ed193d
--- /dev/null
+++ b/PyTorch/contrib/cv/detection/GCNet/configs/_base_/datasets/coco_instance.py
@@ -0,0 +1,51 @@
+dataset_type = 'CocoDataset'
+# data_root = 'D:/dataset/coco/'
+data_root = '/opt/npu/dataset/coco/'
+img_norm_cfg = dict(
+ mean=[123.675, 116.28, 103.53], std=[58.395, 57.12, 57.375], to_rgb=True)
+train_pipeline = [
+ dict(type='LoadImageFromFile'),
+ dict(type='LoadAnnotations', with_bbox=True, with_mask=True),
+ dict(type='Resize', img_scale=(1333, 800), keep_ratio=True),
+ dict(type='RandomFlip', flip_ratio=0.5),
+ dict(type='Normalize', **img_norm_cfg),
+ dict(type='Pad', size_divisor=1344),
+ # dict(type='Pad', size_divisor=32),
+ dict(type='DefaultFormatBundle'),
+ dict(type='Collect', keys=['img', 'gt_bboxes', 'gt_labels', 'gt_masks']),
+]
+test_pipeline = [
+ dict(type='LoadImageFromFile'),
+ dict(
+ type='MultiScaleFlipAug',
+ img_scale=(1333, 800),
+ flip=False,
+ transforms=[
+ dict(type='Resize', keep_ratio=True),
+ dict(type='RandomFlip'),
+ dict(type='Normalize', **img_norm_cfg),
+ # dict(type='Pad', size_divisor=32),
+ dict(type='Pad', size_divisor=1344),
+ dict(type='ImageToTensor', keys=['img']),
+ dict(type='Collect', keys=['img']),
+ ])
+]
+data = dict(
+ samples_per_gpu=2,
+ workers_per_gpu=2,
+ train=dict(
+ type=dataset_type,
+ ann_file=data_root + 'annotations/instances_train2017.json',
+ img_prefix=data_root + 'train2017/',
+ pipeline=train_pipeline),
+ val=dict(
+ type=dataset_type,
+ ann_file=data_root + 'annotations/instances_val2017.json',
+ img_prefix=data_root + 'val2017/',
+ pipeline=test_pipeline),
+ test=dict(
+ type=dataset_type,
+ ann_file=data_root + 'annotations/instances_val2017.json',
+ img_prefix=data_root + 'val2017/',
+ pipeline=test_pipeline))
+evaluation = dict(metric=['bbox', 'segm'])
diff --git a/PyTorch/contrib/cv/detection/GCNet/configs/_base_/datasets/coco_instance_semantic.py b/PyTorch/contrib/cv/detection/GCNet/configs/_base_/datasets/coco_instance_semantic.py
new file mode 100644
index 0000000000000000000000000000000000000000..f7c072ec92731af85952840128f6527bc799913a
--- /dev/null
+++ b/PyTorch/contrib/cv/detection/GCNet/configs/_base_/datasets/coco_instance_semantic.py
@@ -0,0 +1,53 @@
+dataset_type = 'CocoDataset'
+data_root = 'data/coco/'
+img_norm_cfg = dict(
+ mean=[123.675, 116.28, 103.53], std=[58.395, 57.12, 57.375], to_rgb=True)
+train_pipeline = [
+ dict(type='LoadImageFromFile'),
+ dict(
+ type='LoadAnnotations', with_bbox=True, with_mask=True, with_seg=True),
+ dict(type='Resize', img_scale=(1333, 800), keep_ratio=True),
+ dict(type='RandomFlip', flip_ratio=0.5),
+ dict(type='Normalize', **img_norm_cfg),
+ dict(type='Pad', size_divisor=32),
+ dict(type='SegRescale', scale_factor=1 / 8),
+ dict(type='DefaultFormatBundle'),
+ dict(
+ type='Collect',
+ keys=['img', 'gt_bboxes', 'gt_labels', 'gt_masks', 'gt_semantic_seg']),
+]
+test_pipeline = [
+ dict(type='LoadImageFromFile'),
+ dict(
+ type='MultiScaleFlipAug',
+ img_scale=(1333, 800),
+ flip=False,
+ transforms=[
+ dict(type='Resize', keep_ratio=True),
+ dict(type='RandomFlip', flip_ratio=0.5),
+ dict(type='Normalize', **img_norm_cfg),
+ dict(type='Pad', size_divisor=32),
+ dict(type='ImageToTensor', keys=['img']),
+ dict(type='Collect', keys=['img']),
+ ])
+]
+data = dict(
+ samples_per_gpu=2,
+ workers_per_gpu=2,
+ train=dict(
+ type=dataset_type,
+ ann_file=data_root + 'annotations/instances_train2017.json',
+ img_prefix=data_root + 'train2017/',
+ seg_prefix=data_root + 'stuffthingmaps/train2017/',
+ pipeline=train_pipeline),
+ val=dict(
+ type=dataset_type,
+ ann_file=data_root + 'annotations/instances_val2017.json',
+ img_prefix=data_root + 'val2017/',
+ pipeline=test_pipeline),
+ test=dict(
+ type=dataset_type,
+ ann_file=data_root + 'annotations/instances_val2017.json',
+ img_prefix=data_root + 'val2017/',
+ pipeline=test_pipeline))
+evaluation = dict(metric=['bbox', 'segm'])
diff --git a/PyTorch/contrib/cv/detection/GCNet/configs/_base_/datasets/deepfashion.py b/PyTorch/contrib/cv/detection/GCNet/configs/_base_/datasets/deepfashion.py
new file mode 100644
index 0000000000000000000000000000000000000000..308b4b2ac4d9e3516ba4a57e9d3b6af91e97f24b
--- /dev/null
+++ b/PyTorch/contrib/cv/detection/GCNet/configs/_base_/datasets/deepfashion.py
@@ -0,0 +1,53 @@
+# dataset settings
+dataset_type = 'DeepFashionDataset'
+data_root = 'data/DeepFashion/In-shop/'
+img_norm_cfg = dict(
+ mean=[123.675, 116.28, 103.53], std=[58.395, 57.12, 57.375], to_rgb=True)
+train_pipeline = [
+ dict(type='LoadImageFromFile'),
+ dict(type='LoadAnnotations', with_bbox=True, with_mask=True),
+ dict(type='Resize', img_scale=(750, 1101), keep_ratio=True),
+ dict(type='RandomFlip', flip_ratio=0.5),
+ dict(type='Normalize', **img_norm_cfg),
+ dict(type='Pad', size_divisor=32),
+ dict(type='DefaultFormatBundle'),
+ dict(type='Collect', keys=['img', 'gt_bboxes', 'gt_labels', 'gt_masks']),
+]
+test_pipeline = [
+ dict(type='LoadImageFromFile'),
+ dict(
+ type='MultiScaleFlipAug',
+ img_scale=(750, 1101),
+ flip=False,
+ transforms=[
+ dict(type='Resize', keep_ratio=True),
+ dict(type='RandomFlip'),
+ dict(type='Normalize', **img_norm_cfg),
+ dict(type='Pad', size_divisor=32),
+ dict(type='ImageToTensor', keys=['img']),
+ dict(type='Collect', keys=['img']),
+ ])
+]
+data = dict(
+ imgs_per_gpu=2,
+ workers_per_gpu=1,
+ train=dict(
+ type=dataset_type,
+ ann_file=data_root + 'annotations/DeepFashion_segmentation_query.json',
+ img_prefix=data_root + 'Img/',
+ pipeline=train_pipeline,
+ data_root=data_root),
+ val=dict(
+ type=dataset_type,
+ ann_file=data_root + 'annotations/DeepFashion_segmentation_query.json',
+ img_prefix=data_root + 'Img/',
+ pipeline=test_pipeline,
+ data_root=data_root),
+ test=dict(
+ type=dataset_type,
+ ann_file=data_root +
+ 'annotations/DeepFashion_segmentation_gallery.json',
+ img_prefix=data_root + 'Img/',
+ pipeline=test_pipeline,
+ data_root=data_root))
+evaluation = dict(interval=5, metric=['bbox', 'segm'])
diff --git a/PyTorch/contrib/cv/detection/GCNet/configs/_base_/datasets/lvis_v0.5_instance.py b/PyTorch/contrib/cv/detection/GCNet/configs/_base_/datasets/lvis_v0.5_instance.py
new file mode 100644
index 0000000000000000000000000000000000000000..f3da861d6df05b8da58f361815892a416987a927
--- /dev/null
+++ b/PyTorch/contrib/cv/detection/GCNet/configs/_base_/datasets/lvis_v0.5_instance.py
@@ -0,0 +1,23 @@
+_base_ = 'coco_instance.py'
+dataset_type = 'LVISV05Dataset'
+data_root = 'data/lvis_v0.5/'
+data = dict(
+ samples_per_gpu=2,
+ workers_per_gpu=2,
+ train=dict(
+ _delete_=True,
+ type='ClassBalancedDataset',
+ oversample_thr=1e-3,
+ dataset=dict(
+ type=dataset_type,
+ ann_file=data_root + 'annotations/lvis_v0.5_train.json',
+ img_prefix=data_root + 'train2017/')),
+ val=dict(
+ type=dataset_type,
+ ann_file=data_root + 'annotations/lvis_v0.5_val.json',
+ img_prefix=data_root + 'val2017/'),
+ test=dict(
+ type=dataset_type,
+ ann_file=data_root + 'annotations/lvis_v0.5_val.json',
+ img_prefix=data_root + 'val2017/'))
+evaluation = dict(metric=['bbox', 'segm'])
diff --git a/PyTorch/contrib/cv/detection/GCNet/configs/_base_/datasets/lvis_v1_instance.py b/PyTorch/contrib/cv/detection/GCNet/configs/_base_/datasets/lvis_v1_instance.py
new file mode 100644
index 0000000000000000000000000000000000000000..e8c5d1b14594a6ea38b215635686c04995338ed7
--- /dev/null
+++ b/PyTorch/contrib/cv/detection/GCNet/configs/_base_/datasets/lvis_v1_instance.py
@@ -0,0 +1,23 @@
+_base_ = 'coco_instance.py'
+dataset_type = 'LVISV1Dataset'
+data_root = 'data/lvis_v1/'
+data = dict(
+ samples_per_gpu=2,
+ workers_per_gpu=2,
+ train=dict(
+ _delete_=True,
+ type='ClassBalancedDataset',
+ oversample_thr=1e-3,
+ dataset=dict(
+ type=dataset_type,
+ ann_file=data_root + 'annotations/lvis_v1_train.json',
+ img_prefix=data_root)),
+ val=dict(
+ type=dataset_type,
+ ann_file=data_root + 'annotations/lvis_v1_val.json',
+ img_prefix=data_root),
+ test=dict(
+ type=dataset_type,
+ ann_file=data_root + 'annotations/lvis_v1_val.json',
+ img_prefix=data_root))
+evaluation = dict(metric=['bbox', 'segm'])
diff --git a/PyTorch/contrib/cv/detection/GCNet/configs/_base_/datasets/voc0712.py b/PyTorch/contrib/cv/detection/GCNet/configs/_base_/datasets/voc0712.py
new file mode 100644
index 0000000000000000000000000000000000000000..ae09acdd5c9580217815300abbad9f08b71b37ed
--- /dev/null
+++ b/PyTorch/contrib/cv/detection/GCNet/configs/_base_/datasets/voc0712.py
@@ -0,0 +1,55 @@
+# dataset settings
+dataset_type = 'VOCDataset'
+data_root = 'data/VOCdevkit/'
+img_norm_cfg = dict(
+ mean=[123.675, 116.28, 103.53], std=[58.395, 57.12, 57.375], to_rgb=True)
+train_pipeline = [
+ dict(type='LoadImageFromFile'),
+ dict(type='LoadAnnotations', with_bbox=True),
+ dict(type='Resize', img_scale=(1000, 600), keep_ratio=True),
+ dict(type='RandomFlip', flip_ratio=0.5),
+ dict(type='Normalize', **img_norm_cfg),
+ dict(type='Pad', size_divisor=32),
+ dict(type='DefaultFormatBundle'),
+ dict(type='Collect', keys=['img', 'gt_bboxes', 'gt_labels']),
+]
+test_pipeline = [
+ dict(type='LoadImageFromFile'),
+ dict(
+ type='MultiScaleFlipAug',
+ img_scale=(1000, 600),
+ flip=False,
+ transforms=[
+ dict(type='Resize', keep_ratio=True),
+ dict(type='RandomFlip'),
+ dict(type='Normalize', **img_norm_cfg),
+ dict(type='Pad', size_divisor=32),
+ dict(type='ImageToTensor', keys=['img']),
+ dict(type='Collect', keys=['img']),
+ ])
+]
+data = dict(
+ samples_per_gpu=2,
+ workers_per_gpu=2,
+ train=dict(
+ type='RepeatDataset',
+ times=3,
+ dataset=dict(
+ type=dataset_type,
+ ann_file=[
+ data_root + 'VOC2007/ImageSets/Main/trainval.txt',
+ data_root + 'VOC2012/ImageSets/Main/trainval.txt'
+ ],
+ img_prefix=[data_root + 'VOC2007/', data_root + 'VOC2012/'],
+ pipeline=train_pipeline)),
+ val=dict(
+ type=dataset_type,
+ ann_file=data_root + 'VOC2007/ImageSets/Main/test.txt',
+ img_prefix=data_root + 'VOC2007/',
+ pipeline=test_pipeline),
+ test=dict(
+ type=dataset_type,
+ ann_file=data_root + 'VOC2007/ImageSets/Main/test.txt',
+ img_prefix=data_root + 'VOC2007/',
+ pipeline=test_pipeline))
+evaluation = dict(interval=1, metric='mAP')
diff --git a/PyTorch/contrib/cv/detection/GCNet/configs/_base_/datasets/wider_face.py b/PyTorch/contrib/cv/detection/GCNet/configs/_base_/datasets/wider_face.py
new file mode 100644
index 0000000000000000000000000000000000000000..d1d649be42bca2955fb56a784fe80bcc2fdce4e1
--- /dev/null
+++ b/PyTorch/contrib/cv/detection/GCNet/configs/_base_/datasets/wider_face.py
@@ -0,0 +1,63 @@
+# dataset settings
+dataset_type = 'WIDERFaceDataset'
+data_root = 'data/WIDERFace/'
+img_norm_cfg = dict(mean=[123.675, 116.28, 103.53], std=[1, 1, 1], to_rgb=True)
+train_pipeline = [
+ dict(type='LoadImageFromFile', to_float32=True),
+ dict(type='LoadAnnotations', with_bbox=True),
+ dict(
+ type='PhotoMetricDistortion',
+ brightness_delta=32,
+ contrast_range=(0.5, 1.5),
+ saturation_range=(0.5, 1.5),
+ hue_delta=18),
+ dict(
+ type='Expand',
+ mean=img_norm_cfg['mean'],
+ to_rgb=img_norm_cfg['to_rgb'],
+ ratio_range=(1, 4)),
+ dict(
+ type='MinIoURandomCrop',
+ min_ious=(0.1, 0.3, 0.5, 0.7, 0.9),
+ min_crop_size=0.3),
+ dict(type='Resize', img_scale=(300, 300), keep_ratio=False),
+ dict(type='Normalize', **img_norm_cfg),
+ dict(type='RandomFlip', flip_ratio=0.5),
+ dict(type='DefaultFormatBundle'),
+ dict(type='Collect', keys=['img', 'gt_bboxes', 'gt_labels']),
+]
+test_pipeline = [
+ dict(type='LoadImageFromFile'),
+ dict(
+ type='MultiScaleFlipAug',
+ img_scale=(300, 300),
+ flip=False,
+ transforms=[
+ dict(type='Resize', keep_ratio=False),
+ dict(type='Normalize', **img_norm_cfg),
+ dict(type='ImageToTensor', keys=['img']),
+ dict(type='Collect', keys=['img']),
+ ])
+]
+data = dict(
+ samples_per_gpu=60,
+ workers_per_gpu=2,
+ train=dict(
+ type='RepeatDataset',
+ times=2,
+ dataset=dict(
+ type=dataset_type,
+ ann_file=data_root + 'train.txt',
+ img_prefix=data_root + 'WIDER_train/',
+ min_size=17,
+ pipeline=train_pipeline)),
+ val=dict(
+ type=dataset_type,
+ ann_file=data_root + 'val.txt',
+ img_prefix=data_root + 'WIDER_val/',
+ pipeline=test_pipeline),
+ test=dict(
+ type=dataset_type,
+ ann_file=data_root + 'val.txt',
+ img_prefix=data_root + 'WIDER_val/',
+ pipeline=test_pipeline))
diff --git a/PyTorch/contrib/cv/detection/GCNet/configs/_base_/default_runtime.py b/PyTorch/contrib/cv/detection/GCNet/configs/_base_/default_runtime.py
new file mode 100644
index 0000000000000000000000000000000000000000..46a39cf7df52159bfa5b82586419ce0bd5885a10
--- /dev/null
+++ b/PyTorch/contrib/cv/detection/GCNet/configs/_base_/default_runtime.py
@@ -0,0 +1,14 @@
+checkpoint_config = dict(interval=1)
+# yapf:disable
+log_config = dict(
+ interval=50,
+ hooks=[
+ dict(type='TextLoggerHook'),
+ # dict(type='TensorboardLoggerHook')
+ ])
+# yapf:enable
+dist_params = dict(backend='hccl')
+log_level = 'INFO'
+load_from = None
+resume_from = None
+workflow = [('train', 1)]
diff --git a/PyTorch/contrib/cv/detection/GCNet/configs/_base_/models/cascade_mask_rcnn_r50_fpn.py b/PyTorch/contrib/cv/detection/GCNet/configs/_base_/models/cascade_mask_rcnn_r50_fpn.py
new file mode 100644
index 0000000000000000000000000000000000000000..f90b78cef38815b004175d94eee023d3b5ef5e25
--- /dev/null
+++ b/PyTorch/contrib/cv/detection/GCNet/configs/_base_/models/cascade_mask_rcnn_r50_fpn.py
@@ -0,0 +1,200 @@
+# model settings
+model = dict(
+ type='CascadeRCNN',
+ pretrained='torchvision://resnet50',
+ backbone=dict(
+ type='ResNet',
+ depth=50,
+ num_stages=4,
+ out_indices=(0, 1, 2, 3),
+ frozen_stages=1,
+ norm_cfg=dict(type='BN', requires_grad=True),
+ norm_eval=True,
+ style='pytorch'),
+ neck=dict(
+ type='FPN',
+ in_channels=[256, 512, 1024, 2048],
+ out_channels=256,
+ num_outs=5),
+ rpn_head=dict(
+ type='RPNHead',
+ in_channels=256,
+ feat_channels=256,
+ anchor_generator=dict(
+ type='AnchorGenerator',
+ scales=[8],
+ ratios=[0.5, 1.0, 2.0],
+ strides=[4, 8, 16, 32, 64]),
+ bbox_coder=dict(
+ type='DeltaXYWHBBoxCoder',
+ target_means=[.0, .0, .0, .0],
+ target_stds=[1.0, 1.0, 1.0, 1.0]),
+ loss_cls=dict(
+ type='CrossEntropyLoss', use_sigmoid=True, loss_weight=1.0),
+ loss_bbox=dict(type='SmoothL1Loss', beta=1.0 / 9.0, loss_weight=1.0)),
+ roi_head=dict(
+ type='CascadeRoIHead',
+ num_stages=3,
+ stage_loss_weights=[1, 0.5, 0.25],
+ bbox_roi_extractor=dict(
+ type='SingleRoIExtractor',
+ roi_layer=dict(type='RoIAlign', output_size=7, sampling_ratio=0),
+ out_channels=256,
+ featmap_strides=[4, 8, 16, 32]),
+ bbox_head=[
+ dict(
+ type='Shared2FCBBoxHead',
+ in_channels=256,
+ fc_out_channels=1024,
+ roi_feat_size=7,
+ num_classes=80,
+ bbox_coder=dict(
+ type='DeltaXYWHBBoxCoder',
+ target_means=[0., 0., 0., 0.],
+ target_stds=[0.1, 0.1, 0.2, 0.2]),
+ reg_class_agnostic=True,
+ loss_cls=dict(
+ type='CrossEntropyLoss',
+ use_sigmoid=False,
+ loss_weight=1.0),
+ loss_bbox=dict(type='SmoothL1Loss', beta=1.0,
+ loss_weight=1.0)),
+ dict(
+ type='Shared2FCBBoxHead',
+ in_channels=256,
+ fc_out_channels=1024,
+ roi_feat_size=7,
+ num_classes=80,
+ bbox_coder=dict(
+ type='DeltaXYWHBBoxCoder',
+ target_means=[0., 0., 0., 0.],
+ target_stds=[0.05, 0.05, 0.1, 0.1]),
+ reg_class_agnostic=True,
+ loss_cls=dict(
+ type='CrossEntropyLoss',
+ use_sigmoid=False,
+ loss_weight=1.0),
+ loss_bbox=dict(type='SmoothL1Loss', beta=1.0,
+ loss_weight=1.0)),
+ dict(
+ type='Shared2FCBBoxHead',
+ in_channels=256,
+ fc_out_channels=1024,
+ roi_feat_size=7,
+ num_classes=80,
+ bbox_coder=dict(
+ type='DeltaXYWHBBoxCoder',
+ target_means=[0., 0., 0., 0.],
+ target_stds=[0.033, 0.033, 0.067, 0.067]),
+ reg_class_agnostic=True,
+ loss_cls=dict(
+ type='CrossEntropyLoss',
+ use_sigmoid=False,
+ loss_weight=1.0),
+ loss_bbox=dict(type='SmoothL1Loss', beta=1.0, loss_weight=1.0))
+ ],
+ mask_roi_extractor=dict(
+ type='SingleRoIExtractor',
+ roi_layer=dict(type='RoIAlign', output_size=14, sampling_ratio=0),
+ out_channels=256,
+ featmap_strides=[4, 8, 16, 32]),
+ mask_head=dict(
+ type='FCNMaskHead',
+ num_convs=4,
+ in_channels=256,
+ conv_out_channels=256,
+ num_classes=80,
+ loss_mask=dict(
+ type='CrossEntropyLoss', use_mask=True, loss_weight=1.0))))
+# model training and testing settings
+train_cfg = dict(
+ rpn=dict(
+ assigner=dict(
+ type='MaxIoUAssigner',
+ pos_iou_thr=0.7,
+ neg_iou_thr=0.3,
+ min_pos_iou=0.3,
+ match_low_quality=True,
+ ignore_iof_thr=-1),
+ sampler=dict(
+ type='RandomSampler',
+ num=256,
+ pos_fraction=0.5,
+ neg_pos_ub=-1,
+ add_gt_as_proposals=False),
+ allowed_border=0,
+ pos_weight=-1,
+ debug=False),
+ rpn_proposal=dict(
+ nms_across_levels=False,
+ nms_pre=2000,
+ nms_post=2000,
+ max_num=2000,
+ nms_thr=0.7,
+ min_bbox_size=0),
+ rcnn=[
+ dict(
+ assigner=dict(
+ type='MaxIoUAssigner',
+ pos_iou_thr=0.5,
+ neg_iou_thr=0.5,
+ min_pos_iou=0.5,
+ match_low_quality=False,
+ ignore_iof_thr=-1),
+ sampler=dict(
+ type='RandomSampler',
+ num=512,
+ pos_fraction=0.25,
+ neg_pos_ub=-1,
+ add_gt_as_proposals=True),
+ mask_size=28,
+ pos_weight=-1,
+ debug=False),
+ dict(
+ assigner=dict(
+ type='MaxIoUAssigner',
+ pos_iou_thr=0.6,
+ neg_iou_thr=0.6,
+ min_pos_iou=0.6,
+ match_low_quality=False,
+ ignore_iof_thr=-1),
+ sampler=dict(
+ type='RandomSampler',
+ num=512,
+ pos_fraction=0.25,
+ neg_pos_ub=-1,
+ add_gt_as_proposals=True),
+ mask_size=28,
+ pos_weight=-1,
+ debug=False),
+ dict(
+ assigner=dict(
+ type='MaxIoUAssigner',
+ pos_iou_thr=0.7,
+ neg_iou_thr=0.7,
+ min_pos_iou=0.7,
+ match_low_quality=False,
+ ignore_iof_thr=-1),
+ sampler=dict(
+ type='RandomSampler',
+ num=512,
+ pos_fraction=0.25,
+ neg_pos_ub=-1,
+ add_gt_as_proposals=True),
+ mask_size=28,
+ pos_weight=-1,
+ debug=False)
+ ])
+test_cfg = dict(
+ rpn=dict(
+ nms_across_levels=False,
+ nms_pre=1000,
+ nms_post=1000,
+ max_num=1000,
+ nms_thr=0.7,
+ min_bbox_size=0),
+ rcnn=dict(
+ score_thr=0.05,
+ nms=dict(type='nms', iou_threshold=0.5),
+ max_per_img=100,
+ mask_thr_binary=0.5))
diff --git a/PyTorch/contrib/cv/detection/GCNet/configs/_base_/models/cascade_rcnn_r50_fpn.py b/PyTorch/contrib/cv/detection/GCNet/configs/_base_/models/cascade_rcnn_r50_fpn.py
new file mode 100644
index 0000000000000000000000000000000000000000..303276b845fecd041d093e240046de08b6016638
--- /dev/null
+++ b/PyTorch/contrib/cv/detection/GCNet/configs/_base_/models/cascade_rcnn_r50_fpn.py
@@ -0,0 +1,183 @@
+# model settings
+model = dict(
+ type='CascadeRCNN',
+ pretrained='torchvision://resnet50',
+ backbone=dict(
+ type='ResNet',
+ depth=50,
+ num_stages=4,
+ out_indices=(0, 1, 2, 3),
+ frozen_stages=1,
+ norm_cfg=dict(type='BN', requires_grad=True),
+ norm_eval=True,
+ style='pytorch'),
+ neck=dict(
+ type='FPN',
+ in_channels=[256, 512, 1024, 2048],
+ out_channels=256,
+ num_outs=5),
+ rpn_head=dict(
+ type='RPNHead',
+ in_channels=256,
+ feat_channels=256,
+ anchor_generator=dict(
+ type='AnchorGenerator',
+ scales=[8],
+ ratios=[0.5, 1.0, 2.0],
+ strides=[4, 8, 16, 32, 64]),
+ bbox_coder=dict(
+ type='DeltaXYWHBBoxCoder',
+ target_means=[.0, .0, .0, .0],
+ target_stds=[1.0, 1.0, 1.0, 1.0]),
+ loss_cls=dict(
+ type='CrossEntropyLoss', use_sigmoid=True, loss_weight=1.0),
+ loss_bbox=dict(type='SmoothL1Loss', beta=1.0 / 9.0, loss_weight=1.0)),
+ roi_head=dict(
+ type='CascadeRoIHead',
+ num_stages=3,
+ stage_loss_weights=[1, 0.5, 0.25],
+ bbox_roi_extractor=dict(
+ type='SingleRoIExtractor',
+ roi_layer=dict(type='RoIAlign', output_size=7, sampling_ratio=0),
+ out_channels=256,
+ featmap_strides=[4, 8, 16, 32]),
+ bbox_head=[
+ dict(
+ type='Shared2FCBBoxHead',
+ in_channels=256,
+ fc_out_channels=1024,
+ roi_feat_size=7,
+ num_classes=80,
+ bbox_coder=dict(
+ type='DeltaXYWHBBoxCoder',
+ target_means=[0., 0., 0., 0.],
+ target_stds=[0.1, 0.1, 0.2, 0.2]),
+ reg_class_agnostic=True,
+ loss_cls=dict(
+ type='CrossEntropyLoss',
+ use_sigmoid=False,
+ loss_weight=1.0),
+ loss_bbox=dict(type='SmoothL1Loss', beta=1.0,
+ loss_weight=1.0)),
+ dict(
+ type='Shared2FCBBoxHead',
+ in_channels=256,
+ fc_out_channels=1024,
+ roi_feat_size=7,
+ num_classes=80,
+ bbox_coder=dict(
+ type='DeltaXYWHBBoxCoder',
+ target_means=[0., 0., 0., 0.],
+ target_stds=[0.05, 0.05, 0.1, 0.1]),
+ reg_class_agnostic=True,
+ loss_cls=dict(
+ type='CrossEntropyLoss',
+ use_sigmoid=False,
+ loss_weight=1.0),
+ loss_bbox=dict(type='SmoothL1Loss', beta=1.0,
+ loss_weight=1.0)),
+ dict(
+ type='Shared2FCBBoxHead',
+ in_channels=256,
+ fc_out_channels=1024,
+ roi_feat_size=7,
+ num_classes=80,
+ bbox_coder=dict(
+ type='DeltaXYWHBBoxCoder',
+ target_means=[0., 0., 0., 0.],
+ target_stds=[0.033, 0.033, 0.067, 0.067]),
+ reg_class_agnostic=True,
+ loss_cls=dict(
+ type='CrossEntropyLoss',
+ use_sigmoid=False,
+ loss_weight=1.0),
+ loss_bbox=dict(type='SmoothL1Loss', beta=1.0, loss_weight=1.0))
+ ]))
+# model training and testing settings
+train_cfg = dict(
+ rpn=dict(
+ assigner=dict(
+ type='MaxIoUAssigner',
+ pos_iou_thr=0.7,
+ neg_iou_thr=0.3,
+ min_pos_iou=0.3,
+ match_low_quality=True,
+ ignore_iof_thr=-1),
+ sampler=dict(
+ type='RandomSampler',
+ num=256,
+ pos_fraction=0.5,
+ neg_pos_ub=-1,
+ add_gt_as_proposals=False),
+ allowed_border=0,
+ pos_weight=-1,
+ debug=False),
+ rpn_proposal=dict(
+ nms_across_levels=False,
+ nms_pre=2000,
+ nms_post=2000,
+ max_num=2000,
+ nms_thr=0.7,
+ min_bbox_size=0),
+ rcnn=[
+ dict(
+ assigner=dict(
+ type='MaxIoUAssigner',
+ pos_iou_thr=0.5,
+ neg_iou_thr=0.5,
+ min_pos_iou=0.5,
+ match_low_quality=False,
+ ignore_iof_thr=-1),
+ sampler=dict(
+ type='RandomSampler',
+ num=512,
+ pos_fraction=0.25,
+ neg_pos_ub=-1,
+ add_gt_as_proposals=True),
+ pos_weight=-1,
+ debug=False),
+ dict(
+ assigner=dict(
+ type='MaxIoUAssigner',
+ pos_iou_thr=0.6,
+ neg_iou_thr=0.6,
+ min_pos_iou=0.6,
+ match_low_quality=False,
+ ignore_iof_thr=-1),
+ sampler=dict(
+ type='RandomSampler',
+ num=512,
+ pos_fraction=0.25,
+ neg_pos_ub=-1,
+ add_gt_as_proposals=True),
+ pos_weight=-1,
+ debug=False),
+ dict(
+ assigner=dict(
+ type='MaxIoUAssigner',
+ pos_iou_thr=0.7,
+ neg_iou_thr=0.7,
+ min_pos_iou=0.7,
+ match_low_quality=False,
+ ignore_iof_thr=-1),
+ sampler=dict(
+ type='RandomSampler',
+ num=512,
+ pos_fraction=0.25,
+ neg_pos_ub=-1,
+ add_gt_as_proposals=True),
+ pos_weight=-1,
+ debug=False)
+ ])
+test_cfg = dict(
+ rpn=dict(
+ nms_across_levels=False,
+ nms_pre=1000,
+ nms_post=1000,
+ max_num=1000,
+ nms_thr=0.7,
+ min_bbox_size=0),
+ rcnn=dict(
+ score_thr=0.05,
+ nms=dict(type='nms', iou_threshold=0.5),
+ max_per_img=100))
diff --git a/PyTorch/contrib/cv/detection/GCNet/configs/_base_/models/fast_rcnn_r50_fpn.py b/PyTorch/contrib/cv/detection/GCNet/configs/_base_/models/fast_rcnn_r50_fpn.py
new file mode 100644
index 0000000000000000000000000000000000000000..b8d9570deeaaf0cf42b0e16619a1dfc22d38ae5d
--- /dev/null
+++ b/PyTorch/contrib/cv/detection/GCNet/configs/_base_/models/fast_rcnn_r50_fpn.py
@@ -0,0 +1,62 @@
+# model settings
+model = dict(
+ type='FastRCNN',
+ pretrained='torchvision://resnet50',
+ backbone=dict(
+ type='ResNet',
+ depth=50,
+ num_stages=4,
+ out_indices=(0, 1, 2, 3),
+ frozen_stages=1,
+ norm_cfg=dict(type='BN', requires_grad=True),
+ norm_eval=True,
+ style='pytorch'),
+ neck=dict(
+ type='FPN',
+ in_channels=[256, 512, 1024, 2048],
+ out_channels=256,
+ num_outs=5),
+ roi_head=dict(
+ type='StandardRoIHead',
+ bbox_roi_extractor=dict(
+ type='SingleRoIExtractor',
+ roi_layer=dict(type='RoIAlign', output_size=7, sampling_ratio=0),
+ out_channels=256,
+ featmap_strides=[4, 8, 16, 32]),
+ bbox_head=dict(
+ type='Shared2FCBBoxHead',
+ in_channels=256,
+ fc_out_channels=1024,
+ roi_feat_size=7,
+ num_classes=80,
+ bbox_coder=dict(
+ type='DeltaXYWHBBoxCoder',
+ target_means=[0., 0., 0., 0.],
+ target_stds=[0.1, 0.1, 0.2, 0.2]),
+ reg_class_agnostic=False,
+ loss_cls=dict(
+ type='CrossEntropyLoss', use_sigmoid=False, loss_weight=1.0),
+ loss_bbox=dict(type='L1Loss', loss_weight=1.0))))
+# model training and testing settings
+train_cfg = dict(
+ rcnn=dict(
+ assigner=dict(
+ type='MaxIoUAssigner',
+ pos_iou_thr=0.5,
+ neg_iou_thr=0.5,
+ min_pos_iou=0.5,
+ match_low_quality=False,
+ ignore_iof_thr=-1),
+ sampler=dict(
+ type='RandomSampler',
+ num=512,
+ pos_fraction=0.25,
+ neg_pos_ub=-1,
+ add_gt_as_proposals=True),
+ pos_weight=-1,
+ debug=False))
+test_cfg = dict(
+ rcnn=dict(
+ score_thr=0.05,
+ nms=dict(type='nms', iou_threshold=0.5),
+ max_per_img=100))
diff --git a/PyTorch/contrib/cv/detection/GCNet/configs/_base_/models/faster_rcnn_r50_caffe_c4.py b/PyTorch/contrib/cv/detection/GCNet/configs/_base_/models/faster_rcnn_r50_caffe_c4.py
new file mode 100644
index 0000000000000000000000000000000000000000..5a381636382bdd82dc7650e199ef26a3602513e3
--- /dev/null
+++ b/PyTorch/contrib/cv/detection/GCNet/configs/_base_/models/faster_rcnn_r50_caffe_c4.py
@@ -0,0 +1,116 @@
+# model settings
+norm_cfg = dict(type='BN', requires_grad=False)
+model = dict(
+ type='FasterRCNN',
+ pretrained='open-mmlab://detectron2/resnet50_caffe',
+ backbone=dict(
+ type='ResNet',
+ depth=50,
+ num_stages=3,
+ strides=(1, 2, 2),
+ dilations=(1, 1, 1),
+ out_indices=(2, ),
+ frozen_stages=1,
+ norm_cfg=norm_cfg,
+ norm_eval=True,
+ style='caffe'),
+ rpn_head=dict(
+ type='RPNHead',
+ in_channels=1024,
+ feat_channels=1024,
+ anchor_generator=dict(
+ type='AnchorGenerator',
+ scales=[2, 4, 8, 16, 32],
+ ratios=[0.5, 1.0, 2.0],
+ strides=[16]),
+ bbox_coder=dict(
+ type='DeltaXYWHBBoxCoder',
+ target_means=[.0, .0, .0, .0],
+ target_stds=[1.0, 1.0, 1.0, 1.0]),
+ loss_cls=dict(
+ type='CrossEntropyLoss', use_sigmoid=True, loss_weight=1.0),
+ loss_bbox=dict(type='L1Loss', loss_weight=1.0)),
+ roi_head=dict(
+ type='StandardRoIHead',
+ shared_head=dict(
+ type='ResLayer',
+ depth=50,
+ stage=3,
+ stride=2,
+ dilation=1,
+ style='caffe',
+ norm_cfg=norm_cfg,
+ norm_eval=True),
+ bbox_roi_extractor=dict(
+ type='SingleRoIExtractor',
+ roi_layer=dict(type='RoIAlign', output_size=14, sampling_ratio=0),
+ out_channels=1024,
+ featmap_strides=[16]),
+ bbox_head=dict(
+ type='BBoxHead',
+ with_avg_pool=True,
+ roi_feat_size=7,
+ in_channels=2048,
+ num_classes=80,
+ bbox_coder=dict(
+ type='DeltaXYWHBBoxCoder',
+ target_means=[0., 0., 0., 0.],
+ target_stds=[0.1, 0.1, 0.2, 0.2]),
+ reg_class_agnostic=False,
+ loss_cls=dict(
+ type='CrossEntropyLoss', use_sigmoid=False, loss_weight=1.0),
+ loss_bbox=dict(type='L1Loss', loss_weight=1.0))))
+# model training and testing settings
+train_cfg = dict(
+ rpn=dict(
+ assigner=dict(
+ type='MaxIoUAssigner',
+ pos_iou_thr=0.7,
+ neg_iou_thr=0.3,
+ min_pos_iou=0.3,
+ match_low_quality=True,
+ ignore_iof_thr=-1),
+ sampler=dict(
+ type='RandomSampler',
+ num=256,
+ pos_fraction=0.5,
+ neg_pos_ub=-1,
+ add_gt_as_proposals=False),
+ allowed_border=0,
+ pos_weight=-1,
+ debug=False),
+ rpn_proposal=dict(
+ nms_across_levels=False,
+ nms_pre=12000,
+ nms_post=2000,
+ max_num=2000,
+ nms_thr=0.7,
+ min_bbox_size=0),
+ rcnn=dict(
+ assigner=dict(
+ type='MaxIoUAssigner',
+ pos_iou_thr=0.5,
+ neg_iou_thr=0.5,
+ min_pos_iou=0.5,
+ match_low_quality=False,
+ ignore_iof_thr=-1),
+ sampler=dict(
+ type='RandomSampler',
+ num=512,
+ pos_fraction=0.25,
+ neg_pos_ub=-1,
+ add_gt_as_proposals=True),
+ pos_weight=-1,
+ debug=False))
+test_cfg = dict(
+ rpn=dict(
+ nms_across_levels=False,
+ nms_pre=6000,
+ nms_post=1000,
+ max_num=1000,
+ nms_thr=0.7,
+ min_bbox_size=0),
+ rcnn=dict(
+ score_thr=0.05,
+ nms=dict(type='nms', iou_threshold=0.5),
+ max_per_img=100))
diff --git a/PyTorch/contrib/cv/detection/GCNet/configs/_base_/models/faster_rcnn_r50_fpn.py b/PyTorch/contrib/cv/detection/GCNet/configs/_base_/models/faster_rcnn_r50_fpn.py
new file mode 100644
index 0000000000000000000000000000000000000000..338a5c6b604d4bfe316ad35ab51d6b997f74ba9e
--- /dev/null
+++ b/PyTorch/contrib/cv/detection/GCNet/configs/_base_/models/faster_rcnn_r50_fpn.py
@@ -0,0 +1,111 @@
+model = dict(
+ type='FasterRCNN',
+ pretrained='torchvision://resnet50',
+ backbone=dict(
+ type='ResNet',
+ depth=50,
+ num_stages=4,
+ out_indices=(0, 1, 2, 3),
+ frozen_stages=1,
+ norm_cfg=dict(type='BN', requires_grad=True),
+ norm_eval=True,
+ style='pytorch'),
+ neck=dict(
+ type='FPN',
+ in_channels=[256, 512, 1024, 2048],
+ out_channels=256,
+ num_outs=5),
+ rpn_head=dict(
+ type='RPNHead',
+ in_channels=256,
+ feat_channels=256,
+ anchor_generator=dict(
+ type='AnchorGenerator',
+ scales=[8],
+ ratios=[0.5, 1.0, 2.0],
+ strides=[4, 8, 16, 32, 64]),
+ bbox_coder=dict(
+ type='DeltaXYWHBBoxCoder',
+ target_means=[.0, .0, .0, .0],
+ target_stds=[1.0, 1.0, 1.0, 1.0]),
+ loss_cls=dict(
+ type='CrossEntropyLoss', use_sigmoid=True, loss_weight=1.0),
+ loss_bbox=dict(type='L1Loss', loss_weight=1.0)),
+ roi_head=dict(
+ type='StandardRoIHead',
+ bbox_roi_extractor=dict(
+ type='SingleRoIExtractor',
+ roi_layer=dict(type='RoIAlign', output_size=7, sampling_ratio=0),
+ out_channels=256,
+ featmap_strides=[4, 8, 16, 32]),
+ bbox_head=dict(
+ type='Shared2FCBBoxHead',
+ in_channels=256,
+ fc_out_channels=1024,
+ roi_feat_size=7,
+ num_classes=80,
+ bbox_coder=dict(
+ type='DeltaXYWHBBoxCoder',
+ target_means=[0., 0., 0., 0.],
+ target_stds=[0.1, 0.1, 0.2, 0.2]),
+ reg_class_agnostic=False,
+ loss_cls=dict(
+ type='CrossEntropyLoss', use_sigmoid=False, loss_weight=1.0),
+ loss_bbox=dict(type='L1Loss', loss_weight=1.0))))
+# model training and testing settings
+train_cfg = dict(
+ rpn=dict(
+ assigner=dict(
+ type='MaxIoUAssigner',
+ pos_iou_thr=0.7,
+ neg_iou_thr=0.3,
+ min_pos_iou=0.3,
+ match_low_quality=True,
+ ignore_iof_thr=-1),
+ sampler=dict(
+ type='RandomSampler',
+ num=256,
+ pos_fraction=0.5,
+ neg_pos_ub=-1,
+ add_gt_as_proposals=False),
+ allowed_border=-1,
+ pos_weight=-1,
+ debug=False),
+ rpn_proposal=dict(
+ nms_across_levels=False,
+ nms_pre=2000,
+ nms_post=1000,
+ max_num=1000,
+ nms_thr=0.7,
+ min_bbox_size=0),
+ rcnn=dict(
+ assigner=dict(
+ type='MaxIoUAssigner',
+ pos_iou_thr=0.5,
+ neg_iou_thr=0.5,
+ min_pos_iou=0.5,
+ match_low_quality=False,
+ ignore_iof_thr=-1),
+ sampler=dict(
+ type='RandomSampler',
+ num=512,
+ pos_fraction=0.25,
+ neg_pos_ub=-1,
+ add_gt_as_proposals=True),
+ pos_weight=-1,
+ debug=False))
+test_cfg = dict(
+ rpn=dict(
+ nms_across_levels=False,
+ nms_pre=1000,
+ nms_post=1000,
+ max_num=1000,
+ nms_thr=0.7,
+ min_bbox_size=0),
+ rcnn=dict(
+ score_thr=0.05,
+ nms=dict(type='nms', iou_threshold=0.5),
+ max_per_img=100)
+ # soft-nms is also supported for rcnn testing
+ # e.g., nms=dict(type='soft_nms', iou_threshold=0.5, min_score=0.05)
+)
diff --git a/PyTorch/contrib/cv/detection/GCNet/configs/_base_/models/mask_rcnn_npu.py b/PyTorch/contrib/cv/detection/GCNet/configs/_base_/models/mask_rcnn_npu.py
new file mode 100644
index 0000000000000000000000000000000000000000..4472bd0a80d7426278cbb05ab4be9bf411eaef0f
--- /dev/null
+++ b/PyTorch/contrib/cv/detection/GCNet/configs/_base_/models/mask_rcnn_npu.py
@@ -0,0 +1,124 @@
+# model settings
+model = dict(
+ type='MaskRCNN',
+ pretrained='torchvision://resnet50',
+ backbone=dict(
+ type='ResNet',
+ depth=50,
+ num_stages=4,
+ out_indices=(0, 1, 2, 3),
+ frozen_stages=1,
+ norm_cfg=dict(type='BN', requires_grad=True),
+ norm_eval=True,
+ style='pytorch'),
+ neck=dict(
+ type='FPN',
+ in_channels=[256, 512, 1024, 2048],
+ out_channels=256,
+ num_outs=5),
+ rpn_head=dict(
+ type='RPNHead',
+ in_channels=256,
+ feat_channels=256,
+ anchor_generator=dict(
+ type='AnchorGenerator',
+ scales=[8],
+ ratios=[0.5, 1.0, 2.0],
+ strides=[4, 8, 16, 32, 64]),
+ bbox_coder=dict(
+ type='DeltaXYWHBBoxCoder',
+ target_means=[.0, .0, .0, .0],
+ target_stds=[1.0, 1.0, 1.0, 1.0]),
+ loss_cls=dict(
+ type='CrossEntropyLoss', use_sigmoid=True, loss_weight=1.0),
+ loss_bbox=dict(type='L1Loss', loss_weight=1.0)),
+ roi_head=dict(
+ type='StandardRoIHead',
+ bbox_roi_extractor=dict(
+ type='SingleRoIExtractor',
+ roi_layer=dict(type='RoIAlign', output_size=7, sampling_ratio=0),
+ out_channels=256,
+ featmap_strides=[4, 8, 16, 32]),
+ bbox_head=dict(
+ type='Shared2FCBBoxHead',
+ in_channels=256,
+ fc_out_channels=1024,
+ roi_feat_size=7,
+ num_classes=80,
+ bbox_coder=dict(
+ type='DeltaXYWHBBoxCoder',
+ target_means=[0., 0., 0., 0.],
+ target_stds=[0.1, 0.1, 0.2, 0.2]),
+ reg_class_agnostic=False,
+ loss_cls=dict(
+ type='CrossEntropyLoss', use_sigmoid=False, loss_weight=1.0),
+ loss_bbox=dict(type='L1Loss', loss_weight=1.0)),
+ mask_roi_extractor=dict(
+ type='SingleRoIExtractor',
+ roi_layer=dict(type='RoIAlign', output_size=14, sampling_ratio=0),
+ out_channels=256,
+ featmap_strides=[4, 8, 16, 32]),
+ mask_head=dict(
+ type='FCNMaskHead',
+ num_convs=4,
+ in_channels=256,
+ conv_out_channels=256,
+ num_classes=80,
+ loss_mask=dict(
+ type='CrossEntropyLoss', use_mask=True, loss_weight=1.0))))
+# model training and testing settings
+train_cfg = dict(
+ rpn=dict(
+ assigner=dict(
+ type='MaxIoUAssigner',
+ pos_iou_thr=0.7,
+ neg_iou_thr=0.3,
+ min_pos_iou=0.3,
+ match_low_quality=True,
+ ignore_iof_thr=-1),
+ sampler=dict(
+ type='RandomSampler',
+ num=256,
+ pos_fraction=0.5,
+ neg_pos_ub=-1,
+ add_gt_as_proposals=False),
+ allowed_border=-1,
+ pos_weight=-1,
+ debug=False),
+ rpn_proposal=dict(
+ nms_across_levels=False,
+ nms_pre=2000,
+ nms_post=1000,
+ max_num=1000,
+ nms_thr=0.7,
+ min_bbox_size=0),
+ rcnn=dict(
+ assigner=dict(
+ type='MaxIoUAssigner',
+ pos_iou_thr=0.5,
+ neg_iou_thr=0.5,
+ min_pos_iou=0.5,
+ match_low_quality=True,
+ ignore_iof_thr=-1),
+ sampler=dict(
+ type='RandomSampler',
+ num=512,
+ pos_fraction=0.25,
+ neg_pos_ub=-1,
+ add_gt_as_proposals=True),
+ mask_size=28,
+ pos_weight=-1,
+ debug=False))
+test_cfg = dict(
+ rpn=dict(
+ nms_across_levels=False,
+ nms_pre=1000,
+ nms_post=1000,
+ max_num=1000,
+ nms_thr=0.7,
+ min_bbox_size=0),
+ rcnn=dict(
+ score_thr=0.05,
+ nms=dict(type='nms', iou_threshold=0.5),
+ max_per_img=100,
+ mask_thr_binary=0.5))
diff --git a/PyTorch/contrib/cv/detection/GCNet/configs/_base_/models/mask_rcnn_r50_caffe_c4.py b/PyTorch/contrib/cv/detection/GCNet/configs/_base_/models/mask_rcnn_r50_caffe_c4.py
new file mode 100644
index 0000000000000000000000000000000000000000..b9b29b0b99de34caadd1d906b1b9367659524c89
--- /dev/null
+++ b/PyTorch/contrib/cv/detection/GCNet/configs/_base_/models/mask_rcnn_r50_caffe_c4.py
@@ -0,0 +1,127 @@
+# model settings
+norm_cfg = dict(type='BN', requires_grad=False)
+model = dict(
+ type='MaskRCNN',
+ pretrained='open-mmlab://detectron2/resnet50_caffe',
+ backbone=dict(
+ type='ResNet',
+ depth=50,
+ num_stages=3,
+ strides=(1, 2, 2),
+ dilations=(1, 1, 1),
+ out_indices=(2, ),
+ frozen_stages=1,
+ norm_cfg=norm_cfg,
+ norm_eval=True,
+ style='caffe'),
+ rpn_head=dict(
+ type='RPNHead',
+ in_channels=1024,
+ feat_channels=1024,
+ anchor_generator=dict(
+ type='AnchorGenerator',
+ scales=[2, 4, 8, 16, 32],
+ ratios=[0.5, 1.0, 2.0],
+ strides=[16]),
+ bbox_coder=dict(
+ type='DeltaXYWHBBoxCoder',
+ target_means=[.0, .0, .0, .0],
+ target_stds=[1.0, 1.0, 1.0, 1.0]),
+ loss_cls=dict(
+ type='CrossEntropyLoss', use_sigmoid=True, loss_weight=1.0),
+ loss_bbox=dict(type='L1Loss', loss_weight=1.0)),
+ roi_head=dict(
+ type='StandardRoIHead',
+ shared_head=dict(
+ type='ResLayer',
+ depth=50,
+ stage=3,
+ stride=2,
+ dilation=1,
+ style='caffe',
+ norm_cfg=norm_cfg,
+ norm_eval=True),
+ bbox_roi_extractor=dict(
+ type='SingleRoIExtractor',
+ roi_layer=dict(type='RoIAlign', output_size=14, sampling_ratio=0),
+ out_channels=1024,
+ featmap_strides=[16]),
+ bbox_head=dict(
+ type='BBoxHead',
+ with_avg_pool=True,
+ roi_feat_size=7,
+ in_channels=2048,
+ num_classes=80,
+ bbox_coder=dict(
+ type='DeltaXYWHBBoxCoder',
+ target_means=[0., 0., 0., 0.],
+ target_stds=[0.1, 0.1, 0.2, 0.2]),
+ reg_class_agnostic=False,
+ loss_cls=dict(
+ type='CrossEntropyLoss', use_sigmoid=False, loss_weight=1.0),
+ loss_bbox=dict(type='L1Loss', loss_weight=1.0)),
+ mask_roi_extractor=None,
+ mask_head=dict(
+ type='FCNMaskHead',
+ num_convs=0,
+ in_channels=2048,
+ conv_out_channels=256,
+ num_classes=80,
+ loss_mask=dict(
+ type='CrossEntropyLoss', use_mask=True, loss_weight=1.0))))
+# model training and testing settings
+train_cfg = dict(
+ rpn=dict(
+ assigner=dict(
+ type='MaxIoUAssigner',
+ pos_iou_thr=0.7,
+ neg_iou_thr=0.3,
+ min_pos_iou=0.3,
+ match_low_quality=True,
+ ignore_iof_thr=-1),
+ sampler=dict(
+ type='RandomSampler',
+ num=256,
+ pos_fraction=0.5,
+ neg_pos_ub=-1,
+ add_gt_as_proposals=False),
+ allowed_border=0,
+ pos_weight=-1,
+ debug=False),
+ rpn_proposal=dict(
+ nms_across_levels=False,
+ nms_pre=12000,
+ nms_post=2000,
+ max_num=2000,
+ nms_thr=0.7,
+ min_bbox_size=0),
+ rcnn=dict(
+ assigner=dict(
+ type='MaxIoUAssigner',
+ pos_iou_thr=0.5,
+ neg_iou_thr=0.5,
+ min_pos_iou=0.5,
+ match_low_quality=False,
+ ignore_iof_thr=-1),
+ sampler=dict(
+ type='RandomSampler',
+ num=512,
+ pos_fraction=0.25,
+ neg_pos_ub=-1,
+ add_gt_as_proposals=True),
+ mask_size=14,
+ pos_weight=-1,
+ debug=False))
+test_cfg = dict(
+ rpn=dict(
+ nms_across_levels=False,
+ nms_pre=6000,
+ nms_post=1000,
+ max_num=1000,
+ nms_thr=0.7,
+ min_bbox_size=0),
+ rcnn=dict(
+ score_thr=0.05,
+ nms=dict(type='nms', iou_threshold=0.5),
+ max_per_img=100,
+ mask_thr_binary=0.5))
diff --git a/PyTorch/contrib/cv/detection/GCNet/configs/_base_/models/mask_rcnn_r50_fpn.py b/PyTorch/contrib/cv/detection/GCNet/configs/_base_/models/mask_rcnn_r50_fpn.py
new file mode 100644
index 0000000000000000000000000000000000000000..4472bd0a80d7426278cbb05ab4be9bf411eaef0f
--- /dev/null
+++ b/PyTorch/contrib/cv/detection/GCNet/configs/_base_/models/mask_rcnn_r50_fpn.py
@@ -0,0 +1,124 @@
+# model settings
+model = dict(
+ type='MaskRCNN',
+ pretrained='torchvision://resnet50',
+ backbone=dict(
+ type='ResNet',
+ depth=50,
+ num_stages=4,
+ out_indices=(0, 1, 2, 3),
+ frozen_stages=1,
+ norm_cfg=dict(type='BN', requires_grad=True),
+ norm_eval=True,
+ style='pytorch'),
+ neck=dict(
+ type='FPN',
+ in_channels=[256, 512, 1024, 2048],
+ out_channels=256,
+ num_outs=5),
+ rpn_head=dict(
+ type='RPNHead',
+ in_channels=256,
+ feat_channels=256,
+ anchor_generator=dict(
+ type='AnchorGenerator',
+ scales=[8],
+ ratios=[0.5, 1.0, 2.0],
+ strides=[4, 8, 16, 32, 64]),
+ bbox_coder=dict(
+ type='DeltaXYWHBBoxCoder',
+ target_means=[.0, .0, .0, .0],
+ target_stds=[1.0, 1.0, 1.0, 1.0]),
+ loss_cls=dict(
+ type='CrossEntropyLoss', use_sigmoid=True, loss_weight=1.0),
+ loss_bbox=dict(type='L1Loss', loss_weight=1.0)),
+ roi_head=dict(
+ type='StandardRoIHead',
+ bbox_roi_extractor=dict(
+ type='SingleRoIExtractor',
+ roi_layer=dict(type='RoIAlign', output_size=7, sampling_ratio=0),
+ out_channels=256,
+ featmap_strides=[4, 8, 16, 32]),
+ bbox_head=dict(
+ type='Shared2FCBBoxHead',
+ in_channels=256,
+ fc_out_channels=1024,
+ roi_feat_size=7,
+ num_classes=80,
+ bbox_coder=dict(
+ type='DeltaXYWHBBoxCoder',
+ target_means=[0., 0., 0., 0.],
+ target_stds=[0.1, 0.1, 0.2, 0.2]),
+ reg_class_agnostic=False,
+ loss_cls=dict(
+ type='CrossEntropyLoss', use_sigmoid=False, loss_weight=1.0),
+ loss_bbox=dict(type='L1Loss', loss_weight=1.0)),
+ mask_roi_extractor=dict(
+ type='SingleRoIExtractor',
+ roi_layer=dict(type='RoIAlign', output_size=14, sampling_ratio=0),
+ out_channels=256,
+ featmap_strides=[4, 8, 16, 32]),
+ mask_head=dict(
+ type='FCNMaskHead',
+ num_convs=4,
+ in_channels=256,
+ conv_out_channels=256,
+ num_classes=80,
+ loss_mask=dict(
+ type='CrossEntropyLoss', use_mask=True, loss_weight=1.0))))
+# model training and testing settings
+train_cfg = dict(
+ rpn=dict(
+ assigner=dict(
+ type='MaxIoUAssigner',
+ pos_iou_thr=0.7,
+ neg_iou_thr=0.3,
+ min_pos_iou=0.3,
+ match_low_quality=True,
+ ignore_iof_thr=-1),
+ sampler=dict(
+ type='RandomSampler',
+ num=256,
+ pos_fraction=0.5,
+ neg_pos_ub=-1,
+ add_gt_as_proposals=False),
+ allowed_border=-1,
+ pos_weight=-1,
+ debug=False),
+ rpn_proposal=dict(
+ nms_across_levels=False,
+ nms_pre=2000,
+ nms_post=1000,
+ max_num=1000,
+ nms_thr=0.7,
+ min_bbox_size=0),
+ rcnn=dict(
+ assigner=dict(
+ type='MaxIoUAssigner',
+ pos_iou_thr=0.5,
+ neg_iou_thr=0.5,
+ min_pos_iou=0.5,
+ match_low_quality=True,
+ ignore_iof_thr=-1),
+ sampler=dict(
+ type='RandomSampler',
+ num=512,
+ pos_fraction=0.25,
+ neg_pos_ub=-1,
+ add_gt_as_proposals=True),
+ mask_size=28,
+ pos_weight=-1,
+ debug=False))
+test_cfg = dict(
+ rpn=dict(
+ nms_across_levels=False,
+ nms_pre=1000,
+ nms_post=1000,
+ max_num=1000,
+ nms_thr=0.7,
+ min_bbox_size=0),
+ rcnn=dict(
+ score_thr=0.05,
+ nms=dict(type='nms', iou_threshold=0.5),
+ max_per_img=100,
+ mask_thr_binary=0.5))
diff --git a/PyTorch/contrib/cv/detection/GCNet/configs/_base_/models/retinanet_r50_fpn.py b/PyTorch/contrib/cv/detection/GCNet/configs/_base_/models/retinanet_r50_fpn.py
new file mode 100644
index 0000000000000000000000000000000000000000..a08b14f60992a8a5c00c668b37eb9a4dbf0ac7a3
--- /dev/null
+++ b/PyTorch/contrib/cv/detection/GCNet/configs/_base_/models/retinanet_r50_fpn.py
@@ -0,0 +1,60 @@
+# model settings
+model = dict(
+ type='RetinaNet',
+ pretrained='torchvision://resnet50',
+ backbone=dict(
+ type='ResNet',
+ depth=50,
+ num_stages=4,
+ out_indices=(0, 1, 2, 3),
+ frozen_stages=1,
+ norm_cfg=dict(type='BN', requires_grad=True),
+ norm_eval=True,
+ style='pytorch'),
+ neck=dict(
+ type='FPN',
+ in_channels=[256, 512, 1024, 2048],
+ out_channels=256,
+ start_level=1,
+ add_extra_convs='on_input',
+ num_outs=5),
+ bbox_head=dict(
+ type='RetinaHead',
+ num_classes=80,
+ in_channels=256,
+ stacked_convs=4,
+ feat_channels=256,
+ anchor_generator=dict(
+ type='AnchorGenerator',
+ octave_base_scale=4,
+ scales_per_octave=3,
+ ratios=[0.5, 1.0, 2.0],
+ strides=[8, 16, 32, 64, 128]),
+ bbox_coder=dict(
+ type='DeltaXYWHBBoxCoder',
+ target_means=[.0, .0, .0, .0],
+ target_stds=[1.0, 1.0, 1.0, 1.0]),
+ loss_cls=dict(
+ type='FocalLoss',
+ use_sigmoid=True,
+ gamma=2.0,
+ alpha=0.25,
+ loss_weight=1.0),
+ loss_bbox=dict(type='L1Loss', loss_weight=1.0)))
+# training and testing settings
+train_cfg = dict(
+ assigner=dict(
+ type='MaxIoUAssigner',
+ pos_iou_thr=0.5,
+ neg_iou_thr=0.4,
+ min_pos_iou=0,
+ ignore_iof_thr=-1),
+ allowed_border=-1,
+ pos_weight=-1,
+ debug=False)
+test_cfg = dict(
+ nms_pre=1000,
+ min_bbox_size=0,
+ score_thr=0.05,
+ nms=dict(type='nms', iou_threshold=0.5),
+ max_per_img=100)
diff --git a/PyTorch/contrib/cv/detection/GCNet/configs/_base_/models/rpn_r50_caffe_c4.py b/PyTorch/contrib/cv/detection/GCNet/configs/_base_/models/rpn_r50_caffe_c4.py
new file mode 100644
index 0000000000000000000000000000000000000000..bd5d665e0331711adfb2cb3eeea113ed4762e5db
--- /dev/null
+++ b/PyTorch/contrib/cv/detection/GCNet/configs/_base_/models/rpn_r50_caffe_c4.py
@@ -0,0 +1,58 @@
+# model settings
+model = dict(
+ type='RPN',
+ pretrained='open-mmlab://detectron2/resnet50_caffe',
+ backbone=dict(
+ type='ResNet',
+ depth=50,
+ num_stages=3,
+ strides=(1, 2, 2),
+ dilations=(1, 1, 1),
+ out_indices=(2, ),
+ frozen_stages=1,
+ norm_cfg=dict(type='BN', requires_grad=False),
+ norm_eval=True,
+ style='caffe'),
+ neck=None,
+ rpn_head=dict(
+ type='RPNHead',
+ in_channels=1024,
+ feat_channels=1024,
+ anchor_generator=dict(
+ type='AnchorGenerator',
+ scales=[2, 4, 8, 16, 32],
+ ratios=[0.5, 1.0, 2.0],
+ strides=[16]),
+ bbox_coder=dict(
+ type='DeltaXYWHBBoxCoder',
+ target_means=[.0, .0, .0, .0],
+ target_stds=[1.0, 1.0, 1.0, 1.0]),
+ loss_cls=dict(
+ type='CrossEntropyLoss', use_sigmoid=True, loss_weight=1.0),
+ loss_bbox=dict(type='L1Loss', loss_weight=1.0)))
+# model training and testing settings
+train_cfg = dict(
+ rpn=dict(
+ assigner=dict(
+ type='MaxIoUAssigner',
+ pos_iou_thr=0.7,
+ neg_iou_thr=0.3,
+ min_pos_iou=0.3,
+ ignore_iof_thr=-1),
+ sampler=dict(
+ type='RandomSampler',
+ num=256,
+ pos_fraction=0.5,
+ neg_pos_ub=-1,
+ add_gt_as_proposals=False),
+ allowed_border=0,
+ pos_weight=-1,
+ debug=False))
+test_cfg = dict(
+ rpn=dict(
+ nms_across_levels=False,
+ nms_pre=12000,
+ nms_post=2000,
+ max_num=2000,
+ nms_thr=0.7,
+ min_bbox_size=0))
diff --git a/PyTorch/contrib/cv/detection/GCNet/configs/_base_/models/rpn_r50_fpn.py b/PyTorch/contrib/cv/detection/GCNet/configs/_base_/models/rpn_r50_fpn.py
new file mode 100644
index 0000000000000000000000000000000000000000..13e96191deb243d1f625d99ac85bf17503f1f8a8
--- /dev/null
+++ b/PyTorch/contrib/cv/detection/GCNet/configs/_base_/models/rpn_r50_fpn.py
@@ -0,0 +1,60 @@
+# model settings
+model = dict(
+ type='RPN',
+ pretrained='torchvision://resnet50',
+ backbone=dict(
+ type='ResNet',
+ depth=50,
+ num_stages=4,
+ out_indices=(0, 1, 2, 3),
+ frozen_stages=1,
+ norm_cfg=dict(type='BN', requires_grad=True),
+ norm_eval=True,
+ style='pytorch'),
+ neck=dict(
+ type='FPN',
+ in_channels=[256, 512, 1024, 2048],
+ out_channels=256,
+ num_outs=5),
+ rpn_head=dict(
+ type='RPNHead',
+ in_channels=256,
+ feat_channels=256,
+ anchor_generator=dict(
+ type='AnchorGenerator',
+ scales=[8],
+ ratios=[0.5, 1.0, 2.0],
+ strides=[4, 8, 16, 32, 64]),
+ bbox_coder=dict(
+ type='DeltaXYWHBBoxCoder',
+ target_means=[.0, .0, .0, .0],
+ target_stds=[1.0, 1.0, 1.0, 1.0]),
+ loss_cls=dict(
+ type='CrossEntropyLoss', use_sigmoid=True, loss_weight=1.0),
+ loss_bbox=dict(type='L1Loss', loss_weight=1.0)))
+# model training and testing settings
+train_cfg = dict(
+ rpn=dict(
+ assigner=dict(
+ type='MaxIoUAssigner',
+ pos_iou_thr=0.7,
+ neg_iou_thr=0.3,
+ min_pos_iou=0.3,
+ ignore_iof_thr=-1),
+ sampler=dict(
+ type='RandomSampler',
+ num=256,
+ pos_fraction=0.5,
+ neg_pos_ub=-1,
+ add_gt_as_proposals=False),
+ allowed_border=0,
+ pos_weight=-1,
+ debug=False))
+test_cfg = dict(
+ rpn=dict(
+ nms_across_levels=False,
+ nms_pre=2000,
+ nms_post=1000,
+ max_num=1000,
+ nms_thr=0.7,
+ min_bbox_size=0))
diff --git a/PyTorch/contrib/cv/detection/GCNet/configs/_base_/models/ssd300.py b/PyTorch/contrib/cv/detection/GCNet/configs/_base_/models/ssd300.py
new file mode 100644
index 0000000000000000000000000000000000000000..ee7cf3adc8aaced804031196c3901f90b0b0d140
--- /dev/null
+++ b/PyTorch/contrib/cv/detection/GCNet/configs/_base_/models/ssd300.py
@@ -0,0 +1,49 @@
+# model settings
+input_size = 300
+model = dict(
+ type='SingleStageDetector',
+ pretrained='open-mmlab://vgg16_caffe',
+ backbone=dict(
+ type='SSDVGG',
+ input_size=input_size,
+ depth=16,
+ with_last_pool=False,
+ ceil_mode=True,
+ out_indices=(3, 4),
+ out_feature_indices=(22, 34),
+ l2_norm_scale=20),
+ neck=None,
+ bbox_head=dict(
+ type='SSDHead',
+ in_channels=(512, 1024, 512, 256, 256, 256),
+ num_classes=80,
+ anchor_generator=dict(
+ type='SSDAnchorGenerator',
+ scale_major=False,
+ input_size=input_size,
+ basesize_ratio_range=(0.15, 0.9),
+ strides=[8, 16, 32, 64, 100, 300],
+ ratios=[[2], [2, 3], [2, 3], [2, 3], [2], [2]]),
+ bbox_coder=dict(
+ type='DeltaXYWHBBoxCoder',
+ target_means=[.0, .0, .0, .0],
+ target_stds=[0.1, 0.1, 0.2, 0.2])))
+cudnn_benchmark = True
+train_cfg = dict(
+ assigner=dict(
+ type='MaxIoUAssigner',
+ pos_iou_thr=0.5,
+ neg_iou_thr=0.5,
+ min_pos_iou=0.,
+ ignore_iof_thr=-1,
+ gt_max_assign_all=False),
+ smoothl1_beta=1.,
+ allowed_border=-1,
+ pos_weight=-1,
+ neg_pos_ratio=3,
+ debug=False)
+test_cfg = dict(
+ nms=dict(type='nms', iou_threshold=0.45),
+ min_bbox_size=0,
+ score_thr=0.02,
+ max_per_img=200)
diff --git a/PyTorch/contrib/cv/detection/GCNet/configs/_base_/schedules/schedule_1x.py b/PyTorch/contrib/cv/detection/GCNet/configs/_base_/schedules/schedule_1x.py
new file mode 100644
index 0000000000000000000000000000000000000000..f56f386267ae0cd514b8bc889945f8bf6fb5154a
--- /dev/null
+++ b/PyTorch/contrib/cv/detection/GCNet/configs/_base_/schedules/schedule_1x.py
@@ -0,0 +1,11 @@
+# optimizer
+optimizer = dict(type='SGD', lr=0.02, momentum=0.9, weight_decay=0.0001)
+optimizer_config = dict(grad_clip=None)
+# learning policy
+lr_config = dict(
+ policy='step',
+ warmup='linear',
+ warmup_iters=500,
+ warmup_ratio=0.001,
+ step=[8, 11])
+total_epochs = 15
diff --git a/PyTorch/contrib/cv/detection/GCNet/configs/_base_/schedules/schedule_20e.py b/PyTorch/contrib/cv/detection/GCNet/configs/_base_/schedules/schedule_20e.py
new file mode 100644
index 0000000000000000000000000000000000000000..0559030c24ed097d86918bbd589a6a12f8dd8bd5
--- /dev/null
+++ b/PyTorch/contrib/cv/detection/GCNet/configs/_base_/schedules/schedule_20e.py
@@ -0,0 +1,11 @@
+# optimizer
+optimizer = dict(type='SGD', lr=0.02, momentum=0.9, weight_decay=0.0001)
+optimizer_config = dict(grad_clip=None)
+# learning policy
+lr_config = dict(
+ policy='step',
+ warmup='linear',
+ warmup_iters=500,
+ warmup_ratio=0.001,
+ step=[16, 19])
+total_epochs = 20
diff --git a/PyTorch/contrib/cv/detection/GCNet/configs/_base_/schedules/schedule_2x.py b/PyTorch/contrib/cv/detection/GCNet/configs/_base_/schedules/schedule_2x.py
new file mode 100644
index 0000000000000000000000000000000000000000..e34095ff2b5ffdb1f9ba07380a6948504715e3d8
--- /dev/null
+++ b/PyTorch/contrib/cv/detection/GCNet/configs/_base_/schedules/schedule_2x.py
@@ -0,0 +1,11 @@
+# optimizer
+optimizer = dict(type='SGD', lr=0.02, momentum=0.9, weight_decay=0.0001)
+optimizer_config = dict(grad_clip=None)
+# learning policy
+lr_config = dict(
+ policy='step',
+ warmup='linear',
+ warmup_iters=500,
+ warmup_ratio=0.001,
+ step=[16, 22])
+total_epochs = 24
diff --git a/PyTorch/contrib/cv/detection/GCNet/configs/albu_example/README.md b/PyTorch/contrib/cv/detection/GCNet/configs/albu_example/README.md
new file mode 100644
index 0000000000000000000000000000000000000000..4af237e1c713f8ca6ea1f4000d2a5b2e808ea727
--- /dev/null
+++ b/PyTorch/contrib/cv/detection/GCNet/configs/albu_example/README.md
@@ -0,0 +1,5 @@
+## Results and Models
+
+| Backbone | Style | Lr schd | Mem (GB) | Inf time (fps) | box AP | mask AP | Download |
+|:---------:|:-------:|:-------:|:--------:|:--------------:|:------:|:-------:|:--------:|
+| R-50 | pytorch | 1x | 4.4 | 16.6 | 38.0 | 34.5 |[model](http://download.openmmlab.com/mmdetection/v2.0/albu_example/mask_rcnn_r50_fpn_albu_1x_coco/mask_rcnn_r50_fpn_albu_1x_coco_20200208-ab203bcd.pth) | [log](http://download.openmmlab.com/mmdetection/v2.0/albu_example/mask_rcnn_r50_fpn_albu_1x_coco/mask_rcnn_r50_fpn_albu_1x_coco_20200208_225520.log.json) |
diff --git a/PyTorch/contrib/cv/detection/GCNet/configs/albu_example/mask_rcnn_r50_fpn_albu_1x_coco.py b/PyTorch/contrib/cv/detection/GCNet/configs/albu_example/mask_rcnn_r50_fpn_albu_1x_coco.py
new file mode 100644
index 0000000000000000000000000000000000000000..b3f879a6c573871ea17b2bf158173aadf14457b6
--- /dev/null
+++ b/PyTorch/contrib/cv/detection/GCNet/configs/albu_example/mask_rcnn_r50_fpn_albu_1x_coco.py
@@ -0,0 +1,73 @@
+_base_ = '../mask_rcnn/mask_rcnn_r50_fpn_1x_coco.py'
+img_norm_cfg = dict(
+ mean=[123.675, 116.28, 103.53], std=[58.395, 57.12, 57.375], to_rgb=True)
+albu_train_transforms = [
+ dict(
+ type='ShiftScaleRotate',
+ shift_limit=0.0625,
+ scale_limit=0.0,
+ rotate_limit=0,
+ interpolation=1,
+ p=0.5),
+ dict(
+ type='RandomBrightnessContrast',
+ brightness_limit=[0.1, 0.3],
+ contrast_limit=[0.1, 0.3],
+ p=0.2),
+ dict(
+ type='OneOf',
+ transforms=[
+ dict(
+ type='RGBShift',
+ r_shift_limit=10,
+ g_shift_limit=10,
+ b_shift_limit=10,
+ p=1.0),
+ dict(
+ type='HueSaturationValue',
+ hue_shift_limit=20,
+ sat_shift_limit=30,
+ val_shift_limit=20,
+ p=1.0)
+ ],
+ p=0.1),
+ dict(type='JpegCompression', quality_lower=85, quality_upper=95, p=0.2),
+ dict(type='ChannelShuffle', p=0.1),
+ dict(
+ type='OneOf',
+ transforms=[
+ dict(type='Blur', blur_limit=3, p=1.0),
+ dict(type='MedianBlur', blur_limit=3, p=1.0)
+ ],
+ p=0.1),
+]
+train_pipeline = [
+ dict(type='LoadImageFromFile'),
+ dict(type='LoadAnnotations', with_bbox=True, with_mask=True),
+ dict(type='Resize', img_scale=(1333, 800), keep_ratio=True),
+ dict(type='Pad', size_divisor=32),
+ dict(
+ type='Albu',
+ transforms=albu_train_transforms,
+ bbox_params=dict(
+ type='BboxParams',
+ format='pascal_voc',
+ label_fields=['gt_labels'],
+ min_visibility=0.0,
+ filter_lost_elements=True),
+ keymap={
+ 'img': 'image',
+ 'gt_masks': 'masks',
+ 'gt_bboxes': 'bboxes'
+ },
+ update_pad_shape=False,
+ skip_img_without_anno=True),
+ dict(type='Normalize', **img_norm_cfg),
+ dict(type='DefaultFormatBundle'),
+ dict(
+ type='Collect',
+ keys=['img', 'gt_bboxes', 'gt_labels', 'gt_masks'],
+ meta_keys=('filename', 'ori_shape', 'img_shape', 'img_norm_cfg',
+ 'pad_shape', 'scale_factor'))
+]
+data = dict(train=dict(pipeline=train_pipeline))
diff --git a/PyTorch/contrib/cv/detection/GCNet/configs/atss/README.md b/PyTorch/contrib/cv/detection/GCNet/configs/atss/README.md
new file mode 100644
index 0000000000000000000000000000000000000000..e835a8e61f8ae105364ca1c331055245be96ea96
--- /dev/null
+++ b/PyTorch/contrib/cv/detection/GCNet/configs/atss/README.md
@@ -0,0 +1,21 @@
+# Bridging the Gap Between Anchor-based and Anchor-free Detection via Adaptive Training Sample Selection
+
+
+## Introduction
+
+```
+@article{zhang2019bridging,
+ title = {Bridging the Gap Between Anchor-based and Anchor-free Detection via Adaptive Training Sample Selection},
+ author = {Zhang, Shifeng and Chi, Cheng and Yao, Yongqiang and Lei, Zhen and Li, Stan Z.},
+ journal = {arXiv preprint arXiv:1912.02424},
+ year = {2019}
+}
+```
+
+
+## Results and Models
+
+| Backbone | Style | Lr schd | Mem (GB) | Inf time (fps) | box AP | Download |
+|:---------:|:-------:|:-------:|:--------:|:--------------:|:------:|:--------:|
+| R-50 | pytorch | 1x | 3.7 | 19.7 | 39.4 | [model](http://download.openmmlab.com/mmdetection/v2.0/atss/atss_r50_fpn_1x_coco/atss_r50_fpn_1x_coco_20200209-985f7bd0.pth) | [log](http://download.openmmlab.com/mmdetection/v2.0/atss/atss_r50_fpn_1x_coco/atss_r50_fpn_1x_coco_20200209_102539.log.json) |
+| R-101 | pytorch | 1x | 5.6 | 12.3 | 41.5 | [model](http://download.openmmlab.com/mmdetection/v2.0/atss/atss_r101_fpn_1x_coco/atss_r101_fpn_1x_20200825-dfcadd6f.pth) | [log](http://download.openmmlab.com/mmdetection/v2.0/atss/atss_r101_fpn_1x_coco/atss_r101_fpn_1x_20200825-dfcadd6f.log.json) |
diff --git a/PyTorch/contrib/cv/detection/GCNet/configs/atss/atss_r101_fpn_1x_coco.py b/PyTorch/contrib/cv/detection/GCNet/configs/atss/atss_r101_fpn_1x_coco.py
new file mode 100644
index 0000000000000000000000000000000000000000..695779ab31b5f848f8c85c13cc4ca637c8590ba7
--- /dev/null
+++ b/PyTorch/contrib/cv/detection/GCNet/configs/atss/atss_r101_fpn_1x_coco.py
@@ -0,0 +1,5 @@
+_base_ = './atss_r50_fpn_1x_coco.py'
+model = dict(
+ pretrained='torchvision://resnet101',
+ backbone=dict(depth=101),
+)
diff --git a/PyTorch/contrib/cv/detection/GCNet/configs/atss/atss_r50_fpn_1x_coco.py b/PyTorch/contrib/cv/detection/GCNet/configs/atss/atss_r50_fpn_1x_coco.py
new file mode 100644
index 0000000000000000000000000000000000000000..e787622c24b5e3b424ca3400eab31efb3d7876af
--- /dev/null
+++ b/PyTorch/contrib/cv/detection/GCNet/configs/atss/atss_r50_fpn_1x_coco.py
@@ -0,0 +1,62 @@
+_base_ = [
+ '../_base_/datasets/coco_detection.py',
+ '../_base_/schedules/schedule_1x.py', '../_base_/default_runtime.py'
+]
+model = dict(
+ type='ATSS',
+ pretrained='torchvision://resnet50',
+ backbone=dict(
+ type='ResNet',
+ depth=50,
+ num_stages=4,
+ out_indices=(0, 1, 2, 3),
+ frozen_stages=1,
+ norm_cfg=dict(type='BN', requires_grad=True),
+ norm_eval=True,
+ style='pytorch'),
+ neck=dict(
+ type='FPN',
+ in_channels=[256, 512, 1024, 2048],
+ out_channels=256,
+ start_level=1,
+ add_extra_convs='on_output',
+ num_outs=5),
+ bbox_head=dict(
+ type='ATSSHead',
+ num_classes=80,
+ in_channels=256,
+ stacked_convs=4,
+ feat_channels=256,
+ anchor_generator=dict(
+ type='AnchorGenerator',
+ ratios=[1.0],
+ octave_base_scale=8,
+ scales_per_octave=1,
+ strides=[8, 16, 32, 64, 128]),
+ bbox_coder=dict(
+ type='DeltaXYWHBBoxCoder',
+ target_means=[.0, .0, .0, .0],
+ target_stds=[0.1, 0.1, 0.2, 0.2]),
+ loss_cls=dict(
+ type='FocalLoss',
+ use_sigmoid=True,
+ gamma=2.0,
+ alpha=0.25,
+ loss_weight=1.0),
+ loss_bbox=dict(type='GIoULoss', loss_weight=2.0),
+ loss_centerness=dict(
+ type='CrossEntropyLoss', use_sigmoid=True, loss_weight=1.0)))
+# training and testing settings
+train_cfg = dict(
+ assigner=dict(type='ATSSAssigner', topk=9),
+ allowed_border=-1,
+ pos_weight=-1,
+ debug=False)
+test_cfg = dict(
+ nms_pre=1000,
+ min_bbox_size=0,
+ score_thr=0.05,
+ nms=dict(type='nms', iou_threshold=0.6),
+ max_per_img=100)
+# optimizer
+optimizer = dict(type='SGD', lr=0.01, momentum=0.9, weight_decay=0.0001)
diff --git a/PyTorch/contrib/cv/detection/GCNet/configs/carafe/README.md b/PyTorch/contrib/cv/detection/GCNet/configs/carafe/README.md
new file mode 100644
index 0000000000000000000000000000000000000000..a20a6c7770f48b3036cf2de603591c540dd7451f
--- /dev/null
+++ b/PyTorch/contrib/cv/detection/GCNet/configs/carafe/README.md
@@ -0,0 +1,30 @@
+# CARAFE: Content-Aware ReAssembly of FEatures
+
+## Introduction
+
+We provide config files to reproduce the object detection & instance segmentation results in the ICCV 2019 Oral paper for [CARAFE: Content-Aware ReAssembly of FEatures](https://arxiv.org/abs/1905.02188).
+
+```
+@inproceedings{Wang_2019_ICCV,
+ title = {CARAFE: Content-Aware ReAssembly of FEatures},
+ author = {Wang, Jiaqi and Chen, Kai and Xu, Rui and Liu, Ziwei and Loy, Chen Change and Lin, Dahua},
+ booktitle = {The IEEE International Conference on Computer Vision (ICCV)},
+ month = {October},
+ year = {2019}
+}
+```
+
+## Results and Models
+
+The results on COCO 2017 val is shown in the below table.
+
+| Method | Backbone | Style | Lr schd | Test Proposal Num | Inf time (fps) | Box AP | Mask AP | Download |
+|:--------------------:|:--------:|:-------:|:-------:|:-----------------:|:--------------:|:------:|:-------:|:-------:|
+| Faster R-CNN w/ CARAFE | R-50-FPN | pytorch | 1x | 1000 | 16.5 | 38.6 | 38.6 | [model](http://download.openmmlab.com/mmdetection/v2.0/carafe/faster_rcnn_r50_fpn_carafe_1x_coco/faster_rcnn_r50_fpn_carafe_1x_coco_bbox_mAP-0.386_20200504_175733-385a75b7.pth) | [log](http://download.openmmlab.com/mmdetection/v2.0/carafe/faster_rcnn_r50_fpn_carafe_1x_coco/faster_rcnn_r50_fpn_carafe_1x_coco_20200504_175733.log.json) |
+| - | - | - | - | 2000 | | | | |
+| Mask R-CNN w/ CARAFE | R-50-FPN | pytorch | 1x | 1000 | 14.0 | 39.3 | 35.8 | [model](http://download.openmmlab.com/mmdetection/v2.0/carafe/mask_rcnn_r50_fpn_carafe_1x_coco/mask_rcnn_r50_fpn_carafe_1x_coco_bbox_mAP-0.393__segm_mAP-0.358_20200503_135957-8687f195.pth) | [log](http://download.openmmlab.com/mmdetection/v2.0/carafe/mask_rcnn_r50_fpn_carafe_1x_coco/mask_rcnn_r50_fpn_carafe_1x_coco_20200503_135957.log.json) |
+| - | - | - | - | 2000 | | | | |
+
+## Implementation
+
+The CUDA implementation of CARAFE can be find at https://github.com/myownskyW7/CARAFE.
diff --git a/PyTorch/contrib/cv/detection/GCNet/configs/carafe/faster_rcnn_r50_fpn_carafe_1x_coco.py b/PyTorch/contrib/cv/detection/GCNet/configs/carafe/faster_rcnn_r50_fpn_carafe_1x_coco.py
new file mode 100644
index 0000000000000000000000000000000000000000..dedac3f46b4710d16a8bc66f00663e379b2ebdc7
--- /dev/null
+++ b/PyTorch/contrib/cv/detection/GCNet/configs/carafe/faster_rcnn_r50_fpn_carafe_1x_coco.py
@@ -0,0 +1,50 @@
+_base_ = '../faster_rcnn/faster_rcnn_r50_fpn_1x_coco.py'
+model = dict(
+ neck=dict(
+ type='FPN_CARAFE',
+ in_channels=[256, 512, 1024, 2048],
+ out_channels=256,
+ num_outs=5,
+ start_level=0,
+ end_level=-1,
+ norm_cfg=None,
+ act_cfg=None,
+ order=('conv', 'norm', 'act'),
+ upsample_cfg=dict(
+ type='carafe',
+ up_kernel=5,
+ up_group=1,
+ encoder_kernel=3,
+ encoder_dilation=1,
+ compressed_channels=64)))
+img_norm_cfg = dict(
+ mean=[123.675, 116.28, 103.53], std=[58.395, 57.12, 57.375], to_rgb=True)
+train_pipeline = [
+ dict(type='LoadImageFromFile'),
+ dict(type='LoadAnnotations', with_bbox=True),
+ dict(type='Resize', img_scale=(1333, 800), keep_ratio=True),
+ dict(type='RandomFlip', flip_ratio=0.5),
+ dict(type='Normalize', **img_norm_cfg),
+ dict(type='Pad', size_divisor=64),
+ dict(type='DefaultFormatBundle'),
+ dict(type='Collect', keys=['img', 'gt_bboxes', 'gt_labels']),
+]
+test_pipeline = [
+ dict(type='LoadImageFromFile'),
+ dict(
+ type='MultiScaleFlipAug',
+ img_scale=(1333, 800),
+ flip=False,
+ transforms=[
+ dict(type='Resize', keep_ratio=True),
+ dict(type='RandomFlip'),
+ dict(type='Normalize', **img_norm_cfg),
+ dict(type='Pad', size_divisor=64),
+ dict(type='ImageToTensor', keys=['img']),
+ dict(type='Collect', keys=['img']),
+ ])
+]
+data = dict(
+ train=dict(pipeline=train_pipeline),
+ val=dict(pipeline=test_pipeline),
+ test=dict(pipeline=test_pipeline))
diff --git a/PyTorch/contrib/cv/detection/GCNet/configs/carafe/mask_rcnn_r50_fpn_carafe_1x_coco.py b/PyTorch/contrib/cv/detection/GCNet/configs/carafe/mask_rcnn_r50_fpn_carafe_1x_coco.py
new file mode 100644
index 0000000000000000000000000000000000000000..668c023981b9d421e5b51a48757c3819d090307f
--- /dev/null
+++ b/PyTorch/contrib/cv/detection/GCNet/configs/carafe/mask_rcnn_r50_fpn_carafe_1x_coco.py
@@ -0,0 +1,60 @@
+_base_ = '../mask_rcnn/mask_rcnn_r50_fpn_1x_coco.py'
+model = dict(
+ neck=dict(
+ type='FPN_CARAFE',
+ in_channels=[256, 512, 1024, 2048],
+ out_channels=256,
+ num_outs=5,
+ start_level=0,
+ end_level=-1,
+ norm_cfg=None,
+ act_cfg=None,
+ order=('conv', 'norm', 'act'),
+ upsample_cfg=dict(
+ type='carafe',
+ up_kernel=5,
+ up_group=1,
+ encoder_kernel=3,
+ encoder_dilation=1,
+ compressed_channels=64)),
+ roi_head=dict(
+ mask_head=dict(
+ upsample_cfg=dict(
+ type='carafe',
+ scale_factor=2,
+ up_kernel=5,
+ up_group=1,
+ encoder_kernel=3,
+ encoder_dilation=1,
+ compressed_channels=64))))
+img_norm_cfg = dict(
+ mean=[123.675, 116.28, 103.53], std=[58.395, 57.12, 57.375], to_rgb=True)
+train_pipeline = [
+ dict(type='LoadImageFromFile'),
+ dict(type='LoadAnnotations', with_bbox=True, with_mask=True),
+ dict(type='Resize', img_scale=(1333, 800), keep_ratio=True),
+ dict(type='RandomFlip', flip_ratio=0.5),
+ dict(type='Normalize', **img_norm_cfg),
+ dict(type='Pad', size_divisor=64),
+ dict(type='DefaultFormatBundle'),
+ dict(type='Collect', keys=['img', 'gt_bboxes', 'gt_labels', 'gt_masks']),
+]
+test_pipeline = [
+ dict(type='LoadImageFromFile'),
+ dict(
+ type='MultiScaleFlipAug',
+ img_scale=(1333, 800),
+ flip=False,
+ transforms=[
+ dict(type='Resize', keep_ratio=True),
+ dict(type='RandomFlip'),
+ dict(type='Normalize', **img_norm_cfg),
+ dict(type='Pad', size_divisor=64),
+ dict(type='ImageToTensor', keys=['img']),
+ dict(type='Collect', keys=['img']),
+ ])
+]
+data = dict(
+ train=dict(pipeline=train_pipeline),
+ val=dict(pipeline=test_pipeline),
+ test=dict(pipeline=test_pipeline))
diff --git a/PyTorch/contrib/cv/detection/GCNet/configs/cascade_rcnn/README.md b/PyTorch/contrib/cv/detection/GCNet/configs/cascade_rcnn/README.md
new file mode 100644
index 0000000000000000000000000000000000000000..a19377756a15f876a0579ddc147eb239df8f6b90
--- /dev/null
+++ b/PyTorch/contrib/cv/detection/GCNet/configs/cascade_rcnn/README.md
@@ -0,0 +1,52 @@
+# Cascade R-CNN: High Quality Object Detection and Instance Segmentation
+
+## Introduction
+```
+@article{Cai_2019,
+ title={Cascade R-CNN: High Quality Object Detection and Instance Segmentation},
+ ISSN={1939-3539},
+ url={http://dx.doi.org/10.1109/tpami.2019.2956516},
+ DOI={10.1109/tpami.2019.2956516},
+ journal={IEEE Transactions on Pattern Analysis and Machine Intelligence},
+ publisher={Institute of Electrical and Electronics Engineers (IEEE)},
+ author={Cai, Zhaowei and Vasconcelos, Nuno},
+ year={2019},
+ pages={1–1}
+}
+```
+
+## Results and models
+
+### Cascade R-CNN
+
+| Backbone | Style | Lr schd | Mem (GB) | Inf time (fps) | box AP | Download |
+| :-------------: | :-----: | :-----: | :------: | :------------: | :----: |:--------:|
+| R-50-FPN | caffe | 1x | 4.2 | | 40.4 | [model](http://download.openmmlab.com/mmdetection/v2.0/cascade_rcnn/cascade_rcnn_r50_caffe_fpn_1x_coco/cascade_rcnn_r50_caffe_fpn_1x_coco_bbox_mAP-0.404_20200504_174853-b857be87.pth) | [log](http://download.openmmlab.com/mmdetection/v2.0/cascade_rcnn/cascade_rcnn_r50_caffe_fpn_1x_coco/cascade_rcnn_r50_caffe_fpn_1x_coco_20200504_174853.log.json) |
+| R-50-FPN | pytorch | 1x | 4.4 | 16.1 | 40.3 | [model](http://download.openmmlab.com/mmdetection/v2.0/cascade_rcnn/cascade_rcnn_r50_fpn_1x_coco/cascade_rcnn_r50_fpn_1x_coco_20200316-3dc56deb.pth) | [log](http://download.openmmlab.com/mmdetection/v2.0/cascade_rcnn/cascade_rcnn_r50_fpn_1x_coco/cascade_rcnn_r50_fpn_1x_coco_20200316_214748.log.json) |
+| R-50-FPN | pytorch | 20e | - | - | 41.0 | [model](http://download.openmmlab.com/mmdetection/v2.0/cascade_rcnn/cascade_rcnn_r50_fpn_20e_coco/cascade_rcnn_r50_fpn_20e_coco_bbox_mAP-0.41_20200504_175131-e9872a90.pth) | [log](http://download.openmmlab.com/mmdetection/v2.0/cascade_rcnn/cascade_rcnn_r50_fpn_20e_coco/cascade_rcnn_r50_fpn_20e_coco_20200504_175131.log.json) |
+| R-101-FPN | caffe | 1x | 6.2 | | 42.3 | [model](http://download.openmmlab.com/mmdetection/v2.0/cascade_rcnn/cascade_rcnn_r101_caffe_fpn_1x_coco/cascade_rcnn_r101_caffe_fpn_1x_coco_bbox_mAP-0.423_20200504_175649-cab8dbd5.pth) | [log](http://download.openmmlab.com/mmdetection/v2.0/cascade_rcnn/cascade_rcnn_r101_caffe_fpn_1x_coco/cascade_rcnn_r101_caffe_fpn_1x_coco_20200504_175649.log.json) |
+| R-101-FPN | pytorch | 1x | 6.4 | 13.5 | 42.0 | [model](http://download.openmmlab.com/mmdetection/v2.0/cascade_rcnn/cascade_rcnn_r101_fpn_1x_coco/cascade_rcnn_r101_fpn_1x_coco_20200317-0b6a2fbf.pth) | [log](http://download.openmmlab.com/mmdetection/v2.0/cascade_rcnn/cascade_rcnn_r101_fpn_1x_coco/cascade_rcnn_r101_fpn_1x_coco_20200317_101744.log.json) |
+| R-101-FPN | pytorch | 20e | - | - | 42.5 | [model](http://download.openmmlab.com/mmdetection/v2.0/cascade_rcnn/cascade_rcnn_r101_fpn_20e_coco/cascade_rcnn_r101_fpn_20e_coco_bbox_mAP-0.425_20200504_231812-5057dcc5.pth) | [log](http://download.openmmlab.com/mmdetection/v2.0/cascade_rcnn/cascade_rcnn_r101_fpn_20e_coco/cascade_rcnn_r101_fpn_20e_coco_20200504_231812.log.json) |
+| X-101-32x4d-FPN | pytorch | 1x | 7.6 | 10.9 | 43.7 | [model](http://download.openmmlab.com/mmdetection/v2.0/cascade_rcnn/cascade_rcnn_x101_32x4d_fpn_1x_coco/cascade_rcnn_x101_32x4d_fpn_1x_coco_20200316-95c2deb6.pth) | [log](http://download.openmmlab.com/mmdetection/v2.0/cascade_rcnn/cascade_rcnn_x101_32x4d_fpn_1x_coco/cascade_rcnn_x101_32x4d_fpn_1x_coco_20200316_055608.log.json) |
+| X-101-32x4d-FPN | pytorch | 20e | 7.6 | | 43.7 | [model](http://download.openmmlab.com/mmdetection/v2.0/cascade_rcnn/cascade_rcnn_x101_32x4d_fpn_20e_coco/cascade_rcnn_x101_32x4d_fpn_20e_coco_20200906_134608-9ae0a720.pth) | [log](http://download.openmmlab.com/mmdetection/v2.0/cascade_rcnn/cascade_rcnn_x101_32x4d_fpn_20e_coco/cascade_rcnn_x101_32x4d_fpn_20e_coco_20200906_134608.log.json) |
+| X-101-64x4d-FPN | pytorch | 1x | 10.7 | | 44.7 | [model](http://download.openmmlab.com/mmdetection/v2.0/cascade_rcnn/cascade_rcnn_x101_64x4d_fpn_1x_coco/cascade_rcnn_x101_64x4d_fpn_1x_coco_20200515_075702-43ce6a30.pth) | [log](http://download.openmmlab.com/mmdetection/v2.0/cascade_rcnn/cascade_rcnn_x101_64x4d_fpn_1x_coco/cascade_rcnn_x101_64x4d_fpn_1x_coco_20200515_075702.log.json) |
+| X-101-64x4d-FPN | pytorch | 20e | 10.7 | | 44.5 | [model](http://download.openmmlab.com/mmdetection/v2.0/cascade_rcnn/cascade_rcnn_x101_64x4d_fpn_20e_coco/cascade_rcnn_x101_64x4d_fpn_20e_coco_20200509_224357-051557b1.pth) | [log](http://download.openmmlab.com/mmdetection/v2.0/cascade_rcnn/cascade_rcnn_x101_64x4d_fpn_20e_coco/cascade_rcnn_x101_64x4d_fpn_20e_coco_20200509_224357.log.json)|
+
+### Cascade Mask R-CNN
+
+| Backbone | Style | Lr schd | Mem (GB) | Inf time (fps) | box AP | mask AP | Download |
+| :-------------: | :-----: | :-----: | :------: | :------------: | :----: | :-----: | :----------------: |
+| R-50-FPN | caffe | 1x | 5.9 | | 41.2 | 36.0 | [model](http://download.openmmlab.com/mmdetection/v2.0/cascade_rcnn/cascade_mask_rcnn_r50_caffe_fpn_1x_coco/cascade_mask_rcnn_r50_caffe_fpn_1x_coco_bbox_mAP-0.412__segm_mAP-0.36_20200504_174659-5004b251.pth) | [log](http://download.openmmlab.com/mmdetection/v2.0/cascade_rcnn/cascade_mask_rcnn_r50_caffe_fpn_1x_coco/cascade_mask_rcnn_r50_caffe_fpn_1x_coco_20200504_174659.log.json) |
+| R-50-FPN | pytorch | 1x | 6.0 | 11.2 | 41.2 | 35.9 | [model](http://download.openmmlab.com/mmdetection/v2.0/cascade_rcnn/cascade_mask_rcnn_r50_fpn_1x_coco/cascade_mask_rcnn_r50_fpn_1x_coco_20200203-9d4dcb24.pth) | [log](http://download.openmmlab.com/mmdetection/v2.0/cascade_rcnn/cascade_mask_rcnn_r50_fpn_1x_coco/cascade_mask_rcnn_r50_fpn_1x_coco_20200203_170449.log.json) |
+| R-50-FPN | pytorch | 20e | - | - | 41.9 | 36.5 | [model](http://download.openmmlab.com/mmdetection/v2.0/cascade_rcnn/cascade_mask_rcnn_r50_fpn_20e_coco/cascade_mask_rcnn_r50_fpn_20e_coco_bbox_mAP-0.419__segm_mAP-0.365_20200504_174711-4af8e66e.pth) | [log](http://download.openmmlab.com/mmdetection/v2.0/cascade_rcnn/cascade_mask_rcnn_r50_fpn_20e_coco/cascade_mask_rcnn_r50_fpn_20e_coco_20200504_174711.log.json)|
+| R-101-FPN | caffe | 1x | 7.8 | | 43.2 | 37.6 | [model](http://download.openmmlab.com/mmdetection/v2.0/cascade_rcnn/cascade_mask_rcnn_r101_caffe_fpn_1x_coco/cascade_mask_rcnn_r101_caffe_fpn_1x_coco_bbox_mAP-0.432__segm_mAP-0.376_20200504_174813-5c1e9599.pth) | [log](http://download.openmmlab.com/mmdetection/v2.0/cascade_rcnn/cascade_mask_rcnn_r101_caffe_fpn_1x_coco/cascade_mask_rcnn_r101_caffe_fpn_1x_coco_20200504_174813.log.json)|
+| R-101-FPN | pytorch | 1x | 7.9 | 9.8 | 42.9 | 37.3 | [model](http://download.openmmlab.com/mmdetection/v2.0/cascade_rcnn/cascade_mask_rcnn_r101_fpn_1x_coco/cascade_mask_rcnn_r101_fpn_1x_coco_20200203-befdf6ee.pth) | [log](http://download.openmmlab.com/mmdetection/v2.0/cascade_rcnn/cascade_mask_rcnn_r101_fpn_1x_coco/cascade_mask_rcnn_r101_fpn_1x_coco_20200203_092521.log.json) |
+| R-101-FPN | pytorch | 20e | - | - | 43.4 | 37.8 | [model](http://download.openmmlab.com/mmdetection/v2.0/cascade_rcnn/cascade_mask_rcnn_r101_fpn_20e_coco/cascade_mask_rcnn_r101_fpn_20e_coco_bbox_mAP-0.434__segm_mAP-0.378_20200504_174836-005947da.pth) | [log](http://download.openmmlab.com/mmdetection/v2.0/cascade_rcnn/cascade_mask_rcnn_r101_fpn_20e_coco/cascade_mask_rcnn_r101_fpn_20e_coco_20200504_174836.log.json)|
+| X-101-32x4d-FPN | pytorch | 1x | 9.2 | 8.6 | 44.3 | 38.3 | [model](http://download.openmmlab.com/mmdetection/v2.0/cascade_rcnn/cascade_mask_rcnn_x101_32x4d_fpn_1x_coco/cascade_mask_rcnn_x101_32x4d_fpn_1x_coco_20200201-0f411b1f.pth) | [log](http://download.openmmlab.com/mmdetection/v2.0/cascade_rcnn/cascade_mask_rcnn_x101_32x4d_fpn_1x_coco/cascade_mask_rcnn_x101_32x4d_fpn_1x_coco_20200201_052416.log.json) |
+| X-101-32x4d-FPN | pytorch | 20e | 9.2 | - | 45.0 | 39.0 | [model](http://download.openmmlab.com/mmdetection/v2.0/cascade_rcnn/cascade_mask_rcnn_x101_32x4d_fpn_20e_coco/cascade_mask_rcnn_x101_32x4d_fpn_20e_coco_20200528_083917-ed1f4751.pth) | [log](http://download.openmmlab.com/mmdetection/v2.0/cascade_rcnn/cascade_mask_rcnn_x101_32x4d_fpn_20e_coco/cascade_mask_rcnn_x101_32x4d_fpn_20e_coco_20200528_083917.log.json) |
+| X-101-64x4d-FPN | pytorch | 1x | 12.2 | 6.7 | 45.3 | 39.2 | [model](http://download.openmmlab.com/mmdetection/v2.0/cascade_rcnn/cascade_mask_rcnn_x101_64x4d_fpn_1x_coco/cascade_mask_rcnn_x101_64x4d_fpn_1x_coco_20200203-9a2db89d.pth) | [log](http://download.openmmlab.com/mmdetection/v2.0/cascade_rcnn/cascade_mask_rcnn_x101_64x4d_fpn_1x_coco/cascade_mask_rcnn_x101_64x4d_fpn_1x_coco_20200203_044059.log.json) |
+| X-101-64x4d-FPN | pytorch | 20e | 12.2 | | 45.6 |39.5 | [model](http://download.openmmlab.com/mmdetection/v2.0/cascade_rcnn/cascade_mask_rcnn_x101_64x4d_fpn_20e_coco/cascade_mask_rcnn_x101_64x4d_fpn_20e_coco_20200512_161033-bdb5126a.pth) | [log](http://download.openmmlab.com/mmdetection/v2.0/cascade_rcnn/cascade_mask_rcnn_x101_64x4d_fpn_20e_coco/cascade_mask_rcnn_x101_64x4d_fpn_20e_coco_20200512_161033.log.json)|
+
+**Notes:**
+
+- The `20e` schedule in Cascade (Mask) R-CNN indicates decreasing the lr at 16 and 19 epochs, with a total of 20 epochs.
diff --git a/PyTorch/contrib/cv/detection/GCNet/configs/cascade_rcnn/cascade_mask_rcnn_r101_caffe_fpn_1x_coco.py b/PyTorch/contrib/cv/detection/GCNet/configs/cascade_rcnn/cascade_mask_rcnn_r101_caffe_fpn_1x_coco.py
new file mode 100644
index 0000000000000000000000000000000000000000..f42165d9fd14600858681e695de7927aac865652
--- /dev/null
+++ b/PyTorch/contrib/cv/detection/GCNet/configs/cascade_rcnn/cascade_mask_rcnn_r101_caffe_fpn_1x_coco.py
@@ -0,0 +1,4 @@
+_base_ = './cascade_mask_rcnn_r50_caffe_fpn_1x_coco.py'
+model = dict(
+ pretrained='open-mmlab://detectron2/resnet101_caffe',
+ backbone=dict(depth=101))
diff --git a/PyTorch/contrib/cv/detection/GCNet/configs/cascade_rcnn/cascade_mask_rcnn_r101_fpn_1x_coco.py b/PyTorch/contrib/cv/detection/GCNet/configs/cascade_rcnn/cascade_mask_rcnn_r101_fpn_1x_coco.py
new file mode 100644
index 0000000000000000000000000000000000000000..9212dda4992b4d18cef9a4916b765ef37850237f
--- /dev/null
+++ b/PyTorch/contrib/cv/detection/GCNet/configs/cascade_rcnn/cascade_mask_rcnn_r101_fpn_1x_coco.py
@@ -0,0 +1,2 @@
+_base_ = './cascade_mask_rcnn_r50_fpn_1x_coco.py'
+model = dict(pretrained='torchvision://resnet101', backbone=dict(depth=101))
diff --git a/PyTorch/contrib/cv/detection/GCNet/configs/cascade_rcnn/cascade_mask_rcnn_r101_fpn_20e_coco.py b/PyTorch/contrib/cv/detection/GCNet/configs/cascade_rcnn/cascade_mask_rcnn_r101_fpn_20e_coco.py
new file mode 100644
index 0000000000000000000000000000000000000000..d069f8c9fdbaa55cbc44065740187c242cfa2903
--- /dev/null
+++ b/PyTorch/contrib/cv/detection/GCNet/configs/cascade_rcnn/cascade_mask_rcnn_r101_fpn_20e_coco.py
@@ -0,0 +1,2 @@
+_base_ = './cascade_mask_rcnn_r50_fpn_20e_coco.py'
+model = dict(pretrained='torchvision://resnet101', backbone=dict(depth=101))
diff --git a/PyTorch/contrib/cv/detection/GCNet/configs/cascade_rcnn/cascade_mask_rcnn_r50_caffe_fpn_1x_coco.py b/PyTorch/contrib/cv/detection/GCNet/configs/cascade_rcnn/cascade_mask_rcnn_r50_caffe_fpn_1x_coco.py
new file mode 100644
index 0000000000000000000000000000000000000000..b371ed757bf7dd95ef9ecfc2e609ca5ab03795d6
--- /dev/null
+++ b/PyTorch/contrib/cv/detection/GCNet/configs/cascade_rcnn/cascade_mask_rcnn_r50_caffe_fpn_1x_coco.py
@@ -0,0 +1,38 @@
+_base_ = ['./cascade_mask_rcnn_r50_fpn_1x_coco.py']
+
+model = dict(
+ pretrained='open-mmlab://detectron2/resnet50_caffe',
+ backbone=dict(
+ norm_cfg=dict(requires_grad=False), norm_eval=True, style='caffe'))
+
+img_norm_cfg = dict(
+ mean=[103.530, 116.280, 123.675], std=[1.0, 1.0, 1.0], to_rgb=False)
+train_pipeline = [
+ dict(type='LoadImageFromFile'),
+ dict(type='LoadAnnotations', with_bbox=True, with_mask=True),
+ dict(type='Resize', img_scale=(1333, 800), keep_ratio=True),
+ dict(type='RandomFlip', flip_ratio=0.5),
+ dict(type='Normalize', **img_norm_cfg),
+ dict(type='Pad', size_divisor=32),
+ dict(type='DefaultFormatBundle'),
+ dict(type='Collect', keys=['img', 'gt_bboxes', 'gt_labels', 'gt_masks']),
+]
+test_pipeline = [
+ dict(type='LoadImageFromFile'),
+ dict(
+ type='MultiScaleFlipAug',
+ img_scale=(1333, 800),
+ flip=False,
+ transforms=[
+ dict(type='Resize', keep_ratio=True),
+ dict(type='RandomFlip'),
+ dict(type='Normalize', **img_norm_cfg),
+ dict(type='Pad', size_divisor=32),
+ dict(type='ImageToTensor', keys=['img']),
+ dict(type='Collect', keys=['img']),
+ ])
+]
+data = dict(
+ train=dict(pipeline=train_pipeline),
+ val=dict(pipeline=test_pipeline),
+ test=dict(pipeline=test_pipeline))
diff --git a/PyTorch/contrib/cv/detection/GCNet/configs/cascade_rcnn/cascade_mask_rcnn_r50_fpn_1x_coco.py b/PyTorch/contrib/cv/detection/GCNet/configs/cascade_rcnn/cascade_mask_rcnn_r50_fpn_1x_coco.py
new file mode 100644
index 0000000000000000000000000000000000000000..49ab539aa4cdf7c396b6f109efe2dc7a6d596a2a
--- /dev/null
+++ b/PyTorch/contrib/cv/detection/GCNet/configs/cascade_rcnn/cascade_mask_rcnn_r50_fpn_1x_coco.py
@@ -0,0 +1,5 @@
+_base_ = [
+ '../_base_/models/cascade_mask_rcnn_r50_fpn.py',
+ '../_base_/datasets/coco_instance.py',
+ '../_base_/schedules/schedule_1x.py', '../_base_/default_runtime.py'
+]
diff --git a/PyTorch/contrib/cv/detection/GCNet/configs/cascade_rcnn/cascade_mask_rcnn_r50_fpn_20e_coco.py b/PyTorch/contrib/cv/detection/GCNet/configs/cascade_rcnn/cascade_mask_rcnn_r50_fpn_20e_coco.py
new file mode 100644
index 0000000000000000000000000000000000000000..1296dc45dd89da9c0801e1242080c67957cace74
--- /dev/null
+++ b/PyTorch/contrib/cv/detection/GCNet/configs/cascade_rcnn/cascade_mask_rcnn_r50_fpn_20e_coco.py
@@ -0,0 +1,5 @@
+_base_ = [
+ '../_base_/models/cascade_mask_rcnn_r50_fpn.py',
+ '../_base_/datasets/coco_instance.py',
+ '../_base_/schedules/schedule_20e.py', '../_base_/default_runtime.py'
+]
diff --git a/PyTorch/contrib/cv/detection/GCNet/configs/cascade_rcnn/cascade_mask_rcnn_x101_32x4d_fpn_1x_coco.py b/PyTorch/contrib/cv/detection/GCNet/configs/cascade_rcnn/cascade_mask_rcnn_x101_32x4d_fpn_1x_coco.py
new file mode 100644
index 0000000000000000000000000000000000000000..d05eb50c7cd501a5bab4ec403a98137b31b9b51b
--- /dev/null
+++ b/PyTorch/contrib/cv/detection/GCNet/configs/cascade_rcnn/cascade_mask_rcnn_x101_32x4d_fpn_1x_coco.py
@@ -0,0 +1,13 @@
+_base_ = './cascade_mask_rcnn_r50_fpn_1x_coco.py'
+model = dict(
+ pretrained='open-mmlab://resnext101_32x4d',
+ backbone=dict(
+ type='ResNeXt',
+ depth=101,
+ groups=32,
+ base_width=4,
+ num_stages=4,
+ out_indices=(0, 1, 2, 3),
+ frozen_stages=1,
+ norm_cfg=dict(type='BN', requires_grad=True),
+ style='pytorch'))
diff --git a/PyTorch/contrib/cv/detection/GCNet/configs/cascade_rcnn/cascade_mask_rcnn_x101_32x4d_fpn_20e_coco.py b/PyTorch/contrib/cv/detection/GCNet/configs/cascade_rcnn/cascade_mask_rcnn_x101_32x4d_fpn_20e_coco.py
new file mode 100644
index 0000000000000000000000000000000000000000..0cfc7d78a79836ed06cf242f5f5c32af7f065249
--- /dev/null
+++ b/PyTorch/contrib/cv/detection/GCNet/configs/cascade_rcnn/cascade_mask_rcnn_x101_32x4d_fpn_20e_coco.py
@@ -0,0 +1,13 @@
+_base_ = './cascade_mask_rcnn_r50_fpn_20e_coco.py'
+model = dict(
+ pretrained='open-mmlab://resnext101_32x4d',
+ backbone=dict(
+ type='ResNeXt',
+ depth=101,
+ groups=32,
+ base_width=4,
+ num_stages=4,
+ out_indices=(0, 1, 2, 3),
+ frozen_stages=1,
+ norm_cfg=dict(type='BN', requires_grad=True),
+ style='pytorch'))
diff --git a/PyTorch/contrib/cv/detection/GCNet/configs/cascade_rcnn/cascade_mask_rcnn_x101_64x4d_fpn_1x_coco.py b/PyTorch/contrib/cv/detection/GCNet/configs/cascade_rcnn/cascade_mask_rcnn_x101_64x4d_fpn_1x_coco.py
new file mode 100644
index 0000000000000000000000000000000000000000..33629ee6cc2b903407372d68c6d7ab599fe6598e
--- /dev/null
+++ b/PyTorch/contrib/cv/detection/GCNet/configs/cascade_rcnn/cascade_mask_rcnn_x101_64x4d_fpn_1x_coco.py
@@ -0,0 +1,13 @@
+_base_ = './cascade_mask_rcnn_r50_fpn_1x_coco.py'
+model = dict(
+ pretrained='open-mmlab://resnext101_64x4d',
+ backbone=dict(
+ type='ResNeXt',
+ depth=101,
+ groups=64,
+ base_width=4,
+ num_stages=4,
+ out_indices=(0, 1, 2, 3),
+ frozen_stages=1,
+ norm_cfg=dict(type='BN', requires_grad=True),
+ style='pytorch'))
diff --git a/PyTorch/contrib/cv/detection/GCNet/configs/cascade_rcnn/cascade_mask_rcnn_x101_64x4d_fpn_20e_coco.py b/PyTorch/contrib/cv/detection/GCNet/configs/cascade_rcnn/cascade_mask_rcnn_x101_64x4d_fpn_20e_coco.py
new file mode 100644
index 0000000000000000000000000000000000000000..e64c22cdb062a43c082360803caf399fa4141d60
--- /dev/null
+++ b/PyTorch/contrib/cv/detection/GCNet/configs/cascade_rcnn/cascade_mask_rcnn_x101_64x4d_fpn_20e_coco.py
@@ -0,0 +1,13 @@
+_base_ = './cascade_mask_rcnn_r50_fpn_20e_coco.py'
+model = dict(
+ pretrained='open-mmlab://resnext101_64x4d',
+ backbone=dict(
+ type='ResNeXt',
+ depth=101,
+ groups=64,
+ base_width=4,
+ num_stages=4,
+ out_indices=(0, 1, 2, 3),
+ frozen_stages=1,
+ norm_cfg=dict(type='BN', requires_grad=True),
+ style='pytorch'))
diff --git a/PyTorch/contrib/cv/detection/GCNet/configs/cascade_rcnn/cascade_rcnn_r101_caffe_fpn_1x_coco.py b/PyTorch/contrib/cv/detection/GCNet/configs/cascade_rcnn/cascade_rcnn_r101_caffe_fpn_1x_coco.py
new file mode 100644
index 0000000000000000000000000000000000000000..8e8b830fd544b73d2da7a359ea208178a37fc324
--- /dev/null
+++ b/PyTorch/contrib/cv/detection/GCNet/configs/cascade_rcnn/cascade_rcnn_r101_caffe_fpn_1x_coco.py
@@ -0,0 +1,4 @@
+_base_ = './cascade_rcnn_r50_caffe_fpn_1x_coco.py'
+model = dict(
+ pretrained='open-mmlab://detectron2/resnet101_caffe',
+ backbone=dict(depth=101))
diff --git a/PyTorch/contrib/cv/detection/GCNet/configs/cascade_rcnn/cascade_rcnn_r101_fpn_1x_coco.py b/PyTorch/contrib/cv/detection/GCNet/configs/cascade_rcnn/cascade_rcnn_r101_fpn_1x_coco.py
new file mode 100644
index 0000000000000000000000000000000000000000..66666517ad6c7a8427d59cb3efaf33712ef7ed83
--- /dev/null
+++ b/PyTorch/contrib/cv/detection/GCNet/configs/cascade_rcnn/cascade_rcnn_r101_fpn_1x_coco.py
@@ -0,0 +1,2 @@
+_base_ = './cascade_rcnn_r50_fpn_1x_coco.py'
+model = dict(pretrained='torchvision://resnet101', backbone=dict(depth=101))
diff --git a/PyTorch/contrib/cv/detection/GCNet/configs/cascade_rcnn/cascade_rcnn_r101_fpn_20e_coco.py b/PyTorch/contrib/cv/detection/GCNet/configs/cascade_rcnn/cascade_rcnn_r101_fpn_20e_coco.py
new file mode 100644
index 0000000000000000000000000000000000000000..9cb3581910f74063eb1c62b9345a6493098d4a4a
--- /dev/null
+++ b/PyTorch/contrib/cv/detection/GCNet/configs/cascade_rcnn/cascade_rcnn_r101_fpn_20e_coco.py
@@ -0,0 +1,2 @@
+_base_ = './cascade_rcnn_r50_fpn_20e_coco.py'
+model = dict(pretrained='torchvision://resnet101', backbone=dict(depth=101))
diff --git a/PyTorch/contrib/cv/detection/GCNet/configs/cascade_rcnn/cascade_rcnn_r50_caffe_fpn_1x_coco.py b/PyTorch/contrib/cv/detection/GCNet/configs/cascade_rcnn/cascade_rcnn_r50_caffe_fpn_1x_coco.py
new file mode 100644
index 0000000000000000000000000000000000000000..c576c7496928eed58400ba11d71af8f4edc1c4b5
--- /dev/null
+++ b/PyTorch/contrib/cv/detection/GCNet/configs/cascade_rcnn/cascade_rcnn_r50_caffe_fpn_1x_coco.py
@@ -0,0 +1,38 @@
+_base_ = './cascade_rcnn_r50_fpn_1x_coco.py'
+
+model = dict(
+ pretrained='open-mmlab://detectron2/resnet50_caffe',
+ backbone=dict(norm_cfg=dict(requires_grad=False), style='caffe'))
+
+# use caffe img_norm
+img_norm_cfg = dict(
+ mean=[103.530, 116.280, 123.675], std=[1.0, 1.0, 1.0], to_rgb=False)
+train_pipeline = [
+ dict(type='LoadImageFromFile'),
+ dict(type='LoadAnnotations', with_bbox=True),
+ dict(type='Resize', img_scale=(1333, 800), keep_ratio=True),
+ dict(type='RandomFlip', flip_ratio=0.5),
+ dict(type='Normalize', **img_norm_cfg),
+ dict(type='Pad', size_divisor=32),
+ dict(type='DefaultFormatBundle'),
+ dict(type='Collect', keys=['img', 'gt_bboxes', 'gt_labels']),
+]
+test_pipeline = [
+ dict(type='LoadImageFromFile'),
+ dict(
+ type='MultiScaleFlipAug',
+ img_scale=(1333, 800),
+ flip=False,
+ transforms=[
+ dict(type='Resize', keep_ratio=True),
+ dict(type='RandomFlip'),
+ dict(type='Normalize', **img_norm_cfg),
+ dict(type='Pad', size_divisor=32),
+ dict(type='ImageToTensor', keys=['img']),
+ dict(type='Collect', keys=['img']),
+ ])
+]
+data = dict(
+ train=dict(pipeline=train_pipeline),
+ val=dict(pipeline=test_pipeline),
+ test=dict(pipeline=test_pipeline))
diff --git a/PyTorch/contrib/cv/detection/GCNet/configs/cascade_rcnn/cascade_rcnn_r50_fpn_1x_coco.py b/PyTorch/contrib/cv/detection/GCNet/configs/cascade_rcnn/cascade_rcnn_r50_fpn_1x_coco.py
new file mode 100644
index 0000000000000000000000000000000000000000..87e21fbff82763caf0e14ba641493870a15578b1
--- /dev/null
+++ b/PyTorch/contrib/cv/detection/GCNet/configs/cascade_rcnn/cascade_rcnn_r50_fpn_1x_coco.py
@@ -0,0 +1,5 @@
+_base_ = [
+ '../_base_/models/cascade_rcnn_r50_fpn.py',
+ '../_base_/datasets/coco_detection.py',
+ '../_base_/schedules/schedule_1x.py', '../_base_/default_runtime.py'
+]
diff --git a/PyTorch/contrib/cv/detection/GCNet/configs/cascade_rcnn/cascade_rcnn_r50_fpn_20e_coco.py b/PyTorch/contrib/cv/detection/GCNet/configs/cascade_rcnn/cascade_rcnn_r50_fpn_20e_coco.py
new file mode 100644
index 0000000000000000000000000000000000000000..74f24a202074effdf11661f71af32316b4480fb6
--- /dev/null
+++ b/PyTorch/contrib/cv/detection/GCNet/configs/cascade_rcnn/cascade_rcnn_r50_fpn_20e_coco.py
@@ -0,0 +1,4 @@
+_base_ = './cascade_rcnn_r50_fpn_1x_coco.py'
+# learning policy
+lr_config = dict(step=[16, 19])
+total_epochs = 20
diff --git a/PyTorch/contrib/cv/detection/GCNet/configs/cascade_rcnn/cascade_rcnn_x101_32x4d_fpn_1x_coco.py b/PyTorch/contrib/cv/detection/GCNet/configs/cascade_rcnn/cascade_rcnn_x101_32x4d_fpn_1x_coco.py
new file mode 100644
index 0000000000000000000000000000000000000000..1fbe6ce9f8a91151f2dfb656e90c9586b6dd35e3
--- /dev/null
+++ b/PyTorch/contrib/cv/detection/GCNet/configs/cascade_rcnn/cascade_rcnn_x101_32x4d_fpn_1x_coco.py
@@ -0,0 +1,13 @@
+_base_ = './cascade_rcnn_r50_fpn_1x_coco.py'
+model = dict(
+ pretrained='open-mmlab://resnext101_32x4d',
+ backbone=dict(
+ type='ResNeXt',
+ depth=101,
+ groups=32,
+ base_width=4,
+ num_stages=4,
+ out_indices=(0, 1, 2, 3),
+ frozen_stages=1,
+ norm_cfg=dict(type='BN', requires_grad=True),
+ style='pytorch'))
diff --git a/PyTorch/contrib/cv/detection/GCNet/configs/cascade_rcnn/cascade_rcnn_x101_32x4d_fpn_20e_coco.py b/PyTorch/contrib/cv/detection/GCNet/configs/cascade_rcnn/cascade_rcnn_x101_32x4d_fpn_20e_coco.py
new file mode 100644
index 0000000000000000000000000000000000000000..1afeeef1212db831dd1f097d30b0354e459daa97
--- /dev/null
+++ b/PyTorch/contrib/cv/detection/GCNet/configs/cascade_rcnn/cascade_rcnn_x101_32x4d_fpn_20e_coco.py
@@ -0,0 +1,13 @@
+_base_ = './cascade_rcnn_r50_fpn_20e_coco.py'
+model = dict(
+ pretrained='open-mmlab://resnext101_32x4d',
+ backbone=dict(
+ type='ResNeXt',
+ depth=101,
+ groups=32,
+ base_width=4,
+ num_stages=4,
+ out_indices=(0, 1, 2, 3),
+ frozen_stages=1,
+ norm_cfg=dict(type='BN', requires_grad=True),
+ style='pytorch'))
diff --git a/PyTorch/contrib/cv/detection/GCNet/configs/cascade_rcnn/cascade_rcnn_x101_64x4d_fpn_1x_coco.py b/PyTorch/contrib/cv/detection/GCNet/configs/cascade_rcnn/cascade_rcnn_x101_64x4d_fpn_1x_coco.py
new file mode 100644
index 0000000000000000000000000000000000000000..b249bfa0df6037f1433ef6d41f7da16b10645aa2
--- /dev/null
+++ b/PyTorch/contrib/cv/detection/GCNet/configs/cascade_rcnn/cascade_rcnn_x101_64x4d_fpn_1x_coco.py
@@ -0,0 +1,14 @@
+_base_ = './cascade_rcnn_r50_fpn_1x_coco.py'
+model = dict(
+ type='CascadeRCNN',
+ pretrained='open-mmlab://resnext101_64x4d',
+ backbone=dict(
+ type='ResNeXt',
+ depth=101,
+ groups=64,
+ base_width=4,
+ num_stages=4,
+ out_indices=(0, 1, 2, 3),
+ frozen_stages=1,
+ norm_cfg=dict(type='BN', requires_grad=True),
+ style='pytorch'))
diff --git a/PyTorch/contrib/cv/detection/GCNet/configs/cascade_rcnn/cascade_rcnn_x101_64x4d_fpn_20e_coco.py b/PyTorch/contrib/cv/detection/GCNet/configs/cascade_rcnn/cascade_rcnn_x101_64x4d_fpn_20e_coco.py
new file mode 100644
index 0000000000000000000000000000000000000000..500b48cf7882d3e2ecbe6534e2955948bddb6825
--- /dev/null
+++ b/PyTorch/contrib/cv/detection/GCNet/configs/cascade_rcnn/cascade_rcnn_x101_64x4d_fpn_20e_coco.py
@@ -0,0 +1,14 @@
+_base_ = './cascade_rcnn_r50_fpn_20e_coco.py'
+model = dict(
+ type='CascadeRCNN',
+ pretrained='open-mmlab://resnext101_64x4d',
+ backbone=dict(
+ type='ResNeXt',
+ depth=101,
+ groups=64,
+ base_width=4,
+ num_stages=4,
+ out_indices=(0, 1, 2, 3),
+ frozen_stages=1,
+ norm_cfg=dict(type='BN', requires_grad=True),
+ style='pytorch'))
diff --git a/PyTorch/contrib/cv/detection/GCNet/configs/centripetalnet/README.md b/PyTorch/contrib/cv/detection/GCNet/configs/centripetalnet/README.md
new file mode 100644
index 0000000000000000000000000000000000000000..37ddc2bdd9ad039aa34a502f92619022e62f980f
--- /dev/null
+++ b/PyTorch/contrib/cv/detection/GCNet/configs/centripetalnet/README.md
@@ -0,0 +1,22 @@
+# CentripetalNet
+
+## Introduction
+```
+@InProceedings{Dong_2020_CVPR,
+author = {Dong, Zhiwei and Li, Guoxuan and Liao, Yue and Wang, Fei and Ren, Pengju and Qian, Chen},
+title = {CentripetalNet: Pursuing High-Quality Keypoint Pairs for Object Detection},
+booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)},
+month = {June},
+year = {2020}
+}
+```
+
+## Results and models
+
+| Backbone | Batch Size | Step/Total Epochs | Mem (GB) | Inf time (fps) | box AP | Download |
+| :-------------: | :--------: |:----------------: | :------: | :------------: | :----: | :------: |
+| HourglassNet-104 | [16 x 6](./centripetalnet_hourglass104_mstest_16x6_210e_coco.py) | 190/210 | 16.7 | 3.7 | 44.8 | [model](http://download.openmmlab.com/mmdetection/v2.0/centripetalnet/centripetalnet_hourglass104_mstest_16x6_210e_coco/centripetalnet_hourglass104_mstest_16x6_210e_coco_20200915_204804-3ccc61e5.pth) | [log](http://download.openmmlab.com/mmdetection/v2.0/centripetalnet/centripetalnet_hourglass104_mstest_16x6_210e_coco/centripetalnet_hourglass104_mstest_16x6_210e_coco_20200915_204804.log.json) |
+
+Note:
+- TTA setting is single-scale and `flip=True`.
+- The model we released is the best checkpoint rather than the latest checkpoint (box AP 44.8 vs 44.6 in our experiment).
diff --git a/PyTorch/contrib/cv/detection/GCNet/configs/centripetalnet/centripetalnet_hourglass104_mstest_16x6_210e_coco.py b/PyTorch/contrib/cv/detection/GCNet/configs/centripetalnet/centripetalnet_hourglass104_mstest_16x6_210e_coco.py
new file mode 100644
index 0000000000000000000000000000000000000000..510e5abfdaa392b7bc161b83c34d64aa2e85eb1e
--- /dev/null
+++ b/PyTorch/contrib/cv/detection/GCNet/configs/centripetalnet/centripetalnet_hourglass104_mstest_16x6_210e_coco.py
@@ -0,0 +1,105 @@
+_base_ = [
+ '../_base_/default_runtime.py', '../_base_/datasets/coco_detection.py'
+]
+
+# model settings
+model = dict(
+ type='CornerNet',
+ backbone=dict(
+ type='HourglassNet',
+ downsample_times=5,
+ num_stacks=2,
+ stage_channels=[256, 256, 384, 384, 384, 512],
+ stage_blocks=[2, 2, 2, 2, 2, 4],
+ norm_cfg=dict(type='BN', requires_grad=True)),
+ neck=None,
+ bbox_head=dict(
+ type='CentripetalHead',
+ num_classes=80,
+ in_channels=256,
+ num_feat_levels=2,
+ corner_emb_channels=0,
+ loss_heatmap=dict(
+ type='GaussianFocalLoss', alpha=2.0, gamma=4.0, loss_weight=1),
+ loss_offset=dict(type='SmoothL1Loss', beta=1.0, loss_weight=1),
+ loss_guiding_shift=dict(
+ type='SmoothL1Loss', beta=1.0, loss_weight=0.05),
+ loss_centripetal_shift=dict(
+ type='SmoothL1Loss', beta=1.0, loss_weight=1)))
+# data settings
+img_norm_cfg = dict(
+ mean=[123.675, 116.28, 103.53], std=[58.395, 57.12, 57.375], to_rgb=True)
+train_pipeline = [
+ dict(type='LoadImageFromFile', to_float32=True),
+ dict(type='LoadAnnotations', with_bbox=True),
+ dict(
+ type='PhotoMetricDistortion',
+ brightness_delta=32,
+ contrast_range=(0.5, 1.5),
+ saturation_range=(0.5, 1.5),
+ hue_delta=18),
+ dict(
+ type='RandomCenterCropPad',
+ crop_size=(511, 511),
+ ratios=(0.6, 0.7, 0.8, 0.9, 1.0, 1.1, 1.2, 1.3),
+ test_mode=False,
+ test_pad_mode=None,
+ **img_norm_cfg),
+ dict(type='Resize', img_scale=(511, 511), keep_ratio=False),
+ dict(type='RandomFlip', flip_ratio=0.5),
+ dict(type='Normalize', **img_norm_cfg),
+ dict(type='DefaultFormatBundle'),
+ dict(type='Collect', keys=['img', 'gt_bboxes', 'gt_labels']),
+]
+test_pipeline = [
+ dict(type='LoadImageFromFile', to_float32=True),
+ dict(
+ type='MultiScaleFlipAug',
+ scale_factor=1.0,
+ flip=True,
+ transforms=[
+ dict(type='Resize'),
+ dict(
+ type='RandomCenterCropPad',
+ crop_size=None,
+ ratios=None,
+ border=None,
+ test_mode=True,
+ test_pad_mode=['logical_or', 127],
+ **img_norm_cfg),
+ dict(type='RandomFlip'),
+ dict(type='Normalize', **img_norm_cfg),
+ dict(type='ImageToTensor', keys=['img']),
+ dict(
+ type='Collect',
+ keys=['img'],
+ meta_keys=('filename', 'ori_shape', 'img_shape', 'pad_shape',
+ 'scale_factor', 'flip', 'img_norm_cfg', 'border')),
+ ])
+]
+data = dict(
+ samples_per_gpu=6,
+ workers_per_gpu=3,
+ train=dict(pipeline=train_pipeline),
+ val=dict(pipeline=test_pipeline),
+ test=dict(pipeline=test_pipeline))
+# training and testing settings
+train_cfg = None
+test_cfg = dict(
+ corner_topk=100,
+ local_maximum_kernel=3,
+ distance_threshold=0.5,
+ score_thr=0.05,
+ max_per_img=100,
+ nms_cfg=dict(type='soft_nms', iou_threshold=0.5, method='gaussian'))
+# optimizer
+optimizer = dict(type='Adam', lr=0.0005)
+optimizer_config = dict(grad_clip=dict(max_norm=35, norm_type=2))
+# learning policy
+lr_config = dict(
+ policy='step',
+ warmup='linear',
+ warmup_iters=500,
+ warmup_ratio=1.0 / 3,
+ step=[190])
+total_epochs = 210
diff --git a/PyTorch/contrib/cv/detection/GCNet/configs/cityscapes/README.md b/PyTorch/contrib/cv/detection/GCNet/configs/cityscapes/README.md
new file mode 100644
index 0000000000000000000000000000000000000000..dc6bf9e42c36c70e127e46ebca0fed64c1ff39b1
--- /dev/null
+++ b/PyTorch/contrib/cv/detection/GCNet/configs/cityscapes/README.md
@@ -0,0 +1,21 @@
+## Common settings
+
+- All baselines were trained using 8 GPU with a batch size of 8 (1 images per GPU) using the [linear scaling rule](https://arxiv.org/abs/1706.02677) to scale the learning rate.
+- All models were trained on `cityscapes_train`, and tested on `cityscapes_val`.
+- 1x training schedule indicates 64 epochs which corresponds to slightly less than the 24k iterations reported in the original schedule from the [Mask R-CNN paper](https://arxiv.org/abs/1703.06870)
+- COCO pre-trained weights are used to initialize.
+- A conversion [script](../../tools/convert_datasets/cityscapes.py) is provided to convert Cityscapes into COCO format. Please refer to [install.md](../../docs/install.md#prepare-datasets) for details.
+- `CityscapesDataset` implemented three evaluation methods. `bbox` and `segm` are standard COCO bbox/mask AP. `cityscapes` is the cityscapes dataset official evaluation, which may be slightly higher than COCO.
+
+
+### Faster R-CNN
+
+| Backbone | Style | Lr schd | Scale | Mem (GB) | Inf time (fps) | box AP | Download |
+| :-------------: | :-----: | :-----: | :---: | :------: | :------------: | :----: | :--------: |
+| R-50-FPN | pytorch | 1x | 800-1024 | 5.2 | - | 40.3 | [model](http://download.openmmlab.com/mmdetection/v2.0/cityscapes/faster_rcnn_r50_fpn_1x_cityscapes_20200502-829424c0.pth) | [log](http://download.openmmlab.com/mmdetection/v2.0/cityscapes/faster_rcnn_r50_fpn_1x_cityscapes_20200502_114915.log.json) |
+
+### Mask R-CNN
+
+| Backbone | Style | Lr schd | Scale | Mem (GB) | Inf time (fps) | box AP | mask AP | Download |
+| :-------------: | :-----: | :-----: | :------: | :------: | :------------: | :----: | :-----: | :------: |
+| R-50-FPN | pytorch | 1x | 800-1024 | 5.3 | - | 41.0 | 35.8 | [model](http://download.openmmlab.com/mmdetection/v2.0/cityscapes/mask_rcnn_r50_fpn_1x_cityscapes_20200502-6ea77f0e.pth) | [log](http://download.openmmlab.com/mmdetection/v2.0/cityscapes/mask_rcnn_r50_fpn_1x_cityscapes_20200502_114915.log.json) |
diff --git a/PyTorch/contrib/cv/detection/GCNet/configs/cityscapes/faster_rcnn_r50_fpn_1x_cityscapes.py b/PyTorch/contrib/cv/detection/GCNet/configs/cityscapes/faster_rcnn_r50_fpn_1x_cityscapes.py
new file mode 100644
index 0000000000000000000000000000000000000000..a7cfcaa0dd0747587a9e1bb90cf28ce45e46fc2e
--- /dev/null
+++ b/PyTorch/contrib/cv/detection/GCNet/configs/cityscapes/faster_rcnn_r50_fpn_1x_cityscapes.py
@@ -0,0 +1,38 @@
+_base_ = [
+ '../_base_/models/faster_rcnn_r50_fpn.py',
+ '../_base_/datasets/cityscapes_detection.py',
+ '../_base_/default_runtime.py'
+]
+model = dict(
+ pretrained=None,
+ roi_head=dict(
+ bbox_head=dict(
+ type='Shared2FCBBoxHead',
+ in_channels=256,
+ fc_out_channels=1024,
+ roi_feat_size=7,
+ num_classes=8,
+ bbox_coder=dict(
+ type='DeltaXYWHBBoxCoder',
+ target_means=[0., 0., 0., 0.],
+ target_stds=[0.1, 0.1, 0.2, 0.2]),
+ reg_class_agnostic=False,
+ loss_cls=dict(
+ type='CrossEntropyLoss', use_sigmoid=False, loss_weight=1.0),
+ loss_bbox=dict(type='SmoothL1Loss', beta=1.0, loss_weight=1.0))))
+# optimizer
+# lr is set for a batch size of 8
+optimizer = dict(type='SGD', lr=0.01, momentum=0.9, weight_decay=0.0001)
+optimizer_config = dict(grad_clip=None)
+# learning policy
+lr_config = dict(
+ policy='step',
+ warmup='linear',
+ warmup_iters=500,
+ warmup_ratio=0.001,
+ # [7] yields higher performance than [6]
+ step=[7])
+total_epochs = 8 # actual epoch = 8 * 8 = 64
+log_config = dict(interval=100)
+# For better, more stable performance initialize from COCO
+load_from = 'https://open-mmlab.s3.ap-northeast-2.amazonaws.com/mmdetection/v2.0/faster_rcnn/faster_rcnn_r50_fpn_1x_coco/faster_rcnn_r50_fpn_1x_coco_20200130-047c8118.pth' # noqa
diff --git a/PyTorch/contrib/cv/detection/GCNet/configs/cityscapes/mask_rcnn_r50_fpn_1x_cityscapes.py b/PyTorch/contrib/cv/detection/GCNet/configs/cityscapes/mask_rcnn_r50_fpn_1x_cityscapes.py
new file mode 100644
index 0000000000000000000000000000000000000000..b17735366f145029d345c91df9ce2689d9e73dc0
--- /dev/null
+++ b/PyTorch/contrib/cv/detection/GCNet/configs/cityscapes/mask_rcnn_r50_fpn_1x_cityscapes.py
@@ -0,0 +1,45 @@
+_base_ = [
+ '../_base_/models/mask_rcnn_r50_fpn.py',
+ '../_base_/datasets/cityscapes_instance.py', '../_base_/default_runtime.py'
+]
+model = dict(
+ pretrained=None,
+ roi_head=dict(
+ bbox_head=dict(
+ type='Shared2FCBBoxHead',
+ in_channels=256,
+ fc_out_channels=1024,
+ roi_feat_size=7,
+ num_classes=8,
+ bbox_coder=dict(
+ type='DeltaXYWHBBoxCoder',
+ target_means=[0., 0., 0., 0.],
+ target_stds=[0.1, 0.1, 0.2, 0.2]),
+ reg_class_agnostic=False,
+ loss_cls=dict(
+ type='CrossEntropyLoss', use_sigmoid=False, loss_weight=1.0),
+ loss_bbox=dict(type='SmoothL1Loss', beta=1.0, loss_weight=1.0)),
+ mask_head=dict(
+ type='FCNMaskHead',
+ num_convs=4,
+ in_channels=256,
+ conv_out_channels=256,
+ num_classes=8,
+ loss_mask=dict(
+ type='CrossEntropyLoss', use_mask=True, loss_weight=1.0))))
+# optimizer
+# lr is set for a batch size of 8
+optimizer = dict(type='SGD', lr=0.01, momentum=0.9, weight_decay=0.0001)
+optimizer_config = dict(grad_clip=None)
+# learning policy
+lr_config = dict(
+ policy='step',
+ warmup='linear',
+ warmup_iters=500,
+ warmup_ratio=0.001,
+ # [7] yields higher performance than [6]
+ step=[7])
+total_epochs = 8 # actual epoch = 8 * 8 = 64
+log_config = dict(interval=100)
+# For better, more stable performance initialize from COCO
+load_from = 'https://open-mmlab.s3.ap-northeast-2.amazonaws.com/mmdetection/v2.0/mask_rcnn/mask_rcnn_r50_fpn_1x_coco/mask_rcnn_r50_fpn_1x_coco_20200205-d4b0c5d6.pth' # noqa
diff --git a/PyTorch/contrib/cv/detection/GCNet/configs/cornernet/README.md b/PyTorch/contrib/cv/detection/GCNet/configs/cornernet/README.md
new file mode 100644
index 0000000000000000000000000000000000000000..9959f1d6f79f80cea1a8ab0ca43f18c8ad9fcb0e
--- /dev/null
+++ b/PyTorch/contrib/cv/detection/GCNet/configs/cornernet/README.md
@@ -0,0 +1,29 @@
+# CornerNet
+
+## Introduction
+```
+@inproceedings{law2018cornernet,
+ title={Cornernet: Detecting objects as paired keypoints},
+ author={Law, Hei and Deng, Jia},
+ booktitle={15th European Conference on Computer Vision, ECCV 2018},
+ pages={765--781},
+ year={2018},
+ organization={Springer Verlag}
+}
+```
+
+## Results and models
+
+| Backbone | Batch Size | Step/Total Epochs | Mem (GB) | Inf time (fps) | box AP | Download |
+| :-------------: | :--------: |:----------------: | :------: | :------------: | :----: | :------: |
+| HourglassNet-104 | [10 x 5](./cornernet_hourglass104_mstest_10x5_210e_coco.py) | 180/210 | 13.9 | 4.2 | 41.2 | [model](http://download.openmmlab.com/mmdetection/v2.0/cornernet/cornernet_hourglass104_mstest_10x5_210e_coco/cornernet_hourglass104_mstest_10x5_210e_coco_20200824_185720-5fefbf1c.pth) | [log](http://download.openmmlab.com/mmdetection/v2.0/cornernet/cornernet_hourglass104_mstest_10x5_210e_coco/cornernet_hourglass104_mstest_10x5_210e_coco_20200824_185720.log.json) |
+| HourglassNet-104 | [8 x 6](./cornernet_hourglass104_mstest_8x6_210e_coco.py) | 180/210 | 15.9 | 4.2 | 41.2 | [model](http://download.openmmlab.com/mmdetection/v2.0/cornernet/cornernet_hourglass104_mstest_8x6_210e_coco/cornernet_hourglass104_mstest_8x6_210e_coco_20200825_150618-79b44c30.pth) | [log](http://download.openmmlab.com/mmdetection/v2.0/cornernet/cornernet_hourglass104_mstest_8x6_210e_coco/cornernet_hourglass104_mstest_8x6_210e_coco_20200825_150618.log.json) |
+| HourglassNet-104 | [32 x 3](./cornernet_hourglass104_mstest_32x3_210e_coco.py) | 180/210 | 9.5 | 3.9 | 40.4 | [model](http://download.openmmlab.com/mmdetection/v2.0/cornernet/cornernet_hourglass104_mstest_32x3_210e_coco/cornernet_hourglass104_mstest_32x3_210e_coco_20200819_203110-1efaea91.pth) | [log](http://download.openmmlab.com/mmdetection/v2.0/cornernet/cornernet_hourglass104_mstest_32x3_210e_coco/cornernet_hourglass104_mstest_32x3_210e_coco_20200819_203110.log.json) |
+
+Note:
+- TTA setting is single-scale and `flip=True`.
+- Experiments with `images_per_gpu=6` are conducted on Tesla V100-SXM2-32GB, `images_per_gpu=3` are conducted on GeForce GTX 1080 Ti.
+- Here are the descriptions of each experiment setting:
+ - 10 x 5: 10 GPUs with 5 images per gpu. This is the same setting as that reported in the original paper.
+ - 8 x 6: 8 GPUs with 6 images per gpu. The total batchsize is similar to paper and only need 1 node to train.
+ - 32 x 3: 32 GPUs with 3 images per gpu. The default setting for 1080TI and need 4 nodes to train.
diff --git a/PyTorch/contrib/cv/detection/GCNet/configs/cornernet/cornernet_hourglass104_mstest_10x5_210e_coco.py b/PyTorch/contrib/cv/detection/GCNet/configs/cornernet/cornernet_hourglass104_mstest_10x5_210e_coco.py
new file mode 100644
index 0000000000000000000000000000000000000000..b0d8771606c8784f6ac1c3343491a2f22a697976
--- /dev/null
+++ b/PyTorch/contrib/cv/detection/GCNet/configs/cornernet/cornernet_hourglass104_mstest_10x5_210e_coco.py
@@ -0,0 +1,105 @@
+_base_ = [
+ '../_base_/default_runtime.py', '../_base_/datasets/coco_detection.py'
+]
+
+# model settings
+model = dict(
+ type='CornerNet',
+ backbone=dict(
+ type='HourglassNet',
+ downsample_times=5,
+ num_stacks=2,
+ stage_channels=[256, 256, 384, 384, 384, 512],
+ stage_blocks=[2, 2, 2, 2, 2, 4],
+ norm_cfg=dict(type='BN', requires_grad=True)),
+ neck=None,
+ bbox_head=dict(
+ type='CornerHead',
+ num_classes=80,
+ in_channels=256,
+ num_feat_levels=2,
+ corner_emb_channels=1,
+ loss_heatmap=dict(
+ type='GaussianFocalLoss', alpha=2.0, gamma=4.0, loss_weight=1),
+ loss_embedding=dict(
+ type='AssociativeEmbeddingLoss',
+ pull_weight=0.10,
+ push_weight=0.10),
+ loss_offset=dict(type='SmoothL1Loss', beta=1.0, loss_weight=1)))
+# data settings
+img_norm_cfg = dict(
+ mean=[123.675, 116.28, 103.53], std=[58.395, 57.12, 57.375], to_rgb=True)
+train_pipeline = [
+ dict(type='LoadImageFromFile', to_float32=True),
+ dict(type='LoadAnnotations', with_bbox=True),
+ dict(
+ type='PhotoMetricDistortion',
+ brightness_delta=32,
+ contrast_range=(0.5, 1.5),
+ saturation_range=(0.5, 1.5),
+ hue_delta=18),
+ dict(
+ type='RandomCenterCropPad',
+ crop_size=(511, 511),
+ ratios=(0.6, 0.7, 0.8, 0.9, 1.0, 1.1, 1.2, 1.3),
+ test_mode=False,
+ test_pad_mode=None,
+ **img_norm_cfg),
+ dict(type='Resize', img_scale=(511, 511), keep_ratio=False),
+ dict(type='RandomFlip', flip_ratio=0.5),
+ dict(type='Normalize', **img_norm_cfg),
+ dict(type='DefaultFormatBundle'),
+ dict(type='Collect', keys=['img', 'gt_bboxes', 'gt_labels']),
+]
+test_pipeline = [
+ dict(type='LoadImageFromFile', to_float32=True),
+ dict(
+ type='MultiScaleFlipAug',
+ scale_factor=1.0,
+ flip=True,
+ transforms=[
+ dict(type='Resize'),
+ dict(
+ type='RandomCenterCropPad',
+ crop_size=None,
+ ratios=None,
+ border=None,
+ test_mode=True,
+ test_pad_mode=['logical_or', 127],
+ **img_norm_cfg),
+ dict(type='RandomFlip'),
+ dict(type='Normalize', **img_norm_cfg),
+ dict(type='ImageToTensor', keys=['img']),
+ dict(
+ type='Collect',
+ keys=['img'],
+ meta_keys=('filename', 'ori_shape', 'img_shape', 'pad_shape',
+ 'scale_factor', 'flip', 'img_norm_cfg', 'border')),
+ ])
+]
+data = dict(
+ samples_per_gpu=5,
+ workers_per_gpu=3,
+ train=dict(pipeline=train_pipeline),
+ val=dict(pipeline=test_pipeline),
+ test=dict(pipeline=test_pipeline))
+# training and testing settings
+train_cfg = None
+test_cfg = dict(
+ corner_topk=100,
+ local_maximum_kernel=3,
+ distance_threshold=0.5,
+ score_thr=0.05,
+ max_per_img=100,
+ nms_cfg=dict(type='soft_nms', iou_threshold=0.5, method='gaussian'))
+# optimizer
+optimizer = dict(type='Adam', lr=0.0005)
+optimizer_config = dict(grad_clip=dict(max_norm=35, norm_type=2))
+# learning policy
+lr_config = dict(
+ policy='step',
+ warmup='linear',
+ warmup_iters=500,
+ warmup_ratio=1.0 / 3,
+ step=[180])
+total_epochs = 210
diff --git a/PyTorch/contrib/cv/detection/GCNet/configs/cornernet/cornernet_hourglass104_mstest_32x3_210e_coco.py b/PyTorch/contrib/cv/detection/GCNet/configs/cornernet/cornernet_hourglass104_mstest_32x3_210e_coco.py
new file mode 100644
index 0000000000000000000000000000000000000000..b025785df1b2219e993e4588a16fb4fa140ff06f
--- /dev/null
+++ b/PyTorch/contrib/cv/detection/GCNet/configs/cornernet/cornernet_hourglass104_mstest_32x3_210e_coco.py
@@ -0,0 +1,105 @@
+_base_ = [
+ '../_base_/default_runtime.py', '../_base_/datasets/coco_detection.py'
+]
+
+# model settings
+model = dict(
+ type='CornerNet',
+ backbone=dict(
+ type='HourglassNet',
+ downsample_times=5,
+ num_stacks=2,
+ stage_channels=[256, 256, 384, 384, 384, 512],
+ stage_blocks=[2, 2, 2, 2, 2, 4],
+ norm_cfg=dict(type='BN', requires_grad=True)),
+ neck=None,
+ bbox_head=dict(
+ type='CornerHead',
+ num_classes=80,
+ in_channels=256,
+ num_feat_levels=2,
+ corner_emb_channels=1,
+ loss_heatmap=dict(
+ type='GaussianFocalLoss', alpha=2.0, gamma=4.0, loss_weight=1),
+ loss_embedding=dict(
+ type='AssociativeEmbeddingLoss',
+ pull_weight=0.10,
+ push_weight=0.10),
+ loss_offset=dict(type='SmoothL1Loss', beta=1.0, loss_weight=1)))
+# data settings
+img_norm_cfg = dict(
+ mean=[123.675, 116.28, 103.53], std=[58.395, 57.12, 57.375], to_rgb=True)
+train_pipeline = [
+ dict(type='LoadImageFromFile', to_float32=True),
+ dict(type='LoadAnnotations', with_bbox=True),
+ dict(
+ type='PhotoMetricDistortion',
+ brightness_delta=32,
+ contrast_range=(0.5, 1.5),
+ saturation_range=(0.5, 1.5),
+ hue_delta=18),
+ dict(
+ type='RandomCenterCropPad',
+ crop_size=(511, 511),
+ ratios=(0.6, 0.7, 0.8, 0.9, 1.0, 1.1, 1.2, 1.3),
+ test_mode=False,
+ test_pad_mode=None,
+ **img_norm_cfg),
+ dict(type='Resize', img_scale=(511, 511), keep_ratio=False),
+ dict(type='RandomFlip', flip_ratio=0.5),
+ dict(type='Normalize', **img_norm_cfg),
+ dict(type='DefaultFormatBundle'),
+ dict(type='Collect', keys=['img', 'gt_bboxes', 'gt_labels']),
+]
+test_pipeline = [
+ dict(type='LoadImageFromFile', to_float32=True),
+ dict(
+ type='MultiScaleFlipAug',
+ scale_factor=1.0,
+ flip=True,
+ transforms=[
+ dict(type='Resize'),
+ dict(
+ type='RandomCenterCropPad',
+ crop_size=None,
+ ratios=None,
+ border=None,
+ test_mode=True,
+ test_pad_mode=['logical_or', 127],
+ **img_norm_cfg),
+ dict(type='RandomFlip'),
+ dict(type='Normalize', **img_norm_cfg),
+ dict(type='ImageToTensor', keys=['img']),
+ dict(
+ type='Collect',
+ keys=['img'],
+ meta_keys=('filename', 'ori_shape', 'img_shape', 'pad_shape',
+ 'scale_factor', 'flip', 'img_norm_cfg', 'border')),
+ ])
+]
+data = dict(
+ samples_per_gpu=3,
+ workers_per_gpu=3,
+ train=dict(pipeline=train_pipeline),
+ val=dict(pipeline=test_pipeline),
+ test=dict(pipeline=test_pipeline))
+# training and testing settings
+train_cfg = None
+test_cfg = dict(
+ corner_topk=100,
+ local_maximum_kernel=3,
+ distance_threshold=0.5,
+ score_thr=0.05,
+ max_per_img=100,
+ nms_cfg=dict(type='soft_nms', iou_threshold=0.5, method='gaussian'))
+# optimizer
+optimizer = dict(type='Adam', lr=0.0005)
+optimizer_config = dict(grad_clip=dict(max_norm=35, norm_type=2))
+# learning policy
+lr_config = dict(
+ policy='step',
+ warmup='linear',
+ warmup_iters=500,
+ warmup_ratio=1.0 / 3,
+ step=[180])
+total_epochs = 210
diff --git a/PyTorch/contrib/cv/detection/GCNet/configs/cornernet/cornernet_hourglass104_mstest_8x6_210e_coco.py b/PyTorch/contrib/cv/detection/GCNet/configs/cornernet/cornernet_hourglass104_mstest_8x6_210e_coco.py
new file mode 100644
index 0000000000000000000000000000000000000000..0e607d4c6440f405d9f5238e701100385e2ece06
--- /dev/null
+++ b/PyTorch/contrib/cv/detection/GCNet/configs/cornernet/cornernet_hourglass104_mstest_8x6_210e_coco.py
@@ -0,0 +1,105 @@
+_base_ = [
+ '../_base_/default_runtime.py', '../_base_/datasets/coco_detection.py'
+]
+
+# model settings
+model = dict(
+ type='CornerNet',
+ backbone=dict(
+ type='HourglassNet',
+ downsample_times=5,
+ num_stacks=2,
+ stage_channels=[256, 256, 384, 384, 384, 512],
+ stage_blocks=[2, 2, 2, 2, 2, 4],
+ norm_cfg=dict(type='BN', requires_grad=True)),
+ neck=None,
+ bbox_head=dict(
+ type='CornerHead',
+ num_classes=80,
+ in_channels=256,
+ num_feat_levels=2,
+ corner_emb_channels=1,
+ loss_heatmap=dict(
+ type='GaussianFocalLoss', alpha=2.0, gamma=4.0, loss_weight=1),
+ loss_embedding=dict(
+ type='AssociativeEmbeddingLoss',
+ pull_weight=0.10,
+ push_weight=0.10),
+ loss_offset=dict(type='SmoothL1Loss', beta=1.0, loss_weight=1)))
+# data settings
+img_norm_cfg = dict(
+ mean=[123.675, 116.28, 103.53], std=[58.395, 57.12, 57.375], to_rgb=True)
+train_pipeline = [
+ dict(type='LoadImageFromFile', to_float32=True),
+ dict(type='LoadAnnotations', with_bbox=True),
+ dict(
+ type='PhotoMetricDistortion',
+ brightness_delta=32,
+ contrast_range=(0.5, 1.5),
+ saturation_range=(0.5, 1.5),
+ hue_delta=18),
+ dict(
+ type='RandomCenterCropPad',
+ crop_size=(511, 511),
+ ratios=(0.6, 0.7, 0.8, 0.9, 1.0, 1.1, 1.2, 1.3),
+ test_mode=False,
+ test_pad_mode=None,
+ **img_norm_cfg),
+ dict(type='Resize', img_scale=(511, 511), keep_ratio=False),
+ dict(type='RandomFlip', flip_ratio=0.5),
+ dict(type='Normalize', **img_norm_cfg),
+ dict(type='DefaultFormatBundle'),
+ dict(type='Collect', keys=['img', 'gt_bboxes', 'gt_labels']),
+]
+test_pipeline = [
+ dict(type='LoadImageFromFile', to_float32=True),
+ dict(
+ type='MultiScaleFlipAug',
+ scale_factor=1.0,
+ flip=True,
+ transforms=[
+ dict(type='Resize'),
+ dict(
+ type='RandomCenterCropPad',
+ crop_size=None,
+ ratios=None,
+ border=None,
+ test_mode=True,
+ test_pad_mode=['logical_or', 127],
+ **img_norm_cfg),
+ dict(type='RandomFlip'),
+ dict(type='Normalize', **img_norm_cfg),
+ dict(type='ImageToTensor', keys=['img']),
+ dict(
+ type='Collect',
+ keys=['img'],
+ meta_keys=('filename', 'ori_shape', 'img_shape', 'pad_shape',
+ 'scale_factor', 'flip', 'img_norm_cfg', 'border')),
+ ])
+]
+data = dict(
+ samples_per_gpu=6,
+ workers_per_gpu=3,
+ train=dict(pipeline=train_pipeline),
+ val=dict(pipeline=test_pipeline),
+ test=dict(pipeline=test_pipeline))
+# training and testing settings
+train_cfg = None
+test_cfg = dict(
+ corner_topk=100,
+ local_maximum_kernel=3,
+ distance_threshold=0.5,
+ score_thr=0.05,
+ max_per_img=100,
+ nms_cfg=dict(type='soft_nms', iou_threshold=0.5, method='gaussian'))
+# optimizer
+optimizer = dict(type='Adam', lr=0.0005)
+optimizer_config = dict(grad_clip=dict(max_norm=35, norm_type=2))
+# learning policy
+lr_config = dict(
+ policy='step',
+ warmup='linear',
+ warmup_iters=500,
+ warmup_ratio=1.0 / 3,
+ step=[180])
+total_epochs = 210
diff --git a/PyTorch/contrib/cv/detection/GCNet/configs/dcn/README.md b/PyTorch/contrib/cv/detection/GCNet/configs/dcn/README.md
new file mode 100644
index 0000000000000000000000000000000000000000..f1cb1b1121ec6c669e4fadaff1e4ba03a7b36be2
--- /dev/null
+++ b/PyTorch/contrib/cv/detection/GCNet/configs/dcn/README.md
@@ -0,0 +1,46 @@
+# Deformable Convolutional Networks
+
+# Introduction
+
+```
+@inproceedings{dai2017deformable,
+ title={Deformable Convolutional Networks},
+ author={Dai, Jifeng and Qi, Haozhi and Xiong, Yuwen and Li, Yi and Zhang, Guodong and Hu, Han and Wei, Yichen},
+ booktitle={Proceedings of the IEEE international conference on computer vision},
+ year={2017}
+}
+
+@article{zhu2018deformable,
+ title={Deformable ConvNets v2: More Deformable, Better Results},
+ author={Zhu, Xizhou and Hu, Han and Lin, Stephen and Dai, Jifeng},
+ journal={arXiv preprint arXiv:1811.11168},
+ year={2018}
+}
+```
+
+## Results and Models
+
+| Backbone | Model | Style | Conv | Pool | Lr schd | Mem (GB) | Inf time (fps) | box AP | mask AP | Download |
+|:----------------:|:------------:|:-------:|:-------------:|:------:|:-------:|:--------:|:--------------:|:------:|:-------:|:--------:|
+| R-50-FPN | Faster | pytorch | dconv(c3-c5) | - | 1x | 4.0 | 17.8 | 41.3 | | [model](http://download.openmmlab.com/mmdetection/v2.0/dcn/faster_rcnn_r50_fpn_dconv_c3-c5_1x_coco/faster_rcnn_r50_fpn_dconv_c3-c5_1x_coco_20200130-d68aed1e.pth) | [log](http://download.openmmlab.com/mmdetection/v2.0/dcn/faster_rcnn_r50_fpn_dconv_c3-c5_1x_coco/faster_rcnn_r50_fpn_dconv_c3-c5_1x_coco_20200130_212941.log.json) |
+| R-50-FPN | Faster | pytorch | mdconv(c3-c5) | - | 1x | 4.1 | 17.6 | 41.4 | | [model](http://download.openmmlab.com/mmdetection/v2.0/dcn/faster_rcnn_r50_fpn_mdconv_c3-c5_1x_coco/faster_rcnn_r50_fpn_mdconv_c3-c5_1x_coco_20200130-d099253b.pth) | [log](http://download.openmmlab.com/mmdetection/v2.0/dcn/faster_rcnn_r50_fpn_mdconv_c3-c5_1x_coco/faster_rcnn_r50_fpn_mdconv_c3-c5_1x_coco_20200130_222144.log.json) |
+| *R-50-FPN (dg=4) | Faster | pytorch | mdconv(c3-c5) | - | 1x | 4.2 | 17.4 | 41.5 | | [model](http://download.openmmlab.com/mmdetection/v2.0/dcn/faster_rcnn_r50_fpn_mdconv_c3-c5_group4_1x_coco/faster_rcnn_r50_fpn_mdconv_c3-c5_group4_1x_coco_20200130-01262257.pth) | [log](http://download.openmmlab.com/mmdetection/v2.0/dcn/faster_rcnn_r50_fpn_mdconv_c3-c5_group4_1x_coco/faster_rcnn_r50_fpn_mdconv_c3-c5_group4_1x_coco_20200130_222058.log.json) |
+| R-50-FPN | Faster | pytorch | - | dpool | 1x | 5.0 | 17.2 | 38.9 | | [model](http://download.openmmlab.com/mmdetection/v2.0/dcn/faster_rcnn_r50_fpn_dpool_1x_coco/faster_rcnn_r50_fpn_dpool_1x_coco_20200307-90d3c01d.pth) | [log](http://download.openmmlab.com/mmdetection/v2.0/dcn/faster_rcnn_r50_fpn_dpool_1x_coco/faster_rcnn_r50_fpn_dpool_1x_coco_20200307_203250.log.json) |
+| R-50-FPN | Faster | pytorch | - | mdpool | 1x | 5.8 | 16.6 | 38.7 | | [model](http://download.openmmlab.com/mmdetection/v2.0/dcn/faster_rcnn_r50_fpn_mdpool_1x_coco/faster_rcnn_r50_fpn_mdpool_1x_coco_20200307-c0df27ff.pth) | [log](http://download.openmmlab.com/mmdetection/v2.0/dcn/faster_rcnn_r50_fpn_mdpool_1x_coco/faster_rcnn_r50_fpn_mdpool_1x_coco_20200307_203304.log.json) |
+| R-101-FPN | Faster | pytorch | dconv(c3-c5) | - | 1x | 6.0 | 12.5 | 42.7 | | [model](http://download.openmmlab.com/mmdetection/v2.0/dcn/faster_rcnn_r101_fpn_dconv_c3-c5_1x_coco/faster_rcnn_r101_fpn_dconv_c3-c5_1x_coco_20200203-1377f13d.pth) | [log](http://download.openmmlab.com/mmdetection/v2.0/dcn/faster_rcnn_r101_fpn_dconv_c3-c5_1x_coco/faster_rcnn_r101_fpn_dconv_c3-c5_1x_coco_20200203_230019.log.json) |
+| X-101-32x4d-FPN | Faster | pytorch | dconv(c3-c5) | - | 1x | 7.3 | 10.0 | 44.5 | | [model](http://download.openmmlab.com/mmdetection/v2.0/dcn/faster_rcnn_x101_32x4d_fpn_dconv_c3-c5_1x_coco/faster_rcnn_x101_32x4d_fpn_dconv_c3-c5_1x_coco_20200203-4f85c69c.pth) | [log](http://download.openmmlab.com/mmdetection/v2.0/dcn/faster_rcnn_x101_32x4d_fpn_dconv_c3-c5_1x_coco/faster_rcnn_x101_32x4d_fpn_dconv_c3-c5_1x_coco_20200203_001325.log.json) |
+| R-50-FPN | Mask | pytorch | dconv(c3-c5) | - | 1x | 4.5 | 15.4 | 41.8 | 37.4 | [model](http://download.openmmlab.com/mmdetection/v2.0/dcn/mask_rcnn_r50_fpn_dconv_c3-c5_1x_coco/mask_rcnn_r50_fpn_dconv_c3-c5_1x_coco_20200203-4d9ad43b.pth) | [log](http://download.openmmlab.com/mmdetection/v2.0/dcn/mask_rcnn_r50_fpn_dconv_c3-c5_1x_coco/mask_rcnn_r50_fpn_dconv_c3-c5_1x_coco_20200203_061339.log.json) |
+| R-50-FPN | Mask | pytorch | mdconv(c3-c5) | - | 1x | 4.5 | 15.1 | 41.5 | 37.1 | [model](http://download.openmmlab.com/mmdetection/v2.0/dcn/mask_rcnn_r50_fpn_mdconv_c3-c5_1x_coco/mask_rcnn_r50_fpn_mdconv_c3-c5_1x_coco_20200203-ad97591f.pth) | [log](http://download.openmmlab.com/mmdetection/v2.0/dcn/mask_rcnn_r50_fpn_mdconv_c3-c5_1x_coco/mask_rcnn_r50_fpn_mdconv_c3-c5_1x_coco_20200203_063443.log.json) |
+| R-101-FPN | Mask | pytorch | dconv(c3-c5) | - | 1x | 6.5 | 11.7 | 43.5 | 38.9 | [model](http://download.openmmlab.com/mmdetection/v2.0/dcn/mask_rcnn_r101_fpn_dconv_c3-c5_1x_coco/mask_rcnn_r101_fpn_dconv_c3-c5_1x_coco_20200216-a71f5bce.pth) | [log](http://download.openmmlab.com/mmdetection/v2.0/dcn/mask_rcnn_r101_fpn_dconv_c3-c5_1x_coco/mask_rcnn_r101_fpn_dconv_c3-c5_1x_coco_20200216_191601.log.json) |
+| R-50-FPN | Cascade | pytorch | dconv(c3-c5) | - | 1x | 4.5 | 14.6 | 43.8 | | [model](http://download.openmmlab.com/mmdetection/v2.0/dcn/cascade_rcnn_r50_fpn_dconv_c3-c5_1x_coco/cascade_rcnn_r50_fpn_dconv_c3-c5_1x_coco_20200130-2f1fca44.pth) | [log](http://download.openmmlab.com/mmdetection/v2.0/dcn/cascade_rcnn_r50_fpn_dconv_c3-c5_1x_coco/cascade_rcnn_r50_fpn_dconv_c3-c5_1x_coco_20200130_220843.log.json) |
+| R-101-FPN | Cascade | pytorch | dconv(c3-c5) | - | 1x | 6.4 | 11.0 | 45.0 | | [model](http://download.openmmlab.com/mmdetection/v2.0/dcn/cascade_rcnn_r101_fpn_dconv_c3-c5_1x_coco/cascade_rcnn_r101_fpn_dconv_c3-c5_1x_coco_20200203-3b2f0594.pth) | [log](http://download.openmmlab.com/mmdetection/v2.0/dcn/cascade_rcnn_r101_fpn_dconv_c3-c5_1x_coco/cascade_rcnn_r101_fpn_dconv_c3-c5_1x_coco_20200203_224829.log.json) |
+| R-50-FPN | Cascade Mask | pytorch | dconv(c3-c5) | - | 1x | 6.0 | 10.0 | 44.4 | 38.6 | [model](http://download.openmmlab.com/mmdetection/v2.0/dcn/cascade_mask_rcnn_r50_fpn_dconv_c3-c5_1x_coco/cascade_mask_rcnn_r50_fpn_dconv_c3-c5_1x_coco_20200202-42e767a2.pth) | [log](http://download.openmmlab.com/mmdetection/v2.0/dcn/cascade_mask_rcnn_r50_fpn_dconv_c3-c5_1x_coco/cascade_mask_rcnn_r50_fpn_dconv_c3-c5_1x_coco_20200202_010309.log.json) |
+| R-101-FPN | Cascade Mask | pytorch | dconv(c3-c5) | - | 1x | 8.0 | 8.6 | 45.8 | 39.7 | [model](http://download.openmmlab.com/mmdetection/v2.0/dcn/cascade_mask_rcnn_r101_fpn_dconv_c3-c5_1x_coco/cascade_mask_rcnn_r101_fpn_dconv_c3-c5_1x_coco_20200204-df0c5f10.pth) | [log](http://download.openmmlab.com/mmdetection/v2.0/dcn/cascade_mask_rcnn_r101_fpn_dconv_c3-c5_1x_coco/cascade_mask_rcnn_r101_fpn_dconv_c3-c5_1x_coco_20200204_134006.log.json) |
+| X-101-32x4d-FPN | Cascade Mask | pytorch | dconv(c3-c5) | - | 1x | 9.2 | | 47.3 | 41.1 | [model](http://download.openmmlab.com/mmdetection/v2.0/dcn/cascade_mask_rcnn_x101_32x4d_fpn_dconv_c3-c5_1x_coco/cascade_mask_rcnn_x101_32x4d_fpn_dconv_c3-c5_1x_coco-e75f90c8.pth) | [log](http://download.openmmlab.com/mmdetection/v2.0/dcn/cascade_mask_rcnn_x101_32x4d_fpn_dconv_c3-c5_1x_coco/cascade_mask_rcnn_x101_32x4d_fpn_dconv_c3-c5_1x_coco-20200606_183737.log.json) |
+
+**Notes:**
+
+- `dconv` and `mdconv` denote (modulated) deformable convolution, `c3-c5` means adding dconv in resnet stage 3 to 5. `dpool` and `mdpool` denote (modulated) deformable roi pooling.
+- The dcn ops are modified from https://github.com/chengdazhi/Deformable-Convolution-V2-PyTorch, which should be more memory efficient and slightly faster.
+- (*) For R-50-FPN (dg=4), dg is short for deformable_group. This model is trained and tested on Amazon EC2 p3dn.24xlarge instance.
+- **Memory, Train/Inf time is outdated.**
diff --git a/PyTorch/contrib/cv/detection/GCNet/configs/dcn/cascade_mask_rcnn_r101_fpn_dconv_c3-c5_1x_coco.py b/PyTorch/contrib/cv/detection/GCNet/configs/dcn/cascade_mask_rcnn_r101_fpn_dconv_c3-c5_1x_coco.py
new file mode 100644
index 0000000000000000000000000000000000000000..081b998f6f54d3d805dbab38b26750a378c0d93f
--- /dev/null
+++ b/PyTorch/contrib/cv/detection/GCNet/configs/dcn/cascade_mask_rcnn_r101_fpn_dconv_c3-c5_1x_coco.py
@@ -0,0 +1,5 @@
+_base_ = '../cascade_rcnn/cascade_mask_rcnn_r101_fpn_1x_coco.py'
+model = dict(
+ backbone=dict(
+ dcn=dict(type='DCN', deform_groups=1, fallback_on_stride=False),
+ stage_with_dcn=(False, True, True, True)))
diff --git a/PyTorch/contrib/cv/detection/GCNet/configs/dcn/cascade_mask_rcnn_r50_fpn_dconv_c3-c5_1x_coco.py b/PyTorch/contrib/cv/detection/GCNet/configs/dcn/cascade_mask_rcnn_r50_fpn_dconv_c3-c5_1x_coco.py
new file mode 100644
index 0000000000000000000000000000000000000000..3b3683af235f46df36d8793e52c2b9c52e0defeb
--- /dev/null
+++ b/PyTorch/contrib/cv/detection/GCNet/configs/dcn/cascade_mask_rcnn_r50_fpn_dconv_c3-c5_1x_coco.py
@@ -0,0 +1,5 @@
+_base_ = '../cascade_rcnn/cascade_mask_rcnn_r50_fpn_1x_coco.py'
+model = dict(
+ backbone=dict(
+ dcn=dict(type='DCN', deform_groups=1, fallback_on_stride=False),
+ stage_with_dcn=(False, True, True, True)))
diff --git a/PyTorch/contrib/cv/detection/GCNet/configs/dcn/cascade_mask_rcnn_x101_32x4d_fpn_dconv_c3-c5_1x_coco.py b/PyTorch/contrib/cv/detection/GCNet/configs/dcn/cascade_mask_rcnn_x101_32x4d_fpn_dconv_c3-c5_1x_coco.py
new file mode 100644
index 0000000000000000000000000000000000000000..daaa4729c8280107b19107607ec399230713cf93
--- /dev/null
+++ b/PyTorch/contrib/cv/detection/GCNet/configs/dcn/cascade_mask_rcnn_x101_32x4d_fpn_dconv_c3-c5_1x_coco.py
@@ -0,0 +1,5 @@
+_base_ = '../cascade_rcnn/cascade_mask_rcnn_x101_32x4d_fpn_1x_coco.py'
+model = dict(
+ backbone=dict(
+ dcn=dict(type='DCN', deform_groups=1, fallback_on_stride=False),
+ stage_with_dcn=(False, True, True, True)))
diff --git a/PyTorch/contrib/cv/detection/GCNet/configs/dcn/cascade_rcnn_r101_fpn_dconv_c3-c5_1x_coco.py b/PyTorch/contrib/cv/detection/GCNet/configs/dcn/cascade_rcnn_r101_fpn_dconv_c3-c5_1x_coco.py
new file mode 100644
index 0000000000000000000000000000000000000000..a01df33c94e1f8b5f51a51a780b30a77ce99b2c0
--- /dev/null
+++ b/PyTorch/contrib/cv/detection/GCNet/configs/dcn/cascade_rcnn_r101_fpn_dconv_c3-c5_1x_coco.py
@@ -0,0 +1,5 @@
+_base_ = '../cascade_rcnn/cascade_rcnn_r101_fpn_1x_coco.py'
+model = dict(
+ backbone=dict(
+ dcn=dict(type='DCN', deform_groups=1, fallback_on_stride=False),
+ stage_with_dcn=(False, True, True, True)))
diff --git a/PyTorch/contrib/cv/detection/GCNet/configs/dcn/cascade_rcnn_r50_fpn_dconv_c3-c5_1x_coco.py b/PyTorch/contrib/cv/detection/GCNet/configs/dcn/cascade_rcnn_r50_fpn_dconv_c3-c5_1x_coco.py
new file mode 100644
index 0000000000000000000000000000000000000000..aa664bd61c78873a74af229caa8f62feca8daa5e
--- /dev/null
+++ b/PyTorch/contrib/cv/detection/GCNet/configs/dcn/cascade_rcnn_r50_fpn_dconv_c3-c5_1x_coco.py
@@ -0,0 +1,5 @@
+_base_ = '../cascade_rcnn/cascade_rcnn_r50_fpn_1x_coco.py'
+model = dict(
+ backbone=dict(
+ dcn=dict(type='DCN', deform_groups=1, fallback_on_stride=False),
+ stage_with_dcn=(False, True, True, True)))
diff --git a/PyTorch/contrib/cv/detection/GCNet/configs/dcn/faster_rcnn_r101_fpn_dconv_c3-c5_1x_coco.py b/PyTorch/contrib/cv/detection/GCNet/configs/dcn/faster_rcnn_r101_fpn_dconv_c3-c5_1x_coco.py
new file mode 100644
index 0000000000000000000000000000000000000000..f5fee7e13cdfd531bf24d7c261e843855124f762
--- /dev/null
+++ b/PyTorch/contrib/cv/detection/GCNet/configs/dcn/faster_rcnn_r101_fpn_dconv_c3-c5_1x_coco.py
@@ -0,0 +1,5 @@
+_base_ = '../faster_rcnn/faster_rcnn_r101_fpn_1x_coco.py'
+model = dict(
+ backbone=dict(
+ dcn=dict(type='DCN', deform_groups=1, fallback_on_stride=False),
+ stage_with_dcn=(False, True, True, True)))
diff --git a/PyTorch/contrib/cv/detection/GCNet/configs/dcn/faster_rcnn_r50_fpn_dconv_c3-c5_1x_coco.py b/PyTorch/contrib/cv/detection/GCNet/configs/dcn/faster_rcnn_r50_fpn_dconv_c3-c5_1x_coco.py
new file mode 100644
index 0000000000000000000000000000000000000000..8787088f27a09a3f8fd0d05a1144c0abdedd0a21
--- /dev/null
+++ b/PyTorch/contrib/cv/detection/GCNet/configs/dcn/faster_rcnn_r50_fpn_dconv_c3-c5_1x_coco.py
@@ -0,0 +1,5 @@
+_base_ = '../faster_rcnn/faster_rcnn_r50_fpn_1x_coco.py'
+model = dict(
+ backbone=dict(
+ dcn=dict(type='DCN', deform_groups=1, fallback_on_stride=False),
+ stage_with_dcn=(False, True, True, True)))
diff --git a/PyTorch/contrib/cv/detection/GCNet/configs/dcn/faster_rcnn_r50_fpn_dpool_1x_coco.py b/PyTorch/contrib/cv/detection/GCNet/configs/dcn/faster_rcnn_r50_fpn_dpool_1x_coco.py
new file mode 100644
index 0000000000000000000000000000000000000000..1b695f0e19049dc91b7656d7684df151896b7727
--- /dev/null
+++ b/PyTorch/contrib/cv/detection/GCNet/configs/dcn/faster_rcnn_r50_fpn_dpool_1x_coco.py
@@ -0,0 +1,12 @@
+_base_ = '../faster_rcnn/faster_rcnn_r50_fpn_1x_coco.py'
+model = dict(
+ roi_head=dict(
+ bbox_roi_extractor=dict(
+ type='SingleRoIExtractor',
+ roi_layer=dict(
+ _delete_=True,
+ type='DeformRoIPoolPack',
+ output_size=7,
+ output_channels=256),
+ out_channels=256,
+ featmap_strides=[4, 8, 16, 32])))
diff --git a/PyTorch/contrib/cv/detection/GCNet/configs/dcn/faster_rcnn_r50_fpn_mdconv_c3-c5_1x_coco.py b/PyTorch/contrib/cv/detection/GCNet/configs/dcn/faster_rcnn_r50_fpn_mdconv_c3-c5_1x_coco.py
new file mode 100644
index 0000000000000000000000000000000000000000..d1bcf3c102fb660641eda2a1398db3df520caa3a
--- /dev/null
+++ b/PyTorch/contrib/cv/detection/GCNet/configs/dcn/faster_rcnn_r50_fpn_mdconv_c3-c5_1x_coco.py
@@ -0,0 +1,5 @@
+_base_ = '../faster_rcnn/faster_rcnn_r50_fpn_1x_coco.py'
+model = dict(
+ backbone=dict(
+ dcn=dict(type='DCNv2', deform_groups=1, fallback_on_stride=False),
+ stage_with_dcn=(False, True, True, True)))
diff --git a/PyTorch/contrib/cv/detection/GCNet/configs/dcn/faster_rcnn_r50_fpn_mdconv_c3-c5_group4_1x_coco.py b/PyTorch/contrib/cv/detection/GCNet/configs/dcn/faster_rcnn_r50_fpn_mdconv_c3-c5_group4_1x_coco.py
new file mode 100644
index 0000000000000000000000000000000000000000..d0ab89c261f970e16a9c4407620bd16a0df9e9e9
--- /dev/null
+++ b/PyTorch/contrib/cv/detection/GCNet/configs/dcn/faster_rcnn_r50_fpn_mdconv_c3-c5_group4_1x_coco.py
@@ -0,0 +1,5 @@
+_base_ = '../faster_rcnn/faster_rcnn_r50_fpn_1x_coco.py'
+model = dict(
+ backbone=dict(
+ dcn=dict(type='DCNv2', deform_groups=4, fallback_on_stride=False),
+ stage_with_dcn=(False, True, True, True)))
diff --git a/PyTorch/contrib/cv/detection/GCNet/configs/dcn/faster_rcnn_r50_fpn_mdpool_1x_coco.py b/PyTorch/contrib/cv/detection/GCNet/configs/dcn/faster_rcnn_r50_fpn_mdpool_1x_coco.py
new file mode 100644
index 0000000000000000000000000000000000000000..ad7b0346a63dfa3c3ca246b624155fc4fd331a3f
--- /dev/null
+++ b/PyTorch/contrib/cv/detection/GCNet/configs/dcn/faster_rcnn_r50_fpn_mdpool_1x_coco.py
@@ -0,0 +1,12 @@
+_base_ = '../faster_rcnn/faster_rcnn_r50_fpn_1x_coco.py'
+model = dict(
+ roi_head=dict(
+ bbox_roi_extractor=dict(
+ type='SingleRoIExtractor',
+ roi_layer=dict(
+ _delete_=True,
+ type='ModulatedDeformRoIPoolPack',
+ output_size=7,
+ output_channels=256),
+ out_channels=256,
+ featmap_strides=[4, 8, 16, 32])))
diff --git a/PyTorch/contrib/cv/detection/GCNet/configs/dcn/faster_rcnn_x101_32x4d_fpn_dconv_c3-c5_1x_coco.py b/PyTorch/contrib/cv/detection/GCNet/configs/dcn/faster_rcnn_x101_32x4d_fpn_dconv_c3-c5_1x_coco.py
new file mode 100644
index 0000000000000000000000000000000000000000..8357766f50ff638f13ca56bd79d1b1c64e96f3dd
--- /dev/null
+++ b/PyTorch/contrib/cv/detection/GCNet/configs/dcn/faster_rcnn_x101_32x4d_fpn_dconv_c3-c5_1x_coco.py
@@ -0,0 +1,15 @@
+_base_ = '../faster_rcnn/faster_rcnn_r50_fpn_1x_coco.py'
+model = dict(
+ pretrained='open-mmlab://resnext101_32x4d',
+ backbone=dict(
+ type='ResNeXt',
+ depth=101,
+ groups=32,
+ base_width=4,
+ num_stages=4,
+ out_indices=(0, 1, 2, 3),
+ frozen_stages=1,
+ norm_cfg=dict(type='BN', requires_grad=True),
+ style='pytorch',
+ dcn=dict(type='DCN', deform_groups=1, fallback_on_stride=False),
+ stage_with_dcn=(False, True, True, True)))
diff --git a/PyTorch/contrib/cv/detection/GCNet/configs/dcn/mask_rcnn_r101_fpn_dconv_c3-c5_1x_coco.py b/PyTorch/contrib/cv/detection/GCNet/configs/dcn/mask_rcnn_r101_fpn_dconv_c3-c5_1x_coco.py
new file mode 100644
index 0000000000000000000000000000000000000000..cb340022ea27f563b8c4a570cf89b5f09e6434cd
--- /dev/null
+++ b/PyTorch/contrib/cv/detection/GCNet/configs/dcn/mask_rcnn_r101_fpn_dconv_c3-c5_1x_coco.py
@@ -0,0 +1,5 @@
+_base_ = '../mask_rcnn/mask_rcnn_r101_fpn_1x_coco.py'
+model = dict(
+ backbone=dict(
+ dcn=dict(type='DCN', deform_groups=1, fallback_on_stride=False),
+ stage_with_dcn=(False, True, True, True)))
diff --git a/PyTorch/contrib/cv/detection/GCNet/configs/dcn/mask_rcnn_r50_fpn_dconv_c3-c5_1x_coco.py b/PyTorch/contrib/cv/detection/GCNet/configs/dcn/mask_rcnn_r50_fpn_dconv_c3-c5_1x_coco.py
new file mode 100644
index 0000000000000000000000000000000000000000..ababe58dc3fdfbbc6c366f48271db31bf6e2e9e2
--- /dev/null
+++ b/PyTorch/contrib/cv/detection/GCNet/configs/dcn/mask_rcnn_r50_fpn_dconv_c3-c5_1x_coco.py
@@ -0,0 +1,5 @@
+_base_ = '../mask_rcnn/mask_rcnn_r50_fpn_1x_coco.py'
+model = dict(
+ backbone=dict(
+ dcn=dict(type='DCN', deform_groups=1, fallback_on_stride=False),
+ stage_with_dcn=(False, True, True, True)))
diff --git a/PyTorch/contrib/cv/detection/GCNet/configs/dcn/mask_rcnn_r50_fpn_mdconv_c3-c5_1x_coco.py b/PyTorch/contrib/cv/detection/GCNet/configs/dcn/mask_rcnn_r50_fpn_mdconv_c3-c5_1x_coco.py
new file mode 100644
index 0000000000000000000000000000000000000000..5ca2a67cde62bff078b7c4c0d696a585265e4c3a
--- /dev/null
+++ b/PyTorch/contrib/cv/detection/GCNet/configs/dcn/mask_rcnn_r50_fpn_mdconv_c3-c5_1x_coco.py
@@ -0,0 +1,5 @@
+_base_ = '../mask_rcnn/mask_rcnn_r50_fpn_1x_coco.py'
+model = dict(
+ backbone=dict(
+ dcn=dict(type='DCNv2', deform_groups=1, fallback_on_stride=False),
+ stage_with_dcn=(False, True, True, True)))
diff --git a/PyTorch/contrib/cv/detection/GCNet/configs/deepfashion/README.md b/PyTorch/contrib/cv/detection/GCNet/configs/deepfashion/README.md
new file mode 100644
index 0000000000000000000000000000000000000000..6adbe283ac000b911281dd0dd1238dd656a602c5
--- /dev/null
+++ b/PyTorch/contrib/cv/detection/GCNet/configs/deepfashion/README.md
@@ -0,0 +1,43 @@
+## DeepFashion
+
+MMFashion(https://github.com/open-mmlab/mmfashion) develops "fashion parsing and segmentation" module
+based on the dataset
+[DeepFashion-Inshop](https://drive.google.com/drive/folders/0B7EVK8r0v71pVDZFQXRsMDZCX1E?usp=sharing).
+Its annotation follows COCO style.
+To use it, you need to first download the data. Note that we only use "img_highres" in this task.
+The file tree should be like this:
+
+```sh
+mmdetection
+├── mmdet
+├── tools
+├── configs
+├── data
+│ ├── DeepFashion
+│ │ ├── In-shop
+│ │ ├── Anno
+│ │ │ ├── segmentation
+│ │ │ | ├── DeepFashion_segmentation_train.json
+│ │ │ | ├── DeepFashion_segmentation_query.json
+│ │ │ | ├── DeepFashion_segmentation_gallery.json
+│ │ │ ├── list_bbox_inshop.txt
+│ │ │ ├── list_description_inshop.json
+│ │ │ ├── list_item_inshop.txt
+│ │ │ └── list_landmarks_inshop.txt
+│ │ ├── Eval
+│ │ │ └── list_eval_partition.txt
+│ │ ├── Img
+│ │ │ ├── img
+│ │ │ │ ├──XXX.jpg
+│ │ │ ├── img_highres
+│ │ │ └── ├──XXX.jpg
+
+```
+
+After that you can train the Mask RCNN r50 on DeepFashion-In-shop dataset by launching training with the `mask_rcnn_r50_fpn_1x.py` config
+or creating your own config file.
+
+## Model Zoo
+| Backbone | Model type | Dataset | bbox detection Average Precision | segmentation Average Precision | Download (Google) |
+| :---------: | :----------: | :-----------------: | :--------------------------------: | :----------------------------: | :-------------------------: |
+| ResNet50 | Mask RCNN | DeepFashion-In-shop | 0.599 | 0.584 | [model](https://drive.google.com/open?id=1q6zF7J6Gb-FFgM87oIORIt6uBozaXp5r) | [log](https://drive.google.com/file/d/1qTK4Dr4FFLa9fkdI6UVko408gkrfTRLP/view?usp=sharing) |
diff --git a/PyTorch/contrib/cv/detection/GCNet/configs/deepfashion/mask_rcnn_r50_fpn_15e_deepfashion.py b/PyTorch/contrib/cv/detection/GCNet/configs/deepfashion/mask_rcnn_r50_fpn_15e_deepfashion.py
new file mode 100644
index 0000000000000000000000000000000000000000..72e1afce8097f20364622f99b285bf6ee2321f06
--- /dev/null
+++ b/PyTorch/contrib/cv/detection/GCNet/configs/deepfashion/mask_rcnn_r50_fpn_15e_deepfashion.py
@@ -0,0 +1,10 @@
+_base_ = [
+ '../_base_/models/mask_rcnn_r50_fpn.py',
+ '../_base_/datasets/deepfashion.py', '../_base_/schedules/schedule_1x.py',
+ '../_base_/default_runtime.py'
+]
+model = dict(
+ roi_head=dict(
+ bbox_head=dict(num_classes=15), mask_head=dict(num_classes=15)))
+# runtime settings
+total_epochs = 15
diff --git a/PyTorch/contrib/cv/detection/GCNet/configs/detectors/README.md b/PyTorch/contrib/cv/detection/GCNet/configs/detectors/README.md
new file mode 100644
index 0000000000000000000000000000000000000000..27ba9b16c0ecdedfad80593312da31f3d5c253d8
--- /dev/null
+++ b/PyTorch/contrib/cv/detection/GCNet/configs/detectors/README.md
@@ -0,0 +1,37 @@
+# DetectoRS
+
+## Introduction
+
+We provide the config files for [DetectoRS: Detecting Objects with Recursive Feature Pyramid and Switchable Atrous Convolution](https://arxiv.org/pdf/2006.02334.pdf).
+
+```BibTeX
+@article{qiao2020detectors,
+ title={DetectoRS: Detecting Objects with Recursive Feature Pyramid and Switchable Atrous Convolution},
+ author={Qiao, Siyuan and Chen, Liang-Chieh and Yuille, Alan},
+ journal={arXiv preprint arXiv:2006.02334},
+ year={2020}
+}
+```
+
+## Results and Models
+
+DetectoRS includes two major components:
+
+- Recursive Feature Pyramid (RFP).
+- Switchable Atrous Convolution (SAC).
+
+They can be used independently.
+Combining them together results in DetectoRS.
+The results on COCO 2017 val are shown in the below table.
+
+| Method | Detector | Lr schd | Mem (GB) | Inf time (fps) | box AP | mask AP | Download |
+|:------:|:--------:|:-------:|:--------:|:--------------:|:------:|:-------:|:--------:|
+| RFP | Cascade + ResNet-50 | 1x | 7.5 | - | 44.8 | | [model](http://download.openmmlab.com/mmdetection/v2.0/detectors/cascade_rcnn_r50_rfp_1x_coco/cascade_rcnn_r50_rfp_1x_coco-8cf51bfd.pth) | [log](http://download.openmmlab.com/mmdetection/v2.0/detectors/cascade_rcnn_r50_rfp_1x_coco/cascade_rcnn_r50_rfp_1x_coco_20200624_104126.log.json) |
+| SAC | Cascade + ResNet-50 | 1x | 5.6 | - | 45.0| | [model](http://download.openmmlab.com/mmdetection/v2.0/detectors/cascade_rcnn_r50_sac_1x_coco/cascade_rcnn_r50_sac_1x_coco-24bfda62.pth) | [log](http://download.openmmlab.com/mmdetection/v2.0/detectors/cascade_rcnn_r50_sac_1x_coco/cascade_rcnn_r50_sac_1x_coco_20200624_104402.log.json) |
+| DetectoRS | Cascade + ResNet-50 | 1x | 9.9 | - | 47.4 | | [model](http://download.openmmlab.com/mmdetection/v2.0/detectors/detectors_cascade_rcnn_r50_1x_coco/detectors_cascade_rcnn_r50_1x_coco-32a10ba0.pth) | [log](http://download.openmmlab.com/mmdetection/v2.0/detectors/detectors_cascade_rcnn_r50_1x_coco/detectors_cascade_rcnn_r50_1x_coco_20200706_001203.log.json) |
+| RFP | HTC + ResNet-50 | 1x | 11.2 | - | 46.6 | 40.9 | [model](http://download.openmmlab.com/mmdetection/v2.0/detectors/htc_r50_rfp_1x_coco/htc_r50_rfp_1x_coco-8ff87c51.pth) | [log](http://download.openmmlab.com/mmdetection/v2.0/detectors/htc_r50_rfp_1x_coco/htc_r50_rfp_1x_coco_20200624_103053.log.json) |
+| SAC | HTC + ResNet-50 | 1x | 9.3 | - | 46.4 | 40.9 | [model](http://download.openmmlab.com/mmdetection/v2.0/detectors/htc_r50_sac_1x_coco/htc_r50_sac_1x_coco-bfa60c54.pth) | [log](http://download.openmmlab.com/mmdetection/v2.0/detectors/htc_r50_sac_1x_coco/htc_r50_sac_1x_coco_20200624_103111.log.json) |
+| DetectoRS | HTC + ResNet-50 | 1x | 13.6 | - | 49.1 | 42.6 | [model](http://download.openmmlab.com/mmdetection/v2.0/detectors/detectors_htc_r50_1x_coco/detectors_htc_r50_1x_coco-329b1453.pth) | [log](http://download.openmmlab.com/mmdetection/v2.0/detectors/detectors_htc_r50_1x_coco/detectors_htc_r50_1x_coco_20200624_103659.log.json) |
+
+*Note*: This is a re-implementation based on MMDetection-V2.
+The original implementation is based on MMDetection-V1.
diff --git a/PyTorch/contrib/cv/detection/GCNet/configs/detectors/cascade_rcnn_r50_rfp_1x_coco.py b/PyTorch/contrib/cv/detection/GCNet/configs/detectors/cascade_rcnn_r50_rfp_1x_coco.py
new file mode 100644
index 0000000000000000000000000000000000000000..4430d8a677e48f84552eb23403bc874c56bda506
--- /dev/null
+++ b/PyTorch/contrib/cv/detection/GCNet/configs/detectors/cascade_rcnn_r50_rfp_1x_coco.py
@@ -0,0 +1,28 @@
+_base_ = [
+ '../_base_/models/cascade_rcnn_r50_fpn.py',
+ '../_base_/datasets/coco_detection.py',
+ '../_base_/schedules/schedule_1x.py', '../_base_/default_runtime.py'
+]
+
+model = dict(
+ backbone=dict(
+ type='DetectoRS_ResNet',
+ conv_cfg=dict(type='ConvAWS'),
+ output_img=True),
+ neck=dict(
+ type='RFP',
+ rfp_steps=2,
+ aspp_out_channels=64,
+ aspp_dilations=(1, 3, 6, 1),
+ rfp_backbone=dict(
+ rfp_inplanes=256,
+ type='DetectoRS_ResNet',
+ depth=50,
+ num_stages=4,
+ out_indices=(0, 1, 2, 3),
+ frozen_stages=1,
+ norm_cfg=dict(type='BN', requires_grad=True),
+ norm_eval=True,
+ conv_cfg=dict(type='ConvAWS'),
+ pretrained='torchvision://resnet50',
+ style='pytorch')))
diff --git a/PyTorch/contrib/cv/detection/GCNet/configs/detectors/cascade_rcnn_r50_sac_1x_coco.py b/PyTorch/contrib/cv/detection/GCNet/configs/detectors/cascade_rcnn_r50_sac_1x_coco.py
new file mode 100644
index 0000000000000000000000000000000000000000..ccd9319b2d1badebf3b891c8e3bdd55a435a4b7c
--- /dev/null
+++ b/PyTorch/contrib/cv/detection/GCNet/configs/detectors/cascade_rcnn_r50_sac_1x_coco.py
@@ -0,0 +1,12 @@
+_base_ = [
+ '../_base_/models/cascade_rcnn_r50_fpn.py',
+ '../_base_/datasets/coco_detection.py',
+ '../_base_/schedules/schedule_1x.py', '../_base_/default_runtime.py'
+]
+
+model = dict(
+ backbone=dict(
+ type='DetectoRS_ResNet',
+ conv_cfg=dict(type='ConvAWS'),
+ sac=dict(type='SAC', use_deform=True),
+ stage_with_sac=(False, True, True, True)))
diff --git a/PyTorch/contrib/cv/detection/GCNet/configs/detectors/detectors_cascade_rcnn_r50_1x_coco.py b/PyTorch/contrib/cv/detection/GCNet/configs/detectors/detectors_cascade_rcnn_r50_1x_coco.py
new file mode 100644
index 0000000000000000000000000000000000000000..f76040434f1ff07608c83202f779dfacfe91c323
--- /dev/null
+++ b/PyTorch/contrib/cv/detection/GCNet/configs/detectors/detectors_cascade_rcnn_r50_1x_coco.py
@@ -0,0 +1,32 @@
+_base_ = [
+ '../_base_/models/cascade_rcnn_r50_fpn.py',
+ '../_base_/datasets/coco_detection.py',
+ '../_base_/schedules/schedule_1x.py', '../_base_/default_runtime.py'
+]
+
+model = dict(
+ backbone=dict(
+ type='DetectoRS_ResNet',
+ conv_cfg=dict(type='ConvAWS'),
+ sac=dict(type='SAC', use_deform=True),
+ stage_with_sac=(False, True, True, True),
+ output_img=True),
+ neck=dict(
+ type='RFP',
+ rfp_steps=2,
+ aspp_out_channels=64,
+ aspp_dilations=(1, 3, 6, 1),
+ rfp_backbone=dict(
+ rfp_inplanes=256,
+ type='DetectoRS_ResNet',
+ depth=50,
+ num_stages=4,
+ out_indices=(0, 1, 2, 3),
+ frozen_stages=1,
+ norm_cfg=dict(type='BN', requires_grad=True),
+ norm_eval=True,
+ conv_cfg=dict(type='ConvAWS'),
+ sac=dict(type='SAC', use_deform=True),
+ stage_with_sac=(False, True, True, True),
+ pretrained='torchvision://resnet50',
+ style='pytorch')))
diff --git a/PyTorch/contrib/cv/detection/GCNet/configs/detectors/detectors_htc_r50_1x_coco.py b/PyTorch/contrib/cv/detection/GCNet/configs/detectors/detectors_htc_r50_1x_coco.py
new file mode 100644
index 0000000000000000000000000000000000000000..0d2fc4f77fcca715c1dfb613306d214b636aa0c0
--- /dev/null
+++ b/PyTorch/contrib/cv/detection/GCNet/configs/detectors/detectors_htc_r50_1x_coco.py
@@ -0,0 +1,28 @@
+_base_ = '../htc/htc_r50_fpn_1x_coco.py'
+
+model = dict(
+ backbone=dict(
+ type='DetectoRS_ResNet',
+ conv_cfg=dict(type='ConvAWS'),
+ sac=dict(type='SAC', use_deform=True),
+ stage_with_sac=(False, True, True, True),
+ output_img=True),
+ neck=dict(
+ type='RFP',
+ rfp_steps=2,
+ aspp_out_channels=64,
+ aspp_dilations=(1, 3, 6, 1),
+ rfp_backbone=dict(
+ rfp_inplanes=256,
+ type='DetectoRS_ResNet',
+ depth=50,
+ num_stages=4,
+ out_indices=(0, 1, 2, 3),
+ frozen_stages=1,
+ norm_cfg=dict(type='BN', requires_grad=True),
+ norm_eval=True,
+ conv_cfg=dict(type='ConvAWS'),
+ sac=dict(type='SAC', use_deform=True),
+ stage_with_sac=(False, True, True, True),
+ pretrained='torchvision://resnet50',
+ style='pytorch')))
diff --git a/PyTorch/contrib/cv/detection/GCNet/configs/detectors/htc_r50_rfp_1x_coco.py b/PyTorch/contrib/cv/detection/GCNet/configs/detectors/htc_r50_rfp_1x_coco.py
new file mode 100644
index 0000000000000000000000000000000000000000..496104e12550a1985f9c9e3748a343f69d7df6d8
--- /dev/null
+++ b/PyTorch/contrib/cv/detection/GCNet/configs/detectors/htc_r50_rfp_1x_coco.py
@@ -0,0 +1,24 @@
+_base_ = '../htc/htc_r50_fpn_1x_coco.py'
+
+model = dict(
+ backbone=dict(
+ type='DetectoRS_ResNet',
+ conv_cfg=dict(type='ConvAWS'),
+ output_img=True),
+ neck=dict(
+ type='RFP',
+ rfp_steps=2,
+ aspp_out_channels=64,
+ aspp_dilations=(1, 3, 6, 1),
+ rfp_backbone=dict(
+ rfp_inplanes=256,
+ type='DetectoRS_ResNet',
+ depth=50,
+ num_stages=4,
+ out_indices=(0, 1, 2, 3),
+ frozen_stages=1,
+ norm_cfg=dict(type='BN', requires_grad=True),
+ norm_eval=True,
+ conv_cfg=dict(type='ConvAWS'),
+ pretrained='torchvision://resnet50',
+ style='pytorch')))
diff --git a/PyTorch/contrib/cv/detection/GCNet/configs/detectors/htc_r50_sac_1x_coco.py b/PyTorch/contrib/cv/detection/GCNet/configs/detectors/htc_r50_sac_1x_coco.py
new file mode 100644
index 0000000000000000000000000000000000000000..72d4db963ffd95851b945911b3db9941426583ab
--- /dev/null
+++ b/PyTorch/contrib/cv/detection/GCNet/configs/detectors/htc_r50_sac_1x_coco.py
@@ -0,0 +1,8 @@
+_base_ = '../htc/htc_r50_fpn_1x_coco.py'
+
+model = dict(
+ backbone=dict(
+ type='DetectoRS_ResNet',
+ conv_cfg=dict(type='ConvAWS'),
+ sac=dict(type='SAC', use_deform=True),
+ stage_with_sac=(False, True, True, True)))
diff --git a/PyTorch/contrib/cv/detection/GCNet/configs/double_heads/README.md b/PyTorch/contrib/cv/detection/GCNet/configs/double_heads/README.md
new file mode 100644
index 0000000000000000000000000000000000000000..8920492264f9c6e19792184c988bdeb02adb9fdf
--- /dev/null
+++ b/PyTorch/contrib/cv/detection/GCNet/configs/double_heads/README.md
@@ -0,0 +1,19 @@
+# Rethinking Classification and Localization for Object Detection
+
+## Introduction
+```
+@article{wu2019rethinking,
+ title={Rethinking Classification and Localization for Object Detection},
+ author={Yue Wu and Yinpeng Chen and Lu Yuan and Zicheng Liu and Lijuan Wang and Hongzhi Li and Yun Fu},
+ year={2019},
+ eprint={1904.06493},
+ archivePrefix={arXiv},
+ primaryClass={cs.CV}
+}
+```
+
+## Results and models
+
+| Backbone | Style | Lr schd | Mem (GB) | Inf time (fps) | box AP | Download |
+| :-------------: | :-----: | :-----: | :------: | :------------: | :----: | :----------------: |
+| R-50-FPN | pytorch | 1x | 6.8 | 9.5 | 40.0 | [model](http://download.openmmlab.com/mmdetection/v2.0/double_heads/dh_faster_rcnn_r50_fpn_1x_coco/dh_faster_rcnn_r50_fpn_1x_coco_20200130-586b67df.pth) | [log](http://download.openmmlab.com/mmdetection/v2.0/double_heads/dh_faster_rcnn_r50_fpn_1x_coco/dh_faster_rcnn_r50_fpn_1x_coco_20200130_220238.log.json) |
diff --git a/PyTorch/contrib/cv/detection/GCNet/configs/double_heads/dh_faster_rcnn_r50_fpn_1x_coco.py b/PyTorch/contrib/cv/detection/GCNet/configs/double_heads/dh_faster_rcnn_r50_fpn_1x_coco.py
new file mode 100644
index 0000000000000000000000000000000000000000..9b8118b4b633c78120c370f877f47e951c2fdb38
--- /dev/null
+++ b/PyTorch/contrib/cv/detection/GCNet/configs/double_heads/dh_faster_rcnn_r50_fpn_1x_coco.py
@@ -0,0 +1,23 @@
+_base_ = '../faster_rcnn/faster_rcnn_r50_fpn_1x_coco.py'
+model = dict(
+ roi_head=dict(
+ type='DoubleHeadRoIHead',
+ reg_roi_scale_factor=1.3,
+ bbox_head=dict(
+ _delete_=True,
+ type='DoubleConvFCBBoxHead',
+ num_convs=4,
+ num_fcs=2,
+ in_channels=256,
+ conv_out_channels=1024,
+ fc_out_channels=1024,
+ roi_feat_size=7,
+ num_classes=80,
+ bbox_coder=dict(
+ type='DeltaXYWHBBoxCoder',
+ target_means=[0., 0., 0., 0.],
+ target_stds=[0.1, 0.1, 0.2, 0.2]),
+ reg_class_agnostic=False,
+ loss_cls=dict(
+ type='CrossEntropyLoss', use_sigmoid=False, loss_weight=2.0),
+ loss_bbox=dict(type='SmoothL1Loss', beta=1.0, loss_weight=2.0))))
diff --git a/PyTorch/contrib/cv/detection/GCNet/configs/dynamic_rcnn/README.md b/PyTorch/contrib/cv/detection/GCNet/configs/dynamic_rcnn/README.md
new file mode 100644
index 0000000000000000000000000000000000000000..647831e0aed6ad8e4110d51d38e7d131d390474d
--- /dev/null
+++ b/PyTorch/contrib/cv/detection/GCNet/configs/dynamic_rcnn/README.md
@@ -0,0 +1,18 @@
+# Dynamic R-CNN: Towards High Quality Object Detection via Dynamic Training
+
+## Introduction
+
+```
+@article{DynamicRCNN,
+ author = {Hongkai Zhang and Hong Chang and Bingpeng Ma and Naiyan Wang and Xilin Chen},
+ title = {Dynamic {R-CNN}: Towards High Quality Object Detection via Dynamic Training},
+ journal = {arXiv preprint arXiv:2004.06002},
+ year = {2020}
+}
+```
+
+## Results and Models
+
+| Backbone | Style | Lr schd | Mem (GB) | Inf time (fps) | box AP | Download |
+|:---------:|:-------:|:-------:|:--------:|:--------------:|:------:|:--------:|
+| R-50 | pytorch | 1x | 3.8 | | 38.9 | [model](http://download.openmmlab.com/mmdetection/v2.0/dynamic_rcnn/dynamic_rcnn_r50_fpn_1x/dynamic_rcnn_r50_fpn_1x-62a3f276.pth) | [log](http://download.openmmlab.com/mmdetection/v2.0/dynamic_rcnn/dynamic_rcnn_r50_fpn_1x/dynamic_rcnn_r50_fpn_1x_20200618_095048.log.json) |
diff --git a/PyTorch/contrib/cv/detection/GCNet/configs/dynamic_rcnn/dynamic_rcnn_r50_fpn_1x.py b/PyTorch/contrib/cv/detection/GCNet/configs/dynamic_rcnn/dynamic_rcnn_r50_fpn_1x.py
new file mode 100644
index 0000000000000000000000000000000000000000..60f9c5043a6d8e7da0c6038aca868ad7e966c534
--- /dev/null
+++ b/PyTorch/contrib/cv/detection/GCNet/configs/dynamic_rcnn/dynamic_rcnn_r50_fpn_1x.py
@@ -0,0 +1,28 @@
+_base_ = '../faster_rcnn/faster_rcnn_r50_fpn_1x_coco.py'
+model = dict(
+ roi_head=dict(
+ type='DynamicRoIHead',
+ bbox_head=dict(
+ type='Shared2FCBBoxHead',
+ in_channels=256,
+ fc_out_channels=1024,
+ roi_feat_size=7,
+ num_classes=80,
+ bbox_coder=dict(
+ type='DeltaXYWHBBoxCoder',
+ target_means=[0., 0., 0., 0.],
+ target_stds=[0.1, 0.1, 0.2, 0.2]),
+ reg_class_agnostic=False,
+ loss_cls=dict(
+ type='CrossEntropyLoss', use_sigmoid=False, loss_weight=1.0),
+ loss_bbox=dict(type='SmoothL1Loss', beta=1.0, loss_weight=1.0))))
+train_cfg = dict(
+ rpn_proposal=dict(nms_thr=0.85),
+ rcnn=dict(
+ dynamic_rcnn=dict(
+ iou_topk=75,
+ beta_topk=10,
+ update_iter_interval=100,
+ initial_iou=0.4,
+ initial_beta=1.0)))
+test_cfg = dict(rpn=dict(nms_thr=0.85))
diff --git a/PyTorch/contrib/cv/detection/GCNet/configs/empirical_attention/README.md b/PyTorch/contrib/cv/detection/GCNet/configs/empirical_attention/README.md
new file mode 100644
index 0000000000000000000000000000000000000000..9caa329740d6b2cfa1033bcf980b5a12edbb18f9
--- /dev/null
+++ b/PyTorch/contrib/cv/detection/GCNet/configs/empirical_attention/README.md
@@ -0,0 +1,22 @@
+# An Empirical Study of Spatial Attention Mechanisms in Deep Networks
+
+## Introduction
+
+```
+@article{zhu2019empirical,
+ title={An Empirical Study of Spatial Attention Mechanisms in Deep Networks},
+ author={Zhu, Xizhou and Cheng, Dazhi and Zhang, Zheng and Lin, Stephen and Dai, Jifeng},
+ journal={arXiv preprint arXiv:1904.05873},
+ year={2019}
+}
+```
+
+
+## Results and Models
+
+| Backbone | Attention Component | DCN | Lr schd | Mem (GB) | Inf time (fps) | box AP | Download |
+|:---------:|:-------------------:|:----:|:-------:|:--------:|:--------------:|:------:|:--------:|
+| R-50 | 1111 | N | 1x | 8.0 | 13.8 | 40.0 | [model](http://download.openmmlab.com/mmdetection/v2.0/empirical_attention/faster_rcnn_r50_fpn_attention_1111_1x_coco/faster_rcnn_r50_fpn_attention_1111_1x_coco_20200130-403cccba.pth) | [log](http://download.openmmlab.com/mmdetection/v2.0/empirical_attention/faster_rcnn_r50_fpn_attention_1111_1x_coco/faster_rcnn_r50_fpn_attention_1111_1x_coco_20200130_210344.log.json) |
+| R-50 | 0010 | N | 1x | 4.2 | 18.4 | 39.1 | [model](http://download.openmmlab.com/mmdetection/v2.0/empirical_attention/faster_rcnn_r50_fpn_attention_0010_1x_coco/faster_rcnn_r50_fpn_attention_0010_1x_coco_20200130-7cb0c14d.pth) | [log](http://download.openmmlab.com/mmdetection/v2.0/empirical_attention/faster_rcnn_r50_fpn_attention_0010_1x_coco/faster_rcnn_r50_fpn_attention_0010_1x_coco_20200130_210125.log.json) |
+| R-50 | 1111 | Y | 1x | 8.0 | 12.7 | 42.1 | [model](http://download.openmmlab.com/mmdetection/v2.0/empirical_attention/faster_rcnn_r50_fpn_attention_1111_dcn_1x_coco/faster_rcnn_r50_fpn_attention_1111_dcn_1x_coco_20200130-8b2523a6.pth) | [log](http://download.openmmlab.com/mmdetection/v2.0/empirical_attention/faster_rcnn_r50_fpn_attention_1111_dcn_1x_coco/faster_rcnn_r50_fpn_attention_1111_dcn_1x_coco_20200130_204442.log.json) |
+| R-50 | 0010 | Y | 1x | 4.2 | 17.1 | 42.0 | [model](http://download.openmmlab.com/mmdetection/v2.0/empirical_attention/faster_rcnn_r50_fpn_attention_0010_dcn_1x_coco/faster_rcnn_r50_fpn_attention_0010_dcn_1x_coco_20200130-1a2e831d.pth) | [log](http://download.openmmlab.com/mmdetection/v2.0/empirical_attention/faster_rcnn_r50_fpn_attention_0010_dcn_1x_coco/faster_rcnn_r50_fpn_attention_0010_dcn_1x_coco_20200130_210410.log.json) |
diff --git a/PyTorch/contrib/cv/detection/GCNet/configs/empirical_attention/faster_rcnn_r50_fpn_attention_0010_1x_coco.py b/PyTorch/contrib/cv/detection/GCNet/configs/empirical_attention/faster_rcnn_r50_fpn_attention_0010_1x_coco.py
new file mode 100644
index 0000000000000000000000000000000000000000..a544e3ab636aea0efe56007a0ea40608b6e71ad4
--- /dev/null
+++ b/PyTorch/contrib/cv/detection/GCNet/configs/empirical_attention/faster_rcnn_r50_fpn_attention_0010_1x_coco.py
@@ -0,0 +1,13 @@
+_base_ = '../faster_rcnn/faster_rcnn_r50_fpn_1x_coco.py'
+model = dict(
+ backbone=dict(plugins=[
+ dict(
+ cfg=dict(
+ type='GeneralizedAttention',
+ spatial_range=-1,
+ num_heads=8,
+ attention_type='0010',
+ kv_stride=2),
+ stages=(False, False, True, True),
+ position='after_conv2')
+ ]))
diff --git a/PyTorch/contrib/cv/detection/GCNet/configs/empirical_attention/faster_rcnn_r50_fpn_attention_0010_dcn_1x_coco.py b/PyTorch/contrib/cv/detection/GCNet/configs/empirical_attention/faster_rcnn_r50_fpn_attention_0010_dcn_1x_coco.py
new file mode 100644
index 0000000000000000000000000000000000000000..bbefd27aa02f427e27068b37ecf4d30fbd49b519
--- /dev/null
+++ b/PyTorch/contrib/cv/detection/GCNet/configs/empirical_attention/faster_rcnn_r50_fpn_attention_0010_dcn_1x_coco.py
@@ -0,0 +1,16 @@
+_base_ = '../faster_rcnn/faster_rcnn_r50_fpn_1x_coco.py'
+model = dict(
+ backbone=dict(
+ plugins=[
+ dict(
+ cfg=dict(
+ type='GeneralizedAttention',
+ spatial_range=-1,
+ num_heads=8,
+ attention_type='0010',
+ kv_stride=2),
+ stages=(False, False, True, True),
+ position='after_conv2')
+ ],
+ dcn=dict(type='DCN', deform_groups=1, fallback_on_stride=False),
+ stage_with_dcn=(False, True, True, True)))
diff --git a/PyTorch/contrib/cv/detection/GCNet/configs/empirical_attention/faster_rcnn_r50_fpn_attention_1111_1x_coco.py b/PyTorch/contrib/cv/detection/GCNet/configs/empirical_attention/faster_rcnn_r50_fpn_attention_1111_1x_coco.py
new file mode 100644
index 0000000000000000000000000000000000000000..13a4645bfdb50d5a2f04cee49ecc5f7647d10acf
--- /dev/null
+++ b/PyTorch/contrib/cv/detection/GCNet/configs/empirical_attention/faster_rcnn_r50_fpn_attention_1111_1x_coco.py
@@ -0,0 +1,13 @@
+_base_ = '../faster_rcnn/faster_rcnn_r50_fpn_1x_coco.py'
+model = dict(
+ backbone=dict(plugins=[
+ dict(
+ cfg=dict(
+ type='GeneralizedAttention',
+ spatial_range=-1,
+ num_heads=8,
+ attention_type='1111',
+ kv_stride=2),
+ stages=(False, False, True, True),
+ position='after_conv2')
+ ]))
diff --git a/PyTorch/contrib/cv/detection/GCNet/configs/empirical_attention/faster_rcnn_r50_fpn_attention_1111_dcn_1x_coco.py b/PyTorch/contrib/cv/detection/GCNet/configs/empirical_attention/faster_rcnn_r50_fpn_attention_1111_dcn_1x_coco.py
new file mode 100644
index 0000000000000000000000000000000000000000..b1f26c081da27811f856fe9973eb444c82604727
--- /dev/null
+++ b/PyTorch/contrib/cv/detection/GCNet/configs/empirical_attention/faster_rcnn_r50_fpn_attention_1111_dcn_1x_coco.py
@@ -0,0 +1,16 @@
+_base_ = '../faster_rcnn/faster_rcnn_r50_fpn_1x_coco.py'
+model = dict(
+ backbone=dict(
+ plugins=[
+ dict(
+ cfg=dict(
+ type='GeneralizedAttention',
+ spatial_range=-1,
+ num_heads=8,
+ attention_type='1111',
+ kv_stride=2),
+ stages=(False, False, True, True),
+ position='after_conv2')
+ ],
+ dcn=dict(type='DCN', deform_groups=1, fallback_on_stride=False),
+ stage_with_dcn=(False, True, True, True)))
diff --git a/PyTorch/contrib/cv/detection/GCNet/configs/fast_rcnn/README.md b/PyTorch/contrib/cv/detection/GCNet/configs/fast_rcnn/README.md
new file mode 100644
index 0000000000000000000000000000000000000000..b01c4b5956d7beb18a4ebbdfd3845d7156dce63d
--- /dev/null
+++ b/PyTorch/contrib/cv/detection/GCNet/configs/fast_rcnn/README.md
@@ -0,0 +1,13 @@
+# Fast R-CNN
+
+## Introduction
+```
+@inproceedings{girshick2015fast,
+ title={Fast r-cnn},
+ author={Girshick, Ross},
+ booktitle={Proceedings of the IEEE international conference on computer vision},
+ year={2015}
+}
+```
+
+## Results and models
diff --git a/PyTorch/contrib/cv/detection/GCNet/configs/fast_rcnn/fast_rcnn_r101_caffe_fpn_1x_coco.py b/PyTorch/contrib/cv/detection/GCNet/configs/fast_rcnn/fast_rcnn_r101_caffe_fpn_1x_coco.py
new file mode 100644
index 0000000000000000000000000000000000000000..6db24b1e8aa26de5b153f4adcc8ae8dbd885186b
--- /dev/null
+++ b/PyTorch/contrib/cv/detection/GCNet/configs/fast_rcnn/fast_rcnn_r101_caffe_fpn_1x_coco.py
@@ -0,0 +1,4 @@
+_base_ = './fast_rcnn_r50_caffe_fpn_1x_coco.py'
+model = dict(
+ pretrained='open-mmlab://detectron2/resnet101_caffe',
+ backbone=dict(depth=101))
diff --git a/PyTorch/contrib/cv/detection/GCNet/configs/fast_rcnn/fast_rcnn_r101_fpn_1x_coco.py b/PyTorch/contrib/cv/detection/GCNet/configs/fast_rcnn/fast_rcnn_r101_fpn_1x_coco.py
new file mode 100644
index 0000000000000000000000000000000000000000..9a76b3997fbbed5883adde2122dc17ee2262fa80
--- /dev/null
+++ b/PyTorch/contrib/cv/detection/GCNet/configs/fast_rcnn/fast_rcnn_r101_fpn_1x_coco.py
@@ -0,0 +1,2 @@
+_base_ = './fast_rcnn_r50_fpn_1x_coco.py'
+model = dict(pretrained='torchvision://resnet101', backbone=dict(depth=101))
diff --git a/PyTorch/contrib/cv/detection/GCNet/configs/fast_rcnn/fast_rcnn_r101_fpn_2x_coco.py b/PyTorch/contrib/cv/detection/GCNet/configs/fast_rcnn/fast_rcnn_r101_fpn_2x_coco.py
new file mode 100644
index 0000000000000000000000000000000000000000..c9d5b4bef7cf527dc9af1856b6773fc061bda2a7
--- /dev/null
+++ b/PyTorch/contrib/cv/detection/GCNet/configs/fast_rcnn/fast_rcnn_r101_fpn_2x_coco.py
@@ -0,0 +1,2 @@
+_base_ = './fast_rcnn_r50_fpn_2x_coco.py'
+model = dict(pretrained='torchvision://resnet101', backbone=dict(depth=101))
diff --git a/PyTorch/contrib/cv/detection/GCNet/configs/fast_rcnn/fast_rcnn_r50_caffe_fpn_1x_coco.py b/PyTorch/contrib/cv/detection/GCNet/configs/fast_rcnn/fast_rcnn_r50_caffe_fpn_1x_coco.py
new file mode 100644
index 0000000000000000000000000000000000000000..178deb6036e365815944620bce335aaf1233d3af
--- /dev/null
+++ b/PyTorch/contrib/cv/detection/GCNet/configs/fast_rcnn/fast_rcnn_r50_caffe_fpn_1x_coco.py
@@ -0,0 +1,45 @@
+_base_ = './fast_rcnn_r50_fpn_1x_coco.py'
+
+model = dict(
+ pretrained='open-mmlab://detectron2/resnet50_caffe',
+ backbone=dict(
+ norm_cfg=dict(type='BN', requires_grad=False), style='caffe'))
+
+# use caffe img_norm
+img_norm_cfg = dict(
+ mean=[103.530, 116.280, 123.675], std=[1.0, 1.0, 1.0], to_rgb=False)
+train_pipeline = [
+ dict(type='LoadImageFromFile'),
+ dict(type='LoadProposals', num_max_proposals=2000),
+ dict(type='LoadAnnotations', with_bbox=True),
+ dict(type='Resize', img_scale=(1333, 800), keep_ratio=True),
+ dict(type='RandomFlip', flip_ratio=0.5),
+ dict(type='Normalize', **img_norm_cfg),
+ dict(type='Pad', size_divisor=32),
+ dict(type='DefaultFormatBundle'),
+ dict(type='Collect', keys=['img', 'proposals', 'gt_bboxes', 'gt_labels']),
+]
+test_pipeline = [
+ dict(type='LoadImageFromFile'),
+ dict(type='LoadProposals', num_max_proposals=None),
+ dict(
+ type='MultiScaleFlipAug',
+ img_scale=(1333, 800),
+ flip=False,
+ transforms=[
+ dict(type='Resize', keep_ratio=True),
+ dict(type='RandomFlip'),
+ dict(type='Normalize', **img_norm_cfg),
+ dict(type='Pad', size_divisor=32),
+ dict(type='ImageToTensor', keys=['img']),
+ dict(type='ToTensor', keys=['proposals']),
+ dict(
+ type='ToDataContainer',
+ fields=[dict(key='proposals', stack=False)]),
+ dict(type='Collect', keys=['img', 'proposals']),
+ ])
+]
+data = dict(
+ train=dict(pipeline=train_pipeline),
+ val=dict(pipeline=test_pipeline),
+ test=dict(pipeline=test_pipeline))
diff --git a/PyTorch/contrib/cv/detection/GCNet/configs/fast_rcnn/fast_rcnn_r50_fpn_1x_coco.py b/PyTorch/contrib/cv/detection/GCNet/configs/fast_rcnn/fast_rcnn_r50_fpn_1x_coco.py
new file mode 100644
index 0000000000000000000000000000000000000000..d2f080e9d3b1ddade22341aa38c6258eaee78a50
--- /dev/null
+++ b/PyTorch/contrib/cv/detection/GCNet/configs/fast_rcnn/fast_rcnn_r50_fpn_1x_coco.py
@@ -0,0 +1,52 @@
+_base_ = [
+ '../_base_/models/fast_rcnn_r50_fpn.py',
+ '../_base_/datasets/coco_detection.py',
+ '../_base_/schedules/schedule_1x.py', '../_base_/default_runtime.py'
+]
+dataset_type = 'CocoDataset'
+data_root = 'data/coco/'
+img_norm_cfg = dict(
+ mean=[123.675, 116.28, 103.53], std=[58.395, 57.12, 57.375], to_rgb=True)
+train_pipeline = [
+ dict(type='LoadImageFromFile'),
+ dict(type='LoadProposals', num_max_proposals=2000),
+ dict(type='LoadAnnotations', with_bbox=True),
+ dict(type='Resize', img_scale=(1333, 800), keep_ratio=True),
+ dict(type='RandomFlip', flip_ratio=0.5),
+ dict(type='Normalize', **img_norm_cfg),
+ dict(type='Pad', size_divisor=32),
+ dict(type='DefaultFormatBundle'),
+ dict(type='Collect', keys=['img', 'proposals', 'gt_bboxes', 'gt_labels']),
+]
+test_pipeline = [
+ dict(type='LoadImageFromFile'),
+ dict(type='LoadProposals', num_max_proposals=None),
+ dict(
+ type='MultiScaleFlipAug',
+ img_scale=(1333, 800),
+ flip=False,
+ transforms=[
+ dict(type='Resize', keep_ratio=True),
+ dict(type='RandomFlip'),
+ dict(type='Normalize', **img_norm_cfg),
+ dict(type='Pad', size_divisor=32),
+ dict(type='ImageToTensor', keys=['img']),
+ dict(type='ToTensor', keys=['proposals']),
+ dict(
+ type='ToDataContainer',
+ fields=[dict(key='proposals', stack=False)]),
+ dict(type='Collect', keys=['img', 'proposals']),
+ ])
+]
+data = dict(
+ samples_per_gpu=2,
+ workers_per_gpu=2,
+ train=dict(
+ proposal_file=data_root + 'proposals/rpn_r50_fpn_1x_train2017.pkl',
+ pipeline=train_pipeline),
+ val=dict(
+ proposal_file=data_root + 'proposals/rpn_r50_fpn_1x_val2017.pkl',
+ pipeline=test_pipeline),
+ test=dict(
+ proposal_file=data_root + 'proposals/rpn_r50_fpn_1x_val2017.pkl',
+ pipeline=test_pipeline))
diff --git a/PyTorch/contrib/cv/detection/GCNet/configs/fast_rcnn/fast_rcnn_r50_fpn_2x_coco.py b/PyTorch/contrib/cv/detection/GCNet/configs/fast_rcnn/fast_rcnn_r50_fpn_2x_coco.py
new file mode 100644
index 0000000000000000000000000000000000000000..03a87c70454d3a2b2f19762f0ca78c15220f8b5b
--- /dev/null
+++ b/PyTorch/contrib/cv/detection/GCNet/configs/fast_rcnn/fast_rcnn_r50_fpn_2x_coco.py
@@ -0,0 +1,5 @@
+_base_ = './fast_rcnn_r50_fpn_1x_coco.py'
+
+# learning policy
+lr_config = dict(step=[16, 22])
+total_epochs = 24
diff --git a/PyTorch/contrib/cv/detection/GCNet/configs/faster_rcnn/README.md b/PyTorch/contrib/cv/detection/GCNet/configs/faster_rcnn/README.md
new file mode 100644
index 0000000000000000000000000000000000000000..6cfcffc0c026ae32093ac1f7037564eb5a3c3115
--- /dev/null
+++ b/PyTorch/contrib/cv/detection/GCNet/configs/faster_rcnn/README.md
@@ -0,0 +1,46 @@
+# Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks
+
+## Introduction
+```
+@article{Ren_2017,
+ title={Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks},
+ journal={IEEE Transactions on Pattern Analysis and Machine Intelligence},
+ publisher={Institute of Electrical and Electronics Engineers (IEEE)},
+ author={Ren, Shaoqing and He, Kaiming and Girshick, Ross and Sun, Jian},
+ year={2017},
+ month={Jun},
+}
+```
+
+## Results and models
+
+| Backbone | Style | Lr schd | Mem (GB) | Inf time (fps) | box AP | Download |
+| :-------------: | :-----: | :-----: | :------: | :------------: | :----: | :------: |
+| R-50-FPN | caffe | 1x | 3.8 | | 37.8 | [model](http://download.openmmlab.com/mmdetection/v2.0/faster_rcnn/faster_rcnn_r50_caffe_fpn_1x_coco/faster_rcnn_r50_caffe_fpn_1x_coco_bbox_mAP-0.378_20200504_180032-c5925ee5.pth) | [log](http://download.openmmlab.com/mmdetection/v2.0/faster_rcnn/faster_rcnn_r50_caffe_fpn_1x_coco/faster_rcnn_r50_caffe_fpn_1x_coco_20200504_180032.log.json) |
+| R-50-FPN | pytorch | 1x | 4.0 | 21.4 | 37.4 | [model](http://download.openmmlab.com/mmdetection/v2.0/faster_rcnn/faster_rcnn_r50_fpn_1x_coco/faster_rcnn_r50_fpn_1x_coco_20200130-047c8118.pth) | [log](http://download.openmmlab.com/mmdetection/v2.0/faster_rcnn/faster_rcnn_r50_fpn_1x_coco/faster_rcnn_r50_fpn_1x_coco_20200130_204655.log.json) |
+| R-50-FPN | pytorch | 2x | - | - | 38.4 | [model](http://download.openmmlab.com/mmdetection/v2.0/faster_rcnn/faster_rcnn_r50_fpn_2x_coco/faster_rcnn_r50_fpn_2x_coco_bbox_mAP-0.384_20200504_210434-a5d8aa15.pth) | [log](http://download.openmmlab.com/mmdetection/v2.0/faster_rcnn/faster_rcnn_r50_fpn_2x_coco/faster_rcnn_r50_fpn_2x_coco_20200504_210434.log.json) |
+| R-101-FPN | caffe | 1x | 5.7 | | 39.8 | [model](http://download.openmmlab.com/mmdetection/v2.0/faster_rcnn/faster_rcnn_r101_caffe_fpn_1x_coco/faster_rcnn_r101_caffe_fpn_1x_coco_bbox_mAP-0.398_20200504_180057-b269e9dd.pth) | [log](http://download.openmmlab.com/mmdetection/v2.0/faster_rcnn/faster_rcnn_r101_caffe_fpn_1x_coco/faster_rcnn_r101_caffe_fpn_1x_coco_20200504_180057.log.json) |
+| R-101-FPN | pytorch | 1x | 6.0 | 15.6 | 39.4 | [model](http://download.openmmlab.com/mmdetection/v2.0/faster_rcnn/faster_rcnn_r101_fpn_1x_coco/faster_rcnn_r101_fpn_1x_coco_20200130-f513f705.pth) | [log](http://download.openmmlab.com/mmdetection/v2.0/faster_rcnn/faster_rcnn_r101_fpn_1x_coco/faster_rcnn_r101_fpn_1x_coco_20200130_204655.log.json) |
+| R-101-FPN | pytorch | 2x | - | - | 39.8 | [model](http://download.openmmlab.com/mmdetection/v2.0/faster_rcnn/faster_rcnn_r101_fpn_2x_coco/faster_rcnn_r101_fpn_2x_coco_bbox_mAP-0.398_20200504_210455-1d2dac9c.pth) | [log](http://download.openmmlab.com/mmdetection/v2.0/faster_rcnn/faster_rcnn_r101_fpn_2x_coco/faster_rcnn_r101_fpn_2x_coco_20200504_210455.log.json) |
+| X-101-32x4d-FPN | pytorch | 1x | 7.2 | 13.8 | 41.2 | [model](http://download.openmmlab.com/mmdetection/v2.0/faster_rcnn/faster_rcnn_x101_32x4d_fpn_1x_coco/faster_rcnn_x101_32x4d_fpn_1x_coco_20200203-cff10310.pth) | [log](http://download.openmmlab.com/mmdetection/v2.0/faster_rcnn/faster_rcnn_x101_32x4d_fpn_1x_coco/faster_rcnn_x101_32x4d_fpn_1x_coco_20200203_000520.log.json) |
+| X-101-32x4d-FPN | pytorch | 2x | - | - | 41.2 | [model](http://download.openmmlab.com/mmdetection/v2.0/faster_rcnn/faster_rcnn_x101_32x4d_fpn_2x_coco/faster_rcnn_x101_32x4d_fpn_2x_coco_bbox_mAP-0.412_20200506_041400-64a12c0b.pth) | [log](http://download.openmmlab.com/mmdetection/v2.0/faster_rcnn/faster_rcnn_x101_32x4d_fpn_2x_coco/faster_rcnn_x101_32x4d_fpn_2x_coco_20200506_041400.log.json) |
+| X-101-64x4d-FPN | pytorch | 1x | 10.3 | 9.4 | 42.1 | [model](http://download.openmmlab.com/mmdetection/v2.0/faster_rcnn/faster_rcnn_x101_64x4d_fpn_1x_coco/faster_rcnn_x101_64x4d_fpn_1x_coco_20200204-833ee192.pth) | [log](http://download.openmmlab.com/mmdetection/v2.0/faster_rcnn/faster_rcnn_x101_64x4d_fpn_1x_coco/faster_rcnn_x101_64x4d_fpn_1x_coco_20200204_134340.log.json) |
+| X-101-64x4d-FPN | pytorch | 2x | - | - | 41.6 | [model](http://download.openmmlab.com/mmdetection/v2.0/faster_rcnn/faster_rcnn_x101_64x4d_fpn_2x_coco/faster_rcnn_x101_64x4d_fpn_2x_coco_20200512_161033-5961fa95.pth) | [log](http://download.openmmlab.com/mmdetection/v2.0/faster_rcnn/faster_rcnn_x101_64x4d_fpn_2x_coco/faster_rcnn_x101_64x4d_fpn_2x_coco_20200512_161033.log.json) |
+
+## Different regression loss
+We trained with R-50-FPN pytorch style backbone for 1x schedule.
+
+| Backbone | Loss type | Mem (GB) | Inf time (fps) | box AP | Download |
+| :-------------: | :-------: | :------: | :------------: | :----: | :------: |
+| R-50-FPN | L1Loss | 4.0 | 21.4 | 37.4 | [model](http://download.openmmlab.com/mmdetection/v2.0/faster_rcnn/faster_rcnn_r50_fpn_1x_coco/faster_rcnn_r50_fpn_1x_coco_20200130-047c8118.pth) | [log](http://download.openmmlab.com/mmdetection/v2.0/faster_rcnn/faster_rcnn_r50_fpn_1x_coco/faster_rcnn_r50_fpn_1x_coco_20200130_204655.log.json) |
+| R-50-FPN | IoULoss | | | 37.9 | [model](http://download.openmmlab.com/mmdetection/v2.0/faster_rcnn/faster_rcnn_r50_fpn_1x_coco/faster_rcnn_r50_fpn_iou_1x_coco-fdd207f3.pth) | [log](http://download.openmmlab.com/mmdetection/v2.0/faster_rcnn/faster_rcnn_r50_fpn_1x_coco/faster_rcnn_r50_fpn_iou_1x_coco_20200506_095954.log.json) |
+| R-50-FPN | GIoULoss | | | 37.6 | [model](http://download.openmmlab.com/mmdetection/v2.0/faster_rcnn/faster_rcnn_r50_fpn_1x_coco/faster_rcnn_r50_fpn_giou_1x_coco-0eada910.pth) | [log](http://download.openmmlab.com/mmdetection/v2.0/faster_rcnn/faster_rcnn_r50_fpn_1x_coco/faster_rcnn_r50_fpn_giou_1x_coco_20200505_161120.log.json) |
+| R-50-FPN | BoundedIoULoss | | | 37.4 | [model](http://download.openmmlab.com/mmdetection/v2.0/faster_rcnn/faster_rcnn_r50_fpn_1x_coco/faster_rcnn_r50_fpn_bounded_iou_1x_coco-98ad993b.pth) | [log](http://download.openmmlab.com/mmdetection/v2.0/faster_rcnn/faster_rcnn_r50_fpn_1x_coco/faster_rcnn_r50_fpn_bounded_iou_1x_coco_20200505_160738.log.json) |
+
+## Pre-trained Models
+We also train some models with longer schedules and multi-scale training. The users could finetune them for downstream tasks.
+
+| Backbone | Style | Lr schd | Mem (GB) | Inf time (fps) | box AP | Download |
+| :-------------: | :-----: | :-----: | :------: | :------------: | :----: | :------: |
+| [R-50-FPN](./faster_rcnn_r50_caffe_fpn_mstrain-poly_2x_coco.py) | caffe | 2x | 4.3 | | 39.7 |[model](http://download.openmmlab.com/mmdetection/v2.0/faster_rcnn/faster_rcnn_r50_caffe_fpn_mstrain_2x_coco/faster_rcnn_r50_caffe_fpn_mstrain_2x_coco_bbox_mAP-0.397_20200504_231813-10b2de58.pth) | [log](http://download.openmmlab.com/mmdetection/v2.0/faster_rcnn/faster_rcnn_r50_caffe_fpn_mstrain_2x_coco/faster_rcnn_r50_caffe_fpn_mstrain_2x_coco_20200504_231813.log.json)
+| [R-50-FPN](./faster_rcnn_r50_caffe_fpn_mstrain-poly_3x_coco.py) | caffe | 3x | 4.3 | | 40.2 | [model](http://download.openmmlab.com/mmdetection/v2.0/faster_rcnn/faster_rcnn_r50_caffe_fpn_mstrain_3x_coco/faster_rcnn_r50_caffe_fpn_mstrain_3x_coco_bbox_mAP-0.398_20200504_163323-30042637.pth) | [log](http://download.openmmlab.com/mmdetection/v2.0/faster_rcnn/faster_rcnn_r50_caffe_fpn_mstrain_3x_coco/faster_rcnn_r50_caffe_fpn_mstrain_3x_coco_20200504_163323.log.json)
diff --git a/PyTorch/contrib/cv/detection/GCNet/configs/faster_rcnn/faster_rcnn_r101_caffe_fpn_1x_coco.py b/PyTorch/contrib/cv/detection/GCNet/configs/faster_rcnn/faster_rcnn_r101_caffe_fpn_1x_coco.py
new file mode 100644
index 0000000000000000000000000000000000000000..95c7238fcf38a274900599dae6c804829bb600ab
--- /dev/null
+++ b/PyTorch/contrib/cv/detection/GCNet/configs/faster_rcnn/faster_rcnn_r101_caffe_fpn_1x_coco.py
@@ -0,0 +1,4 @@
+_base_ = './faster_rcnn_r50_caffe_fpn_1x_coco.py'
+model = dict(
+ pretrained='open-mmlab://detectron2/resnet101_caffe',
+ backbone=dict(depth=101))
diff --git a/PyTorch/contrib/cv/detection/GCNet/configs/faster_rcnn/faster_rcnn_r101_fpn_1x_coco.py b/PyTorch/contrib/cv/detection/GCNet/configs/faster_rcnn/faster_rcnn_r101_fpn_1x_coco.py
new file mode 100644
index 0000000000000000000000000000000000000000..d2edab113649c38cac3c7dc3ff425462f7c40ffd
--- /dev/null
+++ b/PyTorch/contrib/cv/detection/GCNet/configs/faster_rcnn/faster_rcnn_r101_fpn_1x_coco.py
@@ -0,0 +1,2 @@
+_base_ = './faster_rcnn_r50_fpn_1x_coco.py'
+model = dict(pretrained='torchvision://resnet101', backbone=dict(depth=101))
diff --git a/PyTorch/contrib/cv/detection/GCNet/configs/faster_rcnn/faster_rcnn_r101_fpn_2x_coco.py b/PyTorch/contrib/cv/detection/GCNet/configs/faster_rcnn/faster_rcnn_r101_fpn_2x_coco.py
new file mode 100644
index 0000000000000000000000000000000000000000..9367a3c83aeb1e05f38f4db9fb0110e731dd859c
--- /dev/null
+++ b/PyTorch/contrib/cv/detection/GCNet/configs/faster_rcnn/faster_rcnn_r101_fpn_2x_coco.py
@@ -0,0 +1,2 @@
+_base_ = './faster_rcnn_r50_fpn_2x_coco.py'
+model = dict(pretrained='torchvision://resnet101', backbone=dict(depth=101))
diff --git a/PyTorch/contrib/cv/detection/GCNet/configs/faster_rcnn/faster_rcnn_r50_caffe_c4_1x_coco.py b/PyTorch/contrib/cv/detection/GCNet/configs/faster_rcnn/faster_rcnn_r50_caffe_c4_1x_coco.py
new file mode 100644
index 0000000000000000000000000000000000000000..92344a151be9af53659845b51e4ece7f0a7b636f
--- /dev/null
+++ b/PyTorch/contrib/cv/detection/GCNet/configs/faster_rcnn/faster_rcnn_r50_caffe_c4_1x_coco.py
@@ -0,0 +1,39 @@
+_base_ = [
+ '../_base_/models/faster_rcnn_r50_caffe_c4.py',
+ '../_base_/datasets/coco_detection.py',
+ '../_base_/schedules/schedule_1x.py', '../_base_/default_runtime.py'
+]
+# use caffe img_norm
+img_norm_cfg = dict(
+ mean=[103.530, 116.280, 123.675], std=[1.0, 1.0, 1.0], to_rgb=False)
+train_pipeline = [
+ dict(type='LoadImageFromFile'),
+ dict(type='LoadAnnotations', with_bbox=True),
+ dict(type='Resize', img_scale=(1333, 800), keep_ratio=True),
+ dict(type='RandomFlip', flip_ratio=0.5),
+ dict(type='Normalize', **img_norm_cfg),
+ dict(type='Pad', size_divisor=32),
+ dict(type='DefaultFormatBundle'),
+ dict(type='Collect', keys=['img', 'gt_bboxes', 'gt_labels']),
+]
+test_pipeline = [
+ dict(type='LoadImageFromFile'),
+ dict(
+ type='MultiScaleFlipAug',
+ img_scale=(1333, 800),
+ flip=False,
+ transforms=[
+ dict(type='Resize', keep_ratio=True),
+ dict(type='RandomFlip'),
+ dict(type='Normalize', **img_norm_cfg),
+ dict(type='Pad', size_divisor=32),
+ dict(type='ImageToTensor', keys=['img']),
+ dict(type='Collect', keys=['img']),
+ ])
+]
+data = dict(
+ train=dict(pipeline=train_pipeline),
+ val=dict(pipeline=test_pipeline),
+ test=dict(pipeline=test_pipeline))
+# optimizer
+optimizer = dict(type='SGD', lr=0.01, momentum=0.9, weight_decay=0.0001)
diff --git a/PyTorch/contrib/cv/detection/GCNet/configs/faster_rcnn/faster_rcnn_r50_caffe_fpn_1x_coco.py b/PyTorch/contrib/cv/detection/GCNet/configs/faster_rcnn/faster_rcnn_r50_caffe_fpn_1x_coco.py
new file mode 100644
index 0000000000000000000000000000000000000000..762c72be00b94445897adb8b49420628fec9c33b
--- /dev/null
+++ b/PyTorch/contrib/cv/detection/GCNet/configs/faster_rcnn/faster_rcnn_r50_caffe_fpn_1x_coco.py
@@ -0,0 +1,37 @@
+_base_ = './faster_rcnn_r50_fpn_1x_coco.py'
+model = dict(
+ pretrained='open-mmlab://detectron2/resnet50_caffe',
+ backbone=dict(
+ norm_cfg=dict(requires_grad=False), norm_eval=True, style='caffe'))
+# use caffe img_norm
+img_norm_cfg = dict(
+ mean=[103.530, 116.280, 123.675], std=[1.0, 1.0, 1.0], to_rgb=False)
+train_pipeline = [
+ dict(type='LoadImageFromFile'),
+ dict(type='LoadAnnotations', with_bbox=True),
+ dict(type='Resize', img_scale=(1333, 800), keep_ratio=True),
+ dict(type='RandomFlip', flip_ratio=0.5),
+ dict(type='Normalize', **img_norm_cfg),
+ dict(type='Pad', size_divisor=32),
+ dict(type='DefaultFormatBundle'),
+ dict(type='Collect', keys=['img', 'gt_bboxes', 'gt_labels']),
+]
+test_pipeline = [
+ dict(type='LoadImageFromFile'),
+ dict(
+ type='MultiScaleFlipAug',
+ img_scale=(1333, 800),
+ flip=False,
+ transforms=[
+ dict(type='Resize', keep_ratio=True),
+ dict(type='RandomFlip'),
+ dict(type='Normalize', **img_norm_cfg),
+ dict(type='Pad', size_divisor=32),
+ dict(type='ImageToTensor', keys=['img']),
+ dict(type='Collect', keys=['img']),
+ ])
+]
+data = dict(
+ train=dict(pipeline=train_pipeline),
+ val=dict(pipeline=test_pipeline),
+ test=dict(pipeline=test_pipeline))
diff --git a/PyTorch/contrib/cv/detection/GCNet/configs/faster_rcnn/faster_rcnn_r50_caffe_fpn_mstrain_1x_coco.py b/PyTorch/contrib/cv/detection/GCNet/configs/faster_rcnn/faster_rcnn_r50_caffe_fpn_mstrain_1x_coco.py
new file mode 100644
index 0000000000000000000000000000000000000000..4b87b2ce58b2efc2461046df897038fdd5128cee
--- /dev/null
+++ b/PyTorch/contrib/cv/detection/GCNet/configs/faster_rcnn/faster_rcnn_r50_caffe_fpn_mstrain_1x_coco.py
@@ -0,0 +1,42 @@
+_base_ = './faster_rcnn_r50_fpn_1x_coco.py'
+model = dict(
+ pretrained='open-mmlab://detectron2/resnet50_caffe',
+ backbone=dict(
+ norm_cfg=dict(requires_grad=False), norm_eval=True, style='caffe'))
+# use caffe img_norm
+img_norm_cfg = dict(
+ mean=[103.530, 116.280, 123.675], std=[1.0, 1.0, 1.0], to_rgb=False)
+train_pipeline = [
+ dict(type='LoadImageFromFile'),
+ dict(type='LoadAnnotations', with_bbox=True),
+ dict(
+ type='Resize',
+ img_scale=[(1333, 640), (1333, 672), (1333, 704), (1333, 736),
+ (1333, 768), (1333, 800)],
+ multiscale_mode='value',
+ keep_ratio=True),
+ dict(type='RandomFlip', flip_ratio=0.5),
+ dict(type='Normalize', **img_norm_cfg),
+ dict(type='Pad', size_divisor=32),
+ dict(type='DefaultFormatBundle'),
+ dict(type='Collect', keys=['img', 'gt_bboxes', 'gt_labels']),
+]
+test_pipeline = [
+ dict(type='LoadImageFromFile'),
+ dict(
+ type='MultiScaleFlipAug',
+ img_scale=(1333, 800),
+ flip=False,
+ transforms=[
+ dict(type='Resize', keep_ratio=True),
+ dict(type='RandomFlip'),
+ dict(type='Normalize', **img_norm_cfg),
+ dict(type='Pad', size_divisor=32),
+ dict(type='ImageToTensor', keys=['img']),
+ dict(type='Collect', keys=['img']),
+ ])
+]
+data = dict(
+ train=dict(pipeline=train_pipeline),
+ val=dict(pipeline=test_pipeline),
+ test=dict(pipeline=test_pipeline))
diff --git a/PyTorch/contrib/cv/detection/GCNet/configs/faster_rcnn/faster_rcnn_r50_caffe_fpn_mstrain_2x_coco.py b/PyTorch/contrib/cv/detection/GCNet/configs/faster_rcnn/faster_rcnn_r50_caffe_fpn_mstrain_2x_coco.py
new file mode 100644
index 0000000000000000000000000000000000000000..ef34b92683bd58c9527cc560811e793cdd4bc428
--- /dev/null
+++ b/PyTorch/contrib/cv/detection/GCNet/configs/faster_rcnn/faster_rcnn_r50_caffe_fpn_mstrain_2x_coco.py
@@ -0,0 +1,4 @@
+_base_ = './faster_rcnn_r50_caffe_fpn_mstrain_1x_coco.py'
+# learning policy
+lr_config = dict(step=[16, 23])
+total_epochs = 24
diff --git a/PyTorch/contrib/cv/detection/GCNet/configs/faster_rcnn/faster_rcnn_r50_caffe_fpn_mstrain_3x_coco.py b/PyTorch/contrib/cv/detection/GCNet/configs/faster_rcnn/faster_rcnn_r50_caffe_fpn_mstrain_3x_coco.py
new file mode 100644
index 0000000000000000000000000000000000000000..0d95ed61c4bcbba59a93cc46cabf14b4c0b9fa11
--- /dev/null
+++ b/PyTorch/contrib/cv/detection/GCNet/configs/faster_rcnn/faster_rcnn_r50_caffe_fpn_mstrain_3x_coco.py
@@ -0,0 +1,4 @@
+_base_ = './faster_rcnn_r50_caffe_fpn_mstrain_1x_coco.py'
+# learning policy
+lr_config = dict(step=[28, 34])
+total_epochs = 36
diff --git a/PyTorch/contrib/cv/detection/GCNet/configs/faster_rcnn/faster_rcnn_r50_fpn_1x_coco-person-bicycle-car.py b/PyTorch/contrib/cv/detection/GCNet/configs/faster_rcnn/faster_rcnn_r50_fpn_1x_coco-person-bicycle-car.py
new file mode 100644
index 0000000000000000000000000000000000000000..f41dd86d28271dc727df67e816d1ea9016a3da68
--- /dev/null
+++ b/PyTorch/contrib/cv/detection/GCNet/configs/faster_rcnn/faster_rcnn_r50_fpn_1x_coco-person-bicycle-car.py
@@ -0,0 +1,8 @@
+_base_ = './faster_rcnn_r50_fpn_1x_coco.py'
+classes = ('person', 'bicycle', 'car')
+data = dict(
+ train=dict(classes=classes),
+ val=dict(classes=classes),
+ test=dict(classes=classes))
+# TODO: Update model url after bumping to V2.0
+load_from = 'https://s3.ap-northeast-2.amazonaws.com/open-mmlab/mmdetection/models/faster_rcnn_r50_fpn_1x_20181010-3d1b3351.pth' # noqa
diff --git a/PyTorch/contrib/cv/detection/GCNet/configs/faster_rcnn/faster_rcnn_r50_fpn_1x_coco-person.py b/PyTorch/contrib/cv/detection/GCNet/configs/faster_rcnn/faster_rcnn_r50_fpn_1x_coco-person.py
new file mode 100644
index 0000000000000000000000000000000000000000..14099650f19ccccdb561999499d5ad6d873226bb
--- /dev/null
+++ b/PyTorch/contrib/cv/detection/GCNet/configs/faster_rcnn/faster_rcnn_r50_fpn_1x_coco-person.py
@@ -0,0 +1,6 @@
+_base_ = './faster_rcnn_r50_fpn_1x_coco.py'
+classes = ('person', )
+data = dict(
+ train=dict(classes=classes),
+ val=dict(classes=classes),
+ test=dict(classes=classes))
diff --git a/PyTorch/contrib/cv/detection/GCNet/configs/faster_rcnn/faster_rcnn_r50_fpn_1x_coco.py b/PyTorch/contrib/cv/detection/GCNet/configs/faster_rcnn/faster_rcnn_r50_fpn_1x_coco.py
new file mode 100644
index 0000000000000000000000000000000000000000..009bd93d06b3284c7b31f33f82d636f774e86b74
--- /dev/null
+++ b/PyTorch/contrib/cv/detection/GCNet/configs/faster_rcnn/faster_rcnn_r50_fpn_1x_coco.py
@@ -0,0 +1,5 @@
+_base_ = [
+ '../_base_/models/faster_rcnn_r50_fpn.py',
+ '../_base_/datasets/coco_detection.py',
+ '../_base_/schedules/schedule_1x.py', '../_base_/default_runtime.py'
+]
diff --git a/PyTorch/contrib/cv/detection/GCNet/configs/faster_rcnn/faster_rcnn_r50_fpn_2x_coco.py b/PyTorch/contrib/cv/detection/GCNet/configs/faster_rcnn/faster_rcnn_r50_fpn_2x_coco.py
new file mode 100644
index 0000000000000000000000000000000000000000..e77a7fa8d6b8c1ad7fe293bc932d621464287e0c
--- /dev/null
+++ b/PyTorch/contrib/cv/detection/GCNet/configs/faster_rcnn/faster_rcnn_r50_fpn_2x_coco.py
@@ -0,0 +1,5 @@
+_base_ = [
+ '../_base_/models/faster_rcnn_r50_fpn.py',
+ '../_base_/datasets/coco_detection.py',
+ '../_base_/schedules/schedule_2x.py', '../_base_/default_runtime.py'
+]
diff --git a/PyTorch/contrib/cv/detection/GCNet/configs/faster_rcnn/faster_rcnn_r50_fpn_bounded_iou_1x_coco.py b/PyTorch/contrib/cv/detection/GCNet/configs/faster_rcnn/faster_rcnn_r50_fpn_bounded_iou_1x_coco.py
new file mode 100644
index 0000000000000000000000000000000000000000..648081f19ca7d3ca9a7362a4a41e514d753ce4e8
--- /dev/null
+++ b/PyTorch/contrib/cv/detection/GCNet/configs/faster_rcnn/faster_rcnn_r50_fpn_bounded_iou_1x_coco.py
@@ -0,0 +1,6 @@
+_base_ = './faster_rcnn_r50_fpn_1x_coco.py'
+model = dict(
+ roi_head=dict(
+ bbox_head=dict(
+ reg_decoded_bbox=True,
+ loss_bbox=dict(type='BoundedIoULoss', loss_weight=10.0))))
diff --git a/PyTorch/contrib/cv/detection/GCNet/configs/faster_rcnn/faster_rcnn_r50_fpn_giou_1x_coco.py b/PyTorch/contrib/cv/detection/GCNet/configs/faster_rcnn/faster_rcnn_r50_fpn_giou_1x_coco.py
new file mode 100644
index 0000000000000000000000000000000000000000..5556c4977e221182b013b68fef4b73d1b0605bf3
--- /dev/null
+++ b/PyTorch/contrib/cv/detection/GCNet/configs/faster_rcnn/faster_rcnn_r50_fpn_giou_1x_coco.py
@@ -0,0 +1,6 @@
+_base_ = './faster_rcnn_r50_fpn_1x_coco.py'
+model = dict(
+ roi_head=dict(
+ bbox_head=dict(
+ reg_decoded_bbox=True,
+ loss_bbox=dict(type='GIoULoss', loss_weight=10.0))))
diff --git a/PyTorch/contrib/cv/detection/GCNet/configs/faster_rcnn/faster_rcnn_r50_fpn_iou_1x_coco.py b/PyTorch/contrib/cv/detection/GCNet/configs/faster_rcnn/faster_rcnn_r50_fpn_iou_1x_coco.py
new file mode 100644
index 0000000000000000000000000000000000000000..ddf663e4f0e1525490a493674b32b3dc4c781bb2
--- /dev/null
+++ b/PyTorch/contrib/cv/detection/GCNet/configs/faster_rcnn/faster_rcnn_r50_fpn_iou_1x_coco.py
@@ -0,0 +1,6 @@
+_base_ = './faster_rcnn_r50_fpn_1x_coco.py'
+model = dict(
+ roi_head=dict(
+ bbox_head=dict(
+ reg_decoded_bbox=True,
+ loss_bbox=dict(type='IoULoss', loss_weight=10.0))))
diff --git a/PyTorch/contrib/cv/detection/GCNet/configs/faster_rcnn/faster_rcnn_r50_fpn_ohem_1x_coco.py b/PyTorch/contrib/cv/detection/GCNet/configs/faster_rcnn/faster_rcnn_r50_fpn_ohem_1x_coco.py
new file mode 100644
index 0000000000000000000000000000000000000000..f79ee70dcdf24497681c57e8a22b9127b050db0f
--- /dev/null
+++ b/PyTorch/contrib/cv/detection/GCNet/configs/faster_rcnn/faster_rcnn_r50_fpn_ohem_1x_coco.py
@@ -0,0 +1,2 @@
+_base_ = './faster_rcnn_r50_fpn_1x_coco.py'
+train_cfg = dict(rcnn=dict(sampler=dict(type='OHEMSampler')))
diff --git a/PyTorch/contrib/cv/detection/GCNet/configs/faster_rcnn/faster_rcnn_r50_fpn_soft_nms_1x_coco.py b/PyTorch/contrib/cv/detection/GCNet/configs/faster_rcnn/faster_rcnn_r50_fpn_soft_nms_1x_coco.py
new file mode 100644
index 0000000000000000000000000000000000000000..8ba6b017ff6269824cb960700732b6116d2a3981
--- /dev/null
+++ b/PyTorch/contrib/cv/detection/GCNet/configs/faster_rcnn/faster_rcnn_r50_fpn_soft_nms_1x_coco.py
@@ -0,0 +1,11 @@
+_base_ = [
+ '../_base_/models/faster_rcnn_r50_fpn.py',
+ '../_base_/datasets/coco_detection.py',
+ '../_base_/schedules/schedule_1x.py', '../_base_/default_runtime.py'
+]
+
+test_cfg = dict(
+ rcnn=dict(
+ score_thr=0.05,
+ nms=dict(type='soft_nms', iou_threshold=0.5),
+ max_per_img=100))
diff --git a/PyTorch/contrib/cv/detection/GCNet/configs/faster_rcnn/faster_rcnn_x101_32x4d_fpn_1x_coco.py b/PyTorch/contrib/cv/detection/GCNet/configs/faster_rcnn/faster_rcnn_x101_32x4d_fpn_1x_coco.py
new file mode 100644
index 0000000000000000000000000000000000000000..c536fccc5efbc3a0c58d5bdc5df9be8579d15571
--- /dev/null
+++ b/PyTorch/contrib/cv/detection/GCNet/configs/faster_rcnn/faster_rcnn_x101_32x4d_fpn_1x_coco.py
@@ -0,0 +1,13 @@
+_base_ = './faster_rcnn_r50_fpn_1x_coco.py'
+model = dict(
+ pretrained='open-mmlab://resnext101_32x4d',
+ backbone=dict(
+ type='ResNeXt',
+ depth=101,
+ groups=32,
+ base_width=4,
+ num_stages=4,
+ out_indices=(0, 1, 2, 3),
+ frozen_stages=1,
+ norm_cfg=dict(type='BN', requires_grad=True),
+ style='pytorch'))
diff --git a/PyTorch/contrib/cv/detection/GCNet/configs/faster_rcnn/faster_rcnn_x101_32x4d_fpn_2x_coco.py b/PyTorch/contrib/cv/detection/GCNet/configs/faster_rcnn/faster_rcnn_x101_32x4d_fpn_2x_coco.py
new file mode 100644
index 0000000000000000000000000000000000000000..927609206e1323dcf1173c4a5393e3f03d534c0a
--- /dev/null
+++ b/PyTorch/contrib/cv/detection/GCNet/configs/faster_rcnn/faster_rcnn_x101_32x4d_fpn_2x_coco.py
@@ -0,0 +1,13 @@
+_base_ = './faster_rcnn_r50_fpn_2x_coco.py'
+model = dict(
+ pretrained='open-mmlab://resnext101_32x4d',
+ backbone=dict(
+ type='ResNeXt',
+ depth=101,
+ groups=32,
+ base_width=4,
+ num_stages=4,
+ out_indices=(0, 1, 2, 3),
+ frozen_stages=1,
+ norm_cfg=dict(type='BN', requires_grad=True),
+ style='pytorch'))
diff --git a/PyTorch/contrib/cv/detection/GCNet/configs/faster_rcnn/faster_rcnn_x101_64x4d_fpn_1x_coco.py b/PyTorch/contrib/cv/detection/GCNet/configs/faster_rcnn/faster_rcnn_x101_64x4d_fpn_1x_coco.py
new file mode 100644
index 0000000000000000000000000000000000000000..b588b4eca3df7de341c346aa9ecd0b171194f329
--- /dev/null
+++ b/PyTorch/contrib/cv/detection/GCNet/configs/faster_rcnn/faster_rcnn_x101_64x4d_fpn_1x_coco.py
@@ -0,0 +1,13 @@
+_base_ = './faster_rcnn_r50_fpn_1x_coco.py'
+model = dict(
+ pretrained='open-mmlab://resnext101_64x4d',
+ backbone=dict(
+ type='ResNeXt',
+ depth=101,
+ groups=64,
+ base_width=4,
+ num_stages=4,
+ out_indices=(0, 1, 2, 3),
+ frozen_stages=1,
+ norm_cfg=dict(type='BN', requires_grad=True),
+ style='pytorch'))
diff --git a/PyTorch/contrib/cv/detection/GCNet/configs/faster_rcnn/faster_rcnn_x101_64x4d_fpn_2x_coco.py b/PyTorch/contrib/cv/detection/GCNet/configs/faster_rcnn/faster_rcnn_x101_64x4d_fpn_2x_coco.py
new file mode 100644
index 0000000000000000000000000000000000000000..e87d21a4e6a241f5af892eb11aa82e2c6012a31c
--- /dev/null
+++ b/PyTorch/contrib/cv/detection/GCNet/configs/faster_rcnn/faster_rcnn_x101_64x4d_fpn_2x_coco.py
@@ -0,0 +1,13 @@
+_base_ = './faster_rcnn_r50_fpn_2x_coco.py'
+model = dict(
+ pretrained='open-mmlab://resnext101_64x4d',
+ backbone=dict(
+ type='ResNeXt',
+ depth=101,
+ groups=64,
+ base_width=4,
+ num_stages=4,
+ out_indices=(0, 1, 2, 3),
+ frozen_stages=1,
+ norm_cfg=dict(type='BN', requires_grad=True),
+ style='pytorch'))
diff --git a/PyTorch/contrib/cv/detection/GCNet/configs/fcos/README.md b/PyTorch/contrib/cv/detection/GCNet/configs/fcos/README.md
new file mode 100644
index 0000000000000000000000000000000000000000..c077fca253b8651efccceb18e2190963da6c9ebd
--- /dev/null
+++ b/PyTorch/contrib/cv/detection/GCNet/configs/fcos/README.md
@@ -0,0 +1,37 @@
+# FCOS: Fully Convolutional One-Stage Object Detection
+
+## Introduction
+
+```
+@article{tian2019fcos,
+ title={FCOS: Fully Convolutional One-Stage Object Detection},
+ author={Tian, Zhi and Shen, Chunhua and Chen, Hao and He, Tong},
+ journal={arXiv preprint arXiv:1904.01355},
+ year={2019}
+}
+```
+
+## Results and Models
+
+| Backbone | Style | GN | MS train | Tricks | DCN | Lr schd | Mem (GB) | Inf time (fps) | box AP | Download |
+|:---------:|:-------:|:-------:|:--------:|:-------:|:-------:|:-------:|:--------:|:--------------:|:------:|:--------:|
+| R-50 | caffe | N | N | N | N | 1x | 5.2 | 22.9 | 36.2 | [model](http://download.openmmlab.com/mmdetection/v2.0/fcos/fcos_r50_caffe_fpn_4x4_1x_coco/fcos_r50_caffe_fpn_1x_4gpu_20200218-c229552f.pth) | [log](http://download.openmmlab.com/mmdetection/v2.0/fcos/fcos_r50_caffe_fpn_4x4_1x_coco/20200224_230410.log.json) |
+| R-50 | caffe | Y | N | N | N | 1x | 6.5 | 22.7 | 36.6 | [model](http://download.openmmlab.com/mmdetection/v2.0/fcos/fcos_r50_caffe_fpn_gn-head_4x4_1x_coco/fcos_r50_caffe_fpn_gn_1x_4gpu_20200218-7831950c.pth) | [log](http://download.openmmlab.com/mmdetection/v2.0/fcos/fcos_r50_caffe_fpn_gn-head_4x4_1x_coco/20200130_004230.log.json) |
+| R-50 | caffe | Y | N | Y | N | 1x | - | - | 38.6 | [model](http://download.openmmlab.com/mmdetection/v2.0/fcos/fcos_center-normbbox-centeronreg-giou_r50_caffe_fpn_gn-head_4x4_1x_coco/fcos_center-normbbox-centeronreg-giou_r50_caffe_fpn_gn-head_4x4_1x_coco_20200603-67b3859f.pth) | [log](http://download.openmmlab.com/mmdetection/v2.0/fcos/fcos_center-normbbox-centeronreg-giou_r50_caffe_fpn_gn-head_4x4_1x_coco/fcos_center-normbbox-centeronreg-giou_r50_caffe_fpn_gn-head_4x4_1x_coco_20200603.log.json)|
+| R-50 | caffe | Y | N | Y | Y | 1x | - | - | 42.5 | [model](http://download.openmmlab.com/mmdetection/v2.0/fcos/fcos_center-normbbox-centeronreg-giou_r50_caffe_fpn_gn-head_dcn_4x4_1x_coco/fcos_center-normbbox-centeronreg-giou_r50_caffe_fpn_gn-head_dcn_4x4_1x_coco_20200603-ed16da04.pth) | [log](http://download.openmmlab.com/mmdetection/v2.0/fcos/fcos_center-normbbox-centeronreg-giou_r50_caffe_fpn_gn-head_dcn_4x4_1x_coco/fcos_center-normbbox-centeronreg-giou_r50_caffe_fpn_gn-head_dcn_4x4_1x_coco_20200603.log.json)|
+| R-50 | caffe | Y | N | N | N | 2x | - | - | 36.9 | [model](http://download.openmmlab.com/mmdetection/v2.0/fcos/fcos_r50_caffe_fpn_gn-head_4x4_2x_coco/fcos_r50_caffe_fpn_gn_2x_4gpu_20200218-8ceb5c76.pth) | [log](http://download.openmmlab.com/mmdetection/v2.0/fcos/fcos_r50_caffe_fpn_gn-head_4x4_2x_coco/20200130_004232.log.json) |
+| R-101 | caffe | Y | N | N | N | 1x | 10.2 | 17.3 | 39.2 | [model](http://download.openmmlab.com/mmdetection/v2.0/fcos/fcos_r101_caffe_fpn_gn-head_4x4_1x_coco/fcos_r101_caffe_fpn_gn_1x_4gpu_20200218-13e2cc55.pth) | [log](http://download.openmmlab.com/mmdetection/v2.0/fcos/fcos_r101_caffe_fpn_gn-head_4x4_1x_coco/20200130_004231.log.json) |
+| R-101 | caffe | Y | N | N | N | 2x | - | - | 39.1 | [model](http://download.openmmlab.com/mmdetection/v2.0/fcos/fcos_r101_caffe_fpn_gn-head_4x4_2x_coco/fcos_r101_caffe_fpn_gn_2x_4gpu_20200218-d2261033.pth) | [log](http://download.openmmlab.com/mmdetection/v2.0/fcos/fcos_r101_caffe_fpn_gn-head_4x4_2x_coco/20200130_004231.log.json) |
+
+
+| Backbone | Style | GN | MS train | Lr schd | Mem (GB) | Inf time (fps) | box AP | Download |
+|:---------:|:-------:|:-------:|:--------:|:-------:|:--------:|:--------------:|:------:|:--------:|
+| R-50 | caffe | Y | Y | 2x | 6.5 | 22.9 | 38.7 | [model]() | [log]() |
+| R-101 | caffe | Y | Y | 2x | 10.2 | 17.3 | 40.9 | [model](http://download.openmmlab.com/mmdetection/v2.0/fcos/fcos_r101_caffe_fpn_gn-head_mstrain_640-800_4x4_2x_coco/fcos_mstrain_640_800_r101_caffe_fpn_gn_2x_4gpu_20200218-d8a4f4cf.pth) | [log](http://download.openmmlab.com/mmdetection/v2.0/fcos/fcos_r101_caffe_fpn_gn-head_mstrain_640-800_4x4_2x_coco/20200130_004232.log.json) |
+| X-101 | pytorch | Y | Y | 2x | 10.0 | 9.3 | 42.5 | [model](http://download.openmmlab.com/mmdetection/v2.0/fcos/fcos_x101_64x4d_fpn_gn-head_mstrain_640-800_4x2_2x_coco/fcos_x101_64x4d_fpn_gn-head_mstrain_640-800_4x2_2x_coco_20200229-11f8c079.pth) | [log](http://download.openmmlab.com/mmdetection/v2.0/fcos/fcos_x101_64x4d_fpn_gn-head_mstrain_640-800_4x2_2x_coco/fcos_x101_64x4d_fpn_gn-head_mstrain_640-800_4x2_2x_coco_20200229_222104.log.json) |
+
+**Notes:**
+- To be consistent with the author's implementation, we use 4 GPUs with 4 images/GPU for R-50 and R-101 models, and 8 GPUs with 2 image/GPU for X-101 models.
+- The X-101 backbone is X-101-64x4d.
+- Tricks means setting `norm_on_bbox`, `centerness_on_reg`, `center_sampling` as `True`.
+- DCN means using `DCNv2` in both backbone and head.
diff --git a/PyTorch/contrib/cv/detection/GCNet/configs/fcos/fcos_center-normbbox-centeronreg-giou_r50_caffe_fpn_gn-head_4x4_1x_coco.py b/PyTorch/contrib/cv/detection/GCNet/configs/fcos/fcos_center-normbbox-centeronreg-giou_r50_caffe_fpn_gn-head_4x4_1x_coco.py
new file mode 100644
index 0000000000000000000000000000000000000000..d83fa17f17379067c2f3f659ac9ed37ccf8e20ee
--- /dev/null
+++ b/PyTorch/contrib/cv/detection/GCNet/configs/fcos/fcos_center-normbbox-centeronreg-giou_r50_caffe_fpn_gn-head_4x4_1x_coco.py
@@ -0,0 +1,51 @@
+_base_ = 'fcos_r50_caffe_fpn_gn-head_4x4_1x_coco.py'
+
+model = dict(
+ pretrained='open-mmlab://detectron2/resnet50_caffe',
+ bbox_head=dict(
+ norm_on_bbox=True,
+ centerness_on_reg=True,
+ dcn_on_last_conv=False,
+ center_sampling=True,
+ conv_bias=True,
+ loss_bbox=dict(type='GIoULoss', loss_weight=1.0)))
+# training and testing settings
+test_cfg = dict(nms=dict(type='nms', iou_threshold=0.6))
+
+# dataset settings
+img_norm_cfg = dict(
+ mean=[103.530, 116.280, 123.675], std=[1.0, 1.0, 1.0], to_rgb=False)
+train_pipeline = [
+ dict(type='LoadImageFromFile'),
+ dict(type='LoadAnnotations', with_bbox=True),
+ dict(type='Resize', img_scale=(1333, 800), keep_ratio=True),
+ dict(type='RandomFlip', flip_ratio=0.5),
+ dict(type='Normalize', **img_norm_cfg),
+ dict(type='Pad', size_divisor=32),
+ dict(type='DefaultFormatBundle'),
+ dict(type='Collect', keys=['img', 'gt_bboxes', 'gt_labels']),
+]
+test_pipeline = [
+ dict(type='LoadImageFromFile'),
+ dict(
+ type='MultiScaleFlipAug',
+ img_scale=(1333, 800),
+ flip=False,
+ transforms=[
+ dict(type='Resize', keep_ratio=True),
+ dict(type='RandomFlip'),
+ dict(type='Normalize', **img_norm_cfg),
+ dict(type='Pad', size_divisor=32),
+ dict(type='ImageToTensor', keys=['img']),
+ dict(type='Collect', keys=['img']),
+ ])
+]
+data = dict(
+ samples_per_gpu=4,
+ workers_per_gpu=4,
+ train=dict(pipeline=train_pipeline),
+ val=dict(pipeline=test_pipeline),
+ test=dict(pipeline=test_pipeline))
+optimizer_config = dict(_delete_=True, grad_clip=None)
+
+lr_config = dict(warmup='linear')
diff --git a/PyTorch/contrib/cv/detection/GCNet/configs/fcos/fcos_center-normbbox-centeronreg-giou_r50_caffe_fpn_gn-head_dcn_4x4_1x_coco.py b/PyTorch/contrib/cv/detection/GCNet/configs/fcos/fcos_center-normbbox-centeronreg-giou_r50_caffe_fpn_gn-head_dcn_4x4_1x_coco.py
new file mode 100644
index 0000000000000000000000000000000000000000..67edb415c5feabe8a1eb1bfefb6a7368e3a0b2b1
--- /dev/null
+++ b/PyTorch/contrib/cv/detection/GCNet/configs/fcos/fcos_center-normbbox-centeronreg-giou_r50_caffe_fpn_gn-head_dcn_4x4_1x_coco.py
@@ -0,0 +1,54 @@
+_base_ = 'fcos_r50_caffe_fpn_gn-head_4x4_1x_coco.py'
+
+model = dict(
+ pretrained='open-mmlab://detectron2/resnet50_caffe',
+ backbone=dict(
+ dcn=dict(type='DCNv2', deform_groups=1, fallback_on_stride=False),
+ stage_with_dcn=(False, True, True, True)),
+ bbox_head=dict(
+ norm_on_bbox=True,
+ centerness_on_reg=True,
+ dcn_on_last_conv=True,
+ center_sampling=True,
+ conv_bias=True,
+ loss_bbox=dict(type='GIoULoss', loss_weight=1.0)))
+# training and testing settings
+test_cfg = dict(nms=dict(type='nms', iou_threshold=0.6))
+
+# dataset settings
+img_norm_cfg = dict(
+ mean=[103.530, 116.280, 123.675], std=[1.0, 1.0, 1.0], to_rgb=False)
+train_pipeline = [
+ dict(type='LoadImageFromFile'),
+ dict(type='LoadAnnotations', with_bbox=True),
+ dict(type='Resize', img_scale=(1333, 800), keep_ratio=True),
+ dict(type='RandomFlip', flip_ratio=0.5),
+ dict(type='Normalize', **img_norm_cfg),
+ dict(type='Pad', size_divisor=32),
+ dict(type='DefaultFormatBundle'),
+ dict(type='Collect', keys=['img', 'gt_bboxes', 'gt_labels']),
+]
+test_pipeline = [
+ dict(type='LoadImageFromFile'),
+ dict(
+ type='MultiScaleFlipAug',
+ img_scale=(1333, 800),
+ flip=False,
+ transforms=[
+ dict(type='Resize', keep_ratio=True),
+ dict(type='RandomFlip'),
+ dict(type='Normalize', **img_norm_cfg),
+ dict(type='Pad', size_divisor=32),
+ dict(type='ImageToTensor', keys=['img']),
+ dict(type='Collect', keys=['img']),
+ ])
+]
+data = dict(
+ samples_per_gpu=4,
+ workers_per_gpu=4,
+ train=dict(pipeline=train_pipeline),
+ val=dict(pipeline=test_pipeline),
+ test=dict(pipeline=test_pipeline))
+optimizer_config = dict(_delete_=True, grad_clip=None)
+
+lr_config = dict(warmup='linear')
diff --git a/PyTorch/contrib/cv/detection/GCNet/configs/fcos/fcos_center_r50_caffe_fpn_gn-head_4x4_1x_coco.py b/PyTorch/contrib/cv/detection/GCNet/configs/fcos/fcos_center_r50_caffe_fpn_gn-head_4x4_1x_coco.py
new file mode 100644
index 0000000000000000000000000000000000000000..42b030b636cb670a7acd68ddf836e8db59428f16
--- /dev/null
+++ b/PyTorch/contrib/cv/detection/GCNet/configs/fcos/fcos_center_r50_caffe_fpn_gn-head_4x4_1x_coco.py
@@ -0,0 +1,2 @@
+_base_ = './fcos_r50_caffe_fpn_gn-head_4x4_1x_coco.py'
+model = dict(bbox_head=dict(center_sampling=True, center_sample_radius=1.5))
diff --git a/PyTorch/contrib/cv/detection/GCNet/configs/fcos/fcos_r101_caffe_fpn_gn-head_4x4_1x_coco.py b/PyTorch/contrib/cv/detection/GCNet/configs/fcos/fcos_r101_caffe_fpn_gn-head_4x4_1x_coco.py
new file mode 100644
index 0000000000000000000000000000000000000000..1bab973547ed59c36ab14e493f171cca1492e613
--- /dev/null
+++ b/PyTorch/contrib/cv/detection/GCNet/configs/fcos/fcos_r101_caffe_fpn_gn-head_4x4_1x_coco.py
@@ -0,0 +1,4 @@
+_base_ = './fcos_r50_caffe_fpn_gn-head_4x4_1x_coco.py'
+model = dict(
+ pretrained='open-mmlab://detectron/resnet101_caffe',
+ backbone=dict(depth=101))
diff --git a/PyTorch/contrib/cv/detection/GCNet/configs/fcos/fcos_r101_caffe_fpn_gn-head_4x4_2x_coco.py b/PyTorch/contrib/cv/detection/GCNet/configs/fcos/fcos_r101_caffe_fpn_gn-head_4x4_2x_coco.py
new file mode 100644
index 0000000000000000000000000000000000000000..6632b0c9991468cf0ac99408e8d56050e37b2cf1
--- /dev/null
+++ b/PyTorch/contrib/cv/detection/GCNet/configs/fcos/fcos_r101_caffe_fpn_gn-head_4x4_2x_coco.py
@@ -0,0 +1,4 @@
+_base_ = ['./fcos_r50_caffe_fpn_gn-head_4x4_2x_coco.py']
+model = dict(
+ pretrained='open-mmlab://detectron/resnet101_caffe',
+ backbone=dict(depth=101))
diff --git a/PyTorch/contrib/cv/detection/GCNet/configs/fcos/fcos_r101_caffe_fpn_gn-head_mstrain_640-800_4x4_2x_coco.py b/PyTorch/contrib/cv/detection/GCNet/configs/fcos/fcos_r101_caffe_fpn_gn-head_mstrain_640-800_4x4_2x_coco.py
new file mode 100644
index 0000000000000000000000000000000000000000..472f7269e46d8f3730b09db5443420ac971058b4
--- /dev/null
+++ b/PyTorch/contrib/cv/detection/GCNet/configs/fcos/fcos_r101_caffe_fpn_gn-head_mstrain_640-800_4x4_2x_coco.py
@@ -0,0 +1,44 @@
+_base_ = './fcos_r50_caffe_fpn_gn-head_4x4_1x_coco.py'
+model = dict(
+ pretrained='open-mmlab://detectron/resnet101_caffe',
+ backbone=dict(depth=101))
+img_norm_cfg = dict(
+ mean=[102.9801, 115.9465, 122.7717], std=[1.0, 1.0, 1.0], to_rgb=False)
+train_pipeline = [
+ dict(type='LoadImageFromFile'),
+ dict(type='LoadAnnotations', with_bbox=True),
+ dict(
+ type='Resize',
+ img_scale=[(1333, 640), (1333, 800)],
+ multiscale_mode='value',
+ keep_ratio=True),
+ dict(type='RandomFlip', flip_ratio=0.5),
+ dict(type='Normalize', **img_norm_cfg),
+ dict(type='Pad', size_divisor=32),
+ dict(type='DefaultFormatBundle'),
+ dict(type='Collect', keys=['img', 'gt_bboxes', 'gt_labels']),
+]
+test_pipeline = [
+ dict(type='LoadImageFromFile'),
+ dict(
+ type='MultiScaleFlipAug',
+ img_scale=(1333, 800),
+ flip=False,
+ transforms=[
+ dict(type='Resize', keep_ratio=True),
+ dict(type='RandomFlip'),
+ dict(type='Normalize', **img_norm_cfg),
+ dict(type='Pad', size_divisor=32),
+ dict(type='ImageToTensor', keys=['img']),
+ dict(type='Collect', keys=['img']),
+ ])
+]
+data = dict(
+ samples_per_gpu=4,
+ workers_per_gpu=4,
+ train=dict(pipeline=train_pipeline),
+ val=dict(pipeline=test_pipeline),
+ test=dict(pipeline=test_pipeline))
+# learning policy
+lr_config = dict(step=[16, 22])
+total_epochs = 24
diff --git a/PyTorch/contrib/cv/detection/GCNet/configs/fcos/fcos_r50_caffe_fpn_4x4_1x_coco.py b/PyTorch/contrib/cv/detection/GCNet/configs/fcos/fcos_r50_caffe_fpn_4x4_1x_coco.py
new file mode 100644
index 0000000000000000000000000000000000000000..4697e9e7efc86771b6dfc6dabd36b8e2b1788b09
--- /dev/null
+++ b/PyTorch/contrib/cv/detection/GCNet/configs/fcos/fcos_r50_caffe_fpn_4x4_1x_coco.py
@@ -0,0 +1,106 @@
+_base_ = [
+ '../_base_/datasets/coco_detection.py',
+ '../_base_/schedules/schedule_1x.py', '../_base_/default_runtime.py'
+]
+# model settings
+model = dict(
+ type='FCOS',
+ pretrained='open-mmlab://detectron/resnet50_caffe',
+ backbone=dict(
+ type='ResNet',
+ depth=50,
+ num_stages=4,
+ out_indices=(0, 1, 2, 3),
+ frozen_stages=1,
+ norm_cfg=dict(type='BN', requires_grad=False),
+ norm_eval=True,
+ style='caffe'),
+ neck=dict(
+ type='FPN',
+ in_channels=[256, 512, 1024, 2048],
+ out_channels=256,
+ start_level=1,
+ add_extra_convs=True,
+ extra_convs_on_inputs=False, # use P5
+ num_outs=5,
+ relu_before_extra_convs=True),
+ bbox_head=dict(
+ type='FCOSHead',
+ num_classes=80,
+ in_channels=256,
+ stacked_convs=4,
+ feat_channels=256,
+ strides=[8, 16, 32, 64, 128],
+ norm_cfg=None,
+ loss_cls=dict(
+ type='FocalLoss',
+ use_sigmoid=True,
+ gamma=2.0,
+ alpha=0.25,
+ loss_weight=1.0),
+ loss_bbox=dict(type='IoULoss', loss_weight=1.0),
+ loss_centerness=dict(
+ type='CrossEntropyLoss', use_sigmoid=True, loss_weight=1.0)))
+# training and testing settings
+train_cfg = dict(
+ assigner=dict(
+ type='MaxIoUAssigner',
+ pos_iou_thr=0.5,
+ neg_iou_thr=0.4,
+ min_pos_iou=0,
+ ignore_iof_thr=-1),
+ allowed_border=-1,
+ pos_weight=-1,
+ debug=False)
+test_cfg = dict(
+ nms_pre=1000,
+ min_bbox_size=0,
+ score_thr=0.05,
+ nms=dict(type='nms', iou_threshold=0.5),
+ max_per_img=100)
+img_norm_cfg = dict(
+ mean=[102.9801, 115.9465, 122.7717], std=[1.0, 1.0, 1.0], to_rgb=False)
+train_pipeline = [
+ dict(type='LoadImageFromFile'),
+ dict(type='LoadAnnotations', with_bbox=True),
+ dict(type='Resize', img_scale=(1333, 800), keep_ratio=True),
+ dict(type='RandomFlip', flip_ratio=0.5),
+ dict(type='Normalize', **img_norm_cfg),
+ dict(type='Pad', size_divisor=32),
+ dict(type='DefaultFormatBundle'),
+ dict(type='Collect', keys=['img', 'gt_bboxes', 'gt_labels']),
+]
+test_pipeline = [
+ dict(type='LoadImageFromFile'),
+ dict(
+ type='MultiScaleFlipAug',
+ img_scale=(1333, 800),
+ flip=False,
+ transforms=[
+ dict(type='Resize', keep_ratio=True),
+ dict(type='RandomFlip'),
+ dict(type='Normalize', **img_norm_cfg),
+ dict(type='Pad', size_divisor=32),
+ dict(type='ImageToTensor', keys=['img']),
+ dict(type='Collect', keys=['img']),
+ ])
+]
+data = dict(
+ samples_per_gpu=4,
+ workers_per_gpu=4,
+ train=dict(pipeline=train_pipeline),
+ val=dict(pipeline=test_pipeline),
+ test=dict(pipeline=test_pipeline))
+# optimizer
+optimizer = dict(
+ lr=0.01, paramwise_cfg=dict(bias_lr_mult=2., bias_decay_mult=0.))
+optimizer_config = dict(
+ _delete_=True, grad_clip=dict(max_norm=35, norm_type=2))
+# learning policy
+lr_config = dict(
+ policy='step',
+ warmup='constant',
+ warmup_iters=500,
+ warmup_ratio=1.0 / 3,
+ step=[8, 11])
+total_epochs = 12
diff --git a/PyTorch/contrib/cv/detection/GCNet/configs/fcos/fcos_r50_caffe_fpn_gn-head_4x4_1x_coco.py b/PyTorch/contrib/cv/detection/GCNet/configs/fcos/fcos_r50_caffe_fpn_gn-head_4x4_1x_coco.py
new file mode 100644
index 0000000000000000000000000000000000000000..b0bcad9e101e4a661f8995d7aba54ef86517ba59
--- /dev/null
+++ b/PyTorch/contrib/cv/detection/GCNet/configs/fcos/fcos_r50_caffe_fpn_gn-head_4x4_1x_coco.py
@@ -0,0 +1,105 @@
+_base_ = [
+ '../_base_/datasets/coco_detection.py',
+ '../_base_/schedules/schedule_1x.py', '../_base_/default_runtime.py'
+]
+# model settings
+model = dict(
+ type='FCOS',
+ pretrained='open-mmlab://detectron/resnet50_caffe',
+ backbone=dict(
+ type='ResNet',
+ depth=50,
+ num_stages=4,
+ out_indices=(0, 1, 2, 3),
+ frozen_stages=1,
+ norm_cfg=dict(type='BN', requires_grad=False),
+ norm_eval=True,
+ style='caffe'),
+ neck=dict(
+ type='FPN',
+ in_channels=[256, 512, 1024, 2048],
+ out_channels=256,
+ start_level=1,
+ add_extra_convs=True,
+ extra_convs_on_inputs=False, # use P5
+ num_outs=5,
+ relu_before_extra_convs=True),
+ bbox_head=dict(
+ type='FCOSHead',
+ num_classes=80,
+ in_channels=256,
+ stacked_convs=4,
+ feat_channels=256,
+ strides=[8, 16, 32, 64, 128],
+ loss_cls=dict(
+ type='FocalLoss',
+ use_sigmoid=True,
+ gamma=2.0,
+ alpha=0.25,
+ loss_weight=1.0),
+ loss_bbox=dict(type='IoULoss', loss_weight=1.0),
+ loss_centerness=dict(
+ type='CrossEntropyLoss', use_sigmoid=True, loss_weight=1.0)))
+# training and testing settings
+train_cfg = dict(
+ assigner=dict(
+ type='MaxIoUAssigner',
+ pos_iou_thr=0.5,
+ neg_iou_thr=0.4,
+ min_pos_iou=0,
+ ignore_iof_thr=-1),
+ allowed_border=-1,
+ pos_weight=-1,
+ debug=False)
+test_cfg = dict(
+ nms_pre=1000,
+ min_bbox_size=0,
+ score_thr=0.05,
+ nms=dict(type='nms', iou_threshold=0.5),
+ max_per_img=100)
+img_norm_cfg = dict(
+ mean=[102.9801, 115.9465, 122.7717], std=[1.0, 1.0, 1.0], to_rgb=False)
+train_pipeline = [
+ dict(type='LoadImageFromFile'),
+ dict(type='LoadAnnotations', with_bbox=True),
+ dict(type='Resize', img_scale=(1333, 800), keep_ratio=True),
+ dict(type='RandomFlip', flip_ratio=0.5),
+ dict(type='Normalize', **img_norm_cfg),
+ dict(type='Pad', size_divisor=32),
+ dict(type='DefaultFormatBundle'),
+ dict(type='Collect', keys=['img', 'gt_bboxes', 'gt_labels']),
+]
+test_pipeline = [
+ dict(type='LoadImageFromFile'),
+ dict(
+ type='MultiScaleFlipAug',
+ img_scale=(1333, 800),
+ flip=False,
+ transforms=[
+ dict(type='Resize', keep_ratio=True),
+ dict(type='RandomFlip'),
+ dict(type='Normalize', **img_norm_cfg),
+ dict(type='Pad', size_divisor=32),
+ dict(type='ImageToTensor', keys=['img']),
+ dict(type='Collect', keys=['img']),
+ ])
+]
+data = dict(
+ samples_per_gpu=4,
+ workers_per_gpu=4,
+ train=dict(pipeline=train_pipeline),
+ val=dict(pipeline=test_pipeline),
+ test=dict(pipeline=test_pipeline))
+# optimizer
+optimizer = dict(
+ lr=0.01, paramwise_cfg=dict(bias_lr_mult=2., bias_decay_mult=0.))
+optimizer_config = dict(
+ _delete_=True, grad_clip=dict(max_norm=35, norm_type=2))
+# learning policy
+lr_config = dict(
+ policy='step',
+ warmup='constant',
+ warmup_iters=500,
+ warmup_ratio=1.0 / 3,
+ step=[8, 11])
+total_epochs = 12
diff --git a/PyTorch/contrib/cv/detection/GCNet/configs/fcos/fcos_r50_caffe_fpn_gn-head_4x4_2x_coco.py b/PyTorch/contrib/cv/detection/GCNet/configs/fcos/fcos_r50_caffe_fpn_gn-head_4x4_2x_coco.py
new file mode 100644
index 0000000000000000000000000000000000000000..3a3ccc149b9458bec0e133692e771473d6cd0c18
--- /dev/null
+++ b/PyTorch/contrib/cv/detection/GCNet/configs/fcos/fcos_r50_caffe_fpn_gn-head_4x4_2x_coco.py
@@ -0,0 +1,5 @@
+_base_ = './fcos_r50_caffe_fpn_gn-head_4x4_1x_coco.py'
+
+# learning policy
+lr_config = dict(step=[16, 22])
+total_epochs = 24
diff --git a/PyTorch/contrib/cv/detection/GCNet/configs/fcos/fcos_r50_caffe_fpn_gn-head_mstrain_640-800_4x4_2x_coco.py b/PyTorch/contrib/cv/detection/GCNet/configs/fcos/fcos_r50_caffe_fpn_gn-head_mstrain_640-800_4x4_2x_coco.py
new file mode 100644
index 0000000000000000000000000000000000000000..5983c00f9a005779d71dac9ee84e590e2ee16ec7
--- /dev/null
+++ b/PyTorch/contrib/cv/detection/GCNet/configs/fcos/fcos_r50_caffe_fpn_gn-head_mstrain_640-800_4x4_2x_coco.py
@@ -0,0 +1,39 @@
+_base_ = './fcos_r50_caffe_fpn_gn-head_4x4_1x_coco.py'
+img_norm_cfg = dict(
+ mean=[102.9801, 115.9465, 122.7717], std=[1.0, 1.0, 1.0], to_rgb=False)
+train_pipeline = [
+ dict(type='LoadImageFromFile'),
+ dict(type='LoadAnnotations', with_bbox=True),
+ dict(
+ type='Resize',
+ img_scale=[(1333, 640), (1333, 800)],
+ multiscale_mode='value',
+ keep_ratio=True),
+ dict(type='RandomFlip', flip_ratio=0.5),
+ dict(type='Normalize', **img_norm_cfg),
+ dict(type='Pad', size_divisor=32),
+ dict(type='DefaultFormatBundle'),
+ dict(type='Collect', keys=['img', 'gt_bboxes', 'gt_labels']),
+]
+test_pipeline = [
+ dict(type='LoadImageFromFile'),
+ dict(
+ type='MultiScaleFlipAug',
+ img_scale=(1333, 800),
+ flip=False,
+ transforms=[
+ dict(type='Resize', keep_ratio=True),
+ dict(type='RandomFlip'),
+ dict(type='Normalize', **img_norm_cfg),
+ dict(type='Pad', size_divisor=32),
+ dict(type='ImageToTensor', keys=['img']),
+ dict(type='Collect', keys=['img']),
+ ])
+]
+data = dict(
+ train=dict(pipeline=train_pipeline),
+ val=dict(pipeline=test_pipeline),
+ test=dict(pipeline=test_pipeline))
+# learning policy
+lr_config = dict(step=[16, 22])
+total_epochs = 24
diff --git a/PyTorch/contrib/cv/detection/GCNet/configs/fcos/fcos_x101_64x4d_fpn_gn-head_mstrain_640-800_4x2_2x_coco.py b/PyTorch/contrib/cv/detection/GCNet/configs/fcos/fcos_x101_64x4d_fpn_gn-head_mstrain_640-800_4x2_2x_coco.py
new file mode 100644
index 0000000000000000000000000000000000000000..dc27edd6084d867f4b7bb048cd87492fd6d7ed3c
--- /dev/null
+++ b/PyTorch/contrib/cv/detection/GCNet/configs/fcos/fcos_x101_64x4d_fpn_gn-head_mstrain_640-800_4x2_2x_coco.py
@@ -0,0 +1,59 @@
+_base_ = './fcos_r50_caffe_fpn_gn-head_4x4_1x_coco.py'
+model = dict(
+ pretrained='open-mmlab://resnext101_64x4d',
+ backbone=dict(
+ type='ResNeXt',
+ depth=101,
+ groups=64,
+ base_width=4,
+ num_stages=4,
+ out_indices=(0, 1, 2, 3),
+ frozen_stages=1,
+ norm_cfg=dict(type='BN', requires_grad=True),
+ norm_eval=True,
+ style='pytorch'))
+img_norm_cfg = dict(
+ mean=[123.675, 116.28, 103.53], std=[58.395, 57.12, 57.375], to_rgb=True)
+train_pipeline = [
+ dict(type='LoadImageFromFile'),
+ dict(type='LoadAnnotations', with_bbox=True),
+ dict(
+ type='Resize',
+ img_scale=[(1333, 640), (1333, 800)],
+ multiscale_mode='value',
+ keep_ratio=True),
+ dict(type='RandomFlip', flip_ratio=0.5),
+ dict(type='Normalize', **img_norm_cfg),
+ dict(type='Pad', size_divisor=32),
+ dict(type='DefaultFormatBundle'),
+ dict(type='Collect', keys=['img', 'gt_bboxes', 'gt_labels']),
+]
+test_pipeline = [
+ dict(type='LoadImageFromFile'),
+ dict(
+ type='MultiScaleFlipAug',
+ img_scale=(1333, 800),
+ flip=False,
+ transforms=[
+ dict(type='Resize', keep_ratio=True),
+ dict(type='RandomFlip'),
+ dict(type='Normalize', **img_norm_cfg),
+ dict(type='Pad', size_divisor=32),
+ dict(type='ImageToTensor', keys=['img']),
+ dict(type='Collect', keys=['img']),
+ ])
+]
+data = dict(
+ samples_per_gpu=2,
+ workers_per_gpu=2,
+ train=dict(pipeline=train_pipeline),
+ val=dict(pipeline=test_pipeline),
+ test=dict(pipeline=test_pipeline))
+# optimizer
+optimizer = dict(
+ lr=0.01, paramwise_cfg=dict(bias_lr_mult=2., bias_decay_mult=0.))
+optimizer_config = dict(
+ _delete_=True, grad_clip=dict(max_norm=35, norm_type=2))
+# learning policy
+lr_config = dict(step=[16, 22])
+total_epochs = 24
diff --git a/PyTorch/contrib/cv/detection/GCNet/configs/foveabox/README.md b/PyTorch/contrib/cv/detection/GCNet/configs/foveabox/README.md
new file mode 100644
index 0000000000000000000000000000000000000000..ae2db0e1a01d236d6bdacbf92fff77c596815719
--- /dev/null
+++ b/PyTorch/contrib/cv/detection/GCNet/configs/foveabox/README.md
@@ -0,0 +1,36 @@
+# FoveaBox: Beyond Anchor-based Object Detector
+
+FoveaBox is an accurate, flexible and completely anchor-free object detection system for object detection framework, as presented in our paper [https://arxiv.org/abs/1904.03797](https://arxiv.org/abs/1904.03797):
+Different from previous anchor-based methods, FoveaBox directly learns the object existing possibility and the bounding box coordinates without anchor reference. This is achieved by: (a) predicting category-sensitive semantic maps for the object existing possibility, and (b) producing category-agnostic bounding box for each position that potentially contains an object.
+
+## Main Results
+### Results on R50/101-FPN
+
+| Backbone | Style | align | ms-train| Lr schd | Mem (GB) | Inf time (fps) | box AP | Download |
+|:---------:|:-------:|:-------:|:-------:|:-------:|:--------:|:--------------:|:------:|:--------:|
+| R-50 | pytorch | N | N | 1x | 5.6 | 24.1 | 36.5 | [model](http://download.openmmlab.com/mmdetection/v2.0/foveabox/fovea_r50_fpn_4x4_1x_coco/fovea_r50_fpn_4x4_1x_coco_20200219-ee4d5303.pth) | [log](http://download.openmmlab.com/mmdetection/v2.0/foveabox/fovea_r50_fpn_4x4_1x_coco/fovea_r50_fpn_4x4_1x_coco_20200219_223025.log.json) |
+| R-50 | pytorch | N | N | 2x | 5.6 | - | 37.2 | [model](http://download.openmmlab.com/mmdetection/v2.0/foveabox/fovea_r50_fpn_4x4_2x_coco/fovea_r50_fpn_4x4_2x_coco_20200203-2df792b1.pth) | [log](http://download.openmmlab.com/mmdetection/v2.0/foveabox/fovea_r50_fpn_4x4_2x_coco/fovea_r50_fpn_4x4_2x_coco_20200203_112043.log.json) |
+| R-50 | pytorch | Y | N | 2x | 8.1 | 19.4 | 37.9 | [model](http://download.openmmlab.com/mmdetection/v2.0/foveabox/fovea_align_r50_fpn_gn-head_4x4_2x_coco/fovea_align_r50_fpn_gn-head_4x4_2x_coco_20200203-8987880d.pth) | [log](http://download.openmmlab.com/mmdetection/v2.0/foveabox/fovea_align_r50_fpn_gn-head_4x4_2x_coco/fovea_align_r50_fpn_gn-head_4x4_2x_coco_20200203_134252.log.json) |
+| R-50 | pytorch | Y | Y | 2x | 8.1 | 18.3 | 40.4 | [model](http://download.openmmlab.com/mmdetection/v2.0/foveabox/fovea_align_r50_fpn_gn-head_mstrain_640-800_4x4_2x_coco/fovea_align_r50_fpn_gn-head_mstrain_640-800_4x4_2x_coco_20200205-85ce26cb.pth) | [log](http://download.openmmlab.com/mmdetection/v2.0/foveabox/fovea_align_r50_fpn_gn-head_mstrain_640-800_4x4_2x_coco/fovea_align_r50_fpn_gn-head_mstrain_640-800_4x4_2x_coco_20200205_112557.log.json) |
+| R-101 | pytorch | N | N | 1x | 9.2 | 17.4 | 38.6 | [model](http://download.openmmlab.com/mmdetection/v2.0/foveabox/fovea_r101_fpn_4x4_1x_coco/fovea_r101_fpn_4x4_1x_coco_20200219-05e38f1c.pth) | [log](http://download.openmmlab.com/mmdetection/v2.0/foveabox/fovea_r101_fpn_4x4_1x_coco/fovea_r101_fpn_4x4_1x_coco_20200219_011740.log.json) |
+| R-101 | pytorch | N | N | 2x | 11.7 | - | 40.0 | [model](http://download.openmmlab.com/mmdetection/v2.0/foveabox/fovea_r101_fpn_4x4_2x_coco/fovea_r101_fpn_4x4_2x_coco_20200208-02320ea4.pth) | [log](http://download.openmmlab.com/mmdetection/v2.0/foveabox/fovea_r101_fpn_4x4_2x_coco/fovea_r101_fpn_4x4_2x_coco_20200208_202059.log.json) |
+| R-101 | pytorch | Y | N | 2x | 11.7 | 14.7 | 40.0 | [model](http://download.openmmlab.com/mmdetection/v2.0/foveabox/fovea_align_r101_fpn_gn-head_4x4_2x_coco/fovea_align_r101_fpn_gn-head_4x4_2x_coco_20200208-c39a027a.pth) | [log](http://download.openmmlab.com/mmdetection/v2.0/foveabox/fovea_align_r101_fpn_gn-head_4x4_2x_coco/fovea_align_r101_fpn_gn-head_4x4_2x_coco_20200208_203337.log.json) |
+| R-101 | pytorch | Y | Y | 2x | 11.7 | 14.7 | 42.0 | [model](http://download.openmmlab.com/mmdetection/v2.0/foveabox/fovea_align_r101_fpn_gn-head_mstrain_640-800_4x4_2x_coco/fovea_align_r101_fpn_gn-head_mstrain_640-800_4x4_2x_coco_20200208-649c5eb6.pth) | [log](http://download.openmmlab.com/mmdetection/v2.0/foveabox/fovea_align_r101_fpn_gn-head_mstrain_640-800_4x4_2x_coco/fovea_align_r101_fpn_gn-head_mstrain_640-800_4x4_2x_coco_20200208_202124.log.json) |
+
+[1] *1x and 2x mean the model is trained for 12 and 24 epochs, respectively.* \
+[2] *Align means utilizing deformable convolution to align the cls branch.* \
+[3] *All results are obtained with a single model and without any test time data augmentation.*\
+[4] *We use 4 GPUs for training.*
+
+Any pull requests or issues are welcome.
+
+## Citations
+Please consider citing our paper in your publications if the project helps your research. BibTeX reference is as follows.
+```
+@article{kong2019foveabox,
+ title={FoveaBox: Beyond Anchor-based Object Detector},
+ author={Kong, Tao and Sun, Fuchun and Liu, Huaping and Jiang, Yuning and Shi, Jianbo},
+ journal={arXiv preprint arXiv:1904.03797},
+ year={2019}
+}
+```
diff --git a/PyTorch/contrib/cv/detection/GCNet/configs/foveabox/fovea_align_r101_fpn_gn-head_4x4_2x_coco.py b/PyTorch/contrib/cv/detection/GCNet/configs/foveabox/fovea_align_r101_fpn_gn-head_4x4_2x_coco.py
new file mode 100644
index 0000000000000000000000000000000000000000..49a99af2b1ce205c70df26b877345b9fccbbdd16
--- /dev/null
+++ b/PyTorch/contrib/cv/detection/GCNet/configs/foveabox/fovea_align_r101_fpn_gn-head_4x4_2x_coco.py
@@ -0,0 +1,10 @@
+_base_ = './fovea_r50_fpn_4x4_1x_coco.py'
+model = dict(
+ pretrained='torchvision://resnet101',
+ backbone=dict(depth=101),
+ bbox_head=dict(
+ with_deform=True,
+ norm_cfg=dict(type='GN', num_groups=32, requires_grad=True)))
+# learning policy
+lr_config = dict(step=[16, 22])
+total_epochs = 24
diff --git a/PyTorch/contrib/cv/detection/GCNet/configs/foveabox/fovea_align_r101_fpn_gn-head_mstrain_640-800_4x4_2x_coco.py b/PyTorch/contrib/cv/detection/GCNet/configs/foveabox/fovea_align_r101_fpn_gn-head_mstrain_640-800_4x4_2x_coco.py
new file mode 100644
index 0000000000000000000000000000000000000000..3f35dd6d5c207c66ebb0514035290eb05818c1a2
--- /dev/null
+++ b/PyTorch/contrib/cv/detection/GCNet/configs/foveabox/fovea_align_r101_fpn_gn-head_mstrain_640-800_4x4_2x_coco.py
@@ -0,0 +1,27 @@
+_base_ = './fovea_r50_fpn_4x4_1x_coco.py'
+model = dict(
+ pretrained='torchvision://resnet101',
+ backbone=dict(depth=101),
+ bbox_head=dict(
+ with_deform=True,
+ norm_cfg=dict(type='GN', num_groups=32, requires_grad=True)))
+img_norm_cfg = dict(
+ mean=[123.675, 116.28, 103.53], std=[58.395, 57.12, 57.375], to_rgb=True)
+train_pipeline = [
+ dict(type='LoadImageFromFile'),
+ dict(type='LoadAnnotations', with_bbox=True),
+ dict(
+ type='Resize',
+ img_scale=[(1333, 640), (1333, 800)],
+ multiscale_mode='value',
+ keep_ratio=True),
+ dict(type='RandomFlip', flip_ratio=0.5),
+ dict(type='Normalize', **img_norm_cfg),
+ dict(type='Pad', size_divisor=32),
+ dict(type='DefaultFormatBundle'),
+ dict(type='Collect', keys=['img', 'gt_bboxes', 'gt_labels']),
+]
+data = dict(train=dict(pipeline=train_pipeline))
+# learning policy
+lr_config = dict(step=[16, 22])
+total_epochs = 24
diff --git a/PyTorch/contrib/cv/detection/GCNet/configs/foveabox/fovea_align_r50_fpn_gn-head_4x4_2x_coco.py b/PyTorch/contrib/cv/detection/GCNet/configs/foveabox/fovea_align_r50_fpn_gn-head_4x4_2x_coco.py
new file mode 100644
index 0000000000000000000000000000000000000000..47cf1125fcca6e0b06774377ea10a62c864a13ca
--- /dev/null
+++ b/PyTorch/contrib/cv/detection/GCNet/configs/foveabox/fovea_align_r50_fpn_gn-head_4x4_2x_coco.py
@@ -0,0 +1,10 @@
+_base_ = './fovea_r50_fpn_4x4_1x_coco.py'
+model = dict(
+ bbox_head=dict(
+ with_deform=True,
+ norm_cfg=dict(type='GN', num_groups=32, requires_grad=True)))
+# learning policy
+lr_config = dict(step=[16, 22])
+total_epochs = 24
+optimizer_config = dict(
+ _delete_=True, grad_clip=dict(max_norm=35, norm_type=2))
diff --git a/PyTorch/contrib/cv/detection/GCNet/configs/foveabox/fovea_align_r50_fpn_gn-head_mstrain_640-800_4x4_2x_coco.py b/PyTorch/contrib/cv/detection/GCNet/configs/foveabox/fovea_align_r50_fpn_gn-head_mstrain_640-800_4x4_2x_coco.py
new file mode 100644
index 0000000000000000000000000000000000000000..e80310eab6bbaf0b716f3961408e6586ae2d41d2
--- /dev/null
+++ b/PyTorch/contrib/cv/detection/GCNet/configs/foveabox/fovea_align_r50_fpn_gn-head_mstrain_640-800_4x4_2x_coco.py
@@ -0,0 +1,25 @@
+_base_ = './fovea_r50_fpn_4x4_1x_coco.py'
+model = dict(
+ bbox_head=dict(
+ with_deform=True,
+ norm_cfg=dict(type='GN', num_groups=32, requires_grad=True)))
+img_norm_cfg = dict(
+ mean=[123.675, 116.28, 103.53], std=[58.395, 57.12, 57.375], to_rgb=True)
+train_pipeline = [
+ dict(type='LoadImageFromFile'),
+ dict(type='LoadAnnotations', with_bbox=True),
+ dict(
+ type='Resize',
+ img_scale=[(1333, 640), (1333, 800)],
+ multiscale_mode='value',
+ keep_ratio=True),
+ dict(type='RandomFlip', flip_ratio=0.5),
+ dict(type='Normalize', **img_norm_cfg),
+ dict(type='Pad', size_divisor=32),
+ dict(type='DefaultFormatBundle'),
+ dict(type='Collect', keys=['img', 'gt_bboxes', 'gt_labels']),
+]
+data = dict(train=dict(pipeline=train_pipeline))
+# learning policy
+lr_config = dict(step=[16, 22])
+total_epochs = 24
diff --git a/PyTorch/contrib/cv/detection/GCNet/configs/foveabox/fovea_r101_fpn_4x4_1x_coco.py b/PyTorch/contrib/cv/detection/GCNet/configs/foveabox/fovea_r101_fpn_4x4_1x_coco.py
new file mode 100644
index 0000000000000000000000000000000000000000..907bede158c7043d2a3b0d9daf64a0b6a13bc83c
--- /dev/null
+++ b/PyTorch/contrib/cv/detection/GCNet/configs/foveabox/fovea_r101_fpn_4x4_1x_coco.py
@@ -0,0 +1,2 @@
+_base_ = './fovea_r50_fpn_4x4_1x_coco.py'
+model = dict(pretrained='torchvision://resnet101', backbone=dict(depth=101))
diff --git a/PyTorch/contrib/cv/detection/GCNet/configs/foveabox/fovea_r101_fpn_4x4_2x_coco.py b/PyTorch/contrib/cv/detection/GCNet/configs/foveabox/fovea_r101_fpn_4x4_2x_coco.py
new file mode 100644
index 0000000000000000000000000000000000000000..92963935466ab2db968a8f241420c9795ab2b1b0
--- /dev/null
+++ b/PyTorch/contrib/cv/detection/GCNet/configs/foveabox/fovea_r101_fpn_4x4_2x_coco.py
@@ -0,0 +1,2 @@
+_base_ = './fovea_r50_fpn_4x4_2x_coco.py'
+model = dict(pretrained='torchvision://resnet101', backbone=dict(depth=101))
diff --git a/PyTorch/contrib/cv/detection/GCNet/configs/foveabox/fovea_r50_fpn_4x4_1x_coco.py b/PyTorch/contrib/cv/detection/GCNet/configs/foveabox/fovea_r50_fpn_4x4_1x_coco.py
new file mode 100644
index 0000000000000000000000000000000000000000..4b62c81212e77fedc8581a855077f9b541ff67a2
--- /dev/null
+++ b/PyTorch/contrib/cv/detection/GCNet/configs/foveabox/fovea_r50_fpn_4x4_1x_coco.py
@@ -0,0 +1,52 @@
+_base_ = [
+ '../_base_/datasets/coco_detection.py',
+ '../_base_/schedules/schedule_1x.py', '../_base_/default_runtime.py'
+]
+# model settings
+model = dict(
+ type='FOVEA',
+ pretrained='torchvision://resnet50',
+ backbone=dict(
+ type='ResNet',
+ depth=50,
+ num_stages=4,
+ out_indices=(0, 1, 2, 3),
+ frozen_stages=1,
+ norm_cfg=dict(type='BN', requires_grad=True),
+ norm_eval=True,
+ style='pytorch'),
+ neck=dict(
+ type='FPN',
+ in_channels=[256, 512, 1024, 2048],
+ out_channels=256,
+ start_level=1,
+ num_outs=5,
+ add_extra_convs='on_input'),
+ bbox_head=dict(
+ type='FoveaHead',
+ num_classes=80,
+ in_channels=256,
+ stacked_convs=4,
+ feat_channels=256,
+ strides=[8, 16, 32, 64, 128],
+ base_edge_list=[16, 32, 64, 128, 256],
+ scale_ranges=((1, 64), (32, 128), (64, 256), (128, 512), (256, 2048)),
+ sigma=0.4,
+ with_deform=False,
+ loss_cls=dict(
+ type='FocalLoss',
+ use_sigmoid=True,
+ gamma=1.50,
+ alpha=0.4,
+ loss_weight=1.0),
+ loss_bbox=dict(type='SmoothL1Loss', beta=0.11, loss_weight=1.0)))
+# training and testing settings
+train_cfg = dict()
+test_cfg = dict(
+ nms_pre=1000,
+ score_thr=0.05,
+ nms=dict(type='nms', iou_threshold=0.5),
+ max_per_img=100)
+data = dict(samples_per_gpu=4, workers_per_gpu=4)
+# optimizer
+optimizer = dict(type='SGD', lr=0.01, momentum=0.9, weight_decay=0.0001)
diff --git a/PyTorch/contrib/cv/detection/GCNet/configs/foveabox/fovea_r50_fpn_4x4_2x_coco.py b/PyTorch/contrib/cv/detection/GCNet/configs/foveabox/fovea_r50_fpn_4x4_2x_coco.py
new file mode 100644
index 0000000000000000000000000000000000000000..b4559bb3d9ee631f6e3ca38a9692ac886431a7c8
--- /dev/null
+++ b/PyTorch/contrib/cv/detection/GCNet/configs/foveabox/fovea_r50_fpn_4x4_2x_coco.py
@@ -0,0 +1,4 @@
+_base_ = './fovea_r50_fpn_4x4_1x_coco.py'
+# learning policy
+lr_config = dict(step=[16, 22])
+total_epochs = 24
diff --git a/PyTorch/contrib/cv/detection/GCNet/configs/fp16/README.md b/PyTorch/contrib/cv/detection/GCNet/configs/fp16/README.md
new file mode 100644
index 0000000000000000000000000000000000000000..5071e0dc70e345b9fa8bf35816fb0772ee5a64e6
--- /dev/null
+++ b/PyTorch/contrib/cv/detection/GCNet/configs/fp16/README.md
@@ -0,0 +1,19 @@
+# Mixed Precision Training
+
+## Introduction
+```
+@article{micikevicius2017mixed,
+ title={Mixed precision training},
+ author={Micikevicius, Paulius and Narang, Sharan and Alben, Jonah and Diamos, Gregory and Elsen, Erich and Garcia, David and Ginsburg, Boris and Houston, Michael and Kuchaiev, Oleksii and Venkatesh, Ganesh and others},
+ journal={arXiv preprint arXiv:1710.03740},
+ year={2017}
+}
+```
+
+## Results and Models
+
+| Architecture | Backbone | Style | Lr schd | Mem (GB) | Inf time (fps) | box AP | mask AP | Download |
+|:------------:|:---------:|:-------:|:-------:|:--------:|:--------------:|:------:|:-------:|:--------:|
+| Faster R-CNN | R-50 | pytorch | 1x | 3.4 | 28.8 | 37.5 | - |[model](http://download.openmmlab.com/mmdetection/v2.0/fp16/faster_rcnn_r50_fpn_fp16_1x_coco/faster_rcnn_r50_fpn_fp16_1x_coco_20200204-d4dc1471.pth) | [log](http://download.openmmlab.com/mmdetection/v2.0/fp16/faster_rcnn_r50_fpn_fp16_1x_coco/faster_rcnn_r50_fpn_fp16_1x_coco_20200204_143530.log.json) |
+| Mask R-CNN | R-50 | pytorch | 1x | 3.6 | 24.1 | 38.1 | 34.7 |[model](http://download.openmmlab.com/mmdetection/v2.0/fp16/mask_rcnn_r50_fpn_fp16_1x_coco/mask_rcnn_r50_fpn_fp16_1x_coco_20200205-59faf7e4.pth) | [log](http://download.openmmlab.com/mmdetection/v2.0/fp16/mask_rcnn_r50_fpn_fp16_1x_coco/mask_rcnn_r50_fpn_fp16_1x_coco_20200205_130539.log.json) |
+| Retinanet | R-50 | pytorch | 1x | 2.8 | 31.6 | 36.4 | |[model](http://download.openmmlab.com/mmdetection/v2.0/fp16/retinanet_r50_fpn_fp16_1x_coco/retinanet_r50_fpn_fp16_1x_coco_20200702-0dbfb212.pth) | [log](http://download.openmmlab.com/mmdetection/v2.0/fp16/retinanet_r50_fpn_fp16_1x_coco/retinanet_r50_fpn_fp16_1x_coco_20200702_020127.log.json) |
diff --git a/PyTorch/contrib/cv/detection/GCNet/configs/fp16/faster_rcnn_r50_fpn_fp16_1x_coco.py b/PyTorch/contrib/cv/detection/GCNet/configs/fp16/faster_rcnn_r50_fpn_fp16_1x_coco.py
new file mode 100644
index 0000000000000000000000000000000000000000..78fa5b6c6a895cb04e1813462ed6a7eefd8c1fa6
--- /dev/null
+++ b/PyTorch/contrib/cv/detection/GCNet/configs/fp16/faster_rcnn_r50_fpn_fp16_1x_coco.py
@@ -0,0 +1,3 @@
+_base_ = '../faster_rcnn/faster_rcnn_r50_fpn_1x_coco.py'
+# fp16 settings
+fp16 = dict(loss_scale=512.)
diff --git a/PyTorch/contrib/cv/detection/GCNet/configs/fp16/mask_rcnn_r50_fpn_fp16_1x_coco.py b/PyTorch/contrib/cv/detection/GCNet/configs/fp16/mask_rcnn_r50_fpn_fp16_1x_coco.py
new file mode 100644
index 0000000000000000000000000000000000000000..f506ea815fedd6faefad9a06d7f466b86e8d2622
--- /dev/null
+++ b/PyTorch/contrib/cv/detection/GCNet/configs/fp16/mask_rcnn_r50_fpn_fp16_1x_coco.py
@@ -0,0 +1,3 @@
+_base_ = '../mask_rcnn/mask_rcnn_r50_fpn_1x_coco.py'
+# fp16 settings
+fp16 = dict(loss_scale=512.)
diff --git a/PyTorch/contrib/cv/detection/GCNet/configs/fp16/retinanet_r50_fpn_fp16_1x_coco.py b/PyTorch/contrib/cv/detection/GCNet/configs/fp16/retinanet_r50_fpn_fp16_1x_coco.py
new file mode 100644
index 0000000000000000000000000000000000000000..519c4dbacb1a876dcd973f2a82ddeef98787619d
--- /dev/null
+++ b/PyTorch/contrib/cv/detection/GCNet/configs/fp16/retinanet_r50_fpn_fp16_1x_coco.py
@@ -0,0 +1,3 @@
+_base_ = '../retinanet/retinanet_r50_fpn_1x_coco.py'
+# fp16 settings
+fp16 = dict(loss_scale=512.)
diff --git a/PyTorch/contrib/cv/detection/GCNet/configs/free_anchor/README.md b/PyTorch/contrib/cv/detection/GCNet/configs/free_anchor/README.md
new file mode 100644
index 0000000000000000000000000000000000000000..5353991f347fa36858a2f983594d3a00a963f956
--- /dev/null
+++ b/PyTorch/contrib/cv/detection/GCNet/configs/free_anchor/README.md
@@ -0,0 +1,24 @@
+# FreeAnchor: Learning to Match Anchors for Visual Object Detection
+
+## Introduction
+
+```
+@inproceedings{zhang2019freeanchor,
+ title = {{FreeAnchor}: Learning to Match Anchors for Visual Object Detection},
+ author = {Zhang, Xiaosong and Wan, Fang and Liu, Chang and Ji, Rongrong and Ye, Qixiang},
+ booktitle = {Neural Information Processing Systems},
+ year = {2019}
+}
+```
+
+## Results and Models
+
+| Backbone | Style | Lr schd | Mem (GB) | Inf time (fps) | box AP | Download |
+|:--------:|:-------:|:-------:|:--------:|:--------------:|:------:|:--------:|
+| R-50 | pytorch | 1x | 4.9 | 18.4 | 38.7 | [model](http://download.openmmlab.com/mmdetection/v2.0/free_anchor/retinanet_free_anchor_r50_fpn_1x_coco/retinanet_free_anchor_r50_fpn_1x_coco_20200130-0f67375f.pth) | [log](http://download.openmmlab.com/mmdetection/v2.0/free_anchor/retinanet_free_anchor_r50_fpn_1x_coco/retinanet_free_anchor_r50_fpn_1x_coco_20200130_095625.log.json) |
+| R-101 | pytorch | 1x | 6.8 | 14.9 | 40.3 | [model](http://download.openmmlab.com/mmdetection/v2.0/free_anchor/retinanet_free_anchor_r101_fpn_1x_coco/retinanet_free_anchor_r101_fpn_1x_coco_20200130-358324e6.pth) | [log](http://download.openmmlab.com/mmdetection/v2.0/free_anchor/retinanet_free_anchor_r101_fpn_1x_coco/retinanet_free_anchor_r101_fpn_1x_coco_20200130_100723.log.json) |
+| X-101-32x4d | pytorch | 1x | 8.1 | 11.1 | 41.9 | [model](http://download.openmmlab.com/mmdetection/v2.0/free_anchor/retinanet_free_anchor_x101_32x4d_fpn_1x_coco/retinanet_free_anchor_x101_32x4d_fpn_1x_coco_20200130-d4846968.pth) | [log](http://download.openmmlab.com/mmdetection/v2.0/free_anchor/retinanet_free_anchor_x101_32x4d_fpn_1x_coco/retinanet_free_anchor_x101_32x4d_fpn_1x_coco_20200130_095627.log.json) |
+
+**Notes:**
+- We use 8 GPUs with 2 images/GPU.
+- For more settings and models, please refer to the [official repo](https://github.com/zhangxiaosong18/FreeAnchor).
diff --git a/PyTorch/contrib/cv/detection/GCNet/configs/free_anchor/retinanet_free_anchor_r101_fpn_1x_coco.py b/PyTorch/contrib/cv/detection/GCNet/configs/free_anchor/retinanet_free_anchor_r101_fpn_1x_coco.py
new file mode 100644
index 0000000000000000000000000000000000000000..9917d5c4dc8b9c0149a963e24ecfa1098c1a9995
--- /dev/null
+++ b/PyTorch/contrib/cv/detection/GCNet/configs/free_anchor/retinanet_free_anchor_r101_fpn_1x_coco.py
@@ -0,0 +1,2 @@
+_base_ = './retinanet_free_anchor_r50_fpn_1x_coco.py'
+model = dict(pretrained='torchvision://resnet101', backbone=dict(depth=101))
diff --git a/PyTorch/contrib/cv/detection/GCNet/configs/free_anchor/retinanet_free_anchor_r50_fpn_1x_coco.py b/PyTorch/contrib/cv/detection/GCNet/configs/free_anchor/retinanet_free_anchor_r50_fpn_1x_coco.py
new file mode 100644
index 0000000000000000000000000000000000000000..28f983c29edd071b32a50f18ac7b3f5c1bfdda88
--- /dev/null
+++ b/PyTorch/contrib/cv/detection/GCNet/configs/free_anchor/retinanet_free_anchor_r50_fpn_1x_coco.py
@@ -0,0 +1,22 @@
+_base_ = '../retinanet/retinanet_r50_fpn_1x_coco.py'
+model = dict(
+ bbox_head=dict(
+ _delete_=True,
+ type='FreeAnchorRetinaHead',
+ num_classes=80,
+ in_channels=256,
+ stacked_convs=4,
+ feat_channels=256,
+ anchor_generator=dict(
+ type='AnchorGenerator',
+ octave_base_scale=4,
+ scales_per_octave=3,
+ ratios=[0.5, 1.0, 2.0],
+ strides=[8, 16, 32, 64, 128]),
+ bbox_coder=dict(
+ type='DeltaXYWHBBoxCoder',
+ target_means=[.0, .0, .0, .0],
+ target_stds=[0.1, 0.1, 0.2, 0.2]),
+ loss_bbox=dict(type='SmoothL1Loss', beta=0.11, loss_weight=0.75)))
+optimizer_config = dict(
+ _delete_=True, grad_clip=dict(max_norm=35, norm_type=2))
diff --git a/PyTorch/contrib/cv/detection/GCNet/configs/free_anchor/retinanet_free_anchor_x101_32x4d_fpn_1x_coco.py b/PyTorch/contrib/cv/detection/GCNet/configs/free_anchor/retinanet_free_anchor_x101_32x4d_fpn_1x_coco.py
new file mode 100644
index 0000000000000000000000000000000000000000..e2640c07e86db2d8cc2e6654c78077df10789b4c
--- /dev/null
+++ b/PyTorch/contrib/cv/detection/GCNet/configs/free_anchor/retinanet_free_anchor_x101_32x4d_fpn_1x_coco.py
@@ -0,0 +1,12 @@
+_base_ = './retinanet_free_anchor_r50_fpn_1x_coco.py'
+model = dict(
+ pretrained='open-mmlab://resnext101_32x4d',
+ backbone=dict(
+ type='ResNeXt',
+ depth=101,
+ groups=32,
+ base_width=4,
+ num_stages=4,
+ out_indices=(0, 1, 2, 3),
+ frozen_stages=1,
+ style='pytorch'))
diff --git a/PyTorch/contrib/cv/detection/GCNet/configs/fsaf/README.md b/PyTorch/contrib/cv/detection/GCNet/configs/fsaf/README.md
new file mode 100644
index 0000000000000000000000000000000000000000..1c70066e8518d4f3d5fc69204b9dff15b7893a30
--- /dev/null
+++ b/PyTorch/contrib/cv/detection/GCNet/configs/fsaf/README.md
@@ -0,0 +1,39 @@
+# Feature Selective Anchor-Free Module for Single-Shot Object Detection
+
+FSAF is an anchor-free method published in CVPR2019 ([https://arxiv.org/pdf/1903.00621.pdf](https://arxiv.org/pdf/1903.00621.pdf)).
+Actually it is equivalent to the anchor-based method with only one anchor at each feature map position in each FPN level.
+And this is how we implemented it.
+Only the anchor-free branch is released for its better compatibility with the current framework and less computational budget.
+
+In the original paper, feature maps within the central 0.2-0.5 area of a gt box are tagged as ignored. However,
+it is empirically found that a hard threshold (0.2-0.2) gives a further gain on the performance. (see the table below)
+
+## Main Results
+### Results on R50/R101/X101-FPN
+
+| Backbone | ignore range | ms-train| Lr schd |Train Mem (GB)| Train time (s/iter) | Inf time (fps) | box AP | Download |
+|:----------:| :-------: |:-------:|:-------:|:------------:|:---------------:|:--------------:|:-------------:|:--------:|
+| R-50 | 0.2-0.5 | N | 1x | 3.15 | 0.43 | 12.3 | 36.0 (35.9) | [model](http://download.openmmlab.com/mmdetection/v2.0/fsaf/fsaf_pscale0.2_nscale0.5_r50_fpn_1x_coco/fsaf_pscale0.2_nscale0.5_r50_fpn_1x_coco_20200715-b555b0e0.pth) | [log](http://download.openmmlab.com/mmdetection/v2.0/fsaf/fsaf_pscale0.2_nscale0.5_r50_fpn_1x_coco/fsaf_pscale0.2_nscale0.5_r50_fpn_1x_coco_20200715_094657.log.json) |
+| R-50 | 0.2-0.2 | N | 1x | 3.15 | 0.43 | 13.0 | 37.4 | [model](http://download.openmmlab.com/mmdetection/v2.0/fsaf/fsaf_r50_fpn_1x_coco/fsaf_r50_fpn_1x_coco-94ccc51f.pth) | [log](http://download.openmmlab.com/mmdetection/v2.0/fsaf/fsaf_r50_fpn_1x_coco/fsaf_r50_fpn_1x_coco_20200428_072327.log.json)|
+| R-101 | 0.2-0.2 | N | 1x | 5.08 | 0.58 | 10.8 | 39.3 (37.9) | [model](http://download.openmmlab.com/mmdetection/v2.0/fsaf/fsaf_r101_fpn_1x_coco/fsaf_r101_fpn_1x_coco-9e71098f.pth) | [log](http://download.openmmlab.com/mmdetection/v2.0/fsaf/fsaf_r101_fpn_1x_coco/fsaf_r101_fpn_1x_coco_20200428_160348.log.json)|
+| X-101 | 0.2-0.2 | N | 1x | 9.38 | 1.23 | 5.6 | 42.4 (41.0) | [model](http://download.openmmlab.com/mmdetection/v2.0/fsaf/fsaf_x101_64x4d_fpn_1x_coco/fsaf_x101_64x4d_fpn_1x_coco-e3f6e6fd.pth) | [log](http://download.openmmlab.com/mmdetection/v2.0/fsaf/fsaf_x101_64x4d_fpn_1x_coco/fsaf_x101_64x4d_fpn_1x_coco_20200428_160424.log.json)|
+
+**Notes:**
+ - *1x means the model is trained for 12 epochs.*
+ - *AP values in the brackets represent those reported in the original paper.*
+ - *All results are obtained with a single model and single-scale test.*
+ - *X-101 backbone represents ResNext-101-64x4d.*
+ - *All pretrained backbones use pytorch style.*
+ - *All models are trained on 8 Titan-XP gpus and tested on a single gpu.*
+
+## Citations
+BibTeX reference is as follows.
+```
+@inproceedings{zhu2019feature,
+ title={Feature Selective Anchor-Free Module for Single-Shot Object Detection},
+ author={Zhu, Chenchen and He, Yihui and Savvides, Marios},
+ booktitle={Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition},
+ pages={840--849},
+ year={2019}
+}
+```
diff --git a/PyTorch/contrib/cv/detection/GCNet/configs/fsaf/fsaf_r101_fpn_1x_coco.py b/PyTorch/contrib/cv/detection/GCNet/configs/fsaf/fsaf_r101_fpn_1x_coco.py
new file mode 100644
index 0000000000000000000000000000000000000000..95a7ae2de598f5c89ddf8f0f82be653aa85bd3e6
--- /dev/null
+++ b/PyTorch/contrib/cv/detection/GCNet/configs/fsaf/fsaf_r101_fpn_1x_coco.py
@@ -0,0 +1,2 @@
+_base_ = './fsaf_r50_fpn_1x_coco.py'
+model = dict(pretrained='torchvision://resnet101', backbone=dict(depth=101))
diff --git a/PyTorch/contrib/cv/detection/GCNet/configs/fsaf/fsaf_r50_fpn_1x_coco.py b/PyTorch/contrib/cv/detection/GCNet/configs/fsaf/fsaf_r50_fpn_1x_coco.py
new file mode 100644
index 0000000000000000000000000000000000000000..deb14528efc266e1850e22fb6c171c40e6f7b997
--- /dev/null
+++ b/PyTorch/contrib/cv/detection/GCNet/configs/fsaf/fsaf_r50_fpn_1x_coco.py
@@ -0,0 +1,50 @@
+_base_ = '../retinanet/retinanet_r50_fpn_1x_coco.py'
+# model settings
+model = dict(
+ type='FSAF',
+ bbox_head=dict(
+ type='FSAFHead',
+ num_classes=80,
+ in_channels=256,
+ stacked_convs=4,
+ feat_channels=256,
+ reg_decoded_bbox=True,
+ # Only anchor-free branch is implemented. The anchor generator only
+ # generates 1 anchor at each feature point, as a substitute of the
+ # grid of features.
+ anchor_generator=dict(
+ type='AnchorGenerator',
+ octave_base_scale=1,
+ scales_per_octave=1,
+ ratios=[1.0],
+ strides=[8, 16, 32, 64, 128]),
+ bbox_coder=dict(_delete_=True, type='TBLRBBoxCoder', normalizer=4.0),
+ loss_cls=dict(
+ type='FocalLoss',
+ use_sigmoid=True,
+ gamma=2.0,
+ alpha=0.25,
+ loss_weight=1.0,
+ reduction='none'),
+ loss_bbox=dict(
+ _delete_=True,
+ type='IoULoss',
+ eps=1e-6,
+ loss_weight=1.0,
+ reduction='none'),
+ ))
+
+# training and testing settings
+train_cfg = dict(
+ assigner=dict(
+ _delete_=True,
+ type='CenterRegionAssigner',
+ pos_scale=0.2,
+ neg_scale=0.2,
+ min_pos_iof=0.01),
+ allowed_border=-1,
+ pos_weight=-1,
+ debug=False)
+optimizer = dict(type='SGD', lr=0.01, momentum=0.9, weight_decay=0.0001)
+optimizer_config = dict(
+ _delete_=True, grad_clip=dict(max_norm=10, norm_type=2))
diff --git a/PyTorch/contrib/cv/detection/GCNet/configs/fsaf/fsaf_x101_64x4d_fpn_1x_coco.py b/PyTorch/contrib/cv/detection/GCNet/configs/fsaf/fsaf_x101_64x4d_fpn_1x_coco.py
new file mode 100644
index 0000000000000000000000000000000000000000..b966f24969a60b95878b0b86bb8dae7b8cb3f1ae
--- /dev/null
+++ b/PyTorch/contrib/cv/detection/GCNet/configs/fsaf/fsaf_x101_64x4d_fpn_1x_coco.py
@@ -0,0 +1,13 @@
+_base_ = './fsaf_r50_fpn_1x_coco.py'
+model = dict(
+ pretrained='open-mmlab://resnext101_64x4d',
+ backbone=dict(
+ type='ResNeXt',
+ depth=101,
+ groups=64,
+ base_width=4,
+ num_stages=4,
+ out_indices=(0, 1, 2, 3),
+ frozen_stages=1,
+ norm_cfg=dict(type='BN', requires_grad=True),
+ style='pytorch'))
diff --git a/PyTorch/contrib/cv/detection/GCNet/configs/gcnet/README.md b/PyTorch/contrib/cv/detection/GCNet/configs/gcnet/README.md
new file mode 100644
index 0000000000000000000000000000000000000000..a871fe99a40795e3d02e8bcc0d2b6939d3ebec32
--- /dev/null
+++ b/PyTorch/contrib/cv/detection/GCNet/configs/gcnet/README.md
@@ -0,0 +1,56 @@
+# GCNet for Object Detection
+
+By [Yue Cao](http://yue-cao.me), [Jiarui Xu](http://jerryxu.net), [Stephen Lin](https://scholar.google.com/citations?user=c3PYmxUAAAAJ&hl=en), Fangyun Wei, [Han Hu](https://sites.google.com/site/hanhushomepage/).
+
+We provide config files to reproduce the results in the paper for
+["GCNet: Non-local Networks Meet Squeeze-Excitation Networks and Beyond"](https://arxiv.org/abs/1904.11492) on COCO object detection.
+
+## Introduction
+
+**GCNet** is initially described in [arxiv](https://arxiv.org/abs/1904.11492). Via absorbing advantages of Non-Local Networks (NLNet) and Squeeze-Excitation Networks (SENet), GCNet provides a simple, fast and effective approach for global context modeling, which generally outperforms both NLNet and SENet on major benchmarks for various recognition tasks.
+
+## Citing GCNet
+
+```
+@article{cao2019GCNet,
+ title={GCNet: Non-local Networks Meet Squeeze-Excitation Networks and Beyond},
+ author={Cao, Yue and Xu, Jiarui and Lin, Stephen and Wei, Fangyun and Hu, Han},
+ journal={arXiv preprint arXiv:1904.11492},
+ year={2019}
+}
+```
+
+## Results and models
+The results on COCO 2017val are shown in the below table.
+
+| Backbone | Model | Context | Lr schd | Mem (GB) | Inf time (fps) | box AP | mask AP | Download |
+| :-------: | :--------------: | :------------: | :-----: | :------: | :------------: | :----: | :-----: | :-------: |
+| R-50-FPN | Mask | GC(c3-c5, r16) | 1x | 5.0 | | 39.7 | 35.9 |[model](http://download.openmmlab.com/mmdetection/v2.0/gcnet/mask_rcnn_r50_fpn_r16_gcb_c3-c5_1x_coco/mask_rcnn_r50_fpn_r16_gcb_c3-c5_1x_coco_20200515_211915-187da160.pth) | [log](http://download.openmmlab.com/mmdetection/v2.0/gcnet/mask_rcnn_r50_fpn_r16_gcb_c3-c5_1x_coco/mask_rcnn_r50_fpn_r16_gcb_c3-c5_1x_coco_20200515_211915.log.json) |
+| R-50-FPN | Mask | GC(c3-c5, r4) | 1x | 5.1 | 15.0 | 39.9 | 36.0 | [model](http://download.openmmlab.com/mmdetection/v2.0/gcnet/mask_rcnn_r50_fpn_r4_gcb_c3-c5_1x_coco/mask_rcnn_r50_fpn_r4_gcb_c3-c5_1x_coco_20200204-17235656.pth) | [log](http://download.openmmlab.com/mmdetection/v2.0/gcnet/mask_rcnn_r50_fpn_r4_gcb_c3-c5_1x_coco/mask_rcnn_r50_fpn_r4_gcb_c3-c5_1x_coco_20200204_024626.log.json) |
+| R-101-FPN | Mask | GC(c3-c5, r16) | 1x | 7.6 | 11.4 | 41.3 | 37.2 | [model](http://download.openmmlab.com/mmdetection/v2.0/gcnet/mask_rcnn_r101_fpn_r16_gcb_c3-c5_1x_coco/mask_rcnn_r101_fpn_r16_gcb_c3-c5_1x_coco_20200205-e58ae947.pth) | [log](http://download.openmmlab.com/mmdetection/v2.0/gcnet/mask_rcnn_r101_fpn_r16_gcb_c3-c5_1x_coco/mask_rcnn_r101_fpn_r16_gcb_c3-c5_1x_coco_20200205_192835.log.json) |
+| R-101-FPN | Mask | GC(c3-c5, r4) | 1x | 7.8 | 11.6 | 42.2 | 37.8 | [model](http://download.openmmlab.com/mmdetection/v2.0/gcnet/mask_rcnn_r101_fpn_r4_gcb_c3-c5_1x_coco/mask_rcnn_r101_fpn_r4_gcb_c3-c5_1x_coco_20200206-af22dc9d.pth) | [log](http://download.openmmlab.com/mmdetection/v2.0/gcnet/mask_rcnn_r101_fpn_r4_gcb_c3-c5_1x_coco/mask_rcnn_r101_fpn_r4_gcb_c3-c5_1x_coco_20200206_112128.log.json) |
+
+| Backbone | Model | Context | Lr schd | Mem (GB) | Inf time (fps) | box AP | mask AP | Download |
+| :-------: | :--------------: | :------------: | :-----: | :------: | :------------: | :----: | :-----: | :-------: |
+| R-50-FPN | Mask | - | 1x | 4.4 | 16.6 | 38.4 | 34.6 | [model](http://download.openmmlab.com/mmdetection/v2.0/gcnet/mask_rcnn_r50_fpn_syncbn-backbone_1x_coco/mask_rcnn_r50_fpn_syncbn-backbone_1x_coco_20200202-bb3eb55c.pth) | [log](http://download.openmmlab.com/mmdetection/v2.0/gcnet/mask_rcnn_r50_fpn_syncbn-backbone_1x_coco/mask_rcnn_r50_fpn_syncbn-backbone_1x_coco_20200202_214122.log.json) |
+| R-50-FPN | Mask | GC(c3-c5, r16) | 1x | 5.0 | 15.5 | 40.4 | 36.2 | [model](http://download.openmmlab.com/mmdetection/v2.0/gcnet/mask_rcnn_r50_fpn_syncbn-backbone_r16_gcb_c3-c5_1x_coco/mask_rcnn_r50_fpn_syncbn-backbone_r16_gcb_c3-c5_1x_coco_20200202-587b99aa.pth) | [log](http://download.openmmlab.com/mmdetection/v2.0/gcnet/mask_rcnn_r50_fpn_syncbn-backbone_r16_gcb_c3-c5_1x_coco/mask_rcnn_r50_fpn_syncbn-backbone_r16_gcb_c3-c5_1x_coco_20200202_174907.log.json) |
+| R-50-FPN | Mask | GC(c3-c5, r4) | 1x | 5.1 | 15.1 | 40.7 | 36.5 | [model](http://download.openmmlab.com/mmdetection/v2.0/gcnet/mask_rcnn_r50_fpn_syncbn-backbone_r4_gcb_c3-c5_1x_coco/mask_rcnn_r50_fpn_syncbn-backbone_r4_gcb_c3-c5_1x_coco_20200202-50b90e5c.pth) | [log](http://download.openmmlab.com/mmdetection/v2.0/gcnet/mask_rcnn_r50_fpn_syncbn-backbone_r4_gcb_c3-c5_1x_coco/mask_rcnn_r50_fpn_syncbn-backbone_r4_gcb_c3-c5_1x_coco_20200202_085547.log.json) |
+| R-101-FPN | Mask | - | 1x | 6.4 | 13.3 | 40.5 | 36.3 | [model](http://download.openmmlab.com/mmdetection/v2.0/gcnet/mask_rcnn_r101_fpn_syncbn-backbone_1x_coco/mask_rcnn_r101_fpn_syncbn-backbone_1x_coco_20200210-81658c8a.pth) | [log](http://download.openmmlab.com/mmdetection/v2.0/gcnet/mask_rcnn_r101_fpn_syncbn-backbone_1x_coco/mask_rcnn_r101_fpn_syncbn-backbone_1x_coco_20200210_220422.log.json) |
+| R-101-FPN | Mask | GC(c3-c5, r16) | 1x | 7.6 | 12.0 | 42.2 | 37.8 | [model](http://download.openmmlab.com/mmdetection/v2.0/gcnet/mask_rcnn_r101_fpn_syncbn-backbone_r16_gcb_c3-c5_1x_coco/mask_rcnn_r101_fpn_syncbn-backbone_r16_gcb_c3-c5_1x_coco_20200207-945e77ca.pth) | [log](http://download.openmmlab.com/mmdetection/v2.0/gcnet/mask_rcnn_r101_fpn_syncbn-backbone_r16_gcb_c3-c5_1x_coco/mask_rcnn_r101_fpn_syncbn-backbone_r16_gcb_c3-c5_1x_coco_20200207_015330.log.json) |
+| R-101-FPN | Mask | GC(c3-c5, r4) | 1x | 7.8 | 11.8 | 42.2 | 37.8 | [model](http://download.openmmlab.com/mmdetection/v2.0/gcnet/mask_rcnn_r101_fpn_syncbn-backbone_r4_gcb_c3-c5_1x_coco/mask_rcnn_r101_fpn_syncbn-backbone_r4_gcb_c3-c5_1x_coco_20200206-8407a3f0.pth) | [log](http://download.openmmlab.com/mmdetection/v2.0/gcnet/mask_rcnn_r101_fpn_syncbn-backbone_r4_gcb_c3-c5_1x_coco/mask_rcnn_r101_fpn_syncbn-backbone_r4_gcb_c3-c5_1x_coco_20200206_142508.log.json) |
+| X-101-FPN | Mask | - | 1x | 7.6 | 11.3 | 42.4 | 37.7 | [model](http://download.openmmlab.com/mmdetection/v2.0/gcnet/mask_rcnn_x101_32x4d_fpn_syncbn-backbone_1x_coco/mask_rcnn_x101_32x4d_fpn_syncbn-backbone_1x_coco_20200211-7584841c.pth) | [log](http://download.openmmlab.com/mmdetection/v2.0/gcnet/mask_rcnn_x101_32x4d_fpn_syncbn-backbone_1x_coco/mask_rcnn_x101_32x4d_fpn_syncbn-backbone_1x_coco_20200211_054326.log.json) |
+| X-101-FPN | Mask | GC(c3-c5, r16) | 1x | 8.8 | 9.8 | 43.5 | 38.6 | [model](http://download.openmmlab.com/mmdetection/v2.0/gcnet/mask_rcnn_x101_32x4d_fpn_syncbn-backbone_r16_gcb_c3-c5_1x_coco/mask_rcnn_x101_32x4d_fpn_syncbn-backbone_r16_gcb_c3-c5_1x_coco_20200211-cbed3d2c.pth) | [log](http://download.openmmlab.com/mmdetection/v2.0/gcnet/mask_rcnn_x101_32x4d_fpn_syncbn-backbone_r16_gcb_c3-c5_1x_coco/mask_rcnn_x101_32x4d_fpn_syncbn-backbone_r16_gcb_c3-c5_1x_coco_20200211_164715.log.json) |
+| X-101-FPN | Mask | GC(c3-c5, r4) | 1x | 9.0 | 9.7 | 43.9 | 39.0 | [model](http://download.openmmlab.com/mmdetection/v2.0/gcnet/mask_rcnn_x101_32x4d_fpn_syncbn-backbone_r4_gcb_c3-c5_1x_coco/mask_rcnn_x101_32x4d_fpn_syncbn-backbone_r4_gcb_c3-c5_1x_coco_20200212-68164964.pth) | [log](http://download.openmmlab.com/mmdetection/v2.0/gcnet/mask_rcnn_x101_32x4d_fpn_syncbn-backbone_r4_gcb_c3-c5_1x_coco/mask_rcnn_x101_32x4d_fpn_syncbn-backbone_r4_gcb_c3-c5_1x_coco_20200212_070942.log.json) |
+| X-101-FPN | Cascade Mask | - | 1x | 9.2 | 8.4 | 44.7 | 38.6 | [model](http://download.openmmlab.com/mmdetection/v2.0/gcnet/cascade_mask_rcnn_x101_32x4d_fpn_syncbn-backbone_1x_coco/cascade_mask_rcnn_x101_32x4d_fpn_syncbn-backbone_1x_coco_20200310-d5ad2a5e.pth) | [log](http://download.openmmlab.com/mmdetection/v2.0/gcnet/cascade_mask_rcnn_x101_32x4d_fpn_syncbn-backbone_1x_coco/cascade_mask_rcnn_x101_32x4d_fpn_syncbn-backbone_1x_coco_20200310_115217.log.json) |
+| X-101-FPN | Cascade Mask | GC(c3-c5, r16) | 1x | 10.3 | 7.7 | 46.2 | 39.7 | [model](http://download.openmmlab.com/mmdetection/v2.0/gcnet/cascade_mask_rcnn_x101_32x4d_fpn_syncbn-backbone_r16_gcb_c3-c5_1x_coco/cascade_mask_rcnn_x101_32x4d_fpn_syncbn-backbone_r16_gcb_c3-c5_1x_coco_20200211-10bf2463.pth) | [log](http://download.openmmlab.com/mmdetection/v2.0/gcnet/cascade_mask_rcnn_x101_32x4d_fpn_syncbn-backbone_r16_gcb_c3-c5_1x_coco/cascade_mask_rcnn_x101_32x4d_fpn_syncbn-backbone_r16_gcb_c3-c5_1x_coco_20200211_184154.log.json) |
+| X-101-FPN | Cascade Mask | GC(c3-c5, r4) | 1x | 10.6 | | 46.4 | 40.1 | [model](http://download.openmmlab.com/mmdetection/v2.0/gcnet/cascade_mask_rcnn_x101_32x4d_fpn_syncbn-backbone_r4_gcb_c3-c5_1x_coco/cascade_mask_rcnn_x101_32x4d_fpn_syncbn-backbone_r4_gcb_c3-c5_1x_coco_20200703_180653-ed035291.pth) | [log](http://download.openmmlab.com/mmdetection/v2.0/gcnet/cascade_mask_rcnn_x101_32x4d_fpn_syncbn-backbone_r4_gcb_c3-c5_1x_coco/cascade_mask_rcnn_x101_32x4d_fpn_syncbn-backbone_r4_gcb_c3-c5_1x_coco_20200703_180653.log.json) |
+| X-101-FPN | DCN Cascade Mask | - | 1x | | | 44.9 | 38.9 |[model](http://download.openmmlab.com/mmdetection/v2.0/gcnet/cascade_mask_rcnn_x101_32x4d_fpn_syncbn-backbone_dconv_c3-c5_1x_coco/cascade_mask_rcnn_x101_32x4d_fpn_syncbn-backbone_dconv_c3-c5_1x_coco_20200516_182249-680fc3f2.pth) | [log](http://download.openmmlab.com/mmdetection/v2.0/gcnet/cascade_mask_rcnn_x101_32x4d_fpn_syncbn-backbone_dconv_c3-c5_1x_coco/cascade_mask_rcnn_x101_32x4d_fpn_syncbn-backbone_dconv_c3-c5_1x_coco_20200516_182249.log.json)|
+| X-101-FPN | DCN Cascade Mask | GC(c3-c5, r16) | 1x | | | 44.6 | |[model](http://download.openmmlab.com/mmdetection/v2.0/gcnet/cascade_mask_rcnn_x101_32x4d_fpn_syncbn-backbone_dconv_c3-c5_r16_gcb_c3-c5_1x_coco/cascade_mask_rcnn_x101_32x4d_fpn_syncbn-backbone_dconv_c3-c5_r16_gcb_c3-c5_1x_coco_20200516_015634-08f56b56.pth) | [log](http://download.openmmlab.com/mmdetection/v2.0/gcnet/cascade_mask_rcnn_x101_32x4d_fpn_syncbn-backbone_dconv_c3-c5_r16_gcb_c3-c5_1x_coco/cascade_mask_rcnn_x101_32x4d_fpn_syncbn-backbone_dconv_c3-c5_r16_gcb_c3-c5_1x_coco_20200516_015634.log.json) |
+| X-101-FPN | DCN Cascade Mask | GC(c3-c5, r4) | 1x | | | 45.7 | 39.5 |[model](http://download.openmmlab.com/mmdetection/v2.0/gcnet/cascade_mask_rcnn_x101_32x4d_fpn_syncbn-backbone_dconv_c3-c5_r4_gcb_c3-c5_1x_coco/cascade_mask_rcnn_x101_32x4d_fpn_syncbn-backbone_dconv_c3-c5_r4_gcb_c3-c5_1x_coco_20200518_041145-24cabcfd.pth) | [log](http://download.openmmlab.com/mmdetection/v2.0/gcnet/cascade_mask_rcnn_x101_32x4d_fpn_syncbn-backbone_dconv_c3-c5_r4_gcb_c3-c5_1x_coco/cascade_mask_rcnn_x101_32x4d_fpn_syncbn-backbone_dconv_c3-c5_r4_gcb_c3-c5_1x_coco_20200518_041145.log.json) |
+
+**Notes:**
+
+- The `SyncBN` is added in the backbone for all models in **Table 2**.
+- `GC` denotes Global Context (GC) block is inserted after 1x1 conv of backbone.
+- `DCN` denotes replace 3x3 conv with 3x3 Deformable Convolution in `c3-c5` stages of backbone.
+- `r4` and `r16` denote ratio 4 and ratio 16 in GC block respectively.
diff --git a/PyTorch/contrib/cv/detection/GCNet/configs/gcnet/cascade_mask_rcnn_x101_32x4d_fpn_syncbn-backbone_1x_coco.py b/PyTorch/contrib/cv/detection/GCNet/configs/gcnet/cascade_mask_rcnn_x101_32x4d_fpn_syncbn-backbone_1x_coco.py
new file mode 100644
index 0000000000000000000000000000000000000000..5118895f00345a42fdbc6d2edba084ccd3f1a3c8
--- /dev/null
+++ b/PyTorch/contrib/cv/detection/GCNet/configs/gcnet/cascade_mask_rcnn_x101_32x4d_fpn_syncbn-backbone_1x_coco.py
@@ -0,0 +1,4 @@
+_base_ = '../cascade_rcnn/cascade_mask_rcnn_x101_32x4d_fpn_1x_coco.py'
+model = dict(
+ backbone=dict(
+ norm_cfg=dict(type='SyncBN', requires_grad=True), norm_eval=False))
diff --git a/PyTorch/contrib/cv/detection/GCNet/configs/gcnet/cascade_mask_rcnn_x101_32x4d_fpn_syncbn-backbone_dconv_c3-c5_1x_coco.py b/PyTorch/contrib/cv/detection/GCNet/configs/gcnet/cascade_mask_rcnn_x101_32x4d_fpn_syncbn-backbone_dconv_c3-c5_1x_coco.py
new file mode 100644
index 0000000000000000000000000000000000000000..464aef787de3c932dc3244a93e62cc3df83002ec
--- /dev/null
+++ b/PyTorch/contrib/cv/detection/GCNet/configs/gcnet/cascade_mask_rcnn_x101_32x4d_fpn_syncbn-backbone_dconv_c3-c5_1x_coco.py
@@ -0,0 +1,4 @@
+_base_ = '../dcn/cascade_mask_rcnn_r50_fpn_dconv_c3-c5_1x_coco.py'
+model = dict(
+ backbone=dict(
+ norm_cfg=dict(type='SyncBN', requires_grad=True), norm_eval=False))
diff --git a/PyTorch/contrib/cv/detection/GCNet/configs/gcnet/cascade_mask_rcnn_x101_32x4d_fpn_syncbn-backbone_dconv_c3-c5_r16_gcb_c3-c5_1x_coco.py b/PyTorch/contrib/cv/detection/GCNet/configs/gcnet/cascade_mask_rcnn_x101_32x4d_fpn_syncbn-backbone_dconv_c3-c5_r16_gcb_c3-c5_1x_coco.py
new file mode 100644
index 0000000000000000000000000000000000000000..fa4b6f12f36be74c6e1f7182db110893f9f4f0c4
--- /dev/null
+++ b/PyTorch/contrib/cv/detection/GCNet/configs/gcnet/cascade_mask_rcnn_x101_32x4d_fpn_syncbn-backbone_dconv_c3-c5_r16_gcb_c3-c5_1x_coco.py
@@ -0,0 +1,11 @@
+_base_ = '../dcn/cascade_rcnn_r50_fpn_dconv_c3-c5_1x_coco.py'
+model = dict(
+ backbone=dict(
+ norm_cfg=dict(type='SyncBN', requires_grad=True),
+ norm_eval=False,
+ plugins=[
+ dict(
+ cfg=dict(type='ContextBlock', ratio=1. / 16),
+ stages=(False, True, True, True),
+ position='after_conv3')
+ ]))
diff --git a/PyTorch/contrib/cv/detection/GCNet/configs/gcnet/cascade_mask_rcnn_x101_32x4d_fpn_syncbn-backbone_dconv_c3-c5_r4_gcb_c3-c5_1x_coco.py b/PyTorch/contrib/cv/detection/GCNet/configs/gcnet/cascade_mask_rcnn_x101_32x4d_fpn_syncbn-backbone_dconv_c3-c5_r4_gcb_c3-c5_1x_coco.py
new file mode 100644
index 0000000000000000000000000000000000000000..b76e3e6bab7a32e95aec352829324b8865e63631
--- /dev/null
+++ b/PyTorch/contrib/cv/detection/GCNet/configs/gcnet/cascade_mask_rcnn_x101_32x4d_fpn_syncbn-backbone_dconv_c3-c5_r4_gcb_c3-c5_1x_coco.py
@@ -0,0 +1,11 @@
+_base_ = '../dcn/cascade_mask_rcnn_r50_fpn_dconv_c3-c5_1x_coco.py'
+model = dict(
+ backbone=dict(
+ norm_cfg=dict(type='SyncBN', requires_grad=True),
+ norm_eval=False,
+ plugins=[
+ dict(
+ cfg=dict(type='ContextBlock', ratio=1. / 4),
+ stages=(False, True, True, True),
+ position='after_conv3')
+ ]))
diff --git a/PyTorch/contrib/cv/detection/GCNet/configs/gcnet/cascade_mask_rcnn_x101_32x4d_fpn_syncbn-backbone_r16_gcb_c3-c5_1x_coco.py b/PyTorch/contrib/cv/detection/GCNet/configs/gcnet/cascade_mask_rcnn_x101_32x4d_fpn_syncbn-backbone_r16_gcb_c3-c5_1x_coco.py
new file mode 100644
index 0000000000000000000000000000000000000000..50883ffeb16369ea6210f2ece8fc2d7e084b0134
--- /dev/null
+++ b/PyTorch/contrib/cv/detection/GCNet/configs/gcnet/cascade_mask_rcnn_x101_32x4d_fpn_syncbn-backbone_r16_gcb_c3-c5_1x_coco.py
@@ -0,0 +1,11 @@
+_base_ = '../cascade_rcnn/cascade_mask_rcnn_x101_32x4d_fpn_1x_coco.py'
+model = dict(
+ backbone=dict(
+ norm_cfg=dict(type='SyncBN', requires_grad=True),
+ norm_eval=False,
+ plugins=[
+ dict(
+ cfg=dict(type='ContextBlock', ratio=1. / 16),
+ stages=(False, True, True, True),
+ position='after_conv3')
+ ]))
diff --git a/PyTorch/contrib/cv/detection/GCNet/configs/gcnet/cascade_mask_rcnn_x101_32x4d_fpn_syncbn-backbone_r4_gcb_c3-c5_1x_coco.py b/PyTorch/contrib/cv/detection/GCNet/configs/gcnet/cascade_mask_rcnn_x101_32x4d_fpn_syncbn-backbone_r4_gcb_c3-c5_1x_coco.py
new file mode 100644
index 0000000000000000000000000000000000000000..31fdd070595ac0512a39075bb045dd18035d3f14
--- /dev/null
+++ b/PyTorch/contrib/cv/detection/GCNet/configs/gcnet/cascade_mask_rcnn_x101_32x4d_fpn_syncbn-backbone_r4_gcb_c3-c5_1x_coco.py
@@ -0,0 +1,11 @@
+_base_ = '../cascade_rcnn/cascade_mask_rcnn_x101_32x4d_fpn_1x_coco.py'
+model = dict(
+ backbone=dict(
+ norm_cfg=dict(type='SyncBN', requires_grad=True),
+ norm_eval=False,
+ plugins=[
+ dict(
+ cfg=dict(type='ContextBlock', ratio=1. / 4),
+ stages=(False, True, True, True),
+ position='after_conv3')
+ ]))
diff --git a/PyTorch/contrib/cv/detection/GCNet/configs/gcnet/mask_rcnn_r101_fpn_r16_gcb_c3-c5_1x_coco.py b/PyTorch/contrib/cv/detection/GCNet/configs/gcnet/mask_rcnn_r101_fpn_r16_gcb_c3-c5_1x_coco.py
new file mode 100644
index 0000000000000000000000000000000000000000..ad6ad47696e6aeb2b3505abab0bd2d49d3b7aa83
--- /dev/null
+++ b/PyTorch/contrib/cv/detection/GCNet/configs/gcnet/mask_rcnn_r101_fpn_r16_gcb_c3-c5_1x_coco.py
@@ -0,0 +1,8 @@
+_base_ = '../mask_rcnn/mask_rcnn_r101_fpn_1x_coco.py'
+model = dict(
+ backbone=dict(plugins=[
+ dict(
+ cfg=dict(type='ContextBlock', ratio=1. / 16),
+ stages=(False, True, True, True),
+ position='after_conv3')
+ ]))
diff --git a/PyTorch/contrib/cv/detection/GCNet/configs/gcnet/mask_rcnn_r101_fpn_r4_gcb_c3-c5_1x_coco.py b/PyTorch/contrib/cv/detection/GCNet/configs/gcnet/mask_rcnn_r101_fpn_r4_gcb_c3-c5_1x_coco.py
new file mode 100644
index 0000000000000000000000000000000000000000..29f91674c6d54bfa6fdcfcb5b7e2ec2a2bbf81fa
--- /dev/null
+++ b/PyTorch/contrib/cv/detection/GCNet/configs/gcnet/mask_rcnn_r101_fpn_r4_gcb_c3-c5_1x_coco.py
@@ -0,0 +1,8 @@
+_base_ = '../mask_rcnn/mask_rcnn_r101_fpn_1x_coco.py'
+model = dict(
+ backbone=dict(plugins=[
+ dict(
+ cfg=dict(type='ContextBlock', ratio=1. / 4),
+ stages=(False, True, True, True),
+ position='after_conv3')
+ ]))
diff --git a/PyTorch/contrib/cv/detection/GCNet/configs/gcnet/mask_rcnn_r101_fpn_syncbn-backbone_1x_coco.py b/PyTorch/contrib/cv/detection/GCNet/configs/gcnet/mask_rcnn_r101_fpn_syncbn-backbone_1x_coco.py
new file mode 100644
index 0000000000000000000000000000000000000000..6e1c5d0cadfb9fb3a4f8645e28a8e67fc499e900
--- /dev/null
+++ b/PyTorch/contrib/cv/detection/GCNet/configs/gcnet/mask_rcnn_r101_fpn_syncbn-backbone_1x_coco.py
@@ -0,0 +1,4 @@
+_base_ = '../mask_rcnn/mask_rcnn_r101_fpn_1x_coco.py'
+model = dict(
+ backbone=dict(
+ norm_cfg=dict(type='SyncBN', requires_grad=True), norm_eval=False))
diff --git a/PyTorch/contrib/cv/detection/GCNet/configs/gcnet/mask_rcnn_r101_fpn_syncbn-backbone_r16_gcb_c3-c5_1x_coco.py b/PyTorch/contrib/cv/detection/GCNet/configs/gcnet/mask_rcnn_r101_fpn_syncbn-backbone_r16_gcb_c3-c5_1x_coco.py
new file mode 100644
index 0000000000000000000000000000000000000000..781dba78d68e77fa7eee15f5bbcc539731f8378d
--- /dev/null
+++ b/PyTorch/contrib/cv/detection/GCNet/configs/gcnet/mask_rcnn_r101_fpn_syncbn-backbone_r16_gcb_c3-c5_1x_coco.py
@@ -0,0 +1,11 @@
+_base_ = '../mask_rcnn/mask_rcnn_r101_fpn_1x_coco.py'
+model = dict(
+ backbone=dict(
+ norm_cfg=dict(type='SyncBN', requires_grad=True),
+ norm_eval=False,
+ plugins=[
+ dict(
+ cfg=dict(type='ContextBlock', ratio=1. / 16),
+ stages=(False, True, True, True),
+ position='after_conv3')
+ ]))
diff --git a/PyTorch/contrib/cv/detection/GCNet/configs/gcnet/mask_rcnn_r101_fpn_syncbn-backbone_r4_gcb_c3-c5_1x_coco.py b/PyTorch/contrib/cv/detection/GCNet/configs/gcnet/mask_rcnn_r101_fpn_syncbn-backbone_r4_gcb_c3-c5_1x_coco.py
new file mode 100644
index 0000000000000000000000000000000000000000..32972de857b3c4f43170dcd3e7fbce76425f094d
--- /dev/null
+++ b/PyTorch/contrib/cv/detection/GCNet/configs/gcnet/mask_rcnn_r101_fpn_syncbn-backbone_r4_gcb_c3-c5_1x_coco.py
@@ -0,0 +1,11 @@
+_base_ = '../mask_rcnn/mask_rcnn_r101_fpn_1x_coco.py'
+model = dict(
+ backbone=dict(
+ norm_cfg=dict(type='SyncBN', requires_grad=True),
+ norm_eval=False,
+ plugins=[
+ dict(
+ cfg=dict(type='ContextBlock', ratio=1. / 4),
+ stages=(False, True, True, True),
+ position='after_conv3')
+ ]))
diff --git a/PyTorch/contrib/cv/detection/GCNet/configs/gcnet/mask_rcnn_r50_fpn_r16_gcb_c3-c5_1x_coco.py b/PyTorch/contrib/cv/detection/GCNet/configs/gcnet/mask_rcnn_r50_fpn_r16_gcb_c3-c5_1x_coco.py
new file mode 100644
index 0000000000000000000000000000000000000000..d299b69f576a2547de1f7d9edd171d56ab002d0a
--- /dev/null
+++ b/PyTorch/contrib/cv/detection/GCNet/configs/gcnet/mask_rcnn_r50_fpn_r16_gcb_c3-c5_1x_coco.py
@@ -0,0 +1,8 @@
+_base_ = '../mask_rcnn/mask_rcnn_r50_fpn_1x_coco.py'
+model = dict(
+ backbone=dict(plugins=[
+ dict(
+ cfg=dict(type='ContextBlock', ratio=1. / 16),
+ stages=(False, True, True, True),
+ position='after_conv3')
+ ]))
diff --git a/PyTorch/contrib/cv/detection/GCNet/configs/gcnet/mask_rcnn_r50_fpn_r4_gcb_c3-c5_1x_coco.py b/PyTorch/contrib/cv/detection/GCNet/configs/gcnet/mask_rcnn_r50_fpn_r4_gcb_c3-c5_1x_coco.py
new file mode 100644
index 0000000000000000000000000000000000000000..caf23e696c20064dabaa1c805efec1c02485fb80
--- /dev/null
+++ b/PyTorch/contrib/cv/detection/GCNet/configs/gcnet/mask_rcnn_r50_fpn_r4_gcb_c3-c5_1x_coco.py
@@ -0,0 +1,9 @@
+_base_ = '../mask_rcnn/mask_rcnn_r50_fpn_poly_1x_coco.py'
+# _base_ = '../mask_rcnn/mask_rcnn_r50_fpn_1x_coco.py'
+model = dict(
+ backbone=dict(plugins=[
+ dict(
+ cfg=dict(type='ContextBlock', ratio=1. / 4),
+ stages=(False, True, True, True),
+ position='after_conv3')
+ ]))
diff --git a/PyTorch/contrib/cv/detection/GCNet/configs/gcnet/mask_rcnn_r50_fpn_syncbn-backbone_1x_coco.py b/PyTorch/contrib/cv/detection/GCNet/configs/gcnet/mask_rcnn_r50_fpn_syncbn-backbone_1x_coco.py
new file mode 100644
index 0000000000000000000000000000000000000000..0308a567c147413688c9da679d06f93b0e154d88
--- /dev/null
+++ b/PyTorch/contrib/cv/detection/GCNet/configs/gcnet/mask_rcnn_r50_fpn_syncbn-backbone_1x_coco.py
@@ -0,0 +1,4 @@
+_base_ = '../mask_rcnn/mask_rcnn_r50_fpn_1x_coco.py'
+model = dict(
+ backbone=dict(
+ norm_cfg=dict(type='SyncBN', requires_grad=True), norm_eval=False))
diff --git a/PyTorch/contrib/cv/detection/GCNet/configs/gcnet/mask_rcnn_r50_fpn_syncbn-backbone_r16_gcb_c3-c5_1x_coco.py b/PyTorch/contrib/cv/detection/GCNet/configs/gcnet/mask_rcnn_r50_fpn_syncbn-backbone_r16_gcb_c3-c5_1x_coco.py
new file mode 100644
index 0000000000000000000000000000000000000000..e04780c50f96929997c279b23fe5fa427657039b
--- /dev/null
+++ b/PyTorch/contrib/cv/detection/GCNet/configs/gcnet/mask_rcnn_r50_fpn_syncbn-backbone_r16_gcb_c3-c5_1x_coco.py
@@ -0,0 +1,11 @@
+_base_ = '../mask_rcnn/mask_rcnn_r50_fpn_1x_coco.py'
+model = dict(
+ backbone=dict(
+ norm_cfg=dict(type='SyncBN', requires_grad=True),
+ norm_eval=False,
+ plugins=[
+ dict(
+ cfg=dict(type='ContextBlock', ratio=1. / 16),
+ stages=(False, True, True, True),
+ position='after_conv3')
+ ]))
diff --git a/PyTorch/contrib/cv/detection/GCNet/configs/gcnet/mask_rcnn_r50_fpn_syncbn-backbone_r4_gcb_c3-c5_1x_coco.py b/PyTorch/contrib/cv/detection/GCNet/configs/gcnet/mask_rcnn_r50_fpn_syncbn-backbone_r4_gcb_c3-c5_1x_coco.py
new file mode 100644
index 0000000000000000000000000000000000000000..980f8191d4c07eb35e338bd87e3b73b06b3214ad
--- /dev/null
+++ b/PyTorch/contrib/cv/detection/GCNet/configs/gcnet/mask_rcnn_r50_fpn_syncbn-backbone_r4_gcb_c3-c5_1x_coco.py
@@ -0,0 +1,11 @@
+_base_ = '../mask_rcnn/mask_rcnn_r50_fpn_1x_coco.py'
+model = dict(
+ backbone=dict(
+ norm_cfg=dict(type='SyncBN', requires_grad=True),
+ norm_eval=False,
+ plugins=[
+ dict(
+ cfg=dict(type='ContextBlock', ratio=1. / 4),
+ stages=(False, True, True, True),
+ position='after_conv3')
+ ]))
diff --git a/PyTorch/contrib/cv/detection/GCNet/configs/gcnet/mask_rcnn_x101_32x4d_fpn_syncbn-backbone_1x_coco.py b/PyTorch/contrib/cv/detection/GCNet/configs/gcnet/mask_rcnn_x101_32x4d_fpn_syncbn-backbone_1x_coco.py
new file mode 100644
index 0000000000000000000000000000000000000000..f0c96e58b6131f2958f28c56b9d8384d5b4746f7
--- /dev/null
+++ b/PyTorch/contrib/cv/detection/GCNet/configs/gcnet/mask_rcnn_x101_32x4d_fpn_syncbn-backbone_1x_coco.py
@@ -0,0 +1,4 @@
+_base_ = '../mask_rcnn/mask_rcnn_x101_32x4d_fpn_1x_coco.py'
+model = dict(
+ backbone=dict(
+ norm_cfg=dict(type='SyncBN', requires_grad=True), norm_eval=False))
diff --git a/PyTorch/contrib/cv/detection/GCNet/configs/gcnet/mask_rcnn_x101_32x4d_fpn_syncbn-backbone_r16_gcb_c3-c5_1x_coco.py b/PyTorch/contrib/cv/detection/GCNet/configs/gcnet/mask_rcnn_x101_32x4d_fpn_syncbn-backbone_r16_gcb_c3-c5_1x_coco.py
new file mode 100644
index 0000000000000000000000000000000000000000..7fb8e82ece225ab6f88f1f4f83bea56a42cf1a57
--- /dev/null
+++ b/PyTorch/contrib/cv/detection/GCNet/configs/gcnet/mask_rcnn_x101_32x4d_fpn_syncbn-backbone_r16_gcb_c3-c5_1x_coco.py
@@ -0,0 +1,11 @@
+_base_ = '../mask_rcnn/mask_rcnn_x101_32x4d_fpn_1x_coco.py'
+model = dict(
+ backbone=dict(
+ norm_cfg=dict(type='SyncBN', requires_grad=True),
+ norm_eval=False,
+ plugins=[
+ dict(
+ cfg=dict(type='ContextBlock', ratio=1. / 16),
+ stages=(False, True, True, True),
+ position='after_conv3')
+ ]))
diff --git a/PyTorch/contrib/cv/detection/GCNet/configs/gcnet/mask_rcnn_x101_32x4d_fpn_syncbn-backbone_r4_gcb_c3-c5_1x_coco.py b/PyTorch/contrib/cv/detection/GCNet/configs/gcnet/mask_rcnn_x101_32x4d_fpn_syncbn-backbone_r4_gcb_c3-c5_1x_coco.py
new file mode 100644
index 0000000000000000000000000000000000000000..b1ddbee3b4b79e79bb2a3faf30604f2465612728
--- /dev/null
+++ b/PyTorch/contrib/cv/detection/GCNet/configs/gcnet/mask_rcnn_x101_32x4d_fpn_syncbn-backbone_r4_gcb_c3-c5_1x_coco.py
@@ -0,0 +1,11 @@
+_base_ = '../mask_rcnn/mask_rcnn_x101_32x4d_fpn_1x_coco.py'
+model = dict(
+ backbone=dict(
+ norm_cfg=dict(type='SyncBN', requires_grad=True),
+ norm_eval=False,
+ plugins=[
+ dict(
+ cfg=dict(type='ContextBlock', ratio=1. / 4),
+ stages=(False, True, True, True),
+ position='after_conv3')
+ ]))
diff --git a/PyTorch/contrib/cv/detection/GCNet/configs/gfl/README.md b/PyTorch/contrib/cv/detection/GCNet/configs/gfl/README.md
new file mode 100644
index 0000000000000000000000000000000000000000..3c51b82173f403cb48f4c04465f2e057f829f52d
--- /dev/null
+++ b/PyTorch/contrib/cv/detection/GCNet/configs/gfl/README.md
@@ -0,0 +1,32 @@
+# Generalized Focal Loss: Learning Qualified and Distributed Bounding Boxes for Dense Object Detection
+
+
+## Introduction
+
+We provide config files to reproduce the object detection results in the paper [Generalized Focal Loss: Learning Qualified and Distributed Bounding Boxes for Dense Object Detection](https://arxiv.org/abs/2006.04388)
+
+```
+@article{li2020generalized,
+ title={Generalized Focal Loss: Learning Qualified and Distributed Bounding Boxes for Dense Object Detection},
+ author={Li, Xiang and Wang, Wenhai and Wu, Lijun and Chen, Shuo and Hu, Xiaolin and Li, Jun and Tang, Jinhui and Yang, Jian},
+ journal={arXiv preprint arXiv:2006.04388},
+ year={2020}
+}
+```
+
+
+## Results and Models
+
+| Backbone | Style | Lr schd | Multi-scale Training| Inf time (fps) | box AP | Download |
+|:-----------------:|:-------:|:-------:|:-------------------:|:--------------:|:------:|:--------:|
+| R-50 | pytorch | 1x | No | 19.5 | 40.2 | [model](http://download.openmmlab.com/mmdetection/v2.0/gfl/gfl_r50_fpn_1x_coco/gfl_r50_fpn_1x_coco_20200629_121244-25944287.pth) | [log](http://download.openmmlab.com/mmdetection/v2.0/gfl/gfl_r50_fpn_1x_coco/gfl_r50_fpn_1x_coco_20200629_121244.log.json) |
+| R-50 | pytorch | 2x | Yes | 19.5 | 42.9 | [model](http://download.openmmlab.com/mmdetection/v2.0/gfl/gfl_r50_fpn_mstrain_2x_coco/gfl_r50_fpn_mstrain_2x_coco_20200629_213802-37bb1edc.pth) | [log](http://download.openmmlab.com/mmdetection/v2.0/gfl/gfl_r50_fpn_mstrain_2x_coco/gfl_r50_fpn_mstrain_2x_coco_20200629_213802.log.json) |
+| R-101 | pytorch | 2x | Yes | 14.7 | 44.7 | [model](http://download.openmmlab.com/mmdetection/v2.0/gfl/gfl_r101_fpn_mstrain_2x_coco/gfl_r101_fpn_mstrain_2x_coco_20200629_200126-dd12f847.pth) | [log](http://download.openmmlab.com/mmdetection/v2.0/gfl/gfl_r101_fpn_mstrain_2x_coco/gfl_r101_fpn_mstrain_2x_coco_20200629_200126.log.json) |
+| R-101-dcnv2 | pytorch | 2x | Yes | 12.9 | 47.1 | [model](http://download.openmmlab.com/mmdetection/v2.0/gfl/gfl_r101_fpn_dconv_c3-c5_mstrain_2x_coco/gfl_r101_fpn_dconv_c3-c5_mstrain_2x_coco_20200630_102002-134b07df.pth) | [log](http://download.openmmlab.com/mmdetection/v2.0/gfl/gfl_r101_fpn_dconv_c3-c5_mstrain_2x_coco/gfl_r101_fpn_dconv_c3-c5_mstrain_2x_coco_20200630_102002.log.json) |
+| X-101-32x4d | pytorch | 2x | Yes | 12.1 | 45.9 | [model](http://download.openmmlab.com/mmdetection/v2.0/gfl/gfl_x101_32x4d_fpn_mstrain_2x_coco/gfl_x101_32x4d_fpn_mstrain_2x_coco_20200630_102002-50c1ffdb.pth) | [log](http://download.openmmlab.com/mmdetection/v2.0/gfl/gfl_x101_32x4d_fpn_mstrain_2x_coco/gfl_x101_32x4d_fpn_mstrain_2x_coco_20200630_102002.log.json) |
+| X-101-32x4d-dcnv2 | pytorch | 2x | Yes | 10.7 | 48.1 | [model](http://download.openmmlab.com/mmdetection/v2.0/gfl/gfl_x101_32x4d_fpn_dconv_c4-c5_mstrain_2x_coco/gfl_x101_32x4d_fpn_dconv_c4-c5_mstrain_2x_coco_20200630_102002-14a2bf25.pth) | [log](http://download.openmmlab.com/mmdetection/v2.0/gfl/gfl_x101_32x4d_fpn_dconv_c4-c5_mstrain_2x_coco/gfl_x101_32x4d_fpn_dconv_c4-c5_mstrain_2x_coco_20200630_102002.log.json) |
+
+[1] *1x and 2x mean the model is trained for 90K and 180K iterations, respectively.* \
+[2] *All results are obtained with a single model and without any test time data augmentation such as multi-scale, flipping and etc..* \
+[3] *`dcnv2` denotes deformable convolutional networks v2.* \
+[4] *FPS is tested with a single GeForce RTX 2080Ti GPU, using a batch size of 1.*
diff --git a/PyTorch/contrib/cv/detection/GCNet/configs/gfl/gfl_r101_fpn_dconv_c3-c5_mstrain_2x_coco.py b/PyTorch/contrib/cv/detection/GCNet/configs/gfl/gfl_r101_fpn_dconv_c3-c5_mstrain_2x_coco.py
new file mode 100644
index 0000000000000000000000000000000000000000..eab622b2e8bdc03c717b9b04d043da46f25a7cb3
--- /dev/null
+++ b/PyTorch/contrib/cv/detection/GCNet/configs/gfl/gfl_r101_fpn_dconv_c3-c5_mstrain_2x_coco.py
@@ -0,0 +1,14 @@
+_base_ = './gfl_r50_fpn_mstrain_2x_coco.py'
+model = dict(
+ pretrained='torchvision://resnet101',
+ backbone=dict(
+ type='ResNet',
+ depth=101,
+ num_stages=4,
+ out_indices=(0, 1, 2, 3),
+ frozen_stages=1,
+ norm_cfg=dict(type='BN', requires_grad=True),
+ dcn=dict(type='DCN', deform_groups=1, fallback_on_stride=False),
+ stage_with_dcn=(False, True, True, True),
+ norm_eval=True,
+ style='pytorch'))
diff --git a/PyTorch/contrib/cv/detection/GCNet/configs/gfl/gfl_r101_fpn_mstrain_2x_coco.py b/PyTorch/contrib/cv/detection/GCNet/configs/gfl/gfl_r101_fpn_mstrain_2x_coco.py
new file mode 100644
index 0000000000000000000000000000000000000000..c972d0c159676a81d997e033e4db0a2a6d9b87e2
--- /dev/null
+++ b/PyTorch/contrib/cv/detection/GCNet/configs/gfl/gfl_r101_fpn_mstrain_2x_coco.py
@@ -0,0 +1,12 @@
+_base_ = './gfl_r50_fpn_mstrain_2x_coco.py'
+model = dict(
+ pretrained='torchvision://resnet101',
+ backbone=dict(
+ type='ResNet',
+ depth=101,
+ num_stages=4,
+ out_indices=(0, 1, 2, 3),
+ frozen_stages=1,
+ norm_cfg=dict(type='BN', requires_grad=True),
+ norm_eval=True,
+ style='pytorch'))
diff --git a/PyTorch/contrib/cv/detection/GCNet/configs/gfl/gfl_r50_fpn_1x_coco.py b/PyTorch/contrib/cv/detection/GCNet/configs/gfl/gfl_r50_fpn_1x_coco.py
new file mode 100644
index 0000000000000000000000000000000000000000..77a15ebce3761fe435dcb3c2bc97dd1300ba6633
--- /dev/null
+++ b/PyTorch/contrib/cv/detection/GCNet/configs/gfl/gfl_r50_fpn_1x_coco.py
@@ -0,0 +1,57 @@
+_base_ = [
+ '../_base_/datasets/coco_detection.py',
+ '../_base_/schedules/schedule_1x.py', '../_base_/default_runtime.py'
+]
+model = dict(
+ type='GFL',
+ pretrained='torchvision://resnet50',
+ backbone=dict(
+ type='ResNet',
+ depth=50,
+ num_stages=4,
+ out_indices=(0, 1, 2, 3),
+ frozen_stages=1,
+ norm_cfg=dict(type='BN', requires_grad=True),
+ norm_eval=True,
+ style='pytorch'),
+ neck=dict(
+ type='FPN',
+ in_channels=[256, 512, 1024, 2048],
+ out_channels=256,
+ start_level=1,
+ add_extra_convs='on_output',
+ num_outs=5),
+ bbox_head=dict(
+ type='GFLHead',
+ num_classes=80,
+ in_channels=256,
+ stacked_convs=4,
+ feat_channels=256,
+ anchor_generator=dict(
+ type='AnchorGenerator',
+ ratios=[1.0],
+ octave_base_scale=8,
+ scales_per_octave=1,
+ strides=[8, 16, 32, 64, 128]),
+ loss_cls=dict(
+ type='QualityFocalLoss',
+ use_sigmoid=True,
+ beta=2.0,
+ loss_weight=1.0),
+ loss_dfl=dict(type='DistributionFocalLoss', loss_weight=0.25),
+ reg_max=16,
+ loss_bbox=dict(type='GIoULoss', loss_weight=2.0)))
+# training and testing settings
+train_cfg = dict(
+ assigner=dict(type='ATSSAssigner', topk=9),
+ allowed_border=-1,
+ pos_weight=-1,
+ debug=False)
+test_cfg = dict(
+ nms_pre=1000,
+ min_bbox_size=0,
+ score_thr=0.05,
+ nms=dict(type='nms', iou_threshold=0.6),
+ max_per_img=100)
+# optimizer
+optimizer = dict(type='SGD', lr=0.01, momentum=0.9, weight_decay=0.0001)
diff --git a/PyTorch/contrib/cv/detection/GCNet/configs/gfl/gfl_r50_fpn_mstrain_2x_coco.py b/PyTorch/contrib/cv/detection/GCNet/configs/gfl/gfl_r50_fpn_mstrain_2x_coco.py
new file mode 100644
index 0000000000000000000000000000000000000000..bebfee9f8fdebb8da3bf791a65b0dab8de3fb582
--- /dev/null
+++ b/PyTorch/contrib/cv/detection/GCNet/configs/gfl/gfl_r50_fpn_mstrain_2x_coco.py
@@ -0,0 +1,22 @@
+_base_ = './gfl_r50_fpn_1x_coco.py'
+# learning policy
+lr_config = dict(step=[16, 22])
+total_epochs = 24
+# multi-scale training
+img_norm_cfg = dict(
+ mean=[123.675, 116.28, 103.53], std=[58.395, 57.12, 57.375], to_rgb=True)
+train_pipeline = [
+ dict(type='LoadImageFromFile'),
+ dict(type='LoadAnnotations', with_bbox=True),
+ dict(
+ type='Resize',
+ img_scale=[(1333, 480), (1333, 800)],
+ multiscale_mode='range',
+ keep_ratio=True),
+ dict(type='RandomFlip', flip_ratio=0.5),
+ dict(type='Normalize', **img_norm_cfg),
+ dict(type='Pad', size_divisor=32),
+ dict(type='DefaultFormatBundle'),
+ dict(type='Collect', keys=['img', 'gt_bboxes', 'gt_labels']),
+]
+data = dict(train=dict(pipeline=train_pipeline))
diff --git a/PyTorch/contrib/cv/detection/GCNet/configs/gfl/gfl_x101_32x4d_fpn_dconv_c4-c5_mstrain_2x_coco.py b/PyTorch/contrib/cv/detection/GCNet/configs/gfl/gfl_x101_32x4d_fpn_dconv_c4-c5_mstrain_2x_coco.py
new file mode 100644
index 0000000000000000000000000000000000000000..a2370e234dfec0099aaf74c46a3a85052d882385
--- /dev/null
+++ b/PyTorch/contrib/cv/detection/GCNet/configs/gfl/gfl_x101_32x4d_fpn_dconv_c4-c5_mstrain_2x_coco.py
@@ -0,0 +1,17 @@
+_base_ = './gfl_r50_fpn_mstrain_2x_coco.py'
+model = dict(
+ type='GFL',
+ pretrained='open-mmlab://resnext101_32x4d',
+ backbone=dict(
+ type='ResNeXt',
+ depth=101,
+ groups=32,
+ base_width=4,
+ num_stages=4,
+ out_indices=(0, 1, 2, 3),
+ frozen_stages=1,
+ norm_cfg=dict(type='BN', requires_grad=True),
+ dcn=dict(type='DCN', deform_groups=1, fallback_on_stride=False),
+ stage_with_dcn=(False, False, True, True),
+ norm_eval=True,
+ style='pytorch'))
diff --git a/PyTorch/contrib/cv/detection/GCNet/configs/gfl/gfl_x101_32x4d_fpn_mstrain_2x_coco.py b/PyTorch/contrib/cv/detection/GCNet/configs/gfl/gfl_x101_32x4d_fpn_mstrain_2x_coco.py
new file mode 100644
index 0000000000000000000000000000000000000000..4e00a059f8d2e58d23d6b77764456be351bd3115
--- /dev/null
+++ b/PyTorch/contrib/cv/detection/GCNet/configs/gfl/gfl_x101_32x4d_fpn_mstrain_2x_coco.py
@@ -0,0 +1,15 @@
+_base_ = './gfl_r50_fpn_mstrain_2x_coco.py'
+model = dict(
+ type='GFL',
+ pretrained='open-mmlab://resnext101_32x4d',
+ backbone=dict(
+ type='ResNeXt',
+ depth=101,
+ groups=32,
+ base_width=4,
+ num_stages=4,
+ out_indices=(0, 1, 2, 3),
+ frozen_stages=1,
+ norm_cfg=dict(type='BN', requires_grad=True),
+ norm_eval=True,
+ style='pytorch'))
diff --git a/PyTorch/contrib/cv/detection/GCNet/configs/ghm/README.md b/PyTorch/contrib/cv/detection/GCNet/configs/ghm/README.md
new file mode 100644
index 0000000000000000000000000000000000000000..46bd05f71e10a8583afcbeb72bf89986283b1632
--- /dev/null
+++ b/PyTorch/contrib/cv/detection/GCNet/configs/ghm/README.md
@@ -0,0 +1,21 @@
+# Gradient Harmonized Single-stage Detector
+
+## Introduction
+
+```
+@inproceedings{li2019gradient,
+ title={Gradient Harmonized Single-stage Detector},
+ author={Li, Buyu and Liu, Yu and Wang, Xiaogang},
+ booktitle={AAAI Conference on Artificial Intelligence},
+ year={2019}
+}
+```
+
+## Results and Models
+
+| Backbone | Style | Lr schd | Mem (GB) | Inf time (fps) | box AP | Download |
+| :-------------: | :-----: | :-----: | :------: | :------------: | :----: | :------: |
+| R-50-FPN | pytorch | 1x | 4.0 | 3.3 | 37.0 | [model](http://download.openmmlab.com/mmdetection/v2.0/ghm/retinanet_ghm_r50_fpn_1x_coco/retinanet_ghm_r50_fpn_1x_coco_20200130-a437fda3.pth) | [log](http://download.openmmlab.com/mmdetection/v2.0/ghm/retinanet_ghm_r50_fpn_1x_coco/retinanet_ghm_r50_fpn_1x_coco_20200130_004213.log.json) |
+| R-101-FPN | pytorch | 1x | 6.0 | 4.4 | 39.1 | [model](http://download.openmmlab.com/mmdetection/v2.0/ghm/retinanet_ghm_r101_fpn_1x_coco/retinanet_ghm_r101_fpn_1x_coco_20200130-c148ee8f.pth) | [log](http://download.openmmlab.com/mmdetection/v2.0/ghm/retinanet_ghm_r101_fpn_1x_coco/retinanet_ghm_r101_fpn_1x_coco_20200130_145259.log.json) |
+| X-101-32x4d-FPN | pytorch | 1x | 7.2 | 5.1 | 40.7 | [model](http://download.openmmlab.com/mmdetection/v2.0/ghm/retinanet_ghm_x101_32x4d_fpn_1x_coco/retinanet_ghm_x101_32x4d_fpn_1x_coco_20200131-e4333bd0.pth) | [log](http://download.openmmlab.com/mmdetection/v2.0/ghm/retinanet_ghm_x101_32x4d_fpn_1x_coco/retinanet_ghm_x101_32x4d_fpn_1x_coco_20200131_113653.log.json) |
+| X-101-64x4d-FPN | pytorch | 1x | 10.3 | 5.2 | 41.4 | [model](http://download.openmmlab.com/mmdetection/v2.0/ghm/retinanet_ghm_x101_64x4d_fpn_1x_coco/retinanet_ghm_x101_64x4d_fpn_1x_coco_20200131-dd381cef.pth) | [log](http://download.openmmlab.com/mmdetection/v2.0/ghm/retinanet_ghm_x101_64x4d_fpn_1x_coco/retinanet_ghm_x101_64x4d_fpn_1x_coco_20200131_113723.log.json) |
diff --git a/PyTorch/contrib/cv/detection/GCNet/configs/ghm/retinanet_ghm_r101_fpn_1x_coco.py b/PyTorch/contrib/cv/detection/GCNet/configs/ghm/retinanet_ghm_r101_fpn_1x_coco.py
new file mode 100644
index 0000000000000000000000000000000000000000..18f899a9b456383a8f74053e4716aee50ee5ec8c
--- /dev/null
+++ b/PyTorch/contrib/cv/detection/GCNet/configs/ghm/retinanet_ghm_r101_fpn_1x_coco.py
@@ -0,0 +1,2 @@
+_base_ = './retinanet_ghm_r50_fpn_1x_coco.py'
+model = dict(pretrained='torchvision://resnet101', backbone=dict(depth=101))
diff --git a/PyTorch/contrib/cv/detection/GCNet/configs/ghm/retinanet_ghm_r50_fpn_1x_coco.py b/PyTorch/contrib/cv/detection/GCNet/configs/ghm/retinanet_ghm_r50_fpn_1x_coco.py
new file mode 100644
index 0000000000000000000000000000000000000000..61b9751057f10f2173b8e7edde12cca53ebbd2d0
--- /dev/null
+++ b/PyTorch/contrib/cv/detection/GCNet/configs/ghm/retinanet_ghm_r50_fpn_1x_coco.py
@@ -0,0 +1,19 @@
+_base_ = '../retinanet/retinanet_r50_fpn_1x_coco.py'
+model = dict(
+ bbox_head=dict(
+ loss_cls=dict(
+ _delete_=True,
+ type='GHMC',
+ bins=30,
+ momentum=0.75,
+ use_sigmoid=True,
+ loss_weight=1.0),
+ loss_bbox=dict(
+ _delete_=True,
+ type='GHMR',
+ mu=0.02,
+ bins=10,
+ momentum=0.7,
+ loss_weight=10.0)))
+optimizer_config = dict(
+ _delete_=True, grad_clip=dict(max_norm=35, norm_type=2))
diff --git a/PyTorch/contrib/cv/detection/GCNet/configs/ghm/retinanet_ghm_x101_32x4d_fpn_1x_coco.py b/PyTorch/contrib/cv/detection/GCNet/configs/ghm/retinanet_ghm_x101_32x4d_fpn_1x_coco.py
new file mode 100644
index 0000000000000000000000000000000000000000..a89fc1389ce0f1f9712b4b5d684e632aaee25ce8
--- /dev/null
+++ b/PyTorch/contrib/cv/detection/GCNet/configs/ghm/retinanet_ghm_x101_32x4d_fpn_1x_coco.py
@@ -0,0 +1,13 @@
+_base_ = './retinanet_ghm_r50_fpn_1x_coco.py'
+model = dict(
+ pretrained='open-mmlab://resnext101_32x4d',
+ backbone=dict(
+ type='ResNeXt',
+ depth=101,
+ groups=32,
+ base_width=4,
+ num_stages=4,
+ out_indices=(0, 1, 2, 3),
+ frozen_stages=1,
+ norm_cfg=dict(type='BN', requires_grad=True),
+ style='pytorch'))
diff --git a/PyTorch/contrib/cv/detection/GCNet/configs/ghm/retinanet_ghm_x101_64x4d_fpn_1x_coco.py b/PyTorch/contrib/cv/detection/GCNet/configs/ghm/retinanet_ghm_x101_64x4d_fpn_1x_coco.py
new file mode 100644
index 0000000000000000000000000000000000000000..88013f5ffa2334fe3eccd30616a0b033c258ad87
--- /dev/null
+++ b/PyTorch/contrib/cv/detection/GCNet/configs/ghm/retinanet_ghm_x101_64x4d_fpn_1x_coco.py
@@ -0,0 +1,13 @@
+_base_ = './retinanet_ghm_r50_fpn_1x_coco.py'
+model = dict(
+ pretrained='open-mmlab://resnext101_64x4d',
+ backbone=dict(
+ type='ResNeXt',
+ depth=101,
+ groups=64,
+ base_width=4,
+ num_stages=4,
+ out_indices=(0, 1, 2, 3),
+ frozen_stages=1,
+ norm_cfg=dict(type='BN', requires_grad=True),
+ style='pytorch'))
diff --git a/PyTorch/contrib/cv/detection/GCNet/configs/gn+ws/README.md b/PyTorch/contrib/cv/detection/GCNet/configs/gn+ws/README.md
new file mode 100644
index 0000000000000000000000000000000000000000..a341ee9a01a6783dd716810900df90277949c6f0
--- /dev/null
+++ b/PyTorch/contrib/cv/detection/GCNet/configs/gn+ws/README.md
@@ -0,0 +1,42 @@
+# Weight Standardization
+
+## Introduction
+
+```
+@article{weightstandardization,
+ author = {Siyuan Qiao and Huiyu Wang and Chenxi Liu and Wei Shen and Alan Yuille},
+ title = {Weight Standardization},
+ journal = {arXiv preprint arXiv:1903.10520},
+ year = {2019},
+}
+```
+
+## Results and Models
+
+Faster R-CNN
+
+| Backbone | Style | Normalization | Lr schd | Mem (GB) | Inf time (fps) | box AP | mask AP | Download |
+|:---------:|:-------:|:-------------:|:-------:|:--------:|:--------------:|:------:|:-------:|:--------:|
+| R-50-FPN | pytorch | GN+WS | 1x | 5.9 | 11.7 | 39.7 | - | [model](http://download.openmmlab.com/mmdetection/v2.0/gn%2Bws/faster_rcnn_r50_fpn_gn_ws-all_1x_coco/faster_rcnn_r50_fpn_gn_ws-all_1x_coco_20200130-613d9fe2.pth) | [log](http://download.openmmlab.com/mmdetection/v2.0/gn%2Bws/faster_rcnn_r50_fpn_gn_ws-all_1x_coco/faster_rcnn_r50_fpn_gn_ws-all_1x_coco_20200130_210936.log.json) |
+| R-101-FPN | pytorch | GN+WS | 1x | 8.9 | 9.0 | 41.7 | - | [model](http://download.openmmlab.com/mmdetection/v2.0/gn%2Bws/faster_rcnn_r101_fpn_gn_ws-all_1x_coco/faster_rcnn_r101_fpn_gn_ws-all_1x_coco_20200205-a93b0d75.pth) | [log](http://download.openmmlab.com/mmdetection/v2.0/gn%2Bws/faster_rcnn_r101_fpn_gn_ws-all_1x_coco/faster_rcnn_r101_fpn_gn_ws-all_1x_coco_20200205_232146.log.json) |
+| X-50-32x4d-FPN | pytorch | GN+WS | 1x | 7.0 | 10.3 | 40.7 | - | [model](http://download.openmmlab.com/mmdetection/v2.0/gn%2Bws/faster_rcnn_x50_32x4d_fpn_gn_ws-all_1x_coco/faster_rcnn_x50_32x4d_fpn_gn_ws-all_1x_coco_20200203-839c5d9d.pth) | [log](http://download.openmmlab.com/mmdetection/v2.0/gn%2Bws/faster_rcnn_x50_32x4d_fpn_gn_ws-all_1x_coco/faster_rcnn_x50_32x4d_fpn_gn_ws-all_1x_coco_20200203_220113.log.json) |
+| X-101-32x4d-FPN | pytorch | GN+WS | 1x | 10.8 | 7.6 | 42.1 | - | [model](http://download.openmmlab.com/mmdetection/v2.0/gn%2Bws/faster_rcnn_x101_32x4d_fpn_gn_ws-all_1x_coco/faster_rcnn_x101_32x4d_fpn_gn_ws-all_1x_coco_20200212-27da1bc2.pth) | [log](http://download.openmmlab.com/mmdetection/v2.0/gn%2Bws/faster_rcnn_x101_32x4d_fpn_gn_ws-all_1x_coco/faster_rcnn_x101_32x4d_fpn_gn_ws-all_1x_coco_20200212_195302.log.json) |
+
+Mask R-CNN
+
+| Backbone | Style | Normalization | Lr schd | Mem (GB) | Inf time (fps) | box AP | mask AP | Download |
+|:---------:|:-------:|:-------------:|:---------:|:--------:|:--------------:|:------:|:-------:|:--------:|
+| R-50-FPN | pytorch | GN+WS | 2x | 7.3 | 10.5 | 40.6 | 36.6 | [model](http://download.openmmlab.com/mmdetection/v2.0/gn%2Bws/mask_rcnn_r50_fpn_gn_ws-all_2x_coco/mask_rcnn_r50_fpn_gn_ws-all_2x_coco_20200226-16acb762.pth) | [log](http://download.openmmlab.com/mmdetection/v2.0/gn%2Bws/mask_rcnn_r50_fpn_gn_ws-all_2x_coco/mask_rcnn_r50_fpn_gn_ws-all_2x_coco_20200226_062128.log.json) |
+| R-101-FPN | pytorch | GN+WS | 2x | 10.3 | 8.6 | 42.0 | 37.7 | [model](http://download.openmmlab.com/mmdetection/v2.0/gn%2Bws/mask_rcnn_r101_fpn_gn_ws-all_2x_coco/mask_rcnn_r101_fpn_gn_ws-all_2x_coco_20200212-ea357cd9.pth) | [log](http://download.openmmlab.com/mmdetection/v2.0/gn%2Bws/mask_rcnn_r101_fpn_gn_ws-all_2x_coco/mask_rcnn_r101_fpn_gn_ws-all_2x_coco_20200212_213627.log.json) |
+| X-50-32x4d-FPN | pytorch | GN+WS | 2x | 8.4 | 9.3 | 41.1 | 37.0 | [model](http://download.openmmlab.com/mmdetection/v2.0/gn%2Bws/mask_rcnn_x50_32x4d_fpn_gn_ws-all_2x_coco/mask_rcnn_x50_32x4d_fpn_gn_ws-all_2x_coco_20200216-649fdb6f.pth) | [log](http://download.openmmlab.com/mmdetection/v2.0/gn%2Bws/mask_rcnn_x50_32x4d_fpn_gn_ws-all_2x_coco/mask_rcnn_x50_32x4d_fpn_gn_ws-all_2x_coco_20200216_201500.log.json) |
+| X-101-32x4d-FPN | pytorch | GN+WS | 2x | 12.2 | 7.1 | 42.1 | 37.9 | [model](http://download.openmmlab.com/mmdetection/v2.0/gn%2Bws/mask_rcnn_x101_32x4d_fpn_gn_ws-all_2x_coco/mask_rcnn_x101_32x4d_fpn_gn_ws-all_2x_coco_20200319-33fb95b5.pth) | [log](http://download.openmmlab.com/mmdetection/v2.0/gn%2Bws/mask_rcnn_x101_32x4d_fpn_gn_ws-all_2x_coco/mask_rcnn_x101_32x4d_fpn_gn_ws-all_2x_coco_20200319_104101.log.json) |
+| R-50-FPN | pytorch | GN+WS | 20-23-24e | 7.3 | - | 41.1 | 37.1 | [model](http://download.openmmlab.com/mmdetection/v2.0/gn%2Bws/mask_rcnn_r50_fpn_gn_ws-all_20_23_24e_coco/mask_rcnn_r50_fpn_gn_ws-all_20_23_24e_coco_20200213-487d1283.pth) | [log](http://download.openmmlab.com/mmdetection/v2.0/gn%2Bws/mask_rcnn_r50_fpn_gn_ws-all_20_23_24e_coco/mask_rcnn_r50_fpn_gn_ws-all_20_23_24e_coco_20200213_035123.log.json) |
+| R-101-FPN | pytorch | GN+WS | 20-23-24e | 10.3 | - | 43.1 | 38.6 | [model](http://download.openmmlab.com/mmdetection/v2.0/gn%2Bws/mask_rcnn_r101_fpn_gn_ws-all_20_23_24e_coco/mask_rcnn_r101_fpn_gn_ws-all_20_23_24e_coco_20200213-57b5a50f.pth) | [log](http://download.openmmlab.com/mmdetection/v2.0/gn%2Bws/mask_rcnn_r101_fpn_gn_ws-all_20_23_24e_coco/mask_rcnn_r101_fpn_gn_ws-all_20_23_24e_coco_20200213_130142.log.json) |
+| X-50-32x4d-FPN | pytorch | GN+WS | 20-23-24e | 8.4 | - | 42.1 | 38.0 | [model](http://download.openmmlab.com/mmdetection/v2.0/gn%2Bws/mask_rcnn_x50_32x4d_fpn_gn_ws-all_20_23_24e_coco/mask_rcnn_x50_32x4d_fpn_gn_ws-all_20_23_24e_coco_20200226-969bcb2c.pth) | [log](http://download.openmmlab.com/mmdetection/v2.0/gn%2Bws/mask_rcnn_x50_32x4d_fpn_gn_ws-all_20_23_24e_coco/mask_rcnn_x50_32x4d_fpn_gn_ws-all_20_23_24e_coco_20200226_093732.log.json) |
+| X-101-32x4d-FPN | pytorch | GN+WS | 20-23-24e | 12.2 | - | 42.7 | 38.5 | [model](http://download.openmmlab.com/mmdetection/v2.0/gn%2Bws/mask_rcnn_x101_32x4d_fpn_gn_ws-all_20_23_24e_coco/mask_rcnn_x101_32x4d_fpn_gn_ws-all_20_23_24e_coco_20200316-e6cd35ef.pth) | [log](http://download.openmmlab.com/mmdetection/v2.0/gn%2Bws/mask_rcnn_x101_32x4d_fpn_gn_ws-all_20_23_24e_coco/mask_rcnn_x101_32x4d_fpn_gn_ws-all_20_23_24e_coco_20200316_013741.log.json) |
+
+Note:
+
+- GN+WS requires about 5% more memory than GN, and it is only 5% slower than GN.
+- In the paper, a 20-23-24e lr schedule is used instead of 2x.
+- The X-50-GN and X-101-GN pretrained models are also shared by the authors.
diff --git a/PyTorch/contrib/cv/detection/GCNet/configs/gn+ws/faster_rcnn_r101_fpn_gn_ws-all_1x_coco.py b/PyTorch/contrib/cv/detection/GCNet/configs/gn+ws/faster_rcnn_r101_fpn_gn_ws-all_1x_coco.py
new file mode 100644
index 0000000000000000000000000000000000000000..a5f6bd2292f4c1dfbd59de968e0dc3acf7579424
--- /dev/null
+++ b/PyTorch/contrib/cv/detection/GCNet/configs/gn+ws/faster_rcnn_r101_fpn_gn_ws-all_1x_coco.py
@@ -0,0 +1,3 @@
+_base_ = './faster_rcnn_r50_fpn_gn_ws-all_1x_coco.py'
+model = dict(
+ pretrained='open-mmlab://jhu/resnet101_gn_ws', backbone=dict(depth=101))
diff --git a/PyTorch/contrib/cv/detection/GCNet/configs/gn+ws/faster_rcnn_r50_fpn_gn_ws-all_1x_coco.py b/PyTorch/contrib/cv/detection/GCNet/configs/gn+ws/faster_rcnn_r50_fpn_gn_ws-all_1x_coco.py
new file mode 100644
index 0000000000000000000000000000000000000000..497267b6b50b3c160a4f8807230d4f986cf8eb3f
--- /dev/null
+++ b/PyTorch/contrib/cv/detection/GCNet/configs/gn+ws/faster_rcnn_r50_fpn_gn_ws-all_1x_coco.py
@@ -0,0 +1,13 @@
+_base_ = '../faster_rcnn/faster_rcnn_r50_fpn_1x_coco.py'
+conv_cfg = dict(type='ConvWS')
+norm_cfg = dict(type='GN', num_groups=32, requires_grad=True)
+model = dict(
+ pretrained='open-mmlab://jhu/resnet50_gn_ws',
+ backbone=dict(conv_cfg=conv_cfg, norm_cfg=norm_cfg),
+ neck=dict(conv_cfg=conv_cfg, norm_cfg=norm_cfg),
+ roi_head=dict(
+ bbox_head=dict(
+ type='Shared4Conv1FCBBoxHead',
+ conv_out_channels=256,
+ conv_cfg=conv_cfg,
+ norm_cfg=norm_cfg)))
diff --git a/PyTorch/contrib/cv/detection/GCNet/configs/gn+ws/faster_rcnn_x101_32x4d_fpn_gn_ws-all_1x_coco.py b/PyTorch/contrib/cv/detection/GCNet/configs/gn+ws/faster_rcnn_x101_32x4d_fpn_gn_ws-all_1x_coco.py
new file mode 100644
index 0000000000000000000000000000000000000000..061ca6993606fe2c7bdb020eaf3b5ea8b91a9b8e
--- /dev/null
+++ b/PyTorch/contrib/cv/detection/GCNet/configs/gn+ws/faster_rcnn_x101_32x4d_fpn_gn_ws-all_1x_coco.py
@@ -0,0 +1,16 @@
+_base_ = './faster_rcnn_r50_fpn_gn_ws-all_1x_coco.py'
+conv_cfg = dict(type='ConvWS')
+norm_cfg = dict(type='GN', num_groups=32, requires_grad=True)
+model = dict(
+ pretrained='open-mmlab://jhu/resnext101_32x4d_gn_ws',
+ backbone=dict(
+ type='ResNeXt',
+ depth=101,
+ groups=32,
+ base_width=4,
+ num_stages=4,
+ out_indices=(0, 1, 2, 3),
+ frozen_stages=1,
+ style='pytorch',
+ conv_cfg=conv_cfg,
+ norm_cfg=norm_cfg))
diff --git a/PyTorch/contrib/cv/detection/GCNet/configs/gn+ws/faster_rcnn_x50_32x4d_fpn_gn_ws-all_1x_coco.py b/PyTorch/contrib/cv/detection/GCNet/configs/gn+ws/faster_rcnn_x50_32x4d_fpn_gn_ws-all_1x_coco.py
new file mode 100644
index 0000000000000000000000000000000000000000..1268980615b69009a33b785eeb59322372633d10
--- /dev/null
+++ b/PyTorch/contrib/cv/detection/GCNet/configs/gn+ws/faster_rcnn_x50_32x4d_fpn_gn_ws-all_1x_coco.py
@@ -0,0 +1,16 @@
+_base_ = './faster_rcnn_r50_fpn_gn_ws-all_1x_coco.py'
+conv_cfg = dict(type='ConvWS')
+norm_cfg = dict(type='GN', num_groups=32, requires_grad=True)
+model = dict(
+ pretrained='open-mmlab://jhu/resnext50_32x4d_gn_ws',
+ backbone=dict(
+ type='ResNeXt',
+ depth=50,
+ groups=32,
+ base_width=4,
+ num_stages=4,
+ out_indices=(0, 1, 2, 3),
+ frozen_stages=1,
+ style='pytorch',
+ conv_cfg=conv_cfg,
+ norm_cfg=norm_cfg))
diff --git a/PyTorch/contrib/cv/detection/GCNet/configs/gn+ws/mask_rcnn_r101_fpn_gn_ws-all_20_23_24e_coco.py b/PyTorch/contrib/cv/detection/GCNet/configs/gn+ws/mask_rcnn_r101_fpn_gn_ws-all_20_23_24e_coco.py
new file mode 100644
index 0000000000000000000000000000000000000000..0912329cbe7c8da1b100945c978a274d60254aaa
--- /dev/null
+++ b/PyTorch/contrib/cv/detection/GCNet/configs/gn+ws/mask_rcnn_r101_fpn_gn_ws-all_20_23_24e_coco.py
@@ -0,0 +1,4 @@
+_base_ = './mask_rcnn_r101_fpn_gn_ws-all_2x_coco.py'
+# learning policy
+lr_config = dict(step=[20, 23])
+total_epochs = 24
diff --git a/PyTorch/contrib/cv/detection/GCNet/configs/gn+ws/mask_rcnn_r101_fpn_gn_ws-all_2x_coco.py b/PyTorch/contrib/cv/detection/GCNet/configs/gn+ws/mask_rcnn_r101_fpn_gn_ws-all_2x_coco.py
new file mode 100644
index 0000000000000000000000000000000000000000..4be68176d2ed6f9b209823187f1367d204fe67d1
--- /dev/null
+++ b/PyTorch/contrib/cv/detection/GCNet/configs/gn+ws/mask_rcnn_r101_fpn_gn_ws-all_2x_coco.py
@@ -0,0 +1,3 @@
+_base_ = './mask_rcnn_r50_fpn_gn_ws-all_2x_coco.py'
+model = dict(
+ pretrained='open-mmlab://jhu/resnet101_gn_ws', backbone=dict(depth=101))
diff --git a/PyTorch/contrib/cv/detection/GCNet/configs/gn+ws/mask_rcnn_r50_fpn_gn_ws-all_20_23_24e_coco.py b/PyTorch/contrib/cv/detection/GCNet/configs/gn+ws/mask_rcnn_r50_fpn_gn_ws-all_20_23_24e_coco.py
new file mode 100644
index 0000000000000000000000000000000000000000..938910482f764e5a7ad31c29e9db9e29d65c2db7
--- /dev/null
+++ b/PyTorch/contrib/cv/detection/GCNet/configs/gn+ws/mask_rcnn_r50_fpn_gn_ws-all_20_23_24e_coco.py
@@ -0,0 +1,4 @@
+_base_ = './mask_rcnn_r50_fpn_gn_ws-all_2x_coco.py'
+# learning policy
+lr_config = dict(step=[20, 23])
+total_epochs = 24
diff --git a/PyTorch/contrib/cv/detection/GCNet/configs/gn+ws/mask_rcnn_r50_fpn_gn_ws-all_2x_coco.py b/PyTorch/contrib/cv/detection/GCNet/configs/gn+ws/mask_rcnn_r50_fpn_gn_ws-all_2x_coco.py
new file mode 100644
index 0000000000000000000000000000000000000000..2032b932b1da461180ca9be08c56b5cd66d25873
--- /dev/null
+++ b/PyTorch/contrib/cv/detection/GCNet/configs/gn+ws/mask_rcnn_r50_fpn_gn_ws-all_2x_coco.py
@@ -0,0 +1,17 @@
+_base_ = '../mask_rcnn/mask_rcnn_r50_fpn_1x_coco.py'
+conv_cfg = dict(type='ConvWS')
+norm_cfg = dict(type='GN', num_groups=32, requires_grad=True)
+model = dict(
+ pretrained='open-mmlab://jhu/resnet50_gn_ws',
+ backbone=dict(conv_cfg=conv_cfg, norm_cfg=norm_cfg),
+ neck=dict(conv_cfg=conv_cfg, norm_cfg=norm_cfg),
+ roi_head=dict(
+ bbox_head=dict(
+ type='Shared4Conv1FCBBoxHead',
+ conv_out_channels=256,
+ conv_cfg=conv_cfg,
+ norm_cfg=norm_cfg),
+ mask_head=dict(conv_cfg=conv_cfg, norm_cfg=norm_cfg)))
+# learning policy
+lr_config = dict(step=[16, 22])
+total_epochs = 24
diff --git a/PyTorch/contrib/cv/detection/GCNet/configs/gn+ws/mask_rcnn_x101_32x4d_fpn_gn_ws-all_20_23_24e_coco.py b/PyTorch/contrib/cv/detection/GCNet/configs/gn+ws/mask_rcnn_x101_32x4d_fpn_gn_ws-all_20_23_24e_coco.py
new file mode 100644
index 0000000000000000000000000000000000000000..d3084e5cad5e0e909c18a2738e9cfd4e9586a48b
--- /dev/null
+++ b/PyTorch/contrib/cv/detection/GCNet/configs/gn+ws/mask_rcnn_x101_32x4d_fpn_gn_ws-all_20_23_24e_coco.py
@@ -0,0 +1,4 @@
+_base_ = './mask_rcnn_x101_32x4d_fpn_gn_ws-all_2x_coco.py'
+# learning policy
+lr_config = dict(step=[20, 23])
+total_epochs = 24
diff --git a/PyTorch/contrib/cv/detection/GCNet/configs/gn+ws/mask_rcnn_x101_32x4d_fpn_gn_ws-all_2x_coco.py b/PyTorch/contrib/cv/detection/GCNet/configs/gn+ws/mask_rcnn_x101_32x4d_fpn_gn_ws-all_2x_coco.py
new file mode 100644
index 0000000000000000000000000000000000000000..dbe88770ae5dffbed5229ed4a4e62f10b1c8d12b
--- /dev/null
+++ b/PyTorch/contrib/cv/detection/GCNet/configs/gn+ws/mask_rcnn_x101_32x4d_fpn_gn_ws-all_2x_coco.py
@@ -0,0 +1,17 @@
+_base_ = './mask_rcnn_r50_fpn_gn_ws-all_2x_coco.py'
+# model settings
+conv_cfg = dict(type='ConvWS')
+norm_cfg = dict(type='GN', num_groups=32, requires_grad=True)
+model = dict(
+ pretrained='open-mmlab://jhu/resnext101_32x4d_gn_ws',
+ backbone=dict(
+ type='ResNeXt',
+ depth=101,
+ groups=32,
+ base_width=4,
+ num_stages=4,
+ out_indices=(0, 1, 2, 3),
+ frozen_stages=1,
+ style='pytorch',
+ conv_cfg=conv_cfg,
+ norm_cfg=norm_cfg))
diff --git a/PyTorch/contrib/cv/detection/GCNet/configs/gn+ws/mask_rcnn_x50_32x4d_fpn_gn_ws-all_20_23_24e_coco.py b/PyTorch/contrib/cv/detection/GCNet/configs/gn+ws/mask_rcnn_x50_32x4d_fpn_gn_ws-all_20_23_24e_coco.py
new file mode 100644
index 0000000000000000000000000000000000000000..68792e16c9e3533cb2e0e4d02c6eb049f0f72ed2
--- /dev/null
+++ b/PyTorch/contrib/cv/detection/GCNet/configs/gn+ws/mask_rcnn_x50_32x4d_fpn_gn_ws-all_20_23_24e_coco.py
@@ -0,0 +1,4 @@
+_base_ = './mask_rcnn_x50_32x4d_fpn_gn_ws-all_2x_coco.py'
+# learning policy
+lr_config = dict(step=[20, 23])
+total_epochs = 24
diff --git a/PyTorch/contrib/cv/detection/GCNet/configs/gn+ws/mask_rcnn_x50_32x4d_fpn_gn_ws-all_2x_coco.py b/PyTorch/contrib/cv/detection/GCNet/configs/gn+ws/mask_rcnn_x50_32x4d_fpn_gn_ws-all_2x_coco.py
new file mode 100644
index 0000000000000000000000000000000000000000..9bbc86ead7003ab75264f8cf0cd18edb735fe9fd
--- /dev/null
+++ b/PyTorch/contrib/cv/detection/GCNet/configs/gn+ws/mask_rcnn_x50_32x4d_fpn_gn_ws-all_2x_coco.py
@@ -0,0 +1,17 @@
+_base_ = './mask_rcnn_r50_fpn_gn_ws-all_2x_coco.py'
+# model settings
+conv_cfg = dict(type='ConvWS')
+norm_cfg = dict(type='GN', num_groups=32, requires_grad=True)
+model = dict(
+ pretrained='open-mmlab://jhu/resnext50_32x4d_gn_ws',
+ backbone=dict(
+ type='ResNeXt',
+ depth=50,
+ groups=32,
+ base_width=4,
+ num_stages=4,
+ out_indices=(0, 1, 2, 3),
+ frozen_stages=1,
+ style='pytorch',
+ conv_cfg=conv_cfg,
+ norm_cfg=norm_cfg))
diff --git a/PyTorch/contrib/cv/detection/GCNet/configs/gn/README.md b/PyTorch/contrib/cv/detection/GCNet/configs/gn/README.md
new file mode 100644
index 0000000000000000000000000000000000000000..9188f536762540327f653d0773011ecc51ffdfc6
--- /dev/null
+++ b/PyTorch/contrib/cv/detection/GCNet/configs/gn/README.md
@@ -0,0 +1,28 @@
+# Group Normalization
+
+## Introduction
+
+```
+@inproceedings{wu2018group,
+ title={Group Normalization},
+ author={Wu, Yuxin and He, Kaiming},
+ booktitle={Proceedings of the European Conference on Computer Vision (ECCV)},
+ year={2018}
+}
+```
+
+## Results and Models
+
+| Backbone | model | Lr schd | Mem (GB) | Inf time (fps) | box AP | mask AP | Download |
+|:-------------:|:----------:|:-------:|:--------:|:--------------:|:------:|:-------:|:--------:|
+| R-50-FPN (d) | Mask R-CNN | 2x | 7.1 | 11.0 | 40.2 | 36.4 | [model](http://download.openmmlab.com/mmdetection/v2.0/gn/mask_rcnn_r50_fpn_gn-all_2x_coco/mask_rcnn_r50_fpn_gn-all_2x_coco_20200206-8eee02a6.pth) | [log](http://download.openmmlab.com/mmdetection/v2.0/gn/mask_rcnn_r50_fpn_gn-all_2x_coco/mask_rcnn_r50_fpn_gn-all_2x_coco_20200206_050355.log.json) |
+| R-50-FPN (d) | Mask R-CNN | 3x | 7.1 | - | 40.5 | 36.7 | [model](http://download.openmmlab.com/mmdetection/v2.0/gn/mask_rcnn_r50_fpn_gn-all_3x_coco/mask_rcnn_r50_fpn_gn-all_3x_coco_20200214-8b23b1e5.pth) | [log](http://download.openmmlab.com/mmdetection/v2.0/gn/mask_rcnn_r50_fpn_gn-all_3x_coco/mask_rcnn_r50_fpn_gn-all_3x_coco_20200214_063512.log.json) |
+| R-101-FPN (d) | Mask R-CNN | 2x | 9.9 | 9.0 | 41.9 | 37.6 | [model](http://download.openmmlab.com/mmdetection/v2.0/gn/mask_rcnn_r101_fpn_gn-all_2x_coco/mask_rcnn_r101_fpn_gn-all_2x_coco_20200205-d96b1b50.pth) | [log](http://download.openmmlab.com/mmdetection/v2.0/gn/mask_rcnn_r101_fpn_gn-all_2x_coco/mask_rcnn_r101_fpn_gn-all_2x_coco_20200205_234402.log.json) |
+| R-101-FPN (d) | Mask R-CNN | 3x | 9.9 | | 42.1 | 38.0 | [model](http://download.openmmlab.com/mmdetection/v2.0/gn/mask_rcnn_r101_fpn_gn-all_3x_coco/mask_rcnn_r101_fpn_gn-all_3x_coco_20200513_181609-0df864f4.pth) | [log](http://download.openmmlab.com/mmdetection/v2.0/gn/mask_rcnn_r101_fpn_gn-all_3x_coco/mask_rcnn_r101_fpn_gn-all_3x_coco_20200513_181609.log.json) |
+| R-50-FPN (c) | Mask R-CNN | 2x | 7.1 | 10.9 | 40.0 | 36.1 | [model](http://download.openmmlab.com/mmdetection/v2.0/gn/mask_rcnn_r50_fpn_gn-all_contrib_2x_coco/mask_rcnn_r50_fpn_gn-all_contrib_2x_coco_20200207-20d3e849.pth) | [log](http://download.openmmlab.com/mmdetection/v2.0/gn/mask_rcnn_r50_fpn_gn-all_contrib_2x_coco/mask_rcnn_r50_fpn_gn-all_contrib_2x_coco_20200207_225832.log.json) |
+| R-50-FPN (c) | Mask R-CNN | 3x | 7.1 | - | 40.1 | 36.2 | [model](http://download.openmmlab.com/mmdetection/v2.0/gn/mask_rcnn_r50_fpn_gn-all_contrib_3x_coco/mask_rcnn_r50_fpn_gn-all_contrib_3x_coco_20200225-542aefbc.pth) | [log](http://download.openmmlab.com/mmdetection/v2.0/gn/mask_rcnn_r50_fpn_gn-all_contrib_3x_coco/mask_rcnn_r50_fpn_gn-all_contrib_3x_coco_20200225_235135.log.json) |
+
+**Notes:**
+- (d) means pretrained model converted from Detectron, and (c) means the contributed model pretrained by [@thangvubk](https://github.com/thangvubk).
+- The `3x` schedule is epoch [28, 34, 36].
+- **Memory, Train/Inf time is outdated.**
diff --git a/PyTorch/contrib/cv/detection/GCNet/configs/gn/mask_rcnn_r101_fpn_gn-all_2x_coco.py b/PyTorch/contrib/cv/detection/GCNet/configs/gn/mask_rcnn_r101_fpn_gn-all_2x_coco.py
new file mode 100644
index 0000000000000000000000000000000000000000..0fcc558018b69beedbd05781163c8043d93f7277
--- /dev/null
+++ b/PyTorch/contrib/cv/detection/GCNet/configs/gn/mask_rcnn_r101_fpn_gn-all_2x_coco.py
@@ -0,0 +1,3 @@
+_base_ = './mask_rcnn_r50_fpn_gn-all_2x_coco.py'
+model = dict(
+ pretrained='open-mmlab://detectron/resnet101_gn', backbone=dict(depth=101))
diff --git a/PyTorch/contrib/cv/detection/GCNet/configs/gn/mask_rcnn_r101_fpn_gn-all_3x_coco.py b/PyTorch/contrib/cv/detection/GCNet/configs/gn/mask_rcnn_r101_fpn_gn-all_3x_coco.py
new file mode 100644
index 0000000000000000000000000000000000000000..67890c2a154e0e5c82bfeacd1d7355878bcdf19b
--- /dev/null
+++ b/PyTorch/contrib/cv/detection/GCNet/configs/gn/mask_rcnn_r101_fpn_gn-all_3x_coco.py
@@ -0,0 +1,5 @@
+_base_ = './mask_rcnn_r101_fpn_gn-all_2x_coco.py'
+
+# learning policy
+lr_config = dict(step=[28, 34])
+total_epochs = 36
diff --git a/PyTorch/contrib/cv/detection/GCNet/configs/gn/mask_rcnn_r50_fpn_gn-all_2x_coco.py b/PyTorch/contrib/cv/detection/GCNet/configs/gn/mask_rcnn_r50_fpn_gn-all_2x_coco.py
new file mode 100644
index 0000000000000000000000000000000000000000..7cede4147a32d374ca8d048513493429410f699c
--- /dev/null
+++ b/PyTorch/contrib/cv/detection/GCNet/configs/gn/mask_rcnn_r50_fpn_gn-all_2x_coco.py
@@ -0,0 +1,46 @@
+_base_ = '../mask_rcnn/mask_rcnn_r50_fpn_1x_coco.py'
+norm_cfg = dict(type='GN', num_groups=32, requires_grad=True)
+model = dict(
+ pretrained='open-mmlab://detectron/resnet50_gn',
+ backbone=dict(norm_cfg=norm_cfg),
+ neck=dict(norm_cfg=norm_cfg),
+ roi_head=dict(
+ bbox_head=dict(
+ type='Shared4Conv1FCBBoxHead',
+ conv_out_channels=256,
+ norm_cfg=norm_cfg),
+ mask_head=dict(norm_cfg=norm_cfg)))
+img_norm_cfg = dict(
+ mean=[103.530, 116.280, 123.675], std=[1.0, 1.0, 1.0], to_rgb=False)
+train_pipeline = [
+ dict(type='LoadImageFromFile'),
+ dict(type='LoadAnnotations', with_bbox=True, with_mask=True),
+ dict(type='Resize', img_scale=(1333, 800), keep_ratio=True),
+ dict(type='RandomFlip', flip_ratio=0.5),
+ dict(type='Normalize', **img_norm_cfg),
+ dict(type='Pad', size_divisor=32),
+ dict(type='DefaultFormatBundle'),
+ dict(type='Collect', keys=['img', 'gt_bboxes', 'gt_labels', 'gt_masks']),
+]
+test_pipeline = [
+ dict(type='LoadImageFromFile'),
+ dict(
+ type='MultiScaleFlipAug',
+ img_scale=(1333, 800),
+ flip=False,
+ transforms=[
+ dict(type='Resize', keep_ratio=True),
+ dict(type='RandomFlip'),
+ dict(type='Normalize', **img_norm_cfg),
+ dict(type='Pad', size_divisor=32),
+ dict(type='ImageToTensor', keys=['img']),
+ dict(type='Collect', keys=['img']),
+ ])
+]
+data = dict(
+ train=dict(pipeline=train_pipeline),
+ val=dict(pipeline=test_pipeline),
+ test=dict(pipeline=test_pipeline))
+# learning policy
+lr_config = dict(step=[16, 22])
+total_epochs = 24
diff --git a/PyTorch/contrib/cv/detection/GCNet/configs/gn/mask_rcnn_r50_fpn_gn-all_3x_coco.py b/PyTorch/contrib/cv/detection/GCNet/configs/gn/mask_rcnn_r50_fpn_gn-all_3x_coco.py
new file mode 100644
index 0000000000000000000000000000000000000000..c0b0013829909ea7b3b68415fd89f35037eb77a8
--- /dev/null
+++ b/PyTorch/contrib/cv/detection/GCNet/configs/gn/mask_rcnn_r50_fpn_gn-all_3x_coco.py
@@ -0,0 +1,5 @@
+_base_ = './mask_rcnn_r50_fpn_gn-all_2x_coco.py'
+
+# learning policy
+lr_config = dict(step=[28, 34])
+total_epochs = 36
diff --git a/PyTorch/contrib/cv/detection/GCNet/configs/gn/mask_rcnn_r50_fpn_gn-all_contrib_2x_coco.py b/PyTorch/contrib/cv/detection/GCNet/configs/gn/mask_rcnn_r50_fpn_gn-all_contrib_2x_coco.py
new file mode 100644
index 0000000000000000000000000000000000000000..3c690aecb9662b9e433200e4cd1e1ad3c330f3d9
--- /dev/null
+++ b/PyTorch/contrib/cv/detection/GCNet/configs/gn/mask_rcnn_r50_fpn_gn-all_contrib_2x_coco.py
@@ -0,0 +1,15 @@
+_base_ = '../mask_rcnn/mask_rcnn_r50_fpn_1x_coco.py'
+norm_cfg = dict(type='GN', num_groups=32, requires_grad=True)
+model = dict(
+ pretrained='open-mmlab://contrib/resnet50_gn',
+ backbone=dict(norm_cfg=norm_cfg),
+ neck=dict(norm_cfg=norm_cfg),
+ roi_head=dict(
+ bbox_head=dict(
+ type='Shared4Conv1FCBBoxHead',
+ conv_out_channels=256,
+ norm_cfg=norm_cfg),
+ mask_head=dict(norm_cfg=norm_cfg)))
+# learning policy
+lr_config = dict(step=[16, 22])
+total_epochs = 24
diff --git a/PyTorch/contrib/cv/detection/GCNet/configs/gn/mask_rcnn_r50_fpn_gn-all_contrib_3x_coco.py b/PyTorch/contrib/cv/detection/GCNet/configs/gn/mask_rcnn_r50_fpn_gn-all_contrib_3x_coco.py
new file mode 100644
index 0000000000000000000000000000000000000000..6174861dfa53a5b3465d7e777a5a54b684077788
--- /dev/null
+++ b/PyTorch/contrib/cv/detection/GCNet/configs/gn/mask_rcnn_r50_fpn_gn-all_contrib_3x_coco.py
@@ -0,0 +1,5 @@
+_base_ = './mask_rcnn_r50_fpn_gn-all_contrib_2x_coco.py'
+
+# learning policy
+lr_config = dict(step=[28, 34])
+total_epochs = 36
diff --git a/PyTorch/contrib/cv/detection/GCNet/configs/grid_rcnn/README.md b/PyTorch/contrib/cv/detection/GCNet/configs/grid_rcnn/README.md
new file mode 100644
index 0000000000000000000000000000000000000000..f5eb91d51055b9874e42ba58fd388a4bb8ccac13
--- /dev/null
+++ b/PyTorch/contrib/cv/detection/GCNet/configs/grid_rcnn/README.md
@@ -0,0 +1,32 @@
+# Grid R-CNN
+
+## Introduction
+
+```
+@inproceedings{lu2019grid,
+ title={Grid r-cnn},
+ author={Lu, Xin and Li, Buyu and Yue, Yuxin and Li, Quanquan and Yan, Junjie},
+ booktitle={Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition},
+ year={2019}
+}
+
+@article{lu2019grid,
+ title={Grid R-CNN Plus: Faster and Better},
+ author={Lu, Xin and Li, Buyu and Yue, Yuxin and Li, Quanquan and Yan, Junjie},
+ journal={arXiv preprint arXiv:1906.05688},
+ year={2019}
+}
+```
+
+## Results and Models
+
+| Backbone | Lr schd | Mem (GB) | Inf time (fps) | box AP | Download |
+|:-----------:|:-------:|:--------:|:--------------:|:------:|:--------:|
+| R-50 | 2x | 5.1 | 15.0 | 40.4 | [model](http://download.openmmlab.com/mmdetection/v2.0/grid_rcnn/grid_rcnn_r50_fpn_gn-head_2x_coco/grid_rcnn_r50_fpn_gn-head_2x_coco_20200130-6cca8223.pth) | [log](http://download.openmmlab.com/mmdetection/v2.0/grid_rcnn/grid_rcnn_r50_fpn_gn-head_2x_coco/grid_rcnn_r50_fpn_gn-head_2x_coco_20200130_221140.log.json) |
+| R-101 | 2x | 7.0 | 12.6 | 41.5 | [model](http://download.openmmlab.com/mmdetection/v2.0/grid_rcnn/grid_rcnn_r101_fpn_gn-head_2x_coco/grid_rcnn_r101_fpn_gn-head_2x_coco_20200309-d6eca030.pth) | [log](http://download.openmmlab.com/mmdetection/v2.0/grid_rcnn/grid_rcnn_r101_fpn_gn-head_2x_coco/grid_rcnn_r101_fpn_gn-head_2x_coco_20200309_164224.log.json) |
+| X-101-32x4d | 2x | 8.3 | 10.8 | 42.9 | [model](http://download.openmmlab.com/mmdetection/v2.0/grid_rcnn/grid_rcnn_x101_32x4d_fpn_gn-head_2x_coco/grid_rcnn_x101_32x4d_fpn_gn-head_2x_coco_20200130-d8f0e3ff.pth) | [log](http://download.openmmlab.com/mmdetection/v2.0/grid_rcnn/grid_rcnn_x101_32x4d_fpn_gn-head_2x_coco/grid_rcnn_x101_32x4d_fpn_gn-head_2x_coco_20200130_215413.log.json) |
+| X-101-64x4d | 2x | 11.3 | 7.7 | 43.0 | [model](http://download.openmmlab.com/mmdetection/v2.0/grid_rcnn/grid_rcnn_x101_64x4d_fpn_gn-head_2x_coco/grid_rcnn_x101_64x4d_fpn_gn-head_2x_coco_20200204-ec76a754.pth) | [log](http://download.openmmlab.com/mmdetection/v2.0/grid_rcnn/grid_rcnn_x101_64x4d_fpn_gn-head_2x_coco/grid_rcnn_x101_64x4d_fpn_gn-head_2x_coco_20200204_080641.log.json) |
+
+**Notes:**
+- All models are trained with 8 GPUs instead of 32 GPUs in the original paper.
+- The warming up lasts for 1 epoch and `2x` here indicates 25 epochs.
diff --git a/PyTorch/contrib/cv/detection/GCNet/configs/grid_rcnn/grid_rcnn_r101_fpn_gn-head_2x_coco.py b/PyTorch/contrib/cv/detection/GCNet/configs/grid_rcnn/grid_rcnn_r101_fpn_gn-head_2x_coco.py
new file mode 100644
index 0000000000000000000000000000000000000000..cf8b648a4291db4a172bf031f301110963f38dd6
--- /dev/null
+++ b/PyTorch/contrib/cv/detection/GCNet/configs/grid_rcnn/grid_rcnn_r101_fpn_gn-head_2x_coco.py
@@ -0,0 +1,3 @@
+_base_ = './grid_rcnn_r50_fpn_gn-head_2x_coco.py'
+
+model = dict(pretrained='torchvision://resnet101', backbone=dict(depth=101))
diff --git a/PyTorch/contrib/cv/detection/GCNet/configs/grid_rcnn/grid_rcnn_r50_fpn_gn-head_1x_coco.py b/PyTorch/contrib/cv/detection/GCNet/configs/grid_rcnn/grid_rcnn_r50_fpn_gn-head_1x_coco.py
new file mode 100644
index 0000000000000000000000000000000000000000..cc3e3ef594243be1335aa3b3d2f78f50f4477082
--- /dev/null
+++ b/PyTorch/contrib/cv/detection/GCNet/configs/grid_rcnn/grid_rcnn_r50_fpn_gn-head_1x_coco.py
@@ -0,0 +1,11 @@
+_base_ = ['../grid_rcnn/grid_rcnn_r50_fpn_gn-head_2x_coco.py']
+# learning policy
+lr_config = dict(
+ policy='step',
+ warmup='linear',
+ warmup_iters=500,
+ warmup_ratio=0.001,
+ step=[8, 11])
+checkpoint_config = dict(interval=1)
+# runtime settings
+total_epochs = 12
diff --git a/PyTorch/contrib/cv/detection/GCNet/configs/grid_rcnn/grid_rcnn_r50_fpn_gn-head_2x_coco.py b/PyTorch/contrib/cv/detection/GCNet/configs/grid_rcnn/grid_rcnn_r50_fpn_gn-head_2x_coco.py
new file mode 100644
index 0000000000000000000000000000000000000000..1b40e039c1e8fd584908794755385e62416dd38f
--- /dev/null
+++ b/PyTorch/contrib/cv/detection/GCNet/configs/grid_rcnn/grid_rcnn_r50_fpn_gn-head_2x_coco.py
@@ -0,0 +1,135 @@
+_base_ = [
+ '../_base_/datasets/coco_detection.py', '../_base_/default_runtime.py'
+]
+# model settings
+model = dict(
+ type='GridRCNN',
+ pretrained='torchvision://resnet50',
+ backbone=dict(
+ type='ResNet',
+ depth=50,
+ num_stages=4,
+ out_indices=(0, 1, 2, 3),
+ frozen_stages=1,
+ norm_cfg=dict(type='BN', requires_grad=True),
+ norm_eval=True,
+ style='pytorch'),
+ neck=dict(
+ type='FPN',
+ in_channels=[256, 512, 1024, 2048],
+ out_channels=256,
+ num_outs=5),
+ rpn_head=dict(
+ type='RPNHead',
+ in_channels=256,
+ feat_channels=256,
+ anchor_generator=dict(
+ type='AnchorGenerator',
+ scales=[8],
+ ratios=[0.5, 1.0, 2.0],
+ strides=[4, 8, 16, 32, 64]),
+ bbox_coder=dict(
+ type='DeltaXYWHBBoxCoder',
+ target_means=[.0, .0, .0, .0],
+ target_stds=[1.0, 1.0, 1.0, 1.0]),
+ loss_cls=dict(
+ type='CrossEntropyLoss', use_sigmoid=True, loss_weight=1.0),
+ loss_bbox=dict(type='SmoothL1Loss', beta=1.0 / 9.0, loss_weight=1.0)),
+ roi_head=dict(
+ type='GridRoIHead',
+ bbox_roi_extractor=dict(
+ type='SingleRoIExtractor',
+ roi_layer=dict(type='RoIAlign', output_size=7, sampling_ratio=0),
+ out_channels=256,
+ featmap_strides=[4, 8, 16, 32]),
+ bbox_head=dict(
+ type='Shared2FCBBoxHead',
+ with_reg=False,
+ in_channels=256,
+ fc_out_channels=1024,
+ roi_feat_size=7,
+ num_classes=80,
+ bbox_coder=dict(
+ type='DeltaXYWHBBoxCoder',
+ target_means=[0., 0., 0., 0.],
+ target_stds=[0.1, 0.1, 0.2, 0.2]),
+ reg_class_agnostic=False),
+ grid_roi_extractor=dict(
+ type='SingleRoIExtractor',
+ roi_layer=dict(type='RoIAlign', output_size=14, sampling_ratio=0),
+ out_channels=256,
+ featmap_strides=[4, 8, 16, 32]),
+ grid_head=dict(
+ type='GridHead',
+ grid_points=9,
+ num_convs=8,
+ in_channels=256,
+ point_feat_channels=64,
+ norm_cfg=dict(type='GN', num_groups=36),
+ loss_grid=dict(
+ type='CrossEntropyLoss', use_sigmoid=True, loss_weight=15))))
+# model training and testing settings
+train_cfg = dict(
+ rpn=dict(
+ assigner=dict(
+ type='MaxIoUAssigner',
+ pos_iou_thr=0.7,
+ neg_iou_thr=0.3,
+ min_pos_iou=0.3,
+ ignore_iof_thr=-1),
+ sampler=dict(
+ type='RandomSampler',
+ num=256,
+ pos_fraction=0.5,
+ neg_pos_ub=-1,
+ add_gt_as_proposals=False),
+ allowed_border=0,
+ pos_weight=-1,
+ debug=False),
+ rpn_proposal=dict(
+ nms_across_levels=False,
+ nms_pre=2000,
+ nms_post=2000,
+ max_num=2000,
+ nms_thr=0.7,
+ min_bbox_size=0),
+ rcnn=dict(
+ assigner=dict(
+ type='MaxIoUAssigner',
+ pos_iou_thr=0.5,
+ neg_iou_thr=0.5,
+ min_pos_iou=0.5,
+ ignore_iof_thr=-1),
+ sampler=dict(
+ type='RandomSampler',
+ num=512,
+ pos_fraction=0.25,
+ neg_pos_ub=-1,
+ add_gt_as_proposals=True),
+ pos_radius=1,
+ pos_weight=-1,
+ max_num_grid=192,
+ debug=False))
+test_cfg = dict(
+ rpn=dict(
+ nms_across_levels=False,
+ nms_pre=1000,
+ nms_post=1000,
+ max_num=1000,
+ nms_thr=0.7,
+ min_bbox_size=0),
+ rcnn=dict(
+ score_thr=0.03,
+ nms=dict(type='nms', iou_threshold=0.3),
+ max_per_img=100))
+# optimizer
+optimizer = dict(type='SGD', lr=0.02, momentum=0.9, weight_decay=0.0001)
+optimizer_config = dict(grad_clip=None)
+# learning policy
+lr_config = dict(
+ policy='step',
+ warmup='linear',
+ warmup_iters=3665,
+ warmup_ratio=1.0 / 80,
+ step=[17, 23])
+total_epochs = 25
diff --git a/PyTorch/contrib/cv/detection/GCNet/configs/grid_rcnn/grid_rcnn_x101_32x4d_fpn_gn-head_2x_coco.py b/PyTorch/contrib/cv/detection/GCNet/configs/grid_rcnn/grid_rcnn_x101_32x4d_fpn_gn-head_2x_coco.py
new file mode 100644
index 0000000000000000000000000000000000000000..515bbdf0aa8840c4bec273d1753f34faecf903c5
--- /dev/null
+++ b/PyTorch/contrib/cv/detection/GCNet/configs/grid_rcnn/grid_rcnn_x101_32x4d_fpn_gn-head_2x_coco.py
@@ -0,0 +1,23 @@
+_base_ = './grid_rcnn_r50_fpn_gn-head_2x_coco.py'
+model = dict(
+ pretrained='open-mmlab://resnext101_32x4d',
+ backbone=dict(
+ type='ResNeXt',
+ depth=101,
+ groups=32,
+ base_width=4,
+ num_stages=4,
+ out_indices=(0, 1, 2, 3),
+ frozen_stages=1,
+ style='pytorch'))
+# optimizer
+optimizer = dict(type='SGD', lr=0.02, momentum=0.9, weight_decay=0.0001)
+optimizer_config = dict(grad_clip=None)
+# learning policy
+lr_config = dict(
+ policy='step',
+ warmup='linear',
+ warmup_iters=3665,
+ warmup_ratio=1.0 / 80,
+ step=[17, 23])
+total_epochs = 25
diff --git a/PyTorch/contrib/cv/detection/GCNet/configs/grid_rcnn/grid_rcnn_x101_64x4d_fpn_gn-head_2x_coco.py b/PyTorch/contrib/cv/detection/GCNet/configs/grid_rcnn/grid_rcnn_x101_64x4d_fpn_gn-head_2x_coco.py
new file mode 100644
index 0000000000000000000000000000000000000000..2fdc53c8c04c12bed16a31281127f9774bb70b64
--- /dev/null
+++ b/PyTorch/contrib/cv/detection/GCNet/configs/grid_rcnn/grid_rcnn_x101_64x4d_fpn_gn-head_2x_coco.py
@@ -0,0 +1,12 @@
+_base_ = './grid_rcnn_x101_32x4d_fpn_gn-head_2x_coco.py'
+model = dict(
+ pretrained='open-mmlab://resnext101_64x4d',
+ backbone=dict(
+ type='ResNeXt',
+ depth=101,
+ groups=64,
+ base_width=4,
+ num_stages=4,
+ out_indices=(0, 1, 2, 3),
+ frozen_stages=1,
+ style='pytorch'))
diff --git a/PyTorch/contrib/cv/detection/GCNet/configs/groie/README.md b/PyTorch/contrib/cv/detection/GCNet/configs/groie/README.md
new file mode 100644
index 0000000000000000000000000000000000000000..07168deb2eb2598d0571a1f68c55180a81dc6616
--- /dev/null
+++ b/PyTorch/contrib/cv/detection/GCNet/configs/groie/README.md
@@ -0,0 +1,64 @@
+# GRoIE
+
+## A novel Region of Interest Extraction Layer for Instance Segmentation
+
+By Leonardo Rossi, Akbar Karimi and Andrea Prati from
+[IMPLab](http://implab.ce.unipr.it/).
+
+We provide configs to reproduce the results in the paper for
+"*A novel Region of Interest Extraction Layer for Instance Segmentation*"
+on COCO object detection.
+
+## Introduction
+
+This paper is motivated by the need to overcome to the limitations of existing
+RoI extractors which select only one (the best) layer from FPN.
+
+Our intuition is that all the layers of FPN retain useful information.
+
+Therefore, the proposed layer (called Generic RoI Extractor - **GRoIE**)
+introduces non-local building blocks and attention mechanisms to boost the
+performance.
+
+## Results and models
+
+The results on COCO 2017 minival (5k images) are shown in the below table.
+You can find
+[here](https://drive.google.com/drive/folders/19ssstbq_h0Z1cgxHmJYFO8s1arf3QJbT)
+the trained models.
+
+### Application of GRoIE to different architectures
+
+| Backbone | Method | Lr schd | box AP | mask AP | Config| Download|
+| :-------: | :--------------: | :-----: | :----: | :-----: | :-------:| :-------:|
+| R-50-FPN | Faster Original | 1x | 37.4 | | [config](../faster_rcnn/faster_rcnn_r50_fpn_1x_coco.py) | [model](http://download.openmmlab.com/mmdetection/v2.0/faster_rcnn/faster_rcnn_r50_fpn_1x_coco/faster_rcnn_r50_fpn_1x_coco_20200130-047c8118.pth) | [log](http://download.openmmlab.com/mmdetection/v2.0/faster_rcnn/faster_rcnn_r50_fpn_1x_coco/faster_rcnn_r50_fpn_1x_coco_20200130_204655.log.json) |
+| R-50-FPN | + GRoIE | 1x | 38.3 | | [config](./faster_rcnn_r50_fpn_groie_1x_coco.py) |[model](http://download.openmmlab.com/mmdetection/v2.0/groie/faster_rcnn_r50_fpn_groie_1x_coco/faster_rcnn_r50_fpn_groie_1x_coco_20200604_211715-66ee9516.pth) | [log](http://download.openmmlab.com/mmdetection/v2.0/groie/faster_rcnn_r50_fpn_groie_1x_coco/faster_rcnn_r50_fpn_groie_1x_coco_20200604_211715.log.json) |
+| R-50-FPN | Grid R-CNN | 1x | 39.1 | | [config](./grid_rcnn_r50_fpn_gn-head_1x_coco.py)|[model](http://download.openmmlab.com/mmdetection/v2.0/groie/grid_rcnn_r50_fpn_gn-head_1x_coco/grid_rcnn_r50_fpn_gn-head_1x_coco_20200605_202059-64f00ee8.pth) | [log](http://download.openmmlab.com/mmdetection/v2.0/groie/grid_rcnn_r50_fpn_gn-head_1x_coco/grid_rcnn_r50_fpn_gn-head_1x_coco_20200605_202059.log.json) |
+| R-50-FPN | + GRoIE | 1x | | | [config](./grid_rcnn_r50_fpn_gn-head_groie_1x_coco.py)||
+| R-50-FPN | Mask R-CNN | 1x | 38.2 | 34.7 | [config](../mask_rcnn/mask_rcnn_r50_fpn_1x_coco.py)|[model](http://download.openmmlab.com/mmdetection/v2.0/mask_rcnn/mask_rcnn_r50_fpn_1x_coco/mask_rcnn_r50_fpn_1x_coco_20200205-d4b0c5d6.pth) | [log](http://download.openmmlab.com/mmdetection/v2.0/mask_rcnn/mask_rcnn_r50_fpn_1x_coco/mask_rcnn_r50_fpn_1x_coco_20200205_050542.log.json) |
+| R-50-FPN | + GRoIE | 1x | 39.0 | 36.0 | [config](./mask_rcnn_r50_fpn_groie_1x_coco.py) |[model](http://download.openmmlab.com/mmdetection/v2.0/groie/mask_rcnn_r50_fpn_groie_1x_coco/mask_rcnn_r50_fpn_groie_1x_coco_20200604_211715-50d90c74.pth) | [log](http://download.openmmlab.com/mmdetection/v2.0/groie/mask_rcnn_r50_fpn_groie_1x_coco/mask_rcnn_r50_fpn_groie_1x_coco_20200604_211715.log.json) |
+| R-50-FPN | GC-Net | 1x | 40.7 | 36.5 | [config](../gcnet/mask_rcnn_r50_fpn_syncbn-backbone_r4_gcb_c3-c5_1x_coco.py) | [model](http://download.openmmlab.com/mmdetection/v2.0/gcnet/mask_rcnn_r50_fpn_syncbn-backbone_r4_gcb_c3-c5_1x_coco/mask_rcnn_r50_fpn_syncbn-backbone_r4_gcb_c3-c5_1x_coco_20200202-50b90e5c.pth) | [log](http://download.openmmlab.com/mmdetection/v2.0/gcnet/mask_rcnn_r50_fpn_syncbn-backbone_r4_gcb_c3-c5_1x_coco/mask_rcnn_r50_fpn_syncbn-backbone_r4_gcb_c3-c5_1x_coco_20200202_085547.log.json) |
+| R-50-FPN | + GRoIE | 1x | 41.0 | 37.8 | [config](./mask_rcnn_r50_fpn_syncbn-backbone_r4_gcb_c3-c5_groie_1x_coco.py) |[model](http://download.openmmlab.com/mmdetection/v2.0/groie/mask_rcnn_r50_fpn_syncbn-backbone_r4_gcb_c3-c5_groie_1x_coco/mask_rcnn_r50_fpn_syncbn-backbone_r4_gcb_c3-c5_groie_1x_coco_20200604_211715-42eb79e1.pth) | [log](http://download.openmmlab.com/mmdetection/v2.0/groie/mask_rcnn_r50_fpn_syncbn-backbone_r4_gcb_c3-c5_groie_1x_coco/mask_rcnn_r50_fpn_syncbn-backbone_r4_gcb_c3-c5_groie_1x_coco_20200604_211715-42eb79e1.pth) |
+| R-101-FPN | GC-Net | 1x | 42.2 | 37.8 | [config](../configs/gcnet/mask_rcnn_r101_fpn_syncbn-backbone_r4_gcb_c3-c5_1x_coco.py) |[model](http://download.openmmlab.com/mmdetection/v2.0/gcnet/mask_rcnn_r101_fpn_syncbn-backbone_r4_gcb_c3-c5_1x_coco/mask_rcnn_r101_fpn_syncbn-backbone_r4_gcb_c3-c5_1x_coco_20200206-8407a3f0.pth) | [log](http://download.openmmlab.com/mmdetection/v2.0/gcnet/mask_rcnn_r101_fpn_syncbn-backbone_r4_gcb_c3-c5_1x_coco/mask_rcnn_r101_fpn_syncbn-backbone_r4_gcb_c3-c5_1x_coco_20200206_142508.log.json) |
+| R-101-FPN | + GRoIE | 1x | | | [config](./mask_rcnn_r101_fpn_syncbn-backbone_r4_gcb_c3-c5_groie_1x_coco.py)|[model](http://download.openmmlab.com/mmdetection/v2.0/groie/mask_rcnn_r101_fpn_syncbn-backbone_r4_gcb_c3-c5_groie_1x_coco/mask_rcnn_r101_fpn_syncbn-backbone_r4_gcb_c3-c5_groie_1x_coco_20200607_224507-8daae01c.pth) | [log](http://download.openmmlab.com/mmdetection/v2.0/groie/mask_rcnn_r101_fpn_syncbn-backbone_r4_gcb_c3-c5_groie_1x_coco/mask_rcnn_r101_fpn_syncbn-backbone_r4_gcb_c3-c5_groie_1x_coco_20200607_224507.log.json) |
+
+
+## Citation
+
+If you use this work or benchmark in your research, please cite this project.
+
+```
+@misc{rossi2020novel,
+ title={A novel Region of Interest Extraction Layer for Instance Segmentation},
+ author={Leonardo Rossi and Akbar Karimi and Andrea Prati},
+ year={2020},
+ eprint={2004.13665},
+ archivePrefix={arXiv},
+ primaryClass={cs.CV}
+}
+```
+
+## Contact
+
+The implementation of GROI is currently maintained by
+[Leonardo Rossi](https://github.com/hachreak/).
diff --git a/PyTorch/contrib/cv/detection/GCNet/configs/groie/faster_rcnn_r50_fpn_groie_1x_coco.py b/PyTorch/contrib/cv/detection/GCNet/configs/groie/faster_rcnn_r50_fpn_groie_1x_coco.py
new file mode 100644
index 0000000000000000000000000000000000000000..0fc528bfd49bfc9a262692db78a5f94b46c285af
--- /dev/null
+++ b/PyTorch/contrib/cv/detection/GCNet/configs/groie/faster_rcnn_r50_fpn_groie_1x_coco.py
@@ -0,0 +1,25 @@
+_base_ = '../faster_rcnn/faster_rcnn_r50_fpn_1x_coco.py'
+# model settings
+model = dict(
+ roi_head=dict(
+ bbox_roi_extractor=dict(
+ type='GenericRoIExtractor',
+ aggregation='sum',
+ roi_layer=dict(type='RoIAlign', output_size=7, sampling_ratio=2),
+ out_channels=256,
+ featmap_strides=[4, 8, 16, 32],
+ pre_cfg=dict(
+ type='ConvModule',
+ in_channels=256,
+ out_channels=256,
+ kernel_size=5,
+ padding=2,
+ inplace=False,
+ ),
+ post_cfg=dict(
+ type='GeneralizedAttention',
+ in_channels=256,
+ spatial_range=-1,
+ num_heads=6,
+ attention_type='0100',
+ kv_stride=2))))
diff --git a/PyTorch/contrib/cv/detection/GCNet/configs/groie/grid_rcnn_r50_fpn_gn-head_groie_1x_coco.py b/PyTorch/contrib/cv/detection/GCNet/configs/groie/grid_rcnn_r50_fpn_gn-head_groie_1x_coco.py
new file mode 100644
index 0000000000000000000000000000000000000000..8e4b4ab23513a97adf4471ab3b33ca8abdb6dbe5
--- /dev/null
+++ b/PyTorch/contrib/cv/detection/GCNet/configs/groie/grid_rcnn_r50_fpn_gn-head_groie_1x_coco.py
@@ -0,0 +1,45 @@
+_base_ = '../grid_rcnn/grid_rcnn_r50_fpn_gn-head_1x_coco.py'
+# model settings
+model = dict(
+ roi_head=dict(
+ bbox_roi_extractor=dict(
+ type='GenericRoIExtractor',
+ aggregation='sum',
+ roi_layer=dict(type='RoIAlign', output_size=7, sampling_ratio=2),
+ out_channels=256,
+ featmap_strides=[4, 8, 16, 32],
+ pre_cfg=dict(
+ type='ConvModule',
+ in_channels=256,
+ out_channels=256,
+ kernel_size=5,
+ padding=2,
+ inplace=False,
+ ),
+ post_cfg=dict(
+ type='GeneralizedAttention',
+ in_channels=256,
+ spatial_range=-1,
+ num_heads=6,
+ attention_type='0100',
+ kv_stride=2)),
+ grid_roi_extractor=dict(
+ type='GenericRoIExtractor',
+ roi_layer=dict(type='RoIAlign', output_size=14, sampling_ratio=2),
+ out_channels=256,
+ featmap_strides=[4, 8, 16, 32],
+ pre_cfg=dict(
+ type='ConvModule',
+ in_channels=256,
+ out_channels=256,
+ kernel_size=5,
+ padding=2,
+ inplace=False,
+ ),
+ post_cfg=dict(
+ type='GeneralizedAttention',
+ in_channels=256,
+ spatial_range=-1,
+ num_heads=6,
+ attention_type='0100',
+ kv_stride=2))))
diff --git a/PyTorch/contrib/cv/detection/GCNet/configs/groie/mask_rcnn_r101_fpn_syncbn-backbone_r4_gcb_c3-c5_groie_1x_coco.py b/PyTorch/contrib/cv/detection/GCNet/configs/groie/mask_rcnn_r101_fpn_syncbn-backbone_r4_gcb_c3-c5_groie_1x_coco.py
new file mode 100644
index 0000000000000000000000000000000000000000..8b83722197c69a51907f43bcb05883deedc37f0c
--- /dev/null
+++ b/PyTorch/contrib/cv/detection/GCNet/configs/groie/mask_rcnn_r101_fpn_syncbn-backbone_r4_gcb_c3-c5_groie_1x_coco.py
@@ -0,0 +1,45 @@
+_base_ = '../gcnet/mask_rcnn_r101_fpn_syncbn-backbone_r4_gcb_c3-c5_1x_coco.py'
+# model settings
+model = dict(
+ roi_head=dict(
+ bbox_roi_extractor=dict(
+ type='GenericRoIExtractor',
+ aggregation='sum',
+ roi_layer=dict(type='RoIAlign', output_size=7, sampling_ratio=2),
+ out_channels=256,
+ featmap_strides=[4, 8, 16, 32],
+ pre_cfg=dict(
+ type='ConvModule',
+ in_channels=256,
+ out_channels=256,
+ kernel_size=5,
+ padding=2,
+ inplace=False,
+ ),
+ post_cfg=dict(
+ type='GeneralizedAttention',
+ in_channels=256,
+ spatial_range=-1,
+ num_heads=6,
+ attention_type='0100',
+ kv_stride=2)),
+ mask_roi_extractor=dict(
+ type='GenericRoIExtractor',
+ roi_layer=dict(type='RoIAlign', output_size=14, sampling_ratio=2),
+ out_channels=256,
+ featmap_strides=[4, 8, 16, 32],
+ pre_cfg=dict(
+ type='ConvModule',
+ in_channels=256,
+ out_channels=256,
+ kernel_size=5,
+ padding=2,
+ inplace=False,
+ ),
+ post_cfg=dict(
+ type='GeneralizedAttention',
+ in_channels=256,
+ spatial_range=-1,
+ num_heads=6,
+ attention_type='0100',
+ kv_stride=2))))
diff --git a/PyTorch/contrib/cv/detection/GCNet/configs/groie/mask_rcnn_r50_fpn_groie_1x_coco.py b/PyTorch/contrib/cv/detection/GCNet/configs/groie/mask_rcnn_r50_fpn_groie_1x_coco.py
new file mode 100644
index 0000000000000000000000000000000000000000..81dfb4873bdb587626200a3007dc4d57a92c0fd9
--- /dev/null
+++ b/PyTorch/contrib/cv/detection/GCNet/configs/groie/mask_rcnn_r50_fpn_groie_1x_coco.py
@@ -0,0 +1,45 @@
+_base_ = '../mask_rcnn/mask_rcnn_r50_fpn_1x_coco.py'
+# model settings
+model = dict(
+ roi_head=dict(
+ bbox_roi_extractor=dict(
+ type='GenericRoIExtractor',
+ aggregation='sum',
+ roi_layer=dict(type='RoIAlign', output_size=7, sampling_ratio=2),
+ out_channels=256,
+ featmap_strides=[4, 8, 16, 32],
+ pre_cfg=dict(
+ type='ConvModule',
+ in_channels=256,
+ out_channels=256,
+ kernel_size=5,
+ padding=2,
+ inplace=False,
+ ),
+ post_cfg=dict(
+ type='GeneralizedAttention',
+ in_channels=256,
+ spatial_range=-1,
+ num_heads=6,
+ attention_type='0100',
+ kv_stride=2)),
+ mask_roi_extractor=dict(
+ type='GenericRoIExtractor',
+ roi_layer=dict(type='RoIAlign', output_size=14, sampling_ratio=2),
+ out_channels=256,
+ featmap_strides=[4, 8, 16, 32],
+ pre_cfg=dict(
+ type='ConvModule',
+ in_channels=256,
+ out_channels=256,
+ kernel_size=5,
+ padding=2,
+ inplace=False,
+ ),
+ post_cfg=dict(
+ type='GeneralizedAttention',
+ in_channels=256,
+ spatial_range=-1,
+ num_heads=6,
+ attention_type='0100',
+ kv_stride=2))))
diff --git a/PyTorch/contrib/cv/detection/GCNet/configs/groie/mask_rcnn_r50_fpn_syncbn-backbone_r4_gcb_c3-c5_groie_1x_coco.py b/PyTorch/contrib/cv/detection/GCNet/configs/groie/mask_rcnn_r50_fpn_syncbn-backbone_r4_gcb_c3-c5_groie_1x_coco.py
new file mode 100644
index 0000000000000000000000000000000000000000..852c5ca7c5c4ba04f6a5f7dd6dbaf6b2c357a2fa
--- /dev/null
+++ b/PyTorch/contrib/cv/detection/GCNet/configs/groie/mask_rcnn_r50_fpn_syncbn-backbone_r4_gcb_c3-c5_groie_1x_coco.py
@@ -0,0 +1,45 @@
+_base_ = '../gcnet/mask_rcnn_r50_fpn_syncbn-backbone_r4_gcb_c3-c5_1x_coco.py'
+# model settings
+model = dict(
+ roi_head=dict(
+ bbox_roi_extractor=dict(
+ type='GenericRoIExtractor',
+ aggregation='sum',
+ roi_layer=dict(type='RoIAlign', output_size=7, sampling_ratio=2),
+ out_channels=256,
+ featmap_strides=[4, 8, 16, 32],
+ pre_cfg=dict(
+ type='ConvModule',
+ in_channels=256,
+ out_channels=256,
+ kernel_size=5,
+ padding=2,
+ inplace=False,
+ ),
+ post_cfg=dict(
+ type='GeneralizedAttention',
+ in_channels=256,
+ spatial_range=-1,
+ num_heads=6,
+ attention_type='0100',
+ kv_stride=2)),
+ mask_roi_extractor=dict(
+ type='GenericRoIExtractor',
+ roi_layer=dict(type='RoIAlign', output_size=14, sampling_ratio=2),
+ out_channels=256,
+ featmap_strides=[4, 8, 16, 32],
+ pre_cfg=dict(
+ type='ConvModule',
+ in_channels=256,
+ out_channels=256,
+ kernel_size=5,
+ padding=2,
+ inplace=False,
+ ),
+ post_cfg=dict(
+ type='GeneralizedAttention',
+ in_channels=256,
+ spatial_range=-1,
+ num_heads=6,
+ attention_type='0100',
+ kv_stride=2))))
diff --git a/PyTorch/contrib/cv/detection/GCNet/configs/guided_anchoring/README.md b/PyTorch/contrib/cv/detection/GCNet/configs/guided_anchoring/README.md
new file mode 100644
index 0000000000000000000000000000000000000000..c1a392acc1e803ce2817bb6161d162428dc0a785
--- /dev/null
+++ b/PyTorch/contrib/cv/detection/GCNet/configs/guided_anchoring/README.md
@@ -0,0 +1,51 @@
+# Region Proposal by Guided Anchoring
+
+## Introduction
+
+We provide config files to reproduce the results in the CVPR 2019 paper for [Region Proposal by Guided Anchoring](https://arxiv.org/abs/1901.03278).
+
+```
+@inproceedings{wang2019region,
+ title={Region Proposal by Guided Anchoring},
+ author={Jiaqi Wang and Kai Chen and Shuo Yang and Chen Change Loy and Dahua Lin},
+ booktitle={IEEE Conference on Computer Vision and Pattern Recognition},
+ year={2019}
+}
+```
+
+## Results and Models
+
+The results on COCO 2017 val is shown in the below table. (results on test-dev are usually slightly higher than val).
+
+| Method | Backbone | Style | Lr schd | Mem (GB) | Inf time (fps) | AR 1000 | Download |
+| :----: | :-------------: | :-----: | :-----: | :------: | :------------: | :-----: | :---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------: |
+| GA-RPN | R-50-FPN | caffe | 1x | 5.3 | 15.8 | 68.4 | [model](http://download.openmmlab.com/mmdetection/v2.0/guided_anchoring/ga_rpn_r50_caffe_fpn_1x_coco/ga_rpn_r50_caffe_fpn_1x_coco_20200531-899008a6.pth) | [log](http://download.openmmlab.com/mmdetection/v2.0/guided_anchoring/ga_rpn_r50_caffe_fpn_1x_coco/ga_rpn_r50_caffe_fpn_1x_coco_20200531_011819.log.json) |
+| GA-RPN | R-101-FPN | caffe | 1x | 7.3 | 13.0 | 69.5 | [model](http://download.openmmlab.com/mmdetection/v2.0/guided_anchoring/ga_rpn_r101_caffe_fpn_1x_coco/ga_rpn_r101_caffe_fpn_1x_coco_20200531-ca9ba8fb.pth) | [log](http://download.openmmlab.com/mmdetection/v2.0/guided_anchoring/ga_rpn_r101_caffe_fpn_1x_coco/ga_rpn_r101_caffe_fpn_1x_coco_20200531_011812.log.json) |
+| GA-RPN | X-101-32x4d-FPN | pytorch | 1x | 8.5 | 10.0 | 70.6 | [model](http://download.openmmlab.com/mmdetection/v2.0/guided_anchoring/ga_rpn_x101_32x4d_fpn_1x_coco/ga_rpn_x101_32x4d_fpn_1x_coco_20200220-c28d1b18.pth) | [log](http://download.openmmlab.com/mmdetection/v2.0/guided_anchoring/ga_rpn_x101_32x4d_fpn_1x_coco/ga_rpn_x101_32x4d_fpn_1x_coco_20200220_221326.log.json) |
+| GA-RPN | X-101-64x4d-FPN | pytorch | 1x | 7.1 | 7.5 | 71.2 | [model](http://download.openmmlab.com/mmdetection/v2.0/guided_anchoring/ga_rpn_x101_64x4d_fpn_1x_coco/ga_rpn_x101_64x4d_fpn_1x_coco_20200225-3c6e1aa2.pth) | [log](http://download.openmmlab.com/mmdetection/v2.0/guided_anchoring/ga_rpn_x101_64x4d_fpn_1x_coco/ga_rpn_x101_64x4d_fpn_1x_coco_20200225_152704.log.json) |
+
+
+| Method | Backbone | Style | Lr schd | Mem (GB) | Inf time (fps) | box AP | Download |
+| :------------: | :-------------: | :-----: | :-----: | :------: | :------------: | :----: | :-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------: |
+| GA-Faster RCNN | R-50-FPN | caffe | 1x | 5.5 | | 39.6 | [model](http://download.openmmlab.com/mmdetection/v2.0/guided_anchoring/ga_faster_r50_caffe_fpn_1x_coco/ga_faster_r50_caffe_fpn_1x_coco_20200702_000718-a11ccfe6.pth) | [log](http://download.openmmlab.com/mmdetection/v2.0/guided_anchoring/ga_faster_r50_caffe_fpn_1x_coco/ga_faster_r50_caffe_fpn_1x_coco_20200702_000718.log.json) |
+| GA-Faster RCNN | R-101-FPN | caffe | 1x | 7.5 | | 41.5 | [model](http://download.openmmlab.com/mmdetection/v2.0/guided_anchoring/ga_faster_r101_caffe_fpn_1x_coco/ga_faster_r101_caffe_fpn_1x_coco_bbox_mAP-0.415_20200505_115528-fb82e499.pth) | [log](http://download.openmmlab.com/mmdetection/v2.0/guided_anchoring/ga_faster_r101_caffe_fpn_1x_coco/ga_faster_r101_caffe_fpn_1x_coco_20200505_115528.log.json) |
+| GA-Faster RCNN | X-101-32x4d-FPN | pytorch | 1x | 8.7 | 9.7 | 43.0 | [model](http://download.openmmlab.com/mmdetection/v2.0/guided_anchoring/ga_faster_x101_32x4d_fpn_1x_coco/ga_faster_x101_32x4d_fpn_1x_coco_20200215-1ded9da3.pth) | [log](http://download.openmmlab.com/mmdetection/v2.0/guided_anchoring/ga_faster_x101_32x4d_fpn_1x_coco/ga_faster_x101_32x4d_fpn_1x_coco_20200215_184547.log.json) |
+| GA-Faster RCNN | X-101-64x4d-FPN | pytorch | 1x | 11.8 | 7.3 | 43.9 | [model](http://download.openmmlab.com/mmdetection/v2.0/guided_anchoring/ga_faster_x101_64x4d_fpn_1x_coco/ga_faster_x101_64x4d_fpn_1x_coco_20200215-0fa7bde7.pth) | [log](http://download.openmmlab.com/mmdetection/v2.0/guided_anchoring/ga_faster_x101_64x4d_fpn_1x_coco/ga_faster_x101_64x4d_fpn_1x_coco_20200215_104455.log.json) |
+| GA-RetinaNet | R-50-FPN | caffe | 1x | 3.5 | 16.8 | 36.9 | [model](https://download.openmmlab.com/mmdetection/v2.0/guided_anchoring/ga_retinanet_r50_caffe_fpn_1x_coco/ga_retinanet_r50_caffe_fpn_1x_coco_20201020-39581c6f.pth) | [log](https://download.openmmlab.com/mmdetection/v2.0/guided_anchoring/ga_retinanet_r50_caffe_fpn_1x_coco/ga_retinanet_r50_caffe_fpn_1x_coco_20201020_225450.log.json) |
+| GA-RetinaNet | R-101-FPN | caffe | 1x | 5.5 | 12.9 | 39.0 | [model](http://download.openmmlab.com/mmdetection/v2.0/guided_anchoring/ga_retinanet_r101_caffe_fpn_1x_coco/ga_retinanet_r101_caffe_fpn_1x_coco_20200531-6266453c.pth) | [log](http://download.openmmlab.com/mmdetection/v2.0/guided_anchoring/ga_retinanet_r101_caffe_fpn_1x_coco/ga_retinanet_r101_caffe_fpn_1x_coco_20200531_012847.log.json) |
+| GA-RetinaNet | X-101-32x4d-FPN | pytorch | 1x | 6.9 | 10.6 | 40.5 | [model](http://download.openmmlab.com/mmdetection/v2.0/guided_anchoring/ga_retinanet_x101_32x4d_fpn_1x_coco/ga_retinanet_x101_32x4d_fpn_1x_coco_20200219-40c56caa.pth) | [log](http://download.openmmlab.com/mmdetection/v2.0/guided_anchoring/ga_retinanet_x101_32x4d_fpn_1x_coco/ga_retinanet_x101_32x4d_fpn_1x_coco_20200219_223025.log.json) |
+| GA-RetinaNet | X-101-64x4d-FPN | pytorch | 1x | 9.9 | 7.7 | 41.3 | [model](http://download.openmmlab.com/mmdetection/v2.0/guided_anchoring/ga_retinanet_x101_64x4d_fpn_1x_coco/ga_retinanet_x101_64x4d_fpn_1x_coco_20200226-ef9f7f1f.pth) | [log](http://download.openmmlab.com/mmdetection/v2.0/guided_anchoring/ga_retinanet_x101_64x4d_fpn_1x_coco/ga_retinanet_x101_64x4d_fpn_1x_coco_20200226_221123.log.json) |
+
+
+
+- In the Guided Anchoring paper, `score_thr` is set to 0.001 in Fast/Faster RCNN and 0.05 in RetinaNet for both baselines and Guided Anchoring.
+
+- Performance on COCO test-dev benchmark are shown as follows.
+
+
+| Method | Backbone | Style | Lr schd | Aug Train | Score thr | AP | AP_50 | AP_75 | AP_small | AP_medium | AP_large | Download |
+| :------------: | :-------: | :---: | :-----: | :-------: | :-------: | :---: | :---: | :---: | :------: | :-------: | :------: | :------: |
+| GA-Faster RCNN | R-101-FPN | caffe | 1x | F | 0.05 | | | | | | | |
+| GA-Faster RCNN | R-101-FPN | caffe | 1x | F | 0.001 | | | | | | | |
+| GA-RetinaNet | R-101-FPN | caffe | 1x | F | 0.05 | | | | | | | |
+| GA-RetinaNet | R-101-FPN | caffe | 2x | T | 0.05 | | | | | | | |
diff --git a/PyTorch/contrib/cv/detection/GCNet/configs/guided_anchoring/ga_fast_r50_caffe_fpn_1x_coco.py b/PyTorch/contrib/cv/detection/GCNet/configs/guided_anchoring/ga_fast_r50_caffe_fpn_1x_coco.py
new file mode 100644
index 0000000000000000000000000000000000000000..a1258bd905aced4acfc17c4afb22958cb21d4104
--- /dev/null
+++ b/PyTorch/contrib/cv/detection/GCNet/configs/guided_anchoring/ga_fast_r50_caffe_fpn_1x_coco.py
@@ -0,0 +1,63 @@
+_base_ = '../fast_rcnn/fast_rcnn_r50_fpn_1x_coco.py'
+model = dict(
+ pretrained='open-mmlab://detectron2/resnet50_caffe',
+ backbone=dict(
+ type='ResNet',
+ depth=50,
+ num_stages=4,
+ out_indices=(0, 1, 2, 3),
+ frozen_stages=1,
+ norm_cfg=dict(type='BN', requires_grad=False),
+ norm_eval=True,
+ style='caffe'),
+ roi_head=dict(
+ bbox_head=dict(bbox_coder=dict(target_stds=[0.05, 0.05, 0.1, 0.1]))))
+# model training and testing settings
+train_cfg = dict(
+ rcnn=dict(
+ assigner=dict(pos_iou_thr=0.6, neg_iou_thr=0.6, min_pos_iou=0.6),
+ sampler=dict(num=256)))
+test_cfg = dict(rcnn=dict(score_thr=1e-3))
+dataset_type = 'CocoDataset'
+data_root = 'data/coco/'
+img_norm_cfg = dict(
+ mean=[103.530, 116.280, 123.675], std=[1.0, 1.0, 1.0], to_rgb=False)
+train_pipeline = [
+ dict(type='LoadImageFromFile'),
+ dict(type='LoadProposals', num_max_proposals=300),
+ dict(type='LoadAnnotations', with_bbox=True),
+ dict(type='Resize', img_scale=(1333, 800), keep_ratio=True),
+ dict(type='RandomFlip', flip_ratio=0.5),
+ dict(type='Normalize', **img_norm_cfg),
+ dict(type='Pad', size_divisor=32),
+ dict(type='DefaultFormatBundle'),
+ dict(type='Collect', keys=['img', 'proposals', 'gt_bboxes', 'gt_labels']),
+]
+test_pipeline = [
+ dict(type='LoadImageFromFile'),
+ dict(type='LoadProposals', num_max_proposals=None),
+ dict(
+ type='MultiScaleFlipAug',
+ img_scale=(1333, 800),
+ flip=False,
+ transforms=[
+ dict(type='Resize', keep_ratio=True),
+ dict(type='RandomFlip'),
+ dict(type='Normalize', **img_norm_cfg),
+ dict(type='Pad', size_divisor=32),
+ dict(type='ImageToTensor', keys=['img']),
+ dict(type='Collect', keys=['img', 'proposals']),
+ ])
+]
+data = dict(
+ train=dict(
+ proposal_file=data_root + 'proposals/ga_rpn_r50_fpn_1x_train2017.pkl',
+ pipeline=train_pipeline),
+ val=dict(
+ proposal_file=data_root + 'proposals/ga_rpn_r50_fpn_1x_val2017.pkl',
+ pipeline=test_pipeline),
+ test=dict(
+ proposal_file=data_root + 'proposals/ga_rpn_r50_fpn_1x_val2017.pkl',
+ pipeline=test_pipeline))
+optimizer_config = dict(
+ _delete_=True, grad_clip=dict(max_norm=35, norm_type=2))
diff --git a/PyTorch/contrib/cv/detection/GCNet/configs/guided_anchoring/ga_faster_r101_caffe_fpn_1x_coco.py b/PyTorch/contrib/cv/detection/GCNet/configs/guided_anchoring/ga_faster_r101_caffe_fpn_1x_coco.py
new file mode 100644
index 0000000000000000000000000000000000000000..f438a4792e9aa4bcef35a42349156f1eab044477
--- /dev/null
+++ b/PyTorch/contrib/cv/detection/GCNet/configs/guided_anchoring/ga_faster_r101_caffe_fpn_1x_coco.py
@@ -0,0 +1,4 @@
+_base_ = './ga_faster_r50_caffe_fpn_1x_coco.py'
+model = dict(
+ pretrained='open-mmlab://detectron2/resnet101_caffe',
+ backbone=dict(depth=101))
diff --git a/PyTorch/contrib/cv/detection/GCNet/configs/guided_anchoring/ga_faster_r50_caffe_fpn_1x_coco.py b/PyTorch/contrib/cv/detection/GCNet/configs/guided_anchoring/ga_faster_r50_caffe_fpn_1x_coco.py
new file mode 100644
index 0000000000000000000000000000000000000000..40e75128441c45ef77a77e00391c46e378b27a8c
--- /dev/null
+++ b/PyTorch/contrib/cv/detection/GCNet/configs/guided_anchoring/ga_faster_r50_caffe_fpn_1x_coco.py
@@ -0,0 +1,64 @@
+_base_ = '../faster_rcnn/faster_rcnn_r50_caffe_fpn_1x_coco.py'
+model = dict(
+ rpn_head=dict(
+ _delete_=True,
+ type='GARPNHead',
+ in_channels=256,
+ feat_channels=256,
+ approx_anchor_generator=dict(
+ type='AnchorGenerator',
+ octave_base_scale=8,
+ scales_per_octave=3,
+ ratios=[0.5, 1.0, 2.0],
+ strides=[4, 8, 16, 32, 64]),
+ square_anchor_generator=dict(
+ type='AnchorGenerator',
+ ratios=[1.0],
+ scales=[8],
+ strides=[4, 8, 16, 32, 64]),
+ anchor_coder=dict(
+ type='DeltaXYWHBBoxCoder',
+ target_means=[.0, .0, .0, .0],
+ target_stds=[0.07, 0.07, 0.14, 0.14]),
+ bbox_coder=dict(
+ type='DeltaXYWHBBoxCoder',
+ target_means=[.0, .0, .0, .0],
+ target_stds=[0.07, 0.07, 0.11, 0.11]),
+ loc_filter_thr=0.01,
+ loss_loc=dict(
+ type='FocalLoss',
+ use_sigmoid=True,
+ gamma=2.0,
+ alpha=0.25,
+ loss_weight=1.0),
+ loss_shape=dict(type='BoundedIoULoss', beta=0.2, loss_weight=1.0),
+ loss_cls=dict(
+ type='CrossEntropyLoss', use_sigmoid=True, loss_weight=1.0),
+ loss_bbox=dict(type='SmoothL1Loss', beta=1.0, loss_weight=1.0)),
+ roi_head=dict(
+ bbox_head=dict(bbox_coder=dict(target_stds=[0.05, 0.05, 0.1, 0.1]))))
+# model training and testing settings
+train_cfg = dict(
+ rpn=dict(
+ ga_assigner=dict(
+ type='ApproxMaxIoUAssigner',
+ pos_iou_thr=0.7,
+ neg_iou_thr=0.3,
+ min_pos_iou=0.3,
+ ignore_iof_thr=-1),
+ ga_sampler=dict(
+ type='RandomSampler',
+ num=256,
+ pos_fraction=0.5,
+ neg_pos_ub=-1,
+ add_gt_as_proposals=False),
+ allowed_border=-1,
+ center_ratio=0.2,
+ ignore_ratio=0.5),
+ rpn_proposal=dict(max_num=300),
+ rcnn=dict(
+ assigner=dict(pos_iou_thr=0.6, neg_iou_thr=0.6, min_pos_iou=0.6),
+ sampler=dict(type='RandomSampler', num=256)))
+test_cfg = dict(rpn=dict(max_num=300), rcnn=dict(score_thr=1e-3))
+optimizer_config = dict(
+ _delete_=True, grad_clip=dict(max_norm=35, norm_type=2))
diff --git a/PyTorch/contrib/cv/detection/GCNet/configs/guided_anchoring/ga_faster_r50_fpn_1x_coco.py b/PyTorch/contrib/cv/detection/GCNet/configs/guided_anchoring/ga_faster_r50_fpn_1x_coco.py
new file mode 100644
index 0000000000000000000000000000000000000000..ab19e5b675f1aa1b3b03c2db51defe517f852444
--- /dev/null
+++ b/PyTorch/contrib/cv/detection/GCNet/configs/guided_anchoring/ga_faster_r50_fpn_1x_coco.py
@@ -0,0 +1,64 @@
+_base_ = '../faster_rcnn/faster_rcnn_r50_fpn_1x_coco.py'
+model = dict(
+ rpn_head=dict(
+ _delete_=True,
+ type='GARPNHead',
+ in_channels=256,
+ feat_channels=256,
+ approx_anchor_generator=dict(
+ type='AnchorGenerator',
+ octave_base_scale=8,
+ scales_per_octave=3,
+ ratios=[0.5, 1.0, 2.0],
+ strides=[4, 8, 16, 32, 64]),
+ square_anchor_generator=dict(
+ type='AnchorGenerator',
+ ratios=[1.0],
+ scales=[8],
+ strides=[4, 8, 16, 32, 64]),
+ anchor_coder=dict(
+ type='DeltaXYWHBBoxCoder',
+ target_means=[.0, .0, .0, .0],
+ target_stds=[0.07, 0.07, 0.14, 0.14]),
+ bbox_coder=dict(
+ type='DeltaXYWHBBoxCoder',
+ target_means=[.0, .0, .0, .0],
+ target_stds=[0.07, 0.07, 0.11, 0.11]),
+ loc_filter_thr=0.01,
+ loss_loc=dict(
+ type='FocalLoss',
+ use_sigmoid=True,
+ gamma=2.0,
+ alpha=0.25,
+ loss_weight=1.0),
+ loss_shape=dict(type='BoundedIoULoss', beta=0.2, loss_weight=1.0),
+ loss_cls=dict(
+ type='CrossEntropyLoss', use_sigmoid=True, loss_weight=1.0),
+ loss_bbox=dict(type='SmoothL1Loss', beta=1.0, loss_weight=1.0)),
+ roi_head=dict(
+ bbox_head=dict(bbox_coder=dict(target_stds=[0.05, 0.05, 0.1, 0.1]))))
+# model training and testing settings
+train_cfg = dict(
+ rpn=dict(
+ ga_assigner=dict(
+ type='ApproxMaxIoUAssigner',
+ pos_iou_thr=0.7,
+ neg_iou_thr=0.3,
+ min_pos_iou=0.3,
+ ignore_iof_thr=-1),
+ ga_sampler=dict(
+ type='RandomSampler',
+ num=256,
+ pos_fraction=0.5,
+ neg_pos_ub=-1,
+ add_gt_as_proposals=False),
+ allowed_border=-1,
+ center_ratio=0.2,
+ ignore_ratio=0.5),
+ rpn_proposal=dict(max_num=300),
+ rcnn=dict(
+ assigner=dict(pos_iou_thr=0.6, neg_iou_thr=0.6, min_pos_iou=0.6),
+ sampler=dict(type='RandomSampler', num=256)))
+test_cfg = dict(rpn=dict(max_num=300), rcnn=dict(score_thr=1e-3))
+optimizer_config = dict(
+ _delete_=True, grad_clip=dict(max_norm=35, norm_type=2))
diff --git a/PyTorch/contrib/cv/detection/GCNet/configs/guided_anchoring/ga_faster_x101_32x4d_fpn_1x_coco.py b/PyTorch/contrib/cv/detection/GCNet/configs/guided_anchoring/ga_faster_x101_32x4d_fpn_1x_coco.py
new file mode 100644
index 0000000000000000000000000000000000000000..c9a035f15cfad12ddbbfa87ed0d579c1cde0c4ce
--- /dev/null
+++ b/PyTorch/contrib/cv/detection/GCNet/configs/guided_anchoring/ga_faster_x101_32x4d_fpn_1x_coco.py
@@ -0,0 +1,13 @@
+_base_ = './ga_faster_r50_fpn_1x_coco.py'
+model = dict(
+ pretrained='open-mmlab://resnext101_32x4d',
+ backbone=dict(
+ type='ResNeXt',
+ depth=101,
+ groups=32,
+ base_width=4,
+ num_stages=4,
+ out_indices=(0, 1, 2, 3),
+ frozen_stages=1,
+ norm_cfg=dict(type='BN', requires_grad=True),
+ style='pytorch'))
diff --git a/PyTorch/contrib/cv/detection/GCNet/configs/guided_anchoring/ga_faster_x101_64x4d_fpn_1x_coco.py b/PyTorch/contrib/cv/detection/GCNet/configs/guided_anchoring/ga_faster_x101_64x4d_fpn_1x_coco.py
new file mode 100644
index 0000000000000000000000000000000000000000..87bbfdc827eb17654527ad5305ec80bd9e84b78a
--- /dev/null
+++ b/PyTorch/contrib/cv/detection/GCNet/configs/guided_anchoring/ga_faster_x101_64x4d_fpn_1x_coco.py
@@ -0,0 +1,13 @@
+_base_ = './ga_faster_r50_fpn_1x_coco.py'
+model = dict(
+ pretrained='open-mmlab://resnext101_64x4d',
+ backbone=dict(
+ type='ResNeXt',
+ depth=101,
+ groups=64,
+ base_width=4,
+ num_stages=4,
+ out_indices=(0, 1, 2, 3),
+ frozen_stages=1,
+ norm_cfg=dict(type='BN', requires_grad=True),
+ style='pytorch'))
diff --git a/PyTorch/contrib/cv/detection/GCNet/configs/guided_anchoring/ga_retinanet_r101_caffe_fpn_1x_coco.py b/PyTorch/contrib/cv/detection/GCNet/configs/guided_anchoring/ga_retinanet_r101_caffe_fpn_1x_coco.py
new file mode 100644
index 0000000000000000000000000000000000000000..0048965d5b4d2257eed860f9bd69256795b44fa6
--- /dev/null
+++ b/PyTorch/contrib/cv/detection/GCNet/configs/guided_anchoring/ga_retinanet_r101_caffe_fpn_1x_coco.py
@@ -0,0 +1,4 @@
+_base_ = './ga_retinanet_r50_caffe_fpn_1x_coco.py'
+model = dict(
+ pretrained='open-mmlab://detectron2/resnet101_caffe',
+ backbone=dict(depth=101))
diff --git a/PyTorch/contrib/cv/detection/GCNet/configs/guided_anchoring/ga_retinanet_r101_caffe_fpn_mstrain_2x.py b/PyTorch/contrib/cv/detection/GCNet/configs/guided_anchoring/ga_retinanet_r101_caffe_fpn_mstrain_2x.py
new file mode 100644
index 0000000000000000000000000000000000000000..f6c487bf18fe6bcee9a9b7d62ca99a4d98cafa17
--- /dev/null
+++ b/PyTorch/contrib/cv/detection/GCNet/configs/guided_anchoring/ga_retinanet_r101_caffe_fpn_mstrain_2x.py
@@ -0,0 +1,172 @@
+# model settings
+model = dict(
+ type='RetinaNet',
+ pretrained='open-mmlab://detectron2/resnet101_caffe',
+ backbone=dict(
+ type='ResNet',
+ depth=101,
+ num_stages=4,
+ out_indices=(0, 1, 2, 3),
+ frozen_stages=1,
+ norm_cfg=dict(type='BN', requires_grad=False),
+ norm_eval=True,
+ style='caffe'),
+ neck=dict(
+ type='FPN',
+ in_channels=[256, 512, 1024, 2048],
+ out_channels=256,
+ start_level=1,
+ add_extra_convs=True,
+ num_outs=5),
+ bbox_head=dict(
+ type='GARetinaHead',
+ num_classes=81,
+ in_channels=256,
+ stacked_convs=4,
+ feat_channels=256,
+ approx_anchor_generator=dict(
+ type='AnchorGenerator',
+ octave_base_scale=4,
+ scales_per_octave=3,
+ ratios=[0.5, 1.0, 2.0],
+ strides=[8, 16, 32, 64, 128]),
+ square_anchor_generator=dict(
+ type='AnchorGenerator',
+ ratios=[1.0],
+ scales=[4],
+ strides=[8, 16, 32, 64, 128]),
+ anchor_coder=dict(
+ type='DeltaXYWHBBoxCoder',
+ target_means=[.0, .0, .0, .0],
+ target_stds=[1.0, 1.0, 1.0, 1.0]),
+ bbox_coder=dict(
+ type='DeltaXYWHBBoxCoder',
+ target_means=[.0, .0, .0, .0],
+ target_stds=[1.0, 1.0, 1.0, 1.0]),
+ loc_filter_thr=0.01,
+ loss_loc=dict(
+ type='FocalLoss',
+ use_sigmoid=True,
+ gamma=2.0,
+ alpha=0.25,
+ loss_weight=1.0),
+ loss_shape=dict(type='BoundedIoULoss', beta=0.2, loss_weight=1.0),
+ loss_cls=dict(
+ type='FocalLoss',
+ use_sigmoid=True,
+ gamma=2.0,
+ alpha=0.25,
+ loss_weight=1.0),
+ loss_bbox=dict(type='SmoothL1Loss', beta=0.04, loss_weight=1.0)))
+# training and testing settings
+train_cfg = dict(
+ ga_assigner=dict(
+ type='ApproxMaxIoUAssigner',
+ pos_iou_thr=0.5,
+ neg_iou_thr=0.4,
+ min_pos_iou=0.4,
+ ignore_iof_thr=-1),
+ ga_sampler=dict(
+ type='RandomSampler',
+ num=256,
+ pos_fraction=0.5,
+ neg_pos_ub=-1,
+ add_gt_as_proposals=False),
+ assigner=dict(
+ type='MaxIoUAssigner',
+ pos_iou_thr=0.5,
+ neg_iou_thr=0.5,
+ min_pos_iou=0.0,
+ ignore_iof_thr=-1),
+ allowed_border=-1,
+ pos_weight=-1,
+ center_ratio=0.2,
+ ignore_ratio=0.5,
+ debug=False)
+test_cfg = dict(
+ nms_pre=1000,
+ min_bbox_size=0,
+ score_thr=0.05,
+ nms=dict(type='nms', iou_threshold=0.5),
+ max_per_img=100)
+# dataset settings
+dataset_type = 'CocoDataset'
+data_root = 'data/coco/'
+img_norm_cfg = dict(
+ mean=[103.530, 116.280, 123.675], std=[1.0, 1.0, 1.0], to_rgb=False)
+train_pipeline = [
+ dict(type='LoadImageFromFile'),
+ dict(type='LoadAnnotations', with_bbox=True),
+ dict(
+ type='Resize',
+ img_scale=[(1333, 480), (1333, 960)],
+ keep_ratio=True,
+ multiscale_mode='range'),
+ dict(type='RandomFlip', flip_ratio=0.5),
+ dict(type='Normalize', **img_norm_cfg),
+ dict(type='Pad', size_divisor=32),
+ dict(type='DefaultFormatBundle'),
+ dict(type='Collect', keys=['img', 'gt_bboxes', 'gt_labels']),
+]
+test_pipeline = [
+ dict(type='LoadImageFromFile'),
+ dict(
+ type='MultiScaleFlipAug',
+ img_scale=(1333, 800),
+ flip=False,
+ transforms=[
+ dict(type='Resize', keep_ratio=True),
+ dict(type='RandomFlip'),
+ dict(type='Normalize', **img_norm_cfg),
+ dict(type='Pad', size_divisor=32),
+ dict(type='ImageToTensor', keys=['img']),
+ dict(type='Collect', keys=['img']),
+ ])
+]
+data = dict(
+ samples_per_gpu=2,
+ workers_per_gpu=2,
+ train=dict(
+ type=dataset_type,
+ ann_file=data_root + 'annotations/instances_train2017.json',
+ img_prefix=data_root + 'train2017/',
+ pipeline=train_pipeline),
+ val=dict(
+ type=dataset_type,
+ ann_file=data_root + 'annotations/instances_val2017.json',
+ img_prefix=data_root + 'val2017/',
+ pipeline=test_pipeline),
+ test=dict(
+ type=dataset_type,
+ ann_file=data_root + 'annotations/instances_val2017.json',
+ img_prefix=data_root + 'val2017/',
+ pipeline=test_pipeline))
+evaluation = dict(interval=1, metric='bbox')
+# optimizer
+optimizer = dict(type='SGD', lr=0.01, momentum=0.9, weight_decay=0.0001)
+optimizer_config = dict(
+ _delete_=True, grad_clip=dict(max_norm=35, norm_type=2))
+# learning policy
+lr_config = dict(
+ policy='step',
+ warmup='linear',
+ warmup_iters=500,
+ warmup_ratio=1.0 / 3,
+ step=[16, 22])
+checkpoint_config = dict(interval=1)
+# yapf:disable
+log_config = dict(
+ interval=50,
+ hooks=[
+ dict(type='TextLoggerHook'),
+ # dict(type='TensorboardLoggerHook')
+ ])
+# yapf:enable
+# runtime settings
+total_epochs = 24
+dist_params = dict(backend='nccl')
+log_level = 'INFO'
+work_dir = './work_dirs/ga_retinanet_r101_caffe_fpn_mstrain_2x'
+load_from = None
+resume_from = None
+workflow = [('train', 1)]
diff --git a/PyTorch/contrib/cv/detection/GCNet/configs/guided_anchoring/ga_retinanet_r50_caffe_fpn_1x_coco.py b/PyTorch/contrib/cv/detection/GCNet/configs/guided_anchoring/ga_retinanet_r50_caffe_fpn_1x_coco.py
new file mode 100644
index 0000000000000000000000000000000000000000..8c73cebe0f1c748ca0ac14065179aeceab4d54f8
--- /dev/null
+++ b/PyTorch/contrib/cv/detection/GCNet/configs/guided_anchoring/ga_retinanet_r50_caffe_fpn_1x_coco.py
@@ -0,0 +1,62 @@
+_base_ = '../retinanet/retinanet_r50_caffe_fpn_1x_coco.py'
+model = dict(
+ bbox_head=dict(
+ _delete_=True,
+ type='GARetinaHead',
+ num_classes=80,
+ in_channels=256,
+ stacked_convs=4,
+ feat_channels=256,
+ approx_anchor_generator=dict(
+ type='AnchorGenerator',
+ octave_base_scale=4,
+ scales_per_octave=3,
+ ratios=[0.5, 1.0, 2.0],
+ strides=[8, 16, 32, 64, 128]),
+ square_anchor_generator=dict(
+ type='AnchorGenerator',
+ ratios=[1.0],
+ scales=[4],
+ strides=[8, 16, 32, 64, 128]),
+ anchor_coder=dict(
+ type='DeltaXYWHBBoxCoder',
+ target_means=[.0, .0, .0, .0],
+ target_stds=[1.0, 1.0, 1.0, 1.0]),
+ bbox_coder=dict(
+ type='DeltaXYWHBBoxCoder',
+ target_means=[.0, .0, .0, .0],
+ target_stds=[1.0, 1.0, 1.0, 1.0]),
+ loc_filter_thr=0.01,
+ loss_loc=dict(
+ type='FocalLoss',
+ use_sigmoid=True,
+ gamma=2.0,
+ alpha=0.25,
+ loss_weight=1.0),
+ loss_shape=dict(type='BoundedIoULoss', beta=0.2, loss_weight=1.0),
+ loss_cls=dict(
+ type='FocalLoss',
+ use_sigmoid=True,
+ gamma=2.0,
+ alpha=0.25,
+ loss_weight=1.0),
+ loss_bbox=dict(type='SmoothL1Loss', beta=0.04, loss_weight=1.0)))
+# training and testing settings
+train_cfg = dict(
+ ga_assigner=dict(
+ type='ApproxMaxIoUAssigner',
+ pos_iou_thr=0.5,
+ neg_iou_thr=0.4,
+ min_pos_iou=0.4,
+ ignore_iof_thr=-1),
+ ga_sampler=dict(
+ type='RandomSampler',
+ num=256,
+ pos_fraction=0.5,
+ neg_pos_ub=-1,
+ add_gt_as_proposals=False),
+ assigner=dict(neg_iou_thr=0.5, min_pos_iou=0.0),
+ center_ratio=0.2,
+ ignore_ratio=0.5)
+optimizer_config = dict(
+ _delete_=True, grad_clip=dict(max_norm=35, norm_type=2))
diff --git a/PyTorch/contrib/cv/detection/GCNet/configs/guided_anchoring/ga_retinanet_r50_fpn_1x_coco.py b/PyTorch/contrib/cv/detection/GCNet/configs/guided_anchoring/ga_retinanet_r50_fpn_1x_coco.py
new file mode 100644
index 0000000000000000000000000000000000000000..a5b595d8bb351ed8f507d0aa349fe127d4fc0708
--- /dev/null
+++ b/PyTorch/contrib/cv/detection/GCNet/configs/guided_anchoring/ga_retinanet_r50_fpn_1x_coco.py
@@ -0,0 +1,62 @@
+_base_ = '../retinanet/retinanet_r50_fpn_1x_coco.py'
+model = dict(
+ bbox_head=dict(
+ _delete_=True,
+ type='GARetinaHead',
+ num_classes=80,
+ in_channels=256,
+ stacked_convs=4,
+ feat_channels=256,
+ approx_anchor_generator=dict(
+ type='AnchorGenerator',
+ octave_base_scale=4,
+ scales_per_octave=3,
+ ratios=[0.5, 1.0, 2.0],
+ strides=[8, 16, 32, 64, 128]),
+ square_anchor_generator=dict(
+ type='AnchorGenerator',
+ ratios=[1.0],
+ scales=[4],
+ strides=[8, 16, 32, 64, 128]),
+ anchor_coder=dict(
+ type='DeltaXYWHBBoxCoder',
+ target_means=[.0, .0, .0, .0],
+ target_stds=[1.0, 1.0, 1.0, 1.0]),
+ bbox_coder=dict(
+ type='DeltaXYWHBBoxCoder',
+ target_means=[.0, .0, .0, .0],
+ target_stds=[1.0, 1.0, 1.0, 1.0]),
+ loc_filter_thr=0.01,
+ loss_loc=dict(
+ type='FocalLoss',
+ use_sigmoid=True,
+ gamma=2.0,
+ alpha=0.25,
+ loss_weight=1.0),
+ loss_shape=dict(type='BoundedIoULoss', beta=0.2, loss_weight=1.0),
+ loss_cls=dict(
+ type='FocalLoss',
+ use_sigmoid=True,
+ gamma=2.0,
+ alpha=0.25,
+ loss_weight=1.0),
+ loss_bbox=dict(type='SmoothL1Loss', beta=0.04, loss_weight=1.0)))
+# training and testing settings
+train_cfg = dict(
+ ga_assigner=dict(
+ type='ApproxMaxIoUAssigner',
+ pos_iou_thr=0.5,
+ neg_iou_thr=0.4,
+ min_pos_iou=0.4,
+ ignore_iof_thr=-1),
+ ga_sampler=dict(
+ type='RandomSampler',
+ num=256,
+ pos_fraction=0.5,
+ neg_pos_ub=-1,
+ add_gt_as_proposals=False),
+ assigner=dict(neg_iou_thr=0.5, min_pos_iou=0.0),
+ center_ratio=0.2,
+ ignore_ratio=0.5)
+optimizer_config = dict(
+ _delete_=True, grad_clip=dict(max_norm=35, norm_type=2))
diff --git a/PyTorch/contrib/cv/detection/GCNet/configs/guided_anchoring/ga_retinanet_x101_32x4d_fpn_1x_coco.py b/PyTorch/contrib/cv/detection/GCNet/configs/guided_anchoring/ga_retinanet_x101_32x4d_fpn_1x_coco.py
new file mode 100644
index 0000000000000000000000000000000000000000..18daadd6a9d3024f30157aea1f1cef3e13326b5a
--- /dev/null
+++ b/PyTorch/contrib/cv/detection/GCNet/configs/guided_anchoring/ga_retinanet_x101_32x4d_fpn_1x_coco.py
@@ -0,0 +1,13 @@
+_base_ = './ga_retinanet_r50_fpn_1x_coco.py'
+model = dict(
+ pretrained='open-mmlab://resnext101_32x4d',
+ backbone=dict(
+ type='ResNeXt',
+ depth=101,
+ groups=32,
+ base_width=4,
+ num_stages=4,
+ out_indices=(0, 1, 2, 3),
+ frozen_stages=1,
+ norm_cfg=dict(type='BN', requires_grad=True),
+ style='pytorch'))
diff --git a/PyTorch/contrib/cv/detection/GCNet/configs/guided_anchoring/ga_retinanet_x101_64x4d_fpn_1x_coco.py b/PyTorch/contrib/cv/detection/GCNet/configs/guided_anchoring/ga_retinanet_x101_64x4d_fpn_1x_coco.py
new file mode 100644
index 0000000000000000000000000000000000000000..1b18c2ba41d1493380bab3515be8e29547988ebf
--- /dev/null
+++ b/PyTorch/contrib/cv/detection/GCNet/configs/guided_anchoring/ga_retinanet_x101_64x4d_fpn_1x_coco.py
@@ -0,0 +1,13 @@
+_base_ = './ga_retinanet_r50_fpn_1x_coco.py'
+model = dict(
+ pretrained='open-mmlab://resnext101_64x4d',
+ backbone=dict(
+ type='ResNeXt',
+ depth=101,
+ groups=64,
+ base_width=4,
+ num_stages=4,
+ out_indices=(0, 1, 2, 3),
+ frozen_stages=1,
+ norm_cfg=dict(type='BN', requires_grad=True),
+ style='pytorch'))
diff --git a/PyTorch/contrib/cv/detection/GCNet/configs/guided_anchoring/ga_rpn_r101_caffe_fpn_1x_coco.py b/PyTorch/contrib/cv/detection/GCNet/configs/guided_anchoring/ga_rpn_r101_caffe_fpn_1x_coco.py
new file mode 100644
index 0000000000000000000000000000000000000000..8d154763bf810dc9f668988f05f53dd32a354a31
--- /dev/null
+++ b/PyTorch/contrib/cv/detection/GCNet/configs/guided_anchoring/ga_rpn_r101_caffe_fpn_1x_coco.py
@@ -0,0 +1,5 @@
+_base_ = './ga_rpn_r50_caffe_fpn_1x_coco.py'
+# model settings
+model = dict(
+ pretrained='open-mmlab://detectron2/resnet101_caffe',
+ backbone=dict(depth=101))
diff --git a/PyTorch/contrib/cv/detection/GCNet/configs/guided_anchoring/ga_rpn_r50_caffe_fpn_1x_coco.py b/PyTorch/contrib/cv/detection/GCNet/configs/guided_anchoring/ga_rpn_r50_caffe_fpn_1x_coco.py
new file mode 100644
index 0000000000000000000000000000000000000000..d61fba8abd471adbbbc029864be5909f4c8c7379
--- /dev/null
+++ b/PyTorch/contrib/cv/detection/GCNet/configs/guided_anchoring/ga_rpn_r50_caffe_fpn_1x_coco.py
@@ -0,0 +1,57 @@
+_base_ = '../rpn/rpn_r50_caffe_fpn_1x_coco.py'
+model = dict(
+ rpn_head=dict(
+ _delete_=True,
+ type='GARPNHead',
+ in_channels=256,
+ feat_channels=256,
+ approx_anchor_generator=dict(
+ type='AnchorGenerator',
+ octave_base_scale=8,
+ scales_per_octave=3,
+ ratios=[0.5, 1.0, 2.0],
+ strides=[4, 8, 16, 32, 64]),
+ square_anchor_generator=dict(
+ type='AnchorGenerator',
+ ratios=[1.0],
+ scales=[8],
+ strides=[4, 8, 16, 32, 64]),
+ anchor_coder=dict(
+ type='DeltaXYWHBBoxCoder',
+ target_means=[.0, .0, .0, .0],
+ target_stds=[0.07, 0.07, 0.14, 0.14]),
+ bbox_coder=dict(
+ type='DeltaXYWHBBoxCoder',
+ target_means=[.0, .0, .0, .0],
+ target_stds=[0.07, 0.07, 0.11, 0.11]),
+ loc_filter_thr=0.01,
+ loss_loc=dict(
+ type='FocalLoss',
+ use_sigmoid=True,
+ gamma=2.0,
+ alpha=0.25,
+ loss_weight=1.0),
+ loss_shape=dict(type='BoundedIoULoss', beta=0.2, loss_weight=1.0),
+ loss_cls=dict(
+ type='CrossEntropyLoss', use_sigmoid=True, loss_weight=1.0),
+ loss_bbox=dict(type='SmoothL1Loss', beta=1.0, loss_weight=1.0)))
+# model training and testing settings
+train_cfg = dict(
+ rpn=dict(
+ ga_assigner=dict(
+ type='ApproxMaxIoUAssigner',
+ pos_iou_thr=0.7,
+ neg_iou_thr=0.3,
+ min_pos_iou=0.3,
+ ignore_iof_thr=-1),
+ ga_sampler=dict(
+ type='RandomSampler',
+ num=256,
+ pos_fraction=0.5,
+ neg_pos_ub=-1,
+ add_gt_as_proposals=False),
+ allowed_border=-1,
+ center_ratio=0.2,
+ ignore_ratio=0.5))
+optimizer_config = dict(
+ _delete_=True, grad_clip=dict(max_norm=35, norm_type=2))
diff --git a/PyTorch/contrib/cv/detection/GCNet/configs/guided_anchoring/ga_rpn_r50_fpn_1x_coco.py b/PyTorch/contrib/cv/detection/GCNet/configs/guided_anchoring/ga_rpn_r50_fpn_1x_coco.py
new file mode 100644
index 0000000000000000000000000000000000000000..9c6eb91890b78c7215852525d181e75db434582b
--- /dev/null
+++ b/PyTorch/contrib/cv/detection/GCNet/configs/guided_anchoring/ga_rpn_r50_fpn_1x_coco.py
@@ -0,0 +1,57 @@
+_base_ = '../rpn/rpn_r50_fpn_1x_coco.py'
+model = dict(
+ rpn_head=dict(
+ _delete_=True,
+ type='GARPNHead',
+ in_channels=256,
+ feat_channels=256,
+ approx_anchor_generator=dict(
+ type='AnchorGenerator',
+ octave_base_scale=8,
+ scales_per_octave=3,
+ ratios=[0.5, 1.0, 2.0],
+ strides=[4, 8, 16, 32, 64]),
+ square_anchor_generator=dict(
+ type='AnchorGenerator',
+ ratios=[1.0],
+ scales=[8],
+ strides=[4, 8, 16, 32, 64]),
+ anchor_coder=dict(
+ type='DeltaXYWHBBoxCoder',
+ target_means=[.0, .0, .0, .0],
+ target_stds=[0.07, 0.07, 0.14, 0.14]),
+ bbox_coder=dict(
+ type='DeltaXYWHBBoxCoder',
+ target_means=[.0, .0, .0, .0],
+ target_stds=[0.07, 0.07, 0.11, 0.11]),
+ loc_filter_thr=0.01,
+ loss_loc=dict(
+ type='FocalLoss',
+ use_sigmoid=True,
+ gamma=2.0,
+ alpha=0.25,
+ loss_weight=1.0),
+ loss_shape=dict(type='BoundedIoULoss', beta=0.2, loss_weight=1.0),
+ loss_cls=dict(
+ type='CrossEntropyLoss', use_sigmoid=True, loss_weight=1.0),
+ loss_bbox=dict(type='SmoothL1Loss', beta=1.0, loss_weight=1.0)))
+# model training and testing settings
+train_cfg = dict(
+ rpn=dict(
+ ga_assigner=dict(
+ type='ApproxMaxIoUAssigner',
+ pos_iou_thr=0.7,
+ neg_iou_thr=0.3,
+ min_pos_iou=0.3,
+ ignore_iof_thr=-1),
+ ga_sampler=dict(
+ type='RandomSampler',
+ num=256,
+ pos_fraction=0.5,
+ neg_pos_ub=-1,
+ add_gt_as_proposals=False),
+ allowed_border=-1,
+ center_ratio=0.2,
+ ignore_ratio=0.5))
+optimizer_config = dict(
+ _delete_=True, grad_clip=dict(max_norm=35, norm_type=2))
diff --git a/PyTorch/contrib/cv/detection/GCNet/configs/guided_anchoring/ga_rpn_x101_32x4d_fpn_1x_coco.py b/PyTorch/contrib/cv/detection/GCNet/configs/guided_anchoring/ga_rpn_x101_32x4d_fpn_1x_coco.py
new file mode 100644
index 0000000000000000000000000000000000000000..1e0fe4931e9cb340fcf3b80a4f9380abee500238
--- /dev/null
+++ b/PyTorch/contrib/cv/detection/GCNet/configs/guided_anchoring/ga_rpn_x101_32x4d_fpn_1x_coco.py
@@ -0,0 +1,13 @@
+_base_ = './ga_rpn_r50_fpn_1x_coco.py'
+model = dict(
+ pretrained='open-mmlab://resnext101_32x4d',
+ backbone=dict(
+ type='ResNeXt',
+ depth=101,
+ groups=32,
+ base_width=4,
+ num_stages=4,
+ out_indices=(0, 1, 2, 3),
+ frozen_stages=1,
+ norm_cfg=dict(type='BN', requires_grad=True),
+ style='pytorch'))
diff --git a/PyTorch/contrib/cv/detection/GCNet/configs/guided_anchoring/ga_rpn_x101_64x4d_fpn_1x_coco.py b/PyTorch/contrib/cv/detection/GCNet/configs/guided_anchoring/ga_rpn_x101_64x4d_fpn_1x_coco.py
new file mode 100644
index 0000000000000000000000000000000000000000..bf66b6b9283042ce6eabc437219f0b16be96d613
--- /dev/null
+++ b/PyTorch/contrib/cv/detection/GCNet/configs/guided_anchoring/ga_rpn_x101_64x4d_fpn_1x_coco.py
@@ -0,0 +1,13 @@
+_base_ = './ga_rpn_r50_fpn_1x_coco.py'
+model = dict(
+ pretrained='open-mmlab://resnext101_64x4d',
+ backbone=dict(
+ type='ResNeXt',
+ depth=101,
+ groups=64,
+ base_width=4,
+ num_stages=4,
+ out_indices=(0, 1, 2, 3),
+ frozen_stages=1,
+ norm_cfg=dict(type='BN', requires_grad=True),
+ style='pytorch'))
diff --git a/PyTorch/contrib/cv/detection/GCNet/configs/hrnet/README.md b/PyTorch/contrib/cv/detection/GCNet/configs/hrnet/README.md
new file mode 100644
index 0000000000000000000000000000000000000000..51e244c464986da0d081ede9b67d1d2b59215e66
--- /dev/null
+++ b/PyTorch/contrib/cv/detection/GCNet/configs/hrnet/README.md
@@ -0,0 +1,92 @@
+# High-resolution networks (HRNets) for object detection
+
+## Introduction
+
+```
+@inproceedings{SunXLW19,
+ title={Deep High-Resolution Representation Learning for Human Pose Estimation},
+ author={Ke Sun and Bin Xiao and Dong Liu and Jingdong Wang},
+ booktitle={CVPR},
+ year={2019}
+}
+
+@article{SunZJCXLMWLW19,
+ title={High-Resolution Representations for Labeling Pixels and Regions},
+ author={Ke Sun and Yang Zhao and Borui Jiang and Tianheng Cheng and Bin Xiao
+ and Dong Liu and Yadong Mu and Xinggang Wang and Wenyu Liu and Jingdong Wang},
+ journal = {CoRR},
+ volume = {abs/1904.04514},
+ year={2019}
+}
+```
+
+## Results and Models
+
+
+### Faster R-CNN
+
+| Backbone | Style | Lr schd | Mem (GB) | Inf time (fps) | box AP | Download |
+| :-------------: | :-----: | :-----: | :------: | :-------------:|:------:| :-------:|
+| HRNetV2p-W18 | pytorch | 1x | 6.6 | 13.4 | 36.9 | [model](http://download.openmmlab.com/mmdetection/v2.0/hrnet/faster_rcnn_hrnetv2p_w18_1x_coco/faster_rcnn_hrnetv2p_w18_1x_coco_20200130-56651a6d.pth) | [log](http://download.openmmlab.com/mmdetection/v2.0/hrnet/faster_rcnn_hrnetv2p_w18_1x_coco/faster_rcnn_hrnetv2p_w18_1x_coco_20200130_211246.log.json) |
+| HRNetV2p-W18 | pytorch | 2x | 6.6 | | 38.9 | [model](http://download.openmmlab.com/mmdetection/v2.0/hrnet/faster_rcnn_hrnetv2p_w18_2x_coco/faster_rcnn_hrnetv2p_w18_2x_coco_20200702_085731-a4ec0611.pth) | [log](http://download.openmmlab.com/mmdetection/v2.0/hrnet/faster_rcnn_hrnetv2p_w18_2x_coco/faster_rcnn_hrnetv2p_w18_2x_coco_20200702_085731.log.json) |
+| HRNetV2p-W32 | pytorch | 1x | 9.0 | 12.4 | 40.2 | [model](http://download.openmmlab.com/mmdetection/v2.0/hrnet/faster_rcnn_hrnetv2p_w32_1x_coco/faster_rcnn_hrnetv2p_w32_1x_coco_20200130-6e286425.pth) | [log](http://download.openmmlab.com/mmdetection/v2.0/hrnet/faster_rcnn_hrnetv2p_w32_1x_coco/faster_rcnn_hrnetv2p_w32_1x_coco_20200130_204442.log.json) |
+| HRNetV2p-W32 | pytorch | 2x | 9.0 | | 41.4 | [model](http://download.openmmlab.com/mmdetection/v2.0/hrnet/faster_rcnn_hrnetv2p_w32_2x_coco/faster_rcnn_hrnetv2p_w32_2x_coco_20200529_015927-976a9c15.pth) | [log](http://download.openmmlab.com/mmdetection/v2.0/hrnet/faster_rcnn_hrnetv2p_w32_2x_coco/faster_rcnn_hrnetv2p_w32_2x_coco_20200529_015927.log.json) |
+| HRNetV2p-W40 | pytorch | 1x | 10.4 | 10.5 | 41.2 | [model](http://download.openmmlab.com/mmdetection/v2.0/hrnet/faster_rcnn_hrnetv2p_w40_1x_coco/faster_rcnn_hrnetv2p_w40_1x_coco_20200210-95c1f5ce.pth) | [log](http://download.openmmlab.com/mmdetection/v2.0/hrnet/faster_rcnn_hrnetv2p_w40_1x_coco/faster_rcnn_hrnetv2p_w40_1x_coco_20200210_125315.log.json) |
+| HRNetV2p-W40 | pytorch | 2x | 10.4 | | 42.1 | [model](http://download.openmmlab.com/mmdetection/v2.0/hrnet/faster_rcnn_hrnetv2p_w40_2x_coco/faster_rcnn_hrnetv2p_w40_2x_coco_20200512_161033-0f236ef4.pth) | [log](http://download.openmmlab.com/mmdetection/v2.0/hrnet/faster_rcnn_hrnetv2p_w40_2x_coco/faster_rcnn_hrnetv2p_w40_2x_coco_20200512_161033.log.json) |
+
+### Mask R-CNN
+
+| Backbone | Style | Lr schd | Mem (GB) | Inf time (fps) | box AP | mask AP | Download |
+| :-------------: | :-----: | :-----: | :------: | :-------------:|:------:| :------:|:--------:|
+| HRNetV2p-W18 | pytorch | 1x | 7.0 | 11.7 | 37.7 | 34.2 | [model](http://download.openmmlab.com/mmdetection/v2.0/hrnet/mask_rcnn_hrnetv2p_w18_1x_coco/mask_rcnn_hrnetv2p_w18_1x_coco_20200205-1c3d78ed.pth) | [log](http://download.openmmlab.com/mmdetection/v2.0/hrnet/mask_rcnn_hrnetv2p_w18_1x_coco/mask_rcnn_hrnetv2p_w18_1x_coco_20200205_232523.log.json) |
+| HRNetV2p-W18 | pytorch | 2x | 7.0 | - | 39.8 | 36.0 | [model](http://download.openmmlab.com/mmdetection/v2.0/hrnet/mask_rcnn_hrnetv2p_w18_2x_coco/mask_rcnn_hrnetv2p_w18_2x_coco_20200212-b3c825b1.pth) | [log](http://download.openmmlab.com/mmdetection/v2.0/hrnet/mask_rcnn_hrnetv2p_w18_2x_coco/mask_rcnn_hrnetv2p_w18_2x_coco_20200212_134222.log.json) |
+| HRNetV2p-W32 | pytorch | 1x | 9.4 | 11.3 | 41.2 | 37.1 | [model](http://download.openmmlab.com/mmdetection/v2.0/hrnet/mask_rcnn_hrnetv2p_w32_1x_coco/mask_rcnn_hrnetv2p_w32_1x_coco_20200207-b29f616e.pth) | [log](http://download.openmmlab.com/mmdetection/v2.0/hrnet/mask_rcnn_hrnetv2p_w32_1x_coco/mask_rcnn_hrnetv2p_w32_1x_coco_20200207_055017.log.json) |
+| HRNetV2p-W32 | pytorch | 2x | 9.4 | - | 42.5 | 37.8 | [model](http://download.openmmlab.com/mmdetection/v2.0/hrnet/mask_rcnn_hrnetv2p_w32_2x_coco/mask_rcnn_hrnetv2p_w32_2x_coco_20200213-45b75b4d.pth) | [log](http://download.openmmlab.com/mmdetection/v2.0/hrnet/mask_rcnn_hrnetv2p_w32_2x_coco/mask_rcnn_hrnetv2p_w32_2x_coco_20200213_150518.log.json) |
+| HRNetV2p-W40 | pytorch | 1x | 10.9 | | 42.1 | 37.5 | [model](http://download.openmmlab.com/mmdetection/v2.0/hrnet/mask_rcnn_hrnetv2p_w40_1x_coco/mask_rcnn_hrnetv2p_w40_1x_coco_20200511_015646-66738b35.pth) | [log](http://download.openmmlab.com/mmdetection/v2.0/hrnet/mask_rcnn_hrnetv2p_w40_1x_coco/mask_rcnn_hrnetv2p_w40_1x_coco_20200511_015646.log.json) |
+| HRNetV2p-W40 | pytorch | 2x | 10.9 | | 42.8 | 38.2 | [model](http://download.openmmlab.com/mmdetection/v2.0/hrnet/mask_rcnn_hrnetv2p_w40_2x_coco/mask_rcnn_hrnetv2p_w40_2x_coco_20200512_163732-aed5e4ab.pth) | [log](http://download.openmmlab.com/mmdetection/v2.0/hrnet/mask_rcnn_hrnetv2p_w40_2x_coco/mask_rcnn_hrnetv2p_w40_2x_coco_20200512_163732.log.json) |
+
+
+### Cascade R-CNN
+
+| Backbone | Style | Lr schd | Mem (GB) | Inf time (fps) | box AP | Download |
+| :-------------: | :-----: | :-----: | :------: | :-------------:|:------:| :-------:|
+| HRNetV2p-W18 | pytorch | 20e | 7.0 | 11.0 | 41.2 | [model](http://download.openmmlab.com/mmdetection/v2.0/hrnet/cascade_rcnn_hrnetv2p_w18_20e_coco/cascade_rcnn_hrnetv2p_w18_20e_coco_20200210-434be9d7.pth) | [log](http://download.openmmlab.com/mmdetection/v2.0/hrnet/cascade_rcnn_hrnetv2p_w18_20e_coco/cascade_rcnn_hrnetv2p_w18_20e_coco_20200210_105632.log.json) |
+| HRNetV2p-W32 | pytorch | 20e | 9.4 | 11.0 | 43.3 | [model](http://download.openmmlab.com/mmdetection/v2.0/hrnet/cascade_rcnn_hrnetv2p_w32_20e_coco/cascade_rcnn_hrnetv2p_w32_20e_coco_20200208-928455a4.pth) | [log](http://download.openmmlab.com/mmdetection/v2.0/hrnet/cascade_rcnn_hrnetv2p_w32_20e_coco/cascade_rcnn_hrnetv2p_w32_20e_coco_20200208_160511.log.json) |
+| HRNetV2p-W40 | pytorch | 20e | 10.8 | | 43.8 | [model](http://download.openmmlab.com/mmdetection/v2.0/hrnet/cascade_rcnn_hrnetv2p_w40_20e_coco/cascade_rcnn_hrnetv2p_w40_20e_coco_20200512_161112-75e47b04.pth) | [log](http://download.openmmlab.com/mmdetection/v2.0/hrnet/cascade_rcnn_hrnetv2p_w40_20e_coco/cascade_rcnn_hrnetv2p_w40_20e_coco_20200512_161112.log.json) |
+
+
+### Cascade Mask R-CNN
+
+| Backbone | Style | Lr schd | Mem (GB) | Inf time (fps) | box AP | mask AP | Download |
+| :-------------: | :-----: | :-----: | :------: | :-------------:|:------:| :------:|:--------:|
+| HRNetV2p-W18 | pytorch | 20e | 8.5 | 8.5 |41.6 |36.4 | [model](http://download.openmmlab.com/mmdetection/v2.0/hrnet/cascade_mask_rcnn_hrnetv2p_w18_20e_coco/cascade_mask_rcnn_hrnetv2p_w18_20e_coco_20200210-b543cd2b.pth) | [log](http://download.openmmlab.com/mmdetection/v2.0/hrnet/cascade_mask_rcnn_hrnetv2p_w18_20e_coco/cascade_mask_rcnn_hrnetv2p_w18_20e_coco_20200210_093149.log.json) |
+| HRNetV2p-W32 | pytorch | 20e | | 8.3 |44.3 |38.6 | [model](http://download.openmmlab.com/mmdetection/v2.0/hrnet/cascade_mask_rcnn_hrnetv2p_w32_20e_coco/cascade_mask_rcnn_hrnetv2p_w32_20e_coco_20200512_154043-39d9cf7b.pth) | [log](http://download.openmmlab.com/mmdetection/v2.0/hrnet/cascade_mask_rcnn_hrnetv2p_w32_20e_coco/cascade_mask_rcnn_hrnetv2p_w32_20e_coco_20200512_154043.log.json) |
+| HRNetV2p-W40 | pytorch | 20e | 12.5 | |45.1 |39.3 | [model](http://download.openmmlab.com/mmdetection/v2.0/hrnet/cascade_mask_rcnn_hrnetv2p_w40_20e_coco/cascade_mask_rcnn_hrnetv2p_w40_20e_coco_20200527_204922-969c4610.pth) | [log](http://download.openmmlab.com/mmdetection/v2.0/hrnet/cascade_mask_rcnn_hrnetv2p_w40_20e_coco/cascade_mask_rcnn_hrnetv2p_w40_20e_coco_20200527_204922.log.json) |
+
+### Hybrid Task Cascade (HTC)
+
+| Backbone | Style | Lr schd | Mem (GB) | Inf time (fps) | box AP | mask AP | Download |
+| :-------------: | :-----: | :-----: | :------: | :-------------:|:------:| :------:|:--------:|
+| HRNetV2p-W18 | pytorch | 20e | 10.8 | 4.7 | 42.8 | 37.9 | [model](http://download.openmmlab.com/mmdetection/v2.0/hrnet/htc_hrnetv2p_w18_20e_coco/htc_hrnetv2p_w18_20e_coco_20200210-b266988c.pth) | [log](http://download.openmmlab.com/mmdetection/v2.0/hrnet/htc_hrnetv2p_w18_20e_coco/htc_hrnetv2p_w18_20e_coco_20200210_182735.log.json) |
+| HRNetV2p-W32 | pytorch | 20e | 13.1 | 4.9 | 45.4 | 39.9 | [model](http://download.openmmlab.com/mmdetection/v2.0/hrnet/htc_hrnetv2p_w32_20e_coco/htc_hrnetv2p_w32_20e_coco_20200207-7639fa12.pth) | [log](http://download.openmmlab.com/mmdetection/v2.0/hrnet/htc_hrnetv2p_w32_20e_coco/htc_hrnetv2p_w32_20e_coco_20200207_193153.log.json) |
+| HRNetV2p-W40 | pytorch | 20e | 14.6 | | 46.4 | 40.8 | [model](http://download.openmmlab.com/mmdetection/v2.0/hrnet/htc_hrnetv2p_w40_20e_coco/htc_hrnetv2p_w40_20e_coco_20200529_183411-417c4d5b.pth) | [log](http://download.openmmlab.com/mmdetection/v2.0/hrnet/htc_hrnetv2p_w40_20e_coco/htc_hrnetv2p_w40_20e_coco_20200529_183411.log.json) |
+
+
+### FCOS
+
+| Backbone | Style | GN | MS train | Lr schd | Mem (GB) | Inf time (fps) | box AP | Download |
+|:---------:|:-------:|:-------:|:--------:|:-------:|:------:|:------:|:------:|:--------:|
+|HRNetV2p-W18| pytorch | Y | N | 1x | 13.0 | 12.9 | 35.1 | [model](http://download.openmmlab.com/mmdetection/v2.0/hrnet/fcos_hrnetv2p_w18_gn-head_4x4_1x_coco/fcos_hrnetv2p_w18_gn-head_4x4_1x_coco_20200316-c24bac34.pth) | [log](http://download.openmmlab.com/mmdetection/v2.0/hrnet/fcos_hrnetv2p_w18_gn-head_4x4_1x_coco/fcos_hrnetv2p_w18_gn-head_4x4_1x_coco_20200316_103815.log.json) |
+|HRNetV2p-W18| pytorch | Y | N | 2x | 13.0 | - | 37.7 | [model](http://download.openmmlab.com/mmdetection/v2.0/hrnet/fcos_hrnetv2p_w18_gn-head_4x4_2x_coco/fcos_hrnetv2p_w18_gn-head_4x4_2x_coco_20200316-15348c5b.pth) | [log](http://download.openmmlab.com/mmdetection/v2.0/hrnet/fcos_hrnetv2p_w18_gn-head_4x4_2x_coco/fcos_hrnetv2p_w18_gn-head_4x4_2x_coco_20200316_103815.log.json) |
+|HRNetV2p-W32| pytorch | Y | N | 1x | 17.5 | 12.9 | 39.2 | [model](http://download.openmmlab.com/mmdetection/v2.0/hrnet/fcos_hrnetv2p_w32_gn-head_4x4_1x_coco/fcos_hrnetv2p_w32_gn-head_4x4_1x_coco_20200314-59a7807f.pth) | [log](http://download.openmmlab.com/mmdetection/v2.0/hrnet/fcos_hrnetv2p_w32_gn-head_4x4_1x_coco/fcos_hrnetv2p_w32_gn-head_4x4_1x_coco_20200314_150555.log.json) |
+|HRNetV2p-W32| pytorch | Y | N | 2x | 17.5 | - | 40.3 | [model](http://download.openmmlab.com/mmdetection/v2.0/hrnet/fcos_hrnetv2p_w32_gn-head_4x4_2x_coco/fcos_hrnetv2p_w32_gn-head_4x4_2x_coco_20200314-faf8f0b8.pth) | [log](http://download.openmmlab.com/mmdetection/v2.0/hrnet/fcos_hrnetv2p_w32_gn-head_4x4_2x_coco/fcos_hrnetv2p_w32_gn-head_4x4_2x_coco_20200314_145136.log.json) |
+|HRNetV2p-W18| pytorch | Y | Y | 2x | 13.0 | 12.9 | 38.1 | [model](http://download.openmmlab.com/mmdetection/v2.0/hrnet/fcos_hrnetv2p_w18_gn-head_mstrain_640-800_4x4_2x_coco/fcos_hrnetv2p_w18_gn-head_mstrain_640-800_4x4_2x_coco_20200316-a668468b.pth) | [log](http://download.openmmlab.com/mmdetection/v2.0/hrnet/fcos_hrnetv2p_w18_gn-head_mstrain_640-800_4x4_2x_coco/fcos_hrnetv2p_w18_gn-head_mstrain_640-800_4x4_2x_coco_20200316_104027.log.json) |
+|HRNetV2p-W32| pytorch | Y | Y | 2x | 17.5 | 12.4 | 41.8 | [model](http://download.openmmlab.com/mmdetection/v2.0/hrnet/fcos_hrnetv2p_w32_gn-head_mstrain_640-800_4x4_2x_coco/fcos_hrnetv2p_w32_gn-head_mstrain_640-800_4x4_2x_coco_20200314-065d37a6.pth) | [log](http://download.openmmlab.com/mmdetection/v2.0/hrnet/fcos_hrnetv2p_w32_gn-head_mstrain_640-800_4x4_2x_coco/fcos_hrnetv2p_w32_gn-head_mstrain_640-800_4x4_2x_coco_20200314_145356.log.json) |
+|HRNetV2p-W48| pytorch | Y | Y | 2x | 20.3 | 10.8 | 42.8 | [model](http://download.openmmlab.com/mmdetection/v2.0/hrnet/fcos_hrnetv2p_w40_gn-head_mstrain_640-800_4x4_2x_coco/fcos_hrnetv2p_w40_gn-head_mstrain_640-800_4x4_2x_coco_20200314-e201886d.pth) | [log](http://download.openmmlab.com/mmdetection/v2.0/hrnet/fcos_hrnetv2p_w40_gn-head_mstrain_640-800_4x4_2x_coco/fcos_hrnetv2p_w40_gn-head_mstrain_640-800_4x4_2x_coco_20200314_150607.log.json) |
+
+
+
+**Note:**
+
+- The `28e` schedule in HTC indicates decreasing the lr at 24 and 27 epochs, with a total of 28 epochs.
+- HRNetV2 ImageNet pretrained models are in [HRNets for Image Classification](https://github.com/HRNet/HRNet-Image-Classification).
diff --git a/PyTorch/contrib/cv/detection/GCNet/configs/hrnet/cascade_mask_rcnn_hrnetv2p_w18_20e_coco.py b/PyTorch/contrib/cv/detection/GCNet/configs/hrnet/cascade_mask_rcnn_hrnetv2p_w18_20e_coco.py
new file mode 100644
index 0000000000000000000000000000000000000000..e8df265edefee1b7e5892fe373c1c0f80f59bf7b
--- /dev/null
+++ b/PyTorch/contrib/cv/detection/GCNet/configs/hrnet/cascade_mask_rcnn_hrnetv2p_w18_20e_coco.py
@@ -0,0 +1,10 @@
+_base_ = './cascade_mask_rcnn_hrnetv2p_w32_20e_coco.py'
+# model settings
+model = dict(
+ pretrained='open-mmlab://msra/hrnetv2_w18',
+ backbone=dict(
+ extra=dict(
+ stage2=dict(num_channels=(18, 36)),
+ stage3=dict(num_channels=(18, 36, 72)),
+ stage4=dict(num_channels=(18, 36, 72, 144)))),
+ neck=dict(type='HRFPN', in_channels=[18, 36, 72, 144], out_channels=256))
diff --git a/PyTorch/contrib/cv/detection/GCNet/configs/hrnet/cascade_mask_rcnn_hrnetv2p_w32_20e_coco.py b/PyTorch/contrib/cv/detection/GCNet/configs/hrnet/cascade_mask_rcnn_hrnetv2p_w32_20e_coco.py
new file mode 100644
index 0000000000000000000000000000000000000000..0f394c886b0aedeb1c5f034cd46b0e1cae544da7
--- /dev/null
+++ b/PyTorch/contrib/cv/detection/GCNet/configs/hrnet/cascade_mask_rcnn_hrnetv2p_w32_20e_coco.py
@@ -0,0 +1,39 @@
+_base_ = '../cascade_rcnn/cascade_mask_rcnn_r50_fpn_1x_coco.py'
+model = dict(
+ pretrained='open-mmlab://msra/hrnetv2_w32',
+ backbone=dict(
+ _delete_=True,
+ type='HRNet',
+ extra=dict(
+ stage1=dict(
+ num_modules=1,
+ num_branches=1,
+ block='BOTTLENECK',
+ num_blocks=(4, ),
+ num_channels=(64, )),
+ stage2=dict(
+ num_modules=1,
+ num_branches=2,
+ block='BASIC',
+ num_blocks=(4, 4),
+ num_channels=(32, 64)),
+ stage3=dict(
+ num_modules=4,
+ num_branches=3,
+ block='BASIC',
+ num_blocks=(4, 4, 4),
+ num_channels=(32, 64, 128)),
+ stage4=dict(
+ num_modules=3,
+ num_branches=4,
+ block='BASIC',
+ num_blocks=(4, 4, 4, 4),
+ num_channels=(32, 64, 128, 256)))),
+ neck=dict(
+ _delete_=True,
+ type='HRFPN',
+ in_channels=[32, 64, 128, 256],
+ out_channels=256))
+# learning policy
+lr_config = dict(step=[16, 19])
+total_epochs = 20
diff --git a/PyTorch/contrib/cv/detection/GCNet/configs/hrnet/cascade_mask_rcnn_hrnetv2p_w40_20e_coco.py b/PyTorch/contrib/cv/detection/GCNet/configs/hrnet/cascade_mask_rcnn_hrnetv2p_w40_20e_coco.py
new file mode 100644
index 0000000000000000000000000000000000000000..29b1469fa9f455a3235b323fa3b1e39d5c095f3d
--- /dev/null
+++ b/PyTorch/contrib/cv/detection/GCNet/configs/hrnet/cascade_mask_rcnn_hrnetv2p_w40_20e_coco.py
@@ -0,0 +1,11 @@
+_base_ = './cascade_mask_rcnn_hrnetv2p_w32_20e_coco.py'
+# model settings
+model = dict(
+ pretrained='open-mmlab://msra/hrnetv2_w40',
+ backbone=dict(
+ type='HRNet',
+ extra=dict(
+ stage2=dict(num_channels=(40, 80)),
+ stage3=dict(num_channels=(40, 80, 160)),
+ stage4=dict(num_channels=(40, 80, 160, 320)))),
+ neck=dict(type='HRFPN', in_channels=[40, 80, 160, 320], out_channels=256))
diff --git a/PyTorch/contrib/cv/detection/GCNet/configs/hrnet/cascade_rcnn_hrnetv2p_w18_20e_coco.py b/PyTorch/contrib/cv/detection/GCNet/configs/hrnet/cascade_rcnn_hrnetv2p_w18_20e_coco.py
new file mode 100644
index 0000000000000000000000000000000000000000..9585a4f35d9151b42beac05066a1a231dd1777a9
--- /dev/null
+++ b/PyTorch/contrib/cv/detection/GCNet/configs/hrnet/cascade_rcnn_hrnetv2p_w18_20e_coco.py
@@ -0,0 +1,10 @@
+_base_ = './cascade_rcnn_hrnetv2p_w32_20e_coco.py'
+# model settings
+model = dict(
+ pretrained='open-mmlab://msra/hrnetv2_w18',
+ backbone=dict(
+ extra=dict(
+ stage2=dict(num_channels=(18, 36)),
+ stage3=dict(num_channels=(18, 36, 72)),
+ stage4=dict(num_channels=(18, 36, 72, 144)))),
+ neck=dict(type='HRFPN', in_channels=[18, 36, 72, 144], out_channels=256))
diff --git a/PyTorch/contrib/cv/detection/GCNet/configs/hrnet/cascade_rcnn_hrnetv2p_w32_20e_coco.py b/PyTorch/contrib/cv/detection/GCNet/configs/hrnet/cascade_rcnn_hrnetv2p_w32_20e_coco.py
new file mode 100644
index 0000000000000000000000000000000000000000..c5746337a45bec7bf5ea0e8dc709c7c69685a7b2
--- /dev/null
+++ b/PyTorch/contrib/cv/detection/GCNet/configs/hrnet/cascade_rcnn_hrnetv2p_w32_20e_coco.py
@@ -0,0 +1,39 @@
+_base_ = '../cascade_rcnn/cascade_rcnn_r50_fpn_1x_coco.py'
+model = dict(
+ pretrained='open-mmlab://msra/hrnetv2_w32',
+ backbone=dict(
+ _delete_=True,
+ type='HRNet',
+ extra=dict(
+ stage1=dict(
+ num_modules=1,
+ num_branches=1,
+ block='BOTTLENECK',
+ num_blocks=(4, ),
+ num_channels=(64, )),
+ stage2=dict(
+ num_modules=1,
+ num_branches=2,
+ block='BASIC',
+ num_blocks=(4, 4),
+ num_channels=(32, 64)),
+ stage3=dict(
+ num_modules=4,
+ num_branches=3,
+ block='BASIC',
+ num_blocks=(4, 4, 4),
+ num_channels=(32, 64, 128)),
+ stage4=dict(
+ num_modules=3,
+ num_branches=4,
+ block='BASIC',
+ num_blocks=(4, 4, 4, 4),
+ num_channels=(32, 64, 128, 256)))),
+ neck=dict(
+ _delete_=True,
+ type='HRFPN',
+ in_channels=[32, 64, 128, 256],
+ out_channels=256))
+# learning policy
+lr_config = dict(step=[16, 19])
+total_epochs = 20
diff --git a/PyTorch/contrib/cv/detection/GCNet/configs/hrnet/cascade_rcnn_hrnetv2p_w40_20e_coco.py b/PyTorch/contrib/cv/detection/GCNet/configs/hrnet/cascade_rcnn_hrnetv2p_w40_20e_coco.py
new file mode 100644
index 0000000000000000000000000000000000000000..bd43e47254be7a153fadf26e734f0756d9b4b02e
--- /dev/null
+++ b/PyTorch/contrib/cv/detection/GCNet/configs/hrnet/cascade_rcnn_hrnetv2p_w40_20e_coco.py
@@ -0,0 +1,11 @@
+_base_ = './cascade_rcnn_hrnetv2p_w32_20e_coco.py'
+# model settings
+model = dict(
+ pretrained='open-mmlab://msra/hrnetv2_w40',
+ backbone=dict(
+ type='HRNet',
+ extra=dict(
+ stage2=dict(num_channels=(40, 80)),
+ stage3=dict(num_channels=(40, 80, 160)),
+ stage4=dict(num_channels=(40, 80, 160, 320)))),
+ neck=dict(type='HRFPN', in_channels=[40, 80, 160, 320], out_channels=256))
diff --git a/PyTorch/contrib/cv/detection/GCNet/configs/hrnet/faster_rcnn_hrnetv2p_w18_1x_coco.py b/PyTorch/contrib/cv/detection/GCNet/configs/hrnet/faster_rcnn_hrnetv2p_w18_1x_coco.py
new file mode 100644
index 0000000000000000000000000000000000000000..9907bcbf6464fb964664a318533bf9edda4e34fd
--- /dev/null
+++ b/PyTorch/contrib/cv/detection/GCNet/configs/hrnet/faster_rcnn_hrnetv2p_w18_1x_coco.py
@@ -0,0 +1,10 @@
+_base_ = './faster_rcnn_hrnetv2p_w32_1x_coco.py'
+# model settings
+model = dict(
+ pretrained='open-mmlab://msra/hrnetv2_w18',
+ backbone=dict(
+ extra=dict(
+ stage2=dict(num_channels=(18, 36)),
+ stage3=dict(num_channels=(18, 36, 72)),
+ stage4=dict(num_channels=(18, 36, 72, 144)))),
+ neck=dict(type='HRFPN', in_channels=[18, 36, 72, 144], out_channels=256))
diff --git a/PyTorch/contrib/cv/detection/GCNet/configs/hrnet/faster_rcnn_hrnetv2p_w18_2x_coco.py b/PyTorch/contrib/cv/detection/GCNet/configs/hrnet/faster_rcnn_hrnetv2p_w18_2x_coco.py
new file mode 100644
index 0000000000000000000000000000000000000000..ff3e7cae4aeb1f380f00a7f7f72f1c1ed47e7583
--- /dev/null
+++ b/PyTorch/contrib/cv/detection/GCNet/configs/hrnet/faster_rcnn_hrnetv2p_w18_2x_coco.py
@@ -0,0 +1,5 @@
+_base_ = './faster_rcnn_hrnetv2p_w18_1x_coco.py'
+
+# learning policy
+lr_config = dict(step=[16, 22])
+total_epochs = 24
diff --git a/PyTorch/contrib/cv/detection/GCNet/configs/hrnet/faster_rcnn_hrnetv2p_w32_1x_coco.py b/PyTorch/contrib/cv/detection/GCNet/configs/hrnet/faster_rcnn_hrnetv2p_w32_1x_coco.py
new file mode 100644
index 0000000000000000000000000000000000000000..190e81c710b0e5e9eb34bafff01c9dd4a8ef130c
--- /dev/null
+++ b/PyTorch/contrib/cv/detection/GCNet/configs/hrnet/faster_rcnn_hrnetv2p_w32_1x_coco.py
@@ -0,0 +1,36 @@
+_base_ = '../faster_rcnn/faster_rcnn_r50_fpn_1x_coco.py'
+model = dict(
+ pretrained='open-mmlab://msra/hrnetv2_w32',
+ backbone=dict(
+ _delete_=True,
+ type='HRNet',
+ extra=dict(
+ stage1=dict(
+ num_modules=1,
+ num_branches=1,
+ block='BOTTLENECK',
+ num_blocks=(4, ),
+ num_channels=(64, )),
+ stage2=dict(
+ num_modules=1,
+ num_branches=2,
+ block='BASIC',
+ num_blocks=(4, 4),
+ num_channels=(32, 64)),
+ stage3=dict(
+ num_modules=4,
+ num_branches=3,
+ block='BASIC',
+ num_blocks=(4, 4, 4),
+ num_channels=(32, 64, 128)),
+ stage4=dict(
+ num_modules=3,
+ num_branches=4,
+ block='BASIC',
+ num_blocks=(4, 4, 4, 4),
+ num_channels=(32, 64, 128, 256)))),
+ neck=dict(
+ _delete_=True,
+ type='HRFPN',
+ in_channels=[32, 64, 128, 256],
+ out_channels=256))
diff --git a/PyTorch/contrib/cv/detection/GCNet/configs/hrnet/faster_rcnn_hrnetv2p_w32_2x_coco.py b/PyTorch/contrib/cv/detection/GCNet/configs/hrnet/faster_rcnn_hrnetv2p_w32_2x_coco.py
new file mode 100644
index 0000000000000000000000000000000000000000..2876e3fdae70a0398e7772d81e24d31d2bc1d6fb
--- /dev/null
+++ b/PyTorch/contrib/cv/detection/GCNet/configs/hrnet/faster_rcnn_hrnetv2p_w32_2x_coco.py
@@ -0,0 +1,4 @@
+_base_ = './faster_rcnn_hrnetv2p_w32_1x_coco.py'
+# learning policy
+lr_config = dict(step=[16, 22])
+total_epochs = 24
diff --git a/PyTorch/contrib/cv/detection/GCNet/configs/hrnet/faster_rcnn_hrnetv2p_w40_1x_coco.py b/PyTorch/contrib/cv/detection/GCNet/configs/hrnet/faster_rcnn_hrnetv2p_w40_1x_coco.py
new file mode 100644
index 0000000000000000000000000000000000000000..d0fd9fa0284f17272c0785701f2ae81860bc04b6
--- /dev/null
+++ b/PyTorch/contrib/cv/detection/GCNet/configs/hrnet/faster_rcnn_hrnetv2p_w40_1x_coco.py
@@ -0,0 +1,10 @@
+_base_ = './faster_rcnn_hrnetv2p_w32_1x_coco.py'
+model = dict(
+ pretrained='open-mmlab://msra/hrnetv2_w40',
+ backbone=dict(
+ type='HRNet',
+ extra=dict(
+ stage2=dict(num_channels=(40, 80)),
+ stage3=dict(num_channels=(40, 80, 160)),
+ stage4=dict(num_channels=(40, 80, 160, 320)))),
+ neck=dict(type='HRFPN', in_channels=[40, 80, 160, 320], out_channels=256))
diff --git a/PyTorch/contrib/cv/detection/GCNet/configs/hrnet/faster_rcnn_hrnetv2p_w40_2x_coco.py b/PyTorch/contrib/cv/detection/GCNet/configs/hrnet/faster_rcnn_hrnetv2p_w40_2x_coco.py
new file mode 100644
index 0000000000000000000000000000000000000000..ddb4bd83381851456279541b7f6ed5a4f12ff0a3
--- /dev/null
+++ b/PyTorch/contrib/cv/detection/GCNet/configs/hrnet/faster_rcnn_hrnetv2p_w40_2x_coco.py
@@ -0,0 +1,4 @@
+_base_ = './faster_rcnn_hrnetv2p_w40_1x_coco.py'
+# learning policy
+lr_config = dict(step=[16, 22])
+total_epochs = 24
diff --git a/PyTorch/contrib/cv/detection/GCNet/configs/hrnet/fcos_hrnetv2p_w18_gn-head_4x4_1x_coco.py b/PyTorch/contrib/cv/detection/GCNet/configs/hrnet/fcos_hrnetv2p_w18_gn-head_4x4_1x_coco.py
new file mode 100644
index 0000000000000000000000000000000000000000..20bffb95616d4358007d0825820f4a91ea223649
--- /dev/null
+++ b/PyTorch/contrib/cv/detection/GCNet/configs/hrnet/fcos_hrnetv2p_w18_gn-head_4x4_1x_coco.py
@@ -0,0 +1,9 @@
+_base_ = './fcos_hrnetv2p_w32_gn-head_4x4_1x_coco.py'
+model = dict(
+ pretrained='open-mmlab://msra/hrnetv2_w18',
+ backbone=dict(
+ extra=dict(
+ stage2=dict(num_channels=(18, 36)),
+ stage3=dict(num_channels=(18, 36, 72)),
+ stage4=dict(num_channels=(18, 36, 72, 144)))),
+ neck=dict(type='HRFPN', in_channels=[18, 36, 72, 144], out_channels=256))
diff --git a/PyTorch/contrib/cv/detection/GCNet/configs/hrnet/fcos_hrnetv2p_w18_gn-head_4x4_2x_coco.py b/PyTorch/contrib/cv/detection/GCNet/configs/hrnet/fcos_hrnetv2p_w18_gn-head_4x4_2x_coco.py
new file mode 100644
index 0000000000000000000000000000000000000000..7847fb438b9954327066535e4ff810aefba0f214
--- /dev/null
+++ b/PyTorch/contrib/cv/detection/GCNet/configs/hrnet/fcos_hrnetv2p_w18_gn-head_4x4_2x_coco.py
@@ -0,0 +1,4 @@
+_base_ = './fcos_hrnetv2p_w18_gn-head_4x4_1x_coco.py'
+# learning policy
+lr_config = dict(step=[16, 22])
+total_epochs = 24
diff --git a/PyTorch/contrib/cv/detection/GCNet/configs/hrnet/fcos_hrnetv2p_w18_gn-head_mstrain_640-800_4x4_2x_coco.py b/PyTorch/contrib/cv/detection/GCNet/configs/hrnet/fcos_hrnetv2p_w18_gn-head_mstrain_640-800_4x4_2x_coco.py
new file mode 100644
index 0000000000000000000000000000000000000000..b845128de51d2080f6444e2c849f4642a43ad942
--- /dev/null
+++ b/PyTorch/contrib/cv/detection/GCNet/configs/hrnet/fcos_hrnetv2p_w18_gn-head_mstrain_640-800_4x4_2x_coco.py
@@ -0,0 +1,9 @@
+_base_ = './fcos_hrnetv2p_w32_gn-head_mstrain_640-800_4x4_2x_coco.py'
+model = dict(
+ pretrained='open-mmlab://msra/hrnetv2_w18',
+ backbone=dict(
+ extra=dict(
+ stage2=dict(num_channels=(18, 36)),
+ stage3=dict(num_channels=(18, 36, 72)),
+ stage4=dict(num_channels=(18, 36, 72, 144)))),
+ neck=dict(type='HRFPN', in_channels=[18, 36, 72, 144], out_channels=256))
diff --git a/PyTorch/contrib/cv/detection/GCNet/configs/hrnet/fcos_hrnetv2p_w32_gn-head_4x4_1x_coco.py b/PyTorch/contrib/cv/detection/GCNet/configs/hrnet/fcos_hrnetv2p_w32_gn-head_4x4_1x_coco.py
new file mode 100644
index 0000000000000000000000000000000000000000..b3640224511b4a1fd38e999a82f1723431dc5cb3
--- /dev/null
+++ b/PyTorch/contrib/cv/detection/GCNet/configs/hrnet/fcos_hrnetv2p_w32_gn-head_4x4_1x_coco.py
@@ -0,0 +1,38 @@
+_base_ = '../fcos/fcos_r50_caffe_fpn_gn-head_4x4_1x_coco.py'
+model = dict(
+ pretrained='open-mmlab://msra/hrnetv2_w32',
+ backbone=dict(
+ _delete_=True,
+ type='HRNet',
+ extra=dict(
+ stage1=dict(
+ num_modules=1,
+ num_branches=1,
+ block='BOTTLENECK',
+ num_blocks=(4, ),
+ num_channels=(64, )),
+ stage2=dict(
+ num_modules=1,
+ num_branches=2,
+ block='BASIC',
+ num_blocks=(4, 4),
+ num_channels=(32, 64)),
+ stage3=dict(
+ num_modules=4,
+ num_branches=3,
+ block='BASIC',
+ num_blocks=(4, 4, 4),
+ num_channels=(32, 64, 128)),
+ stage4=dict(
+ num_modules=3,
+ num_branches=4,
+ block='BASIC',
+ num_blocks=(4, 4, 4, 4),
+ num_channels=(32, 64, 128, 256)))),
+ neck=dict(
+ _delete_=True,
+ type='HRFPN',
+ in_channels=[32, 64, 128, 256],
+ out_channels=256,
+ stride=2,
+ num_outs=5))
diff --git a/PyTorch/contrib/cv/detection/GCNet/configs/hrnet/fcos_hrnetv2p_w32_gn-head_4x4_2x_coco.py b/PyTorch/contrib/cv/detection/GCNet/configs/hrnet/fcos_hrnetv2p_w32_gn-head_4x4_2x_coco.py
new file mode 100644
index 0000000000000000000000000000000000000000..65717e3b2f942df98f17574c0442e343fb869782
--- /dev/null
+++ b/PyTorch/contrib/cv/detection/GCNet/configs/hrnet/fcos_hrnetv2p_w32_gn-head_4x4_2x_coco.py
@@ -0,0 +1,4 @@
+_base_ = './fcos_hrnetv2p_w32_gn-head_4x4_1x_coco.py'
+# learning policy
+lr_config = dict(step=[16, 22])
+total_epochs = 24
diff --git a/PyTorch/contrib/cv/detection/GCNet/configs/hrnet/fcos_hrnetv2p_w32_gn-head_mstrain_640-800_4x4_2x_coco.py b/PyTorch/contrib/cv/detection/GCNet/configs/hrnet/fcos_hrnetv2p_w32_gn-head_mstrain_640-800_4x4_2x_coco.py
new file mode 100644
index 0000000000000000000000000000000000000000..6866b1ae3d8399d69d5f875bca771a102af4e815
--- /dev/null
+++ b/PyTorch/contrib/cv/detection/GCNet/configs/hrnet/fcos_hrnetv2p_w32_gn-head_mstrain_640-800_4x4_2x_coco.py
@@ -0,0 +1,39 @@
+_base_ = './fcos_hrnetv2p_w32_gn-head_4x4_1x_coco.py'
+img_norm_cfg = dict(
+ mean=[103.530, 116.280, 123.675], std=[1.0, 1.0, 1.0], to_rgb=False)
+train_pipeline = [
+ dict(type='LoadImageFromFile'),
+ dict(type='LoadAnnotations', with_bbox=True),
+ dict(
+ type='Resize',
+ img_scale=[(1333, 640), (1333, 800)],
+ multiscale_mode='value',
+ keep_ratio=True),
+ dict(type='RandomFlip', flip_ratio=0.5),
+ dict(type='Normalize', **img_norm_cfg),
+ dict(type='Pad', size_divisor=32),
+ dict(type='DefaultFormatBundle'),
+ dict(type='Collect', keys=['img', 'gt_bboxes', 'gt_labels']),
+]
+test_pipeline = [
+ dict(type='LoadImageFromFile'),
+ dict(
+ type='MultiScaleFlipAug',
+ img_scale=(1333, 800),
+ flip=False,
+ transforms=[
+ dict(type='Resize', keep_ratio=True),
+ dict(type='RandomFlip'),
+ dict(type='Normalize', **img_norm_cfg),
+ dict(type='Pad', size_divisor=32),
+ dict(type='ImageToTensor', keys=['img']),
+ dict(type='Collect', keys=['img']),
+ ])
+]
+data = dict(
+ train=dict(pipeline=train_pipeline),
+ val=dict(pipeline=test_pipeline),
+ test=dict(pipeline=test_pipeline))
+# learning policy
+lr_config = dict(step=[16, 22])
+total_epochs = 24
diff --git a/PyTorch/contrib/cv/detection/GCNet/configs/hrnet/fcos_hrnetv2p_w40_gn-head_mstrain_640-800_4x4_2x_coco.py b/PyTorch/contrib/cv/detection/GCNet/configs/hrnet/fcos_hrnetv2p_w40_gn-head_mstrain_640-800_4x4_2x_coco.py
new file mode 100644
index 0000000000000000000000000000000000000000..452b0fe2d89566a998744d9c7812e550596462e3
--- /dev/null
+++ b/PyTorch/contrib/cv/detection/GCNet/configs/hrnet/fcos_hrnetv2p_w40_gn-head_mstrain_640-800_4x4_2x_coco.py
@@ -0,0 +1,10 @@
+_base_ = './fcos_hrnetv2p_w32_gn-head_mstrain_640-800_4x4_2x_coco.py'
+model = dict(
+ pretrained='open-mmlab://msra/hrnetv2_w40',
+ backbone=dict(
+ type='HRNet',
+ extra=dict(
+ stage2=dict(num_channels=(40, 80)),
+ stage3=dict(num_channels=(40, 80, 160)),
+ stage4=dict(num_channels=(40, 80, 160, 320)))),
+ neck=dict(type='HRFPN', in_channels=[40, 80, 160, 320], out_channels=256))
diff --git a/PyTorch/contrib/cv/detection/GCNet/configs/hrnet/htc_hrnetv2p_w18_20e_coco.py b/PyTorch/contrib/cv/detection/GCNet/configs/hrnet/htc_hrnetv2p_w18_20e_coco.py
new file mode 100644
index 0000000000000000000000000000000000000000..391636ff452471af367ed14be5faa49c0b7e1be6
--- /dev/null
+++ b/PyTorch/contrib/cv/detection/GCNet/configs/hrnet/htc_hrnetv2p_w18_20e_coco.py
@@ -0,0 +1,9 @@
+_base_ = './htc_hrnetv2p_w32_20e_coco.py'
+model = dict(
+ pretrained='open-mmlab://msra/hrnetv2_w18',
+ backbone=dict(
+ extra=dict(
+ stage2=dict(num_channels=(18, 36)),
+ stage3=dict(num_channels=(18, 36, 72)),
+ stage4=dict(num_channels=(18, 36, 72, 144)))),
+ neck=dict(type='HRFPN', in_channels=[18, 36, 72, 144], out_channels=256))
diff --git a/PyTorch/contrib/cv/detection/GCNet/configs/hrnet/htc_hrnetv2p_w32_20e_coco.py b/PyTorch/contrib/cv/detection/GCNet/configs/hrnet/htc_hrnetv2p_w32_20e_coco.py
new file mode 100644
index 0000000000000000000000000000000000000000..aee78089b9e32d3c0bcd6a29f51c22d1af96d2ce
--- /dev/null
+++ b/PyTorch/contrib/cv/detection/GCNet/configs/hrnet/htc_hrnetv2p_w32_20e_coco.py
@@ -0,0 +1,36 @@
+_base_ = '../htc/htc_r50_fpn_20e_coco.py'
+model = dict(
+ pretrained='open-mmlab://msra/hrnetv2_w32',
+ backbone=dict(
+ _delete_=True,
+ type='HRNet',
+ extra=dict(
+ stage1=dict(
+ num_modules=1,
+ num_branches=1,
+ block='BOTTLENECK',
+ num_blocks=(4, ),
+ num_channels=(64, )),
+ stage2=dict(
+ num_modules=1,
+ num_branches=2,
+ block='BASIC',
+ num_blocks=(4, 4),
+ num_channels=(32, 64)),
+ stage3=dict(
+ num_modules=4,
+ num_branches=3,
+ block='BASIC',
+ num_blocks=(4, 4, 4),
+ num_channels=(32, 64, 128)),
+ stage4=dict(
+ num_modules=3,
+ num_branches=4,
+ block='BASIC',
+ num_blocks=(4, 4, 4, 4),
+ num_channels=(32, 64, 128, 256)))),
+ neck=dict(
+ _delete_=True,
+ type='HRFPN',
+ in_channels=[32, 64, 128, 256],
+ out_channels=256))
diff --git a/PyTorch/contrib/cv/detection/GCNet/configs/hrnet/htc_hrnetv2p_w40_20e_coco.py b/PyTorch/contrib/cv/detection/GCNet/configs/hrnet/htc_hrnetv2p_w40_20e_coco.py
new file mode 100644
index 0000000000000000000000000000000000000000..abf6fb550e4dfff4e749e15b001c37e6db8ae476
--- /dev/null
+++ b/PyTorch/contrib/cv/detection/GCNet/configs/hrnet/htc_hrnetv2p_w40_20e_coco.py
@@ -0,0 +1,10 @@
+_base_ = './htc_hrnetv2p_w32_20e_coco.py'
+model = dict(
+ pretrained='open-mmlab://msra/hrnetv2_w40',
+ backbone=dict(
+ type='HRNet',
+ extra=dict(
+ stage2=dict(num_channels=(40, 80)),
+ stage3=dict(num_channels=(40, 80, 160)),
+ stage4=dict(num_channels=(40, 80, 160, 320)))),
+ neck=dict(type='HRFPN', in_channels=[40, 80, 160, 320], out_channels=256))
diff --git a/PyTorch/contrib/cv/detection/GCNet/configs/hrnet/htc_hrnetv2p_w40_28e_coco.py b/PyTorch/contrib/cv/detection/GCNet/configs/hrnet/htc_hrnetv2p_w40_28e_coco.py
new file mode 100644
index 0000000000000000000000000000000000000000..40c97d1fdb1b5b86030d9aef436129d24b3dbb0e
--- /dev/null
+++ b/PyTorch/contrib/cv/detection/GCNet/configs/hrnet/htc_hrnetv2p_w40_28e_coco.py
@@ -0,0 +1,4 @@
+_base_ = './htc_hrnetv2p_w40_20e_coco.py'
+# learning policy
+lr_config = dict(step=[24, 27])
+total_epochs = 28
diff --git a/PyTorch/contrib/cv/detection/GCNet/configs/hrnet/htc_x101_64x4d_fpn_16x1_28e_coco.py b/PyTorch/contrib/cv/detection/GCNet/configs/hrnet/htc_x101_64x4d_fpn_16x1_28e_coco.py
new file mode 100644
index 0000000000000000000000000000000000000000..459af318e785d119b5afef5f25a3095c1cd4e665
--- /dev/null
+++ b/PyTorch/contrib/cv/detection/GCNet/configs/hrnet/htc_x101_64x4d_fpn_16x1_28e_coco.py
@@ -0,0 +1,4 @@
+_base_ = '../htc/htc_x101_64x4d_fpn_16x1_20e_coco.py'
+# learning policy
+lr_config = dict(step=[24, 27])
+total_epochs = 28
diff --git a/PyTorch/contrib/cv/detection/GCNet/configs/hrnet/mask_rcnn_hrnetv2p_w18_1x_coco.py b/PyTorch/contrib/cv/detection/GCNet/configs/hrnet/mask_rcnn_hrnetv2p_w18_1x_coco.py
new file mode 100644
index 0000000000000000000000000000000000000000..82a5f464ed9b31ec6a513efc6a9fa20953cf1689
--- /dev/null
+++ b/PyTorch/contrib/cv/detection/GCNet/configs/hrnet/mask_rcnn_hrnetv2p_w18_1x_coco.py
@@ -0,0 +1,9 @@
+_base_ = './mask_rcnn_hrnetv2p_w32_1x_coco.py'
+model = dict(
+ pretrained='open-mmlab://msra/hrnetv2_w18',
+ backbone=dict(
+ extra=dict(
+ stage2=dict(num_channels=(18, 36)),
+ stage3=dict(num_channels=(18, 36, 72)),
+ stage4=dict(num_channels=(18, 36, 72, 144)))),
+ neck=dict(type='HRFPN', in_channels=[18, 36, 72, 144], out_channels=256))
diff --git a/PyTorch/contrib/cv/detection/GCNet/configs/hrnet/mask_rcnn_hrnetv2p_w18_2x_coco.py b/PyTorch/contrib/cv/detection/GCNet/configs/hrnet/mask_rcnn_hrnetv2p_w18_2x_coco.py
new file mode 100644
index 0000000000000000000000000000000000000000..afde2daa2729316d29a0a56c9c0380b8f2b8aa95
--- /dev/null
+++ b/PyTorch/contrib/cv/detection/GCNet/configs/hrnet/mask_rcnn_hrnetv2p_w18_2x_coco.py
@@ -0,0 +1,4 @@
+_base_ = './mask_rcnn_hrnetv2p_w18_1x_coco.py'
+# learning policy
+lr_config = dict(step=[16, 22])
+total_epochs = 24
diff --git a/PyTorch/contrib/cv/detection/GCNet/configs/hrnet/mask_rcnn_hrnetv2p_w32_1x_coco.py b/PyTorch/contrib/cv/detection/GCNet/configs/hrnet/mask_rcnn_hrnetv2p_w32_1x_coco.py
new file mode 100644
index 0000000000000000000000000000000000000000..f533af6d867466ff3ee70a3941b7bfbe90f5b3ba
--- /dev/null
+++ b/PyTorch/contrib/cv/detection/GCNet/configs/hrnet/mask_rcnn_hrnetv2p_w32_1x_coco.py
@@ -0,0 +1,36 @@
+_base_ = '../mask_rcnn/mask_rcnn_r50_fpn_1x_coco.py'
+model = dict(
+ pretrained='open-mmlab://msra/hrnetv2_w32',
+ backbone=dict(
+ _delete_=True,
+ type='HRNet',
+ extra=dict(
+ stage1=dict(
+ num_modules=1,
+ num_branches=1,
+ block='BOTTLENECK',
+ num_blocks=(4, ),
+ num_channels=(64, )),
+ stage2=dict(
+ num_modules=1,
+ num_branches=2,
+ block='BASIC',
+ num_blocks=(4, 4),
+ num_channels=(32, 64)),
+ stage3=dict(
+ num_modules=4,
+ num_branches=3,
+ block='BASIC',
+ num_blocks=(4, 4, 4),
+ num_channels=(32, 64, 128)),
+ stage4=dict(
+ num_modules=3,
+ num_branches=4,
+ block='BASIC',
+ num_blocks=(4, 4, 4, 4),
+ num_channels=(32, 64, 128, 256)))),
+ neck=dict(
+ _delete_=True,
+ type='HRFPN',
+ in_channels=[32, 64, 128, 256],
+ out_channels=256))
diff --git a/PyTorch/contrib/cv/detection/GCNet/configs/hrnet/mask_rcnn_hrnetv2p_w32_2x_coco.py b/PyTorch/contrib/cv/detection/GCNet/configs/hrnet/mask_rcnn_hrnetv2p_w32_2x_coco.py
new file mode 100644
index 0000000000000000000000000000000000000000..24dce1ce5520060805f94cb0b9c6900912e44d0b
--- /dev/null
+++ b/PyTorch/contrib/cv/detection/GCNet/configs/hrnet/mask_rcnn_hrnetv2p_w32_2x_coco.py
@@ -0,0 +1,4 @@
+_base_ = './mask_rcnn_hrnetv2p_w32_1x_coco.py'
+# learning policy
+lr_config = dict(step=[16, 22])
+total_epochs = 24
diff --git a/PyTorch/contrib/cv/detection/GCNet/configs/hrnet/mask_rcnn_hrnetv2p_w40_1x_coco.py b/PyTorch/contrib/cv/detection/GCNet/configs/hrnet/mask_rcnn_hrnetv2p_w40_1x_coco.py
new file mode 100644
index 0000000000000000000000000000000000000000..5b10c166cf36601bdb895de81874970aebc83310
--- /dev/null
+++ b/PyTorch/contrib/cv/detection/GCNet/configs/hrnet/mask_rcnn_hrnetv2p_w40_1x_coco.py
@@ -0,0 +1,10 @@
+_base_ = './mask_rcnn_hrnetv2p_w18_1x_coco.py'
+model = dict(
+ pretrained='open-mmlab://msra/hrnetv2_w40',
+ backbone=dict(
+ type='HRNet',
+ extra=dict(
+ stage2=dict(num_channels=(40, 80)),
+ stage3=dict(num_channels=(40, 80, 160)),
+ stage4=dict(num_channels=(40, 80, 160, 320)))),
+ neck=dict(type='HRFPN', in_channels=[40, 80, 160, 320], out_channels=256))
diff --git a/PyTorch/contrib/cv/detection/GCNet/configs/hrnet/mask_rcnn_hrnetv2p_w40_2x_coco.py b/PyTorch/contrib/cv/detection/GCNet/configs/hrnet/mask_rcnn_hrnetv2p_w40_2x_coco.py
new file mode 100644
index 0000000000000000000000000000000000000000..fa7ec1c6e09742f5e4e92ed0fe066ac5ed75fe94
--- /dev/null
+++ b/PyTorch/contrib/cv/detection/GCNet/configs/hrnet/mask_rcnn_hrnetv2p_w40_2x_coco.py
@@ -0,0 +1,4 @@
+_base_ = './mask_rcnn_hrnetv2p_w40_1x_coco.py'
+# learning policy
+lr_config = dict(step=[16, 22])
+total_epochs = 24
diff --git a/PyTorch/contrib/cv/detection/GCNet/configs/htc/README.md b/PyTorch/contrib/cv/detection/GCNet/configs/htc/README.md
new file mode 100644
index 0000000000000000000000000000000000000000..ad99a1ba377a5150165df873be19b14865e1aeab
--- /dev/null
+++ b/PyTorch/contrib/cv/detection/GCNet/configs/htc/README.md
@@ -0,0 +1,55 @@
+# Hybrid Task Cascade for Instance Segmentation
+
+## Introduction
+
+We provide config files to reproduce the results in the CVPR 2019 paper for [Hybrid Task Cascade](https://arxiv.org/abs/1901.07518).
+
+```
+@inproceedings{chen2019hybrid,
+ title={Hybrid task cascade for instance segmentation},
+ author={Chen, Kai and Pang, Jiangmiao and Wang, Jiaqi and Xiong, Yu and Li, Xiaoxiao and Sun, Shuyang and Feng, Wansen and Liu, Ziwei and Shi, Jianping and Ouyang, Wanli and Chen Change Loy and Dahua Lin},
+ booktitle={IEEE Conference on Computer Vision and Pattern Recognition},
+ year={2019}
+}
+```
+
+## Dataset
+
+HTC requires COCO and COCO-stuff dataset for training. You need to download and extract it in the COCO dataset path.
+The directory should be like this.
+
+```
+mmdetection
+├── mmdet
+├── tools
+├── configs
+├── data
+│ ├── coco
+│ │ ├── annotations
+│ │ ├── train2017
+│ │ ├── val2017
+│ │ ├── test2017
+| | ├── stuffthingmaps
+```
+
+## Results and Models
+
+The results on COCO 2017val are shown in the below table. (results on test-dev are usually slightly higher than val)
+
+| Backbone | Style | Lr schd | Mem (GB) | Inf time (fps) | box AP | mask AP | Download |
+|:---------:|:-------:|:-------:|:--------:|:--------------:|:------:|:-------:|:--------:|
+| R-50-FPN | pytorch | 1x | 8.2 | 5.8 | 42.3 | 37.4 | [model](http://download.openmmlab.com/mmdetection/v2.0/htc/htc_r50_fpn_1x_coco/htc_r50_fpn_1x_coco_20200317-7332cf16.pth) | [log](http://download.openmmlab.com/mmdetection/v2.0/htc/htc_r50_fpn_1x_coco/htc_r50_fpn_1x_coco_20200317_070435.log.json) |
+| R-50-FPN | pytorch | 20e | 8.2 | - | 43.3 | 38.3 | [model](http://download.openmmlab.com/mmdetection/v2.0/htc/htc_r50_fpn_20e_coco/htc_r50_fpn_20e_coco_20200319-fe28c577.pth) | [log](http://download.openmmlab.com/mmdetection/v2.0/htc/htc_r50_fpn_20e_coco/htc_r50_fpn_20e_coco_20200319_070313.log.json) |
+| R-101-FPN | pytorch | 20e | 10.2 | 5.5 | 44.8 | 39.6 | [model](http://download.openmmlab.com/mmdetection/v2.0/htc/htc_r101_fpn_20e_coco/htc_r101_fpn_20e_coco_20200317-9b41b48f.pth) | [log](http://download.openmmlab.com/mmdetection/v2.0/htc/htc_r101_fpn_20e_coco/htc_r101_fpn_20e_coco_20200317_153107.log.json) |
+| X-101-32x4d-FPN | pytorch |20e| 11.4 | 5.0 | 46.1 | 40.5 | [model](http://download.openmmlab.com/mmdetection/v2.0/htc/htc_x101_32x4d_fpn_16x1_20e_coco/htc_x101_32x4d_fpn_16x1_20e_coco_20200318-de97ae01.pth) | [log](http://download.openmmlab.com/mmdetection/v2.0/htc/htc_x101_32x4d_fpn_16x1_20e_coco/htc_x101_32x4d_fpn_16x1_20e_coco_20200318_034519.log.json) |
+| X-101-64x4d-FPN | pytorch |20e| 14.5 | 4.4 | 47.0 | 41.4 | [model](http://download.openmmlab.com/mmdetection/v2.0/htc/htc_x101_64x4d_fpn_16x1_20e_coco/htc_x101_64x4d_fpn_16x1_20e_coco_20200318-b181fd7a.pth) | [log](http://download.openmmlab.com/mmdetection/v2.0/htc/htc_x101_64x4d_fpn_16x1_20e_coco/htc_x101_64x4d_fpn_16x1_20e_coco_20200318_081711.log.json) |
+
+- In the HTC paper and COCO 2018 Challenge, `score_thr` is set to 0.001 for both baselines and HTC.
+- We use 8 GPUs with 2 images/GPU for R-50 and R-101 models, and 16 GPUs with 1 image/GPU for X-101 models.
+If you would like to train X-101 HTC with 8 GPUs, you need to change the lr from 0.02 to 0.01.
+
+We also provide a powerful HTC with DCN and multi-scale training model. No testing augmentation is used.
+
+| Backbone | Style | DCN | training scales | Lr schd | box AP | mask AP | Download |
+|:----------------:|:-------:|:-----:|:---------------:|:-------:|:------:|:-------:|:--------:|
+| X-101-64x4d-FPN | pytorch | c3-c5 | 400~1400 | 20e | 50.4 | 43.8 | [model](http://download.openmmlab.com/mmdetection/v2.0/htc/htc_x101_64x4d_fpn_dconv_c3-c5_mstrain_400_1400_16x1_20e_coco/htc_x101_64x4d_fpn_dconv_c3-c5_mstrain_400_1400_16x1_20e_coco_20200312-946fd751.pth) | [log](http://download.openmmlab.com/mmdetection/v2.0/htc/htc_x101_64x4d_fpn_dconv_c3-c5_mstrain_400_1400_16x1_20e_coco/htc_x101_64x4d_fpn_dconv_c3-c5_mstrain_400_1400_16x1_20e_coco_20200312_203410.log.json) |
diff --git a/PyTorch/contrib/cv/detection/GCNet/configs/htc/htc_r101_fpn_20e_coco.py b/PyTorch/contrib/cv/detection/GCNet/configs/htc/htc_r101_fpn_20e_coco.py
new file mode 100644
index 0000000000000000000000000000000000000000..d543f028fe7ee3984f498fd05c94ddb265070061
--- /dev/null
+++ b/PyTorch/contrib/cv/detection/GCNet/configs/htc/htc_r101_fpn_20e_coco.py
@@ -0,0 +1,5 @@
+_base_ = './htc_r50_fpn_1x_coco.py'
+model = dict(pretrained='torchvision://resnet101', backbone=dict(depth=101))
+# learning policy
+lr_config = dict(step=[16, 19])
+total_epochs = 20
diff --git a/PyTorch/contrib/cv/detection/GCNet/configs/htc/htc_r50_fpn_1x_coco.py b/PyTorch/contrib/cv/detection/GCNet/configs/htc/htc_r50_fpn_1x_coco.py
new file mode 100644
index 0000000000000000000000000000000000000000..929cf464f6091f8380fd1057b282f29f4f7a8b5f
--- /dev/null
+++ b/PyTorch/contrib/cv/detection/GCNet/configs/htc/htc_r50_fpn_1x_coco.py
@@ -0,0 +1,56 @@
+_base_ = './htc_without_semantic_r50_fpn_1x_coco.py'
+model = dict(
+ roi_head=dict(
+ semantic_roi_extractor=dict(
+ type='SingleRoIExtractor',
+ roi_layer=dict(type='RoIAlign', output_size=14, sampling_ratio=0),
+ out_channels=256,
+ featmap_strides=[8]),
+ semantic_head=dict(
+ type='FusedSemanticHead',
+ num_ins=5,
+ fusion_level=1,
+ num_convs=4,
+ in_channels=256,
+ conv_out_channels=256,
+ num_classes=183,
+ ignore_label=255,
+ loss_weight=0.2)))
+data_root = 'data/coco/'
+img_norm_cfg = dict(
+ mean=[123.675, 116.28, 103.53], std=[58.395, 57.12, 57.375], to_rgb=True)
+train_pipeline = [
+ dict(type='LoadImageFromFile'),
+ dict(
+ type='LoadAnnotations', with_bbox=True, with_mask=True, with_seg=True),
+ dict(type='Resize', img_scale=(1333, 800), keep_ratio=True),
+ dict(type='RandomFlip', flip_ratio=0.5),
+ dict(type='Normalize', **img_norm_cfg),
+ dict(type='Pad', size_divisor=32),
+ dict(type='SegRescale', scale_factor=1 / 8),
+ dict(type='DefaultFormatBundle'),
+ dict(
+ type='Collect',
+ keys=['img', 'gt_bboxes', 'gt_labels', 'gt_masks', 'gt_semantic_seg']),
+]
+test_pipeline = [
+ dict(type='LoadImageFromFile'),
+ dict(
+ type='MultiScaleFlipAug',
+ img_scale=(1333, 800),
+ flip=False,
+ transforms=[
+ dict(type='Resize', keep_ratio=True),
+ dict(type='RandomFlip', flip_ratio=0.5),
+ dict(type='Normalize', **img_norm_cfg),
+ dict(type='Pad', size_divisor=32),
+ dict(type='ImageToTensor', keys=['img']),
+ dict(type='Collect', keys=['img']),
+ ])
+]
+data = dict(
+ train=dict(
+ seg_prefix=data_root + 'stuffthingmaps/train2017/',
+ pipeline=train_pipeline),
+ val=dict(pipeline=test_pipeline),
+ test=dict(pipeline=test_pipeline))
diff --git a/PyTorch/contrib/cv/detection/GCNet/configs/htc/htc_r50_fpn_20e_coco.py b/PyTorch/contrib/cv/detection/GCNet/configs/htc/htc_r50_fpn_20e_coco.py
new file mode 100644
index 0000000000000000000000000000000000000000..b05a92cd8a4d45f6c8733b0d9a44d357cf8a3308
--- /dev/null
+++ b/PyTorch/contrib/cv/detection/GCNet/configs/htc/htc_r50_fpn_20e_coco.py
@@ -0,0 +1,4 @@
+_base_ = './htc_r50_fpn_1x_coco.py'
+# learning policy
+lr_config = dict(step=[16, 19])
+total_epochs = 20
diff --git a/PyTorch/contrib/cv/detection/GCNet/configs/htc/htc_without_semantic_r50_fpn_1x_coco.py b/PyTorch/contrib/cv/detection/GCNet/configs/htc/htc_without_semantic_r50_fpn_1x_coco.py
new file mode 100644
index 0000000000000000000000000000000000000000..81ed3a8a03a36fcc3d183844d7405b755cc03540
--- /dev/null
+++ b/PyTorch/contrib/cv/detection/GCNet/configs/htc/htc_without_semantic_r50_fpn_1x_coco.py
@@ -0,0 +1,240 @@
+_base_ = [
+ '../_base_/datasets/coco_instance.py',
+ '../_base_/schedules/schedule_1x.py', '../_base_/default_runtime.py'
+]
+# model settings
+model = dict(
+ type='HybridTaskCascade',
+ pretrained='torchvision://resnet50',
+ backbone=dict(
+ type='ResNet',
+ depth=50,
+ num_stages=4,
+ out_indices=(0, 1, 2, 3),
+ frozen_stages=1,
+ norm_cfg=dict(type='BN', requires_grad=True),
+ norm_eval=True,
+ style='pytorch'),
+ neck=dict(
+ type='FPN',
+ in_channels=[256, 512, 1024, 2048],
+ out_channels=256,
+ num_outs=5),
+ rpn_head=dict(
+ type='RPNHead',
+ in_channels=256,
+ feat_channels=256,
+ anchor_generator=dict(
+ type='AnchorGenerator',
+ scales=[8],
+ ratios=[0.5, 1.0, 2.0],
+ strides=[4, 8, 16, 32, 64]),
+ bbox_coder=dict(
+ type='DeltaXYWHBBoxCoder',
+ target_means=[.0, .0, .0, .0],
+ target_stds=[1.0, 1.0, 1.0, 1.0]),
+ loss_cls=dict(
+ type='CrossEntropyLoss', use_sigmoid=True, loss_weight=1.0),
+ loss_bbox=dict(type='SmoothL1Loss', beta=1.0 / 9.0, loss_weight=1.0)),
+ roi_head=dict(
+ type='HybridTaskCascadeRoIHead',
+ interleaved=True,
+ mask_info_flow=True,
+ num_stages=3,
+ stage_loss_weights=[1, 0.5, 0.25],
+ bbox_roi_extractor=dict(
+ type='SingleRoIExtractor',
+ roi_layer=dict(type='RoIAlign', output_size=7, sampling_ratio=0),
+ out_channels=256,
+ featmap_strides=[4, 8, 16, 32]),
+ bbox_head=[
+ dict(
+ type='Shared2FCBBoxHead',
+ in_channels=256,
+ fc_out_channels=1024,
+ roi_feat_size=7,
+ num_classes=80,
+ bbox_coder=dict(
+ type='DeltaXYWHBBoxCoder',
+ target_means=[0., 0., 0., 0.],
+ target_stds=[0.1, 0.1, 0.2, 0.2]),
+ reg_class_agnostic=True,
+ loss_cls=dict(
+ type='CrossEntropyLoss',
+ use_sigmoid=False,
+ loss_weight=1.0),
+ loss_bbox=dict(type='SmoothL1Loss', beta=1.0,
+ loss_weight=1.0)),
+ dict(
+ type='Shared2FCBBoxHead',
+ in_channels=256,
+ fc_out_channels=1024,
+ roi_feat_size=7,
+ num_classes=80,
+ bbox_coder=dict(
+ type='DeltaXYWHBBoxCoder',
+ target_means=[0., 0., 0., 0.],
+ target_stds=[0.05, 0.05, 0.1, 0.1]),
+ reg_class_agnostic=True,
+ loss_cls=dict(
+ type='CrossEntropyLoss',
+ use_sigmoid=False,
+ loss_weight=1.0),
+ loss_bbox=dict(type='SmoothL1Loss', beta=1.0,
+ loss_weight=1.0)),
+ dict(
+ type='Shared2FCBBoxHead',
+ in_channels=256,
+ fc_out_channels=1024,
+ roi_feat_size=7,
+ num_classes=80,
+ bbox_coder=dict(
+ type='DeltaXYWHBBoxCoder',
+ target_means=[0., 0., 0., 0.],
+ target_stds=[0.033, 0.033, 0.067, 0.067]),
+ reg_class_agnostic=True,
+ loss_cls=dict(
+ type='CrossEntropyLoss',
+ use_sigmoid=False,
+ loss_weight=1.0),
+ loss_bbox=dict(type='SmoothL1Loss', beta=1.0, loss_weight=1.0))
+ ],
+ mask_roi_extractor=dict(
+ type='SingleRoIExtractor',
+ roi_layer=dict(type='RoIAlign', output_size=14, sampling_ratio=0),
+ out_channels=256,
+ featmap_strides=[4, 8, 16, 32]),
+ mask_head=[
+ dict(
+ type='HTCMaskHead',
+ with_conv_res=False,
+ num_convs=4,
+ in_channels=256,
+ conv_out_channels=256,
+ num_classes=80,
+ loss_mask=dict(
+ type='CrossEntropyLoss', use_mask=True, loss_weight=1.0)),
+ dict(
+ type='HTCMaskHead',
+ num_convs=4,
+ in_channels=256,
+ conv_out_channels=256,
+ num_classes=80,
+ loss_mask=dict(
+ type='CrossEntropyLoss', use_mask=True, loss_weight=1.0)),
+ dict(
+ type='HTCMaskHead',
+ num_convs=4,
+ in_channels=256,
+ conv_out_channels=256,
+ num_classes=80,
+ loss_mask=dict(
+ type='CrossEntropyLoss', use_mask=True, loss_weight=1.0))
+ ]))
+# model training and testing settings
+train_cfg = dict(
+ rpn=dict(
+ assigner=dict(
+ type='MaxIoUAssigner',
+ pos_iou_thr=0.7,
+ neg_iou_thr=0.3,
+ min_pos_iou=0.3,
+ ignore_iof_thr=-1),
+ sampler=dict(
+ type='RandomSampler',
+ num=256,
+ pos_fraction=0.5,
+ neg_pos_ub=-1,
+ add_gt_as_proposals=False),
+ allowed_border=0,
+ pos_weight=-1,
+ debug=False),
+ rpn_proposal=dict(
+ nms_across_levels=False,
+ nms_pre=2000,
+ nms_post=2000,
+ max_num=2000,
+ nms_thr=0.7,
+ min_bbox_size=0),
+ rcnn=[
+ dict(
+ assigner=dict(
+ type='MaxIoUAssigner',
+ pos_iou_thr=0.5,
+ neg_iou_thr=0.5,
+ min_pos_iou=0.5,
+ ignore_iof_thr=-1),
+ sampler=dict(
+ type='RandomSampler',
+ num=512,
+ pos_fraction=0.25,
+ neg_pos_ub=-1,
+ add_gt_as_proposals=True),
+ mask_size=28,
+ pos_weight=-1,
+ debug=False),
+ dict(
+ assigner=dict(
+ type='MaxIoUAssigner',
+ pos_iou_thr=0.6,
+ neg_iou_thr=0.6,
+ min_pos_iou=0.6,
+ ignore_iof_thr=-1),
+ sampler=dict(
+ type='RandomSampler',
+ num=512,
+ pos_fraction=0.25,
+ neg_pos_ub=-1,
+ add_gt_as_proposals=True),
+ mask_size=28,
+ pos_weight=-1,
+ debug=False),
+ dict(
+ assigner=dict(
+ type='MaxIoUAssigner',
+ pos_iou_thr=0.7,
+ neg_iou_thr=0.7,
+ min_pos_iou=0.7,
+ ignore_iof_thr=-1),
+ sampler=dict(
+ type='RandomSampler',
+ num=512,
+ pos_fraction=0.25,
+ neg_pos_ub=-1,
+ add_gt_as_proposals=True),
+ mask_size=28,
+ pos_weight=-1,
+ debug=False)
+ ])
+test_cfg = dict(
+ rpn=dict(
+ nms_across_levels=False,
+ nms_pre=1000,
+ nms_post=1000,
+ max_num=1000,
+ nms_thr=0.7,
+ min_bbox_size=0),
+ rcnn=dict(
+ score_thr=0.001,
+ nms=dict(type='nms', iou_threshold=0.5),
+ max_per_img=100,
+ mask_thr_binary=0.5))
+img_norm_cfg = dict(
+ mean=[123.675, 116.28, 103.53], std=[58.395, 57.12, 57.375], to_rgb=True)
+test_pipeline = [
+ dict(type='LoadImageFromFile'),
+ dict(
+ type='MultiScaleFlipAug',
+ img_scale=(1333, 800),
+ flip=False,
+ transforms=[
+ dict(type='Resize', keep_ratio=True),
+ dict(type='RandomFlip', flip_ratio=0.5),
+ dict(type='Normalize', **img_norm_cfg),
+ dict(type='Pad', size_divisor=32),
+ dict(type='ImageToTensor', keys=['img']),
+ dict(type='Collect', keys=['img']),
+ ])
+]
+data = dict(
+ val=dict(pipeline=test_pipeline), test=dict(pipeline=test_pipeline))
diff --git a/PyTorch/contrib/cv/detection/GCNet/configs/htc/htc_x101_32x4d_fpn_16x1_20e_coco.py b/PyTorch/contrib/cv/detection/GCNet/configs/htc/htc_x101_32x4d_fpn_16x1_20e_coco.py
new file mode 100644
index 0000000000000000000000000000000000000000..19b3447cd71a7339669b3b18471858d0adae016a
--- /dev/null
+++ b/PyTorch/contrib/cv/detection/GCNet/configs/htc/htc_x101_32x4d_fpn_16x1_20e_coco.py
@@ -0,0 +1,18 @@
+_base_ = './htc_r50_fpn_1x_coco.py'
+model = dict(
+ pretrained='open-mmlab://resnext101_32x4d',
+ backbone=dict(
+ type='ResNeXt',
+ depth=101,
+ groups=32,
+ base_width=4,
+ num_stages=4,
+ out_indices=(0, 1, 2, 3),
+ frozen_stages=1,
+ norm_cfg=dict(type='BN', requires_grad=True),
+ norm_eval=True,
+ style='pytorch'))
+data = dict(samples_per_gpu=1, workers_per_gpu=1)
+# learning policy
+lr_config = dict(step=[16, 19])
+total_epochs = 20
diff --git a/PyTorch/contrib/cv/detection/GCNet/configs/htc/htc_x101_64x4d_fpn_16x1_20e_coco.py b/PyTorch/contrib/cv/detection/GCNet/configs/htc/htc_x101_64x4d_fpn_16x1_20e_coco.py
new file mode 100644
index 0000000000000000000000000000000000000000..e76cff2a21fec34eeef25ef65f053ad0a2cde16f
--- /dev/null
+++ b/PyTorch/contrib/cv/detection/GCNet/configs/htc/htc_x101_64x4d_fpn_16x1_20e_coco.py
@@ -0,0 +1,18 @@
+_base_ = './htc_r50_fpn_1x_coco.py'
+model = dict(
+ pretrained='open-mmlab://resnext101_64x4d',
+ backbone=dict(
+ type='ResNeXt',
+ depth=101,
+ groups=64,
+ base_width=4,
+ num_stages=4,
+ out_indices=(0, 1, 2, 3),
+ frozen_stages=1,
+ norm_cfg=dict(type='BN', requires_grad=True),
+ norm_eval=True,
+ style='pytorch'))
+data = dict(samples_per_gpu=1, workers_per_gpu=1)
+# learning policy
+lr_config = dict(step=[16, 19])
+total_epochs = 20
diff --git a/PyTorch/contrib/cv/detection/GCNet/configs/htc/htc_x101_64x4d_fpn_dconv_c3-c5_mstrain_400_1400_16x1_20e_coco.py b/PyTorch/contrib/cv/detection/GCNet/configs/htc/htc_x101_64x4d_fpn_dconv_c3-c5_mstrain_400_1400_16x1_20e_coco.py
new file mode 100644
index 0000000000000000000000000000000000000000..4a98ff2858895b0e6730634b2a559eba1ce72ea4
--- /dev/null
+++ b/PyTorch/contrib/cv/detection/GCNet/configs/htc/htc_x101_64x4d_fpn_dconv_c3-c5_mstrain_400_1400_16x1_20e_coco.py
@@ -0,0 +1,42 @@
+_base_ = './htc_r50_fpn_1x_coco.py'
+model = dict(
+ pretrained='open-mmlab://resnext101_64x4d',
+ backbone=dict(
+ type='ResNeXt',
+ depth=101,
+ groups=64,
+ base_width=4,
+ num_stages=4,
+ out_indices=(0, 1, 2, 3),
+ frozen_stages=1,
+ norm_cfg=dict(type='BN', requires_grad=True),
+ norm_eval=True,
+ style='pytorch',
+ dcn=dict(type='DCN', deform_groups=1, fallback_on_stride=False),
+ stage_with_dcn=(False, True, True, True)))
+# dataset settings
+img_norm_cfg = dict(
+ mean=[123.675, 116.28, 103.53], std=[58.395, 57.12, 57.375], to_rgb=True)
+train_pipeline = [
+ dict(type='LoadImageFromFile'),
+ dict(
+ type='LoadAnnotations', with_bbox=True, with_mask=True, with_seg=True),
+ dict(
+ type='Resize',
+ img_scale=[(1600, 400), (1600, 1400)],
+ multiscale_mode='range',
+ keep_ratio=True),
+ dict(type='RandomFlip', flip_ratio=0.5),
+ dict(type='Normalize', **img_norm_cfg),
+ dict(type='Pad', size_divisor=32),
+ dict(type='SegRescale', scale_factor=1 / 8),
+ dict(type='DefaultFormatBundle'),
+ dict(
+ type='Collect',
+ keys=['img', 'gt_bboxes', 'gt_labels', 'gt_masks', 'gt_semantic_seg']),
+]
+data = dict(
+ samples_per_gpu=1, workers_per_gpu=1, train=dict(pipeline=train_pipeline))
+# learning policy
+lr_config = dict(step=[16, 19])
+total_epochs = 20
diff --git a/PyTorch/contrib/cv/detection/GCNet/configs/instaboost/README.md b/PyTorch/contrib/cv/detection/GCNet/configs/instaboost/README.md
new file mode 100644
index 0000000000000000000000000000000000000000..8652101f72216cfc018c4f22ecad0129d18fc1f5
--- /dev/null
+++ b/PyTorch/contrib/cv/detection/GCNet/configs/instaboost/README.md
@@ -0,0 +1,43 @@
+# InstaBoost for MMDetection
+
+Configs in this directory is the implementation for ICCV2019 paper "InstaBoost: Boosting Instance Segmentation Via Probability Map Guided Copy-Pasting" and provided by the authors of the paper. InstaBoost is a data augmentation method for object detection and instance segmentation. The paper has been released on [`arXiv`](https://arxiv.org/abs/1908.07801).
+
+```
+@inproceedings{fang2019instaboost,
+ title={Instaboost: Boosting instance segmentation via probability map guided copy-pasting},
+ author={Fang, Hao-Shu and Sun, Jianhua and Wang, Runzhong and Gou, Minghao and Li, Yong-Lu and Lu, Cewu},
+ booktitle={Proceedings of the IEEE International Conference on Computer Vision},
+ pages={682--691},
+ year={2019}
+}
+```
+
+## Usage
+
+### Requirements
+
+You need to install `instaboostfast` before using it.
+
+```
+pip install instaboostfast
+```
+
+The code and more details can be found [here](https://github.com/GothicAi/Instaboost).
+
+### Integration with MMDetection
+
+InstaBoost have been already integrated in the data pipeline, thus all you need is to add or change **InstaBoost** configurations after **LoadImageFromFile**. We have provided examples like [this](mask_rcnn_r50_fpn_instaboost_4x#L121). You can refer to [`InstaBoostConfig`](https://github.com/GothicAi/InstaBoost-pypi#instaboostconfig) for more details.
+
+## Results and Models
+
+ - All models were trained on `coco_2017_train` and tested on `coco_2017_val` for conveinience of evaluation and comparison. In the paper, the results are obtained from `test-dev`.
+ - To balance accuracy and training time when using InstaBoost, models released in this page are all trained for 48 Epochs. Other training and testing configs strictly follow the original framework.
+ - For results and models in MMDetection V1.x, please refer to [Instaboost](https://github.com/GothicAi/Instaboost).
+
+
+| Network | Backbone | Lr schd | Mem (GB) | Inf time (fps) | box AP | mask AP | Download |
+| :-------------: | :--------: | :-----: | :------: | :------------: | :------:| :-----: | :-----------------: |
+| Mask R-CNN | R-50-FPN | 4x | 4.4 | 17.5 | 40.6 | 36.6 | [model](http://download.openmmlab.com/mmdetection/v2.0/instaboost/mask_rcnn_r50_fpn_instaboost_4x_coco/mask_rcnn_r50_fpn_instaboost_4x_coco_20200307-d025f83a.pth) | [log](http://download.openmmlab.com/mmdetection/v2.0/instaboost/mask_rcnn_r50_fpn_instaboost_4x_coco/mask_rcnn_r50_fpn_instaboost_4x_coco_20200307_223635.log.json) |
+| Mask R-CNN | R-101-FPN | 4x | 6.4 | | 42.5 | 38.0 | [model](http://download.openmmlab.com/mmdetection/v2.0/instaboost/mask_rcnn_r101_fpn_instaboost_4x_coco/mask_rcnn_r101_fpn_instaboost_4x_coco_20200703_235738-f23f3a5f.pth) | [log](http://download.openmmlab.com/mmdetection/v2.0/instaboost/mask_rcnn_r101_fpn_instaboost_4x_coco/mask_rcnn_r101_fpn_instaboost_4x_coco_20200703_235738.log.json) |
+| Mask R-CNN | X-101-64x4d-FPN | 4x | 10.7 | | 44.7 | 39.7 | [model](http://download.openmmlab.com/mmdetection/v2.0/instaboost/mask_rcnn_x101_64x4d_fpn_instaboost_4x_coco/mask_rcnn_x101_64x4d_fpn_instaboost_4x_coco_20200515_080947-8ed58c1b.pth) | [log](http://download.openmmlab.com/mmdetection/v2.0/instaboost/mask_rcnn_x101_64x4d_fpn_instaboost_4x_coco/mask_rcnn_x101_64x4d_fpn_instaboost_4x_coco_20200515_080947.log.json) |
+| Cascade R-CNN | R-101-FPN | 4x | 6.0 | 12.0 | 43.7 | 38.0 | [model](http://download.openmmlab.com/mmdetection/v2.0/instaboost/cascade_mask_rcnn_r50_fpn_instaboost_4x_coco/cascade_mask_rcnn_r50_fpn_instaboost_4x_coco_20200307-c19d98d9.pth) | [log](http://download.openmmlab.com/mmdetection/v2.0/instaboost/cascade_mask_rcnn_r50_fpn_instaboost_4x_coco/cascade_mask_rcnn_r50_fpn_instaboost_4x_coco_20200307_223646.log.json) |
diff --git a/PyTorch/contrib/cv/detection/GCNet/configs/instaboost/cascade_mask_rcnn_r101_fpn_instaboost_4x_coco.py b/PyTorch/contrib/cv/detection/GCNet/configs/instaboost/cascade_mask_rcnn_r101_fpn_instaboost_4x_coco.py
new file mode 100644
index 0000000000000000000000000000000000000000..723ab0295f8457c03114ca535dede951e7d5b169
--- /dev/null
+++ b/PyTorch/contrib/cv/detection/GCNet/configs/instaboost/cascade_mask_rcnn_r101_fpn_instaboost_4x_coco.py
@@ -0,0 +1,3 @@
+_base_ = './cascade_mask_rcnn_r50_fpn_instaboost_4x_coco.py'
+
+model = dict(pretrained='torchvision://resnet101', backbone=dict(depth=101))
diff --git a/PyTorch/contrib/cv/detection/GCNet/configs/instaboost/cascade_mask_rcnn_r50_fpn_instaboost_4x_coco.py b/PyTorch/contrib/cv/detection/GCNet/configs/instaboost/cascade_mask_rcnn_r50_fpn_instaboost_4x_coco.py
new file mode 100644
index 0000000000000000000000000000000000000000..6c234b62aa439aac37cb0ea3867f73e42edf8d78
--- /dev/null
+++ b/PyTorch/contrib/cv/detection/GCNet/configs/instaboost/cascade_mask_rcnn_r50_fpn_instaboost_4x_coco.py
@@ -0,0 +1,28 @@
+_base_ = '../cascade_rcnn/cascade_mask_rcnn_r50_fpn_1x_coco.py'
+img_norm_cfg = dict(
+ mean=[123.675, 116.28, 103.53], std=[58.395, 57.12, 57.375], to_rgb=True)
+train_pipeline = [
+ dict(type='LoadImageFromFile'),
+ dict(
+ type='InstaBoost',
+ action_candidate=('normal', 'horizontal', 'skip'),
+ action_prob=(1, 0, 0),
+ scale=(0.8, 1.2),
+ dx=15,
+ dy=15,
+ theta=(-1, 1),
+ color_prob=0.5,
+ hflag=False,
+ aug_ratio=0.5),
+ dict(type='LoadAnnotations', with_bbox=True, with_mask=True),
+ dict(type='Resize', img_scale=(1333, 800), keep_ratio=True),
+ dict(type='RandomFlip', flip_ratio=0.5),
+ dict(type='Normalize', **img_norm_cfg),
+ dict(type='Pad', size_divisor=32),
+ dict(type='DefaultFormatBundle'),
+ dict(type='Collect', keys=['img', 'gt_bboxes', 'gt_labels', 'gt_masks']),
+]
+data = dict(train=dict(pipeline=train_pipeline))
+# learning policy
+lr_config = dict(step=[32, 44])
+total_epochs = 48
diff --git a/PyTorch/contrib/cv/detection/GCNet/configs/instaboost/cascade_mask_rcnn_x101_64x4d_fpn_instaboost_4x_coco.py b/PyTorch/contrib/cv/detection/GCNet/configs/instaboost/cascade_mask_rcnn_x101_64x4d_fpn_instaboost_4x_coco.py
new file mode 100644
index 0000000000000000000000000000000000000000..7cf5f307442e56b29460fb5477cef64bfd3476b9
--- /dev/null
+++ b/PyTorch/contrib/cv/detection/GCNet/configs/instaboost/cascade_mask_rcnn_x101_64x4d_fpn_instaboost_4x_coco.py
@@ -0,0 +1,13 @@
+_base_ = './cascade_mask_rcnn_r50_fpn_instaboost_4x_coco.py'
+model = dict(
+ pretrained='open-mmlab://resnext101_64x4d',
+ backbone=dict(
+ type='ResNeXt',
+ depth=101,
+ groups=64,
+ base_width=4,
+ num_stages=4,
+ out_indices=(0, 1, 2, 3),
+ frozen_stages=1,
+ norm_cfg=dict(type='BN', requires_grad=True),
+ style='pytorch'))
diff --git a/PyTorch/contrib/cv/detection/GCNet/configs/instaboost/mask_rcnn_r101_fpn_instaboost_4x_coco.py b/PyTorch/contrib/cv/detection/GCNet/configs/instaboost/mask_rcnn_r101_fpn_instaboost_4x_coco.py
new file mode 100644
index 0000000000000000000000000000000000000000..c2819477abb070b724d0295ccf028025918b263a
--- /dev/null
+++ b/PyTorch/contrib/cv/detection/GCNet/configs/instaboost/mask_rcnn_r101_fpn_instaboost_4x_coco.py
@@ -0,0 +1,2 @@
+_base_ = './mask_rcnn_r50_fpn_instaboost_4x_coco.py'
+model = dict(pretrained='torchvision://resnet101', backbone=dict(depth=101))
diff --git a/PyTorch/contrib/cv/detection/GCNet/configs/instaboost/mask_rcnn_r50_fpn_instaboost_4x_coco.py b/PyTorch/contrib/cv/detection/GCNet/configs/instaboost/mask_rcnn_r50_fpn_instaboost_4x_coco.py
new file mode 100644
index 0000000000000000000000000000000000000000..ca4b312fca68e02aeea331a59d5541a74e6723bc
--- /dev/null
+++ b/PyTorch/contrib/cv/detection/GCNet/configs/instaboost/mask_rcnn_r50_fpn_instaboost_4x_coco.py
@@ -0,0 +1,28 @@
+_base_ = '../mask_rcnn/mask_rcnn_r50_fpn_1x_coco.py'
+img_norm_cfg = dict(
+ mean=[123.675, 116.28, 103.53], std=[58.395, 57.12, 57.375], to_rgb=True)
+train_pipeline = [
+ dict(type='LoadImageFromFile'),
+ dict(
+ type='InstaBoost',
+ action_candidate=('normal', 'horizontal', 'skip'),
+ action_prob=(1, 0, 0),
+ scale=(0.8, 1.2),
+ dx=15,
+ dy=15,
+ theta=(-1, 1),
+ color_prob=0.5,
+ hflag=False,
+ aug_ratio=0.5),
+ dict(type='LoadAnnotations', with_bbox=True, with_mask=True),
+ dict(type='Resize', img_scale=(1333, 800), keep_ratio=True),
+ dict(type='RandomFlip', flip_ratio=0.5),
+ dict(type='Normalize', **img_norm_cfg),
+ dict(type='Pad', size_divisor=32),
+ dict(type='DefaultFormatBundle'),
+ dict(type='Collect', keys=['img', 'gt_bboxes', 'gt_labels', 'gt_masks']),
+]
+data = dict(train=dict(pipeline=train_pipeline))
+# learning policy
+lr_config = dict(step=[32, 44])
+total_epochs = 48
diff --git a/PyTorch/contrib/cv/detection/GCNet/configs/instaboost/mask_rcnn_x101_64x4d_fpn_instaboost_4x_coco.py b/PyTorch/contrib/cv/detection/GCNet/configs/instaboost/mask_rcnn_x101_64x4d_fpn_instaboost_4x_coco.py
new file mode 100644
index 0000000000000000000000000000000000000000..0acd088a469e682011a90b770efa51116f6c42ca
--- /dev/null
+++ b/PyTorch/contrib/cv/detection/GCNet/configs/instaboost/mask_rcnn_x101_64x4d_fpn_instaboost_4x_coco.py
@@ -0,0 +1,13 @@
+_base_ = './mask_rcnn_r50_fpn_instaboost_4x_coco.py'
+model = dict(
+ pretrained='open-mmlab://resnext101_64x4d',
+ backbone=dict(
+ type='ResNeXt',
+ depth=101,
+ groups=64,
+ base_width=4,
+ num_stages=4,
+ out_indices=(0, 1, 2, 3),
+ frozen_stages=1,
+ norm_cfg=dict(type='BN', requires_grad=True),
+ style='pytorch'))
diff --git a/PyTorch/contrib/cv/detection/GCNet/configs/legacy_1.x/README.md b/PyTorch/contrib/cv/detection/GCNet/configs/legacy_1.x/README.md
new file mode 100644
index 0000000000000000000000000000000000000000..17f407cb6a8e9e62c8027634c884daa868b00d5f
--- /dev/null
+++ b/PyTorch/contrib/cv/detection/GCNet/configs/legacy_1.x/README.md
@@ -0,0 +1,49 @@
+# Legacy Configs in MMDetection V1.x
+
+Configs in this directory implement the legacy configs used by MMDetection V1.x and its model zoos.
+
+To help users convert their models from V1.x to MMDetection V2.0, we provide v1.x configs to inference the converted v1.x models.
+Due to the BC-breaking changes in MMDetection V2.0 from MMDetection V1.x, running inference with the same model weights in these two version will produce different results. The difference will cause within 1% AP absolute difference as can be found in the following table.
+
+## Usage
+
+To upgrade the model version, the users need to do the following steps.
+
+### 1. Convert model weights
+There are three main difference in the model weights between V1.x and V2.0 codebases.
+
+1. Since the class order in all the detector's classification branch is reordered, all the legacy model weights need to go through the conversion process.
+2. The regression and segmentation head no longer contain the background channel. Weights in these background channels should be removed to fix in the current codebase.
+3. For two-stage detectors, their wegihts need to be upgraded since MMDetection V2.0 refactors all the two-stage detectors with `RoIHead`.
+
+The users can do the same modification as mentioned above for the self-implemented
+detectors. We provide a scripts `tools/upgrade_model_version.py` to convert the model weights in the V1.x model zoo.
+
+```bash
+python tools/upgrade_model_version.py ${OLD_MODEL_PATH} ${NEW_MODEL_PATH} --num-classes ${NUM_CLASSES}
+
+```
+- OLD_MODEL_PATH: the path to load the model weights in 1.x version.
+- NEW_MODEL_PATH: the path to save the converted model weights in 2.0 version.
+- NUM_CLASSES: number of classes of the original model weights. Usually it is 81 for COCO dataset, 21 for VOC dataset.
+The number of classes in V2.0 models should be equal to that in V1.x models - 1.
+
+### 2. Use configs with legacy settings
+
+After converting the model weights, checkout to the v1.2 release to find the corresponding config file that uses the legacy settings.
+The V1.x models usually need these three legacy modules: `LegacyAnchorGenerator`, `LegacyDeltaXYWHBBoxCoder`, and `RoIAlign(align=False)`.
+For models using ResNet Caffe backbones, they also need to change the pretrain name and the corresponding `img_norm_cfg`.
+An example is in [`retinanet_r50_caffe_fpn_1x_coco_v1.py`](retinanet_r50_caffe_fpn_1x_coco_v1.py)
+Then use the config to test the model weights. For most models, the obtained results should be close to that in V1.x.
+We provide configs of some common structures in this directory.
+
+## Performance
+
+The performance change after converting the models in this directory are listed as the following.
+| Method | Style | Lr schd | V1.x box AP | V1.x mask AP | V2.0 box AP | V2.0 mask AP |Download |
+| :-------------: | :-----: | :-----: | :------:| :-----: |:------:| :-----: |:------------------------------------------------------------------------------------------------------------------------------: |
+|[Mask R-CNN R-50-FPN](./mask_rcnn_r50_fpn_1x_coco_v1.py) | pytorch | 1x | 37.3 | 34.2 | 36.8 | 33.9 |[model](https://s3.ap-northeast-2.amazonaws.com/open-mmlab/mmdetection/models/mask_rcnn_r50_fpn_1x_20181010-069fa190.pth)|
+|[RetinaNet R-50-FPN](./retinanet_r50_caffe_fpn_1x_coco_v1.py)| caffe | 1x | 35.8 | - | 35.4 | - |
+|[RetinaNet R-50-FPN](./retinanet_r50_fpn_1x_coco_v1.py)| pytorch | 1x | 35.6 |-|35.2| -|[model](https://s3.ap-northeast-2.amazonaws.com/open-mmlab/mmdetection/models/retinanet_r50_fpn_1x_20181125-7b0c2548.pth) |
+|[Cascade Mask R-CNN R-50-FPN](./cascade_mask_rcnn_r50_fpn_1x_coco_v1.py) | pytorch | 1x | 41.2 | 35.7 |40.8| 35.6| [model](https://s3.ap-northeast-2.amazonaws.com/open-mmlab/mmdetection/models/cascade_mask_rcnn_r50_fpn_1x_20181123-88b170c9.pth) |
+|[SSD300-VGG16](./ssd300_coco_v1.py) | caffe | 120e | 25.7 |-|25.4|-| [model](https://s3.ap-northeast-2.amazonaws.com/open-mmlab/mmdetection/models/ssd300_coco_vgg16_caffe_120e_20181221-84d7110b.pth) |
diff --git a/PyTorch/contrib/cv/detection/GCNet/configs/legacy_1.x/cascade_mask_rcnn_r50_fpn_1x_coco_v1.py b/PyTorch/contrib/cv/detection/GCNet/configs/legacy_1.x/cascade_mask_rcnn_r50_fpn_1x_coco_v1.py
new file mode 100644
index 0000000000000000000000000000000000000000..5899444adf0c7309367fb52e1f6d135e788f2b57
--- /dev/null
+++ b/PyTorch/contrib/cv/detection/GCNet/configs/legacy_1.x/cascade_mask_rcnn_r50_fpn_1x_coco_v1.py
@@ -0,0 +1,79 @@
+_base_ = [
+ '../_base_/models/cascade_mask_rcnn_r50_fpn.py',
+ '../_base_/datasets/coco_instance.py',
+ '../_base_/schedules/schedule_1x.py', '../_base_/default_runtime.py'
+]
+model = dict(
+ type='CascadeRCNN',
+ pretrained='torchvision://resnet50',
+ backbone=dict(
+ type='ResNet',
+ depth=50,
+ num_stages=4,
+ out_indices=(0, 1, 2, 3),
+ frozen_stages=1,
+ norm_cfg=dict(type='BN', requires_grad=True),
+ norm_eval=True,
+ style='pytorch'),
+ neck=dict(
+ type='FPN',
+ in_channels=[256, 512, 1024, 2048],
+ out_channels=256,
+ num_outs=5),
+ rpn_head=dict(
+ anchor_generator=dict(type='LegacyAnchorGenerator', center_offset=0.5),
+ bbox_coder=dict(
+ type='LegacyDeltaXYWHBBoxCoder',
+ target_means=[.0, .0, .0, .0],
+ target_stds=[1.0, 1.0, 1.0, 1.0])),
+ roi_head=dict(
+ bbox_roi_extractor=dict(
+ type='SingleRoIExtractor',
+ roi_layer=dict(
+ type='RoIAlign',
+ output_size=7,
+ sampling_ratio=2,
+ aligned=False)),
+ bbox_head=[
+ dict(
+ type='Shared2FCBBoxHead',
+ reg_class_agnostic=True,
+ in_channels=256,
+ fc_out_channels=1024,
+ roi_feat_size=7,
+ num_classes=80,
+ bbox_coder=dict(
+ type='LegacyDeltaXYWHBBoxCoder',
+ target_means=[0., 0., 0., 0.],
+ target_stds=[0.1, 0.1, 0.2, 0.2])),
+ dict(
+ type='Shared2FCBBoxHead',
+ reg_class_agnostic=True,
+ in_channels=256,
+ fc_out_channels=1024,
+ roi_feat_size=7,
+ num_classes=80,
+ bbox_coder=dict(
+ type='LegacyDeltaXYWHBBoxCoder',
+ target_means=[0., 0., 0., 0.],
+ target_stds=[0.05, 0.05, 0.1, 0.1])),
+ dict(
+ type='Shared2FCBBoxHead',
+ reg_class_agnostic=True,
+ in_channels=256,
+ fc_out_channels=1024,
+ roi_feat_size=7,
+ num_classes=80,
+ bbox_coder=dict(
+ type='LegacyDeltaXYWHBBoxCoder',
+ target_means=[0., 0., 0., 0.],
+ target_stds=[0.033, 0.033, 0.067, 0.067])),
+ ],
+ mask_roi_extractor=dict(
+ type='SingleRoIExtractor',
+ roi_layer=dict(
+ type='RoIAlign',
+ output_size=14,
+ sampling_ratio=2,
+ aligned=False))))
+dist_params = dict(backend='nccl', port=29515)
diff --git a/PyTorch/contrib/cv/detection/GCNet/configs/legacy_1.x/faster_rcnn_r50_fpn_1x_coco_v1.py b/PyTorch/contrib/cv/detection/GCNet/configs/legacy_1.x/faster_rcnn_r50_fpn_1x_coco_v1.py
new file mode 100644
index 0000000000000000000000000000000000000000..1cb833cfbcdbe420deece2d5fd806b7b99df5a24
--- /dev/null
+++ b/PyTorch/contrib/cv/detection/GCNet/configs/legacy_1.x/faster_rcnn_r50_fpn_1x_coco_v1.py
@@ -0,0 +1,37 @@
+_base_ = [
+ '../_base_/models/faster_rcnn_r50_fpn.py',
+ '../_base_/datasets/coco_detection.py',
+ '../_base_/schedules/schedule_1x.py', '../_base_/default_runtime.py'
+]
+
+model = dict(
+ type='FasterRCNN',
+ pretrained='torchvision://resnet50',
+ rpn_head=dict(
+ type='RPNHead',
+ anchor_generator=dict(
+ type='LegacyAnchorGenerator',
+ center_offset=0.5,
+ scales=[8],
+ ratios=[0.5, 1.0, 2.0],
+ strides=[4, 8, 16, 32, 64]),
+ bbox_coder=dict(type='LegacyDeltaXYWHBBoxCoder'),
+ loss_bbox=dict(type='SmoothL1Loss', beta=1.0 / 9.0, loss_weight=1.0)),
+ roi_head=dict(
+ type='StandardRoIHead',
+ bbox_roi_extractor=dict(
+ type='SingleRoIExtractor',
+ roi_layer=dict(
+ type='RoIAlign',
+ output_size=7,
+ sampling_ratio=2,
+ aligned=False),
+ out_channels=256,
+ featmap_strides=[4, 8, 16, 32]),
+ bbox_head=dict(
+ bbox_coder=dict(type='LegacyDeltaXYWHBBoxCoder'),
+ loss_bbox=dict(type='SmoothL1Loss', beta=1.0, loss_weight=1.0))))
+# model training and testing settings
+train_cfg = dict(
+ rpn_proposal=dict(nms_post=2000, max_num=2000),
+ rcnn=dict(assigner=dict(match_low_quality=True)))
diff --git a/PyTorch/contrib/cv/detection/GCNet/configs/legacy_1.x/mask_rcnn_r50_fpn_1x_coco_v1.py b/PyTorch/contrib/cv/detection/GCNet/configs/legacy_1.x/mask_rcnn_r50_fpn_1x_coco_v1.py
new file mode 100644
index 0000000000000000000000000000000000000000..0b200610191369da8d3581478f9013b4467755e4
--- /dev/null
+++ b/PyTorch/contrib/cv/detection/GCNet/configs/legacy_1.x/mask_rcnn_r50_fpn_1x_coco_v1.py
@@ -0,0 +1,33 @@
+_base_ = [
+ '../_base_/models/mask_rcnn_r50_fpn.py',
+ '../_base_/datasets/coco_instance.py',
+ '../_base_/schedules/schedule_1x.py', '../_base_/default_runtime.py'
+]
+
+model = dict(
+ rpn_head=dict(
+ anchor_generator=dict(type='LegacyAnchorGenerator', center_offset=0.5),
+ bbox_coder=dict(type='LegacyDeltaXYWHBBoxCoder'),
+ loss_bbox=dict(type='SmoothL1Loss', beta=1.0 / 9.0, loss_weight=1.0)),
+ roi_head=dict(
+ bbox_roi_extractor=dict(
+ type='SingleRoIExtractor',
+ roi_layer=dict(
+ type='RoIAlign',
+ output_size=7,
+ sampling_ratio=2,
+ aligned=False)),
+ mask_roi_extractor=dict(
+ type='SingleRoIExtractor',
+ roi_layer=dict(
+ type='RoIAlign',
+ output_size=14,
+ sampling_ratio=2,
+ aligned=False)),
+ bbox_head=dict(
+ bbox_coder=dict(type='LegacyDeltaXYWHBBoxCoder'),
+ loss_bbox=dict(type='SmoothL1Loss', beta=1.0, loss_weight=1.0))))
+# model training and testing settings
+train_cfg = dict(
+ rpn_proposal=dict(nms_post=2000, max_num=2000),
+ rcnn=dict(assigner=dict(match_low_quality=True)))
diff --git a/PyTorch/contrib/cv/detection/GCNet/configs/legacy_1.x/retinanet_r50_caffe_fpn_1x_coco_v1.py b/PyTorch/contrib/cv/detection/GCNet/configs/legacy_1.x/retinanet_r50_caffe_fpn_1x_coco_v1.py
new file mode 100644
index 0000000000000000000000000000000000000000..ef9392f7e351f489d6d9e97936925b6a16d1212e
--- /dev/null
+++ b/PyTorch/contrib/cv/detection/GCNet/configs/legacy_1.x/retinanet_r50_caffe_fpn_1x_coco_v1.py
@@ -0,0 +1,37 @@
+_base_ = './retinanet_r50_fpn_1x_coco_v1.py'
+model = dict(
+ pretrained='open-mmlab://detectron/resnet50_caffe',
+ backbone=dict(
+ norm_cfg=dict(requires_grad=False), norm_eval=True, style='caffe'))
+# use caffe img_norm
+img_norm_cfg = dict(
+ mean=[102.9801, 115.9465, 122.7717], std=[1.0, 1.0, 1.0], to_rgb=False)
+train_pipeline = [
+ dict(type='LoadImageFromFile'),
+ dict(type='LoadAnnotations', with_bbox=True),
+ dict(type='Resize', img_scale=(1333, 800), keep_ratio=True),
+ dict(type='RandomFlip', flip_ratio=0.5),
+ dict(type='Normalize', **img_norm_cfg),
+ dict(type='Pad', size_divisor=32),
+ dict(type='DefaultFormatBundle'),
+ dict(type='Collect', keys=['img', 'gt_bboxes', 'gt_labels']),
+]
+test_pipeline = [
+ dict(type='LoadImageFromFile'),
+ dict(
+ type='MultiScaleFlipAug',
+ img_scale=(1333, 800),
+ flip=False,
+ transforms=[
+ dict(type='Resize', keep_ratio=True),
+ dict(type='RandomFlip'),
+ dict(type='Normalize', **img_norm_cfg),
+ dict(type='Pad', size_divisor=32),
+ dict(type='ImageToTensor', keys=['img']),
+ dict(type='Collect', keys=['img']),
+ ])
+]
+data = dict(
+ train=dict(pipeline=train_pipeline),
+ val=dict(pipeline=test_pipeline),
+ test=dict(pipeline=test_pipeline))
diff --git a/PyTorch/contrib/cv/detection/GCNet/configs/legacy_1.x/retinanet_r50_fpn_1x_coco_v1.py b/PyTorch/contrib/cv/detection/GCNet/configs/legacy_1.x/retinanet_r50_fpn_1x_coco_v1.py
new file mode 100644
index 0000000000000000000000000000000000000000..6198b9717957374ce734ca74de5f54dda44123b9
--- /dev/null
+++ b/PyTorch/contrib/cv/detection/GCNet/configs/legacy_1.x/retinanet_r50_fpn_1x_coco_v1.py
@@ -0,0 +1,17 @@
+_base_ = [
+ '../_base_/models/retinanet_r50_fpn.py',
+ '../_base_/datasets/coco_detection.py',
+ '../_base_/schedules/schedule_1x.py', '../_base_/default_runtime.py'
+]
+model = dict(
+ bbox_head=dict(
+ type='RetinaHead',
+ anchor_generator=dict(
+ type='LegacyAnchorGenerator',
+ center_offset=0.5,
+ octave_base_scale=4,
+ scales_per_octave=3,
+ ratios=[0.5, 1.0, 2.0],
+ strides=[8, 16, 32, 64, 128]),
+ bbox_coder=dict(type='LegacyDeltaXYWHBBoxCoder'),
+ loss_bbox=dict(type='SmoothL1Loss', beta=0.11, loss_weight=1.0)))
diff --git a/PyTorch/contrib/cv/detection/GCNet/configs/legacy_1.x/ssd300_coco_v1.py b/PyTorch/contrib/cv/detection/GCNet/configs/legacy_1.x/ssd300_coco_v1.py
new file mode 100644
index 0000000000000000000000000000000000000000..b194e7651ede006c5101bff1056749edf4d249cd
--- /dev/null
+++ b/PyTorch/contrib/cv/detection/GCNet/configs/legacy_1.x/ssd300_coco_v1.py
@@ -0,0 +1,79 @@
+_base_ = [
+ '../_base_/models/ssd300.py', '../_base_/datasets/coco_detection.py',
+ '../_base_/schedules/schedule_2x.py', '../_base_/default_runtime.py'
+]
+# model settings
+input_size = 300
+model = dict(
+ bbox_head=dict(
+ type='SSDHead',
+ anchor_generator=dict(
+ type='LegacySSDAnchorGenerator',
+ scale_major=False,
+ input_size=input_size,
+ basesize_ratio_range=(0.15, 0.9),
+ strides=[8, 16, 32, 64, 100, 300],
+ ratios=[[2], [2, 3], [2, 3], [2, 3], [2], [2]]),
+ bbox_coder=dict(
+ type='LegacyDeltaXYWHBBoxCoder',
+ target_means=[.0, .0, .0, .0],
+ target_stds=[0.1, 0.1, 0.2, 0.2])))
+# dataset settings
+dataset_type = 'CocoDataset'
+data_root = 'data/coco/'
+img_norm_cfg = dict(mean=[123.675, 116.28, 103.53], std=[1, 1, 1], to_rgb=True)
+train_pipeline = [
+ dict(type='LoadImageFromFile', to_float32=True),
+ dict(type='LoadAnnotations', with_bbox=True),
+ dict(
+ type='PhotoMetricDistortion',
+ brightness_delta=32,
+ contrast_range=(0.5, 1.5),
+ saturation_range=(0.5, 1.5),
+ hue_delta=18),
+ dict(
+ type='Expand',
+ mean=img_norm_cfg['mean'],
+ to_rgb=img_norm_cfg['to_rgb'],
+ ratio_range=(1, 4)),
+ dict(
+ type='MinIoURandomCrop',
+ min_ious=(0.1, 0.3, 0.5, 0.7, 0.9),
+ min_crop_size=0.3),
+ dict(type='Resize', img_scale=(300, 300), keep_ratio=False),
+ dict(type='Normalize', **img_norm_cfg),
+ dict(type='RandomFlip', flip_ratio=0.5),
+ dict(type='DefaultFormatBundle'),
+ dict(type='Collect', keys=['img', 'gt_bboxes', 'gt_labels']),
+]
+test_pipeline = [
+ dict(type='LoadImageFromFile'),
+ dict(
+ type='MultiScaleFlipAug',
+ img_scale=(300, 300),
+ flip=False,
+ transforms=[
+ dict(type='Resize', keep_ratio=False),
+ dict(type='Normalize', **img_norm_cfg),
+ dict(type='ImageToTensor', keys=['img']),
+ dict(type='Collect', keys=['img']),
+ ])
+]
+data = dict(
+ samples_per_gpu=8,
+ workers_per_gpu=3,
+ train=dict(
+ _delete_=True,
+ type='RepeatDataset',
+ times=5,
+ dataset=dict(
+ type=dataset_type,
+ ann_file=data_root + 'annotations/instances_train2017.json',
+ img_prefix=data_root + 'train2017/',
+ pipeline=train_pipeline)),
+ val=dict(pipeline=test_pipeline),
+ test=dict(pipeline=test_pipeline))
+# optimizer
+optimizer = dict(type='SGD', lr=2e-3, momentum=0.9, weight_decay=5e-4)
+optimizer_config = dict(_delete_=True)
+dist_params = dict(backend='nccl', port=29555)
diff --git a/PyTorch/contrib/cv/detection/GCNet/configs/libra_rcnn/README.md b/PyTorch/contrib/cv/detection/GCNet/configs/libra_rcnn/README.md
new file mode 100644
index 0000000000000000000000000000000000000000..8cfe5251fcd33121b9e9c37a1fed90ab76235334
--- /dev/null
+++ b/PyTorch/contrib/cv/detection/GCNet/configs/libra_rcnn/README.md
@@ -0,0 +1,26 @@
+# Libra R-CNN: Towards Balanced Learning for Object Detection
+
+## Introduction
+
+We provide config files to reproduce the results in the CVPR 2019 paper [Libra R-CNN](https://arxiv.org/pdf/1904.02701.pdf).
+
+```
+@inproceedings{pang2019libra,
+ title={Libra R-CNN: Towards Balanced Learning for Object Detection},
+ author={Pang, Jiangmiao and Chen, Kai and Shi, Jianping and Feng, Huajun and Ouyang, Wanli and Dahua Lin},
+ booktitle={IEEE Conference on Computer Vision and Pattern Recognition},
+ year={2019}
+}
+```
+
+## Results and models
+
+The results on COCO 2017val are shown in the below table. (results on test-dev are usually slightly higher than val)
+
+| Architecture | Backbone | Style | Lr schd | Mem (GB) | Inf time (fps) | box AP | Download |
+|:------------:|:---------------:|:-------:|:-------:|:--------:|:--------------:|:------:|:--------:|
+| Faster R-CNN | R-50-FPN | pytorch | 1x | 4.6 | 19.0 | 38.3 | [model](http://download.openmmlab.com/mmdetection/v2.0/libra_rcnn/libra_faster_rcnn_r50_fpn_1x_coco/libra_faster_rcnn_r50_fpn_1x_coco_20200130-3afee3a9.pth) | [log](http://download.openmmlab.com/mmdetection/v2.0/libra_rcnn/libra_faster_rcnn_r50_fpn_1x_coco/libra_faster_rcnn_r50_fpn_1x_coco_20200130_204655.log.json) |
+| Fast R-CNN | R-50-FPN | pytorch | 1x | | | | |
+| Faster R-CNN | R-101-FPN | pytorch | 1x | 6.5 | 14.4 | 40.1 | [model](http://download.openmmlab.com/mmdetection/v2.0/libra_rcnn/libra_faster_rcnn_r101_fpn_1x_coco/libra_faster_rcnn_r101_fpn_1x_coco_20200203-8dba6a5a.pth) | [log](http://download.openmmlab.com/mmdetection/v2.0/libra_rcnn/libra_faster_rcnn_r101_fpn_1x_coco/libra_faster_rcnn_r101_fpn_1x_coco_20200203_001405.log.json) |
+| Faster R-CNN | X-101-64x4d-FPN | pytorch | 1x | 10.8 | 8.5 | 42.7 | [model](http://download.openmmlab.com/mmdetection/v2.0/libra_rcnn/libra_faster_rcnn_x101_64x4d_fpn_1x_coco/libra_faster_rcnn_x101_64x4d_fpn_1x_coco_20200315-3a7d0488.pth) | [log](http://download.openmmlab.com/mmdetection/v2.0/libra_rcnn/libra_faster_rcnn_x101_64x4d_fpn_1x_coco/libra_faster_rcnn_x101_64x4d_fpn_1x_coco_20200315_231625.log.json) |
+| RetinaNet | R-50-FPN | pytorch | 1x | 4.2 | 17.7 | 37.6 | [model](http://download.openmmlab.com/mmdetection/v2.0/libra_rcnn/libra_retinanet_r50_fpn_1x_coco/libra_retinanet_r50_fpn_1x_coco_20200205-804d94ce.pth) | [log](http://download.openmmlab.com/mmdetection/v2.0/libra_rcnn/libra_retinanet_r50_fpn_1x_coco/libra_retinanet_r50_fpn_1x_coco_20200205_112757.log.json) |
diff --git a/PyTorch/contrib/cv/detection/GCNet/configs/libra_rcnn/libra_fast_rcnn_r50_fpn_1x_coco.py b/PyTorch/contrib/cv/detection/GCNet/configs/libra_rcnn/libra_fast_rcnn_r50_fpn_1x_coco.py
new file mode 100644
index 0000000000000000000000000000000000000000..b416c8d035146edc68f0d7198f15aed0bc0093cd
--- /dev/null
+++ b/PyTorch/contrib/cv/detection/GCNet/configs/libra_rcnn/libra_fast_rcnn_r50_fpn_1x_coco.py
@@ -0,0 +1,50 @@
+_base_ = '../fast_rcnn/fast_rcnn_r50_fpn_1x_coco.py'
+# model settings
+model = dict(
+ neck=[
+ dict(
+ type='FPN',
+ in_channels=[256, 512, 1024, 2048],
+ out_channels=256,
+ num_outs=5),
+ dict(
+ type='BFP',
+ in_channels=256,
+ num_levels=5,
+ refine_level=2,
+ refine_type='non_local')
+ ],
+ roi_head=dict(
+ bbox_head=dict(
+ loss_bbox=dict(
+ _delete_=True,
+ type='BalancedL1Loss',
+ alpha=0.5,
+ gamma=1.5,
+ beta=1.0,
+ loss_weight=1.0))))
+# model training and testing settings
+train_cfg = dict(
+ rcnn=dict(
+ sampler=dict(
+ _delete_=True,
+ type='CombinedSampler',
+ num=512,
+ pos_fraction=0.25,
+ add_gt_as_proposals=True,
+ pos_sampler=dict(type='InstanceBalancedPosSampler'),
+ neg_sampler=dict(
+ type='IoUBalancedNegSampler',
+ floor_thr=-1,
+ floor_fraction=0,
+ num_bins=3))))
+# dataset settings
+dataset_type = 'CocoDataset'
+data_root = 'data/coco/'
+data = dict(
+ train=dict(proposal_file=data_root +
+ 'libra_proposals/rpn_r50_fpn_1x_train2017.pkl'),
+ val=dict(proposal_file=data_root +
+ 'libra_proposals/rpn_r50_fpn_1x_val2017.pkl'),
+ test=dict(proposal_file=data_root +
+ 'libra_proposals/rpn_r50_fpn_1x_val2017.pkl'))
diff --git a/PyTorch/contrib/cv/detection/GCNet/configs/libra_rcnn/libra_faster_rcnn_r101_fpn_1x_coco.py b/PyTorch/contrib/cv/detection/GCNet/configs/libra_rcnn/libra_faster_rcnn_r101_fpn_1x_coco.py
new file mode 100644
index 0000000000000000000000000000000000000000..8e36c9b3a506eacd97bfadee8d167886eef74cb7
--- /dev/null
+++ b/PyTorch/contrib/cv/detection/GCNet/configs/libra_rcnn/libra_faster_rcnn_r101_fpn_1x_coco.py
@@ -0,0 +1,2 @@
+_base_ = './libra_faster_rcnn_r50_fpn_1x_coco.py'
+model = dict(pretrained='torchvision://resnet101', backbone=dict(depth=101))
diff --git a/PyTorch/contrib/cv/detection/GCNet/configs/libra_rcnn/libra_faster_rcnn_r50_fpn_1x_coco.py b/PyTorch/contrib/cv/detection/GCNet/configs/libra_rcnn/libra_faster_rcnn_r50_fpn_1x_coco.py
new file mode 100644
index 0000000000000000000000000000000000000000..9e9b6172158af7f6c63e159916f85f3676096b6f
--- /dev/null
+++ b/PyTorch/contrib/cv/detection/GCNet/configs/libra_rcnn/libra_faster_rcnn_r50_fpn_1x_coco.py
@@ -0,0 +1,41 @@
+_base_ = '../faster_rcnn/faster_rcnn_r50_fpn_1x_coco.py'
+# model settings
+model = dict(
+ neck=[
+ dict(
+ type='FPN',
+ in_channels=[256, 512, 1024, 2048],
+ out_channels=256,
+ num_outs=5),
+ dict(
+ type='BFP',
+ in_channels=256,
+ num_levels=5,
+ refine_level=2,
+ refine_type='non_local')
+ ],
+ roi_head=dict(
+ bbox_head=dict(
+ loss_bbox=dict(
+ _delete_=True,
+ type='BalancedL1Loss',
+ alpha=0.5,
+ gamma=1.5,
+ beta=1.0,
+ loss_weight=1.0))))
+# model training and testing settings
+train_cfg = dict(
+ rpn=dict(sampler=dict(neg_pos_ub=5), allowed_border=-1),
+ rcnn=dict(
+ sampler=dict(
+ _delete_=True,
+ type='CombinedSampler',
+ num=512,
+ pos_fraction=0.25,
+ add_gt_as_proposals=True,
+ pos_sampler=dict(type='InstanceBalancedPosSampler'),
+ neg_sampler=dict(
+ type='IoUBalancedNegSampler',
+ floor_thr=-1,
+ floor_fraction=0,
+ num_bins=3))))
diff --git a/PyTorch/contrib/cv/detection/GCNet/configs/libra_rcnn/libra_faster_rcnn_x101_64x4d_fpn_1x_coco.py b/PyTorch/contrib/cv/detection/GCNet/configs/libra_rcnn/libra_faster_rcnn_x101_64x4d_fpn_1x_coco.py
new file mode 100644
index 0000000000000000000000000000000000000000..e94553294294fa49952f2dfe0e3c64a5e00bc878
--- /dev/null
+++ b/PyTorch/contrib/cv/detection/GCNet/configs/libra_rcnn/libra_faster_rcnn_x101_64x4d_fpn_1x_coco.py
@@ -0,0 +1,13 @@
+_base_ = './libra_faster_rcnn_r50_fpn_1x_coco.py'
+model = dict(
+ pretrained='open-mmlab://resnext101_64x4d',
+ backbone=dict(
+ type='ResNeXt',
+ depth=101,
+ groups=64,
+ base_width=4,
+ num_stages=4,
+ out_indices=(0, 1, 2, 3),
+ frozen_stages=1,
+ norm_cfg=dict(type='BN', requires_grad=True),
+ style='pytorch'))
diff --git a/PyTorch/contrib/cv/detection/GCNet/configs/libra_rcnn/libra_retinanet_r50_fpn_1x_coco.py b/PyTorch/contrib/cv/detection/GCNet/configs/libra_rcnn/libra_retinanet_r50_fpn_1x_coco.py
new file mode 100644
index 0000000000000000000000000000000000000000..be2742098fb8f1e46bbb16c9d3e2e20c2e3083aa
--- /dev/null
+++ b/PyTorch/contrib/cv/detection/GCNet/configs/libra_rcnn/libra_retinanet_r50_fpn_1x_coco.py
@@ -0,0 +1,26 @@
+_base_ = '../retinanet/retinanet_r50_fpn_1x_coco.py'
+# model settings
+model = dict(
+ neck=[
+ dict(
+ type='FPN',
+ in_channels=[256, 512, 1024, 2048],
+ out_channels=256,
+ start_level=1,
+ add_extra_convs='on_input',
+ num_outs=5),
+ dict(
+ type='BFP',
+ in_channels=256,
+ num_levels=5,
+ refine_level=1,
+ refine_type='non_local')
+ ],
+ bbox_head=dict(
+ loss_bbox=dict(
+ _delete_=True,
+ type='BalancedL1Loss',
+ alpha=0.5,
+ gamma=1.5,
+ beta=0.11,
+ loss_weight=1.0)))
diff --git a/PyTorch/contrib/cv/detection/GCNet/configs/lvis/README.md b/PyTorch/contrib/cv/detection/GCNet/configs/lvis/README.md
new file mode 100644
index 0000000000000000000000000000000000000000..b4ef28ec2bb2397ccd86d8c963228be4d40a9db5
--- /dev/null
+++ b/PyTorch/contrib/cv/detection/GCNet/configs/lvis/README.md
@@ -0,0 +1,43 @@
+# LVIS dataset
+
+## Introduction
+```
+@inproceedings{gupta2019lvis,
+ title={{LVIS}: A Dataset for Large Vocabulary Instance Segmentation},
+ author={Gupta, Agrim and Dollar, Piotr and Girshick, Ross},
+ booktitle={Proceedings of the {IEEE} Conference on Computer Vision and Pattern Recognition},
+ year={2019}
+}
+```
+
+## Common Setting
+* Please follow [install guide](../../docs/install.md#install-mmdetection) to install open-mmlab forked cocoapi first.
+* Run following scripts to install our forked lvis-api.
+ ```
+ # mmlvis is fully compatible with official lvis
+ pip install mmlvis
+ ```
+ or
+ ```
+ pip install -r requirements/optional.txt
+ ```
+* All experiments use oversample strategy [here](../../docs/tutorials/new_dataset.md#class-balanced-dataset) with oversample threshold `1e-3`.
+* The size of LVIS v0.5 is half of COCO, so schedule `2x` in LVIS is roughly the same iterations as `1x` in COCO.
+
+## Results and models of LVIS v0.5
+
+| Backbone | Style | Lr schd | Mem (GB) | Inf time (fps) | box AP | mask AP | Download |
+| :-------------: | :-----: | :-----: | :------: | :------------: | :----: | :-----: | :------: |
+| R-50-FPN | pytorch | 2x | - | - | 26.1 | 25.9 | [model](http://download.openmmlab.com/mmdetection/v2.0/lvis/mask_rcnn_r50_fpn_sample1e-3_mstrain_2x_lvis/mask_rcnn_r50_fpn_sample1e-3_mstrain_2x_lvis-dbd06831.pth) | [log](http://download.openmmlab.com/mmdetection/v2.0/lvis/mask_rcnn_r50_fpn_sample1e-3_mstrain_2x_lvis/mask_rcnn_r50_fpn_sample1e-3_mstrain_2x_lvis_20200531_160435.log.json) |
+| R-101-FPN | pytorch | 2x | - | - | 27.1 | 27.0 | [model](http://download.openmmlab.com/mmdetection/v2.0/lvis/mask_rcnn_r101_fpn_sample1e-3_mstrain_2x_lvis/mask_rcnn_r101_fpn_sample1e-3_mstrain_2x_lvis-54582ee2.pth) | [log](http://download.openmmlab.com/mmdetection/v2.0/lvis/mask_rcnn_r101_fpn_sample1e-3_mstrain_2x_lvis/mask_rcnn_r101_fpn_sample1e-3_mstrain_2x_lvis_20200601_134748.log.json) |
+| X-101-32x4d-FPN | pytorch | 2x | - | - | 26.7 | 26.9 | [model](http://download.openmmlab.com/mmdetection/v2.0/lvis/mask_rcnn_x101_32x4d_fpn_sample1e-3_mstrain_2x_lvis/mask_rcnn_x101_32x4d_fpn_sample1e-3_mstrain_2x_lvis-3cf55ea2.pth) | [log](http://download.openmmlab.com/mmdetection/v2.0/lvis/mask_rcnn_x101_32x4d_fpn_sample1e-3_mstrain_2x_lvis/mask_rcnn_x101_32x4d_fpn_sample1e-3_mstrain_2x_lvis_20200531_221749.log.json) |
+| X-101-64x4d-FPN | pytorch | 2x | - | - | 26.4 | 26.0 | [model](http://download.openmmlab.com/mmdetection/v2.0/lvis/mask_rcnn_x101_64x4d_fpn_sample1e-3_mstrain_2x_lvis/mask_rcnn_x101_64x4d_fpn_sample1e-3_mstrain_2x_lvis-1c99a5ad.pth) | [log](http://download.openmmlab.com/mmdetection/v2.0/lvis/mask_rcnn_x101_64x4d_fpn_sample1e-3_mstrain_2x_lvis/mask_rcnn_x101_64x4d_fpn_sample1e-3_mstrain_2x_lvis_20200601_194651.log.json) |
+
+## Results and models of LVIS v1
+
+| Backbone | Style | Lr schd | Mem (GB) | Inf time (fps) | box AP | mask AP | Download |
+| :-------------: | :-----: | :-----: | :------: | :------------: | :----: | :-----: | :------: |
+| R-50-FPN | pytorch | 1x | 9.1 | - | 22.5 | 21.7 | [model](http://download.openmmlab.com/mmdetection/v2.0/lvis/mask_rcnn_r50_fpn_sample1e-3_mstrain_1x_lvis_v1/mask_rcnn_r50_fpn_sample1e-3_mstrain_1x_lvis_v1-aa78ac3d.pth) | [log](http://download.openmmlab.com/mmdetection/v2.0/lvis/mask_rcnn_r50_fpn_sample1e-3_mstrain_1x_lvis_v1/mask_rcnn_r50_fpn_sample1e-3_mstrain_1x_lvis_v1-20200829_061305.log.json) |
+| R-101-FPN | pytorch | 1x | 10.8 | - | 24.6 | 23.6 | [model](http://download.openmmlab.com/mmdetection/v2.0/lvis/mask_rcnn_r101_fpn_sample1e-3_mstrain_1x_lvis_v1/mask_rcnn_r101_fpn_sample1e-3_mstrain_1x_lvis_v1-ec55ce32.pth) | [log](http://download.openmmlab.com/mmdetection/v2.0/lvis/mask_rcnn_r101_fpn_sample1e-3_mstrain_1x_lvis_v1/mask_rcnn_r101_fpn_sample1e-3_mstrain_1x_lvis_v1-20200829_070959.log.json) |
+| X-101-32x4d-FPN | pytorch | 1x | 11.8 | - | 26.7 | 25.5 | [model](http://download.openmmlab.com/mmdetection/v2.0/lvis/mask_rcnn_x101_32x4d_fpn_sample1e-3_mstrain_1x_lvis_v1/mask_rcnn_x101_32x4d_fpn_sample1e-3_mstrain_1x_lvis_v1-ebbc5c81.pth) | [log](http://download.openmmlab.com/mmdetection/v2.0/lvis/mask_rcnn_x101_32x4d_fpn_sample1e-3_mstrain_1x_lvis_v1/mask_rcnn_x101_32x4d_fpn_sample1e-3_mstrain_1x_lvis_v1-20200829_071317.log.json) |
+| X-101-64x4d-FPN | pytorch | 1x | 14.6 | - | 27.2 | 25.8 | [model](http://download.openmmlab.com/mmdetection/v2.0/lvis/mask_rcnn_x101_64x4d_fpn_sample1e-3_mstrain_1x_lvis_v1/mask_rcnn_x101_64x4d_fpn_sample1e-3_mstrain_1x_lvis_v1-43d9edfe.pth) | [log](http://download.openmmlab.com/mmdetection/v2.0/lvis/mask_rcnn_x101_64x4d_fpn_sample1e-3_mstrain_1x_lvis_v1/mask_rcnn_x101_64x4d_fpn_sample1e-3_mstrain_1x_lvis_v1-20200830_060206.log.json) |
diff --git a/PyTorch/contrib/cv/detection/GCNet/configs/lvis/mask_rcnn_r101_fpn_sample1e-3_mstrain_1x_lvis_v1.py b/PyTorch/contrib/cv/detection/GCNet/configs/lvis/mask_rcnn_r101_fpn_sample1e-3_mstrain_1x_lvis_v1.py
new file mode 100644
index 0000000000000000000000000000000000000000..188186502d56674fa4e6073b39819a209b9a2c1f
--- /dev/null
+++ b/PyTorch/contrib/cv/detection/GCNet/configs/lvis/mask_rcnn_r101_fpn_sample1e-3_mstrain_1x_lvis_v1.py
@@ -0,0 +1,2 @@
+_base_ = './mask_rcnn_r50_fpn_sample1e-3_mstrain_1x_lvis_v1.py'
+model = dict(pretrained='torchvision://resnet101', backbone=dict(depth=101))
diff --git a/PyTorch/contrib/cv/detection/GCNet/configs/lvis/mask_rcnn_r101_fpn_sample1e-3_mstrain_2x_lvis_v0.5.py b/PyTorch/contrib/cv/detection/GCNet/configs/lvis/mask_rcnn_r101_fpn_sample1e-3_mstrain_2x_lvis_v0.5.py
new file mode 100644
index 0000000000000000000000000000000000000000..2d2816c2dee68b60376e67e78e9fba277da826c0
--- /dev/null
+++ b/PyTorch/contrib/cv/detection/GCNet/configs/lvis/mask_rcnn_r101_fpn_sample1e-3_mstrain_2x_lvis_v0.5.py
@@ -0,0 +1,2 @@
+_base_ = './mask_rcnn_r50_fpn_sample1e-3_mstrain_2x_lvis_v0.5.py'
+model = dict(pretrained='torchvision://resnet101', backbone=dict(depth=101))
diff --git a/PyTorch/contrib/cv/detection/GCNet/configs/lvis/mask_rcnn_r50_fpn_sample1e-3_mstrain_1x_lvis_v1.py b/PyTorch/contrib/cv/detection/GCNet/configs/lvis/mask_rcnn_r50_fpn_sample1e-3_mstrain_1x_lvis_v1.py
new file mode 100644
index 0000000000000000000000000000000000000000..6ca6098f689f38a2be8e80b9ec944b1129ab0b46
--- /dev/null
+++ b/PyTorch/contrib/cv/detection/GCNet/configs/lvis/mask_rcnn_r50_fpn_sample1e-3_mstrain_1x_lvis_v1.py
@@ -0,0 +1,31 @@
+_base_ = [
+ '../_base_/models/mask_rcnn_r50_fpn.py',
+ '../_base_/datasets/lvis_v1_instance.py',
+ '../_base_/schedules/schedule_1x.py', '../_base_/default_runtime.py'
+]
+model = dict(
+ roi_head=dict(
+ bbox_head=dict(num_classes=1203), mask_head=dict(num_classes=1203)))
+test_cfg = dict(
+ rcnn=dict(
+ score_thr=0.0001,
+ # LVIS allows up to 300
+ max_per_img=300))
+img_norm_cfg = dict(
+ mean=[123.675, 116.28, 103.53], std=[58.395, 57.12, 57.375], to_rgb=True)
+train_pipeline = [
+ dict(type='LoadImageFromFile'),
+ dict(type='LoadAnnotations', with_bbox=True, with_mask=True),
+ dict(
+ type='Resize',
+ img_scale=[(1333, 640), (1333, 672), (1333, 704), (1333, 736),
+ (1333, 768), (1333, 800)],
+ multiscale_mode='value',
+ keep_ratio=True),
+ dict(type='RandomFlip', flip_ratio=0.5),
+ dict(type='Normalize', **img_norm_cfg),
+ dict(type='Pad', size_divisor=32),
+ dict(type='DefaultFormatBundle'),
+ dict(type='Collect', keys=['img', 'gt_bboxes', 'gt_labels', 'gt_masks']),
+]
+data = dict(train=dict(dataset=dict(pipeline=train_pipeline)))
diff --git a/PyTorch/contrib/cv/detection/GCNet/configs/lvis/mask_rcnn_r50_fpn_sample1e-3_mstrain_2x_lvis_v0.5.py b/PyTorch/contrib/cv/detection/GCNet/configs/lvis/mask_rcnn_r50_fpn_sample1e-3_mstrain_2x_lvis_v0.5.py
new file mode 100644
index 0000000000000000000000000000000000000000..ff1da67187d92ca3ca3cb9cdc9118b0d1584ec0f
--- /dev/null
+++ b/PyTorch/contrib/cv/detection/GCNet/configs/lvis/mask_rcnn_r50_fpn_sample1e-3_mstrain_2x_lvis_v0.5.py
@@ -0,0 +1,31 @@
+_base_ = [
+ '../_base_/models/mask_rcnn_r50_fpn.py',
+ '../_base_/datasets/lvis_v0.5_instance.py',
+ '../_base_/schedules/schedule_2x.py', '../_base_/default_runtime.py'
+]
+model = dict(
+ roi_head=dict(
+ bbox_head=dict(num_classes=1230), mask_head=dict(num_classes=1230)))
+test_cfg = dict(
+ rcnn=dict(
+ score_thr=0.0001,
+ # LVIS allows up to 300
+ max_per_img=300))
+img_norm_cfg = dict(
+ mean=[123.675, 116.28, 103.53], std=[58.395, 57.12, 57.375], to_rgb=True)
+train_pipeline = [
+ dict(type='LoadImageFromFile'),
+ dict(type='LoadAnnotations', with_bbox=True, with_mask=True),
+ dict(
+ type='Resize',
+ img_scale=[(1333, 640), (1333, 672), (1333, 704), (1333, 736),
+ (1333, 768), (1333, 800)],
+ multiscale_mode='value',
+ keep_ratio=True),
+ dict(type='RandomFlip', flip_ratio=0.5),
+ dict(type='Normalize', **img_norm_cfg),
+ dict(type='Pad', size_divisor=32),
+ dict(type='DefaultFormatBundle'),
+ dict(type='Collect', keys=['img', 'gt_bboxes', 'gt_labels', 'gt_masks']),
+]
+data = dict(train=dict(dataset=dict(pipeline=train_pipeline)))
diff --git a/PyTorch/contrib/cv/detection/GCNet/configs/lvis/mask_rcnn_x101_32x4d_fpn_sample1e-3_mstrain_1x_lvis_v1.py b/PyTorch/contrib/cv/detection/GCNet/configs/lvis/mask_rcnn_x101_32x4d_fpn_sample1e-3_mstrain_1x_lvis_v1.py
new file mode 100644
index 0000000000000000000000000000000000000000..5abcc2e014fe57b862422fa2fe18dd651761b56e
--- /dev/null
+++ b/PyTorch/contrib/cv/detection/GCNet/configs/lvis/mask_rcnn_x101_32x4d_fpn_sample1e-3_mstrain_1x_lvis_v1.py
@@ -0,0 +1,13 @@
+_base_ = './mask_rcnn_r50_fpn_sample1e-3_mstrain_1x_lvis_v1.py'
+model = dict(
+ pretrained='open-mmlab://resnext101_32x4d',
+ backbone=dict(
+ type='ResNeXt',
+ depth=101,
+ groups=32,
+ base_width=4,
+ num_stages=4,
+ out_indices=(0, 1, 2, 3),
+ frozen_stages=1,
+ norm_cfg=dict(type='BN', requires_grad=True),
+ style='pytorch'))
diff --git a/PyTorch/contrib/cv/detection/GCNet/configs/lvis/mask_rcnn_x101_32x4d_fpn_sample1e-3_mstrain_2x_lvis_v0.5.py b/PyTorch/contrib/cv/detection/GCNet/configs/lvis/mask_rcnn_x101_32x4d_fpn_sample1e-3_mstrain_2x_lvis_v0.5.py
new file mode 100644
index 0000000000000000000000000000000000000000..439c39a93a8a12119ffa408987c8cea6d8cb313a
--- /dev/null
+++ b/PyTorch/contrib/cv/detection/GCNet/configs/lvis/mask_rcnn_x101_32x4d_fpn_sample1e-3_mstrain_2x_lvis_v0.5.py
@@ -0,0 +1,13 @@
+_base_ = './mask_rcnn_r50_fpn_sample1e-3_mstrain_2x_lvis_v0.5.py'
+model = dict(
+ pretrained='open-mmlab://resnext101_32x4d',
+ backbone=dict(
+ type='ResNeXt',
+ depth=101,
+ groups=32,
+ base_width=4,
+ num_stages=4,
+ out_indices=(0, 1, 2, 3),
+ frozen_stages=1,
+ norm_cfg=dict(type='BN', requires_grad=True),
+ style='pytorch'))
diff --git a/PyTorch/contrib/cv/detection/GCNet/configs/lvis/mask_rcnn_x101_64x4d_fpn_sample1e-3_mstrain_1x_lvis_v1.py b/PyTorch/contrib/cv/detection/GCNet/configs/lvis/mask_rcnn_x101_64x4d_fpn_sample1e-3_mstrain_1x_lvis_v1.py
new file mode 100644
index 0000000000000000000000000000000000000000..f77adba2f150f62900571f5f32b2083ee53b7003
--- /dev/null
+++ b/PyTorch/contrib/cv/detection/GCNet/configs/lvis/mask_rcnn_x101_64x4d_fpn_sample1e-3_mstrain_1x_lvis_v1.py
@@ -0,0 +1,13 @@
+_base_ = './mask_rcnn_r50_fpn_sample1e-3_mstrain_1x_lvis_v1.py'
+model = dict(
+ pretrained='open-mmlab://resnext101_64x4d',
+ backbone=dict(
+ type='ResNeXt',
+ depth=101,
+ groups=64,
+ base_width=4,
+ num_stages=4,
+ out_indices=(0, 1, 2, 3),
+ frozen_stages=1,
+ norm_cfg=dict(type='BN', requires_grad=True),
+ style='pytorch'))
diff --git a/PyTorch/contrib/cv/detection/GCNet/configs/lvis/mask_rcnn_x101_64x4d_fpn_sample1e-3_mstrain_2x_lvis_v0.5.py b/PyTorch/contrib/cv/detection/GCNet/configs/lvis/mask_rcnn_x101_64x4d_fpn_sample1e-3_mstrain_2x_lvis_v0.5.py
new file mode 100644
index 0000000000000000000000000000000000000000..2136255464715bcee89b47f1437a9dd4040e04c7
--- /dev/null
+++ b/PyTorch/contrib/cv/detection/GCNet/configs/lvis/mask_rcnn_x101_64x4d_fpn_sample1e-3_mstrain_2x_lvis_v0.5.py
@@ -0,0 +1,13 @@
+_base_ = './mask_rcnn_r50_fpn_sample1e-3_mstrain_2x_lvis_v0.5.py'
+model = dict(
+ pretrained='open-mmlab://resnext101_64x4d',
+ backbone=dict(
+ type='ResNeXt',
+ depth=101,
+ groups=64,
+ base_width=4,
+ num_stages=4,
+ out_indices=(0, 1, 2, 3),
+ frozen_stages=1,
+ norm_cfg=dict(type='BN', requires_grad=True),
+ style='pytorch'))
diff --git a/PyTorch/contrib/cv/detection/GCNet/configs/mask_rcnn/README.md b/PyTorch/contrib/cv/detection/GCNet/configs/mask_rcnn/README.md
new file mode 100644
index 0000000000000000000000000000000000000000..d34f59341ac5e3faf7d16ba00cf7f9f48ffcfdd3
--- /dev/null
+++ b/PyTorch/contrib/cv/detection/GCNet/configs/mask_rcnn/README.md
@@ -0,0 +1,40 @@
+# Mask R-CNN
+
+## Introduction
+```
+@article{He_2017,
+ title={Mask R-CNN},
+ journal={2017 IEEE International Conference on Computer Vision (ICCV)},
+ publisher={IEEE},
+ author={He, Kaiming and Gkioxari, Georgia and Dollar, Piotr and Girshick, Ross},
+ year={2017},
+ month={Oct}
+}
+```
+
+## Results and models
+
+| Backbone | Style | Lr schd | Mem (GB) | Inf time (fps) | box AP | mask AP | Download |
+| :-------------: | :-----: | :-----: | :------: | :------------: | :----: | :-----: | :------: |
+| R-50-FPN | caffe | 1x | 4.3 | | 38.0 | 34.4 | [model](http://download.openmmlab.com/mmdetection/v2.0/mask_rcnn/mask_rcnn_r50_caffe_fpn_1x_coco/mask_rcnn_r50_caffe_fpn_1x_coco_bbox_mAP-0.38__segm_mAP-0.344_20200504_231812-0ebd1859.pth) | [log](http://download.openmmlab.com/mmdetection/v2.0/mask_rcnn/mask_rcnn_r50_caffe_fpn_1x_coco/mask_rcnn_r50_caffe_fpn_1x_coco_20200504_231812.log.json) |
+| R-50-FPN | pytorch | 1x | 4.4 | 16.1 | 38.2 | 34.7 | [model](http://download.openmmlab.com/mmdetection/v2.0/mask_rcnn/mask_rcnn_r50_fpn_1x_coco/mask_rcnn_r50_fpn_1x_coco_20200205-d4b0c5d6.pth) | [log](http://download.openmmlab.com/mmdetection/v2.0/mask_rcnn/mask_rcnn_r50_fpn_1x_coco/mask_rcnn_r50_fpn_1x_coco_20200205_050542.log.json) |
+| R-50-FPN | pytorch | 2x | - | - | 39.2 | 35.4 | [model](http://download.openmmlab.com/mmdetection/v2.0/mask_rcnn/mask_rcnn_r50_fpn_2x_coco/mask_rcnn_r50_fpn_2x_coco_bbox_mAP-0.392__segm_mAP-0.354_20200505_003907-3e542a40.pth) | [log](http://download.openmmlab.com/mmdetection/v2.0/mask_rcnn/mask_rcnn_r50_fpn_2x_coco/mask_rcnn_r50_fpn_2x_coco_20200505_003907.log.json) |
+| R-101-FPN | caffe | 1x | | | 40.4 | 36.4 | [model](http://download.openmmlab.com/mmdetection/v2.0/mask_rcnn/mask_rcnn_r101_caffe_fpn_1x_coco/mask_rcnn_r101_caffe_fpn_1x_coco_20200601_095758-805e06c1.pth) | [log](http://download.openmmlab.com/mmdetection/v2.0/mask_rcnn/mask_rcnn_r101_caffe_fpn_1x_coco/mask_rcnn_r101_caffe_fpn_1x_coco_20200601_095758.log.json)|
+| R-101-FPN | pytorch | 1x | 6.4 | 13.5 | 40.0 | 36.1 | [model](http://download.openmmlab.com/mmdetection/v2.0/mask_rcnn/mask_rcnn_r101_fpn_1x_coco/mask_rcnn_r101_fpn_1x_coco_20200204-1efe0ed5.pth) | [log](http://download.openmmlab.com/mmdetection/v2.0/mask_rcnn/mask_rcnn_r101_fpn_1x_coco/mask_rcnn_r101_fpn_1x_coco_20200204_144809.log.json) |
+| R-101-FPN | pytorch | 2x | - | - | 40.8 | 36.6 | [model](http://download.openmmlab.com/mmdetection/v2.0/mask_rcnn/mask_rcnn_r101_fpn_2x_coco/mask_rcnn_r101_fpn_2x_coco_bbox_mAP-0.408__segm_mAP-0.366_20200505_071027-14b391c7.pth) | [log](http://download.openmmlab.com/mmdetection/v2.0/mask_rcnn/mask_rcnn_r101_fpn_2x_coco/mask_rcnn_r101_fpn_2x_coco_20200505_071027.log.json) |
+| X-101-32x4d-FPN | pytorch | 1x | 7.6 | 11.3 | 41.9 | 37.5 | [model](http://download.openmmlab.com/mmdetection/v2.0/mask_rcnn/mask_rcnn_x101_32x4d_fpn_1x_coco/mask_rcnn_x101_32x4d_fpn_1x_coco_20200205-478d0b67.pth) | [log](http://download.openmmlab.com/mmdetection/v2.0/mask_rcnn/mask_rcnn_x101_32x4d_fpn_1x_coco/mask_rcnn_x101_32x4d_fpn_1x_coco_20200205_034906.log.json) |
+| X-101-32x4d-FPN | pytorch | 2x | - | - | 42.2 | 37.8 | [model](http://download.openmmlab.com/mmdetection/v2.0/mask_rcnn/mask_rcnn_x101_32x4d_fpn_2x_coco/mask_rcnn_x101_32x4d_fpn_2x_coco_bbox_mAP-0.422__segm_mAP-0.378_20200506_004702-faef898c.pth) | [log](http://download.openmmlab.com/mmdetection/v2.0/mask_rcnn/mask_rcnn_x101_32x4d_fpn_2x_coco/mask_rcnn_x101_32x4d_fpn_2x_coco_20200506_004702.log.json) |
+| X-101-64x4d-FPN | pytorch | 1x | 10.7 | 8.0 | 42.8 | 38.4 | [model](http://download.openmmlab.com/mmdetection/v2.0/mask_rcnn/mask_rcnn_x101_64x4d_fpn_1x_coco/mask_rcnn_x101_64x4d_fpn_1x_coco_20200201-9352eb0d.pth) | [log](http://download.openmmlab.com/mmdetection/v2.0/mask_rcnn/mask_rcnn_x101_64x4d_fpn_1x_coco/mask_rcnn_x101_64x4d_fpn_1x_coco_20200201_124310.log.json) |
+| X-101-64x4d-FPN | pytorch | 2x | - | - | 42.7 | 38.1 | [model](http://download.openmmlab.com/mmdetection/v2.0/mask_rcnn/mask_rcnn_x101_64x4d_fpn_2x_coco/mask_rcnn_x101_64x4d_fpn_2x_coco_20200509_224208-39d6f70c.pth) | [log](http://download.openmmlab.com/mmdetection/v2.0/mask_rcnn/mask_rcnn_x101_64x4d_fpn_2x_coco/mask_rcnn_x101_64x4d_fpn_2x_coco_20200509_224208.log.json)|
+| X-101-32x8d-FPN | pytorch | 1x | - | - | 42.8 | 38.3 | |
+
+
+## Pre-trained Models
+We also train some models with longer schedules and multi-scale training. The users could finetune them for downstream tasks.
+
+| Backbone | Style | Lr schd | Mem (GB) | Inf time (fps) | box AP | mask AP | Download |
+| :-------------: | :-----: | :-----: | :------: | :------------: | :----: | :-----: | :------: |
+| [R-50-FPN](./mask_rcnn_r50_caffe_fpn_mstrain-poly_2x_coco.py) | caffe | 2x | 4.3 | | 40.3 | 36.5 | [model](http://download.openmmlab.com/mmdetection/v2.0/mask_rcnn/mask_rcnn_r50_caffe_fpn_mstrain-poly_2x_coco/mask_rcnn_r50_caffe_fpn_mstrain-poly_2x_coco_bbox_mAP-0.403__segm_mAP-0.365_20200504_231822-a75c98ce.pth) | [log](http://download.openmmlab.com/mmdetection/v2.0/mask_rcnn/mask_rcnn_r50_caffe_fpn_mstrain-poly_2x_coco/mask_rcnn_r50_caffe_fpn_mstrain-poly_2x_coco_20200504_231822.log.json)
+| [R-50-FPN](./mask_rcnn_r50_caffe_fpn_mstrain-poly_3x_coco.py) | caffe | 3x | 4.3 | | 40.8 | 37.0 | [model](http://download.openmmlab.com/mmdetection/v2.0/mask_rcnn/mask_rcnn_r50_caffe_fpn_mstrain-poly_3x_coco/mask_rcnn_r50_caffe_fpn_mstrain-poly_3x_coco_bbox_mAP-0.408__segm_mAP-0.37_20200504_163245-42aa3d00.pth) | [log](http://download.openmmlab.com/mmdetection/v2.0/mask_rcnn/mask_rcnn_r50_caffe_fpn_mstrain-poly_3x_coco/mask_rcnn_r50_caffe_fpn_mstrain-poly_3x_coco_20200504_163245.log.json)
+| [X-101-32x8d-FPN](./mask_rcnn_x101_32x8d_fpn_mstrain-poly_3x_coco.py) | pytorch | 1x | - | | 43.6 | 39.0 |
+| [X-101-32x8d-FPN](./mask_rcnn_x101_32x8d_fpn_mstrain-poly_3x_coco.py) | pytorch | 3x | - | | 44.0 | 39.3 |
diff --git a/PyTorch/contrib/cv/detection/GCNet/configs/mask_rcnn/mask_rcnn_r101_caffe_fpn_1x_coco.py b/PyTorch/contrib/cv/detection/GCNet/configs/mask_rcnn/mask_rcnn_r101_caffe_fpn_1x_coco.py
new file mode 100644
index 0000000000000000000000000000000000000000..230181cbeeb9c070dad926892f62d8f482d0ab1e
--- /dev/null
+++ b/PyTorch/contrib/cv/detection/GCNet/configs/mask_rcnn/mask_rcnn_r101_caffe_fpn_1x_coco.py
@@ -0,0 +1,4 @@
+_base_ = './mask_rcnn_r50_caffe_fpn_1x_coco.py'
+model = dict(
+ pretrained='open-mmlab://detectron2/resnet101_caffe',
+ backbone=dict(depth=101))
diff --git a/PyTorch/contrib/cv/detection/GCNet/configs/mask_rcnn/mask_rcnn_r101_fpn_1x_coco.py b/PyTorch/contrib/cv/detection/GCNet/configs/mask_rcnn/mask_rcnn_r101_fpn_1x_coco.py
new file mode 100644
index 0000000000000000000000000000000000000000..db02d9b880c7de447da881efe184e532ad0ee215
--- /dev/null
+++ b/PyTorch/contrib/cv/detection/GCNet/configs/mask_rcnn/mask_rcnn_r101_fpn_1x_coco.py
@@ -0,0 +1,2 @@
+_base_ = './mask_rcnn_r50_fpn_1x_coco.py'
+model = dict(pretrained='torchvision://resnet101', backbone=dict(depth=101))
diff --git a/PyTorch/contrib/cv/detection/GCNet/configs/mask_rcnn/mask_rcnn_r101_fpn_2x_coco.py b/PyTorch/contrib/cv/detection/GCNet/configs/mask_rcnn/mask_rcnn_r101_fpn_2x_coco.py
new file mode 100644
index 0000000000000000000000000000000000000000..c8cb2d87eedae2777ac8727dff5f398e1c477ab1
--- /dev/null
+++ b/PyTorch/contrib/cv/detection/GCNet/configs/mask_rcnn/mask_rcnn_r101_fpn_2x_coco.py
@@ -0,0 +1,2 @@
+_base_ = './mask_rcnn_r50_fpn_2x_coco.py'
+model = dict(pretrained='torchvision://resnet101', backbone=dict(depth=101))
diff --git a/PyTorch/contrib/cv/detection/GCNet/configs/mask_rcnn/mask_rcnn_r50_caffe_c4_1x_coco.py b/PyTorch/contrib/cv/detection/GCNet/configs/mask_rcnn/mask_rcnn_r50_caffe_c4_1x_coco.py
new file mode 100644
index 0000000000000000000000000000000000000000..a44c01831b508da0a5e1ca3720bb437bcea086d1
--- /dev/null
+++ b/PyTorch/contrib/cv/detection/GCNet/configs/mask_rcnn/mask_rcnn_r50_caffe_c4_1x_coco.py
@@ -0,0 +1,39 @@
+_base_ = [
+ '../_base_/models/mask_rcnn_r50_caffe_c4.py',
+ '../_base_/datasets/coco_instance.py',
+ '../_base_/schedules/schedule_1x.py', '../_base_/default_runtime.py'
+]
+# use caffe img_norm
+img_norm_cfg = dict(
+ mean=[103.530, 116.280, 123.675], std=[1.0, 1.0, 1.0], to_rgb=False)
+train_pipeline = [
+ dict(type='LoadImageFromFile'),
+ dict(type='LoadAnnotations', with_bbox=True, with_mask=True),
+ dict(type='Resize', img_scale=(1333, 800), keep_ratio=True),
+ dict(type='RandomFlip', flip_ratio=0.5),
+ dict(type='Normalize', **img_norm_cfg),
+ dict(type='Pad', size_divisor=32),
+ dict(type='DefaultFormatBundle'),
+ dict(type='Collect', keys=['img', 'gt_bboxes', 'gt_labels', 'gt_masks']),
+]
+test_pipeline = [
+ dict(type='LoadImageFromFile'),
+ dict(
+ type='MultiScaleFlipAug',
+ img_scale=(1333, 800),
+ flip=False,
+ transforms=[
+ dict(type='Resize', keep_ratio=True),
+ dict(type='RandomFlip'),
+ dict(type='Normalize', **img_norm_cfg),
+ dict(type='Pad', size_divisor=32),
+ dict(type='ImageToTensor', keys=['img']),
+ dict(type='Collect', keys=['img']),
+ ])
+]
+data = dict(
+ train=dict(pipeline=train_pipeline),
+ val=dict(pipeline=test_pipeline),
+ test=dict(pipeline=test_pipeline))
+# optimizer
+optimizer = dict(type='SGD', lr=0.01, momentum=0.9, weight_decay=0.0001)
diff --git a/PyTorch/contrib/cv/detection/GCNet/configs/mask_rcnn/mask_rcnn_r50_caffe_fpn_1x_coco.py b/PyTorch/contrib/cv/detection/GCNet/configs/mask_rcnn/mask_rcnn_r50_caffe_fpn_1x_coco.py
new file mode 100644
index 0000000000000000000000000000000000000000..0471fe86eb50b0fd644f10d77ab0ea7e150c95cf
--- /dev/null
+++ b/PyTorch/contrib/cv/detection/GCNet/configs/mask_rcnn/mask_rcnn_r50_caffe_fpn_1x_coco.py
@@ -0,0 +1,36 @@
+_base_ = './mask_rcnn_r50_fpn_1x_coco.py'
+model = dict(
+ pretrained='open-mmlab://detectron2/resnet50_caffe',
+ backbone=dict(norm_cfg=dict(requires_grad=False), style='caffe'))
+# use caffe img_norm
+img_norm_cfg = dict(
+ mean=[103.530, 116.280, 123.675], std=[1.0, 1.0, 1.0], to_rgb=False)
+train_pipeline = [
+ dict(type='LoadImageFromFile'),
+ dict(type='LoadAnnotations', with_bbox=True, with_mask=True),
+ dict(type='Resize', img_scale=(1333, 800), keep_ratio=True),
+ dict(type='RandomFlip', flip_ratio=0.5),
+ dict(type='Normalize', **img_norm_cfg),
+ dict(type='Pad', size_divisor=32),
+ dict(type='DefaultFormatBundle'),
+ dict(type='Collect', keys=['img', 'gt_bboxes', 'gt_labels', 'gt_masks']),
+]
+test_pipeline = [
+ dict(type='LoadImageFromFile'),
+ dict(
+ type='MultiScaleFlipAug',
+ img_scale=(1333, 800),
+ flip=False,
+ transforms=[
+ dict(type='Resize', keep_ratio=True),
+ dict(type='RandomFlip'),
+ dict(type='Normalize', **img_norm_cfg),
+ dict(type='Pad', size_divisor=32),
+ dict(type='ImageToTensor', keys=['img']),
+ dict(type='Collect', keys=['img']),
+ ])
+]
+data = dict(
+ train=dict(pipeline=train_pipeline),
+ val=dict(pipeline=test_pipeline),
+ test=dict(pipeline=test_pipeline))
diff --git a/PyTorch/contrib/cv/detection/GCNet/configs/mask_rcnn/mask_rcnn_r50_caffe_fpn_mstrain-poly_1x_coco.py b/PyTorch/contrib/cv/detection/GCNet/configs/mask_rcnn/mask_rcnn_r50_caffe_fpn_mstrain-poly_1x_coco.py
new file mode 100644
index 0000000000000000000000000000000000000000..5d6215d6f6e2f81fa284af0e639f3568429e3a75
--- /dev/null
+++ b/PyTorch/contrib/cv/detection/GCNet/configs/mask_rcnn/mask_rcnn_r50_caffe_fpn_mstrain-poly_1x_coco.py
@@ -0,0 +1,45 @@
+_base_ = './mask_rcnn_r50_fpn_1x_coco.py'
+model = dict(
+ pretrained='open-mmlab://detectron2/resnet50_caffe',
+ backbone=dict(norm_cfg=dict(requires_grad=False), style='caffe'))
+# use caffe img_norm
+img_norm_cfg = dict(
+ mean=[103.530, 116.280, 123.675], std=[1.0, 1.0, 1.0], to_rgb=False)
+train_pipeline = [
+ dict(type='LoadImageFromFile'),
+ dict(
+ type='LoadAnnotations',
+ with_bbox=True,
+ with_mask=True,
+ poly2mask=False),
+ dict(
+ type='Resize',
+ img_scale=[(1333, 640), (1333, 672), (1333, 704), (1333, 736),
+ (1333, 768), (1333, 800)],
+ multiscale_mode='value',
+ keep_ratio=True),
+ dict(type='RandomFlip', flip_ratio=0.5),
+ dict(type='Normalize', **img_norm_cfg),
+ dict(type='Pad', size_divisor=32),
+ dict(type='DefaultFormatBundle'),
+ dict(type='Collect', keys=['img', 'gt_bboxes', 'gt_labels', 'gt_masks']),
+]
+test_pipeline = [
+ dict(type='LoadImageFromFile'),
+ dict(
+ type='MultiScaleFlipAug',
+ img_scale=(1333, 800),
+ flip=False,
+ transforms=[
+ dict(type='Resize', keep_ratio=True),
+ dict(type='RandomFlip'),
+ dict(type='Normalize', **img_norm_cfg),
+ dict(type='Pad', size_divisor=32),
+ dict(type='ImageToTensor', keys=['img']),
+ dict(type='Collect', keys=['img']),
+ ])
+]
+data = dict(
+ train=dict(pipeline=train_pipeline),
+ val=dict(pipeline=test_pipeline),
+ test=dict(pipeline=test_pipeline))
diff --git a/PyTorch/contrib/cv/detection/GCNet/configs/mask_rcnn/mask_rcnn_r50_caffe_fpn_mstrain-poly_2x_coco.py b/PyTorch/contrib/cv/detection/GCNet/configs/mask_rcnn/mask_rcnn_r50_caffe_fpn_mstrain-poly_2x_coco.py
new file mode 100644
index 0000000000000000000000000000000000000000..374b86446af40b643c4e68501e8215c4817579cf
--- /dev/null
+++ b/PyTorch/contrib/cv/detection/GCNet/configs/mask_rcnn/mask_rcnn_r50_caffe_fpn_mstrain-poly_2x_coco.py
@@ -0,0 +1,4 @@
+_base_ = './mask_rcnn_r50_caffe_fpn_mstrain-poly_1x_coco.py'
+# learning policy
+lr_config = dict(step=[16, 23])
+total_epochs = 24
diff --git a/PyTorch/contrib/cv/detection/GCNet/configs/mask_rcnn/mask_rcnn_r50_caffe_fpn_mstrain-poly_3x_coco.py b/PyTorch/contrib/cv/detection/GCNet/configs/mask_rcnn/mask_rcnn_r50_caffe_fpn_mstrain-poly_3x_coco.py
new file mode 100644
index 0000000000000000000000000000000000000000..44f7e039fce0d1162c9f1bb11530dd7977439a11
--- /dev/null
+++ b/PyTorch/contrib/cv/detection/GCNet/configs/mask_rcnn/mask_rcnn_r50_caffe_fpn_mstrain-poly_3x_coco.py
@@ -0,0 +1,4 @@
+_base_ = './mask_rcnn_r50_caffe_fpn_mstrain-poly_1x_coco.py'
+# learning policy
+lr_config = dict(step=[28, 34])
+total_epochs = 36
diff --git a/PyTorch/contrib/cv/detection/GCNet/configs/mask_rcnn/mask_rcnn_r50_caffe_fpn_mstrain_1x_coco.py b/PyTorch/contrib/cv/detection/GCNet/configs/mask_rcnn/mask_rcnn_r50_caffe_fpn_mstrain_1x_coco.py
new file mode 100644
index 0000000000000000000000000000000000000000..86c5b13343b637ce218eed231240195a6768c5d1
--- /dev/null
+++ b/PyTorch/contrib/cv/detection/GCNet/configs/mask_rcnn/mask_rcnn_r50_caffe_fpn_mstrain_1x_coco.py
@@ -0,0 +1,41 @@
+_base_ = './mask_rcnn_r50_fpn_1x_coco.py'
+model = dict(
+ pretrained='open-mmlab://detectron2/resnet50_caffe',
+ backbone=dict(norm_cfg=dict(requires_grad=False), style='caffe'))
+# use caffe img_norm
+img_norm_cfg = dict(
+ mean=[103.530, 116.280, 123.675], std=[1.0, 1.0, 1.0], to_rgb=False)
+train_pipeline = [
+ dict(type='LoadImageFromFile'),
+ dict(type='LoadAnnotations', with_bbox=True, with_mask=True),
+ dict(
+ type='Resize',
+ img_scale=[(1333, 640), (1333, 672), (1333, 704), (1333, 736),
+ (1333, 768), (1333, 800)],
+ multiscale_mode='value',
+ keep_ratio=True),
+ dict(type='RandomFlip', flip_ratio=0.5),
+ dict(type='Normalize', **img_norm_cfg),
+ dict(type='Pad', size_divisor=32),
+ dict(type='DefaultFormatBundle'),
+ dict(type='Collect', keys=['img', 'gt_bboxes', 'gt_labels', 'gt_masks']),
+]
+test_pipeline = [
+ dict(type='LoadImageFromFile'),
+ dict(
+ type='MultiScaleFlipAug',
+ img_scale=(1333, 800),
+ flip=False,
+ transforms=[
+ dict(type='Resize', keep_ratio=True),
+ dict(type='RandomFlip'),
+ dict(type='Normalize', **img_norm_cfg),
+ dict(type='Pad', size_divisor=32),
+ dict(type='ImageToTensor', keys=['img']),
+ dict(type='Collect', keys=['img']),
+ ])
+]
+data = dict(
+ train=dict(pipeline=train_pipeline),
+ val=dict(pipeline=test_pipeline),
+ test=dict(pipeline=test_pipeline))
diff --git a/PyTorch/contrib/cv/detection/GCNet/configs/mask_rcnn/mask_rcnn_r50_caffe_fpn_poly_1x_coco_v1.py b/PyTorch/contrib/cv/detection/GCNet/configs/mask_rcnn/mask_rcnn_r50_caffe_fpn_poly_1x_coco_v1.py
new file mode 100644
index 0000000000000000000000000000000000000000..431e5ab33675290d27e232f4fc5402279b7cf14c
--- /dev/null
+++ b/PyTorch/contrib/cv/detection/GCNet/configs/mask_rcnn/mask_rcnn_r50_caffe_fpn_poly_1x_coco_v1.py
@@ -0,0 +1,57 @@
+_base_ = './mask_rcnn_r50_fpn_1x_coco.py'
+model = dict(
+ pretrained='open-mmlab://resnet50_caffe_bgr',
+ backbone=dict(norm_cfg=dict(requires_grad=False), style='caffe'),
+ rpn_head=dict(
+ loss_bbox=dict(type='SmoothL1Loss', beta=1.0 / 9.0, loss_weight=1.0)),
+ roi_head=dict(
+ bbox_roi_extractor=dict(
+ roi_layer=dict(
+ type='RoIAlign',
+ output_size=7,
+ sampling_ratio=2,
+ aligned=False)),
+ bbox_head=dict(
+ loss_bbox=dict(type='SmoothL1Loss', beta=1.0, loss_weight=1.0)),
+ mask_roi_extractor=dict(
+ roi_layer=dict(
+ type='RoIAlign',
+ output_size=14,
+ sampling_ratio=2,
+ aligned=False))))
+# use caffe img_norm
+img_norm_cfg = dict(
+ mean=[103.530, 116.280, 123.675], std=[1.0, 1.0, 1.0], to_rgb=False)
+train_pipeline = [
+ dict(type='LoadImageFromFile'),
+ dict(
+ type='LoadAnnotations',
+ with_bbox=True,
+ with_mask=True,
+ poly2mask=False),
+ dict(type='Resize', img_scale=(1333, 800), keep_ratio=True),
+ dict(type='RandomFlip', flip_ratio=0.5),
+ dict(type='Normalize', **img_norm_cfg),
+ dict(type='Pad', size_divisor=32),
+ dict(type='DefaultFormatBundle'),
+ dict(type='Collect', keys=['img', 'gt_bboxes', 'gt_labels', 'gt_masks']),
+]
+test_pipeline = [
+ dict(type='LoadImageFromFile'),
+ dict(
+ type='MultiScaleFlipAug',
+ img_scale=(1333, 800),
+ flip=False,
+ transforms=[
+ dict(type='Resize', keep_ratio=True),
+ dict(type='RandomFlip'),
+ dict(type='Normalize', **img_norm_cfg),
+ dict(type='Pad', size_divisor=32),
+ dict(type='ImageToTensor', keys=['img']),
+ dict(type='Collect', keys=['img']),
+ ])
+]
+data = dict(
+ train=dict(pipeline=train_pipeline),
+ val=dict(pipeline=test_pipeline),
+ test=dict(pipeline=test_pipeline))
diff --git a/PyTorch/contrib/cv/detection/GCNet/configs/mask_rcnn/mask_rcnn_r50_fpn_1x_coco.py b/PyTorch/contrib/cv/detection/GCNet/configs/mask_rcnn/mask_rcnn_r50_fpn_1x_coco.py
new file mode 100644
index 0000000000000000000000000000000000000000..6a6c92460f1d58b8e8d361fb56ee123f2668ad9f
--- /dev/null
+++ b/PyTorch/contrib/cv/detection/GCNet/configs/mask_rcnn/mask_rcnn_r50_fpn_1x_coco.py
@@ -0,0 +1,5 @@
+_base_ = [
+ '../_base_/models/mask_rcnn_r50_fpn.py',
+ '../_base_/datasets/coco_instance.py',
+ '../_base_/schedules/schedule_1x.py', '../_base_/default_runtime.py'
+]
diff --git a/PyTorch/contrib/cv/detection/GCNet/configs/mask_rcnn/mask_rcnn_r50_fpn_2x_coco.py b/PyTorch/contrib/cv/detection/GCNet/configs/mask_rcnn/mask_rcnn_r50_fpn_2x_coco.py
new file mode 100644
index 0000000000000000000000000000000000000000..932b1f905155a0d3285daefc4891f5194705e30d
--- /dev/null
+++ b/PyTorch/contrib/cv/detection/GCNet/configs/mask_rcnn/mask_rcnn_r50_fpn_2x_coco.py
@@ -0,0 +1,5 @@
+_base_ = [
+ '../_base_/models/mask_rcnn_r50_fpn.py',
+ '../_base_/datasets/coco_instance.py',
+ '../_base_/schedules/schedule_2x.py', '../_base_/default_runtime.py'
+]
diff --git a/PyTorch/contrib/cv/detection/GCNet/configs/mask_rcnn/mask_rcnn_r50_fpn_poly_1x_coco.py b/PyTorch/contrib/cv/detection/GCNet/configs/mask_rcnn/mask_rcnn_r50_fpn_poly_1x_coco.py
new file mode 100644
index 0000000000000000000000000000000000000000..c952e67bd29e9d23de6d8d43fcac80acfb5beb58
--- /dev/null
+++ b/PyTorch/contrib/cv/detection/GCNet/configs/mask_rcnn/mask_rcnn_r50_fpn_poly_1x_coco.py
@@ -0,0 +1,24 @@
+_base_ = [
+ '../_base_/models/mask_rcnn_r50_fpn.py',
+ '../_base_/datasets/coco_instance.py',
+ '../_base_/schedules/schedule_1x.py', '../_base_/default_runtime.py'
+]
+
+img_norm_cfg = dict(
+ mean=[123.675, 116.28, 103.53], std=[58.395, 57.12, 57.375], to_rgb=True)
+train_pipeline = [
+ dict(type='LoadImageFromFile'),
+ dict(
+ type='LoadAnnotations',
+ with_bbox=True,
+ with_mask=True,
+ poly2mask=False),
+ dict(type='Resize', img_scale=(1333, 800), keep_ratio=True),
+ dict(type='RandomFlip', flip_ratio=0.5),
+ dict(type='Normalize', **img_norm_cfg),
+ # dict(type='Pad', size_divisor=32),
+ dict(type='Pad', size_divisor=1344),
+ dict(type='DefaultFormatBundle'),
+ dict(type='Collect', keys=['img', 'gt_bboxes', 'gt_labels', 'gt_masks']),
+]
+data = dict(train=dict(pipeline=train_pipeline))
diff --git a/PyTorch/contrib/cv/detection/GCNet/configs/mask_rcnn/mask_rcnn_x101_32x4d_fpn_1x_coco.py b/PyTorch/contrib/cv/detection/GCNet/configs/mask_rcnn/mask_rcnn_x101_32x4d_fpn_1x_coco.py
new file mode 100644
index 0000000000000000000000000000000000000000..d0016d1f1df4534ae27de95c4f7ec9976b3ab6d0
--- /dev/null
+++ b/PyTorch/contrib/cv/detection/GCNet/configs/mask_rcnn/mask_rcnn_x101_32x4d_fpn_1x_coco.py
@@ -0,0 +1,13 @@
+_base_ = './mask_rcnn_r101_fpn_1x_coco.py'
+model = dict(
+ pretrained='open-mmlab://resnext101_32x4d',
+ backbone=dict(
+ type='ResNeXt',
+ depth=101,
+ groups=32,
+ base_width=4,
+ num_stages=4,
+ out_indices=(0, 1, 2, 3),
+ frozen_stages=1,
+ norm_cfg=dict(type='BN', requires_grad=True),
+ style='pytorch'))
diff --git a/PyTorch/contrib/cv/detection/GCNet/configs/mask_rcnn/mask_rcnn_x101_32x4d_fpn_2x_coco.py b/PyTorch/contrib/cv/detection/GCNet/configs/mask_rcnn/mask_rcnn_x101_32x4d_fpn_2x_coco.py
new file mode 100644
index 0000000000000000000000000000000000000000..d4189c6fa2a6a3481bf666b713f6ab91812f3d86
--- /dev/null
+++ b/PyTorch/contrib/cv/detection/GCNet/configs/mask_rcnn/mask_rcnn_x101_32x4d_fpn_2x_coco.py
@@ -0,0 +1,13 @@
+_base_ = './mask_rcnn_r101_fpn_2x_coco.py'
+model = dict(
+ pretrained='open-mmlab://resnext101_32x4d',
+ backbone=dict(
+ type='ResNeXt',
+ depth=101,
+ groups=32,
+ base_width=4,
+ num_stages=4,
+ out_indices=(0, 1, 2, 3),
+ frozen_stages=1,
+ norm_cfg=dict(type='BN', requires_grad=True),
+ style='pytorch'))
diff --git a/PyTorch/contrib/cv/detection/GCNet/configs/mask_rcnn/mask_rcnn_x101_32x8d_fpn_1x_coco.py b/PyTorch/contrib/cv/detection/GCNet/configs/mask_rcnn/mask_rcnn_x101_32x8d_fpn_1x_coco.py
new file mode 100644
index 0000000000000000000000000000000000000000..ee034b716d6e20bfad03abe769f91fa3cc44c5e9
--- /dev/null
+++ b/PyTorch/contrib/cv/detection/GCNet/configs/mask_rcnn/mask_rcnn_x101_32x8d_fpn_1x_coco.py
@@ -0,0 +1,63 @@
+_base_ = './mask_rcnn_r101_fpn_1x_coco.py'
+model = dict(
+ pretrained='open-mmlab://detectron2/resnext101_32x8d',
+ backbone=dict(
+ type='ResNeXt',
+ depth=101,
+ groups=32,
+ base_width=8,
+ num_stages=4,
+ out_indices=(0, 1, 2, 3),
+ frozen_stages=1,
+ norm_cfg=dict(type='BN', requires_grad=False),
+ style='pytorch'))
+
+dataset_type = 'CocoDataset'
+data_root = 'data/coco/'
+img_norm_cfg = dict(
+ mean=[103.530, 116.280, 123.675],
+ std=[57.375, 57.120, 58.395],
+ to_rgb=False)
+train_pipeline = [
+ dict(type='LoadImageFromFile'),
+ dict(type='LoadAnnotations', with_bbox=True, with_mask=True),
+ dict(type='Resize', img_scale=(1333, 800), keep_ratio=True),
+ dict(type='RandomFlip', flip_ratio=0.5),
+ dict(type='Normalize', **img_norm_cfg),
+ dict(type='Pad', size_divisor=32),
+ dict(type='DefaultFormatBundle'),
+ dict(type='Collect', keys=['img', 'gt_bboxes', 'gt_labels', 'gt_masks']),
+]
+test_pipeline = [
+ dict(type='LoadImageFromFile'),
+ dict(
+ type='MultiScaleFlipAug',
+ img_scale=(1333, 800),
+ flip=False,
+ transforms=[
+ dict(type='Resize', keep_ratio=True),
+ dict(type='RandomFlip'),
+ dict(type='Normalize', **img_norm_cfg),
+ dict(type='Pad', size_divisor=32),
+ dict(type='ImageToTensor', keys=['img']),
+ dict(type='Collect', keys=['img']),
+ ])
+]
+data = dict(
+ samples_per_gpu=2,
+ workers_per_gpu=2,
+ train=dict(
+ type=dataset_type,
+ ann_file=data_root + 'annotations/instances_train2017.json',
+ img_prefix=data_root + 'train2017/',
+ pipeline=train_pipeline),
+ val=dict(
+ type=dataset_type,
+ ann_file=data_root + 'annotations/instances_val2017.json',
+ img_prefix=data_root + 'val2017/',
+ pipeline=test_pipeline),
+ test=dict(
+ type=dataset_type,
+ ann_file=data_root + 'annotations/instances_val2017.json',
+ img_prefix=data_root + 'val2017/',
+ pipeline=test_pipeline))
diff --git a/PyTorch/contrib/cv/detection/GCNet/configs/mask_rcnn/mask_rcnn_x101_32x8d_fpn_mstrain-poly_1x_coco.py b/PyTorch/contrib/cv/detection/GCNet/configs/mask_rcnn/mask_rcnn_x101_32x8d_fpn_mstrain-poly_1x_coco.py
new file mode 100644
index 0000000000000000000000000000000000000000..1c124328286c659d800d2c44a2c4e4fee15f26e5
--- /dev/null
+++ b/PyTorch/contrib/cv/detection/GCNet/configs/mask_rcnn/mask_rcnn_x101_32x8d_fpn_mstrain-poly_1x_coco.py
@@ -0,0 +1,58 @@
+_base_ = './mask_rcnn_r101_fpn_1x_coco.py'
+model = dict(
+ pretrained='open-mmlab://detectron2/resnext101_32x8d',
+ backbone=dict(
+ type='ResNeXt',
+ depth=101,
+ groups=32,
+ base_width=8,
+ num_stages=4,
+ out_indices=(0, 1, 2, 3),
+ frozen_stages=1,
+ norm_cfg=dict(type='BN', requires_grad=False),
+ style='pytorch'))
+
+dataset_type = 'CocoDataset'
+data_root = 'data/coco/'
+img_norm_cfg = dict(
+ mean=[103.530, 116.280, 123.675],
+ std=[57.375, 57.120, 58.395],
+ to_rgb=False)
+train_pipeline = [
+ dict(type='LoadImageFromFile'),
+ dict(
+ type='LoadAnnotations',
+ with_bbox=True,
+ with_mask=True,
+ poly2mask=False),
+ dict(
+ type='Resize',
+ img_scale=[(1333, 640), (1333, 672), (1333, 704), (1333, 736),
+ (1333, 768), (1333, 800)],
+ multiscale_mode='value',
+ keep_ratio=True),
+ dict(type='RandomFlip', flip_ratio=0.5),
+ dict(type='Normalize', **img_norm_cfg),
+ dict(type='Pad', size_divisor=32),
+ dict(type='DefaultFormatBundle'),
+ dict(type='Collect', keys=['img', 'gt_bboxes', 'gt_labels', 'gt_masks']),
+]
+test_pipeline = [
+ dict(type='LoadImageFromFile'),
+ dict(
+ type='MultiScaleFlipAug',
+ img_scale=(1333, 800),
+ flip=False,
+ transforms=[
+ dict(type='Resize', keep_ratio=True),
+ dict(type='RandomFlip'),
+ dict(type='Normalize', **img_norm_cfg),
+ dict(type='Pad', size_divisor=32),
+ dict(type='ImageToTensor', keys=['img']),
+ dict(type='Collect', keys=['img']),
+ ])
+]
+data = dict(
+ train=dict(pipeline=train_pipeline),
+ val=dict(pipeline=test_pipeline),
+ test=dict(pipeline=test_pipeline))
diff --git a/PyTorch/contrib/cv/detection/GCNet/configs/mask_rcnn/mask_rcnn_x101_32x8d_fpn_mstrain-poly_3x_coco.py b/PyTorch/contrib/cv/detection/GCNet/configs/mask_rcnn/mask_rcnn_x101_32x8d_fpn_mstrain-poly_3x_coco.py
new file mode 100644
index 0000000000000000000000000000000000000000..f326441d6226c469ae544052c92ac0c6fd210159
--- /dev/null
+++ b/PyTorch/contrib/cv/detection/GCNet/configs/mask_rcnn/mask_rcnn_x101_32x8d_fpn_mstrain-poly_3x_coco.py
@@ -0,0 +1,61 @@
+_base_ = './mask_rcnn_r101_fpn_1x_coco.py'
+model = dict(
+ pretrained='open-mmlab://detectron2/resnext101_32x8d',
+ backbone=dict(
+ type='ResNeXt',
+ depth=101,
+ groups=32,
+ base_width=8,
+ num_stages=4,
+ out_indices=(0, 1, 2, 3),
+ frozen_stages=1,
+ norm_cfg=dict(type='BN', requires_grad=False),
+ style='pytorch'))
+
+dataset_type = 'CocoDataset'
+data_root = 'data/coco/'
+img_norm_cfg = dict(
+ mean=[103.530, 116.280, 123.675],
+ std=[57.375, 57.120, 58.395],
+ to_rgb=False)
+train_pipeline = [
+ dict(type='LoadImageFromFile'),
+ dict(
+ type='LoadAnnotations',
+ with_bbox=True,
+ with_mask=True,
+ poly2mask=False),
+ dict(
+ type='Resize',
+ img_scale=[(1333, 640), (1333, 672), (1333, 704), (1333, 736),
+ (1333, 768), (1333, 800)],
+ multiscale_mode='value',
+ keep_ratio=True),
+ dict(type='RandomFlip', flip_ratio=0.5),
+ dict(type='Normalize', **img_norm_cfg),
+ dict(type='Pad', size_divisor=32),
+ dict(type='DefaultFormatBundle'),
+ dict(type='Collect', keys=['img', 'gt_bboxes', 'gt_labels', 'gt_masks']),
+]
+test_pipeline = [
+ dict(type='LoadImageFromFile'),
+ dict(
+ type='MultiScaleFlipAug',
+ img_scale=(1333, 800),
+ flip=False,
+ transforms=[
+ dict(type='Resize', keep_ratio=True),
+ dict(type='RandomFlip'),
+ dict(type='Normalize', **img_norm_cfg),
+ dict(type='Pad', size_divisor=32),
+ dict(type='ImageToTensor', keys=['img']),
+ dict(type='Collect', keys=['img']),
+ ])
+]
+data = dict(
+ train=dict(pipeline=train_pipeline),
+ val=dict(pipeline=test_pipeline),
+ test=dict(pipeline=test_pipeline))
+
+lr_config = dict(step=[28, 34])
+total_epochs = 36
diff --git a/PyTorch/contrib/cv/detection/GCNet/configs/mask_rcnn/mask_rcnn_x101_64x4d_fpn_1x_coco.py b/PyTorch/contrib/cv/detection/GCNet/configs/mask_rcnn/mask_rcnn_x101_64x4d_fpn_1x_coco.py
new file mode 100644
index 0000000000000000000000000000000000000000..31e5943216f19a87a2f1e6f666efead573f72626
--- /dev/null
+++ b/PyTorch/contrib/cv/detection/GCNet/configs/mask_rcnn/mask_rcnn_x101_64x4d_fpn_1x_coco.py
@@ -0,0 +1,13 @@
+_base_ = './mask_rcnn_x101_32x4d_fpn_1x_coco.py'
+model = dict(
+ pretrained='open-mmlab://resnext101_64x4d',
+ backbone=dict(
+ type='ResNeXt',
+ depth=101,
+ groups=64,
+ base_width=4,
+ num_stages=4,
+ out_indices=(0, 1, 2, 3),
+ frozen_stages=1,
+ norm_cfg=dict(type='BN', requires_grad=True),
+ style='pytorch'))
diff --git a/PyTorch/contrib/cv/detection/GCNet/configs/mask_rcnn/mask_rcnn_x101_64x4d_fpn_2x_coco.py b/PyTorch/contrib/cv/detection/GCNet/configs/mask_rcnn/mask_rcnn_x101_64x4d_fpn_2x_coco.py
new file mode 100644
index 0000000000000000000000000000000000000000..9ba92c5b0b6dcaf10746aeacf7a868348133ff80
--- /dev/null
+++ b/PyTorch/contrib/cv/detection/GCNet/configs/mask_rcnn/mask_rcnn_x101_64x4d_fpn_2x_coco.py
@@ -0,0 +1,13 @@
+_base_ = './mask_rcnn_x101_32x4d_fpn_2x_coco.py'
+model = dict(
+ pretrained='open-mmlab://resnext101_64x4d',
+ backbone=dict(
+ type='ResNeXt',
+ depth=101,
+ groups=64,
+ base_width=4,
+ num_stages=4,
+ out_indices=(0, 1, 2, 3),
+ frozen_stages=1,
+ norm_cfg=dict(type='BN', requires_grad=True),
+ style='pytorch'))
diff --git a/PyTorch/contrib/cv/detection/GCNet/configs/ms_rcnn/README.md b/PyTorch/contrib/cv/detection/GCNet/configs/ms_rcnn/README.md
new file mode 100644
index 0000000000000000000000000000000000000000..ddc1d7d9bb0441a44b0efed524ee1cb1d45b38f7
--- /dev/null
+++ b/PyTorch/contrib/cv/detection/GCNet/configs/ms_rcnn/README.md
@@ -0,0 +1,24 @@
+# Mask Scoring R-CNN
+
+## Introduction
+
+```
+@inproceedings{huang2019msrcnn,
+ title={Mask Scoring R-CNN},
+ author={Zhaojin Huang and Lichao Huang and Yongchao Gong and Chang Huang and Xinggang Wang},
+ booktitle={IEEE Conference on Computer Vision and Pattern Recognition},
+ year={2019},
+}
+```
+
+## Results and Models
+
+| Backbone | style | Lr schd | Mem (GB) | Inf time (fps) | box AP | mask AP | Download |
+|:-------------:|:----------:|:-------:|:--------:|:--------------:|:------:|:-------:|:--------:|
+| R-50-FPN | caffe | 1x | 4.5 | | 38.2 | 36.0 | [model](http://download.openmmlab.com/mmdetection/v2.0/ms_rcnn/ms_rcnn_r50_caffe_fpn_1x_coco/ms_rcnn_r50_caffe_fpn_1x_coco_20200702_180848-61c9355e.pth) | [log](http://download.openmmlab.com/mmdetection/v2.0/ms_rcnn/ms_rcnn_r50_caffe_fpn_1x_coco/ms_rcnn_r50_caffe_fpn_1x_coco_20200702_180848.log.json) |
+| R-50-FPN | caffe | 2x | - | - | 38.8 | 36.3 | [model](http://download.openmmlab.com/mmdetection/v2.0/ms_rcnn/ms_rcnn_r50_caffe_fpn_2x_coco/ms_rcnn_r50_caffe_fpn_2x_coco_bbox_mAP-0.388__segm_mAP-0.363_20200506_004738-ee87b137.pth) | [log](http://download.openmmlab.com/mmdetection/v2.0/ms_rcnn/ms_rcnn_r50_caffe_fpn_2x_coco/ms_rcnn_r50_caffe_fpn_2x_coco_20200506_004738.log.json) |
+| R-101-FPN | caffe | 1x | 6.5 | | 40.4 | 37.6 | [model](http://download.openmmlab.com/mmdetection/v2.0/ms_rcnn/ms_rcnn_r101_caffe_fpn_1x_coco/ms_rcnn_r101_caffe_fpn_1x_coco_bbox_mAP-0.404__segm_mAP-0.376_20200506_004755-b9b12a37.pth) | [log](http://download.openmmlab.com/mmdetection/v2.0/ms_rcnn/ms_rcnn_r101_caffe_fpn_1x_coco/ms_rcnn_r101_caffe_fpn_1x_coco_20200506_004755.log.json) |
+| R-101-FPN | caffe | 2x | - | - | 41.1 | 38.1 | [model](http://download.openmmlab.com/mmdetection/v2.0/ms_rcnn/ms_rcnn_r101_caffe_fpn_2x_coco/ms_rcnn_r101_caffe_fpn_2x_coco_bbox_mAP-0.411__segm_mAP-0.381_20200506_011134-5f3cc74f.pth) | [log](http://download.openmmlab.com/mmdetection/v2.0/ms_rcnn/ms_rcnn_r101_caffe_fpn_2x_coco/ms_rcnn_r101_caffe_fpn_2x_coco_20200506_011134.log.json) |
+| R-X101-32x4d | pytorch | 2x | 7.9 | 11.0 | 41.8 | 38.7 | [model](http://download.openmmlab.com/mmdetection/v2.0/ms_rcnn/ms_rcnn_x101_32x4d_fpn_1x_coco/ms_rcnn_x101_32x4d_fpn_1x_coco_20200206-81fd1740.pth) | [log](http://download.openmmlab.com/mmdetection/v2.0/ms_rcnn/ms_rcnn_x101_32x4d_fpn_1x_coco/ms_rcnn_x101_32x4d_fpn_1x_coco_20200206_100113.log.json) |
+| R-X101-64x4d | pytorch | 1x | 11.0 | 8.0 | 43.0 | 39.5 | [model](http://download.openmmlab.com/mmdetection/v2.0/ms_rcnn/ms_rcnn_x101_64x4d_fpn_1x_coco/ms_rcnn_x101_64x4d_fpn_1x_coco_20200206-86ba88d2.pth) | [log](http://download.openmmlab.com/mmdetection/v2.0/ms_rcnn/ms_rcnn_x101_64x4d_fpn_1x_coco/ms_rcnn_x101_64x4d_fpn_1x_coco_20200206_091744.log.json) |
+| R-X101-64x4d | pytorch | 2x | 11.0 | 8.0 | 42.6 | 39.5 | [model](http://download.openmmlab.com/mmdetection/v2.0/ms_rcnn/ms_rcnn_x101_64x4d_fpn_2x_coco/ms_rcnn_x101_64x4d_fpn_2x_coco_20200308-02a445e2.pth) | [log](http://download.openmmlab.com/mmdetection/v2.0/ms_rcnn/ms_rcnn_x101_64x4d_fpn_2x_coco/ms_rcnn_x101_64x4d_fpn_2x_coco_20200308_012247.log.json) |
diff --git a/PyTorch/contrib/cv/detection/GCNet/configs/ms_rcnn/ms_rcnn_r101_caffe_fpn_1x_coco.py b/PyTorch/contrib/cv/detection/GCNet/configs/ms_rcnn/ms_rcnn_r101_caffe_fpn_1x_coco.py
new file mode 100644
index 0000000000000000000000000000000000000000..3bd33c40263fc3a5bc44d09f5e3368ea9a859b0f
--- /dev/null
+++ b/PyTorch/contrib/cv/detection/GCNet/configs/ms_rcnn/ms_rcnn_r101_caffe_fpn_1x_coco.py
@@ -0,0 +1,4 @@
+_base_ = './ms_rcnn_r50_caffe_fpn_1x_coco.py'
+model = dict(
+ pretrained='open-mmlab://detectron2/resnet101_caffe',
+ backbone=dict(depth=101))
diff --git a/PyTorch/contrib/cv/detection/GCNet/configs/ms_rcnn/ms_rcnn_r101_caffe_fpn_2x_coco.py b/PyTorch/contrib/cv/detection/GCNet/configs/ms_rcnn/ms_rcnn_r101_caffe_fpn_2x_coco.py
new file mode 100644
index 0000000000000000000000000000000000000000..8d4a30a3f446d7af065ff0921667fc7a813b65a2
--- /dev/null
+++ b/PyTorch/contrib/cv/detection/GCNet/configs/ms_rcnn/ms_rcnn_r101_caffe_fpn_2x_coco.py
@@ -0,0 +1,4 @@
+_base_ = './ms_rcnn_r101_caffe_fpn_1x_coco.py'
+# learning policy
+lr_config = dict(step=[16, 22])
+total_epochs = 24
diff --git a/PyTorch/contrib/cv/detection/GCNet/configs/ms_rcnn/ms_rcnn_r50_caffe_fpn_1x_coco.py b/PyTorch/contrib/cv/detection/GCNet/configs/ms_rcnn/ms_rcnn_r50_caffe_fpn_1x_coco.py
new file mode 100644
index 0000000000000000000000000000000000000000..f0781996623e48a475f2d3fb3cc77abebbf7aa2f
--- /dev/null
+++ b/PyTorch/contrib/cv/detection/GCNet/configs/ms_rcnn/ms_rcnn_r50_caffe_fpn_1x_coco.py
@@ -0,0 +1,16 @@
+_base_ = '../mask_rcnn/mask_rcnn_r50_caffe_fpn_1x_coco.py'
+model = dict(
+ type='MaskScoringRCNN',
+ roi_head=dict(
+ type='MaskScoringRoIHead',
+ mask_iou_head=dict(
+ type='MaskIoUHead',
+ num_convs=4,
+ num_fcs=2,
+ roi_feat_size=14,
+ in_channels=256,
+ conv_out_channels=256,
+ fc_out_channels=1024,
+ num_classes=80)))
+# model training and testing settings
+train_cfg = dict(rcnn=dict(mask_thr_binary=0.5))
diff --git a/PyTorch/contrib/cv/detection/GCNet/configs/ms_rcnn/ms_rcnn_r50_caffe_fpn_2x_coco.py b/PyTorch/contrib/cv/detection/GCNet/configs/ms_rcnn/ms_rcnn_r50_caffe_fpn_2x_coco.py
new file mode 100644
index 0000000000000000000000000000000000000000..a14317ad90b31a6ecaf4a8452afa9df4ff5b66c0
--- /dev/null
+++ b/PyTorch/contrib/cv/detection/GCNet/configs/ms_rcnn/ms_rcnn_r50_caffe_fpn_2x_coco.py
@@ -0,0 +1,4 @@
+_base_ = './ms_rcnn_r50_caffe_fpn_1x_coco.py'
+# learning policy
+lr_config = dict(step=[16, 22])
+total_epochs = 24
diff --git a/PyTorch/contrib/cv/detection/GCNet/configs/ms_rcnn/ms_rcnn_r50_fpn_1x_coco.py b/PyTorch/contrib/cv/detection/GCNet/configs/ms_rcnn/ms_rcnn_r50_fpn_1x_coco.py
new file mode 100644
index 0000000000000000000000000000000000000000..daf4c37584b79a8017d040b0fd0f23d40989f6a0
--- /dev/null
+++ b/PyTorch/contrib/cv/detection/GCNet/configs/ms_rcnn/ms_rcnn_r50_fpn_1x_coco.py
@@ -0,0 +1,16 @@
+_base_ = '../mask_rcnn/mask_rcnn_r50_fpn_1x_coco.py'
+model = dict(
+ type='MaskScoringRCNN',
+ roi_head=dict(
+ type='MaskScoringRoIHead',
+ mask_iou_head=dict(
+ type='MaskIoUHead',
+ num_convs=4,
+ num_fcs=2,
+ roi_feat_size=14,
+ in_channels=256,
+ conv_out_channels=256,
+ fc_out_channels=1024,
+ num_classes=80)))
+# model training and testing settings
+train_cfg = dict(rcnn=dict(mask_thr_binary=0.5))
diff --git a/PyTorch/contrib/cv/detection/GCNet/configs/ms_rcnn/ms_rcnn_x101_32x4d_fpn_1x_coco.py b/PyTorch/contrib/cv/detection/GCNet/configs/ms_rcnn/ms_rcnn_x101_32x4d_fpn_1x_coco.py
new file mode 100644
index 0000000000000000000000000000000000000000..4a78a252a9a49889c288ec6cb7d8114c78da5c57
--- /dev/null
+++ b/PyTorch/contrib/cv/detection/GCNet/configs/ms_rcnn/ms_rcnn_x101_32x4d_fpn_1x_coco.py
@@ -0,0 +1,13 @@
+_base_ = './ms_rcnn_r50_fpn_1x_coco.py'
+model = dict(
+ pretrained='open-mmlab://resnext101_32x4d',
+ backbone=dict(
+ type='ResNeXt',
+ depth=101,
+ groups=32,
+ base_width=4,
+ num_stages=4,
+ out_indices=(0, 1, 2, 3),
+ frozen_stages=1,
+ norm_cfg=dict(type='BN', requires_grad=True),
+ style='pytorch'))
diff --git a/PyTorch/contrib/cv/detection/GCNet/configs/ms_rcnn/ms_rcnn_x101_64x4d_fpn_1x_coco.py b/PyTorch/contrib/cv/detection/GCNet/configs/ms_rcnn/ms_rcnn_x101_64x4d_fpn_1x_coco.py
new file mode 100644
index 0000000000000000000000000000000000000000..61a0cefe4e20b55cd3caaab7dde325a111275726
--- /dev/null
+++ b/PyTorch/contrib/cv/detection/GCNet/configs/ms_rcnn/ms_rcnn_x101_64x4d_fpn_1x_coco.py
@@ -0,0 +1,13 @@
+_base_ = './ms_rcnn_r50_fpn_1x_coco.py'
+model = dict(
+ pretrained='open-mmlab://resnext101_64x4d',
+ backbone=dict(
+ type='ResNeXt',
+ depth=101,
+ groups=64,
+ base_width=4,
+ num_stages=4,
+ out_indices=(0, 1, 2, 3),
+ frozen_stages=1,
+ norm_cfg=dict(type='BN', requires_grad=True),
+ style='pytorch'))
diff --git a/PyTorch/contrib/cv/detection/GCNet/configs/ms_rcnn/ms_rcnn_x101_64x4d_fpn_2x_coco.py b/PyTorch/contrib/cv/detection/GCNet/configs/ms_rcnn/ms_rcnn_x101_64x4d_fpn_2x_coco.py
new file mode 100644
index 0000000000000000000000000000000000000000..92ce4fbdd88727ceed7c688cc6ec954380fd2cc9
--- /dev/null
+++ b/PyTorch/contrib/cv/detection/GCNet/configs/ms_rcnn/ms_rcnn_x101_64x4d_fpn_2x_coco.py
@@ -0,0 +1,4 @@
+_base_ = './ms_rcnn_x101_64x4d_fpn_1x_coco.py'
+# learning policy
+lr_config = dict(step=[16, 22])
+total_epochs = 24
diff --git a/PyTorch/contrib/cv/detection/GCNet/configs/nas_fcos/README.md b/PyTorch/contrib/cv/detection/GCNet/configs/nas_fcos/README.md
new file mode 100644
index 0000000000000000000000000000000000000000..3cf4d5d986d9e212f3207ee21e52ad342e41947a
--- /dev/null
+++ b/PyTorch/contrib/cv/detection/GCNet/configs/nas_fcos/README.md
@@ -0,0 +1,22 @@
+# NAS-FCOS: Fast Neural Architecture Search for Object Detection
+
+## Introduction
+
+```
+@article{wang2019fcos,
+ title={Nas-fcos: Fast neural architecture search for object detection},
+ author={Wang, Ning and Gao, Yang and Chen, Hao and Wang, Peng and Tian, Zhi and Shen, Chunhua},
+ journal={arXiv preprint arXiv:1906.04423},
+ year={2019}
+}
+```
+
+## Results and Models
+
+| Head | Backbone | Style | GN-head | Lr schd | Mem (GB) | Inf time (fps) | box AP | Download |
+|:---------:|:---------:|:-------:|:-------:|:-------:|:--------:|:--------------:|:------:|:--------:|
+| NAS-FCOSHead | R-50 | caffe | Y | 1x | | | 39.4 | [model](http://download.openmmlab.com/mmdetection/v2.0/nas_fcos/nas_fcos_nashead_r50_caffe_fpn_gn-head_4x4_1x_coco/nas_fcos_nashead_r50_caffe_fpn_gn-head_4x4_1x_coco_20200520-1bdba3ce.pth) | [log](http://download.openmmlab.com/mmdetection/v2.0/nas_fcos/nas_fcos_nashead_r50_caffe_fpn_gn-head_4x4_1x_coco/nas_fcos_nashead_r50_caffe_fpn_gn-head_4x4_1x_coco_20200520.log.json) |
+| FCOSHead | R-50 | caffe | Y | 1x | | | 38.5 | [model](http://download.openmmlab.com/mmdetection/v2.0/nas_fcos/nas_fcos_fcoshead_r50_caffe_fpn_gn-head_4x4_1x_coco/nas_fcos_fcoshead_r50_caffe_fpn_gn-head_4x4_1x_coco_20200521-7fdcbce0.pth) | [log](http://download.openmmlab.com/mmdetection/v2.0/nas_fcos/nas_fcos_fcoshead_r50_caffe_fpn_gn-head_4x4_1x_coco/nas_fcos_fcoshead_r50_caffe_fpn_gn-head_4x4_1x_coco_20200521.log.json) |
+
+**Notes:**
+- To be consistent with the author's implementation, we use 4 GPUs with 4 images/GPU.
diff --git a/PyTorch/contrib/cv/detection/GCNet/configs/nas_fcos/nas_fcos_fcoshead_r50_caffe_fpn_gn-head_4x4_1x_coco.py b/PyTorch/contrib/cv/detection/GCNet/configs/nas_fcos/nas_fcos_fcoshead_r50_caffe_fpn_gn-head_4x4_1x_coco.py
new file mode 100644
index 0000000000000000000000000000000000000000..76dde57d8a42d5bf9ce1a188270d98bc7fcdb49e
--- /dev/null
+++ b/PyTorch/contrib/cv/detection/GCNet/configs/nas_fcos/nas_fcos_fcoshead_r50_caffe_fpn_gn-head_4x4_1x_coco.py
@@ -0,0 +1,99 @@
+_base_ = [
+ '../_base_/datasets/coco_detection.py',
+ '../_base_/schedules/schedule_1x.py', '../_base_/default_runtime.py'
+]
+
+model = dict(
+ type='NASFCOS',
+ pretrained='open-mmlab://detectron2/resnet50_caffe',
+ backbone=dict(
+ type='ResNet',
+ depth=50,
+ num_stages=4,
+ out_indices=(0, 1, 2, 3),
+ frozen_stages=1,
+ norm_cfg=dict(type='BN', requires_grad=False, eps=0),
+ style='caffe'),
+ neck=dict(
+ type='NASFCOS_FPN',
+ in_channels=[256, 512, 1024, 2048],
+ out_channels=256,
+ start_level=1,
+ add_extra_convs=True,
+ num_outs=5,
+ norm_cfg=dict(type='BN'),
+ conv_cfg=dict(type='DCNv2', deform_groups=2)),
+ bbox_head=dict(
+ type='FCOSHead',
+ num_classes=80,
+ in_channels=256,
+ stacked_convs=4,
+ feat_channels=256,
+ strides=[8, 16, 32, 64, 128],
+ norm_cfg=dict(type='GN', num_groups=32),
+ loss_cls=dict(
+ type='FocalLoss',
+ use_sigmoid=True,
+ gamma=2.0,
+ alpha=0.25,
+ loss_weight=1.0),
+ loss_bbox=dict(type='IoULoss', loss_weight=1.0),
+ loss_centerness=dict(
+ type='CrossEntropyLoss', use_sigmoid=True, loss_weight=1.0)))
+
+train_cfg = dict(
+ assigner=dict(
+ type='MaxIoUAssigner',
+ pos_iou_thr=0.5,
+ neg_iou_thr=0.4,
+ min_pos_iou=0,
+ ignore_iof_thr=-1),
+ allowed_border=-1,
+ pos_weight=-1,
+ debug=False)
+test_cfg = dict(
+ nms_pre=1000,
+ min_bbox_size=0,
+ score_thr=0.05,
+ nms=dict(type='nms', iou_threshold=0.6),
+ max_per_img=100)
+
+img_norm_cfg = dict(
+ mean=[103.530, 116.280, 123.675], std=[1.0, 1.0, 1.0], to_rgb=False)
+
+train_pipeline = [
+ dict(type='LoadImageFromFile'),
+ dict(type='LoadAnnotations', with_bbox=True),
+ dict(type='Resize', img_scale=(1333, 800), keep_ratio=True),
+ dict(type='RandomFlip', flip_ratio=0.5),
+ dict(type='Normalize', **img_norm_cfg),
+ dict(type='Pad', size_divisor=32),
+ dict(type='DefaultFormatBundle'),
+ dict(type='Collect', keys=['img', 'gt_bboxes', 'gt_labels']),
+]
+
+test_pipeline = [
+ dict(type='LoadImageFromFile'),
+ dict(
+ type='MultiScaleFlipAug',
+ img_scale=(1333, 800),
+ flip=False,
+ transforms=[
+ dict(type='Resize', keep_ratio=True),
+ dict(type='RandomFlip'),
+ dict(type='Normalize', **img_norm_cfg),
+ dict(type='Pad', size_divisor=32),
+ dict(type='ImageToTensor', keys=['img']),
+ dict(type='Collect', keys=['img']),
+ ])
+]
+
+data = dict(
+ samples_per_gpu=4,
+ workers_per_gpu=2,
+ train=dict(pipeline=train_pipeline),
+ val=dict(pipeline=test_pipeline),
+ test=dict(pipeline=test_pipeline))
+
+optimizer = dict(
+ lr=0.01, paramwise_cfg=dict(bias_lr_mult=2., bias_decay_mult=0.))
diff --git a/PyTorch/contrib/cv/detection/GCNet/configs/nas_fcos/nas_fcos_nashead_r50_caffe_fpn_gn-head_4x4_1x_coco.py b/PyTorch/contrib/cv/detection/GCNet/configs/nas_fcos/nas_fcos_nashead_r50_caffe_fpn_gn-head_4x4_1x_coco.py
new file mode 100644
index 0000000000000000000000000000000000000000..a22f8f1998c46b38f56223837330d2014029ca11
--- /dev/null
+++ b/PyTorch/contrib/cv/detection/GCNet/configs/nas_fcos/nas_fcos_nashead_r50_caffe_fpn_gn-head_4x4_1x_coco.py
@@ -0,0 +1,98 @@
+_base_ = [
+ '../_base_/datasets/coco_detection.py',
+ '../_base_/schedules/schedule_1x.py', '../_base_/default_runtime.py'
+]
+
+model = dict(
+ type='NASFCOS',
+ pretrained='open-mmlab://detectron2/resnet50_caffe',
+ backbone=dict(
+ type='ResNet',
+ depth=50,
+ num_stages=4,
+ out_indices=(0, 1, 2, 3),
+ frozen_stages=1,
+ norm_cfg=dict(type='BN', requires_grad=False, eps=0),
+ style='caffe'),
+ neck=dict(
+ type='NASFCOS_FPN',
+ in_channels=[256, 512, 1024, 2048],
+ out_channels=256,
+ start_level=1,
+ add_extra_convs=True,
+ num_outs=5,
+ norm_cfg=dict(type='BN'),
+ conv_cfg=dict(type='DCNv2', deform_groups=2)),
+ bbox_head=dict(
+ type='NASFCOSHead',
+ num_classes=80,
+ in_channels=256,
+ feat_channels=256,
+ strides=[8, 16, 32, 64, 128],
+ norm_cfg=dict(type='GN', num_groups=32),
+ loss_cls=dict(
+ type='FocalLoss',
+ use_sigmoid=True,
+ gamma=2.0,
+ alpha=0.25,
+ loss_weight=1.0),
+ loss_bbox=dict(type='IoULoss', loss_weight=1.0),
+ loss_centerness=dict(
+ type='CrossEntropyLoss', use_sigmoid=True, loss_weight=1.0)))
+
+train_cfg = dict(
+ assigner=dict(
+ type='MaxIoUAssigner',
+ pos_iou_thr=0.5,
+ neg_iou_thr=0.4,
+ min_pos_iou=0,
+ ignore_iof_thr=-1),
+ allowed_border=-1,
+ pos_weight=-1,
+ debug=False)
+test_cfg = dict(
+ nms_pre=1000,
+ min_bbox_size=0,
+ score_thr=0.05,
+ nms=dict(type='nms', iou_threshold=0.6),
+ max_per_img=100)
+
+img_norm_cfg = dict(
+ mean=[103.530, 116.280, 123.675], std=[1.0, 1.0, 1.0], to_rgb=False)
+
+train_pipeline = [
+ dict(type='LoadImageFromFile'),
+ dict(type='LoadAnnotations', with_bbox=True),
+ dict(type='Resize', img_scale=(1333, 800), keep_ratio=True),
+ dict(type='RandomFlip', flip_ratio=0.5),
+ dict(type='Normalize', **img_norm_cfg),
+ dict(type='Pad', size_divisor=32),
+ dict(type='DefaultFormatBundle'),
+ dict(type='Collect', keys=['img', 'gt_bboxes', 'gt_labels']),
+]
+
+test_pipeline = [
+ dict(type='LoadImageFromFile'),
+ dict(
+ type='MultiScaleFlipAug',
+ img_scale=(1333, 800),
+ flip=False,
+ transforms=[
+ dict(type='Resize', keep_ratio=True),
+ dict(type='RandomFlip'),
+ dict(type='Normalize', **img_norm_cfg),
+ dict(type='Pad', size_divisor=32),
+ dict(type='ImageToTensor', keys=['img']),
+ dict(type='Collect', keys=['img']),
+ ])
+]
+
+data = dict(
+ samples_per_gpu=4,
+ workers_per_gpu=2,
+ train=dict(pipeline=train_pipeline),
+ val=dict(pipeline=test_pipeline),
+ test=dict(pipeline=test_pipeline))
+
+optimizer = dict(
+ lr=0.01, paramwise_cfg=dict(bias_lr_mult=2., bias_decay_mult=0.))
diff --git a/PyTorch/contrib/cv/detection/GCNet/configs/nas_fpn/README.md b/PyTorch/contrib/cv/detection/GCNet/configs/nas_fpn/README.md
new file mode 100644
index 0000000000000000000000000000000000000000..c7c27497652d844693324a870536a7f89352d639
--- /dev/null
+++ b/PyTorch/contrib/cv/detection/GCNet/configs/nas_fpn/README.md
@@ -0,0 +1,25 @@
+# NAS-FPN: Learning Scalable Feature Pyramid Architecture for Object Detection
+
+## Introduction
+
+```
+@inproceedings{ghiasi2019fpn,
+ title={Nas-fpn: Learning scalable feature pyramid architecture for object detection},
+ author={Ghiasi, Golnaz and Lin, Tsung-Yi and Le, Quoc V},
+ booktitle={Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition},
+ pages={7036--7045},
+ year={2019}
+}
+```
+
+## Results and Models
+
+We benchmark the new training schedule (crop training, large batch, unfrozen BN, 50 epochs) introduced in NAS-FPN. RetinaNet is used in the paper.
+
+| Backbone | Lr schd | Mem (GB) | Inf time (fps) | box AP | Download |
+|:-----------:|:-------:|:--------:|:--------------:|:------:|:--------:|
+| R-50-FPN | 50e | 12.9 | 22.9 | 37.9 | [model](http://download.openmmlab.com/mmdetection/v2.0/nas_fpn/retinanet_r50_fpn_crop640_50e_coco/retinanet_r50_fpn_crop640_50e_coco-9b953d76.pth) | [log](http://download.openmmlab.com/mmdetection/v2.0/nas_fpn/retinanet_r50_fpn_crop640_50e_coco/retinanet_r50_fpn_crop640_50e_coco_20200529_095329.log.json) |
+| R-50-NASFPN | 50e | 13.2 | 23.0 | 40.5 | [model](http://download.openmmlab.com/mmdetection/v2.0/nas_fpn/retinanet_r50_nasfpn_crop640_50e_coco/retinanet_r50_nasfpn_crop640_50e_coco-0ad1f644.pth) | [log](http://download.openmmlab.com/mmdetection/v2.0/nas_fpn/retinanet_r50_nasfpn_crop640_50e_coco/retinanet_r50_nasfpn_crop640_50e_coco_20200528_230008.log.json) |
+
+
+**Note**: We find that it is unstable to train NAS-FPN and there is a small chance that results can be 3% mAP lower.
diff --git a/PyTorch/contrib/cv/detection/GCNet/configs/nas_fpn/retinanet_r50_fpn_crop640_50e_coco.py b/PyTorch/contrib/cv/detection/GCNet/configs/nas_fpn/retinanet_r50_fpn_crop640_50e_coco.py
new file mode 100644
index 0000000000000000000000000000000000000000..feeabc7119ba72279dc0ad266ec19b7146aec3e6
--- /dev/null
+++ b/PyTorch/contrib/cv/detection/GCNet/configs/nas_fpn/retinanet_r50_fpn_crop640_50e_coco.py
@@ -0,0 +1,80 @@
+_base_ = [
+ '../_base_/models/retinanet_r50_fpn.py',
+ '../_base_/datasets/coco_detection.py', '../_base_/default_runtime.py'
+]
+cudnn_benchmark = True
+norm_cfg = dict(type='BN', requires_grad=True)
+model = dict(
+ pretrained='torchvision://resnet50',
+ backbone=dict(
+ type='ResNet',
+ depth=50,
+ num_stages=4,
+ out_indices=(0, 1, 2, 3),
+ frozen_stages=1,
+ norm_cfg=norm_cfg,
+ norm_eval=False,
+ style='pytorch'),
+ neck=dict(
+ relu_before_extra_convs=True,
+ no_norm_on_lateral=True,
+ norm_cfg=norm_cfg),
+ bbox_head=dict(type='RetinaSepBNHead', num_ins=5, norm_cfg=norm_cfg))
+# training and testing settings
+train_cfg = dict(assigner=dict(neg_iou_thr=0.5))
+# dataset settings
+img_norm_cfg = dict(
+ mean=[123.675, 116.28, 103.53], std=[58.395, 57.12, 57.375], to_rgb=True)
+train_pipeline = [
+ dict(type='LoadImageFromFile'),
+ dict(type='LoadAnnotations', with_bbox=True),
+ dict(
+ type='Resize',
+ img_scale=(640, 640),
+ ratio_range=(0.8, 1.2),
+ keep_ratio=True),
+ dict(type='RandomCrop', crop_size=(640, 640)),
+ dict(type='RandomFlip', flip_ratio=0.5),
+ dict(type='Normalize', **img_norm_cfg),
+ dict(type='Pad', size=(640, 640)),
+ dict(type='DefaultFormatBundle'),
+ dict(type='Collect', keys=['img', 'gt_bboxes', 'gt_labels']),
+]
+test_pipeline = [
+ dict(type='LoadImageFromFile'),
+ dict(
+ type='MultiScaleFlipAug',
+ img_scale=(640, 640),
+ flip=False,
+ transforms=[
+ dict(type='Resize', keep_ratio=True),
+ dict(type='RandomFlip'),
+ dict(type='Normalize', **img_norm_cfg),
+ dict(type='Pad', size_divisor=64),
+ dict(type='ImageToTensor', keys=['img']),
+ dict(type='Collect', keys=['img']),
+ ])
+]
+data = dict(
+ samples_per_gpu=8,
+ workers_per_gpu=4,
+ train=dict(pipeline=train_pipeline),
+ val=dict(pipeline=test_pipeline),
+ test=dict(pipeline=test_pipeline))
+# optimizer
+optimizer = dict(
+ type='SGD',
+ lr=0.08,
+ momentum=0.9,
+ weight_decay=0.0001,
+ paramwise_cfg=dict(norm_decay_mult=0, bypass_duplicate=True))
+optimizer_config = dict(grad_clip=None)
+# learning policy
+lr_config = dict(
+ policy='step',
+ warmup='linear',
+ warmup_iters=1000,
+ warmup_ratio=0.1,
+ step=[30, 40])
+# runtime settings
+total_epochs = 50
diff --git a/PyTorch/contrib/cv/detection/GCNet/configs/nas_fpn/retinanet_r50_nasfpn_crop640_50e_coco.py b/PyTorch/contrib/cv/detection/GCNet/configs/nas_fpn/retinanet_r50_nasfpn_crop640_50e_coco.py
new file mode 100644
index 0000000000000000000000000000000000000000..72fbb0445a4b778d86b935051042d98bac37538b
--- /dev/null
+++ b/PyTorch/contrib/cv/detection/GCNet/configs/nas_fpn/retinanet_r50_nasfpn_crop640_50e_coco.py
@@ -0,0 +1,79 @@
+_base_ = [
+ '../_base_/models/retinanet_r50_fpn.py',
+ '../_base_/datasets/coco_detection.py', '../_base_/default_runtime.py'
+]
+cudnn_benchmark = True
+# model settings
+norm_cfg = dict(type='BN', requires_grad=True)
+model = dict(
+ type='RetinaNet',
+ pretrained='torchvision://resnet50',
+ backbone=dict(
+ type='ResNet',
+ depth=50,
+ num_stages=4,
+ out_indices=(0, 1, 2, 3),
+ frozen_stages=1,
+ norm_cfg=norm_cfg,
+ norm_eval=False,
+ style='pytorch'),
+ neck=dict(type='NASFPN', stack_times=7, norm_cfg=norm_cfg),
+ bbox_head=dict(type='RetinaSepBNHead', num_ins=5, norm_cfg=norm_cfg))
+# training and testing settings
+train_cfg = dict(assigner=dict(neg_iou_thr=0.5))
+# dataset settings
+img_norm_cfg = dict(
+ mean=[123.675, 116.28, 103.53], std=[58.395, 57.12, 57.375], to_rgb=True)
+train_pipeline = [
+ dict(type='LoadImageFromFile'),
+ dict(type='LoadAnnotations', with_bbox=True),
+ dict(
+ type='Resize',
+ img_scale=(640, 640),
+ ratio_range=(0.8, 1.2),
+ keep_ratio=True),
+ dict(type='RandomCrop', crop_size=(640, 640)),
+ dict(type='RandomFlip', flip_ratio=0.5),
+ dict(type='Normalize', **img_norm_cfg),
+ dict(type='Pad', size=(640, 640)),
+ dict(type='DefaultFormatBundle'),
+ dict(type='Collect', keys=['img', 'gt_bboxes', 'gt_labels']),
+]
+test_pipeline = [
+ dict(type='LoadImageFromFile'),
+ dict(
+ type='MultiScaleFlipAug',
+ img_scale=(640, 640),
+ flip=False,
+ transforms=[
+ dict(type='Resize', keep_ratio=True),
+ dict(type='RandomFlip'),
+ dict(type='Normalize', **img_norm_cfg),
+ dict(type='Pad', size_divisor=128),
+ dict(type='ImageToTensor', keys=['img']),
+ dict(type='Collect', keys=['img']),
+ ])
+]
+data = dict(
+ samples_per_gpu=8,
+ workers_per_gpu=4,
+ train=dict(pipeline=train_pipeline),
+ val=dict(pipeline=test_pipeline),
+ test=dict(pipeline=test_pipeline))
+# optimizer
+optimizer = dict(
+ type='SGD',
+ lr=0.08,
+ momentum=0.9,
+ weight_decay=0.0001,
+ paramwise_cfg=dict(norm_decay_mult=0, bypass_duplicate=True))
+optimizer_config = dict(grad_clip=None)
+# learning policy
+lr_config = dict(
+ policy='step',
+ warmup='linear',
+ warmup_iters=1000,
+ warmup_ratio=0.1,
+ step=[30, 40])
+# runtime settings
+total_epochs = 50
diff --git a/PyTorch/contrib/cv/detection/GCNet/configs/paa/README.md b/PyTorch/contrib/cv/detection/GCNet/configs/paa/README.md
new file mode 100644
index 0000000000000000000000000000000000000000..b5cc84554570d46f0f9e1fc7bcd6adee31f84c59
--- /dev/null
+++ b/PyTorch/contrib/cv/detection/GCNet/configs/paa/README.md
@@ -0,0 +1,22 @@
+# Probabilistic Anchor Assignment with IoU Prediction for Object Detection
+
+
+
+## Results and Models
+We provide config files to reproduce the object detection results in the
+ECCV 2020 paper for Probabilistic Anchor Assignment with IoU
+Prediction for Object Detection.
+
+| Backbone | Lr schd | Mem (GB) | Score voting | box AP | Download |
+|:-----------:|:-------:|:--------:|:------------:|:------:|:--------:|
+| R-50-FPN | 12e | 3.7 | True | 40.4 | [model](http://download.openmmlab.com/mmdetection/v2.0/paa/paa_r50_fpn_1x_coco/paa_r50_fpn_1x_coco_20200821-936edec3.pth) | [log](http://download.openmmlab.com/mmdetection/v2.0/paa/paa_r50_fpn_1x_coco/paa_r50_fpn_1x_coco_20200821-936edec3.log.json) |
+| R-50-FPN | 12e | 3.7 | False | 40.2 | - |
+| R-50-FPN | 18e | 3.7 | True | 41.4 | [model](http://download.openmmlab.com/mmdetection/v2.0/paa/paa_r50_fpn_1.5x_coco/paa_r50_fpn_1.5x_coco_20200823-805d6078.pth) | [log](http://download.openmmlab.com/mmdetection/v2.0/paa/paa_r50_fpn_1.5x_coco/paa_r50_fpn_1.5x_coco_20200823-805d6078.log.json) |
+| R-50-FPN | 18e | 3.7 | False | 41.2 | - |
+| R-50-FPN | 24e | 3.7 | True | 41.6 | [model](http://download.openmmlab.com/mmdetection/v2.0/paa/paa_r50_fpn_2x_coco/paa_r50_fpn_2x_coco_20200821-c98bfc4e.pth) | [log](http://download.openmmlab.com/mmdetection/v2.0/paa/paa_r50_fpn_2x_coco/paa_r50_fpn_2x_coco_20200821-c98bfc4e.log.json) |
+| R-101-FPN | 12e | 6.2 | True | 42.6 | [model](http://download.openmmlab.com/mmdetection/v2.0/paa/paa_r101_fpn_1x_coco/paa_r101_fpn_1x_coco_20200821-0a1825a4.pth) | [log](http://download.openmmlab.com/mmdetection/v2.0/paa/paa_r101_fpn_1x_coco/paa_r101_fpn_1x_coco_20200821-0a1825a4.log.json) |
+| R-101-FPN | 12e | 6.2 | False | 42.4 | - |
+| R-101-FPN | 24e | 6.2 | True | 43.5 | [model](http://download.openmmlab.com/mmdetection/v2.0/paa/paa_r101_fpn_2x_coco/paa_r101_fpn_2x_coco_20200821-6829f96b.pth) | [log](http://download.openmmlab.com/mmdetection/v2.0/paa/paa_r101_fpn_2x_coco/paa_r101_fpn_2x_coco_20200821-6829f96b.log.json) |
+
+**Note**:
+1. We find that the performance is unstable with 1x setting and may fluctuate by about 0.2 mAP. We report the best results.
diff --git a/PyTorch/contrib/cv/detection/GCNet/configs/paa/paa_r101_fpn_1x_coco.py b/PyTorch/contrib/cv/detection/GCNet/configs/paa/paa_r101_fpn_1x_coco.py
new file mode 100644
index 0000000000000000000000000000000000000000..a64a012dd32c1c4b857a21bc996778c923c7c461
--- /dev/null
+++ b/PyTorch/contrib/cv/detection/GCNet/configs/paa/paa_r101_fpn_1x_coco.py
@@ -0,0 +1,4 @@
+_base_ = './paa_r50_fpn_1x_coco.py'
+model = dict(pretrained='torchvision://resnet101', backbone=dict(depth=101))
+lr_config = dict(step=[16, 22])
+total_epochs = 24
diff --git a/PyTorch/contrib/cv/detection/GCNet/configs/paa/paa_r101_fpn_2x_coco.py b/PyTorch/contrib/cv/detection/GCNet/configs/paa/paa_r101_fpn_2x_coco.py
new file mode 100644
index 0000000000000000000000000000000000000000..a3bc60f91e42244876aee34a8f330af9e5711ea2
--- /dev/null
+++ b/PyTorch/contrib/cv/detection/GCNet/configs/paa/paa_r101_fpn_2x_coco.py
@@ -0,0 +1,3 @@
+_base_ = './paa_r101_fpn_1x_coco.py'
+lr_config = dict(step=[16, 22])
+total_epochs = 24
diff --git a/PyTorch/contrib/cv/detection/GCNet/configs/paa/paa_r50_fpn_1.5x_coco.py b/PyTorch/contrib/cv/detection/GCNet/configs/paa/paa_r50_fpn_1.5x_coco.py
new file mode 100644
index 0000000000000000000000000000000000000000..7de45783b8114fe15892e9e9f242d5283e1fceea
--- /dev/null
+++ b/PyTorch/contrib/cv/detection/GCNet/configs/paa/paa_r50_fpn_1.5x_coco.py
@@ -0,0 +1,3 @@
+_base_ = './paa_r50_fpn_1x_coco.py'
+lr_config = dict(step=[12, 16])
+total_epochs = 18
diff --git a/PyTorch/contrib/cv/detection/GCNet/configs/paa/paa_r50_fpn_1x_coco.py b/PyTorch/contrib/cv/detection/GCNet/configs/paa/paa_r50_fpn_1x_coco.py
new file mode 100644
index 0000000000000000000000000000000000000000..e66cd1b77968459a01eec82c819c33a0403a2358
--- /dev/null
+++ b/PyTorch/contrib/cv/detection/GCNet/configs/paa/paa_r50_fpn_1x_coco.py
@@ -0,0 +1,70 @@
+_base_ = [
+ '../_base_/datasets/coco_detection.py',
+ '../_base_/schedules/schedule_1x.py', '../_base_/default_runtime.py'
+]
+model = dict(
+ type='PAA',
+ pretrained='torchvision://resnet50',
+ backbone=dict(
+ type='ResNet',
+ depth=50,
+ num_stages=4,
+ out_indices=(0, 1, 2, 3),
+ frozen_stages=1,
+ norm_cfg=dict(type='BN', requires_grad=True),
+ norm_eval=True,
+ style='pytorch'),
+ neck=dict(
+ type='FPN',
+ in_channels=[256, 512, 1024, 2048],
+ out_channels=256,
+ start_level=1,
+ add_extra_convs='on_output',
+ num_outs=5),
+ bbox_head=dict(
+ type='PAAHead',
+ reg_decoded_bbox=True,
+ score_voting=True,
+ topk=9,
+ num_classes=80,
+ in_channels=256,
+ stacked_convs=4,
+ feat_channels=256,
+ anchor_generator=dict(
+ type='AnchorGenerator',
+ ratios=[1.0],
+ octave_base_scale=8,
+ scales_per_octave=1,
+ strides=[8, 16, 32, 64, 128]),
+ bbox_coder=dict(
+ type='DeltaXYWHBBoxCoder',
+ target_means=[.0, .0, .0, .0],
+ target_stds=[0.1, 0.1, 0.2, 0.2]),
+ loss_cls=dict(
+ type='FocalLoss',
+ use_sigmoid=True,
+ gamma=2.0,
+ alpha=0.25,
+ loss_weight=1.0),
+ loss_bbox=dict(type='GIoULoss', loss_weight=1.3),
+ loss_centerness=dict(
+ type='CrossEntropyLoss', use_sigmoid=True, loss_weight=0.5)))
+# training and testing settings
+train_cfg = dict(
+ assigner=dict(
+ type='MaxIoUAssigner',
+ pos_iou_thr=0.1,
+ neg_iou_thr=0.1,
+ min_pos_iou=0,
+ ignore_iof_thr=-1),
+ allowed_border=-1,
+ pos_weight=-1,
+ debug=False)
+test_cfg = dict(
+ nms_pre=1000,
+ min_bbox_size=0,
+ score_thr=0.05,
+ nms=dict(type='nms', iou_threshold=0.6),
+ max_per_img=100)
+# optimizer
+optimizer = dict(type='SGD', lr=0.01, momentum=0.9, weight_decay=0.0001)
diff --git a/PyTorch/contrib/cv/detection/GCNet/configs/paa/paa_r50_fpn_2x_coco.py b/PyTorch/contrib/cv/detection/GCNet/configs/paa/paa_r50_fpn_2x_coco.py
new file mode 100644
index 0000000000000000000000000000000000000000..529f07439e00789fe7f378b4d7b13da708db1fa6
--- /dev/null
+++ b/PyTorch/contrib/cv/detection/GCNet/configs/paa/paa_r50_fpn_2x_coco.py
@@ -0,0 +1,3 @@
+_base_ = './paa_r50_fpn_1x_coco.py'
+lr_config = dict(step=[16, 22])
+total_epochs = 24
diff --git a/PyTorch/contrib/cv/detection/GCNet/configs/pafpn/README.md b/PyTorch/contrib/cv/detection/GCNet/configs/pafpn/README.md
new file mode 100644
index 0000000000000000000000000000000000000000..c7e8f311c59e04bd74cecc16979ed6d8a42d9d95
--- /dev/null
+++ b/PyTorch/contrib/cv/detection/GCNet/configs/pafpn/README.md
@@ -0,0 +1,24 @@
+# Path Aggregation Network for Instance Segmentation
+
+## Introduction
+
+```
+@inproceedings{liu2018path,
+ author = {Shu Liu and
+ Lu Qi and
+ Haifang Qin and
+ Jianping Shi and
+ Jiaya Jia},
+ title = {Path Aggregation Network for Instance Segmentation},
+ booktitle = {Proceedings of IEEE Conference on Computer Vision and Pattern Recognition (CVPR)},
+ year = {2018}
+}
+```
+
+## Results and Models
+
+## Results and Models
+
+| Backbone | style | Lr schd | Mem (GB) | Inf time (fps) | box AP | mask AP | Download |
+|:-------------:|:----------:|:-------:|:--------:|:--------------:|:------:|:-------:|:--------:|
+| R-50-FPN | pytorch | 1x | 4.0 | 17.2 | 37.5 | | [model](http://download.openmmlab.com/mmdetection/v2.0/pafpn/faster_rcnn_r50_pafpn_1x_coco/faster_rcnn_r50_pafpn_1x_coco_bbox_mAP-0.375_20200503_105836-b7b4b9bd.pth) | [log](http://download.openmmlab.com/mmdetection/v2.0/pafpn/faster_rcnn_r50_pafpn_1x_coco/faster_rcnn_r50_pafpn_1x_coco_20200503_105836.log.json) |
diff --git a/PyTorch/contrib/cv/detection/GCNet/configs/pafpn/faster_rcnn_r50_pafpn_1x_coco.py b/PyTorch/contrib/cv/detection/GCNet/configs/pafpn/faster_rcnn_r50_pafpn_1x_coco.py
new file mode 100644
index 0000000000000000000000000000000000000000..b2fdef91c5cc8396baee9c2d8a09556162443078
--- /dev/null
+++ b/PyTorch/contrib/cv/detection/GCNet/configs/pafpn/faster_rcnn_r50_pafpn_1x_coco.py
@@ -0,0 +1,8 @@
+_base_ = '../faster_rcnn/faster_rcnn_r50_fpn_1x_coco.py'
+
+model = dict(
+ neck=dict(
+ type='PAFPN',
+ in_channels=[256, 512, 1024, 2048],
+ out_channels=256,
+ num_outs=5))
diff --git a/PyTorch/contrib/cv/detection/GCNet/configs/pascal_voc/README.md b/PyTorch/contrib/cv/detection/GCNet/configs/pascal_voc/README.md
new file mode 100644
index 0000000000000000000000000000000000000000..c6d3715acadacda7b910b9d7a5b14b1baeff3b1c
--- /dev/null
+++ b/PyTorch/contrib/cv/detection/GCNet/configs/pascal_voc/README.md
@@ -0,0 +1,6 @@
+## Results and Models
+
+| Architecture | Backbone | Style | Lr schd | Mem (GB) | Inf time (fps) | box AP | Download |
+|:------------:|:---------:|:-------:|:-------:|:--------:|:--------------:|:------:|:--------:|
+| Faster R-CNN | R-50 | pytorch | 1x | 2.6 | - | 79.5 |[model](http://download.openmmlab.com/mmdetection/v2.0/pascal_voc/faster_rcnn_r50_fpn_1x_voc0712/faster_rcnn_r50_fpn_1x_voc0712_20200624-c9895d40.pth) | [log](http://download.openmmlab.com/mmdetection/v2.0/pascal_voc/faster_rcnn_r50_fpn_1x_voc0712/20200623_015208.log.json) |
+| Retinanet | R-50 | pytorch | 1x | 2.1 | - | 77.3 |[model](http://download.openmmlab.com/mmdetection/v2.0/pascal_voc/retinanet_r50_fpn_1x_voc0712/retinanet_r50_fpn_1x_voc0712_20200617-47cbdd0e.pth) | [log](http://download.openmmlab.com/mmdetection/v2.0/pascal_voc/retinanet_r50_fpn_1x_voc0712/retinanet_r50_fpn_1x_voc0712_20200616_014642.log.json) |
diff --git a/PyTorch/contrib/cv/detection/GCNet/configs/pascal_voc/faster_rcnn_r50_fpn_1x_voc0712.py b/PyTorch/contrib/cv/detection/GCNet/configs/pascal_voc/faster_rcnn_r50_fpn_1x_voc0712.py
new file mode 100644
index 0000000000000000000000000000000000000000..b48203a54a5ee06b22f35c5c80b9da9647caec8d
--- /dev/null
+++ b/PyTorch/contrib/cv/detection/GCNet/configs/pascal_voc/faster_rcnn_r50_fpn_1x_voc0712.py
@@ -0,0 +1,13 @@
+_base_ = [
+ '../_base_/models/faster_rcnn_r50_fpn.py', '../_base_/datasets/voc0712.py',
+ '../_base_/default_runtime.py'
+]
+model = dict(roi_head=dict(bbox_head=dict(num_classes=20)))
+# optimizer
+optimizer = dict(type='SGD', lr=0.01, momentum=0.9, weight_decay=0.0001)
+optimizer_config = dict(grad_clip=None)
+# learning policy
+# actual epoch = 3 * 3 = 9
+lr_config = dict(policy='step', step=[3])
+# runtime settings
+total_epochs = 4 # actual epoch = 4 * 3 = 12
diff --git a/PyTorch/contrib/cv/detection/GCNet/configs/pascal_voc/retinanet_r50_fpn_1x_voc0712.py b/PyTorch/contrib/cv/detection/GCNet/configs/pascal_voc/retinanet_r50_fpn_1x_voc0712.py
new file mode 100644
index 0000000000000000000000000000000000000000..cf8b9bf6f69eedebd2d982b53a24a5bfa226a02c
--- /dev/null
+++ b/PyTorch/contrib/cv/detection/GCNet/configs/pascal_voc/retinanet_r50_fpn_1x_voc0712.py
@@ -0,0 +1,13 @@
+_base_ = [
+ '../_base_/models/retinanet_r50_fpn.py', '../_base_/datasets/voc0712.py',
+ '../_base_/default_runtime.py'
+]
+model = dict(bbox_head=dict(num_classes=20))
+# optimizer
+optimizer = dict(type='SGD', lr=0.01, momentum=0.9, weight_decay=0.0001)
+optimizer_config = dict(grad_clip=None)
+# learning policy
+# actual epoch = 3 * 3 = 9
+lr_config = dict(policy='step', step=[3])
+# runtime settings
+total_epochs = 4 # actual epoch = 4 * 3 = 12
diff --git a/PyTorch/contrib/cv/detection/GCNet/configs/pascal_voc/ssd300_voc0712.py b/PyTorch/contrib/cv/detection/GCNet/configs/pascal_voc/ssd300_voc0712.py
new file mode 100644
index 0000000000000000000000000000000000000000..677ed07c3a590bc2ca8a2d5949194a9f282b6dc9
--- /dev/null
+++ b/PyTorch/contrib/cv/detection/GCNet/configs/pascal_voc/ssd300_voc0712.py
@@ -0,0 +1,69 @@
+_base_ = [
+ '../_base_/models/ssd300.py', '../_base_/datasets/voc0712.py',
+ '../_base_/default_runtime.py'
+]
+model = dict(
+ bbox_head=dict(
+ num_classes=20, anchor_generator=dict(basesize_ratio_range=(0.2,
+ 0.9))))
+# dataset settings
+dataset_type = 'VOCDataset'
+data_root = 'data/VOCdevkit/'
+img_norm_cfg = dict(mean=[123.675, 116.28, 103.53], std=[1, 1, 1], to_rgb=True)
+train_pipeline = [
+ dict(type='LoadImageFromFile', to_float32=True),
+ dict(type='LoadAnnotations', with_bbox=True),
+ dict(
+ type='PhotoMetricDistortion',
+ brightness_delta=32,
+ contrast_range=(0.5, 1.5),
+ saturation_range=(0.5, 1.5),
+ hue_delta=18),
+ dict(
+ type='Expand',
+ mean=img_norm_cfg['mean'],
+ to_rgb=img_norm_cfg['to_rgb'],
+ ratio_range=(1, 4)),
+ dict(
+ type='MinIoURandomCrop',
+ min_ious=(0.1, 0.3, 0.5, 0.7, 0.9),
+ min_crop_size=0.3),
+ dict(type='Resize', img_scale=(300, 300), keep_ratio=False),
+ dict(type='Normalize', **img_norm_cfg),
+ dict(type='RandomFlip', flip_ratio=0.5),
+ dict(type='DefaultFormatBundle'),
+ dict(type='Collect', keys=['img', 'gt_bboxes', 'gt_labels']),
+]
+test_pipeline = [
+ dict(type='LoadImageFromFile'),
+ dict(
+ type='MultiScaleFlipAug',
+ img_scale=(300, 300),
+ flip=False,
+ transforms=[
+ dict(type='Resize', keep_ratio=False),
+ dict(type='Normalize', **img_norm_cfg),
+ dict(type='ImageToTensor', keys=['img']),
+ dict(type='Collect', keys=['img']),
+ ])
+]
+data = dict(
+ samples_per_gpu=8,
+ workers_per_gpu=3,
+ train=dict(
+ type='RepeatDataset', times=10, dataset=dict(pipeline=train_pipeline)),
+ val=dict(pipeline=test_pipeline),
+ test=dict(pipeline=test_pipeline))
+# optimizer
+optimizer = dict(type='SGD', lr=1e-3, momentum=0.9, weight_decay=5e-4)
+optimizer_config = dict()
+# learning policy
+lr_config = dict(
+ policy='step',
+ warmup='linear',
+ warmup_iters=500,
+ warmup_ratio=0.001,
+ step=[16, 20])
+checkpoint_config = dict(interval=1)
+# runtime settings
+total_epochs = 24
diff --git a/PyTorch/contrib/cv/detection/GCNet/configs/pascal_voc/ssd512_voc0712.py b/PyTorch/contrib/cv/detection/GCNet/configs/pascal_voc/ssd512_voc0712.py
new file mode 100644
index 0000000000000000000000000000000000000000..365a65fc64bf693d812c97855942827b10bd8e64
--- /dev/null
+++ b/PyTorch/contrib/cv/detection/GCNet/configs/pascal_voc/ssd512_voc0712.py
@@ -0,0 +1,53 @@
+_base_ = 'ssd300_voc0712.py'
+input_size = 512
+model = dict(
+ backbone=dict(input_size=input_size),
+ bbox_head=dict(
+ in_channels=(512, 1024, 512, 256, 256, 256, 256),
+ anchor_generator=dict(
+ input_size=input_size,
+ strides=[8, 16, 32, 64, 128, 256, 512],
+ basesize_ratio_range=(0.15, 0.9),
+ ratios=([2], [2, 3], [2, 3], [2, 3], [2, 3], [2], [2]))))
+img_norm_cfg = dict(mean=[123.675, 116.28, 103.53], std=[1, 1, 1], to_rgb=True)
+train_pipeline = [
+ dict(type='LoadImageFromFile', to_float32=True),
+ dict(type='LoadAnnotations', with_bbox=True),
+ dict(
+ type='PhotoMetricDistortion',
+ brightness_delta=32,
+ contrast_range=(0.5, 1.5),
+ saturation_range=(0.5, 1.5),
+ hue_delta=18),
+ dict(
+ type='Expand',
+ mean=img_norm_cfg['mean'],
+ to_rgb=img_norm_cfg['to_rgb'],
+ ratio_range=(1, 4)),
+ dict(
+ type='MinIoURandomCrop',
+ min_ious=(0.1, 0.3, 0.5, 0.7, 0.9),
+ min_crop_size=0.3),
+ dict(type='Resize', img_scale=(512, 512), keep_ratio=False),
+ dict(type='Normalize', **img_norm_cfg),
+ dict(type='RandomFlip', flip_ratio=0.5),
+ dict(type='DefaultFormatBundle'),
+ dict(type='Collect', keys=['img', 'gt_bboxes', 'gt_labels']),
+]
+test_pipeline = [
+ dict(type='LoadImageFromFile'),
+ dict(
+ type='MultiScaleFlipAug',
+ img_scale=(512, 512),
+ flip=False,
+ transforms=[
+ dict(type='Resize', keep_ratio=False),
+ dict(type='Normalize', **img_norm_cfg),
+ dict(type='ImageToTensor', keys=['img']),
+ dict(type='Collect', keys=['img']),
+ ])
+]
+data = dict(
+ train=dict(dataset=dict(pipeline=train_pipeline)),
+ val=dict(pipeline=test_pipeline),
+ test=dict(pipeline=test_pipeline))
diff --git a/PyTorch/contrib/cv/detection/GCNet/configs/pisa/README.md b/PyTorch/contrib/cv/detection/GCNet/configs/pisa/README.md
new file mode 100644
index 0000000000000000000000000000000000000000..da4c9a2967f478abf940590cd28969b4b4c6d9e5
--- /dev/null
+++ b/PyTorch/contrib/cv/detection/GCNet/configs/pisa/README.md
@@ -0,0 +1,38 @@
+# Prime Sample Attention in Object Detection
+
+## Introduction
+
+```
+@inproceedings{cao2019prime,
+ title={Prime sample attention in object detection},
+ author={Cao, Yuhang and Chen, Kai and Loy, Chen Change and Lin, Dahua},
+ booktitle={IEEE Conference on Computer Vision and Pattern Recognition},
+ year={2020}
+}
+```
+
+## Results and models
+
+
+| PISA | Network | Backbone | Lr schd | box AP | mask AP | Download |
+|:----:|:-------:|:-------------------:|:-------:|:------:|:-------:|:--------:|
+| × | Faster R-CNN | R-50-FPN | 1x | 36.4 | | - |
+| √ | Faster R-CNN | R-50-FPN | 1x | 38.4 | | [model](http://download.openmmlab.com/mmdetection/v2.0/pisa/pisa_faster_rcnn_r50_fpn_1x_coco/pisa_faster_rcnn_r50_fpn_1x_coco-dea93523.pth) | [log](http://download.openmmlab.com/mmdetection/v2.0/pisa/pisa_faster_rcnn_r50_fpn_1x_coco/pisa_faster_rcnn_r50_fpn_1x_coco_20200506_185619.log.json) |
+| × | Faster R-CNN | X101-32x4d-FPN | 1x | 40.1 | | - |
+| √ | Faster R-CNN | X101-32x4d-FPN | 1x | 41.9 | | [model](http://download.openmmlab.com/mmdetection/v2.0/pisa/pisa_faster_rcnn_x101_32x4d_fpn_1x_coco/pisa_faster_rcnn_x101_32x4d_fpn_1x_coco-e4accec4.pth) | [log](http://download.openmmlab.com/mmdetection/v2.0/pisa/pisa_faster_rcnn_x101_32x4d_fpn_1x_coco/pisa_faster_rcnn_x101_32x4d_fpn_1x_coco_20200505_181503.log.json) |
+| × | Mask R-CNN | R-50-FPN | 1x | 37.3 | 34.2 | - |
+| √ | Mask R-CNN | R-50-FPN | 1x | 39.1 | 35.2 | [model](http://download.openmmlab.com/mmdetection/v2.0/pisa/pisa_mask_rcnn_r50_fpn_1x_coco/pisa_mask_rcnn_r50_fpn_1x_coco-dfcedba6.pth) | [log](http://download.openmmlab.com/mmdetection/v2.0/pisa/pisa_mask_rcnn_r50_fpn_1x_coco/pisa_mask_rcnn_r50_fpn_1x_coco_20200508_150500.log.json) |
+| × | Mask R-CNN | X101-32x4d-FPN | 1x | 41.1 | 37.1 | - |
+| √ | Mask R-CNN | X101-32x4d-FPN | 1x | | | |
+| × | RetinaNet | R-50-FPN | 1x | 35.6 | | - |
+| √ | RetinaNet | R-50-FPN | 1x | 36.9 | | [model](http://download.openmmlab.com/mmdetection/v2.0/pisa/pisa_retinanet_r50_fpn_1x_coco/pisa_retinanet_r50_fpn_1x_coco-76409952.pth) | [log](http://download.openmmlab.com/mmdetection/v2.0/pisa/pisa_retinanet_r50_fpn_1x_coco/pisa_retinanet_r50_fpn_1x_coco_20200504_014311.log.json) |
+| × | RetinaNet | X101-32x4d-FPN | 1x | 39.0 | | - |
+| √ | RetinaNet | X101-32x4d-FPN | 1x | 40.7 | | [model](http://download.openmmlab.com/mmdetection/v2.0/pisa/pisa_retinanet_x101_32x4d_fpn_1x_coco/pisa_retinanet_x101_32x4d_fpn_1x_coco-a0c13c73.pth) | [log](http://download.openmmlab.com/mmdetection/v2.0/pisa/pisa_retinanet_x101_32x4d_fpn_1x_coco/pisa_retinanet_x101_32x4d_fpn_1x_coco_20200505_001404.log.json) |
+| × | SSD300 | VGG16 | 1x | 25.6 | | - |
+| √ | SSD300 | VGG16 | 1x | 27.6 | | [model](http://download.openmmlab.com/mmdetection/v2.0/pisa/pisa_ssd300_coco/pisa_ssd300_coco-710e3ac9.pth) | [log](http://download.openmmlab.com/mmdetection/v2.0/pisa/pisa_ssd300_coco/pisa_ssd300_coco_20200504_144325.log.json) |
+| × | SSD300 | VGG16 | 1x | 29.3 | | - |
+| √ | SSD300 | VGG16 | 1x | 31.8 | | [model](http://download.openmmlab.com/mmdetection/v2.0/pisa/pisa_ssd512_coco/pisa_ssd512_coco-247addee.pth) | [log](http://download.openmmlab.com/mmdetection/v2.0/pisa/pisa_ssd512_coco/pisa_ssd512_coco_20200508_131030.log.json) |
+
+**Notes:**
+- In the original paper, all models are trained and tested on mmdet v1.x, thus results may not be exactly the same with this release on v2.0.
+- It is noted PISA only modifies the training pipeline so the inference time remains the same with the baseline.
diff --git a/PyTorch/contrib/cv/detection/GCNet/configs/pisa/pisa_faster_rcnn_r50_fpn_1x_coco.py b/PyTorch/contrib/cv/detection/GCNet/configs/pisa/pisa_faster_rcnn_r50_fpn_1x_coco.py
new file mode 100644
index 0000000000000000000000000000000000000000..ab70f464ce45b27a27f2c4fde610b6a997ac0553
--- /dev/null
+++ b/PyTorch/contrib/cv/detection/GCNet/configs/pisa/pisa_faster_rcnn_r50_fpn_1x_coco.py
@@ -0,0 +1,36 @@
+_base_ = '../faster_rcnn/faster_rcnn_r50_fpn_1x_coco.py'
+
+model = dict(
+ roi_head=dict(
+ type='PISARoIHead',
+ bbox_head=dict(
+ loss_bbox=dict(type='SmoothL1Loss', beta=1.0, loss_weight=1.0))))
+
+train_cfg = dict(
+ rpn_proposal=dict(
+ nms_across_levels=False,
+ nms_pre=2000,
+ nms_post=2000,
+ max_num=2000,
+ nms_thr=0.7,
+ min_bbox_size=0),
+ rcnn=dict(
+ sampler=dict(
+ type='ScoreHLRSampler',
+ num=512,
+ pos_fraction=0.25,
+ neg_pos_ub=-1,
+ add_gt_as_proposals=True,
+ k=0.5,
+ bias=0.),
+ isr=dict(k=2, bias=0),
+ carl=dict(k=1, bias=0.2)))
+
+test_cfg = dict(
+ rpn=dict(
+ nms_across_levels=False,
+ nms_pre=2000,
+ nms_post=2000,
+ max_num=2000,
+ nms_thr=0.7,
+ min_bbox_size=0))
diff --git a/PyTorch/contrib/cv/detection/GCNet/configs/pisa/pisa_faster_rcnn_x101_32x4d_fpn_1x_coco.py b/PyTorch/contrib/cv/detection/GCNet/configs/pisa/pisa_faster_rcnn_x101_32x4d_fpn_1x_coco.py
new file mode 100644
index 0000000000000000000000000000000000000000..e735ecad36877f318ea97e9686378bd0ed0f11b1
--- /dev/null
+++ b/PyTorch/contrib/cv/detection/GCNet/configs/pisa/pisa_faster_rcnn_x101_32x4d_fpn_1x_coco.py
@@ -0,0 +1,36 @@
+_base_ = '../faster_rcnn/faster_rcnn_x101_32x4d_fpn_1x_coco.py'
+
+model = dict(
+ roi_head=dict(
+ type='PISARoIHead',
+ bbox_head=dict(
+ loss_bbox=dict(type='SmoothL1Loss', beta=1.0, loss_weight=1.0))))
+
+train_cfg = dict(
+ rpn_proposal=dict(
+ nms_across_levels=False,
+ nms_pre=2000,
+ nms_post=2000,
+ max_num=2000,
+ nms_thr=0.7,
+ min_bbox_size=0),
+ rcnn=dict(
+ sampler=dict(
+ type='ScoreHLRSampler',
+ num=512,
+ pos_fraction=0.25,
+ neg_pos_ub=-1,
+ add_gt_as_proposals=True,
+ k=0.5,
+ bias=0.),
+ isr=dict(k=2, bias=0),
+ carl=dict(k=1, bias=0.2)))
+
+test_cfg = dict(
+ rpn=dict(
+ nms_across_levels=False,
+ nms_pre=2000,
+ nms_post=2000,
+ max_num=2000,
+ nms_thr=0.7,
+ min_bbox_size=0))
diff --git a/PyTorch/contrib/cv/detection/GCNet/configs/pisa/pisa_mask_rcnn_r50_fpn_1x_coco.py b/PyTorch/contrib/cv/detection/GCNet/configs/pisa/pisa_mask_rcnn_r50_fpn_1x_coco.py
new file mode 100644
index 0000000000000000000000000000000000000000..d1008c3f0e6d7f004fed6dd6a93ed7f8a9ee7003
--- /dev/null
+++ b/PyTorch/contrib/cv/detection/GCNet/configs/pisa/pisa_mask_rcnn_r50_fpn_1x_coco.py
@@ -0,0 +1,36 @@
+_base_ = '../mask_rcnn/mask_rcnn_r50_fpn_1x_coco.py'
+
+model = dict(
+ roi_head=dict(
+ type='PISARoIHead',
+ bbox_head=dict(
+ loss_bbox=dict(type='SmoothL1Loss', beta=1.0, loss_weight=1.0))))
+
+train_cfg = dict(
+ rpn_proposal=dict(
+ nms_across_levels=False,
+ nms_pre=2000,
+ nms_post=2000,
+ max_num=2000,
+ nms_thr=0.7,
+ min_bbox_size=0),
+ rcnn=dict(
+ sampler=dict(
+ type='ScoreHLRSampler',
+ num=512,
+ pos_fraction=0.25,
+ neg_pos_ub=-1,
+ add_gt_as_proposals=True,
+ k=0.5,
+ bias=0.),
+ isr=dict(k=2, bias=0),
+ carl=dict(k=1, bias=0.2)))
+
+test_cfg = dict(
+ rpn=dict(
+ nms_across_levels=False,
+ nms_pre=2000,
+ nms_post=2000,
+ max_num=2000,
+ nms_thr=0.7,
+ min_bbox_size=0))
diff --git a/PyTorch/contrib/cv/detection/GCNet/configs/pisa/pisa_mask_rcnn_x101_32x4d_fpn_1x_coco.py b/PyTorch/contrib/cv/detection/GCNet/configs/pisa/pisa_mask_rcnn_x101_32x4d_fpn_1x_coco.py
new file mode 100644
index 0000000000000000000000000000000000000000..568792588456ef57b6f90189bf5dfec2a5765236
--- /dev/null
+++ b/PyTorch/contrib/cv/detection/GCNet/configs/pisa/pisa_mask_rcnn_x101_32x4d_fpn_1x_coco.py
@@ -0,0 +1,36 @@
+_base_ = '../mask_rcnn/mask_rcnn_x101_32x4d_fpn_1x_coco.py'
+
+model = dict(
+ roi_head=dict(
+ type='PISARoIHead',
+ bbox_head=dict(
+ loss_bbox=dict(type='SmoothL1Loss', beta=1.0, loss_weight=1.0))))
+
+train_cfg = dict(
+ rpn_proposal=dict(
+ nms_across_levels=False,
+ nms_pre=2000,
+ nms_post=2000,
+ max_num=2000,
+ nms_thr=0.7,
+ min_bbox_size=0),
+ rcnn=dict(
+ sampler=dict(
+ type='ScoreHLRSampler',
+ num=512,
+ pos_fraction=0.25,
+ neg_pos_ub=-1,
+ add_gt_as_proposals=True,
+ k=0.5,
+ bias=0.),
+ isr=dict(k=2, bias=0),
+ carl=dict(k=1, bias=0.2)))
+
+test_cfg = dict(
+ rpn=dict(
+ nms_across_levels=False,
+ nms_pre=2000,
+ nms_post=2000,
+ max_num=2000,
+ nms_thr=0.7,
+ min_bbox_size=0))
diff --git a/PyTorch/contrib/cv/detection/GCNet/configs/pisa/pisa_retinanet_r50_fpn_1x_coco.py b/PyTorch/contrib/cv/detection/GCNet/configs/pisa/pisa_retinanet_r50_fpn_1x_coco.py
new file mode 100644
index 0000000000000000000000000000000000000000..b4aa4db51672eee8a5ab8d94522e0f9fadd28108
--- /dev/null
+++ b/PyTorch/contrib/cv/detection/GCNet/configs/pisa/pisa_retinanet_r50_fpn_1x_coco.py
@@ -0,0 +1,8 @@
+_base_ = '../retinanet/retinanet_r50_fpn_1x_coco.py'
+
+model = dict(
+ bbox_head=dict(
+ type='PISARetinaHead',
+ loss_bbox=dict(type='SmoothL1Loss', beta=0.11, loss_weight=1.0)))
+
+train_cfg = dict(isr=dict(k=2., bias=0.), carl=dict(k=1., bias=0.2))
diff --git a/PyTorch/contrib/cv/detection/GCNet/configs/pisa/pisa_retinanet_x101_32x4d_fpn_1x_coco.py b/PyTorch/contrib/cv/detection/GCNet/configs/pisa/pisa_retinanet_x101_32x4d_fpn_1x_coco.py
new file mode 100644
index 0000000000000000000000000000000000000000..4f8f273d3976677aed3e8697dee4b39e808922c1
--- /dev/null
+++ b/PyTorch/contrib/cv/detection/GCNet/configs/pisa/pisa_retinanet_x101_32x4d_fpn_1x_coco.py
@@ -0,0 +1,8 @@
+_base_ = '../retinanet/retinanet_x101_32x4d_fpn_1x_coco.py'
+
+model = dict(
+ bbox_head=dict(
+ type='PISARetinaHead',
+ loss_bbox=dict(type='SmoothL1Loss', beta=0.11, loss_weight=1.0)))
+
+train_cfg = dict(isr=dict(k=2., bias=0.), carl=dict(k=1., bias=0.2))
diff --git a/PyTorch/contrib/cv/detection/GCNet/configs/pisa/pisa_ssd300_coco.py b/PyTorch/contrib/cv/detection/GCNet/configs/pisa/pisa_ssd300_coco.py
new file mode 100644
index 0000000000000000000000000000000000000000..fe5f4f6d05cb4a9efddaae868d859490db53ae1c
--- /dev/null
+++ b/PyTorch/contrib/cv/detection/GCNet/configs/pisa/pisa_ssd300_coco.py
@@ -0,0 +1,8 @@
+_base_ = '../ssd/ssd300_coco.py'
+
+model = dict(bbox_head=dict(type='PISASSDHead'))
+
+train_cfg = dict(isr=dict(k=2., bias=0.), carl=dict(k=1., bias=0.2))
+
+optimizer_config = dict(
+ _delete_=True, grad_clip=dict(max_norm=35, norm_type=2))
diff --git a/PyTorch/contrib/cv/detection/GCNet/configs/pisa/pisa_ssd512_coco.py b/PyTorch/contrib/cv/detection/GCNet/configs/pisa/pisa_ssd512_coco.py
new file mode 100644
index 0000000000000000000000000000000000000000..1404ee05340523169562f93999e024561324940e
--- /dev/null
+++ b/PyTorch/contrib/cv/detection/GCNet/configs/pisa/pisa_ssd512_coco.py
@@ -0,0 +1,8 @@
+_base_ = '../ssd/ssd512_coco.py'
+
+model = dict(bbox_head=dict(type='PISASSDHead'))
+
+train_cfg = dict(isr=dict(k=2., bias=0.), carl=dict(k=1., bias=0.2))
+
+optimizer_config = dict(
+ _delete_=True, grad_clip=dict(max_norm=35, norm_type=2))
diff --git a/PyTorch/contrib/cv/detection/GCNet/configs/point_rend/README.md b/PyTorch/contrib/cv/detection/GCNet/configs/point_rend/README.md
new file mode 100644
index 0000000000000000000000000000000000000000..b2e5a3684941df2d50c24817adc2b6625b8467d3
--- /dev/null
+++ b/PyTorch/contrib/cv/detection/GCNet/configs/point_rend/README.md
@@ -0,0 +1,20 @@
+# PointRend
+
+## Introduction
+```
+@InProceedings{kirillov2019pointrend,
+ title={{PointRend}: Image Segmentation as Rendering},
+ author={Alexander Kirillov and Yuxin Wu and Kaiming He and Ross Girshick},
+ journal={ArXiv:1912.08193},
+ year={2019}
+}
+```
+
+## Results and models
+
+| Backbone | Style | Lr schd | Mem (GB) | Inf time (fps) | box AP | mask AP | Download |
+| :-------------: | :-----: | :-----: | :------: | :------------: | :----: | :-----: | :------: |
+| R-50-FPN | caffe | 1x | 4.6 | | 38.4 | 36.3 | [model](http://download.openmmlab.com/mmdetection/v2.0/point_rend/point_rend_r50_caffe_fpn_mstrain_1x_coco/point_rend_r50_caffe_fpn_mstrain_1x_coco-1bcb5fb4.pth) | [log](http://download.openmmlab.com/mmdetection/v2.0/point_rend/point_rend_r50_caffe_fpn_mstrain_1x_coco/point_rend_r50_caffe_fpn_mstrain_1x_coco_20200612_161407.log.json) |
+| R-50-FPN | caffe | 3x | 4.6 | | 41.0 | 38.0 | [model](http://download.openmmlab.com/mmdetection/v2.0/point_rend/point_rend_r50_caffe_fpn_mstrain_3x_coco/point_rend_r50_caffe_fpn_mstrain_3x_coco-e0ebb6b7.pth) | [log](http://download.openmmlab.com/mmdetection/v2.0/point_rend/point_rend_r50_caffe_fpn_mstrain_3x_coco/point_rend_r50_caffe_fpn_mstrain_3x_coco_20200614_002632.log.json) |
+
+Note: All models are trained with multi-scale, the input image shorter side is randomly scaled to one of (640, 672, 704, 736, 768, 800).
diff --git a/PyTorch/contrib/cv/detection/GCNet/configs/point_rend/point_rend_r50_caffe_fpn_mstrain_1x_coco.py b/PyTorch/contrib/cv/detection/GCNet/configs/point_rend/point_rend_r50_caffe_fpn_mstrain_1x_coco.py
new file mode 100644
index 0000000000000000000000000000000000000000..dc7f97554b2ca905ad098b487cd7e0393d30cd1d
--- /dev/null
+++ b/PyTorch/contrib/cv/detection/GCNet/configs/point_rend/point_rend_r50_caffe_fpn_mstrain_1x_coco.py
@@ -0,0 +1,42 @@
+_base_ = '../mask_rcnn/mask_rcnn_r50_caffe_fpn_mstrain_1x_coco.py'
+# model settings
+model = dict(
+ type='PointRend',
+ roi_head=dict(
+ type='PointRendRoIHead',
+ mask_roi_extractor=dict(
+ type='GenericRoIExtractor',
+ aggregation='concat',
+ roi_layer=dict(
+ _delete_=True, type='SimpleRoIAlign', output_size=14),
+ out_channels=256,
+ featmap_strides=[4]),
+ mask_head=dict(
+ _delete_=True,
+ type='CoarseMaskHead',
+ num_fcs=2,
+ in_channels=256,
+ conv_out_channels=256,
+ fc_out_channels=1024,
+ num_classes=80,
+ loss_mask=dict(
+ type='CrossEntropyLoss', use_mask=True, loss_weight=1.0)),
+ point_head=dict(
+ type='MaskPointHead',
+ num_fcs=3,
+ in_channels=256,
+ fc_channels=256,
+ num_classes=80,
+ coarse_pred_each_layer=True,
+ loss_point=dict(
+ type='CrossEntropyLoss', use_mask=True, loss_weight=1.0))))
+# model training and testing settings
+train_cfg = dict(
+ rcnn=dict(
+ mask_size=7,
+ num_points=14 * 14,
+ oversample_ratio=3,
+ importance_sample_ratio=0.75))
+test_cfg = dict(
+ rcnn=dict(
+ subdivision_steps=5, subdivision_num_points=28 * 28, scale_factor=2))
diff --git a/PyTorch/contrib/cv/detection/GCNet/configs/point_rend/point_rend_r50_caffe_fpn_mstrain_3x_coco.py b/PyTorch/contrib/cv/detection/GCNet/configs/point_rend/point_rend_r50_caffe_fpn_mstrain_3x_coco.py
new file mode 100644
index 0000000000000000000000000000000000000000..4e00eb744c76a770b035ecb5f3751e95df02025a
--- /dev/null
+++ b/PyTorch/contrib/cv/detection/GCNet/configs/point_rend/point_rend_r50_caffe_fpn_mstrain_3x_coco.py
@@ -0,0 +1,4 @@
+_base_ = './point_rend_r50_caffe_fpn_mstrain_1x_coco.py'
+# learning policy
+lr_config = dict(step=[28, 34])
+total_epochs = 36
diff --git a/PyTorch/contrib/cv/detection/GCNet/configs/regnet/README.md b/PyTorch/contrib/cv/detection/GCNet/configs/regnet/README.md
new file mode 100644
index 0000000000000000000000000000000000000000..988a43bf5636a68ed7c2c320db05435647dee87e
--- /dev/null
+++ b/PyTorch/contrib/cv/detection/GCNet/configs/regnet/README.md
@@ -0,0 +1,90 @@
+# Designing Network Design Spaces
+
+## Introduction
+
+We implement RegNetX and RegNetY models in detection systems and provide their first results on Mask R-CNN, Faster R-CNN and RetinaNet.
+
+The pre-trained modles are converted from [model zoo of pycls](https://github.com/facebookresearch/pycls/blob/master/MODEL_ZOO.md).
+
+```
+@article{radosavovic2020designing,
+ title={Designing Network Design Spaces},
+ author={Ilija Radosavovic and Raj Prateek Kosaraju and Ross Girshick and Kaiming He and Piotr Dollár},
+ year={2020},
+ eprint={2003.13678},
+ archivePrefix={arXiv},
+ primaryClass={cs.CV}
+}
+```
+
+## Usage
+
+To use a regnet model, there are two steps to do:
+1. Convert the model to ResNet-style supported by MMDetection
+2. Modify backbone and neck in config accordingly
+
+### Convert model
+
+We already prepare models of FLOPs from 400M to 12G in our model zoo.
+
+For more general usage, we also provide script `regnet2mmdet.py` in the tools directory to convert the key of models pretrained by [pycls](https://github.com/facebookresearch/pycls/) to
+ResNet-style checkpoints used in MMDetection.
+
+```bash
+python -u tools/regnet2mmdet.py ${PRETRAIN_PATH} ${STORE_PATH}
+```
+This script convert model from `PRETRAIN_PATH` and store the converted model in `STORE_PATH`.
+
+
+### Modify config
+
+The users can modify the config's `depth` of backbone and corresponding keys in `arch` according to the configs in the [pycls model zoo](https://github.com/facebookresearch/pycls/blob/master/MODEL_ZOO.md).
+The parameter `in_channels` in FPN can be found in the Figure 15 & 16 of the paper (`wi` in the legend).
+This directory already provides some configs with their performance, using RegNetX from 800MF to 12GF level.
+For other pre-trained models or self-implemented regnet models, the users are responsible to check these parameters by themselves.
+
+**Note**: Although Fig. 15 & 16 also provide `w0`, `wa`, `wm`, `group_w`, and `bot_mul` for `arch`, they are quantized thus inaccurate, using them sometimes produces different backbone that does not match the key in the pre-trained model.
+
+## Results
+
+### Mask R-CNN
+| Backbone | Style | Lr schd | Mem (GB) | Inf time (fps) | box AP | mask AP | Download |
+| :---------: | :-----: | :-----: | :------: | :------------: | :----: | :-----: | :------: |
+| [R-50-FPN](../mask_rcnn/mask_rcnn_r50_fpn_1x_coco.py)| pytorch | 1x | 4.4 | 12.0 | 38.2 | 34.7 | [model](http://download.openmmlab.com/mmdetection/v2.0/mask_rcnn/mask_rcnn_r50_fpn_1x_coco/mask_rcnn_r50_fpn_1x_coco_20200205-d4b0c5d6.pth) | [log](http://download.openmmlab.com/mmdetection/v2.0/mask_rcnn/mask_rcnn_r50_fpn_1x_coco/mask_rcnn_r50_fpn_1x_coco_20200205_050542.log.json) |
+|[RegNetX-3.2GF-FPN](./mask_rcnn_regnetx-3.2GF_fpn_1x_coco.py)| pytorch | 1x |5.0 ||40.3|36.6|[model](http://download.openmmlab.com/mmdetection/v2.0/regnet/mask_rcnn_regnetx-3.2GF_fpn_1x_coco/mask_rcnn_regnetx-3.2GF_fpn_1x_coco_20200520_163141-2a9d1814.pth) | [log](http://download.openmmlab.com/mmdetection/v2.0/regnet/mask_rcnn_regnetx-3.2GF_fpn_1x_coco/mask_rcnn_regnetx-3.2GF_fpn_1x_coco_20200520_163141.log.json) |
+|[RegNetX-4.0GF-FPN](./mask_rcnn_regnetx-4GF_fpn_1x_coco.py)| pytorch | 1x |5.5||41.5|37.4|[model](http://download.openmmlab.com/mmdetection/v2.0/regnet/mask_rcnn_regnetx-4GF_fpn_1x_coco/mask_rcnn_regnetx-4GF_fpn_1x_coco_20200517_180217-32e9c92d.pth) | [log](http://download.openmmlab.com/mmdetection/v2.0/regnet/mask_rcnn_regnetx-4GF_fpn_1x_coco/mask_rcnn_regnetx-4GF_fpn_1x_coco_20200517_180217.log.json) |
+| [R-101-FPN](../mask_rcnn/mask_rcnn_r101_fpn_1x_coco.py)| pytorch | 1x | 6.4 | 10.3 | 40.0 | 36.1 | [model](http://download.openmmlab.com/mmdetection/v2.0/mask_rcnn/mask_rcnn_r101_fpn_1x_coco/mask_rcnn_r101_fpn_1x_coco_20200204-1efe0ed5.pth) | [log](http://download.openmmlab.com/mmdetection/v2.0/mask_rcnn/mask_rcnn_r101_fpn_1x_coco/mask_rcnn_r101_fpn_1x_coco_20200204_144809.log.json) |
+|[RegNetX-6.4GF-FPN](./mask_rcnn_regnetx-6.4GF_fpn_1x_coco.py)| pytorch | 1x |6.1 ||41.0|37.1|[model](http://download.openmmlab.com/mmdetection/v2.0/regnet/mask_rcnn_regnetx-6.4GF_fpn_1x_coco/mask_rcnn_regnetx-6.4GF_fpn_1x_coco_20200517_180439-3a7aae83.pth) | [log](http://download.openmmlab.com/mmdetection/v2.0/regnet/mask_rcnn_regnetx-6.4GF_fpn_1x_coco/mask_rcnn_regnetx-6.4GF_fpn_1x_coco_20200517_180439.log.json) |
+| [X-101-32x4d-FPN](../mask_rcnn/mask_rcnn_x101_32x4d_fpn_1x_coco.py) | pytorch | 1x | 7.6 | 9.4 | 41.9 | 37.5 | [model](http://download.openmmlab.com/mmdetection/v2.0/mask_rcnn/mask_rcnn_x101_32x4d_fpn_1x_coco/mask_rcnn_x101_32x4d_fpn_1x_coco_20200205-478d0b67.pth) | [log](http://download.openmmlab.com/mmdetection/v2.0/mask_rcnn/mask_rcnn_x101_32x4d_fpn_1x_coco/mask_rcnn_x101_32x4d_fpn_1x_coco_20200205_034906.log.json) |
+|[RegNetX-8.0GF-FPN](./mask_rcnn_regnetx-8GF_fpn_1x_coco.py)| pytorch | 1x |6.4 ||41.7|37.5|[model](http://download.openmmlab.com/mmdetection/v2.0/regnet/mask_rcnn_regnetx-8GF_fpn_1x_coco/mask_rcnn_regnetx-8GF_fpn_1x_coco_20200517_180515-09daa87e.pth) | [log](http://download.openmmlab.com/mmdetection/v2.0/regnet/mask_rcnn_regnetx-8GF_fpn_1x_coco/mask_rcnn_regnetx-8GF_fpn_1x_coco_20200517_180515.log.json) |
+|[RegNetX-12GF-FPN](./mask_rcnn_regnetx-12GF_fpn_1x_coco.py)| pytorch | 1x |7.4 ||42.2|38|[model](http://download.openmmlab.com/mmdetection/v2.0/regnet/mask_rcnn_regnetx-12GF_fpn_1x_coco/mask_rcnn_regnetx-12GF_fpn_1x_coco_20200517_180552-b538bd8b.pth) | [log](http://download.openmmlab.com/mmdetection/v2.0/regnet/mask_rcnn_regnetx-12GF_fpn_1x_coco/mask_rcnn_regnetx-12GF_fpn_1x_coco_20200517_180552.log.json) |
+|[RegNetX-3.2GF-FPN-DCN-C3-C5](./mask_rcnn_regnetx-3.2GF_fpn_mdconv_c3-c5_1x_coco.py)| pytorch | 1x |5.0 ||40.3|36.6|[model](http://download.openmmlab.com/mmdetection/v2.0/regnet/mask_rcnn_regnetx-3.2GF_fpn_mdconv_c3-c5_1x_coco/mask_rcnn_regnetx-3.2GF_fpn_mdconv_c3-c5_1x_coco_20200520_172726-75f40794.pth) | [log](http://download.openmmlab.com/mmdetection/v2.0/regnet/mask_rcnn_regnetx-3.2GF_fpn_mdconv_c3-c5_1x_coco/mask_rcnn_regnetx-3.2GF_fpn_mdconv_c3-c5_1x_coco_20200520_172726.log.json) |
+
+### Faster R-CNN
+| Backbone | Style | Lr schd | Mem (GB) | Inf time (fps) | box AP | Download |
+| :---------: | :-----: | :-----: | :------: | :------------: | :----: | :------: |
+| [R-50-FPN](../faster_rcnn/faster_rcnn_r50_fpn_1x_coco.py)| pytorch | 1x | 4.0 | 18.2 | 37.4 | [model](http://download.openmmlab.com/mmdetection/v2.0/faster_rcnn/faster_rcnn_r50_fpn_1x_coco/faster_rcnn_r50_fpn_1x_coco_20200130-047c8118.pth) | [log](http://download.openmmlab.com/mmdetection/v2.0/faster_rcnn/faster_rcnn_r50_fpn_1x_coco/faster_rcnn_r50_fpn_1x_coco_20200130_204655.log.json) |
+|[RegNetX-3.2GF-FPN](./faster_rcnn_regnetx-3.2GF_fpn_1x_coco.py)| pytorch | 1x | 4.5||39.9|[model](http://download.openmmlab.com/mmdetection/v2.0/regnet/faster_rcnn_regnetx-3.2GF_fpn_1x_coco/faster_rcnn_regnetx-3.2GF_fpn_1x_coco_20200517_175927-126fd9bf.pth) | [log](http://download.openmmlab.com/mmdetection/v2.0/regnet/faster_rcnn_regnetx-3.2GF_fpn_1x_coco/faster_rcnn_regnetx-3.2GF_fpn_1x_coco_20200517_175927.log.json) |
+|[RegNetX-3.2GF-FPN](./faster_rcnn_regnetx-3.2GF_fpn_2x_coco.py)| pytorch | 2x | 4.5||41.1|[model](http://download.openmmlab.com/mmdetection/v2.0/regnet/faster_rcnn_regnetx-3.2GF_fpn_2x_coco/faster_rcnn_regnetx-3.2GF_fpn_2x_coco_20200520_223955-e2081918.pth) | [log](http://download.openmmlab.com/mmdetection/v2.0/regnet/faster_rcnn_regnetx-3.2GF_fpn_2x_coco/faster_rcnn_regnetx-3.2GF_fpn_2x_coco_20200520_223955.log.json) |
+
+### RetinaNet
+| Backbone | Style | Lr schd | Mem (GB) | Inf time (fps) | box AP | Download |
+| :---------: | :-----: | :-----: | :------: | :------------: | :----: | :------: |
+| [R-50-FPN](../retinanet/retinanet_r50_fpn_1x_coco.py) | pytorch | 1x | 3.8 | 16.6 | 36.5 | [model](http://download.openmmlab.com/mmdetection/v2.0/retinanet/retinanet_r50_fpn_1x_coco/retinanet_r50_fpn_1x_coco_20200130-c2398f9e.pth) | [log](http://download.openmmlab.com/mmdetection/v2.0/retinanet/retinanet_r50_fpn_1x_coco/retinanet_r50_fpn_1x_coco_20200130_002941.log.json) |
+|[RegNetX-800MF-FPN](./retinanet_regnetx-800MF_fpn_1x_coco.py)| pytorch | 1x |2.5||35.6|[model](http://download.openmmlab.com/mmdetection/v2.0/regnet/retinanet_regnetx-800MF_fpn_1x_coco/retinanet_regnetx-800MF_fpn_1x_coco_20200517_191403-f6f91d10.pth) | [log](http://download.openmmlab.com/mmdetection/v2.0/regnet/retinanet_regnetx-800MF_fpn_1x_coco/retinanet_regnetx-800MF_fpn_1x_coco_20200517_191403.log.json) |
+|[RegNetX-1.6GF-FPN](./retinanet_regnetx-1.6GF_fpn_1x_coco.py)| pytorch | 1x |3.3||37.3|[model](http://download.openmmlab.com/mmdetection/v2.0/regnet/retinanet_regnetx-1.6GF_fpn_1x_coco/retinanet_regnetx-1.6GF_fpn_1x_coco_20200517_191403-37009a9d.pth) | [log](http://download.openmmlab.com/mmdetection/v2.0/regnet/retinanet_regnetx-1.6GF_fpn_1x_coco/retinanet_regnetx-1.6GF_fpn_1x_coco_20200517_191403.log.json) |
+|[RegNetX-3.2GF-FPN](./retinanet_regnetx-3.2GF_fpn_1x_coco.py)| pytorch | 1x |4.2 ||39.1|[model](http://download.openmmlab.com/mmdetection/v2.0/regnet/retinanet_regnetx-3.2GF_fpn_1x_coco/retinanet_regnetx-3.2GF_fpn_1x_coco_20200520_163141-cb1509e8.pth) | [log](http://download.openmmlab.com/mmdetection/v2.0/regnet/retinanet_regnetx-3.2GF_fpn_1x_coco/retinanet_regnetx-3.2GF_fpn_1x_coco_20200520_163141.log.json) |
+
+### Pre-trained models
+
+We also train some models with longer schedules and multi-scale training. The users could finetune them for downstream tasks.
+
+| Method | Backbone | Style | Lr schd | Mem (GB) | Inf time (fps) | box AP | mask AP | Download |
+| :-----: | :-----: | :-----: | :-----: | :------: | :------------: | :----: | :-----: | :------: |
+|Faster RCNN |[RegNetX-3.2GF-FPN](./faster_rcnn_regnetx-3.2GF_fpn_mstrain_3x_coco.py)| pytorch | 3x |5.0 ||42.2|-|[model](http://download.openmmlab.com/mmdetection/v2.0/regnet/faster_rcnn_regnetx-3.2GF_fpn_mstrain_3x_coco/faster_rcnn_regnetx-3.2GF_fpn_mstrain_3x_coco_20200520_224253-bf85ae3e.pth) | [log](http://download.openmmlab.com/mmdetection/v2.0/regnet/faster_rcnn_regnetx-3.2GF_fpn_mstrain_3x_coco/faster_rcnn_regnetx-3.2GF_fpn_mstrain_3x_coco_20200520_224253.log.json) |
+|Mask RCNN |[RegNetX-3.2GF-FPN](./mask_rcnn_regnetx-3.2GF_fpn_mstrain_3x_coco.py)| pytorch | 3x |5.0 ||43.1|38.7|[model](http://download.openmmlab.com/mmdetection/v2.0/regnet/mask_rcnn_regnetx-3.2GF_fpn_mstrain_3x_coco/mask_rcnn_regnetx-3.2GF_fpn_mstrain_3x_coco_20200521_202221-99879813.pth) | [log](http://download.openmmlab.com/mmdetection/v2.0/regnet/mask_rcnn_regnetx-3.2GF_fpn_mstrain_3x_coco/mask_rcnn_regnetx-3.2GF_fpn_mstrain_3x_coco_20200521_202221.log.json) |
+
+### Notice
+1. The models are trained using a different weight decay, i.e., `weight_decay=5e-5` according to the setting in ImageNet training. This brings improvement of at least 0.7 AP absolute but does not improve the model using ResNet-50.
+2. RetinaNets using RegNets are trained with learning rate 0.02 with gradient clip. We find that using learning rate 0.02 could improve the results by at least 0.7 AP absolute and gradient clip is necessary to stabilize the training.
+However, this does not improve the performance of ResNet-50-FPN RetinaNet.
diff --git a/PyTorch/contrib/cv/detection/GCNet/configs/regnet/faster_rcnn_regnetx-3.2GF_fpn_1x_coco.py b/PyTorch/contrib/cv/detection/GCNet/configs/regnet/faster_rcnn_regnetx-3.2GF_fpn_1x_coco.py
new file mode 100644
index 0000000000000000000000000000000000000000..4fc61a3b523e0b29447e858d98d683a9df00921a
--- /dev/null
+++ b/PyTorch/contrib/cv/detection/GCNet/configs/regnet/faster_rcnn_regnetx-3.2GF_fpn_1x_coco.py
@@ -0,0 +1,56 @@
+_base_ = [
+ '../_base_/models/faster_rcnn_r50_fpn.py',
+ '../_base_/datasets/coco_detection.py',
+ '../_base_/schedules/schedule_1x.py', '../_base_/default_runtime.py'
+]
+model = dict(
+ pretrained='open-mmlab://regnetx_3.2gf',
+ backbone=dict(
+ _delete_=True,
+ type='RegNet',
+ arch='regnetx_3.2gf',
+ out_indices=(0, 1, 2, 3),
+ frozen_stages=1,
+ norm_cfg=dict(type='BN', requires_grad=True),
+ norm_eval=True,
+ style='pytorch'),
+ neck=dict(
+ type='FPN',
+ in_channels=[96, 192, 432, 1008],
+ out_channels=256,
+ num_outs=5))
+img_norm_cfg = dict(
+ # The mean and std are used in PyCls when training RegNets
+ mean=[103.53, 116.28, 123.675],
+ std=[57.375, 57.12, 58.395],
+ to_rgb=False)
+train_pipeline = [
+ dict(type='LoadImageFromFile'),
+ dict(type='LoadAnnotations', with_bbox=True),
+ dict(type='Resize', img_scale=(1333, 800), keep_ratio=True),
+ dict(type='RandomFlip', flip_ratio=0.5),
+ dict(type='Normalize', **img_norm_cfg),
+ dict(type='Pad', size_divisor=32),
+ dict(type='DefaultFormatBundle'),
+ dict(type='Collect', keys=['img', 'gt_bboxes', 'gt_labels']),
+]
+test_pipeline = [
+ dict(type='LoadImageFromFile'),
+ dict(
+ type='MultiScaleFlipAug',
+ img_scale=(1333, 800),
+ flip=False,
+ transforms=[
+ dict(type='Resize', keep_ratio=True),
+ dict(type='RandomFlip'),
+ dict(type='Normalize', **img_norm_cfg),
+ dict(type='Pad', size_divisor=32),
+ dict(type='ImageToTensor', keys=['img']),
+ dict(type='Collect', keys=['img']),
+ ])
+]
+data = dict(
+ train=dict(pipeline=train_pipeline),
+ val=dict(pipeline=test_pipeline),
+ test=dict(pipeline=test_pipeline))
+optimizer = dict(type='SGD', lr=0.02, momentum=0.9, weight_decay=0.00005)
diff --git a/PyTorch/contrib/cv/detection/GCNet/configs/regnet/faster_rcnn_regnetx-3.2GF_fpn_2x_coco.py b/PyTorch/contrib/cv/detection/GCNet/configs/regnet/faster_rcnn_regnetx-3.2GF_fpn_2x_coco.py
new file mode 100644
index 0000000000000000000000000000000000000000..4645b694eb7b1d55361279d8fef965924f67b6aa
--- /dev/null
+++ b/PyTorch/contrib/cv/detection/GCNet/configs/regnet/faster_rcnn_regnetx-3.2GF_fpn_2x_coco.py
@@ -0,0 +1,3 @@
+_base_ = './faster_rcnn_regnetx-3.2GF_fpn_1x_coco.py'
+lr_config = dict(step=[16, 22])
+total_epochs = 24
diff --git a/PyTorch/contrib/cv/detection/GCNet/configs/regnet/faster_rcnn_regnetx-3.2GF_fpn_mstrain_3x_coco.py b/PyTorch/contrib/cv/detection/GCNet/configs/regnet/faster_rcnn_regnetx-3.2GF_fpn_mstrain_3x_coco.py
new file mode 100644
index 0000000000000000000000000000000000000000..66e636ae5ceb9b6f012fc0e94207cb4c63fad8fc
--- /dev/null
+++ b/PyTorch/contrib/cv/detection/GCNet/configs/regnet/faster_rcnn_regnetx-3.2GF_fpn_mstrain_3x_coco.py
@@ -0,0 +1,63 @@
+_base_ = [
+ '../_base_/models/faster_rcnn_r50_fpn.py',
+ '../_base_/datasets/coco_detection.py',
+ '../_base_/schedules/schedule_1x.py', '../_base_/default_runtime.py'
+]
+model = dict(
+ pretrained='open-mmlab://regnetx_3.2gf',
+ backbone=dict(
+ _delete_=True,
+ type='RegNet',
+ arch='regnetx_3.2gf',
+ out_indices=(0, 1, 2, 3),
+ frozen_stages=1,
+ norm_cfg=dict(type='BN', requires_grad=True),
+ norm_eval=True,
+ style='pytorch'),
+ neck=dict(
+ type='FPN',
+ in_channels=[96, 192, 432, 1008],
+ out_channels=256,
+ num_outs=5))
+img_norm_cfg = dict(
+ # The mean and std are used in PyCls when training RegNets
+ mean=[103.53, 116.28, 123.675],
+ std=[57.375, 57.12, 58.395],
+ to_rgb=False)
+train_pipeline = [
+ dict(type='LoadImageFromFile'),
+ dict(type='LoadAnnotations', with_bbox=True),
+ dict(
+ type='Resize',
+ img_scale=[(1333, 640), (1333, 672), (1333, 704), (1333, 736),
+ (1333, 768), (1333, 800)],
+ multiscale_mode='value',
+ keep_ratio=True),
+ dict(type='RandomFlip', flip_ratio=0.5),
+ dict(type='Normalize', **img_norm_cfg),
+ dict(type='Pad', size_divisor=32),
+ dict(type='DefaultFormatBundle'),
+ dict(type='Collect', keys=['img', 'gt_bboxes', 'gt_labels']),
+]
+test_pipeline = [
+ dict(type='LoadImageFromFile'),
+ dict(
+ type='MultiScaleFlipAug',
+ img_scale=(1333, 800),
+ flip=False,
+ transforms=[
+ dict(type='Resize', keep_ratio=True),
+ dict(type='RandomFlip'),
+ dict(type='Normalize', **img_norm_cfg),
+ dict(type='Pad', size_divisor=32),
+ dict(type='ImageToTensor', keys=['img']),
+ dict(type='Collect', keys=['img']),
+ ])
+]
+data = dict(
+ train=dict(pipeline=train_pipeline),
+ val=dict(pipeline=test_pipeline),
+ test=dict(pipeline=test_pipeline))
+optimizer = dict(type='SGD', lr=0.02, momentum=0.9, weight_decay=0.00005)
+lr_config = dict(step=[28, 34])
+total_epochs = 36
diff --git a/PyTorch/contrib/cv/detection/GCNet/configs/regnet/mask_rcnn_regnetx-12GF_fpn_1x_coco.py b/PyTorch/contrib/cv/detection/GCNet/configs/regnet/mask_rcnn_regnetx-12GF_fpn_1x_coco.py
new file mode 100644
index 0000000000000000000000000000000000000000..104d6d43bd958d49f75d54965b326ebac29ae330
--- /dev/null
+++ b/PyTorch/contrib/cv/detection/GCNet/configs/regnet/mask_rcnn_regnetx-12GF_fpn_1x_coco.py
@@ -0,0 +1,16 @@
+_base_ = './mask_rcnn_regnetx-3.2GF_fpn_1x_coco.py'
+model = dict(
+ pretrained='open-mmlab://regnetx_12gf',
+ backbone=dict(
+ type='RegNet',
+ arch='regnetx_12gf',
+ out_indices=(0, 1, 2, 3),
+ frozen_stages=1,
+ norm_cfg=dict(type='BN', requires_grad=True),
+ norm_eval=True,
+ style='pytorch'),
+ neck=dict(
+ type='FPN',
+ in_channels=[224, 448, 896, 2240],
+ out_channels=256,
+ num_outs=5))
diff --git a/PyTorch/contrib/cv/detection/GCNet/configs/regnet/mask_rcnn_regnetx-3.2GF_fpn_1x_coco.py b/PyTorch/contrib/cv/detection/GCNet/configs/regnet/mask_rcnn_regnetx-3.2GF_fpn_1x_coco.py
new file mode 100644
index 0000000000000000000000000000000000000000..19168b54d9e22ddf7b48f753844b9983b68c47f1
--- /dev/null
+++ b/PyTorch/contrib/cv/detection/GCNet/configs/regnet/mask_rcnn_regnetx-3.2GF_fpn_1x_coco.py
@@ -0,0 +1,57 @@
+_base_ = [
+ '../_base_/models/mask_rcnn_r50_fpn.py',
+ '../_base_/datasets/coco_instance.py',
+ '../_base_/schedules/schedule_1x.py', '../_base_/default_runtime.py'
+]
+model = dict(
+ pretrained='open-mmlab://regnetx_3.2gf',
+ backbone=dict(
+ _delete_=True,
+ type='RegNet',
+ arch='regnetx_3.2gf',
+ out_indices=(0, 1, 2, 3),
+ frozen_stages=1,
+ norm_cfg=dict(type='BN', requires_grad=True),
+ norm_eval=True,
+ style='pytorch'),
+ neck=dict(
+ type='FPN',
+ in_channels=[96, 192, 432, 1008],
+ out_channels=256,
+ num_outs=5))
+img_norm_cfg = dict(
+ # The mean and std are used in PyCls when training RegNets
+ mean=[103.53, 116.28, 123.675],
+ std=[57.375, 57.12, 58.395],
+ to_rgb=False)
+train_pipeline = [
+ # Images are converted to float32 directly after loading in PyCls
+ dict(type='LoadImageFromFile'),
+ dict(type='LoadAnnotations', with_bbox=True, with_mask=True),
+ dict(type='Resize', img_scale=(1333, 800), keep_ratio=True),
+ dict(type='RandomFlip', flip_ratio=0.5),
+ dict(type='Normalize', **img_norm_cfg),
+ dict(type='Pad', size_divisor=32),
+ dict(type='DefaultFormatBundle'),
+ dict(type='Collect', keys=['img', 'gt_bboxes', 'gt_labels', 'gt_masks']),
+]
+test_pipeline = [
+ dict(type='LoadImageFromFile'),
+ dict(
+ type='MultiScaleFlipAug',
+ img_scale=(1333, 800),
+ flip=False,
+ transforms=[
+ dict(type='Resize', keep_ratio=True),
+ dict(type='RandomFlip'),
+ dict(type='Normalize', **img_norm_cfg),
+ dict(type='Pad', size_divisor=32),
+ dict(type='ImageToTensor', keys=['img']),
+ dict(type='Collect', keys=['img']),
+ ])
+]
+data = dict(
+ train=dict(pipeline=train_pipeline),
+ val=dict(pipeline=test_pipeline),
+ test=dict(pipeline=test_pipeline))
+optimizer = dict(type='SGD', lr=0.02, momentum=0.9, weight_decay=0.00005)
diff --git a/PyTorch/contrib/cv/detection/GCNet/configs/regnet/mask_rcnn_regnetx-3.2GF_fpn_mdconv_c3-c5_1x_coco.py b/PyTorch/contrib/cv/detection/GCNet/configs/regnet/mask_rcnn_regnetx-3.2GF_fpn_mdconv_c3-c5_1x_coco.py
new file mode 100644
index 0000000000000000000000000000000000000000..dd5153e6ef0ef16b8607279634ce6f1593bd3c1c
--- /dev/null
+++ b/PyTorch/contrib/cv/detection/GCNet/configs/regnet/mask_rcnn_regnetx-3.2GF_fpn_mdconv_c3-c5_1x_coco.py
@@ -0,0 +1,6 @@
+_base_ = 'mask_rcnn_regnetx-3.2GF_fpn_1x_coco.py'
+model = dict(
+ pretrained='open-mmlab://regnetx_3.2gf',
+ backbone=dict(
+ dcn=dict(type='DCNv2', deform_groups=1, fallback_on_stride=False),
+ stage_with_dcn=(False, True, True, True)))
diff --git a/PyTorch/contrib/cv/detection/GCNet/configs/regnet/mask_rcnn_regnetx-3.2GF_fpn_mstrain_3x_coco.py b/PyTorch/contrib/cv/detection/GCNet/configs/regnet/mask_rcnn_regnetx-3.2GF_fpn_mstrain_3x_coco.py
new file mode 100644
index 0000000000000000000000000000000000000000..59255b43483d85d582748ebf31a6047a51bc9794
--- /dev/null
+++ b/PyTorch/contrib/cv/detection/GCNet/configs/regnet/mask_rcnn_regnetx-3.2GF_fpn_mstrain_3x_coco.py
@@ -0,0 +1,65 @@
+_base_ = [
+ '../_base_/models/mask_rcnn_r50_fpn.py',
+ '../_base_/datasets/coco_instance.py',
+ '../_base_/schedules/schedule_1x.py', '../_base_/default_runtime.py'
+]
+model = dict(
+ pretrained='open-mmlab://regnetx_3.2gf',
+ backbone=dict(
+ _delete_=True,
+ type='RegNet',
+ arch='regnetx_3.2gf',
+ out_indices=(0, 1, 2, 3),
+ frozen_stages=1,
+ norm_cfg=dict(type='BN', requires_grad=True),
+ norm_eval=True,
+ style='pytorch'),
+ neck=dict(
+ type='FPN',
+ in_channels=[96, 192, 432, 1008],
+ out_channels=256,
+ num_outs=5))
+img_norm_cfg = dict(
+ # The mean and std are used in PyCls when training RegNets
+ mean=[103.53, 116.28, 123.675],
+ std=[57.375, 57.12, 58.395],
+ to_rgb=False)
+train_pipeline = [
+ dict(type='LoadImageFromFile'),
+ dict(type='LoadAnnotations', with_bbox=True, with_mask=True),
+ dict(
+ type='Resize',
+ img_scale=[(1333, 640), (1333, 672), (1333, 704), (1333, 736),
+ (1333, 768), (1333, 800)],
+ multiscale_mode='value',
+ keep_ratio=True),
+ dict(type='RandomFlip', flip_ratio=0.5),
+ dict(type='Normalize', **img_norm_cfg),
+ dict(type='Pad', size_divisor=32),
+ dict(type='DefaultFormatBundle'),
+ dict(type='Collect', keys=['img', 'gt_bboxes', 'gt_labels', 'gt_masks']),
+]
+test_pipeline = [
+ dict(type='LoadImageFromFile'),
+ dict(
+ type='MultiScaleFlipAug',
+ img_scale=(1333, 800),
+ flip=False,
+ transforms=[
+ dict(type='Resize', keep_ratio=True),
+ dict(type='RandomFlip'),
+ dict(type='Normalize', **img_norm_cfg),
+ dict(type='Pad', size_divisor=32),
+ dict(type='ImageToTensor', keys=['img']),
+ dict(type='Collect', keys=['img']),
+ ])
+]
+data = dict(
+ train=dict(pipeline=train_pipeline),
+ val=dict(pipeline=test_pipeline),
+ test=dict(pipeline=test_pipeline))
+optimizer = dict(type='SGD', lr=0.02, momentum=0.9, weight_decay=0.00005)
+lr_config = dict(step=[28, 34])
+total_epochs = 36
+optimizer_config = dict(
+ _delete_=True, grad_clip=dict(max_norm=35, norm_type=2))
diff --git a/PyTorch/contrib/cv/detection/GCNet/configs/regnet/mask_rcnn_regnetx-4GF_fpn_1x_coco.py b/PyTorch/contrib/cv/detection/GCNet/configs/regnet/mask_rcnn_regnetx-4GF_fpn_1x_coco.py
new file mode 100644
index 0000000000000000000000000000000000000000..8830ef08481bae863bd1401223f4cbd14210e87f
--- /dev/null
+++ b/PyTorch/contrib/cv/detection/GCNet/configs/regnet/mask_rcnn_regnetx-4GF_fpn_1x_coco.py
@@ -0,0 +1,16 @@
+_base_ = './mask_rcnn_regnetx-3.2GF_fpn_1x_coco.py'
+model = dict(
+ pretrained='open-mmlab://regnetx_4.0gf',
+ backbone=dict(
+ type='RegNet',
+ arch='regnetx_4.0gf',
+ out_indices=(0, 1, 2, 3),
+ frozen_stages=1,
+ norm_cfg=dict(type='BN', requires_grad=True),
+ norm_eval=True,
+ style='pytorch'),
+ neck=dict(
+ type='FPN',
+ in_channels=[80, 240, 560, 1360],
+ out_channels=256,
+ num_outs=5))
diff --git a/PyTorch/contrib/cv/detection/GCNet/configs/regnet/mask_rcnn_regnetx-6.4GF_fpn_1x_coco.py b/PyTorch/contrib/cv/detection/GCNet/configs/regnet/mask_rcnn_regnetx-6.4GF_fpn_1x_coco.py
new file mode 100644
index 0000000000000000000000000000000000000000..7569ef3825737cfbf4c2680a655c1b197e0a8053
--- /dev/null
+++ b/PyTorch/contrib/cv/detection/GCNet/configs/regnet/mask_rcnn_regnetx-6.4GF_fpn_1x_coco.py
@@ -0,0 +1,16 @@
+_base_ = './mask_rcnn_regnetx-3.2GF_fpn_1x_coco.py'
+model = dict(
+ pretrained='open-mmlab://regnetx_6.4gf',
+ backbone=dict(
+ type='RegNet',
+ arch='regnetx_6.4gf',
+ out_indices=(0, 1, 2, 3),
+ frozen_stages=1,
+ norm_cfg=dict(type='BN', requires_grad=True),
+ norm_eval=True,
+ style='pytorch'),
+ neck=dict(
+ type='FPN',
+ in_channels=[168, 392, 784, 1624],
+ out_channels=256,
+ num_outs=5))
diff --git a/PyTorch/contrib/cv/detection/GCNet/configs/regnet/mask_rcnn_regnetx-8GF_fpn_1x_coco.py b/PyTorch/contrib/cv/detection/GCNet/configs/regnet/mask_rcnn_regnetx-8GF_fpn_1x_coco.py
new file mode 100644
index 0000000000000000000000000000000000000000..b5890264672f0996d98db422365746e85fcea8e6
--- /dev/null
+++ b/PyTorch/contrib/cv/detection/GCNet/configs/regnet/mask_rcnn_regnetx-8GF_fpn_1x_coco.py
@@ -0,0 +1,16 @@
+_base_ = './mask_rcnn_regnetx-3.2GF_fpn_1x_coco.py'
+model = dict(
+ pretrained='open-mmlab://regnetx_8.0gf',
+ backbone=dict(
+ type='RegNet',
+ arch='regnetx_8.0gf',
+ out_indices=(0, 1, 2, 3),
+ frozen_stages=1,
+ norm_cfg=dict(type='BN', requires_grad=True),
+ norm_eval=True,
+ style='pytorch'),
+ neck=dict(
+ type='FPN',
+ in_channels=[80, 240, 720, 1920],
+ out_channels=256,
+ num_outs=5))
diff --git a/PyTorch/contrib/cv/detection/GCNet/configs/regnet/retinanet_regnetx-1.6GF_fpn_1x_coco.py b/PyTorch/contrib/cv/detection/GCNet/configs/regnet/retinanet_regnetx-1.6GF_fpn_1x_coco.py
new file mode 100644
index 0000000000000000000000000000000000000000..4f2beb850ded95402d6b44c80553f224e15fb557
--- /dev/null
+++ b/PyTorch/contrib/cv/detection/GCNet/configs/regnet/retinanet_regnetx-1.6GF_fpn_1x_coco.py
@@ -0,0 +1,16 @@
+_base_ = './retinanet_regnetx-3.2GF_fpn_1x_coco.py'
+model = dict(
+ pretrained='open-mmlab://regnetx_1.6gf',
+ backbone=dict(
+ type='RegNet',
+ arch='regnetx_1.6gf',
+ out_indices=(0, 1, 2, 3),
+ frozen_stages=1,
+ norm_cfg=dict(type='BN', requires_grad=True),
+ norm_eval=True,
+ style='pytorch'),
+ neck=dict(
+ type='FPN',
+ in_channels=[72, 168, 408, 912],
+ out_channels=256,
+ num_outs=5))
diff --git a/PyTorch/contrib/cv/detection/GCNet/configs/regnet/retinanet_regnetx-3.2GF_fpn_1x_coco.py b/PyTorch/contrib/cv/detection/GCNet/configs/regnet/retinanet_regnetx-3.2GF_fpn_1x_coco.py
new file mode 100644
index 0000000000000000000000000000000000000000..8f483a17ace5c101548f640b95cc94030f37a0b3
--- /dev/null
+++ b/PyTorch/contrib/cv/detection/GCNet/configs/regnet/retinanet_regnetx-3.2GF_fpn_1x_coco.py
@@ -0,0 +1,58 @@
+_base_ = [
+ '../_base_/models/retinanet_r50_fpn.py',
+ '../_base_/datasets/coco_detection.py',
+ '../_base_/schedules/schedule_1x.py', '../_base_/default_runtime.py'
+]
+model = dict(
+ pretrained='open-mmlab://regnetx_3.2gf',
+ backbone=dict(
+ _delete_=True,
+ type='RegNet',
+ arch='regnetx_3.2gf',
+ out_indices=(0, 1, 2, 3),
+ frozen_stages=1,
+ norm_cfg=dict(type='BN', requires_grad=True),
+ norm_eval=True,
+ style='pytorch'),
+ neck=dict(
+ type='FPN',
+ in_channels=[96, 192, 432, 1008],
+ out_channels=256,
+ num_outs=5))
+img_norm_cfg = dict(
+ # The mean and std are used in PyCls when training RegNets
+ mean=[103.53, 116.28, 123.675],
+ std=[57.375, 57.12, 58.395],
+ to_rgb=False)
+train_pipeline = [
+ dict(type='LoadImageFromFile'),
+ dict(type='LoadAnnotations', with_bbox=True),
+ dict(type='Resize', img_scale=(1333, 800), keep_ratio=True),
+ dict(type='RandomFlip', flip_ratio=0.5),
+ dict(type='Normalize', **img_norm_cfg),
+ dict(type='Pad', size_divisor=32),
+ dict(type='DefaultFormatBundle'),
+ dict(type='Collect', keys=['img', 'gt_bboxes', 'gt_labels']),
+]
+test_pipeline = [
+ dict(type='LoadImageFromFile'),
+ dict(
+ type='MultiScaleFlipAug',
+ img_scale=(1333, 800),
+ flip=False,
+ transforms=[
+ dict(type='Resize', keep_ratio=True),
+ dict(type='RandomFlip'),
+ dict(type='Normalize', **img_norm_cfg),
+ dict(type='Pad', size_divisor=32),
+ dict(type='ImageToTensor', keys=['img']),
+ dict(type='Collect', keys=['img']),
+ ])
+]
+data = dict(
+ train=dict(pipeline=train_pipeline),
+ val=dict(pipeline=test_pipeline),
+ test=dict(pipeline=test_pipeline))
+optimizer = dict(type='SGD', lr=0.02, momentum=0.9, weight_decay=0.00005)
+optimizer_config = dict(
+ _delete_=True, grad_clip=dict(max_norm=35, norm_type=2))
diff --git a/PyTorch/contrib/cv/detection/GCNet/configs/regnet/retinanet_regnetx-800MF_fpn_1x_coco.py b/PyTorch/contrib/cv/detection/GCNet/configs/regnet/retinanet_regnetx-800MF_fpn_1x_coco.py
new file mode 100644
index 0000000000000000000000000000000000000000..fe1d659f1a58ddb6e662d74a41c77005d2ee0638
--- /dev/null
+++ b/PyTorch/contrib/cv/detection/GCNet/configs/regnet/retinanet_regnetx-800MF_fpn_1x_coco.py
@@ -0,0 +1,16 @@
+_base_ = './retinanet_regnetx-3.2GF_fpn_1x_coco.py'
+model = dict(
+ pretrained='open-mmlab://regnetx_800mf',
+ backbone=dict(
+ type='RegNet',
+ arch='regnetx_800mf',
+ out_indices=(0, 1, 2, 3),
+ frozen_stages=1,
+ norm_cfg=dict(type='BN', requires_grad=True),
+ norm_eval=True,
+ style='pytorch'),
+ neck=dict(
+ type='FPN',
+ in_channels=[64, 128, 288, 672],
+ out_channels=256,
+ num_outs=5))
diff --git a/PyTorch/contrib/cv/detection/GCNet/configs/reppoints/README.md b/PyTorch/contrib/cv/detection/GCNet/configs/reppoints/README.md
new file mode 100644
index 0000000000000000000000000000000000000000..2d0e7cdbbd75701e84711edc38a6de445bc08825
--- /dev/null
+++ b/PyTorch/contrib/cv/detection/GCNet/configs/reppoints/README.md
@@ -0,0 +1,52 @@
+# RepPoints: Point Set Representation for Object Detection
+
+By [Ze Yang](https://yangze.tech/), [Shaohui Liu](http://b1ueber2y.me/), and [Han Hu](https://ancientmooner.github.io/).
+
+We provide code support and configuration files to reproduce the results in the paper for
+["RepPoints: Point Set Representation for Object Detection"](https://arxiv.org/abs/1904.11490) on COCO object detection.
+
+## Introduction
+
+**RepPoints**, initially described in [arXiv](https://arxiv.org/abs/1904.11490), is a new representation method for visual objects, on which visual understanding tasks are typically centered. Visual object representation, aiming at both geometric description and appearance feature extraction, is conventionally achieved by `bounding box + RoIPool (RoIAlign)`. The bounding box representation is convenient to use; however, it provides only a rectangular localization of objects that lacks geometric precision and may consequently degrade feature quality. Our new representation, RepPoints, models objects by a `point set` instead of a `bounding box`, which learns to adaptively position themselves over an object in a manner that circumscribes the object’s `spatial extent` and enables `semantically aligned feature extraction`. This richer and more flexible representation maintains the convenience of bounding boxes while facilitating various visual understanding applications. This repo demonstrated the effectiveness of RepPoints for COCO object detection.
+
+Another feature of this repo is the demonstration of an `anchor-free detector`, which can be as effective as state-of-the-art anchor-based detection methods. The anchor-free detector can utilize either `bounding box` or `RepPoints` as the basic object representation.
+
+
+

+
Learning RepPoints in Object Detection.
+
+
+## Citing RepPoints
+
+```
+@inproceedings{yang2019reppoints,
+ title={RepPoints: Point Set Representation for Object Detection},
+ author={Yang, Ze and Liu, Shaohui and Hu, Han and Wang, Liwei and Lin, Stephen},
+ booktitle={The IEEE International Conference on Computer Vision (ICCV)},
+ month={Oct},
+ year={2019}
+}
+```
+
+## Results and models
+
+The results on COCO 2017val are shown in the table below.
+
+| Method | Backbone | GN | Anchor | convert func | Lr schd | Mem (GB) | Inf time (fps) | box AP | Download |
+|:---------:|:-------------:|:---:|:------:|:------------:|:-------:|:--------:|:--------------:|:------:|:--------:|
+| BBox | R-50-FPN | Y | single | - | 1x | 3.9 | 15.9 | 36.4 | [model](http://download.openmmlab.com/mmdetection/v2.0/reppoints/bbox_r50_grid_fpn_gn-neck%2Bhead_1x_coco/bbox_r50_grid_fpn_gn-neck%2Bhead_1x_coco_20200329-c98bfa96.pth) | [log](http://download.openmmlab.com/mmdetection/v2.0/reppoints/bbox_r50_grid_fpn_gn-neck%2Bhead_1x_coco/bbox_r50_grid_fpn_gn-neck%2Bhead_1x_coco_20200329_145916.log.json) |
+| BBox | R-50-FPN | Y | none | - | 1x | 3.9 | 15.4 | 37.4 | [model](http://download.openmmlab.com/mmdetection/v2.0/reppoints/bbox_r50_grid_center_fpn_gn-neck%2Bhead_1x_coco/bbox_r50_grid_center_fpn_gn-neck%2Bhead_1x_coco_20200330-00f73d58.pth) | [log](http://download.openmmlab.com/mmdetection/v2.0/reppoints/bbox_r50_grid_center_fpn_gn-neck%2Bhead_1x_coco/bbox_r50_grid_center_fpn_gn-neck%2Bhead_1x_coco_20200330_233609.log.json) |
+| RepPoints | R-50-FPN | N | none | moment | 1x | 3.3 | 18.5 | 37.0 | [model](http://download.openmmlab.com/mmdetection/v2.0/reppoints/reppoints_moment_r50_fpn_1x_coco/reppoints_moment_r50_fpn_1x_coco_20200330-b73db8d1.pth) | [log](http://download.openmmlab.com/mmdetection/v2.0/reppoints/reppoints_moment_r50_fpn_1x_coco/reppoints_moment_r50_fpn_1x_coco_20200330_233609.log.json) |
+| RepPoints | R-50-FPN | Y | none | moment | 1x | 3.9 | 17.5 | 38.1 | [model](http://download.openmmlab.com/mmdetection/v2.0/reppoints/reppoints_moment_r50_fpn_gn-neck%2Bhead_1x_coco/reppoints_moment_r50_fpn_gn-neck%2Bhead_1x_coco_20200329-4b38409a.pth) | [log](http://download.openmmlab.com/mmdetection/v2.0/reppoints/reppoints_moment_r50_fpn_gn-neck%2Bhead_1x_coco/reppoints_moment_r50_fpn_gn-neck%2Bhead_1x_coco_20200329_145952.log.json) |
+| RepPoints | R-50-FPN | Y | none | moment | 2x | 3.9 | - | 38.6 | [model](http://download.openmmlab.com/mmdetection/v2.0/reppoints/reppoints_moment_r50_fpn_gn-neck%2Bhead_2x_coco/reppoints_moment_r50_fpn_gn-neck%2Bhead_2x_coco_20200329-91babaa2.pth) | [log](http://download.openmmlab.com/mmdetection/v2.0/reppoints/reppoints_moment_r50_fpn_gn-neck%2Bhead_2x_coco/reppoints_moment_r50_fpn_gn-neck%2Bhead_2x_coco_20200329_150020.log.json) |
+| RepPoints | R-101-FPN | Y | none | moment | 2x | 5.8 | 13.7 | 40.5 | [model](http://download.openmmlab.com/mmdetection/v2.0/reppoints/reppoints_moment_r101_fpn_gn-neck%2Bhead_2x_coco/reppoints_moment_r101_fpn_gn-neck%2Bhead_2x_coco_20200329-4fbc7310.pth) | [log](http://download.openmmlab.com/mmdetection/v2.0/reppoints/reppoints_moment_r101_fpn_gn-neck%2Bhead_2x_coco/reppoints_moment_r101_fpn_gn-neck%2Bhead_2x_coco_20200329_132205.log.json) |
+| RepPoints | R-101-FPN-DCN | Y | none | moment | 2x | 5.9 | 12.1 | 42.9 | [model](http://download.openmmlab.com/mmdetection/v2.0/reppoints/reppoints_moment_r101_fpn_dconv_c3-c5_gn-neck%2Bhead_2x_coco/reppoints_moment_r101_fpn_dconv_c3-c5_gn-neck%2Bhead_2x_coco_20200329-3309fbf2.pth) | [log](http://download.openmmlab.com/mmdetection/v2.0/reppoints/reppoints_moment_r101_fpn_dconv_c3-c5_gn-neck%2Bhead_2x_coco/reppoints_moment_r101_fpn_dconv_c3-c5_gn-neck%2Bhead_2x_coco_20200329_132134.log.json) |
+| RepPoints | X-101-FPN-DCN | Y | none | moment | 2x | 7.1 | 9.3 | 44.2 | [model](http://download.openmmlab.com/mmdetection/v2.0/reppoints/reppoints_moment_x101_fpn_dconv_c3-c5_gn-neck%2Bhead_2x_coco/reppoints_moment_x101_fpn_dconv_c3-c5_gn-neck%2Bhead_2x_coco_20200329-f87da1ea.pth) | [log](http://download.openmmlab.com/mmdetection/v2.0/reppoints/reppoints_moment_x101_fpn_dconv_c3-c5_gn-neck%2Bhead_2x_coco/reppoints_moment_x101_fpn_dconv_c3-c5_gn-neck%2Bhead_2x_coco_20200329_132201.log.json) |
+
+**Notes:**
+
+- `R-xx`, `X-xx` denote the ResNet and ResNeXt architectures, respectively.
+- `DCN` denotes replacing 3x3 conv with the 3x3 deformable convolution in `c3-c5` stages of backbone.
+- `none` in the `anchor` column means 2-d `center point` (x,y) is used to represent the initial object hypothesis. `single` denotes one 4-d anchor box (x,y,w,h) with IoU based label assign criterion is adopted.
+- `moment`, `partial MinMax`, `MinMax` in the `convert func` column are three functions to convert a point set to a pseudo box.
+- Note the results here are slightly different from those reported in the paper, due to framework change. While the original paper uses an [MXNet](https://mxnet.apache.org/) implementation, we re-implement the method in [PyTorch](https://pytorch.org/) based on mmdetection.
diff --git a/PyTorch/contrib/cv/detection/GCNet/configs/reppoints/bbox_r50_grid_center_fpn_gn-neck+head_1x_coco.py b/PyTorch/contrib/cv/detection/GCNet/configs/reppoints/bbox_r50_grid_center_fpn_gn-neck+head_1x_coco.py
new file mode 100644
index 0000000000000000000000000000000000000000..b24c8db768423de12d1e8582bb26dd71218f52ee
--- /dev/null
+++ b/PyTorch/contrib/cv/detection/GCNet/configs/reppoints/bbox_r50_grid_center_fpn_gn-neck+head_1x_coco.py
@@ -0,0 +1,2 @@
+_base_ = './reppoints_moment_r50_fpn_gn-neck+head_1x_coco.py'
+model = dict(bbox_head=dict(transform_method='minmax', use_grid_points=True))
diff --git a/PyTorch/contrib/cv/detection/GCNet/configs/reppoints/bbox_r50_grid_fpn_gn-neck+head_1x_coco.py b/PyTorch/contrib/cv/detection/GCNet/configs/reppoints/bbox_r50_grid_fpn_gn-neck+head_1x_coco.py
new file mode 100644
index 0000000000000000000000000000000000000000..f225a32080c749c2908360a998e383323fbd317c
--- /dev/null
+++ b/PyTorch/contrib/cv/detection/GCNet/configs/reppoints/bbox_r50_grid_fpn_gn-neck+head_1x_coco.py
@@ -0,0 +1,12 @@
+_base_ = './reppoints_moment_r50_fpn_gn-neck+head_1x_coco.py'
+model = dict(bbox_head=dict(transform_method='minmax', use_grid_points=True))
+# training and testing settings
+train_cfg = dict(
+ init=dict(
+ assigner=dict(
+ _delete_=True,
+ type='MaxIoUAssigner',
+ pos_iou_thr=0.5,
+ neg_iou_thr=0.4,
+ min_pos_iou=0,
+ ignore_iof_thr=-1)))
diff --git a/PyTorch/contrib/cv/detection/GCNet/configs/reppoints/reppoints_minmax_r50_fpn_gn-neck+head_1x_coco.py b/PyTorch/contrib/cv/detection/GCNet/configs/reppoints/reppoints_minmax_r50_fpn_gn-neck+head_1x_coco.py
new file mode 100644
index 0000000000000000000000000000000000000000..0f56a46b3c002cdec630bb06df66a4fc9e7804a8
--- /dev/null
+++ b/PyTorch/contrib/cv/detection/GCNet/configs/reppoints/reppoints_minmax_r50_fpn_gn-neck+head_1x_coco.py
@@ -0,0 +1,2 @@
+_base_ = './reppoints_moment_r50_fpn_gn-neck+head_1x_coco.py'
+model = dict(bbox_head=dict(transform_method='minmax'))
diff --git a/PyTorch/contrib/cv/detection/GCNet/configs/reppoints/reppoints_moment_r101_fpn_dconv_c3-c5_gn-neck+head_2x_coco.py b/PyTorch/contrib/cv/detection/GCNet/configs/reppoints/reppoints_moment_r101_fpn_dconv_c3-c5_gn-neck+head_2x_coco.py
new file mode 100644
index 0000000000000000000000000000000000000000..241754cfb45ed998e7c2e3bb8e662a49fa341e89
--- /dev/null
+++ b/PyTorch/contrib/cv/detection/GCNet/configs/reppoints/reppoints_moment_r101_fpn_dconv_c3-c5_gn-neck+head_2x_coco.py
@@ -0,0 +1,7 @@
+_base_ = './reppoints_moment_r50_fpn_gn-neck+head_2x_coco.py'
+model = dict(
+ pretrained='torchvision://resnet101',
+ backbone=dict(
+ depth=101,
+ dcn=dict(type='DCN', deform_groups=1, fallback_on_stride=False),
+ stage_with_dcn=(False, True, True, True)))
diff --git a/PyTorch/contrib/cv/detection/GCNet/configs/reppoints/reppoints_moment_r101_fpn_gn-neck+head_2x_coco.py b/PyTorch/contrib/cv/detection/GCNet/configs/reppoints/reppoints_moment_r101_fpn_gn-neck+head_2x_coco.py
new file mode 100644
index 0000000000000000000000000000000000000000..19efa0dd756993c9f51a3b9589e558beb2eb5f83
--- /dev/null
+++ b/PyTorch/contrib/cv/detection/GCNet/configs/reppoints/reppoints_moment_r101_fpn_gn-neck+head_2x_coco.py
@@ -0,0 +1,2 @@
+_base_ = './reppoints_moment_r50_fpn_gn-neck+head_2x_coco.py'
+model = dict(pretrained='torchvision://resnet101', backbone=dict(depth=101))
diff --git a/PyTorch/contrib/cv/detection/GCNet/configs/reppoints/reppoints_moment_r50_fpn_1x_coco.py b/PyTorch/contrib/cv/detection/GCNet/configs/reppoints/reppoints_moment_r50_fpn_1x_coco.py
new file mode 100644
index 0000000000000000000000000000000000000000..6d1c89b208217f71add73b76c7e2daeb67b23979
--- /dev/null
+++ b/PyTorch/contrib/cv/detection/GCNet/configs/reppoints/reppoints_moment_r50_fpn_1x_coco.py
@@ -0,0 +1,67 @@
+_base_ = [
+ '../_base_/datasets/coco_detection.py',
+ '../_base_/schedules/schedule_1x.py', '../_base_/default_runtime.py'
+]
+model = dict(
+ type='RepPointsDetector',
+ pretrained='torchvision://resnet50',
+ backbone=dict(
+ type='ResNet',
+ depth=50,
+ num_stages=4,
+ out_indices=(0, 1, 2, 3),
+ frozen_stages=1,
+ norm_cfg=dict(type='BN', requires_grad=True),
+ norm_eval=True,
+ style='pytorch'),
+ neck=dict(
+ type='FPN',
+ in_channels=[256, 512, 1024, 2048],
+ out_channels=256,
+ start_level=1,
+ add_extra_convs='on_input',
+ num_outs=5),
+ bbox_head=dict(
+ type='RepPointsHead',
+ num_classes=80,
+ in_channels=256,
+ feat_channels=256,
+ point_feat_channels=256,
+ stacked_convs=3,
+ num_points=9,
+ gradient_mul=0.1,
+ point_strides=[8, 16, 32, 64, 128],
+ point_base_scale=4,
+ loss_cls=dict(
+ type='FocalLoss',
+ use_sigmoid=True,
+ gamma=2.0,
+ alpha=0.25,
+ loss_weight=1.0),
+ loss_bbox_init=dict(type='SmoothL1Loss', beta=0.11, loss_weight=0.5),
+ loss_bbox_refine=dict(type='SmoothL1Loss', beta=0.11, loss_weight=1.0),
+ transform_method='moment'))
+# training and testing settings
+train_cfg = dict(
+ init=dict(
+ assigner=dict(type='PointAssigner', scale=4, pos_num=1),
+ allowed_border=-1,
+ pos_weight=-1,
+ debug=False),
+ refine=dict(
+ assigner=dict(
+ type='MaxIoUAssigner',
+ pos_iou_thr=0.5,
+ neg_iou_thr=0.4,
+ min_pos_iou=0,
+ ignore_iof_thr=-1),
+ allowed_border=-1,
+ pos_weight=-1,
+ debug=False))
+test_cfg = dict(
+ nms_pre=1000,
+ min_bbox_size=0,
+ score_thr=0.05,
+ nms=dict(type='nms', iou_threshold=0.5),
+ max_per_img=100)
+optimizer = dict(lr=0.01)
diff --git a/PyTorch/contrib/cv/detection/GCNet/configs/reppoints/reppoints_moment_r50_fpn_gn-neck+head_1x_coco.py b/PyTorch/contrib/cv/detection/GCNet/configs/reppoints/reppoints_moment_r50_fpn_gn-neck+head_1x_coco.py
new file mode 100644
index 0000000000000000000000000000000000000000..337f167c820979f345eef120a936195d8f5975c2
--- /dev/null
+++ b/PyTorch/contrib/cv/detection/GCNet/configs/reppoints/reppoints_moment_r50_fpn_gn-neck+head_1x_coco.py
@@ -0,0 +1,4 @@
+_base_ = './reppoints_moment_r50_fpn_1x_coco.py'
+norm_cfg = dict(type='GN', num_groups=32, requires_grad=True)
+model = dict(neck=dict(norm_cfg=norm_cfg), bbox_head=dict(norm_cfg=norm_cfg))
+optimizer = dict(lr=0.01)
diff --git a/PyTorch/contrib/cv/detection/GCNet/configs/reppoints/reppoints_moment_r50_fpn_gn-neck+head_2x_coco.py b/PyTorch/contrib/cv/detection/GCNet/configs/reppoints/reppoints_moment_r50_fpn_gn-neck+head_2x_coco.py
new file mode 100644
index 0000000000000000000000000000000000000000..b9c712d998092bdd7bf7c2d03dac22c58f253c08
--- /dev/null
+++ b/PyTorch/contrib/cv/detection/GCNet/configs/reppoints/reppoints_moment_r50_fpn_gn-neck+head_2x_coco.py
@@ -0,0 +1,3 @@
+_base_ = './reppoints_moment_r50_fpn_gn-neck+head_1x_coco.py'
+lr_config = dict(step=[16, 22])
+total_epochs = 24
diff --git a/PyTorch/contrib/cv/detection/GCNet/configs/reppoints/reppoints_moment_x101_fpn_dconv_c3-c5_gn-neck+head_2x_coco.py b/PyTorch/contrib/cv/detection/GCNet/configs/reppoints/reppoints_moment_x101_fpn_dconv_c3-c5_gn-neck+head_2x_coco.py
new file mode 100644
index 0000000000000000000000000000000000000000..c33019da0ccbc3b37bd58bfa4e6f2cfca68cbd48
--- /dev/null
+++ b/PyTorch/contrib/cv/detection/GCNet/configs/reppoints/reppoints_moment_x101_fpn_dconv_c3-c5_gn-neck+head_2x_coco.py
@@ -0,0 +1,15 @@
+_base_ = './reppoints_moment_r50_fpn_gn-neck+head_2x_coco.py'
+model = dict(
+ pretrained='open-mmlab://resnext101_32x4d',
+ backbone=dict(
+ type='ResNeXt',
+ depth=101,
+ groups=32,
+ base_width=4,
+ num_stages=4,
+ out_indices=(0, 1, 2, 3),
+ frozen_stages=1,
+ norm_cfg=dict(type='BN', requires_grad=True),
+ style='pytorch',
+ dcn=dict(type='DCN', deform_groups=1, fallback_on_stride=False),
+ stage_with_dcn=(False, True, True, True)))
diff --git a/PyTorch/contrib/cv/detection/GCNet/configs/reppoints/reppoints_partial_minmax_r50_fpn_gn-neck+head_1x_coco.py b/PyTorch/contrib/cv/detection/GCNet/configs/reppoints/reppoints_partial_minmax_r50_fpn_gn-neck+head_1x_coco.py
new file mode 100644
index 0000000000000000000000000000000000000000..9a63bd0862be6d5f363c5d481bade3e8e2e8433a
--- /dev/null
+++ b/PyTorch/contrib/cv/detection/GCNet/configs/reppoints/reppoints_partial_minmax_r50_fpn_gn-neck+head_1x_coco.py
@@ -0,0 +1,2 @@
+_base_ = './reppoints_moment_r50_fpn_gn-neck+head_1x_coco.py'
+model = dict(bbox_head=dict(transform_method='partial_minmax'))
diff --git a/PyTorch/contrib/cv/detection/GCNet/configs/res2net/README.md b/PyTorch/contrib/cv/detection/GCNet/configs/res2net/README.md
new file mode 100644
index 0000000000000000000000000000000000000000..a3158b6c6f5bff8d70f2e82bb0c2d57656a7135c
--- /dev/null
+++ b/PyTorch/contrib/cv/detection/GCNet/configs/res2net/README.md
@@ -0,0 +1,52 @@
+# Res2Net for object detection and instance segmentation
+
+## Introduction
+
+We propose a novel building block for CNNs, namely Res2Net, by constructing hierarchical residual-like connections within one single residual block. The Res2Net represents multi-scale features at a granular level and increases the range of receptive fields for each network layer.
+
+| Backbone |Params. | GFLOPs | top-1 err. | top-5 err. |
+| :-------------: |:----: | :-----: | :--------: | :--------: |
+| ResNet-101 |44.6 M | 7.8 | 22.63 | 6.44 |
+| ResNeXt-101-64x4d |83.5M | 15.5 | 20.40 | - |
+| HRNetV2p-W48 | 77.5M | 16.1 | 20.70 | 5.50 |
+| Res2Net-101 | 45.2M | 8.3 | 18.77 | 4.64 |
+
+Compared with other backbone networks, Res2Net requires fewer parameters and FLOPs.
+
+**Note:**
+- GFLOPs for classification are calculated with image size (224x224).
+
+```
+@article{gao2019res2net,
+ title={Res2Net: A New Multi-scale Backbone Architecture},
+ author={Gao, Shang-Hua and Cheng, Ming-Ming and Zhao, Kai and Zhang, Xin-Yu and Yang, Ming-Hsuan and Torr, Philip},
+ journal={IEEE TPAMI},
+ year={2020},
+ doi={10.1109/TPAMI.2019.2938758},
+}
+```
+## Results and Models
+### Faster R-CNN
+| Backbone | Style | Lr schd | Mem (GB) | Inf time (fps) | box AP | Download |
+| :-------------: | :-----: | :-----: | :------: | :------------: | :----: | :------: |
+|R2-101-FPN | pytorch | 2x | 7.4 | - | 43.0 |[model](http://download.openmmlab.com/mmdetection/v2.0/res2net/faster_rcnn_r2_101_fpn_2x_coco/faster_rcnn_r2_101_fpn_2x_coco-175f1da6.pth) | [log](http://download.openmmlab.com/mmdetection/v2.0/res2net/faster_rcnn_r2_101_fpn_2x_coco/faster_rcnn_r2_101_fpn_2x_coco_20200514_231734.log.json) |
+### Mask R-CNN
+| Backbone | Style | Lr schd | Mem (GB) | Inf time (fps) | box AP | mask AP | Download |
+| :-------------: | :-----: | :-----: | :------: | :------------: | :----: | :-----: | :------: |
+|R2-101-FPN | pytorch | 2x | 7.9 | - | 43.6 | 38.7 |[model](http://download.openmmlab.com/mmdetection/v2.0/res2net/mask_rcnn_r2_101_fpn_2x_coco/mask_rcnn_r2_101_fpn_2x_coco-17f061e8.pth) | [log](http://download.openmmlab.com/mmdetection/v2.0/res2net/mask_rcnn_r2_101_fpn_2x_coco/mask_rcnn_r2_101_fpn_2x_coco_20200515_002413.log.json) |
+### Cascade R-CNN
+| Backbone | Style | Lr schd | Mem (GB) | Inf time (fps) | box AP | Download |
+| :-------------: | :-----: | :-----: | :------: | :------------: | :----: | :------: |
+|R2-101-FPN | pytorch | 20e | 7.8 | - | 45.7 |[model](http://download.openmmlab.com/mmdetection/v2.0/res2net/cascade_rcnn_r2_101_fpn_20e_coco/cascade_rcnn_r2_101_fpn_20e_coco-f4b7b7db.pth) | [log](http://download.openmmlab.com/mmdetection/v2.0/res2net/cascade_rcnn_r2_101_fpn_20e_coco/cascade_rcnn_r2_101_fpn_20e_coco_20200515_091644.log.json) |
+### Cascade Mask R-CNN
+| Backbone | Style | Lr schd | Mem (GB) | Inf time (fps) | box AP | mask AP | Download |
+| :-------------: | :-----: | :-----: | :------: | :------------: | :----: | :-----: | :------: |
+R2-101-FPN | pytorch | 20e | 9.5 | - | 46.4 | 40.0 |[model](http://download.openmmlab.com/mmdetection/v2.0/res2net/cascade_mask_rcnn_r2_101_fpn_20e_coco/cascade_mask_rcnn_r2_101_fpn_20e_coco-8a7b41e1.pth) | [log](http://download.openmmlab.com/mmdetection/v2.0/res2net/cascade_mask_rcnn_r2_101_fpn_20e_coco/cascade_mask_rcnn_r2_101_fpn_20e_coco_20200515_091645.log.json) |
+### Hybrid Task Cascade (HTC)
+| Backbone | Style | Lr schd | Mem (GB) | Inf time (fps) | box AP | mask AP | Download |
+| :-------------: | :-----: | :-----: | :------: | :------------: | :----: | :-----: | :------: |
+| R2-101-FPN | pytorch | 20e | - | - | 47.5 | 41.6 | [model](http://download.openmmlab.com/mmdetection/v2.0/res2net/htc_r2_101_fpn_20e_coco/htc_r2_101_fpn_20e_coco-3a8d2112.pth) | [log](http://download.openmmlab.com/mmdetection/v2.0/res2net/htc_r2_101_fpn_20e_coco/htc_r2_101_fpn_20e_coco_20200515_150029.log.json) |
+
+
+- Res2Net ImageNet pretrained models are in [Res2Net-PretrainedModels](https://github.com/Res2Net/Res2Net-PretrainedModels).
+- More applications of Res2Net are in [Res2Net-Github](https://github.com/Res2Net/).
diff --git a/PyTorch/contrib/cv/detection/GCNet/configs/res2net/cascade_mask_rcnn_r2_101_fpn_20e_coco.py b/PyTorch/contrib/cv/detection/GCNet/configs/res2net/cascade_mask_rcnn_r2_101_fpn_20e_coco.py
new file mode 100644
index 0000000000000000000000000000000000000000..50df4e2db500d575eaddd7538b49cc808e30b50e
--- /dev/null
+++ b/PyTorch/contrib/cv/detection/GCNet/configs/res2net/cascade_mask_rcnn_r2_101_fpn_20e_coco.py
@@ -0,0 +1,4 @@
+_base_ = '../cascade_rcnn/cascade_mask_rcnn_r50_fpn_20e_coco.py'
+model = dict(
+ pretrained='open-mmlab://res2net101_v1d_26w_4s',
+ backbone=dict(type='Res2Net', depth=101, scales=4, base_width=26))
diff --git a/PyTorch/contrib/cv/detection/GCNet/configs/res2net/cascade_rcnn_r2_101_fpn_20e_coco.py b/PyTorch/contrib/cv/detection/GCNet/configs/res2net/cascade_rcnn_r2_101_fpn_20e_coco.py
new file mode 100644
index 0000000000000000000000000000000000000000..1cac759ab66323cf034f21a9afff770f79c10035
--- /dev/null
+++ b/PyTorch/contrib/cv/detection/GCNet/configs/res2net/cascade_rcnn_r2_101_fpn_20e_coco.py
@@ -0,0 +1,4 @@
+_base_ = '../cascade_rcnn/cascade_rcnn_r50_fpn_20e_coco.py'
+model = dict(
+ pretrained='open-mmlab://res2net101_v1d_26w_4s',
+ backbone=dict(type='Res2Net', depth=101, scales=4, base_width=26))
diff --git a/PyTorch/contrib/cv/detection/GCNet/configs/res2net/faster_rcnn_r2_101_fpn_2x_coco.py b/PyTorch/contrib/cv/detection/GCNet/configs/res2net/faster_rcnn_r2_101_fpn_2x_coco.py
new file mode 100644
index 0000000000000000000000000000000000000000..85004e02c31edeb487f765835815c6f80c18fb6f
--- /dev/null
+++ b/PyTorch/contrib/cv/detection/GCNet/configs/res2net/faster_rcnn_r2_101_fpn_2x_coco.py
@@ -0,0 +1,4 @@
+_base_ = '../faster_rcnn/faster_rcnn_r50_fpn_2x_coco.py'
+model = dict(
+ pretrained='open-mmlab://res2net101_v1d_26w_4s',
+ backbone=dict(type='Res2Net', depth=101, scales=4, base_width=26))
diff --git a/PyTorch/contrib/cv/detection/GCNet/configs/res2net/htc_r2_101_fpn_20e_coco.py b/PyTorch/contrib/cv/detection/GCNet/configs/res2net/htc_r2_101_fpn_20e_coco.py
new file mode 100644
index 0000000000000000000000000000000000000000..8e7647a6a148615a6b72e6b7a11a8d7be0742b77
--- /dev/null
+++ b/PyTorch/contrib/cv/detection/GCNet/configs/res2net/htc_r2_101_fpn_20e_coco.py
@@ -0,0 +1,7 @@
+_base_ = '../htc/htc_r50_fpn_1x_coco.py'
+model = dict(
+ pretrained='open-mmlab://res2net101_v1d_26w_4s',
+ backbone=dict(type='Res2Net', depth=101, scales=4, base_width=26))
+# learning policy
+lr_config = dict(step=[16, 19])
+total_epochs = 20
diff --git a/PyTorch/contrib/cv/detection/GCNet/configs/res2net/mask_rcnn_r2_101_fpn_2x_coco.py b/PyTorch/contrib/cv/detection/GCNet/configs/res2net/mask_rcnn_r2_101_fpn_2x_coco.py
new file mode 100644
index 0000000000000000000000000000000000000000..a620188807218a9c80ad89ac6002dda3ea4b830c
--- /dev/null
+++ b/PyTorch/contrib/cv/detection/GCNet/configs/res2net/mask_rcnn_r2_101_fpn_2x_coco.py
@@ -0,0 +1,4 @@
+_base_ = '../mask_rcnn/mask_rcnn_r50_fpn_2x_coco.py'
+model = dict(
+ pretrained='open-mmlab://res2net101_v1d_26w_4s',
+ backbone=dict(type='Res2Net', depth=101, scales=4, base_width=26))
diff --git a/PyTorch/contrib/cv/detection/GCNet/configs/retinanet/README.md b/PyTorch/contrib/cv/detection/GCNet/configs/retinanet/README.md
new file mode 100644
index 0000000000000000000000000000000000000000..fbc2ee31d10275f407592048e7a1fdc995550578
--- /dev/null
+++ b/PyTorch/contrib/cv/detection/GCNet/configs/retinanet/README.md
@@ -0,0 +1,26 @@
+# Focal Loss for Dense Object Detection
+
+## Introduction
+```
+@inproceedings{lin2017focal,
+ title={Focal loss for dense object detection},
+ author={Lin, Tsung-Yi and Goyal, Priya and Girshick, Ross and He, Kaiming and Doll{\'a}r, Piotr},
+ booktitle={Proceedings of the IEEE international conference on computer vision},
+ year={2017}
+}
+```
+
+## Results and models
+
+| Backbone | Style | Lr schd | Mem (GB) | Inf time (fps) | box AP | Download |
+| :-------------: | :-----: | :-----: | :------: | :------------: | :----: | :-------: |
+| R-50-FPN | caffe | 1x | 3.5 | 18.6 | 36.3 | [model](http://download.openmmlab.com/mmdetection/v2.0/retinanet/retinanet_r50_caffe_fpn_1x_coco/retinanet_r50_caffe_fpn_1x_coco_20200531-f11027c5.pth) | [log](http://download.openmmlab.com/mmdetection/v2.0/retinanet/retinanet_r50_caffe_fpn_1x_coco/retinanet_r50_caffe_fpn_1x_coco_20200531_012518.log.json) |
+| R-50-FPN | pytorch | 1x | 3.8 | 19.0 | 36.5 | [model](http://download.openmmlab.com/mmdetection/v2.0/retinanet/retinanet_r50_fpn_1x_coco/retinanet_r50_fpn_1x_coco_20200130-c2398f9e.pth) | [log](http://download.openmmlab.com/mmdetection/v2.0/retinanet/retinanet_r50_fpn_1x_coco/retinanet_r50_fpn_1x_coco_20200130_002941.log.json) |
+| R-50-FPN | pytorch | 2x | - | - | 37.4 | [model](http://download.openmmlab.com/mmdetection/v2.0/retinanet/retinanet_r50_fpn_2x_coco/retinanet_r50_fpn_2x_coco_20200131-fdb43119.pth) | [log](http://download.openmmlab.com/mmdetection/v2.0/retinanet/retinanet_r50_fpn_2x_coco/retinanet_r50_fpn_2x_coco_20200131_114738.log.json) |
+| R-101-FPN | caffe | 1x | 5.5 | 14.7 | 38.5 | [model](http://download.openmmlab.com/mmdetection/v2.0/retinanet/retinanet_r101_caffe_fpn_1x_coco/retinanet_r101_caffe_fpn_1x_coco_20200531-b428fa0f.pth) | [log](http://download.openmmlab.com/mmdetection/v2.0/retinanet/retinanet_r101_caffe_fpn_1x_coco/retinanet_r101_caffe_fpn_1x_coco_20200531_012536.log.json) |
+| R-101-FPN | pytorch | 1x | 5.7 | 15.0 | 38.5 | [model](http://download.openmmlab.com/mmdetection/v2.0/retinanet/retinanet_r101_fpn_1x_coco/retinanet_r101_fpn_1x_coco_20200130-7a93545f.pth) | [log](http://download.openmmlab.com/mmdetection/v2.0/retinanet/retinanet_r101_fpn_1x_coco/retinanet_r101_fpn_1x_coco_20200130_003055.log.json) |
+| R-101-FPN | pytorch | 2x | - | - | 38.9 | [model](http://download.openmmlab.com/mmdetection/v2.0/retinanet/retinanet_r101_fpn_2x_coco/retinanet_r101_fpn_2x_coco_20200131-5560aee8.pth) | [log](http://download.openmmlab.com/mmdetection/v2.0/retinanet/retinanet_r101_fpn_2x_coco/retinanet_r101_fpn_2x_coco_20200131_114859.log.json) |
+| X-101-32x4d-FPN | pytorch | 1x | 7.0 | 12.1 | 39.9 | [model](http://download.openmmlab.com/mmdetection/v2.0/retinanet/retinanet_x101_32x4d_fpn_1x_coco/retinanet_x101_32x4d_fpn_1x_coco_20200130-5c8b7ec4.pth) | [log](http://download.openmmlab.com/mmdetection/v2.0/retinanet/retinanet_x101_32x4d_fpn_1x_coco/retinanet_x101_32x4d_fpn_1x_coco_20200130_003004.log.json) |
+| X-101-32x4d-FPN | pytorch | 2x | - | - | 40.1 | [model](http://download.openmmlab.com/mmdetection/v2.0/retinanet/retinanet_x101_32x4d_fpn_2x_coco/retinanet_x101_32x4d_fpn_2x_coco_20200131-237fc5e1.pth) | [log](http://download.openmmlab.com/mmdetection/v2.0/retinanet/retinanet_x101_32x4d_fpn_2x_coco/retinanet_x101_32x4d_fpn_2x_coco_20200131_114812.log.json) |
+| X-101-64x4d-FPN | pytorch | 1x | 10.0 | 8.7 | 41.0 | [model](http://download.openmmlab.com/mmdetection/v2.0/retinanet/retinanet_x101_64x4d_fpn_1x_coco/retinanet_x101_64x4d_fpn_1x_coco_20200130-366f5af1.pth) | [log](http://download.openmmlab.com/mmdetection/v2.0/retinanet/retinanet_x101_64x4d_fpn_1x_coco/retinanet_x101_64x4d_fpn_1x_coco_20200130_003008.log.json) |
+| X-101-64x4d-FPN | pytorch | 2x | - | - | 40.8 | [model](http://download.openmmlab.com/mmdetection/v2.0/retinanet/retinanet_x101_64x4d_fpn_2x_coco/retinanet_x101_64x4d_fpn_2x_coco_20200131-bca068ab.pth) | [log](http://download.openmmlab.com/mmdetection/v2.0/retinanet/retinanet_x101_64x4d_fpn_2x_coco/retinanet_x101_64x4d_fpn_2x_coco_20200131_114833.log.json) |
diff --git a/PyTorch/contrib/cv/detection/GCNet/configs/retinanet/retinanet_r101_caffe_fpn_1x_coco.py b/PyTorch/contrib/cv/detection/GCNet/configs/retinanet/retinanet_r101_caffe_fpn_1x_coco.py
new file mode 100644
index 0000000000000000000000000000000000000000..21d227b044728a30890b93fc769743d2124956c1
--- /dev/null
+++ b/PyTorch/contrib/cv/detection/GCNet/configs/retinanet/retinanet_r101_caffe_fpn_1x_coco.py
@@ -0,0 +1,4 @@
+_base_ = './retinanet_r50_caffe_fpn_1x_coco.py'
+model = dict(
+ pretrained='open-mmlab://detectron2/resnet101_caffe',
+ backbone=dict(depth=101))
diff --git a/PyTorch/contrib/cv/detection/GCNet/configs/retinanet/retinanet_r101_fpn_1x_coco.py b/PyTorch/contrib/cv/detection/GCNet/configs/retinanet/retinanet_r101_fpn_1x_coco.py
new file mode 100644
index 0000000000000000000000000000000000000000..1e6f46340d551abaa22ff2176bec22824188d6cb
--- /dev/null
+++ b/PyTorch/contrib/cv/detection/GCNet/configs/retinanet/retinanet_r101_fpn_1x_coco.py
@@ -0,0 +1,2 @@
+_base_ = './retinanet_r50_fpn_1x_coco.py'
+model = dict(pretrained='torchvision://resnet101', backbone=dict(depth=101))
diff --git a/PyTorch/contrib/cv/detection/GCNet/configs/retinanet/retinanet_r101_fpn_2x_coco.py b/PyTorch/contrib/cv/detection/GCNet/configs/retinanet/retinanet_r101_fpn_2x_coco.py
new file mode 100644
index 0000000000000000000000000000000000000000..c12088a266d7ccad31bd2233ee5a9ee90f4c2b14
--- /dev/null
+++ b/PyTorch/contrib/cv/detection/GCNet/configs/retinanet/retinanet_r101_fpn_2x_coco.py
@@ -0,0 +1,2 @@
+_base_ = './retinanet_r50_fpn_2x_coco.py'
+model = dict(pretrained='torchvision://resnet101', backbone=dict(depth=101))
diff --git a/PyTorch/contrib/cv/detection/GCNet/configs/retinanet/retinanet_r50_caffe_fpn_1x_coco.py b/PyTorch/contrib/cv/detection/GCNet/configs/retinanet/retinanet_r50_caffe_fpn_1x_coco.py
new file mode 100644
index 0000000000000000000000000000000000000000..028c1a3ad48f49ee22e0ee70d07555d58f3c73d1
--- /dev/null
+++ b/PyTorch/contrib/cv/detection/GCNet/configs/retinanet/retinanet_r50_caffe_fpn_1x_coco.py
@@ -0,0 +1,37 @@
+_base_ = './retinanet_r50_fpn_1x_coco.py'
+model = dict(
+ pretrained='open-mmlab://detectron2/resnet50_caffe',
+ backbone=dict(
+ norm_cfg=dict(requires_grad=False), norm_eval=True, style='caffe'))
+# use caffe img_norm
+img_norm_cfg = dict(
+ mean=[103.530, 116.280, 123.675], std=[1.0, 1.0, 1.0], to_rgb=False)
+train_pipeline = [
+ dict(type='LoadImageFromFile'),
+ dict(type='LoadAnnotations', with_bbox=True),
+ dict(type='Resize', img_scale=(1333, 800), keep_ratio=True),
+ dict(type='RandomFlip', flip_ratio=0.5),
+ dict(type='Normalize', **img_norm_cfg),
+ dict(type='Pad', size_divisor=32),
+ dict(type='DefaultFormatBundle'),
+ dict(type='Collect', keys=['img', 'gt_bboxes', 'gt_labels']),
+]
+test_pipeline = [
+ dict(type='LoadImageFromFile'),
+ dict(
+ type='MultiScaleFlipAug',
+ img_scale=(1333, 800),
+ flip=False,
+ transforms=[
+ dict(type='Resize', keep_ratio=True),
+ dict(type='RandomFlip'),
+ dict(type='Normalize', **img_norm_cfg),
+ dict(type='Pad', size_divisor=32),
+ dict(type='ImageToTensor', keys=['img']),
+ dict(type='Collect', keys=['img']),
+ ])
+]
+data = dict(
+ train=dict(pipeline=train_pipeline),
+ val=dict(pipeline=test_pipeline),
+ test=dict(pipeline=test_pipeline))
diff --git a/PyTorch/contrib/cv/detection/GCNet/configs/retinanet/retinanet_r50_caffe_fpn_mstrain_1x_coco.py b/PyTorch/contrib/cv/detection/GCNet/configs/retinanet/retinanet_r50_caffe_fpn_mstrain_1x_coco.py
new file mode 100644
index 0000000000000000000000000000000000000000..f2a0decf8fb46f0dde87e8e5f9d1608ce8ffe576
--- /dev/null
+++ b/PyTorch/contrib/cv/detection/GCNet/configs/retinanet/retinanet_r50_caffe_fpn_mstrain_1x_coco.py
@@ -0,0 +1,42 @@
+_base_ = './retinanet_r50_fpn_1x_coco.py'
+model = dict(
+ pretrained='open-mmlab://detectron2/resnet50_caffe',
+ backbone=dict(
+ norm_cfg=dict(requires_grad=False), norm_eval=True, style='caffe'))
+# use caffe img_norm
+img_norm_cfg = dict(
+ mean=[103.530, 116.280, 123.675], std=[1.0, 1.0, 1.0], to_rgb=False)
+train_pipeline = [
+ dict(type='LoadImageFromFile'),
+ dict(type='LoadAnnotations', with_bbox=True),
+ dict(
+ type='Resize',
+ img_scale=[(1333, 640), (1333, 672), (1333, 704), (1333, 736),
+ (1333, 768), (1333, 800)],
+ multiscale_mode='value',
+ keep_ratio=True),
+ dict(type='RandomFlip', flip_ratio=0.5),
+ dict(type='Normalize', **img_norm_cfg),
+ dict(type='Pad', size_divisor=32),
+ dict(type='DefaultFormatBundle'),
+ dict(type='Collect', keys=['img', 'gt_bboxes', 'gt_labels']),
+]
+test_pipeline = [
+ dict(type='LoadImageFromFile'),
+ dict(
+ type='MultiScaleFlipAug',
+ img_scale=(1333, 800),
+ flip=False,
+ transforms=[
+ dict(type='Resize', keep_ratio=True),
+ dict(type='RandomFlip'),
+ dict(type='Normalize', **img_norm_cfg),
+ dict(type='Pad', size_divisor=32),
+ dict(type='ImageToTensor', keys=['img']),
+ dict(type='Collect', keys=['img']),
+ ])
+]
+data = dict(
+ train=dict(pipeline=train_pipeline),
+ val=dict(pipeline=test_pipeline),
+ test=dict(pipeline=test_pipeline))
diff --git a/PyTorch/contrib/cv/detection/GCNet/configs/retinanet/retinanet_r50_caffe_fpn_mstrain_2x_coco.py b/PyTorch/contrib/cv/detection/GCNet/configs/retinanet/retinanet_r50_caffe_fpn_mstrain_2x_coco.py
new file mode 100644
index 0000000000000000000000000000000000000000..a42c4925e10ef2fa591893aa2e05de3c47f18ab4
--- /dev/null
+++ b/PyTorch/contrib/cv/detection/GCNet/configs/retinanet/retinanet_r50_caffe_fpn_mstrain_2x_coco.py
@@ -0,0 +1,4 @@
+_base_ = './retinanet_r50_caffe_fpn_mstrain_1x_coco.py'
+# learning policy
+lr_config = dict(step=[16, 23])
+total_epochs = 24
diff --git a/PyTorch/contrib/cv/detection/GCNet/configs/retinanet/retinanet_r50_caffe_fpn_mstrain_3x_coco.py b/PyTorch/contrib/cv/detection/GCNet/configs/retinanet/retinanet_r50_caffe_fpn_mstrain_3x_coco.py
new file mode 100644
index 0000000000000000000000000000000000000000..2fb73e51ef02ca582b125387278ee50406d4ea1c
--- /dev/null
+++ b/PyTorch/contrib/cv/detection/GCNet/configs/retinanet/retinanet_r50_caffe_fpn_mstrain_3x_coco.py
@@ -0,0 +1,4 @@
+_base_ = './retinanet_r50_caffe_fpn_mstrain_1x_coco.py'
+# learning policy
+lr_config = dict(step=[28, 34])
+total_epochs = 36
diff --git a/PyTorch/contrib/cv/detection/GCNet/configs/retinanet/retinanet_r50_fpn_1x_coco.py b/PyTorch/contrib/cv/detection/GCNet/configs/retinanet/retinanet_r50_fpn_1x_coco.py
new file mode 100644
index 0000000000000000000000000000000000000000..04bd696b9589e37ad34c9fdd035b97e271d3b214
--- /dev/null
+++ b/PyTorch/contrib/cv/detection/GCNet/configs/retinanet/retinanet_r50_fpn_1x_coco.py
@@ -0,0 +1,7 @@
+_base_ = [
+ '../_base_/models/retinanet_r50_fpn.py',
+ '../_base_/datasets/coco_detection.py',
+ '../_base_/schedules/schedule_1x.py', '../_base_/default_runtime.py'
+]
+# optimizer
+optimizer = dict(type='SGD', lr=0.01, momentum=0.9, weight_decay=0.0001)
diff --git a/PyTorch/contrib/cv/detection/GCNet/configs/retinanet/retinanet_r50_fpn_2x_coco.py b/PyTorch/contrib/cv/detection/GCNet/configs/retinanet/retinanet_r50_fpn_2x_coco.py
new file mode 100644
index 0000000000000000000000000000000000000000..1c61d36404e712efdce5cbdb06cec6d0a3e1225a
--- /dev/null
+++ b/PyTorch/contrib/cv/detection/GCNet/configs/retinanet/retinanet_r50_fpn_2x_coco.py
@@ -0,0 +1,4 @@
+_base_ = './retinanet_r50_fpn_1x_coco.py'
+# learning policy
+lr_config = dict(step=[16, 22])
+total_epochs = 24
diff --git a/PyTorch/contrib/cv/detection/GCNet/configs/retinanet/retinanet_x101_32x4d_fpn_1x_coco.py b/PyTorch/contrib/cv/detection/GCNet/configs/retinanet/retinanet_x101_32x4d_fpn_1x_coco.py
new file mode 100644
index 0000000000000000000000000000000000000000..9927f8f07510b2bc6d1c92f397bc2075e38c104c
--- /dev/null
+++ b/PyTorch/contrib/cv/detection/GCNet/configs/retinanet/retinanet_x101_32x4d_fpn_1x_coco.py
@@ -0,0 +1,13 @@
+_base_ = './retinanet_r50_fpn_1x_coco.py'
+model = dict(
+ pretrained='open-mmlab://resnext101_32x4d',
+ backbone=dict(
+ type='ResNeXt',
+ depth=101,
+ groups=32,
+ base_width=4,
+ num_stages=4,
+ out_indices=(0, 1, 2, 3),
+ frozen_stages=1,
+ norm_cfg=dict(type='BN', requires_grad=True),
+ style='pytorch'))
diff --git a/PyTorch/contrib/cv/detection/GCNet/configs/retinanet/retinanet_x101_32x4d_fpn_2x_coco.py b/PyTorch/contrib/cv/detection/GCNet/configs/retinanet/retinanet_x101_32x4d_fpn_2x_coco.py
new file mode 100644
index 0000000000000000000000000000000000000000..cd78b6df320aea7b23412b2f734e8684f84b9822
--- /dev/null
+++ b/PyTorch/contrib/cv/detection/GCNet/configs/retinanet/retinanet_x101_32x4d_fpn_2x_coco.py
@@ -0,0 +1,13 @@
+_base_ = './retinanet_r50_fpn_2x_coco.py'
+model = dict(
+ pretrained='open-mmlab://resnext101_32x4d',
+ backbone=dict(
+ type='ResNeXt',
+ depth=101,
+ groups=32,
+ base_width=4,
+ num_stages=4,
+ out_indices=(0, 1, 2, 3),
+ frozen_stages=1,
+ norm_cfg=dict(type='BN', requires_grad=True),
+ style='pytorch'))
diff --git a/PyTorch/contrib/cv/detection/GCNet/configs/retinanet/retinanet_x101_64x4d_fpn_1x_coco.py b/PyTorch/contrib/cv/detection/GCNet/configs/retinanet/retinanet_x101_64x4d_fpn_1x_coco.py
new file mode 100644
index 0000000000000000000000000000000000000000..cc40f26020731817dd3c3ff702427280760e67d1
--- /dev/null
+++ b/PyTorch/contrib/cv/detection/GCNet/configs/retinanet/retinanet_x101_64x4d_fpn_1x_coco.py
@@ -0,0 +1,13 @@
+_base_ = './retinanet_r50_fpn_1x_coco.py'
+model = dict(
+ pretrained='open-mmlab://resnext101_64x4d',
+ backbone=dict(
+ type='ResNeXt',
+ depth=101,
+ groups=64,
+ base_width=4,
+ num_stages=4,
+ out_indices=(0, 1, 2, 3),
+ frozen_stages=1,
+ norm_cfg=dict(type='BN', requires_grad=True),
+ style='pytorch'))
diff --git a/PyTorch/contrib/cv/detection/GCNet/configs/retinanet/retinanet_x101_64x4d_fpn_2x_coco.py b/PyTorch/contrib/cv/detection/GCNet/configs/retinanet/retinanet_x101_64x4d_fpn_2x_coco.py
new file mode 100644
index 0000000000000000000000000000000000000000..eac05a64a22f28d597eb4c8b1c31351b52829056
--- /dev/null
+++ b/PyTorch/contrib/cv/detection/GCNet/configs/retinanet/retinanet_x101_64x4d_fpn_2x_coco.py
@@ -0,0 +1,13 @@
+_base_ = './retinanet_r50_fpn_2x_coco.py'
+model = dict(
+ pretrained='open-mmlab://resnext101_64x4d',
+ backbone=dict(
+ type='ResNeXt',
+ depth=101,
+ groups=64,
+ base_width=4,
+ num_stages=4,
+ out_indices=(0, 1, 2, 3),
+ frozen_stages=1,
+ norm_cfg=dict(type='BN', requires_grad=True),
+ style='pytorch'))
diff --git a/PyTorch/contrib/cv/detection/GCNet/configs/rpn/README.md b/PyTorch/contrib/cv/detection/GCNet/configs/rpn/README.md
new file mode 100644
index 0000000000000000000000000000000000000000..ab3fd6bea2614585f17d7a3d2c443ca3b260fbae
--- /dev/null
+++ b/PyTorch/contrib/cv/detection/GCNet/configs/rpn/README.md
@@ -0,0 +1,26 @@
+# Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks
+
+## Introduction
+```
+@inproceedings{ren2015faster,
+ title={Faster r-cnn: Towards real-time object detection with region proposal networks},
+ author={Ren, Shaoqing and He, Kaiming and Girshick, Ross and Sun, Jian},
+ booktitle={Advances in neural information processing systems},
+ year={2015}
+}
+```
+
+## Results and models
+
+| Backbone | Style | Lr schd | Mem (GB) | Inf time (fps) | AR1000 | Download |
+| :-------------: | :-----: | :-----: | :------: | :------------: | :----: | :--------: |
+| R-50-FPN | caffe | 1x | 3.5 | 22.6 | 58.7 | [model](http://download.openmmlab.com/mmdetection/v2.0/rpn/rpn_r50_caffe_fpn_1x_coco/rpn_r50_caffe_fpn_1x_coco_20200531-5b903a37.pth) | [log](http://download.openmmlab.com/mmdetection/v2.0/rpn/rpn_r50_caffe_fpn_1x_coco/rpn_r50_caffe_fpn_1x_coco_20200531_012334.log.json) |
+| R-50-FPN | pytorch | 1x | 3.8 | 22.3 | 58.2 | [model](http://download.openmmlab.com/mmdetection/v2.0/rpn/rpn_r50_fpn_1x_coco/rpn_r50_fpn_1x_coco_20200218-5525fa2e.pth) | [log](http://download.openmmlab.com/mmdetection/v2.0/rpn/rpn_r50_fpn_1x_coco/rpn_r50_fpn_1x_coco_20200218_151240.log.json) |
+| R-50-FPN | pytorch | 2x | - | - | 58.6 | [model](http://download.openmmlab.com/mmdetection/v2.0/rpn/rpn_r50_fpn_2x_coco/rpn_r50_fpn_2x_coco_20200131-0728c9b3.pth) | [log](http://download.openmmlab.com/mmdetection/v2.0/rpn/rpn_r50_fpn_2x_coco/rpn_r50_fpn_2x_coco_20200131_190631.log.json) |
+| R-101-FPN | caffe | 1x | 5.4 | 17.3 | 60.0 | [model](http://download.openmmlab.com/mmdetection/v2.0/rpn/rpn_r101_caffe_fpn_1x_coco/rpn_r101_caffe_fpn_1x_coco_20200531-0629a2e2.pth) | [log](http://download.openmmlab.com/mmdetection/v2.0/rpn/rpn_r101_caffe_fpn_1x_coco/rpn_r101_caffe_fpn_1x_coco_20200531_012345.log.json) |
+| R-101-FPN | pytorch | 1x | 5.8 | 16.5 | 59.7 | [model](http://download.openmmlab.com/mmdetection/v2.0/rpn/rpn_r101_fpn_1x_coco/rpn_r101_fpn_1x_coco_20200131-2ace2249.pth) | [log](http://download.openmmlab.com/mmdetection/v2.0/rpn/rpn_r101_fpn_1x_coco/rpn_r101_fpn_1x_coco_20200131_191000.log.json) |
+| R-101-FPN | pytorch | 2x | - | - | 60.2 | [model](http://download.openmmlab.com/mmdetection/v2.0/rpn/rpn_r101_fpn_2x_coco/rpn_r101_fpn_2x_coco_20200131-24e3db1a.pth) | [log](http://download.openmmlab.com/mmdetection/v2.0/rpn/rpn_r101_fpn_2x_coco/rpn_r101_fpn_2x_coco_20200131_191106.log.json) |
+| X-101-32x4d-FPN | pytorch | 1x | 7.0 | 13.0 | 60.6 | [model](http://download.openmmlab.com/mmdetection/v2.0/rpn/rpn_x101_32x4d_fpn_1x_coco/rpn_x101_32x4d_fpn_1x_coco_20200219-b02646c6.pth) | [log](http://download.openmmlab.com/mmdetection/v2.0/rpn/rpn_x101_32x4d_fpn_1x_coco/rpn_x101_32x4d_fpn_1x_coco_20200219_012037.log.json) |
+| X-101-32x4d-FPN | pytorch | 2x | - | - | 61.1 | [model](http://download.openmmlab.com/mmdetection/v2.0/rpn/rpn_x101_32x4d_fpn_2x_coco/rpn_x101_32x4d_fpn_2x_coco_20200208-d22bd0bb.pth) | [log](http://download.openmmlab.com/mmdetection/v2.0/rpn/rpn_x101_32x4d_fpn_2x_coco/rpn_x101_32x4d_fpn_2x_coco_20200208_200752.log.json) |
+| X-101-64x4d-FPN | pytorch | 1x | 10.1 | 9.1 | 61.0 | [model](http://download.openmmlab.com/mmdetection/v2.0/rpn/rpn_x101_64x4d_fpn_1x_coco/rpn_x101_64x4d_fpn_1x_coco_20200208-cde6f7dd.pth) | [log](http://download.openmmlab.com/mmdetection/v2.0/rpn/rpn_x101_64x4d_fpn_1x_coco/rpn_x101_64x4d_fpn_1x_coco_20200208_200752.log.json) |
+| X-101-64x4d-FPN | pytorch | 2x | - | - | 61.5 | [model](http://download.openmmlab.com/mmdetection/v2.0/rpn/rpn_x101_64x4d_fpn_2x_coco/rpn_x101_64x4d_fpn_2x_coco_20200208-c65f524f.pth) | [log](http://download.openmmlab.com/mmdetection/v2.0/rpn/rpn_x101_64x4d_fpn_2x_coco/rpn_x101_64x4d_fpn_2x_coco_20200208_200752.log.json) |
diff --git a/PyTorch/contrib/cv/detection/GCNet/configs/rpn/rpn_r101_caffe_fpn_1x_coco.py b/PyTorch/contrib/cv/detection/GCNet/configs/rpn/rpn_r101_caffe_fpn_1x_coco.py
new file mode 100644
index 0000000000000000000000000000000000000000..e616fdf46ef82fb1de0519541d20156e789f03ec
--- /dev/null
+++ b/PyTorch/contrib/cv/detection/GCNet/configs/rpn/rpn_r101_caffe_fpn_1x_coco.py
@@ -0,0 +1,4 @@
+_base_ = './rpn_r50_caffe_fpn_1x_coco.py'
+model = dict(
+ pretrained='open-mmlab://detectron2/resnet101_caffe',
+ backbone=dict(depth=101))
diff --git a/PyTorch/contrib/cv/detection/GCNet/configs/rpn/rpn_r101_fpn_1x_coco.py b/PyTorch/contrib/cv/detection/GCNet/configs/rpn/rpn_r101_fpn_1x_coco.py
new file mode 100644
index 0000000000000000000000000000000000000000..b2af6119319c03a8e213b2c352fc48e66bc8a822
--- /dev/null
+++ b/PyTorch/contrib/cv/detection/GCNet/configs/rpn/rpn_r101_fpn_1x_coco.py
@@ -0,0 +1,2 @@
+_base_ = './rpn_r50_fpn_1x_coco.py'
+model = dict(pretrained='torchvision://resnet101', backbone=dict(depth=101))
diff --git a/PyTorch/contrib/cv/detection/GCNet/configs/rpn/rpn_r101_fpn_2x_coco.py b/PyTorch/contrib/cv/detection/GCNet/configs/rpn/rpn_r101_fpn_2x_coco.py
new file mode 100644
index 0000000000000000000000000000000000000000..6908d3001d89ee3efe2b1e508759fbda94b7bf7a
--- /dev/null
+++ b/PyTorch/contrib/cv/detection/GCNet/configs/rpn/rpn_r101_fpn_2x_coco.py
@@ -0,0 +1,2 @@
+_base_ = './rpn_r50_fpn_2x_coco.py'
+model = dict(pretrained='torchvision://resnet101', backbone=dict(depth=101))
diff --git a/PyTorch/contrib/cv/detection/GCNet/configs/rpn/rpn_r50_caffe_c4_1x_coco.py b/PyTorch/contrib/cv/detection/GCNet/configs/rpn/rpn_r50_caffe_c4_1x_coco.py
new file mode 100644
index 0000000000000000000000000000000000000000..6da0ee94906fd8febaf69786976e478ef8f35c9e
--- /dev/null
+++ b/PyTorch/contrib/cv/detection/GCNet/configs/rpn/rpn_r50_caffe_c4_1x_coco.py
@@ -0,0 +1,38 @@
+_base_ = [
+ '../_base_/models/rpn_r50_caffe_c4.py',
+ '../_base_/datasets/coco_detection.py',
+ '../_base_/schedules/schedule_1x.py', '../_base_/default_runtime.py'
+]
+# dataset settings
+img_norm_cfg = dict(
+ mean=[103.530, 116.280, 123.675], std=[1.0, 1.0, 1.0], to_rgb=False)
+train_pipeline = [
+ dict(type='LoadImageFromFile'),
+ dict(type='LoadAnnotations', with_bbox=True, with_label=False),
+ dict(type='Resize', img_scale=(1333, 800), keep_ratio=True),
+ dict(type='RandomFlip', flip_ratio=0.5),
+ dict(type='Normalize', **img_norm_cfg),
+ dict(type='Pad', size_divisor=32),
+ dict(type='DefaultFormatBundle'),
+ dict(type='Collect', keys=['img', 'gt_bboxes']),
+]
+test_pipeline = [
+ dict(type='LoadImageFromFile'),
+ dict(
+ type='MultiScaleFlipAug',
+ img_scale=(1333, 800),
+ flip=False,
+ transforms=[
+ dict(type='Resize', keep_ratio=True),
+ dict(type='RandomFlip'),
+ dict(type='Normalize', **img_norm_cfg),
+ dict(type='Pad', size_divisor=32),
+ dict(type='ImageToTensor', keys=['img']),
+ dict(type='Collect', keys=['img']),
+ ])
+]
+data = dict(
+ train=dict(pipeline=train_pipeline),
+ val=dict(pipeline=test_pipeline),
+ test=dict(pipeline=test_pipeline))
+evaluation = dict(interval=1, metric='proposal_fast')
diff --git a/PyTorch/contrib/cv/detection/GCNet/configs/rpn/rpn_r50_caffe_fpn_1x_coco.py b/PyTorch/contrib/cv/detection/GCNet/configs/rpn/rpn_r50_caffe_fpn_1x_coco.py
new file mode 100644
index 0000000000000000000000000000000000000000..398f3c14db1d63343b08bd5280d69aaae6c70a99
--- /dev/null
+++ b/PyTorch/contrib/cv/detection/GCNet/configs/rpn/rpn_r50_caffe_fpn_1x_coco.py
@@ -0,0 +1,37 @@
+_base_ = './rpn_r50_fpn_1x_coco.py'
+model = dict(
+ pretrained='open-mmlab://detectron2/resnet50_caffe',
+ backbone=dict(
+ norm_cfg=dict(requires_grad=False), norm_eval=True, style='caffe'))
+# use caffe img_norm
+img_norm_cfg = dict(
+ mean=[103.530, 116.280, 123.675], std=[1.0, 1.0, 1.0], to_rgb=False)
+train_pipeline = [
+ dict(type='LoadImageFromFile'),
+ dict(type='LoadAnnotations', with_bbox=True, with_label=False),
+ dict(type='Resize', img_scale=(1333, 800), keep_ratio=True),
+ dict(type='RandomFlip', flip_ratio=0.5),
+ dict(type='Normalize', **img_norm_cfg),
+ dict(type='Pad', size_divisor=32),
+ dict(type='DefaultFormatBundle'),
+ dict(type='Collect', keys=['img', 'gt_bboxes']),
+]
+test_pipeline = [
+ dict(type='LoadImageFromFile'),
+ dict(
+ type='MultiScaleFlipAug',
+ img_scale=(1333, 800),
+ flip=False,
+ transforms=[
+ dict(type='Resize', keep_ratio=True),
+ dict(type='RandomFlip'),
+ dict(type='Normalize', **img_norm_cfg),
+ dict(type='Pad', size_divisor=32),
+ dict(type='ImageToTensor', keys=['img']),
+ dict(type='Collect', keys=['img']),
+ ])
+]
+data = dict(
+ train=dict(pipeline=train_pipeline),
+ val=dict(pipeline=test_pipeline),
+ test=dict(pipeline=test_pipeline))
diff --git a/PyTorch/contrib/cv/detection/GCNet/configs/rpn/rpn_r50_fpn_1x_coco.py b/PyTorch/contrib/cv/detection/GCNet/configs/rpn/rpn_r50_fpn_1x_coco.py
new file mode 100644
index 0000000000000000000000000000000000000000..26f95a3402f9fd2d54c5919484e2f4958beb8a34
--- /dev/null
+++ b/PyTorch/contrib/cv/detection/GCNet/configs/rpn/rpn_r50_fpn_1x_coco.py
@@ -0,0 +1,18 @@
+_base_ = [
+ '../_base_/models/rpn_r50_fpn.py', '../_base_/datasets/coco_detection.py',
+ '../_base_/schedules/schedule_1x.py', '../_base_/default_runtime.py'
+]
+img_norm_cfg = dict(
+ mean=[123.675, 116.28, 103.53], std=[58.395, 57.12, 57.375], to_rgb=True)
+train_pipeline = [
+ dict(type='LoadImageFromFile'),
+ dict(type='LoadAnnotations', with_bbox=True, with_label=False),
+ dict(type='Resize', img_scale=(1333, 800), keep_ratio=True),
+ dict(type='RandomFlip', flip_ratio=0.5),
+ dict(type='Normalize', **img_norm_cfg),
+ dict(type='Pad', size_divisor=32),
+ dict(type='DefaultFormatBundle'),
+ dict(type='Collect', keys=['img', 'gt_bboxes']),
+]
+data = dict(train=dict(pipeline=train_pipeline))
+evaluation = dict(interval=1, metric='proposal_fast')
diff --git a/PyTorch/contrib/cv/detection/GCNet/configs/rpn/rpn_r50_fpn_2x_coco.py b/PyTorch/contrib/cv/detection/GCNet/configs/rpn/rpn_r50_fpn_2x_coco.py
new file mode 100644
index 0000000000000000000000000000000000000000..3a92d8d3f65776c1fe72c9909c36fca428267afd
--- /dev/null
+++ b/PyTorch/contrib/cv/detection/GCNet/configs/rpn/rpn_r50_fpn_2x_coco.py
@@ -0,0 +1,5 @@
+_base_ = './rpn_r50_fpn_1x_coco.py'
+
+# learning policy
+lr_config = dict(step=[16, 22])
+total_epochs = 24
diff --git a/PyTorch/contrib/cv/detection/GCNet/configs/rpn/rpn_x101_32x4d_fpn_1x_coco.py b/PyTorch/contrib/cv/detection/GCNet/configs/rpn/rpn_x101_32x4d_fpn_1x_coco.py
new file mode 100644
index 0000000000000000000000000000000000000000..83bd70032cb24be6b96f988522ef84f7b4cc0e6a
--- /dev/null
+++ b/PyTorch/contrib/cv/detection/GCNet/configs/rpn/rpn_x101_32x4d_fpn_1x_coco.py
@@ -0,0 +1,13 @@
+_base_ = './rpn_r50_fpn_1x_coco.py'
+model = dict(
+ pretrained='open-mmlab://resnext101_32x4d',
+ backbone=dict(
+ type='ResNeXt',
+ depth=101,
+ groups=32,
+ base_width=4,
+ num_stages=4,
+ out_indices=(0, 1, 2, 3),
+ frozen_stages=1,
+ norm_cfg=dict(type='BN', requires_grad=True),
+ style='pytorch'))
diff --git a/PyTorch/contrib/cv/detection/GCNet/configs/rpn/rpn_x101_32x4d_fpn_2x_coco.py b/PyTorch/contrib/cv/detection/GCNet/configs/rpn/rpn_x101_32x4d_fpn_2x_coco.py
new file mode 100644
index 0000000000000000000000000000000000000000..979afb97073a92e228ed302dab161d8f9bbade32
--- /dev/null
+++ b/PyTorch/contrib/cv/detection/GCNet/configs/rpn/rpn_x101_32x4d_fpn_2x_coco.py
@@ -0,0 +1,13 @@
+_base_ = './rpn_r50_fpn_2x_coco.py'
+model = dict(
+ pretrained='open-mmlab://resnext101_32x4d',
+ backbone=dict(
+ type='ResNeXt',
+ depth=101,
+ groups=32,
+ base_width=4,
+ num_stages=4,
+ out_indices=(0, 1, 2, 3),
+ frozen_stages=1,
+ norm_cfg=dict(type='BN', requires_grad=True),
+ style='pytorch'))
diff --git a/PyTorch/contrib/cv/detection/GCNet/configs/rpn/rpn_x101_64x4d_fpn_1x_coco.py b/PyTorch/contrib/cv/detection/GCNet/configs/rpn/rpn_x101_64x4d_fpn_1x_coco.py
new file mode 100644
index 0000000000000000000000000000000000000000..bb7f0a630b9f2e9263183e003c288a33eb972e71
--- /dev/null
+++ b/PyTorch/contrib/cv/detection/GCNet/configs/rpn/rpn_x101_64x4d_fpn_1x_coco.py
@@ -0,0 +1,13 @@
+_base_ = './rpn_r50_fpn_1x_coco.py'
+model = dict(
+ pretrained='open-mmlab://resnext101_64x4d',
+ backbone=dict(
+ type='ResNeXt',
+ depth=101,
+ groups=64,
+ base_width=4,
+ num_stages=4,
+ out_indices=(0, 1, 2, 3),
+ frozen_stages=1,
+ norm_cfg=dict(type='BN', requires_grad=True),
+ style='pytorch'))
diff --git a/PyTorch/contrib/cv/detection/GCNet/configs/rpn/rpn_x101_64x4d_fpn_2x_coco.py b/PyTorch/contrib/cv/detection/GCNet/configs/rpn/rpn_x101_64x4d_fpn_2x_coco.py
new file mode 100644
index 0000000000000000000000000000000000000000..8c766f05f4ee61273670ce74ed60c91c89beb50e
--- /dev/null
+++ b/PyTorch/contrib/cv/detection/GCNet/configs/rpn/rpn_x101_64x4d_fpn_2x_coco.py
@@ -0,0 +1,13 @@
+_base_ = './rpn_r50_fpn_2x_coco.py'
+model = dict(
+ pretrained='open-mmlab://resnext101_64x4d',
+ backbone=dict(
+ type='ResNeXt',
+ depth=101,
+ groups=64,
+ base_width=4,
+ num_stages=4,
+ out_indices=(0, 1, 2, 3),
+ frozen_stages=1,
+ norm_cfg=dict(type='BN', requires_grad=True),
+ style='pytorch'))
diff --git a/PyTorch/contrib/cv/detection/GCNet/configs/sabl/README.md b/PyTorch/contrib/cv/detection/GCNet/configs/sabl/README.md
new file mode 100644
index 0000000000000000000000000000000000000000..b8f9738a6a20598d846421a50f93922e40022ceb
--- /dev/null
+++ b/PyTorch/contrib/cv/detection/GCNet/configs/sabl/README.md
@@ -0,0 +1,36 @@
+# Side-Aware Boundary Localization for More Precise Object Detection
+
+## Introduction
+
+We provide config files to reproduce the object detection results in the ECCV 2020 Spotlight paper for [Side-Aware Boundary Localization for More Precise Object Detection](https://arxiv.org/abs/1912.04260).
+
+```
+@inproceedings{Wang_2020_ECCV,
+ title = {Side-Aware Boundary Localization for More Precise Object Detection},
+ author = {Wang, Jiaqi and Zhang, Wenwei and Cao, Yuhang and Chen, Kai and Pang, Jiangmiao and Gong, Tao and Shi, Jianping, Loy, Chen Change and Lin, Dahua},
+ booktitle = {ECCV},
+ year = {2020}
+}
+```
+
+## Results and Models
+
+The results on COCO 2017 val is shown in the below table. (results on test-dev are usually slightly higher than val).
+Single-scale testing (1333x800) is adopted in all results.
+
+
+| Method | Backbone | Lr schd | ms-train | box AP | Download |
+| :----------------: | :-------: | :-----: | :------: | :----: | :---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------: |
+| SABL Faster R-CNN | R-50-FPN | 1x | N | 39.9 | [model](https://open-mmlab.s3.ap-northeast-2.amazonaws.com/mmdetection/v2.0/sabl/sabl_faster_rcnn_r50_fpn_1x_coco/sabl_faster_rcnn_r50_fpn_1x_coco-e867595b.pth) | [log](https://open-mmlab.s3.ap-northeast-2.amazonaws.com/mmdetection/v2.0/sabl/sabl_faster_rcnn_r50_fpn_1x_coco/20200830_130324.log.json) |
+| SABL Faster R-CNN | R-101-FPN | 1x | N | 41.7 | [model](https://open-mmlab.s3.ap-northeast-2.amazonaws.com/mmdetection/v2.0/sabl/sabl_faster_rcnn_r101_fpn_1x_coco/sabl_faster_rcnn_r101_fpn_1x_coco-f804c6c1.pth) | [log](https://open-mmlab.s3.ap-northeast-2.amazonaws.com/mmdetection/v2.0/sabl/sabl_faster_rcnn_r101_fpn_1x_coco/20200830_183949.log.json) |
+| SABL Cascade R-CNN | R-50-FPN | 1x | N | 41.6 | [model](https://open-mmlab.s3.ap-northeast-2.amazonaws.com/mmdetection/v2.0/sabl/sabl_cascade_rcnn_r50_fpn_1x_coco/sabl_cascade_rcnn_r50_fpn_1x_coco-e1748e5e.pth) | [log](https://open-mmlab.s3.ap-northeast-2.amazonaws.com/mmdetection/v2.0/sabl/sabl_cascade_rcnn_r50_fpn_1x_coco/20200831_033726.log.json) |
+| SABL Cascade R-CNN | R-101-FPN | 1x | N | 43.0 | [model](https://open-mmlab.s3.ap-northeast-2.amazonaws.com/mmdetection/v2.0/sabl/sabl_cascade_rcnn_r101_fpn_1x_coco/sabl_cascade_rcnn_r101_fpn_1x_coco-2b83e87c.pth) | [log](https://open-mmlab.s3.ap-northeast-2.amazonaws.com/mmdetection/v2.0/sabl/sabl_cascade_rcnn_r101_fpn_1x_coco/20200831_141745.log.json) |
+
+| Method | Backbone | GN | Lr schd | ms-train | box AP | Download |
+| :------------: | :-------: | :---: | :-----: | :---------: | :----: | :------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------: |
+| SABL RetinaNet | R-50-FPN | N | 1x | N | 37.7 | [model](https://open-mmlab.s3.ap-northeast-2.amazonaws.com/mmdetection/v2.0/sabl/sabl_retinanet_r50_fpn_1x_coco/sabl_retinanet_r50_fpn_1x_coco-6c54fd4f.pth) | [log](https://open-mmlab.s3.ap-northeast-2.amazonaws.com/mmdetection/v2.0/sabl/sabl_retinanet_r50_fpn_1x_coco/20200830_053451.log.json) |
+| SABL RetinaNet | R-50-FPN | Y | 1x | N | 38.8 | [model](https://open-mmlab.s3.ap-northeast-2.amazonaws.com/mmdetection/v2.0/sabl/sabl_retinanet_r50_fpn_gn_1x_coco/sabl_retinanet_r50_fpn_gn_1x_coco-e16dfcf1.pth) | [log](https://open-mmlab.s3.ap-northeast-2.amazonaws.com/mmdetection/v2.0/sabl/sabl_retinanet_r50_fpn_gn_1x_coco/20200831_141955.log.json) |
+| SABL RetinaNet | R-101-FPN | N | 1x | N | 39.7 | [model](https://open-mmlab.s3.ap-northeast-2.amazonaws.com/mmdetection/v2.0/sabl/sabl_retinanet_r101_fpn_1x_coco/sabl_retinanet_r101_fpn_1x_coco-42026904.pth) | [log](https://open-mmlab.s3.ap-northeast-2.amazonaws.com/mmdetection/v2.0/sabl/sabl_retinanet_r101_fpn_1x_coco/20200831_034256.log.json) |
+| SABL RetinaNet | R-101-FPN | Y | 1x | N | 40.5 | [model](https://open-mmlab.s3.ap-northeast-2.amazonaws.com/mmdetection/v2.0/sabl/sabl_retinanet_r101_fpn_gn_1x_coco/sabl_retinanet_r101_fpn_gn_1x_coco-40a893e8.pth) | [log](https://open-mmlab.s3.ap-northeast-2.amazonaws.com/mmdetection/v2.0/sabl/sabl_retinanet_r101_fpn_gn_1x_coco/20200830_201422.log.json) |
+| SABL RetinaNet | R-101-FPN | Y | 2x | Y (640~800) | 42.9 | [model](https://open-mmlab.s3.ap-northeast-2.amazonaws.com/mmdetection/v2.0/sabl/sabl_retinanet_r101_fpn_gn_2x_ms_640_800_coco/sabl_retinanet_r101_fpn_gn_2x_ms_640_800_coco-1e63382c.pth) | [log](https://open-mmlab.s3.ap-northeast-2.amazonaws.com/mmdetection/v2.0/sabl/sabl_retinanet_r101_fpn_gn_2x_ms_640_800_coco/20200830_144807.log.json) |
+| SABL RetinaNet | R-101-FPN | Y | 2x | Y (480~960) | 43.6 | [model](https://open-mmlab.s3.ap-northeast-2.amazonaws.com/mmdetection/v2.0/sabl/sabl_retinanet_r101_fpn_gn_2x_ms_480_960_coco/sabl_retinanet_r101_fpn_gn_2x_ms_480_960_coco-5342f857.pth) | [log](https://open-mmlab.s3.ap-northeast-2.amazonaws.com/mmdetection/v2.0/sabl/sabl_retinanet_r101_fpn_gn_2x_ms_480_960_coco/20200830_164537.log.json) |
diff --git a/PyTorch/contrib/cv/detection/GCNet/configs/sabl/sabl_cascade_rcnn_r101_fpn_1x_coco.py b/PyTorch/contrib/cv/detection/GCNet/configs/sabl/sabl_cascade_rcnn_r101_fpn_1x_coco.py
new file mode 100644
index 0000000000000000000000000000000000000000..0322006464e158a238525e91449cc81a6143375c
--- /dev/null
+++ b/PyTorch/contrib/cv/detection/GCNet/configs/sabl/sabl_cascade_rcnn_r101_fpn_1x_coco.py
@@ -0,0 +1,88 @@
+_base_ = [
+ '../_base_/models/cascade_rcnn_r50_fpn.py',
+ '../_base_/datasets/coco_detection.py',
+ '../_base_/schedules/schedule_1x.py', '../_base_/default_runtime.py'
+]
+# model settings
+model = dict(
+ pretrained='torchvision://resnet101',
+ backbone=dict(depth=101),
+ roi_head=dict(bbox_head=[
+ dict(
+ type='SABLHead',
+ num_classes=80,
+ cls_in_channels=256,
+ reg_in_channels=256,
+ roi_feat_size=7,
+ reg_feat_up_ratio=2,
+ reg_pre_kernel=3,
+ reg_post_kernel=3,
+ reg_pre_num=2,
+ reg_post_num=1,
+ cls_out_channels=1024,
+ reg_offset_out_channels=256,
+ reg_cls_out_channels=256,
+ num_cls_fcs=1,
+ num_reg_fcs=0,
+ reg_class_agnostic=True,
+ norm_cfg=None,
+ bbox_coder=dict(
+ type='BucketingBBoxCoder', num_buckets=14, scale_factor=1.7),
+ loss_cls=dict(
+ type='CrossEntropyLoss', use_sigmoid=False, loss_weight=1.0),
+ loss_bbox_cls=dict(
+ type='CrossEntropyLoss', use_sigmoid=True, loss_weight=1.0),
+ loss_bbox_reg=dict(type='SmoothL1Loss', beta=0.1,
+ loss_weight=1.0)),
+ dict(
+ type='SABLHead',
+ num_classes=80,
+ cls_in_channels=256,
+ reg_in_channels=256,
+ roi_feat_size=7,
+ reg_feat_up_ratio=2,
+ reg_pre_kernel=3,
+ reg_post_kernel=3,
+ reg_pre_num=2,
+ reg_post_num=1,
+ cls_out_channels=1024,
+ reg_offset_out_channels=256,
+ reg_cls_out_channels=256,
+ num_cls_fcs=1,
+ num_reg_fcs=0,
+ reg_class_agnostic=True,
+ norm_cfg=None,
+ bbox_coder=dict(
+ type='BucketingBBoxCoder', num_buckets=14, scale_factor=1.5),
+ loss_cls=dict(
+ type='CrossEntropyLoss', use_sigmoid=False, loss_weight=1.0),
+ loss_bbox_cls=dict(
+ type='CrossEntropyLoss', use_sigmoid=True, loss_weight=1.0),
+ loss_bbox_reg=dict(type='SmoothL1Loss', beta=0.1,
+ loss_weight=1.0)),
+ dict(
+ type='SABLHead',
+ num_classes=80,
+ cls_in_channels=256,
+ reg_in_channels=256,
+ roi_feat_size=7,
+ reg_feat_up_ratio=2,
+ reg_pre_kernel=3,
+ reg_post_kernel=3,
+ reg_pre_num=2,
+ reg_post_num=1,
+ cls_out_channels=1024,
+ reg_offset_out_channels=256,
+ reg_cls_out_channels=256,
+ num_cls_fcs=1,
+ num_reg_fcs=0,
+ reg_class_agnostic=True,
+ norm_cfg=None,
+ bbox_coder=dict(
+ type='BucketingBBoxCoder', num_buckets=14, scale_factor=1.3),
+ loss_cls=dict(
+ type='CrossEntropyLoss', use_sigmoid=False, loss_weight=1.0),
+ loss_bbox_cls=dict(
+ type='CrossEntropyLoss', use_sigmoid=True, loss_weight=1.0),
+ loss_bbox_reg=dict(type='SmoothL1Loss', beta=0.1, loss_weight=1.0))
+ ]))
diff --git a/PyTorch/contrib/cv/detection/GCNet/configs/sabl/sabl_cascade_rcnn_r50_fpn_1x_coco.py b/PyTorch/contrib/cv/detection/GCNet/configs/sabl/sabl_cascade_rcnn_r50_fpn_1x_coco.py
new file mode 100644
index 0000000000000000000000000000000000000000..4b28a59280e6701d31afeeaae7ae12cdbd4fb95e
--- /dev/null
+++ b/PyTorch/contrib/cv/detection/GCNet/configs/sabl/sabl_cascade_rcnn_r50_fpn_1x_coco.py
@@ -0,0 +1,86 @@
+_base_ = [
+ '../_base_/models/cascade_rcnn_r50_fpn.py',
+ '../_base_/datasets/coco_detection.py',
+ '../_base_/schedules/schedule_1x.py', '../_base_/default_runtime.py'
+]
+# model settings
+model = dict(
+ roi_head=dict(bbox_head=[
+ dict(
+ type='SABLHead',
+ num_classes=80,
+ cls_in_channels=256,
+ reg_in_channels=256,
+ roi_feat_size=7,
+ reg_feat_up_ratio=2,
+ reg_pre_kernel=3,
+ reg_post_kernel=3,
+ reg_pre_num=2,
+ reg_post_num=1,
+ cls_out_channels=1024,
+ reg_offset_out_channels=256,
+ reg_cls_out_channels=256,
+ num_cls_fcs=1,
+ num_reg_fcs=0,
+ reg_class_agnostic=True,
+ norm_cfg=None,
+ bbox_coder=dict(
+ type='BucketingBBoxCoder', num_buckets=14, scale_factor=1.7),
+ loss_cls=dict(
+ type='CrossEntropyLoss', use_sigmoid=False, loss_weight=1.0),
+ loss_bbox_cls=dict(
+ type='CrossEntropyLoss', use_sigmoid=True, loss_weight=1.0),
+ loss_bbox_reg=dict(type='SmoothL1Loss', beta=0.1,
+ loss_weight=1.0)),
+ dict(
+ type='SABLHead',
+ num_classes=80,
+ cls_in_channels=256,
+ reg_in_channels=256,
+ roi_feat_size=7,
+ reg_feat_up_ratio=2,
+ reg_pre_kernel=3,
+ reg_post_kernel=3,
+ reg_pre_num=2,
+ reg_post_num=1,
+ cls_out_channels=1024,
+ reg_offset_out_channels=256,
+ reg_cls_out_channels=256,
+ num_cls_fcs=1,
+ num_reg_fcs=0,
+ reg_class_agnostic=True,
+ norm_cfg=None,
+ bbox_coder=dict(
+ type='BucketingBBoxCoder', num_buckets=14, scale_factor=1.5),
+ loss_cls=dict(
+ type='CrossEntropyLoss', use_sigmoid=False, loss_weight=1.0),
+ loss_bbox_cls=dict(
+ type='CrossEntropyLoss', use_sigmoid=True, loss_weight=1.0),
+ loss_bbox_reg=dict(type='SmoothL1Loss', beta=0.1,
+ loss_weight=1.0)),
+ dict(
+ type='SABLHead',
+ num_classes=80,
+ cls_in_channels=256,
+ reg_in_channels=256,
+ roi_feat_size=7,
+ reg_feat_up_ratio=2,
+ reg_pre_kernel=3,
+ reg_post_kernel=3,
+ reg_pre_num=2,
+ reg_post_num=1,
+ cls_out_channels=1024,
+ reg_offset_out_channels=256,
+ reg_cls_out_channels=256,
+ num_cls_fcs=1,
+ num_reg_fcs=0,
+ reg_class_agnostic=True,
+ norm_cfg=None,
+ bbox_coder=dict(
+ type='BucketingBBoxCoder', num_buckets=14, scale_factor=1.3),
+ loss_cls=dict(
+ type='CrossEntropyLoss', use_sigmoid=False, loss_weight=1.0),
+ loss_bbox_cls=dict(
+ type='CrossEntropyLoss', use_sigmoid=True, loss_weight=1.0),
+ loss_bbox_reg=dict(type='SmoothL1Loss', beta=0.1, loss_weight=1.0))
+ ]))
diff --git a/PyTorch/contrib/cv/detection/GCNet/configs/sabl/sabl_faster_rcnn_r101_fpn_1x_coco.py b/PyTorch/contrib/cv/detection/GCNet/configs/sabl/sabl_faster_rcnn_r101_fpn_1x_coco.py
new file mode 100644
index 0000000000000000000000000000000000000000..4c797cad1c693ba3578fd6852f8d055d3e7406fe
--- /dev/null
+++ b/PyTorch/contrib/cv/detection/GCNet/configs/sabl/sabl_faster_rcnn_r101_fpn_1x_coco.py
@@ -0,0 +1,36 @@
+_base_ = [
+ '../_base_/models/faster_rcnn_r50_fpn.py',
+ '../_base_/datasets/coco_detection.py',
+ '../_base_/schedules/schedule_1x.py', '../_base_/default_runtime.py'
+]
+model = dict(
+ pretrained='torchvision://resnet101',
+ backbone=dict(depth=101),
+ roi_head=dict(
+ bbox_head=dict(
+ _delete_=True,
+ type='SABLHead',
+ num_classes=80,
+ cls_in_channels=256,
+ reg_in_channels=256,
+ roi_feat_size=7,
+ reg_feat_up_ratio=2,
+ reg_pre_kernel=3,
+ reg_post_kernel=3,
+ reg_pre_num=2,
+ reg_post_num=1,
+ cls_out_channels=1024,
+ reg_offset_out_channels=256,
+ reg_cls_out_channels=256,
+ num_cls_fcs=1,
+ num_reg_fcs=0,
+ reg_class_agnostic=True,
+ norm_cfg=None,
+ bbox_coder=dict(
+ type='BucketingBBoxCoder', num_buckets=14, scale_factor=1.7),
+ loss_cls=dict(
+ type='CrossEntropyLoss', use_sigmoid=False, loss_weight=1.0),
+ loss_bbox_cls=dict(
+ type='CrossEntropyLoss', use_sigmoid=True, loss_weight=1.0),
+ loss_bbox_reg=dict(type='SmoothL1Loss', beta=0.1,
+ loss_weight=1.0))))
diff --git a/PyTorch/contrib/cv/detection/GCNet/configs/sabl/sabl_faster_rcnn_r50_fpn_1x_coco.py b/PyTorch/contrib/cv/detection/GCNet/configs/sabl/sabl_faster_rcnn_r50_fpn_1x_coco.py
new file mode 100644
index 0000000000000000000000000000000000000000..732c7ba3f607e2ac68f16acceddd16b1269aa2cf
--- /dev/null
+++ b/PyTorch/contrib/cv/detection/GCNet/configs/sabl/sabl_faster_rcnn_r50_fpn_1x_coco.py
@@ -0,0 +1,34 @@
+_base_ = [
+ '../_base_/models/faster_rcnn_r50_fpn.py',
+ '../_base_/datasets/coco_detection.py',
+ '../_base_/schedules/schedule_1x.py', '../_base_/default_runtime.py'
+]
+model = dict(
+ roi_head=dict(
+ bbox_head=dict(
+ _delete_=True,
+ type='SABLHead',
+ num_classes=80,
+ cls_in_channels=256,
+ reg_in_channels=256,
+ roi_feat_size=7,
+ reg_feat_up_ratio=2,
+ reg_pre_kernel=3,
+ reg_post_kernel=3,
+ reg_pre_num=2,
+ reg_post_num=1,
+ cls_out_channels=1024,
+ reg_offset_out_channels=256,
+ reg_cls_out_channels=256,
+ num_cls_fcs=1,
+ num_reg_fcs=0,
+ reg_class_agnostic=True,
+ norm_cfg=None,
+ bbox_coder=dict(
+ type='BucketingBBoxCoder', num_buckets=14, scale_factor=1.7),
+ loss_cls=dict(
+ type='CrossEntropyLoss', use_sigmoid=False, loss_weight=1.0),
+ loss_bbox_cls=dict(
+ type='CrossEntropyLoss', use_sigmoid=True, loss_weight=1.0),
+ loss_bbox_reg=dict(type='SmoothL1Loss', beta=0.1,
+ loss_weight=1.0))))
diff --git a/PyTorch/contrib/cv/detection/GCNet/configs/sabl/sabl_retinanet_r101_fpn_1x_coco.py b/PyTorch/contrib/cv/detection/GCNet/configs/sabl/sabl_retinanet_r101_fpn_1x_coco.py
new file mode 100644
index 0000000000000000000000000000000000000000..7504fe216056e7710caf29935e5cd4fdb1b695fb
--- /dev/null
+++ b/PyTorch/contrib/cv/detection/GCNet/configs/sabl/sabl_retinanet_r101_fpn_1x_coco.py
@@ -0,0 +1,52 @@
+_base_ = [
+ '../_base_/models/retinanet_r50_fpn.py',
+ '../_base_/datasets/coco_detection.py',
+ '../_base_/schedules/schedule_1x.py', '../_base_/default_runtime.py'
+]
+# model settings
+model = dict(
+ pretrained='torchvision://resnet101',
+ backbone=dict(depth=101),
+ bbox_head=dict(
+ _delete_=True,
+ type='SABLRetinaHead',
+ num_classes=80,
+ in_channels=256,
+ stacked_convs=4,
+ feat_channels=256,
+ approx_anchor_generator=dict(
+ type='AnchorGenerator',
+ octave_base_scale=4,
+ scales_per_octave=3,
+ ratios=[0.5, 1.0, 2.0],
+ strides=[8, 16, 32, 64, 128]),
+ square_anchor_generator=dict(
+ type='AnchorGenerator',
+ ratios=[1.0],
+ scales=[4],
+ strides=[8, 16, 32, 64, 128]),
+ bbox_coder=dict(
+ type='BucketingBBoxCoder', num_buckets=14, scale_factor=3.0),
+ loss_cls=dict(
+ type='FocalLoss',
+ use_sigmoid=True,
+ gamma=2.0,
+ alpha=0.25,
+ loss_weight=1.0),
+ loss_bbox_cls=dict(
+ type='CrossEntropyLoss', use_sigmoid=True, loss_weight=1.5),
+ loss_bbox_reg=dict(
+ type='SmoothL1Loss', beta=1.0 / 9.0, loss_weight=1.5)))
+# training and testing settings
+train_cfg = dict(
+ assigner=dict(
+ type='ApproxMaxIoUAssigner',
+ pos_iou_thr=0.5,
+ neg_iou_thr=0.4,
+ min_pos_iou=0.0,
+ ignore_iof_thr=-1),
+ allowed_border=-1,
+ pos_weight=-1,
+ debug=False)
+# optimizer
+optimizer = dict(type='SGD', lr=0.01, momentum=0.9, weight_decay=0.0001)
diff --git a/PyTorch/contrib/cv/detection/GCNet/configs/sabl/sabl_retinanet_r101_fpn_gn_1x_coco.py b/PyTorch/contrib/cv/detection/GCNet/configs/sabl/sabl_retinanet_r101_fpn_gn_1x_coco.py
new file mode 100644
index 0000000000000000000000000000000000000000..8143af21297eaf40f46217fa7fa65f7ecee2c11f
--- /dev/null
+++ b/PyTorch/contrib/cv/detection/GCNet/configs/sabl/sabl_retinanet_r101_fpn_gn_1x_coco.py
@@ -0,0 +1,54 @@
+_base_ = [
+ '../_base_/models/retinanet_r50_fpn.py',
+ '../_base_/datasets/coco_detection.py',
+ '../_base_/schedules/schedule_1x.py', '../_base_/default_runtime.py'
+]
+# model settings
+norm_cfg = dict(type='GN', num_groups=32, requires_grad=True)
+model = dict(
+ pretrained='torchvision://resnet101',
+ backbone=dict(depth=101),
+ bbox_head=dict(
+ _delete_=True,
+ type='SABLRetinaHead',
+ num_classes=80,
+ in_channels=256,
+ stacked_convs=4,
+ feat_channels=256,
+ approx_anchor_generator=dict(
+ type='AnchorGenerator',
+ octave_base_scale=4,
+ scales_per_octave=3,
+ ratios=[0.5, 1.0, 2.0],
+ strides=[8, 16, 32, 64, 128]),
+ square_anchor_generator=dict(
+ type='AnchorGenerator',
+ ratios=[1.0],
+ scales=[4],
+ strides=[8, 16, 32, 64, 128]),
+ norm_cfg=norm_cfg,
+ bbox_coder=dict(
+ type='BucketingBBoxCoder', num_buckets=14, scale_factor=3.0),
+ loss_cls=dict(
+ type='FocalLoss',
+ use_sigmoid=True,
+ gamma=2.0,
+ alpha=0.25,
+ loss_weight=1.0),
+ loss_bbox_cls=dict(
+ type='CrossEntropyLoss', use_sigmoid=True, loss_weight=1.5),
+ loss_bbox_reg=dict(
+ type='SmoothL1Loss', beta=1.0 / 9.0, loss_weight=1.5)))
+# training and testing settings
+train_cfg = dict(
+ assigner=dict(
+ type='ApproxMaxIoUAssigner',
+ pos_iou_thr=0.5,
+ neg_iou_thr=0.4,
+ min_pos_iou=0.0,
+ ignore_iof_thr=-1),
+ allowed_border=-1,
+ pos_weight=-1,
+ debug=False)
+# optimizer
+optimizer = dict(type='SGD', lr=0.01, momentum=0.9, weight_decay=0.0001)
diff --git a/PyTorch/contrib/cv/detection/GCNet/configs/sabl/sabl_retinanet_r101_fpn_gn_2x_ms_480_960_coco.py b/PyTorch/contrib/cv/detection/GCNet/configs/sabl/sabl_retinanet_r101_fpn_gn_2x_ms_480_960_coco.py
new file mode 100644
index 0000000000000000000000000000000000000000..4e2b71bfe673dea67263d0f9bf21a68f7abc48f4
--- /dev/null
+++ b/PyTorch/contrib/cv/detection/GCNet/configs/sabl/sabl_retinanet_r101_fpn_gn_2x_ms_480_960_coco.py
@@ -0,0 +1,71 @@
+_base_ = [
+ '../_base_/models/retinanet_r50_fpn.py',
+ '../_base_/datasets/coco_detection.py',
+ '../_base_/schedules/schedule_2x.py', '../_base_/default_runtime.py'
+]
+# model settings
+norm_cfg = dict(type='GN', num_groups=32, requires_grad=True)
+model = dict(
+ pretrained='torchvision://resnet101',
+ backbone=dict(depth=101),
+ bbox_head=dict(
+ _delete_=True,
+ type='SABLRetinaHead',
+ num_classes=80,
+ in_channels=256,
+ stacked_convs=4,
+ feat_channels=256,
+ approx_anchor_generator=dict(
+ type='AnchorGenerator',
+ octave_base_scale=4,
+ scales_per_octave=3,
+ ratios=[0.5, 1.0, 2.0],
+ strides=[8, 16, 32, 64, 128]),
+ square_anchor_generator=dict(
+ type='AnchorGenerator',
+ ratios=[1.0],
+ scales=[4],
+ strides=[8, 16, 32, 64, 128]),
+ norm_cfg=norm_cfg,
+ bbox_coder=dict(
+ type='BucketingBBoxCoder', num_buckets=14, scale_factor=3.0),
+ loss_cls=dict(
+ type='FocalLoss',
+ use_sigmoid=True,
+ gamma=2.0,
+ alpha=0.25,
+ loss_weight=1.0),
+ loss_bbox_cls=dict(
+ type='CrossEntropyLoss', use_sigmoid=True, loss_weight=1.5),
+ loss_bbox_reg=dict(
+ type='SmoothL1Loss', beta=1.0 / 9.0, loss_weight=1.5)))
+# training and testing settings
+train_cfg = dict(
+ assigner=dict(
+ type='ApproxMaxIoUAssigner',
+ pos_iou_thr=0.5,
+ neg_iou_thr=0.4,
+ min_pos_iou=0.0,
+ ignore_iof_thr=-1),
+ allowed_border=-1,
+ pos_weight=-1,
+ debug=False)
+img_norm_cfg = dict(
+ mean=[123.675, 116.28, 103.53], std=[58.395, 57.12, 57.375], to_rgb=True)
+train_pipeline = [
+ dict(type='LoadImageFromFile'),
+ dict(type='LoadAnnotations', with_bbox=True),
+ dict(
+ type='Resize',
+ img_scale=[(1333, 480), (1333, 960)],
+ multiscale_mode='range',
+ keep_ratio=True),
+ dict(type='RandomFlip', flip_ratio=0.5),
+ dict(type='Normalize', **img_norm_cfg),
+ dict(type='Pad', size_divisor=32),
+ dict(type='DefaultFormatBundle'),
+ dict(type='Collect', keys=['img', 'gt_bboxes', 'gt_labels']),
+]
+data = dict(train=dict(pipeline=train_pipeline))
+# optimizer
+optimizer = dict(type='SGD', lr=0.01, momentum=0.9, weight_decay=0.0001)
diff --git a/PyTorch/contrib/cv/detection/GCNet/configs/sabl/sabl_retinanet_r101_fpn_gn_2x_ms_640_800_coco.py b/PyTorch/contrib/cv/detection/GCNet/configs/sabl/sabl_retinanet_r101_fpn_gn_2x_ms_640_800_coco.py
new file mode 100644
index 0000000000000000000000000000000000000000..013020105a06f18b4fee33dc65ed3ca5f3ccdcef
--- /dev/null
+++ b/PyTorch/contrib/cv/detection/GCNet/configs/sabl/sabl_retinanet_r101_fpn_gn_2x_ms_640_800_coco.py
@@ -0,0 +1,71 @@
+_base_ = [
+ '../_base_/models/retinanet_r50_fpn.py',
+ '../_base_/datasets/coco_detection.py',
+ '../_base_/schedules/schedule_2x.py', '../_base_/default_runtime.py'
+]
+# model settings
+norm_cfg = dict(type='GN', num_groups=32, requires_grad=True)
+model = dict(
+ pretrained='torchvision://resnet101',
+ backbone=dict(depth=101),
+ bbox_head=dict(
+ _delete_=True,
+ type='SABLRetinaHead',
+ num_classes=80,
+ in_channels=256,
+ stacked_convs=4,
+ feat_channels=256,
+ approx_anchor_generator=dict(
+ type='AnchorGenerator',
+ octave_base_scale=4,
+ scales_per_octave=3,
+ ratios=[0.5, 1.0, 2.0],
+ strides=[8, 16, 32, 64, 128]),
+ square_anchor_generator=dict(
+ type='AnchorGenerator',
+ ratios=[1.0],
+ scales=[4],
+ strides=[8, 16, 32, 64, 128]),
+ norm_cfg=norm_cfg,
+ bbox_coder=dict(
+ type='BucketingBBoxCoder', num_buckets=14, scale_factor=3.0),
+ loss_cls=dict(
+ type='FocalLoss',
+ use_sigmoid=True,
+ gamma=2.0,
+ alpha=0.25,
+ loss_weight=1.0),
+ loss_bbox_cls=dict(
+ type='CrossEntropyLoss', use_sigmoid=True, loss_weight=1.5),
+ loss_bbox_reg=dict(
+ type='SmoothL1Loss', beta=1.0 / 9.0, loss_weight=1.5)))
+# training and testing settings
+train_cfg = dict(
+ assigner=dict(
+ type='ApproxMaxIoUAssigner',
+ pos_iou_thr=0.5,
+ neg_iou_thr=0.4,
+ min_pos_iou=0.0,
+ ignore_iof_thr=-1),
+ allowed_border=-1,
+ pos_weight=-1,
+ debug=False)
+img_norm_cfg = dict(
+ mean=[123.675, 116.28, 103.53], std=[58.395, 57.12, 57.375], to_rgb=True)
+train_pipeline = [
+ dict(type='LoadImageFromFile'),
+ dict(type='LoadAnnotations', with_bbox=True),
+ dict(
+ type='Resize',
+ img_scale=[(1333, 640), (1333, 800)],
+ multiscale_mode='range',
+ keep_ratio=True),
+ dict(type='RandomFlip', flip_ratio=0.5),
+ dict(type='Normalize', **img_norm_cfg),
+ dict(type='Pad', size_divisor=32),
+ dict(type='DefaultFormatBundle'),
+ dict(type='Collect', keys=['img', 'gt_bboxes', 'gt_labels']),
+]
+data = dict(train=dict(pipeline=train_pipeline))
+# optimizer
+optimizer = dict(type='SGD', lr=0.01, momentum=0.9, weight_decay=0.0001)
diff --git a/PyTorch/contrib/cv/detection/GCNet/configs/sabl/sabl_retinanet_r50_fpn_1x_coco.py b/PyTorch/contrib/cv/detection/GCNet/configs/sabl/sabl_retinanet_r50_fpn_1x_coco.py
new file mode 100644
index 0000000000000000000000000000000000000000..ce518306b570eba94f71da7da84967b5de7765fe
--- /dev/null
+++ b/PyTorch/contrib/cv/detection/GCNet/configs/sabl/sabl_retinanet_r50_fpn_1x_coco.py
@@ -0,0 +1,50 @@
+_base_ = [
+ '../_base_/models/retinanet_r50_fpn.py',
+ '../_base_/datasets/coco_detection.py',
+ '../_base_/schedules/schedule_1x.py', '../_base_/default_runtime.py'
+]
+# model settings
+model = dict(
+ bbox_head=dict(
+ _delete_=True,
+ type='SABLRetinaHead',
+ num_classes=80,
+ in_channels=256,
+ stacked_convs=4,
+ feat_channels=256,
+ approx_anchor_generator=dict(
+ type='AnchorGenerator',
+ octave_base_scale=4,
+ scales_per_octave=3,
+ ratios=[0.5, 1.0, 2.0],
+ strides=[8, 16, 32, 64, 128]),
+ square_anchor_generator=dict(
+ type='AnchorGenerator',
+ ratios=[1.0],
+ scales=[4],
+ strides=[8, 16, 32, 64, 128]),
+ bbox_coder=dict(
+ type='BucketingBBoxCoder', num_buckets=14, scale_factor=3.0),
+ loss_cls=dict(
+ type='FocalLoss',
+ use_sigmoid=True,
+ gamma=2.0,
+ alpha=0.25,
+ loss_weight=1.0),
+ loss_bbox_cls=dict(
+ type='CrossEntropyLoss', use_sigmoid=True, loss_weight=1.5),
+ loss_bbox_reg=dict(
+ type='SmoothL1Loss', beta=1.0 / 9.0, loss_weight=1.5)))
+# training and testing settings
+train_cfg = dict(
+ assigner=dict(
+ type='ApproxMaxIoUAssigner',
+ pos_iou_thr=0.5,
+ neg_iou_thr=0.4,
+ min_pos_iou=0.0,
+ ignore_iof_thr=-1),
+ allowed_border=-1,
+ pos_weight=-1,
+ debug=False)
+# optimizer
+optimizer = dict(type='SGD', lr=0.01, momentum=0.9, weight_decay=0.0001)
diff --git a/PyTorch/contrib/cv/detection/GCNet/configs/sabl/sabl_retinanet_r50_fpn_gn_1x_coco.py b/PyTorch/contrib/cv/detection/GCNet/configs/sabl/sabl_retinanet_r50_fpn_gn_1x_coco.py
new file mode 100644
index 0000000000000000000000000000000000000000..bb1dad59b6312e9df2742e7775f10635ebb13431
--- /dev/null
+++ b/PyTorch/contrib/cv/detection/GCNet/configs/sabl/sabl_retinanet_r50_fpn_gn_1x_coco.py
@@ -0,0 +1,52 @@
+_base_ = [
+ '../_base_/models/retinanet_r50_fpn.py',
+ '../_base_/datasets/coco_detection.py',
+ '../_base_/schedules/schedule_1x.py', '../_base_/default_runtime.py'
+]
+# model settings
+norm_cfg = dict(type='GN', num_groups=32, requires_grad=True)
+model = dict(
+ bbox_head=dict(
+ _delete_=True,
+ type='SABLRetinaHead',
+ num_classes=80,
+ in_channels=256,
+ stacked_convs=4,
+ feat_channels=256,
+ approx_anchor_generator=dict(
+ type='AnchorGenerator',
+ octave_base_scale=4,
+ scales_per_octave=3,
+ ratios=[0.5, 1.0, 2.0],
+ strides=[8, 16, 32, 64, 128]),
+ square_anchor_generator=dict(
+ type='AnchorGenerator',
+ ratios=[1.0],
+ scales=[4],
+ strides=[8, 16, 32, 64, 128]),
+ norm_cfg=norm_cfg,
+ bbox_coder=dict(
+ type='BucketingBBoxCoder', num_buckets=14, scale_factor=3.0),
+ loss_cls=dict(
+ type='FocalLoss',
+ use_sigmoid=True,
+ gamma=2.0,
+ alpha=0.25,
+ loss_weight=1.0),
+ loss_bbox_cls=dict(
+ type='CrossEntropyLoss', use_sigmoid=True, loss_weight=1.5),
+ loss_bbox_reg=dict(
+ type='SmoothL1Loss', beta=1.0 / 9.0, loss_weight=1.5)))
+# training and testing settings
+train_cfg = dict(
+ assigner=dict(
+ type='ApproxMaxIoUAssigner',
+ pos_iou_thr=0.5,
+ neg_iou_thr=0.4,
+ min_pos_iou=0.0,
+ ignore_iof_thr=-1),
+ allowed_border=-1,
+ pos_weight=-1,
+ debug=False)
+# optimizer
+optimizer = dict(type='SGD', lr=0.01, momentum=0.9, weight_decay=0.0001)
diff --git a/PyTorch/contrib/cv/detection/GCNet/configs/scratch/README.md b/PyTorch/contrib/cv/detection/GCNet/configs/scratch/README.md
new file mode 100644
index 0000000000000000000000000000000000000000..4022d05bddb079bdffbdff495c160c00303edeec
--- /dev/null
+++ b/PyTorch/contrib/cv/detection/GCNet/configs/scratch/README.md
@@ -0,0 +1,22 @@
+# Rethinking ImageNet Pre-training
+
+## Introduction
+
+```
+@article{he2018rethinking,
+ title={Rethinking imagenet pre-training},
+ author={He, Kaiming and Girshick, Ross and Doll{\'a}r, Piotr},
+ journal={arXiv preprint arXiv:1811.08883},
+ year={2018}
+}
+```
+
+## Results and Models
+
+| Model | Backbone | Style | Lr schd | box AP | mask AP | Download |
+|:------------:|:---------:|:-------:|:-------:|:------:|:-------:|:--------:|
+| Faster R-CNN | R-50-FPN | pytorch | 6x | 40.7 | | [model](http://download.openmmlab.com/mmdetection/v2.0/scratch/faster_rcnn_r50_fpn_gn-all_scratch_6x_coco/scratch_faster_rcnn_r50_fpn_gn_6x_bbox_mAP-0.407_20200201_193013-90813d01.pth) | [log](http://download.openmmlab.com/mmdetection/v2.0/scratch/faster_rcnn_r50_fpn_gn-all_scratch_6x_coco/scratch_faster_rcnn_r50_fpn_gn_6x_20200201_193013.log.json) |
+| Mask R-CNN | R-50-FPN | pytorch | 6x | 41.2 | 37.4 | [model](http://download.openmmlab.com/mmdetection/v2.0/scratch/mask_rcnn_r50_fpn_gn-all_scratch_6x_coco/scratch_mask_rcnn_r50_fpn_gn_6x_bbox_mAP-0.412__segm_mAP-0.374_20200201_193051-1e190a40.pth) | [log](http://download.openmmlab.com/mmdetection/v2.0/scratch/mask_rcnn_r50_fpn_gn-all_scratch_6x_coco/scratch_mask_rcnn_r50_fpn_gn_6x_20200201_193051.log.json) |
+
+Note:
+- The above models are trained with 16 GPUs.
diff --git a/PyTorch/contrib/cv/detection/GCNet/configs/scratch/faster_rcnn_r50_fpn_gn-all_scratch_6x_coco.py b/PyTorch/contrib/cv/detection/GCNet/configs/scratch/faster_rcnn_r50_fpn_gn-all_scratch_6x_coco.py
new file mode 100644
index 0000000000000000000000000000000000000000..ebe87d11f41f164882a1d787b26a8c9cc55b4107
--- /dev/null
+++ b/PyTorch/contrib/cv/detection/GCNet/configs/scratch/faster_rcnn_r50_fpn_gn-all_scratch_6x_coco.py
@@ -0,0 +1,22 @@
+_base_ = [
+ '../_base_/models/faster_rcnn_r50_fpn.py',
+ '../_base_/datasets/coco_detection.py',
+ '../_base_/schedules/schedule_1x.py', '../_base_/default_runtime.py'
+]
+norm_cfg = dict(type='GN', num_groups=32, requires_grad=True)
+model = dict(
+ pretrained=None,
+ backbone=dict(
+ frozen_stages=-1, zero_init_residual=False, norm_cfg=norm_cfg),
+ neck=dict(norm_cfg=norm_cfg),
+ roi_head=dict(
+ bbox_head=dict(
+ type='Shared4Conv1FCBBoxHead',
+ conv_out_channels=256,
+ norm_cfg=norm_cfg)))
+# optimizer
+optimizer = dict(paramwise_cfg=dict(norm_decay_mult=0))
+optimizer_config = dict(_delete_=True, grad_clip=None)
+# learning policy
+lr_config = dict(warmup_ratio=0.1, step=[65, 71])
+total_epochs = 73
diff --git a/PyTorch/contrib/cv/detection/GCNet/configs/scratch/mask_rcnn_r50_fpn_gn-all_scratch_6x_coco.py b/PyTorch/contrib/cv/detection/GCNet/configs/scratch/mask_rcnn_r50_fpn_gn-all_scratch_6x_coco.py
new file mode 100644
index 0000000000000000000000000000000000000000..2da1750dd3842edcc1e9653e3efc635337941f76
--- /dev/null
+++ b/PyTorch/contrib/cv/detection/GCNet/configs/scratch/mask_rcnn_r50_fpn_gn-all_scratch_6x_coco.py
@@ -0,0 +1,23 @@
+_base_ = [
+ '../_base_/models/mask_rcnn_r50_fpn.py',
+ '../_base_/datasets/coco_instance.py',
+ '../_base_/schedules/schedule_1x.py', '../_base_/default_runtime.py'
+]
+norm_cfg = dict(type='GN', num_groups=32, requires_grad=True)
+model = dict(
+ pretrained=None,
+ backbone=dict(
+ frozen_stages=-1, zero_init_residual=False, norm_cfg=norm_cfg),
+ neck=dict(norm_cfg=norm_cfg),
+ roi_head=dict(
+ bbox_head=dict(
+ type='Shared4Conv1FCBBoxHead',
+ conv_out_channels=256,
+ norm_cfg=norm_cfg),
+ mask_head=dict(norm_cfg=norm_cfg)))
+# optimizer
+optimizer = dict(paramwise_cfg=dict(norm_decay_mult=0))
+optimizer_config = dict(_delete_=True, grad_clip=None)
+# learning policy
+lr_config = dict(warmup_ratio=0.1, step=[65, 71])
+total_epochs = 73
diff --git a/PyTorch/contrib/cv/detection/GCNet/configs/ssd/README.md b/PyTorch/contrib/cv/detection/GCNet/configs/ssd/README.md
new file mode 100644
index 0000000000000000000000000000000000000000..dd436ccf4560a66e1de3b476bda0b4a25b94415b
--- /dev/null
+++ b/PyTorch/contrib/cv/detection/GCNet/configs/ssd/README.md
@@ -0,0 +1,18 @@
+# SSD: Single Shot MultiBox Detector
+
+## Introduction
+```
+@article{Liu_2016,
+ title={SSD: Single Shot MultiBox Detector},
+ journal={ECCV},
+ author={Liu, Wei and Anguelov, Dragomir and Erhan, Dumitru and Szegedy, Christian and Reed, Scott and Fu, Cheng-Yang and Berg, Alexander C.},
+ year={2016},
+}
+```
+
+## Results and models
+
+| Backbone | Size | Style | Lr schd | Mem (GB) | Inf time (fps) | box AP | Download |
+| :------: | :---: | :---: | :-----: | :------: | :------------: | :----: | :-------------------------------------------------------------------------------------------------------------------------------: |
+| VGG16 | 300 | caffe | 120e | 10.2 | 43.7 | 25.6 | [model](http://download.openmmlab.com/mmdetection/v2.0/ssd/ssd300_coco/ssd300_coco_20200307-a92d2092.pth) | [log](http://download.openmmlab.com/mmdetection/v2.0/ssd/ssd300_coco/ssd300_coco_20200307_174216.log.json) |
+| VGG16 | 512 | caffe | 120e | 9.3 | 30.7 | 29.4 | [model](http://download.openmmlab.com/mmdetection/v2.0/ssd/ssd512_coco/ssd512_coco_20200308-038c5591.pth) | [log](http://download.openmmlab.com/mmdetection/v2.0/ssd/ssd512_coco/ssd512_coco_20200308_134447.log.json) |
diff --git a/PyTorch/contrib/cv/detection/GCNet/configs/ssd/ssd300_coco.py b/PyTorch/contrib/cv/detection/GCNet/configs/ssd/ssd300_coco.py
new file mode 100644
index 0000000000000000000000000000000000000000..75c5e4e5b81a320a7e6bd7bc31e7d5cf49a0b92d
--- /dev/null
+++ b/PyTorch/contrib/cv/detection/GCNet/configs/ssd/ssd300_coco.py
@@ -0,0 +1,62 @@
+_base_ = [
+ '../_base_/models/ssd300.py', '../_base_/datasets/coco_detection.py',
+ '../_base_/schedules/schedule_2x.py', '../_base_/default_runtime.py'
+]
+# dataset settings
+dataset_type = 'CocoDataset'
+data_root = 'data/coco/'
+img_norm_cfg = dict(mean=[123.675, 116.28, 103.53], std=[1, 1, 1], to_rgb=True)
+train_pipeline = [
+ dict(type='LoadImageFromFile', to_float32=True),
+ dict(type='LoadAnnotations', with_bbox=True),
+ dict(
+ type='PhotoMetricDistortion',
+ brightness_delta=32,
+ contrast_range=(0.5, 1.5),
+ saturation_range=(0.5, 1.5),
+ hue_delta=18),
+ dict(
+ type='Expand',
+ mean=img_norm_cfg['mean'],
+ to_rgb=img_norm_cfg['to_rgb'],
+ ratio_range=(1, 4)),
+ dict(
+ type='MinIoURandomCrop',
+ min_ious=(0.1, 0.3, 0.5, 0.7, 0.9),
+ min_crop_size=0.3),
+ dict(type='Resize', img_scale=(300, 300), keep_ratio=False),
+ dict(type='Normalize', **img_norm_cfg),
+ dict(type='RandomFlip', flip_ratio=0.5),
+ dict(type='DefaultFormatBundle'),
+ dict(type='Collect', keys=['img', 'gt_bboxes', 'gt_labels']),
+]
+test_pipeline = [
+ dict(type='LoadImageFromFile'),
+ dict(
+ type='MultiScaleFlipAug',
+ img_scale=(300, 300),
+ flip=False,
+ transforms=[
+ dict(type='Resize', keep_ratio=False),
+ dict(type='Normalize', **img_norm_cfg),
+ dict(type='ImageToTensor', keys=['img']),
+ dict(type='Collect', keys=['img']),
+ ])
+]
+data = dict(
+ samples_per_gpu=8,
+ workers_per_gpu=3,
+ train=dict(
+ _delete_=True,
+ type='RepeatDataset',
+ times=5,
+ dataset=dict(
+ type=dataset_type,
+ ann_file=data_root + 'annotations/instances_train2017.json',
+ img_prefix=data_root + 'train2017/',
+ pipeline=train_pipeline)),
+ val=dict(pipeline=test_pipeline),
+ test=dict(pipeline=test_pipeline))
+# optimizer
+optimizer = dict(type='SGD', lr=2e-3, momentum=0.9, weight_decay=5e-4)
+optimizer_config = dict(_delete_=True)
diff --git a/PyTorch/contrib/cv/detection/GCNet/configs/ssd/ssd512_coco.py b/PyTorch/contrib/cv/detection/GCNet/configs/ssd/ssd512_coco.py
new file mode 100644
index 0000000000000000000000000000000000000000..44d2920f4289c351c27e0d70dc03de0deb064a54
--- /dev/null
+++ b/PyTorch/contrib/cv/detection/GCNet/configs/ssd/ssd512_coco.py
@@ -0,0 +1,71 @@
+_base_ = 'ssd300_coco.py'
+input_size = 512
+model = dict(
+ backbone=dict(input_size=input_size),
+ bbox_head=dict(
+ in_channels=(512, 1024, 512, 256, 256, 256, 256),
+ anchor_generator=dict(
+ type='SSDAnchorGenerator',
+ scale_major=False,
+ input_size=input_size,
+ basesize_ratio_range=(0.1, 0.9),
+ strides=[8, 16, 32, 64, 128, 256, 512],
+ ratios=[[2], [2, 3], [2, 3], [2, 3], [2, 3], [2], [2]])))
+# dataset settings
+dataset_type = 'CocoDataset'
+data_root = 'data/coco/'
+img_norm_cfg = dict(mean=[123.675, 116.28, 103.53], std=[1, 1, 1], to_rgb=True)
+train_pipeline = [
+ dict(type='LoadImageFromFile', to_float32=True),
+ dict(type='LoadAnnotations', with_bbox=True),
+ dict(
+ type='PhotoMetricDistortion',
+ brightness_delta=32,
+ contrast_range=(0.5, 1.5),
+ saturation_range=(0.5, 1.5),
+ hue_delta=18),
+ dict(
+ type='Expand',
+ mean=img_norm_cfg['mean'],
+ to_rgb=img_norm_cfg['to_rgb'],
+ ratio_range=(1, 4)),
+ dict(
+ type='MinIoURandomCrop',
+ min_ious=(0.1, 0.3, 0.5, 0.7, 0.9),
+ min_crop_size=0.3),
+ dict(type='Resize', img_scale=(512, 512), keep_ratio=False),
+ dict(type='Normalize', **img_norm_cfg),
+ dict(type='RandomFlip', flip_ratio=0.5),
+ dict(type='DefaultFormatBundle'),
+ dict(type='Collect', keys=['img', 'gt_bboxes', 'gt_labels']),
+]
+test_pipeline = [
+ dict(type='LoadImageFromFile'),
+ dict(
+ type='MultiScaleFlipAug',
+ img_scale=(512, 512),
+ flip=False,
+ transforms=[
+ dict(type='Resize', keep_ratio=False),
+ dict(type='Normalize', **img_norm_cfg),
+ dict(type='ImageToTensor', keys=['img']),
+ dict(type='Collect', keys=['img']),
+ ])
+]
+data = dict(
+ samples_per_gpu=8,
+ workers_per_gpu=3,
+ train=dict(
+ _delete_=True,
+ type='RepeatDataset',
+ times=5,
+ dataset=dict(
+ type=dataset_type,
+ ann_file=data_root + 'annotations/instances_train2017.json',
+ img_prefix=data_root + 'train2017/',
+ pipeline=train_pipeline)),
+ val=dict(pipeline=test_pipeline),
+ test=dict(pipeline=test_pipeline))
+# optimizer
+optimizer = dict(type='SGD', lr=2e-3, momentum=0.9, weight_decay=5e-4)
+optimizer_config = dict(_delete_=True)
diff --git a/PyTorch/contrib/cv/detection/GCNet/configs/vfnet/README.md b/PyTorch/contrib/cv/detection/GCNet/configs/vfnet/README.md
new file mode 100644
index 0000000000000000000000000000000000000000..4dc892bec2965bb0cb6b9dbc3bd2704148d60c02
--- /dev/null
+++ b/PyTorch/contrib/cv/detection/GCNet/configs/vfnet/README.md
@@ -0,0 +1,40 @@
+# VarifocalNet: An IoU-aware Dense Object Detector
+
+## Introduction
+**VarifocalNet (VFNet)** learns to predict the IoU-aware classification score which mixes the object presence confidence and localization accuracy together as the detection score for a bounding box. The learning is supervised by the proposed Varifocal Loss (VFL), based on a new star-shaped bounding box feature representation (the features at nine yellow sampling points). Given the new representation, the object localization accuracy is further improved by refining the initially regressed bounding box. The full paper is available at: [https://arxiv.org/abs/2008.13367](https://arxiv.org/abs/2008.13367).
+
+
+

+
Learning to Predict the IoU-aware Classification Score.
+
+
+## Citing VarifocalNet
+
+```
+@article{zhang2020varifocalnet,
+ title={VarifocalNet: An IoU-aware Dense Object Detector},
+ author={Zhang, Haoyang and Wang, Ying and Dayoub, Feras and S{\"u}nderhauf, Niko},
+ journal={arXiv preprint arXiv:2008.13367},
+ year={2020}
+}
+```
+
+## Results and Models
+
+| Backbone | Style | DCN | MS train | Lr schd |Inf time (fps) | box AP (val) | box AP (test-dev) | Download |
+|:------------:|:---------:|:-------:|:--------:|:-------:|:-------------:|:------------:|:-----------------:|:--------:|
+| R-50 | pytorch | N | N | 1x | - | 41.6 | 41.6 | [model](https://openmmlab.oss-cn-hangzhou.aliyuncs.com/mmdetection/v2.0/vfnet/vfnet_r50_fpn_1x_coco/vfnet_r50_fpn_1x_coco_20201027-38db6f58.pth) | [log](https://openmmlab.oss-cn-hangzhou.aliyuncs.com/mmdetection/v2.0/vfnet/vfnet_r50_fpn_1x_coco/vfnet_r50_fpn_1x_coco.json)|
+| R-50 | pytorch | N | Y | 2x | - | 44.5 | 44.8 | [model](https://openmmlab.oss-cn-hangzhou.aliyuncs.com/mmdetection/v2.0/vfnet/vfnet_r50_fpn_mstrain_2x_coco/vfnet_r50_fpn_mstrain_2x_coco_20201027-7cc75bd2.pth) | [log](https://openmmlab.oss-cn-hangzhou.aliyuncs.com/mmdetection/v2.0/vfnet/vfnet_r50_fpn_mstrain_2x_coco/vfnet_r50_fpn_mstrain_2x_coco.json)|
+| R-50 | pytorch | Y | Y | 2x | - | 47.8 | 48.0 | [model](https://openmmlab.oss-cn-hangzhou.aliyuncs.com/mmdetection/v2.0/vfnet/vfnet_r50_fpn_mdconv_c3-c5_mstrain_2x_coco/vfnet_r50_fpn_mdconv_c3-c5_mstrain_2x_coco_20201027pth-6879c318.pth) | [log](https://openmmlab.oss-cn-hangzhou.aliyuncs.com/mmdetection/v2.0/vfnet/vfnet_r50_fpn_mdconv_c3-c5_mstrain_2x_coco/vfnet_r50_fpn_mdconv_c3-c5_mstrain_2x_coco.json)|
+| R-101 | pytorch | N | N | 1x | - | 43.0 | 43.6 | [model](https://openmmlab.oss-cn-hangzhou.aliyuncs.com/mmdetection/v2.0/vfnet/vfnet_r101_fpn_1x_coco/vfnet_r101_fpn_1x_coco_20201027pth-c831ece7.pth) | [log](https://openmmlab.oss-cn-hangzhou.aliyuncs.com/mmdetection/v2.0/vfnet/vfnet_r101_fpn_1x_coco/vfnet_r101_fpn_1x_coco.json)|
+| R-101 | pytorch | N | Y | 2x | - | 46.2 | 46.7 | [model](https://openmmlab.oss-cn-hangzhou.aliyuncs.com/mmdetection/v2.0/vfnet/vfnet_r101_fpn_mstrain_2x_coco/vfnet_r101_fpn_mstrain_2x_coco_20201027pth-4a5d53f1.pth) | [log](https://openmmlab.oss-cn-hangzhou.aliyuncs.com/mmdetection/v2.0/vfnet/vfnet_r101_fpn_mstrain_2x_coco/vfnet_r101_fpn_mstrain_2x_coco.json)|
+| R-101 | pytorch | Y | Y | 2x | - | 49.0 | 49.2 | [model](https://openmmlab.oss-cn-hangzhou.aliyuncs.com/mmdetection/v2.0/vfnet/vfnet_r101_fpn_mdconv_c3-c5_mstrain_2x_coco/vfnet_r101_fpn_mdconv_c3-c5_mstrain_2x_coco_20201027pth-7729adb5.pth) | [log](https://openmmlab.oss-cn-hangzhou.aliyuncs.com/mmdetection/v2.0/vfnet/vfnet_r101_fpn_mdconv_c3-c5_mstrain_2x_coco/vfnet_r101_fpn_mdconv_c3-c5_mstrain_2x_coco.json)|
+| X-101-32x4d | pytorch | Y | Y | 2x | - | 49.7 | 50.0 | [model](https://openmmlab.oss-cn-hangzhou.aliyuncs.com/mmdetection/v2.0/vfnet/vfnet_x101_32x4d_fpn_mdconv_c3-c5_mstrain_2x_coco/vfnet_x101_32x4d_fpn_mdconv_c3-c5_mstrain_2x_coco_20201027pth-d300a6fc.pth) | [log](https://openmmlab.oss-cn-hangzhou.aliyuncs.com/mmdetection/v2.0/vfnet/vfnet_x101_32x4d_fpn_mdconv_c3-c5_mstrain_2x_coco/vfnet_x101_32x4d_fpn_mdconv_c3-c5_mstrain_2x_coco.json)|
+| X-101-64x4d | pytorch | Y | Y | 2x | - | 50.4 | 50.8 | [model](https://openmmlab.oss-cn-hangzhou.aliyuncs.com/mmdetection/v2.0/vfnet/vfnet_x101_64x4d_fpn_mdconv_c3-c5_mstrain_2x_coco/vfnet_x101_64x4d_fpn_mdconv_c3-c5_mstrain_2x_coco_20201027pth-b5f6da5e.pth) | [log](https://openmmlab.oss-cn-hangzhou.aliyuncs.com/mmdetection/v2.0/vfnet/vfnet_x101_64x4d_fpn_mdconv_c3-c5_mstrain_2x_coco/vfnet_x101_64x4d_fpn_mdconv_c3-c5_mstrain_2x_coco.json)|
+
+
+**Notes:**
+- The MS-train scale range is 1333x[480:960] (`range` mode) and the inference scale keeps 1333x800.
+- DCN means using `DCNv2` in both backbone and head.
+- Inference time will be updated soon.
+- More results and pre-trained models can be found in [VarifocalNet-Github](https://github.com/hyz-xmaster/VarifocalNet)
diff --git a/PyTorch/contrib/cv/detection/GCNet/configs/vfnet/vfnet_r101_fpn_1x_coco.py b/PyTorch/contrib/cv/detection/GCNet/configs/vfnet/vfnet_r101_fpn_1x_coco.py
new file mode 100644
index 0000000000000000000000000000000000000000..09521310523f38be90518e9c7db6856db1225c1b
--- /dev/null
+++ b/PyTorch/contrib/cv/detection/GCNet/configs/vfnet/vfnet_r101_fpn_1x_coco.py
@@ -0,0 +1,2 @@
+_base_ = './vfnet_r50_fpn_1x_coco.py'
+model = dict(pretrained='torchvision://resnet101', backbone=dict(depth=101))
diff --git a/PyTorch/contrib/cv/detection/GCNet/configs/vfnet/vfnet_r101_fpn_2x_coco.py b/PyTorch/contrib/cv/detection/GCNet/configs/vfnet/vfnet_r101_fpn_2x_coco.py
new file mode 100644
index 0000000000000000000000000000000000000000..d0a1f569463972dc5b7fe10c35f8fb5d3321a261
--- /dev/null
+++ b/PyTorch/contrib/cv/detection/GCNet/configs/vfnet/vfnet_r101_fpn_2x_coco.py
@@ -0,0 +1,4 @@
+_base_ = './vfnet_r50_fpn_1x_coco.py'
+model = dict(pretrained='torchvision://resnet101', backbone=dict(depth=101))
+lr_config = dict(step=[16, 22])
+total_epochs = 24
diff --git a/PyTorch/contrib/cv/detection/GCNet/configs/vfnet/vfnet_r101_fpn_mdconv_c3-c5_mstrain_2x_coco.py b/PyTorch/contrib/cv/detection/GCNet/configs/vfnet/vfnet_r101_fpn_mdconv_c3-c5_mstrain_2x_coco.py
new file mode 100644
index 0000000000000000000000000000000000000000..f8ef6ec092db2e454ca5359b6df89d31365672c0
--- /dev/null
+++ b/PyTorch/contrib/cv/detection/GCNet/configs/vfnet/vfnet_r101_fpn_mdconv_c3-c5_mstrain_2x_coco.py
@@ -0,0 +1,14 @@
+_base_ = './vfnet_r50_fpn_mdconv_c3-c5_mstrain_2x_coco.py'
+model = dict(
+ pretrained='torchvision://resnet101',
+ backbone=dict(
+ type='ResNet',
+ depth=101,
+ num_stages=4,
+ out_indices=(0, 1, 2, 3),
+ frozen_stages=1,
+ norm_cfg=dict(type='BN', requires_grad=True),
+ norm_eval=True,
+ style='pytorch',
+ dcn=dict(type='DCNv2', deform_groups=1, fallback_on_stride=False),
+ stage_with_dcn=(False, True, True, True)))
diff --git a/PyTorch/contrib/cv/detection/GCNet/configs/vfnet/vfnet_r101_fpn_mstrain_2x_coco.py b/PyTorch/contrib/cv/detection/GCNet/configs/vfnet/vfnet_r101_fpn_mstrain_2x_coco.py
new file mode 100644
index 0000000000000000000000000000000000000000..be7f075fea00a4570d50fd30f1685139b70a8bb6
--- /dev/null
+++ b/PyTorch/contrib/cv/detection/GCNet/configs/vfnet/vfnet_r101_fpn_mstrain_2x_coco.py
@@ -0,0 +1,2 @@
+_base_ = './vfnet_r50_fpn_mstrain_2x_coco.py'
+model = dict(pretrained='torchvision://resnet101', backbone=dict(depth=101))
diff --git a/PyTorch/contrib/cv/detection/GCNet/configs/vfnet/vfnet_r2_101_fpn_mdconv_c3-c5_mstrain_2x_coco.py b/PyTorch/contrib/cv/detection/GCNet/configs/vfnet/vfnet_r2_101_fpn_mdconv_c3-c5_mstrain_2x_coco.py
new file mode 100644
index 0000000000000000000000000000000000000000..8da3122657adc2785129c28a84473c25777abba3
--- /dev/null
+++ b/PyTorch/contrib/cv/detection/GCNet/configs/vfnet/vfnet_r2_101_fpn_mdconv_c3-c5_mstrain_2x_coco.py
@@ -0,0 +1,16 @@
+_base_ = './vfnet_r50_fpn_mdconv_c3-c5_mstrain_2x_coco.py'
+model = dict(
+ pretrained='open-mmlab://res2net101_v1d_26w_4s',
+ backbone=dict(
+ type='Res2Net',
+ depth=101,
+ scales=4,
+ base_width=26,
+ num_stages=4,
+ out_indices=(0, 1, 2, 3),
+ frozen_stages=1,
+ norm_cfg=dict(type='BN', requires_grad=True),
+ norm_eval=True,
+ style='pytorch',
+ dcn=dict(type='DCNv2', deform_groups=1, fallback_on_stride=False),
+ stage_with_dcn=(False, True, True, True)))
diff --git a/PyTorch/contrib/cv/detection/GCNet/configs/vfnet/vfnet_r2_101_fpn_mstrain_2x_coco.py b/PyTorch/contrib/cv/detection/GCNet/configs/vfnet/vfnet_r2_101_fpn_mstrain_2x_coco.py
new file mode 100644
index 0000000000000000000000000000000000000000..2bcf779db008dbbf0c8f3b1fdc84a9940967f78a
--- /dev/null
+++ b/PyTorch/contrib/cv/detection/GCNet/configs/vfnet/vfnet_r2_101_fpn_mstrain_2x_coco.py
@@ -0,0 +1,14 @@
+_base_ = './vfnet_r50_fpn_mstrain_2x_coco.py'
+model = dict(
+ pretrained='open-mmlab://res2net101_v1d_26w_4s',
+ backbone=dict(
+ type='Res2Net',
+ depth=101,
+ scales=4,
+ base_width=26,
+ num_stages=4,
+ out_indices=(0, 1, 2, 3),
+ frozen_stages=1,
+ norm_cfg=dict(type='BN', requires_grad=True),
+ norm_eval=True,
+ style='pytorch'))
diff --git a/PyTorch/contrib/cv/detection/GCNet/configs/vfnet/vfnet_r50_fpn_1x_coco.py b/PyTorch/contrib/cv/detection/GCNet/configs/vfnet/vfnet_r50_fpn_1x_coco.py
new file mode 100644
index 0000000000000000000000000000000000000000..6875e5f38c4dae0d10888fa90ead55af736b67aa
--- /dev/null
+++ b/PyTorch/contrib/cv/detection/GCNet/configs/vfnet/vfnet_r50_fpn_1x_coco.py
@@ -0,0 +1,114 @@
+_base_ = [
+ '../_base_/datasets/coco_detection.py',
+ '../_base_/schedules/schedule_1x.py', '../_base_/default_runtime.py'
+]
+# model settings
+model = dict(
+ type='VFNet',
+ pretrained='torchvision://resnet50',
+ backbone=dict(
+ type='ResNet',
+ depth=50,
+ num_stages=4,
+ out_indices=(0, 1, 2, 3),
+ frozen_stages=1,
+ norm_cfg=dict(type='BN', requires_grad=True),
+ norm_eval=True,
+ style='pytorch'),
+ neck=dict(
+ type='FPN',
+ in_channels=[256, 512, 1024, 2048],
+ out_channels=256,
+ start_level=1,
+ add_extra_convs=True,
+ extra_convs_on_inputs=False, # use P5
+ num_outs=5,
+ relu_before_extra_convs=True),
+ bbox_head=dict(
+ type='VFNetHead',
+ num_classes=80,
+ in_channels=256,
+ stacked_convs=3,
+ feat_channels=256,
+ strides=[8, 16, 32, 64, 128],
+ center_sampling=False,
+ dcn_on_last_conv=False,
+ use_atss=True,
+ use_vfl=True,
+ loss_cls=dict(
+ type='VarifocalLoss',
+ use_sigmoid=True,
+ alpha=0.75,
+ gamma=2.0,
+ iou_weighted=True,
+ loss_weight=1.0),
+ loss_bbox=dict(type='GIoULoss', loss_weight=1.5),
+ loss_bbox_refine=dict(type='GIoULoss', loss_weight=2.0)))
+
+# training and testing settings
+train_cfg = dict(
+ assigner=dict(type='ATSSAssigner', topk=9),
+ allowed_border=-1,
+ pos_weight=-1,
+ debug=False)
+test_cfg = dict(
+ nms_pre=1000,
+ min_bbox_size=0,
+ score_thr=0.05,
+ nms=dict(type='nms', iou_threshold=0.6),
+ max_per_img=100)
+
+# data setting
+dataset_type = 'CocoDataset'
+data_root = 'data/coco/'
+img_norm_cfg = dict(
+ mean=[123.675, 116.28, 103.53], std=[58.395, 57.12, 57.375], to_rgb=True)
+train_pipeline = [
+ dict(type='LoadImageFromFile'),
+ dict(type='LoadAnnotations', with_bbox=True),
+ dict(type='Resize', img_scale=(1333, 800), keep_ratio=True),
+ dict(type='RandomFlip', flip_ratio=0.5),
+ dict(type='Normalize', **img_norm_cfg),
+ dict(type='Pad', size_divisor=32),
+ dict(type='DefaultFormatBundle'),
+ dict(type='Collect', keys=['img', 'gt_bboxes', 'gt_labels']),
+]
+test_pipeline = [
+ dict(type='LoadImageFromFile'),
+ dict(
+ type='MultiScaleFlipAug',
+ img_scale=(1333, 800),
+ flip=False,
+ transforms=[
+ dict(type='Resize', keep_ratio=True),
+ dict(type='RandomFlip'),
+ dict(type='Normalize', **img_norm_cfg),
+ dict(type='Pad', size_divisor=32),
+ dict(type='DefaultFormatBundle'),
+ dict(type='Collect', keys=['img']),
+ ])
+]
+data = dict(
+ samples_per_gpu=2,
+ workers_per_gpu=2,
+ train=dict(pipeline=train_pipeline),
+ val=dict(pipeline=test_pipeline),
+ test=dict(pipeline=test_pipeline))
+
+# optimizer
+optimizer = dict(
+ lr=0.01, paramwise_cfg=dict(bias_lr_mult=2., bias_decay_mult=0.))
+optimizer_config = dict(grad_clip=None)
+# learning policy
+lr_config = dict(
+ policy='step',
+ warmup='linear',
+ warmup_iters=500,
+ warmup_ratio=0.1,
+ step=[8, 11])
+total_epochs = 12
+
+# runtime
+load_from = None
+resume_from = None
+workflow = [('train', 1)]
diff --git a/PyTorch/contrib/cv/detection/GCNet/configs/vfnet/vfnet_r50_fpn_mdconv_c3-c5_mstrain_2x_coco.py b/PyTorch/contrib/cv/detection/GCNet/configs/vfnet/vfnet_r50_fpn_mdconv_c3-c5_mstrain_2x_coco.py
new file mode 100644
index 0000000000000000000000000000000000000000..24d2093b8b537a365c3e07261921b120b422918c
--- /dev/null
+++ b/PyTorch/contrib/cv/detection/GCNet/configs/vfnet/vfnet_r50_fpn_mdconv_c3-c5_mstrain_2x_coco.py
@@ -0,0 +1,6 @@
+_base_ = './vfnet_r50_fpn_mstrain_2x_coco.py'
+model = dict(
+ backbone=dict(
+ dcn=dict(type='DCNv2', deform_groups=1, fallback_on_stride=False),
+ stage_with_dcn=(False, True, True, True)),
+ bbox_head=dict(dcn_on_last_conv=True))
diff --git a/PyTorch/contrib/cv/detection/GCNet/configs/vfnet/vfnet_r50_fpn_mstrain_2x_coco.py b/PyTorch/contrib/cv/detection/GCNet/configs/vfnet/vfnet_r50_fpn_mstrain_2x_coco.py
new file mode 100644
index 0000000000000000000000000000000000000000..31b54fb8fe1ef3e620198adf851a97d8f9a071df
--- /dev/null
+++ b/PyTorch/contrib/cv/detection/GCNet/configs/vfnet/vfnet_r50_fpn_mstrain_2x_coco.py
@@ -0,0 +1,39 @@
+_base_ = './vfnet_r50_fpn_1x_coco.py'
+img_norm_cfg = dict(
+ mean=[123.675, 116.28, 103.53], std=[58.395, 57.12, 57.375], to_rgb=True)
+train_pipeline = [
+ dict(type='LoadImageFromFile'),
+ dict(type='LoadAnnotations', with_bbox=True),
+ dict(
+ type='Resize',
+ img_scale=[(1333, 480), (1333, 960)],
+ multiscale_mode='range',
+ keep_ratio=True),
+ dict(type='RandomFlip', flip_ratio=0.5),
+ dict(type='Normalize', **img_norm_cfg),
+ dict(type='Pad', size_divisor=32),
+ dict(type='DefaultFormatBundle'),
+ dict(type='Collect', keys=['img', 'gt_bboxes', 'gt_labels']),
+]
+test_pipeline = [
+ dict(type='LoadImageFromFile'),
+ dict(
+ type='MultiScaleFlipAug',
+ img_scale=(1333, 800),
+ flip=False,
+ transforms=[
+ dict(type='Resize', keep_ratio=True),
+ dict(type='RandomFlip'),
+ dict(type='Normalize', **img_norm_cfg),
+ dict(type='Pad', size_divisor=32),
+ dict(type='DefaultFormatBundle'),
+ dict(type='Collect', keys=['img']),
+ ])
+]
+data = dict(
+ train=dict(pipeline=train_pipeline),
+ val=dict(pipeline=test_pipeline),
+ test=dict(pipeline=test_pipeline))
+# learning policy
+lr_config = dict(step=[16, 22])
+total_epochs = 24
diff --git a/PyTorch/contrib/cv/detection/GCNet/configs/vfnet/vfnet_x101_32x4d_fpn_mdconv_c3-c5_mstrain_2x_coco.py b/PyTorch/contrib/cv/detection/GCNet/configs/vfnet/vfnet_x101_32x4d_fpn_mdconv_c3-c5_mstrain_2x_coco.py
new file mode 100644
index 0000000000000000000000000000000000000000..ebeef6ff6640e83378391d3ce7072aa296826c32
--- /dev/null
+++ b/PyTorch/contrib/cv/detection/GCNet/configs/vfnet/vfnet_x101_32x4d_fpn_mdconv_c3-c5_mstrain_2x_coco.py
@@ -0,0 +1,16 @@
+_base_ = './vfnet_r50_fpn_mdconv_c3-c5_mstrain_2x_coco.py'
+model = dict(
+ pretrained='open-mmlab://resnext101_32x4d',
+ backbone=dict(
+ type='ResNeXt',
+ depth=101,
+ groups=32,
+ base_width=4,
+ num_stages=4,
+ out_indices=(0, 1, 2, 3),
+ frozen_stages=1,
+ norm_cfg=dict(type='BN', requires_grad=True),
+ norm_eval=True,
+ style='pytorch',
+ dcn=dict(type='DCNv2', deform_groups=1, fallback_on_stride=False),
+ stage_with_dcn=(False, True, True, True)))
diff --git a/PyTorch/contrib/cv/detection/GCNet/configs/vfnet/vfnet_x101_32x4d_fpn_mstrain_2x_coco.py b/PyTorch/contrib/cv/detection/GCNet/configs/vfnet/vfnet_x101_32x4d_fpn_mstrain_2x_coco.py
new file mode 100644
index 0000000000000000000000000000000000000000..5ed26504af131f3806426fcbd343bb7c4c9e229c
--- /dev/null
+++ b/PyTorch/contrib/cv/detection/GCNet/configs/vfnet/vfnet_x101_32x4d_fpn_mstrain_2x_coco.py
@@ -0,0 +1,14 @@
+_base_ = './vfnet_r50_fpn_mstrain_2x_coco.py'
+model = dict(
+ pretrained='open-mmlab://resnext101_32x4d',
+ backbone=dict(
+ type='ResNeXt',
+ depth=101,
+ groups=32,
+ base_width=4,
+ num_stages=4,
+ out_indices=(0, 1, 2, 3),
+ frozen_stages=1,
+ norm_cfg=dict(type='BN', requires_grad=True),
+ norm_eval=True,
+ style='pytorch'))
diff --git a/PyTorch/contrib/cv/detection/GCNet/configs/vfnet/vfnet_x101_64x4d_fpn_mdconv_c3-c5_mstrain_2x_coco.py b/PyTorch/contrib/cv/detection/GCNet/configs/vfnet/vfnet_x101_64x4d_fpn_mdconv_c3-c5_mstrain_2x_coco.py
new file mode 100644
index 0000000000000000000000000000000000000000..2e19078e2830a2fa6dd2d3b703b0bbf711b7e1e4
--- /dev/null
+++ b/PyTorch/contrib/cv/detection/GCNet/configs/vfnet/vfnet_x101_64x4d_fpn_mdconv_c3-c5_mstrain_2x_coco.py
@@ -0,0 +1,16 @@
+_base_ = './vfnet_r50_fpn_mdconv_c3-c5_mstrain_2x_coco.py'
+model = dict(
+ pretrained='open-mmlab://resnext101_64x4d',
+ backbone=dict(
+ type='ResNeXt',
+ depth=101,
+ groups=64,
+ base_width=4,
+ num_stages=4,
+ out_indices=(0, 1, 2, 3),
+ frozen_stages=1,
+ norm_cfg=dict(type='BN', requires_grad=True),
+ norm_eval=True,
+ style='pytorch',
+ dcn=dict(type='DCNv2', deform_groups=1, fallback_on_stride=False),
+ stage_with_dcn=(False, True, True, True)))
diff --git a/PyTorch/contrib/cv/detection/GCNet/configs/vfnet/vfnet_x101_64x4d_fpn_mstrain_2x_coco.py b/PyTorch/contrib/cv/detection/GCNet/configs/vfnet/vfnet_x101_64x4d_fpn_mstrain_2x_coco.py
new file mode 100644
index 0000000000000000000000000000000000000000..4329b34bee03d219cdd94b600055eb5d5a7cc8ef
--- /dev/null
+++ b/PyTorch/contrib/cv/detection/GCNet/configs/vfnet/vfnet_x101_64x4d_fpn_mstrain_2x_coco.py
@@ -0,0 +1,14 @@
+_base_ = './vfnet_r50_fpn_mstrain_2x_coco.py'
+model = dict(
+ pretrained='open-mmlab://resnext101_64x4d',
+ backbone=dict(
+ type='ResNeXt',
+ depth=101,
+ groups=64,
+ base_width=4,
+ num_stages=4,
+ out_indices=(0, 1, 2, 3),
+ frozen_stages=1,
+ norm_cfg=dict(type='BN', requires_grad=True),
+ norm_eval=True,
+ style='pytorch'))
diff --git a/PyTorch/contrib/cv/detection/GCNet/configs/wider_face/README.md b/PyTorch/contrib/cv/detection/GCNet/configs/wider_face/README.md
new file mode 100644
index 0000000000000000000000000000000000000000..f7392007a1ce6379aee4c5e4544111f8207fe823
--- /dev/null
+++ b/PyTorch/contrib/cv/detection/GCNet/configs/wider_face/README.md
@@ -0,0 +1,32 @@
+## WIDER Face Dataset
+
+To use the WIDER Face dataset you need to download it
+and extract to the `data/WIDERFace` folder. Annotation in the VOC format
+can be found in this [repo](https://github.com/sovrasov/wider-face-pascal-voc-annotations.git).
+You should move the annotation files from `WIDER_train_annotations` and `WIDER_val_annotations` folders
+to the `Annotation` folders inside the corresponding directories `WIDER_train` and `WIDER_val`.
+Also annotation lists `val.txt` and `train.txt` should be copied to `data/WIDERFace` from `WIDER_train_annotations` and `WIDER_val_annotations`.
+The directory should be like this:
+
+```
+mmdetection
+├── mmdet
+├── tools
+├── configs
+├── data
+│ ├── WIDERFace
+│ │ ├── WIDER_train
+│ | │ ├──0--Parade
+│ | │ ├── ...
+│ | │ ├── Annotations
+│ │ ├── WIDER_val
+│ | │ ├──0--Parade
+│ | │ ├── ...
+│ | │ ├── Annotations
+│ │ ├── val.txt
+│ │ ├── train.txt
+
+```
+
+After that you can train the SSD300 on WIDER by launching training with the `ssd300_wider_face.py` config or
+create your own config based on the presented one.
diff --git a/PyTorch/contrib/cv/detection/GCNet/configs/wider_face/ssd300_wider_face.py b/PyTorch/contrib/cv/detection/GCNet/configs/wider_face/ssd300_wider_face.py
new file mode 100644
index 0000000000000000000000000000000000000000..d0e89a83d9828bf2188664da22b91ec87cbada74
--- /dev/null
+++ b/PyTorch/contrib/cv/detection/GCNet/configs/wider_face/ssd300_wider_face.py
@@ -0,0 +1,18 @@
+_base_ = [
+ '../_base_/models/ssd300.py', '../_base_/datasets/wider_face.py',
+ '../_base_/default_runtime.py'
+]
+model = dict(bbox_head=dict(num_classes=1))
+# optimizer
+optimizer = dict(type='SGD', lr=0.012, momentum=0.9, weight_decay=5e-4)
+optimizer_config = dict()
+# learning policy
+lr_config = dict(
+ policy='step',
+ warmup='linear',
+ warmup_iters=1000,
+ warmup_ratio=0.001,
+ step=[16, 20])
+# runtime settings
+total_epochs = 24
+log_config = dict(interval=1)
diff --git a/PyTorch/contrib/cv/detection/GCNet/configs/yolact/README.md b/PyTorch/contrib/cv/detection/GCNet/configs/yolact/README.md
new file mode 100644
index 0000000000000000000000000000000000000000..02a5a11452321360055491e64fb27e08959050f9
--- /dev/null
+++ b/PyTorch/contrib/cv/detection/GCNet/configs/yolact/README.md
@@ -0,0 +1,60 @@
+# **Y**ou **O**nly **L**ook **A**t **C**oefficien**T**s
+```
+ ██╗ ██╗ ██████╗ ██╗ █████╗ ██████╗████████╗
+ ╚██╗ ██╔╝██╔═══██╗██║ ██╔══██╗██╔════╝╚══██╔══╝
+ ╚████╔╝ ██║ ██║██║ ███████║██║ ██║
+ ╚██╔╝ ██║ ██║██║ ██╔══██║██║ ██║
+ ██║ ╚██████╔╝███████╗██║ ██║╚██████╗ ██║
+ ╚═╝ ╚═════╝ ╚══════╝╚═╝ ╚═╝ ╚═════╝ ╚═╝
+```
+
+A simple, fully convolutional model for real-time instance segmentation. This is the code for our paper:
+ - [YOLACT: Real-time Instance Segmentation](https://arxiv.org/abs/1904.02689)
+
+
+#### For a real-time demo, check out our ICCV video:
+[](https://www.youtube.com/watch?v=0pMfmo8qfpQ)
+
+# Evaluation
+Here are our YOLACT models along with their FPS on a Titan Xp and mAP on COCO's `val`:
+
+| Image Size | GPU x BS | Backbone | *FPS | mAP | Weights |
+|:----------:|:--------:|:-------------:|:-----:|:----:|---------|
+| 550 | 1x8 | Resnet50-FPN | 42.5 | 29.0 | [model](https://openmmlab.oss-cn-hangzhou.aliyuncs.com/mmdetection/v2.0/yolact/yolact_r50_1x8_coco_20200908-f38d58df.pth) |
+| 550 | 8x8 | Resnet50-FPN | 42.5 | 28.4 | [model](https://openmmlab.oss-cn-hangzhou.aliyuncs.com/mmdetection/v2.0/yolact/yolact_r50_8x8_coco_20200908-ca34f5db.pth) |
+| 550 | 1x8 | Resnet101-FPN | 33.5 | 30.4 | [model](https://openmmlab.oss-cn-hangzhou.aliyuncs.com/mmdetection/v2.0/yolact/yolact_r101_1x8_coco_20200908-4cbe9101.pth) |
+
+*Note: The FPS is evaluated by the [original implementation](https://github.com/dbolya/yolact). When calculating FPS, only the model inference time is taken into account. Data loading and post-processing operations such as converting masks to RLE code, generating COCO JSON results, image rendering are not included.
+
+# Training
+All the aforementioned models are trained with a single GPU. It typically takes ~12GB VRAM when using resnet-101 as the backbone. If you want to try multiple GPUs training, you may have to modify the configuration files accordingly, such as adjusting the training schedule and freezing batch norm.
+```Shell
+# Trains using the resnet-101 backbone with a batch size of 8 on a single GPU.
+./tools/dist_train.sh configs/yolact/yolact_r101.py 1
+```
+
+# Testing
+Please refer to [mmdetection/docs/getting_started.md](https://github.com/open-mmlab/mmdetection/blob/master/docs/getting_started.md#inference-with-pretrained-models).
+
+# Citation
+If you use YOLACT or this code base in your work, please cite
+```
+@inproceedings{yolact-iccv2019,
+ author = {Daniel Bolya and Chong Zhou and Fanyi Xiao and Yong Jae Lee},
+ title = {YOLACT: {Real-time} Instance Segmentation},
+ booktitle = {ICCV},
+ year = {2019},
+}
+```
+
+
diff --git a/PyTorch/contrib/cv/detection/GCNet/configs/yolact/yolact_r101_1x8_coco.py b/PyTorch/contrib/cv/detection/GCNet/configs/yolact/yolact_r101_1x8_coco.py
new file mode 100644
index 0000000000000000000000000000000000000000..2864b590b5538b735a16df3b2690b29a95384df8
--- /dev/null
+++ b/PyTorch/contrib/cv/detection/GCNet/configs/yolact/yolact_r101_1x8_coco.py
@@ -0,0 +1,3 @@
+_base_ = './yolact_r50_1x8_coco.py'
+
+model = dict(pretrained='torchvision://resnet101', backbone=dict(depth=101))
diff --git a/PyTorch/contrib/cv/detection/GCNet/configs/yolact/yolact_r50_1x8_coco.py b/PyTorch/contrib/cv/detection/GCNet/configs/yolact/yolact_r50_1x8_coco.py
new file mode 100644
index 0000000000000000000000000000000000000000..9c9a8c8ff3449a013190765c8342cb3998c70dd5
--- /dev/null
+++ b/PyTorch/contrib/cv/detection/GCNet/configs/yolact/yolact_r50_1x8_coco.py
@@ -0,0 +1,160 @@
+_base_ = '../_base_/default_runtime.py'
+
+# model settings
+img_size = 550
+model = dict(
+ type='YOLACT',
+ pretrained='torchvision://resnet50',
+ backbone=dict(
+ type='ResNet',
+ depth=50,
+ num_stages=4,
+ out_indices=(0, 1, 2, 3),
+ frozen_stages=-1, # do not freeze stem
+ norm_cfg=dict(type='BN', requires_grad=True),
+ norm_eval=False, # update the statistics of bn
+ zero_init_residual=False,
+ style='pytorch'),
+ neck=dict(
+ type='FPN',
+ in_channels=[256, 512, 1024, 2048],
+ out_channels=256,
+ start_level=1,
+ add_extra_convs='on_input',
+ num_outs=5,
+ upsample_cfg=dict(mode='bilinear')),
+ bbox_head=dict(
+ type='YOLACTHead',
+ num_classes=80,
+ in_channels=256,
+ feat_channels=256,
+ anchor_generator=dict(
+ type='AnchorGenerator',
+ octave_base_scale=3,
+ scales_per_octave=1,
+ base_sizes=[8, 16, 32, 64, 128],
+ ratios=[0.5, 1.0, 2.0],
+ strides=[550.0 / x for x in [69, 35, 18, 9, 5]],
+ centers=[(550 * 0.5 / x, 550 * 0.5 / x)
+ for x in [69, 35, 18, 9, 5]]),
+ bbox_coder=dict(
+ type='DeltaXYWHBBoxCoder',
+ target_means=[.0, .0, .0, .0],
+ target_stds=[0.1, 0.1, 0.2, 0.2]),
+ loss_cls=dict(
+ type='CrossEntropyLoss',
+ use_sigmoid=False,
+ reduction='none',
+ loss_weight=1.0),
+ loss_bbox=dict(type='SmoothL1Loss', beta=1.0, loss_weight=1.5),
+ num_head_convs=1,
+ num_protos=32,
+ use_ohem=True),
+ mask_head=dict(
+ type='YOLACTProtonet',
+ in_channels=256,
+ num_protos=32,
+ num_classes=80,
+ max_masks_to_train=100,
+ loss_mask_weight=6.125),
+ segm_head=dict(
+ type='YOLACTSegmHead',
+ num_classes=80,
+ in_channels=256,
+ loss_segm=dict(
+ type='CrossEntropyLoss', use_sigmoid=True, loss_weight=1.0)))
+# training and testing settings
+train_cfg = dict(
+ assigner=dict(
+ type='MaxIoUAssigner',
+ pos_iou_thr=0.5,
+ neg_iou_thr=0.4,
+ min_pos_iou=0.,
+ ignore_iof_thr=-1,
+ gt_max_assign_all=False),
+ # smoothl1_beta=1.,
+ allowed_border=-1,
+ pos_weight=-1,
+ neg_pos_ratio=3,
+ debug=False)
+test_cfg = dict(
+ nms_pre=1000,
+ min_bbox_size=0,
+ score_thr=0.05,
+ iou_thr=0.5,
+ top_k=200,
+ max_per_img=100)
+# dataset settings
+dataset_type = 'CocoDataset'
+data_root = 'data/coco/'
+img_norm_cfg = dict(
+ mean=[123.68, 116.78, 103.94], std=[58.40, 57.12, 57.38], to_rgb=True)
+train_pipeline = [
+ dict(type='LoadImageFromFile', to_float32=True),
+ dict(type='LoadAnnotations', with_bbox=True, with_mask=True),
+ dict(type='FilterAnnotations', min_gt_bbox_wh=(4.0, 4.0)),
+ dict(
+ type='PhotoMetricDistortion',
+ brightness_delta=32,
+ contrast_range=(0.5, 1.5),
+ saturation_range=(0.5, 1.5),
+ hue_delta=18),
+ dict(
+ type='Expand',
+ mean=img_norm_cfg['mean'],
+ to_rgb=img_norm_cfg['to_rgb'],
+ ratio_range=(1, 4)),
+ dict(
+ type='MinIoURandomCrop',
+ min_ious=(0.1, 0.3, 0.5, 0.7, 0.9),
+ min_crop_size=0.3),
+ dict(type='Resize', img_scale=(img_size, img_size), keep_ratio=False),
+ dict(type='Normalize', **img_norm_cfg),
+ dict(type='RandomFlip', flip_ratio=0.5),
+ dict(type='DefaultFormatBundle'),
+ dict(type='Collect', keys=['img', 'gt_bboxes', 'gt_labels', 'gt_masks']),
+]
+test_pipeline = [
+ dict(type='LoadImageFromFile'),
+ dict(
+ type='MultiScaleFlipAug',
+ img_scale=(img_size, img_size),
+ flip=False,
+ transforms=[
+ dict(type='Resize', keep_ratio=False),
+ dict(type='Normalize', **img_norm_cfg),
+ dict(type='ImageToTensor', keys=['img']),
+ dict(type='Collect', keys=['img']),
+ ])
+]
+data = dict(
+ samples_per_gpu=8,
+ workers_per_gpu=4,
+ train=dict(
+ type=dataset_type,
+ ann_file=data_root + 'annotations/instances_train2017.json',
+ img_prefix=data_root + 'train2017/',
+ pipeline=train_pipeline),
+ val=dict(
+ type=dataset_type,
+ ann_file=data_root + 'annotations/instances_val2017.json',
+ img_prefix=data_root + 'val2017/',
+ pipeline=test_pipeline),
+ test=dict(
+ type=dataset_type,
+ ann_file=data_root + 'annotations/instances_val2017.json',
+ img_prefix=data_root + 'val2017/',
+ pipeline=test_pipeline))
+# optimizer
+optimizer = dict(type='SGD', lr=1e-3, momentum=0.9, weight_decay=5e-4)
+optimizer_config = dict()
+# learning policy
+lr_config = dict(
+ policy='step',
+ warmup='linear',
+ warmup_iters=500,
+ warmup_ratio=0.1,
+ step=[20, 42, 49, 52])
+total_epochs = 55
+cudnn_benchmark = True
+evaluation = dict(metric=['bbox', 'segm'])
diff --git a/PyTorch/contrib/cv/detection/GCNet/configs/yolact/yolact_r50_8x8_coco.py b/PyTorch/contrib/cv/detection/GCNet/configs/yolact/yolact_r50_8x8_coco.py
new file mode 100644
index 0000000000000000000000000000000000000000..b3adcb74a6155a0ab7303ab9ae90ee120f3eb4ad
--- /dev/null
+++ b/PyTorch/contrib/cv/detection/GCNet/configs/yolact/yolact_r50_8x8_coco.py
@@ -0,0 +1,11 @@
+_base_ = 'yolact_r50_1x8_coco.py'
+
+optimizer = dict(type='SGD', lr=8e-3, momentum=0.9, weight_decay=5e-4)
+optimizer_config = dict(grad_clip=dict(max_norm=35, norm_type=2))
+# learning policy
+lr_config = dict(
+ policy='step',
+ warmup='linear',
+ warmup_iters=1000,
+ warmup_ratio=0.1,
+ step=[20, 42, 49, 52])
diff --git a/PyTorch/contrib/cv/detection/GCNet/configs/yolo/README.md b/PyTorch/contrib/cv/detection/GCNet/configs/yolo/README.md
new file mode 100644
index 0000000000000000000000000000000000000000..274f1877031714b38810e589fd84db25595e5d22
--- /dev/null
+++ b/PyTorch/contrib/cv/detection/GCNet/configs/yolo/README.md
@@ -0,0 +1,25 @@
+# YOLOv3
+
+## Introduction
+```
+@misc{redmon2018yolov3,
+ title={YOLOv3: An Incremental Improvement},
+ author={Joseph Redmon and Ali Farhadi},
+ year={2018},
+ eprint={1804.02767},
+ archivePrefix={arXiv},
+ primaryClass={cs.CV}
+}
+```
+
+## Results and Models
+
+| Backbone | Scale | Lr schd | Mem (GB) | Inf time (fps) | box AP | Download |
+| :-------------: | :-----: | :-----: | :------: | :------------: | :----: | :-------: |
+| DarkNet-53 | 320 | 273e | 2.7 | 63.9 | 27.9 | [model](http://download.openmmlab.com/mmdetection/v2.0/yolo/yolov3_d53_320_273e_coco/yolov3_d53_320_273e_coco-421362b6.pth) | [log](http://download.openmmlab.com/mmdetection/v2.0/yolo/yolov3_d53_320_273e_coco/yolov3_d53_320_273e_coco-20200819_172101.log.json) |
+| DarkNet-53 | 416 | 273e | 3.8 | 61.2 | 30.9 | [model](http://download.openmmlab.com/mmdetection/v2.0/yolo/yolov3_d53_mstrain-416_273e_coco/yolov3_d53_mstrain-416_273e_coco-2b60fcd9.pth) | [log](http://download.openmmlab.com/mmdetection/v2.0/yolo/yolov3_d53_mstrain-416_273e_coco/yolov3_d53_mstrain-416_273e_coco-20200819_173424.log.json) |
+| DarkNet-53 | 608 | 273e | 7.1 | 48.1 | 33.4 | [model](http://download.openmmlab.com/mmdetection/v2.0/yolo/yolov3_d53_mstrain-608_273e_coco/yolov3_d53_mstrain-608_273e_coco-139f5633.pth) | [log](http://download.openmmlab.com/mmdetection/v2.0/yolo/yolov3_d53_mstrain-608_273e_coco/yolov3_d53_mstrain-608_273e_coco-20200819_170820.log.json) |
+
+
+## Credit
+This implementation originates from the project of Haoyu Wu(@wuhy08) at Western Digital.
diff --git a/PyTorch/contrib/cv/detection/GCNet/configs/yolo/yolov3_d53_320_273e_coco.py b/PyTorch/contrib/cv/detection/GCNet/configs/yolo/yolov3_d53_320_273e_coco.py
new file mode 100644
index 0000000000000000000000000000000000000000..87359f6fb66d94de10b8e3797ee3eec93a19cb26
--- /dev/null
+++ b/PyTorch/contrib/cv/detection/GCNet/configs/yolo/yolov3_d53_320_273e_coco.py
@@ -0,0 +1,42 @@
+_base_ = './yolov3_d53_mstrain-608_273e_coco.py'
+# dataset settings
+img_norm_cfg = dict(mean=[0, 0, 0], std=[255., 255., 255.], to_rgb=True)
+train_pipeline = [
+ dict(type='LoadImageFromFile', to_float32=True),
+ dict(type='LoadAnnotations', with_bbox=True),
+ dict(type='PhotoMetricDistortion'),
+ dict(
+ type='Expand',
+ mean=img_norm_cfg['mean'],
+ to_rgb=img_norm_cfg['to_rgb'],
+ ratio_range=(1, 2)),
+ dict(
+ type='MinIoURandomCrop',
+ min_ious=(0.4, 0.5, 0.6, 0.7, 0.8, 0.9),
+ min_crop_size=0.3),
+ dict(type='Resize', img_scale=(320, 320), keep_ratio=True),
+ dict(type='RandomFlip', flip_ratio=0.5),
+ dict(type='Normalize', **img_norm_cfg),
+ dict(type='Pad', size_divisor=32),
+ dict(type='DefaultFormatBundle'),
+ dict(type='Collect', keys=['img', 'gt_bboxes', 'gt_labels'])
+]
+test_pipeline = [
+ dict(type='LoadImageFromFile'),
+ dict(
+ type='MultiScaleFlipAug',
+ img_scale=(320, 320),
+ flip=False,
+ transforms=[
+ dict(type='Resize', keep_ratio=True),
+ dict(type='RandomFlip'),
+ dict(type='Normalize', **img_norm_cfg),
+ dict(type='Pad', size_divisor=32),
+ dict(type='ImageToTensor', keys=['img']),
+ dict(type='Collect', keys=['img'])
+ ])
+]
+data = dict(
+ train=dict(pipeline=train_pipeline),
+ val=dict(pipeline=test_pipeline),
+ test=dict(pipeline=test_pipeline))
diff --git a/PyTorch/contrib/cv/detection/GCNet/configs/yolo/yolov3_d53_mstrain-416_273e_coco.py b/PyTorch/contrib/cv/detection/GCNet/configs/yolo/yolov3_d53_mstrain-416_273e_coco.py
new file mode 100644
index 0000000000000000000000000000000000000000..d029b5cdd6b3dad09b16a6f2a23e66be684a6412
--- /dev/null
+++ b/PyTorch/contrib/cv/detection/GCNet/configs/yolo/yolov3_d53_mstrain-416_273e_coco.py
@@ -0,0 +1,42 @@
+_base_ = './yolov3_d53_mstrain-608_273e_coco.py'
+# dataset settings
+img_norm_cfg = dict(mean=[0, 0, 0], std=[255., 255., 255.], to_rgb=True)
+train_pipeline = [
+ dict(type='LoadImageFromFile', to_float32=True),
+ dict(type='LoadAnnotations', with_bbox=True),
+ dict(type='PhotoMetricDistortion'),
+ dict(
+ type='Expand',
+ mean=img_norm_cfg['mean'],
+ to_rgb=img_norm_cfg['to_rgb'],
+ ratio_range=(1, 2)),
+ dict(
+ type='MinIoURandomCrop',
+ min_ious=(0.4, 0.5, 0.6, 0.7, 0.8, 0.9),
+ min_crop_size=0.3),
+ dict(type='Resize', img_scale=[(320, 320), (416, 416)], keep_ratio=True),
+ dict(type='RandomFlip', flip_ratio=0.5),
+ dict(type='Normalize', **img_norm_cfg),
+ dict(type='Pad', size_divisor=32),
+ dict(type='DefaultFormatBundle'),
+ dict(type='Collect', keys=['img', 'gt_bboxes', 'gt_labels'])
+]
+test_pipeline = [
+ dict(type='LoadImageFromFile'),
+ dict(
+ type='MultiScaleFlipAug',
+ img_scale=(416, 416),
+ flip=False,
+ transforms=[
+ dict(type='Resize', keep_ratio=True),
+ dict(type='RandomFlip'),
+ dict(type='Normalize', **img_norm_cfg),
+ dict(type='Pad', size_divisor=32),
+ dict(type='ImageToTensor', keys=['img']),
+ dict(type='Collect', keys=['img'])
+ ])
+]
+data = dict(
+ train=dict(pipeline=train_pipeline),
+ val=dict(pipeline=test_pipeline),
+ test=dict(pipeline=test_pipeline))
diff --git a/PyTorch/contrib/cv/detection/GCNet/configs/yolo/yolov3_d53_mstrain-608_273e_coco.py b/PyTorch/contrib/cv/detection/GCNet/configs/yolo/yolov3_d53_mstrain-608_273e_coco.py
new file mode 100644
index 0000000000000000000000000000000000000000..049984d01cfbf78e09e609e8de381460747faa0b
--- /dev/null
+++ b/PyTorch/contrib/cv/detection/GCNet/configs/yolo/yolov3_d53_mstrain-608_273e_coco.py
@@ -0,0 +1,121 @@
+_base_ = '../_base_/default_runtime.py'
+# model settings
+model = dict(
+ type='YOLOV3',
+ pretrained='open-mmlab://darknet53',
+ backbone=dict(type='Darknet', depth=53, out_indices=(3, 4, 5)),
+ neck=dict(
+ type='YOLOV3Neck',
+ num_scales=3,
+ in_channels=[1024, 512, 256],
+ out_channels=[512, 256, 128]),
+ bbox_head=dict(
+ type='YOLOV3Head',
+ num_classes=80,
+ in_channels=[512, 256, 128],
+ out_channels=[1024, 512, 256],
+ anchor_generator=dict(
+ type='YOLOAnchorGenerator',
+ base_sizes=[[(116, 90), (156, 198), (373, 326)],
+ [(30, 61), (62, 45), (59, 119)],
+ [(10, 13), (16, 30), (33, 23)]],
+ strides=[32, 16, 8]),
+ bbox_coder=dict(type='YOLOBBoxCoder'),
+ featmap_strides=[32, 16, 8],
+ loss_cls=dict(
+ type='CrossEntropyLoss',
+ use_sigmoid=True,
+ loss_weight=1.0,
+ reduction='sum'),
+ loss_conf=dict(
+ type='CrossEntropyLoss',
+ use_sigmoid=True,
+ loss_weight=1.0,
+ reduction='sum'),
+ loss_xy=dict(
+ type='CrossEntropyLoss',
+ use_sigmoid=True,
+ loss_weight=2.0,
+ reduction='sum'),
+ loss_wh=dict(type='MSELoss', loss_weight=2.0, reduction='sum')))
+# training and testing settings
+train_cfg = dict(
+ assigner=dict(
+ type='GridAssigner', pos_iou_thr=0.5, neg_iou_thr=0.5, min_pos_iou=0))
+test_cfg = dict(
+ nms_pre=1000,
+ min_bbox_size=0,
+ score_thr=0.05,
+ conf_thr=0.005,
+ nms=dict(type='nms', iou_threshold=0.45),
+ max_per_img=100)
+# dataset settings
+dataset_type = 'CocoDataset'
+data_root = 'data/coco/'
+img_norm_cfg = dict(mean=[0, 0, 0], std=[255., 255., 255.], to_rgb=True)
+train_pipeline = [
+ dict(type='LoadImageFromFile', to_float32=True),
+ dict(type='LoadAnnotations', with_bbox=True),
+ dict(type='PhotoMetricDistortion'),
+ dict(
+ type='Expand',
+ mean=img_norm_cfg['mean'],
+ to_rgb=img_norm_cfg['to_rgb'],
+ ratio_range=(1, 2)),
+ dict(
+ type='MinIoURandomCrop',
+ min_ious=(0.4, 0.5, 0.6, 0.7, 0.8, 0.9),
+ min_crop_size=0.3),
+ dict(type='Resize', img_scale=[(320, 320), (608, 608)], keep_ratio=True),
+ dict(type='RandomFlip', flip_ratio=0.5),
+ dict(type='Normalize', **img_norm_cfg),
+ dict(type='Pad', size_divisor=32),
+ dict(type='DefaultFormatBundle'),
+ dict(type='Collect', keys=['img', 'gt_bboxes', 'gt_labels'])
+]
+test_pipeline = [
+ dict(type='LoadImageFromFile'),
+ dict(
+ type='MultiScaleFlipAug',
+ img_scale=(608, 608),
+ flip=False,
+ transforms=[
+ dict(type='Resize', keep_ratio=True),
+ dict(type='RandomFlip'),
+ dict(type='Normalize', **img_norm_cfg),
+ dict(type='Pad', size_divisor=32),
+ dict(type='ImageToTensor', keys=['img']),
+ dict(type='Collect', keys=['img'])
+ ])
+]
+data = dict(
+ samples_per_gpu=8,
+ workers_per_gpu=4,
+ train=dict(
+ type=dataset_type,
+ ann_file=data_root + 'annotations/instances_train2017.json',
+ img_prefix=data_root + 'train2017/',
+ pipeline=train_pipeline),
+ val=dict(
+ type=dataset_type,
+ ann_file=data_root + 'annotations/instances_val2017.json',
+ img_prefix=data_root + 'val2017/',
+ pipeline=test_pipeline),
+ test=dict(
+ type=dataset_type,
+ ann_file=data_root + 'annotations/instances_val2017.json',
+ img_prefix=data_root + 'val2017/',
+ pipeline=test_pipeline))
+# optimizer
+optimizer = dict(type='SGD', lr=0.001, momentum=0.9, weight_decay=0.0005)
+optimizer_config = dict(grad_clip=dict(max_norm=35, norm_type=2))
+# learning policy
+lr_config = dict(
+ policy='step',
+ warmup='linear',
+ warmup_iters=2000, # same as burn-in in darknet
+ warmup_ratio=0.1,
+ step=[218, 246])
+# runtime settings
+total_epochs = 273
+evaluation = dict(interval=1, metric=['bbox'])
diff --git a/PyTorch/contrib/cv/detection/GCNet/docker/Dockerfile b/PyTorch/contrib/cv/detection/GCNet/docker/Dockerfile
new file mode 100644
index 0000000000000000000000000000000000000000..81e458fc1c9b1a50a457c196de1e6da619ac0695
--- /dev/null
+++ b/PyTorch/contrib/cv/detection/GCNet/docker/Dockerfile
@@ -0,0 +1,24 @@
+ARG PYTORCH="1.6.0"
+ARG CUDA="10.1"
+ARG CUDNN="7"
+
+FROM pytorch/pytorch:${PYTORCH}-cuda${CUDA}-cudnn${CUDNN}-devel
+
+ENV TORCH_CUDA_ARCH_LIST="6.0 6.1 7.0+PTX"
+ENV TORCH_NVCC_FLAGS="-Xfatbin -compress-all"
+ENV CMAKE_PREFIX_PATH="$(dirname $(which conda))/../"
+
+RUN apt-get update && apt-get install -y ffmpeg libsm6 libxext6 git ninja-build libglib2.0-0 libsm6 libxrender-dev libxext6 \
+ && apt-get clean \
+ && rm -rf /var/lib/apt/lists/*
+
+# Install MMCV
+RUN pip install mmcv-full==latest+torch1.6.0+cu101 -f https://openmmlab.oss-accelerate.aliyuncs.com/mmcv/dist/index.html
+
+# Install MMDetection
+RUN conda clean --all
+RUN git clone https://github.com/open-mmlab/mmdetection.git /mmdetection
+WORKDIR /mmdetection
+ENV FORCE_CUDA="1"
+RUN pip install -r requirements/build.txt
+RUN pip install --no-cache-dir -e .
diff --git a/PyTorch/contrib/cv/detection/GCNet/mmcv/__init__.py b/PyTorch/contrib/cv/detection/GCNet/mmcv/__init__.py
new file mode 100644
index 0000000000000000000000000000000000000000..74ee0442fc47e2dd508c77b49774b2e6adec7bfa
--- /dev/null
+++ b/PyTorch/contrib/cv/detection/GCNet/mmcv/__init__.py
@@ -0,0 +1,15 @@
+# Copyright (c) Open-MMLab. All rights reserved.
+# flake8: noqa
+from .arraymisc import *
+from .fileio import *
+from .image import *
+from .utils import *
+from .version import *
+from .video import *
+from .visualization import *
+
+# The following modules are not imported to this level, so mmcv may be used
+# without PyTorch.
+# - runner
+# - parallel
+# - op
diff --git a/PyTorch/contrib/cv/detection/GCNet/mmcv/arraymisc/__init__.py b/PyTorch/contrib/cv/detection/GCNet/mmcv/arraymisc/__init__.py
new file mode 100644
index 0000000000000000000000000000000000000000..2e3934ca4524e87a0bc8d64016770030254e41a5
--- /dev/null
+++ b/PyTorch/contrib/cv/detection/GCNet/mmcv/arraymisc/__init__.py
@@ -0,0 +1,4 @@
+# Copyright (c) Open-MMLab. All rights reserved.
+from .quantization import dequantize, quantize
+
+__all__ = ['quantize', 'dequantize']
diff --git a/PyTorch/contrib/cv/detection/GCNet/mmcv/arraymisc/quantization.py b/PyTorch/contrib/cv/detection/GCNet/mmcv/arraymisc/quantization.py
new file mode 100644
index 0000000000000000000000000000000000000000..47b6fa2a0b26996afe3408815fe4c97309fe1693
--- /dev/null
+++ b/PyTorch/contrib/cv/detection/GCNet/mmcv/arraymisc/quantization.py
@@ -0,0 +1,55 @@
+# Copyright (c) Open-MMLab. All rights reserved.
+import numpy as np
+
+
+def quantize(arr, min_val, max_val, levels, dtype=np.int64):
+ """Quantize an array of (-inf, inf) to [0, levels-1].
+
+ Args:
+ arr (ndarray): Input array.
+ min_val (scalar): Minimum value to be clipped.
+ max_val (scalar): Maximum value to be clipped.
+ levels (int): Quantization levels.
+ dtype (np.type): The type of the quantized array.
+
+ Returns:
+ tuple: Quantized array.
+ """
+ if not (isinstance(levels, int) and levels > 1):
+ raise ValueError(
+ f'levels must be a positive integer, but got {levels}')
+ if min_val >= max_val:
+ raise ValueError(
+ f'min_val ({min_val}) must be smaller than max_val ({max_val})')
+
+ arr = np.clip(arr, min_val, max_val) - min_val
+ quantized_arr = np.minimum(
+ np.floor(levels * arr / (max_val - min_val)).astype(dtype), levels - 1)
+
+ return quantized_arr
+
+
+def dequantize(arr, min_val, max_val, levels, dtype=np.float64):
+ """Dequantize an array.
+
+ Args:
+ arr (ndarray): Input array.
+ min_val (scalar): Minimum value to be clipped.
+ max_val (scalar): Maximum value to be clipped.
+ levels (int): Quantization levels.
+ dtype (np.type): The type of the dequantized array.
+
+ Returns:
+ tuple: Dequantized array.
+ """
+ if not (isinstance(levels, int) and levels > 1):
+ raise ValueError(
+ f'levels must be a positive integer, but got {levels}')
+ if min_val >= max_val:
+ raise ValueError(
+ f'min_val ({min_val}) must be smaller than max_val ({max_val})')
+
+ dequantized_arr = (arr + 0.5).astype(dtype) * (max_val -
+ min_val) / levels + min_val
+
+ return dequantized_arr
diff --git a/PyTorch/contrib/cv/detection/GCNet/mmcv/cnn/__init__.py b/PyTorch/contrib/cv/detection/GCNet/mmcv/cnn/__init__.py
new file mode 100644
index 0000000000000000000000000000000000000000..f7522fa784968e063e91d22b8f8464e041795b1c
--- /dev/null
+++ b/PyTorch/contrib/cv/detection/GCNet/mmcv/cnn/__init__.py
@@ -0,0 +1,41 @@
+# Copyright (c) Open-MMLab. All rights reserved.
+from .alexnet import AlexNet
+# yapf: disable
+from .bricks import (ACTIVATION_LAYERS, CONV_LAYERS, NORM_LAYERS,
+ PADDING_LAYERS, PLUGIN_LAYERS, UPSAMPLE_LAYERS,
+ ContextBlock, Conv2d, Conv3d, ConvAWS2d, ConvModule,
+ ConvTranspose2d, ConvTranspose3d, ConvWS2d,
+ DepthwiseSeparableConvModule, GeneralizedAttention,
+ HSigmoid, HSwish, Linear, MaxPool2d, MaxPool3d,
+ NonLocal1d, NonLocal2d, NonLocal3d, Scale, Swish,
+ build_activation_layer, build_conv_layer,
+ build_norm_layer, build_padding_layer, build_plugin_layer,
+ build_upsample_layer, conv_ws_2d, is_norm)
+from .builder import MODELS, build_model_from_cfg
+# yapf: enable
+from .resnet import ResNet, make_res_layer
+from .utils import (INITIALIZERS, Caffe2XavierInit, ConstantInit, KaimingInit,
+ NormalInit, PretrainedInit, TruncNormalInit, UniformInit,
+ XavierInit, bias_init_with_prob, caffe2_xavier_init,
+ constant_init, fuse_conv_bn, get_model_complexity_info,
+ initialize, kaiming_init, normal_init, trunc_normal_init,
+ uniform_init, xavier_init)
+from .vgg import VGG, make_vgg_layer
+
+__all__ = [
+ 'AlexNet', 'VGG', 'make_vgg_layer', 'ResNet', 'make_res_layer',
+ 'constant_init', 'xavier_init', 'normal_init', 'trunc_normal_init',
+ 'uniform_init', 'kaiming_init', 'caffe2_xavier_init',
+ 'bias_init_with_prob', 'ConvModule', 'build_activation_layer',
+ 'build_conv_layer', 'build_norm_layer', 'build_padding_layer',
+ 'build_upsample_layer', 'build_plugin_layer', 'is_norm', 'NonLocal1d',
+ 'NonLocal2d', 'NonLocal3d', 'ContextBlock', 'HSigmoid', 'Swish', 'HSwish',
+ 'GeneralizedAttention', 'ACTIVATION_LAYERS', 'CONV_LAYERS', 'NORM_LAYERS',
+ 'PADDING_LAYERS', 'UPSAMPLE_LAYERS', 'PLUGIN_LAYERS', 'Scale',
+ 'get_model_complexity_info', 'conv_ws_2d', 'ConvAWS2d', 'ConvWS2d',
+ 'fuse_conv_bn', 'DepthwiseSeparableConvModule', 'Linear', 'Conv2d',
+ 'ConvTranspose2d', 'MaxPool2d', 'ConvTranspose3d', 'MaxPool3d', 'Conv3d',
+ 'initialize', 'INITIALIZERS', 'ConstantInit', 'XavierInit', 'NormalInit',
+ 'TruncNormalInit', 'UniformInit', 'KaimingInit', 'PretrainedInit',
+ 'Caffe2XavierInit', 'MODELS', 'build_model_from_cfg'
+]
diff --git a/PyTorch/contrib/cv/detection/GCNet/mmcv/cnn/alexnet.py b/PyTorch/contrib/cv/detection/GCNet/mmcv/cnn/alexnet.py
new file mode 100644
index 0000000000000000000000000000000000000000..3938d5cd2868c48f5f875287a4a4fea3c970072f
--- /dev/null
+++ b/PyTorch/contrib/cv/detection/GCNet/mmcv/cnn/alexnet.py
@@ -0,0 +1,61 @@
+# Copyright (c) Open-MMLab. All rights reserved.
+import logging
+
+import torch.nn as nn
+
+
+class AlexNet(nn.Module):
+ """AlexNet backbone.
+
+ Args:
+ num_classes (int): number of classes for classification.
+ """
+
+ def __init__(self, num_classes=-1):
+ super(AlexNet, self).__init__()
+ self.num_classes = num_classes
+ self.features = nn.Sequential(
+ nn.Conv2d(3, 64, kernel_size=11, stride=4, padding=2),
+ nn.ReLU(inplace=True),
+ nn.MaxPool2d(kernel_size=3, stride=2),
+ nn.Conv2d(64, 192, kernel_size=5, padding=2),
+ nn.ReLU(inplace=True),
+ nn.MaxPool2d(kernel_size=3, stride=2),
+ nn.Conv2d(192, 384, kernel_size=3, padding=1),
+ nn.ReLU(inplace=True),
+ nn.Conv2d(384, 256, kernel_size=3, padding=1),
+ nn.ReLU(inplace=True),
+ nn.Conv2d(256, 256, kernel_size=3, padding=1),
+ nn.ReLU(inplace=True),
+ nn.MaxPool2d(kernel_size=3, stride=2),
+ )
+ if self.num_classes > 0:
+ self.classifier = nn.Sequential(
+ nn.Dropout(),
+ nn.Linear(256 * 6 * 6, 4096),
+ nn.ReLU(inplace=True),
+ nn.Dropout(),
+ nn.Linear(4096, 4096),
+ nn.ReLU(inplace=True),
+ nn.Linear(4096, num_classes),
+ )
+
+ def init_weights(self, pretrained=None):
+ if isinstance(pretrained, str):
+ logger = logging.getLogger()
+ from ..runner import load_checkpoint
+ load_checkpoint(self, pretrained, strict=False, logger=logger)
+ elif pretrained is None:
+ # use default initializer
+ pass
+ else:
+ raise TypeError('pretrained must be a str or None')
+
+ def forward(self, x):
+
+ x = self.features(x)
+ if self.num_classes > 0:
+ x = x.view(x.size(0), 256 * 6 * 6)
+ x = self.classifier(x)
+
+ return x
diff --git a/PyTorch/contrib/cv/detection/GCNet/mmcv/cnn/bricks/__init__.py b/PyTorch/contrib/cv/detection/GCNet/mmcv/cnn/bricks/__init__.py
new file mode 100644
index 0000000000000000000000000000000000000000..78da6f39a1b5c5fbc637402e16082ae5de9ba303
--- /dev/null
+++ b/PyTorch/contrib/cv/detection/GCNet/mmcv/cnn/bricks/__init__.py
@@ -0,0 +1,34 @@
+from .activation import build_activation_layer
+from .context_block import ContextBlock
+from .conv import build_conv_layer
+from .conv2d_adaptive_padding import Conv2dAdaptivePadding
+from .conv_module import ConvModule
+from .conv_ws import ConvAWS2d, ConvWS2d, conv_ws_2d
+from .depthwise_separable_conv_module import DepthwiseSeparableConvModule
+from .drop import Dropout, DropPath
+from .generalized_attention import GeneralizedAttention
+from .hsigmoid import HSigmoid
+from .hswish import HSwish
+from .non_local import NonLocal1d, NonLocal2d, NonLocal3d
+from .norm import build_norm_layer, is_norm
+from .padding import build_padding_layer
+from .plugin import build_plugin_layer
+from .registry import (ACTIVATION_LAYERS, CONV_LAYERS, NORM_LAYERS,
+ PADDING_LAYERS, PLUGIN_LAYERS, UPSAMPLE_LAYERS)
+from .scale import Scale
+from .swish import Swish
+from .upsample import build_upsample_layer
+from .wrappers import (Conv2d, Conv3d, ConvTranspose2d, ConvTranspose3d,
+ Linear, MaxPool2d, MaxPool3d)
+
+__all__ = [
+ 'ConvModule', 'build_activation_layer', 'build_conv_layer',
+ 'build_norm_layer', 'build_padding_layer', 'build_upsample_layer',
+ 'build_plugin_layer', 'is_norm', 'HSigmoid', 'HSwish', 'NonLocal1d',
+ 'NonLocal2d', 'NonLocal3d', 'ContextBlock', 'GeneralizedAttention',
+ 'ACTIVATION_LAYERS', 'CONV_LAYERS', 'NORM_LAYERS', 'PADDING_LAYERS',
+ 'UPSAMPLE_LAYERS', 'PLUGIN_LAYERS', 'Scale', 'ConvAWS2d', 'ConvWS2d',
+ 'conv_ws_2d', 'DepthwiseSeparableConvModule', 'Swish', 'Linear',
+ 'Conv2dAdaptivePadding', 'Conv2d', 'ConvTranspose2d', 'MaxPool2d',
+ 'ConvTranspose3d', 'MaxPool3d', 'Conv3d', 'Dropout', 'DropPath'
+]
diff --git a/PyTorch/contrib/cv/detection/GCNet/mmcv/cnn/bricks/activation.py b/PyTorch/contrib/cv/detection/GCNet/mmcv/cnn/bricks/activation.py
new file mode 100644
index 0000000000000000000000000000000000000000..f50241b192677f5372eabbe7948ddf6277e45a0f
--- /dev/null
+++ b/PyTorch/contrib/cv/detection/GCNet/mmcv/cnn/bricks/activation.py
@@ -0,0 +1,90 @@
+import torch
+import torch.nn as nn
+import torch.nn.functional as F
+
+from mmcv.utils import TORCH_VERSION, build_from_cfg
+from .registry import ACTIVATION_LAYERS
+
+for module in [
+ nn.ReLU, nn.LeakyReLU, nn.PReLU, nn.RReLU, nn.ReLU6, nn.ELU,
+ nn.Sigmoid, nn.Tanh
+]:
+ ACTIVATION_LAYERS.register_module(module=module)
+
+
+@ACTIVATION_LAYERS.register_module(name='Clip')
+@ACTIVATION_LAYERS.register_module()
+class Clamp(nn.Module):
+ """Clamp activation layer.
+
+ This activation function is to clamp the feature map value within
+ :math:`[min, max]`. More details can be found in ``torch.clamp()``.
+
+ Args:
+ min (Number | optional): Lower-bound of the range to be clamped to.
+ Default to -1.
+ max (Number | optional): Upper-bound of the range to be clamped to.
+ Default to 1.
+ """
+
+ def __init__(self, min=-1., max=1.):
+ super(Clamp, self).__init__()
+ self.min = min
+ self.max = max
+
+ def forward(self, x):
+ """Forward function.
+
+ Args:
+ x (torch.Tensor): The input tensor.
+
+ Returns:
+ torch.Tensor: Clamped tensor.
+ """
+ return torch.clamp(x, min=self.min, max=self.max)
+
+
+class GELU(nn.Module):
+ r"""Applies the Gaussian Error Linear Units function:
+
+ .. math::
+ \text{GELU}(x) = x * \Phi(x)
+ where :math:`\Phi(x)` is the Cumulative Distribution Function for
+ Gaussian Distribution.
+
+ Shape:
+ - Input: :math:`(N, *)` where `*` means, any number of additional
+ dimensions
+ - Output: :math:`(N, *)`, same shape as the input
+
+ .. image:: scripts/activation_images/GELU.png
+
+ Examples::
+
+ >>> m = nn.GELU()
+ >>> input = torch.randn(2)
+ >>> output = m(input)
+ """
+
+ def forward(self, input):
+ return F.gelu(input)
+
+
+if TORCH_VERSION == 'parrots' or TORCH_VERSION < '1.4':
+ ACTIVATION_LAYERS.register_module(module=GELU)
+else:
+ ACTIVATION_LAYERS.register_module(module=nn.GELU)
+
+
+def build_activation_layer(cfg):
+ """Build activation layer.
+
+ Args:
+ cfg (dict): The activation layer config, which should contain:
+ - type (str): Layer type.
+ - layer args: Args needed to instantiate an activation layer.
+
+ Returns:
+ nn.Module: Created activation layer.
+ """
+ return build_from_cfg(cfg, ACTIVATION_LAYERS)
diff --git a/PyTorch/contrib/cv/detection/GCNet/mmcv/cnn/bricks/context_block.py b/PyTorch/contrib/cv/detection/GCNet/mmcv/cnn/bricks/context_block.py
new file mode 100644
index 0000000000000000000000000000000000000000..5c0703e7022af6954aa74110db18ffc5f8c5fee2
--- /dev/null
+++ b/PyTorch/contrib/cv/detection/GCNet/mmcv/cnn/bricks/context_block.py
@@ -0,0 +1,290 @@
+# import torch
+# from torch import nn
+
+# from ..utils import constant_init, kaiming_init
+# from .registry import PLUGIN_LAYERS
+
+
+# def last_zero_init(m):
+# if isinstance(m, nn.Sequential):
+# constant_init(m[-1], val=0)
+# else:
+# constant_init(m, val=0)
+
+
+# @PLUGIN_LAYERS.register_module()
+# class ContextBlock(nn.Module):
+# """ContextBlock module in GCNet.
+
+# See 'GCNet: Non-local Networks Meet Squeeze-Excitation Networks and Beyond'
+# (https://arxiv.org/abs/1904.11492) for details.
+
+# Args:
+# in_channels (int): Channels of the input feature map.
+# ratio (float): Ratio of channels of transform bottleneck
+# pooling_type (str): Pooling method for context modeling.
+# Options are 'att' and 'avg', stand for attention pooling and
+# average pooling respectively. Default: 'att'.
+# fusion_types (Sequence[str]): Fusion method for feature fusion,
+# Options are 'channels_add', 'channel_mul', stand for channelwise
+# addition and multiplication respectively. Default: ('channel_add',)
+# """
+
+# _abbr_ = 'context_block'
+
+# def __init__(self,
+# in_channels,
+# ratio,
+# pooling_type='att',
+# fusion_types=('channel_add', )):
+# super(ContextBlock, self).__init__()
+# assert pooling_type in ['avg', 'att']
+# assert isinstance(fusion_types, (list, tuple))
+# valid_fusion_types = ['channel_add', 'channel_mul']
+# assert all([f in valid_fusion_types for f in fusion_types])
+# assert len(fusion_types) > 0, 'at least one fusion should be used'
+# self.in_channels = in_channels
+# self.ratio = ratio
+# self.planes = int(in_channels * ratio)
+# self.pooling_type = pooling_type
+# self.fusion_types = fusion_types
+# if pooling_type == 'att':
+# self.conv_mask = nn.Conv2d(in_channels, 1, kernel_size=1)
+# self.softmax = nn.Softmax(dim=2)
+# else:
+# self.avg_pool = nn.AdaptiveAvgPool2d(1)
+# if 'channel_add' in fusion_types:
+# self.channel_add_conv = nn.Sequential(
+# nn.Conv2d(self.in_channels, self.planes, kernel_size=1),
+# nn.LayerNorm([self.planes, 1, 1]),
+# nn.ReLU(inplace=True), # yapf: disable
+# nn.Conv2d(self.planes, self.in_channels, kernel_size=1))
+# else:
+# self.channel_add_conv = None
+# if 'channel_mul' in fusion_types:
+# self.channel_mul_conv = nn.Sequential(
+# nn.Conv2d(self.in_channels, self.planes, kernel_size=1),
+# nn.LayerNorm([self.planes, 1, 1]),
+# nn.ReLU(inplace=True), # yapf: disable
+# nn.Conv2d(self.planes, self.in_channels, kernel_size=1))
+# else:
+# self.channel_mul_conv = None
+# self.reset_parameters()
+
+# def reset_parameters(self):
+# if self.pooling_type == 'att':
+# kaiming_init(self.conv_mask, mode='fan_in')
+# self.conv_mask.inited = True
+
+# if self.channel_add_conv is not None:
+# last_zero_init(self.channel_add_conv)
+# if self.channel_mul_conv is not None:
+# last_zero_init(self.channel_mul_conv)
+
+# def spatial_pool(self, x):
+# batch, channel, height, width = x.size()
+# if self.pooling_type == 'att':
+# input_x = x
+# # [N, C, H * W]
+# input_x = input_x.view(batch, channel, height * width)
+# # [N, 1, C, H * W]
+# input_x = input_x.unsqueeze(1)
+# # [N, 1, H, W]
+# context_mask = self.conv_mask(x)
+# # [N, 1, H * W]
+# context_mask = context_mask.view(batch, 1, height * width)
+# # [N, 1, H * W]
+# context_mask = self.softmax(context_mask)
+# # [N, 1, H * W, 1]
+# context_mask = context_mask.unsqueeze(-1)
+# # [N, 1, C, 1]
+# context = torch.matmul(input_x, context_mask)
+# # [N, C, 1, 1]
+# context = context.view(batch, channel, 1, 1)
+# else:
+# # [N, C, 1, 1]
+# context = self.avg_pool(x)
+
+# return context
+
+# def forward(self, x):
+# # [N, C, 1, 1]
+# context = self.spatial_pool(x)
+
+# out = x
+# if self.channel_mul_conv is not None:
+# # [N, C, 1, 1]
+# channel_mul_term = torch.sigmoid(self.channel_mul_conv(context))
+# out = out * channel_mul_term
+# if self.channel_add_conv is not None:
+# # [N, C, 1, 1]
+# channel_add_term = self.channel_add_conv(context)
+# out = out + channel_add_term
+
+# return out
+
+# Copyright (c) OpenMMLab. All rights reserved.
+import torch
+from torch import nn
+
+from ..utils import constant_init, kaiming_init
+from .registry import PLUGIN_LAYERS
+
+
+# def last_zero_init(m):
+# if isinstance(m, nn.Sequential):
+# constant_init(m[-1], val=0)
+# else:
+# constant_init(m, val=0)
+
+
+
+def kaiming_init(module,
+ a=0,
+ mode='fan_out',
+ nonlinearity='relu',
+ bias=0,
+ distribution='normal'):
+ assert distribution in ['uniform', 'normal']
+ if distribution == 'uniform':
+ nn.init.kaiming_uniform_(
+ module.weight, a=a, mode=mode, nonlinearity=nonlinearity)
+ else:
+ nn.init.kaiming_normal_(
+ module.weight, a=a, mode=mode, nonlinearity=nonlinearity)
+ if hasattr(module, 'bias') and module.bias is not None:
+ nn.init.constant_(module.bias, bias)
+
+def constant_init(module, val, bias=0):
+ if hasattr(module, 'weight') and module.weight is not None:
+ nn.init.constant_(module.weight, val)
+ if hasattr(module, 'bias') and module.bias is not None:
+ nn.init.constant_(module.bias, bias)
+
+
+def last_zero_init(m):
+ if isinstance(m, nn.Sequential):
+ constant_init(m[-1], val=0)
+ else:
+ constant_init(m, val=0)
+
+
+class LayerNorm(nn.LayerNorm):
+ def __init__(self, normalized_shape):
+ super(LayerNorm, self).__init__(normalized_shape[0])
+
+ def forward(self, x):
+ shape_raw = x.shape
+ x = super(LayerNorm, self).forward(x.squeeze()).reshape(shape_raw)
+ return x
+
+@PLUGIN_LAYERS.register_module()
+class ContextBlock(nn.Module):
+ """ContextBlock module in GCNet.
+
+ See 'GCNet: Non-local Networks Meet Squeeze-Excitation Networks and Beyond'
+ (https://arxiv.org/abs/1904.11492) for details.
+
+ Args:
+ in_channels (int): Channels of the input feature map.
+ ratio (float): Ratio of channels of transform bottleneck
+ pooling_type (str): Pooling method for context modeling.
+ Options are 'att' and 'avg', stand for attention pooling and
+ average pooling respectively. Default: 'att'.
+ fusion_types (Sequence[str]): Fusion method for feature fusion,
+ Options are 'channels_add', 'channel_mul', stand for channelwise
+ addition and multiplication respectively. Default: ('channel_add',)
+ """
+
+ _abbr_ = 'context_block'
+
+ def __init__(self,
+ in_channels,
+ ratio,
+ pooling_type='att',
+ fusion_types=('channel_add', )):
+ super(ContextBlock, self).__init__()
+ assert pooling_type in ['avg', 'att']
+ assert isinstance(fusion_types, (list, tuple))
+ valid_fusion_types = ['channel_add', 'channel_mul']
+ assert all([f in valid_fusion_types for f in fusion_types])
+ assert len(fusion_types) > 0, 'at least one fusion should be used'
+ self.in_channels = in_channels
+ self.ratio = ratio
+ self.planes = int(in_channels * ratio)
+ self.pooling_type = pooling_type
+ self.fusion_types = fusion_types
+ if pooling_type == 'att':
+ self.conv_mask = nn.Conv2d(in_channels, 1, kernel_size=1)
+ self.softmax = nn.Softmax(dim=2)
+ else:
+ self.avg_pool = nn.AdaptiveAvgPool2d(1)
+ if 'channel_add' in fusion_types:
+ self.channel_add_conv = nn.Sequential(
+ nn.Conv2d(self.in_channels, self.planes, kernel_size=1),
+ LayerNorm([self.planes, 1, 1]),
+ nn.ReLU(inplace=True), # yapf: disable
+ nn.Conv2d(self.planes, self.in_channels, kernel_size=1))
+ else:
+ self.channel_add_conv = None
+ if 'channel_mul' in fusion_types:
+ self.channel_mul_conv = nn.Sequential(
+ nn.Conv2d(self.in_channels, self.planes, kernel_size=1),
+ LayerNorm([self.planes, 1, 1]),
+ nn.ReLU(inplace=True), # yapf: disable
+ nn.Conv2d(self.planes, self.in_channels, kernel_size=1))
+ else:
+ self.channel_mul_conv = None
+ self.reset_parameters()
+
+ def reset_parameters(self):
+ if self.pooling_type == 'att':
+ kaiming_init(self.conv_mask, mode='fan_in')
+ self.conv_mask.inited = True
+
+ if self.channel_add_conv is not None:
+ last_zero_init(self.channel_add_conv)
+ if self.channel_mul_conv is not None:
+ last_zero_init(self.channel_mul_conv)
+
+ def spatial_pool(self, x):
+ batch, channel, height, width = x.size()
+ if self.pooling_type == 'att':
+ input_x = x
+ # [N, C, H * W]
+ input_x = input_x.view(batch, channel, height * width)
+ # [N, 1, C, H * W]
+ input_x = input_x.unsqueeze(1)
+ # [N, 1, H, W]
+ context_mask = self.conv_mask(x)
+ # [N, 1, H * W]
+ context_mask = context_mask.view(batch, 1, height * width)
+ # [N, 1, H * W]
+ context_mask = self.softmax(context_mask)
+ # [N, 1, H * W, 1]
+ context_mask = context_mask.unsqueeze(-1)
+ # [N, 1, C, 1]
+ context = torch.matmul(input_x, context_mask)
+ # [N, C, 1, 1]
+ context = context.view(batch, channel, 1, 1)
+ else:
+ # [N, C, 1, 1]
+ context = self.avg_pool(x)
+
+ return context
+
+ def forward(self, x):
+ # [N, C, 1, 1]
+ context = self.spatial_pool(x)
+
+ out = x
+ if self.channel_mul_conv is not None:
+ # [N, C, 1, 1]
+ channel_mul_term = torch.sigmoid(self.channel_mul_conv(context))
+ out = out * channel_mul_term
+ if self.channel_add_conv is not None:
+ # [N, C, 1, 1]
+ channel_add_term = self.channel_add_conv(context)
+ out = out + channel_add_term
+
+ return out
\ No newline at end of file
diff --git a/PyTorch/contrib/cv/detection/GCNet/mmcv/cnn/bricks/conv.py b/PyTorch/contrib/cv/detection/GCNet/mmcv/cnn/bricks/conv.py
new file mode 100644
index 0000000000000000000000000000000000000000..bd3928cc59e12d89025fada53e7c3052f5150764
--- /dev/null
+++ b/PyTorch/contrib/cv/detection/GCNet/mmcv/cnn/bricks/conv.py
@@ -0,0 +1,43 @@
+from torch import nn
+
+from .registry import CONV_LAYERS
+
+CONV_LAYERS.register_module('Conv1d', module=nn.Conv1d)
+CONV_LAYERS.register_module('Conv2d', module=nn.Conv2d)
+CONV_LAYERS.register_module('Conv3d', module=nn.Conv3d)
+CONV_LAYERS.register_module('Conv', module=nn.Conv2d)
+
+
+def build_conv_layer(cfg, *args, **kwargs):
+ """Build convolution layer.
+
+ Args:
+ cfg (None or dict): The conv layer config, which should contain:
+ - type (str): Layer type.
+ - layer args: Args needed to instantiate an conv layer.
+ args (argument list): Arguments passed to the `__init__`
+ method of the corresponding conv layer.
+ kwargs (keyword arguments): Keyword arguments passed to the `__init__`
+ method of the corresponding conv layer.
+
+ Returns:
+ nn.Module: Created conv layer.
+ """
+ if cfg is None:
+ cfg_ = dict(type='Conv2d')
+ else:
+ if not isinstance(cfg, dict):
+ raise TypeError('cfg must be a dict')
+ if 'type' not in cfg:
+ raise KeyError('the cfg dict must contain the key "type"')
+ cfg_ = cfg.copy()
+
+ layer_type = cfg_.pop('type')
+ if layer_type not in CONV_LAYERS:
+ raise KeyError(f'Unrecognized norm type {layer_type}')
+ else:
+ conv_layer = CONV_LAYERS.get(layer_type)
+
+ layer = conv_layer(*args, **kwargs, **cfg_)
+
+ return layer
diff --git a/PyTorch/contrib/cv/detection/GCNet/mmcv/cnn/bricks/conv2d_adaptive_padding.py b/PyTorch/contrib/cv/detection/GCNet/mmcv/cnn/bricks/conv2d_adaptive_padding.py
new file mode 100644
index 0000000000000000000000000000000000000000..6b636b034559e5a74f60642e9ec7c6202674a057
--- /dev/null
+++ b/PyTorch/contrib/cv/detection/GCNet/mmcv/cnn/bricks/conv2d_adaptive_padding.py
@@ -0,0 +1,61 @@
+import math
+
+from torch import nn
+from torch.nn import functional as F
+
+from .registry import CONV_LAYERS
+
+
+@CONV_LAYERS.register_module()
+class Conv2dAdaptivePadding(nn.Conv2d):
+ """Implementation of 2D convolution in tensorflow with `padding` as "same",
+ which applies padding to input (if needed) so that input image gets fully
+ covered by filter and stride you specified. For stride 1, this will ensure
+ that output image size is same as input. For stride of 2, output dimensions
+ will be half, for example.
+
+ Args:
+ in_channels (int): Number of channels in the input image
+ out_channels (int): Number of channels produced by the convolution
+ kernel_size (int or tuple): Size of the convolving kernel
+ stride (int or tuple, optional): Stride of the convolution. Default: 1
+ padding (int or tuple, optional): Zero-padding added to both sides of
+ the input. Default: 0
+ dilation (int or tuple, optional): Spacing between kernel elements.
+ Default: 1
+ groups (int, optional): Number of blocked connections from input
+ channels to output channels. Default: 1
+ bias (bool, optional): If ``True``, adds a learnable bias to the
+ output. Default: ``True``
+ """
+
+ def __init__(self,
+ in_channels,
+ out_channels,
+ kernel_size,
+ stride=1,
+ padding=0,
+ dilation=1,
+ groups=1,
+ bias=True):
+ super().__init__(in_channels, out_channels, kernel_size, stride, 0,
+ dilation, groups, bias)
+
+ def forward(self, x):
+ img_h, img_w = x.size()[-2:]
+ kernel_h, kernel_w = self.weight.size()[-2:]
+ stride_h, stride_w = self.stride
+ output_h = math.ceil(img_h / stride_h)
+ output_w = math.ceil(img_w / stride_w)
+ pad_h = (
+ max((output_h - 1) * self.stride[0] +
+ (kernel_h - 1) * self.dilation[0] + 1 - img_h, 0))
+ pad_w = (
+ max((output_w - 1) * self.stride[1] +
+ (kernel_w - 1) * self.dilation[1] + 1 - img_w, 0))
+ if pad_h > 0 or pad_w > 0:
+ x = F.pad(x, [
+ pad_w // 2, pad_w - pad_w // 2, pad_h // 2, pad_h - pad_h // 2
+ ])
+ return F.conv2d(x, self.weight, self.bias, self.stride, self.padding,
+ self.dilation, self.groups)
diff --git a/PyTorch/contrib/cv/detection/GCNet/mmcv/cnn/bricks/conv_module.py b/PyTorch/contrib/cv/detection/GCNet/mmcv/cnn/bricks/conv_module.py
new file mode 100644
index 0000000000000000000000000000000000000000..d4c4d772bcebf228fae5143b3665393da50ec8df
--- /dev/null
+++ b/PyTorch/contrib/cv/detection/GCNet/mmcv/cnn/bricks/conv_module.py
@@ -0,0 +1,203 @@
+import warnings
+
+import torch.nn as nn
+
+from ..utils import constant_init, kaiming_init
+from .activation import build_activation_layer
+from .conv import build_conv_layer
+from .norm import build_norm_layer
+from .padding import build_padding_layer
+from .registry import PLUGIN_LAYERS
+
+
+@PLUGIN_LAYERS.register_module()
+class ConvModule(nn.Module):
+ """A conv block that bundles conv/norm/activation layers.
+
+ This block simplifies the usage of convolution layers, which are commonly
+ used with a norm layer (e.g., BatchNorm) and activation layer (e.g., ReLU).
+ It is based upon three build methods: `build_conv_layer()`,
+ `build_norm_layer()` and `build_activation_layer()`.
+
+ Besides, we add some additional features in this module.
+ 1. Automatically set `bias` of the conv layer.
+ 2. Spectral norm is supported.
+ 3. More padding modes are supported. Before PyTorch 1.5, nn.Conv2d only
+ supports zero and circular padding, and we add "reflect" padding mode.
+
+ Args:
+ in_channels (int): Number of channels in the input feature map.
+ Same as that in ``nn._ConvNd``.
+ out_channels (int): Number of channels produced by the convolution.
+ Same as that in ``nn._ConvNd``.
+ kernel_size (int | tuple[int]): Size of the convolving kernel.
+ Same as that in ``nn._ConvNd``.
+ stride (int | tuple[int]): Stride of the convolution.
+ Same as that in ``nn._ConvNd``.
+ padding (int | tuple[int]): Zero-padding added to both sides of
+ the input. Same as that in ``nn._ConvNd``.
+ dilation (int | tuple[int]): Spacing between kernel elements.
+ Same as that in ``nn._ConvNd``.
+ groups (int): Number of blocked connections from input channels to
+ output channels. Same as that in ``nn._ConvNd``.
+ bias (bool | str): If specified as `auto`, it will be decided by the
+ norm_cfg. Bias will be set as True if `norm_cfg` is None, otherwise
+ False. Default: "auto".
+ conv_cfg (dict): Config dict for convolution layer. Default: None,
+ which means using conv2d.
+ norm_cfg (dict): Config dict for normalization layer. Default: None.
+ act_cfg (dict): Config dict for activation layer.
+ Default: dict(type='ReLU').
+ inplace (bool): Whether to use inplace mode for activation.
+ Default: True.
+ with_spectral_norm (bool): Whether use spectral norm in conv module.
+ Default: False.
+ padding_mode (str): If the `padding_mode` has not been supported by
+ current `Conv2d` in PyTorch, we will use our own padding layer
+ instead. Currently, we support ['zeros', 'circular'] with official
+ implementation and ['reflect'] with our own implementation.
+ Default: 'zeros'.
+ order (tuple[str]): The order of conv/norm/activation layers. It is a
+ sequence of "conv", "norm" and "act". Common examples are
+ ("conv", "norm", "act") and ("act", "conv", "norm").
+ Default: ('conv', 'norm', 'act').
+ """
+
+ _abbr_ = 'conv_block'
+
+ def __init__(self,
+ in_channels,
+ out_channels,
+ kernel_size,
+ stride=1,
+ padding=0,
+ dilation=1,
+ groups=1,
+ bias='auto',
+ conv_cfg=None,
+ norm_cfg=None,
+ act_cfg=dict(type='ReLU'),
+ inplace=True,
+ with_spectral_norm=False,
+ padding_mode='zeros',
+ order=('conv', 'norm', 'act')):
+ super(ConvModule, self).__init__()
+ assert conv_cfg is None or isinstance(conv_cfg, dict)
+ assert norm_cfg is None or isinstance(norm_cfg, dict)
+ assert act_cfg is None or isinstance(act_cfg, dict)
+ official_padding_mode = ['zeros', 'circular']
+ self.conv_cfg = conv_cfg
+ self.norm_cfg = norm_cfg
+ self.act_cfg = act_cfg
+ self.inplace = inplace
+ self.with_spectral_norm = with_spectral_norm
+ self.with_explicit_padding = padding_mode not in official_padding_mode
+ self.order = order
+ assert isinstance(self.order, tuple) and len(self.order) == 3
+ assert set(order) == set(['conv', 'norm', 'act'])
+
+ self.with_norm = norm_cfg is not None
+ self.with_activation = act_cfg is not None
+ # if the conv layer is before a norm layer, bias is unnecessary.
+ if bias == 'auto':
+ bias = not self.with_norm
+ self.with_bias = bias
+
+ if self.with_norm and self.with_bias:
+ warnings.warn('ConvModule has norm and bias at the same time')
+
+ if self.with_explicit_padding:
+ pad_cfg = dict(type=padding_mode)
+ self.padding_layer = build_padding_layer(pad_cfg, padding)
+
+ # reset padding to 0 for conv module
+ conv_padding = 0 if self.with_explicit_padding else padding
+ # build convolution layer
+ self.conv = build_conv_layer(
+ conv_cfg,
+ in_channels,
+ out_channels,
+ kernel_size,
+ stride=stride,
+ padding=conv_padding,
+ dilation=dilation,
+ groups=groups,
+ bias=bias)
+ # export the attributes of self.conv to a higher level for convenience
+ self.in_channels = self.conv.in_channels
+ self.out_channels = self.conv.out_channels
+ self.kernel_size = self.conv.kernel_size
+ self.stride = self.conv.stride
+ self.padding = padding
+ self.dilation = self.conv.dilation
+ self.transposed = self.conv.transposed
+ self.output_padding = self.conv.output_padding
+ self.groups = self.conv.groups
+
+ if self.with_spectral_norm:
+ self.conv = nn.utils.spectral_norm(self.conv)
+
+ # build normalization layers
+ if self.with_norm:
+ # norm layer is after conv layer
+ if order.index('norm') > order.index('conv'):
+ norm_channels = out_channels
+ else:
+ norm_channels = in_channels
+ self.norm_name, norm = build_norm_layer(norm_cfg, norm_channels)
+ self.add_module(self.norm_name, norm)
+ else:
+ self.norm_name = None
+
+ # build activation layer
+ if self.with_activation:
+ act_cfg_ = act_cfg.copy()
+ # nn.Tanh has no 'inplace' argument
+ if act_cfg_['type'] not in [
+ 'Tanh', 'PReLU', 'Sigmoid', 'HSigmoid', 'Swish'
+ ]:
+ act_cfg_.setdefault('inplace', inplace)
+ self.activate = build_activation_layer(act_cfg_)
+
+ # Use msra init by default
+ self.init_weights()
+
+ @property
+ def norm(self):
+ if self.norm_name:
+ return getattr(self, self.norm_name)
+ else:
+ return None
+
+ def init_weights(self):
+ # 1. It is mainly for customized conv layers with their own
+ # initialization manners by calling their own ``init_weights()``,
+ # and we do not want ConvModule to override the initialization.
+ # 2. For customized conv layers without their own initialization
+ # manners (that is, they don't have their own ``init_weights()``)
+ # and PyTorch's conv layers, they will be initialized by
+ # this method with default ``kaiming_init``.
+ # Note: For PyTorch's conv layers, they will be overwritten by our
+ # initialization implementation using default ``kaiming_init``.
+ if not hasattr(self.conv, 'init_weights'):
+ if self.with_activation and self.act_cfg['type'] == 'LeakyReLU':
+ nonlinearity = 'leaky_relu'
+ a = self.act_cfg.get('negative_slope', 0.01)
+ else:
+ nonlinearity = 'relu'
+ a = 0
+ kaiming_init(self.conv, a=a, nonlinearity=nonlinearity)
+ if self.with_norm:
+ constant_init(self.norm, 1, bias=0)
+
+ def forward(self, x, activate=True, norm=True):
+ for layer in self.order:
+ if layer == 'conv':
+ if self.with_explicit_padding:
+ x = self.padding_layer(x)
+ x = self.conv(x)
+ elif layer == 'norm' and norm and self.with_norm:
+ x = self.norm(x)
+ elif layer == 'act' and activate and self.with_activation:
+ x = self.activate(x)
+ return x
diff --git a/PyTorch/contrib/cv/detection/GCNet/mmcv/cnn/bricks/conv_ws.py b/PyTorch/contrib/cv/detection/GCNet/mmcv/cnn/bricks/conv_ws.py
new file mode 100644
index 0000000000000000000000000000000000000000..5dea2312fb9eb70adba18d602879845e02c0c696
--- /dev/null
+++ b/PyTorch/contrib/cv/detection/GCNet/mmcv/cnn/bricks/conv_ws.py
@@ -0,0 +1,147 @@
+import torch
+import torch.nn as nn
+import torch.nn.functional as F
+
+from .registry import CONV_LAYERS
+
+
+def conv_ws_2d(input,
+ weight,
+ bias=None,
+ stride=1,
+ padding=0,
+ dilation=1,
+ groups=1,
+ eps=1e-5):
+ c_in = weight.size(0)
+ weight_flat = weight.view(c_in, -1)
+ mean = weight_flat.mean(dim=1, keepdim=True).view(c_in, 1, 1, 1)
+ std = weight_flat.std(dim=1, keepdim=True).view(c_in, 1, 1, 1)
+ weight = (weight - mean) / (std + eps)
+ return F.conv2d(input, weight, bias, stride, padding, dilation, groups)
+
+
+@CONV_LAYERS.register_module('ConvWS')
+class ConvWS2d(nn.Conv2d):
+
+ def __init__(self,
+ in_channels,
+ out_channels,
+ kernel_size,
+ stride=1,
+ padding=0,
+ dilation=1,
+ groups=1,
+ bias=True,
+ eps=1e-5):
+ super(ConvWS2d, self).__init__(
+ in_channels,
+ out_channels,
+ kernel_size,
+ stride=stride,
+ padding=padding,
+ dilation=dilation,
+ groups=groups,
+ bias=bias)
+ self.eps = eps
+
+ def forward(self, x):
+ return conv_ws_2d(x, self.weight, self.bias, self.stride, self.padding,
+ self.dilation, self.groups, self.eps)
+
+
+@CONV_LAYERS.register_module(name='ConvAWS')
+class ConvAWS2d(nn.Conv2d):
+ """AWS (Adaptive Weight Standardization)
+
+ This is a variant of Weight Standardization
+ (https://arxiv.org/pdf/1903.10520.pdf)
+ It is used in DetectoRS to avoid NaN
+ (https://arxiv.org/pdf/2006.02334.pdf)
+
+ Args:
+ in_channels (int): Number of channels in the input image
+ out_channels (int): Number of channels produced by the convolution
+ kernel_size (int or tuple): Size of the conv kernel
+ stride (int or tuple, optional): Stride of the convolution. Default: 1
+ padding (int or tuple, optional): Zero-padding added to both sides of
+ the input. Default: 0
+ dilation (int or tuple, optional): Spacing between kernel elements.
+ Default: 1
+ groups (int, optional): Number of blocked connections from input
+ channels to output channels. Default: 1
+ bias (bool, optional): If set True, adds a learnable bias to the
+ output. Default: True
+ """
+
+ def __init__(self,
+ in_channels,
+ out_channels,
+ kernel_size,
+ stride=1,
+ padding=0,
+ dilation=1,
+ groups=1,
+ bias=True):
+ super().__init__(
+ in_channels,
+ out_channels,
+ kernel_size,
+ stride=stride,
+ padding=padding,
+ dilation=dilation,
+ groups=groups,
+ bias=bias)
+ self.register_buffer('weight_gamma',
+ torch.ones(self.out_channels, 1, 1, 1))
+ self.register_buffer('weight_beta',
+ torch.zeros(self.out_channels, 1, 1, 1))
+
+ def _get_weight(self, weight):
+ weight_flat = weight.view(weight.size(0), -1)
+ mean = weight_flat.mean(dim=1).view(-1, 1, 1, 1)
+ std = torch.sqrt(weight_flat.var(dim=1) + 1e-5).view(-1, 1, 1, 1)
+ weight = (weight - mean) / std
+ weight = self.weight_gamma * weight + self.weight_beta
+ return weight
+
+ def forward(self, x):
+ weight = self._get_weight(self.weight)
+ return F.conv2d(x, weight, self.bias, self.stride, self.padding,
+ self.dilation, self.groups)
+
+ def _load_from_state_dict(self, state_dict, prefix, local_metadata, strict,
+ missing_keys, unexpected_keys, error_msgs):
+ """Override default load function.
+
+ AWS overrides the function _load_from_state_dict to recover
+ weight_gamma and weight_beta if they are missing. If weight_gamma and
+ weight_beta are found in the checkpoint, this function will return
+ after super()._load_from_state_dict. Otherwise, it will compute the
+ mean and std of the pretrained weights and store them in weight_beta
+ and weight_gamma.
+ """
+
+ self.weight_gamma.data.fill_(-1)
+ local_missing_keys = []
+ super()._load_from_state_dict(state_dict, prefix, local_metadata,
+ strict, local_missing_keys,
+ unexpected_keys, error_msgs)
+ if self.weight_gamma.data.mean() > 0:
+ for k in local_missing_keys:
+ missing_keys.append(k)
+ return
+ weight = self.weight.data
+ weight_flat = weight.view(weight.size(0), -1)
+ mean = weight_flat.mean(dim=1).view(-1, 1, 1, 1)
+ std = torch.sqrt(weight_flat.var(dim=1) + 1e-5).view(-1, 1, 1, 1)
+ self.weight_beta.data.copy_(mean)
+ self.weight_gamma.data.copy_(std)
+ missing_gamma_beta = [
+ k for k in local_missing_keys
+ if k.endswith('weight_gamma') or k.endswith('weight_beta')
+ ]
+ for k in missing_gamma_beta:
+ local_missing_keys.remove(k)
+ for k in local_missing_keys:
+ missing_keys.append(k)
diff --git a/PyTorch/contrib/cv/detection/GCNet/mmcv/cnn/bricks/depthwise_separable_conv_module.py b/PyTorch/contrib/cv/detection/GCNet/mmcv/cnn/bricks/depthwise_separable_conv_module.py
new file mode 100644
index 0000000000000000000000000000000000000000..aee8b7f63bfae94b358fab01de74d6158145ea3d
--- /dev/null
+++ b/PyTorch/contrib/cv/detection/GCNet/mmcv/cnn/bricks/depthwise_separable_conv_module.py
@@ -0,0 +1,95 @@
+import torch.nn as nn
+
+from .conv_module import ConvModule
+
+
+class DepthwiseSeparableConvModule(nn.Module):
+ """Depthwise separable convolution module.
+
+ See https://arxiv.org/pdf/1704.04861.pdf for details.
+
+ This module can replace a ConvModule with the conv block replaced by two
+ conv block: depthwise conv block and pointwise conv block. The depthwise
+ conv block contains depthwise-conv/norm/activation layers. The pointwise
+ conv block contains pointwise-conv/norm/activation layers. It should be
+ noted that there will be norm/activation layer in the depthwise conv block
+ if `norm_cfg` and `act_cfg` are specified.
+
+ Args:
+ in_channels (int): Number of channels in the input feature map.
+ Same as that in ``nn._ConvNd``.
+ out_channels (int): Number of channels produced by the convolution.
+ Same as that in ``nn._ConvNd``.
+ kernel_size (int | tuple[int]): Size of the convolving kernel.
+ Same as that in ``nn._ConvNd``.
+ stride (int | tuple[int]): Stride of the convolution.
+ Same as that in ``nn._ConvNd``. Default: 1.
+ padding (int | tuple[int]): Zero-padding added to both sides of
+ the input. Same as that in ``nn._ConvNd``. Default: 0.
+ dilation (int | tuple[int]): Spacing between kernel elements.
+ Same as that in ``nn._ConvNd``. Default: 1.
+ norm_cfg (dict): Default norm config for both depthwise ConvModule and
+ pointwise ConvModule. Default: None.
+ act_cfg (dict): Default activation config for both depthwise ConvModule
+ and pointwise ConvModule. Default: dict(type='ReLU').
+ dw_norm_cfg (dict): Norm config of depthwise ConvModule. If it is
+ 'default', it will be the same as `norm_cfg`. Default: 'default'.
+ dw_act_cfg (dict): Activation config of depthwise ConvModule. If it is
+ 'default', it will be the same as `act_cfg`. Default: 'default'.
+ pw_norm_cfg (dict): Norm config of pointwise ConvModule. If it is
+ 'default', it will be the same as `norm_cfg`. Default: 'default'.
+ pw_act_cfg (dict): Activation config of pointwise ConvModule. If it is
+ 'default', it will be the same as `act_cfg`. Default: 'default'.
+ kwargs (optional): Other shared arguments for depthwise and pointwise
+ ConvModule. See ConvModule for ref.
+ """
+
+ def __init__(self,
+ in_channels,
+ out_channels,
+ kernel_size,
+ stride=1,
+ padding=0,
+ dilation=1,
+ norm_cfg=None,
+ act_cfg=dict(type='ReLU'),
+ dw_norm_cfg='default',
+ dw_act_cfg='default',
+ pw_norm_cfg='default',
+ pw_act_cfg='default',
+ **kwargs):
+ super(DepthwiseSeparableConvModule, self).__init__()
+ assert 'groups' not in kwargs, 'groups should not be specified'
+
+ # if norm/activation config of depthwise/pointwise ConvModule is not
+ # specified, use default config.
+ dw_norm_cfg = dw_norm_cfg if dw_norm_cfg != 'default' else norm_cfg
+ dw_act_cfg = dw_act_cfg if dw_act_cfg != 'default' else act_cfg
+ pw_norm_cfg = pw_norm_cfg if pw_norm_cfg != 'default' else norm_cfg
+ pw_act_cfg = pw_act_cfg if pw_act_cfg != 'default' else act_cfg
+
+ # depthwise convolution
+ self.depthwise_conv = ConvModule(
+ in_channels,
+ in_channels,
+ kernel_size,
+ stride=stride,
+ padding=padding,
+ dilation=dilation,
+ groups=in_channels,
+ norm_cfg=dw_norm_cfg,
+ act_cfg=dw_act_cfg,
+ **kwargs)
+
+ self.pointwise_conv = ConvModule(
+ in_channels,
+ out_channels,
+ 1,
+ norm_cfg=pw_norm_cfg,
+ act_cfg=pw_act_cfg,
+ **kwargs)
+
+ def forward(self, x):
+ x = self.depthwise_conv(x)
+ x = self.pointwise_conv(x)
+ return x
diff --git a/PyTorch/contrib/cv/detection/GCNet/mmcv/cnn/bricks/drop.py b/PyTorch/contrib/cv/detection/GCNet/mmcv/cnn/bricks/drop.py
new file mode 100644
index 0000000000000000000000000000000000000000..dd380c21628faae4e6896b29492f81085dfa5417
--- /dev/null
+++ b/PyTorch/contrib/cv/detection/GCNet/mmcv/cnn/bricks/drop.py
@@ -0,0 +1,64 @@
+import torch
+import torch.nn as nn
+
+from mmcv import build_from_cfg
+from .registry import DROPOUT_LAYERS
+
+
+def drop_path(x, drop_prob=0., training=False):
+ """Drop paths (Stochastic Depth) per sample (when applied in main path of
+ residual blocks).
+
+ We follow the implementation
+ https://github.com/rwightman/pytorch-image-models/blob/a2727c1bf78ba0d7b5727f5f95e37fb7f8866b1f/timm/models/layers/drop.py # noqa: E501
+ """
+ if drop_prob == 0. or not training:
+ return x
+ keep_prob = 1 - drop_prob
+ # handle tensors with different dimensions, not just 4D tensors.
+ shape = (x.shape[0], ) + (1, ) * (x.ndim - 1)
+ random_tensor = keep_prob + torch.rand(
+ shape, dtype=x.dtype, device=x.device)
+ output = x.div(keep_prob) * random_tensor.floor()
+ return output
+
+
+@DROPOUT_LAYERS.register_module()
+class DropPath(nn.Module):
+ """Drop paths (Stochastic Depth) per sample (when applied in main path of
+ residual blocks).
+
+ We follow the implementation
+ https://github.com/rwightman/pytorch-image-models/blob/a2727c1bf78ba0d7b5727f5f95e37fb7f8866b1f/timm/models/layers/drop.py # noqa: E501
+
+ Args:
+ drop_prob (float): Probability of the path to be zeroed. Default: 0.1
+ """
+
+ def __init__(self, drop_prob=0.1):
+ super(DropPath, self).__init__()
+ self.drop_prob = drop_prob
+
+ def forward(self, x):
+ return drop_path(x, self.drop_prob, self.training)
+
+
+@DROPOUT_LAYERS.register_module()
+class Dropout(nn.Dropout):
+ """A wrapper for ``torch.nn.Dropout``, We rename the ``p`` of
+ ``torch.nn.Dropout`` to ``drop_prob`` so as to be consistent with
+ ``DropPath``
+
+ Args:
+ drop_prob (float): Probability of the elements to be
+ zeroed. Default: 0.5.
+ inplace (bool): Do the operation inplace or not. Default: False.
+ """
+
+ def __init__(self, drop_prob=0.5, inplace=False):
+ super().__init__(p=drop_prob, inplace=inplace)
+
+
+def build_dropout(cfg, default_args=None):
+ """Builder for drop out layers."""
+ return build_from_cfg(cfg, DROPOUT_LAYERS, default_args)
diff --git a/PyTorch/contrib/cv/detection/GCNet/mmcv/cnn/bricks/generalized_attention.py b/PyTorch/contrib/cv/detection/GCNet/mmcv/cnn/bricks/generalized_attention.py
new file mode 100644
index 0000000000000000000000000000000000000000..c6e4f00d35bf48836520f6e3db88bd7a7e2d5b6b
--- /dev/null
+++ b/PyTorch/contrib/cv/detection/GCNet/mmcv/cnn/bricks/generalized_attention.py
@@ -0,0 +1,411 @@
+import math
+
+import numpy as np
+import torch
+import torch.nn as nn
+import torch.nn.functional as F
+
+from ..utils import kaiming_init
+from .registry import PLUGIN_LAYERS
+
+
+@PLUGIN_LAYERS.register_module()
+class GeneralizedAttention(nn.Module):
+ """GeneralizedAttention module.
+
+ See 'An Empirical Study of Spatial Attention Mechanisms in Deep Networks'
+ (https://arxiv.org/abs/1711.07971) for details.
+
+ Args:
+ in_channels (int): Channels of the input feature map.
+ spatial_range (int): The spatial range. -1 indicates no spatial range
+ constraint. Default: -1.
+ num_heads (int): The head number of empirical_attention module.
+ Default: 9.
+ position_embedding_dim (int): The position embedding dimension.
+ Default: -1.
+ position_magnitude (int): A multiplier acting on coord difference.
+ Default: 1.
+ kv_stride (int): The feature stride acting on key/value feature map.
+ Default: 2.
+ q_stride (int): The feature stride acting on query feature map.
+ Default: 1.
+ attention_type (str): A binary indicator string for indicating which
+ items in generalized empirical_attention module are used.
+ Default: '1111'.
+
+ - '1000' indicates 'query and key content' (appr - appr) item,
+ - '0100' indicates 'query content and relative position'
+ (appr - position) item,
+ - '0010' indicates 'key content only' (bias - appr) item,
+ - '0001' indicates 'relative position only' (bias - position) item.
+ """
+
+ _abbr_ = 'gen_attention_block'
+
+ def __init__(self,
+ in_channels,
+ spatial_range=-1,
+ num_heads=9,
+ position_embedding_dim=-1,
+ position_magnitude=1,
+ kv_stride=2,
+ q_stride=1,
+ attention_type='1111'):
+
+ super(GeneralizedAttention, self).__init__()
+
+ # hard range means local range for non-local operation
+ self.position_embedding_dim = (
+ position_embedding_dim
+ if position_embedding_dim > 0 else in_channels)
+
+ self.position_magnitude = position_magnitude
+ self.num_heads = num_heads
+ self.in_channels = in_channels
+ self.spatial_range = spatial_range
+ self.kv_stride = kv_stride
+ self.q_stride = q_stride
+ self.attention_type = [bool(int(_)) for _ in attention_type]
+ self.qk_embed_dim = in_channels // num_heads
+ out_c = self.qk_embed_dim * num_heads
+
+ if self.attention_type[0] or self.attention_type[1]:
+ self.query_conv = nn.Conv2d(
+ in_channels=in_channels,
+ out_channels=out_c,
+ kernel_size=1,
+ bias=False)
+ self.query_conv.kaiming_init = True
+
+ if self.attention_type[0] or self.attention_type[2]:
+ self.key_conv = nn.Conv2d(
+ in_channels=in_channels,
+ out_channels=out_c,
+ kernel_size=1,
+ bias=False)
+ self.key_conv.kaiming_init = True
+
+ self.v_dim = in_channels // num_heads
+ self.value_conv = nn.Conv2d(
+ in_channels=in_channels,
+ out_channels=self.v_dim * num_heads,
+ kernel_size=1,
+ bias=False)
+ self.value_conv.kaiming_init = True
+
+ if self.attention_type[1] or self.attention_type[3]:
+ self.appr_geom_fc_x = nn.Linear(
+ self.position_embedding_dim // 2, out_c, bias=False)
+ self.appr_geom_fc_x.kaiming_init = True
+
+ self.appr_geom_fc_y = nn.Linear(
+ self.position_embedding_dim // 2, out_c, bias=False)
+ self.appr_geom_fc_y.kaiming_init = True
+
+ if self.attention_type[2]:
+ stdv = 1.0 / math.sqrt(self.qk_embed_dim * 2)
+ appr_bias_value = -2 * stdv * torch.rand(out_c) + stdv
+ self.appr_bias = nn.Parameter(appr_bias_value)
+
+ if self.attention_type[3]:
+ stdv = 1.0 / math.sqrt(self.qk_embed_dim * 2)
+ geom_bias_value = -2 * stdv * torch.rand(out_c) + stdv
+ self.geom_bias = nn.Parameter(geom_bias_value)
+
+ self.proj_conv = nn.Conv2d(
+ in_channels=self.v_dim * num_heads,
+ out_channels=in_channels,
+ kernel_size=1,
+ bias=True)
+ self.proj_conv.kaiming_init = True
+ self.gamma = nn.Parameter(torch.zeros(1))
+
+ if self.spatial_range >= 0:
+ # only works when non local is after 3*3 conv
+ if in_channels == 256:
+ max_len = 84
+ elif in_channels == 512:
+ max_len = 42
+
+ max_len_kv = int((max_len - 1.0) / self.kv_stride + 1)
+ local_constraint_map = np.ones(
+ (max_len, max_len, max_len_kv, max_len_kv), dtype=np.int)
+ for iy in range(max_len):
+ for ix in range(max_len):
+ local_constraint_map[
+ iy, ix,
+ max((iy - self.spatial_range) //
+ self.kv_stride, 0):min((iy + self.spatial_range +
+ 1) // self.kv_stride +
+ 1, max_len),
+ max((ix - self.spatial_range) //
+ self.kv_stride, 0):min((ix + self.spatial_range +
+ 1) // self.kv_stride +
+ 1, max_len)] = 0
+
+ self.local_constraint_map = nn.Parameter(
+ torch.from_numpy(local_constraint_map).byte(),
+ requires_grad=False)
+
+ if self.q_stride > 1:
+ self.q_downsample = nn.AvgPool2d(
+ kernel_size=1, stride=self.q_stride)
+ else:
+ self.q_downsample = None
+
+ if self.kv_stride > 1:
+ self.kv_downsample = nn.AvgPool2d(
+ kernel_size=1, stride=self.kv_stride)
+ else:
+ self.kv_downsample = None
+
+ self.init_weights()
+
+ def get_position_embedding(self,
+ h,
+ w,
+ h_kv,
+ w_kv,
+ q_stride,
+ kv_stride,
+ device,
+ dtype,
+ feat_dim,
+ wave_length=1000):
+ # the default type of Tensor is float32, leading to type mismatch
+ # in fp16 mode. Cast it to support fp16 mode.
+ h_idxs = torch.linspace(0, h - 1, h).to(device=device, dtype=dtype)
+ h_idxs = h_idxs.view((h, 1)) * q_stride
+
+ w_idxs = torch.linspace(0, w - 1, w).to(device=device, dtype=dtype)
+ w_idxs = w_idxs.view((w, 1)) * q_stride
+
+ h_kv_idxs = torch.linspace(0, h_kv - 1, h_kv).to(
+ device=device, dtype=dtype)
+ h_kv_idxs = h_kv_idxs.view((h_kv, 1)) * kv_stride
+
+ w_kv_idxs = torch.linspace(0, w_kv - 1, w_kv).to(
+ device=device, dtype=dtype)
+ w_kv_idxs = w_kv_idxs.view((w_kv, 1)) * kv_stride
+
+ # (h, h_kv, 1)
+ h_diff = h_idxs.unsqueeze(1) - h_kv_idxs.unsqueeze(0)
+ h_diff *= self.position_magnitude
+
+ # (w, w_kv, 1)
+ w_diff = w_idxs.unsqueeze(1) - w_kv_idxs.unsqueeze(0)
+ w_diff *= self.position_magnitude
+
+ feat_range = torch.arange(0, feat_dim / 4).to(
+ device=device, dtype=dtype)
+
+ dim_mat = torch.Tensor([wave_length]).to(device=device, dtype=dtype)
+ dim_mat = dim_mat**((4. / feat_dim) * feat_range)
+ dim_mat = dim_mat.view((1, 1, -1))
+
+ embedding_x = torch.cat(
+ ((w_diff / dim_mat).sin(), (w_diff / dim_mat).cos()), dim=2)
+
+ embedding_y = torch.cat(
+ ((h_diff / dim_mat).sin(), (h_diff / dim_mat).cos()), dim=2)
+
+ return embedding_x, embedding_y
+
+ def forward(self, x_input):
+ num_heads = self.num_heads
+
+ # use empirical_attention
+ if self.q_downsample is not None:
+ x_q = self.q_downsample(x_input)
+ else:
+ x_q = x_input
+ n, _, h, w = x_q.shape
+
+ if self.kv_downsample is not None:
+ x_kv = self.kv_downsample(x_input)
+ else:
+ x_kv = x_input
+ _, _, h_kv, w_kv = x_kv.shape
+
+ if self.attention_type[0] or self.attention_type[1]:
+ proj_query = self.query_conv(x_q).view(
+ (n, num_heads, self.qk_embed_dim, h * w))
+ proj_query = proj_query.permute(0, 1, 3, 2)
+
+ if self.attention_type[0] or self.attention_type[2]:
+ proj_key = self.key_conv(x_kv).view(
+ (n, num_heads, self.qk_embed_dim, h_kv * w_kv))
+
+ if self.attention_type[1] or self.attention_type[3]:
+ position_embed_x, position_embed_y = self.get_position_embedding(
+ h, w, h_kv, w_kv, self.q_stride, self.kv_stride,
+ x_input.device, x_input.dtype, self.position_embedding_dim)
+ # (n, num_heads, w, w_kv, dim)
+ position_feat_x = self.appr_geom_fc_x(position_embed_x).\
+ view(1, w, w_kv, num_heads, self.qk_embed_dim).\
+ permute(0, 3, 1, 2, 4).\
+ repeat(n, 1, 1, 1, 1)
+
+ # (n, num_heads, h, h_kv, dim)
+ position_feat_y = self.appr_geom_fc_y(position_embed_y).\
+ view(1, h, h_kv, num_heads, self.qk_embed_dim).\
+ permute(0, 3, 1, 2, 4).\
+ repeat(n, 1, 1, 1, 1)
+
+ position_feat_x /= math.sqrt(2)
+ position_feat_y /= math.sqrt(2)
+
+ # accelerate for saliency only
+ if (np.sum(self.attention_type) == 1) and self.attention_type[2]:
+ appr_bias = self.appr_bias.\
+ view(1, num_heads, 1, self.qk_embed_dim).\
+ repeat(n, 1, 1, 1)
+
+ energy = torch.matmul(appr_bias, proj_key).\
+ view(n, num_heads, 1, h_kv * w_kv)
+
+ h = 1
+ w = 1
+ else:
+ # (n, num_heads, h*w, h_kv*w_kv), query before key, 540mb for
+ if not self.attention_type[0]:
+ energy = torch.zeros(
+ n,
+ num_heads,
+ h,
+ w,
+ h_kv,
+ w_kv,
+ dtype=x_input.dtype,
+ device=x_input.device)
+
+ # attention_type[0]: appr - appr
+ # attention_type[1]: appr - position
+ # attention_type[2]: bias - appr
+ # attention_type[3]: bias - position
+ if self.attention_type[0] or self.attention_type[2]:
+ if self.attention_type[0] and self.attention_type[2]:
+ appr_bias = self.appr_bias.\
+ view(1, num_heads, 1, self.qk_embed_dim)
+ energy = torch.matmul(proj_query + appr_bias, proj_key).\
+ view(n, num_heads, h, w, h_kv, w_kv)
+
+ elif self.attention_type[0]:
+ energy = torch.matmul(proj_query, proj_key).\
+ view(n, num_heads, h, w, h_kv, w_kv)
+
+ elif self.attention_type[2]:
+ appr_bias = self.appr_bias.\
+ view(1, num_heads, 1, self.qk_embed_dim).\
+ repeat(n, 1, 1, 1)
+
+ energy += torch.matmul(appr_bias, proj_key).\
+ view(n, num_heads, 1, 1, h_kv, w_kv)
+
+ if self.attention_type[1] or self.attention_type[3]:
+ if self.attention_type[1] and self.attention_type[3]:
+ geom_bias = self.geom_bias.\
+ view(1, num_heads, 1, self.qk_embed_dim)
+
+ proj_query_reshape = (proj_query + geom_bias).\
+ view(n, num_heads, h, w, self.qk_embed_dim)
+
+ energy_x = torch.matmul(
+ proj_query_reshape.permute(0, 1, 3, 2, 4),
+ position_feat_x.permute(0, 1, 2, 4, 3))
+ energy_x = energy_x.\
+ permute(0, 1, 3, 2, 4).unsqueeze(4)
+
+ energy_y = torch.matmul(
+ proj_query_reshape,
+ position_feat_y.permute(0, 1, 2, 4, 3))
+ energy_y = energy_y.unsqueeze(5)
+
+ energy += energy_x + energy_y
+
+ elif self.attention_type[1]:
+ proj_query_reshape = proj_query.\
+ view(n, num_heads, h, w, self.qk_embed_dim)
+ proj_query_reshape = proj_query_reshape.\
+ permute(0, 1, 3, 2, 4)
+ position_feat_x_reshape = position_feat_x.\
+ permute(0, 1, 2, 4, 3)
+ position_feat_y_reshape = position_feat_y.\
+ permute(0, 1, 2, 4, 3)
+
+ energy_x = torch.matmul(proj_query_reshape,
+ position_feat_x_reshape)
+ energy_x = energy_x.permute(0, 1, 3, 2, 4).unsqueeze(4)
+
+ energy_y = torch.matmul(proj_query_reshape,
+ position_feat_y_reshape)
+ energy_y = energy_y.unsqueeze(5)
+
+ energy += energy_x + energy_y
+
+ elif self.attention_type[3]:
+ geom_bias = self.geom_bias.\
+ view(1, num_heads, self.qk_embed_dim, 1).\
+ repeat(n, 1, 1, 1)
+
+ position_feat_x_reshape = position_feat_x.\
+ view(n, num_heads, w*w_kv, self.qk_embed_dim)
+
+ position_feat_y_reshape = position_feat_y.\
+ view(n, num_heads, h * h_kv, self.qk_embed_dim)
+
+ energy_x = torch.matmul(position_feat_x_reshape, geom_bias)
+ energy_x = energy_x.view(n, num_heads, 1, w, 1, w_kv)
+
+ energy_y = torch.matmul(position_feat_y_reshape, geom_bias)
+ energy_y = energy_y.view(n, num_heads, h, 1, h_kv, 1)
+
+ energy += energy_x + energy_y
+
+ energy = energy.view(n, num_heads, h * w, h_kv * w_kv)
+
+ if self.spatial_range >= 0:
+ cur_local_constraint_map = \
+ self.local_constraint_map[:h, :w, :h_kv, :w_kv].\
+ contiguous().\
+ view(1, 1, h*w, h_kv*w_kv)
+
+ energy = energy.masked_fill_(cur_local_constraint_map,
+ float('-inf'))
+
+ attention = F.softmax(energy, 3)
+
+ proj_value = self.value_conv(x_kv)
+ proj_value_reshape = proj_value.\
+ view((n, num_heads, self.v_dim, h_kv * w_kv)).\
+ permute(0, 1, 3, 2)
+
+ out = torch.matmul(attention, proj_value_reshape).\
+ permute(0, 1, 3, 2).\
+ contiguous().\
+ view(n, self.v_dim * self.num_heads, h, w)
+
+ out = self.proj_conv(out)
+
+ # output is downsampled, upsample back to input size
+ if self.q_downsample is not None:
+ out = F.interpolate(
+ out,
+ size=x_input.shape[2:],
+ mode='bilinear',
+ align_corners=False)
+
+ out = self.gamma * out + x_input
+ return out
+
+ def init_weights(self):
+ for m in self.modules():
+ if hasattr(m, 'kaiming_init') and m.kaiming_init:
+ kaiming_init(
+ m,
+ mode='fan_in',
+ nonlinearity='leaky_relu',
+ bias=0,
+ distribution='uniform',
+ a=1)
diff --git a/PyTorch/contrib/cv/detection/GCNet/mmcv/cnn/bricks/hsigmoid.py b/PyTorch/contrib/cv/detection/GCNet/mmcv/cnn/bricks/hsigmoid.py
new file mode 100644
index 0000000000000000000000000000000000000000..0956ef0ea3291bc6cb5a8cc3b6e5844f31442978
--- /dev/null
+++ b/PyTorch/contrib/cv/detection/GCNet/mmcv/cnn/bricks/hsigmoid.py
@@ -0,0 +1,33 @@
+import torch.nn as nn
+
+from .registry import ACTIVATION_LAYERS
+
+
+@ACTIVATION_LAYERS.register_module()
+class HSigmoid(nn.Module):
+ """Hard Sigmoid Module. Apply the hard sigmoid function:
+ Hsigmoid(x) = min(max((x + bias) / divisor, min_value), max_value)
+ Default: Hsigmoid(x) = min(max((x + 1) / 2, 0), 1)
+
+ Args:
+ bias (float): Bias of the input feature map. Default: 1.0.
+ divisor (float): Divisor of the input feature map. Default: 2.0.
+ min_value (float): Lower bound value. Default: 0.0.
+ max_value (float): Upper bound value. Default: 1.0.
+
+ Returns:
+ Tensor: The output tensor.
+ """
+
+ def __init__(self, bias=1.0, divisor=2.0, min_value=0.0, max_value=1.0):
+ super(HSigmoid, self).__init__()
+ self.bias = bias
+ self.divisor = divisor
+ assert self.divisor != 0
+ self.min_value = min_value
+ self.max_value = max_value
+
+ def forward(self, x):
+ x = (x + self.bias) / self.divisor
+
+ return x.clamp_(self.min_value, self.max_value)
diff --git a/PyTorch/contrib/cv/detection/GCNet/mmcv/cnn/bricks/hswish.py b/PyTorch/contrib/cv/detection/GCNet/mmcv/cnn/bricks/hswish.py
new file mode 100644
index 0000000000000000000000000000000000000000..f1a22adbca185e84b6fd044f88d88f64c4196d78
--- /dev/null
+++ b/PyTorch/contrib/cv/detection/GCNet/mmcv/cnn/bricks/hswish.py
@@ -0,0 +1,28 @@
+import torch.nn as nn
+
+from .registry import ACTIVATION_LAYERS
+
+
+@ACTIVATION_LAYERS.register_module()
+class HSwish(nn.Module):
+ """Hard Swish Module.
+
+ This module applies the hard swish function:
+
+ .. math::
+ Hswish(x) = x * ReLU6(x + 3) / 6
+
+ Args:
+ inplace (bool): can optionally do the operation in-place.
+ Default: False.
+
+ Returns:
+ Tensor: The output tensor.
+ """
+
+ def __init__(self, inplace=False):
+ super(HSwish, self).__init__()
+ self.act = nn.ReLU6(inplace)
+
+ def forward(self, x):
+ return x * self.act(x + 3) / 6
diff --git a/PyTorch/contrib/cv/detection/GCNet/mmcv/cnn/bricks/non_local.py b/PyTorch/contrib/cv/detection/GCNet/mmcv/cnn/bricks/non_local.py
new file mode 100644
index 0000000000000000000000000000000000000000..3ee0656653d8af00cd8edc2296df14a7f20d7c6f
--- /dev/null
+++ b/PyTorch/contrib/cv/detection/GCNet/mmcv/cnn/bricks/non_local.py
@@ -0,0 +1,305 @@
+from abc import ABCMeta
+
+import torch
+import torch.nn as nn
+
+from ..utils import constant_init, normal_init
+from .conv_module import ConvModule
+from .registry import PLUGIN_LAYERS
+
+
+class _NonLocalNd(nn.Module, metaclass=ABCMeta):
+ """Basic Non-local module.
+
+ This module is proposed in
+ "Non-local Neural Networks"
+ Paper reference: https://arxiv.org/abs/1711.07971
+ Code reference: https://github.com/AlexHex7/Non-local_pytorch
+
+ Args:
+ in_channels (int): Channels of the input feature map.
+ reduction (int): Channel reduction ratio. Default: 2.
+ use_scale (bool): Whether to scale pairwise_weight by
+ `1/sqrt(inter_channels)` when the mode is `embedded_gaussian`.
+ Default: True.
+ conv_cfg (None | dict): The config dict for convolution layers.
+ If not specified, it will use `nn.Conv2d` for convolution layers.
+ Default: None.
+ norm_cfg (None | dict): The config dict for normalization layers.
+ Default: None. (This parameter is only applicable to conv_out.)
+ mode (str): Options are `gaussian`, `concatenation`,
+ `embedded_gaussian` and `dot_product`. Default: embedded_gaussian.
+ """
+
+ def __init__(self,
+ in_channels,
+ reduction=2,
+ use_scale=True,
+ conv_cfg=None,
+ norm_cfg=None,
+ mode='embedded_gaussian',
+ **kwargs):
+ super(_NonLocalNd, self).__init__()
+ self.in_channels = in_channels
+ self.reduction = reduction
+ self.use_scale = use_scale
+ self.inter_channels = max(in_channels // reduction, 1)
+ self.mode = mode
+
+ if mode not in [
+ 'gaussian', 'embedded_gaussian', 'dot_product', 'concatenation'
+ ]:
+ raise ValueError("Mode should be in 'gaussian', 'concatenation', "
+ f"'embedded_gaussian' or 'dot_product', but got "
+ f'{mode} instead.')
+
+ # g, theta, phi are defaulted as `nn.ConvNd`.
+ # Here we use ConvModule for potential usage.
+ self.g = ConvModule(
+ self.in_channels,
+ self.inter_channels,
+ kernel_size=1,
+ conv_cfg=conv_cfg,
+ act_cfg=None)
+ self.conv_out = ConvModule(
+ self.inter_channels,
+ self.in_channels,
+ kernel_size=1,
+ conv_cfg=conv_cfg,
+ norm_cfg=norm_cfg,
+ act_cfg=None)
+
+ if self.mode != 'gaussian':
+ self.theta = ConvModule(
+ self.in_channels,
+ self.inter_channels,
+ kernel_size=1,
+ conv_cfg=conv_cfg,
+ act_cfg=None)
+ self.phi = ConvModule(
+ self.in_channels,
+ self.inter_channels,
+ kernel_size=1,
+ conv_cfg=conv_cfg,
+ act_cfg=None)
+
+ if self.mode == 'concatenation':
+ self.concat_project = ConvModule(
+ self.inter_channels * 2,
+ 1,
+ kernel_size=1,
+ stride=1,
+ padding=0,
+ bias=False,
+ act_cfg=dict(type='ReLU'))
+
+ self.init_weights(**kwargs)
+
+ def init_weights(self, std=0.01, zeros_init=True):
+ if self.mode != 'gaussian':
+ for m in [self.g, self.theta, self.phi]:
+ normal_init(m.conv, std=std)
+ else:
+ normal_init(self.g.conv, std=std)
+ if zeros_init:
+ if self.conv_out.norm_cfg is None:
+ constant_init(self.conv_out.conv, 0)
+ else:
+ constant_init(self.conv_out.norm, 0)
+ else:
+ if self.conv_out.norm_cfg is None:
+ normal_init(self.conv_out.conv, std=std)
+ else:
+ normal_init(self.conv_out.norm, std=std)
+
+ def gaussian(self, theta_x, phi_x):
+ # NonLocal1d pairwise_weight: [N, H, H]
+ # NonLocal2d pairwise_weight: [N, HxW, HxW]
+ # NonLocal3d pairwise_weight: [N, TxHxW, TxHxW]
+ pairwise_weight = torch.matmul(theta_x, phi_x)
+ pairwise_weight = pairwise_weight.softmax(dim=-1)
+ return pairwise_weight
+
+ def embedded_gaussian(self, theta_x, phi_x):
+ # NonLocal1d pairwise_weight: [N, H, H]
+ # NonLocal2d pairwise_weight: [N, HxW, HxW]
+ # NonLocal3d pairwise_weight: [N, TxHxW, TxHxW]
+ pairwise_weight = torch.matmul(theta_x, phi_x)
+ if self.use_scale:
+ # theta_x.shape[-1] is `self.inter_channels`
+ pairwise_weight /= theta_x.shape[-1]**0.5
+ pairwise_weight = pairwise_weight.softmax(dim=-1)
+ return pairwise_weight
+
+ def dot_product(self, theta_x, phi_x):
+ # NonLocal1d pairwise_weight: [N, H, H]
+ # NonLocal2d pairwise_weight: [N, HxW, HxW]
+ # NonLocal3d pairwise_weight: [N, TxHxW, TxHxW]
+ pairwise_weight = torch.matmul(theta_x, phi_x)
+ pairwise_weight /= pairwise_weight.shape[-1]
+ return pairwise_weight
+
+ def concatenation(self, theta_x, phi_x):
+ # NonLocal1d pairwise_weight: [N, H, H]
+ # NonLocal2d pairwise_weight: [N, HxW, HxW]
+ # NonLocal3d pairwise_weight: [N, TxHxW, TxHxW]
+ h = theta_x.size(2)
+ w = phi_x.size(3)
+ theta_x = theta_x.repeat(1, 1, 1, w)
+ phi_x = phi_x.repeat(1, 1, h, 1)
+
+ concat_feature = torch.cat([theta_x, phi_x], dim=1)
+ pairwise_weight = self.concat_project(concat_feature)
+ n, _, h, w = pairwise_weight.size()
+ pairwise_weight = pairwise_weight.view(n, h, w)
+ pairwise_weight /= pairwise_weight.shape[-1]
+
+ return pairwise_weight
+
+ def forward(self, x):
+ # Assume `reduction = 1`, then `inter_channels = C`
+ # or `inter_channels = C` when `mode="gaussian"`
+
+ # NonLocal1d x: [N, C, H]
+ # NonLocal2d x: [N, C, H, W]
+ # NonLocal3d x: [N, C, T, H, W]
+ n = x.size(0)
+
+ # NonLocal1d g_x: [N, H, C]
+ # NonLocal2d g_x: [N, HxW, C]
+ # NonLocal3d g_x: [N, TxHxW, C]
+ g_x = self.g(x).view(n, self.inter_channels, -1)
+ g_x = g_x.permute(0, 2, 1)
+
+ # NonLocal1d theta_x: [N, H, C], phi_x: [N, C, H]
+ # NonLocal2d theta_x: [N, HxW, C], phi_x: [N, C, HxW]
+ # NonLocal3d theta_x: [N, TxHxW, C], phi_x: [N, C, TxHxW]
+ if self.mode == 'gaussian':
+ theta_x = x.view(n, self.in_channels, -1)
+ theta_x = theta_x.permute(0, 2, 1)
+ if self.sub_sample:
+ phi_x = self.phi(x).view(n, self.in_channels, -1)
+ else:
+ phi_x = x.view(n, self.in_channels, -1)
+ elif self.mode == 'concatenation':
+ theta_x = self.theta(x).view(n, self.inter_channels, -1, 1)
+ phi_x = self.phi(x).view(n, self.inter_channels, 1, -1)
+ else:
+ theta_x = self.theta(x).view(n, self.inter_channels, -1)
+ theta_x = theta_x.permute(0, 2, 1)
+ phi_x = self.phi(x).view(n, self.inter_channels, -1)
+
+ pairwise_func = getattr(self, self.mode)
+ # NonLocal1d pairwise_weight: [N, H, H]
+ # NonLocal2d pairwise_weight: [N, HxW, HxW]
+ # NonLocal3d pairwise_weight: [N, TxHxW, TxHxW]
+ pairwise_weight = pairwise_func(theta_x, phi_x)
+
+ # NonLocal1d y: [N, H, C]
+ # NonLocal2d y: [N, HxW, C]
+ # NonLocal3d y: [N, TxHxW, C]
+ y = torch.matmul(pairwise_weight, g_x)
+ # NonLocal1d y: [N, C, H]
+ # NonLocal2d y: [N, C, H, W]
+ # NonLocal3d y: [N, C, T, H, W]
+ y = y.permute(0, 2, 1).contiguous().reshape(n, self.inter_channels,
+ *x.size()[2:])
+
+ output = x + self.conv_out(y)
+
+ return output
+
+
+class NonLocal1d(_NonLocalNd):
+ """1D Non-local module.
+
+ Args:
+ in_channels (int): Same as `NonLocalND`.
+ sub_sample (bool): Whether to apply max pooling after pairwise
+ function (Note that the `sub_sample` is applied on spatial only).
+ Default: False.
+ conv_cfg (None | dict): Same as `NonLocalND`.
+ Default: dict(type='Conv1d').
+ """
+
+ def __init__(self,
+ in_channels,
+ sub_sample=False,
+ conv_cfg=dict(type='Conv1d'),
+ **kwargs):
+ super(NonLocal1d, self).__init__(
+ in_channels, conv_cfg=conv_cfg, **kwargs)
+
+ self.sub_sample = sub_sample
+
+ if sub_sample:
+ max_pool_layer = nn.MaxPool1d(kernel_size=2)
+ self.g = nn.Sequential(self.g, max_pool_layer)
+ if self.mode != 'gaussian':
+ self.phi = nn.Sequential(self.phi, max_pool_layer)
+ else:
+ self.phi = max_pool_layer
+
+
+@PLUGIN_LAYERS.register_module()
+class NonLocal2d(_NonLocalNd):
+ """2D Non-local module.
+
+ Args:
+ in_channels (int): Same as `NonLocalND`.
+ sub_sample (bool): Whether to apply max pooling after pairwise
+ function (Note that the `sub_sample` is applied on spatial only).
+ Default: False.
+ conv_cfg (None | dict): Same as `NonLocalND`.
+ Default: dict(type='Conv2d').
+ """
+
+ _abbr_ = 'nonlocal_block'
+
+ def __init__(self,
+ in_channels,
+ sub_sample=False,
+ conv_cfg=dict(type='Conv2d'),
+ **kwargs):
+ super(NonLocal2d, self).__init__(
+ in_channels, conv_cfg=conv_cfg, **kwargs)
+
+ self.sub_sample = sub_sample
+
+ if sub_sample:
+ max_pool_layer = nn.MaxPool2d(kernel_size=(2, 2))
+ self.g = nn.Sequential(self.g, max_pool_layer)
+ if self.mode != 'gaussian':
+ self.phi = nn.Sequential(self.phi, max_pool_layer)
+ else:
+ self.phi = max_pool_layer
+
+
+class NonLocal3d(_NonLocalNd):
+ """3D Non-local module.
+
+ Args:
+ in_channels (int): Same as `NonLocalND`.
+ sub_sample (bool): Whether to apply max pooling after pairwise
+ function (Note that the `sub_sample` is applied on spatial only).
+ Default: False.
+ conv_cfg (None | dict): Same as `NonLocalND`.
+ Default: dict(type='Conv3d').
+ """
+
+ def __init__(self,
+ in_channels,
+ sub_sample=False,
+ conv_cfg=dict(type='Conv3d'),
+ **kwargs):
+ super(NonLocal3d, self).__init__(
+ in_channels, conv_cfg=conv_cfg, **kwargs)
+ self.sub_sample = sub_sample
+
+ if sub_sample:
+ max_pool_layer = nn.MaxPool3d(kernel_size=(1, 2, 2))
+ self.g = nn.Sequential(self.g, max_pool_layer)
+ if self.mode != 'gaussian':
+ self.phi = nn.Sequential(self.phi, max_pool_layer)
+ else:
+ self.phi = max_pool_layer
diff --git a/PyTorch/contrib/cv/detection/GCNet/mmcv/cnn/bricks/norm.py b/PyTorch/contrib/cv/detection/GCNet/mmcv/cnn/bricks/norm.py
new file mode 100644
index 0000000000000000000000000000000000000000..00352258537fa923d75a5fa8a3511f36c1680a78
--- /dev/null
+++ b/PyTorch/contrib/cv/detection/GCNet/mmcv/cnn/bricks/norm.py
@@ -0,0 +1,143 @@
+import inspect
+
+import torch.nn as nn
+
+from mmcv.utils import is_tuple_of
+from mmcv.utils.parrots_wrapper import SyncBatchNorm, _BatchNorm, _InstanceNorm
+from .registry import NORM_LAYERS
+
+NORM_LAYERS.register_module('BN', module=nn.BatchNorm2d)
+NORM_LAYERS.register_module('BN1d', module=nn.BatchNorm1d)
+NORM_LAYERS.register_module('BN2d', module=nn.BatchNorm2d)
+NORM_LAYERS.register_module('BN3d', module=nn.BatchNorm3d)
+NORM_LAYERS.register_module('SyncBN', module=SyncBatchNorm)
+NORM_LAYERS.register_module('GN', module=nn.GroupNorm)
+NORM_LAYERS.register_module('LN', module=nn.LayerNorm)
+NORM_LAYERS.register_module('IN', module=nn.InstanceNorm2d)
+NORM_LAYERS.register_module('IN1d', module=nn.InstanceNorm1d)
+NORM_LAYERS.register_module('IN2d', module=nn.InstanceNorm2d)
+NORM_LAYERS.register_module('IN3d', module=nn.InstanceNorm3d)
+
+
+def infer_abbr(class_type):
+ """Infer abbreviation from the class name.
+
+ When we build a norm layer with `build_norm_layer()`, we want to preserve
+ the norm type in variable names, e.g, self.bn1, self.gn. This method will
+ infer the abbreviation to map class types to abbreviations.
+
+ Rule 1: If the class has the property "_abbr_", return the property.
+ Rule 2: If the parent class is _BatchNorm, GroupNorm, LayerNorm or
+ InstanceNorm, the abbreviation of this layer will be "bn", "gn", "ln" and
+ "in" respectively.
+ Rule 3: If the class name contains "batch", "group", "layer" or "instance",
+ the abbreviation of this layer will be "bn", "gn", "ln" and "in"
+ respectively.
+ Rule 4: Otherwise, the abbreviation falls back to "norm".
+
+ Args:
+ class_type (type): The norm layer type.
+
+ Returns:
+ str: The inferred abbreviation.
+ """
+ if not inspect.isclass(class_type):
+ raise TypeError(
+ f'class_type must be a type, but got {type(class_type)}')
+ if hasattr(class_type, '_abbr_'):
+ return class_type._abbr_
+ if issubclass(class_type, _InstanceNorm): # IN is a subclass of BN
+ return 'in'
+ elif issubclass(class_type, _BatchNorm):
+ return 'bn'
+ elif issubclass(class_type, nn.GroupNorm):
+ return 'gn'
+ elif issubclass(class_type, nn.LayerNorm):
+ return 'ln'
+ else:
+ class_name = class_type.__name__.lower()
+ if 'batch' in class_name:
+ return 'bn'
+ elif 'group' in class_name:
+ return 'gn'
+ elif 'layer' in class_name:
+ return 'ln'
+ elif 'instance' in class_name:
+ return 'in'
+ else:
+ return 'norm_layer'
+
+
+def build_norm_layer(cfg, num_features, postfix=''):
+ """Build normalization layer.
+
+ Args:
+ cfg (dict): The norm layer config, which should contain:
+
+ - type (str): Layer type.
+ - layer args: Args needed to instantiate a norm layer.
+ - requires_grad (bool, optional): Whether stop gradient updates.
+ num_features (int): Number of input channels.
+ postfix (int | str): The postfix to be appended into norm abbreviation
+ to create named layer.
+
+ Returns:
+ (str, nn.Module): The first element is the layer name consisting of
+ abbreviation and postfix, e.g., bn1, gn. The second element is the
+ created norm layer.
+ """
+ if not isinstance(cfg, dict):
+ raise TypeError('cfg must be a dict')
+ if 'type' not in cfg:
+ raise KeyError('the cfg dict must contain the key "type"')
+ cfg_ = cfg.copy()
+
+ layer_type = cfg_.pop('type')
+ if layer_type not in NORM_LAYERS:
+ raise KeyError(f'Unrecognized norm type {layer_type}')
+
+ norm_layer = NORM_LAYERS.get(layer_type)
+ abbr = infer_abbr(norm_layer)
+
+ assert isinstance(postfix, (int, str))
+ name = abbr + str(postfix)
+
+ requires_grad = cfg_.pop('requires_grad', True)
+ cfg_.setdefault('eps', 1e-5)
+ if layer_type != 'GN':
+ layer = norm_layer(num_features, **cfg_)
+ if layer_type == 'SyncBN':
+ layer._specify_ddp_gpu_num(1)
+ else:
+ assert 'num_groups' in cfg_
+ layer = norm_layer(num_channels=num_features, **cfg_)
+
+ for param in layer.parameters():
+ param.requires_grad = requires_grad
+
+ return name, layer
+
+
+def is_norm(layer, exclude=None):
+ """Check if a layer is a normalization layer.
+
+ Args:
+ layer (nn.Module): The layer to be checked.
+ exclude (type | tuple[type]): Types to be excluded.
+
+ Returns:
+ bool: Whether the layer is a norm layer.
+ """
+ if exclude is not None:
+ if not isinstance(exclude, tuple):
+ exclude = (exclude, )
+ if not is_tuple_of(exclude, type):
+ raise TypeError(
+ f'"exclude" must be either None or type or a tuple of types, '
+ f'but got {type(exclude)}: {exclude}')
+
+ if exclude and isinstance(layer, exclude):
+ return False
+
+ all_norm_bases = (_BatchNorm, _InstanceNorm, nn.GroupNorm, nn.LayerNorm)
+ return isinstance(layer, all_norm_bases)
diff --git a/PyTorch/contrib/cv/detection/GCNet/mmcv/cnn/bricks/padding.py b/PyTorch/contrib/cv/detection/GCNet/mmcv/cnn/bricks/padding.py
new file mode 100644
index 0000000000000000000000000000000000000000..b7e82129c1f1a2bf57c86d50d46fe2f65a6d8f75
--- /dev/null
+++ b/PyTorch/contrib/cv/detection/GCNet/mmcv/cnn/bricks/padding.py
@@ -0,0 +1,35 @@
+import torch.nn as nn
+
+from .registry import PADDING_LAYERS
+
+PADDING_LAYERS.register_module('zero', module=nn.ZeroPad2d)
+PADDING_LAYERS.register_module('reflect', module=nn.ReflectionPad2d)
+PADDING_LAYERS.register_module('replicate', module=nn.ReplicationPad2d)
+
+
+def build_padding_layer(cfg, *args, **kwargs):
+ """Build padding layer.
+
+ Args:
+ cfg (None or dict): The padding layer config, which should contain:
+ - type (str): Layer type.
+ - layer args: Args needed to instantiate a padding layer.
+
+ Returns:
+ nn.Module: Created padding layer.
+ """
+ if not isinstance(cfg, dict):
+ raise TypeError('cfg must be a dict')
+ if 'type' not in cfg:
+ raise KeyError('the cfg dict must contain the key "type"')
+
+ cfg_ = cfg.copy()
+ padding_type = cfg_.pop('type')
+ if padding_type not in PADDING_LAYERS:
+ raise KeyError(f'Unrecognized padding type {padding_type}.')
+ else:
+ padding_layer = PADDING_LAYERS.get(padding_type)
+
+ layer = padding_layer(*args, **kwargs, **cfg_)
+
+ return layer
diff --git a/PyTorch/contrib/cv/detection/GCNet/mmcv/cnn/bricks/plugin.py b/PyTorch/contrib/cv/detection/GCNet/mmcv/cnn/bricks/plugin.py
new file mode 100644
index 0000000000000000000000000000000000000000..07c010d4053174dd41107aa654ea67e82b46a25c
--- /dev/null
+++ b/PyTorch/contrib/cv/detection/GCNet/mmcv/cnn/bricks/plugin.py
@@ -0,0 +1,88 @@
+import inspect
+import platform
+
+from .registry import PLUGIN_LAYERS
+
+if platform.system() == 'Windows':
+ import regex as re
+else:
+ import re
+
+
+def infer_abbr(class_type):
+ """Infer abbreviation from the class name.
+
+ This method will infer the abbreviation to map class types to
+ abbreviations.
+
+ Rule 1: If the class has the property "abbr", return the property.
+ Rule 2: Otherwise, the abbreviation falls back to snake case of class
+ name, e.g. the abbreviation of ``FancyBlock`` will be ``fancy_block``.
+
+ Args:
+ class_type (type): The norm layer type.
+
+ Returns:
+ str: The inferred abbreviation.
+ """
+
+ def camel2snack(word):
+ """Convert camel case word into snack case.
+
+ Modified from `inflection lib
+ `_.
+
+ Example::
+
+ >>> camel2snack("FancyBlock")
+ 'fancy_block'
+ """
+
+ word = re.sub(r'([A-Z]+)([A-Z][a-z])', r'\1_\2', word)
+ word = re.sub(r'([a-z\d])([A-Z])', r'\1_\2', word)
+ word = word.replace('-', '_')
+ return word.lower()
+
+ if not inspect.isclass(class_type):
+ raise TypeError(
+ f'class_type must be a type, but got {type(class_type)}')
+ if hasattr(class_type, '_abbr_'):
+ return class_type._abbr_
+ else:
+ return camel2snack(class_type.__name__)
+
+
+def build_plugin_layer(cfg, postfix='', **kwargs):
+ """Build plugin layer.
+
+ Args:
+ cfg (None or dict): cfg should contain:
+ type (str): identify plugin layer type.
+ layer args: args needed to instantiate a plugin layer.
+ postfix (int, str): appended into norm abbreviation to
+ create named layer. Default: ''.
+
+ Returns:
+ tuple[str, nn.Module]:
+ name (str): abbreviation + postfix
+ layer (nn.Module): created plugin layer
+ """
+ if not isinstance(cfg, dict):
+ raise TypeError('cfg must be a dict')
+ if 'type' not in cfg:
+ raise KeyError('the cfg dict must contain the key "type"')
+ cfg_ = cfg.copy()
+
+ layer_type = cfg_.pop('type')
+ if layer_type not in PLUGIN_LAYERS:
+ raise KeyError(f'Unrecognized plugin type {layer_type}')
+
+ plugin_layer = PLUGIN_LAYERS.get(layer_type)
+ abbr = infer_abbr(plugin_layer)
+
+ assert isinstance(postfix, (int, str))
+ name = abbr + str(postfix)
+
+ layer = plugin_layer(**kwargs, **cfg_)
+
+ return name, layer
diff --git a/PyTorch/contrib/cv/detection/GCNet/mmcv/cnn/bricks/registry.py b/PyTorch/contrib/cv/detection/GCNet/mmcv/cnn/bricks/registry.py
new file mode 100644
index 0000000000000000000000000000000000000000..31c1ccc196ae75a42353d97693258c209f9d3a98
--- /dev/null
+++ b/PyTorch/contrib/cv/detection/GCNet/mmcv/cnn/bricks/registry.py
@@ -0,0 +1,15 @@
+from mmcv.utils import Registry
+
+CONV_LAYERS = Registry('conv layer')
+NORM_LAYERS = Registry('norm layer')
+ACTIVATION_LAYERS = Registry('activation layer')
+PADDING_LAYERS = Registry('padding layer')
+UPSAMPLE_LAYERS = Registry('upsample layer')
+PLUGIN_LAYERS = Registry('plugin layer')
+
+DROPOUT_LAYERS = Registry('drop out layers')
+POSITIONAL_ENCODING = Registry('position encoding')
+ATTENTION = Registry('attention')
+FEEDFORWARD_NETWORK = Registry('feed-forward Network')
+TRANSFORMER_LAYER = Registry('transformerLayer')
+TRANSFORMER_LAYER_SEQUENCE = Registry('transformer-layers sequence')
diff --git a/PyTorch/contrib/cv/detection/GCNet/mmcv/cnn/bricks/scale.py b/PyTorch/contrib/cv/detection/GCNet/mmcv/cnn/bricks/scale.py
new file mode 100644
index 0000000000000000000000000000000000000000..be7109b82403b1f15de017c738a76e13d0ecaae7
--- /dev/null
+++ b/PyTorch/contrib/cv/detection/GCNet/mmcv/cnn/bricks/scale.py
@@ -0,0 +1,20 @@
+import torch
+import torch.nn as nn
+
+
+class Scale(nn.Module):
+ """A learnable scale parameter.
+
+ This layer scales the input by a learnable factor. It multiplies a
+ learnable scale parameter of shape (1,) with input of any shape.
+
+ Args:
+ scale (float): Initial value of scale factor. Default: 1.0
+ """
+
+ def __init__(self, scale=1.0):
+ super(Scale, self).__init__()
+ self.scale = nn.Parameter(torch.tensor(scale, dtype=torch.float))
+
+ def forward(self, x):
+ return x * self.scale
diff --git a/PyTorch/contrib/cv/detection/GCNet/mmcv/cnn/bricks/swish.py b/PyTorch/contrib/cv/detection/GCNet/mmcv/cnn/bricks/swish.py
new file mode 100644
index 0000000000000000000000000000000000000000..f396dc59b7bfa29b5b414bff45bb8393dffd839e
--- /dev/null
+++ b/PyTorch/contrib/cv/detection/GCNet/mmcv/cnn/bricks/swish.py
@@ -0,0 +1,24 @@
+import torch
+import torch.nn as nn
+
+from .registry import ACTIVATION_LAYERS
+
+
+@ACTIVATION_LAYERS.register_module()
+class Swish(nn.Module):
+ """Swish Module.
+
+ This module applies the swish function:
+
+ .. math::
+ Swish(x) = x * Sigmoid(x)
+
+ Returns:
+ Tensor: The output tensor.
+ """
+
+ def __init__(self):
+ super(Swish, self).__init__()
+
+ def forward(self, x):
+ return x * torch.sigmoid(x)
diff --git a/PyTorch/contrib/cv/detection/GCNet/mmcv/cnn/bricks/transformer.py b/PyTorch/contrib/cv/detection/GCNet/mmcv/cnn/bricks/transformer.py
new file mode 100644
index 0000000000000000000000000000000000000000..06715cde601c8da243bf15d4d023b1a371880726
--- /dev/null
+++ b/PyTorch/contrib/cv/detection/GCNet/mmcv/cnn/bricks/transformer.py
@@ -0,0 +1,601 @@
+import copy
+import warnings
+
+import torch
+import torch.nn as nn
+
+from mmcv import ConfigDict, deprecated_api_warning
+from mmcv.cnn import Linear, build_activation_layer, build_norm_layer
+from mmcv.runner.base_module import BaseModule, ModuleList, Sequential
+from mmcv.utils import build_from_cfg
+from .drop import build_dropout
+from .registry import (ATTENTION, FEEDFORWARD_NETWORK, POSITIONAL_ENCODING,
+ TRANSFORMER_LAYER, TRANSFORMER_LAYER_SEQUENCE)
+
+# Avoid BC-breaking of importing MultiScaleDeformableAttention from this file
+try:
+ from mmcv.ops.multi_scale_deform_attn import MultiScaleDeformableAttention # noqa F401
+ warnings.warn(
+ ImportWarning(
+ '``MultiScaleDeformableAttention`` has been moved to '
+ '``mmcv.ops.multi_scale_deform_attn``, please change original path ' # noqa E501
+ '``from mmcv.cnn.bricks.transformer import MultiScaleDeformableAttention`` ' # noqa E501
+ 'to ``from mmcv.ops.multi_scale_deform_attn import MultiScaleDeformableAttention`` ' # noqa E501
+ ))
+
+except ImportError:
+ warnings.warn('Fail to import ``MultiScaleDeformableAttention`` from '
+ '``mmcv.ops.multi_scale_deform_attn``, '
+ 'You should install ``mmcv-full`` if you need this module. ')
+
+
+def build_positional_encoding(cfg, default_args=None):
+ """Builder for Position Encoding."""
+ return build_from_cfg(cfg, POSITIONAL_ENCODING, default_args)
+
+
+def build_attention(cfg, default_args=None):
+ """Builder for attention."""
+ return build_from_cfg(cfg, ATTENTION, default_args)
+
+
+def build_feedforward_network(cfg, default_args=None):
+ """Builder for feed-forward network (FFN)."""
+ return build_from_cfg(cfg, FEEDFORWARD_NETWORK, default_args)
+
+
+def build_transformer_layer(cfg, default_args=None):
+ """Builder for transformer layer."""
+ return build_from_cfg(cfg, TRANSFORMER_LAYER, default_args)
+
+
+def build_transformer_layer_sequence(cfg, default_args=None):
+ """Builder for transformer encoder and transformer decoder."""
+ return build_from_cfg(cfg, TRANSFORMER_LAYER_SEQUENCE, default_args)
+
+
+@ATTENTION.register_module()
+class MultiheadAttention(BaseModule):
+ """A wrapper for ``torch.nn.MultiheadAttention``.
+
+ This module implements MultiheadAttention with identity connection,
+ and positional encoding is also passed as input.
+
+ Args:
+ embed_dims (int): The embedding dimension.
+ num_heads (int): Parallel attention heads.
+ attn_drop (float): A Dropout layer on attn_output_weights.
+ Default: 0.0.
+ proj_drop (float): A Dropout layer after `nn.MultiheadAttention`.
+ Default: 0.0.
+ dropout_layer (obj:`ConfigDict`): The dropout_layer used
+ when adding the shortcut.
+ init_cfg (obj:`mmcv.ConfigDict`): The Config for initialization.
+ Default: None.
+ batch_first (bool): When it is True, Key, Query and Value are shape of
+ (batch, n, embed_dim), otherwise (n, batch, embed_dim).
+ Default to False.
+ """
+
+ def __init__(self,
+ embed_dims,
+ num_heads,
+ attn_drop=0.,
+ proj_drop=0.,
+ dropout_layer=dict(type='Dropout', drop_prob=0.),
+ init_cfg=None,
+ batch_first=False,
+ **kwargs):
+ super(MultiheadAttention, self).__init__(init_cfg)
+ if 'dropout' in kwargs:
+ warnings.warn('The arguments `dropout` in MultiheadAttention '
+ 'has been deprecated, now you can separately '
+ 'set `attn_drop`(float), proj_drop(float), '
+ 'and `dropout_layer`(dict) ')
+ attn_drop = kwargs['dropout']
+ dropout_layer['drop_prob'] = kwargs.pop('dropout')
+
+ self.embed_dims = embed_dims
+ self.num_heads = num_heads
+ self.batch_first = batch_first
+
+ self.attn = nn.MultiheadAttention(embed_dims, num_heads, attn_drop,
+ **kwargs)
+ if self.batch_first:
+
+ def _bnc_to_nbc(forward):
+ """Because the dataflow('key', 'query', 'value') of
+ ``torch.nn.MultiheadAttention`` is (num_query, batch,
+ embed_dims), We should adjust the shape of dataflow from
+ batch_first (batch, num_query, embed_dims) to num_query_first
+ (num_query ,batch, embed_dims), and recover ``attn_output``
+ from num_query_first to batch_first."""
+
+ def forward_wrapper(**kwargs):
+ convert_keys = ('key', 'query', 'value')
+ for key in kwargs.keys():
+ if key in convert_keys:
+ kwargs[key] = kwargs[key].transpose(0, 1)
+ attn_output, attn_output_weights = forward(**kwargs)
+ return attn_output.transpose(0, 1), attn_output_weights
+
+ return forward_wrapper
+
+ self.attn.forward = _bnc_to_nbc(self.attn.forward)
+
+ self.proj_drop = nn.Dropout(proj_drop)
+ self.dropout_layer = build_dropout(
+ dropout_layer) if dropout_layer else nn.Identity()
+
+ @deprecated_api_warning({'residual': 'identity'},
+ cls_name='MultiheadAttention')
+ def forward(self,
+ query,
+ key=None,
+ value=None,
+ identity=None,
+ query_pos=None,
+ key_pos=None,
+ attn_mask=None,
+ key_padding_mask=None,
+ **kwargs):
+ """Forward function for `MultiheadAttention`.
+
+ **kwargs allow passing a more general data flow when combining
+ with other operations in `transformerlayer`.
+
+ Args:
+ query (Tensor): The input query with shape [num_queries, bs,
+ embed_dims] if self.batch_first is False, else
+ [bs, num_queries embed_dims].
+ key (Tensor): The key tensor with shape [num_keys, bs,
+ embed_dims] if self.batch_first is False, else
+ [bs, num_keys, embed_dims] .
+ If None, the ``query`` will be used. Defaults to None.
+ value (Tensor): The value tensor with same shape as `key`.
+ Same in `nn.MultiheadAttention.forward`. Defaults to None.
+ If None, the `key` will be used.
+ identity (Tensor): This tensor, with the same shape as x,
+ will be used for the identity link.
+ If None, `x` will be used. Defaults to None.
+ query_pos (Tensor): The positional encoding for query, with
+ the same shape as `x`. If not None, it will
+ be added to `x` before forward function. Defaults to None.
+ key_pos (Tensor): The positional encoding for `key`, with the
+ same shape as `key`. Defaults to None. If not None, it will
+ be added to `key` before forward function. If None, and
+ `query_pos` has the same shape as `key`, then `query_pos`
+ will be used for `key_pos`. Defaults to None.
+ attn_mask (Tensor): ByteTensor mask with shape [num_queries,
+ num_keys]. Same in `nn.MultiheadAttention.forward`.
+ Defaults to None.
+ key_padding_mask (Tensor): ByteTensor with shape [bs, num_keys].
+ Defaults to None.
+
+ Returns:
+ Tensor: forwarded results with shape
+ [num_queries, bs, embed_dims]
+ if self.batch_first is False, else
+ [bs, num_queries embed_dims].
+ """
+
+ if key is None:
+ key = query
+ if value is None:
+ value = key
+ if identity is None:
+ identity = query
+ if key_pos is None:
+ if query_pos is not None:
+ # use query_pos if key_pos is not available
+ if query_pos.shape == key.shape:
+ key_pos = query_pos
+ else:
+ warnings.warn(f'position encoding of key is'
+ f'missing in {self.__class__.__name__}.')
+ if query_pos is not None:
+ query = query + query_pos
+ if key_pos is not None:
+ key = key + key_pos
+
+ out = self.attn(
+ query=query,
+ key=key,
+ value=value,
+ attn_mask=attn_mask,
+ key_padding_mask=key_padding_mask)[0]
+
+ return identity + self.dropout_layer(self.proj_drop(out))
+
+
+@FEEDFORWARD_NETWORK.register_module()
+class FFN(BaseModule):
+ """Implements feed-forward networks (FFNs) with identity connection.
+
+ Args:
+ embed_dims (int): The feature dimension. Same as
+ `MultiheadAttention`. Defaults: 256.
+ feedforward_channels (int): The hidden dimension of FFNs.
+ Defaults: 1024.
+ num_fcs (int, optional): The number of fully-connected layers in
+ FFNs. Default: 2.
+ act_cfg (dict, optional): The activation config for FFNs.
+ Default: dict(type='ReLU')
+ ffn_drop (float, optional): Probability of an element to be
+ zeroed in FFN. Default 0.0.
+ add_identity (bool, optional): Whether to add the
+ identity connection. Default: `True`.
+ dropout_layer (obj:`ConfigDict`): The dropout_layer used
+ when adding the shortcut.
+ init_cfg (obj:`mmcv.ConfigDict`): The Config for initialization.
+ Default: None.
+ """
+
+ @deprecated_api_warning(
+ {
+ 'dropout': 'ffn_drop',
+ 'add_residual': 'add_identity'
+ },
+ cls_name='FFN')
+ def __init__(self,
+ embed_dims=256,
+ feedforward_channels=1024,
+ num_fcs=2,
+ act_cfg=dict(type='ReLU', inplace=True),
+ ffn_drop=0.,
+ dropout_layer=None,
+ add_identity=True,
+ init_cfg=None,
+ **kwargs):
+ super(FFN, self).__init__(init_cfg)
+ assert num_fcs >= 2, 'num_fcs should be no less ' \
+ f'than 2. got {num_fcs}.'
+ self.embed_dims = embed_dims
+ self.feedforward_channels = feedforward_channels
+ self.num_fcs = num_fcs
+ self.act_cfg = act_cfg
+ self.activate = build_activation_layer(act_cfg)
+
+ layers = []
+ in_channels = embed_dims
+ for _ in range(num_fcs - 1):
+ layers.append(
+ Sequential(
+ Linear(in_channels, feedforward_channels), self.activate,
+ nn.Dropout(ffn_drop)))
+ in_channels = feedforward_channels
+ layers.append(Linear(feedforward_channels, embed_dims))
+ layers.append(nn.Dropout(ffn_drop))
+ self.layers = Sequential(*layers)
+ self.dropout_layer = build_dropout(
+ dropout_layer) if dropout_layer else torch.nn.Identity()
+ self.add_identity = add_identity
+
+ @deprecated_api_warning({'residual': 'identity'}, cls_name='FFN')
+ def forward(self, x, identity=None):
+ """Forward function for `FFN`.
+
+ The function would add x to the output tensor if residue is None.
+ """
+ out = self.layers(x)
+ if not self.add_identity:
+ return self.dropout_layer(out)
+ if identity is None:
+ identity = x
+ return identity + self.dropout_layer(out)
+
+
+@TRANSFORMER_LAYER.register_module()
+class BaseTransformerLayer(BaseModule):
+ """Base `TransformerLayer` for vision transformer.
+
+ It can be built from `mmcv.ConfigDict` and support more flexible
+ customization, for example, using any number of `FFN or LN ` and
+ use different kinds of `attention` by specifying a list of `ConfigDict`
+ named `attn_cfgs`. It is worth mentioning that it supports `prenorm`
+ when you specifying `norm` as the first element of `operation_order`.
+ More details about the `prenorm`: `On Layer Normalization in the
+ Transformer Architecture `_ .
+
+ Args:
+ attn_cfgs (list[`mmcv.ConfigDict`] | obj:`mmcv.ConfigDict` | None )):
+ Configs for `self_attention` or `cross_attention` modules,
+ The order of the configs in the list should be consistent with
+ corresponding attentions in operation_order.
+ If it is a dict, all of the attention modules in operation_order
+ will be built with this config. Default: None.
+ ffn_cfgs (list[`mmcv.ConfigDict`] | obj:`mmcv.ConfigDict` | None )):
+ Configs for FFN, The order of the configs in the list should be
+ consistent with corresponding ffn in operation_order.
+ If it is a dict, all of the attention modules in operation_order
+ will be built with this config.
+ operation_order (tuple[str]): The execution order of operation
+ in transformer. Such as ('self_attn', 'norm', 'ffn', 'norm').
+ Support `prenorm` when you specifying first element as `norm`.
+ Default:None.
+ norm_cfg (dict): Config dict for normalization layer.
+ Default: dict(type='LN').
+ init_cfg (obj:`mmcv.ConfigDict`): The Config for initialization.
+ Default: None.
+ batch_first (bool): Key, Query and Value are shape
+ of (batch, n, embed_dim)
+ or (n, batch, embed_dim). Default to False.
+ """
+
+ def __init__(self,
+ attn_cfgs=None,
+ ffn_cfgs=dict(
+ type='FFN',
+ embed_dims=256,
+ feedforward_channels=1024,
+ num_fcs=2,
+ ffn_drop=0.,
+ act_cfg=dict(type='ReLU', inplace=True),
+ ),
+ operation_order=None,
+ norm_cfg=dict(type='LN'),
+ init_cfg=None,
+ batch_first=False,
+ **kwargs):
+
+ deprecated_args = dict(
+ feedforward_channels='feedforward_channels',
+ ffn_dropout='ffn_drop',
+ ffn_num_fcs='num_fcs')
+ for ori_name, new_name in deprecated_args.items():
+ if ori_name in kwargs:
+ warnings.warn(
+ f'The arguments `{ori_name}` in BaseTransformerLayer '
+ f'has been deprecated, now you should set `{new_name}` '
+ f'and other FFN related arguments '
+ f'to a dict named `ffn_cfgs`. ')
+ ffn_cfgs[new_name] = kwargs[ori_name]
+
+ super(BaseTransformerLayer, self).__init__(init_cfg)
+
+ self.batch_first = batch_first
+
+ assert set(operation_order) & set(
+ ['self_attn', 'norm', 'ffn', 'cross_attn']) == \
+ set(operation_order), f'The operation_order of' \
+ f' {self.__class__.__name__} should ' \
+ f'contains all four operation type ' \
+ f"{['self_attn', 'norm', 'ffn', 'cross_attn']}"
+
+ num_attn = operation_order.count('self_attn') + operation_order.count(
+ 'cross_attn')
+ if isinstance(attn_cfgs, dict):
+ attn_cfgs = [copy.deepcopy(attn_cfgs) for _ in range(num_attn)]
+ else:
+ assert num_attn == len(attn_cfgs), f'The length ' \
+ f'of attn_cfg {num_attn} is ' \
+ f'not consistent with the number of attention' \
+ f'in operation_order {operation_order}.'
+
+ self.num_attn = num_attn
+ self.operation_order = operation_order
+ self.norm_cfg = norm_cfg
+ self.pre_norm = operation_order[0] == 'norm'
+ self.attentions = ModuleList()
+
+ index = 0
+ for operation_name in operation_order:
+ if operation_name in ['self_attn', 'cross_attn']:
+ if 'batch_first' in attn_cfgs[index]:
+ assert self.batch_first == attn_cfgs[index]['batch_first']
+ else:
+ attn_cfgs[index]['batch_first'] = self.batch_first
+ attention = build_attention(attn_cfgs[index])
+ # Some custom attentions used as `self_attn`
+ # or `cross_attn` can have different behavior.
+ attention.operation_name = operation_name
+ self.attentions.append(attention)
+ index += 1
+
+ self.embed_dims = self.attentions[0].embed_dims
+
+ self.ffns = ModuleList()
+ num_ffns = operation_order.count('ffn')
+ if isinstance(ffn_cfgs, dict):
+ ffn_cfgs = ConfigDict(ffn_cfgs)
+ if isinstance(ffn_cfgs, dict):
+ ffn_cfgs = [copy.deepcopy(ffn_cfgs) for _ in range(num_ffns)]
+ assert len(ffn_cfgs) == num_ffns
+ for ffn_index in range(num_ffns):
+ if 'embed_dims' not in ffn_cfgs[ffn_index]:
+ ffn_cfgs['embed_dims'] = self.embed_dims
+ else:
+ assert ffn_cfgs[ffn_index]['embed_dims'] == self.embed_dims
+ self.ffns.append(
+ build_feedforward_network(ffn_cfgs[ffn_index],
+ dict(type='FFN')))
+
+ self.norms = ModuleList()
+ num_norms = operation_order.count('norm')
+ for _ in range(num_norms):
+ self.norms.append(build_norm_layer(norm_cfg, self.embed_dims)[1])
+
+ def forward(self,
+ query,
+ key=None,
+ value=None,
+ query_pos=None,
+ key_pos=None,
+ attn_masks=None,
+ query_key_padding_mask=None,
+ key_padding_mask=None,
+ **kwargs):
+ """Forward function for `TransformerDecoderLayer`.
+
+ **kwargs contains some specific arguments of attentions.
+
+ Args:
+ query (Tensor): The input query with shape
+ [num_queries, bs, embed_dims] if
+ self.batch_first is False, else
+ [bs, num_queries embed_dims].
+ key (Tensor): The key tensor with shape [num_keys, bs,
+ embed_dims] if self.batch_first is False, else
+ [bs, num_keys, embed_dims] .
+ value (Tensor): The value tensor with same shape as `key`.
+ query_pos (Tensor): The positional encoding for `query`.
+ Default: None.
+ key_pos (Tensor): The positional encoding for `key`.
+ Default: None.
+ attn_masks (List[Tensor] | None): 2D Tensor used in
+ calculation of corresponding attention. The length of
+ it should equal to the number of `attention` in
+ `operation_order`. Default: None.
+ query_key_padding_mask (Tensor): ByteTensor for `query`, with
+ shape [bs, num_queries]. Only used in `self_attn` layer.
+ Defaults to None.
+ key_padding_mask (Tensor): ByteTensor for `query`, with
+ shape [bs, num_keys]. Default: None.
+
+ Returns:
+ Tensor: forwarded results with shape [num_queries, bs, embed_dims].
+ """
+
+ norm_index = 0
+ attn_index = 0
+ ffn_index = 0
+ identity = query
+ if attn_masks is None:
+ attn_masks = [None for _ in range(self.num_attn)]
+ elif isinstance(attn_masks, torch.Tensor):
+ attn_masks = [
+ copy.deepcopy(attn_masks) for _ in range(self.num_attn)
+ ]
+ warnings.warn(f'Use same attn_mask in all attentions in '
+ f'{self.__class__.__name__} ')
+ else:
+ assert len(attn_masks) == self.num_attn, f'The length of ' \
+ f'attn_masks {len(attn_masks)} must be equal ' \
+ f'to the number of attention in ' \
+ f'operation_order {self.num_attn}'
+
+ for layer in self.operation_order:
+ if layer == 'self_attn':
+ temp_key = temp_value = query
+ query = self.attentions[attn_index](
+ query,
+ temp_key,
+ temp_value,
+ identity if self.pre_norm else None,
+ query_pos=query_pos,
+ key_pos=query_pos,
+ attn_mask=attn_masks[attn_index],
+ key_padding_mask=query_key_padding_mask,
+ **kwargs)
+ attn_index += 1
+ identity = query
+
+ elif layer == 'norm':
+ query = self.norms[norm_index](query)
+ norm_index += 1
+
+ elif layer == 'cross_attn':
+ query = self.attentions[attn_index](
+ query,
+ key,
+ value,
+ identity if self.pre_norm else None,
+ query_pos=query_pos,
+ key_pos=key_pos,
+ attn_mask=attn_masks[attn_index],
+ key_padding_mask=key_padding_mask,
+ **kwargs)
+ attn_index += 1
+ identity = query
+
+ elif layer == 'ffn':
+ query = self.ffns[ffn_index](
+ query, identity if self.pre_norm else None)
+ ffn_index += 1
+
+ return query
+
+
+@TRANSFORMER_LAYER_SEQUENCE.register_module()
+class TransformerLayerSequence(BaseModule):
+ """Base class for TransformerEncoder and TransformerDecoder in vision
+ transformer.
+
+ As base-class of Encoder and Decoder in vision transformer.
+ Support customization such as specifying different kind
+ of `transformer_layer` in `transformer_coder`.
+
+ Args:
+ transformerlayer (list[obj:`mmcv.ConfigDict`] |
+ obj:`mmcv.ConfigDict`): Config of transformerlayer
+ in TransformerCoder. If it is obj:`mmcv.ConfigDict`,
+ it would be repeated `num_layer` times to a
+ list[`mmcv.ConfigDict`]. Default: None.
+ num_layers (int): The number of `TransformerLayer`. Default: None.
+ init_cfg (obj:`mmcv.ConfigDict`): The Config for initialization.
+ Default: None.
+ """
+
+ def __init__(self, transformerlayers=None, num_layers=None, init_cfg=None):
+ super(TransformerLayerSequence, self).__init__(init_cfg)
+ if isinstance(transformerlayers, dict):
+ transformerlayers = [
+ copy.deepcopy(transformerlayers) for _ in range(num_layers)
+ ]
+ else:
+ assert isinstance(transformerlayers, list) and \
+ len(transformerlayers) == num_layers
+ self.num_layers = num_layers
+ self.layers = ModuleList()
+ for i in range(num_layers):
+ self.layers.append(build_transformer_layer(transformerlayers[i]))
+ self.embed_dims = self.layers[0].embed_dims
+ self.pre_norm = self.layers[0].pre_norm
+
+ def forward(self,
+ query,
+ key,
+ value,
+ query_pos=None,
+ key_pos=None,
+ attn_masks=None,
+ query_key_padding_mask=None,
+ key_padding_mask=None,
+ **kwargs):
+ """Forward function for `TransformerCoder`.
+
+ Args:
+ query (Tensor): Input query with shape
+ `(num_queries, bs, embed_dims)`.
+ key (Tensor): The key tensor with shape
+ `(num_keys, bs, embed_dims)`.
+ value (Tensor): The value tensor with shape
+ `(num_keys, bs, embed_dims)`.
+ query_pos (Tensor): The positional encoding for `query`.
+ Default: None.
+ key_pos (Tensor): The positional encoding for `key`.
+ Default: None.
+ attn_masks (List[Tensor], optional): Each element is 2D Tensor
+ which is used in calculation of corresponding attention in
+ operation_order. Default: None.
+ query_key_padding_mask (Tensor): ByteTensor for `query`, with
+ shape [bs, num_queries]. Only used in self-attention
+ Default: None.
+ key_padding_mask (Tensor): ByteTensor for `query`, with
+ shape [bs, num_keys]. Default: None.
+
+ Returns:
+ Tensor: results with shape [num_queries, bs, embed_dims].
+ """
+ for layer in self.layers:
+ query = layer(
+ query,
+ key,
+ value,
+ query_pos=query_pos,
+ key_pos=key_pos,
+ attn_masks=attn_masks,
+ query_key_padding_mask=query_key_padding_mask,
+ key_padding_mask=key_padding_mask,
+ **kwargs)
+ return query
diff --git a/PyTorch/contrib/cv/detection/GCNet/mmcv/cnn/bricks/upsample.py b/PyTorch/contrib/cv/detection/GCNet/mmcv/cnn/bricks/upsample.py
new file mode 100644
index 0000000000000000000000000000000000000000..c1388c39bf6c1693c16987682299938b82e3c311
--- /dev/null
+++ b/PyTorch/contrib/cv/detection/GCNet/mmcv/cnn/bricks/upsample.py
@@ -0,0 +1,83 @@
+import torch.nn as nn
+import torch.nn.functional as F
+
+from ..utils import xavier_init
+from .registry import UPSAMPLE_LAYERS
+
+UPSAMPLE_LAYERS.register_module('nearest', module=nn.Upsample)
+UPSAMPLE_LAYERS.register_module('bilinear', module=nn.Upsample)
+
+
+@UPSAMPLE_LAYERS.register_module(name='pixel_shuffle')
+class PixelShufflePack(nn.Module):
+ """Pixel Shuffle upsample layer.
+
+ This module packs `F.pixel_shuffle()` and a nn.Conv2d module together to
+ achieve a simple upsampling with pixel shuffle.
+
+ Args:
+ in_channels (int): Number of input channels.
+ out_channels (int): Number of output channels.
+ scale_factor (int): Upsample ratio.
+ upsample_kernel (int): Kernel size of the conv layer to expand the
+ channels.
+ """
+
+ def __init__(self, in_channels, out_channels, scale_factor,
+ upsample_kernel):
+ super(PixelShufflePack, self).__init__()
+ self.in_channels = in_channels
+ self.out_channels = out_channels
+ self.scale_factor = scale_factor
+ self.upsample_kernel = upsample_kernel
+ self.upsample_conv = nn.Conv2d(
+ self.in_channels,
+ self.out_channels * scale_factor * scale_factor,
+ self.upsample_kernel,
+ padding=(self.upsample_kernel - 1) // 2)
+ self.init_weights()
+
+ def init_weights(self):
+ xavier_init(self.upsample_conv, distribution='uniform')
+
+ def forward(self, x):
+ x = self.upsample_conv(x)
+ x = F.pixel_shuffle(x, self.scale_factor)
+ return x
+
+
+def build_upsample_layer(cfg, *args, **kwargs):
+ """Build upsample layer.
+
+ Args:
+ cfg (dict): The upsample layer config, which should contain:
+
+ - type (str): Layer type.
+ - scale_factor (int): Upsample ratio, which is not applicable to
+ deconv.
+ - layer args: Args needed to instantiate a upsample layer.
+ args (argument list): Arguments passed to the ``__init__``
+ method of the corresponding conv layer.
+ kwargs (keyword arguments): Keyword arguments passed to the
+ ``__init__`` method of the corresponding conv layer.
+
+ Returns:
+ nn.Module: Created upsample layer.
+ """
+ if not isinstance(cfg, dict):
+ raise TypeError(f'cfg must be a dict, but got {type(cfg)}')
+ if 'type' not in cfg:
+ raise KeyError(
+ f'the cfg dict must contain the key "type", but got {cfg}')
+ cfg_ = cfg.copy()
+
+ layer_type = cfg_.pop('type')
+ if layer_type not in UPSAMPLE_LAYERS:
+ raise KeyError(f'Unrecognized upsample type {layer_type}')
+ else:
+ upsample = UPSAMPLE_LAYERS.get(layer_type)
+
+ if upsample is nn.Upsample:
+ cfg_['mode'] = layer_type
+ layer = upsample(*args, **kwargs, **cfg_)
+ return layer
diff --git a/PyTorch/contrib/cv/detection/GCNet/mmcv/cnn/bricks/wrappers.py b/PyTorch/contrib/cv/detection/GCNet/mmcv/cnn/bricks/wrappers.py
new file mode 100644
index 0000000000000000000000000000000000000000..6e125b41ca92e1edeb76e5fd8c5abc69004eab8f
--- /dev/null
+++ b/PyTorch/contrib/cv/detection/GCNet/mmcv/cnn/bricks/wrappers.py
@@ -0,0 +1,179 @@
+r"""Modified from https://github.com/facebookresearch/detectron2/blob/master/detectron2/layers/wrappers.py # noqa: E501
+
+Wrap some nn modules to support empty tensor input. Currently, these wrappers
+are mainly used in mask heads like fcn_mask_head and maskiou_heads since mask
+heads are trained on only positive RoIs.
+"""
+import math
+
+import torch
+import torch.nn as nn
+from torch.nn.modules.utils import _pair, _triple
+
+from .registry import CONV_LAYERS, UPSAMPLE_LAYERS
+
+if torch.__version__ == 'parrots':
+ TORCH_VERSION = torch.__version__
+else:
+ # torch.__version__ could be 1.3.1+cu92, we only need the first two
+ # for comparison
+ TORCH_VERSION = tuple(int(x) for x in torch.__version__.split('.')[:2])
+
+
+def obsolete_torch_version(torch_version, version_threshold):
+ return torch_version == 'parrots' or torch_version <= version_threshold
+
+
+class NewEmptyTensorOp(torch.autograd.Function):
+
+ @staticmethod
+ def forward(ctx, x, new_shape):
+ ctx.shape = x.shape
+ return x.new_empty(new_shape)
+
+ @staticmethod
+ def backward(ctx, grad):
+ shape = ctx.shape
+ return NewEmptyTensorOp.apply(grad, shape), None
+
+
+@CONV_LAYERS.register_module('Conv', force=True)
+class Conv2d(nn.Conv2d):
+
+ def forward(self, x):
+ if x.numel() == 0 and obsolete_torch_version(TORCH_VERSION, (1, 4)):
+ out_shape = [x.shape[0], self.out_channels]
+ for i, k, p, s, d in zip(x.shape[-2:], self.kernel_size,
+ self.padding, self.stride, self.dilation):
+ o = (i + 2 * p - (d * (k - 1) + 1)) // s + 1
+ out_shape.append(o)
+ empty = NewEmptyTensorOp.apply(x, out_shape)
+ if self.training:
+ # produce dummy gradient to avoid DDP warning.
+ dummy = sum(x.view(-1)[0] for x in self.parameters()) * 0.0
+ return empty + dummy
+ else:
+ return empty
+
+ return super().forward(x)
+
+
+@CONV_LAYERS.register_module('Conv3d', force=True)
+class Conv3d(nn.Conv3d):
+
+ def forward(self, x):
+ if x.numel() == 0 and obsolete_torch_version(TORCH_VERSION, (1, 4)):
+ out_shape = [x.shape[0], self.out_channels]
+ for i, k, p, s, d in zip(x.shape[-3:], self.kernel_size,
+ self.padding, self.stride, self.dilation):
+ o = (i + 2 * p - (d * (k - 1) + 1)) // s + 1
+ out_shape.append(o)
+ empty = NewEmptyTensorOp.apply(x, out_shape)
+ if self.training:
+ # produce dummy gradient to avoid DDP warning.
+ dummy = sum(x.view(-1)[0] for x in self.parameters()) * 0.0
+ return empty + dummy
+ else:
+ return empty
+
+ return super().forward(x)
+
+
+@CONV_LAYERS.register_module()
+@CONV_LAYERS.register_module('deconv')
+@UPSAMPLE_LAYERS.register_module('deconv', force=True)
+class ConvTranspose2d(nn.ConvTranspose2d):
+
+ def forward(self, x):
+ if x.numel() == 0 and obsolete_torch_version(TORCH_VERSION, (1, 4)):
+ out_shape = [x.shape[0], self.out_channels]
+ for i, k, p, s, d, op in zip(x.shape[-2:], self.kernel_size,
+ self.padding, self.stride,
+ self.dilation, self.output_padding):
+ out_shape.append((i - 1) * s - 2 * p + (d * (k - 1) + 1) + op)
+ empty = NewEmptyTensorOp.apply(x, out_shape)
+ if self.training:
+ # produce dummy gradient to avoid DDP warning.
+ dummy = sum(x.view(-1)[0] for x in self.parameters()) * 0.0
+ return empty + dummy
+ else:
+ return empty
+
+ return super().forward(x)
+
+
+@CONV_LAYERS.register_module()
+@CONV_LAYERS.register_module('deconv3d')
+@UPSAMPLE_LAYERS.register_module('deconv3d', force=True)
+class ConvTranspose3d(nn.ConvTranspose3d):
+
+ def forward(self, x):
+ if x.numel() == 0 and obsolete_torch_version(TORCH_VERSION, (1, 4)):
+ out_shape = [x.shape[0], self.out_channels]
+ for i, k, p, s, d, op in zip(x.shape[-3:], self.kernel_size,
+ self.padding, self.stride,
+ self.dilation, self.output_padding):
+ out_shape.append((i - 1) * s - 2 * p + (d * (k - 1) + 1) + op)
+ empty = NewEmptyTensorOp.apply(x, out_shape)
+ if self.training:
+ # produce dummy gradient to avoid DDP warning.
+ dummy = sum(x.view(-1)[0] for x in self.parameters()) * 0.0
+ return empty + dummy
+ else:
+ return empty
+
+ return super().forward(x)
+
+
+class MaxPool2d(nn.MaxPool2d):
+
+ def forward(self, x):
+ # PyTorch 1.9 does not support empty tensor inference yet
+ if x.numel() == 0 and obsolete_torch_version(TORCH_VERSION, (1, 9)):
+ out_shape = list(x.shape[:2])
+ for i, k, p, s, d in zip(x.shape[-2:], _pair(self.kernel_size),
+ _pair(self.padding), _pair(self.stride),
+ _pair(self.dilation)):
+ o = (i + 2 * p - (d * (k - 1) + 1)) / s + 1
+ o = math.ceil(o) if self.ceil_mode else math.floor(o)
+ out_shape.append(o)
+ empty = NewEmptyTensorOp.apply(x, out_shape)
+ return empty
+
+ return super().forward(x)
+
+
+class MaxPool3d(nn.MaxPool3d):
+
+ def forward(self, x):
+ # PyTorch 1.9 does not support empty tensor inference yet
+ if x.numel() == 0 and obsolete_torch_version(TORCH_VERSION, (1, 9)):
+ out_shape = list(x.shape[:2])
+ for i, k, p, s, d in zip(x.shape[-3:], _triple(self.kernel_size),
+ _triple(self.padding),
+ _triple(self.stride),
+ _triple(self.dilation)):
+ o = (i + 2 * p - (d * (k - 1) + 1)) / s + 1
+ o = math.ceil(o) if self.ceil_mode else math.floor(o)
+ out_shape.append(o)
+ empty = NewEmptyTensorOp.apply(x, out_shape)
+ return empty
+
+ return super().forward(x)
+
+
+class Linear(torch.nn.Linear):
+
+ def forward(self, x):
+ # empty tensor forward of Linear layer is supported in Pytorch 1.6
+ if x.numel() == 0 and obsolete_torch_version(TORCH_VERSION, (1, 5)):
+ out_shape = [x.shape[0], self.out_features]
+ empty = NewEmptyTensorOp.apply(x, out_shape)
+ if self.training:
+ # produce dummy gradient to avoid DDP warning.
+ dummy = sum(x.view(-1)[0] for x in self.parameters()) * 0.0
+ return empty + dummy
+ else:
+ return empty
+
+ return super().forward(x)
diff --git a/PyTorch/contrib/cv/detection/GCNet/mmcv/cnn/builder.py b/PyTorch/contrib/cv/detection/GCNet/mmcv/cnn/builder.py
new file mode 100644
index 0000000000000000000000000000000000000000..7d16a61581d430769a30668c6888ccc480e7f5f2
--- /dev/null
+++ b/PyTorch/contrib/cv/detection/GCNet/mmcv/cnn/builder.py
@@ -0,0 +1,29 @@
+from ..runner import Sequential
+from ..utils import Registry, build_from_cfg
+
+
+def build_model_from_cfg(cfg, registry, default_args=None):
+ """Build a PyTorch model from config dict(s). Different from
+ ``build_from_cfg``, if cfg is a list, a ``nn.Sequential`` will be built.
+
+ Args:
+ cfg (dict, list[dict]): The config of modules, is is either a config
+ dict or a list of config dicts. If cfg is a list, a
+ the built modules will be wrapped with ``nn.Sequential``.
+ registry (:obj:`Registry`): A registry the module belongs to.
+ default_args (dict, optional): Default arguments to build the module.
+ Defaults to None.
+
+ Returns:
+ nn.Module: A built nn module.
+ """
+ if isinstance(cfg, list):
+ modules = [
+ build_from_cfg(cfg_, registry, default_args) for cfg_ in cfg
+ ]
+ return Sequential(*modules)
+ else:
+ return build_from_cfg(cfg, registry, default_args)
+
+
+MODELS = Registry('model', build_func=build_model_from_cfg)
diff --git a/PyTorch/contrib/cv/detection/GCNet/mmcv/cnn/resnet.py b/PyTorch/contrib/cv/detection/GCNet/mmcv/cnn/resnet.py
new file mode 100644
index 0000000000000000000000000000000000000000..8fe9a3320a46d39d7422929f59340e2e511c2e27
--- /dev/null
+++ b/PyTorch/contrib/cv/detection/GCNet/mmcv/cnn/resnet.py
@@ -0,0 +1,316 @@
+# Copyright (c) Open-MMLab. All rights reserved.
+import logging
+
+import torch.nn as nn
+import torch.utils.checkpoint as cp
+
+from .utils import constant_init, kaiming_init
+
+
+def conv3x3(in_planes, out_planes, stride=1, dilation=1):
+ """3x3 convolution with padding."""
+ return nn.Conv2d(
+ in_planes,
+ out_planes,
+ kernel_size=3,
+ stride=stride,
+ padding=dilation,
+ dilation=dilation,
+ bias=False)
+
+
+class BasicBlock(nn.Module):
+ expansion = 1
+
+ def __init__(self,
+ inplanes,
+ planes,
+ stride=1,
+ dilation=1,
+ downsample=None,
+ style='pytorch',
+ with_cp=False):
+ super(BasicBlock, self).__init__()
+ assert style in ['pytorch', 'caffe']
+ self.conv1 = conv3x3(inplanes, planes, stride, dilation)
+ self.bn1 = nn.BatchNorm2d(planes)
+ self.relu = nn.ReLU(inplace=True)
+ self.conv2 = conv3x3(planes, planes)
+ self.bn2 = nn.BatchNorm2d(planes)
+ self.downsample = downsample
+ self.stride = stride
+ self.dilation = dilation
+ assert not with_cp
+
+ def forward(self, x):
+ residual = x
+
+ out = self.conv1(x)
+ out = self.bn1(out)
+ out = self.relu(out)
+
+ out = self.conv2(out)
+ out = self.bn2(out)
+
+ if self.downsample is not None:
+ residual = self.downsample(x)
+
+ out += residual
+ out = self.relu(out)
+
+ return out
+
+
+class Bottleneck(nn.Module):
+ expansion = 4
+
+ def __init__(self,
+ inplanes,
+ planes,
+ stride=1,
+ dilation=1,
+ downsample=None,
+ style='pytorch',
+ with_cp=False):
+ """Bottleneck block.
+
+ If style is "pytorch", the stride-two layer is the 3x3 conv layer, if
+ it is "caffe", the stride-two layer is the first 1x1 conv layer.
+ """
+ super(Bottleneck, self).__init__()
+ assert style in ['pytorch', 'caffe']
+ if style == 'pytorch':
+ conv1_stride = 1
+ conv2_stride = stride
+ else:
+ conv1_stride = stride
+ conv2_stride = 1
+ self.conv1 = nn.Conv2d(
+ inplanes, planes, kernel_size=1, stride=conv1_stride, bias=False)
+ self.conv2 = nn.Conv2d(
+ planes,
+ planes,
+ kernel_size=3,
+ stride=conv2_stride,
+ padding=dilation,
+ dilation=dilation,
+ bias=False)
+
+ self.bn1 = nn.BatchNorm2d(planes)
+ self.bn2 = nn.BatchNorm2d(planes)
+ self.conv3 = nn.Conv2d(
+ planes, planes * self.expansion, kernel_size=1, bias=False)
+ self.bn3 = nn.BatchNorm2d(planes * self.expansion)
+ self.relu = nn.ReLU(inplace=True)
+ self.downsample = downsample
+ self.stride = stride
+ self.dilation = dilation
+ self.with_cp = with_cp
+
+ def forward(self, x):
+
+ def _inner_forward(x):
+ residual = x
+
+ out = self.conv1(x)
+ out = self.bn1(out)
+ out = self.relu(out)
+
+ out = self.conv2(out)
+ out = self.bn2(out)
+ out = self.relu(out)
+
+ out = self.conv3(out)
+ out = self.bn3(out)
+
+ if self.downsample is not None:
+ residual = self.downsample(x)
+
+ out += residual
+
+ return out
+
+ if self.with_cp and x.requires_grad:
+ out = cp.checkpoint(_inner_forward, x)
+ else:
+ out = _inner_forward(x)
+
+ out = self.relu(out)
+
+ return out
+
+
+def make_res_layer(block,
+ inplanes,
+ planes,
+ blocks,
+ stride=1,
+ dilation=1,
+ style='pytorch',
+ with_cp=False):
+ downsample = None
+ if stride != 1 or inplanes != planes * block.expansion:
+ downsample = nn.Sequential(
+ nn.Conv2d(
+ inplanes,
+ planes * block.expansion,
+ kernel_size=1,
+ stride=stride,
+ bias=False),
+ nn.BatchNorm2d(planes * block.expansion),
+ )
+
+ layers = []
+ layers.append(
+ block(
+ inplanes,
+ planes,
+ stride,
+ dilation,
+ downsample,
+ style=style,
+ with_cp=with_cp))
+ inplanes = planes * block.expansion
+ for _ in range(1, blocks):
+ layers.append(
+ block(inplanes, planes, 1, dilation, style=style, with_cp=with_cp))
+
+ return nn.Sequential(*layers)
+
+
+class ResNet(nn.Module):
+ """ResNet backbone.
+
+ Args:
+ depth (int): Depth of resnet, from {18, 34, 50, 101, 152}.
+ num_stages (int): Resnet stages, normally 4.
+ strides (Sequence[int]): Strides of the first block of each stage.
+ dilations (Sequence[int]): Dilation of each stage.
+ out_indices (Sequence[int]): Output from which stages.
+ style (str): `pytorch` or `caffe`. If set to "pytorch", the stride-two
+ layer is the 3x3 conv layer, otherwise the stride-two layer is
+ the first 1x1 conv layer.
+ frozen_stages (int): Stages to be frozen (all param fixed). -1 means
+ not freezing any parameters.
+ bn_eval (bool): Whether to set BN layers as eval mode, namely, freeze
+ running stats (mean and var).
+ bn_frozen (bool): Whether to freeze weight and bias of BN layers.
+ with_cp (bool): Use checkpoint or not. Using checkpoint will save some
+ memory while slowing down the training speed.
+ """
+
+ arch_settings = {
+ 18: (BasicBlock, (2, 2, 2, 2)),
+ 34: (BasicBlock, (3, 4, 6, 3)),
+ 50: (Bottleneck, (3, 4, 6, 3)),
+ 101: (Bottleneck, (3, 4, 23, 3)),
+ 152: (Bottleneck, (3, 8, 36, 3))
+ }
+
+ def __init__(self,
+ depth,
+ num_stages=4,
+ strides=(1, 2, 2, 2),
+ dilations=(1, 1, 1, 1),
+ out_indices=(0, 1, 2, 3),
+ style='pytorch',
+ frozen_stages=-1,
+ bn_eval=True,
+ bn_frozen=False,
+ with_cp=False):
+ super(ResNet, self).__init__()
+ if depth not in self.arch_settings:
+ raise KeyError(f'invalid depth {depth} for resnet')
+ assert num_stages >= 1 and num_stages <= 4
+ block, stage_blocks = self.arch_settings[depth]
+ stage_blocks = stage_blocks[:num_stages]
+ assert len(strides) == len(dilations) == num_stages
+ assert max(out_indices) < num_stages
+
+ self.out_indices = out_indices
+ self.style = style
+ self.frozen_stages = frozen_stages
+ self.bn_eval = bn_eval
+ self.bn_frozen = bn_frozen
+ self.with_cp = with_cp
+
+ self.inplanes = 64
+ self.conv1 = nn.Conv2d(
+ 3, 64, kernel_size=7, stride=2, padding=3, bias=False)
+ self.bn1 = nn.BatchNorm2d(64)
+ self.relu = nn.ReLU(inplace=True)
+ self.maxpool = nn.MaxPool2d(kernel_size=3, stride=2, padding=1)
+
+ self.res_layers = []
+ for i, num_blocks in enumerate(stage_blocks):
+ stride = strides[i]
+ dilation = dilations[i]
+ planes = 64 * 2**i
+ res_layer = make_res_layer(
+ block,
+ self.inplanes,
+ planes,
+ num_blocks,
+ stride=stride,
+ dilation=dilation,
+ style=self.style,
+ with_cp=with_cp)
+ self.inplanes = planes * block.expansion
+ layer_name = f'layer{i + 1}'
+ self.add_module(layer_name, res_layer)
+ self.res_layers.append(layer_name)
+
+ self.feat_dim = block.expansion * 64 * 2**(len(stage_blocks) - 1)
+
+ def init_weights(self, pretrained=None):
+ if isinstance(pretrained, str):
+ logger = logging.getLogger()
+ from ..runner import load_checkpoint
+ load_checkpoint(self, pretrained, strict=False, logger=logger)
+ elif pretrained is None:
+ for m in self.modules():
+ if isinstance(m, nn.Conv2d):
+ kaiming_init(m)
+ elif isinstance(m, nn.BatchNorm2d):
+ constant_init(m, 1)
+ else:
+ raise TypeError('pretrained must be a str or None')
+
+ def forward(self, x):
+ x = self.conv1(x)
+ x = self.bn1(x)
+ x = self.relu(x)
+ x = self.maxpool(x)
+ outs = []
+ for i, layer_name in enumerate(self.res_layers):
+ res_layer = getattr(self, layer_name)
+ x = res_layer(x)
+ if i in self.out_indices:
+ outs.append(x)
+ if len(outs) == 1:
+ return outs[0]
+ else:
+ return tuple(outs)
+
+ def train(self, mode=True):
+ super(ResNet, self).train(mode)
+ if self.bn_eval:
+ for m in self.modules():
+ if isinstance(m, nn.BatchNorm2d):
+ m.eval()
+ if self.bn_frozen:
+ for params in m.parameters():
+ params.requires_grad = False
+ if mode and self.frozen_stages >= 0:
+ for param in self.conv1.parameters():
+ param.requires_grad = False
+ for param in self.bn1.parameters():
+ param.requires_grad = False
+ self.bn1.eval()
+ self.bn1.weight.requires_grad = False
+ self.bn1.bias.requires_grad = False
+ for i in range(1, self.frozen_stages + 1):
+ mod = getattr(self, f'layer{i}')
+ mod.eval()
+ for param in mod.parameters():
+ param.requires_grad = False
diff --git a/PyTorch/contrib/cv/detection/GCNet/mmcv/cnn/utils/__init__.py b/PyTorch/contrib/cv/detection/GCNet/mmcv/cnn/utils/__init__.py
new file mode 100644
index 0000000000000000000000000000000000000000..c8a4bd51f83fc29d04a8166d5070d135b121ab47
--- /dev/null
+++ b/PyTorch/contrib/cv/detection/GCNet/mmcv/cnn/utils/__init__.py
@@ -0,0 +1,18 @@
+# Copyright (c) Open-MMLab. All rights reserved.
+from .flops_counter import get_model_complexity_info
+from .fuse_conv_bn import fuse_conv_bn
+from .weight_init import (INITIALIZERS, Caffe2XavierInit, ConstantInit,
+ KaimingInit, NormalInit, PretrainedInit,
+ TruncNormalInit, UniformInit, XavierInit,
+ bias_init_with_prob, caffe2_xavier_init,
+ constant_init, initialize, kaiming_init, normal_init,
+ trunc_normal_init, uniform_init, xavier_init)
+
+__all__ = [
+ 'get_model_complexity_info', 'bias_init_with_prob', 'caffe2_xavier_init',
+ 'constant_init', 'kaiming_init', 'normal_init', 'trunc_normal_init',
+ 'uniform_init', 'xavier_init', 'fuse_conv_bn', 'initialize',
+ 'INITIALIZERS', 'ConstantInit', 'XavierInit', 'NormalInit',
+ 'TruncNormalInit', 'UniformInit', 'KaimingInit', 'PretrainedInit',
+ 'Caffe2XavierInit'
+]
diff --git a/PyTorch/contrib/cv/detection/GCNet/mmcv/cnn/utils/flops_counter.py b/PyTorch/contrib/cv/detection/GCNet/mmcv/cnn/utils/flops_counter.py
new file mode 100644
index 0000000000000000000000000000000000000000..dceeb398bfc8a562d406136028381326ef55e0dc
--- /dev/null
+++ b/PyTorch/contrib/cv/detection/GCNet/mmcv/cnn/utils/flops_counter.py
@@ -0,0 +1,599 @@
+# Modified from flops-counter.pytorch by Vladislav Sovrasov
+# original repo: https://github.com/sovrasov/flops-counter.pytorch
+
+# MIT License
+
+# Copyright (c) 2018 Vladislav Sovrasov
+
+# Permission is hereby granted, free of charge, to any person obtaining a copy
+# of this software and associated documentation files (the "Software"), to deal
+# in the Software without restriction, including without limitation the rights
+# to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
+# copies of the Software, and to permit persons to whom the Software is
+# furnished to do so, subject to the following conditions:
+
+# The above copyright notice and this permission notice shall be included in
+# all copies or substantial portions of the Software.
+
+# THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+# IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+# FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
+# AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+# LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
+# OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
+# SOFTWARE.
+
+import sys
+from functools import partial
+
+import numpy as np
+import torch
+import torch.nn as nn
+
+import mmcv
+
+
+def get_model_complexity_info(model,
+ input_shape,
+ print_per_layer_stat=True,
+ as_strings=True,
+ input_constructor=None,
+ flush=False,
+ ost=sys.stdout):
+ """Get complexity information of a model.
+
+ This method can calculate FLOPs and parameter counts of a model with
+ corresponding input shape. It can also print complexity information for
+ each layer in a model.
+
+ Supported layers are listed as below:
+ - Convolutions: ``nn.Conv1d``, ``nn.Conv2d``, ``nn.Conv3d``.
+ - Activations: ``nn.ReLU``, ``nn.PReLU``, ``nn.ELU``, ``nn.LeakyReLU``,
+ ``nn.ReLU6``.
+ - Poolings: ``nn.MaxPool1d``, ``nn.MaxPool2d``, ``nn.MaxPool3d``,
+ ``nn.AvgPool1d``, ``nn.AvgPool2d``, ``nn.AvgPool3d``,
+ ``nn.AdaptiveMaxPool1d``, ``nn.AdaptiveMaxPool2d``,
+ ``nn.AdaptiveMaxPool3d``, ``nn.AdaptiveAvgPool1d``,
+ ``nn.AdaptiveAvgPool2d``, ``nn.AdaptiveAvgPool3d``.
+ - BatchNorms: ``nn.BatchNorm1d``, ``nn.BatchNorm2d``,
+ ``nn.BatchNorm3d``, ``nn.GroupNorm``, ``nn.InstanceNorm1d``,
+ ``InstanceNorm2d``, ``InstanceNorm3d``, ``nn.LayerNorm``.
+ - Linear: ``nn.Linear``.
+ - Deconvolution: ``nn.ConvTranspose2d``.
+ - Upsample: ``nn.Upsample``.
+
+ Args:
+ model (nn.Module): The model for complexity calculation.
+ input_shape (tuple): Input shape used for calculation.
+ print_per_layer_stat (bool): Whether to print complexity information
+ for each layer in a model. Default: True.
+ as_strings (bool): Output FLOPs and params counts in a string form.
+ Default: True.
+ input_constructor (None | callable): If specified, it takes a callable
+ method that generates input. otherwise, it will generate a random
+ tensor with input shape to calculate FLOPs. Default: None.
+ flush (bool): same as that in :func:`print`. Default: False.
+ ost (stream): same as ``file`` param in :func:`print`.
+ Default: sys.stdout.
+
+ Returns:
+ tuple[float | str]: If ``as_strings`` is set to True, it will return
+ FLOPs and parameter counts in a string format. otherwise, it will
+ return those in a float number format.
+ """
+ assert type(input_shape) is tuple
+ assert len(input_shape) >= 1
+ assert isinstance(model, nn.Module)
+ flops_model = add_flops_counting_methods(model)
+ flops_model.eval()
+ flops_model.start_flops_count()
+ if input_constructor:
+ input = input_constructor(input_shape)
+ _ = flops_model(**input)
+ else:
+ try:
+ batch = torch.ones(()).new_empty(
+ (1, *input_shape),
+ dtype=next(flops_model.parameters()).dtype,
+ device=next(flops_model.parameters()).device)
+ except StopIteration:
+ # Avoid StopIteration for models which have no parameters,
+ # like `nn.Relu()`, `nn.AvgPool2d`, etc.
+ batch = torch.ones(()).new_empty((1, *input_shape))
+
+ _ = flops_model(batch)
+
+ flops_count, params_count = flops_model.compute_average_flops_cost()
+ if print_per_layer_stat:
+ print_model_with_flops(
+ flops_model, flops_count, params_count, ost=ost, flush=flush)
+ flops_model.stop_flops_count()
+
+ if as_strings:
+ return flops_to_string(flops_count), params_to_string(params_count)
+
+ return flops_count, params_count
+
+
+def flops_to_string(flops, units='GFLOPs', precision=2):
+ """Convert FLOPs number into a string.
+
+ Note that Here we take a multiply-add counts as one FLOP.
+
+ Args:
+ flops (float): FLOPs number to be converted.
+ units (str | None): Converted FLOPs units. Options are None, 'GFLOPs',
+ 'MFLOPs', 'KFLOPs', 'FLOPs'. If set to None, it will automatically
+ choose the most suitable unit for FLOPs. Default: 'GFLOPs'.
+ precision (int): Digit number after the decimal point. Default: 2.
+
+ Returns:
+ str: The converted FLOPs number with units.
+
+ Examples:
+ >>> flops_to_string(1e9)
+ '1.0 GFLOPs'
+ >>> flops_to_string(2e5, 'MFLOPs')
+ '0.2 MFLOPs'
+ >>> flops_to_string(3e-9, None)
+ '3e-09 FLOPs'
+ """
+ if units is None:
+ if flops // 10**9 > 0:
+ return str(round(flops / 10.**9, precision)) + ' GFLOPs'
+ elif flops // 10**6 > 0:
+ return str(round(flops / 10.**6, precision)) + ' MFLOPs'
+ elif flops // 10**3 > 0:
+ return str(round(flops / 10.**3, precision)) + ' KFLOPs'
+ else:
+ return str(flops) + ' FLOPs'
+ else:
+ if units == 'GFLOPs':
+ return str(round(flops / 10.**9, precision)) + ' ' + units
+ elif units == 'MFLOPs':
+ return str(round(flops / 10.**6, precision)) + ' ' + units
+ elif units == 'KFLOPs':
+ return str(round(flops / 10.**3, precision)) + ' ' + units
+ else:
+ return str(flops) + ' FLOPs'
+
+
+def params_to_string(num_params, units=None, precision=2):
+ """Convert parameter number into a string.
+
+ Args:
+ num_params (float): Parameter number to be converted.
+ units (str | None): Converted FLOPs units. Options are None, 'M',
+ 'K' and ''. If set to None, it will automatically choose the most
+ suitable unit for Parameter number. Default: None.
+ precision (int): Digit number after the decimal point. Default: 2.
+
+ Returns:
+ str: The converted parameter number with units.
+
+ Examples:
+ >>> params_to_string(1e9)
+ '1000.0 M'
+ >>> params_to_string(2e5)
+ '200.0 k'
+ >>> params_to_string(3e-9)
+ '3e-09'
+ """
+ if units is None:
+ if num_params // 10**6 > 0:
+ return str(round(num_params / 10**6, precision)) + ' M'
+ elif num_params // 10**3:
+ return str(round(num_params / 10**3, precision)) + ' k'
+ else:
+ return str(num_params)
+ else:
+ if units == 'M':
+ return str(round(num_params / 10.**6, precision)) + ' ' + units
+ elif units == 'K':
+ return str(round(num_params / 10.**3, precision)) + ' ' + units
+ else:
+ return str(num_params)
+
+
+def print_model_with_flops(model,
+ total_flops,
+ total_params,
+ units='GFLOPs',
+ precision=3,
+ ost=sys.stdout,
+ flush=False):
+ """Print a model with FLOPs for each layer.
+
+ Args:
+ model (nn.Module): The model to be printed.
+ total_flops (float): Total FLOPs of the model.
+ total_params (float): Total parameter counts of the model.
+ units (str | None): Converted FLOPs units. Default: 'GFLOPs'.
+ precision (int): Digit number after the decimal point. Default: 3.
+ ost (stream): same as `file` param in :func:`print`.
+ Default: sys.stdout.
+ flush (bool): same as that in :func:`print`. Default: False.
+
+ Example:
+ >>> class ExampleModel(nn.Module):
+
+ >>> def __init__(self):
+ >>> super().__init__()
+ >>> self.conv1 = nn.Conv2d(3, 8, 3)
+ >>> self.conv2 = nn.Conv2d(8, 256, 3)
+ >>> self.conv3 = nn.Conv2d(256, 8, 3)
+ >>> self.avg_pool = nn.AdaptiveAvgPool2d((1, 1))
+ >>> self.flatten = nn.Flatten()
+ >>> self.fc = nn.Linear(8, 1)
+
+ >>> def forward(self, x):
+ >>> x = self.conv1(x)
+ >>> x = self.conv2(x)
+ >>> x = self.conv3(x)
+ >>> x = self.avg_pool(x)
+ >>> x = self.flatten(x)
+ >>> x = self.fc(x)
+ >>> return x
+
+ >>> model = ExampleModel()
+ >>> x = (3, 16, 16)
+ to print the complexity information state for each layer, you can use
+ >>> get_model_complexity_info(model, x)
+ or directly use
+ >>> print_model_with_flops(model, 4579784.0, 37361)
+ ExampleModel(
+ 0.037 M, 100.000% Params, 0.005 GFLOPs, 100.000% FLOPs,
+ (conv1): Conv2d(0.0 M, 0.600% Params, 0.0 GFLOPs, 0.959% FLOPs, 3, 8, kernel_size=(3, 3), stride=(1, 1)) # noqa: E501
+ (conv2): Conv2d(0.019 M, 50.020% Params, 0.003 GFLOPs, 58.760% FLOPs, 8, 256, kernel_size=(3, 3), stride=(1, 1))
+ (conv3): Conv2d(0.018 M, 49.356% Params, 0.002 GFLOPs, 40.264% FLOPs, 256, 8, kernel_size=(3, 3), stride=(1, 1))
+ (avg_pool): AdaptiveAvgPool2d(0.0 M, 0.000% Params, 0.0 GFLOPs, 0.017% FLOPs, output_size=(1, 1))
+ (flatten): Flatten(0.0 M, 0.000% Params, 0.0 GFLOPs, 0.000% FLOPs, )
+ (fc): Linear(0.0 M, 0.024% Params, 0.0 GFLOPs, 0.000% FLOPs, in_features=8, out_features=1, bias=True)
+ )
+ """
+
+ def accumulate_params(self):
+ if is_supported_instance(self):
+ return self.__params__
+ else:
+ sum = 0
+ for m in self.children():
+ sum += m.accumulate_params()
+ return sum
+
+ def accumulate_flops(self):
+ if is_supported_instance(self):
+ return self.__flops__ / model.__batch_counter__
+ else:
+ sum = 0
+ for m in self.children():
+ sum += m.accumulate_flops()
+ return sum
+
+ def flops_repr(self):
+ accumulated_num_params = self.accumulate_params()
+ accumulated_flops_cost = self.accumulate_flops()
+ return ', '.join([
+ params_to_string(
+ accumulated_num_params, units='M', precision=precision),
+ '{:.3%} Params'.format(accumulated_num_params / total_params),
+ flops_to_string(
+ accumulated_flops_cost, units=units, precision=precision),
+ '{:.3%} FLOPs'.format(accumulated_flops_cost / total_flops),
+ self.original_extra_repr()
+ ])
+
+ def add_extra_repr(m):
+ m.accumulate_flops = accumulate_flops.__get__(m)
+ m.accumulate_params = accumulate_params.__get__(m)
+ flops_extra_repr = flops_repr.__get__(m)
+ if m.extra_repr != flops_extra_repr:
+ m.original_extra_repr = m.extra_repr
+ m.extra_repr = flops_extra_repr
+ assert m.extra_repr != m.original_extra_repr
+
+ def del_extra_repr(m):
+ if hasattr(m, 'original_extra_repr'):
+ m.extra_repr = m.original_extra_repr
+ del m.original_extra_repr
+ if hasattr(m, 'accumulate_flops'):
+ del m.accumulate_flops
+
+ model.apply(add_extra_repr)
+ print(model, file=ost, flush=flush)
+ model.apply(del_extra_repr)
+
+
+def get_model_parameters_number(model):
+ """Calculate parameter number of a model.
+
+ Args:
+ model (nn.module): The model for parameter number calculation.
+
+ Returns:
+ float: Parameter number of the model.
+ """
+ num_params = sum(p.numel() for p in model.parameters() if p.requires_grad)
+ return num_params
+
+
+def add_flops_counting_methods(net_main_module):
+ # adding additional methods to the existing module object,
+ # this is done this way so that each function has access to self object
+ net_main_module.start_flops_count = start_flops_count.__get__(
+ net_main_module)
+ net_main_module.stop_flops_count = stop_flops_count.__get__(
+ net_main_module)
+ net_main_module.reset_flops_count = reset_flops_count.__get__(
+ net_main_module)
+ net_main_module.compute_average_flops_cost = compute_average_flops_cost.__get__( # noqa: E501
+ net_main_module)
+
+ net_main_module.reset_flops_count()
+
+ return net_main_module
+
+
+def compute_average_flops_cost(self):
+ """Compute average FLOPs cost.
+
+ A method to compute average FLOPs cost, which will be available after
+ `add_flops_counting_methods()` is called on a desired net object.
+
+ Returns:
+ float: Current mean flops consumption per image.
+ """
+ batches_count = self.__batch_counter__
+ flops_sum = 0
+ for module in self.modules():
+ if is_supported_instance(module):
+ flops_sum += module.__flops__
+ params_sum = get_model_parameters_number(self)
+ return flops_sum / batches_count, params_sum
+
+
+def start_flops_count(self):
+ """Activate the computation of mean flops consumption per image.
+
+ A method to activate the computation of mean flops consumption per image.
+ which will be available after ``add_flops_counting_methods()`` is called on
+ a desired net object. It should be called before running the network.
+ """
+ add_batch_counter_hook_function(self)
+
+ def add_flops_counter_hook_function(module):
+ if is_supported_instance(module):
+ if hasattr(module, '__flops_handle__'):
+ return
+
+ else:
+ handle = module.register_forward_hook(
+ get_modules_mapping()[type(module)])
+
+ module.__flops_handle__ = handle
+
+ self.apply(partial(add_flops_counter_hook_function))
+
+
+def stop_flops_count(self):
+ """Stop computing the mean flops consumption per image.
+
+ A method to stop computing the mean flops consumption per image, which will
+ be available after ``add_flops_counting_methods()`` is called on a desired
+ net object. It can be called to pause the computation whenever.
+ """
+ remove_batch_counter_hook_function(self)
+ self.apply(remove_flops_counter_hook_function)
+
+
+def reset_flops_count(self):
+ """Reset statistics computed so far.
+
+ A method to Reset computed statistics, which will be available after
+ `add_flops_counting_methods()` is called on a desired net object.
+ """
+ add_batch_counter_variables_or_reset(self)
+ self.apply(add_flops_counter_variable_or_reset)
+
+
+# ---- Internal functions
+def empty_flops_counter_hook(module, input, output):
+ module.__flops__ += 0
+
+
+def upsample_flops_counter_hook(module, input, output):
+ output_size = output[0]
+ batch_size = output_size.shape[0]
+ output_elements_count = batch_size
+ for val in output_size.shape[1:]:
+ output_elements_count *= val
+ module.__flops__ += int(output_elements_count)
+
+
+def relu_flops_counter_hook(module, input, output):
+ active_elements_count = output.numel()
+ module.__flops__ += int(active_elements_count)
+
+
+def linear_flops_counter_hook(module, input, output):
+ input = input[0]
+ output_last_dim = output.shape[
+ -1] # pytorch checks dimensions, so here we don't care much
+ module.__flops__ += int(np.prod(input.shape) * output_last_dim)
+
+
+def pool_flops_counter_hook(module, input, output):
+ input = input[0]
+ module.__flops__ += int(np.prod(input.shape))
+
+
+def norm_flops_counter_hook(module, input, output):
+ input = input[0]
+
+ batch_flops = np.prod(input.shape)
+ if (getattr(module, 'affine', False)
+ or getattr(module, 'elementwise_affine', False)):
+ batch_flops *= 2
+ module.__flops__ += int(batch_flops)
+
+
+def deconv_flops_counter_hook(conv_module, input, output):
+ # Can have multiple inputs, getting the first one
+ input = input[0]
+
+ batch_size = input.shape[0]
+ input_height, input_width = input.shape[2:]
+
+ kernel_height, kernel_width = conv_module.kernel_size
+ in_channels = conv_module.in_channels
+ out_channels = conv_module.out_channels
+ groups = conv_module.groups
+
+ filters_per_channel = out_channels // groups
+ conv_per_position_flops = (
+ kernel_height * kernel_width * in_channels * filters_per_channel)
+
+ active_elements_count = batch_size * input_height * input_width
+ overall_conv_flops = conv_per_position_flops * active_elements_count
+ bias_flops = 0
+ if conv_module.bias is not None:
+ output_height, output_width = output.shape[2:]
+ bias_flops = out_channels * batch_size * output_height * output_height
+ overall_flops = overall_conv_flops + bias_flops
+
+ conv_module.__flops__ += int(overall_flops)
+
+
+def conv_flops_counter_hook(conv_module, input, output):
+ # Can have multiple inputs, getting the first one
+ input = input[0]
+
+ batch_size = input.shape[0]
+ output_dims = list(output.shape[2:])
+
+ kernel_dims = list(conv_module.kernel_size)
+ in_channels = conv_module.in_channels
+ out_channels = conv_module.out_channels
+ groups = conv_module.groups
+
+ filters_per_channel = out_channels // groups
+ conv_per_position_flops = int(
+ np.prod(kernel_dims)) * in_channels * filters_per_channel
+
+ active_elements_count = batch_size * int(np.prod(output_dims))
+
+ overall_conv_flops = conv_per_position_flops * active_elements_count
+
+ bias_flops = 0
+
+ if conv_module.bias is not None:
+
+ bias_flops = out_channels * active_elements_count
+
+ overall_flops = overall_conv_flops + bias_flops
+
+ conv_module.__flops__ += int(overall_flops)
+
+
+def batch_counter_hook(module, input, output):
+ batch_size = 1
+ if len(input) > 0:
+ # Can have multiple inputs, getting the first one
+ input = input[0]
+ batch_size = len(input)
+ else:
+ pass
+ print('Warning! No positional inputs found for a module, '
+ 'assuming batch size is 1.')
+ module.__batch_counter__ += batch_size
+
+
+def add_batch_counter_variables_or_reset(module):
+
+ module.__batch_counter__ = 0
+
+
+def add_batch_counter_hook_function(module):
+ if hasattr(module, '__batch_counter_handle__'):
+ return
+
+ handle = module.register_forward_hook(batch_counter_hook)
+ module.__batch_counter_handle__ = handle
+
+
+def remove_batch_counter_hook_function(module):
+ if hasattr(module, '__batch_counter_handle__'):
+ module.__batch_counter_handle__.remove()
+ del module.__batch_counter_handle__
+
+
+def add_flops_counter_variable_or_reset(module):
+ if is_supported_instance(module):
+ if hasattr(module, '__flops__') or hasattr(module, '__params__'):
+ print('Warning: variables __flops__ or __params__ are already '
+ 'defined for the module' + type(module).__name__ +
+ ' ptflops can affect your code!')
+ module.__flops__ = 0
+ module.__params__ = get_model_parameters_number(module)
+
+
+def is_supported_instance(module):
+ if type(module) in get_modules_mapping():
+ return True
+ return False
+
+
+def remove_flops_counter_hook_function(module):
+ if is_supported_instance(module):
+ if hasattr(module, '__flops_handle__'):
+ module.__flops_handle__.remove()
+ del module.__flops_handle__
+
+
+def get_modules_mapping():
+ return {
+ # convolutions
+ nn.Conv1d: conv_flops_counter_hook,
+ nn.Conv2d: conv_flops_counter_hook,
+ mmcv.cnn.bricks.Conv2d: conv_flops_counter_hook,
+ nn.Conv3d: conv_flops_counter_hook,
+ mmcv.cnn.bricks.Conv3d: conv_flops_counter_hook,
+ # activations
+ nn.ReLU: relu_flops_counter_hook,
+ nn.PReLU: relu_flops_counter_hook,
+ nn.ELU: relu_flops_counter_hook,
+ nn.LeakyReLU: relu_flops_counter_hook,
+ nn.ReLU6: relu_flops_counter_hook,
+ # poolings
+ nn.MaxPool1d: pool_flops_counter_hook,
+ nn.AvgPool1d: pool_flops_counter_hook,
+ nn.AvgPool2d: pool_flops_counter_hook,
+ nn.MaxPool2d: pool_flops_counter_hook,
+ mmcv.cnn.bricks.MaxPool2d: pool_flops_counter_hook,
+ nn.MaxPool3d: pool_flops_counter_hook,
+ mmcv.cnn.bricks.MaxPool3d: pool_flops_counter_hook,
+ nn.AvgPool3d: pool_flops_counter_hook,
+ nn.AdaptiveMaxPool1d: pool_flops_counter_hook,
+ nn.AdaptiveAvgPool1d: pool_flops_counter_hook,
+ nn.AdaptiveMaxPool2d: pool_flops_counter_hook,
+ nn.AdaptiveAvgPool2d: pool_flops_counter_hook,
+ nn.AdaptiveMaxPool3d: pool_flops_counter_hook,
+ nn.AdaptiveAvgPool3d: pool_flops_counter_hook,
+ # normalizations
+ nn.BatchNorm1d: norm_flops_counter_hook,
+ nn.BatchNorm2d: norm_flops_counter_hook,
+ nn.BatchNorm3d: norm_flops_counter_hook,
+ nn.GroupNorm: norm_flops_counter_hook,
+ nn.InstanceNorm1d: norm_flops_counter_hook,
+ nn.InstanceNorm2d: norm_flops_counter_hook,
+ nn.InstanceNorm3d: norm_flops_counter_hook,
+ nn.LayerNorm: norm_flops_counter_hook,
+ # FC
+ nn.Linear: linear_flops_counter_hook,
+ mmcv.cnn.bricks.Linear: linear_flops_counter_hook,
+ # Upscale
+ nn.Upsample: upsample_flops_counter_hook,
+ # Deconvolution
+ nn.ConvTranspose2d: deconv_flops_counter_hook,
+ mmcv.cnn.bricks.ConvTranspose2d: deconv_flops_counter_hook,
+ }
diff --git a/PyTorch/contrib/cv/detection/GCNet/mmcv/cnn/utils/fuse_conv_bn.py b/PyTorch/contrib/cv/detection/GCNet/mmcv/cnn/utils/fuse_conv_bn.py
new file mode 100644
index 0000000000000000000000000000000000000000..31578be9202d080c01c281d399036efa01a64d61
--- /dev/null
+++ b/PyTorch/contrib/cv/detection/GCNet/mmcv/cnn/utils/fuse_conv_bn.py
@@ -0,0 +1,58 @@
+import torch
+import torch.nn as nn
+
+
+def _fuse_conv_bn(conv, bn):
+ """Fuse conv and bn into one module.
+
+ Args:
+ conv (nn.Module): Conv to be fused.
+ bn (nn.Module): BN to be fused.
+
+ Returns:
+ nn.Module: Fused module.
+ """
+ conv_w = conv.weight
+ conv_b = conv.bias if conv.bias is not None else torch.zeros_like(
+ bn.running_mean)
+
+ factor = bn.weight / torch.sqrt(bn.running_var + bn.eps)
+ conv.weight = nn.Parameter(conv_w *
+ factor.reshape([conv.out_channels, 1, 1, 1]))
+ conv.bias = nn.Parameter((conv_b - bn.running_mean) * factor + bn.bias)
+ return conv
+
+
+def fuse_conv_bn(module):
+ """Recursively fuse conv and bn in a module.
+
+ During inference, the functionary of batch norm layers is turned off
+ but only the mean and var alone channels are used, which exposes the
+ chance to fuse it with the preceding conv layers to save computations and
+ simplify network structures.
+
+ Args:
+ module (nn.Module): Module to be fused.
+
+ Returns:
+ nn.Module: Fused module.
+ """
+ last_conv = None
+ last_conv_name = None
+
+ for name, child in module.named_children():
+ if isinstance(child,
+ (nn.modules.batchnorm._BatchNorm, nn.SyncBatchNorm)):
+ if last_conv is None: # only fuse BN that is after Conv
+ continue
+ fused_conv = _fuse_conv_bn(last_conv, child)
+ module._modules[last_conv_name] = fused_conv
+ # To reduce changes, set BN as Identity instead of deleting it.
+ module._modules[name] = nn.Identity()
+ last_conv = None
+ elif isinstance(child, nn.Conv2d):
+ last_conv = child
+ last_conv_name = name
+ else:
+ fuse_conv_bn(child)
+ return module
diff --git a/PyTorch/contrib/cv/detection/GCNet/mmcv/cnn/utils/weight_init.py b/PyTorch/contrib/cv/detection/GCNet/mmcv/cnn/utils/weight_init.py
new file mode 100644
index 0000000000000000000000000000000000000000..36303a22c38837a4660839f02bca5d553ebde3a3
--- /dev/null
+++ b/PyTorch/contrib/cv/detection/GCNet/mmcv/cnn/utils/weight_init.py
@@ -0,0 +1,599 @@
+# Copyright (c) Open-MMLab. All rights reserved.
+import copy
+import math
+import warnings
+
+import numpy as np
+import torch
+import torch.nn as nn
+from torch import Tensor
+
+from mmcv.utils import Registry, build_from_cfg, get_logger, print_log
+
+INITIALIZERS = Registry('initializer')
+
+
+def constant_init(module, val, bias=0):
+ if hasattr(module, 'weight') and module.weight is not None:
+ nn.init.constant_(module.weight, val)
+ if hasattr(module, 'bias') and module.bias is not None:
+ nn.init.constant_(module.bias, bias)
+
+
+def xavier_init(module, gain=1, bias=0, distribution='normal'):
+ assert distribution in ['uniform', 'normal']
+ if hasattr(module, 'weight') and module.weight is not None:
+ if distribution == 'uniform':
+ nn.init.xavier_uniform_(module.weight, gain=gain)
+ else:
+ nn.init.xavier_normal_(module.weight, gain=gain)
+ if hasattr(module, 'bias') and module.bias is not None:
+ nn.init.constant_(module.bias, bias)
+
+
+def normal_init(module, mean=0, std=1, bias=0):
+ if hasattr(module, 'weight') and module.weight is not None:
+ nn.init.normal_(module.weight, mean, std)
+ if hasattr(module, 'bias') and module.bias is not None:
+ nn.init.constant_(module.bias, bias)
+
+
+def trunc_normal_init(module: nn.Module,
+ mean: float = 0,
+ std: float = 1,
+ a: float = -2,
+ b: float = 2,
+ bias: float = 0) -> None:
+ if hasattr(module, 'weight') and module.weight is not None:
+ trunc_normal_(module.weight, mean, std, a, b) # type: ignore
+ if hasattr(module, 'bias') and module.bias is not None:
+ nn.init.constant_(module.bias, bias) # type: ignore
+
+
+def uniform_init(module, a=0, b=1, bias=0):
+ if hasattr(module, 'weight') and module.weight is not None:
+ nn.init.uniform_(module.weight, a, b)
+ if hasattr(module, 'bias') and module.bias is not None:
+ nn.init.constant_(module.bias, bias)
+
+
+def kaiming_init(module,
+ a=0,
+ mode='fan_out',
+ nonlinearity='relu',
+ bias=0,
+ distribution='normal'):
+ assert distribution in ['uniform', 'normal']
+ if hasattr(module, 'weight') and module.weight is not None:
+ if distribution == 'uniform':
+ nn.init.kaiming_uniform_(
+ module.weight, a=a, mode=mode, nonlinearity=nonlinearity)
+ else:
+ nn.init.kaiming_normal_(
+ module.weight, a=a, mode=mode, nonlinearity=nonlinearity)
+ if hasattr(module, 'bias') and module.bias is not None:
+ nn.init.constant_(module.bias, bias)
+
+
+def caffe2_xavier_init(module, bias=0):
+ # `XavierFill` in Caffe2 corresponds to `kaiming_uniform_` in PyTorch
+ # Acknowledgment to FAIR's internal code
+ kaiming_init(
+ module,
+ a=1,
+ mode='fan_in',
+ nonlinearity='leaky_relu',
+ bias=bias,
+ distribution='uniform')
+
+
+def bias_init_with_prob(prior_prob):
+ """initialize conv/fc bias value according to a given probability value."""
+ bias_init = float(-np.log((1 - prior_prob) / prior_prob))
+ return bias_init
+
+
+def _get_bases_name(m):
+ return [b.__name__ for b in m.__class__.__bases__]
+
+
+class BaseInit(object):
+
+ def __init__(self, *, bias=0, bias_prob=None, layer=None):
+ self.wholemodule = False
+ if not isinstance(bias, (int, float)):
+ raise TypeError(f'bias must be a number, but got a {type(bias)}')
+
+ if bias_prob is not None:
+ if not isinstance(bias_prob, float):
+ raise TypeError(f'bias_prob type must be float, \
+ but got {type(bias_prob)}')
+
+ if layer is not None:
+ if not isinstance(layer, (str, list)):
+ raise TypeError(f'layer must be a str or a list of str, \
+ but got a {type(layer)}')
+ else:
+ layer = []
+
+ if bias_prob is not None:
+ self.bias = bias_init_with_prob(bias_prob)
+ else:
+ self.bias = bias
+ self.layer = [layer] if isinstance(layer, str) else layer
+
+
+@INITIALIZERS.register_module(name='Constant')
+class ConstantInit(BaseInit):
+ """Initialize module parameters with constant values.
+
+ Args:
+ val (int | float): the value to fill the weights in the module with
+ bias (int | float): the value to fill the bias. Defaults to 0.
+ bias_prob (float, optional): the probability for bias initialization.
+ Defaults to None.
+ layer (str | list[str], optional): the layer will be initialized.
+ Defaults to None.
+ """
+
+ def __init__(self, val, **kwargs):
+ super().__init__(**kwargs)
+ self.val = val
+
+ def __call__(self, module):
+
+ def init(m):
+ if self.wholemodule:
+ constant_init(m, self.val, self.bias)
+ else:
+ layername = m.__class__.__name__
+ basesname = _get_bases_name(m)
+ if len(set(self.layer) & set([layername] + basesname)):
+ constant_init(m, self.val, self.bias)
+
+ module.apply(init)
+
+
+@INITIALIZERS.register_module(name='Xavier')
+class XavierInit(BaseInit):
+ r"""Initialize module parameters with values according to the method
+ described in `Understanding the difficulty of training deep feedforward
+ neural networks - Glorot, X. & Bengio, Y. (2010).
+ `_
+
+ Args:
+ gain (int | float): an optional scaling factor. Defaults to 1.
+ bias (int | float): the value to fill the bias. Defaults to 0.
+ bias_prob (float, optional): the probability for bias initialization.
+ Defaults to None.
+ distribution (str): distribution either be ``'normal'``
+ or ``'uniform'``. Defaults to ``'normal'``.
+ layer (str | list[str], optional): the layer will be initialized.
+ Defaults to None.
+ """
+
+ def __init__(self, gain=1, distribution='normal', **kwargs):
+ super().__init__(**kwargs)
+ self.gain = gain
+ self.distribution = distribution
+
+ def __call__(self, module):
+
+ def init(m):
+ if self.wholemodule:
+ xavier_init(m, self.gain, self.bias, self.distribution)
+ else:
+ layername = m.__class__.__name__
+ basesname = _get_bases_name(m)
+ if len(set(self.layer) & set([layername] + basesname)):
+ xavier_init(m, self.gain, self.bias, self.distribution)
+
+ module.apply(init)
+
+
+@INITIALIZERS.register_module(name='Normal')
+class NormalInit(BaseInit):
+ r"""Initialize module parameters with the values drawn from the normal
+ distribution :math:`\mathcal{N}(\text{mean}, \text{std}^2)`.
+
+ Args:
+ mean (int | float):the mean of the normal distribution. Defaults to 0.
+ std (int | float): the standard deviation of the normal distribution.
+ Defaults to 1.
+ bias (int | float): the value to fill the bias. Defaults to 0.
+ bias_prob (float, optional): the probability for bias initialization.
+ Defaults to None.
+ layer (str | list[str], optional): the layer will be initialized.
+ Defaults to None.
+
+ """
+
+ def __init__(self, mean=0, std=1, **kwargs):
+ super().__init__(**kwargs)
+ self.mean = mean
+ self.std = std
+
+ def __call__(self, module):
+
+ def init(m):
+ if self.wholemodule:
+ normal_init(m, self.mean, self.std, self.bias)
+ else:
+ layername = m.__class__.__name__
+ basesname = _get_bases_name(m)
+ if len(set(self.layer) & set([layername] + basesname)):
+ normal_init(m, self.mean, self.std, self.bias)
+
+ module.apply(init)
+
+
+@INITIALIZERS.register_module(name='TruncNormal')
+class TruncNormalInit(BaseInit):
+ r"""Initialize module parameters with the values drawn from the normal
+ distribution :math:`\mathcal{N}(\text{mean}, \text{std}^2)` with values
+ outside :math:`[a, b]`.
+
+ Args:
+ mean (float): the mean of the normal distribution. Defaults to 0.
+ std (float): the standard deviation of the normal distribution.
+ Defaults to 1.
+ a (float): The minimum cutoff value.
+ b ( float): The maximum cutoff value.
+ bias (float): the value to fill the bias. Defaults to 0.
+ bias_prob (float, optional): the probability for bias initialization.
+ Defaults to None.
+ layer (str | list[str], optional): the layer will be initialized.
+ Defaults to None.
+
+ """
+
+ def __init__(self,
+ mean: float = 0,
+ std: float = 1,
+ a: float = -2,
+ b: float = 2,
+ **kwargs) -> None:
+ super().__init__(**kwargs)
+ self.mean = mean
+ self.std = std
+ self.a = a
+ self.b = b
+
+ def __call__(self, module: nn.Module) -> None:
+
+ def init(m):
+ if self.wholemodule:
+ trunc_normal_init(m, self.mean, self.std, self.a, self.b,
+ self.bias)
+ else:
+ layername = m.__class__.__name__
+ basesname = _get_bases_name(m)
+ if len(set(self.layer) & set([layername] + basesname)):
+ trunc_normal_init(m, self.mean, self.std, self.a, self.b,
+ self.bias)
+
+ module.apply(init)
+
+
+@INITIALIZERS.register_module(name='Uniform')
+class UniformInit(BaseInit):
+ r"""Initialize module parameters with values drawn from the uniform
+ distribution :math:`\mathcal{U}(a, b)`.
+
+ Args:
+ a (int | float): the lower bound of the uniform distribution.
+ Defaults to 0.
+ b (int | float): the upper bound of the uniform distribution.
+ Defaults to 1.
+ bias (int | float): the value to fill the bias. Defaults to 0.
+ bias_prob (float, optional): the probability for bias initialization.
+ Defaults to None.
+ layer (str | list[str], optional): the layer will be initialized.
+ Defaults to None.
+ """
+
+ def __init__(self, a=0, b=1, **kwargs):
+ super().__init__(**kwargs)
+ self.a = a
+ self.b = b
+
+ def __call__(self, module):
+
+ def init(m):
+ if self.wholemodule:
+ uniform_init(m, self.a, self.b, self.bias)
+ else:
+ layername = m.__class__.__name__
+ basesname = _get_bases_name(m)
+ if len(set(self.layer) & set([layername] + basesname)):
+ uniform_init(m, self.a, self.b, self.bias)
+
+ module.apply(init)
+
+
+@INITIALIZERS.register_module(name='Kaiming')
+class KaimingInit(BaseInit):
+ r"""Initialize module paramters with the valuse according to the method
+ described in `Delving deep into rectifiers: Surpassing human-level
+ performance on ImageNet classification - He, K. et al. (2015).
+ `_
+
+ Args:
+ a (int | float): the negative slope of the rectifier used after this
+ layer (only used with ``'leaky_relu'``). Defaults to 0.
+ mode (str): either ``'fan_in'`` or ``'fan_out'``. Choosing
+ ``'fan_in'`` preserves the magnitude of the variance of the weights
+ in the forward pass. Choosing ``'fan_out'`` preserves the
+ magnitudes in the backwards pass. Defaults to ``'fan_out'``.
+ nonlinearity (str): the non-linear function (`nn.functional` name),
+ recommended to use only with ``'relu'`` or ``'leaky_relu'`` .
+ Defaults to 'relu'.
+ bias (int | float): the value to fill the bias. Defaults to 0.
+ bias_prob (float, optional): the probability for bias initialization.
+ Defaults to None.
+ distribution (str): distribution either be ``'normal'`` or
+ ``'uniform'``. Defaults to ``'normal'``.
+ layer (str | list[str], optional): the layer will be initialized.
+ Defaults to None.
+ """
+
+ def __init__(self,
+ a=0,
+ mode='fan_out',
+ nonlinearity='relu',
+ distribution='normal',
+ **kwargs):
+ super().__init__(**kwargs)
+ self.a = a
+ self.mode = mode
+ self.nonlinearity = nonlinearity
+ self.distribution = distribution
+
+ def __call__(self, module):
+
+ def init(m):
+ if self.wholemodule:
+ kaiming_init(m, self.a, self.mode, self.nonlinearity,
+ self.bias, self.distribution)
+ else:
+ layername = m.__class__.__name__
+ basesname = _get_bases_name(m)
+ if len(set(self.layer) & set([layername] + basesname)):
+ kaiming_init(m, self.a, self.mode, self.nonlinearity,
+ self.bias, self.distribution)
+
+ module.apply(init)
+
+
+@INITIALIZERS.register_module(name='Caffe2Xavier')
+class Caffe2XavierInit(KaimingInit):
+ # `XavierFill` in Caffe2 corresponds to `kaiming_uniform_` in PyTorch
+ # Acknowledgment to FAIR's internal code
+ def __init__(self, **kwargs):
+ super().__init__(
+ a=1,
+ mode='fan_in',
+ nonlinearity='leaky_relu',
+ distribution='uniform',
+ **kwargs)
+
+ def __call__(self, module):
+ super().__call__(module)
+
+
+@INITIALIZERS.register_module(name='Pretrained')
+class PretrainedInit(object):
+ """Initialize module by loading a pretrained model.
+
+ Args:
+ checkpoint (str): the checkpoint file of the pretrained model should
+ be load.
+ prefix (str, optional): the prefix of a sub-module in the pretrained
+ model. it is for loading a part of the pretrained model to
+ initialize. For example, if we would like to only load the
+ backbone of a detector model, we can set ``prefix='backbone.'``.
+ Defaults to None.
+ map_location (str): map tensors into proper locations.
+ """
+
+ def __init__(self, checkpoint, prefix=None, map_location=None):
+ self.checkpoint = checkpoint
+ self.prefix = prefix
+ self.map_location = map_location
+
+ def __call__(self, module):
+ from mmcv.runner import (_load_checkpoint_with_prefix, load_checkpoint,
+ load_state_dict)
+ logger = get_logger('mmcv')
+ if self.prefix is None:
+ print_log(f'load model from: {self.checkpoint}', logger=logger)
+ load_checkpoint(
+ module,
+ self.checkpoint,
+ map_location=self.map_location,
+ strict=False,
+ logger=logger)
+ else:
+ print_log(
+ f'load {self.prefix} in model from: {self.checkpoint}',
+ logger=logger)
+ state_dict = _load_checkpoint_with_prefix(
+ self.prefix, self.checkpoint, map_location=self.map_location)
+ load_state_dict(module, state_dict, strict=False, logger=logger)
+
+
+def _initialize(module, cfg, wholemodule=False):
+ func = build_from_cfg(cfg, INITIALIZERS)
+ # wholemodule flag is for override mode, there is no layer key in override
+ # and initializer will give init values for the whole module with the name
+ # in override.
+ func.wholemodule = wholemodule
+ func(module)
+
+
+def _initialize_override(module, override, cfg):
+ if not isinstance(override, (dict, list)):
+ raise TypeError(f'override must be a dict or a list of dict, \
+ but got {type(override)}')
+
+ override = [override] if isinstance(override, dict) else override
+
+ for override_ in override:
+
+ cp_override = copy.deepcopy(override_)
+ name = cp_override.pop('name', None)
+ if name is None:
+ raise ValueError('`override` must contain the key "name",'
+ f'but got {cp_override}')
+ # if override only has name kay, it means use args in init_cfg
+ if not cp_override:
+ cp_override.update(cfg)
+ # if override has name key and other args except type key, it will
+ # raise error
+ elif 'type' not in cp_override.keys():
+ raise ValueError(
+ f'`override` need "type" key, but got {cp_override}')
+
+ if hasattr(module, name):
+ _initialize(getattr(module, name), cp_override, wholemodule=True)
+ else:
+ raise RuntimeError(f'module did not have attribute {name}, '
+ f'but init_cfg is {cp_override}.')
+
+
+def initialize(module, init_cfg):
+ """Initialize a module.
+
+ Args:
+ module (``torch.nn.Module``): the module will be initialized.
+ init_cfg (dict | list[dict]): initialization configuration dict to
+ define initializer. OpenMMLab has implemented 6 initializers
+ including ``Constant``, ``Xavier``, ``Normal``, ``Uniform``,
+ ``Kaiming``, and ``Pretrained``.
+ Example:
+ >>> module = nn.Linear(2, 3, bias=True)
+ >>> init_cfg = dict(type='Constant', layer='Linear', val =1 , bias =2)
+ >>> initialize(module, init_cfg)
+
+ >>> module = nn.Sequential(nn.Conv1d(3, 1, 3), nn.Linear(1,2))
+ >>> # define key ``'layer'`` for initializing layer with different
+ >>> # configuration
+ >>> init_cfg = [dict(type='Constant', layer='Conv1d', val=1),
+ dict(type='Constant', layer='Linear', val=2)]
+ >>> initialize(module, init_cfg)
+
+ >>> # define key``'override'`` to initialize some specific part in
+ >>> # module
+ >>> class FooNet(nn.Module):
+ >>> def __init__(self):
+ >>> super().__init__()
+ >>> self.feat = nn.Conv2d(3, 16, 3)
+ >>> self.reg = nn.Conv2d(16, 10, 3)
+ >>> self.cls = nn.Conv2d(16, 5, 3)
+ >>> model = FooNet()
+ >>> init_cfg = dict(type='Constant', val=1, bias=2, layer='Conv2d',
+ >>> override=dict(type='Constant', name='reg', val=3, bias=4))
+ >>> initialize(model, init_cfg)
+
+ >>> model = ResNet(depth=50)
+ >>> # Initialize weights with the pretrained model.
+ >>> init_cfg = dict(type='Pretrained',
+ checkpoint='torchvision://resnet50')
+ >>> initialize(model, init_cfg)
+
+ >>> # Initialize weights of a sub-module with the specific part of
+ >>> # a pretrained model by using "prefix".
+ >>> url = 'http://download.openmmlab.com/mmdetection/v2.0/retinanet/'\
+ >>> 'retinanet_r50_fpn_1x_coco/'\
+ >>> 'retinanet_r50_fpn_1x_coco_20200130-c2398f9e.pth'
+ >>> init_cfg = dict(type='Pretrained',
+ checkpoint=url, prefix='backbone.')
+ """
+ if not isinstance(init_cfg, (dict, list)):
+ raise TypeError(f'init_cfg must be a dict or a list of dict, \
+ but got {type(init_cfg)}')
+
+ if isinstance(init_cfg, dict):
+ init_cfg = [init_cfg]
+
+ for cfg in init_cfg:
+ # should deeply copy the original config because cfg may be used by
+ # other modules, e.g., one init_cfg shared by multiple bottleneck
+ # blocks, the expected cfg will be changed after pop and will change
+ # the initialization behavior of other modules
+ cp_cfg = copy.deepcopy(cfg)
+ override = cp_cfg.pop('override', None)
+ _initialize(module, cp_cfg)
+
+ if override is not None:
+ cp_cfg.pop('layer', None)
+ _initialize_override(module, override, cp_cfg)
+ else:
+ # All attributes in module have same initialization.
+ pass
+
+
+def _no_grad_trunc_normal_(tensor: Tensor, mean: float, std: float, a: float,
+ b: float) -> Tensor:
+ # Method based on
+ # https://people.sc.fsu.edu/~jburkardt/presentations/truncated_normal.pdf
+ # Modified from
+ # https://github.com/pytorch/pytorch/blob/master/torch/nn/init.py
+ def norm_cdf(x):
+ # Computes standard normal cumulative distribution function
+ return (1. + math.erf(x / math.sqrt(2.))) / 2.
+
+ if (mean < a - 2 * std) or (mean > b + 2 * std):
+ warnings.warn(
+ 'mean is more than 2 std from [a, b] in nn.init.trunc_normal_. '
+ 'The distribution of values may be incorrect.',
+ stacklevel=2)
+
+ with torch.no_grad():
+ # Values are generated by using a truncated uniform distribution and
+ # then using the inverse CDF for the normal distribution.
+ # Get upper and lower cdf values
+ lower = norm_cdf((a - mean) / std)
+ upper = norm_cdf((b - mean) / std)
+
+ # Uniformly fill tensor with values from [lower, upper], then translate
+ # to [2lower-1, 2upper-1].
+ tensor.uniform_(2 * lower - 1, 2 * upper - 1)
+
+ # Use inverse cdf transform for normal distribution to get truncated
+ # standard normal
+ tensor.erfinv_()
+
+ # Transform to proper mean, std
+ tensor.mul_(std * math.sqrt(2.))
+ tensor.add_(mean)
+
+ # Clamp to ensure it's in the proper range
+ tensor.clamp_(min=a, max=b)
+ return tensor
+
+
+def trunc_normal_(tensor: Tensor,
+ mean: float = 0.,
+ std: float = 1.,
+ a: float = -2.,
+ b: float = 2.) -> Tensor:
+ r"""Fills the input Tensor with values drawn from a truncated
+ normal distribution. The values are effectively drawn from the
+ normal distribution :math:`\mathcal{N}(\text{mean}, \text{std}^2)`
+ with values outside :math:`[a, b]` redrawn until they are within
+ the bounds. The method used for generating the random values works
+ best when :math:`a \leq \text{mean} \leq b`.
+
+ Modified from
+ https://github.com/pytorch/pytorch/blob/master/torch/nn/init.py
+
+ Args:
+ tensor (``torch.Tensor``): an n-dimensional `torch.Tensor`.
+ mean (float): the mean of the normal distribution.
+ std (float): the standard deviation of the normal distribution.
+ a (float): the minimum cutoff value.
+ b (float): the maximum cutoff value.
+ """
+ return _no_grad_trunc_normal_(tensor, mean, std, a, b)
diff --git a/PyTorch/contrib/cv/detection/GCNet/mmcv/cnn/vgg.py b/PyTorch/contrib/cv/detection/GCNet/mmcv/cnn/vgg.py
new file mode 100644
index 0000000000000000000000000000000000000000..82f8ba10932703da198d7834e041afe2cfb9d346
--- /dev/null
+++ b/PyTorch/contrib/cv/detection/GCNet/mmcv/cnn/vgg.py
@@ -0,0 +1,175 @@
+# Copyright (c) Open-MMLab. All rights reserved.
+import logging
+
+import torch.nn as nn
+
+from .utils import constant_init, kaiming_init, normal_init
+
+
+def conv3x3(in_planes, out_planes, dilation=1):
+ """3x3 convolution with padding."""
+ return nn.Conv2d(
+ in_planes,
+ out_planes,
+ kernel_size=3,
+ padding=dilation,
+ dilation=dilation)
+
+
+def make_vgg_layer(inplanes,
+ planes,
+ num_blocks,
+ dilation=1,
+ with_bn=False,
+ ceil_mode=False):
+ layers = []
+ for _ in range(num_blocks):
+ layers.append(conv3x3(inplanes, planes, dilation))
+ if with_bn:
+ layers.append(nn.BatchNorm2d(planes))
+ layers.append(nn.ReLU(inplace=True))
+ inplanes = planes
+ layers.append(nn.MaxPool2d(kernel_size=2, stride=2, ceil_mode=ceil_mode))
+
+ return layers
+
+
+class VGG(nn.Module):
+ """VGG backbone.
+
+ Args:
+ depth (int): Depth of vgg, from {11, 13, 16, 19}.
+ with_bn (bool): Use BatchNorm or not.
+ num_classes (int): number of classes for classification.
+ num_stages (int): VGG stages, normally 5.
+ dilations (Sequence[int]): Dilation of each stage.
+ out_indices (Sequence[int]): Output from which stages.
+ frozen_stages (int): Stages to be frozen (all param fixed). -1 means
+ not freezing any parameters.
+ bn_eval (bool): Whether to set BN layers as eval mode, namely, freeze
+ running stats (mean and var).
+ bn_frozen (bool): Whether to freeze weight and bias of BN layers.
+ """
+
+ arch_settings = {
+ 11: (1, 1, 2, 2, 2),
+ 13: (2, 2, 2, 2, 2),
+ 16: (2, 2, 3, 3, 3),
+ 19: (2, 2, 4, 4, 4)
+ }
+
+ def __init__(self,
+ depth,
+ with_bn=False,
+ num_classes=-1,
+ num_stages=5,
+ dilations=(1, 1, 1, 1, 1),
+ out_indices=(0, 1, 2, 3, 4),
+ frozen_stages=-1,
+ bn_eval=True,
+ bn_frozen=False,
+ ceil_mode=False,
+ with_last_pool=True):
+ super(VGG, self).__init__()
+ if depth not in self.arch_settings:
+ raise KeyError(f'invalid depth {depth} for vgg')
+ assert num_stages >= 1 and num_stages <= 5
+ stage_blocks = self.arch_settings[depth]
+ self.stage_blocks = stage_blocks[:num_stages]
+ assert len(dilations) == num_stages
+ assert max(out_indices) <= num_stages
+
+ self.num_classes = num_classes
+ self.out_indices = out_indices
+ self.frozen_stages = frozen_stages
+ self.bn_eval = bn_eval
+ self.bn_frozen = bn_frozen
+
+ self.inplanes = 3
+ start_idx = 0
+ vgg_layers = []
+ self.range_sub_modules = []
+ for i, num_blocks in enumerate(self.stage_blocks):
+ num_modules = num_blocks * (2 + with_bn) + 1
+ end_idx = start_idx + num_modules
+ dilation = dilations[i]
+ planes = 64 * 2**i if i < 4 else 512
+ vgg_layer = make_vgg_layer(
+ self.inplanes,
+ planes,
+ num_blocks,
+ dilation=dilation,
+ with_bn=with_bn,
+ ceil_mode=ceil_mode)
+ vgg_layers.extend(vgg_layer)
+ self.inplanes = planes
+ self.range_sub_modules.append([start_idx, end_idx])
+ start_idx = end_idx
+ if not with_last_pool:
+ vgg_layers.pop(-1)
+ self.range_sub_modules[-1][1] -= 1
+ self.module_name = 'features'
+ self.add_module(self.module_name, nn.Sequential(*vgg_layers))
+
+ if self.num_classes > 0:
+ self.classifier = nn.Sequential(
+ nn.Linear(512 * 7 * 7, 4096),
+ nn.ReLU(True),
+ nn.Dropout(),
+ nn.Linear(4096, 4096),
+ nn.ReLU(True),
+ nn.Dropout(),
+ nn.Linear(4096, num_classes),
+ )
+
+ def init_weights(self, pretrained=None):
+ if isinstance(pretrained, str):
+ logger = logging.getLogger()
+ from ..runner import load_checkpoint
+ load_checkpoint(self, pretrained, strict=False, logger=logger)
+ elif pretrained is None:
+ for m in self.modules():
+ if isinstance(m, nn.Conv2d):
+ kaiming_init(m)
+ elif isinstance(m, nn.BatchNorm2d):
+ constant_init(m, 1)
+ elif isinstance(m, nn.Linear):
+ normal_init(m, std=0.01)
+ else:
+ raise TypeError('pretrained must be a str or None')
+
+ def forward(self, x):
+ outs = []
+ vgg_layers = getattr(self, self.module_name)
+ for i in range(len(self.stage_blocks)):
+ for j in range(*self.range_sub_modules[i]):
+ vgg_layer = vgg_layers[j]
+ x = vgg_layer(x)
+ if i in self.out_indices:
+ outs.append(x)
+ if self.num_classes > 0:
+ x = x.view(x.size(0), -1)
+ x = self.classifier(x)
+ outs.append(x)
+ if len(outs) == 1:
+ return outs[0]
+ else:
+ return tuple(outs)
+
+ def train(self, mode=True):
+ super(VGG, self).train(mode)
+ if self.bn_eval:
+ for m in self.modules():
+ if isinstance(m, nn.BatchNorm2d):
+ m.eval()
+ if self.bn_frozen:
+ for params in m.parameters():
+ params.requires_grad = False
+ vgg_layers = getattr(self, self.module_name)
+ if mode and self.frozen_stages >= 0:
+ for i in range(self.frozen_stages):
+ for j in range(*self.range_sub_modules[i]):
+ mod = vgg_layers[j]
+ mod.eval()
+ for param in mod.parameters():
+ param.requires_grad = False
diff --git a/PyTorch/contrib/cv/detection/GCNet/mmcv/engine/__init__.py b/PyTorch/contrib/cv/detection/GCNet/mmcv/engine/__init__.py
new file mode 100644
index 0000000000000000000000000000000000000000..8bec565dfc54efecedd75599048004ffc58cc9ae
--- /dev/null
+++ b/PyTorch/contrib/cv/detection/GCNet/mmcv/engine/__init__.py
@@ -0,0 +1,7 @@
+from .test import (collect_results_cpu, collect_results_gpu, multi_gpu_test,
+ single_gpu_test)
+
+__all__ = [
+ 'collect_results_cpu', 'collect_results_gpu', 'multi_gpu_test',
+ 'single_gpu_test'
+]
diff --git a/PyTorch/contrib/cv/detection/GCNet/mmcv/engine/test.py b/PyTorch/contrib/cv/detection/GCNet/mmcv/engine/test.py
new file mode 100644
index 0000000000000000000000000000000000000000..a0fe57a22255b523d5c5a70391e28b5a1a52c784
--- /dev/null
+++ b/PyTorch/contrib/cv/detection/GCNet/mmcv/engine/test.py
@@ -0,0 +1,201 @@
+import os.path as osp
+import pickle
+import shutil
+import tempfile
+import time
+
+import torch
+import torch.distributed as dist
+
+import mmcv
+from mmcv.runner import get_dist_info
+
+
+def single_gpu_test(model, data_loader):
+ """Test model with a single gpu.
+
+ This method tests model with a single gpu and displays test progress bar.
+
+ Args:
+ model (nn.Module): Model to be tested.
+ data_loader (nn.Dataloader): Pytorch data loader.
+
+ Returns:
+ list: The prediction results.
+ """
+ model.eval()
+ results = []
+ dataset = data_loader.dataset
+ prog_bar = mmcv.ProgressBar(len(dataset))
+ for data in data_loader:
+ with torch.no_grad():
+ result = model(return_loss=False, **data)
+ results.extend(result)
+
+ # Assume result has the same length of batch_size
+ # refer to https://github.com/open-mmlab/mmcv/issues/985
+ batch_size = len(result)
+ for _ in range(batch_size):
+ prog_bar.update()
+ return results
+
+
+def multi_gpu_test(model, data_loader, tmpdir=None, gpu_collect=False):
+ """Test model with multiple gpus.
+
+ This method tests model with multiple gpus and collects the results
+ under two different modes: gpu and cpu modes. By setting
+ ``gpu_collect=True``, it encodes results to gpu tensors and use gpu
+ communication for results collection. On cpu mode it saves the results on
+ different gpus to ``tmpdir`` and collects them by the rank 0 worker.
+
+ Args:
+ model (nn.Module): Model to be tested.
+ data_loader (nn.Dataloader): Pytorch data loader.
+ tmpdir (str): Path of directory to save the temporary results from
+ different gpus under cpu mode.
+ gpu_collect (bool): Option to use either gpu or cpu to collect results.
+
+ Returns:
+ list: The prediction results.
+ """
+ model.eval()
+ results = []
+ dataset = data_loader.dataset
+ rank, world_size = get_dist_info()
+ if rank == 0:
+ prog_bar = mmcv.ProgressBar(len(dataset))
+ time.sleep(2) # This line can prevent deadlock problem in some cases.
+ for i, data in enumerate(data_loader):
+ with torch.no_grad():
+ result = model(return_loss=False, **data)
+ results.extend(result)
+
+ if rank == 0:
+ batch_size = len(result)
+ batch_size_all = batch_size * world_size
+ if batch_size_all + prog_bar.completed > len(dataset):
+ batch_size_all = len(dataset) - prog_bar.completed
+ for _ in range(batch_size_all):
+ prog_bar.update()
+
+ # collect results from all ranks
+ if gpu_collect:
+ results = collect_results_gpu(results, len(dataset))
+ else:
+ results = collect_results_cpu(results, len(dataset), tmpdir)
+ return results
+
+
+def collect_results_cpu(result_part, size, tmpdir=None):
+ """Collect results under cpu mode.
+
+ On cpu mode, this function will save the results on different gpus to
+ ``tmpdir`` and collect them by the rank 0 worker.
+
+ Args:
+ result_part (list): Result list containing result parts
+ to be collected.
+ size (int): Size of the results, commonly equal to length of
+ the results.
+ tmpdir (str | None): temporal directory for collected results to
+ store. If set to None, it will create a random temporal directory
+ for it.
+
+ Returns:
+ list: The collected results.
+ """
+ rank, world_size = get_dist_info()
+ # create a tmp dir if it is not specified
+ if tmpdir is None:
+ MAX_LEN = 512
+ # 32 is whitespace
+ dir_tensor = torch.full((MAX_LEN, ),
+ 32,
+ dtype=torch.uint8,
+ device='cuda')
+ if rank == 0:
+ mmcv.mkdir_or_exist('.dist_test')
+ tmpdir = tempfile.mkdtemp(dir='.dist_test')
+ tmpdir = torch.tensor(
+ bytearray(tmpdir.encode()), dtype=torch.uint8, device='cuda')
+ dir_tensor[:len(tmpdir)] = tmpdir
+ dist.broadcast(dir_tensor, 0)
+ tmpdir = dir_tensor.cpu().numpy().tobytes().decode().rstrip()
+ else:
+ mmcv.mkdir_or_exist(tmpdir)
+ # dump the part result to the dir
+ mmcv.dump(result_part, osp.join(tmpdir, f'part_{rank}.pkl'))
+ dist.barrier()
+ # collect all parts
+ if rank != 0:
+ return None
+ else:
+ # load results of all parts from tmp dir
+ part_list = []
+ for i in range(world_size):
+ part_file = osp.join(tmpdir, f'part_{i}.pkl')
+ part_result = mmcv.load(part_file)
+ # When data is severely insufficient, an empty part_result
+ # on a certain gpu could makes the overall outputs empty.
+ if part_result:
+ part_list.append(part_result)
+ # sort the results
+ ordered_results = []
+ for res in zip(*part_list):
+ ordered_results.extend(list(res))
+ # the dataloader may pad some samples
+ ordered_results = ordered_results[:size]
+ # remove tmp dir
+ shutil.rmtree(tmpdir)
+ return ordered_results
+
+
+def collect_results_gpu(result_part, size):
+ """Collect results under gpu mode.
+
+ On gpu mode, this function will encode results to gpu tensors and use gpu
+ communication for results collection.
+
+ Args:
+ result_part (list): Result list containing result parts
+ to be collected.
+ size (int): Size of the results, commonly equal to length of
+ the results.
+
+ Returns:
+ list: The collected results.
+ """
+ rank, world_size = get_dist_info()
+ # dump result part to tensor with pickle
+ part_tensor = torch.tensor(
+ bytearray(pickle.dumps(result_part)), dtype=torch.uint8, device='cuda')
+ # gather all result part tensor shape
+ shape_tensor = torch.tensor(part_tensor.shape, device='cuda')
+ shape_list = [shape_tensor.clone() for _ in range(world_size)]
+ dist.all_gather(shape_list, shape_tensor)
+ # padding result part tensor to max length
+ shape_max = torch.tensor(shape_list).max()
+ part_send = torch.zeros(shape_max, dtype=torch.uint8, device='cuda')
+ part_send[:shape_tensor[0]] = part_tensor
+ part_recv_list = [
+ part_tensor.new_zeros(shape_max) for _ in range(world_size)
+ ]
+ # gather all result part
+ dist.all_gather(part_recv_list, part_send)
+
+ if rank == 0:
+ part_list = []
+ for recv, shape in zip(part_recv_list, shape_list):
+ part_result = pickle.loads(recv[:shape[0]].cpu().numpy().tobytes())
+ # When data is severely insufficient, an empty part_result
+ # on a certain gpu could makes the overall outputs empty.
+ if part_result:
+ part_list.append(part_result)
+ # sort the results
+ ordered_results = []
+ for res in zip(*part_list):
+ ordered_results.extend(list(res))
+ # the dataloader may pad some samples
+ ordered_results = ordered_results[:size]
+ return ordered_results
diff --git a/PyTorch/contrib/cv/detection/GCNet/mmcv/fileio/__init__.py b/PyTorch/contrib/cv/detection/GCNet/mmcv/fileio/__init__.py
new file mode 100644
index 0000000000000000000000000000000000000000..b307027ad973e024e1e081b410017395c4ca28db
--- /dev/null
+++ b/PyTorch/contrib/cv/detection/GCNet/mmcv/fileio/__init__.py
@@ -0,0 +1,11 @@
+# Copyright (c) Open-MMLab. All rights reserved.
+from .file_client import BaseStorageBackend, FileClient
+from .handlers import BaseFileHandler, JsonHandler, PickleHandler, YamlHandler
+from .io import dump, load, register_handler
+from .parse import dict_from_file, list_from_file
+
+__all__ = [
+ 'BaseStorageBackend', 'FileClient', 'load', 'dump', 'register_handler',
+ 'BaseFileHandler', 'JsonHandler', 'PickleHandler', 'YamlHandler',
+ 'list_from_file', 'dict_from_file'
+]
diff --git a/PyTorch/contrib/cv/detection/GCNet/mmcv/fileio/file_client.py b/PyTorch/contrib/cv/detection/GCNet/mmcv/fileio/file_client.py
new file mode 100644
index 0000000000000000000000000000000000000000..f496f6ee4dfd1e04da6f2eac46af237fcc78aa80
--- /dev/null
+++ b/PyTorch/contrib/cv/detection/GCNet/mmcv/fileio/file_client.py
@@ -0,0 +1,309 @@
+import inspect
+from abc import ABCMeta, abstractmethod
+from urllib.request import urlopen
+
+
+class BaseStorageBackend(metaclass=ABCMeta):
+ """Abstract class of storage backends.
+
+ All backends need to implement two apis: ``get()`` and ``get_text()``.
+ ``get()`` reads the file as a byte stream and ``get_text()`` reads the file
+ as texts.
+ """
+
+ @abstractmethod
+ def get(self, filepath):
+ pass
+
+ @abstractmethod
+ def get_text(self, filepath):
+ pass
+
+
+class CephBackend(BaseStorageBackend):
+ """Ceph storage backend.
+
+ Args:
+ path_mapping (dict|None): path mapping dict from local path to Petrel
+ path. When ``path_mapping={'src': 'dst'}``, ``src`` in ``filepath``
+ will be replaced by ``dst``. Default: None.
+ """
+
+ def __init__(self, path_mapping=None):
+ try:
+ import ceph
+ except ImportError:
+ raise ImportError('Please install ceph to enable CephBackend.')
+
+ self._client = ceph.S3Client()
+ assert isinstance(path_mapping, dict) or path_mapping is None
+ self.path_mapping = path_mapping
+
+ def get(self, filepath):
+ filepath = str(filepath)
+ if self.path_mapping is not None:
+ for k, v in self.path_mapping.items():
+ filepath = filepath.replace(k, v)
+ value = self._client.Get(filepath)
+ value_buf = memoryview(value)
+ return value_buf
+
+ def get_text(self, filepath):
+ raise NotImplementedError
+
+
+class PetrelBackend(BaseStorageBackend):
+ """Petrel storage backend (for internal use).
+
+ Args:
+ path_mapping (dict|None): path mapping dict from local path to Petrel
+ path. When `path_mapping={'src': 'dst'}`, `src` in `filepath` will
+ be replaced by `dst`. Default: None.
+ enable_mc (bool): whether to enable memcached support. Default: True.
+ """
+
+ def __init__(self, path_mapping=None, enable_mc=True):
+ try:
+ from petrel_client import client
+ except ImportError:
+ raise ImportError('Please install petrel_client to enable '
+ 'PetrelBackend.')
+
+ self._client = client.Client(enable_mc=enable_mc)
+ assert isinstance(path_mapping, dict) or path_mapping is None
+ self.path_mapping = path_mapping
+
+ def get(self, filepath):
+ filepath = str(filepath)
+ if self.path_mapping is not None:
+ for k, v in self.path_mapping.items():
+ filepath = filepath.replace(k, v)
+ value = self._client.Get(filepath)
+ value_buf = memoryview(value)
+ return value_buf
+
+ def get_text(self, filepath):
+ raise NotImplementedError
+
+
+class MemcachedBackend(BaseStorageBackend):
+ """Memcached storage backend.
+
+ Attributes:
+ server_list_cfg (str): Config file for memcached server list.
+ client_cfg (str): Config file for memcached client.
+ sys_path (str | None): Additional path to be appended to `sys.path`.
+ Default: None.
+ """
+
+ def __init__(self, server_list_cfg, client_cfg, sys_path=None):
+ if sys_path is not None:
+ import sys
+ sys.path.append(sys_path)
+ try:
+ import mc
+ except ImportError:
+ raise ImportError(
+ 'Please install memcached to enable MemcachedBackend.')
+
+ self.server_list_cfg = server_list_cfg
+ self.client_cfg = client_cfg
+ self._client = mc.MemcachedClient.GetInstance(self.server_list_cfg,
+ self.client_cfg)
+ # mc.pyvector servers as a point which points to a memory cache
+ self._mc_buffer = mc.pyvector()
+
+ def get(self, filepath):
+ filepath = str(filepath)
+ import mc
+ self._client.Get(filepath, self._mc_buffer)
+ value_buf = mc.ConvertBuffer(self._mc_buffer)
+ return value_buf
+
+ def get_text(self, filepath):
+ raise NotImplementedError
+
+
+class LmdbBackend(BaseStorageBackend):
+ """Lmdb storage backend.
+
+ Args:
+ db_path (str): Lmdb database path.
+ readonly (bool, optional): Lmdb environment parameter. If True,
+ disallow any write operations. Default: True.
+ lock (bool, optional): Lmdb environment parameter. If False, when
+ concurrent access occurs, do not lock the database. Default: False.
+ readahead (bool, optional): Lmdb environment parameter. If False,
+ disable the OS filesystem readahead mechanism, which may improve
+ random read performance when a database is larger than RAM.
+ Default: False.
+
+ Attributes:
+ db_path (str): Lmdb database path.
+ """
+
+ def __init__(self,
+ db_path,
+ readonly=True,
+ lock=False,
+ readahead=False,
+ **kwargs):
+ try:
+ import lmdb
+ except ImportError:
+ raise ImportError('Please install lmdb to enable LmdbBackend.')
+
+ self.db_path = str(db_path)
+ self._client = lmdb.open(
+ self.db_path,
+ readonly=readonly,
+ lock=lock,
+ readahead=readahead,
+ **kwargs)
+
+ def get(self, filepath):
+ """Get values according to the filepath.
+
+ Args:
+ filepath (str | obj:`Path`): Here, filepath is the lmdb key.
+ """
+ filepath = str(filepath)
+ with self._client.begin(write=False) as txn:
+ value_buf = txn.get(filepath.encode('ascii'))
+ return value_buf
+
+ def get_text(self, filepath):
+ raise NotImplementedError
+
+
+class HardDiskBackend(BaseStorageBackend):
+ """Raw hard disks storage backend."""
+
+ def get(self, filepath):
+ filepath = str(filepath)
+ with open(filepath, 'rb') as f:
+ value_buf = f.read()
+ return value_buf
+
+ def get_text(self, filepath):
+ filepath = str(filepath)
+ with open(filepath, 'r') as f:
+ value_buf = f.read()
+ return value_buf
+
+
+class HTTPBackend(BaseStorageBackend):
+ """HTTP and HTTPS storage bachend."""
+
+ def get(self, filepath):
+ value_buf = urlopen(filepath).read()
+ return value_buf
+
+ def get_text(self, filepath):
+ value_buf = urlopen(filepath).read()
+ return value_buf.decode('utf-8')
+
+
+class FileClient:
+ """A general file client to access files in different backend.
+
+ The client loads a file or text in a specified backend from its path
+ and return it as a binary file. it can also register other backend
+ accessor with a given name and backend class.
+
+ Attributes:
+ backend (str): The storage backend type. Options are "disk", "ceph",
+ "memcached", "lmdb" and "http".
+ client (:obj:`BaseStorageBackend`): The backend object.
+ """
+
+ _backends = {
+ 'disk': HardDiskBackend,
+ 'ceph': CephBackend,
+ 'memcached': MemcachedBackend,
+ 'lmdb': LmdbBackend,
+ 'petrel': PetrelBackend,
+ 'http': HTTPBackend,
+ }
+
+ def __init__(self, backend='disk', **kwargs):
+ if backend not in self._backends:
+ raise ValueError(
+ f'Backend {backend} is not supported. Currently supported ones'
+ f' are {list(self._backends.keys())}')
+ self.backend = backend
+ self.client = self._backends[backend](**kwargs)
+
+ @classmethod
+ def _register_backend(cls, name, backend, force=False):
+ if not isinstance(name, str):
+ raise TypeError('the backend name should be a string, '
+ f'but got {type(name)}')
+ if not inspect.isclass(backend):
+ raise TypeError(
+ f'backend should be a class but got {type(backend)}')
+ if not issubclass(backend, BaseStorageBackend):
+ raise TypeError(
+ f'backend {backend} is not a subclass of BaseStorageBackend')
+ if not force and name in cls._backends:
+ raise KeyError(
+ f'{name} is already registered as a storage backend, '
+ 'add "force=True" if you want to override it')
+
+ cls._backends[name] = backend
+
+ @classmethod
+ def register_backend(cls, name, backend=None, force=False):
+ """Register a backend to FileClient.
+
+ This method can be used as a normal class method or a decorator.
+
+ .. code-block:: python
+
+ class NewBackend(BaseStorageBackend):
+
+ def get(self, filepath):
+ return filepath
+
+ def get_text(self, filepath):
+ return filepath
+
+ FileClient.register_backend('new', NewBackend)
+
+ or
+
+ .. code-block:: python
+
+ @FileClient.register_backend('new')
+ class NewBackend(BaseStorageBackend):
+
+ def get(self, filepath):
+ return filepath
+
+ def get_text(self, filepath):
+ return filepath
+
+ Args:
+ name (str): The name of the registered backend.
+ backend (class, optional): The backend class to be registered,
+ which must be a subclass of :class:`BaseStorageBackend`.
+ When this method is used as a decorator, backend is None.
+ Defaults to None.
+ force (bool, optional): Whether to override the backend if the name
+ has already been registered. Defaults to False.
+ """
+ if backend is not None:
+ cls._register_backend(name, backend, force=force)
+ return
+
+ def _register(backend_cls):
+ cls._register_backend(name, backend_cls, force=force)
+ return backend_cls
+
+ return _register
+
+ def get(self, filepath):
+ return self.client.get(filepath)
+
+ def get_text(self, filepath):
+ return self.client.get_text(filepath)
diff --git a/PyTorch/contrib/cv/detection/GCNet/mmcv/fileio/handlers/__init__.py b/PyTorch/contrib/cv/detection/GCNet/mmcv/fileio/handlers/__init__.py
new file mode 100644
index 0000000000000000000000000000000000000000..2fbc6ec92b18623cc4d3375a26d5977f77f497dc
--- /dev/null
+++ b/PyTorch/contrib/cv/detection/GCNet/mmcv/fileio/handlers/__init__.py
@@ -0,0 +1,7 @@
+# Copyright (c) Open-MMLab. All rights reserved.
+from .base import BaseFileHandler
+from .json_handler import JsonHandler
+from .pickle_handler import PickleHandler
+from .yaml_handler import YamlHandler
+
+__all__ = ['BaseFileHandler', 'JsonHandler', 'PickleHandler', 'YamlHandler']
diff --git a/PyTorch/contrib/cv/detection/GCNet/mmcv/fileio/handlers/base.py b/PyTorch/contrib/cv/detection/GCNet/mmcv/fileio/handlers/base.py
new file mode 100644
index 0000000000000000000000000000000000000000..91f3fe1fbc6d588b2cb8e90efc5d11500f600298
--- /dev/null
+++ b/PyTorch/contrib/cv/detection/GCNet/mmcv/fileio/handlers/base.py
@@ -0,0 +1,25 @@
+# Copyright (c) Open-MMLab. All rights reserved.
+from abc import ABCMeta, abstractmethod
+
+
+class BaseFileHandler(metaclass=ABCMeta):
+
+ @abstractmethod
+ def load_from_fileobj(self, file, **kwargs):
+ pass
+
+ @abstractmethod
+ def dump_to_fileobj(self, obj, file, **kwargs):
+ pass
+
+ @abstractmethod
+ def dump_to_str(self, obj, **kwargs):
+ pass
+
+ def load_from_path(self, filepath, mode='r', **kwargs):
+ with open(filepath, mode) as f:
+ return self.load_from_fileobj(f, **kwargs)
+
+ def dump_to_path(self, obj, filepath, mode='w', **kwargs):
+ with open(filepath, mode) as f:
+ self.dump_to_fileobj(obj, f, **kwargs)
diff --git a/PyTorch/contrib/cv/detection/GCNet/mmcv/fileio/handlers/json_handler.py b/PyTorch/contrib/cv/detection/GCNet/mmcv/fileio/handlers/json_handler.py
new file mode 100644
index 0000000000000000000000000000000000000000..d92c397f14b081757e910d5f454aec7f5f74c246
--- /dev/null
+++ b/PyTorch/contrib/cv/detection/GCNet/mmcv/fileio/handlers/json_handler.py
@@ -0,0 +1,36 @@
+# Copyright (c) Open-MMLab. All rights reserved.
+import json
+
+import numpy as np
+
+from .base import BaseFileHandler
+
+
+def set_default(obj):
+ """Set default json values for non-serializable values.
+
+ It helps convert ``set``, ``range`` and ``np.ndarray`` data types to list.
+ It also converts ``np.generic`` (including ``np.int32``, ``np.float32``,
+ etc.) into plain numbers of plain python built-in types.
+ """
+ if isinstance(obj, (set, range)):
+ return list(obj)
+ elif isinstance(obj, np.ndarray):
+ return obj.tolist()
+ elif isinstance(obj, np.generic):
+ return obj.item()
+ raise TypeError(f'{type(obj)} is unsupported for json dump')
+
+
+class JsonHandler(BaseFileHandler):
+
+ def load_from_fileobj(self, file):
+ return json.load(file)
+
+ def dump_to_fileobj(self, obj, file, **kwargs):
+ kwargs.setdefault('default', set_default)
+ json.dump(obj, file, **kwargs)
+
+ def dump_to_str(self, obj, **kwargs):
+ kwargs.setdefault('default', set_default)
+ return json.dumps(obj, **kwargs)
diff --git a/PyTorch/contrib/cv/detection/GCNet/mmcv/fileio/handlers/pickle_handler.py b/PyTorch/contrib/cv/detection/GCNet/mmcv/fileio/handlers/pickle_handler.py
new file mode 100644
index 0000000000000000000000000000000000000000..b22b1dc1dfd3aa994803ddc13f9b6745fb87c42c
--- /dev/null
+++ b/PyTorch/contrib/cv/detection/GCNet/mmcv/fileio/handlers/pickle_handler.py
@@ -0,0 +1,26 @@
+# Copyright (c) Open-MMLab. All rights reserved.
+import pickle
+
+from .base import BaseFileHandler
+
+
+class PickleHandler(BaseFileHandler):
+
+ def load_from_fileobj(self, file, **kwargs):
+ return pickle.load(file, **kwargs)
+
+ def load_from_path(self, filepath, **kwargs):
+ return super(PickleHandler, self).load_from_path(
+ filepath, mode='rb', **kwargs)
+
+ def dump_to_str(self, obj, **kwargs):
+ kwargs.setdefault('protocol', 2)
+ return pickle.dumps(obj, **kwargs)
+
+ def dump_to_fileobj(self, obj, file, **kwargs):
+ kwargs.setdefault('protocol', 2)
+ pickle.dump(obj, file, **kwargs)
+
+ def dump_to_path(self, obj, filepath, **kwargs):
+ super(PickleHandler, self).dump_to_path(
+ obj, filepath, mode='wb', **kwargs)
diff --git a/PyTorch/contrib/cv/detection/GCNet/mmcv/fileio/handlers/yaml_handler.py b/PyTorch/contrib/cv/detection/GCNet/mmcv/fileio/handlers/yaml_handler.py
new file mode 100644
index 0000000000000000000000000000000000000000..c93eba8d36412ec0887ad8cdd52dcc470734b7c3
--- /dev/null
+++ b/PyTorch/contrib/cv/detection/GCNet/mmcv/fileio/handlers/yaml_handler.py
@@ -0,0 +1,24 @@
+# Copyright (c) Open-MMLab. All rights reserved.
+import yaml
+
+try:
+ from yaml import CLoader as Loader, CDumper as Dumper
+except ImportError:
+ from yaml import Loader, Dumper
+
+from .base import BaseFileHandler # isort:skip
+
+
+class YamlHandler(BaseFileHandler):
+
+ def load_from_fileobj(self, file, **kwargs):
+ kwargs.setdefault('Loader', Loader)
+ return yaml.load(file, **kwargs)
+
+ def dump_to_fileobj(self, obj, file, **kwargs):
+ kwargs.setdefault('Dumper', Dumper)
+ yaml.dump(obj, file, **kwargs)
+
+ def dump_to_str(self, obj, **kwargs):
+ kwargs.setdefault('Dumper', Dumper)
+ return yaml.dump(obj, **kwargs)
diff --git a/PyTorch/contrib/cv/detection/GCNet/mmcv/fileio/io.py b/PyTorch/contrib/cv/detection/GCNet/mmcv/fileio/io.py
new file mode 100644
index 0000000000000000000000000000000000000000..777df97a6ea80061ad73974bea1fad78ca26209f
--- /dev/null
+++ b/PyTorch/contrib/cv/detection/GCNet/mmcv/fileio/io.py
@@ -0,0 +1,112 @@
+# Copyright (c) Open-MMLab. All rights reserved.
+from pathlib import Path
+
+from ..utils import is_list_of, is_str
+from .handlers import BaseFileHandler, JsonHandler, PickleHandler, YamlHandler
+
+file_handlers = {
+ 'json': JsonHandler(),
+ 'yaml': YamlHandler(),
+ 'yml': YamlHandler(),
+ 'pickle': PickleHandler(),
+ 'pkl': PickleHandler()
+}
+
+
+def load(file, file_format=None, **kwargs):
+ """Load data from json/yaml/pickle files.
+
+ This method provides a unified api for loading data from serialized files.
+
+ Args:
+ file (str or :obj:`Path` or file-like object): Filename or a file-like
+ object.
+ file_format (str, optional): If not specified, the file format will be
+ inferred from the file extension, otherwise use the specified one.
+ Currently supported formats include "json", "yaml/yml" and
+ "pickle/pkl".
+
+ Returns:
+ The content from the file.
+ """
+ if isinstance(file, Path):
+ file = str(file)
+ if file_format is None and is_str(file):
+ file_format = file.split('.')[-1]
+ if file_format not in file_handlers:
+ raise TypeError(f'Unsupported format: {file_format}')
+
+ handler = file_handlers[file_format]
+ if is_str(file):
+ obj = handler.load_from_path(file, **kwargs)
+ elif hasattr(file, 'read'):
+ obj = handler.load_from_fileobj(file, **kwargs)
+ else:
+ raise TypeError('"file" must be a filepath str or a file-object')
+ return obj
+
+
+def dump(obj, file=None, file_format=None, **kwargs):
+ """Dump data to json/yaml/pickle strings or files.
+
+ This method provides a unified api for dumping data as strings or to files,
+ and also supports custom arguments for each file format.
+
+ Args:
+ obj (any): The python object to be dumped.
+ file (str or :obj:`Path` or file-like object, optional): If not
+ specified, then the object is dump to a str, otherwise to a file
+ specified by the filename or file-like object.
+ file_format (str, optional): Same as :func:`load`.
+
+ Returns:
+ bool: True for success, False otherwise.
+ """
+ if isinstance(file, Path):
+ file = str(file)
+ if file_format is None:
+ if is_str(file):
+ file_format = file.split('.')[-1]
+ elif file is None:
+ raise ValueError(
+ 'file_format must be specified since file is None')
+ if file_format not in file_handlers:
+ raise TypeError(f'Unsupported format: {file_format}')
+
+ handler = file_handlers[file_format]
+ if file is None:
+ return handler.dump_to_str(obj, **kwargs)
+ elif is_str(file):
+ handler.dump_to_path(obj, file, **kwargs)
+ elif hasattr(file, 'write'):
+ handler.dump_to_fileobj(obj, file, **kwargs)
+ else:
+ raise TypeError('"file" must be a filename str or a file-object')
+
+
+def _register_handler(handler, file_formats):
+ """Register a handler for some file extensions.
+
+ Args:
+ handler (:obj:`BaseFileHandler`): Handler to be registered.
+ file_formats (str or list[str]): File formats to be handled by this
+ handler.
+ """
+ if not isinstance(handler, BaseFileHandler):
+ raise TypeError(
+ f'handler must be a child of BaseFileHandler, not {type(handler)}')
+ if isinstance(file_formats, str):
+ file_formats = [file_formats]
+ if not is_list_of(file_formats, str):
+ raise TypeError('file_formats must be a str or a list of str')
+ for ext in file_formats:
+ file_handlers[ext] = handler
+
+
+def register_handler(file_formats, **kwargs):
+
+ def wrap(cls):
+ _register_handler(cls(**kwargs), file_formats)
+ return cls
+
+ return wrap
diff --git a/PyTorch/contrib/cv/detection/GCNet/mmcv/fileio/parse.py b/PyTorch/contrib/cv/detection/GCNet/mmcv/fileio/parse.py
new file mode 100644
index 0000000000000000000000000000000000000000..5640029c17e58d338fb7178edc1f967cda40e12c
--- /dev/null
+++ b/PyTorch/contrib/cv/detection/GCNet/mmcv/fileio/parse.py
@@ -0,0 +1,52 @@
+# Copyright (c) Open-MMLab. All rights reserved.
+def list_from_file(filename, prefix='', offset=0, max_num=0, encoding='utf-8'):
+ """Load a text file and parse the content as a list of strings.
+
+ Args:
+ filename (str): Filename.
+ prefix (str): The prefix to be inserted to the begining of each item.
+ offset (int): The offset of lines.
+ max_num (int): The maximum number of lines to be read,
+ zeros and negatives mean no limitation.
+ encoding (str): Encoding used to open the file. Default utf-8.
+
+ Returns:
+ list[str]: A list of strings.
+ """
+ cnt = 0
+ item_list = []
+ with open(filename, 'r', encoding=encoding) as f:
+ for _ in range(offset):
+ f.readline()
+ for line in f:
+ if 0 < max_num <= cnt:
+ break
+ item_list.append(prefix + line.rstrip('\n\r'))
+ cnt += 1
+ return item_list
+
+
+def dict_from_file(filename, key_type=str):
+ """Load a text file and parse the content as a dict.
+
+ Each line of the text file will be two or more columns split by
+ whitespaces or tabs. The first column will be parsed as dict keys, and
+ the following columns will be parsed as dict values.
+
+ Args:
+ filename(str): Filename.
+ key_type(type): Type of the dict keys. str is user by default and
+ type conversion will be performed if specified.
+
+ Returns:
+ dict: The parsed contents.
+ """
+ mapping = {}
+ with open(filename, 'r') as f:
+ for line in f:
+ items = line.rstrip('\n').split()
+ assert len(items) >= 2
+ key = key_type(items[0])
+ val = items[1:] if len(items) > 2 else items[1]
+ mapping[key] = val
+ return mapping
diff --git a/PyTorch/contrib/cv/detection/GCNet/mmcv/image/__init__.py b/PyTorch/contrib/cv/detection/GCNet/mmcv/image/__init__.py
new file mode 100644
index 0000000000000000000000000000000000000000..1a45f4e0c84056fd27a299e24a1377e37223d18c
--- /dev/null
+++ b/PyTorch/contrib/cv/detection/GCNet/mmcv/image/__init__.py
@@ -0,0 +1,28 @@
+# Copyright (c) Open-MMLab. All rights reserved.
+from .colorspace import (bgr2gray, bgr2hls, bgr2hsv, bgr2rgb, bgr2ycbcr,
+ gray2bgr, gray2rgb, hls2bgr, hsv2bgr, imconvert,
+ rgb2bgr, rgb2gray, rgb2ycbcr, ycbcr2bgr, ycbcr2rgb)
+from .geometric import (cutout, imcrop, imflip, imflip_, impad,
+ impad_to_multiple, imrescale, imresize, imresize_like,
+ imresize_to_multiple, imrotate, imshear, imtranslate,
+ rescale_size)
+from .io import imfrombytes, imread, imwrite, supported_backends, use_backend
+from .misc import tensor2imgs
+from .photometric import (adjust_brightness, adjust_color, adjust_contrast,
+ adjust_lighting, adjust_sharpness, auto_contrast,
+ clahe, imdenormalize, imequalize, iminvert,
+ imnormalize, imnormalize_, lut_transform, posterize,
+ solarize)
+
+__all__ = [
+ 'bgr2gray', 'bgr2hls', 'bgr2hsv', 'bgr2rgb', 'gray2bgr', 'gray2rgb',
+ 'hls2bgr', 'hsv2bgr', 'imconvert', 'rgb2bgr', 'rgb2gray', 'imrescale',
+ 'imresize', 'imresize_like', 'imresize_to_multiple', 'rescale_size',
+ 'imcrop', 'imflip', 'imflip_', 'impad', 'impad_to_multiple', 'imrotate',
+ 'imfrombytes', 'imread', 'imwrite', 'supported_backends', 'use_backend',
+ 'imdenormalize', 'imnormalize', 'imnormalize_', 'iminvert', 'posterize',
+ 'solarize', 'rgb2ycbcr', 'bgr2ycbcr', 'ycbcr2rgb', 'ycbcr2bgr',
+ 'tensor2imgs', 'imshear', 'imtranslate', 'adjust_color', 'imequalize',
+ 'adjust_brightness', 'adjust_contrast', 'lut_transform', 'clahe',
+ 'adjust_sharpness', 'auto_contrast', 'cutout', 'adjust_lighting'
+]
diff --git a/PyTorch/contrib/cv/detection/GCNet/mmcv/image/colorspace.py b/PyTorch/contrib/cv/detection/GCNet/mmcv/image/colorspace.py
new file mode 100644
index 0000000000000000000000000000000000000000..56cfe657704faa8bff1c6d1345d473909226a9ae
--- /dev/null
+++ b/PyTorch/contrib/cv/detection/GCNet/mmcv/image/colorspace.py
@@ -0,0 +1,306 @@
+# Copyright (c) Open-MMLab. All rights reserved.
+import cv2
+import numpy as np
+
+
+def imconvert(img, src, dst):
+ """Convert an image from the src colorspace to dst colorspace.
+
+ Args:
+ img (ndarray): The input image.
+ src (str): The source colorspace, e.g., 'rgb', 'hsv'.
+ dst (str): The destination colorspace, e.g., 'rgb', 'hsv'.
+
+ Returns:
+ ndarray: The converted image.
+ """
+ code = getattr(cv2, f'COLOR_{src.upper()}2{dst.upper()}')
+ out_img = cv2.cvtColor(img, code)
+ return out_img
+
+
+def bgr2gray(img, keepdim=False):
+ """Convert a BGR image to grayscale image.
+
+ Args:
+ img (ndarray): The input image.
+ keepdim (bool): If False (by default), then return the grayscale image
+ with 2 dims, otherwise 3 dims.
+
+ Returns:
+ ndarray: The converted grayscale image.
+ """
+ out_img = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
+ if keepdim:
+ out_img = out_img[..., None]
+ return out_img
+
+
+def rgb2gray(img, keepdim=False):
+ """Convert a RGB image to grayscale image.
+
+ Args:
+ img (ndarray): The input image.
+ keepdim (bool): If False (by default), then return the grayscale image
+ with 2 dims, otherwise 3 dims.
+
+ Returns:
+ ndarray: The converted grayscale image.
+ """
+ out_img = cv2.cvtColor(img, cv2.COLOR_RGB2GRAY)
+ if keepdim:
+ out_img = out_img[..., None]
+ return out_img
+
+
+def gray2bgr(img):
+ """Convert a grayscale image to BGR image.
+
+ Args:
+ img (ndarray): The input image.
+
+ Returns:
+ ndarray: The converted BGR image.
+ """
+ img = img[..., None] if img.ndim == 2 else img
+ out_img = cv2.cvtColor(img, cv2.COLOR_GRAY2BGR)
+ return out_img
+
+
+def gray2rgb(img):
+ """Convert a grayscale image to RGB image.
+
+ Args:
+ img (ndarray): The input image.
+
+ Returns:
+ ndarray: The converted RGB image.
+ """
+ img = img[..., None] if img.ndim == 2 else img
+ out_img = cv2.cvtColor(img, cv2.COLOR_GRAY2RGB)
+ return out_img
+
+
+def _convert_input_type_range(img):
+ """Convert the type and range of the input image.
+
+ It converts the input image to np.float32 type and range of [0, 1].
+ It is mainly used for pre-processing the input image in colorspace
+ convertion functions such as rgb2ycbcr and ycbcr2rgb.
+
+ Args:
+ img (ndarray): The input image. It accepts:
+ 1. np.uint8 type with range [0, 255];
+ 2. np.float32 type with range [0, 1].
+
+ Returns:
+ (ndarray): The converted image with type of np.float32 and range of
+ [0, 1].
+ """
+ img_type = img.dtype
+ img = img.astype(np.float32)
+ if img_type == np.float32:
+ pass
+ elif img_type == np.uint8:
+ img /= 255.
+ else:
+ raise TypeError('The img type should be np.float32 or np.uint8, '
+ f'but got {img_type}')
+ return img
+
+
+def _convert_output_type_range(img, dst_type):
+ """Convert the type and range of the image according to dst_type.
+
+ It converts the image to desired type and range. If `dst_type` is np.uint8,
+ images will be converted to np.uint8 type with range [0, 255]. If
+ `dst_type` is np.float32, it converts the image to np.float32 type with
+ range [0, 1].
+ It is mainly used for post-processing images in colorspace convertion
+ functions such as rgb2ycbcr and ycbcr2rgb.
+
+ Args:
+ img (ndarray): The image to be converted with np.float32 type and
+ range [0, 255].
+ dst_type (np.uint8 | np.float32): If dst_type is np.uint8, it
+ converts the image to np.uint8 type with range [0, 255]. If
+ dst_type is np.float32, it converts the image to np.float32 type
+ with range [0, 1].
+
+ Returns:
+ (ndarray): The converted image with desired type and range.
+ """
+ if dst_type not in (np.uint8, np.float32):
+ raise TypeError('The dst_type should be np.float32 or np.uint8, '
+ f'but got {dst_type}')
+ if dst_type == np.uint8:
+ img = img.round()
+ else:
+ img /= 255.
+ return img.astype(dst_type)
+
+
+def rgb2ycbcr(img, y_only=False):
+ """Convert a RGB image to YCbCr image.
+
+ This function produces the same results as Matlab's `rgb2ycbcr` function.
+ It implements the ITU-R BT.601 conversion for standard-definition
+ television. See more details in
+ https://en.wikipedia.org/wiki/YCbCr#ITU-R_BT.601_conversion.
+
+ It differs from a similar function in cv2.cvtColor: `RGB <-> YCrCb`.
+ In OpenCV, it implements a JPEG conversion. See more details in
+ https://en.wikipedia.org/wiki/YCbCr#JPEG_conversion.
+
+ Args:
+ img (ndarray): The input image. It accepts:
+ 1. np.uint8 type with range [0, 255];
+ 2. np.float32 type with range [0, 1].
+ y_only (bool): Whether to only return Y channel. Default: False.
+
+ Returns:
+ ndarray: The converted YCbCr image. The output image has the same type
+ and range as input image.
+ """
+ img_type = img.dtype
+ img = _convert_input_type_range(img)
+ if y_only:
+ out_img = np.dot(img, [65.481, 128.553, 24.966]) + 16.0
+ else:
+ out_img = np.matmul(
+ img, [[65.481, -37.797, 112.0], [128.553, -74.203, -93.786],
+ [24.966, 112.0, -18.214]]) + [16, 128, 128]
+ out_img = _convert_output_type_range(out_img, img_type)
+ return out_img
+
+
+def bgr2ycbcr(img, y_only=False):
+ """Convert a BGR image to YCbCr image.
+
+ The bgr version of rgb2ycbcr.
+ It implements the ITU-R BT.601 conversion for standard-definition
+ television. See more details in
+ https://en.wikipedia.org/wiki/YCbCr#ITU-R_BT.601_conversion.
+
+ It differs from a similar function in cv2.cvtColor: `BGR <-> YCrCb`.
+ In OpenCV, it implements a JPEG conversion. See more details in
+ https://en.wikipedia.org/wiki/YCbCr#JPEG_conversion.
+
+ Args:
+ img (ndarray): The input image. It accepts:
+ 1. np.uint8 type with range [0, 255];
+ 2. np.float32 type with range [0, 1].
+ y_only (bool): Whether to only return Y channel. Default: False.
+
+ Returns:
+ ndarray: The converted YCbCr image. The output image has the same type
+ and range as input image.
+ """
+ img_type = img.dtype
+ img = _convert_input_type_range(img)
+ if y_only:
+ out_img = np.dot(img, [24.966, 128.553, 65.481]) + 16.0
+ else:
+ out_img = np.matmul(
+ img, [[24.966, 112.0, -18.214], [128.553, -74.203, -93.786],
+ [65.481, -37.797, 112.0]]) + [16, 128, 128]
+ out_img = _convert_output_type_range(out_img, img_type)
+ return out_img
+
+
+def ycbcr2rgb(img):
+ """Convert a YCbCr image to RGB image.
+
+ This function produces the same results as Matlab's ycbcr2rgb function.
+ It implements the ITU-R BT.601 conversion for standard-definition
+ television. See more details in
+ https://en.wikipedia.org/wiki/YCbCr#ITU-R_BT.601_conversion.
+
+ It differs from a similar function in cv2.cvtColor: `YCrCb <-> RGB`.
+ In OpenCV, it implements a JPEG conversion. See more details in
+ https://en.wikipedia.org/wiki/YCbCr#JPEG_conversion.
+
+ Args:
+ img (ndarray): The input image. It accepts:
+ 1. np.uint8 type with range [0, 255];
+ 2. np.float32 type with range [0, 1].
+
+ Returns:
+ ndarray: The converted RGB image. The output image has the same type
+ and range as input image.
+ """
+ img_type = img.dtype
+ img = _convert_input_type_range(img) * 255
+ out_img = np.matmul(img, [[0.00456621, 0.00456621, 0.00456621],
+ [0, -0.00153632, 0.00791071],
+ [0.00625893, -0.00318811, 0]]) * 255.0 + [
+ -222.921, 135.576, -276.836
+ ]
+ out_img = _convert_output_type_range(out_img, img_type)
+ return out_img
+
+
+def ycbcr2bgr(img):
+ """Convert a YCbCr image to BGR image.
+
+ The bgr version of ycbcr2rgb.
+ It implements the ITU-R BT.601 conversion for standard-definition
+ television. See more details in
+ https://en.wikipedia.org/wiki/YCbCr#ITU-R_BT.601_conversion.
+
+ It differs from a similar function in cv2.cvtColor: `YCrCb <-> BGR`.
+ In OpenCV, it implements a JPEG conversion. See more details in
+ https://en.wikipedia.org/wiki/YCbCr#JPEG_conversion.
+
+ Args:
+ img (ndarray): The input image. It accepts:
+ 1. np.uint8 type with range [0, 255];
+ 2. np.float32 type with range [0, 1].
+
+ Returns:
+ ndarray: The converted BGR image. The output image has the same type
+ and range as input image.
+ """
+ img_type = img.dtype
+ img = _convert_input_type_range(img) * 255
+ out_img = np.matmul(img, [[0.00456621, 0.00456621, 0.00456621],
+ [0.00791071, -0.00153632, 0],
+ [0, -0.00318811, 0.00625893]]) * 255.0 + [
+ -276.836, 135.576, -222.921
+ ]
+ out_img = _convert_output_type_range(out_img, img_type)
+ return out_img
+
+
+def convert_color_factory(src, dst):
+
+ code = getattr(cv2, f'COLOR_{src.upper()}2{dst.upper()}')
+
+ def convert_color(img):
+ out_img = cv2.cvtColor(img, code)
+ return out_img
+
+ convert_color.__doc__ = f"""Convert a {src.upper()} image to {dst.upper()}
+ image.
+
+ Args:
+ img (ndarray or str): The input image.
+
+ Returns:
+ ndarray: The converted {dst.upper()} image.
+ """
+
+ return convert_color
+
+
+bgr2rgb = convert_color_factory('bgr', 'rgb')
+
+rgb2bgr = convert_color_factory('rgb', 'bgr')
+
+bgr2hsv = convert_color_factory('bgr', 'hsv')
+
+hsv2bgr = convert_color_factory('hsv', 'bgr')
+
+bgr2hls = convert_color_factory('bgr', 'hls')
+
+hls2bgr = convert_color_factory('hls', 'bgr')
diff --git a/PyTorch/contrib/cv/detection/GCNet/mmcv/image/geometric.py b/PyTorch/contrib/cv/detection/GCNet/mmcv/image/geometric.py
new file mode 100644
index 0000000000000000000000000000000000000000..f81aa4599b4bc8e8d8f2344c5797d075f0d32d1f
--- /dev/null
+++ b/PyTorch/contrib/cv/detection/GCNet/mmcv/image/geometric.py
@@ -0,0 +1,728 @@
+# Copyright (c) Open-MMLab. All rights reserved.
+import numbers
+
+import cv2
+import numpy as np
+
+from ..utils import to_2tuple
+from .io import imread_backend
+
+try:
+ from PIL import Image
+except ImportError:
+ Image = None
+
+
+def _scale_size(size, scale):
+ """Rescale a size by a ratio.
+
+ Args:
+ size (tuple[int]): (w, h).
+ scale (float | tuple(float)): Scaling factor.
+
+ Returns:
+ tuple[int]: scaled size.
+ """
+ if isinstance(scale, (float, int)):
+ scale = (scale, scale)
+ w, h = size
+ return int(w * float(scale[0]) + 0.5), int(h * float(scale[1]) + 0.5)
+
+
+cv2_interp_codes = {
+ 'nearest': cv2.INTER_NEAREST,
+ 'bilinear': cv2.INTER_LINEAR,
+ 'bicubic': cv2.INTER_CUBIC,
+ 'area': cv2.INTER_AREA,
+ 'lanczos': cv2.INTER_LANCZOS4
+}
+
+if Image is not None:
+ pillow_interp_codes = {
+ 'nearest': Image.NEAREST,
+ 'bilinear': Image.BILINEAR,
+ 'bicubic': Image.BICUBIC,
+ 'box': Image.BOX,
+ 'lanczos': Image.LANCZOS,
+ 'hamming': Image.HAMMING
+ }
+
+
+def imresize(img,
+ size,
+ return_scale=False,
+ interpolation='bilinear',
+ out=None,
+ backend=None):
+ """Resize image to a given size.
+
+ Args:
+ img (ndarray): The input image.
+ size (tuple[int]): Target size (w, h).
+ return_scale (bool): Whether to return `w_scale` and `h_scale`.
+ interpolation (str): Interpolation method, accepted values are
+ "nearest", "bilinear", "bicubic", "area", "lanczos" for 'cv2'
+ backend, "nearest", "bilinear" for 'pillow' backend.
+ out (ndarray): The output destination.
+ backend (str | None): The image resize backend type. Options are `cv2`,
+ `pillow`, `None`. If backend is None, the global imread_backend
+ specified by ``mmcv.use_backend()`` will be used. Default: None.
+
+ Returns:
+ tuple | ndarray: (`resized_img`, `w_scale`, `h_scale`) or
+ `resized_img`.
+ """
+ h, w = img.shape[:2]
+ if backend is None:
+ backend = imread_backend
+ if backend not in ['cv2', 'pillow']:
+ raise ValueError(f'backend: {backend} is not supported for resize.'
+ f"Supported backends are 'cv2', 'pillow'")
+
+ if backend == 'pillow':
+ assert img.dtype == np.uint8, 'Pillow backend only support uint8 type'
+ pil_image = Image.fromarray(img)
+ pil_image = pil_image.resize(size, pillow_interp_codes[interpolation])
+ resized_img = np.array(pil_image)
+ else:
+ resized_img = cv2.resize(
+ img, size, dst=out, interpolation=cv2_interp_codes[interpolation])
+ if not return_scale:
+ return resized_img
+ else:
+ w_scale = size[0] / w
+ h_scale = size[1] / h
+ return resized_img, w_scale, h_scale
+
+
+def imresize_to_multiple(img,
+ divisor,
+ size=None,
+ scale_factor=None,
+ keep_ratio=False,
+ return_scale=False,
+ interpolation='bilinear',
+ out=None,
+ backend=None):
+ """Resize image according to a given size or scale factor and then rounds
+ up the the resized or rescaled image size to the nearest value that can be
+ divided by the divisor.
+
+ Args:
+ img (ndarray): The input image.
+ divisor (int | tuple): Resized image size will be a multiple of
+ divisor. If divisor is a tuple, divisor should be
+ (w_divisor, h_divisor).
+ size (None | int | tuple[int]): Target size (w, h). Default: None.
+ scale_factor (None | float | tuple[float]): Multiplier for spatial
+ size. Should match input size if it is a tuple and the 2D style is
+ (w_scale_factor, h_scale_factor). Default: None.
+ keep_ratio (bool): Whether to keep the aspect ratio when resizing the
+ image. Default: False.
+ return_scale (bool): Whether to return `w_scale` and `h_scale`.
+ interpolation (str): Interpolation method, accepted values are
+ "nearest", "bilinear", "bicubic", "area", "lanczos" for 'cv2'
+ backend, "nearest", "bilinear" for 'pillow' backend.
+ out (ndarray): The output destination.
+ backend (str | None): The image resize backend type. Options are `cv2`,
+ `pillow`, `None`. If backend is None, the global imread_backend
+ specified by ``mmcv.use_backend()`` will be used. Default: None.
+
+ Returns:
+ tuple | ndarray: (`resized_img`, `w_scale`, `h_scale`) or
+ `resized_img`.
+ """
+ h, w = img.shape[:2]
+ if size is not None and scale_factor is not None:
+ raise ValueError('only one of size or scale_factor should be defined')
+ elif size is None and scale_factor is None:
+ raise ValueError('one of size or scale_factor should be defined')
+ elif size is not None:
+ size = to_2tuple(size)
+ if keep_ratio:
+ size = rescale_size((w, h), size, return_scale=False)
+ else:
+ size = _scale_size((w, h), scale_factor)
+
+ divisor = to_2tuple(divisor)
+ size = tuple([int(np.ceil(s / d)) * d for s, d in zip(size, divisor)])
+ resized_img, w_scale, h_scale = imresize(
+ img,
+ size,
+ return_scale=True,
+ interpolation=interpolation,
+ out=out,
+ backend=backend)
+ if return_scale:
+ return resized_img, w_scale, h_scale
+ else:
+ return resized_img
+
+
+def imresize_like(img,
+ dst_img,
+ return_scale=False,
+ interpolation='bilinear',
+ backend=None):
+ """Resize image to the same size of a given image.
+
+ Args:
+ img (ndarray): The input image.
+ dst_img (ndarray): The target image.
+ return_scale (bool): Whether to return `w_scale` and `h_scale`.
+ interpolation (str): Same as :func:`resize`.
+ backend (str | None): Same as :func:`resize`.
+
+ Returns:
+ tuple or ndarray: (`resized_img`, `w_scale`, `h_scale`) or
+ `resized_img`.
+ """
+ h, w = dst_img.shape[:2]
+ return imresize(img, (w, h), return_scale, interpolation, backend=backend)
+
+
+def rescale_size(old_size, scale, return_scale=False):
+ """Calculate the new size to be rescaled to.
+
+ Args:
+ old_size (tuple[int]): The old size (w, h) of image.
+ scale (float | tuple[int]): The scaling factor or maximum size.
+ If it is a float number, then the image will be rescaled by this
+ factor, else if it is a tuple of 2 integers, then the image will
+ be rescaled as large as possible within the scale.
+ return_scale (bool): Whether to return the scaling factor besides the
+ rescaled image size.
+
+ Returns:
+ tuple[int]: The new rescaled image size.
+ """
+ w, h = old_size
+ if isinstance(scale, (float, int)):
+ if scale <= 0:
+ raise ValueError(f'Invalid scale {scale}, must be positive.')
+ scale_factor = scale
+ elif isinstance(scale, tuple):
+ max_long_edge = max(scale)
+ max_short_edge = min(scale)
+ scale_factor = min(max_long_edge / max(h, w),
+ max_short_edge / min(h, w))
+ else:
+ raise TypeError(
+ f'Scale must be a number or tuple of int, but got {type(scale)}')
+
+ new_size = _scale_size((w, h), scale_factor)
+
+ if return_scale:
+ return new_size, scale_factor
+ else:
+ return new_size
+
+
+def imrescale(img,
+ scale,
+ return_scale=False,
+ interpolation='bilinear',
+ backend=None):
+ """Resize image while keeping the aspect ratio.
+
+ Args:
+ img (ndarray): The input image.
+ scale (float | tuple[int]): The scaling factor or maximum size.
+ If it is a float number, then the image will be rescaled by this
+ factor, else if it is a tuple of 2 integers, then the image will
+ be rescaled as large as possible within the scale.
+ return_scale (bool): Whether to return the scaling factor besides the
+ rescaled image.
+ interpolation (str): Same as :func:`resize`.
+ backend (str | None): Same as :func:`resize`.
+
+ Returns:
+ ndarray: The rescaled image.
+ """
+ h, w = img.shape[:2]
+ new_size, scale_factor = rescale_size((w, h), scale, return_scale=True)
+ rescaled_img = imresize(
+ img, new_size, interpolation=interpolation, backend=backend)
+ if return_scale:
+ return rescaled_img, scale_factor
+ else:
+ return rescaled_img
+
+
+def imflip(img, direction='horizontal'):
+ """Flip an image horizontally or vertically.
+
+ Args:
+ img (ndarray): Image to be flipped.
+ direction (str): The flip direction, either "horizontal" or
+ "vertical" or "diagonal".
+
+ Returns:
+ ndarray: The flipped image.
+ """
+ assert direction in ['horizontal', 'vertical', 'diagonal']
+ if direction == 'horizontal':
+ return np.flip(img, axis=1)
+ elif direction == 'vertical':
+ return np.flip(img, axis=0)
+ else:
+ return np.flip(img, axis=(0, 1))
+
+
+def imflip_(img, direction='horizontal'):
+ """Inplace flip an image horizontally or vertically.
+
+ Args:
+ img (ndarray): Image to be flipped.
+ direction (str): The flip direction, either "horizontal" or
+ "vertical" or "diagonal".
+
+ Returns:
+ ndarray: The flipped image (inplace).
+ """
+ assert direction in ['horizontal', 'vertical', 'diagonal']
+ if direction == 'horizontal':
+ return cv2.flip(img, 1, img)
+ elif direction == 'vertical':
+ return cv2.flip(img, 0, img)
+ else:
+ return cv2.flip(img, -1, img)
+
+
+def imrotate(img,
+ angle,
+ center=None,
+ scale=1.0,
+ border_value=0,
+ interpolation='bilinear',
+ auto_bound=False):
+ """Rotate an image.
+
+ Args:
+ img (ndarray): Image to be rotated.
+ angle (float): Rotation angle in degrees, positive values mean
+ clockwise rotation.
+ center (tuple[float], optional): Center point (w, h) of the rotation in
+ the source image. If not specified, the center of the image will be
+ used.
+ scale (float): Isotropic scale factor.
+ border_value (int): Border value.
+ interpolation (str): Same as :func:`resize`.
+ auto_bound (bool): Whether to adjust the image size to cover the whole
+ rotated image.
+
+ Returns:
+ ndarray: The rotated image.
+ """
+ if center is not None and auto_bound:
+ raise ValueError('`auto_bound` conflicts with `center`')
+ h, w = img.shape[:2]
+ if center is None:
+ center = ((w - 1) * 0.5, (h - 1) * 0.5)
+ assert isinstance(center, tuple)
+
+ matrix = cv2.getRotationMatrix2D(center, -angle, scale)
+ if auto_bound:
+ cos = np.abs(matrix[0, 0])
+ sin = np.abs(matrix[0, 1])
+ new_w = h * sin + w * cos
+ new_h = h * cos + w * sin
+ matrix[0, 2] += (new_w - w) * 0.5
+ matrix[1, 2] += (new_h - h) * 0.5
+ w = int(np.round(new_w))
+ h = int(np.round(new_h))
+ rotated = cv2.warpAffine(
+ img,
+ matrix, (w, h),
+ flags=cv2_interp_codes[interpolation],
+ borderValue=border_value)
+ return rotated
+
+
+def bbox_clip(bboxes, img_shape):
+ """Clip bboxes to fit the image shape.
+
+ Args:
+ bboxes (ndarray): Shape (..., 4*k)
+ img_shape (tuple[int]): (height, width) of the image.
+
+ Returns:
+ ndarray: Clipped bboxes.
+ """
+ assert bboxes.shape[-1] % 4 == 0
+ cmin = np.empty(bboxes.shape[-1], dtype=bboxes.dtype)
+ cmin[0::2] = img_shape[1] - 1
+ cmin[1::2] = img_shape[0] - 1
+ clipped_bboxes = np.maximum(np.minimum(bboxes, cmin), 0)
+ return clipped_bboxes
+
+
+def bbox_scaling(bboxes, scale, clip_shape=None):
+ """Scaling bboxes w.r.t the box center.
+
+ Args:
+ bboxes (ndarray): Shape(..., 4).
+ scale (float): Scaling factor.
+ clip_shape (tuple[int], optional): If specified, bboxes that exceed the
+ boundary will be clipped according to the given shape (h, w).
+
+ Returns:
+ ndarray: Scaled bboxes.
+ """
+ if float(scale) == 1.0:
+ scaled_bboxes = bboxes.copy()
+ else:
+ w = bboxes[..., 2] - bboxes[..., 0] + 1
+ h = bboxes[..., 3] - bboxes[..., 1] + 1
+ dw = (w * (scale - 1)) * 0.5
+ dh = (h * (scale - 1)) * 0.5
+ scaled_bboxes = bboxes + np.stack((-dw, -dh, dw, dh), axis=-1)
+ if clip_shape is not None:
+ return bbox_clip(scaled_bboxes, clip_shape)
+ else:
+ return scaled_bboxes
+
+
+def imcrop(img, bboxes, scale=1.0, pad_fill=None):
+ """Crop image patches.
+
+ 3 steps: scale the bboxes -> clip bboxes -> crop and pad.
+
+ Args:
+ img (ndarray): Image to be cropped.
+ bboxes (ndarray): Shape (k, 4) or (4, ), location of cropped bboxes.
+ scale (float, optional): Scale ratio of bboxes, the default value
+ 1.0 means no padding.
+ pad_fill (Number | list[Number]): Value to be filled for padding.
+ Default: None, which means no padding.
+
+ Returns:
+ list[ndarray] | ndarray: The cropped image patches.
+ """
+ chn = 1 if img.ndim == 2 else img.shape[2]
+ if pad_fill is not None:
+ if isinstance(pad_fill, (int, float)):
+ pad_fill = [pad_fill for _ in range(chn)]
+ assert len(pad_fill) == chn
+
+ _bboxes = bboxes[None, ...] if bboxes.ndim == 1 else bboxes
+ scaled_bboxes = bbox_scaling(_bboxes, scale).astype(np.int32)
+ clipped_bbox = bbox_clip(scaled_bboxes, img.shape)
+
+ patches = []
+ for i in range(clipped_bbox.shape[0]):
+ x1, y1, x2, y2 = tuple(clipped_bbox[i, :])
+ if pad_fill is None:
+ patch = img[y1:y2 + 1, x1:x2 + 1, ...]
+ else:
+ _x1, _y1, _x2, _y2 = tuple(scaled_bboxes[i, :])
+ if chn == 1:
+ patch_shape = (_y2 - _y1 + 1, _x2 - _x1 + 1)
+ else:
+ patch_shape = (_y2 - _y1 + 1, _x2 - _x1 + 1, chn)
+ patch = np.array(
+ pad_fill, dtype=img.dtype) * np.ones(
+ patch_shape, dtype=img.dtype)
+ x_start = 0 if _x1 >= 0 else -_x1
+ y_start = 0 if _y1 >= 0 else -_y1
+ w = x2 - x1 + 1
+ h = y2 - y1 + 1
+ patch[y_start:y_start + h, x_start:x_start + w,
+ ...] = img[y1:y1 + h, x1:x1 + w, ...]
+ patches.append(patch)
+
+ if bboxes.ndim == 1:
+ return patches[0]
+ else:
+ return patches
+
+
+def impad(img,
+ *,
+ shape=None,
+ padding=None,
+ pad_val=0,
+ padding_mode='constant'):
+ """Pad the given image to a certain shape or pad on all sides with
+ specified padding mode and padding value.
+
+ Args:
+ img (ndarray): Image to be padded.
+ shape (tuple[int]): Expected padding shape (h, w). Default: None.
+ padding (int or tuple[int]): Padding on each border. If a single int is
+ provided this is used to pad all borders. If tuple of length 2 is
+ provided this is the padding on left/right and top/bottom
+ respectively. If a tuple of length 4 is provided this is the
+ padding for the left, top, right and bottom borders respectively.
+ Default: None. Note that `shape` and `padding` can not be both
+ set.
+ pad_val (Number | Sequence[Number]): Values to be filled in padding
+ areas when padding_mode is 'constant'. Default: 0.
+ padding_mode (str): Type of padding. Should be: constant, edge,
+ reflect or symmetric. Default: constant.
+
+ - constant: pads with a constant value, this value is specified
+ with pad_val.
+ - edge: pads with the last value at the edge of the image.
+ - reflect: pads with reflection of image without repeating the
+ last value on the edge. For example, padding [1, 2, 3, 4]
+ with 2 elements on both sides in reflect mode will result
+ in [3, 2, 1, 2, 3, 4, 3, 2].
+ - symmetric: pads with reflection of image repeating the last
+ value on the edge. For example, padding [1, 2, 3, 4] with
+ 2 elements on both sides in symmetric mode will result in
+ [2, 1, 1, 2, 3, 4, 4, 3]
+
+ Returns:
+ ndarray: The padded image.
+ """
+
+ assert (shape is not None) ^ (padding is not None)
+ if shape is not None:
+ padding = (0, 0, shape[1] - img.shape[1], shape[0] - img.shape[0])
+
+ # check pad_val
+ if isinstance(pad_val, tuple):
+ assert len(pad_val) == img.shape[-1]
+ elif not isinstance(pad_val, numbers.Number):
+ raise TypeError('pad_val must be a int or a tuple. '
+ f'But received {type(pad_val)}')
+
+ # check padding
+ if isinstance(padding, tuple) and len(padding) in [2, 4]:
+ if len(padding) == 2:
+ padding = (padding[0], padding[1], padding[0], padding[1])
+ elif isinstance(padding, numbers.Number):
+ padding = (padding, padding, padding, padding)
+ else:
+ raise ValueError('Padding must be a int or a 2, or 4 element tuple.'
+ f'But received {padding}')
+
+ # check padding mode
+ assert padding_mode in ['constant', 'edge', 'reflect', 'symmetric']
+
+ border_type = {
+ 'constant': cv2.BORDER_CONSTANT,
+ 'edge': cv2.BORDER_REPLICATE,
+ 'reflect': cv2.BORDER_REFLECT_101,
+ 'symmetric': cv2.BORDER_REFLECT
+ }
+ img = cv2.copyMakeBorder(
+ img,
+ padding[1],
+ padding[3],
+ padding[0],
+ padding[2],
+ border_type[padding_mode],
+ value=pad_val)
+
+ return img
+
+
+def impad_to_multiple(img, divisor, pad_val=0):
+ """Pad an image to ensure each edge to be multiple to some number.
+
+ Args:
+ img (ndarray): Image to be padded.
+ divisor (int): Padded image edges will be multiple to divisor.
+ pad_val (Number | Sequence[Number]): Same as :func:`impad`.
+
+ Returns:
+ ndarray: The padded image.
+ """
+ pad_h = int(np.ceil(img.shape[0] / divisor)) * divisor
+ pad_w = int(np.ceil(img.shape[1] / divisor)) * divisor
+ return impad(img, shape=(pad_h, pad_w), pad_val=pad_val)
+
+
+def cutout(img, shape, pad_val=0):
+ """Randomly cut out a rectangle from the original img.
+
+ Args:
+ img (ndarray): Image to be cutout.
+ shape (int | tuple[int]): Expected cutout shape (h, w). If given as a
+ int, the value will be used for both h and w.
+ pad_val (int | float | tuple[int | float]): Values to be filled in the
+ cut area. Defaults to 0.
+
+ Returns:
+ ndarray: The cutout image.
+ """
+
+ channels = 1 if img.ndim == 2 else img.shape[2]
+ if isinstance(shape, int):
+ cut_h, cut_w = shape, shape
+ else:
+ assert isinstance(shape, tuple) and len(shape) == 2, \
+ f'shape must be a int or a tuple with length 2, but got type ' \
+ f'{type(shape)} instead.'
+ cut_h, cut_w = shape
+ if isinstance(pad_val, (int, float)):
+ pad_val = tuple([pad_val] * channels)
+ elif isinstance(pad_val, tuple):
+ assert len(pad_val) == channels, \
+ 'Expected the num of elements in tuple equals the channels' \
+ 'of input image. Found {} vs {}'.format(
+ len(pad_val), channels)
+ else:
+ raise TypeError(f'Invalid type {type(pad_val)} for `pad_val`')
+
+ img_h, img_w = img.shape[:2]
+ y0 = np.random.uniform(img_h)
+ x0 = np.random.uniform(img_w)
+
+ y1 = int(max(0, y0 - cut_h / 2.))
+ x1 = int(max(0, x0 - cut_w / 2.))
+ y2 = min(img_h, y1 + cut_h)
+ x2 = min(img_w, x1 + cut_w)
+
+ if img.ndim == 2:
+ patch_shape = (y2 - y1, x2 - x1)
+ else:
+ patch_shape = (y2 - y1, x2 - x1, channels)
+
+ img_cutout = img.copy()
+ patch = np.array(
+ pad_val, dtype=img.dtype) * np.ones(
+ patch_shape, dtype=img.dtype)
+ img_cutout[y1:y2, x1:x2, ...] = patch
+
+ return img_cutout
+
+
+def _get_shear_matrix(magnitude, direction='horizontal'):
+ """Generate the shear matrix for transformation.
+
+ Args:
+ magnitude (int | float): The magnitude used for shear.
+ direction (str): The flip direction, either "horizontal"
+ or "vertical".
+
+ Returns:
+ ndarray: The shear matrix with dtype float32.
+ """
+ if direction == 'horizontal':
+ shear_matrix = np.float32([[1, magnitude, 0], [0, 1, 0]])
+ elif direction == 'vertical':
+ shear_matrix = np.float32([[1, 0, 0], [magnitude, 1, 0]])
+ return shear_matrix
+
+
+def imshear(img,
+ magnitude,
+ direction='horizontal',
+ border_value=0,
+ interpolation='bilinear'):
+ """Shear an image.
+
+ Args:
+ img (ndarray): Image to be sheared with format (h, w)
+ or (h, w, c).
+ magnitude (int | float): The magnitude used for shear.
+ direction (str): The flip direction, either "horizontal"
+ or "vertical".
+ border_value (int | tuple[int]): Value used in case of a
+ constant border.
+ interpolation (str): Same as :func:`resize`.
+
+ Returns:
+ ndarray: The sheared image.
+ """
+ assert direction in ['horizontal',
+ 'vertical'], f'Invalid direction: {direction}'
+ height, width = img.shape[:2]
+ if img.ndim == 2:
+ channels = 1
+ elif img.ndim == 3:
+ channels = img.shape[-1]
+ if isinstance(border_value, int):
+ border_value = tuple([border_value] * channels)
+ elif isinstance(border_value, tuple):
+ assert len(border_value) == channels, \
+ 'Expected the num of elements in tuple equals the channels' \
+ 'of input image. Found {} vs {}'.format(
+ len(border_value), channels)
+ else:
+ raise ValueError(
+ f'Invalid type {type(border_value)} for `border_value`')
+ shear_matrix = _get_shear_matrix(magnitude, direction)
+ sheared = cv2.warpAffine(
+ img,
+ shear_matrix,
+ (width, height),
+ # Note case when the number elements in `border_value`
+ # greater than 3 (e.g. shearing masks whose channels large
+ # than 3) will raise TypeError in `cv2.warpAffine`.
+ # Here simply slice the first 3 values in `border_value`.
+ borderValue=border_value[:3],
+ flags=cv2_interp_codes[interpolation])
+ return sheared
+
+
+def _get_translate_matrix(offset, direction='horizontal'):
+ """Generate the translate matrix.
+
+ Args:
+ offset (int | float): The offset used for translate.
+ direction (str): The translate direction, either
+ "horizontal" or "vertical".
+
+ Returns:
+ ndarray: The translate matrix with dtype float32.
+ """
+ if direction == 'horizontal':
+ translate_matrix = np.float32([[1, 0, offset], [0, 1, 0]])
+ elif direction == 'vertical':
+ translate_matrix = np.float32([[1, 0, 0], [0, 1, offset]])
+ return translate_matrix
+
+
+def imtranslate(img,
+ offset,
+ direction='horizontal',
+ border_value=0,
+ interpolation='bilinear'):
+ """Translate an image.
+
+ Args:
+ img (ndarray): Image to be translated with format
+ (h, w) or (h, w, c).
+ offset (int | float): The offset used for translate.
+ direction (str): The translate direction, either "horizontal"
+ or "vertical".
+ border_value (int | tuple[int]): Value used in case of a
+ constant border.
+ interpolation (str): Same as :func:`resize`.
+
+ Returns:
+ ndarray: The translated image.
+ """
+ assert direction in ['horizontal',
+ 'vertical'], f'Invalid direction: {direction}'
+ height, width = img.shape[:2]
+ if img.ndim == 2:
+ channels = 1
+ elif img.ndim == 3:
+ channels = img.shape[-1]
+ if isinstance(border_value, int):
+ border_value = tuple([border_value] * channels)
+ elif isinstance(border_value, tuple):
+ assert len(border_value) == channels, \
+ 'Expected the num of elements in tuple equals the channels' \
+ 'of input image. Found {} vs {}'.format(
+ len(border_value), channels)
+ else:
+ raise ValueError(
+ f'Invalid type {type(border_value)} for `border_value`.')
+ translate_matrix = _get_translate_matrix(offset, direction)
+ translated = cv2.warpAffine(
+ img,
+ translate_matrix,
+ (width, height),
+ # Note case when the number elements in `border_value`
+ # greater than 3 (e.g. translating masks whose channels
+ # large than 3) will raise TypeError in `cv2.warpAffine`.
+ # Here simply slice the first 3 values in `border_value`.
+ borderValue=border_value[:3],
+ flags=cv2_interp_codes[interpolation])
+ return translated
diff --git a/PyTorch/contrib/cv/detection/GCNet/mmcv/image/io.py b/PyTorch/contrib/cv/detection/GCNet/mmcv/image/io.py
new file mode 100644
index 0000000000000000000000000000000000000000..8c64e0eff67e596426318be7d167fe28a49be909
--- /dev/null
+++ b/PyTorch/contrib/cv/detection/GCNet/mmcv/image/io.py
@@ -0,0 +1,258 @@
+# Copyright (c) Open-MMLab. All rights reserved.
+import io
+import os.path as osp
+from pathlib import Path
+
+import cv2
+import numpy as np
+from cv2 import (IMREAD_COLOR, IMREAD_GRAYSCALE, IMREAD_IGNORE_ORIENTATION,
+ IMREAD_UNCHANGED)
+
+from mmcv.utils import check_file_exist, is_str, mkdir_or_exist
+
+try:
+ from turbojpeg import TJCS_RGB, TJPF_BGR, TJPF_GRAY, TurboJPEG
+except ImportError:
+ TJCS_RGB = TJPF_GRAY = TJPF_BGR = TurboJPEG = None
+
+try:
+ from PIL import Image, ImageOps
+except ImportError:
+ Image = None
+
+try:
+ import tifffile
+except ImportError:
+ tifffile = None
+
+jpeg = None
+supported_backends = ['cv2', 'turbojpeg', 'pillow', 'tifffile']
+
+imread_flags = {
+ 'color': IMREAD_COLOR,
+ 'grayscale': IMREAD_GRAYSCALE,
+ 'unchanged': IMREAD_UNCHANGED,
+ 'color_ignore_orientation': IMREAD_IGNORE_ORIENTATION | IMREAD_COLOR,
+ 'grayscale_ignore_orientation':
+ IMREAD_IGNORE_ORIENTATION | IMREAD_GRAYSCALE
+}
+
+imread_backend = 'cv2'
+
+
+def use_backend(backend):
+ """Select a backend for image decoding.
+
+ Args:
+ backend (str): The image decoding backend type. Options are `cv2`,
+ `pillow`, `turbojpeg` (see https://github.com/lilohuang/PyTurboJPEG)
+ and `tifffile`. `turbojpeg` is faster but it only supports `.jpeg`
+ file format.
+ """
+ assert backend in supported_backends
+ global imread_backend
+ imread_backend = backend
+ if imread_backend == 'turbojpeg':
+ if TurboJPEG is None:
+ raise ImportError('`PyTurboJPEG` is not installed')
+ global jpeg
+ if jpeg is None:
+ jpeg = TurboJPEG()
+ elif imread_backend == 'pillow':
+ if Image is None:
+ raise ImportError('`Pillow` is not installed')
+ elif imread_backend == 'tifffile':
+ if tifffile is None:
+ raise ImportError('`tifffile` is not installed')
+
+
+def _jpegflag(flag='color', channel_order='bgr'):
+ channel_order = channel_order.lower()
+ if channel_order not in ['rgb', 'bgr']:
+ raise ValueError('channel order must be either "rgb" or "bgr"')
+
+ if flag == 'color':
+ if channel_order == 'bgr':
+ return TJPF_BGR
+ elif channel_order == 'rgb':
+ return TJCS_RGB
+ elif flag == 'grayscale':
+ return TJPF_GRAY
+ else:
+ raise ValueError('flag must be "color" or "grayscale"')
+
+
+def _pillow2array(img, flag='color', channel_order='bgr'):
+ """Convert a pillow image to numpy array.
+
+ Args:
+ img (:obj:`PIL.Image.Image`): The image loaded using PIL
+ flag (str): Flags specifying the color type of a loaded image,
+ candidates are 'color', 'grayscale' and 'unchanged'.
+ Default to 'color'.
+ channel_order (str): The channel order of the output image array,
+ candidates are 'bgr' and 'rgb'. Default to 'bgr'.
+
+ Returns:
+ np.ndarray: The converted numpy array
+ """
+ channel_order = channel_order.lower()
+ if channel_order not in ['rgb', 'bgr']:
+ raise ValueError('channel order must be either "rgb" or "bgr"')
+
+ if flag == 'unchanged':
+ array = np.array(img)
+ if array.ndim >= 3 and array.shape[2] >= 3: # color image
+ array[:, :, :3] = array[:, :, (2, 1, 0)] # RGB to BGR
+ else:
+ # Handle exif orientation tag
+ if flag in ['color', 'grayscale']:
+ img = ImageOps.exif_transpose(img)
+ # If the image mode is not 'RGB', convert it to 'RGB' first.
+ if img.mode != 'RGB':
+ if img.mode != 'LA':
+ # Most formats except 'LA' can be directly converted to RGB
+ img = img.convert('RGB')
+ else:
+ # When the mode is 'LA', the default conversion will fill in
+ # the canvas with black, which sometimes shadows black objects
+ # in the foreground.
+ #
+ # Therefore, a random color (124, 117, 104) is used for canvas
+ img_rgba = img.convert('RGBA')
+ img = Image.new('RGB', img_rgba.size, (124, 117, 104))
+ img.paste(img_rgba, mask=img_rgba.split()[3]) # 3 is alpha
+ if flag in ['color', 'color_ignore_orientation']:
+ array = np.array(img)
+ if channel_order != 'rgb':
+ array = array[:, :, ::-1] # RGB to BGR
+ elif flag in ['grayscale', 'grayscale_ignore_orientation']:
+ img = img.convert('L')
+ array = np.array(img)
+ else:
+ raise ValueError(
+ 'flag must be "color", "grayscale", "unchanged", '
+ f'"color_ignore_orientation" or "grayscale_ignore_orientation"'
+ f' but got {flag}')
+ return array
+
+
+def imread(img_or_path, flag='color', channel_order='bgr', backend=None):
+ """Read an image.
+
+ Args:
+ img_or_path (ndarray or str or Path): Either a numpy array or str or
+ pathlib.Path. If it is a numpy array (loaded image), then
+ it will be returned as is.
+ flag (str): Flags specifying the color type of a loaded image,
+ candidates are `color`, `grayscale`, `unchanged`,
+ `color_ignore_orientation` and `grayscale_ignore_orientation`.
+ By default, `cv2` and `pillow` backend would rotate the image
+ according to its EXIF info unless called with `unchanged` or
+ `*_ignore_orientation` flags. `turbojpeg` and `tifffile` backend
+ always ignore image's EXIF info regardless of the flag.
+ The `turbojpeg` backend only supports `color` and `grayscale`.
+ channel_order (str): Order of channel, candidates are `bgr` and `rgb`.
+ backend (str | None): The image decoding backend type. Options are
+ `cv2`, `pillow`, `turbojpeg`, `tifffile`, `None`.
+ If backend is None, the global imread_backend specified by
+ ``mmcv.use_backend()`` will be used. Default: None.
+
+ Returns:
+ ndarray: Loaded image array.
+ """
+
+ if backend is None:
+ backend = imread_backend
+ if backend not in supported_backends:
+ raise ValueError(f'backend: {backend} is not supported. Supported '
+ "backends are 'cv2', 'turbojpeg', 'pillow'")
+ if isinstance(img_or_path, Path):
+ img_or_path = str(img_or_path)
+
+ if isinstance(img_or_path, np.ndarray):
+ return img_or_path
+ elif is_str(img_or_path):
+ check_file_exist(img_or_path,
+ f'img file does not exist: {img_or_path}')
+ if backend == 'turbojpeg':
+ with open(img_or_path, 'rb') as in_file:
+ img = jpeg.decode(in_file.read(),
+ _jpegflag(flag, channel_order))
+ if img.shape[-1] == 1:
+ img = img[:, :, 0]
+ return img
+ elif backend == 'pillow':
+ img = Image.open(img_or_path)
+ img = _pillow2array(img, flag, channel_order)
+ return img
+ elif backend == 'tifffile':
+ img = tifffile.imread(img_or_path)
+ return img
+ else:
+ flag = imread_flags[flag] if is_str(flag) else flag
+ img = cv2.imread(img_or_path, flag)
+ if flag == IMREAD_COLOR and channel_order == 'rgb':
+ cv2.cvtColor(img, cv2.COLOR_BGR2RGB, img)
+ return img
+ else:
+ raise TypeError('"img" must be a numpy array or a str or '
+ 'a pathlib.Path object')
+
+
+def imfrombytes(content, flag='color', channel_order='bgr', backend=None):
+ """Read an image from bytes.
+
+ Args:
+ content (bytes): Image bytes got from files or other streams.
+ flag (str): Same as :func:`imread`.
+ backend (str | None): The image decoding backend type. Options are
+ `cv2`, `pillow`, `turbojpeg`, `None`. If backend is None, the
+ global imread_backend specified by ``mmcv.use_backend()`` will be
+ used. Default: None.
+
+ Returns:
+ ndarray: Loaded image array.
+ """
+
+ if backend is None:
+ backend = imread_backend
+ if backend not in supported_backends:
+ raise ValueError(f'backend: {backend} is not supported. Supported '
+ "backends are 'cv2', 'turbojpeg', 'pillow'")
+ if backend == 'turbojpeg':
+ img = jpeg.decode(content, _jpegflag(flag, channel_order))
+ if img.shape[-1] == 1:
+ img = img[:, :, 0]
+ return img
+ elif backend == 'pillow':
+ buff = io.BytesIO(content)
+ img = Image.open(buff)
+ img = _pillow2array(img, flag, channel_order)
+ return img
+ else:
+ img_np = np.frombuffer(content, np.uint8)
+ flag = imread_flags[flag] if is_str(flag) else flag
+ img = cv2.imdecode(img_np, flag)
+ if flag == IMREAD_COLOR and channel_order == 'rgb':
+ cv2.cvtColor(img, cv2.COLOR_BGR2RGB, img)
+ return img
+
+
+def imwrite(img, file_path, params=None, auto_mkdir=True):
+ """Write image to file.
+
+ Args:
+ img (ndarray): Image array to be written.
+ file_path (str): Image file path.
+ params (None or list): Same as opencv :func:`imwrite` interface.
+ auto_mkdir (bool): If the parent folder of `file_path` does not exist,
+ whether to create it automatically.
+
+ Returns:
+ bool: Successful or not.
+ """
+ if auto_mkdir:
+ dir_name = osp.abspath(osp.dirname(file_path))
+ mkdir_or_exist(dir_name)
+ return cv2.imwrite(file_path, img, params)
diff --git a/PyTorch/contrib/cv/detection/GCNet/mmcv/image/misc.py b/PyTorch/contrib/cv/detection/GCNet/mmcv/image/misc.py
new file mode 100644
index 0000000000000000000000000000000000000000..1e02b952e2a36b4964a8812d08dcf753007c0280
--- /dev/null
+++ b/PyTorch/contrib/cv/detection/GCNet/mmcv/image/misc.py
@@ -0,0 +1,43 @@
+import numpy as np
+
+import mmcv
+
+try:
+ import torch
+except ImportError:
+ torch = None
+
+
+def tensor2imgs(tensor, mean=(0, 0, 0), std=(1, 1, 1), to_rgb=True):
+ """Convert tensor to 3-channel images.
+
+ Args:
+ tensor (torch.Tensor): Tensor that contains multiple images, shape (
+ N, C, H, W).
+ mean (tuple[float], optional): Mean of images. Defaults to (0, 0, 0).
+ std (tuple[float], optional): Standard deviation of images.
+ Defaults to (1, 1, 1).
+ to_rgb (bool, optional): Whether the tensor was converted to RGB
+ format in the first place. If so, convert it back to BGR.
+ Defaults to True.
+
+ Returns:
+ list[np.ndarray]: A list that contains multiple images.
+ """
+
+ if torch is None:
+ raise RuntimeError('pytorch is not installed')
+ assert torch.is_tensor(tensor) and tensor.ndim == 4
+ assert len(mean) == 3
+ assert len(std) == 3
+
+ num_imgs = tensor.size(0)
+ mean = np.array(mean, dtype=np.float32)
+ std = np.array(std, dtype=np.float32)
+ imgs = []
+ for img_id in range(num_imgs):
+ img = tensor[img_id, ...].cpu().numpy().transpose(1, 2, 0)
+ img = mmcv.imdenormalize(
+ img, mean, std, to_bgr=to_rgb).astype(np.uint8)
+ imgs.append(np.ascontiguousarray(img))
+ return imgs
diff --git a/PyTorch/contrib/cv/detection/GCNet/mmcv/image/photometric.py b/PyTorch/contrib/cv/detection/GCNet/mmcv/image/photometric.py
new file mode 100644
index 0000000000000000000000000000000000000000..c43c33dd9903ec66ce75c3a72bb931798d4564d9
--- /dev/null
+++ b/PyTorch/contrib/cv/detection/GCNet/mmcv/image/photometric.py
@@ -0,0 +1,425 @@
+import cv2
+import numpy as np
+
+from ..utils import is_tuple_of
+from .colorspace import bgr2gray, gray2bgr
+
+
+def imnormalize(img, mean, std, to_rgb=True):
+ """Normalize an image with mean and std.
+
+ Args:
+ img (ndarray): Image to be normalized.
+ mean (ndarray): The mean to be used for normalize.
+ std (ndarray): The std to be used for normalize.
+ to_rgb (bool): Whether to convert to rgb.
+
+ Returns:
+ ndarray: The normalized image.
+ """
+ img = img.copy().astype(np.float32)
+ return imnormalize_(img, mean, std, to_rgb)
+
+
+def imnormalize_(img, mean, std, to_rgb=True):
+ """Inplace normalize an image with mean and std.
+
+ Args:
+ img (ndarray): Image to be normalized.
+ mean (ndarray): The mean to be used for normalize.
+ std (ndarray): The std to be used for normalize.
+ to_rgb (bool): Whether to convert to rgb.
+
+ Returns:
+ ndarray: The normalized image.
+ """
+ # cv2 inplace normalization does not accept uint8
+ assert img.dtype != np.uint8
+ mean = np.float64(mean.reshape(1, -1))
+ stdinv = 1 / np.float64(std.reshape(1, -1))
+ if to_rgb:
+ cv2.cvtColor(img, cv2.COLOR_BGR2RGB, img) # inplace
+ cv2.subtract(img, mean, img) # inplace
+ cv2.multiply(img, stdinv, img) # inplace
+ return img
+
+
+def imdenormalize(img, mean, std, to_bgr=True):
+ assert img.dtype != np.uint8
+ mean = mean.reshape(1, -1).astype(np.float64)
+ std = std.reshape(1, -1).astype(np.float64)
+ img = cv2.multiply(img, std) # make a copy
+ cv2.add(img, mean, img) # inplace
+ if to_bgr:
+ cv2.cvtColor(img, cv2.COLOR_RGB2BGR, img) # inplace
+ return img
+
+
+def iminvert(img):
+ """Invert (negate) an image.
+
+ Args:
+ img (ndarray): Image to be inverted.
+
+ Returns:
+ ndarray: The inverted image.
+ """
+ return np.full_like(img, 255) - img
+
+
+def solarize(img, thr=128):
+ """Solarize an image (invert all pixel values above a threshold)
+
+ Args:
+ img (ndarray): Image to be solarized.
+ thr (int): Threshold for solarizing (0 - 255).
+
+ Returns:
+ ndarray: The solarized image.
+ """
+ img = np.where(img < thr, img, 255 - img)
+ return img
+
+
+def posterize(img, bits):
+ """Posterize an image (reduce the number of bits for each color channel)
+
+ Args:
+ img (ndarray): Image to be posterized.
+ bits (int): Number of bits (1 to 8) to use for posterizing.
+
+ Returns:
+ ndarray: The posterized image.
+ """
+ shift = 8 - bits
+ img = np.left_shift(np.right_shift(img, shift), shift)
+ return img
+
+
+def adjust_color(img, alpha=1, beta=None, gamma=0):
+ r"""It blends the source image and its gray image:
+
+ .. math::
+ output = img * alpha + gray\_img * beta + gamma
+
+ Args:
+ img (ndarray): The input source image.
+ alpha (int | float): Weight for the source image. Default 1.
+ beta (int | float): Weight for the converted gray image.
+ If None, it's assigned the value (1 - `alpha`).
+ gamma (int | float): Scalar added to each sum.
+ Same as :func:`cv2.addWeighted`. Default 0.
+
+ Returns:
+ ndarray: Colored image which has the same size and dtype as input.
+ """
+ gray_img = bgr2gray(img)
+ gray_img = np.tile(gray_img[..., None], [1, 1, 3])
+ if beta is None:
+ beta = 1 - alpha
+ colored_img = cv2.addWeighted(img, alpha, gray_img, beta, gamma)
+ if not colored_img.dtype == np.uint8:
+ # Note when the dtype of `img` is not the default `np.uint8`
+ # (e.g. np.float32), the value in `colored_img` got from cv2
+ # is not guaranteed to be in range [0, 255], so here clip
+ # is needed.
+ colored_img = np.clip(colored_img, 0, 255)
+ return colored_img
+
+
+def imequalize(img):
+ """Equalize the image histogram.
+
+ This function applies a non-linear mapping to the input image,
+ in order to create a uniform distribution of grayscale values
+ in the output image.
+
+ Args:
+ img (ndarray): Image to be equalized.
+
+ Returns:
+ ndarray: The equalized image.
+ """
+
+ def _scale_channel(im, c):
+ """Scale the data in the corresponding channel."""
+ im = im[:, :, c]
+ # Compute the histogram of the image channel.
+ histo = np.histogram(im, 256, (0, 255))[0]
+ # For computing the step, filter out the nonzeros.
+ nonzero_histo = histo[histo > 0]
+ step = (np.sum(nonzero_histo) - nonzero_histo[-1]) // 255
+ if not step:
+ lut = np.array(range(256))
+ else:
+ # Compute the cumulative sum, shifted by step // 2
+ # and then normalized by step.
+ lut = (np.cumsum(histo) + (step // 2)) // step
+ # Shift lut, prepending with 0.
+ lut = np.concatenate([[0], lut[:-1]], 0)
+ # If step is zero, return the original image.
+ # Otherwise, index from lut.
+ return np.where(np.equal(step, 0), im, lut[im])
+
+ # Scales each channel independently and then stacks
+ # the result.
+ s1 = _scale_channel(img, 0)
+ s2 = _scale_channel(img, 1)
+ s3 = _scale_channel(img, 2)
+ equalized_img = np.stack([s1, s2, s3], axis=-1)
+ return equalized_img.astype(img.dtype)
+
+
+def adjust_brightness(img, factor=1.):
+ """Adjust image brightness.
+
+ This function controls the brightness of an image. An
+ enhancement factor of 0.0 gives a black image.
+ A factor of 1.0 gives the original image. This function
+ blends the source image and the degenerated black image:
+
+ .. math::
+ output = img * factor + degenerated * (1 - factor)
+
+ Args:
+ img (ndarray): Image to be brightened.
+ factor (float): A value controls the enhancement.
+ Factor 1.0 returns the original image, lower
+ factors mean less color (brightness, contrast,
+ etc), and higher values more. Default 1.
+
+ Returns:
+ ndarray: The brightened image.
+ """
+ degenerated = np.zeros_like(img)
+ # Note manually convert the dtype to np.float32, to
+ # achieve as close results as PIL.ImageEnhance.Brightness.
+ # Set beta=1-factor, and gamma=0
+ brightened_img = cv2.addWeighted(
+ img.astype(np.float32), factor, degenerated.astype(np.float32),
+ 1 - factor, 0)
+ brightened_img = np.clip(brightened_img, 0, 255)
+ return brightened_img.astype(img.dtype)
+
+
+def adjust_contrast(img, factor=1.):
+ """Adjust image contrast.
+
+ This function controls the contrast of an image. An
+ enhancement factor of 0.0 gives a solid grey
+ image. A factor of 1.0 gives the original image. It
+ blends the source image and the degenerated mean image:
+
+ .. math::
+ output = img * factor + degenerated * (1 - factor)
+
+ Args:
+ img (ndarray): Image to be contrasted. BGR order.
+ factor (float): Same as :func:`mmcv.adjust_brightness`.
+
+ Returns:
+ ndarray: The contrasted image.
+ """
+ gray_img = bgr2gray(img)
+ hist = np.histogram(gray_img, 256, (0, 255))[0]
+ mean = round(np.sum(gray_img) / np.sum(hist))
+ degenerated = (np.ones_like(img[..., 0]) * mean).astype(img.dtype)
+ degenerated = gray2bgr(degenerated)
+ contrasted_img = cv2.addWeighted(
+ img.astype(np.float32), factor, degenerated.astype(np.float32),
+ 1 - factor, 0)
+ contrasted_img = np.clip(contrasted_img, 0, 255)
+ return contrasted_img.astype(img.dtype)
+
+
+def auto_contrast(img, cutoff=0):
+ """Auto adjust image contrast.
+
+ This function maximize (normalize) image contrast by first removing cutoff
+ percent of the lightest and darkest pixels from the histogram and remapping
+ the image so that the darkest pixel becomes black (0), and the lightest
+ becomes white (255).
+
+ Args:
+ img (ndarray): Image to be contrasted. BGR order.
+ cutoff (int | float | tuple): The cutoff percent of the lightest and
+ darkest pixels to be removed. If given as tuple, it shall be
+ (low, high). Otherwise, the single value will be used for both.
+ Defaults to 0.
+
+ Returns:
+ ndarray: The contrasted image.
+ """
+
+ def _auto_contrast_channel(im, c, cutoff):
+ im = im[:, :, c]
+ # Compute the histogram of the image channel.
+ histo = np.histogram(im, 256, (0, 255))[0]
+ # Remove cut-off percent pixels from histo
+ histo_sum = np.cumsum(histo)
+ cut_low = histo_sum[-1] * cutoff[0] // 100
+ cut_high = histo_sum[-1] - histo_sum[-1] * cutoff[1] // 100
+ histo_sum = np.clip(histo_sum, cut_low, cut_high) - cut_low
+ histo = np.concatenate([[histo_sum[0]], np.diff(histo_sum)], 0)
+
+ # Compute mapping
+ low, high = np.nonzero(histo)[0][0], np.nonzero(histo)[0][-1]
+ # If all the values have been cut off, return the origin img
+ if low >= high:
+ return im
+ scale = 255.0 / (high - low)
+ offset = -low * scale
+ lut = np.array(range(256))
+ lut = lut * scale + offset
+ lut = np.clip(lut, 0, 255)
+ return lut[im]
+
+ if isinstance(cutoff, (int, float)):
+ cutoff = (cutoff, cutoff)
+ else:
+ assert isinstance(cutoff, tuple), 'cutoff must be of type int, ' \
+ f'float or tuple, but got {type(cutoff)} instead.'
+ # Auto adjusts contrast for each channel independently and then stacks
+ # the result.
+ s1 = _auto_contrast_channel(img, 0, cutoff)
+ s2 = _auto_contrast_channel(img, 1, cutoff)
+ s3 = _auto_contrast_channel(img, 2, cutoff)
+ contrasted_img = np.stack([s1, s2, s3], axis=-1)
+ return contrasted_img.astype(img.dtype)
+
+
+def adjust_sharpness(img, factor=1., kernel=None):
+ """Adjust image sharpness.
+
+ This function controls the sharpness of an image. An
+ enhancement factor of 0.0 gives a blurred image. A
+ factor of 1.0 gives the original image. And a factor
+ of 2.0 gives a sharpened image. It blends the source
+ image and the degenerated mean image:
+
+ .. math::
+ output = img * factor + degenerated * (1 - factor)
+
+ Args:
+ img (ndarray): Image to be sharpened. BGR order.
+ factor (float): Same as :func:`mmcv.adjust_brightness`.
+ kernel (np.ndarray, optional): Filter kernel to be applied on the img
+ to obtain the degenerated img. Defaults to None.
+
+ Notes:
+ No value sanity check is enforced on the kernel set by users. So with
+ an inappropriate kernel, the `adjust_sharpness` may fail to perform
+ the function its name indicates but end up performing whatever
+ transform determined by the kernel.
+
+ Returns:
+ ndarray: The sharpened image.
+ """
+
+ if kernel is None:
+ # adopted from PIL.ImageFilter.SMOOTH
+ kernel = np.array([[1., 1., 1.], [1., 5., 1.], [1., 1., 1.]]) / 13
+ assert isinstance(kernel, np.ndarray), \
+ f'kernel must be of type np.ndarray, but got {type(kernel)} instead.'
+ assert kernel.ndim == 2, \
+ f'kernel must have a dimension of 2, but got {kernel.ndim} instead.'
+
+ degenerated = cv2.filter2D(img, -1, kernel)
+ sharpened_img = cv2.addWeighted(
+ img.astype(np.float32), factor, degenerated.astype(np.float32),
+ 1 - factor, 0)
+ sharpened_img = np.clip(sharpened_img, 0, 255)
+ return sharpened_img.astype(img.dtype)
+
+
+def adjust_lighting(img, eigval, eigvec, alphastd=0.1, to_rgb=True):
+ """AlexNet-style PCA jitter.
+
+ This data augmentation is proposed in `ImageNet Classification with Deep
+ Convolutional Neural Networks
+ `_.
+
+ Args:
+ img (ndarray): Image to be adjusted lighting. BGR order.
+ eigval (ndarray): the eigenvalue of the convariance matrix of pixel
+ values, respectively.
+ eigvec (ndarray): the eigenvector of the convariance matrix of pixel
+ values, respectively.
+ alphastd (float): The standard deviation for distribution of alpha.
+ Defaults to 0.1
+ to_rgb (bool): Whether to convert img to rgb.
+
+ Returns:
+ ndarray: The adjusted image.
+ """
+ assert isinstance(eigval, np.ndarray) and isinstance(eigvec, np.ndarray), \
+ f'eigval and eigvec should both be of type np.ndarray, got ' \
+ f'{type(eigval)} and {type(eigvec)} instead.'
+
+ assert eigval.ndim == 1 and eigvec.ndim == 2
+ assert eigvec.shape == (3, eigval.shape[0])
+ n_eigval = eigval.shape[0]
+ assert isinstance(alphastd, float), 'alphastd should be of type float, ' \
+ f'got {type(alphastd)} instead.'
+
+ img = img.copy().astype(np.float32)
+ if to_rgb:
+ cv2.cvtColor(img, cv2.COLOR_BGR2RGB, img) # inplace
+
+ alpha = np.random.normal(0, alphastd, n_eigval)
+ alter = eigvec \
+ * np.broadcast_to(alpha.reshape(1, n_eigval), (3, n_eigval)) \
+ * np.broadcast_to(eigval.reshape(1, n_eigval), (3, n_eigval))
+ alter = np.broadcast_to(alter.sum(axis=1).reshape(1, 1, 3), img.shape)
+ img_adjusted = img + alter
+ return img_adjusted
+
+
+def lut_transform(img, lut_table):
+ """Transform array by look-up table.
+
+ The function lut_transform fills the output array with values from the
+ look-up table. Indices of the entries are taken from the input array.
+
+ Args:
+ img (ndarray): Image to be transformed.
+ lut_table (ndarray): look-up table of 256 elements; in case of
+ multi-channel input array, the table should either have a single
+ channel (in this case the same table is used for all channels) or
+ the same number of channels as in the input array.
+
+ Returns:
+ ndarray: The transformed image.
+ """
+ assert isinstance(img, np.ndarray)
+ assert 0 <= np.min(img) and np.max(img) <= 255
+ assert isinstance(lut_table, np.ndarray)
+ assert lut_table.shape == (256, )
+
+ return cv2.LUT(np.array(img, dtype=np.uint8), lut_table)
+
+
+def clahe(img, clip_limit=40.0, tile_grid_size=(8, 8)):
+ """Use CLAHE method to process the image.
+
+ See `ZUIDERVELD,K. Contrast Limited Adaptive Histogram Equalization[J].
+ Graphics Gems, 1994:474-485.` for more information.
+
+ Args:
+ img (ndarray): Image to be processed.
+ clip_limit (float): Threshold for contrast limiting. Default: 40.0.
+ tile_grid_size (tuple[int]): Size of grid for histogram equalization.
+ Input image will be divided into equally sized rectangular tiles.
+ It defines the number of tiles in row and column. Default: (8, 8).
+
+ Returns:
+ ndarray: The processed image.
+ """
+ assert isinstance(img, np.ndarray)
+ assert img.ndim == 2
+ assert isinstance(clip_limit, (float, int))
+ assert is_tuple_of(tile_grid_size, int)
+ assert len(tile_grid_size) == 2
+
+ clahe = cv2.createCLAHE(clip_limit, tile_grid_size)
+ return clahe.apply(np.array(img, dtype=np.uint8))
diff --git a/PyTorch/contrib/cv/detection/GCNet/mmcv/model_zoo/deprecated.json b/PyTorch/contrib/cv/detection/GCNet/mmcv/model_zoo/deprecated.json
new file mode 100644
index 0000000000000000000000000000000000000000..25cf6f28caecc22a77e3136fefa6b8dfc0e6cb5b
--- /dev/null
+++ b/PyTorch/contrib/cv/detection/GCNet/mmcv/model_zoo/deprecated.json
@@ -0,0 +1,6 @@
+{
+ "resnet50_caffe": "detectron/resnet50_caffe",
+ "resnet50_caffe_bgr": "detectron2/resnet50_caffe_bgr",
+ "resnet101_caffe": "detectron/resnet101_caffe",
+ "resnet101_caffe_bgr": "detectron2/resnet101_caffe_bgr"
+}
diff --git a/PyTorch/contrib/cv/detection/GCNet/mmcv/model_zoo/mmcls.json b/PyTorch/contrib/cv/detection/GCNet/mmcv/model_zoo/mmcls.json
new file mode 100644
index 0000000000000000000000000000000000000000..51a2a071985cd4cb94c20850478475e3c2917709
--- /dev/null
+++ b/PyTorch/contrib/cv/detection/GCNet/mmcv/model_zoo/mmcls.json
@@ -0,0 +1,31 @@
+{
+ "vgg11": "https://download.openmmlab.com/mmclassification/v0/vgg/vgg11_batch256_imagenet_20210208-4271cd6c.pth",
+ "vgg13": "https://download.openmmlab.com/mmclassification/v0/vgg/vgg13_batch256_imagenet_20210208-4d1d6080.pth",
+ "vgg16": "https://download.openmmlab.com/mmclassification/v0/vgg/vgg16_batch256_imagenet_20210208-db26f1a5.pth",
+ "vgg19": "https://download.openmmlab.com/mmclassification/v0/vgg/vgg19_bn_batch256_imagenet_20210208-da620c4f.pth",
+ "vgg11_bn": "https://download.openmmlab.com/mmclassification/v0/vgg/vgg11_bn_batch256_imagenet_20210207-f244902c.pth",
+ "vgg13_bn": "https://download.openmmlab.com/mmclassification/v0/vgg/vgg13_bn_batch256_imagenet_20210207-1a8b7864.pth",
+ "vgg16_bn": "https://download.openmmlab.com/mmclassification/v0/vgg/vgg16_bn_batch256_imagenet_20210208-7e55cd29.pth",
+ "vgg19_bn": "https://download.openmmlab.com/mmclassification/v0/vgg/vgg19_bn_batch256_imagenet_20210208-da620c4f.pth",
+ "resnet18": "https://download.openmmlab.com/mmclassification/v0/resnet/resnet18_batch256_imagenet_20200708-34ab8f90.pth",
+ "resnet34": "https://download.openmmlab.com/mmclassification/v0/resnet/resnet34_batch256_imagenet_20200708-32ffb4f7.pth",
+ "resnet50": "https://download.openmmlab.com/mmclassification/v0/resnet/resnet50_batch256_imagenet_20200708-cfb998bf.pth",
+ "resnet101": "https://download.openmmlab.com/mmclassification/v0/resnet/resnet101_batch256_imagenet_20200708-753f3608.pth",
+ "resnet152": "https://download.openmmlab.com/mmclassification/v0/resnet/resnet152_batch256_imagenet_20200708-ec25b1f9.pth",
+ "resnet50_v1d": "https://download.openmmlab.com/mmclassification/v0/resnet/resnetv1d50_batch256_imagenet_20200708-1ad0ce94.pth",
+ "resnet101_v1d": "https://download.openmmlab.com/mmclassification/v0/resnet/resnetv1d101_batch256_imagenet_20200708-9cb302ef.pth",
+ "resnet152_v1d": "https://download.openmmlab.com/mmclassification/v0/resnet/resnetv1d152_batch256_imagenet_20200708-e79cb6a2.pth",
+ "resnext50_32x4d": "https://download.openmmlab.com/mmclassification/v0/resnext/resnext50_32x4d_b32x8_imagenet_20210429-56066e27.pth",
+ "resnext101_32x4d": "https://download.openmmlab.com/mmclassification/v0/resnext/resnext101_32x4d_b32x8_imagenet_20210506-e0fa3dd5.pth",
+ "resnext101_32x8d": "https://download.openmmlab.com/mmclassification/v0/resnext/resnext101_32x8d_b32x8_imagenet_20210506-23a247d5.pth",
+ "resnext152_32x4d": "https://download.openmmlab.com/mmclassification/v0/resnext/resnext152_32x4d_b32x8_imagenet_20210524-927787be.pth",
+ "se-resnet50": "https://download.openmmlab.com/mmclassification/v0/se-resnet/se-resnet50_batch256_imagenet_20200804-ae206104.pth",
+ "se-resnet101": "https://download.openmmlab.com/mmclassification/v0/se-resnet/se-resnet101_batch256_imagenet_20200804-ba5b51d4.pth",
+ "resnest50": "https://download.openmmlab.com/mmclassification/v0/resnest/resnest50_imagenet_converted-1ebf0afe.pth",
+ "resnest101": "https://download.openmmlab.com/mmclassification/v0/resnest/resnest101_imagenet_converted-032caa52.pth",
+ "resnest200": "https://download.openmmlab.com/mmclassification/v0/resnest/resnest200_imagenet_converted-581a60f2.pth",
+ "resnest269": "https://download.openmmlab.com/mmclassification/v0/resnest/resnest269_imagenet_converted-59930960.pth",
+ "shufflenet_v1": "https://download.openmmlab.com/mmclassification/v0/shufflenet_v1/shufflenet_v1_batch1024_imagenet_20200804-5d6cec73.pth",
+ "shufflenet_v2": "https://download.openmmlab.com/mmclassification/v0/shufflenet_v2/shufflenet_v2_batch1024_imagenet_20200812-5bf4721e.pth",
+ "mobilenet_v2": "https://download.openmmlab.com/mmclassification/v0/mobilenet_v2/mobilenet_v2_batch256_imagenet_20200708-3b2dc3af.pth"
+}
diff --git a/PyTorch/contrib/cv/detection/GCNet/mmcv/model_zoo/open_mmlab.json b/PyTorch/contrib/cv/detection/GCNet/mmcv/model_zoo/open_mmlab.json
new file mode 100644
index 0000000000000000000000000000000000000000..44c24f6bfecd8d8a18d55015b2099049c43c0732
--- /dev/null
+++ b/PyTorch/contrib/cv/detection/GCNet/mmcv/model_zoo/open_mmlab.json
@@ -0,0 +1,49 @@
+{
+ "vgg16_caffe": "https://download.openmmlab.com/pretrain/third_party/vgg16_caffe-292e1171.pth",
+ "detectron/resnet50_caffe": "https://download.openmmlab.com/pretrain/third_party/resnet50_caffe-788b5fa3.pth",
+ "detectron2/resnet50_caffe": "https://download.openmmlab.com/pretrain/third_party/resnet50_msra-5891d200.pth",
+ "detectron/resnet101_caffe": "https://download.openmmlab.com/pretrain/third_party/resnet101_caffe-3ad79236.pth",
+ "detectron2/resnet101_caffe": "https://download.openmmlab.com/pretrain/third_party/resnet101_msra-6cc46731.pth",
+ "detectron2/resnext101_32x8d": "https://download.openmmlab.com/pretrain/third_party/resnext101_32x8d-1516f1aa.pth",
+ "resnext50_32x4d": "https://download.openmmlab.com/pretrain/third_party/resnext50-32x4d-0ab1a123.pth",
+ "resnext101_32x4d": "https://download.openmmlab.com/pretrain/third_party/resnext101_32x4d-a5af3160.pth",
+ "resnext101_64x4d": "https://download.openmmlab.com/pretrain/third_party/resnext101_64x4d-ee2c6f71.pth",
+ "contrib/resnet50_gn": "https://download.openmmlab.com/pretrain/third_party/resnet50_gn_thangvubk-ad1730dd.pth",
+ "detectron/resnet50_gn": "https://download.openmmlab.com/pretrain/third_party/resnet50_gn-9186a21c.pth",
+ "detectron/resnet101_gn": "https://download.openmmlab.com/pretrain/third_party/resnet101_gn-cac0ab98.pth",
+ "jhu/resnet50_gn_ws": "https://download.openmmlab.com/pretrain/third_party/resnet50_gn_ws-15beedd8.pth",
+ "jhu/resnet101_gn_ws": "https://download.openmmlab.com/pretrain/third_party/resnet101_gn_ws-3e3c308c.pth",
+ "jhu/resnext50_32x4d_gn_ws": "https://download.openmmlab.com/pretrain/third_party/resnext50_32x4d_gn_ws-0d87ac85.pth",
+ "jhu/resnext101_32x4d_gn_ws": "https://download.openmmlab.com/pretrain/third_party/resnext101_32x4d_gn_ws-34ac1a9e.pth",
+ "jhu/resnext50_32x4d_gn": "https://download.openmmlab.com/pretrain/third_party/resnext50_32x4d_gn-c7e8b754.pth",
+ "jhu/resnext101_32x4d_gn": "https://download.openmmlab.com/pretrain/third_party/resnext101_32x4d_gn-ac3bb84e.pth",
+ "msra/hrnetv2_w18_small": "https://download.openmmlab.com/pretrain/third_party/hrnetv2_w18_small-b5a04e21.pth",
+ "msra/hrnetv2_w18": "https://download.openmmlab.com/pretrain/third_party/hrnetv2_w18-00eb2006.pth",
+ "msra/hrnetv2_w32": "https://download.openmmlab.com/pretrain/third_party/hrnetv2_w32-dc9eeb4f.pth",
+ "msra/hrnetv2_w40": "https://download.openmmlab.com/pretrain/third_party/hrnetv2_w40-ed0b031c.pth",
+ "msra/hrnetv2_w48": "https://download.openmmlab.com/pretrain/third_party/hrnetv2_w48-d2186c55.pth",
+ "bninception_caffe": "https://download.openmmlab.com/pretrain/third_party/bn_inception_caffe-ed2e8665.pth",
+ "kin400/i3d_r50_f32s2_k400": "https://download.openmmlab.com/pretrain/third_party/i3d_r50_f32s2_k400-2c57e077.pth",
+ "kin400/nl3d_r50_f32s2_k400": "https://download.openmmlab.com/pretrain/third_party/nl3d_r50_f32s2_k400-fa7e7caa.pth",
+ "res2net101_v1d_26w_4s": "https://download.openmmlab.com/pretrain/third_party/res2net101_v1d_26w_4s_mmdetv2-f0a600f9.pth",
+ "regnetx_400mf": "https://download.openmmlab.com/pretrain/third_party/regnetx_400mf-a5b10d96.pth",
+ "regnetx_800mf": "https://download.openmmlab.com/pretrain/third_party/regnetx_800mf-1f4be4c7.pth",
+ "regnetx_1.6gf": "https://download.openmmlab.com/pretrain/third_party/regnetx_1.6gf-5791c176.pth",
+ "regnetx_3.2gf": "https://download.openmmlab.com/pretrain/third_party/regnetx_3.2gf-c2599b0f.pth",
+ "regnetx_4.0gf": "https://download.openmmlab.com/pretrain/third_party/regnetx_4.0gf-a88f671e.pth",
+ "regnetx_6.4gf": "https://download.openmmlab.com/pretrain/third_party/regnetx_6.4gf-006af45d.pth",
+ "regnetx_8.0gf": "https://download.openmmlab.com/pretrain/third_party/regnetx_8.0gf-3c68abe7.pth",
+ "regnetx_12gf": "https://download.openmmlab.com/pretrain/third_party/regnetx_12gf-4c2a3350.pth",
+ "resnet18_v1c": "https://download.openmmlab.com/pretrain/third_party/resnet18_v1c-b5776b93.pth",
+ "resnet50_v1c": "https://download.openmmlab.com/pretrain/third_party/resnet50_v1c-2cccc1ad.pth",
+ "resnet101_v1c": "https://download.openmmlab.com/pretrain/third_party/resnet101_v1c-e67eebb6.pth",
+ "mmedit/vgg16": "https://download.openmmlab.com/mmediting/third_party/vgg_state_dict.pth",
+ "mmedit/res34_en_nomixup": "https://download.openmmlab.com/mmediting/third_party/model_best_resnet34_En_nomixup.pth",
+ "mmedit/mobilenet_v2": "https://download.openmmlab.com/mmediting/third_party/mobilenet_v2.pth",
+ "contrib/mobilenet_v3_large": "https://download.openmmlab.com/pretrain/third_party/mobilenet_v3_large-bc2c3fd3.pth",
+ "contrib/mobilenet_v3_small": "https://download.openmmlab.com/pretrain/third_party/mobilenet_v3_small-47085aa1.pth",
+ "resnest50": "https://download.openmmlab.com/pretrain/third_party/resnest50_d2-7497a55b.pth",
+ "resnest101": "https://download.openmmlab.com/pretrain/third_party/resnest101_d2-f3b931b2.pth",
+ "resnest200": "https://download.openmmlab.com/pretrain/third_party/resnest200_d2-ca88e41f.pth",
+ "darknet53": "https://download.openmmlab.com/pretrain/third_party/darknet53-a628ea1b.pth"
+}
diff --git a/PyTorch/contrib/cv/detection/GCNet/mmcv/onnx/__init__.py b/PyTorch/contrib/cv/detection/GCNet/mmcv/onnx/__init__.py
new file mode 100644
index 0000000000000000000000000000000000000000..12c57c07a9a12d8df1a90cb59fab189a837ec742
--- /dev/null
+++ b/PyTorch/contrib/cv/detection/GCNet/mmcv/onnx/__init__.py
@@ -0,0 +1,4 @@
+from .info import is_custom_op_loaded
+from .symbolic import register_extra_symbolics
+
+__all__ = ['register_extra_symbolics', 'is_custom_op_loaded']
diff --git a/PyTorch/contrib/cv/detection/GCNet/mmcv/onnx/info.py b/PyTorch/contrib/cv/detection/GCNet/mmcv/onnx/info.py
new file mode 100644
index 0000000000000000000000000000000000000000..6c8ba391df5ff69b9b4a5278a5b84527f75ba2cf
--- /dev/null
+++ b/PyTorch/contrib/cv/detection/GCNet/mmcv/onnx/info.py
@@ -0,0 +1,18 @@
+import os
+
+
+def is_custom_op_loaded():
+ flag = False
+ try:
+ from ..tensorrt import is_tensorrt_plugin_loaded
+ flag = is_tensorrt_plugin_loaded()
+ except (ImportError, ModuleNotFoundError):
+ pass
+ if not flag:
+ try:
+ from ..ops import get_onnxruntime_op_path
+ ort_lib_path = get_onnxruntime_op_path()
+ flag = os.path.exists(ort_lib_path)
+ except (ImportError, ModuleNotFoundError):
+ pass
+ return flag
diff --git a/PyTorch/contrib/cv/detection/GCNet/mmcv/onnx/onnx_utils/__init__.py b/PyTorch/contrib/cv/detection/GCNet/mmcv/onnx/onnx_utils/__init__.py
new file mode 100644
index 0000000000000000000000000000000000000000..e69de29bb2d1d6434b8b29ae775ad8c2e48c5391
diff --git a/PyTorch/contrib/cv/detection/GCNet/mmcv/onnx/onnx_utils/symbolic_helper.py b/PyTorch/contrib/cv/detection/GCNet/mmcv/onnx/onnx_utils/symbolic_helper.py
new file mode 100644
index 0000000000000000000000000000000000000000..032d4b1b059c9ffc5d0592714b49759d5a4f4c57
--- /dev/null
+++ b/PyTorch/contrib/cv/detection/GCNet/mmcv/onnx/onnx_utils/symbolic_helper.py
@@ -0,0 +1,330 @@
+"""Modified from https://github.com/pytorch/pytorch."""
+import warnings
+from functools import wraps
+from sys import maxsize
+
+import torch
+import torch.onnx
+# This import monkey-patches graph manipulation methods on Graph, used for the
+# ONNX symbolics
+import torch.onnx.utils
+from torch._C import ListType
+
+# ---------------------------------------------------------------------------------
+# Helper functions
+# ---------------------------------------------------------------------------------
+
+# Save some builtins as locals, because we'll shadown them below
+_sum = sum
+
+
+def _parse_arg(value, desc):
+ if desc == 'none':
+ return value
+ if desc == 'v' or not _is_value(value):
+ return value
+ if value.node().mustBeNone():
+ return None
+ if value.node().kind() == 'onnx::Constant':
+ tval = value.node()['value']
+ if desc == 'i':
+ return int(tval)
+ elif desc == 'f':
+ return float(tval)
+ elif desc == 'b':
+ return bool(tval)
+ elif desc == 's':
+ return str(tval)
+ elif desc == 't':
+ return tval
+ elif desc == 'is':
+ return [int(v) for v in tval]
+ elif desc == 'fs':
+ return [float(v) for v in tval]
+ else:
+ raise RuntimeError(
+ "ONNX symbolic doesn't know to interpret Constant node")
+ elif value.node().kind() == 'prim::ListConstruct':
+ if desc == 'is':
+ for v in value.node().inputs():
+ if v.node().kind() != 'onnx::Constant':
+ raise RuntimeError(
+ "Failed to export an ONNX attribute '" +
+ v.node().kind() +
+ "', since it's not constant, please try to make "
+ 'things (e.g., kernel size) static if possible')
+ return [int(v.node()['value']) for v in value.node().inputs()]
+ else:
+ raise RuntimeError(
+ "ONNX symbolic doesn't know to interpret ListConstruct node")
+
+ raise RuntimeError('Unexpected node type: {}'.format(value.node().kind()))
+
+
+def _maybe_get_const(value, desc):
+ if _is_value(value) and value.node().kind() == 'onnx::Constant':
+ return _parse_arg(value, desc)
+ return value
+
+
+def _maybe_get_scalar(value):
+ value_t = _maybe_get_const(value, 't')
+ if isinstance(value_t, torch.Tensor) and value_t.shape == ():
+ return value_t
+ return value
+
+
+def _get_const(value, desc, arg_name):
+ if _is_value(value) and value.node().kind() not in ('onnx::Constant',
+ 'prim::Constant'):
+ raise RuntimeError('ONNX symbolic expected a constant'
+ ' value of the {} argument, got `{}`'.format(
+ arg_name, value))
+ return _parse_arg(value, desc)
+
+
+def _unpack_list(list_value):
+ list_node = list_value.node()
+ assert list_node.kind() == 'prim::ListConstruct'
+ return list(list_node.inputs())
+
+
+# Check if list_value is output from prim::ListConstruct
+# This is usually called before _unpack_list to ensure the list can be
+# unpacked.
+def _is_packed_list(list_value):
+ return _is_value(
+ list_value) and list_value.node().kind() == 'prim::ListConstruct'
+
+
+def parse_args(*arg_descriptors):
+
+ def decorator(fn):
+ fn._arg_descriptors = arg_descriptors
+
+ def wrapper(g, *args):
+ # some args may be optional, so the length may be smaller
+ assert len(arg_descriptors) >= len(args)
+ args = [
+ _parse_arg(arg, arg_desc)
+ for arg, arg_desc in zip(args, arg_descriptors)
+ ]
+ return fn(g, *args)
+
+ # In Python 2 functools.wraps chokes on partially applied functions, so
+ # we need this as a workaround
+ try:
+ wrapper = wraps(fn)(wrapper)
+ except Exception:
+ pass
+ return wrapper
+
+ return decorator
+
+
+def _scalar(x):
+ """Convert a scalar tensor into a Python value."""
+ assert x.numel() == 1
+ return x.item()
+
+
+def _if_scalar_type_as(g, self, tensor):
+ """Convert self into the same type of tensor, as necessary."""
+ if isinstance(self, torch._C.Value):
+ return self
+
+ scalar_type = tensor.type().scalarType()
+ if scalar_type:
+ ty = scalar_type.lower()
+ return getattr(self, ty)()
+
+ return self
+
+
+def _is_none(x):
+ return x.node().mustBeNone()
+
+
+def _is_value(x):
+ return isinstance(x, torch._C.Value)
+
+
+def _is_tensor_list(x):
+ return x.type().isSubtypeOf(ListType.ofTensors())
+
+
+def _unimplemented(op, msg):
+ warnings.warn('ONNX export failed on ' + op + ' because ' + msg +
+ ' not supported')
+
+
+def _try_get_scalar_type(*args):
+ for arg in args:
+ try:
+ return arg.type().scalarType()
+ except RuntimeError:
+ pass
+ return None
+
+
+def _topk_helper(g, input, k, dim, largest=True, sorted=False, out=None):
+ if out is not None:
+ _unimplemented('TopK', 'Out parameter is not supported')
+ if not _is_value(k):
+ k = g.op('Constant', value_t=torch.tensor([k], dtype=torch.int64))
+ else:
+ k = g.op('Reshape', k, g.op('Constant', value_t=torch.tensor([1])))
+ return g.op(
+ 'TopK',
+ input,
+ k,
+ axis_i=dim,
+ largest_i=largest,
+ sorted_i=sorted,
+ outputs=2)
+
+
+def _slice_helper(g,
+ input,
+ axes,
+ starts,
+ ends,
+ steps=None,
+ dynamic_slice=False):
+ # TODO(ruobing): add support for opset<10
+ from torch.onnx.symbolic_opset10 import _slice
+ return _slice(g, input, axes, starts, ends, steps, dynamic_slice)
+
+
+def _unsqueeze_helper(g, input, dim):
+ from torch.onnx.symbolic_opset9 import unsqueeze
+ return unsqueeze(g, input, dim)
+
+
+def _interpolate_size_to_scales(g, input, output_size, dim):
+ output_size = _maybe_get_const(output_size, 'is')
+ if _is_value(output_size):
+ offset = 2
+ offsets = g.op(
+ 'Constant', value_t=torch.ones(offset, dtype=torch.float32))
+ dividend = g.op(
+ 'Cast', output_size, to_i=cast_pytorch_to_onnx['Float'])
+ divisor = _slice_helper(
+ g, g.op('Shape', input), axes=[0], ends=[maxsize], starts=[offset])
+ divisor = g.op('Cast', divisor, to_i=cast_pytorch_to_onnx['Float'])
+ scale_dims = g.op('Div', dividend, divisor)
+ scales = g.op('Concat', offsets, scale_dims, axis_i=0)
+ else:
+ scales_constant = [
+ 1. if i < 2 else float(output_size[-(dim - i)]) /
+ float(input.type().sizes()[-(dim - i)]) for i in range(0, dim)
+ ]
+ scales = g.op(
+ 'Constant',
+ value_t=torch.tensor(scales_constant, dtype=torch.float32))
+ return scales
+
+
+def _interpolate_get_scales_if_available(g, scales):
+ if len(scales) == 0:
+ return None
+ # scales[0] is NoneType in Pytorch == 1.5.1
+ # scales[0] is TensorType with sizes = [] in Pytorch == 1.6.0
+ # scales[0] is ListType in Pytorch == 1.7.0
+ # scales[0] is TensorType with sizes = [2] in Pytorch == 1.8.0
+ scale_desc = 'fs' if scales[0].type().kind() == 'ListType' or (
+ scales[0].type().kind() == 'TensorType' and
+ (sum(scales[0].type().sizes()) > 1)) else 'f'
+ available_scales = _maybe_get_const(
+ scales[0], scale_desc) != -1 and not _is_none(scales[0])
+
+ if not available_scales:
+ return None
+
+ offsets = g.op('Constant', value_t=torch.ones(2, dtype=torch.float32))
+ if scale_desc == 'fs':
+ scales_list = g.op(
+ 'Constant',
+ value_t=torch.tensor(_maybe_get_const(scales[0], scale_desc)))
+ # modify to support PyTorch==1.7.0
+ # https://github.com/pytorch/pytorch/blob/75ee5756715e7161314ce037474843b68f69fc04/torch/onnx/symbolic_helper.py#L375 # noqa: E501
+ scales = g.op('Concat', offsets, scales_list, axis_i=0)
+ else:
+ # for PyTorch < 1.7.0
+ scales_list = []
+ for scale in scales:
+ unsqueezed_scale = _unsqueeze_helper(g, scale, 0)
+ # ONNX only supports float for the scales. double -> float.
+ unsqueezed_scale = g.op(
+ 'Cast', unsqueezed_scale, to_i=cast_pytorch_to_onnx['Float'])
+ scales_list.append(unsqueezed_scale)
+ scales = g.op('Concat', offsets, *scales_list, axis_i=0)
+ return scales
+
+
+def _get_interpolate_attributes(g, mode, args):
+ if mode == 'nearest':
+ align_corners = None
+ scales = args[0:]
+ else:
+ align_corners = args[0]
+ scales = args[1:]
+ scales = _interpolate_get_scales_if_available(g, scales)
+ return scales, align_corners
+
+
+def _interpolate_get_scales(g, scale_factor, dim):
+ offsets = g.op('Constant', value_t=torch.ones(2, dtype=torch.float32))
+ if isinstance(scale_factor.type(), torch._C.ListType):
+ return g.op('Concat', offsets, scale_factor, axis_i=0)
+ else:
+ scale_factor = _unsqueeze_helper(g, scale_factor, 0)
+ scale_factor = g.op(
+ 'Cast', scale_factor, to_i=cast_pytorch_to_onnx['Float'])
+ scales = [scale_factor for i in range(dim - 2)]
+ scale_factor = g.op('Concat', offsets, *scales, axis_i=0)
+ return scale_factor
+
+
+def _size_helper(g, self, dim):
+ full_shape = g.op('Shape', self)
+ from torch.onnx.symbolic_opset9 import select
+ return select(g, full_shape, g.op('Constant', value_t=torch.tensor([0])),
+ dim)
+
+
+def _avgpool_helper(tuple_fn, padding, kernel_size, stride, divisor_override,
+ name):
+ if divisor_override and divisor_override.node().kind() != 'prim::Constant':
+ return _unimplemented(name, 'divisor_override')
+ if not stride:
+ stride = kernel_size
+ padding = tuple(tuple_fn(padding))
+ return padding
+
+
+# Metaprogram symbolics for each ATen native specialized cast operator.
+# For e.g. we specify a function named `_cast_uint8_t` that instantiates an
+# ONNX cast node with `to` attribute 'UINT8'
+#
+# TODO: remove these once we support Type's in the JIT IR and we can once again
+# use the unified toType operator
+cast_pytorch_to_onnx = {
+ 'Byte': torch.onnx.TensorProtoDataType.UINT8,
+ 'Char': torch.onnx.TensorProtoDataType.INT8,
+ 'Double': torch.onnx.TensorProtoDataType.DOUBLE,
+ 'Float': torch.onnx.TensorProtoDataType.FLOAT,
+ 'Half': torch.onnx.TensorProtoDataType.FLOAT16,
+ 'Int': torch.onnx.TensorProtoDataType.INT32,
+ 'Long': torch.onnx.TensorProtoDataType.INT64,
+ 'Short': torch.onnx.TensorProtoDataType.INT16,
+ 'Bool': torch.onnx.TensorProtoDataType.BOOL,
+ 'ComplexFloat': torch.onnx.TensorProtoDataType.COMPLEX64,
+ 'ComplexDouble': torch.onnx.TensorProtoDataType.COMPLEX128,
+ 'Undefined': torch.onnx.TensorProtoDataType.UNDEFINED,
+}
+
+# Global set to store the list of quantized operators in the network.
+# This is currently only used in the conversion of quantized ops from PT
+# -> C2 via ONNX.
+_quantized_ops = set()
diff --git a/PyTorch/contrib/cv/detection/GCNet/mmcv/onnx/symbolic.py b/PyTorch/contrib/cv/detection/GCNet/mmcv/onnx/symbolic.py
new file mode 100644
index 0000000000000000000000000000000000000000..1990e3c24822db0397755aa065f3b0926f90ec0c
--- /dev/null
+++ b/PyTorch/contrib/cv/detection/GCNet/mmcv/onnx/symbolic.py
@@ -0,0 +1,435 @@
+"""Modified from https://github.com/pytorch/pytorch."""
+import os
+
+import numpy as np
+import torch
+from torch.nn.modules.utils import _pair, _single, _triple
+from torch.onnx.symbolic_helper import parse_args
+from torch.onnx.symbolic_registry import register_op
+
+from .onnx_utils import symbolic_helper as sym_help
+
+
+def _interpolate(name, dim, interpolate_mode):
+
+ def symbolic_fn(g, input, output_size, *args):
+ scales, align_corners = sym_help._get_interpolate_attributes(
+ g, interpolate_mode, args)
+ align_corners = sym_help._maybe_get_scalar(align_corners)
+ transformation_mode = 'asymmetric' \
+ if interpolate_mode == 'nearest' \
+ else 'align_corners' if align_corners else 'pytorch_half_pixel'
+ empty_tensor = g.op(
+ 'Constant', value_t=torch.tensor([], dtype=torch.float32))
+
+ if scales is None:
+ if 'ONNX_BACKEND' in os.environ and os.environ[
+ 'ONNX_BACKEND'] == 'TensorRT':
+ input_size = input.type().sizes()
+ # slice the first two dim
+ input_size = input_size[:2]
+ # convert output_size to int type
+ output_size = sym_help._maybe_get_const(output_size, 'is')
+ input_size.extend(output_size)
+ output_size = g.op(
+ 'Constant',
+ value_t=torch.tensor(input_size, dtype=torch.int64))
+ else:
+ input_size = g.op('Shape', input)
+ input_size_beg = sym_help._slice_helper(
+ g, input_size, axes=[0], ends=[2], starts=[0])
+ output_size = g.op(
+ 'Cast',
+ output_size,
+ to_i=sym_help.cast_pytorch_to_onnx['Long'])
+ output_size = g.op(
+ 'Concat', input_size_beg, output_size, axis_i=0)
+ scales = g.op(
+ 'Constant', value_t=torch.tensor([], dtype=torch.float32))
+ return g.op(
+ 'Resize',
+ input,
+ empty_tensor,
+ # roi only takes effect whith
+ # coordinate_transformation_mode="tf_crop_and_resize"
+ scales, # scales is not needed since we are sending out_size
+ output_size,
+ coordinate_transformation_mode_s=transformation_mode,
+ cubic_coeff_a_f=-0.75, # only valid when mode="cubic"
+ mode_s=interpolate_mode, # nearest, linear, or cubic
+ nearest_mode_s='floor') # only valid when mode="nearest"
+ else:
+ return g.op(
+ 'Resize',
+ input,
+ empty_tensor,
+ # roi only takes effect with
+ # coordinate_transformation_mode="tf_crop_and_resize"
+ scales, # scales is not needed since we are sending out_size
+ coordinate_transformation_mode_s=transformation_mode,
+ cubic_coeff_a_f=-0.75, # only valid when mode="cubic"
+ mode_s=interpolate_mode, # nearest, linear, or cubic
+ nearest_mode_s='floor') # only valid when mode="nearest"
+
+ return symbolic_fn
+
+
+upsample_nearest1d = _interpolate('upsample_nearest1d', 3, 'nearest')
+upsample_nearest2d = _interpolate('upsample_nearest2d', 4, 'nearest')
+upsample_nearest3d = _interpolate('upsample_nearest3d', 5, 'nearest')
+upsample_linear1d = _interpolate('upsample_linear1d', 3, 'linear')
+upsample_bilinear2d = _interpolate('upsample_bilinear2d', 4, 'linear')
+upsample_trilinear3d = _interpolate('upsample_trilinear3d', 5, 'linear')
+upsample_bicubic2d = _interpolate('upsample_bicubic2d', 4, 'cubic')
+
+
+@parse_args('v', 'v', 'i', 'i', 'i', 'none')
+def topk(g, self, k, dim, largest, sorted, out=None):
+ return sym_help._topk_helper(
+ g, self, k, dim, largest=largest, sorted=sorted, out=out)
+
+
+def masked_select(g, self, mask):
+ from torch.onnx.symbolic_opset9 import nonzero, expand_as
+ index = nonzero(g, expand_as(g, mask, self))
+ return g.op('GatherND', self, index)
+
+
+def _prepare_onnx_paddings(g, dim, pad):
+ pad_len = torch.onnx.symbolic_opset9.size(
+ g, pad, g.op('Constant', value_t=torch.tensor([0])))
+ # Set extension = [0] * (dim * 2 - len(pad))
+ extension = g.op(
+ 'Sub',
+ g.op('Mul',
+ g.op('Constant', value_t=torch.tensor(dim, dtype=torch.int64)),
+ g.op('Constant', value_t=torch.tensor(2, dtype=torch.int64))),
+ pad_len)
+ pad = g.op('Cast', pad, to_i=sym_help.cast_pytorch_to_onnx['Long'])
+ paddings = g.op(
+ 'Concat',
+ pad,
+ g.op(
+ 'ConstantOfShape',
+ extension,
+ value_t=torch.tensor([0], dtype=torch.int64)),
+ axis_i=0)
+ paddings = g.op('Reshape', paddings,
+ g.op('Constant', value_t=torch.tensor([-1, 2])))
+ paddings = g.op(
+ 'Transpose',
+ torch.onnx.symbolic_opset10.flip(g, paddings, [0]),
+ perm_i=[1, 0])
+ paddings = g.op('Reshape', paddings,
+ g.op('Constant', value_t=torch.tensor([-1])))
+ padding_c = g.op(
+ 'Cast', paddings, to_i=sym_help.cast_pytorch_to_onnx['Long'])
+ return padding_c
+
+
+def constant_pad_nd(g, input, padding, value=None):
+ mode = 'constant'
+ value = sym_help._maybe_get_scalar(value)
+ value = sym_help._if_scalar_type_as(g, value, input)
+ pad = _prepare_onnx_paddings(g, input.type().dim(), padding)
+ return g.op('Pad', input, pad, value, mode_s=mode)
+
+
+def reflection_pad(g, input, padding):
+ mode = 'reflect'
+ paddings = _prepare_onnx_paddings(g, input.type().dim(), padding)
+ return g.op('Pad', input, paddings, mode_s=mode)
+
+
+reflection_pad1d = reflection_pad
+reflection_pad2d = reflection_pad
+reflection_pad3d = reflection_pad
+
+
+def _avg_pool(name, tuple_fn):
+
+ @parse_args('v', 'is', 'is', 'is', 'i', 'i', 'none')
+ def symbolic_fn(g,
+ input,
+ kernel_size,
+ stride,
+ padding,
+ ceil_mode,
+ count_include_pad,
+ divisor_override=None):
+ padding = sym_help._avgpool_helper(tuple_fn, padding, kernel_size,
+ stride, divisor_override, name)
+ if not stride:
+ stride = kernel_size
+ if count_include_pad:
+ input = g.op(
+ 'Pad',
+ input,
+ g.op(
+ 'Constant',
+ value_t=torch.tensor(((0, ) * 2 + padding) * 2)),
+ mode_s='constant')
+ padding = (0, ) * len(padding)
+ output = g.op(
+ 'AveragePool',
+ input,
+ kernel_shape_i=tuple_fn(kernel_size),
+ strides_i=tuple_fn(stride),
+ pads_i=padding * 2,
+ ceil_mode_i=ceil_mode)
+ return output
+
+ return symbolic_fn
+
+
+avg_pool1d = _avg_pool('avg_pool1d', _single)
+avg_pool2d = _avg_pool('avg_pool2d', _pair)
+avg_pool3d = _avg_pool('avg_pool3d', _triple)
+
+
+def _get_im2col_indices_along_dim(g, input_d, kernel_size_d, dilation_d,
+ padding_d, stride_d):
+ # Input is always 4-D (N, C, H, W)
+ # Calculate indices of sliding blocks along spatial dimension
+ # Slide kernel over input each dim d:
+ # each dimension d ranges from 0 to
+ # input[d]+2xpadding[d]-dilation[d]x(kernel_size[d]-1)
+ # with steps = stride
+
+ blocks_d = g.op('Add', input_d,
+ g.op('Constant', value_t=torch.tensor(padding_d * 2)))
+ blocks_d = g.op(
+ 'Sub', blocks_d,
+ g.op(
+ 'Constant',
+ value_t=torch.tensor(dilation_d * (kernel_size_d - 1))))
+
+ # Stride kernel over input and find starting indices along dim d
+ blocks_d_indices = g.op('Range', g.op('Constant', value_t=torch.tensor(0)),
+ blocks_d,
+ g.op('Constant', value_t=torch.tensor(stride_d)))
+
+ # Apply dilation on kernel and find its indices along dim d
+ kernel_grid = np.arange(0, kernel_size_d * dilation_d, dilation_d)
+ kernel_grid = g.op('Constant', value_t=torch.tensor([kernel_grid]))
+
+ # Broadcast and add kernel staring positions (indices) with
+ # kernel_grid along dim d, to get block indices along dim d
+ blocks_d_indices = g.op(
+ 'Unsqueeze', blocks_d_indices, axes_i=[0]) # Reshape to [1, -1]
+ kernel_mask = g.op('Reshape', kernel_grid,
+ g.op('Constant', value_t=torch.tensor([-1, 1])))
+ block_mask = g.op('Add', blocks_d_indices, kernel_mask)
+
+ return block_mask
+
+
+def _get_im2col_padded_input(g, input, padding_h, padding_w):
+ # Input is always 4-D tensor (N, C, H, W)
+ # Padding tensor has the following format: (padding_h, padding_w)
+ # Reshape the padding to follow ONNX format:
+ # (dim1_begin, dim2_begin,...,dim1_end, dim2_end,...)
+ pad = g.op(
+ 'Constant', value_t=torch.LongTensor([0, 0, padding_h, padding_w] * 2))
+ return g.op('Pad', input, pad)
+
+
+def _get_im2col_output_shape(g, input, kernel_h, kernel_w):
+ batch_dim = size(g, input, g.op('Constant', value_t=torch.tensor(0)))
+ channel_dim = size(g, input, g.op('Constant', value_t=torch.tensor(1)))
+ channel_unfolded = g.op(
+ 'Mul', channel_dim,
+ g.op('Constant', value_t=torch.tensor(kernel_h * kernel_w)))
+
+ return g.op(
+ 'Concat',
+ g.op('Unsqueeze', batch_dim, axes_i=[0]),
+ g.op('Unsqueeze', channel_unfolded, axes_i=[0]),
+ g.op('Constant', value_t=torch.tensor([-1])),
+ axis_i=0)
+
+
+def size(g, self, dim=None):
+ if dim is None:
+ return g.op('Shape', self)
+ return sym_help._size_helper(g, self, dim)
+
+
+@parse_args('v', 'is', 'is', 'is', 'is')
+def im2col(g, input, kernel_size, dilation, padding, stride):
+ # Input is always 4-D tensor (N, C, H, W)
+ # All other args are int[2]
+
+ input_h = size(g, input, g.op('Constant', value_t=torch.tensor(2)))
+ input_w = size(g, input, g.op('Constant', value_t=torch.tensor(3)))
+
+ stride_h, stride_w = stride[0], stride[1]
+ padding_h, padding_w = padding[0], padding[1]
+ dilation_h, dilation_w = dilation[0], dilation[1]
+ kernel_h, kernel_w = kernel_size[0], kernel_size[1]
+
+ blocks_row_indices = _get_im2col_indices_along_dim(g, input_h, kernel_h,
+ dilation_h, padding_h,
+ stride_h)
+ blocks_col_indices = _get_im2col_indices_along_dim(g, input_w, kernel_w,
+ dilation_w, padding_w,
+ stride_w)
+
+ output_shape = _get_im2col_output_shape(g, input, kernel_h, kernel_w)
+ padded_input = _get_im2col_padded_input(g, input, padding_h, padding_w)
+
+ output = g.op('Gather', padded_input, blocks_row_indices, axis_i=2)
+ output = g.op('Gather', output, blocks_col_indices, axis_i=4)
+ output = g.op('Transpose', output, perm_i=[0, 1, 2, 4, 3, 5])
+ return g.op('Reshape', output, output_shape)
+
+
+@parse_args('v', 'i')
+def one_hot(g, self, num_classes):
+ values = g.op('Constant', value_t=torch.LongTensor([0, 1]))
+ depth = g.op('Constant', value_t=torch.LongTensor([num_classes]))
+ return g.op('OneHot', self, depth, values, axis_i=-1)
+
+
+@parse_args('v', 'i', 'none')
+def softmax(g, input, dim, dtype=None):
+ input_dim = input.type().dim()
+ if input_dim:
+ # TODO: remove this as onnx opset 11 spec allows negative axes
+ if dim < 0:
+ dim = input_dim + dim
+ if input_dim == dim + 1:
+ softmax = g.op('Softmax', input, axis_i=dim)
+ if dtype and dtype.node().kind() != 'prim::Constant':
+ parsed_dtype = sym_help._get_const(dtype, 'i', 'dtype')
+ softmax = g.op(
+ 'Cast',
+ softmax,
+ to_i=sym_help.scalar_type_to_onnx[parsed_dtype])
+ return softmax
+
+ max_value = g.op('ReduceMax', input, axes_i=[dim], keepdims_i=1)
+ input = g.op('Sub', input, max_value)
+ exp = g.op('Exp', input)
+ sum = g.op('ReduceSum', exp, axes_i=[dim])
+ softmax = g.op('Div', exp, sum)
+ if dtype and dtype.node().kind() != 'prim::Constant':
+ parsed_dtype = sym_help._get_const(dtype, 'i', 'dtype')
+ softmax = g.op(
+ 'Cast', softmax, to_i=sym_help.scalar_type_to_onnx[parsed_dtype])
+ return softmax
+
+
+def _adaptive_pool(name, type, tuple_fn, fn=None):
+
+ @parse_args('v', 'is')
+ def symbolic_fn(g, input, output_size):
+ if output_size == [1] * len(output_size) and type == 'AveragePool':
+ return g.op('GlobalAveragePool', input)
+ if not input.isCompleteTensor():
+ if output_size == [1] * len(output_size):
+ return g.op('GlobalMaxPool', input), None
+ raise NotImplementedError(
+ '[Adaptive pool]:input size not accessible')
+ dim = input.type().sizes()[2:]
+ if output_size == [1] * len(output_size) and type == 'MaxPool':
+ return g.op('GlobalMaxPool', input), None
+
+ # compute stride = floor(input_size / output_size)
+ s = [int(dim[i] / output_size[i]) for i in range(0, len(dim))]
+
+ # compute kernel_size = input_size - (output_size - 1) * stride
+ k = [dim[i] - (output_size[i] - 1) * s[i] for i in range(0, len(dim))]
+
+ # call max_poolxd_with_indices to get indices in the output
+ if type == 'MaxPool':
+ return fn(g, input, k, k, (0, ) * len(dim), (1, ) * len(dim),
+ False)
+ output = g.op(
+ type,
+ input,
+ kernel_shape_i=tuple_fn(k),
+ strides_i=tuple_fn(s),
+ ceil_mode_i=False)
+ return output
+
+ return symbolic_fn
+
+
+adaptive_avg_pool1d = _adaptive_pool('adaptive_avg_pool1d', 'AveragePool',
+ _single)
+adaptive_avg_pool2d = _adaptive_pool('adaptive_avg_pool2d', 'AveragePool',
+ _pair)
+adaptive_avg_pool3d = _adaptive_pool('adaptive_avg_pool3d', 'AveragePool',
+ _triple)
+
+
+def new_full(g,
+ self,
+ size,
+ fill_value,
+ dtype,
+ layout,
+ device,
+ pin_memory=False):
+ from torch.onnx.symbolic_opset9 import full
+ if dtype is None and self.isCompleteTensor():
+ dtype = self.type().scalarType()
+ dtype = sym_help.scalar_type_to_onnx.index(
+ sym_help.cast_pytorch_to_onnx[dtype])
+ return full(g, size, fill_value, dtype, layout, device, pin_memory)
+
+
+@parse_args('v', 'v', 'i', 'i', 'i')
+def grid_sampler(g,
+ input,
+ grid,
+ interpolation_mode,
+ padding_mode,
+ align_corners=False):
+ return g.op(
+ 'mmcv::grid_sampler',
+ input,
+ grid,
+ interpolation_mode_i=interpolation_mode,
+ padding_mode_i=padding_mode,
+ align_corners_i=align_corners)
+
+
+@parse_args('v', 'i')
+def cummax(g, input, dim):
+ return g.op('mmcv::cummax', input, dim_i=dim, outputs=2)
+
+
+@parse_args('v', 'i')
+def cummin(g, input, dim):
+ return g.op('mmcv::cummin', input, dim_i=dim, outputs=2)
+
+
+def register_extra_symbolics(opset=11):
+ register_op('one_hot', one_hot, '', opset)
+ register_op('im2col', im2col, '', opset)
+ register_op('topk', topk, '', opset)
+ register_op('softmax', softmax, '', opset)
+ register_op('constant_pad_nd', constant_pad_nd, '', opset)
+ register_op('reflection_pad1d', reflection_pad1d, '', opset)
+ register_op('reflection_pad2d', reflection_pad2d, '', opset)
+ register_op('reflection_pad3d', reflection_pad3d, '', opset)
+ register_op('avg_pool1d', avg_pool1d, '', opset)
+ register_op('avg_pool2d', avg_pool2d, '', opset)
+ register_op('avg_pool3d', avg_pool3d, '', opset)
+ register_op('adaptive_avg_pool1d', adaptive_avg_pool1d, '', opset)
+ register_op('adaptive_avg_pool2d', adaptive_avg_pool2d, '', opset)
+ register_op('adaptive_avg_pool3d', adaptive_avg_pool3d, '', opset)
+ register_op('masked_select', masked_select, '', opset)
+ register_op('upsample_nearest1d', upsample_nearest1d, '', opset)
+ register_op('upsample_nearest2d', upsample_nearest2d, '', opset)
+ register_op('upsample_nearest3d', upsample_nearest3d, '', opset)
+ register_op('upsample_linear1d', upsample_linear1d, '', opset)
+ register_op('upsample_bilinear2d', upsample_bilinear2d, '', opset)
+ register_op('upsample_trilinear3d', upsample_trilinear3d, '', opset)
+ register_op('upsample_bicubic2d', upsample_bicubic2d, '', opset)
+ register_op('new_full', new_full, '', opset)
+ register_op('grid_sampler', grid_sampler, '', opset)
+ register_op('cummax', cummax, '', opset)
+ register_op('cummin', cummin, '', opset)
diff --git a/PyTorch/contrib/cv/detection/GCNet/mmcv/ops/.roi_align.py.swo b/PyTorch/contrib/cv/detection/GCNet/mmcv/ops/.roi_align.py.swo
new file mode 100644
index 0000000000000000000000000000000000000000..ee2b98ec6063ba0d5b7d469c2fbf069a93b2fae7
Binary files /dev/null and b/PyTorch/contrib/cv/detection/GCNet/mmcv/ops/.roi_align.py.swo differ
diff --git a/PyTorch/contrib/cv/detection/GCNet/mmcv/ops/.roi_align.py.swp b/PyTorch/contrib/cv/detection/GCNet/mmcv/ops/.roi_align.py.swp
new file mode 100644
index 0000000000000000000000000000000000000000..34fe7ff3a0d289bde81f4c127f336f44bf6195ba
Binary files /dev/null and b/PyTorch/contrib/cv/detection/GCNet/mmcv/ops/.roi_align.py.swp differ
diff --git a/PyTorch/contrib/cv/detection/GCNet/mmcv/ops/__init__.py b/PyTorch/contrib/cv/detection/GCNet/mmcv/ops/__init__.py
new file mode 100644
index 0000000000000000000000000000000000000000..ac9987b160c5bec46556d1b25789832c5d7ea4b5
--- /dev/null
+++ b/PyTorch/contrib/cv/detection/GCNet/mmcv/ops/__init__.py
@@ -0,0 +1,55 @@
+from .bbox import bbox_overlaps
+from .border_align import BorderAlign, border_align
+from .box_iou_rotated import box_iou_rotated
+from .carafe import CARAFE, CARAFENaive, CARAFEPack, carafe, carafe_naive
+from .cc_attention import CrissCrossAttention
+from .contour_expand import contour_expand
+from .corner_pool import CornerPool
+from .deform_conv import DeformConv2d, DeformConv2dPack, deform_conv2d
+from .deform_roi_pool import (DeformRoIPool, DeformRoIPoolPack,
+ ModulatedDeformRoIPoolPack, deform_roi_pool)
+from .deprecated_wrappers import Conv2d_deprecated as Conv2d
+from .deprecated_wrappers import ConvTranspose2d_deprecated as ConvTranspose2d
+from .deprecated_wrappers import Linear_deprecated as Linear
+from .deprecated_wrappers import MaxPool2d_deprecated as MaxPool2d
+from .focal_loss import (SigmoidFocalLoss, SoftmaxFocalLoss,
+ sigmoid_focal_loss, softmax_focal_loss)
+from .fused_bias_leakyrelu import FusedBiasLeakyReLU, fused_bias_leakyrelu
+from .info import (get_compiler_version, get_compiling_cuda_version,
+ get_onnxruntime_op_path)
+from .masked_conv import MaskedConv2d, masked_conv2d
+from .modulated_deform_conv import (ModulatedDeformConv2d,
+ ModulatedDeformConv2dPack,
+ modulated_deform_conv2d)
+from .multi_scale_deform_attn import MultiScaleDeformableAttention
+from .nms import batched_nms, nms, nms_match, nms_rotated, soft_nms
+from .pixel_group import pixel_group
+from .point_sample import (SimpleRoIAlign, point_sample,
+ rel_roi_point_to_rel_img_point)
+from .psa_mask import PSAMask
+from .roi_align import RoIAlign, roi_align
+from .roi_align_rotated import RoIAlignRotated, roi_align_rotated
+from .roi_pool import RoIPool, roi_pool
+from .saconv import SAConv2d
+from .sync_bn import SyncBatchNorm
+from .tin_shift import TINShift, tin_shift
+from .upfirdn2d import upfirdn2d
+
+__all__ = [
+ 'bbox_overlaps', 'CARAFE', 'CARAFENaive', 'CARAFEPack', 'carafe',
+ 'carafe_naive', 'CornerPool', 'DeformConv2d', 'DeformConv2dPack',
+ 'deform_conv2d', 'DeformRoIPool', 'DeformRoIPoolPack',
+ 'ModulatedDeformRoIPoolPack', 'deform_roi_pool', 'SigmoidFocalLoss',
+ 'SoftmaxFocalLoss', 'sigmoid_focal_loss', 'softmax_focal_loss',
+ 'get_compiler_version', 'get_compiling_cuda_version',
+ 'get_onnxruntime_op_path', 'MaskedConv2d', 'masked_conv2d',
+ 'ModulatedDeformConv2d', 'ModulatedDeformConv2dPack',
+ 'modulated_deform_conv2d', 'batched_nms', 'nms', 'soft_nms', 'nms_match',
+ 'RoIAlign', 'roi_align', 'RoIPool', 'roi_pool', 'SyncBatchNorm', 'Conv2d',
+ 'ConvTranspose2d', 'Linear', 'MaxPool2d', 'CrissCrossAttention', 'PSAMask',
+ 'point_sample', 'rel_roi_point_to_rel_img_point', 'SimpleRoIAlign',
+ 'SAConv2d', 'TINShift', 'tin_shift', 'box_iou_rotated', 'nms_rotated',
+ 'upfirdn2d', 'FusedBiasLeakyReLU', 'fused_bias_leakyrelu',
+ 'RoIAlignRotated', 'roi_align_rotated', 'pixel_group', 'contour_expand',
+ 'MultiScaleDeformableAttention', 'BorderAlign', 'border_align'
+]
diff --git a/PyTorch/contrib/cv/detection/GCNet/mmcv/ops/bbox.py b/PyTorch/contrib/cv/detection/GCNet/mmcv/ops/bbox.py
new file mode 100644
index 0000000000000000000000000000000000000000..855009ad149a49b1b3dbbbf497960107accf0c18
--- /dev/null
+++ b/PyTorch/contrib/cv/detection/GCNet/mmcv/ops/bbox.py
@@ -0,0 +1,71 @@
+from ..utils import ext_loader
+
+ext_module = ext_loader.load_ext('_ext', ['bbox_overlaps'])
+
+
+def bbox_overlaps(bboxes1, bboxes2, mode='iou', aligned=False, offset=0):
+ """Calculate overlap between two set of bboxes.
+
+ If ``aligned`` is ``False``, then calculate the ious between each bbox
+ of bboxes1 and bboxes2, otherwise the ious between each aligned pair of
+ bboxes1 and bboxes2.
+
+ Args:
+ bboxes1 (Tensor): shape (m, 4) in format or empty.
+ bboxes2 (Tensor): shape (n, 4) in format or empty.
+ If aligned is ``True``, then m and n must be equal.
+ mode (str): "iou" (intersection over union) or iof (intersection over
+ foreground).
+
+ Returns:
+ ious(Tensor): shape (m, n) if aligned == False else shape (m, 1)
+
+ Example:
+ >>> bboxes1 = torch.FloatTensor([
+ >>> [0, 0, 10, 10],
+ >>> [10, 10, 20, 20],
+ >>> [32, 32, 38, 42],
+ >>> ])
+ >>> bboxes2 = torch.FloatTensor([
+ >>> [0, 0, 10, 20],
+ >>> [0, 10, 10, 19],
+ >>> [10, 10, 20, 20],
+ >>> ])
+ >>> bbox_overlaps(bboxes1, bboxes2)
+ tensor([[0.5000, 0.0000, 0.0000],
+ [0.0000, 0.0000, 1.0000],
+ [0.0000, 0.0000, 0.0000]])
+
+ Example:
+ >>> empty = torch.FloatTensor([])
+ >>> nonempty = torch.FloatTensor([
+ >>> [0, 0, 10, 9],
+ >>> ])
+ >>> assert tuple(bbox_overlaps(empty, nonempty).shape) == (0, 1)
+ >>> assert tuple(bbox_overlaps(nonempty, empty).shape) == (1, 0)
+ >>> assert tuple(bbox_overlaps(empty, empty).shape) == (0, 0)
+ """
+
+ mode_dict = {'iou': 0, 'iof': 1}
+ assert mode in mode_dict.keys()
+ mode_flag = mode_dict[mode]
+ # Either the boxes are empty or the length of boxes' last dimension is 4
+ assert (bboxes1.size(-1) == 4 or bboxes1.size(0) == 0)
+ assert (bboxes2.size(-1) == 4 or bboxes2.size(0) == 0)
+ assert offset == 1 or offset == 0
+
+ rows = bboxes1.size(0)
+ cols = bboxes2.size(0)
+ if aligned:
+ assert rows == cols
+
+ if rows * cols == 0:
+ return bboxes1.new(rows, 1) if aligned else bboxes1.new(rows, cols)
+
+ if aligned:
+ ious = bboxes1.new_zeros(rows)
+ else:
+ ious = bboxes1.new_zeros((rows, cols))
+ ext_module.bbox_overlaps(
+ bboxes1, bboxes2, ious, mode=mode_flag, aligned=aligned, offset=offset)
+ return ious
diff --git a/PyTorch/contrib/cv/detection/GCNet/mmcv/ops/border_align.py b/PyTorch/contrib/cv/detection/GCNet/mmcv/ops/border_align.py
new file mode 100644
index 0000000000000000000000000000000000000000..e111d69550c1d175a243c75f6811ab5fbaede8c6
--- /dev/null
+++ b/PyTorch/contrib/cv/detection/GCNet/mmcv/ops/border_align.py
@@ -0,0 +1,108 @@
+# modified from
+# https://github.com/Megvii-BaseDetection/cvpods/blob/master/cvpods/layers/border_align.py
+
+import torch
+import torch.nn as nn
+from torch.autograd import Function
+from torch.autograd.function import once_differentiable
+
+from ..utils import ext_loader
+
+ext_module = ext_loader.load_ext(
+ '_ext', ['border_align_forward', 'border_align_backward'])
+
+
+class BorderAlignFunction(Function):
+
+ @staticmethod
+ def symbolic(g, input, boxes, pool_size):
+ return g.op(
+ 'mmcv::MMCVBorderAlign', input, boxes, pool_size_i=pool_size)
+
+ @staticmethod
+ def forward(ctx, input, boxes, pool_size):
+ ctx.pool_size = pool_size
+ ctx.input_shape = input.size()
+
+ assert boxes.ndim == 3, 'boxes must be with shape [B, H*W, 4]'
+ assert boxes.size(2) == 4, \
+ 'the last dimension of boxes must be (x1, y1, x2, y2)'
+ assert input.size(1) % 4 == 0, \
+ 'the channel for input feature must be divisible by factor 4'
+
+ # [B, C//4, H*W, 4]
+ output_shape = (input.size(0), input.size(1) // 4, boxes.size(1), 4)
+ output = input.new_zeros(output_shape)
+ # `argmax_idx` only used for backward
+ argmax_idx = input.new_zeros(output_shape).to(torch.int)
+
+ ext_module.border_align_forward(
+ input, boxes, output, argmax_idx, pool_size=ctx.pool_size)
+
+ ctx.save_for_backward(boxes, argmax_idx)
+ return output
+
+ @staticmethod
+ @once_differentiable
+ def backward(ctx, grad_output):
+ boxes, argmax_idx = ctx.saved_tensors
+ grad_input = grad_output.new_zeros(ctx.input_shape)
+ # complex head architecture may cause grad_output uncontiguous
+ grad_output = grad_output.contiguous()
+ ext_module.border_align_backward(
+ grad_output,
+ boxes,
+ argmax_idx,
+ grad_input,
+ pool_size=ctx.pool_size)
+ return grad_input, None, None
+
+
+border_align = BorderAlignFunction.apply
+
+
+class BorderAlign(nn.Module):
+ r"""Border align pooling layer.
+
+ Applies border_align over the input feature based on predicted bboxes.
+ The details were described in the paper
+ `BorderDet: Border Feature for Dense Object Detection
+ `_.
+
+ For each border line (e.g. top, left, bottom or right) of each box,
+ border_align does the following:
+ 1. uniformly samples `pool_size`+1 positions on this line, involving \
+ the start and end points.
+ 2. the corresponding features on these points are computed by \
+ bilinear interpolation.
+ 3. max pooling over all the `pool_size`+1 positions are used for \
+ computing pooled feature.
+
+ Args:
+ pool_size (int): number of positions sampled over the boxes' borders
+ (e.g. top, bottom, left, right).
+
+ """
+
+ def __init__(self, pool_size):
+ super(BorderAlign, self).__init__()
+ self.pool_size = pool_size
+
+ def forward(self, input, boxes):
+ """
+ Args:
+ input: Features with shape [N,4C,H,W]. Channels ranged in [0,C),
+ [C,2C), [2C,3C), [3C,4C) represent the top, left, bottom,
+ right features respectively.
+ boxes: Boxes with shape [N,H*W,4]. Coordinate format (x1,y1,x2,y2).
+
+ Returns:
+ Tensor: Pooled features with shape [N,C,H*W,4]. The order is
+ (top,left,bottom,right) for the last dimension.
+ """
+ return border_align(input, boxes, self.pool_size)
+
+ def __repr__(self):
+ s = self.__class__.__name__
+ s += f'(pool_size={self.pool_size})'
+ return s
diff --git a/PyTorch/contrib/cv/detection/GCNet/mmcv/ops/box_iou_rotated.py b/PyTorch/contrib/cv/detection/GCNet/mmcv/ops/box_iou_rotated.py
new file mode 100644
index 0000000000000000000000000000000000000000..fbfcef2acce58a4b1212a69c28f030c7bd77d3b2
--- /dev/null
+++ b/PyTorch/contrib/cv/detection/GCNet/mmcv/ops/box_iou_rotated.py
@@ -0,0 +1,44 @@
+from ..utils import ext_loader
+
+ext_module = ext_loader.load_ext('_ext', ['box_iou_rotated'])
+
+
+def box_iou_rotated(bboxes1, bboxes2, mode='iou', aligned=False):
+ """Return intersection-over-union (Jaccard index) of boxes.
+
+ Both sets of boxes are expected to be in
+ (x_center, y_center, width, height, angle) format.
+
+ If ``aligned`` is ``False``, then calculate the ious between each bbox
+ of bboxes1 and bboxes2, otherwise the ious between each aligned pair of
+ bboxes1 and bboxes2.
+
+ Arguments:
+ boxes1 (Tensor): rotated bboxes 1. \
+ It has shape (N, 5), indicating (x, y, w, h, theta) for each row.
+ Note that theta is in radian.
+ boxes2 (Tensor): rotated bboxes 2. \
+ It has shape (M, 5), indicating (x, y, w, h, theta) for each row.
+ Note that theta is in radian.
+ mode (str): "iou" (intersection over union) or iof (intersection over
+ foreground).
+
+ Returns:
+ ious(Tensor): shape (N, M) if aligned == False else shape (N,)
+ """
+ assert mode in ['iou', 'iof']
+ mode_dict = {'iou': 0, 'iof': 1}
+ mode_flag = mode_dict[mode]
+ rows = bboxes1.size(0)
+ cols = bboxes2.size(0)
+ if aligned:
+ ious = bboxes1.new_zeros(rows)
+ else:
+ ious = bboxes1.new_zeros((rows * cols))
+ bboxes1 = bboxes1.contiguous()
+ bboxes2 = bboxes2.contiguous()
+ ext_module.box_iou_rotated(
+ bboxes1, bboxes2, ious, mode_flag=mode_flag, aligned=aligned)
+ if not aligned:
+ ious = ious.view(rows, cols)
+ return ious
diff --git a/PyTorch/contrib/cv/detection/GCNet/mmcv/ops/carafe.py b/PyTorch/contrib/cv/detection/GCNet/mmcv/ops/carafe.py
new file mode 100644
index 0000000000000000000000000000000000000000..4ec679189185f1a5fc5d507c6547ac53577cfb64
--- /dev/null
+++ b/PyTorch/contrib/cv/detection/GCNet/mmcv/ops/carafe.py
@@ -0,0 +1,286 @@
+import torch
+import torch.nn as nn
+import torch.nn.functional as F
+from torch.autograd import Function
+from torch.nn.modules.module import Module
+
+from ..cnn import UPSAMPLE_LAYERS, normal_init, xavier_init
+from ..utils import ext_loader
+
+ext_module = ext_loader.load_ext('_ext', [
+ 'carafe_naive_forward', 'carafe_naive_backward', 'carafe_forward',
+ 'carafe_backward'
+])
+
+
+class CARAFENaiveFunction(Function):
+
+ @staticmethod
+ def symbolic(g, features, masks, kernel_size, group_size, scale_factor):
+ return g.op(
+ 'MMCVCARAFENaive',
+ features,
+ masks,
+ kernel_size=kernel_size,
+ group_size=group_size,
+ scale_factor=scale_factor)
+
+ @staticmethod
+ def forward(ctx, features, masks, kernel_size, group_size, scale_factor):
+ assert scale_factor >= 1
+ assert masks.size(1) == kernel_size * kernel_size * group_size
+ assert masks.size(-1) == features.size(-1) * scale_factor
+ assert masks.size(-2) == features.size(-2) * scale_factor
+ assert features.size(1) % group_size == 0
+ assert (kernel_size - 1) % 2 == 0 and kernel_size >= 1
+ ctx.kernel_size = kernel_size
+ ctx.group_size = group_size
+ ctx.scale_factor = scale_factor
+ ctx.feature_size = features.size()
+ ctx.mask_size = masks.size()
+
+ n, c, h, w = features.size()
+ output = features.new_zeros((n, c, h * scale_factor, w * scale_factor))
+ ext_module.carafe_naive_forward(
+ features,
+ masks,
+ output,
+ kernel_size=kernel_size,
+ group_size=group_size,
+ scale_factor=scale_factor)
+
+ if features.requires_grad or masks.requires_grad:
+ ctx.save_for_backward(features, masks)
+ return output
+
+ @staticmethod
+ def backward(ctx, grad_output):
+ assert grad_output.is_cuda
+
+ features, masks = ctx.saved_tensors
+ kernel_size = ctx.kernel_size
+ group_size = ctx.group_size
+ scale_factor = ctx.scale_factor
+
+ grad_input = torch.zeros_like(features)
+ grad_masks = torch.zeros_like(masks)
+ ext_module.carafe_naive_backward(
+ grad_output.contiguous(),
+ features,
+ masks,
+ grad_input,
+ grad_masks,
+ kernel_size=kernel_size,
+ group_size=group_size,
+ scale_factor=scale_factor)
+
+ return grad_input, grad_masks, None, None, None
+
+
+carafe_naive = CARAFENaiveFunction.apply
+
+
+class CARAFENaive(Module):
+
+ def __init__(self, kernel_size, group_size, scale_factor):
+ super(CARAFENaive, self).__init__()
+
+ assert isinstance(kernel_size, int) and isinstance(
+ group_size, int) and isinstance(scale_factor, int)
+ self.kernel_size = kernel_size
+ self.group_size = group_size
+ self.scale_factor = scale_factor
+
+ def forward(self, features, masks):
+ return carafe_naive(features, masks, self.kernel_size, self.group_size,
+ self.scale_factor)
+
+
+class CARAFEFunction(Function):
+
+ @staticmethod
+ def symbolic(g, features, masks, kernel_size, group_size, scale_factor):
+ return g.op(
+ 'MMCVCARAFE',
+ features,
+ masks,
+ kernel_size=kernel_size,
+ group_size=group_size,
+ scale_factor=scale_factor)
+
+ @staticmethod
+ def forward(ctx, features, masks, kernel_size, group_size, scale_factor):
+ assert scale_factor >= 1
+ assert masks.size(1) == kernel_size * kernel_size * group_size
+ assert masks.size(-1) == features.size(-1) * scale_factor
+ assert masks.size(-2) == features.size(-2) * scale_factor
+ assert features.size(1) % group_size == 0
+ assert (kernel_size - 1) % 2 == 0 and kernel_size >= 1
+ ctx.kernel_size = kernel_size
+ ctx.group_size = group_size
+ ctx.scale_factor = scale_factor
+ ctx.feature_size = features.size()
+ ctx.mask_size = masks.size()
+
+ n, c, h, w = features.size()
+ output = features.new_zeros((n, c, h * scale_factor, w * scale_factor))
+ routput = features.new_zeros(output.size(), requires_grad=False)
+ rfeatures = features.new_zeros(features.size(), requires_grad=False)
+ rmasks = masks.new_zeros(masks.size(), requires_grad=False)
+ ext_module.carafe_forward(
+ features,
+ masks,
+ rfeatures,
+ routput,
+ rmasks,
+ output,
+ kernel_size=kernel_size,
+ group_size=group_size,
+ scale_factor=scale_factor)
+
+ if features.requires_grad or masks.requires_grad:
+ ctx.save_for_backward(features, masks, rfeatures)
+ return output
+
+ @staticmethod
+ def backward(ctx, grad_output):
+ assert grad_output.is_cuda
+
+ features, masks, rfeatures = ctx.saved_tensors
+ kernel_size = ctx.kernel_size
+ group_size = ctx.group_size
+ scale_factor = ctx.scale_factor
+
+ rgrad_output = torch.zeros_like(grad_output, requires_grad=False)
+ rgrad_input_hs = torch.zeros_like(grad_output, requires_grad=False)
+ rgrad_input = torch.zeros_like(features, requires_grad=False)
+ rgrad_masks = torch.zeros_like(masks, requires_grad=False)
+ grad_input = torch.zeros_like(features, requires_grad=False)
+ grad_masks = torch.zeros_like(masks, requires_grad=False)
+ ext_module.carafe_backward(
+ grad_output.contiguous(),
+ rfeatures,
+ masks,
+ rgrad_output,
+ rgrad_input_hs,
+ rgrad_input,
+ rgrad_masks,
+ grad_input,
+ grad_masks,
+ kernel_size=kernel_size,
+ group_size=group_size,
+ scale_factor=scale_factor)
+ return grad_input, grad_masks, None, None, None
+
+
+carafe = CARAFEFunction.apply
+
+
+class CARAFE(Module):
+ """ CARAFE: Content-Aware ReAssembly of FEatures
+
+ Please refer to https://arxiv.org/abs/1905.02188 for more details.
+
+ Args:
+ kernel_size (int): reassemble kernel size
+ group_size (int): reassemble group size
+ scale_factor (int): upsample ratio
+
+ Returns:
+ upsampled feature map
+ """
+
+ def __init__(self, kernel_size, group_size, scale_factor):
+ super(CARAFE, self).__init__()
+
+ assert isinstance(kernel_size, int) and isinstance(
+ group_size, int) and isinstance(scale_factor, int)
+ self.kernel_size = kernel_size
+ self.group_size = group_size
+ self.scale_factor = scale_factor
+
+ def forward(self, features, masks):
+ return carafe(features, masks, self.kernel_size, self.group_size,
+ self.scale_factor)
+
+
+@UPSAMPLE_LAYERS.register_module(name='carafe')
+class CARAFEPack(nn.Module):
+ """A unified package of CARAFE upsampler that contains: 1) channel
+ compressor 2) content encoder 3) CARAFE op.
+
+ Official implementation of ICCV 2019 paper
+ CARAFE: Content-Aware ReAssembly of FEatures
+ Please refer to https://arxiv.org/abs/1905.02188 for more details.
+
+ Args:
+ channels (int): input feature channels
+ scale_factor (int): upsample ratio
+ up_kernel (int): kernel size of CARAFE op
+ up_group (int): group size of CARAFE op
+ encoder_kernel (int): kernel size of content encoder
+ encoder_dilation (int): dilation of content encoder
+ compressed_channels (int): output channels of channels compressor
+
+ Returns:
+ upsampled feature map
+ """
+
+ def __init__(self,
+ channels,
+ scale_factor,
+ up_kernel=5,
+ up_group=1,
+ encoder_kernel=3,
+ encoder_dilation=1,
+ compressed_channels=64):
+ super(CARAFEPack, self).__init__()
+ self.channels = channels
+ self.scale_factor = scale_factor
+ self.up_kernel = up_kernel
+ self.up_group = up_group
+ self.encoder_kernel = encoder_kernel
+ self.encoder_dilation = encoder_dilation
+ self.compressed_channels = compressed_channels
+ self.channel_compressor = nn.Conv2d(channels, self.compressed_channels,
+ 1)
+ self.content_encoder = nn.Conv2d(
+ self.compressed_channels,
+ self.up_kernel * self.up_kernel * self.up_group *
+ self.scale_factor * self.scale_factor,
+ self.encoder_kernel,
+ padding=int((self.encoder_kernel - 1) * self.encoder_dilation / 2),
+ dilation=self.encoder_dilation,
+ groups=1)
+ self.init_weights()
+
+ def init_weights(self):
+ for m in self.modules():
+ if isinstance(m, nn.Conv2d):
+ xavier_init(m, distribution='uniform')
+ normal_init(self.content_encoder, std=0.001)
+
+ def kernel_normalizer(self, mask):
+ mask = F.pixel_shuffle(mask, self.scale_factor)
+ n, mask_c, h, w = mask.size()
+ # use float division explicitly,
+ # to void inconsistency while exporting to onnx
+ mask_channel = int(mask_c / float(self.up_kernel**2))
+ mask = mask.view(n, mask_channel, -1, h, w)
+
+ mask = F.softmax(mask, dim=2)
+ mask = mask.view(n, mask_c, h, w).contiguous()
+
+ return mask
+
+ def feature_reassemble(self, x, mask):
+ x = carafe(x, mask, self.up_kernel, self.up_group, self.scale_factor)
+ return x
+
+ def forward(self, x):
+ compressed_x = self.channel_compressor(x)
+ mask = self.content_encoder(compressed_x)
+ mask = self.kernel_normalizer(mask)
+
+ x = self.feature_reassemble(x, mask)
+ return x
diff --git a/PyTorch/contrib/cv/detection/GCNet/mmcv/ops/cc_attention.py b/PyTorch/contrib/cv/detection/GCNet/mmcv/ops/cc_attention.py
new file mode 100644
index 0000000000000000000000000000000000000000..6f59d29fd08ccddcd9148f7403986a673afedd19
--- /dev/null
+++ b/PyTorch/contrib/cv/detection/GCNet/mmcv/ops/cc_attention.py
@@ -0,0 +1,95 @@
+import torch
+import torch.nn as nn
+import torch.nn.functional as F
+from torch.autograd.function import once_differentiable
+
+from mmcv.cnn import Scale
+from ..utils import ext_loader
+
+ext_module = ext_loader.load_ext(
+ '_ext', ['ca_forward', 'ca_backward', 'ca_map_forward', 'ca_map_backward'])
+
+
+class CAWeightFunction(torch.autograd.Function):
+
+ @staticmethod
+ def symbolic(g, t, f):
+ return g.op('MMCVCAWeight', t, f)
+
+ @staticmethod
+ def forward(ctx, t, f):
+ n, c, h, w = t.size()
+ weight = torch.zeros(n, h + w - 1, h, w).to(t.device)
+ ext_module.ca_forward(t, f, weight)
+
+ ctx.save_for_backward(t, f)
+
+ return weight
+
+ @staticmethod
+ @once_differentiable
+ def backward(ctx, dw):
+ t, f = ctx.saved_tensors
+ dt = torch.zeros_like(t)
+ df = torch.zeros_like(f)
+ ext_module.ca_backward(dw, t, f, dt, df)
+ return dt, df
+
+
+class CAMapFunction(torch.autograd.Function):
+
+ @staticmethod
+ def symbolic(g, weight, v):
+ return g.op('MMCVCAMap', weight, v)
+
+ @staticmethod
+ def forward(ctx, weight, v):
+ out = torch.zeros_like(v)
+ ext_module.ca_map_forward(weight, v, out)
+
+ ctx.save_for_backward(weight, v)
+
+ return out
+
+ @staticmethod
+ @once_differentiable
+ def backward(ctx, dout):
+ weight, v = ctx.saved_tensors
+ dw = torch.zeros_like(weight)
+ dv = torch.zeros_like(v)
+ ext_module.ca_map_backward(dout, weight, v, dw, dv)
+
+ return dw, dv
+
+
+ca_weight = CAWeightFunction.apply
+ca_map = CAMapFunction.apply
+
+
+class CrissCrossAttention(nn.Module):
+ """Criss-Cross Attention Module."""
+
+ def __init__(self, in_channels):
+ super(CrissCrossAttention, self).__init__()
+ self.query_conv = nn.Conv2d(in_channels, in_channels // 8, 1)
+ self.key_conv = nn.Conv2d(in_channels, in_channels // 8, 1)
+ self.value_conv = nn.Conv2d(in_channels, in_channels, 1)
+ self.gamma = Scale(0.)
+ self.in_channels = in_channels
+
+ def forward(self, x):
+ proj_query = self.query_conv(x)
+ proj_key = self.key_conv(x)
+ proj_value = self.value_conv(x)
+
+ energy = ca_weight(proj_query, proj_key)
+ attention = F.softmax(energy, 1)
+ out = ca_map(attention, proj_value)
+ out = self.gamma(out) + x
+
+ return out
+
+ def __repr__(self):
+ s = self.__class__.__name__
+ s += f'(in_channels={self.in_channels})'
+ return s
diff --git a/PyTorch/contrib/cv/detection/GCNet/mmcv/ops/contour_expand.py b/PyTorch/contrib/cv/detection/GCNet/mmcv/ops/contour_expand.py
new file mode 100644
index 0000000000000000000000000000000000000000..241f4db4af45e68a9e84f6491690dea40b39b68e
--- /dev/null
+++ b/PyTorch/contrib/cv/detection/GCNet/mmcv/ops/contour_expand.py
@@ -0,0 +1,37 @@
+import numpy as np
+import torch
+
+from ..utils import ext_loader
+
+ext_module = ext_loader.load_ext('_ext', ['contour_expand'])
+
+
+def contour_expand(kernel_mask, internal_kernel_label, min_kernel_area,
+ kernel_num):
+ """Expand kernel contours so that foreground pixels are assigned into
+ instances.
+
+ Arguments:
+ kernel_mask (np.array or Tensor): The instance kernel mask with
+ size hxw.
+ internal_kernel_label (np.array or Tensor): The instance internal
+ kernel label with size hxw.
+ min_kernel_area (int): The minimum kernel area.
+ kernel_num (int): The instance kernel number.
+
+ Returns:
+ label (np.array or Tensor): The instance index map with size hxw.
+ """
+ assert isinstance(kernel_mask, (torch.Tensor, np.ndarray))
+ assert isinstance(internal_kernel_label, (torch.Tensor, np.ndarray))
+ assert isinstance(min_kernel_area, int)
+ assert isinstance(kernel_num, int)
+
+ if isinstance(kernel_mask, np.ndarray):
+ kernel_mask = torch.from_numpy(kernel_mask)
+ if isinstance(internal_kernel_label, np.ndarray):
+ internal_kernel_label = torch.from_numpy(internal_kernel_label)
+
+ label = ext_module.contour_expand(kernel_mask, internal_kernel_label,
+ min_kernel_area, kernel_num)
+ return label
diff --git a/PyTorch/contrib/cv/detection/GCNet/mmcv/ops/corner_pool.py b/PyTorch/contrib/cv/detection/GCNet/mmcv/ops/corner_pool.py
new file mode 100644
index 0000000000000000000000000000000000000000..f1593369e5721853c947f47beca0775d70966178
--- /dev/null
+++ b/PyTorch/contrib/cv/detection/GCNet/mmcv/ops/corner_pool.py
@@ -0,0 +1,160 @@
+import torch
+from torch import nn
+from torch.autograd import Function
+
+from ..utils import ext_loader
+
+ext_module = ext_loader.load_ext('_ext', [
+ 'top_pool_forward', 'top_pool_backward', 'bottom_pool_forward',
+ 'bottom_pool_backward', 'left_pool_forward', 'left_pool_backward',
+ 'right_pool_forward', 'right_pool_backward'
+])
+
+_mode_dict = {'top': 0, 'bottom': 1, 'left': 2, 'right': 3}
+
+
+class TopPoolFunction(Function):
+
+ @staticmethod
+ def symbolic(g, input):
+ output = g.op(
+ 'mmcv::MMCVCornerPool', input, mode_i=int(_mode_dict['top']))
+ return output
+
+ @staticmethod
+ def forward(ctx, input):
+ output = ext_module.top_pool_forward(input)
+ ctx.save_for_backward(input)
+ return output
+
+ @staticmethod
+ def backward(ctx, grad_output):
+ input, = ctx.saved_tensors
+ output = ext_module.top_pool_backward(input, grad_output)
+ return output
+
+
+class BottomPoolFunction(Function):
+
+ @staticmethod
+ def symbolic(g, input):
+ output = g.op(
+ 'mmcv::MMCVCornerPool', input, mode_i=int(_mode_dict['bottom']))
+ return output
+
+ @staticmethod
+ def forward(ctx, input):
+ output = ext_module.bottom_pool_forward(input)
+ ctx.save_for_backward(input)
+ return output
+
+ @staticmethod
+ def backward(ctx, grad_output):
+ input, = ctx.saved_tensors
+ output = ext_module.bottom_pool_backward(input, grad_output)
+ return output
+
+
+class LeftPoolFunction(Function):
+
+ @staticmethod
+ def symbolic(g, input):
+ output = g.op(
+ 'mmcv::MMCVCornerPool', input, mode_i=int(_mode_dict['left']))
+ return output
+
+ @staticmethod
+ def forward(ctx, input):
+ output = ext_module.left_pool_forward(input)
+ ctx.save_for_backward(input)
+ return output
+
+ @staticmethod
+ def backward(ctx, grad_output):
+ input, = ctx.saved_tensors
+ output = ext_module.left_pool_backward(input, grad_output)
+ return output
+
+
+class RightPoolFunction(Function):
+
+ @staticmethod
+ def symbolic(g, input):
+ output = g.op(
+ 'mmcv::MMCVCornerPool', input, mode_i=int(_mode_dict['right']))
+ return output
+
+ @staticmethod
+ def forward(ctx, input):
+ output = ext_module.right_pool_forward(input)
+ ctx.save_for_backward(input)
+ return output
+
+ @staticmethod
+ def backward(ctx, grad_output):
+ input, = ctx.saved_tensors
+ output = ext_module.right_pool_backward(input, grad_output)
+ return output
+
+
+class CornerPool(nn.Module):
+ """Corner Pooling.
+
+ Corner Pooling is a new type of pooling layer that helps a
+ convolutional network better localize corners of bounding boxes.
+
+ Please refer to https://arxiv.org/abs/1808.01244 for more details.
+ Code is modified from https://github.com/princeton-vl/CornerNet-Lite.
+
+ Args:
+ mode(str): Pooling orientation for the pooling layer
+
+ - 'bottom': Bottom Pooling
+ - 'left': Left Pooling
+ - 'right': Right Pooling
+ - 'top': Top Pooling
+
+ Returns:
+ Feature map after pooling.
+ """
+
+ pool_functions = {
+ 'bottom': BottomPoolFunction,
+ 'left': LeftPoolFunction,
+ 'right': RightPoolFunction,
+ 'top': TopPoolFunction,
+ }
+
+ cummax_dim_flip = {
+ 'bottom': (2, False),
+ 'left': (3, True),
+ 'right': (3, False),
+ 'top': (2, True),
+ }
+
+ def __init__(self, mode):
+ super(CornerPool, self).__init__()
+ assert mode in self.pool_functions
+ self.mode = mode
+ self.corner_pool = self.pool_functions[mode]
+
+ def forward(self, x):
+ if torch.__version__ != 'parrots' and torch.__version__ >= '1.5.0':
+ if torch.onnx.is_in_onnx_export():
+ assert torch.__version__ >= '1.7.0', \
+ 'When `cummax` serves as an intermediate component whose '\
+ 'outputs is used as inputs for another modules, it\'s '\
+ 'expected that pytorch version must be >= 1.7.0, '\
+ 'otherwise Error appears like: `RuntimeError: tuple '\
+ 'appears in op that does not forward tuples, unsupported '\
+ 'kind: prim::PythonOp`.'
+
+ dim, flip = self.cummax_dim_flip[self.mode]
+ if flip:
+ x = x.flip(dim)
+ pool_tensor, _ = torch.cummax(x, dim=dim)
+ if flip:
+ pool_tensor = pool_tensor.flip(dim)
+ return pool_tensor
+ else:
+ return self.corner_pool.apply(x)
diff --git a/PyTorch/contrib/cv/detection/GCNet/mmcv/ops/csrc/bbox_overlaps_cuda_kernel.cuh b/PyTorch/contrib/cv/detection/GCNet/mmcv/ops/csrc/bbox_overlaps_cuda_kernel.cuh
new file mode 100644
index 0000000000000000000000000000000000000000..e5fccabae47fb45b5800c45dd3755b03d7e505fa
--- /dev/null
+++ b/PyTorch/contrib/cv/detection/GCNet/mmcv/ops/csrc/bbox_overlaps_cuda_kernel.cuh
@@ -0,0 +1,83 @@
+#ifndef BBOX_OVERLAPS_CUDA_KERNEL_CUH
+#define BBOX_OVERLAPS_CUDA_KERNEL_CUH
+
+#ifdef MMCV_USE_PARROTS
+#include "parrots_cuda_helper.hpp"
+#else
+#include "pytorch_cuda_helper.hpp"
+#endif
+
+template
+__global__ void bbox_overlaps_cuda_kernel(const T* bbox1, const T* bbox2,
+ T* ious, const int num_bbox1,
+ const int num_bbox2, const int mode,
+ const bool aligned,
+ const int offset) {
+ if (aligned) {
+ CUDA_1D_KERNEL_LOOP(index, num_bbox1) {
+ int b1 = index;
+ int b2 = index;
+
+ int base1 = b1 * 4;
+ T b1_x1 = bbox1[base1];
+ T b1_y1 = bbox1[base1 + 1];
+ T b1_x2 = bbox1[base1 + 2];
+ T b1_y2 = bbox1[base1 + 3];
+ T b1_area = (b1_x2 - b1_x1 + offset) * (b1_y2 - b1_y1 + offset);
+
+ int base2 = b2 * 4;
+ T b2_x1 = bbox2[base2];
+ T b2_y1 = bbox2[base2 + 1];
+ T b2_x2 = bbox2[base2 + 2];
+ T b2_y2 = bbox2[base2 + 3];
+ T b2_area = (b2_x2 - b2_x1 + offset) * (b2_y2 - b2_y1 + offset);
+
+ T left = fmaxf(b1_x1, b2_x1), right = fminf(b1_x2, b2_x2);
+ T top = fmaxf(b1_y1, b2_y1), bottom = fminf(b1_y2, b2_y2);
+ T width = fmaxf(right - left + offset, 0.f);
+ T height = fmaxf(bottom - top + offset, 0.f);
+ T interS = width * height;
+ T baseS = 1.0;
+ if (mode == 0) {
+ baseS = fmaxf(b1_area + b2_area - interS, T(offset));
+ } else if (mode == 1) {
+ baseS = fmaxf(b1_area, T(offset));
+ }
+ ious[index] = interS / baseS;
+ }
+ } else {
+ CUDA_1D_KERNEL_LOOP(index, num_bbox1 * num_bbox2) {
+ int b1 = index / num_bbox2;
+ int b2 = index % num_bbox2;
+
+ int base1 = b1 * 4;
+ T b1_x1 = bbox1[base1];
+ T b1_y1 = bbox1[base1 + 1];
+ T b1_x2 = bbox1[base1 + 2];
+ T b1_y2 = bbox1[base1 + 3];
+ T b1_area = (b1_x2 - b1_x1 + offset) * (b1_y2 - b1_y1 + offset);
+
+ int base2 = b2 * 4;
+ T b2_x1 = bbox2[base2];
+ T b2_y1 = bbox2[base2 + 1];
+ T b2_x2 = bbox2[base2 + 2];
+ T b2_y2 = bbox2[base2 + 3];
+ T b2_area = (b2_x2 - b2_x1 + offset) * (b2_y2 - b2_y1 + offset);
+
+ T left = fmaxf(b1_x1, b2_x1), right = fminf(b1_x2, b2_x2);
+ T top = fmaxf(b1_y1, b2_y1), bottom = fminf(b1_y2, b2_y2);
+ T width = fmaxf(right - left + offset, 0.f);
+ T height = fmaxf(bottom - top + offset, 0.f);
+ T interS = width * height;
+ T baseS = 1.0;
+ if (mode == 0) {
+ baseS = fmaxf(b1_area + b2_area - interS, T(offset));
+ } else if (mode == 1) {
+ baseS = fmaxf(b1_area, T(offset));
+ }
+ ious[index] = interS / baseS;
+ }
+ }
+}
+
+#endif // BBOX_OVERLAPS_CUDA_KERNEL_CUH
diff --git a/PyTorch/contrib/cv/detection/GCNet/mmcv/ops/csrc/border_align_cuda_kernel.cuh b/PyTorch/contrib/cv/detection/GCNet/mmcv/ops/csrc/border_align_cuda_kernel.cuh
new file mode 100644
index 0000000000000000000000000000000000000000..143dce5ddc2e644df0c028d707d86c9786959d8f
--- /dev/null
+++ b/PyTorch/contrib/cv/detection/GCNet/mmcv/ops/csrc/border_align_cuda_kernel.cuh
@@ -0,0 +1,199 @@
+// modified from
+// https://github.com/Megvii-BaseDetection/cvpods/blob/master/cvpods/layers/csrc/border_align/border_align_kernel.cu.
+// the main difference: (1) use `argmax_idx` for fast computing of gradient
+// during the backward. (2) `wh` is directly computed by `boxes`, rather than
+// passing it as argument to forward or backward functions.
+
+#ifndef BORDER_ALIGN_CUDA_KERNEL_CUH
+#define BORDER_ALIGN_CUDA_KERNEL_CUH
+
+#include
+#ifdef MMCV_WITH_TRT
+#include "common_cuda_helper.hpp"
+#else // MMCV_WITH_TRT
+#ifdef MMCV_USE_PARROTS
+#include "parrots_cuda_helper.hpp"
+#else // MMCV_USE_PARROTS
+#include "pytorch_cuda_helper.hpp"
+#endif // MMCV_USE_PARROTS
+#endif // MMCV_WITH_TRT
+
+enum BorderMode { Top = 0, Left = 1, Bottom = 2, Right = 3 };
+
+/*** Forward ***/
+template
+__global__ void border_align_forward_cuda_kernel(
+ const int nthreads, const T* input, const T* boxes, T* output,
+ int* argmax_idx, const int channels, const int box_size, const int height,
+ const int width, const int pool_size) {
+ CUDA_1D_KERNEL_LOOP(index, nthreads) {
+ // (batch_idx, c_idx, box_idx) is an element paralleled for computing
+ // output, and `extreme_idx` is in range [0,3]
+ int batch_idx, c_idx, box_idx, extreme_idx, maxidx, *offset_argmax_idx;
+ const T *offset_box, *offset_input, *offset_box_x;
+ T *offset_output, box_width, box_height, stride, x_stride, y_stride, x, y,
+ val, maxval;
+
+ extreme_idx = threadIdx.y;
+ // shape (N, C, box_size, 4) for output
+ batch_idx = index / channels / box_size;
+ // shape (N, box_size, 4) for boxes
+ box_idx = index % box_size + batch_idx * box_size;
+ c_idx = (index / box_size) % channels;
+
+ offset_box = boxes + box_idx * 4;
+ box_width = *(offset_box + 2) - *offset_box;
+ box_height = *(offset_box + 3) - *(offset_box + 1);
+ offset_output = output + index * 4 + extreme_idx;
+ offset_argmax_idx = argmax_idx + index * 4 + extreme_idx;
+ // shape (N, 4C, h, w) for input.
+ // [0,C) for top feature, [C,2C) for left feature,
+ // [2C,3C) for bottom feature, [3C,4C) for right feature
+ offset_input =
+ input + (batch_idx * channels * 4 + extreme_idx * channels + c_idx) *
+ height * width;
+
+ // extreme_idx in [0,1] -> offset_box_x indexed at x1
+ // extreme_idx in [2,3] -> offset_box_x indexed at x2
+ offset_box_x = offset_box + extreme_idx / 2 * 2;
+
+ // (x1,y1) or (x2,y2) for (x,y)
+ x = *offset_box_x;
+ y = *(offset_box_x + 1);
+
+ switch (extreme_idx) {
+ // top
+ case BorderMode::Top:
+ stride = box_width / pool_size;
+ x_stride = stride;
+ y_stride = 0;
+ break;
+ // left
+ case BorderMode::Left:
+ stride = box_height / pool_size;
+ x_stride = 0;
+ y_stride = stride;
+ break;
+ // bottom
+ case BorderMode::Bottom:
+ stride = box_width / pool_size;
+ x_stride = -stride;
+ y_stride = 0;
+ break;
+ // right
+ case BorderMode::Right:
+ stride = box_height / pool_size;
+ x_stride = 0;
+ y_stride = -stride;
+ break;
+ }
+
+ // initialize maxval and maxidx with the start position (e.g. (x1,y1) or
+ // (x2,y2))
+ maxval = bilinear_interpolate(offset_input, height, width, y, x, index);
+ maxidx = 0;
+
+ // do max_pool along the border
+ for (int i = 1; i <= pool_size; i++) {
+ x += x_stride;
+ y += y_stride;
+ val = bilinear_interpolate(offset_input, height, width, y, x, index);
+ if (val > maxval) {
+ maxval = val;
+ maxidx = i;
+ }
+ }
+
+ // update output and argmax_idx
+ *offset_output = maxval;
+ *offset_argmax_idx = maxidx;
+ }
+}
+
+/*** Backward ***/
+template
+__global__ void border_align_backward_cuda_kernel(
+ const int nthreads, const T* grad_output, const T* boxes,
+ const int* argmax_idx, T* grad_input, const int channels,
+ const int box_size, const int height, const int width,
+ const int pool_size) {
+ CUDA_1D_KERNEL_LOOP(index, nthreads) {
+ // (batch_idx, c_idx, box_idx) is an element paralleled for computing
+ // output, and `extreme_idx` is in range [0,3]
+ int batch_idx, c_idx, box_idx, extreme_idx;
+ const int* offset_argmax_idx;
+ const T *offset_grad_output, *offset_box, *offset_box_x;
+ T *offset_grad_input, box_width, box_height, stride, x_stride, y_stride, x,
+ y;
+
+ extreme_idx = threadIdx.y;
+ batch_idx = index / channels / box_size;
+ box_idx = index % box_size + batch_idx * box_size;
+ c_idx = (index / box_size) % channels;
+
+ offset_box = boxes + box_idx * 4;
+ box_width = *(offset_box + 2) - *offset_box;
+ box_height = *(offset_box + 3) - *(offset_box + 1);
+ offset_grad_output = grad_output + index * 4 + extreme_idx;
+ offset_argmax_idx = argmax_idx + index * 4 + extreme_idx;
+ // [0,C) for top feature grad, [C,2C) for left feature grad,
+ // [2C,3C) for bottom feature grad, [3C,4C) for right feature grad
+ offset_grad_input = grad_input + (batch_idx * channels * 4 +
+ extreme_idx * channels + c_idx) *
+ height * width;
+
+ // extreme_idx in [0,1] -> offset_box_x indexed at x1
+ // extreme_idx in [2,3] -> offset_box_x indexed at x2
+ offset_box_x = offset_box + extreme_idx / 2 * 2;
+
+ switch (extreme_idx) {
+ // top
+ case BorderMode::Top:
+ stride = box_width / pool_size;
+ x_stride = stride;
+ y_stride = 0;
+ break;
+ // left
+ case BorderMode::Left:
+ stride = box_height / pool_size;
+ x_stride = 0;
+ y_stride = stride;
+ break;
+ // bottom
+ case BorderMode::Bottom:
+ stride = box_width / pool_size;
+ x_stride = -stride;
+ y_stride = 0;
+ break;
+ // right
+ case BorderMode::Right:
+ stride = box_height / pool_size;
+ x_stride = 0;
+ y_stride = -stride;
+ break;
+ }
+
+ // get position (x,y) which has maximum value during forward
+ x = *offset_box_x;
+ y = *(offset_box_x + 1);
+ x += x_stride * (T)(*offset_argmax_idx);
+ y += y_stride * (T)(*offset_argmax_idx);
+
+ T w1, w2, w3, w4;
+ int x_low, x_high, y_low, y_high;
+ bilinear_interpolate_gradient(height, width, y, x, w1, w2, w3, w4, x_low,
+ x_high, y_low, y_high, index);
+
+ // update grad_output
+ atomicAdd(offset_grad_input + y_low * width + x_low,
+ *offset_grad_output * w1);
+ atomicAdd(offset_grad_input + y_low * width + x_high,
+ *offset_grad_output * w2);
+ atomicAdd(offset_grad_input + y_high * width + x_low,
+ *offset_grad_output * w3);
+ atomicAdd(offset_grad_input + y_high * width + x_high,
+ *offset_grad_output * w4);
+ }
+}
+
+#endif // BORDER_ALIGN_CUDA_KERNEL_CUH
diff --git a/PyTorch/contrib/cv/detection/GCNet/mmcv/ops/csrc/box_iou_rotated_cuda.cuh b/PyTorch/contrib/cv/detection/GCNet/mmcv/ops/csrc/box_iou_rotated_cuda.cuh
new file mode 100644
index 0000000000000000000000000000000000000000..abd47cd85437804310886de057b5a839a49481b2
--- /dev/null
+++ b/PyTorch/contrib/cv/detection/GCNet/mmcv/ops/csrc/box_iou_rotated_cuda.cuh
@@ -0,0 +1,81 @@
+// Copyright (c) Facebook, Inc. and its affiliates. All Rights Reserved
+// modified from
+// https://github.com/facebookresearch/detectron2/blob/master/detectron2/layers/csrc/box_iou_rotated/box_iou_rotated_cuda.cu
+#ifndef BOX_IOU_ROTATED_CUDA_CUH
+#define BOX_IOU_ROTATED_CUDA_CUH
+
+#ifdef MMCV_USE_PARROTS
+#include "parrots_cuda_helper.hpp"
+#else
+#include "pytorch_cuda_helper.hpp"
+#endif
+#include "box_iou_rotated_utils.hpp"
+
+// 2D block with 32 * 16 = 512 threads per block
+const int BLOCK_DIM_X = 32;
+const int BLOCK_DIM_Y = 16;
+
+inline int divideUP(const int x, const int y) { return (((x) + (y)-1) / (y)); }
+
+template
+__global__ void box_iou_rotated_cuda_kernel(
+ const int n_boxes1, const int n_boxes2, const T* dev_boxes1,
+ const T* dev_boxes2, T* dev_ious, const int mode_flag, const bool aligned) {
+ if (aligned) {
+ CUDA_1D_KERNEL_LOOP(index, n_boxes1) {
+ int b1 = index;
+ int b2 = index;
+
+ int base1 = b1 * 5;
+
+ float block_boxes1[5];
+ float block_boxes2[5];
+
+ block_boxes1[0] = dev_boxes1[base1 + 0];
+ block_boxes1[1] = dev_boxes1[base1 + 1];
+ block_boxes1[2] = dev_boxes1[base1 + 2];
+ block_boxes1[3] = dev_boxes1[base1 + 3];
+ block_boxes1[4] = dev_boxes1[base1 + 4];
+
+ int base2 = b2 * 5;
+
+ block_boxes2[0] = dev_boxes2[base2 + 0];
+ block_boxes2[1] = dev_boxes2[base2 + 1];
+ block_boxes2[2] = dev_boxes2[base2 + 2];
+ block_boxes2[3] = dev_boxes2[base2 + 3];
+ block_boxes2[4] = dev_boxes2[base2 + 4];
+
+ dev_ious[index] =
+ single_box_iou_rotated(block_boxes1, block_boxes2, mode_flag);
+ }
+ } else {
+ CUDA_1D_KERNEL_LOOP(index, n_boxes1 * n_boxes2) {
+ int b1 = index / n_boxes2;
+ int b2 = index % n_boxes2;
+
+ int base1 = b1 * 5;
+
+ float block_boxes1[5];
+ float block_boxes2[5];
+
+ block_boxes1[0] = dev_boxes1[base1 + 0];
+ block_boxes1[1] = dev_boxes1[base1 + 1];
+ block_boxes1[2] = dev_boxes1[base1 + 2];
+ block_boxes1[3] = dev_boxes1[base1 + 3];
+ block_boxes1[4] = dev_boxes1[base1 + 4];
+
+ int base2 = b2 * 5;
+
+ block_boxes2[0] = dev_boxes2[base2 + 0];
+ block_boxes2[1] = dev_boxes2[base2 + 1];
+ block_boxes2[2] = dev_boxes2[base2 + 2];
+ block_boxes2[3] = dev_boxes2[base2 + 3];
+ block_boxes2[4] = dev_boxes2[base2 + 4];
+
+ dev_ious[index] =
+ single_box_iou_rotated(block_boxes1, block_boxes2, mode_flag);
+ }
+ }
+}
+
+#endif
diff --git a/PyTorch/contrib/cv/detection/GCNet/mmcv/ops/csrc/box_iou_rotated_utils.hpp b/PyTorch/contrib/cv/detection/GCNet/mmcv/ops/csrc/box_iou_rotated_utils.hpp
new file mode 100644
index 0000000000000000000000000000000000000000..67190dc10eb245bb2bea23133ac984cd1c5a4888
--- /dev/null
+++ b/PyTorch/contrib/cv/detection/GCNet/mmcv/ops/csrc/box_iou_rotated_utils.hpp
@@ -0,0 +1,343 @@
+// Copyright (c) Facebook, Inc. and its affiliates. All Rights Reserved
+// modified from
+// https://github.com/facebookresearch/detectron2/blob/master/detectron2/layers/csrc/box_iou_rotated/box_iou_rotated_utils.h
+#pragma once
+#include
+#include
+
+#ifdef __CUDACC__
+// Designates functions callable from the host (CPU) and the device (GPU)
+#define HOST_DEVICE __host__ __device__
+#define HOST_DEVICE_INLINE HOST_DEVICE __forceinline__
+#else
+#include
+#define HOST_DEVICE
+#define HOST_DEVICE_INLINE HOST_DEVICE inline
+#endif
+
+namespace {
+
+template
+struct RotatedBox {
+ T x_ctr, y_ctr, w, h, a;
+};
+
+template
+struct Point {
+ T x, y;
+ HOST_DEVICE_INLINE Point(const T& px = 0, const T& py = 0) : x(px), y(py) {}
+ HOST_DEVICE_INLINE Point operator+(const Point& p) const {
+ return Point(x + p.x, y + p.y);
+ }
+ HOST_DEVICE_INLINE Point& operator+=(const Point& p) {
+ x += p.x;
+ y += p.y;
+ return *this;
+ }
+ HOST_DEVICE_INLINE Point operator-(const Point& p) const {
+ return Point(x - p.x, y - p.y);
+ }
+ HOST_DEVICE_INLINE Point operator*(const T coeff) const {
+ return Point(x * coeff, y * coeff);
+ }
+};
+
+template
+HOST_DEVICE_INLINE T dot_2d(const Point& A, const Point& B) {
+ return A.x * B.x + A.y * B.y;
+}
+
+template
+HOST_DEVICE_INLINE T cross_2d(const Point& A, const Point& B) {
+ return A.x * B.y - B.x * A.y;
+}
+
+template
+HOST_DEVICE_INLINE void get_rotated_vertices(const RotatedBox& box,
+ Point (&pts)[4]) {
+ // M_PI / 180. == 0.01745329251
+ // double theta = box.a * 0.01745329251;
+ // MODIFIED
+ double theta = box.a;
+ T cosTheta2 = (T)cos(theta) * 0.5f;
+ T sinTheta2 = (T)sin(theta) * 0.5f;
+
+ // y: top --> down; x: left --> right
+ pts[0].x = box.x_ctr - sinTheta2 * box.h - cosTheta2 * box.w;
+ pts[0].y = box.y_ctr + cosTheta2 * box.h - sinTheta2 * box.w;
+ pts[1].x = box.x_ctr + sinTheta2 * box.h - cosTheta2 * box.w;
+ pts[1].y = box.y_ctr - cosTheta2 * box.h - sinTheta2 * box.w;
+ pts[2].x = 2 * box.x_ctr - pts[0].x;
+ pts[2].y = 2 * box.y_ctr - pts[0].y;
+ pts[3].x = 2 * box.x_ctr - pts[1].x;
+ pts[3].y = 2 * box.y_ctr - pts[1].y;
+}
+
+template
+HOST_DEVICE_INLINE int get_intersection_points(const Point (&pts1)[4],
+ const Point (&pts2)[4],
+ Point (&intersections)[24]) {
+ // Line vector
+ // A line from p1 to p2 is: p1 + (p2-p1)*t, t=[0,1]
+ Point vec1[4], vec2[4];
+ for (int i = 0; i < 4; i++) {
+ vec1[i] = pts1[(i + 1) % 4] - pts1[i];
+ vec2[i] = pts2[(i + 1) % 4] - pts2[i];
+ }
+
+ // Line test - test all line combos for intersection
+ int num = 0; // number of intersections
+ for (int i = 0; i < 4; i++) {
+ for (int j = 0; j < 4; j++) {
+ // Solve for 2x2 Ax=b
+ T det = cross_2d(vec2[j], vec1[i]);
+
+ // This takes care of parallel lines
+ if (fabs(det) <= 1e-14) {
+ continue;
+ }
+
+ auto vec12 = pts2[j] - pts1[i];
+
+ T t1 = cross_2d(vec2[j], vec12) / det;
+ T t2 = cross_2d(vec1[i], vec12) / det;
+
+ if (t1 >= 0.0f && t1 <= 1.0f && t2 >= 0.0f && t2 <= 1.0f) {
+ intersections[num++] = pts1[i] + vec1[i] * t1;
+ }
+ }
+ }
+
+ // Check for vertices of rect1 inside rect2
+ {
+ const auto& AB = vec2[0];
+ const auto& DA = vec2[3];
+ auto ABdotAB = dot_2d(AB, AB);
+ auto ADdotAD = dot_2d(DA, DA);
+ for (int i = 0; i < 4; i++) {
+ // assume ABCD is the rectangle, and P is the point to be judged
+ // P is inside ABCD iff. P's projection on AB lies within AB
+ // and P's projection on AD lies within AD
+
+ auto AP = pts1[i] - pts2[0];
+
+ auto APdotAB = dot_2d(AP, AB);
+ auto APdotAD = -dot_2d(AP, DA);
+
+ if ((APdotAB >= 0) && (APdotAD >= 0) && (APdotAB <= ABdotAB) &&
+ (APdotAD <= ADdotAD)) {
+ intersections[num++] = pts1[i];
+ }
+ }
+ }
+
+ // Reverse the check - check for vertices of rect2 inside rect1
+ {
+ const auto& AB = vec1[0];
+ const auto& DA = vec1[3];
+ auto ABdotAB = dot_2d(AB, AB);
+ auto ADdotAD = dot_2d(DA, DA);
+ for (int i = 0; i < 4; i++) {
+ auto AP = pts2[i] - pts1[0];
+
+ auto APdotAB = dot_2d(AP, AB);
+ auto APdotAD = -dot_2d(AP, DA);
+
+ if ((APdotAB >= 0) && (APdotAD >= 0) && (APdotAB <= ABdotAB) &&
+ (APdotAD <= ADdotAD)) {
+ intersections[num++] = pts2[i];
+ }
+ }
+ }
+
+ return num;
+}
+
+template
+HOST_DEVICE_INLINE int convex_hull_graham(const Point (&p)[24],
+ const int& num_in, Point (&q)[24],
+ bool shift_to_zero = false) {
+ assert(num_in >= 2);
+
+ // Step 1:
+ // Find point with minimum y
+ // if more than 1 points have the same minimum y,
+ // pick the one with the minimum x.
+ int t = 0;
+ for (int i = 1; i < num_in; i++) {
+ if (p[i].y < p[t].y || (p[i].y == p[t].y && p[i].x < p[t].x)) {
+ t = i;
+ }
+ }
+ auto& start = p[t]; // starting point
+
+ // Step 2:
+ // Subtract starting point from every points (for sorting in the next step)
+ for (int i = 0; i < num_in; i++) {
+ q[i] = p[i] - start;
+ }
+
+ // Swap the starting point to position 0
+ auto tmp = q[0];
+ q[0] = q[t];
+ q[t] = tmp;
+
+ // Step 3:
+ // Sort point 1 ~ num_in according to their relative cross-product values
+ // (essentially sorting according to angles)
+ // If the angles are the same, sort according to their distance to origin
+ T dist[24];
+ for (int i = 0; i < num_in; i++) {
+ dist[i] = dot_2d(q[i], q[i]);
+ }
+
+#ifdef __CUDACC__
+ // CUDA version
+ // In the future, we can potentially use thrust
+ // for sorting here to improve speed (though not guaranteed)
+ for (int i = 1; i < num_in - 1; i++) {
+ for (int j = i + 1; j < num_in; j++) {
+ T crossProduct = cross_2d(q[i], q[j]);
+ if ((crossProduct < -1e-6) ||
+ (fabs(crossProduct) < 1e-6 && dist[i] > dist[j])) {
+ auto q_tmp = q[i];
+ q[i] = q[j];
+ q[j] = q_tmp;
+ auto dist_tmp = dist[i];
+ dist[i] = dist[j];
+ dist[j] = dist_tmp;
+ }
+ }
+ }
+#else
+ // CPU version
+ std::sort(q + 1, q + num_in,
+ [](const Point& A, const Point& B) -> bool {
+ T temp = cross_2d(A, B);
+ if (fabs(temp) < 1e-6) {
+ return dot_2d(A, A) < dot_2d(B, B);
+ } else {
+ return temp > 0;
+ }
+ });
+#endif
+
+ // Step 4:
+ // Make sure there are at least 2 points (that don't overlap with each other)
+ // in the stack
+ int k; // index of the non-overlapped second point
+ for (k = 1; k < num_in; k++) {
+ if (dist[k] > 1e-8) {
+ break;
+ }
+ }
+ if (k == num_in) {
+ // We reach the end, which means the convex hull is just one point
+ q[0] = p[t];
+ return 1;
+ }
+ q[1] = q[k];
+ int m = 2; // 2 points in the stack
+ // Step 5:
+ // Finally we can start the scanning process.
+ // When a non-convex relationship between the 3 points is found
+ // (either concave shape or duplicated points),
+ // we pop the previous point from the stack
+ // until the 3-point relationship is convex again, or
+ // until the stack only contains two points
+ for (int i = k + 1; i < num_in; i++) {
+ while (m > 1 && cross_2d(q[i] - q[m - 2], q[m - 1] - q[m - 2]) >= 0) {
+ m--;
+ }
+ q[m++] = q[i];
+ }
+
+ // Step 6 (Optional):
+ // In general sense we need the original coordinates, so we
+ // need to shift the points back (reverting Step 2)
+ // But if we're only interested in getting the area/perimeter of the shape
+ // We can simply return.
+ if (!shift_to_zero) {
+ for (int i = 0; i < m; i++) {
+ q[i] += start;
+ }
+ }
+
+ return m;
+}
+
+template
+HOST_DEVICE_INLINE T polygon_area(const Point (&q)[24], const int& m) {
+ if (m <= 2) {
+ return 0;
+ }
+
+ T area = 0;
+ for (int i = 1; i < m - 1; i++) {
+ area += fabs(cross_2d(q[i] - q[0], q[i + 1] - q[0]));
+ }
+
+ return area / 2.0;
+}
+
+template
+HOST_DEVICE_INLINE T rotated_boxes_intersection(const RotatedBox& box1,
+ const RotatedBox& box2) {
+ // There are up to 4 x 4 + 4 + 4 = 24 intersections (including dups) returned
+ // from rotated_rect_intersection_pts
+ Point intersectPts[24], orderedPts[24];
+
+ Point pts1[4];
+ Point pts2[4];
+ get_rotated_vertices(box1, pts1);
+ get_rotated_vertices(box2, pts2);
+
+ int num = get_intersection_points(pts1, pts2, intersectPts);
+
+ if (num <= 2) {
+ return 0.0;
+ }
+
+ // Convex Hull to order the intersection points in clockwise order and find
+ // the contour area.
+ int num_convex = convex_hull_graham(intersectPts, num, orderedPts, true);
+ return polygon_area(orderedPts, num_convex);
+}
+
+} // namespace
+
+template
+HOST_DEVICE_INLINE T single_box_iou_rotated(T const* const box1_raw,
+ T const* const box2_raw,
+ const int mode_flag) {
+ // shift center to the middle point to achieve higher precision in result
+ RotatedBox box1, box2;
+ auto center_shift_x = (box1_raw[0] + box2_raw[0]) / 2.0;
+ auto center_shift_y = (box1_raw[1] + box2_raw[1]) / 2.0;
+ box1.x_ctr = box1_raw[0] - center_shift_x;
+ box1.y_ctr = box1_raw[1] - center_shift_y;
+ box1.w = box1_raw[2];
+ box1.h = box1_raw[3];
+ box1.a = box1_raw[4];
+ box2.x_ctr = box2_raw[0] - center_shift_x;
+ box2.y_ctr = box2_raw[1] - center_shift_y;
+ box2.w = box2_raw[2];
+ box2.h = box2_raw[3];
+ box2.a = box2_raw[4];
+
+ const T area1 = box1.w * box1.h;
+ const T area2 = box2.w * box2.h;
+ if (area1 < 1e-14 || area2 < 1e-14) {
+ return 0.f;
+ }
+
+ const T intersection = rotated_boxes_intersection(box1, box2);
+ T baseS = 1.0;
+ if (mode_flag == 0) {
+ baseS = (area1 + area2 - intersection);
+ } else if (mode_flag == 1) {
+ baseS = area1;
+ }
+ const T iou = intersection / baseS;
+ return iou;
+}
diff --git a/PyTorch/contrib/cv/detection/GCNet/mmcv/ops/csrc/carafe_cuda_kernel.cuh b/PyTorch/contrib/cv/detection/GCNet/mmcv/ops/csrc/carafe_cuda_kernel.cuh
new file mode 100644
index 0000000000000000000000000000000000000000..e9b569d3b5e67ec470812a0a786d2f141f63f113
--- /dev/null
+++ b/PyTorch/contrib/cv/detection/GCNet/mmcv/ops/csrc/carafe_cuda_kernel.cuh
@@ -0,0 +1,314 @@
+#ifndef CARAFE_CUDA_KERNEL_CUH
+#define CARAFE_CUDA_KERNEL_CUH
+
+#ifdef MMCV_USE_PARROTS
+#include "parrots_cuda_helper.hpp"
+#else
+#include "pytorch_cuda_helper.hpp"
+#endif
+
+#define WARP_SIZE 32
+#define THREADS_PER_PIXEL 32
+#define MAX_SHARED_MEMORY 49152
+#define MAX_SHARED_SCALAR_T 6144 // 49152 / 8 = 6144
+#define MAXIMIZE_KERNEL_SIZE true
+#define kTileDim 32
+#define kBlockRows 8
+#define FULL_MASK 0xffffffff
+
+inline int divideUP(const int x, const int y) { return (((x) + (y)-1) / (y)); }
+
+__device__ inline int Loc2Index(const int n, const int c, const int h,
+ const int w, const int channel_num,
+ const int height, const int width) {
+ int index = w + (h + (c + n * channel_num) * height) * width;
+ return index;
+}
+/* TODO: move this to a common place */
+template
+__device__ inline scalar_t min(scalar_t a, scalar_t b) {
+ return a < b ? a : b;
+}
+
+template
+__device__ inline scalar_t max(scalar_t a, scalar_t b) {
+ return a > b ? a : b;
+}
+
+template
+__device__ __forceinline__ scalar_t warpReduceSum(scalar_t val) {
+ for (int offset = 16; offset > 0; offset /= 2)
+ val += __shfl_down_sync(FULL_MASK, val, offset);
+ return val;
+}
+
+template <>
+__device__ __forceinline__ phalf warpReduceSum(phalf val) {
+ for (int offset = 16; offset > 0; offset /= 2)
+ __PHALF(val) +=
+ __shfl_down_sync(FULL_MASK, static_cast<__half>(__PHALF(val)), offset);
+ return val;
+}
+
+// Splits the original matrix into submatrices with size 32 * 32.
+// Each block transposes one submatrix by loading it into shared memory.
+// Reference https://devblogs.nvidia.com/efficient-matrix-transpose-cuda-cc/
+template
+__global__ void BatchTranspose2DCUDAKernel(const int N, const int H,
+ const int W, const int dh,
+ const int dw,
+ const scalar_t *__restrict__ X,
+ scalar_t *__restrict__ Y) {
+ __shared__ scalar_t tile[kTileDim][kTileDim + 1];
+ const int n = blockIdx.x / (dh * dw);
+ const int k = blockIdx.x % (dh * dw);
+ const int r = k / dw;
+ const int c = k % dw;
+ const int offset = n * H * W;
+ int x = c * kTileDim + threadIdx.x;
+ int y = r * kTileDim + threadIdx.y;
+ if (x < W) {
+ for (int i = 0; threadIdx.y + i < kTileDim && y + i < H; i += kBlockRows) {
+ tile[threadIdx.y + i][threadIdx.x] = X[offset + (y + i) * W + x];
+ }
+ }
+ __syncthreads();
+ x = r * kTileDim + threadIdx.x;
+ y = c * kTileDim + threadIdx.y;
+ if (x < H) {
+ for (int i = 0; threadIdx.y + i < kTileDim && y + i < W; i += kBlockRows) {
+ Y[offset + (y + i) * H + x] = tile[threadIdx.x][threadIdx.y + i];
+ }
+ }
+}
+template
+__global__ void CARAFEForward(
+ const int num_kernels, const scalar_t *__restrict__ bottom_data,
+ const scalar_t *__restrict__ bottom_masks, const int kernel_size,
+ const int group_size, const int scale_factor, const int channels,
+ const int down_height, const int down_width, const int height,
+ const int width, const int mask_channels, scalar_t *__restrict__ top_data) {
+#if MAXIMIZE_KERNEL_SIZE
+ __shared__ float shared_mask[MAX_SHARED_SCALAR_T * 2];
+#else
+ __shared__ scalar_t shared_mask[MAX_SHARED_SCALAR_T];
+#endif
+
+ int index = threadIdx.x + blockIdx.x * blockDim.x;
+ if (index > num_kernels - 1) {
+ return;
+ }
+ const int pixel_id = threadIdx.x / THREADS_PER_PIXEL;
+ const int split_id = threadIdx.x % THREADS_PER_PIXEL;
+ index = index / THREADS_PER_PIXEL;
+ const int pw = index % width;
+ const int ph = (index / width) % height;
+ const int n = index / width / height;
+
+ const int down_pw = pw / scale_factor;
+ const int down_ph = ph / scale_factor;
+
+ const int start_w = down_pw - (kernel_size - 1) / 2;
+ const int end_w = down_pw + (kernel_size - 1) / 2 + 1;
+ const int start_h = down_ph - (kernel_size - 1) / 2;
+ const int end_h = down_ph + (kernel_size - 1) / 2 + 1;
+ for (int c = split_id; c < mask_channels; c += THREADS_PER_PIXEL) {
+ int mask_index = Loc2Index(n, ph, pw, c, height, width, mask_channels);
+ shared_mask[c * WARP_SIZE + pixel_id] = bottom_masks[mask_index];
+ }
+ __syncthreads();
+
+ const int channels_per_group = ceilf(channels / (float)group_size);
+#pragma unroll
+ for (int c = split_id; c < channels; c += THREADS_PER_PIXEL) {
+ int mask_group = c / channels_per_group;
+ scalar_t output_val = 0;
+#pragma unroll
+ for (int iy = start_h; iy < end_h; iy++) {
+#pragma unroll
+ for (int ix = start_w; ix < end_w; ix++) {
+ if (iy < 0 || iy > down_height - 1 || ix < 0 || ix > down_width - 1) {
+ continue;
+ }
+ int mask_iy = iy - down_ph + (kernel_size - 1) / 2;
+ int mask_ix = ix - down_pw + (kernel_size - 1) / 2;
+ int mask_c =
+ (mask_group * kernel_size + mask_iy) * kernel_size + mask_ix;
+ int feat_index =
+ Loc2Index(n, iy, ix, c, down_height, down_width, channels);
+
+ output_val += bottom_data[feat_index] *
+ shared_mask[mask_c * WARP_SIZE + pixel_id];
+ }
+ }
+
+ int top_index = Loc2Index(n, ph, pw, c, height, width, channels);
+ top_data[top_index] = output_val;
+ }
+}
+
+template
+__global__ void CARAFEBackward_Feature(
+ const int num_kernels, const scalar_t *__restrict__ top_diff,
+ const scalar_t *__restrict__ bottom_masks, const int kernel_size,
+ const int group_size, const int scale_factor, const int channels,
+ const int down_height, const int down_width, const int height,
+ const int width, const int mask_channels,
+ scalar_t *__restrict__ bottom_diff) {
+#if MAXIMIZE_KERNEL_SIZE
+ __shared__ float shared_mask[MAX_SHARED_SCALAR_T * 2];
+#else
+ __shared__ scalar_t shared_mask[MAX_SHARED_SCALAR_T];
+#endif
+
+ int index = threadIdx.x + blockIdx.x * blockDim.x;
+ if (index > num_kernels - 1) {
+ return;
+ }
+
+ const int pixel_id = threadIdx.x / THREADS_PER_PIXEL;
+ const int split_id = threadIdx.x % THREADS_PER_PIXEL;
+ // (n, c, ph, pw) is an element in the bottom_data
+ index = index / THREADS_PER_PIXEL;
+ const int pw = index % width;
+ const int ph = (index / width) % height;
+ const int n = index / width / height;
+
+ const int start_w = pw - (kernel_size - 1) * scale_factor / 2;
+ const int end_w = pw + (kernel_size - 1) * scale_factor / 2 + 1;
+ const int start_h = ph - (kernel_size - 1) * scale_factor / 2;
+ const int end_h = ph + (kernel_size - 1) * scale_factor / 2 + 1;
+ for (int c = split_id; c < mask_channels; c += THREADS_PER_PIXEL) {
+ const int mask_w = (c % kernel_size) * scale_factor;
+ const int mask_h = (c / kernel_size % kernel_size) * scale_factor;
+ const int mask_x = start_w + mask_w;
+ const int mask_y = start_h + mask_h;
+ if (mask_y < 0 || mask_y > height - 1 || mask_x < 0 || mask_x > width - 1) {
+ shared_mask[c * WARP_SIZE + pixel_id] = 0;
+ continue;
+ }
+ const int mask_group = c / (kernel_size * kernel_size);
+ const int mask_c = (2 * mask_group + 1) * kernel_size * kernel_size - c - 1;
+ int mask_index =
+ Loc2Index(n, mask_c, mask_y, mask_x, mask_channels, height, width);
+ shared_mask[c * WARP_SIZE + pixel_id] = bottom_masks[mask_index];
+ }
+ __syncthreads();
+ const int channels_per_group = ceilf(channels / (float)group_size);
+#pragma unroll
+ for (int c = split_id; c < channels; c += THREADS_PER_PIXEL) {
+ int mask_group = c / channels_per_group;
+ int top_index = Loc2Index(n, ph, pw, c, height, width, channels);
+ scalar_t output_val = 0;
+#pragma unroll
+ for (int iy = start_h; iy < end_h; iy += scale_factor) {
+#pragma unroll
+ for (int ix = start_w; ix < end_w; ix += scale_factor) {
+ if (iy < 0 || iy > height - 1 || ix < 0 || ix > width - 1) {
+ continue;
+ }
+ int mask_iy =
+ (iy - ph + (kernel_size - 1) * scale_factor / 2) / scale_factor;
+ int mask_ix =
+ (ix - pw + (kernel_size - 1) * scale_factor / 2) / scale_factor;
+ int mask_c =
+ (mask_group * kernel_size + mask_iy) * kernel_size + mask_ix;
+ int feat_index = Loc2Index(n, iy, ix, c, height, width, channels);
+ output_val +=
+ shared_mask[mask_c * WARP_SIZE + pixel_id] * top_diff[feat_index];
+ }
+ }
+ bottom_diff[top_index] = output_val;
+ }
+}
+
+template
+__global__ void FeatureSum(const int num_kernels,
+ const scalar_t *__restrict__ input_data,
+ const int scale_factor, const int channels,
+ const int height, const int width,
+ scalar_t *__restrict__ output_data) {
+ int index = threadIdx.x + blockIdx.x * blockDim.x;
+ if (index > num_kernels - 1) {
+ return;
+ }
+ const int split_id = threadIdx.x % THREADS_PER_PIXEL;
+ index = index / THREADS_PER_PIXEL;
+ const int pw = index % width;
+ const int ph = (index / width) % height;
+ const int n = index / width / height;
+ for (int c = split_id; c < channels; c += THREADS_PER_PIXEL) {
+ scalar_t output_val = 0;
+ for (int iy = ph * scale_factor; iy < (ph + 1) * scale_factor; iy++) {
+ for (int ix = pw * scale_factor; ix < (pw + 1) * scale_factor; ix++) {
+ int input_id = Loc2Index(n, iy, ix, c, height * scale_factor,
+ width * scale_factor, channels);
+ output_val += input_data[input_id];
+ }
+ }
+ const int output_id = Loc2Index(n, ph, pw, c, height, width, channels);
+ output_data[output_id] = output_val;
+ }
+}
+
+template
+__global__ void CARAFEBackward_Mask(const int num_kernels,
+ const scalar_t *__restrict__ top_diff,
+ const scalar_t *__restrict__ bottom_data,
+ const int kernel_size, const int group_size,
+ const int scale_factor, const int channels,
+ const int down_height, const int down_width,
+ const int height, const int width,
+ const int mask_channels,
+ scalar_t *__restrict__ mask_diff) {
+ int index = threadIdx.x + blockIdx.x * blockDim.x;
+ if (index > num_kernels - 1) {
+ return;
+ }
+
+ const int lane_id = index % WARP_SIZE;
+ index = index / WARP_SIZE;
+ const int mask_c = index % mask_channels;
+ // (n, c, ph, pw) is an element in the bottom_data
+ index = index / mask_channels;
+ const int pw = index % width;
+ const int ph = (index / width) % height;
+ const int n = index / width / height;
+
+ const int down_pw = pw / scale_factor;
+ const int down_ph = ph / scale_factor;
+
+ const int mask_group = mask_c / (kernel_size * kernel_size);
+ const int mask_loc = mask_c % (kernel_size * kernel_size);
+
+ const int offset_x = mask_loc % kernel_size - (kernel_size - 1) / 2;
+ const int offset_y =
+ mask_loc / kernel_size % kernel_size - (kernel_size - 1) / 2;
+
+ const int down_x = down_pw + offset_x;
+ const int down_y = down_ph + offset_y;
+
+ scalar_t output_val = 0;
+
+ if (down_y >= 0 && down_y <= down_height - 1 && down_x >= 0 &&
+ down_x <= down_width - 1) {
+ const int channels_per_mask = ceilf(channels / (float)group_size);
+ const int start = channels_per_mask * mask_group;
+ const int end = min(channels_per_mask * (mask_group + 1), channels);
+ for (int c = start + lane_id; c < end; c += WARP_SIZE) {
+ int bottom_id =
+ Loc2Index(n, down_y, down_x, c, down_height, down_width, channels);
+ int top_id = Loc2Index(n, ph, pw, c, height, width, channels);
+ output_val += top_diff[top_id] * bottom_data[bottom_id];
+ }
+ }
+ __syncwarp();
+ output_val = warpReduceSum(output_val);
+ if (lane_id == 0) {
+ const int mask_id =
+ Loc2Index(n, ph, pw, mask_c, height, width, mask_channels);
+ mask_diff[mask_id] = output_val;
+ }
+}
+
+#endif // CARAFE_CUDA_KERNEL_CUH
diff --git a/PyTorch/contrib/cv/detection/GCNet/mmcv/ops/csrc/carafe_naive_cuda_kernel.cuh b/PyTorch/contrib/cv/detection/GCNet/mmcv/ops/csrc/carafe_naive_cuda_kernel.cuh
new file mode 100644
index 0000000000000000000000000000000000000000..6f375162c0819d829d93c4755a2a15f39e6ced37
--- /dev/null
+++ b/PyTorch/contrib/cv/detection/GCNet/mmcv/ops/csrc/carafe_naive_cuda_kernel.cuh
@@ -0,0 +1,110 @@
+#ifndef CARAFE_NAIVE_CUDA_KERNEL_CUH
+#define CARAFE_NAIVE_CUDA_KERNEL_CUH
+
+#ifdef MMCV_USE_PARROTS
+#include "parrots_cuda_helper.hpp"
+#else
+#include "pytorch_cuda_helper.hpp"
+#endif
+
+__device__ inline int Loc2Index(const int n, const int c, const int h,
+ const int w, const int channel_num,
+ const int height, const int width) {
+ int index = w + (h + (c + n * channel_num) * height) * width;
+ return index;
+}
+
+template
+__global__ void carafe_naive_forward_cuda_kernel(
+ const int nthreads, const scalar_t *bottom_data,
+ const scalar_t *bottom_masks, scalar_t *top_data, const int kernel_size,
+ const int group_size, const int scale_factor, const int channels,
+ const int height, const int width) {
+ CUDA_1D_KERNEL_LOOP(index, nthreads) {
+ // (n, c, ph, pw) is an element in the bottom_data
+ int pw = index % width;
+ int ph = (index / width) % height;
+ int c = (index / width / height) % channels;
+ int n = index / width / height / channels;
+
+ int mask_channels = kernel_size * kernel_size * group_size;
+ int mask_group = c / (channels / group_size);
+
+ int down_pw = pw / scale_factor;
+ int down_ph = ph / scale_factor;
+ int down_width = width / scale_factor;
+ int down_height = height / scale_factor;
+ int start_w = down_pw - (kernel_size - 1) / 2;
+ int end_w = down_pw + (kernel_size - 1) / 2 + 1;
+ int start_h = down_ph - (kernel_size - 1) / 2;
+ int end_h = down_ph + (kernel_size - 1) / 2 + 1;
+
+ scalar_t output_val = 0;
+ for (int iy = start_h; iy < end_h; iy++) {
+ for (int ix = start_w; ix < end_w; ix++) {
+ if (iy < 0 || iy > down_height - 1 || ix < 0 || ix > down_width - 1) {
+ continue;
+ }
+ int mask_iy = iy - down_ph + (kernel_size - 1) / 2;
+ int mask_ix = ix - down_pw + (kernel_size - 1) / 2;
+ int mask_c =
+ (mask_group * kernel_size + mask_iy) * kernel_size + mask_ix;
+ int feat_index =
+ Loc2Index(n, c, iy, ix, channels, down_height, down_width);
+ int mask_index =
+ Loc2Index(n, mask_c, ph, pw, mask_channels, height, width);
+ output_val += bottom_data[feat_index] * bottom_masks[mask_index];
+ }
+ }
+ top_data[index] = output_val;
+ }
+}
+
+template
+__global__ void carafe_naive_backward_cuda_kernel(
+ const int nthreads, const scalar_t *top_diff, const scalar_t *bottom_data,
+ const scalar_t *bottom_masks, scalar_t *bottom_diff, scalar_t *mask_diff,
+ const int kernel_size, const int group_size, const int scale_factor,
+ const int channels, const int height, const int width) {
+ CUDA_1D_KERNEL_LOOP(index, nthreads) {
+ // (n, c, ph, pw) is an element in the bottom_data
+ int pw = index % width;
+ int ph = (index / width) % height;
+ int c = (index / width / height) % channels;
+ int n = index / width / height / channels;
+
+ int mask_channels = kernel_size * kernel_size * group_size;
+ int mask_group = c / (channels / group_size);
+
+ int down_pw = pw / scale_factor;
+ int down_ph = ph / scale_factor;
+ int down_width = width / scale_factor;
+ int down_height = height / scale_factor;
+ int start_w = down_pw - (kernel_size - 1) / 2;
+ int end_w = down_pw + (kernel_size - 1) / 2 + 1;
+ int start_h = down_ph - (kernel_size - 1) / 2;
+ int end_h = down_ph + (kernel_size - 1) / 2 + 1;
+
+ for (int iy = start_h; iy < end_h; iy++) {
+ for (int ix = start_w; ix < end_w; ix++) {
+ if (iy < 0 || iy > down_height - 1 || ix < 0 || ix > down_width - 1) {
+ continue;
+ }
+ int mask_iy = iy - down_ph + (kernel_size - 1) / 2;
+ int mask_ix = ix - down_pw + (kernel_size - 1) / 2;
+ int mask_c =
+ (mask_group * kernel_size + mask_iy) * kernel_size + mask_ix;
+ int feat_index =
+ Loc2Index(n, c, iy, ix, channels, down_height, down_width);
+ int mask_index =
+ Loc2Index(n, mask_c, ph, pw, mask_channels, height, width);
+ atomicAdd(bottom_diff + feat_index,
+ bottom_masks[mask_index] * top_diff[index]);
+ atomicAdd(mask_diff + mask_index,
+ bottom_data[feat_index] * top_diff[index]);
+ }
+ }
+ }
+}
+
+#endif // CARAFE_NAIVE_CUDA_KERNEL_CUH
diff --git a/PyTorch/contrib/cv/detection/GCNet/mmcv/ops/csrc/cc_attention_cuda_kernel.cuh b/PyTorch/contrib/cv/detection/GCNet/mmcv/ops/csrc/cc_attention_cuda_kernel.cuh
new file mode 100644
index 0000000000000000000000000000000000000000..15e07d19702fcdf0a03f6a361b178d9c6ad6a075
--- /dev/null
+++ b/PyTorch/contrib/cv/detection/GCNet/mmcv/ops/csrc/cc_attention_cuda_kernel.cuh
@@ -0,0 +1,167 @@
+#ifndef CC_ATTENTION_CUDA_KERNEL_CUH
+#define CC_ATTENTION_CUDA_KERNEL_CUH
+
+#ifdef MMCV_USE_PARROTS
+#include "parrots_cuda_helper.hpp"
+#else
+#include "pytorch_cuda_helper.hpp"
+#endif
+
+template
+__global__ void ca_forward_kernel(const T *t, const T *f, T *weight, int num,
+ int chn, int height, int width) {
+ int x = blockIdx.x * blockDim.x + threadIdx.x;
+ int y = blockIdx.y * blockDim.y + threadIdx.y;
+ int sp = height * width;
+ int len = height + width - 1;
+ int z = blockIdx.z % len;
+ int batch = blockIdx.z / len;
+
+ if (x < width && y < height) {
+ T *weight_ptr = weight + (batch * len + z) * sp + y * width + x;
+ const int t_offset = y * width + x;
+ const int j = (z - width < y) ? z - width : z - width + 1;
+ const int f_offset = z < width ? y * width + z : j * width + x;
+ for (int plane = 0; plane < chn; ++plane) {
+ const int tf_base = (batch * chn + plane) * sp;
+ *weight_ptr += t[tf_base + t_offset] * f[tf_base + f_offset];
+ }
+ }
+}
+
+template
+__global__ void ca_backward_kernel_t(const T *dw, const T *t, const T *f, T *dt,
+ int num, int chn, int height, int width) {
+ int x = blockIdx.x * blockDim.x + threadIdx.x;
+ int y = blockIdx.y * blockDim.y + threadIdx.y;
+ int sp = height * width;
+ int len = height + width - 1;
+ int plane = blockIdx.z % chn;
+ int batch = blockIdx.z / chn;
+
+ if (x < width && y < height) {
+ for (int i = 0; i < width; ++i) {
+ T _dw = dw[(batch * len + i) * sp + y * width + x];
+ T _f = f[(batch * chn + plane) * sp + y * width + i];
+ dt[(batch * chn + plane) * sp + y * width + x] += _dw * _f;
+ }
+ for (int i = 0; i < height; ++i) {
+ if (i == y) continue;
+ int j = i < y ? i : i - 1;
+
+ T _dw = dw[(batch * len + width + j) * sp + y * width + x];
+ T _f = f[(batch * chn + plane) * sp + i * width + x];
+ dt[(batch * chn + plane) * sp + y * width + x] += _dw * _f;
+ }
+ }
+}
+
+template
+__global__ void ca_backward_kernel_f(const T *dw, const T *t, const T *f, T *df,
+ int num, int chn, int height, int width) {
+ int x = blockIdx.x * blockDim.x + threadIdx.x;
+ int y = blockIdx.y * blockDim.y + threadIdx.y;
+ int sp = height * width;
+ int len = height + width - 1;
+ int plane = blockIdx.z % chn;
+ int batch = blockIdx.z / chn;
+
+ if (x < width && y < height) {
+ for (int i = 0; i < width; ++i) {
+ T _dw = dw[(batch * len + x) * sp + y * width + i];
+ T _t = t[(batch * chn + plane) * sp + y * width + i];
+ df[(batch * chn + plane) * sp + y * width + x] += _dw * _t;
+ }
+ for (int i = 0; i < height; ++i) {
+ if (i == y) continue;
+ int j = i > y ? y : y - 1;
+
+ T _dw = dw[(batch * len + width + j) * sp + i * width + x];
+ T _t = t[(batch * chn + plane) * sp + i * width + x];
+ df[(batch * chn + plane) * sp + y * width + x] += _dw * _t;
+ }
+ }
+}
+
+template
+__global__ void ca_map_forward_kernel(const T *weight, const T *g, T *out,
+ int num, int chn, int height, int width) {
+ int x = blockIdx.x * blockDim.x + threadIdx.x;
+ int y = blockIdx.y * blockDim.y + threadIdx.y;
+ int sp = height * width;
+ int len = height + width - 1;
+ int plane = blockIdx.z % chn;
+ int batch = blockIdx.z / chn;
+ if (x < width && y < height) {
+ for (int i = 0; i < width; ++i) {
+ T _g = g[(batch * chn + plane) * sp + y * width + i];
+ T _w = weight[(batch * len + i) * sp + y * width + x];
+ out[(batch * chn + plane) * sp + y * width + x] += _g * _w;
+ }
+ for (int i = 0; i < height; ++i) {
+ if (i == y) continue;
+
+ int j = i < y ? i : i - 1;
+
+ T _g = g[(batch * chn + plane) * sp + i * width + x];
+ T _w = weight[(batch * len + width + j) * sp + y * width + x];
+ out[(batch * chn + plane) * sp + y * width + x] += _g * _w;
+ }
+ }
+}
+
+template
+__global__ void ca_map_backward_kernel_w(const T *dout, const T *weight,
+ const T *g, T *dw, int num, int chn,
+ int height, int width) {
+ int x = blockIdx.x * blockDim.x + threadIdx.x;
+ int y = blockIdx.y * blockDim.y + threadIdx.y;
+ int sp = height * width;
+ int len = height + width - 1;
+
+ int z = blockIdx.z % len;
+ int batch = blockIdx.z / len;
+
+ if (x < width && y < height) {
+ int widx = (batch * len + z) * sp + y * width + x;
+ int dout_idx = batch * chn * sp + y * width + x;
+ int gidx = batch * chn * sp;
+ if (z < width) {
+ gidx += y * width + z;
+ } else {
+ int j = z - width;
+ j = j < y ? j : j + 1;
+ gidx += j * width + x;
+ }
+ for (int plane = 0; plane < chn; plane++) {
+ dw[widx] += dout[dout_idx + plane * sp] * g[gidx + plane * sp];
+ }
+ }
+}
+
+template
+__global__ void ca_map_backward_kernel_g(const T *dout, const T *weight,
+ const T *g, T *dg, int num, int chn,
+ int height, int width) {
+ int x = blockIdx.x * blockDim.x + threadIdx.x;
+ int y = blockIdx.y * blockDim.y + threadIdx.y;
+ int sp = height * width;
+ int len = height + width - 1;
+ int plane = blockIdx.z % chn;
+ int batch = blockIdx.z / chn;
+ int index = (batch * chn + plane) * sp + y * width + x;
+
+ if (x < width && y < height) {
+ for (int i = 0; i < width; ++i) {
+ dg[index] += dout[(batch * chn + plane) * sp + y * width + i] *
+ weight[(batch * len + x) * sp + y * width + i];
+ }
+ for (int i = 0; i < height; ++i) {
+ if (i == y) continue;
+ int j = i > y ? y : y - 1;
+ dg[index] += dout[(batch * chn + plane) * sp + i * width + x] *
+ weight[(batch * len + width + j) * sp + i * width + x];
+ }
+ }
+}
+#endif // CC_ATTENTION_CUDA_KERNEL_CUH
diff --git a/PyTorch/contrib/cv/detection/GCNet/mmcv/ops/csrc/common_cuda_helper.hpp b/PyTorch/contrib/cv/detection/GCNet/mmcv/ops/csrc/common_cuda_helper.hpp
new file mode 100644
index 0000000000000000000000000000000000000000..a9ab6e82f1f50f1ea6fc27b42888efa73290eb28
--- /dev/null
+++ b/PyTorch/contrib/cv/detection/GCNet/mmcv/ops/csrc/common_cuda_helper.hpp
@@ -0,0 +1,110 @@
+#ifndef COMMON_CUDA_HELPER
+#define COMMON_CUDA_HELPER
+
+#include
+
+#define CUDA_1D_KERNEL_LOOP(i, n) \
+ for (int i = blockIdx.x * blockDim.x + threadIdx.x; i < (n); \
+ i += blockDim.x * gridDim.x)
+
+#define THREADS_PER_BLOCK 512
+
+inline int GET_BLOCKS(const int N) {
+ int optimal_block_num = (N + THREADS_PER_BLOCK - 1) / THREADS_PER_BLOCK;
+ int max_block_num = 4096;
+ return min(optimal_block_num, max_block_num);
+}
+
+template
+__device__ T bilinear_interpolate(const T* input, const int height,
+ const int width, T y, T x,
+ const int index /* index for debug only*/) {
+ // deal with cases that inverse elements are out of feature map boundary
+ if (y < -1.0 || y > height || x < -1.0 || x > width) return 0;
+
+ if (y <= 0) y = 0;
+ if (x <= 0) x = 0;
+
+ int y_low = (int)y;
+ int x_low = (int)x;
+ int y_high;
+ int x_high;
+
+ if (y_low >= height - 1) {
+ y_high = y_low = height - 1;
+ y = (T)y_low;
+ } else {
+ y_high = y_low + 1;
+ }
+
+ if (x_low >= width - 1) {
+ x_high = x_low = width - 1;
+ x = (T)x_low;
+ } else {
+ x_high = x_low + 1;
+ }
+
+ T ly = y - y_low;
+ T lx = x - x_low;
+ T hy = 1. - ly, hx = 1. - lx;
+ // do bilinear interpolation
+ T v1 = input[y_low * width + x_low];
+ T v2 = input[y_low * width + x_high];
+ T v3 = input[y_high * width + x_low];
+ T v4 = input[y_high * width + x_high];
+ T w1 = hy * hx, w2 = hy * lx, w3 = ly * hx, w4 = ly * lx;
+
+ T val = (w1 * v1 + w2 * v2 + w3 * v3 + w4 * v4);
+
+ return val;
+}
+
+template
+__device__ void bilinear_interpolate_gradient(
+ const int height, const int width, T y, T x, T& w1, T& w2, T& w3, T& w4,
+ int& x_low, int& x_high, int& y_low, int& y_high,
+ const int index /* index for debug only*/) {
+ // deal with cases that inverse elements are out of feature map boundary
+ if (y < -1.0 || y > height || x < -1.0 || x > width) {
+ // empty
+ w1 = w2 = w3 = w4 = 0.;
+ x_low = x_high = y_low = y_high = -1;
+ return;
+ }
+
+ if (y <= 0) y = 0;
+ if (x <= 0) x = 0;
+
+ y_low = (int)y;
+ x_low = (int)x;
+
+ if (y_low >= height - 1) {
+ y_high = y_low = height - 1;
+ y = (T)y_low;
+ } else {
+ y_high = y_low + 1;
+ }
+
+ if (x_low >= width - 1) {
+ x_high = x_low = width - 1;
+ x = (T)x_low;
+ } else {
+ x_high = x_low + 1;
+ }
+
+ T ly = y - y_low;
+ T lx = x - x_low;
+ T hy = 1. - ly, hx = 1. - lx;
+
+ // reference in forward
+ // T v1 = input[y_low * width + x_low];
+ // T v2 = input[y_low * width + x_high];
+ // T v3 = input[y_high * width + x_low];
+ // T v4 = input[y_high * width + x_high];
+ // T val = (w1 * v1 + w2 * v2 + w3 * v3 + w4 * v4);
+
+ w1 = hy * hx, w2 = hy * lx, w3 = ly * hx, w4 = ly * lx;
+
+ return;
+}
+#endif // COMMON_CUDA_HELPER
diff --git a/PyTorch/contrib/cv/detection/GCNet/mmcv/ops/csrc/deform_conv_cuda_kernel.cuh b/PyTorch/contrib/cv/detection/GCNet/mmcv/ops/csrc/deform_conv_cuda_kernel.cuh
new file mode 100644
index 0000000000000000000000000000000000000000..6b4d1bbd85bad1b87ee5d6b8a3cd3b29e3cbc411
--- /dev/null
+++ b/PyTorch/contrib/cv/detection/GCNet/mmcv/ops/csrc/deform_conv_cuda_kernel.cuh
@@ -0,0 +1,367 @@
+/*!
+ ******************* BEGIN Caffe Copyright Notice and Disclaimer
+ *****************
+ *
+ * COPYRIGHT
+ *
+ * All contributions by the University of California:
+ * Copyright (c) 2014-2017 The Regents of the University of California (Regents)
+ * All rights reserved.
+ *
+ * All other contributions:
+ * Copyright (c) 2014-2017, the respective contributors
+ * All rights reserved.
+ *
+ * Caffe uses a shared copyright model: each contributor holds copyright over
+ * their contributions to Caffe. The project versioning records all such
+ * contribution and copyright details. If a contributor wants to further mark
+ * their specific copyright on a particular contribution, they should indicate
+ * their copyright solely in the commit message of the change when it is
+ * committed.
+ *
+ * LICENSE
+ *
+ * Redistribution and use in source and binary forms, with or without
+ * modification, are permitted provided that the following conditions are met:
+ *
+ * 1. Redistributions of source code must retain the above copyright notice,
+ *this list of conditions and the following disclaimer.
+ * 2. Redistributions in binary form must reproduce the above copyright notice,
+ * this list of conditions and the following disclaimer in the documentation
+ * and/or other materials provided with the distribution.
+ *
+ * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS"
+ *AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
+ *IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE
+ * DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE LIABLE
+ *FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
+ *DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR
+ *SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER
+ *CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY,
+ *OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ *
+ * CONTRIBUTION AGREEMENT
+ *
+ * By contributing to the BVLC/caffe repository through pull-request, comment,
+ * or otherwise, the contributor releases their content to the
+ * license and copyright terms herein.
+ *
+ ***************** END Caffe Copyright Notice and Disclaimer
+ *********************
+ *
+ * Copyright (c) 2018 Microsoft
+ * Licensed under The MIT License [see LICENSE for details]
+ * \file modulated_deformable_im2col.cuh
+ * \brief Function definitions of converting an image to
+ * column matrix based on kernel, padding, dilation, and offset.
+ * These functions are mainly used in deformable convolution operators.
+ * \ref: https://arxiv.org/abs/1703.06211
+ * \author Yuwen Xiong, Haozhi Qi, Jifeng Dai, Xizhou Zhu, Han Hu, Dazhi Cheng
+ */
+
+// modified from
+// https://github.com/chengdazhi/Deformable-Convolution-V2-PyTorch/blob/mmdetection/mmdet/ops/dcn/src/deform_conv_cuda_kernel.cu
+
+#ifndef DEFORM_CONV_CUDA_KERNEL_CUH
+#define DEFORM_CONV_CUDA_KERNEL_CUH
+
+#include
+#ifdef MMCV_WITH_TRT
+#include "common_cuda_helper.hpp"
+#else // MMCV_WITH_TRT
+#ifdef MMCV_USE_PARROTS
+#include "parrots_cuda_helper.hpp"
+#else // MMCV_USE_PARROTS
+#include "pytorch_cuda_helper.hpp"
+#endif // MMCV_USE_PARROTS
+#endif // MMCV_WITH_TRT
+
+template
+__device__ T deformable_im2col_bilinear(const T *input, const int data_width,
+ const int height, const int width, T h,
+ T w) {
+ if (h <= -1 || height <= h || w <= -1 || width <= w) {
+ return 0;
+ }
+
+ int h_low = floorf(h);
+ int w_low = floorf(w);
+ int h_high = h_low + 1;
+ int w_high = w_low + 1;
+
+ T lh = h - h_low;
+ T lw = w - w_low;
+ T hh = 1 - lh, hw = 1 - lw;
+
+ T v1 = 0;
+ if (h_low >= 0 && w_low >= 0) v1 = input[h_low * data_width + w_low];
+ T v2 = 0;
+ if (h_low >= 0 && w_high <= width - 1)
+ v2 = input[h_low * data_width + w_high];
+ T v3 = 0;
+ if (h_high <= height - 1 && w_low >= 0)
+ v3 = input[h_high * data_width + w_low];
+ T v4 = 0;
+ if (h_high <= height - 1 && w_high <= width - 1)
+ v4 = input[h_high * data_width + w_high];
+
+ T w1 = hh * hw, w2 = hh * lw, w3 = lh * hw, w4 = lh * lw;
+
+ T val = (w1 * v1 + w2 * v2 + w3 * v3 + w4 * v4);
+ return val;
+}
+
+template
+__device__ T get_gradient_weight(T argmax_h, T argmax_w, const int h,
+ const int w, const int height,
+ const int width) {
+ if (argmax_h <= -1 || argmax_h >= height || argmax_w <= -1 ||
+ argmax_w >= width) {
+ // empty
+ return 0;
+ }
+
+ int argmax_h_low = floorf(argmax_h);
+ int argmax_w_low = floorf(argmax_w);
+ int argmax_h_high = argmax_h_low + 1;
+ int argmax_w_high = argmax_w_low + 1;
+
+ T weight = 0;
+ if (h == argmax_h_low && w == argmax_w_low)
+ weight = (h + 1 - argmax_h) * (w + 1 - argmax_w);
+ if (h == argmax_h_low && w == argmax_w_high)
+ weight = (h + 1 - argmax_h) * (argmax_w + 1 - w);
+ if (h == argmax_h_high && w == argmax_w_low)
+ weight = (argmax_h + 1 - h) * (w + 1 - argmax_w);
+ if (h == argmax_h_high && w == argmax_w_high)
+ weight = (argmax_h + 1 - h) * (argmax_w + 1 - w);
+ return weight;
+}
+
+template
+__device__ T get_coordinate_weight(T argmax_h, T argmax_w, const int height,
+ const int width, const T *im_data,
+ const int data_width, const int bp_dir) {
+ if (argmax_h <= -1 || argmax_h >= height || argmax_w <= -1 ||
+ argmax_w >= width) {
+ // empty
+ return 0;
+ }
+
+ int argmax_h_low = floorf(argmax_h);
+ int argmax_w_low = floorf(argmax_w);
+ int argmax_h_high = argmax_h_low + 1;
+ int argmax_w_high = argmax_w_low + 1;
+
+ T weight = 0;
+
+ if (bp_dir == 0) {
+ if (argmax_h_low >= 0 && argmax_w_low >= 0)
+ weight += -1 * (argmax_w_low + 1 - argmax_w) *
+ im_data[argmax_h_low * data_width + argmax_w_low];
+ if (argmax_h_low >= 0 && argmax_w_high <= width - 1)
+ weight += -1 * (argmax_w - argmax_w_low) *
+ im_data[argmax_h_low * data_width + argmax_w_high];
+ if (argmax_h_high <= height - 1 && argmax_w_low >= 0)
+ weight += (argmax_w_low + 1 - argmax_w) *
+ im_data[argmax_h_high * data_width + argmax_w_low];
+ if (argmax_h_high <= height - 1 && argmax_w_high <= width - 1)
+ weight += (argmax_w - argmax_w_low) *
+ im_data[argmax_h_high * data_width + argmax_w_high];
+ } else if (bp_dir == 1) {
+ if (argmax_h_low >= 0 && argmax_w_low >= 0)
+ weight += -1 * (argmax_h_low + 1 - argmax_h) *
+ im_data[argmax_h_low * data_width + argmax_w_low];
+ if (argmax_h_low >= 0 && argmax_w_high <= width - 1)
+ weight += (argmax_h_low + 1 - argmax_h) *
+ im_data[argmax_h_low * data_width + argmax_w_high];
+ if (argmax_h_high <= height - 1 && argmax_w_low >= 0)
+ weight += -1 * (argmax_h - argmax_h_low) *
+ im_data[argmax_h_high * data_width + argmax_w_low];
+ if (argmax_h_high <= height - 1 && argmax_w_high <= width - 1)
+ weight += (argmax_h - argmax_h_low) *
+ im_data[argmax_h_high * data_width + argmax_w_high];
+ }
+
+ return weight;
+}
+
+template
+__global__ void deformable_im2col_gpu_kernel(
+ const int n, const T *data_im, const T *data_offset, const int height,
+ const int width, const int kernel_h, const int kernel_w, const int pad_h,
+ const int pad_w, const int stride_h, const int stride_w,
+ const int dilation_h, const int dilation_w,
+ const int channel_per_deformable_group, const int batch_size,
+ const int num_channels, const int deformable_group, const int height_col,
+ const int width_col, T *data_col) {
+ CUDA_1D_KERNEL_LOOP(index, n) {
+ // index index of output matrix
+ const int w_col = index % width_col;
+ const int h_col = (index / width_col) % height_col;
+ const int b_col = (index / width_col / height_col) % batch_size;
+ const int c_im = (index / width_col / height_col) / batch_size;
+ const int c_col = c_im * kernel_h * kernel_w;
+
+ // compute deformable group index
+ const int deformable_group_index = c_im / channel_per_deformable_group;
+
+ const int h_in = h_col * stride_h - pad_h;
+ const int w_in = w_col * stride_w - pad_w;
+ T *data_col_ptr =
+ data_col +
+ ((c_col * batch_size + b_col) * height_col + h_col) * width_col + w_col;
+ const T *data_im_ptr =
+ data_im + (b_col * num_channels + c_im) * height * width;
+ const T *data_offset_ptr =
+ data_offset + (b_col * deformable_group + deformable_group_index) * 2 *
+ kernel_h * kernel_w * height_col * width_col;
+
+ for (int i = 0; i < kernel_h; ++i) {
+ for (int j = 0; j < kernel_w; ++j) {
+ const int data_offset_h_ptr =
+ ((2 * (i * kernel_w + j)) * height_col + h_col) * width_col + w_col;
+ const int data_offset_w_ptr =
+ ((2 * (i * kernel_w + j) + 1) * height_col + h_col) * width_col +
+ w_col;
+ const T offset_h = data_offset_ptr[data_offset_h_ptr];
+ const T offset_w = data_offset_ptr[data_offset_w_ptr];
+ T val = static_cast(0);
+ const T h_im = h_in + i * dilation_h + offset_h;
+ const T w_im = w_in + j * dilation_w + offset_w;
+ if (h_im > -1 && w_im > -1 && h_im < height && w_im < width)
+ val = deformable_im2col_bilinear(data_im_ptr, width, height, width,
+ h_im, w_im);
+ *data_col_ptr = val;
+ data_col_ptr += batch_size * height_col * width_col;
+ }
+ }
+ }
+}
+
+template
+__global__ void deformable_col2im_gpu_kernel(
+ const int n, const T *data_col, const T *data_offset, const int channels,
+ const int height, const int width, const int kernel_h, const int kernel_w,
+ const int pad_h, const int pad_w, const int stride_h, const int stride_w,
+ const int dilation_h, const int dilation_w,
+ const int channel_per_deformable_group, const int batch_size,
+ const int deformable_group, const int height_col, const int width_col,
+ T *grad_im) {
+ CUDA_1D_KERNEL_LOOP(index, n) {
+ const int j = (index / width_col / height_col / batch_size) % kernel_w;
+ const int i =
+ (index / width_col / height_col / batch_size / kernel_w) % kernel_h;
+ const int c =
+ index / width_col / height_col / batch_size / kernel_w / kernel_h;
+ // compute the start and end of the output
+
+ const int deformable_group_index = c / channel_per_deformable_group;
+
+ int w_out = index % width_col;
+ int h_out = (index / width_col) % height_col;
+ int b = (index / width_col / height_col) % batch_size;
+ int w_in = w_out * stride_w - pad_w;
+ int h_in = h_out * stride_h - pad_h;
+
+ const T *data_offset_ptr =
+ data_offset + (b * deformable_group + deformable_group_index) * 2 *
+ kernel_h * kernel_w * height_col * width_col;
+ const int data_offset_h_ptr =
+ ((2 * (i * kernel_w + j)) * height_col + h_out) * width_col + w_out;
+ const int data_offset_w_ptr =
+ ((2 * (i * kernel_w + j) + 1) * height_col + h_out) * width_col + w_out;
+ const T offset_h = data_offset_ptr[data_offset_h_ptr];
+ const T offset_w = data_offset_ptr[data_offset_w_ptr];
+ const T cur_inv_h_data = h_in + i * dilation_h + offset_h;
+ const T cur_inv_w_data = w_in + j * dilation_w + offset_w;
+
+ const T cur_top_grad = data_col[index];
+ const int cur_h = (int)cur_inv_h_data;
+ const int cur_w = (int)cur_inv_w_data;
+ for (int dy = -2; dy <= 2; dy++) {
+ for (int dx = -2; dx <= 2; dx++) {
+ if (cur_h + dy >= 0 && cur_h + dy < height && cur_w + dx >= 0 &&
+ cur_w + dx < width && abs(cur_inv_h_data - (cur_h + dy)) < 1 &&
+ abs(cur_inv_w_data - (cur_w + dx)) < 1) {
+ int cur_bottom_grad_pos =
+ ((b * channels + c) * height + cur_h + dy) * width + cur_w + dx;
+ T weight = get_gradient_weight(cur_inv_h_data, cur_inv_w_data,
+ cur_h + dy, cur_w + dx, height, width);
+ atomicAdd(grad_im + cur_bottom_grad_pos, weight * cur_top_grad);
+ }
+ }
+ }
+ }
+}
+
+template
+__global__ void deformable_col2im_coord_gpu_kernel(
+ const int n, const T *data_col, const T *data_im, const T *data_offset,
+ const int channels, const int height, const int width, const int kernel_h,
+ const int kernel_w, const int pad_h, const int pad_w, const int stride_h,
+ const int stride_w, const int dilation_h, const int dilation_w,
+ const int channel_per_deformable_group, const int batch_size,
+ const int offset_channels, const int deformable_group, const int height_col,
+ const int width_col, T *grad_offset) {
+ CUDA_1D_KERNEL_LOOP(index, n) {
+ T val = 0;
+ int w = index % width_col;
+ int h = (index / width_col) % height_col;
+ int c = (index / width_col / height_col) % offset_channels;
+ int b = (index / width_col / height_col) / offset_channels;
+ // compute the start and end of the output
+
+ const int deformable_group_index = c / (2 * kernel_h * kernel_w);
+ const int col_step = kernel_h * kernel_w;
+ int cnt = 0;
+ const T *data_col_ptr = data_col + deformable_group_index *
+ channel_per_deformable_group *
+ batch_size * width_col * height_col;
+ const T *data_im_ptr =
+ data_im + (b * deformable_group + deformable_group_index) *
+ channel_per_deformable_group / kernel_h / kernel_w *
+ height * width;
+ const T *data_offset_ptr =
+ data_offset + (b * deformable_group + deformable_group_index) * 2 *
+ kernel_h * kernel_w * height_col * width_col;
+
+ const int offset_c = c - deformable_group_index * 2 * kernel_h * kernel_w;
+
+ for (int col_c = (offset_c / 2); col_c < channel_per_deformable_group;
+ col_c += col_step) {
+ const int col_pos =
+ (((col_c * batch_size + b) * height_col) + h) * width_col + w;
+ const int bp_dir = offset_c % 2;
+
+ int j = (col_pos / width_col / height_col / batch_size) % kernel_w;
+ int i =
+ (col_pos / width_col / height_col / batch_size / kernel_w) % kernel_h;
+ int w_out = col_pos % width_col;
+ int h_out = (col_pos / width_col) % height_col;
+ int w_in = w_out * stride_w - pad_w;
+ int h_in = h_out * stride_h - pad_h;
+ const int data_offset_h_ptr =
+ (((2 * (i * kernel_w + j)) * height_col + h_out) * width_col + w_out);
+ const int data_offset_w_ptr =
+ (((2 * (i * kernel_w + j) + 1) * height_col + h_out) * width_col +
+ w_out);
+ const T offset_h = data_offset_ptr[data_offset_h_ptr];
+ const T offset_w = data_offset_ptr[data_offset_w_ptr];
+ T inv_h = h_in + i * dilation_h + offset_h;
+ T inv_w = w_in + j * dilation_w + offset_w;
+ if (inv_h <= -1 || inv_w <= -1 || inv_h >= height || inv_w >= width)
+ inv_h = inv_w = -2;
+ const T weight = get_coordinate_weight(inv_h, inv_w, height, width,
+ data_im_ptr + cnt * height * width,
+ width, bp_dir);
+ val += weight * data_col_ptr[col_pos];
+ cnt += 1;
+ }
+
+ grad_offset[index] = val;
+ }
+}
+
+#endif // DEFORM_CONV_CUDA_KERNEL_CUH
diff --git a/PyTorch/contrib/cv/detection/GCNet/mmcv/ops/csrc/deform_roi_pool_cuda_kernel.cuh b/PyTorch/contrib/cv/detection/GCNet/mmcv/ops/csrc/deform_roi_pool_cuda_kernel.cuh
new file mode 100644
index 0000000000000000000000000000000000000000..cddb8d5e9edf5a8737a547ba388473a6b222931e
--- /dev/null
+++ b/PyTorch/contrib/cv/detection/GCNet/mmcv/ops/csrc/deform_roi_pool_cuda_kernel.cuh
@@ -0,0 +1,185 @@
+#ifndef DEFORM_ROI_POOL_CUDA_KERNEL_CUH
+#define DEFORM_ROI_POOL_CUDA_KERNEL_CUH
+
+#ifdef MMCV_USE_PARROTS
+#include "parrots_cuda_helper.hpp"
+#else
+#include "pytorch_cuda_helper.hpp"
+#endif
+
+template
+__global__ void deform_roi_pool_forward_cuda_kernel(
+ const int nthreads, const T* input, const T* rois, const T* offset,
+ T* output, const int pooled_height, const int pooled_width,
+ const T spatial_scale, const int sampling_ratio, const T gamma,
+ const int channels, const int height, const int width) {
+ CUDA_1D_KERNEL_LOOP(index, nthreads) {
+ // (n, c, ph, pw) is an element in the pooled output
+ int pw = index % pooled_width;
+ int ph = (index / pooled_width) % pooled_height;
+ int c = (index / pooled_width / pooled_height) % channels;
+ int n = index / pooled_width / pooled_height / channels;
+
+ const T* offset_rois = rois + n * 5;
+ int roi_batch_ind = offset_rois[0];
+
+ // Do not using rounding; this implementation detail is critical
+ T roi_start_w = offset_rois[1] * spatial_scale - 0.5;
+ T roi_start_h = offset_rois[2] * spatial_scale - 0.5;
+ T roi_end_w = offset_rois[3] * spatial_scale - 0.5;
+ T roi_end_h = offset_rois[4] * spatial_scale - 0.5;
+
+ T roi_width = roi_end_w - roi_start_w;
+ T roi_height = roi_end_h - roi_start_h;
+
+ T bin_size_h = static_cast(roi_height) / static_cast(pooled_height);
+ T bin_size_w = static_cast(roi_width) / static_cast(pooled_width);
+
+ const T* offset_input =
+ input + (roi_batch_ind * channels + c) * height * width;
+
+ // We use roi_bin_grid to sample the grid and mimic integral
+ int roi_bin_grid_h =
+ (sampling_ratio > 0)
+ ? sampling_ratio
+ : static_cast(ceilf(roi_height / pooled_height));
+ int roi_bin_grid_w =
+ (sampling_ratio > 0)
+ ? sampling_ratio
+ : static_cast(ceilf(roi_width / pooled_width));
+
+ // Compute roi offset
+ if (offset != NULL) {
+ const T* offset_cur_w = offset + n * pooled_width * pooled_height * 2 +
+ ph * pooled_width + pw;
+ T offset_roi_w = gamma * roi_width * offset_cur_w[0];
+ T offset_roi_h =
+ gamma * roi_height * offset_cur_w[pooled_width * pooled_height];
+ roi_start_w += offset_roi_w;
+ roi_start_h += offset_roi_h;
+ }
+
+ // We do average pooling inside a bin
+ const T count = max(roi_bin_grid_h * roi_bin_grid_w, 1);
+ T output_val = 0.;
+ for (int iy = 0; iy < roi_bin_grid_h; iy++) {
+ const T y = roi_start_h + ph * bin_size_h +
+ static_cast(iy + .5f) * bin_size_h /
+ static_cast(roi_bin_grid_h);
+ for (int ix = 0; ix < roi_bin_grid_w; ix++) {
+ const T x = roi_start_w + pw * bin_size_w +
+ static_cast(ix + .5f) * bin_size_w /
+ static_cast(roi_bin_grid_w);
+ T val = bilinear_interpolate(offset_input, height, width, y, x, index);
+ output_val += val;
+ }
+ }
+ output[index] = output_val / count;
+ }
+}
+
+template
+__global__ void deform_roi_pool_backward_cuda_kernel(
+ const int nthreads, const T* grad_output, const T* input, const T* rois,
+ const T* offset, T* grad_input, T* grad_offset, const int pooled_height,
+ const int pooled_width, const T spatial_scale, const int sampling_ratio,
+ const T gamma, const int channels, const int height, const int width) {
+ CUDA_1D_KERNEL_LOOP(index, nthreads) {
+ // (n, c, ph, pw) is an element in the pooled output
+ int pw = index % pooled_width;
+ int ph = (index / pooled_width) % pooled_height;
+ int c = (index / pooled_width / pooled_height) % channels;
+ int n = index / pooled_width / pooled_height / channels;
+
+ const T* offset_rois = rois + n * 5;
+ int roi_batch_ind = offset_rois[0];
+ const T* offset_input =
+ input + ((roi_batch_ind * channels + c) * height * width);
+ T* offset_grad_input =
+ grad_input + ((roi_batch_ind * channels + c) * height * width);
+
+ // Do not using rounding; this implementation detail is critical
+ T roi_start_w = offset_rois[1] * spatial_scale - 0.5;
+ T roi_start_h = offset_rois[2] * spatial_scale - 0.5;
+ T roi_end_w = offset_rois[3] * spatial_scale - 0.5;
+ T roi_end_h = offset_rois[4] * spatial_scale - 0.5;
+
+ T roi_width = roi_end_w - roi_start_w;
+ T roi_height = roi_end_h - roi_start_h;
+
+ T bin_size_h = static_cast(roi_height) / static_cast(pooled_height);
+ T bin_size_w = static_cast(roi_width) / static_cast(pooled_width);
+
+ // We use roi_bin_grid to sample the grid and mimic integral
+ int roi_bin_grid_h =
+ (sampling_ratio > 0)
+ ? sampling_ratio
+ : static_cast(ceilf(roi_height / pooled_height));
+ int roi_bin_grid_w =
+ (sampling_ratio > 0)
+ ? sampling_ratio
+ : static_cast(ceilf(roi_width / pooled_width));
+
+ // Compute roi offset
+ if (offset != NULL) {
+ const T* offset_cur_w = offset + n * pooled_width * pooled_height * 2 +
+ ph * pooled_width + pw;
+ T offset_roi_w = gamma * roi_width * offset_cur_w[0];
+ T offset_roi_h =
+ gamma * roi_height * offset_cur_w[pooled_width * pooled_height];
+ roi_start_w += offset_roi_w;
+ roi_start_h += offset_roi_h;
+ }
+
+ // We do average (integral) pooling inside a bin
+ const T count = roi_bin_grid_h * roi_bin_grid_w; // e.g. = 4
+ const T grad_output_this_bin = grad_output[index] / count;
+
+ for (int iy = 0; iy < roi_bin_grid_h; iy++) {
+ const T y = roi_start_h + ph * bin_size_h +
+ static_cast(iy + .5f) * bin_size_h /
+ static_cast(roi_bin_grid_h);
+ for (int ix = 0; ix < roi_bin_grid_w; ix++) {
+ const T x = roi_start_w + pw * bin_size_w +
+ static_cast(ix + .5f) * bin_size_w /
+ static_cast(roi_bin_grid_w);
+
+ T w1, w2, w3, w4;
+ int x_low, x_high, y_low, y_high;
+ bilinear_interpolate_gradient(height, width, y, x, w1, w2, w3, w4,
+ x_low, x_high, y_low, y_high, index);
+
+ if (x_low >= 0 && x_high >= 0 && y_low >= 0 && y_high >= 0) {
+ atomicAdd(offset_grad_input + y_low * width + x_low,
+ grad_output_this_bin * w1);
+ atomicAdd(offset_grad_input + y_low * width + x_high,
+ grad_output_this_bin * w2);
+ atomicAdd(offset_grad_input + y_high * width + x_low,
+ grad_output_this_bin * w3);
+ atomicAdd(offset_grad_input + y_high * width + x_high,
+ grad_output_this_bin * w4);
+ if (offset != NULL) {
+ T input_00 = offset_input[y_low * width + x_low];
+ T input_10 = offset_input[y_low * width + x_high];
+ T input_01 = offset_input[y_high * width + x_low];
+ T input_11 = offset_input[y_high * width + x_high];
+ T ogx = gamma * roi_width * grad_output_this_bin *
+ (input_11 * (y - y_low) + input_10 * (y_high - y) +
+ input_01 * (y_low - y) + input_00 * (y - y_high));
+ T ogy = gamma * roi_height * grad_output_this_bin *
+ (input_11 * (x - x_low) + input_01 * (x_high - x) +
+ input_10 * (x_low - x) + input_00 * (x - x_high));
+ atomicAdd(grad_offset + n * pooled_width * pooled_height * 2 +
+ ph * pooled_width + pw,
+ ogx);
+ atomicAdd(grad_offset + n * pooled_width * pooled_height * 2 +
+ pooled_width * pooled_height + ph * pooled_width + pw,
+ ogy);
+ }
+ }
+ }
+ }
+ }
+}
+
+#endif // DEFORM_ROI_POOL_CUDA_KERNEL_CUH
diff --git a/PyTorch/contrib/cv/detection/GCNet/mmcv/ops/csrc/masked_conv2d_cuda_kernel.cuh b/PyTorch/contrib/cv/detection/GCNet/mmcv/ops/csrc/masked_conv2d_cuda_kernel.cuh
new file mode 100644
index 0000000000000000000000000000000000000000..4be8329ae30fc0598cd37cc63f9a4bd07c400a27
--- /dev/null
+++ b/PyTorch/contrib/cv/detection/GCNet/mmcv/ops/csrc/masked_conv2d_cuda_kernel.cuh
@@ -0,0 +1,61 @@
+#ifndef MASKED_CONV2D_CUDA_KERNEL_CUH
+#define MASKED_CONV2D_CUDA_KERNEL_CUH
+
+#ifdef MMCV_USE_PARROTS
+#include "parrots_cuda_helper.hpp"
+#else
+#include "pytorch_cuda_helper.hpp"
+#endif
+
+template
+__global__ void MaskedIm2colForward(const int n, const scalar_t *data_im,
+ const int height, const int width,
+ const int kernel_h, const int kernel_w,
+ const int pad_h, const int pad_w,
+ const int64_t *mask_h_idx,
+ const int64_t *mask_w_idx,
+ const int mask_cnt, scalar_t *data_col) {
+ // mask_cnt * channels
+ CUDA_1D_KERNEL_LOOP(index, n) {
+ const int m_index = index % mask_cnt;
+ const int h_col = mask_h_idx[m_index];
+ const int w_col = mask_w_idx[m_index];
+ const int c_im = index / mask_cnt;
+ const int c_col = c_im * kernel_h * kernel_w;
+ const int h_offset = h_col - pad_h;
+ const int w_offset = w_col - pad_w;
+ scalar_t *data_col_ptr = data_col + c_col * mask_cnt + m_index;
+ for (int i = 0; i < kernel_h; ++i) {
+ int h_im = h_offset + i;
+ for (int j = 0; j < kernel_w; ++j) {
+ int w_im = w_offset + j;
+ if (h_im >= 0 && w_im >= 0 && h_im < height && w_im < width) {
+ *data_col_ptr =
+ (scalar_t)data_im[(c_im * height + h_im) * width + w_im];
+ } else {
+ *data_col_ptr = 0.0;
+ }
+ data_col_ptr += mask_cnt;
+ }
+ }
+ }
+}
+
+template
+__global__ void MaskedCol2imForward(const int n, const scalar_t *data_col,
+ const int height, const int width,
+ const int channels,
+ const int64_t *mask_h_idx,
+ const int64_t *mask_w_idx,
+ const int mask_cnt, scalar_t *data_im) {
+ CUDA_1D_KERNEL_LOOP(index, n) {
+ const int m_index = index % mask_cnt;
+ const int h_im = mask_h_idx[m_index];
+ const int w_im = mask_w_idx[m_index];
+ const int c_im = index / mask_cnt;
+ // compute the start and end of the output
+ data_im[(c_im * height + h_im) * width + w_im] = data_col[index];
+ }
+}
+
+#endif // MASKED_CONV2D_CUDA_KERNEL_CUH
diff --git a/PyTorch/contrib/cv/detection/GCNet/mmcv/ops/csrc/modulated_deform_conv_cuda_kernel.cuh b/PyTorch/contrib/cv/detection/GCNet/mmcv/ops/csrc/modulated_deform_conv_cuda_kernel.cuh
new file mode 100644
index 0000000000000000000000000000000000000000..ca0e91a25246569bb7de04649ab4f5afe233670c
--- /dev/null
+++ b/PyTorch/contrib/cv/detection/GCNet/mmcv/ops/csrc/modulated_deform_conv_cuda_kernel.cuh
@@ -0,0 +1,399 @@
+/*!
+ ******************* BEGIN Caffe Copyright Notice and Disclaimer
+ *****************
+ *
+ * COPYRIGHT
+ *
+ * All contributions by the University of California:
+ * Copyright (c) 2014-2017 The Regents of the University of California (Regents)
+ * All rights reserved.
+ *
+ * All other contributions:
+ * Copyright (c) 2014-2017, the respective contributors
+ * All rights reserved.
+ *
+ * Caffe uses a shared copyright model: each contributor holds copyright over
+ * their contributions to Caffe. The project versioning records all such
+ * contribution and copyright details. If a contributor wants to further mark
+ * their specific copyright on a particular contribution, they should indicate
+ * their copyright solely in the commit message of the change when it is
+ * committed.
+ *
+ * LICENSE
+ *
+ * Redistribution and use in source and binary forms, with or without
+ * modification, are permitted provided that the following conditions are met:
+ *
+ * 1. Redistributions of source code must retain the above copyright notice,
+ *this list of conditions and the following disclaimer.
+ * 2. Redistributions in binary form must reproduce the above copyright notice,
+ * this list of conditions and the following disclaimer in the documentation
+ * and/or other materials provided with the distribution.
+ *
+ * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS"
+ *AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
+ *IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE
+ * DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE LIABLE
+ *FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
+ *DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR
+ *SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER
+ *CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY,
+ *OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ *
+ * CONTRIBUTION AGREEMENT
+ *
+ * By contributing to the BVLC/caffe repository through pull-request, comment,
+ * or otherwise, the contributor releases their content to the
+ * license and copyright terms herein.
+ *
+ ***************** END Caffe Copyright Notice and Disclaimer
+ *********************
+ *
+ * Copyright (c) 2018 Microsoft
+ * Licensed under The MIT License [see LICENSE for details]
+ * \file modulated_deformable_im2col.cuh
+ * \brief Function definitions of converting an image to
+ * column matrix based on kernel, padding, dilation, and offset.
+ * These functions are mainly used in deformable convolution operators.
+ * \ref: https://arxiv.org/abs/1703.06211
+ * \author Yuwen Xiong, Haozhi Qi, Jifeng Dai, Xizhou Zhu, Han Hu, Dazhi Cheng
+ */
+
+// modified from
+// https://github.com/chengdazhi/Deformable-Convolution-V2-PyTorch/blob/mmdetection/mmdet/ops/dcn/src/deform_conv_cuda_kernel.cu
+
+#ifndef MODULATED_DEFORM_CONV_CUDA_KERNEL_CUH
+#define MODULATED_DEFORM_CONV_CUDA_KERNEL_CUH
+
+#include
+#ifdef MMCV_WITH_TRT
+#include "common_cuda_helper.hpp"
+#else // MMCV_WITH_TRT
+#ifdef MMCV_USE_PARROTS
+#include "parrots_cuda_helper.hpp"
+#else // MMCV_USE_PARROTS
+#include "pytorch_cuda_helper.hpp"
+#endif // MMCV_USE_PARROTS
+#endif // MMCV_WITH_TRT
+
+template
+__device__ T dmcn_im2col_bilinear(const T *input, const int data_width,
+ const int height, const int width, T h, T w) {
+ int h_low = floorf(h);
+ int w_low = floorf(w);
+ int h_high = h_low + 1;
+ int w_high = w_low + 1;
+
+ T lh = h - h_low;
+ T lw = w - w_low;
+ T hh = 1 - lh, hw = 1 - lw;
+
+ T v1 = 0;
+ if (h_low >= 0 && w_low >= 0) v1 = input[h_low * data_width + w_low];
+ T v2 = 0;
+ if (h_low >= 0 && w_high <= width - 1)
+ v2 = input[h_low * data_width + w_high];
+ T v3 = 0;
+ if (h_high <= height - 1 && w_low >= 0)
+ v3 = input[h_high * data_width + w_low];
+ T v4 = 0;
+ if (h_high <= height - 1 && w_high <= width - 1)
+ v4 = input[h_high * data_width + w_high];
+
+ T w1 = hh * hw, w2 = hh * lw, w3 = lh * hw, w4 = lh * lw;
+
+ T val = (w1 * v1 + w2 * v2 + w3 * v3 + w4 * v4);
+ return val;
+}
+
+template
+__device__ T dmcn_get_gradient_weight(T argmax_h, T argmax_w, const int h,
+ const int w, const int height,
+ const int width) {
+ if (argmax_h <= -1 || argmax_h >= height || argmax_w <= -1 ||
+ argmax_w >= width) {
+ // empty
+ return 0;
+ }
+
+ int argmax_h_low = floorf(argmax_h);
+ int argmax_w_low = floorf(argmax_w);
+ int argmax_h_high = argmax_h_low + 1;
+ int argmax_w_high = argmax_w_low + 1;
+
+ T weight = 0;
+ if (h == argmax_h_low && w == argmax_w_low)
+ weight = (h + 1 - argmax_h) * (w + 1 - argmax_w);
+ if (h == argmax_h_low && w == argmax_w_high)
+ weight = (h + 1 - argmax_h) * (argmax_w + 1 - w);
+ if (h == argmax_h_high && w == argmax_w_low)
+ weight = (argmax_h + 1 - h) * (w + 1 - argmax_w);
+ if (h == argmax_h_high && w == argmax_w_high)
+ weight = (argmax_h + 1 - h) * (argmax_w + 1 - w);
+ return weight;
+}
+
+template
+__device__ T dmcn_get_coordinate_weight(T argmax_h, T argmax_w,
+ const int height, const int width,
+ const T *im_data, const int data_width,
+ const int bp_dir) {
+ if (argmax_h <= -1 || argmax_h >= height || argmax_w <= -1 ||
+ argmax_w >= width) {
+ // empty
+ return 0;
+ }
+
+ int argmax_h_low = floorf(argmax_h);
+ int argmax_w_low = floorf(argmax_w);
+ int argmax_h_high = argmax_h_low + 1;
+ int argmax_w_high = argmax_w_low + 1;
+
+ T weight = 0;
+
+ if (bp_dir == 0) {
+ if (argmax_h_low >= 0 && argmax_w_low >= 0)
+ weight += -1 * (argmax_w_low + 1 - argmax_w) *
+ im_data[argmax_h_low * data_width + argmax_w_low];
+ if (argmax_h_low >= 0 && argmax_w_high <= width - 1)
+ weight += -1 * (argmax_w - argmax_w_low) *
+ im_data[argmax_h_low * data_width + argmax_w_high];
+ if (argmax_h_high <= height - 1 && argmax_w_low >= 0)
+ weight += (argmax_w_low + 1 - argmax_w) *
+ im_data[argmax_h_high * data_width + argmax_w_low];
+ if (argmax_h_high <= height - 1 && argmax_w_high <= width - 1)
+ weight += (argmax_w - argmax_w_low) *
+ im_data[argmax_h_high * data_width + argmax_w_high];
+ } else if (bp_dir == 1) {
+ if (argmax_h_low >= 0 && argmax_w_low >= 0)
+ weight += -1 * (argmax_h_low + 1 - argmax_h) *
+ im_data[argmax_h_low * data_width + argmax_w_low];
+ if (argmax_h_low >= 0 && argmax_w_high <= width - 1)
+ weight += (argmax_h_low + 1 - argmax_h) *
+ im_data[argmax_h_low * data_width + argmax_w_high];
+ if (argmax_h_high <= height - 1 && argmax_w_low >= 0)
+ weight += -1 * (argmax_h - argmax_h_low) *
+ im_data[argmax_h_high * data_width + argmax_w_low];
+ if (argmax_h_high <= height - 1 && argmax_w_high <= width - 1)
+ weight += (argmax_h - argmax_h_low) *
+ im_data[argmax_h_high * data_width + argmax_w_high];
+ }
+
+ return weight;
+}
+
+template
+__global__ void modulated_deformable_im2col_gpu_kernel(
+ const int n, const T *data_im, const T *data_offset, const T *data_mask,
+ const int height, const int width, const int kernel_h, const int kernel_w,
+ const int pad_h, const int pad_w, const int stride_h, const int stride_w,
+ const int dilation_h, const int dilation_w,
+ const int channel_per_deformable_group, const int batch_size,
+ const int num_channels, const int deformable_group, const int height_col,
+ const int width_col, T *data_col) {
+ CUDA_1D_KERNEL_LOOP(index, n) {
+ // index index of output matrix
+ const int w_col = index % width_col;
+ const int h_col = (index / width_col) % height_col;
+ const int b_col = (index / width_col / height_col) % batch_size;
+ const int c_im = (index / width_col / height_col) / batch_size;
+ const int c_col = c_im * kernel_h * kernel_w;
+
+ // compute deformable group index
+ const int deformable_group_index = c_im / channel_per_deformable_group;
+
+ const int h_in = h_col * stride_h - pad_h;
+ const int w_in = w_col * stride_w - pad_w;
+
+ T *data_col_ptr =
+ data_col +
+ ((c_col * batch_size + b_col) * height_col + h_col) * width_col + w_col;
+ const T *data_im_ptr =
+ data_im + (b_col * num_channels + c_im) * height * width;
+ const T *data_offset_ptr =
+ data_offset + (b_col * deformable_group + deformable_group_index) * 2 *
+ kernel_h * kernel_w * height_col * width_col;
+
+ const T *data_mask_ptr =
+ data_mask + (b_col * deformable_group + deformable_group_index) *
+ kernel_h * kernel_w * height_col * width_col;
+
+ for (int i = 0; i < kernel_h; ++i) {
+ for (int j = 0; j < kernel_w; ++j) {
+ const int data_offset_h_ptr =
+ ((2 * (i * kernel_w + j)) * height_col + h_col) * width_col + w_col;
+ const int data_offset_w_ptr =
+ ((2 * (i * kernel_w + j) + 1) * height_col + h_col) * width_col +
+ w_col;
+ const int data_mask_hw_ptr =
+ ((i * kernel_w + j) * height_col + h_col) * width_col + w_col;
+ const T offset_h = data_offset_ptr[data_offset_h_ptr];
+ const T offset_w = data_offset_ptr[data_offset_w_ptr];
+ const T mask = data_mask_ptr[data_mask_hw_ptr];
+ T val = static_cast(0);
+ const T h_im = h_in + i * dilation_h + offset_h;
+ const T w_im = w_in + j * dilation_w + offset_w;
+ if (h_im > -1 && w_im > -1 && h_im < height && w_im < width)
+ val = dmcn_im2col_bilinear(data_im_ptr, width, height, width, h_im,
+ w_im);
+ *data_col_ptr = val * mask;
+ data_col_ptr += batch_size * height_col * width_col;
+ }
+ }
+ }
+}
+
+template
+__global__ void modulated_deformable_col2im_gpu_kernel(
+ const int n, const T *data_col, const T *data_offset, const T *data_mask,
+ const int channels, const int height, const int width, const int kernel_h,
+ const int kernel_w, const int pad_h, const int pad_w, const int stride_h,
+ const int stride_w, const int dilation_h, const int dilation_w,
+ const int channel_per_deformable_group, const int batch_size,
+ const int deformable_group, const int height_col, const int width_col,
+ T *grad_im) {
+ CUDA_1D_KERNEL_LOOP(index, n) {
+ const int j = (index / width_col / height_col / batch_size) % kernel_w;
+ const int i =
+ (index / width_col / height_col / batch_size / kernel_w) % kernel_h;
+ const int c =
+ index / width_col / height_col / batch_size / kernel_w / kernel_h;
+ // compute the start and end of the output
+
+ const int deformable_group_index = c / channel_per_deformable_group;
+
+ int w_out = index % width_col;
+ int h_out = (index / width_col) % height_col;
+ int b = (index / width_col / height_col) % batch_size;
+ int w_in = w_out * stride_w - pad_w;
+ int h_in = h_out * stride_h - pad_h;
+
+ const T *data_offset_ptr =
+ data_offset + (b * deformable_group + deformable_group_index) * 2 *
+ kernel_h * kernel_w * height_col * width_col;
+ const T *data_mask_ptr =
+ data_mask + (b * deformable_group + deformable_group_index) * kernel_h *
+ kernel_w * height_col * width_col;
+ const int data_offset_h_ptr =
+ ((2 * (i * kernel_w + j)) * height_col + h_out) * width_col + w_out;
+ const int data_offset_w_ptr =
+ ((2 * (i * kernel_w + j) + 1) * height_col + h_out) * width_col + w_out;
+ const int data_mask_hw_ptr =
+ ((i * kernel_w + j) * height_col + h_out) * width_col + w_out;
+ const T offset_h = data_offset_ptr[data_offset_h_ptr];
+ const T offset_w = data_offset_ptr[data_offset_w_ptr];
+ const T mask = data_mask_ptr[data_mask_hw_ptr];
+ const T cur_inv_h_data = h_in + i * dilation_h + offset_h;
+ const T cur_inv_w_data = w_in + j * dilation_w + offset_w;
+
+ const T cur_top_grad = data_col[index] * mask;
+ const int cur_h = (int)cur_inv_h_data;
+ const int cur_w = (int)cur_inv_w_data;
+ for (int dy = -2; dy <= 2; dy++) {
+ for (int dx = -2; dx <= 2; dx++) {
+ if (cur_h + dy >= 0 && cur_h + dy < height && cur_w + dx >= 0 &&
+ cur_w + dx < width && abs(cur_inv_h_data - (cur_h + dy)) < 1 &&
+ abs(cur_inv_w_data - (cur_w + dx)) < 1) {
+ int cur_bottom_grad_pos =
+ ((b * channels + c) * height + cur_h + dy) * width + cur_w + dx;
+ T weight =
+ dmcn_get_gradient_weight(cur_inv_h_data, cur_inv_w_data,
+ cur_h + dy, cur_w + dx, height, width);
+ atomicAdd(grad_im + cur_bottom_grad_pos, weight * cur_top_grad);
+ }
+ }
+ }
+ }
+}
+
+template
+__global__ void modulated_deformable_col2im_coord_gpu_kernel(
+ const int n, const T *data_col, const T *data_im, const T *data_offset,
+ const T *data_mask, const int channels, const int height, const int width,
+ const int kernel_h, const int kernel_w, const int pad_h, const int pad_w,
+ const int stride_h, const int stride_w, const int dilation_h,
+ const int dilation_w, const int channel_per_deformable_group,
+ const int batch_size, const int offset_channels, const int deformable_group,
+ const int height_col, const int width_col, T *grad_offset, T *grad_mask) {
+ CUDA_1D_KERNEL_LOOP(index, n) {
+ T val = 0, mval = 0;
+ int w = index % width_col;
+ int h = (index / width_col) % height_col;
+ int c = (index / width_col / height_col) % offset_channels;
+ int b = (index / width_col / height_col) / offset_channels;
+ // compute the start and end of the output
+
+ const int deformable_group_index = c / (2 * kernel_h * kernel_w);
+ const int col_step = kernel_h * kernel_w;
+ int cnt = 0;
+ const T *data_col_ptr = data_col + deformable_group_index *
+ channel_per_deformable_group *
+ batch_size * width_col * height_col;
+ const T *data_im_ptr =
+ data_im + (b * deformable_group + deformable_group_index) *
+ channel_per_deformable_group / kernel_h / kernel_w *
+ height * width;
+ const T *data_offset_ptr =
+ data_offset + (b * deformable_group + deformable_group_index) * 2 *
+ kernel_h * kernel_w * height_col * width_col;
+ const T *data_mask_ptr =
+ data_mask + (b * deformable_group + deformable_group_index) * kernel_h *
+ kernel_w * height_col * width_col;
+
+ const int offset_c = c - deformable_group_index * 2 * kernel_h * kernel_w;
+
+ for (int col_c = (offset_c / 2); col_c < channel_per_deformable_group;
+ col_c += col_step) {
+ const int col_pos =
+ (((col_c * batch_size + b) * height_col) + h) * width_col + w;
+ const int bp_dir = offset_c % 2;
+
+ int j = (col_pos / width_col / height_col / batch_size) % kernel_w;
+ int i =
+ (col_pos / width_col / height_col / batch_size / kernel_w) % kernel_h;
+ int w_out = col_pos % width_col;
+ int h_out = (col_pos / width_col) % height_col;
+ int w_in = w_out * stride_w - pad_w;
+ int h_in = h_out * stride_h - pad_h;
+ const int data_offset_h_ptr =
+ (((2 * (i * kernel_w + j)) * height_col + h_out) * width_col + w_out);
+ const int data_offset_w_ptr =
+ (((2 * (i * kernel_w + j) + 1) * height_col + h_out) * width_col +
+ w_out);
+ const int data_mask_hw_ptr =
+ (((i * kernel_w + j) * height_col + h_out) * width_col + w_out);
+ const T offset_h = data_offset_ptr[data_offset_h_ptr];
+ const T offset_w = data_offset_ptr[data_offset_w_ptr];
+ const T mask = data_mask_ptr[data_mask_hw_ptr];
+ T inv_h = h_in + i * dilation_h + offset_h;
+ T inv_w = w_in + j * dilation_w + offset_w;
+ if (inv_h <= -1 || inv_w <= -1 || inv_h >= height || inv_w >= width)
+ inv_h = inv_w = -2;
+ else
+ mval += data_col_ptr[col_pos] *
+ dmcn_im2col_bilinear(data_im_ptr + cnt * height * width, width,
+ height, width, inv_h, inv_w);
+ const T weight = dmcn_get_coordinate_weight(
+ inv_h, inv_w, height, width, data_im_ptr + cnt * height * width,
+ width, bp_dir);
+ val += weight * data_col_ptr[col_pos] * mask;
+ cnt += 1;
+ }
+ // KERNEL_ASSIGN(grad_offset[index], offset_req, val);
+ grad_offset[index] = val;
+ if (offset_c % 2 == 0)
+ // KERNEL_ASSIGN(grad_mask[(((b * deformable_group +
+ // deformable_group_index) * kernel_h * kernel_w + offset_c / 2) *
+ // height_col + h) * width_col + w], mask_req, mval);
+ grad_mask[(((b * deformable_group + deformable_group_index) * kernel_h *
+ kernel_w +
+ offset_c / 2) *
+ height_col +
+ h) *
+ width_col +
+ w] = mval;
+ }
+}
+
+#endif // MODULATED_DEFORM_CONV_CUDA_KERNEL_CUH
diff --git a/PyTorch/contrib/cv/detection/GCNet/mmcv/ops/csrc/ms_deform_attn_cuda_kernel.cuh b/PyTorch/contrib/cv/detection/GCNet/mmcv/ops/csrc/ms_deform_attn_cuda_kernel.cuh
new file mode 100644
index 0000000000000000000000000000000000000000..c888fabb3e919ca8b1a7f37aeee0ed7ae1a14369
--- /dev/null
+++ b/PyTorch/contrib/cv/detection/GCNet/mmcv/ops/csrc/ms_deform_attn_cuda_kernel.cuh
@@ -0,0 +1,807 @@
+/*!
+**************************************************************************************************
+* Deformable DETR
+* Copyright (c) 2020 SenseTime. All Rights Reserved.
+* Licensed under the Apache License, Version 2.0 [see LICENSE for details]
+**************************************************************************************************
+* Modified from
+*https://github.com/chengdazhi/Deformable-Convolution-V2-PyTorch/tree/pytorch_1.0.0
+**************************************************************************************************
+*/
+#ifndef DEFORM_ATTN_CUDA_KERNEL
+#define DEFORM_ATTN_CUDA_KERNEL
+
+#include "common_cuda_helper.hpp"
+#include "pytorch_cuda_helper.hpp"
+
+const int CUDA_NUM_THREADS = 1024;
+inline int GET_BLOCKS(const int N, const int num_threads) {
+ return (N + num_threads - 1) / num_threads;
+}
+
+template
+__device__ scalar_t ms_deform_attn_im2col_bilinear(
+ const scalar_t *&bottom_data, const int &height, const int &width,
+ const int &nheads, const int &channels, const scalar_t &h,
+ const scalar_t &w, const int &m, const int &c) {
+ const int h_low = floor(h);
+ const int w_low = floor(w);
+ const int h_high = h_low + 1;
+ const int w_high = w_low + 1;
+
+ const scalar_t lh = h - h_low;
+ const scalar_t lw = w - w_low;
+ const scalar_t hh = 1 - lh, hw = 1 - lw;
+
+ const int w_stride = nheads * channels;
+ const int h_stride = width * w_stride;
+ const int h_low_ptr_offset = h_low * h_stride;
+ const int h_high_ptr_offset = h_low_ptr_offset + h_stride;
+ const int w_low_ptr_offset = w_low * w_stride;
+ const int w_high_ptr_offset = w_low_ptr_offset + w_stride;
+ const int base_ptr = m * channels + c;
+
+ scalar_t v1 = 0;
+ if (h_low >= 0 && w_low >= 0) {
+ const int ptr1 = h_low_ptr_offset + w_low_ptr_offset + base_ptr;
+ v1 = bottom_data[ptr1];
+ }
+ scalar_t v2 = 0;
+ if (h_low >= 0 && w_high <= width - 1) {
+ const int ptr2 = h_low_ptr_offset + w_high_ptr_offset + base_ptr;
+ v2 = bottom_data[ptr2];
+ }
+ scalar_t v3 = 0;
+ if (h_high <= height - 1 && w_low >= 0) {
+ const int ptr3 = h_high_ptr_offset + w_low_ptr_offset + base_ptr;
+ v3 = bottom_data[ptr3];
+ }
+ scalar_t v4 = 0;
+ if (h_high <= height - 1 && w_high <= width - 1) {
+ const int ptr4 = h_high_ptr_offset + w_high_ptr_offset + base_ptr;
+ v4 = bottom_data[ptr4];
+ }
+
+ const scalar_t w1 = hh * hw, w2 = hh * lw, w3 = lh * hw, w4 = lh * lw;
+
+ const scalar_t val = (w1 * v1 + w2 * v2 + w3 * v3 + w4 * v4);
+ return val;
+}
+
+template
+__device__ void ms_deform_attn_col2im_bilinear(
+ const scalar_t *&bottom_data, const int &height, const int &width,
+ const int &nheads, const int &channels, const scalar_t &h,
+ const scalar_t &w, const int &m, const int &c, const scalar_t &top_grad,
+ const scalar_t &attn_weight, scalar_t *&grad_value,
+ scalar_t *grad_sampling_loc, scalar_t *grad_attn_weight) {
+ const int h_low = floor(h);
+ const int w_low = floor(w);
+ const int h_high = h_low + 1;
+ const int w_high = w_low + 1;
+
+ const scalar_t lh = h - h_low;
+ const scalar_t lw = w - w_low;
+ const scalar_t hh = 1 - lh, hw = 1 - lw;
+
+ const int w_stride = nheads * channels;
+ const int h_stride = width * w_stride;
+ const int h_low_ptr_offset = h_low * h_stride;
+ const int h_high_ptr_offset = h_low_ptr_offset + h_stride;
+ const int w_low_ptr_offset = w_low * w_stride;
+ const int w_high_ptr_offset = w_low_ptr_offset + w_stride;
+ const int base_ptr = m * channels + c;
+
+ const scalar_t w1 = hh * hw, w2 = hh * lw, w3 = lh * hw, w4 = lh * lw;
+ const scalar_t top_grad_value = top_grad * attn_weight;
+ scalar_t grad_h_weight = 0, grad_w_weight = 0;
+
+ scalar_t v1 = 0;
+ if (h_low >= 0 && w_low >= 0) {
+ const int ptr1 = h_low_ptr_offset + w_low_ptr_offset + base_ptr;
+ v1 = bottom_data[ptr1];
+ grad_h_weight -= hw * v1;
+ grad_w_weight -= hh * v1;
+ atomicAdd(grad_value + ptr1, w1 * top_grad_value);
+ }
+ scalar_t v2 = 0;
+ if (h_low >= 0 && w_high <= width - 1) {
+ const int ptr2 = h_low_ptr_offset + w_high_ptr_offset + base_ptr;
+ v2 = bottom_data[ptr2];
+ grad_h_weight -= lw * v2;
+ grad_w_weight += hh * v2;
+ atomicAdd(grad_value + ptr2, w2 * top_grad_value);
+ }
+ scalar_t v3 = 0;
+ if (h_high <= height - 1 && w_low >= 0) {
+ const int ptr3 = h_high_ptr_offset + w_low_ptr_offset + base_ptr;
+ v3 = bottom_data[ptr3];
+ grad_h_weight += hw * v3;
+ grad_w_weight -= lh * v3;
+ atomicAdd(grad_value + ptr3, w3 * top_grad_value);
+ }
+ scalar_t v4 = 0;
+ if (h_high <= height - 1 && w_high <= width - 1) {
+ const int ptr4 = h_high_ptr_offset + w_high_ptr_offset + base_ptr;
+ v4 = bottom_data[ptr4];
+ grad_h_weight += lw * v4;
+ grad_w_weight += lh * v4;
+ atomicAdd(grad_value + ptr4, w4 * top_grad_value);
+ }
+
+ const scalar_t val = (w1 * v1 + w2 * v2 + w3 * v3 + w4 * v4);
+ *grad_attn_weight = top_grad * val;
+ *grad_sampling_loc = width * grad_w_weight * top_grad_value;
+ *(grad_sampling_loc + 1) = height * grad_h_weight * top_grad_value;
+}
+
+template
+__device__ void ms_deform_attn_col2im_bilinear_gm(
+ const scalar_t *&bottom_data, const int &height, const int &width,
+ const int &nheads, const int &channels, const scalar_t &h,
+ const scalar_t &w, const int &m, const int &c, const scalar_t &top_grad,
+ const scalar_t &attn_weight, scalar_t *&grad_value,
+ scalar_t *grad_sampling_loc, scalar_t *grad_attn_weight) {
+ const int h_low = floor(h);
+ const int w_low = floor(w);
+ const int h_high = h_low + 1;
+ const int w_high = w_low + 1;
+
+ const scalar_t lh = h - h_low;
+ const scalar_t lw = w - w_low;
+ const scalar_t hh = 1 - lh, hw = 1 - lw;
+
+ const int w_stride = nheads * channels;
+ const int h_stride = width * w_stride;
+ const int h_low_ptr_offset = h_low * h_stride;
+ const int h_high_ptr_offset = h_low_ptr_offset + h_stride;
+ const int w_low_ptr_offset = w_low * w_stride;
+ const int w_high_ptr_offset = w_low_ptr_offset + w_stride;
+ const int base_ptr = m * channels + c;
+
+ const scalar_t w1 = hh * hw, w2 = hh * lw, w3 = lh * hw, w4 = lh * lw;
+ const scalar_t top_grad_value = top_grad * attn_weight;
+ scalar_t grad_h_weight = 0, grad_w_weight = 0;
+
+ scalar_t v1 = 0;
+ if (h_low >= 0 && w_low >= 0) {
+ const int ptr1 = h_low_ptr_offset + w_low_ptr_offset + base_ptr;
+ v1 = bottom_data[ptr1];
+ grad_h_weight -= hw * v1;
+ grad_w_weight -= hh * v1;
+ atomicAdd(grad_value + ptr1, w1 * top_grad_value);
+ }
+ scalar_t v2 = 0;
+ if (h_low >= 0 && w_high <= width - 1) {
+ const int ptr2 = h_low_ptr_offset + w_high_ptr_offset + base_ptr;
+ v2 = bottom_data[ptr2];
+ grad_h_weight -= lw * v2;
+ grad_w_weight += hh * v2;
+ atomicAdd(grad_value + ptr2, w2 * top_grad_value);
+ }
+ scalar_t v3 = 0;
+ if (h_high <= height - 1 && w_low >= 0) {
+ const int ptr3 = h_high_ptr_offset + w_low_ptr_offset + base_ptr;
+ v3 = bottom_data[ptr3];
+ grad_h_weight += hw * v3;
+ grad_w_weight -= lh * v3;
+ atomicAdd(grad_value + ptr3, w3 * top_grad_value);
+ }
+ scalar_t v4 = 0;
+ if (h_high <= height - 1 && w_high <= width - 1) {
+ const int ptr4 = h_high_ptr_offset + w_high_ptr_offset + base_ptr;
+ v4 = bottom_data[ptr4];
+ grad_h_weight += lw * v4;
+ grad_w_weight += lh * v4;
+ atomicAdd(grad_value + ptr4, w4 * top_grad_value);
+ }
+
+ const scalar_t val = (w1 * v1 + w2 * v2 + w3 * v3 + w4 * v4);
+ atomicAdd(grad_attn_weight, top_grad * val);
+ atomicAdd(grad_sampling_loc, width * grad_w_weight * top_grad_value);
+ atomicAdd(grad_sampling_loc + 1, height * grad_h_weight * top_grad_value);
+}
+
+template
+__global__ void ms_deformable_im2col_gpu_kernel(
+ const int n, const scalar_t *data_value, const int64_t *data_spatial_shapes,
+ const int64_t *data_level_start_index, const scalar_t *data_sampling_loc,
+ const scalar_t *data_attn_weight, const int batch_size,
+ const int spatial_size, const int num_heads, const int channels,
+ const int num_levels, const int num_query, const int num_point,
+ scalar_t *data_col) {
+ CUDA_1D_KERNEL_LOOP(index, n) {
+ int _temp = index;
+ const int c_col = _temp % channels;
+ _temp /= channels;
+ const int sampling_index = _temp;
+ const int m_col = _temp % num_heads;
+ _temp /= num_heads;
+ const int q_col = _temp % num_query;
+ _temp /= num_query;
+ const int b_col = _temp;
+
+ scalar_t *data_col_ptr = data_col + index;
+ int data_weight_ptr = sampling_index * num_levels * num_point;
+ int data_loc_w_ptr = data_weight_ptr << 1;
+ const int qid_stride = num_heads * channels;
+ const int data_value_ptr_init_offset = b_col * spatial_size * qid_stride;
+ scalar_t col = 0;
+
+ for (int l_col = 0; l_col < num_levels; ++l_col) {
+ const int level_start_id = data_level_start_index[l_col];
+ const int spatial_h_ptr = l_col << 1;
+ const int spatial_h = data_spatial_shapes[spatial_h_ptr];
+ const int spatial_w = data_spatial_shapes[spatial_h_ptr + 1];
+ const scalar_t *data_value_ptr =
+ data_value +
+ (data_value_ptr_init_offset + level_start_id * qid_stride);
+ for (int p_col = 0; p_col < num_point; ++p_col) {
+ const scalar_t loc_w = data_sampling_loc[data_loc_w_ptr];
+ const scalar_t loc_h = data_sampling_loc[data_loc_w_ptr + 1];
+ const scalar_t weight = data_attn_weight[data_weight_ptr];
+
+ const scalar_t h_im = loc_h * spatial_h - 0.5;
+ const scalar_t w_im = loc_w * spatial_w - 0.5;
+
+ if (h_im > -1 && w_im > -1 && h_im < spatial_h && w_im < spatial_w) {
+ col += ms_deform_attn_im2col_bilinear(data_value_ptr, spatial_h,
+ spatial_w, num_heads, channels,
+ h_im, w_im, m_col, c_col) *
+ weight;
+ }
+
+ data_weight_ptr += 1;
+ data_loc_w_ptr += 2;
+ }
+ }
+ *data_col_ptr = col;
+ }
+}
+
+template
+__global__ void ms_deformable_col2im_gpu_kernel_shm_blocksize_aware_reduce_v1(
+ const int n, const scalar_t *grad_col, const scalar_t *data_value,
+ const int64_t *data_spatial_shapes, const int64_t *data_level_start_index,
+ const scalar_t *data_sampling_loc, const scalar_t *data_attn_weight,
+ const int batch_size, const int spatial_size, const int num_heads,
+ const int channels, const int num_levels, const int num_query,
+ const int num_point, scalar_t *grad_value, scalar_t *grad_sampling_loc,
+ scalar_t *grad_attn_weight) {
+ CUDA_1D_KERNEL_LOOP(index, n) {
+ __shared__ scalar_t cache_grad_sampling_loc[blockSize * 2];
+ __shared__ scalar_t cache_grad_attn_weight[blockSize];
+ unsigned int tid = threadIdx.x;
+ int _temp = index;
+ const int c_col = _temp % channels;
+ _temp /= channels;
+ const int sampling_index = _temp;
+ const int m_col = _temp % num_heads;
+ _temp /= num_heads;
+ const int q_col = _temp % num_query;
+ _temp /= num_query;
+ const int b_col = _temp;
+
+ const scalar_t top_grad = grad_col[index];
+
+ int data_weight_ptr = sampling_index * num_levels * num_point;
+ int data_loc_w_ptr = data_weight_ptr << 1;
+ const int grad_sampling_ptr = data_weight_ptr;
+ grad_sampling_loc += grad_sampling_ptr << 1;
+ grad_attn_weight += grad_sampling_ptr;
+ const int grad_weight_stride = 1;
+ const int grad_loc_stride = 2;
+ const int qid_stride = num_heads * channels;
+ const int data_value_ptr_init_offset = b_col * spatial_size * qid_stride;
+
+ for (int l_col = 0; l_col < num_levels; ++l_col) {
+ const int level_start_id = data_level_start_index[l_col];
+ const int spatial_h_ptr = l_col << 1;
+ const int spatial_h = data_spatial_shapes[spatial_h_ptr];
+ const int spatial_w = data_spatial_shapes[spatial_h_ptr + 1];
+ const int value_ptr_offset =
+ data_value_ptr_init_offset + level_start_id * qid_stride;
+ const scalar_t *data_value_ptr = data_value + value_ptr_offset;
+ scalar_t *grad_value_ptr = grad_value + value_ptr_offset;
+
+ for (int p_col = 0; p_col < num_point; ++p_col) {
+ const scalar_t loc_w = data_sampling_loc[data_loc_w_ptr];
+ const scalar_t loc_h = data_sampling_loc[data_loc_w_ptr + 1];
+ const scalar_t weight = data_attn_weight[data_weight_ptr];
+
+ const scalar_t h_im = loc_h * spatial_h - 0.5;
+ const scalar_t w_im = loc_w * spatial_w - 0.5;
+ *(cache_grad_sampling_loc + (threadIdx.x << 1)) = 0;
+ *(cache_grad_sampling_loc + ((threadIdx.x << 1) + 1)) = 0;
+ *(cache_grad_attn_weight + threadIdx.x) = 0;
+ if (h_im > -1 && w_im > -1 && h_im < spatial_h && w_im < spatial_w) {
+ ms_deform_attn_col2im_bilinear(
+ data_value_ptr, spatial_h, spatial_w, num_heads, channels, h_im,
+ w_im, m_col, c_col, top_grad, weight, grad_value_ptr,
+ cache_grad_sampling_loc + (threadIdx.x << 1),
+ cache_grad_attn_weight + threadIdx.x);
+ }
+
+ __syncthreads();
+ if (tid == 0) {
+ scalar_t _grad_w = cache_grad_sampling_loc[0],
+ _grad_h = cache_grad_sampling_loc[1],
+ _grad_a = cache_grad_attn_weight[0];
+ int sid = 2;
+ for (unsigned int tid = 1; tid < blockSize; ++tid) {
+ _grad_w += cache_grad_sampling_loc[sid];
+ _grad_h += cache_grad_sampling_loc[sid + 1];
+ _grad_a += cache_grad_attn_weight[tid];
+ sid += 2;
+ }
+
+ *grad_sampling_loc = _grad_w;
+ *(grad_sampling_loc + 1) = _grad_h;
+ *grad_attn_weight = _grad_a;
+ }
+ __syncthreads();
+
+ data_weight_ptr += 1;
+ data_loc_w_ptr += 2;
+ grad_attn_weight += grad_weight_stride;
+ grad_sampling_loc += grad_loc_stride;
+ }
+ }
+ }
+}
+
+template
+__global__ void ms_deformable_col2im_gpu_kernel_shm_blocksize_aware_reduce_v2(
+ const int n, const scalar_t *grad_col, const scalar_t *data_value,
+ const int64_t *data_spatial_shapes, const int64_t *data_level_start_index,
+ const scalar_t *data_sampling_loc, const scalar_t *data_attn_weight,
+ const int batch_size, const int spatial_size, const int num_heads,
+ const int channels, const int num_levels, const int num_query,
+ const int num_point, scalar_t *grad_value, scalar_t *grad_sampling_loc,
+ scalar_t *grad_attn_weight) {
+ CUDA_1D_KERNEL_LOOP(index, n) {
+ __shared__ scalar_t cache_grad_sampling_loc[blockSize * 2];
+ __shared__ scalar_t cache_grad_attn_weight[blockSize];
+ unsigned int tid = threadIdx.x;
+ int _temp = index;
+ const int c_col = _temp % channels;
+ _temp /= channels;
+ const int sampling_index = _temp;
+ const int m_col = _temp % num_heads;
+ _temp /= num_heads;
+ const int q_col = _temp % num_query;
+ _temp /= num_query;
+ const int b_col = _temp;
+
+ const scalar_t top_grad = grad_col[index];
+
+ int data_weight_ptr = sampling_index * num_levels * num_point;
+ int data_loc_w_ptr = data_weight_ptr << 1;
+ const int grad_sampling_ptr = data_weight_ptr;
+ grad_sampling_loc += grad_sampling_ptr << 1;
+ grad_attn_weight += grad_sampling_ptr;
+ const int grad_weight_stride = 1;
+ const int grad_loc_stride = 2;
+ const int qid_stride = num_heads * channels;
+ const int data_value_ptr_init_offset = b_col * spatial_size * qid_stride;
+
+ for (int l_col = 0; l_col < num_levels; ++l_col) {
+ const int level_start_id = data_level_start_index[l_col];
+ const int spatial_h_ptr = l_col << 1;
+ const int spatial_h = data_spatial_shapes[spatial_h_ptr];
+ const int spatial_w = data_spatial_shapes[spatial_h_ptr + 1];
+ const int value_ptr_offset =
+ data_value_ptr_init_offset + level_start_id * qid_stride;
+ const scalar_t *data_value_ptr = data_value + value_ptr_offset;
+ scalar_t *grad_value_ptr = grad_value + value_ptr_offset;
+
+ for (int p_col = 0; p_col < num_point; ++p_col) {
+ const scalar_t loc_w = data_sampling_loc[data_loc_w_ptr];
+ const scalar_t loc_h = data_sampling_loc[data_loc_w_ptr + 1];
+ const scalar_t weight = data_attn_weight[data_weight_ptr];
+
+ const scalar_t h_im = loc_h * spatial_h - 0.5;
+ const scalar_t w_im = loc_w * spatial_w - 0.5;
+ *(cache_grad_sampling_loc + (threadIdx.x << 1)) = 0;
+ *(cache_grad_sampling_loc + ((threadIdx.x << 1) + 1)) = 0;
+ *(cache_grad_attn_weight + threadIdx.x) = 0;
+ if (h_im > -1 && w_im > -1 && h_im < spatial_h && w_im < spatial_w) {
+ ms_deform_attn_col2im_bilinear(
+ data_value_ptr, spatial_h, spatial_w, num_heads, channels, h_im,
+ w_im, m_col, c_col, top_grad, weight, grad_value_ptr,
+ cache_grad_sampling_loc + (threadIdx.x << 1),
+ cache_grad_attn_weight + threadIdx.x);
+ }
+
+ __syncthreads();
+
+ for (unsigned int s = blockSize / 2; s > 0; s >>= 1) {
+ if (tid < s) {
+ const unsigned int xid1 = tid << 1;
+ const unsigned int xid2 = (tid + s) << 1;
+ cache_grad_attn_weight[tid] += cache_grad_attn_weight[tid + s];
+ cache_grad_sampling_loc[xid1] += cache_grad_sampling_loc[xid2];
+ cache_grad_sampling_loc[xid1 + 1] +=
+ cache_grad_sampling_loc[xid2 + 1];
+ }
+ __syncthreads();
+ }
+
+ if (tid == 0) {
+ *grad_sampling_loc = cache_grad_sampling_loc[0];
+ *(grad_sampling_loc + 1) = cache_grad_sampling_loc[1];
+ *grad_attn_weight = cache_grad_attn_weight[0];
+ }
+ __syncthreads();
+
+ data_weight_ptr += 1;
+ data_loc_w_ptr += 2;
+ grad_attn_weight += grad_weight_stride;
+ grad_sampling_loc += grad_loc_stride;
+ }
+ }
+ }
+}
+
+template
+__global__ void ms_deformable_col2im_gpu_kernel_shm_reduce_v1(
+ const int n, const scalar_t *grad_col, const scalar_t *data_value,
+ const int64_t *data_spatial_shapes, const int64_t *data_level_start_index,
+ const scalar_t *data_sampling_loc, const scalar_t *data_attn_weight,
+ const int batch_size, const int spatial_size, const int num_heads,
+ const int channels, const int num_levels, const int num_query,
+ const int num_point, scalar_t *grad_value, scalar_t *grad_sampling_loc,
+ scalar_t *grad_attn_weight) {
+ CUDA_1D_KERNEL_LOOP(index, n) {
+ extern __shared__ int _s[];
+ scalar_t *cache_grad_sampling_loc = reinterpret_cast(_s);
+ scalar_t *cache_grad_attn_weight = cache_grad_sampling_loc + 2 * blockDim.x;
+ unsigned int tid = threadIdx.x;
+ int _temp = index;
+ const int c_col = _temp % channels;
+ _temp /= channels;
+ const int sampling_index = _temp;
+ const int m_col = _temp % num_heads;
+ _temp /= num_heads;
+ const int q_col = _temp % num_query;
+ _temp /= num_query;
+ const int b_col = _temp;
+
+ const scalar_t top_grad = grad_col[index];
+
+ int data_weight_ptr = sampling_index * num_levels * num_point;
+ int data_loc_w_ptr = data_weight_ptr << 1;
+ const int grad_sampling_ptr = data_weight_ptr;
+ grad_sampling_loc += grad_sampling_ptr << 1;
+ grad_attn_weight += grad_sampling_ptr;
+ const int grad_weight_stride = 1;
+ const int grad_loc_stride = 2;
+ const int qid_stride = num_heads * channels;
+ const int data_value_ptr_init_offset = b_col * spatial_size * qid_stride;
+
+ for (int l_col = 0; l_col < num_levels; ++l_col) {
+ const int level_start_id = data_level_start_index[l_col];
+ const int spatial_h_ptr = l_col << 1;
+ const int spatial_h = data_spatial_shapes[spatial_h_ptr];
+ const int spatial_w = data_spatial_shapes[spatial_h_ptr + 1];
+ const int value_ptr_offset =
+ data_value_ptr_init_offset + level_start_id * qid_stride;
+ const scalar_t *data_value_ptr = data_value + value_ptr_offset;
+ scalar_t *grad_value_ptr = grad_value + value_ptr_offset;
+
+ for (int p_col = 0; p_col < num_point; ++p_col) {
+ const scalar_t loc_w = data_sampling_loc[data_loc_w_ptr];
+ const scalar_t loc_h = data_sampling_loc[data_loc_w_ptr + 1];
+ const scalar_t weight = data_attn_weight[data_weight_ptr];
+
+ const scalar_t h_im = loc_h * spatial_h - 0.5;
+ const scalar_t w_im = loc_w * spatial_w - 0.5;
+ *(cache_grad_sampling_loc + (threadIdx.x << 1)) = 0;
+ *(cache_grad_sampling_loc + ((threadIdx.x << 1) + 1)) = 0;
+ *(cache_grad_attn_weight + threadIdx.x) = 0;
+ if (h_im > -1 && w_im > -1 && h_im < spatial_h && w_im < spatial_w) {
+ ms_deform_attn_col2im_bilinear(
+ data_value_ptr, spatial_h, spatial_w, num_heads, channels, h_im,
+ w_im, m_col, c_col, top_grad, weight, grad_value_ptr,
+ cache_grad_sampling_loc + (threadIdx.x << 1),
+ cache_grad_attn_weight + threadIdx.x);
+ }
+
+ __syncthreads();
+ if (tid == 0) {
+ scalar_t _grad_w = cache_grad_sampling_loc[0],
+ _grad_h = cache_grad_sampling_loc[1],
+ _grad_a = cache_grad_attn_weight[0];
+ int sid = 2;
+ for (unsigned int tid = 1; tid < blockDim.x; ++tid) {
+ _grad_w += cache_grad_sampling_loc[sid];
+ _grad_h += cache_grad_sampling_loc[sid + 1];
+ _grad_a += cache_grad_attn_weight[tid];
+ sid += 2;
+ }
+
+ *grad_sampling_loc = _grad_w;
+ *(grad_sampling_loc + 1) = _grad_h;
+ *grad_attn_weight = _grad_a;
+ }
+ __syncthreads();
+
+ data_weight_ptr += 1;
+ data_loc_w_ptr += 2;
+ grad_attn_weight += grad_weight_stride;
+ grad_sampling_loc += grad_loc_stride;
+ }
+ }
+ }
+}
+
+template
+__global__ void ms_deformable_col2im_gpu_kernel_shm_reduce_v2(
+ const int n, const scalar_t *grad_col, const scalar_t *data_value,
+ const int64_t *data_spatial_shapes, const int64_t *data_level_start_index,
+ const scalar_t *data_sampling_loc, const scalar_t *data_attn_weight,
+ const int batch_size, const int spatial_size, const int num_heads,
+ const int channels, const int num_levels, const int num_query,
+ const int num_point, scalar_t *grad_value, scalar_t *grad_sampling_loc,
+ scalar_t *grad_attn_weight) {
+ CUDA_1D_KERNEL_LOOP(index, n) {
+ extern __shared__ int _s[];
+ scalar_t *cache_grad_sampling_loc = reinterpret_cast(_s);
+ scalar_t *cache_grad_attn_weight = cache_grad_sampling_loc + 2 * blockDim.x;
+ unsigned int tid = threadIdx.x;
+ int _temp = index;
+ const int c_col = _temp % channels;
+ _temp /= channels;
+ const int sampling_index = _temp;
+ const int m_col = _temp % num_heads;
+ _temp /= num_heads;
+ const int q_col = _temp % num_query;
+ _temp /= num_query;
+ const int b_col = _temp;
+
+ const scalar_t top_grad = grad_col[index];
+
+ int data_weight_ptr = sampling_index * num_levels * num_point;
+ int data_loc_w_ptr = data_weight_ptr << 1;
+ const int grad_sampling_ptr = data_weight_ptr;
+ grad_sampling_loc += grad_sampling_ptr << 1;
+ grad_attn_weight += grad_sampling_ptr;
+ const int grad_weight_stride = 1;
+ const int grad_loc_stride = 2;
+ const int qid_stride = num_heads * channels;
+ const int data_value_ptr_init_offset = b_col * spatial_size * qid_stride;
+
+ for (int l_col = 0; l_col < num_levels; ++l_col) {
+ const int level_start_id = data_level_start_index[l_col];
+ const int spatial_h_ptr = l_col << 1;
+ const int spatial_h = data_spatial_shapes[spatial_h_ptr];
+ const int spatial_w = data_spatial_shapes[spatial_h_ptr + 1];
+ const int value_ptr_offset =
+ data_value_ptr_init_offset + level_start_id * qid_stride;
+ const scalar_t *data_value_ptr = data_value + value_ptr_offset;
+ scalar_t *grad_value_ptr = grad_value + value_ptr_offset;
+
+ for (int p_col = 0; p_col < num_point; ++p_col) {
+ const scalar_t loc_w = data_sampling_loc[data_loc_w_ptr];
+ const scalar_t loc_h = data_sampling_loc[data_loc_w_ptr + 1];
+ const scalar_t weight = data_attn_weight[data_weight_ptr];
+
+ const scalar_t h_im = loc_h * spatial_h - 0.5;
+ const scalar_t w_im = loc_w * spatial_w - 0.5;
+ *(cache_grad_sampling_loc + (threadIdx.x << 1)) = 0;
+ *(cache_grad_sampling_loc + ((threadIdx.x << 1) + 1)) = 0;
+ *(cache_grad_attn_weight + threadIdx.x) = 0;
+ if (h_im > -1 && w_im > -1 && h_im < spatial_h && w_im < spatial_w) {
+ ms_deform_attn_col2im_bilinear(
+ data_value_ptr, spatial_h, spatial_w, num_heads, channels, h_im,
+ w_im, m_col, c_col, top_grad, weight, grad_value_ptr,
+ cache_grad_sampling_loc + (threadIdx.x << 1),
+ cache_grad_attn_weight + threadIdx.x);
+ }
+
+ __syncthreads();
+
+ for (unsigned int s = blockDim.x / 2, spre = blockDim.x; s > 0;
+ s >>= 1, spre >>= 1) {
+ if (tid < s) {
+ const unsigned int xid1 = tid << 1;
+ const unsigned int xid2 = (tid + s) << 1;
+ cache_grad_attn_weight[tid] += cache_grad_attn_weight[tid + s];
+ cache_grad_sampling_loc[xid1] += cache_grad_sampling_loc[xid2];
+ cache_grad_sampling_loc[xid1 + 1] +=
+ cache_grad_sampling_loc[xid2 + 1];
+ if (tid + (s << 1) < spre) {
+ cache_grad_attn_weight[tid] +=
+ cache_grad_attn_weight[tid + (s << 1)];
+ cache_grad_sampling_loc[xid1] +=
+ cache_grad_sampling_loc[xid2 + (s << 1)];
+ cache_grad_sampling_loc[xid1 + 1] +=
+ cache_grad_sampling_loc[xid2 + 1 + (s << 1)];
+ }
+ }
+ __syncthreads();
+ }
+
+ if (tid == 0) {
+ *grad_sampling_loc = cache_grad_sampling_loc[0];
+ *(grad_sampling_loc + 1) = cache_grad_sampling_loc[1];
+ *grad_attn_weight = cache_grad_attn_weight[0];
+ }
+ __syncthreads();
+
+ data_weight_ptr += 1;
+ data_loc_w_ptr += 2;
+ grad_attn_weight += grad_weight_stride;
+ grad_sampling_loc += grad_loc_stride;
+ }
+ }
+ }
+}
+
+template
+__global__ void ms_deformable_col2im_gpu_kernel_shm_reduce_v2_multi_blocks(
+ const int n, const scalar_t *grad_col, const scalar_t *data_value,
+ const int64_t *data_spatial_shapes, const int64_t *data_level_start_index,
+ const scalar_t *data_sampling_loc, const scalar_t *data_attn_weight,
+ const int batch_size, const int spatial_size, const int num_heads,
+ const int channels, const int num_levels, const int num_query,
+ const int num_point, scalar_t *grad_value, scalar_t *grad_sampling_loc,
+ scalar_t *grad_attn_weight) {
+ CUDA_1D_KERNEL_LOOP(index, n) {
+ extern __shared__ int _s[];
+ scalar_t *cache_grad_sampling_loc = reinterpret_cast(_s);
+ scalar_t *cache_grad_attn_weight = cache_grad_sampling_loc + 2 * blockDim.x;
+ unsigned int tid = threadIdx.x;
+ int _temp = index;
+ const int c_col = _temp % channels;
+ _temp /= channels;
+ const int sampling_index = _temp;
+ const int m_col = _temp % num_heads;
+ _temp /= num_heads;
+ const int q_col = _temp % num_query;
+ _temp /= num_query;
+ const int b_col = _temp;
+
+ const scalar_t top_grad = grad_col[index];
+
+ int data_weight_ptr = sampling_index * num_levels * num_point;
+ int data_loc_w_ptr = data_weight_ptr << 1;
+ const int grad_sampling_ptr = data_weight_ptr;
+ grad_sampling_loc += grad_sampling_ptr << 1;
+ grad_attn_weight += grad_sampling_ptr;
+ const int grad_weight_stride = 1;
+ const int grad_loc_stride = 2;
+ const int qid_stride = num_heads * channels;
+ const int data_value_ptr_init_offset = b_col * spatial_size * qid_stride;
+
+ for (int l_col = 0; l_col < num_levels; ++l_col) {
+ const int level_start_id = data_level_start_index[l_col];
+ const int spatial_h_ptr = l_col << 1;
+ const int spatial_h = data_spatial_shapes[spatial_h_ptr];
+ const int spatial_w = data_spatial_shapes[spatial_h_ptr + 1];
+ const int value_ptr_offset =
+ data_value_ptr_init_offset + level_start_id * qid_stride;
+ const scalar_t *data_value_ptr = data_value + value_ptr_offset;
+ scalar_t *grad_value_ptr = grad_value + value_ptr_offset;
+
+ for (int p_col = 0; p_col < num_point; ++p_col) {
+ const scalar_t loc_w = data_sampling_loc[data_loc_w_ptr];
+ const scalar_t loc_h = data_sampling_loc[data_loc_w_ptr + 1];
+ const scalar_t weight = data_attn_weight[data_weight_ptr];
+
+ const scalar_t h_im = loc_h * spatial_h - 0.5;
+ const scalar_t w_im = loc_w * spatial_w - 0.5;
+ *(cache_grad_sampling_loc + (threadIdx.x << 1)) = 0;
+ *(cache_grad_sampling_loc + ((threadIdx.x << 1) + 1)) = 0;
+ *(cache_grad_attn_weight + threadIdx.x) = 0;
+ if (h_im > -1 && w_im > -1 && h_im < spatial_h && w_im < spatial_w) {
+ ms_deform_attn_col2im_bilinear(
+ data_value_ptr, spatial_h, spatial_w, num_heads, channels, h_im,
+ w_im, m_col, c_col, top_grad, weight, grad_value_ptr,
+ cache_grad_sampling_loc + (threadIdx.x << 1),
+ cache_grad_attn_weight + threadIdx.x);
+ }
+
+ __syncthreads();
+
+ for (unsigned int s = blockDim.x / 2, spre = blockDim.x; s > 0;
+ s >>= 1, spre >>= 1) {
+ if (tid < s) {
+ const unsigned int xid1 = tid << 1;
+ const unsigned int xid2 = (tid + s) << 1;
+ cache_grad_attn_weight[tid] += cache_grad_attn_weight[tid + s];
+ cache_grad_sampling_loc[xid1] += cache_grad_sampling_loc[xid2];
+ cache_grad_sampling_loc[xid1 + 1] +=
+ cache_grad_sampling_loc[xid2 + 1];
+ if (tid + (s << 1) < spre) {
+ cache_grad_attn_weight[tid] +=
+ cache_grad_attn_weight[tid + (s << 1)];
+ cache_grad_sampling_loc[xid1] +=
+ cache_grad_sampling_loc[xid2 + (s << 1)];
+ cache_grad_sampling_loc[xid1 + 1] +=
+ cache_grad_sampling_loc[xid2 + 1 + (s << 1)];
+ }
+ }
+ __syncthreads();
+ }
+
+ if (tid == 0) {
+ atomicAdd(grad_sampling_loc, cache_grad_sampling_loc[0]);
+ atomicAdd(grad_sampling_loc + 1, cache_grad_sampling_loc[1]);
+ atomicAdd(grad_attn_weight, cache_grad_attn_weight[0]);
+ }
+ __syncthreads();
+
+ data_weight_ptr += 1;
+ data_loc_w_ptr += 2;
+ grad_attn_weight += grad_weight_stride;
+ grad_sampling_loc += grad_loc_stride;
+ }
+ }
+ }
+}
+
+template
+__global__ void ms_deformable_col2im_gpu_kernel_gm(
+ const int n, const scalar_t *grad_col, const scalar_t *data_value,
+ const int64_t *data_spatial_shapes, const int64_t *data_level_start_index,
+ const scalar_t *data_sampling_loc, const scalar_t *data_attn_weight,
+ const int batch_size, const int spatial_size, const int num_heads,
+ const int channels, const int num_levels, const int num_query,
+ const int num_point, scalar_t *grad_value, scalar_t *grad_sampling_loc,
+ scalar_t *grad_attn_weight) {
+ CUDA_1D_KERNEL_LOOP(index, n) {
+ int _temp = index;
+ const int c_col = _temp % channels;
+ _temp /= channels;
+ const int sampling_index = _temp;
+ const int m_col = _temp % num_heads;
+ _temp /= num_heads;
+ const int q_col = _temp % num_query;
+ _temp /= num_query;
+ const int b_col = _temp;
+
+ const scalar_t top_grad = grad_col[index];
+
+ int data_weight_ptr = sampling_index * num_levels * num_point;
+ int data_loc_w_ptr = data_weight_ptr << 1;
+ const int grad_sampling_ptr = data_weight_ptr;
+ grad_sampling_loc += grad_sampling_ptr << 1;
+ grad_attn_weight += grad_sampling_ptr;
+ const int grad_weight_stride = 1;
+ const int grad_loc_stride = 2;
+ const int qid_stride = num_heads * channels;
+ const int data_value_ptr_init_offset = b_col * spatial_size * qid_stride;
+
+ for (int l_col = 0; l_col < num_levels; ++l_col) {
+ const int level_start_id = data_level_start_index[l_col];
+ const int spatial_h_ptr = l_col << 1;
+ const int spatial_h = data_spatial_shapes[spatial_h_ptr];
+ const int spatial_w = data_spatial_shapes[spatial_h_ptr + 1];
+ const int value_ptr_offset =
+ data_value_ptr_init_offset + level_start_id * qid_stride;
+ const scalar_t *data_value_ptr = data_value + value_ptr_offset;
+ scalar_t *grad_value_ptr = grad_value + value_ptr_offset;
+
+ for (int p_col = 0; p_col < num_point; ++p_col) {
+ const scalar_t loc_w = data_sampling_loc[data_loc_w_ptr];
+ const scalar_t loc_h = data_sampling_loc[data_loc_w_ptr + 1];
+ const scalar_t weight = data_attn_weight[data_weight_ptr];
+
+ const scalar_t h_im = loc_h * spatial_h - 0.5;
+ const scalar_t w_im = loc_w * spatial_w - 0.5;
+ if (h_im > -1 && w_im > -1 && h_im < spatial_h && w_im < spatial_w) {
+ ms_deform_attn_col2im_bilinear_gm(
+ data_value_ptr, spatial_h, spatial_w, num_heads, channels, h_im,
+ w_im, m_col, c_col, top_grad, weight, grad_value_ptr,
+ grad_sampling_loc, grad_attn_weight);
+ }
+ data_weight_ptr += 1;
+ data_loc_w_ptr += 2;
+ grad_attn_weight += grad_weight_stride;
+ grad_sampling_loc += grad_loc_stride;
+ }
+ }
+ }
+}
+#endif // DEFORM_ATTN_CUDA_KERNEL
diff --git a/PyTorch/contrib/cv/detection/GCNet/mmcv/ops/csrc/nms_cuda_kernel.cuh b/PyTorch/contrib/cv/detection/GCNet/mmcv/ops/csrc/nms_cuda_kernel.cuh
new file mode 100644
index 0000000000000000000000000000000000000000..363d4947107c9569f15ad96d7628ddde23f70b8b
--- /dev/null
+++ b/PyTorch/contrib/cv/detection/GCNet/mmcv/ops/csrc/nms_cuda_kernel.cuh
@@ -0,0 +1,74 @@
+#ifndef NMS_CUDA_KERNEL_CUH
+#define NMS_CUDA_KERNEL_CUH
+
+#include
+#ifdef MMCV_WITH_TRT
+#include "common_cuda_helper.hpp"
+#else // MMCV_WITH_TRT
+#ifdef MMCV_USE_PARROTS
+#include "parrots_cuda_helper.hpp"
+#else // MMCV_USE_PARROTS
+#include "pytorch_cuda_helper.hpp"
+#endif // MMCV_USE_PARROTS
+#endif // MMCV_WITH_TRT
+
+#define DIVUP(m, n) ((m) / (n) + ((m) % (n) > 0))
+int const threadsPerBlock = sizeof(unsigned long long int) * 8;
+
+__device__ inline bool devIoU(float const *const a, float const *const b,
+ const int offset, const float threshold) {
+ float left = fmaxf(a[0], b[0]), right = fminf(a[2], b[2]);
+ float top = fmaxf(a[1], b[1]), bottom = fminf(a[3], b[3]);
+ float width = fmaxf(right - left + offset, 0.f),
+ height = fmaxf(bottom - top + offset, 0.f);
+ float interS = width * height;
+ float Sa = (a[2] - a[0] + offset) * (a[3] - a[1] + offset);
+ float Sb = (b[2] - b[0] + offset) * (b[3] - b[1] + offset);
+ return interS > threshold * (Sa + Sb - interS);
+}
+
+__global__ void nms_cuda(const int n_boxes, const float iou_threshold,
+ const int offset, const float *dev_boxes,
+ unsigned long long *dev_mask) {
+ const int row_start = blockIdx.y;
+ const int col_start = blockIdx.x;
+ const int tid = threadIdx.x;
+
+ if (row_start > col_start) return;
+
+ const int row_size =
+ fminf(n_boxes - row_start * threadsPerBlock, threadsPerBlock);
+ const int col_size =
+ fminf(n_boxes - col_start * threadsPerBlock, threadsPerBlock);
+
+ __shared__ float block_boxes[threadsPerBlock * 4];
+ if (tid < col_size) {
+ block_boxes[tid * 4 + 0] =
+ dev_boxes[(threadsPerBlock * col_start + tid) * 4 + 0];
+ block_boxes[tid * 4 + 1] =
+ dev_boxes[(threadsPerBlock * col_start + tid) * 4 + 1];
+ block_boxes[tid * 4 + 2] =
+ dev_boxes[(threadsPerBlock * col_start + tid) * 4 + 2];
+ block_boxes[tid * 4 + 3] =
+ dev_boxes[(threadsPerBlock * col_start + tid) * 4 + 3];
+ }
+ __syncthreads();
+
+ if (tid < row_size) {
+ const int cur_box_idx = threadsPerBlock * row_start + tid;
+ const float *cur_box = dev_boxes + cur_box_idx * 4;
+ int i = 0;
+ unsigned long long int t = 0;
+ int start = 0;
+ if (row_start == col_start) {
+ start = tid + 1;
+ }
+ for (i = start; i < col_size; i++) {
+ if (devIoU(cur_box, block_boxes + i * 4, offset, iou_threshold)) {
+ t |= 1ULL << i;
+ }
+ }
+ dev_mask[cur_box_idx * gridDim.y + col_start] = t;
+ }
+}
+#endif // NMS_CUDA_KERNEL_CUH
diff --git a/PyTorch/contrib/cv/detection/GCNet/mmcv/ops/csrc/nms_rotated_cuda.cuh b/PyTorch/contrib/cv/detection/GCNet/mmcv/ops/csrc/nms_rotated_cuda.cuh
new file mode 100644
index 0000000000000000000000000000000000000000..80bed9681f748390999a2963bd3448570b0dbf6a
--- /dev/null
+++ b/PyTorch/contrib/cv/detection/GCNet/mmcv/ops/csrc/nms_rotated_cuda.cuh
@@ -0,0 +1,135 @@
+// Copyright (c) Facebook, Inc. and its affiliates. All Rights Reserved
+// modified from
+// https://github.com/facebookresearch/detectron2/blob/master/detectron2/layers/csrc/nms_rotated/nms_rotated_cuda.cu
+#ifndef NMS_ROTATED_CUDA_CUH
+#define NMS_ROTATED_CUDA_CUH
+
+#ifdef MMCV_USE_PARROTS
+#include "parrots_cuda_helper.hpp"
+#else
+#include "pytorch_cuda_helper.hpp"
+#endif
+#include "box_iou_rotated_utils.hpp"
+
+__host__ __device__ inline int divideUP(const int x, const int y) {
+ return (((x) + (y)-1) / (y));
+}
+
+namespace {
+int const threadsPerBlock = sizeof(unsigned long long) * 8;
+}
+
+template
+__global__ void nms_rotated_cuda_kernel(const int n_boxes,
+ const float iou_threshold,
+ const T* dev_boxes,
+ unsigned long long* dev_mask,
+ const int multi_label) {
+ // nms_rotated_cuda_kernel is modified from torchvision's nms_cuda_kernel
+
+ if (multi_label == 1) {
+ const int row_start = blockIdx.y;
+ const int col_start = blockIdx.x;
+
+ // if (row_start > col_start) return;
+
+ const int row_size =
+ min(n_boxes - row_start * threadsPerBlock, threadsPerBlock);
+ const int col_size =
+ min(n_boxes - col_start * threadsPerBlock, threadsPerBlock);
+
+ // Compared to nms_cuda_kernel, where each box is represented with 4 values
+ // (x1, y1, x2, y2), each rotated box is represented with 5 values
+ // (x_center, y_center, width, height, angle_degrees) here.
+ __shared__ T block_boxes[threadsPerBlock * 5];
+ if (threadIdx.x < col_size) {
+ block_boxes[threadIdx.x * 6 + 0] =
+ dev_boxes[(threadsPerBlock * col_start + threadIdx.x) * 6 + 0];
+ block_boxes[threadIdx.x * 6 + 1] =
+ dev_boxes[(threadsPerBlock * col_start + threadIdx.x) * 6 + 1];
+ block_boxes[threadIdx.x * 6 + 2] =
+ dev_boxes[(threadsPerBlock * col_start + threadIdx.x) * 6 + 2];
+ block_boxes[threadIdx.x * 6 + 3] =
+ dev_boxes[(threadsPerBlock * col_start + threadIdx.x) * 6 + 3];
+ block_boxes[threadIdx.x * 6 + 4] =
+ dev_boxes[(threadsPerBlock * col_start + threadIdx.x) * 6 + 4];
+ block_boxes[threadIdx.x * 6 + 5] =
+ dev_boxes[(threadsPerBlock * col_start + threadIdx.x) * 6 + 5];
+ }
+ __syncthreads();
+
+ if (threadIdx.x < row_size) {
+ const int cur_box_idx = threadsPerBlock * row_start + threadIdx.x;
+ const T* cur_box = dev_boxes + cur_box_idx * 6;
+ int i = 0;
+ unsigned long long t = 0;
+ int start = 0;
+ if (row_start == col_start) {
+ start = threadIdx.x + 1;
+ }
+ for (i = start; i < col_size; i++) {
+ // Instead of devIoU used by original horizontal nms, here
+ // we use the single_box_iou_rotated function from
+ // box_iou_rotated_utils.h
+ if (single_box_iou_rotated(cur_box, block_boxes + i * 6, 0) >
+ iou_threshold) {
+ t |= 1ULL << i;
+ }
+ }
+ const int col_blocks = divideUP(n_boxes, threadsPerBlock);
+ dev_mask[cur_box_idx * col_blocks + col_start] = t;
+ }
+ } else {
+ const int row_start = blockIdx.y;
+ const int col_start = blockIdx.x;
+
+ // if (row_start > col_start) return;
+
+ const int row_size =
+ min(n_boxes - row_start * threadsPerBlock, threadsPerBlock);
+ const int col_size =
+ min(n_boxes - col_start * threadsPerBlock, threadsPerBlock);
+
+ // Compared to nms_cuda_kernel, where each box is represented with 4 values
+ // (x1, y1, x2, y2), each rotated box is represented with 5 values
+ // (x_center, y_center, width, height, angle_degrees) here.
+ __shared__ T block_boxes[threadsPerBlock * 5];
+ if (threadIdx.x < col_size) {
+ block_boxes[threadIdx.x * 5 + 0] =
+ dev_boxes[(threadsPerBlock * col_start + threadIdx.x) * 5 + 0];
+ block_boxes[threadIdx.x * 5 + 1] =
+ dev_boxes[(threadsPerBlock * col_start + threadIdx.x) * 5 + 1];
+ block_boxes[threadIdx.x * 5 + 2] =
+ dev_boxes[(threadsPerBlock * col_start + threadIdx.x) * 5 + 2];
+ block_boxes[threadIdx.x * 5 + 3] =
+ dev_boxes[(threadsPerBlock * col_start + threadIdx.x) * 5 + 3];
+ block_boxes[threadIdx.x * 5 + 4] =
+ dev_boxes[(threadsPerBlock * col_start + threadIdx.x) * 5 + 4];
+ }
+ __syncthreads();
+
+ if (threadIdx.x < row_size) {
+ const int cur_box_idx = threadsPerBlock * row_start + threadIdx.x;
+ const T* cur_box = dev_boxes + cur_box_idx * 5;
+ int i = 0;
+ unsigned long long t = 0;
+ int start = 0;
+ if (row_start == col_start) {
+ start = threadIdx.x + 1;
+ }
+ for (i = start; i < col_size; i++) {
+ // Instead of devIoU used by original horizontal nms, here
+ // we use the single_box_iou_rotated function from
+ // box_iou_rotated_utils.h
+ if (single_box_iou_rotated(cur_box, block_boxes + i * 5, 0) >
+ iou_threshold) {
+ t |= 1ULL << i;
+ }
+ }
+ const int col_blocks = divideUP(n_boxes, threadsPerBlock);
+ dev_mask[cur_box_idx * col_blocks + col_start] = t;
+ }
+ }
+}
+
+#endif
diff --git a/PyTorch/contrib/cv/detection/GCNet/mmcv/ops/csrc/parrots/bbox_overlaps.cpp b/PyTorch/contrib/cv/detection/GCNet/mmcv/ops/csrc/parrots/bbox_overlaps.cpp
new file mode 100644
index 0000000000000000000000000000000000000000..23bf7d43474838318d4c819dea5d22b9847ad253
--- /dev/null
+++ b/PyTorch/contrib/cv/detection/GCNet/mmcv/ops/csrc/parrots/bbox_overlaps.cpp
@@ -0,0 +1,29 @@
+#include "pytorch_cpp_helper.hpp"
+
+#ifdef MMCV_WITH_CUDA
+void BBoxOverlapsCUDAKernelLauncher(const Tensor bboxes1, const Tensor bboxes2,
+ Tensor ious, const int mode,
+ const bool aligned, const int offset);
+
+void bbox_overlaps_cuda(const Tensor bboxes1, const Tensor bboxes2, Tensor ious,
+ const int mode, const bool aligned, const int offset) {
+ BBoxOverlapsCUDAKernelLauncher(bboxes1, bboxes2, ious, mode, aligned, offset);
+}
+#endif
+
+void bbox_overlaps(const Tensor bboxes1, const Tensor bboxes2, Tensor ious,
+ const int mode, const bool aligned, const int offset) {
+ if (bboxes1.device().is_cuda()) {
+#ifdef MMCV_WITH_CUDA
+ CHECK_CUDA_INPUT(bboxes1);
+ CHECK_CUDA_INPUT(bboxes2);
+ CHECK_CUDA_INPUT(ious);
+
+ bbox_overlaps_cuda(bboxes1, bboxes2, ious, mode, aligned, offset);
+#else
+ AT_ERROR("bbox_overlaps is not compiled with GPU support");
+#endif
+ } else {
+ AT_ERROR("bbox_overlaps is not implemented on CPU");
+ }
+}
diff --git a/PyTorch/contrib/cv/detection/GCNet/mmcv/ops/csrc/parrots/bbox_overlaps_cuda.cu b/PyTorch/contrib/cv/detection/GCNet/mmcv/ops/csrc/parrots/bbox_overlaps_cuda.cu
new file mode 100644
index 0000000000000000000000000000000000000000..d6e26c24d1f8e8d8da47b42f176a598c84ee6a89
--- /dev/null
+++ b/PyTorch/contrib/cv/detection/GCNet/mmcv/ops/csrc/parrots/bbox_overlaps_cuda.cu
@@ -0,0 +1,22 @@
+#include "bbox_overlaps_cuda_kernel.cuh"
+#include "pytorch_cuda_helper.hpp"
+
+void BBoxOverlapsCUDAKernelLauncher(const Tensor bboxes1, const Tensor bboxes2,
+ Tensor ious, const int mode,
+ const bool aligned, const int offset) {
+ int output_size = ious.numel();
+ int num_bbox1 = bboxes1.size(0);
+ int num_bbox2 = bboxes2.size(0);
+
+ at::cuda::CUDAGuard device_guard(bboxes1.device());
+ cudaStream_t stream = at::cuda::getCurrentCUDAStream();
+ AT_DISPATCH_FLOATING_TYPES_AND_HALF(
+ bboxes1.scalar_type(), "bbox_overlaps_cuda_kernel", ([&] {
+ bbox_overlaps_cuda_kernel
+ <<>>(
+ bboxes1.data_ptr(), bboxes2.data_ptr(),
+ ious.data_ptr(), num_bbox1, num_bbox2, mode, aligned,
+ offset);
+ }));
+ AT_CUDA_CHECK(cudaGetLastError());
+}
diff --git a/PyTorch/contrib/cv/detection/GCNet/mmcv/ops/csrc/parrots/bbox_overlaps_parrots.cpp b/PyTorch/contrib/cv/detection/GCNet/mmcv/ops/csrc/parrots/bbox_overlaps_parrots.cpp
new file mode 100644
index 0000000000000000000000000000000000000000..35bb5f5c87803297e803d235d04e4cb08eb21669
--- /dev/null
+++ b/PyTorch/contrib/cv/detection/GCNet/mmcv/ops/csrc/parrots/bbox_overlaps_parrots.cpp
@@ -0,0 +1,39 @@
+#include
+#include
+#include
+
+#include "bbox_overlaps_pytorch.h"
+
+using namespace parrots;
+
+#ifdef MMCV_WITH_CUDA
+/*
+ * void bbox_overlaps_cuda(const Tensor bboxes1, const Tensor bboxes2, Tensor
+ * ious, const int mode, const bool aligned, const int offset);
+ */
+void bbox_overlaps_parrots(CudaContext& ctx, const SSElement& attr,
+ const OperatorBase::in_list_t& ins,
+ OperatorBase::out_list_t& outs) {
+ int mode, offset;
+ bool aligned;
+ SSAttrs(attr)
+ .get("mode", mode)
+ .get("aligned", aligned)
+ .get("offset", offset)
+ .done();
+
+ const auto& bboxes1 = buildATensor(ctx, ins[0]);
+ const auto& bboxes2 = buildATensor(ctx, ins[1]);
+ auto ious = buildATensor(ctx, outs[0]);
+ bbox_overlaps_cuda(bboxes1, bboxes2, ious, mode, aligned, offset);
+}
+
+PARROTS_EXTENSION_REGISTER(bbox_overlaps)
+ .attr("mode")
+ .attr("aligned")
+ .attr("offset")
+ .input(2)
+ .output(1)
+ .apply(bbox_overlaps_parrots)
+ .done();
+#endif
diff --git a/PyTorch/contrib/cv/detection/GCNet/mmcv/ops/csrc/parrots/border_align.cpp b/PyTorch/contrib/cv/detection/GCNet/mmcv/ops/csrc/parrots/border_align.cpp
new file mode 100644
index 0000000000000000000000000000000000000000..78351e2a5fe5c57f9548bb4d4c01dd7569ae1e4a
--- /dev/null
+++ b/PyTorch/contrib/cv/detection/GCNet/mmcv/ops/csrc/parrots/border_align.cpp
@@ -0,0 +1,67 @@
+#include "pytorch_cpp_helper.hpp"
+
+#ifdef MMCV_WITH_CUDA
+void BorderAlignForwardCUDAKernelLauncher(const Tensor &input,
+ const Tensor &boxes, Tensor output,
+ Tensor argmax_idx,
+ const int pool_size);
+
+void BorderAlignBackwardCUDAKernelLauncher(const Tensor &grad_output,
+ const Tensor &boxes,
+ const Tensor &argmax_idx,
+ Tensor grad_input,
+ const int pool_size);
+
+void border_align_forward_cuda(const Tensor &input, const Tensor &boxes,
+ Tensor output, Tensor argmax_idx,
+ const int pool_size) {
+ BorderAlignForwardCUDAKernelLauncher(input, boxes, output, argmax_idx,
+ pool_size);
+}
+
+void border_align_backward_cuda(const Tensor &grad_output, const Tensor &boxes,
+ const Tensor &argmax_idx, Tensor grad_input,
+ const int pool_size) {
+ BorderAlignBackwardCUDAKernelLauncher(grad_output, boxes, argmax_idx,
+ grad_input, pool_size);
+}
+#endif
+
+void border_align_forward(const Tensor &input, const Tensor &boxes,
+ Tensor output, Tensor argmax_idx,
+ const int pool_size) {
+ if (input.device().is_cuda()) {
+#ifdef MMCV_WITH_CUDA
+ CHECK_CUDA_INPUT(input);
+ CHECK_CUDA_INPUT(boxes);
+ CHECK_CUDA_INPUT(output);
+ CHECK_CUDA_INPUT(argmax_idx);
+
+ border_align_forward_cuda(input, boxes, output, argmax_idx, pool_size);
+#else
+ AT_ERROR("BorderAlign is not compiled with GPU support");
+#endif
+ } else {
+ AT_ERROR("BorderAlign is not implemented on CPU");
+ }
+}
+
+void border_align_backward(const Tensor &grad_output, const Tensor &boxes,
+ const Tensor &argmax_idx, Tensor grad_input,
+ const int pool_size) {
+ if (grad_output.device().is_cuda()) {
+#ifdef MMCV_WITH_CUDA
+ CHECK_CUDA_INPUT(grad_output);
+ CHECK_CUDA_INPUT(boxes);
+ CHECK_CUDA_INPUT(argmax_idx);
+ CHECK_CUDA_INPUT(grad_input);
+
+ border_align_backward_cuda(grad_output, boxes, argmax_idx, grad_input,
+ pool_size);
+#else
+ AT_ERROR("BorderAlign is not compiled with GPU support");
+#endif
+ } else {
+ AT_ERROR("BorderAlign is not implemented on CPU");
+ }
+}
diff --git a/PyTorch/contrib/cv/detection/GCNet/mmcv/ops/csrc/parrots/border_align_cuda.cu b/PyTorch/contrib/cv/detection/GCNet/mmcv/ops/csrc/parrots/border_align_cuda.cu
new file mode 100644
index 0000000000000000000000000000000000000000..06ba452f65c15945385aa2127bb4a2f94b9bcf8c
--- /dev/null
+++ b/PyTorch/contrib/cv/detection/GCNet/mmcv/ops/csrc/parrots/border_align_cuda.cu
@@ -0,0 +1,67 @@
+#include "border_align_cuda_kernel.cuh"
+#include "pytorch_cuda_helper.hpp"
+
+void BorderAlignForwardCUDAKernelLauncher(const Tensor &input,
+ const Tensor &boxes, Tensor output,
+ Tensor argmax_idx,
+ const int pool_size) {
+ // shape assertion
+ AT_ASSERTM(input.ndimension() == 4,
+ "non-empty 4D(batch mode) tensor expected for input feature");
+ AT_ASSERTM(boxes.ndimension() == 3,
+ "boxes must be 3D tensor with size of [B, H*W, 4]");
+
+ int batch_size = input.size(0);
+ int feat_channels = input.size(1);
+ int channels = feat_channels / 4;
+ int height = input.size(2);
+ int width = input.size(3);
+ // shape [N, box_size, 4] for boxes. (x1, y1, x2, y2) format
+ int box_size = boxes.size(1);
+ // shape [N, channels, box_size, 4] for output
+ int nthreads = batch_size * channels * box_size;
+
+ at::cuda::CUDAGuard device_guard(input.device());
+ cudaStream_t stream = at::cuda::getCurrentCUDAStream();
+ dim3 block(128, 4);
+ AT_DISPATCH_FLOATING_TYPES_AND_HALF(
+ input.scalar_type(), "border_align_forward_cuda_kernel", [&] {
+ border_align_forward_cuda_kernel
+ <<>>(
+ nthreads, input.data_ptr(),
+ boxes.data_ptr(), output.data_ptr(),
+ argmax_idx.data_ptr(), channels, box_size, height, width,
+ pool_size);
+ });
+
+ AT_CUDA_CHECK(cudaGetLastError());
+}
+
+void BorderAlignBackwardCUDAKernelLauncher(const Tensor &grad_output,
+ const Tensor &boxes,
+ const Tensor &argmax_idx,
+ Tensor grad_input,
+ const int pool_size) {
+ int batch_size = grad_input.size(0);
+ int feat_channels = grad_input.size(1);
+ int channels = feat_channels / 4;
+ int height = grad_input.size(2);
+ int width = grad_input.size(3);
+ int box_size = boxes.size(1);
+ int nthreads = batch_size * channels * box_size;
+
+ at::cuda::CUDAGuard device_guard(grad_output.device());
+ cudaStream_t stream = at::cuda::getCurrentCUDAStream();
+ dim3 block(128, 4);
+ AT_DISPATCH_FLOATING_TYPES_AND_HALF(
+ grad_output.scalar_type(), "border_align_backward_cuda_kernel", [&] {
+ border_align_backward_cuda_kernel
+ <<>>(
+ nthreads, grad_output.data_ptr(),
+ boxes.data_ptr(), argmax_idx.data_ptr(),
+ grad_input.data_ptr(), channels, box_size, height,
+ width, pool_size);
+ });
+
+ AT_CUDA_CHECK(cudaGetLastError());
+}
diff --git a/PyTorch/contrib/cv/detection/GCNet/mmcv/ops/csrc/parrots/border_align_parrots.cpp b/PyTorch/contrib/cv/detection/GCNet/mmcv/ops/csrc/parrots/border_align_parrots.cpp
new file mode 100644
index 0000000000000000000000000000000000000000..a4564b09e1a6bddaba2e1b88513cf93d9cf36437
--- /dev/null
+++ b/PyTorch/contrib/cv/detection/GCNet/mmcv/ops/csrc/parrots/border_align_parrots.cpp
@@ -0,0 +1,50 @@
+#include
+#include
+#include
+
+#include "border_align_pytorch.h"
+
+using namespace parrots;
+
+void border_align_forward_cuda_parrots(CudaContext& ctx, const SSElement& attr,
+ const OperatorBase::in_list_t& ins,
+ OperatorBase::out_list_t& outs) {
+ int pool_size;
+ SSAttrs(attr).get("pool_size", pool_size).done();
+
+ const auto& input = buildATensor(ctx, ins[0]);
+ const auto& boxes = buildATensor(ctx, ins[1]);
+
+ auto output = buildATensor(ctx, outs[0]);
+ auto argmax_idx = buildATensor(ctx, outs[1]);
+ border_align_forward_cuda(input, boxes, output, argmax_idx, pool_size);
+}
+
+void border_align_backward_cuda_parrots(CudaContext& ctx, const SSElement& attr,
+ const OperatorBase::in_list_t& ins,
+ OperatorBase::out_list_t& outs) {
+ int pool_size;
+ SSAttrs(attr).get("pool_size", pool_size).done();
+
+ const auto& top_grad = buildATensor(ctx, ins[0]);
+ const auto& boxes = buildATensor(ctx, ins[1]);
+ const auto& argmax_idx = buildATensor(ctx, ins[2]);
+
+ auto bottom_grad = buildATensor(ctx, outs[0]);
+ border_align_backward_cuda(top_grad, boxes, argmax_idx, bottom_grad,
+ pool_size);
+}
+
+PARROTS_EXTENSION_REGISTER(border_align_forward)
+ .attr("pool_size")
+ .input(2)
+ .output(2)
+ .apply(border_align_forward_cuda_parrots)
+ .done();
+
+PARROTS_EXTENSION_REGISTER(border_align_backward)
+ .attr("pool_size")
+ .input(3)
+ .output(1)
+ .apply(border_align_backward_cuda_parrots)
+ .done();
diff --git a/PyTorch/contrib/cv/detection/GCNet/mmcv/ops/csrc/parrots/box_iou_rotated.cpp b/PyTorch/contrib/cv/detection/GCNet/mmcv/ops/csrc/parrots/box_iou_rotated.cpp
new file mode 100644
index 0000000000000000000000000000000000000000..01fc02f550d9e77cdb279e96af3f033a861eb6ba
--- /dev/null
+++ b/PyTorch/contrib/cv/detection/GCNet/mmcv/ops/csrc/parrots/box_iou_rotated.cpp
@@ -0,0 +1,29 @@
+// Copyright (c) Facebook, Inc. and its affiliates. All Rights Reserved
+// modified from
+// https://github.com/facebookresearch/detectron2/blob/master/detectron2/layers/csrc/box_iou_rotated/box_iou_rotated.h
+#include "pytorch_cpp_helper.hpp"
+
+void box_iou_rotated_cpu(const Tensor boxes1, const Tensor boxes2, Tensor ious,
+ const int mode_flag, const bool aligned);
+
+#ifdef MMCV_WITH_CUDA
+void box_iou_rotated_cuda(const Tensor boxes1, const Tensor boxes2, Tensor ious,
+ const int mode_flag, const bool aligned);
+#endif
+
+// Interface for Python
+// inline is needed to prevent multiple function definitions when this header is
+// included by different cpps
+void box_iou_rotated(const Tensor boxes1, const Tensor boxes2, Tensor ious,
+ const int mode_flag, const bool aligned) {
+ assert(boxes1.device().is_cuda() == boxes2.device().is_cuda());
+ if (boxes1.device().is_cuda()) {
+#ifdef MMCV_WITH_CUDA
+ box_iou_rotated_cuda(boxes1, boxes2, ious, mode_flag, aligned);
+#else
+ AT_ERROR("Not compiled with GPU support");
+#endif
+ } else {
+ box_iou_rotated_cpu(boxes1, boxes2, ious, mode_flag, aligned);
+ }
+}
diff --git a/PyTorch/contrib/cv/detection/GCNet/mmcv/ops/csrc/parrots/box_iou_rotated_cpu.cpp b/PyTorch/contrib/cv/detection/GCNet/mmcv/ops/csrc/parrots/box_iou_rotated_cpu.cpp
new file mode 100644
index 0000000000000000000000000000000000000000..2b434885a82ed76cf326520df908d303a25bb060
--- /dev/null
+++ b/PyTorch/contrib/cv/detection/GCNet/mmcv/ops/csrc/parrots/box_iou_rotated_cpu.cpp
@@ -0,0 +1,33 @@
+// Copyright (c) Facebook, Inc. and its affiliates. All Rights Reserved
+// modified from
+// https://github.com/facebookresearch/detectron2/blob/master/detectron2/layers/csrc/box_iou_rotated/box_iou_rotated_cpu.cpp
+#include "box_iou_rotated_utils.hpp"
+#include "pytorch_cpp_helper.hpp"
+
+template
+void box_iou_rotated_cpu_kernel(const Tensor boxes1, const Tensor boxes2,
+ Tensor ious, const int mode_flag,
+ const bool aligned) {
+ int output_size = ious.numel();
+ auto num_boxes1 = boxes1.size(0);
+ auto num_boxes2 = boxes2.size(0);
+
+ if (aligned) {
+ for (int i = 0; i < output_size; i++) {
+ ious[i] = single_box_iou_rotated(boxes1[i].data_ptr(),
+ boxes2[i].data_ptr(), mode_flag);
+ }
+ } else {
+ for (int i = 0; i < num_boxes1; i++) {
+ for (int j = 0; j < num_boxes2; j++) {
+ ious[i * num_boxes2 + j] = single_box_iou_rotated(
+ boxes1[i].data_ptr(), boxes2[j].data_ptr(), mode_flag);
+ }
+ }
+ }
+}
+
+void box_iou_rotated_cpu(const Tensor boxes1, const Tensor boxes2, Tensor ious,
+ const int mode_flag, const bool aligned) {
+ box_iou_rotated_cpu_kernel(boxes1, boxes2, ious, mode_flag, aligned);
+}
diff --git a/PyTorch/contrib/cv/detection/GCNet/mmcv/ops/csrc/parrots/box_iou_rotated_cuda.cu b/PyTorch/contrib/cv/detection/GCNet/mmcv/ops/csrc/parrots/box_iou_rotated_cuda.cu
new file mode 100644
index 0000000000000000000000000000000000000000..d399b5ce7f158d27f5becc62a912e2104feac27b
--- /dev/null
+++ b/PyTorch/contrib/cv/detection/GCNet/mmcv/ops/csrc/parrots/box_iou_rotated_cuda.cu
@@ -0,0 +1,25 @@
+// Copyright (c) Facebook, Inc. and its affiliates. All Rights Reserved
+// modified from
+// https://github.com/facebookresearch/detectron2/blob/master/detectron2/layers/csrc/box_iou_rotated/box_iou_rotated_cuda.cu
+#include "box_iou_rotated_cuda.cuh"
+#include "pytorch_cuda_helper.hpp"
+
+void box_iou_rotated_cuda(const Tensor boxes1, const Tensor boxes2, Tensor ious,
+ const int mode_flag, const bool aligned) {
+ using scalar_t = float;
+ AT_ASSERTM(boxes1.type().is_cuda(), "boxes1 must be a CUDA tensor");
+ AT_ASSERTM(boxes2.type().is_cuda(), "boxes2 must be a CUDA tensor");
+
+ int output_size = ious.numel();
+ int num_boxes1 = boxes1.size(0);
+ int num_boxes2 = boxes2.size(0);
+
+ at::cuda::CUDAGuard device_guard(boxes1.device());
+ cudaStream_t stream = at::cuda::getCurrentCUDAStream();
+ box_iou_rotated_cuda_kernel
+ <<>>(
+ num_boxes1, num_boxes2, boxes1.data_ptr(),
+ boxes2.data_ptr(), (scalar_t*)ious.data_ptr(),
+ mode_flag, aligned);
+ AT_CUDA_CHECK(cudaGetLastError());
+}
diff --git a/PyTorch/contrib/cv/detection/GCNet/mmcv/ops/csrc/parrots/box_iou_rotated_parrots.cpp b/PyTorch/contrib/cv/detection/GCNet/mmcv/ops/csrc/parrots/box_iou_rotated_parrots.cpp
new file mode 100644
index 0000000000000000000000000000000000000000..27114fea942c504daf68c171e8623229608454ff
--- /dev/null
+++ b/PyTorch/contrib/cv/detection/GCNet/mmcv/ops/csrc/parrots/box_iou_rotated_parrots.cpp
@@ -0,0 +1,60 @@
+#include
+#include
+#include
+
+#include "box_iou_rotated_pytorch.h"
+
+using namespace parrots;
+
+/*
+ * void box_iou_rotated_cpu(const Tensor boxes1, const Tensor boxes2, Tensor
+ * ious, const int mode_flag, const bool aligned);
+ */
+void box_iou_rotated_cpu_parrots(HostContext& ctx, const SSElement& attr,
+ const OperatorBase::in_list_t& ins,
+ OperatorBase::out_list_t& outs) {
+ bool aligned;
+ int mode_flag;
+ SSAttrs(attr)
+ .get("aligned", aligned)
+ .get("mode_flag", mode_flag)
+ .done();
+
+ const auto& boxes1 = buildATensor(ctx, ins[0]);
+ const auto& boxes2 = buildATensor(ctx, ins[1]);
+ auto ious = buildATensor(ctx, outs[0]);
+ box_iou_rotated_cpu(boxes1, boxes2, ious, mode_flag, aligned);
+}
+
+#ifdef MMCV_WITH_CUDA
+/*
+ * void box_iou_rotated_cuda(const Tensor boxes1, const Tensor boxes2, Tensor
+ * ious, const int mode_flag, const bool aligned);
+ */
+void box_iou_rotated_cuda_parrots(CudaContext& ctx, const SSElement& attr,
+ const OperatorBase::in_list_t& ins,
+ OperatorBase::out_list_t& outs) {
+ bool aligned;
+ int mode_flag;
+ SSAttrs(attr)
+ .get("aligned", aligned)
+ .get("mode_flag", mode_flag)
+ .done();
+
+ const auto& boxes1 = buildATensor(ctx, ins[0]);
+ const auto& boxes2 = buildATensor(ctx, ins[1]);
+ auto ious = buildATensor(ctx, outs[0]);
+ box_iou_rotated_cuda(boxes1, boxes2, ious, mode_flag, aligned);
+}
+#endif
+
+PARROTS_EXTENSION_REGISTER(box_iou_rotated)
+ .attr("aligned")
+ .attr("mode_flag")
+ .input(2)
+ .output(1)
+ .apply(box_iou_rotated_cpu_parrots)
+#ifdef MMCV_WITH_CUDA
+ .apply(box_iou_rotated_cuda_parrots)
+#endif
+ .done();
diff --git a/PyTorch/contrib/cv/detection/GCNet/mmcv/ops/csrc/parrots/carafe.cpp b/PyTorch/contrib/cv/detection/GCNet/mmcv/ops/csrc/parrots/carafe.cpp
new file mode 100644
index 0000000000000000000000000000000000000000..67619284fade9b752ddb831f58da71a1224fdc26
--- /dev/null
+++ b/PyTorch/contrib/cv/detection/GCNet/mmcv/ops/csrc/parrots/carafe.cpp
@@ -0,0 +1,83 @@
+#include "pytorch_cpp_helper.hpp"
+
+#ifdef MMCV_WITH_CUDA
+void CARAFEForwardCUDAKernelLauncher(const Tensor features, const Tensor masks,
+ Tensor rfeatures, Tensor routput,
+ Tensor rmasks, Tensor output,
+ const int kernel_size,
+ const int group_size,
+ const int scale_factor);
+
+void CARAFEBackwardCUDAKernelLauncher(
+ const Tensor top_grad, const Tensor rfeatures, const Tensor masks,
+ Tensor rtop_grad, Tensor rbottom_grad_hs, Tensor rbottom_grad,
+ Tensor rmask_grad, Tensor bottom_grad, Tensor mask_grad,
+ const int kernel_size, const int group_size, const int scale_factor);
+
+void carafe_forward_cuda(Tensor features, Tensor masks, Tensor rfeatures,
+ Tensor routput, Tensor rmasks, Tensor output,
+ int kernel_size, int group_size, int scale_factor) {
+ CARAFEForwardCUDAKernelLauncher(features, masks, rfeatures, routput, rmasks,
+ output, kernel_size, group_size,
+ scale_factor);
+}
+
+void carafe_backward_cuda(Tensor top_grad, Tensor rfeatures, Tensor masks,
+ Tensor rtop_grad, Tensor rbottom_grad_hs,
+ Tensor rbottom_grad, Tensor rmask_grad,
+ Tensor bottom_grad, Tensor mask_grad, int kernel_size,
+ int group_size, int scale_factor) {
+ CARAFEBackwardCUDAKernelLauncher(top_grad, rfeatures, masks, rtop_grad,
+ rbottom_grad_hs, rbottom_grad, rmask_grad,
+ bottom_grad, mask_grad, kernel_size,
+ group_size, scale_factor);
+}
+#endif
+
+void carafe_forward(Tensor features, Tensor masks, Tensor rfeatures,
+ Tensor routput, Tensor rmasks, Tensor output,
+ int kernel_size, int group_size, int scale_factor) {
+ if (features.device().is_cuda()) {
+#ifdef MMCV_WITH_CUDA
+ CHECK_CUDA_INPUT(features);
+ CHECK_CUDA_INPUT(masks);
+ CHECK_CUDA_INPUT(rfeatures);
+ CHECK_CUDA_INPUT(routput);
+ CHECK_CUDA_INPUT(rmasks);
+ CHECK_CUDA_INPUT(output);
+ carafe_forward_cuda(features, masks, rfeatures, routput, rmasks, output,
+ kernel_size, group_size, scale_factor);
+#else
+ AT_ERROR("Carafe is not compiled with GPU support");
+#endif
+ } else {
+ AT_ERROR("Carafe is not implemented on CPU");
+ }
+}
+
+void carafe_backward(Tensor top_grad, Tensor rfeatures, Tensor masks,
+ Tensor rtop_grad, Tensor rbottom_grad_hs,
+ Tensor rbottom_grad, Tensor rmask_grad, Tensor bottom_grad,
+ Tensor mask_grad, int kernel_size, int group_size,
+ int scale_factor) {
+ if (top_grad.device().is_cuda()) {
+#ifdef MMCV_WITH_CUDA
+ CHECK_CUDA_INPUT(top_grad);
+ CHECK_CUDA_INPUT(rfeatures);
+ CHECK_CUDA_INPUT(masks);
+ CHECK_CUDA_INPUT(rtop_grad);
+ CHECK_CUDA_INPUT(rbottom_grad_hs);
+ CHECK_CUDA_INPUT(rbottom_grad);
+ CHECK_CUDA_INPUT(rmask_grad);
+ CHECK_CUDA_INPUT(bottom_grad);
+ CHECK_CUDA_INPUT(mask_grad);
+ carafe_backward_cuda(top_grad, rfeatures, masks, rtop_grad, rbottom_grad_hs,
+ rbottom_grad, rmask_grad, bottom_grad, mask_grad,
+ kernel_size, group_size, scale_factor);
+#else
+ AT_ERROR("Carafe is not compiled with GPU support");
+#endif
+ } else {
+ AT_ERROR("Carafe is not implemented on CPU");
+ }
+}
diff --git a/PyTorch/contrib/cv/detection/GCNet/mmcv/ops/csrc/parrots/carafe_cuda.cu b/PyTorch/contrib/cv/detection/GCNet/mmcv/ops/csrc/parrots/carafe_cuda.cu
new file mode 100644
index 0000000000000000000000000000000000000000..2f9ac053024f59dc7e26c21ab9b0845a813f3cbf
--- /dev/null
+++ b/PyTorch/contrib/cv/detection/GCNet/mmcv/ops/csrc/parrots/carafe_cuda.cu
@@ -0,0 +1,179 @@
+#include "carafe_cuda_kernel.cuh"
+#include "pytorch_cuda_helper.hpp"
+
+void CARAFEForwardCUDAKernelLauncher(const Tensor features, const Tensor masks,
+ Tensor rfeatures, Tensor routput,
+ Tensor rmasks, Tensor output,
+ const int kernel_size,
+ const int group_size,
+ const int scale_factor) {
+ const int batch_size = output.size(0);
+ const int channels = output.size(1);
+ const int output_height = output.size(2);
+ const int output_width = output.size(3);
+
+ const int input_height = features.size(2);
+ const int input_width = features.size(3);
+
+ const int mask_channels = masks.size(1);
+
+ rfeatures.resize_({batch_size, input_height, input_width, channels});
+ routput.resize_({batch_size, output_height, output_width, channels});
+ rmasks.resize_({batch_size, output_height, output_width, mask_channels});
+
+ // one warp per pixel
+ at::cuda::CUDAGuard device_guard(features.device());
+ cudaStream_t stream = at::cuda::getCurrentCUDAStream();
+ AT_DISPATCH_FLOATING_TYPES_AND_HALF(
+ features.scalar_type(), "NCHW2NHWC_Feature", ([&] {
+ const scalar_t *bottom_data = features.data_ptr();
+ scalar_t *top_data = rfeatures.data_ptr();
+ const int dh = divideUP(channels, kTileDim);
+ const int dw = divideUP(input_height * input_width, kTileDim);
+ BatchTranspose2DCUDAKernel
+ <<>>(
+ batch_size, channels, input_height * input_width, dh, dw,
+ bottom_data, top_data);
+ }));
+ AT_DISPATCH_FLOATING_TYPES_AND_HALF(
+ features.scalar_type(), "NCHW2NHWC_Masks", ([&] {
+ const scalar_t *bottom_data = masks.data_ptr();
+ scalar_t *top_data = rmasks.data_ptr();
+ const int dh = divideUP(mask_channels, kTileDim);
+ const int dw = divideUP(output_height * output_width, kTileDim);
+ BatchTranspose2DCUDAKernel
+ <<>>(
+ batch_size, mask_channels, output_height * output_width, dh, dw,
+ bottom_data, top_data);
+ }));
+ AT_DISPATCH_FLOATING_TYPES_AND_HALF(
+ features.scalar_type(), "CARAFELaucherForward", ([&] {
+ const int num_kernels =
+ batch_size * output_height * output_width * THREADS_PER_PIXEL;
+ const scalar_t *bottom_data = rfeatures.data_ptr();
+ const scalar_t *bottom_masks = rmasks.data_ptr();
+ scalar_t *top_data = routput.data_ptr();
+
+ CARAFEForward<<>>(
+ num_kernels, bottom_data, bottom_masks, kernel_size, group_size,
+ scale_factor, channels, input_height, input_width, output_height,
+ output_width, mask_channels, top_data);
+ }));
+ AT_DISPATCH_FLOATING_TYPES_AND_HALF(
+ features.scalar_type(), "NHWC2NCHW", ([&] {
+ const scalar_t *bottom_data = routput.data_ptr();
+ scalar_t *top_data = output.data_ptr();
+ const int dh = divideUP(output_height * output_width, kTileDim);
+ const int dw = divideUP(channels, kTileDim);
+ BatchTranspose2DCUDAKernel
+ <<>>(
+ batch_size, output_height * output_width, channels, dh, dw,
+ bottom_data, top_data);
+ }));
+
+ AT_CUDA_CHECK(cudaGetLastError());
+}
+
+void CARAFEBackwardCUDAKernelLauncher(
+ const Tensor top_grad, const Tensor rfeatures, const Tensor masks,
+ Tensor rtop_grad, Tensor rbottom_grad_hs, Tensor rbottom_grad,
+ Tensor rmask_grad, Tensor bottom_grad, Tensor mask_grad,
+ const int kernel_size, const int group_size, const int scale_factor) {
+ const int batch_size = top_grad.size(0);
+ const int channels = top_grad.size(1);
+ const int output_height = top_grad.size(2);
+ const int output_width = top_grad.size(3);
+
+ const int input_height = bottom_grad.size(2);
+ const int input_width = bottom_grad.size(3);
+
+ const int mask_channels = masks.size(1);
+
+ rtop_grad.resize_({batch_size, output_height, output_width, channels});
+ rbottom_grad.resize_({batch_size, input_height, input_width, channels});
+ rbottom_grad_hs.resize_({batch_size, output_height, output_width, channels});
+ rmask_grad.resize_({batch_size, output_height, output_width, mask_channels});
+
+ at::cuda::CUDAGuard device_guard(top_grad.device());
+ cudaStream_t stream = at::cuda::getCurrentCUDAStream();
+ AT_DISPATCH_FLOATING_TYPES_AND_HALF(
+ top_grad.scalar_type(), "NCHW2NHWC_Top_Grad", ([&] {
+ const scalar_t *bottom_data = top_grad.data_ptr();
+ scalar_t *top_data = rtop_grad.data_ptr();
+ const int dh = divideUP(channels, kTileDim);
+ const int dw = divideUP(output_height * output_width, kTileDim);
+ BatchTranspose2DCUDAKernel
+ <<>>(
+ batch_size, channels, output_height * output_width, dh, dw,
+ bottom_data, top_data);
+ }));
+
+ AT_DISPATCH_FLOATING_TYPES_AND_HALF(
+ top_grad.scalar_type(), "CARAFELaucherBackward_Feature", ([&] {
+ const int num_kernels =
+ batch_size * output_height * output_width * THREADS_PER_PIXEL;
+ const scalar_t *top_diff = rtop_grad.data_ptr();
+ const scalar_t *bottom_masks = masks.data_ptr();
+ scalar_t *bottom_diff = rbottom_grad_hs.data_ptr();
+
+ CARAFEBackward_Feature
+ <<>>(num_kernels, top_diff, bottom_masks, kernel_size,
+ group_size, scale_factor, channels, input_height,
+ input_width, output_height, output_width,
+ mask_channels, bottom_diff);
+ }));
+ AT_DISPATCH_FLOATING_TYPES_AND_HALF(
+ top_grad.scalar_type(), "FeatureSum", ([&] {
+ const int num_kernels =
+ batch_size * input_height * input_width * THREADS_PER_PIXEL;
+ const scalar_t *bottom_diff_hs = rbottom_grad_hs.data_ptr();
+ scalar_t *bottom_diff = rbottom_grad.data_ptr();
+
+ FeatureSum
+ <<>>(num_kernels, bottom_diff_hs, scale_factor, channels,
+ input_height, input_width, bottom_diff);
+ }));
+ AT_DISPATCH_FLOATING_TYPES_AND_HALF(
+ top_grad.scalar_type(), "NHWC2NCHW_Bottom_Grad", ([&] {
+ const scalar_t *bottom_data = rbottom_grad.data_ptr();
+ scalar_t *top_data = bottom_grad.data_ptr();
+ const int dh = divideUP(input_height * input_width, kTileDim);
+ const int dw = divideUP(channels, kTileDim);
+ BatchTranspose2DCUDAKernel
+ <<>>(
+ batch_size, input_height * input_width, channels, dh, dw,
+ bottom_data, top_data);
+ }));
+
+ AT_DISPATCH_FLOATING_TYPES_AND_HALF(
+ top_grad.scalar_type(), "CARAFELaucherBackward_Mask", ([&] {
+ const int num_kernels = batch_size * output_height * output_width *
+ mask_channels * WARP_SIZE;
+ const scalar_t *top_diff = rtop_grad.data_ptr();
+ const scalar_t *bottom_data = rfeatures.data_ptr();
+ scalar_t *mask_diff = rmask_grad.data_ptr();
+
+ CARAFEBackward_Mask
+ <<>>(num_kernels, top_diff, bottom_data, kernel_size,
+ group_size, scale_factor, channels, input_height,
+ input_width, output_height, output_width,
+ mask_channels, mask_diff);
+ }));
+ AT_DISPATCH_FLOATING_TYPES_AND_HALF(
+ top_grad.scalar_type(), "NHWC2NCHW_Mask_Grad", ([&] {
+ const scalar_t *bottom_data = rmask_grad.data_ptr();
+ scalar_t *top_data = mask_grad.data_ptr();
+ const int dh = divideUP(output_height * output_width, kTileDim);
+ const int dw = divideUP(mask_channels, kTileDim);
+ BatchTranspose2DCUDAKernel
+ <<>>(
+ batch_size, output_height * output_width, mask_channels, dh, dw,
+ bottom_data, top_data);
+ }));
+
+ AT_CUDA_CHECK(cudaGetLastError());
+}
diff --git a/PyTorch/contrib/cv/detection/GCNet/mmcv/ops/csrc/parrots/carafe_naive.cpp b/PyTorch/contrib/cv/detection/GCNet/mmcv/ops/csrc/parrots/carafe_naive.cpp
new file mode 100644
index 0000000000000000000000000000000000000000..bb0aa0978b4a8331db0e167bd29e1653717253df
--- /dev/null
+++ b/PyTorch/contrib/cv/detection/GCNet/mmcv/ops/csrc/parrots/carafe_naive.cpp
@@ -0,0 +1,68 @@
+#include "pytorch_cpp_helper.hpp"
+
+#ifdef MMCV_WITH_CUDA
+void CARAFENAIVEForwardCUDAKernelLauncher(const Tensor features,
+ const Tensor masks, Tensor output,
+ const int kernel_size,
+ const int group_size,
+ const int scale_factor);
+
+void CARAFENAIVEBackwardCUDAKernelLauncher(
+ const Tensor top_grad, const Tensor features, const Tensor masks,
+ Tensor bottom_grad, Tensor mask_grad, const int kernel_size,
+ const int group_size, const int scale_factor);
+
+void carafe_naive_forward_cuda(Tensor features, Tensor masks, Tensor output,
+ int kernel_size, int group_size,
+ int scale_factor) {
+ CARAFENAIVEForwardCUDAKernelLauncher(features, masks, output, kernel_size,
+ group_size, scale_factor);
+}
+
+void carafe_naive_backward_cuda(Tensor top_grad, Tensor features, Tensor masks,
+ Tensor bottom_grad, Tensor mask_grad,
+ int kernel_size, int group_size,
+ int scale_factor) {
+ CARAFENAIVEBackwardCUDAKernelLauncher(top_grad, features, masks, bottom_grad,
+ mask_grad, kernel_size, group_size,
+ scale_factor);
+}
+#endif
+
+void carafe_naive_forward(Tensor features, Tensor masks, Tensor output,
+ int kernel_size, int group_size, int scale_factor) {
+ if (features.device().is_cuda()) {
+#ifdef MMCV_WITH_CUDA
+ CHECK_CUDA_INPUT(features);
+ CHECK_CUDA_INPUT(masks);
+ CHECK_CUDA_INPUT(output);
+ carafe_naive_forward_cuda(features, masks, output, kernel_size, group_size,
+ scale_factor);
+#else
+ AT_ERROR("CarafeNaive is not compiled with GPU support");
+#endif
+ } else {
+ AT_ERROR("CarafeNaive is not implemented on CPU");
+ }
+}
+
+void carafe_naive_backward(Tensor top_grad, Tensor features, Tensor masks,
+ Tensor bottom_grad, Tensor mask_grad,
+ int kernel_size, int group_size, int scale_factor) {
+ if (top_grad.device().is_cuda()) {
+#ifdef MMCV_WITH_CUDA
+ CHECK_CUDA_INPUT(top_grad);
+ CHECK_CUDA_INPUT(features);
+ CHECK_CUDA_INPUT(masks);
+ CHECK_CUDA_INPUT(bottom_grad);
+ CHECK_CUDA_INPUT(mask_grad);
+ carafe_naive_backward_cuda(top_grad, features, masks, bottom_grad,
+ mask_grad, kernel_size, group_size,
+ scale_factor);
+#else
+ AT_ERROR("CarafeNaive is not compiled with GPU support");
+#endif
+ } else {
+ AT_ERROR("CarafeNaive is not implemented on CPU");
+ }
+}
diff --git a/PyTorch/contrib/cv/detection/GCNet/mmcv/ops/csrc/parrots/carafe_naive_cuda.cu b/PyTorch/contrib/cv/detection/GCNet/mmcv/ops/csrc/parrots/carafe_naive_cuda.cu
new file mode 100644
index 0000000000000000000000000000000000000000..ffc05c8fa588b98ee5ab3432ec146a928ac2509e
--- /dev/null
+++ b/PyTorch/contrib/cv/detection/GCNet/mmcv/ops/csrc/parrots/carafe_naive_cuda.cu
@@ -0,0 +1,51 @@
+#include "carafe_naive_cuda_kernel.cuh"
+#include "pytorch_cuda_helper.hpp"
+
+void CARAFENAIVEForwardCUDAKernelLauncher(const Tensor features,
+ const Tensor masks, Tensor output,
+ const int kernel_size,
+ const int group_size,
+ const int scale_factor) {
+ int output_size = output.numel();
+ int channels = output.size(1);
+ int height = output.size(2);
+ int width = output.size(3);
+
+ at::cuda::CUDAGuard device_guard(features.device());
+ cudaStream_t stream = at::cuda::getCurrentCUDAStream();
+ AT_DISPATCH_FLOATING_TYPES_AND_HALF(
+ features.scalar_type(), "CARAFENAIVEForward", ([&] {
+ carafe_naive_forward_cuda_kernel
+ <<>>(
+ output_size, features.data_ptr(),
+ masks.data_ptr(), output.data_ptr(),
+ kernel_size, group_size, scale_factor, channels, height, width);
+ }));
+
+ AT_CUDA_CHECK(cudaGetLastError());
+}
+
+void CARAFENAIVEBackwardCUDAKernelLauncher(
+ const Tensor top_grad, const Tensor features, const Tensor masks,
+ Tensor bottom_grad, Tensor mask_grad, const int kernel_size,
+ const int group_size, const int scale_factor) {
+ int output_size = top_grad.numel();
+ int channels = top_grad.size(1);
+ int height = top_grad.size(2);
+ int width = top_grad.size(3);
+
+ at::cuda::CUDAGuard device_guard(top_grad.device());
+ cudaStream_t stream = at::cuda::getCurrentCUDAStream();
+ AT_DISPATCH_FLOATING_TYPES_AND_HALF(
+ top_grad.scalar_type(), "CARAFENAIVEBackward", ([&] {
+ carafe_naive_backward_cuda_kernel
+ <<>>(
+ output_size, top_grad.data_ptr(),
+ features.data_ptr(), masks.data_ptr(),
+ bottom_grad.data_ptr(),
+ mask_grad.data_ptr(), kernel_size, group_size,
+ scale_factor, channels, height, width);
+ }));
+
+ AT_CUDA_CHECK(cudaGetLastError());
+}
diff --git a/PyTorch/contrib/cv/detection/GCNet/mmcv/ops/csrc/parrots/carafe_naive_parrots.cpp b/PyTorch/contrib/cv/detection/GCNet/mmcv/ops/csrc/parrots/carafe_naive_parrots.cpp
new file mode 100644
index 0000000000000000000000000000000000000000..78dfe09d424367826bcc7dafd6c6840466fc0c3b
--- /dev/null
+++ b/PyTorch/contrib/cv/detection/GCNet/mmcv/ops/csrc/parrots/carafe_naive_parrots.cpp
@@ -0,0 +1,73 @@
+#include
+#include