# CenterMask **Repository Path**: wan_xin_jun/CenterMask ## Basic Information - **Project Name**: CenterMask - **Description**: CenterMask : Real-Time Anchor-Free Instance Segmentation, in CVPR 2020 - **Primary Language**: Unknown - **License**: BSD-2-Clause - **Default Branch**: master - **Homepage**: None - **GVP Project**: No ## Statistics - **Stars**: 0 - **Forks**: 0 - **Created**: 2020-06-28 - **Last Updated**: 2024-07-06 ## Categories & Tags **Categories**: Uncategorized **Tags**: None ## README # [CenterMask](https://arxiv.org/abs/1911.06667) : Real-Time Anchor-Free Instance Segmentation ![architecture](figures/architecture.png) ## Abstract We propose a simple yet efficient anchor-free instance segmentation, called **CenterMask**, that adds a novel spatial attention-guided mask (SAG-Mask) branch to anchor-free one stage object detector (FCOS) in the same vein with Mask R-CNN. Plugged into the FCOS object detector, the SAG-Mask branch predicts a segmentation mask on each box with the spatial attention map that helps to focus on informative pixels and suppress noise. We also present an improved **VoVNetV2** backbone networks with two effective strategies: (1) residual connection for alleviating the saturation problem of larger VoVNet and (2) effective Squeeze-Excitation (eSE) dealing with the information loss problem of original SE. With SAG-Mask and VoVNetV2, we deign CenterMask and CenterMask-Lite that are targeted to large and small models, respectively. CenterMask outperforms all previous state-of-the-art models at a much faster speed. CenterMask-Lite also achieves 33.4% mask AP / 38.0% box AP, outperforming YOLACT by 2.6 / 7.0 AP gain, respectively, at over 35fps on Titan Xp. We hope that CenterMask and VoVNetV2 can serve as a solid baseline of real-time instance segmentation and backbone network for various vision tasks, respectively. ## Highlights - ***First* anchor-free one-stage instance segmentation.** To the best of our knowledge, **CenterMask** is the first instance segmentation on top of anchor-free object detection (15/11/2019). - **Toward Real-Time: CenterMask-Lite.** This works provide not only large-scale CenterMask but also lightweight CenterMask-Lite that can run at real-time speed (> 30 fps). - **State-of-the-art performance.** CenterMask outperforms Mask R-CNN, TensorMask, and ShapeMask at much faster speed and CenterMask-Lite models also surpass YOLACT or YOLACT++ by large margins. - **Well balanced (speed/accuracy) backbone network, VoVNetV2.** VoVNetV2 shows better performance and faster speed than ResNe(X)t or HRNet. ## Updates - Open the official repo and code will be released after refactoring. (05/12/2019) - Release code and MobileNetV2 & ResNet backbone models shown in the [[`paper`]](https://arxiv.org/abs/1911.06667). (10/12/2019) - Upload the VoVNetV2 backbone models. (02/01/2020) - Open VoVNetV2 backbone for [Detectron2](https://github.com/youngwanLEE/detectron2/tree/vovnet/projects/VoVNet) --> [vovnet-detectron2](https://github.com/youngwanLEE/vovnet-detectron2). (08/01/2020) - Upload CenterMask-Lite models trained for 48 epochs outperforming [YOLACT](https://arxiv.org/abs/1904.02689) or [YOLACT++](https://arxiv.org/abs/1912.06218). (14/01/2020) - [centermask2](https://github.com/youngwanLEE/centermask2) has been released. (20/02/2020) ## Models ### Environment - V100 or Titan Xp GPU - CUDA 10.0 - cuDNN7.3 - pytorch1.1 - Implemented on [fcos](https://github.com/tianzhi0549/FCOS) and [maskrcn-benchmark](https://github.com/facebookresearch/maskrcnn-benchmark) - [GoogleDrive weight download](https://drive.google.com/drive/folders/1llkxG5lKK7lZZ0W__7u5M5m4Ddf4YIWr?usp=sharing) ### coco test-dev results |Detector | Backbone | epoch | Mask AP (AP/APs/APm/APl) | Box AP (AP/APs/APm/APl) | Time (ms) | GPU |Weight | |----------|----------|:--------------:|:-------------------:|:------------------------:|:--------------------------:| :---:|:---:| | [ShapeMask](https://arxiv.org/abs/1904.03239) | R-101-FPN |N/A | 37.4/16.1/40.1/53.8 | 42.2/24.9/45.2/52.7 | 125| V100| - | | [TensorMask](https://arxiv.org/abs/1903.12174) | R-101-FPN | 72 | 37.1/17.4/39.1/51.6 | - | 380 |V100| - | [RetinaMask](https://arxiv.org/abs/1901.03353) | R-101-FPN | 24 | 34.7/14.3/36.7/50.5 | 41.4/23.0/44.5/53.0 | 98 |V100| - | | [Mask R-CNN](https://arxiv.org/abs/1703.06870) | R-101-FPN | 24 | 37.9/18.1/40.3/53.3 | 42.2/24.9/45.2/52.7 | 94 |V100| - |[link](https://www.dropbox.com/s/rs1rgl5lupw576a/FRCN-V-57-FPN-2x-norm.pth?dl=1)| | **CenterMask** | R-101-FPN | 24 | 38.3/17.7/40.8/54.5| 43.1/25.2/46.1/54.4 | **72** |V100| [link](https://www.dropbox.com/s/9w17k9iiihob8vx/centermask-R-101-ms-2x.pth?dl=1)| | **CenterMask** | X-101-FPN | 36 | 39.6/19.7/42.0/55.2| 44.6/27.1/47.2/55.2 | 123 |V100| [link](https://www.dropbox.com/s/yrczyb1u49hv05a/centermask-X-101-FPN-ms-3x.pth?dl=1)| | **CenterMask** | V2-99-FPN | 36 | 40.6/20.1/42.8/57.0| 45.8/27.8/48.3/57.6 | 84 |V100| [link](https://www.dropbox.com/s/99i7ydsz2ngrvu1/centermask-V2-99-FPN-ms-3x.pth?dl=1)| || | [YOLACT-400](https://arxiv.org/abs/1904.02689) | R-101-FPN | 48 | 24.9/5.0/25.3/45.0 | 28.4/10.7/28.9/43.1 | 22 | Xp |-| | **CenterMask-Lite** | MV2-FPN | 48 | 26.7/9.0/27.0/40.9 | 30.2/14.2/31.9/40.9 | **20** | Xp |[link](https://www.dropbox.com/s/fk9m4uqkhrpkqc6/centermask-lite-M-v2-bs16-4x.pth?dl=1)| || | [YOLACT-550](https://arxiv.org/abs/1904.02689) | R-50-FPN | 48 | 28.2/9.2/29.3/44.8 | 30.3/14.0/31.2/43.0|23|Xp|-| | **CenterMask-Lite** | V2-19-FPN | 48 | 32.4/13.6/33.8/47.2 | 35.9/19.6/38.0/45.9 | **23** | Xp |[link](https://www.dropbox.com/s/alifk31z3roife1/centermask-lite-V-19-eSE-ms-bs16-4x.pth?dl=1)| || | [YOLACT-550](https://arxiv.org/abs/1904.02689) | R-101-FPN | 48 | 29.8/9.9/31.3/47.7 | 31.0/14.4/31.8/43.7| 30 | Xp| - | | [YOLACT-550++](https://arxiv.org/abs/1912.06218) | R-50-FPN | 48 | 34.1/11.7/36.1/53.6 | - |29|Xp|-| | [YOLACT-550++](https://arxiv.org/abs/1912.06218) | R-101-FPN | 48 | 34.6/11.9/36.8/55.1 | - | 36 | Xp| - | | **CenterMask-Lite** | R-50-FPN | 48 | 32.9/12.9/34.7/48.7 | 36.7/18.7/39.4/48.2 | 29 | Xp |[link](https://www.dropbox.com/s/nbuoit8ewd7ii4f/centermask-lite-R-50-ms-bs16-4x.pth?dl=1)| | **CenterMask-Lite** | V2-39-FPN | 48 | 36.3/15.6/38.1/53.1 | 40.7/22.4/43.2/53.5 | **28** | Xp |[link](https://www.dropbox.com/s/s3atq9nzqtmdvpi/centermask-lite-V-39-eSE-ms-bs16-4x.pth?dl=1)| *Note that RetinaMask, Mask R-CNN, and CenterMask are implemented by using same baseline code([maskrcnn-benchmark](https://github.com/facebookresearch/maskrcnn-benchmark)) and all models are trained using multi-scale training augmentation.*\ *We expect that if we implement our CenterMask based on [detectron2](https://github.com/facebookresearch/detectron2), it will get better performance.*\ *24/36/48/72 epoch are same as 2x/3x/4x/6x training schedule in [detectron](https://github.com/facebookresearch/Detectron), respectively.*\ *Training CenterMask-Lite models longer (24 --> 48 epochs same as YOLACT) boosts ther performance, widening the performance gap from YOLACT and even YOLACT++.* ### coco val2017 results |Detector | Backbone | epoch | Mask AP (AP/APs/APm/APl) | Box AP (AP/APs/APm/APl) | Time (ms) | Weight | |----------|----------|:--------------:|:-------------------:|:------------------------:| :---:|:---:| | **CenterMask** | MV2-FPN | 36 | 31.2/14.5/32.8/46.3 | 35.5/20.6/38.0/46.8 | **56** | [link](https://www.dropbox.com/s/t1vjdqgix7a632a/centermask-M-v2-FPN-ms-3x.pth?dl=1)| | **CenterMask** | **V2-19-FPN** | 36 | 34.7/17.3/37.5/49.6 | 39.7/24.6/42.7/50.8 | 59 | [link](https://www.dropbox.com/s/guy4b2cstnsvddj/centermask-V-19-eSE-FPN-ms-3x.pth?dl=1)| || | Mask R-CNN | R-50-FPN | 24 | 35.9/17.1/38.9/52.0 | 39.7/24.0/43.0/50.8 | 77 | [link](https://www.dropbox.com/s/r3ocl8ls45wsbgo/MRCN-R-50-FPN-ms-2x.pth?dl=1)| | **CenterMask** | R-50-FPN | 24 | 36.4/17.3/39.5/52.7 | 41.2/24.9/45.1/53.0 | 72 | [link](https://www.dropbox.com/s/bhpf6jud8ovvxmh/centermask-R-50-FPN-ms-2x.pth?dl=1)| | **CenterMask** | **V2-39-FPN** | 24 | 37.7/17.9/40.8/54.3 | 42.6/25.3/46.3/55.2 | **70** | [link](https://www.dropbox.com/s/ugcpzcx5b4btvjc/centermask-V2-39-FPN-ms-2x.pth?dl=1)| | Mask R-CNN | R-50-FPN | 36 | 36.5/17.9/39.2/52.5| 40.5/24.7/43.7/52.2 | 77 | [link](https://www.dropbox.com/s/09ny9ofj5t1r883/MRCN-R-50-FPN-ms-3x.pth?dl=1)| | **CenterMask** | R-50-FPN | 36 | 37.0/17.6/39.7/53.8 | 41.7/24.8/45.1/54.5 | 72 | [link](https://www.dropbox.com/s/438pbeuqlj1spf0/centermask-R-50-FPN-ms-3x.pth?dl=1)| | **CenterMask** | **V2-39-FPN** | 36 | 38.5/19.0/41.5/54.7 | 43.5/27.1/46.9/55.9 | **70** | [link](https://www.dropbox.com/s/5mmq2ok0yopupnz/centermask-V2-39-FPN-ms-3x.pth?dl=1)| || | Mask R-CNN | R-101-FPN | 24 | 37.8/18.5/40.7/54.9 | 42.2/25.8/45.8/54.0 | 94 | [link](https://www.dropbox.com/s/ptjc4qorps5gbwe/MRCN-R-101-FPN-ms-2x.pth?dl=1)| | **CenterMask** | R-101-FPN | 24 | 38.0/18.2/41.3/55.2 | 43.1/25.7/47.0/55.6 | 91 | [link](https://dl.dropbox.com/s/9w17k9iiihob8vx/centermask-R-101-ms-2x.pth?dl=1)| | **CenterMask** | **V2-57-FPN** | 24 | 38.5/18.6/41.9/56.2 | 43.8/26.7/47.4/57.1 | **76** | [link](https://www.dropbox.com/s/949k1ednumtd2rk/centermask-V2-57-FPN-ms-2x.pth?dl=1)| | Mask R-CNN | R-101-FPN | 36 | 38.0/18.4/40.8/55.2 | 42.4/25.4/45.5/55.2 | 94 | [link](https://www.dropbox.com/s/hev2k4vfh362d3s/MRCN-R-101-FPN-ms-3x.pth?dl=1)| | **CenterMask** | R-101-FPN | 36 | 38.6/19.2/42.0/56.1 | 43.7/27.2/47.6/56.7 | 91 | [link](https://www.dropbox.com/s/1uxpfh8z0sp8tr2/centermask-R-101-FPN-ms-3x.pth?dl=1)| | **CenterMask** | **V2-57-FPN** | 36 | 39.4/19.6/42.9/55.9 | 44.6/27.7/48.3/57.3 | **76** | [link](https://www.dropbox.com/s/5m5tc4h30tqp2it/centermask-V2-57-FPN-ms-3x.pth?dl=1)| || | Mask R-CNN | X-101-32x8d-FPN | 24 | 38.9/19.6/41.6/55.7 | 43.7/27.6/46.9/55.9 | 165 | [link](https://www.dropbox.com/s/o6uu0nft0a8iu5s/MRCN-X-101-FPN-ms-2x.pth?dl=1)| | **CenterMask** | X-101-32x8d-FPN | 24 | 39.1/19.6/42.5/56.1 | 44.3/26.9/48.5/57.0 | 157 | [link](https://www.dropbox.com/s/ovhzjz43nph14mo/centermask-X-101-FPN-ms-2x.pth?dl=1)| | **CenterMask** | **V2-99-FPN** | 24 | 39.6/19.6/43.1/56.9 | 44.8/27.6/49.0/57.7 | **106** | [link](https://www.dropbox.com/s/lemwoq6qwoqnbzm/centermask-V2-99-FPN-ms-2x.pth?dl=1)| | Mask R-CNN | X-101-32x8d-FPN | 36 | 38.6/19.7/41.1/55.2 | 43.6/27.3/46.7/55.6 | 165 | [link](https://www.dropbox.com/s/yl3zmeaxghvni43/MRCN-X-101-FPN-ms-3x.pth?dl=1)| | **CenterMask** |X-101-32x8d-FPN | 36 | 39.1/18.5/42.3/56.4 | 44.4/26.7/47.7/57.1 | 157 |[link](https://www.dropbox.com/s/yrczyb1u49hv05a/centermask-X-101-FPN-ms-3x.pth?dl=1)| | **CenterMask** | **V2-99-FPN** | 36 | 40.2/20.6/43.5/57.3 | 45.6/29.2/49.3/58.8 | **106** | [link](https://www.dropbox.com/s/99i7ydsz2ngrvu1/centermask-V2-99-FPN-ms-3x.pth?dl=1)| *Note that the all models are trained using **train-time augmentation (multi-scale)**.*\ *The inference time of all models is measured on **Titan Xp** GPU.*\ *24/36 epoch are same as x2/x3 training schedule in [detectron](https://github.com/facebookresearch/Detectron), respectively.* ## Installation Check [INSTALL.md](INSTALL.md) for installation instructions which is orginate from [maskrcnn-benchmark](https://github.com/facebookresearch/maskrcnn-benchmark). ## Training Follow [the instructions](https://github.com/facebookresearch/maskrcnn-benchmark#multi-gpu-training) of [maskrcnn-benchmark](https://github.com/facebookresearch/maskrcnn-benchmark) guides. If you want multi-gpu (e.g.,8) training, ```bash export NGPUS=8 python -m torch.distributed.launch --nproc_per_node=$NGPUS tools/train_net.py --config-file "configs/centermask/centermask_R_50_FPN_1x.yaml" ``` ## Evaluation Follow [the instruction](https://github.com/facebookresearch/maskrcnn-benchmark#evaluation) of [maskrcnn-benchmark](https://github.com/facebookresearch/maskrcnn-benchmark) First of all, you have to download the weight file you want to inference. For examaple (CenterMask-Lite-R-50), ##### multi-gpu evaluation & test batch size 16, ```bash wget https://www.dropbox.com/s/2enqxenccz4xy6l/centermask-lite-R-50-ms-bs32-1x.pth export NGPUS=8 python -m torch.distributed.launch --nproc_per_node=$NGPUS tools/test_net.py --config-file "configs/centermask/centermask_R_50_FPN_lite_res600_ms_bs32_1x.yaml" TEST.IMS_PER_BATCH 16 MODEL.WEIGHT centermask-lite-R-50-ms-bs32-1x.pth ``` ##### For single-gpu evaluation & test batch size 1, ```bash wget https://www.dropbox.com/s/2enqxenccz4xy6l/centermask-lite-R-50-ms-bs32-1x.pth CUDA_VISIBLE_DEVICES=0 python tools/test_net.py --config-file "configs/centermask/centermask_R_50_FPN_lite_res600_ms_bs32_1x.yaml" TEST.IMS_PER_BATCH 1 MODEL.WEIGHT centermask-lite-R-50-ms-bs32-1x.pth ``` ## TODO - [x] train-time augmentation + 3x schedule for comparing with detectron2 models - [x] ResNet-50 & ResNeXt-101-32x8d - [x] VoVNetV2 backbones - [x] VoVNetV2 backbones for [Detectron2](https://github.com/youngwanLEE/detectron2/tree/vovnet/projects/VoVNet) - [x] CenterMask in [Detectron2](https://github.com/youngwanLEE/detectron2/tree/vovnet/projects/VoVNet) - [ ] quick-demo - [ ] arxiv paper update ## Performance ![vizualization](figures/quality.png) ![results_table](figures/results.png) ## Citing CenterMask Please cite our paper in your publications if it helps your research: ```BibTeX @article{lee2019centermask, title={CenterMask: Real-Time Anchor-Free Instance Segmentation}, author={Lee, Youngwan and Park, Jongyoul}, booktitle={CVPR}, year={2020} } ```