diff --git a/models b/models new file mode 160000 index 0000000000000000000000000000000000000000..1329e3bb2a78bccd508a630eec0376c9ece1e104 --- /dev/null +++ b/models @@ -0,0 +1 @@ +Subproject commit 1329e3bb2a78bccd508a630eec0376c9ece1e104 diff --git a/research/cv/ssd_resnet50/Dockerfile b/research/cv/ssd_resnet50/Dockerfile index fcb31f207f23664ca2d60bda9a15463af1042dd9..1b847a3ec546252b0d6c8531f89cb202a111c9da 100644 --- a/research/cv/ssd_resnet50/Dockerfile +++ b/research/cv/ssd_resnet50/Dockerfile @@ -1,6 +1,6 @@ -ARG FROM_IMAGE_NAME -FROM ${FROM_IMAGE_NAME} - -RUN apt install libgl1-mesa-glx -y -COPY requirements.txt . -RUN pip3.7 install -r requirements.txt +ARG FROM_IMAGE_NAME +FROM ${FROM_IMAGE_NAME} + +RUN apt install libgl1-mesa-glx -y +COPY requirements.txt . +RUN pip3.7 install -r requirements.txt diff --git a/research/cv/ssd_resnet50/README.md b/research/cv/ssd_resnet50/README.md index 6a9c4bbd6c2b8b1f05b67437b210df30947f1c85..3cb5d376123e3ed1d2a398a088df42f6303eee0e 100644 --- a/research/cv/ssd_resnet50/README.md +++ b/research/cv/ssd_resnet50/README.md @@ -1,352 +1,352 @@ -# Contents - -- [Contents](#contents) - - [SSD Description](#ssd-description) - - [Model Architecture](#model-architecture) - - [Dataset](#dataset) - - [Environment Requirements](#environment-requirements) - - [Quick Start](#quick-start) - - [Prepare the model](#prepare-the-model) - - [Run the scripts](#run-the-scripts) - - [Script Description](#script-description) - - [Script and Sample Code](#script-and-sample-code) - - [Script Parameters](#script-parameters) - - [Training Process](#training-process) - - [Training on Ascend](#training-on-ascend) - - [Evaluation Process](#evaluation-process) - - [Evaluation on Ascend](#evaluation-on-ascend) - - [Performance](#performance) - - [Export Process](#Export-process) - - [Export](#Export) - - [Inference Process](#Inference-process) - - [Inference](#Inference) - - [Description of Random Situation](#description-of-random-situation) - - [ModelZoo Homepage](#modelzoo-homepage) - -## [SSD Description](#contents) - -SSD discretizes the output space of bounding boxes into a set of default boxes over different aspect ratios and scales per feature map location. At prediction time, the network generates scores for the presence of each object category in each default box and produces adjustments to the box to better match the object shape.Additionally, the network combines predictions from multiple feature maps with different resolutions to naturally handle objects of various sizes. - -[Paper](https://arxiv.org/abs/1512.02325): Wei Liu, Dragomir Anguelov, Dumitru Erhan, Christian Szegedy, Scott Reed, Cheng-Yang Fu, Alexander C. Berg.European Conference on Computer Vision (ECCV), 2016 (In press). - -## [Model Architecture](#contents) - -The SSD approach is based on a feed-forward convolutional network that produces a fixed-size collection of bounding boxes and scores for the presence of object class instances in those boxes, followed by a non-maximum suppression step to produce the final detections. The early network layers are based on a standard architecture used for high quality image classification, which is called the base network. Then add auxiliary structure to the network to produce detections. - -## [Dataset](#contents) - -Note that you can run the scripts based on the dataset mentioned in original paper or widely used in relevant domain/network architecture. In the following sections, we will introduce how to run the scripts using the related dataset below. - -Dataset used: [COCO2017]() - -- Dataset size:19G - - Train:18G,118000 images - - Val:1G,5000 images - - Annotations:241M,instances,captions,person_keypoints etc -- Data format:image and json files - - Note:Data will be processed in dataset.py - -## [Environment Requirements](#contents) - -- Install [MindSpore](https://www.mindspore.cn/install/en). - -- Download the dataset COCO2017. - -- We use COCO2017 as training dataset in this example by default, and you can also use your own datasets. - First, install Cython ,pycocotool and opencv to process data and to get evaluation result. - - ```shell - pip install Cython - - pip install pycocotools - - pip install opencv-python - ``` - - 1. If coco dataset is used. **Select dataset to coco when run script.** - - Change the `coco_root` and other settings you need in `src/config.py`. The directory structure is as follows: - - ```shell - . - └─coco_dataset - ├─annotations - ├─instance_train2017.json - └─instance_val2017.json - ├─val2017 - └─train2017 - ``` - - 2. If VOC dataset is used. **Select dataset to voc when run script.** - Change `classes`, `num_classes`, `voc_json` and `voc_root` in `src/config.py`. `voc_json` is the path of json file with coco format for evaluation, `voc_root` is the path of VOC dataset, the directory structure is as follows: - - ```shell - . - └─voc_dataset - └─train - ├─0001.jpg - └─0001.xml - ... - ├─xxxx.jpg - └─xxxx.xml - └─eval - ├─0001.jpg - └─0001.xml - ... - ├─xxxx.jpg - └─xxxx.xml - ``` - - 3. If your own dataset is used. **Select dataset to other when run script.** - Organize the dataset information into a TXT file, each row in the file is as follows: - - ```shell - train2017/0000001.jpg 0,259,401,459,7 35,28,324,201,2 0,30,59,80,2 - ``` - - Each row is an image annotation which split by space, the first column is a relative path of image, the others are box and class infomations of the format [xmin,ymin,xmax,ymax,class]. We read image from an image path joined by the `image_dir`(dataset directory) and the relative path in `anno_path`(the TXT file path), `image_dir` and `anno_path` are setting in `src/config.py`. - -## [Quick Start](#contents) - -### Run the scripts - -After installing MindSpore via the official website, you can start training and evaluation as follows: - -- running on Ascend - -```shell -# distributed training on Ascend -bash run_distribute_train.sh [DEVICE_NUM] [EPOCH_SIZE] [LR] [DATASET] [RANK_TABLE_FILE] - -# training on single NPU -bash run_standalone_train.sh - -# run eval on Ascend -bash run_eval.sh [DATASET] [CHECKPOINT_PATH] [DEVICE_ID] - -``` - -- Run on docker - -Build docker images(Change version to the one you actually used) - -```shell -# build docker -docker build -t ssd:20.1.0 . --build-arg FROM_IMAGE_NAME=ascend-mindspore-arm:20.1.0 -``` - -Create a container layer over the created image and start it - -```shell -# start docker -bash scripts/docker_start.sh ssd:20.1.0 [DATA_DIR] [MODEL_DIR] -``` - -Then you can run everything just like on ascend. - -## [Script Description](#contents) - -### [Script and Sample Code](#contents) - -```shell -. -└─ cv - └─ ssd - ├─ README.md # descriptions about SSD - ├─ ascend310_infer # application for 310 inference - ├─ scripts - ├─ run_distribute_train.sh # shell script for distributed on ascend - ├─ run_eval.sh # shell script for eval on ascend - └─ run_infer_310.sh # shell script for 310 inference - ├─ src - ├─ __init__.py # init file - ├─ box_utils.py # bbox utils - ├─ eval_utils.py # metrics utils - ├─ config.py # total config - ├─ dataset.py # create dataset and process dataset - ├─ init_params.py # parameters utils - ├─ lr_schedule.py # learning ratio generator - └─ ssd.py # ssd architecture - ├─ eval.py # eval scripts - ├─ train.py # train scripts - ├─ export.py # export mindir script - ├─ postprogress.py # post process for 310 inference - └─ mindspore_hub_conf.py # mindspore hub interface -``` - -### [Script Parameters](#contents) - - ```shell - Major parameters in train.py and config.py as follows: - - "device_num": 1 # Use device nums - "lr": 0.05 # Learning rate init value - "dataset": coco # Dataset name - "epoch_size": 500 # Epoch size - "batch_size": 32 # Batch size of input tensor - "pre_trained": None # Pretrained checkpoint file path - "pre_trained_epoch_size": 0 # Pretrained epoch size - "save_checkpoint_epochs": 10 # The epoch interval between two checkpoints. By default, the checkpoint will be saved per 10 epochs - "loss_scale": 1024 # Loss scale - "filter_weight": False # Load parameters in head layer or not. If the class numbers of train dataset is different from the class numbers in pre_trained checkpoint, please set True. - "freeze_layer": "none" # Freeze the backbone parameters or not, support none and backbone. - - "class_num": 81 # Dataset class number - "image_shape": [300, 300] # Image height and width used as input to the model - "mindrecord_dir": "/data/MindRecord_COCO" # MindRecord path - "coco_root": "/data/coco2017" # COCO2017 dataset path - "voc_root": "/data/voc_dataset" # VOC original dataset path - "voc_json": "annotations/voc_instances_val.json" # is the path of json file with coco format for evaluation - "image_dir": "" # Other dataset image path, if coco or voc used, it will be useless - "anno_path": "" # Other dataset annotation path, if coco or voc used, it will be useless - - ``` - -### [Training Process](#contents) - -To train the model, run `train.py`. If the `mindrecord_dir` is empty, it will generate [mindrecord](https://www.mindspore.cn/docs/programming_guide/en/master/convert_dataset.html) files by `coco_root`(coco dataset), `voc_root`(voc dataset) or `image_dir` and `anno_path`(own dataset). **Note if mindrecord_dir isn't empty, it will use mindrecord_dir instead of raw images.** - -#### Training on Ascend - -- Distribute mode - -```shell - bash run_distribute_train.sh [DEVICE_NUM] [EPOCH_SIZE] [LR] [DATASET] [RANK_TABLE_FILE] [PRE_TRAINED](optional) [PRE_TRAINED_EPOCH_SIZE](optional) -``` - -We need five or seven parameters for this scripts. - -- `DEVICE_NUM`: the device number for distributed train. -- `EPOCH_NUM`: epoch num for distributed train. -- `LR`: learning rate init value for distributed train. -- `DATASET`:the dataset mode for distributed train. -- `RANK_TABLE_FILE :` the path of [rank_table.json](https://gitee.com/mindspore/mindspore/tree/master/model_zoo/utils/hccl_tools), it is better to use absolute path. -- `PRE_TRAINED :` the path of pretrained checkpoint file, it is better to use absolute path. -- `PRE_TRAINED_EPOCH_SIZE :` the epoch num of pretrained. - -Training result will be stored in the current path, whose folder name begins with "LOG". Under this, you can find checkpoint file together with result like the followings in log - -```shell -epoch: 1 step: 458, loss is 3.1681802 -epoch time: 228752.4654865265, per step time: 499.4595316299705 -epoch: 2 step: 458, loss is 2.8847265 -epoch time: 38912.93382644653, per step time: 84.96273761232868 -epoch: 3 step: 458, loss is 2.8398118 -epoch time: 38769.184827804565, per step time: 84.64887516987896 -... - -epoch: 498 step: 458, loss is 0.70908034 -epoch time: 38771.079778671265, per step time: 84.65301261718616 -epoch: 499 step: 458, loss is 0.7974688 -epoch time: 38787.413120269775, per step time: 84.68867493508685 -epoch: 500 step: 458, loss is 0.5548882 -epoch time: 39064.8467540741, per step time: 85.29442522723602 -``` - -### [Evaluation Process](#contents) - -#### Evaluation on Ascend - -```shell -bash run_eval.sh [DATASET] [CHECKPOINT_PATH] [DEVICE_ID] -``` - -We need two parameters for this scripts. - -- `DATASET`:the dataset mode of evaluation dataset. -- `CHECKPOINT_PATH`: the absolute path for checkpoint file. -- `DEVICE_ID`: the device id for eval. - -> checkpoint can be produced in training process. - -Inference result will be stored in the example path, whose folder name begins with "eval". Under this, you can find result like the followings in log. - -```shell -Average Precision (AP) @[ IoU=0.50:0.95 | area= all | maxDets=100 ] = 0.327 -Average Precision (AP) @[ IoU=0.50 | area= all | maxDets=100 ] = 0.474 -Average Precision (AP) @[ IoU=0.75 | area= all | maxDets=100 ] = 0.358 -Average Precision (AP) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.120 -Average Precision (AP) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.350 -Average Precision (AP) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.459 -Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets= 1 ] = 0.315 -Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets= 10 ] = 0.489 -Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets=100 ] = 0.511 -Average Recall (AR) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.208 -Average Recall (AR) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.557 -Average Recall (AR) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.689 - -======================================== - -mAP: 0.32719216721918915 - -``` - -## [Export Process](#contents) - -### [Export](#content) - -```shell -python export.py --ckpt_file [CKPT_PATH] --device_target [DEVICE_TARGET] --file_format[EXPORT_FORMAT] -``` - -`EXPORT_FORMAT` should be in ["AIR", "MINDIR"] - -## [Inference Process](#contents) - -### [Inference](#content) - -Before performing inference, we need to export model first. Air model can only be exported in Ascend 910 environment, mindir model can be exported in any environment. -Current batch_ Size can only be set to 1. - -```shell -# Ascend310 inference -bash run_infer_310.sh [MINDIR_PATH] [DATA_PATH] [DVPP] [ANNO_FILE] [DEVICE_ID] -``` - -Inference result will be stored in the example path, you can find result like the followings in acc.log. - -```shell -Average Precision (AP) @[ IoU=0.50:0.95 | area= all | maxDets=100 ] = 0.327 -Average Precision (AP) @[ IoU=0.50 | area= all | maxDets=100 ] = 0.475 -Average Precision (AP) @[ IoU=0.75 | area= all | maxDets=100 ] = 0.358 -Average Precision (AP) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.115 -Average Precision (AP) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.353 -Average Precision (AP) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.455 -Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets= 1 ] = 0.314 -Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets= 10 ] = 0.485 -Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets=100 ] = 0.509 -Average Recall (AR) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.200 -Average Recall (AR) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.554 -Average Recall (AR) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.692 - -mAP: 0.3266651054070853 -``` - -### [Performance](#contents) - -| 参数 | Ascend | -| ------------------- | --------------------- | -| 模型版本 | SSD resnet50 | -| 资源 | Ascend 910 | -| 上传日期 | 2020-03-29 | -| MindSpore版本 | 1.1.0 | -| 数据集 | COCO2017 | -| mAP | IoU=0.50: 32.7% | -| 模型大小 | 281M(.ckpt文件) | - -## [Export MindIR](#contents) - -```shell -python export.py --ckpt_file [CKPT_PATH] --file_name [FILE_NAME] --file_format [FILE_FORMAT] -``` - -The ckpt_file parameter is required, -`EXPORT_FORMAT` should be in ["AIR", "MINDIR"] - -## [Description of Random Situation](#contents) - -In dataset.py, we set the seed inside “create_dataset" function. We also use random seed in train.py. - -## [ModelZoo Homepage](#contents) - - Please check the official [homepage](https://gitee.com/mindspore/mindspore/tree/master/model_zoo). +# Contents + +- [Contents](#contents) + - [SSD Description](#ssd-description) + - [Model Architecture](#model-architecture) + - [Dataset](#dataset) + - [Environment Requirements](#environment-requirements) + - [Quick Start](#quick-start) + - [Prepare the model](#prepare-the-model) + - [Run the scripts](#run-the-scripts) + - [Script Description](#script-description) + - [Script and Sample Code](#script-and-sample-code) + - [Script Parameters](#script-parameters) + - [Training Process](#training-process) + - [Training on Ascend](#training-on-ascend) + - [Evaluation Process](#evaluation-process) + - [Evaluation on Ascend](#evaluation-on-ascend) + - [Performance](#performance) + - [Export Process](#Export-process) + - [Export](#Export) + - [Inference Process](#Inference-process) + - [Inference](#Inference) + - [Description of Random Situation](#description-of-random-situation) + - [ModelZoo Homepage](#modelzoo-homepage) + +## [SSD Description](#contents) + +SSD discretizes the output space of bounding boxes into a set of default boxes over different aspect ratios and scales per feature map location. At prediction time, the network generates scores for the presence of each object category in each default box and produces adjustments to the box to better match the object shape.Additionally, the network combines predictions from multiple feature maps with different resolutions to naturally handle objects of various sizes. + +[Paper](https://arxiv.org/abs/1512.02325): Wei Liu, Dragomir Anguelov, Dumitru Erhan, Christian Szegedy, Scott Reed, Cheng-Yang Fu, Alexander C. Berg.European Conference on Computer Vision (ECCV), 2016 (In press). + +## [Model Architecture](#contents) + +The SSD approach is based on a feed-forward convolutional network that produces a fixed-size collection of bounding boxes and scores for the presence of object class instances in those boxes, followed by a non-maximum suppression step to produce the final detections. The early network layers are based on a standard architecture used for high quality image classification, which is called the base network. Then add auxiliary structure to the network to produce detections. + +## [Dataset](#contents) + +Note that you can run the scripts based on the dataset mentioned in original paper or widely used in relevant domain/network architecture. In the following sections, we will introduce how to run the scripts using the related dataset below. + +Dataset used: [COCO2017]() + +- Dataset size:19G + - Train:18G,118000 images + - Val:1G,5000 images + - Annotations:241M,instances,captions,person_keypoints etc +- Data format:image and json files + - Note:Data will be processed in dataset.py + +## [Environment Requirements](#contents) + +- Install [MindSpore](https://www.mindspore.cn/install/en). + +- Download the dataset COCO2017. + +- We use COCO2017 as training dataset in this example by default, and you can also use your own datasets. + First, install Cython ,pycocotool and opencv to process data and to get evaluation result. + + ```shell + pip install Cython + + pip install pycocotools + + pip install opencv-python + ``` + + 1. If coco dataset is used. **Select dataset to coco when run script.** + + Change the `coco_root` and other settings you need in `src/config.py`. The directory structure is as follows: + + ```shell + . + └─coco_dataset + ├─annotations + ├─instance_train2017.json + └─instance_val2017.json + ├─val2017 + └─train2017 + ``` + + 2. If VOC dataset is used. **Select dataset to voc when run script.** + Change `classes`, `num_classes`, `voc_json` and `voc_root` in `src/config.py`. `voc_json` is the path of json file with coco format for evaluation, `voc_root` is the path of VOC dataset, the directory structure is as follows: + + ```shell + . + └─voc_dataset + └─train + ├─0001.jpg + └─0001.xml + ... + ├─xxxx.jpg + └─xxxx.xml + └─eval + ├─0001.jpg + └─0001.xml + ... + ├─xxxx.jpg + └─xxxx.xml + ``` + + 3. If your own dataset is used. **Select dataset to other when run script.** + Organize the dataset information into a TXT file, each row in the file is as follows: + + ```shell + train2017/0000001.jpg 0,259,401,459,7 35,28,324,201,2 0,30,59,80,2 + ``` + + Each row is an image annotation which split by space, the first column is a relative path of image, the others are box and class infomations of the format [xmin,ymin,xmax,ymax,class]. We read image from an image path joined by the `image_dir`(dataset directory) and the relative path in `anno_path`(the TXT file path), `image_dir` and `anno_path` are setting in `src/config.py`. + +## [Quick Start](#contents) + +### Run the scripts + +After installing MindSpore via the official website, you can start training and evaluation as follows: + +- running on Ascend + +```shell +# distributed training on Ascend +bash run_distribute_train.sh [DEVICE_NUM] [EPOCH_SIZE] [LR] [DATASET] [RANK_TABLE_FILE] + +# training on single NPU +bash run_standalone_train.sh + +# run eval on Ascend +bash run_eval.sh [DATASET] [CHECKPOINT_PATH] [DEVICE_ID] + +``` + +- Run on docker + +Build docker images(Change version to the one you actually used) + +```shell +# build docker +docker build -t ssd:20.1.0 . --build-arg FROM_IMAGE_NAME=ascend-mindspore-arm:20.1.0 +``` + +Create a container layer over the created image and start it + +```shell +# start docker +bash scripts/docker_start.sh ssd:20.1.0 [DATA_DIR] [MODEL_DIR] +``` + +Then you can run everything just like on ascend. + +## [Script Description](#contents) + +### [Script and Sample Code](#contents) + +```shell +. +└─ cv + └─ ssd + ├─ README.md # descriptions about SSD + ├─ ascend310_infer # application for 310 inference + ├─ scripts + ├─ run_distribute_train.sh # shell script for distributed on ascend + ├─ run_eval.sh # shell script for eval on ascend + └─ run_infer_310.sh # shell script for 310 inference + ├─ src + ├─ __init__.py # init file + ├─ box_utils.py # bbox utils + ├─ eval_utils.py # metrics utils + ├─ config.py # total config + ├─ dataset.py # create dataset and process dataset + ├─ init_params.py # parameters utils + ├─ lr_schedule.py # learning ratio generator + └─ ssd.py # ssd architecture + ├─ eval.py # eval scripts + ├─ train.py # train scripts + ├─ export.py # export mindir script + ├─ postprogress.py # post process for 310 inference + └─ mindspore_hub_conf.py # mindspore hub interface +``` + +### [Script Parameters](#contents) + + ```shell + Major parameters in train.py and config.py as follows: + + "device_num": 1 # Use device nums + "lr": 0.05 # Learning rate init value + "dataset": coco # Dataset name + "epoch_size": 500 # Epoch size + "batch_size": 32 # Batch size of input tensor + "pre_trained": None # Pretrained checkpoint file path + "pre_trained_epoch_size": 0 # Pretrained epoch size + "save_checkpoint_epochs": 10 # The epoch interval between two checkpoints. By default, the checkpoint will be saved per 10 epochs + "loss_scale": 1024 # Loss scale + "filter_weight": False # Load parameters in head layer or not. If the class numbers of train dataset is different from the class numbers in pre_trained checkpoint, please set True. + "freeze_layer": "none" # Freeze the backbone parameters or not, support none and backbone. + + "class_num": 81 # Dataset class number + "image_shape": [300, 300] # Image height and width used as input to the model + "mindrecord_dir": "/data/MindRecord_COCO" # MindRecord path + "coco_root": "/data/coco2017" # COCO2017 dataset path + "voc_root": "/data/voc_dataset" # VOC original dataset path + "voc_json": "annotations/voc_instances_val.json" # is the path of json file with coco format for evaluation + "image_dir": "" # Other dataset image path, if coco or voc used, it will be useless + "anno_path": "" # Other dataset annotation path, if coco or voc used, it will be useless + + ``` + +### [Training Process](#contents) + +To train the model, run `train.py`. If the `mindrecord_dir` is empty, it will generate [mindrecord](https://www.mindspore.cn/docs/programming_guide/en/master/convert_dataset.html) files by `coco_root`(coco dataset), `voc_root`(voc dataset) or `image_dir` and `anno_path`(own dataset). **Note if mindrecord_dir isn't empty, it will use mindrecord_dir instead of raw images.** + +#### Training on Ascend + +- Distribute mode + +```shell + bash run_distribute_train.sh [DEVICE_NUM] [EPOCH_SIZE] [LR] [DATASET] [RANK_TABLE_FILE] [PRE_TRAINED](optional) [PRE_TRAINED_EPOCH_SIZE](optional) +``` + +We need five or seven parameters for this scripts. + +- `DEVICE_NUM`: the device number for distributed train. +- `EPOCH_NUM`: epoch num for distributed train. +- `LR`: learning rate init value for distributed train. +- `DATASET`:the dataset mode for distributed train. +- `RANK_TABLE_FILE :` the path of [rank_table.json](https://gitee.com/mindspore/mindspore/tree/master/model_zoo/utils/hccl_tools), it is better to use absolute path. +- `PRE_TRAINED :` the path of pretrained checkpoint file, it is better to use absolute path. +- `PRE_TRAINED_EPOCH_SIZE :` the epoch num of pretrained. + +Training result will be stored in the current path, whose folder name begins with "LOG". Under this, you can find checkpoint file together with result like the followings in log + +```shell +epoch: 1 step: 458, loss is 3.1681802 +epoch time: 228752.4654865265, per step time: 499.4595316299705 +epoch: 2 step: 458, loss is 2.8847265 +epoch time: 38912.93382644653, per step time: 84.96273761232868 +epoch: 3 step: 458, loss is 2.8398118 +epoch time: 38769.184827804565, per step time: 84.64887516987896 +... + +epoch: 498 step: 458, loss is 0.70908034 +epoch time: 38771.079778671265, per step time: 84.65301261718616 +epoch: 499 step: 458, loss is 0.7974688 +epoch time: 38787.413120269775, per step time: 84.68867493508685 +epoch: 500 step: 458, loss is 0.5548882 +epoch time: 39064.8467540741, per step time: 85.29442522723602 +``` + +### [Evaluation Process](#contents) + +#### Evaluation on Ascend + +```shell +bash run_eval.sh [DATASET] [CHECKPOINT_PATH] [DEVICE_ID] +``` + +We need two parameters for this scripts. + +- `DATASET`:the dataset mode of evaluation dataset. +- `CHECKPOINT_PATH`: the absolute path for checkpoint file. +- `DEVICE_ID`: the device id for eval. + +> checkpoint can be produced in training process. + +Inference result will be stored in the example path, whose folder name begins with "eval". Under this, you can find result like the followings in log. + +```shell +Average Precision (AP) @[ IoU=0.50:0.95 | area= all | maxDets=100 ] = 0.327 +Average Precision (AP) @[ IoU=0.50 | area= all | maxDets=100 ] = 0.474 +Average Precision (AP) @[ IoU=0.75 | area= all | maxDets=100 ] = 0.358 +Average Precision (AP) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.120 +Average Precision (AP) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.350 +Average Precision (AP) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.459 +Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets= 1 ] = 0.315 +Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets= 10 ] = 0.489 +Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets=100 ] = 0.511 +Average Recall (AR) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.208 +Average Recall (AR) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.557 +Average Recall (AR) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.689 + +======================================== + +mAP: 0.32719216721918915 + +``` + +## [Export Process](#contents) + +### [Export](#content) + +```shell +python export.py --ckpt_file [CKPT_PATH] --device_target [DEVICE_TARGET] --file_format[EXPORT_FORMAT] +``` + +`EXPORT_FORMAT` should be in ["AIR", "MINDIR"] + +## [Inference Process](#contents) + +### [Inference](#content) + +Before performing inference, we need to export model first. Air model can only be exported in Ascend 910 environment, mindir model can be exported in any environment. +Current batch_ Size can only be set to 1. + +```shell +# Ascend310 inference +bash run_infer_310.sh [MINDIR_PATH] [DATA_PATH] [DVPP] [ANNO_FILE] [DEVICE_ID] +``` + +Inference result will be stored in the example path, you can find result like the followings in acc.log. + +```shell +Average Precision (AP) @[ IoU=0.50:0.95 | area= all | maxDets=100 ] = 0.327 +Average Precision (AP) @[ IoU=0.50 | area= all | maxDets=100 ] = 0.475 +Average Precision (AP) @[ IoU=0.75 | area= all | maxDets=100 ] = 0.358 +Average Precision (AP) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.115 +Average Precision (AP) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.353 +Average Precision (AP) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.455 +Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets= 1 ] = 0.314 +Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets= 10 ] = 0.485 +Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets=100 ] = 0.509 +Average Recall (AR) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.200 +Average Recall (AR) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.554 +Average Recall (AR) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.692 + +mAP: 0.3266651054070853 +``` + +### [Performance](#contents) + +| 参数 | Ascend | +| ------------------- | --------------------- | +| 模型版本 | SSD resnet50 | +| 资源 | Ascend 910 | +| 上传日期 | 2020-03-29 | +| MindSpore版本 | 1.1.0 | +| 数据集 | COCO2017 | +| mAP | IoU=0.50: 32.7% | +| 模型大小 | 281M(.ckpt文件) | + +## [Export MindIR](#contents) + +```shell +python export.py --ckpt_file [CKPT_PATH] --file_name [FILE_NAME] --file_format [FILE_FORMAT] +``` + +The ckpt_file parameter is required, +`EXPORT_FORMAT` should be in ["AIR", "MINDIR"] + +## [Description of Random Situation](#contents) + +In dataset.py, we set the seed inside “create_dataset" function. We also use random seed in train.py. + +## [ModelZoo Homepage](#contents) + + Please check the official [homepage](https://gitee.com/mindspore/mindspore/tree/master/model_zoo). diff --git a/research/cv/ssd_resnet50/README_CN.md b/research/cv/ssd_resnet50/README_CN.md index ba92664d1e0f142b670df405caeec8fecaaf1419..081e8678603060c2e143fe8a15926bb558927ff5 100644 --- a/research/cv/ssd_resnet50/README_CN.md +++ b/research/cv/ssd_resnet50/README_CN.md @@ -1,304 +1,304 @@ -# 目录 - - - -- [目录](#目录) -- [SSD说明](#ssd说明) -- [模型架构](#模型架构) -- [数据集](#数据集) -- [环境要求](#环境要求) -- [快速入门](#快速入门) -- [脚本说明](#脚本说明) - - [脚本及样例代码](#脚本及样例代码) - - [脚本参数](#脚本参数) - - [训练过程](#训练过程) - - [Ascend上训练](#ascend上训练) - - [评估过程](#评估过程) - - [Ascend处理器环境评估](#ascend处理器环境评估) - - [性能](#性能) - - [导出过程](#导出过程) - - [导出](#导出) - - [推理过程](#推理过程) - - [推理](#推理) -- [随机情况说明](#随机情况说明) -- [ModelZoo主页](#modelzoo主页) - - - -# SSD说明 - -SSD将边界框的输出空间离散成一组默认框,每个特征映射位置具有不同的纵横比和尺度。在预测时,网络对每个默认框中存在的对象类别进行评分,并对框进行调整以更好地匹配对象形状。此外,网络将多个不同分辨率的特征映射的预测组合在一起,自然处理各种大小的对象。 - -[论文](https://arxiv.org/abs/1512.02325): Wei Liu, Dragomir Anguelov, Dumitru Erhan, Christian Szegedy, Scott Reed, Cheng-Yang Fu, Alexander C. Berg.European Conference on Computer Vision (ECCV), 2016 (In press). - -# 模型架构 - -SSD方法基于前向卷积网络,该网络产生固定大小的边界框集合,并针对这些框内存在的对象类实例进行评分,然后通过非极大值抑制步骤进行最终检测。早期的网络层基于高质量图像分类的标准体系结构,被称为基础网络。后来通过向网络添加辅助结构进行检测。 - -# 数据集 - -使用的数据集: [COCO2017]() - -- 数据集大小:19 GB - - 训练集:18 GB,118000张图像 - - 验证集:1 GB,5000张图像 - - 标注:241 MB,实例,字幕,person_keypoints等 -- 数据格式:图像和json文件 - - 注意:数据在dataset.py中处理 - -# 环境要求 - -- 安装[MindSpore](https://www.mindspore.cn/install)。 - -- 下载数据集COCO2017。 - -- 本示例默认使用COCO2017作为训练数据集,您也可以使用自己的数据集。 - - 1. 如果使用coco数据集。**执行脚本时选择数据集coco。** - 安装Cython和pycocotool,也可以安装mmcv进行数据处理。 - - ```python - pip install Cython - - pip install pycocotools - - ``` - - 并在`config.py`中更改COCO_ROOT和其他您需要的设置。目录结构如下: - - ```text - . - └─cocodataset - ├─annotations - ├─instance_train2017.json - └─instance_val2017.json - ├─val2017 - └─train2017 - - ``` - - 2. 如果使用自己的数据集。**执行脚本时选择数据集为other。** - 将数据集信息整理成TXT文件,每行如下: - - ```text - train2017/0000001.jpg 0,259,401,459,7 35,28,324,201,2 0,30,59,80,2 - - ``` - - 每行是按空间分割的图像标注,第一列是图像的相对路径,其余为[xmin,ymin,xmax,ymax,class]格式的框和类信息。我们从`IMAGE_DIR`(数据集目录)和`ANNO_PATH`(TXT文件路径)的相对路径连接起来的图像路径中读取图像。在`config.py`中设置`IMAGE_DIR`和`ANNO_PATH`。 - -# 快速入门 - -通过官方网站安装MindSpore后,您可以按照如下步骤进行训练和评估: - -```shell script -# Ascend分布式训练 -bash run_distribute_train.sh [DEVICE_NUM] [EPOCH_SIZE] [LR] [DATASET] [RANK_TABLE_FILE] -``` - -```shell script -# 单卡训练 -bash run_standalone_train.sh -``` - -```shell script -# Ascend处理器环境运行eval -bash run_eval.sh [DATASET] [CHECKPOINT_PATH] [DEVICE_ID] -``` - -# 脚本说明 - -## 脚本及样例代码 - -```text -. -└─ cv - └─ ssd - ├─ README.md ## SSD相关说明 - ├─ ascend310_infer ## 实现310推理源代码 - ├─ scripts - ├─ run_distribute_train.sh ## Ascend分布式shell脚本 - ├─ run_infer_310.sh ## Ascend推理shell脚本 - └─ run_eval.sh ## Ascend评估shell脚本 - ├─ src - ├─ __init__.py ## 初始化文件 - ├─ box_util.py ## bbox工具 - ├─ coco_eval.py ## coco指标工具 - ├─ config.py ## 总配置 - ├─ dataset.py ## 创建并处理数据集 - ├─ init_params.py ## 参数工具 - ├─ lr_schedule.py ## 学习率生成器 - └─ ssd.py ## SSD架构 - ├─ eval.py ## 评估脚本 - ├─ export.py ## 导出 AIR,MINDIR模型的脚本 - ├─ postprogress.py ## 310推理后处理脚本 - ├─ train.py ## 训练脚本 - └─ mindspore_hub_conf.py ## MindSpore Hub接口 -``` - -## 脚本参数 - - ```text - train.py和config.py中主要参数如下: - - "device_num": 1 # 使用设备数量 - "lr": 0.05 # 学习率初始值 - "dataset": coco # 数据集名称 - "epoch_size": 500 # 轮次大小 - "batch_size": 32 # 输入张量的批次大小 - "pre_trained": None # 预训练检查点文件路径 - "pre_trained_epoch_size": 0 # 预训练轮次大小 - "save_checkpoint_epochs": 10 # 两个检查点之间的轮次间隔。默认情况下,每10个轮次都会保存检查点。 - "loss_scale": 1024 # 损失放大 - - "class_num": 81 # 数据集类数 - "image_shape": [300, 300] # 作为模型输入的图像高和宽 - "mindrecord_dir": "/data/MindRecord_COCO" # MindRecord路径 - "coco_root": "/data/coco2017" # COCO2017数据集路径 - "voc_root": "" # VOC原始数据集路径 - "image_dir": "" # 其他数据集图片路径,如果使用coco或voc,此参数无效。 - "anno_path": "" # 其他数据集标注路径,如果使用coco或voc,此参数无效。 - - ``` - -## 训练过程 - -运行`train.py`训练模型。如果`mindrecord_dir`为空,则会通过`coco_root`(coco数据集)或`image_dir`和`anno_path`(自己的数据集)生成[MindRecord](https://www.mindspore.cn/docs/programming_guide/zh-CN/master/convert_dataset.html)文件。**注意,如果mindrecord_dir不为空,将使用mindrecord_dir代替原始图像。** - -### Ascend上训练 - -- 分布式 - -```shell script - bash run_distribute_train.sh [DEVICE_NUM] [EPOCH_SIZE] [LR] [DATASET] [RANK_TABLE_FILE] [PRE_TRAINED](optional) [PRE_TRAINED_EPOCH_SIZE](optional) -``` - -此脚本需要五或七个参数。 - -- `DEVICE_NUM`:分布式训练的设备数。 -- `EPOCH_NUM`:分布式训练的轮次数。 -- `LR`:分布式训练的学习率初始值。 -- `DATASET`:分布式训练的数据集模式。 -- `RANK_TABLE_FILE`:[rank_table.json](https://gitee.com/mindspore/mindspore/tree/master/model_zoo/utils/hccl_tools)的路径。最好使用绝对路径。 -- `PRE_TRAINED`:预训练检查点文件的路径。最好使用绝对路径。 -- `PRE_TRAINED_EPOCH_SIZE`:预训练的轮次数。 - - 训练结果保存在当前路径中,文件夹名称以"LOG"开头。 您可在此文件夹中找到检查点文件以及结果,如下所示。 - -```text -epoch: 1 step: 458, loss is 3.1681802 -epoch time: 228752.4654865265, per step time: 499.4595316299705 -epoch: 2 step: 458, loss is 2.8847265 -epoch time: 38912.93382644653, per step time: 84.96273761232868 -epoch: 3 step: 458, loss is 2.8398118 -epoch time: 38769.184827804565, per step time: 84.64887516987896 -... - -epoch: 498 step: 458, loss is 0.70908034 -epoch time: 38771.079778671265, per step time: 84.65301261718616 -epoch: 499 step: 458, loss is 0.7974688 -epoch time: 38787.413120269775, per step time: 84.68867493508685 -epoch: 500 step: 458, loss is 0.5548882 -epoch time: 39064.8467540741, per step time: 85.29442522723602 -``` - -## 评估过程 - -### Ascend处理器环境评估 - -```shell script -bash run_eval.sh [DATASET] [CHECKPOINT_PATH] [DEVICE_ID] -``` - -此脚本需要两个参数。 - -- `DATASET`:评估数据集的模式。 -- `CHECKPOINT_PATH`:检查点文件的绝对路径。 -- `DEVICE_ID`: 评估的设备ID。 - -> 在训练过程中可以生成检查点。 - -推理结果保存在示例路径中,文件夹名称以“eval”开头。您可以在日志中找到类似以下的结果。 - -```text -Average Precision (AP) @[ IoU=0.50:0.95 | area= all | maxDets=100 ] = 0.327 -Average Precision (AP) @[ IoU=0.50 | area= all | maxDets=100 ] = 0.474 -Average Precision (AP) @[ IoU=0.75 | area= all | maxDets=100 ] = 0.358 -Average Precision (AP) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.120 -Average Precision (AP) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.350 -Average Precision (AP) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.459 -Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets= 1 ] = 0.315 -Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets= 10 ] = 0.489 -Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets=100 ] = 0.511 -Average Recall (AR) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.208 -Average Recall (AR) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.557 -Average Recall (AR) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.689 - -======================================== - -mAP: 0.32719216721918915 - -``` - -## 导出过程 - -### 导出 - -```shell -python export.py --ckpt_file [CKPT_PATH] --device_target [DEVICE_TARGET] --file_format[EXPORT_FORMAT] -``` - -`EXPORT_FORMAT`可选 ["AIR", "MINDIR"] - -## 推理过程 - -### 推理 - -在还行推理之前我们需要先导出模型。Air模型只能在昇腾910环境上导出,mindir可以在任意环境上导出。batch_size只支持1。 - -```shell -# Ascend310 inference -bash run_infer_310.sh [MINDIR_PATH] [DATA_PATH] [DVPP] [ANNO_FILE] [DEVICE_ID] -``` - -推理结果被保存到了当前目录,可以在acc.log中获得类似下面的结果。 - -```shell -Average Precision (AP) @[ IoU=0.50:0.95 | area= all | maxDets=100 ] = 0.327 -Average Precision (AP) @[ IoU=0.50 | area= all | maxDets=100 ] = 0.475 -Average Precision (AP) @[ IoU=0.75 | area= all | maxDets=100 ] = 0.358 -Average Precision (AP) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.115 -Average Precision (AP) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.353 -Average Precision (AP) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.455 -Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets= 1 ] = 0.314 -Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets= 10 ] = 0.485 -Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets=100 ] = 0.509 -Average Recall (AR) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.200 -Average Recall (AR) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.554 -Average Recall (AR) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.692 - -mAP: 0.3266651054070853 -``` - -### 性能 - -| 参数 | Ascend | -| ------------------- | --------------------- | -| 模型版本 | SSD resnet50 | -| 资源 | Ascend 910 | -| 上传日期 | 2021-03-29 | -| MindSpore版本 | 1.1.0 | -| 数据集 | COCO2017 | -| mAP | IoU=0.50: 32.7% | -| 模型大小 | 281M(.ckpt文件) | - -参数ckpt_file为必填项, -`EXPORT_FORMAT` 必须在 ["AIR", "MINDIR"]中选择。 - -# 随机情况说明 - -dataset.py中设置了“create_dataset”函数内的种子,同时还使用了train.py中的随机种子。 - -# ModelZoo主页 - - 请浏览官网[主页](https://gitee.com/mindspore/mindspore/tree/master/model_zoo)。 +# 目录 + + + +- [目录](#目录) +- [SSD说明](#ssd说明) +- [模型架构](#模型架构) +- [数据集](#数据集) +- [环境要求](#环境要求) +- [快速入门](#快速入门) +- [脚本说明](#脚本说明) + - [脚本及样例代码](#脚本及样例代码) + - [脚本参数](#脚本参数) + - [训练过程](#训练过程) + - [Ascend上训练](#ascend上训练) + - [评估过程](#评估过程) + - [Ascend处理器环境评估](#ascend处理器环境评估) + - [性能](#性能) + - [导出过程](#导出过程) + - [导出](#导出) + - [推理过程](#推理过程) + - [推理](#推理) +- [随机情况说明](#随机情况说明) +- [ModelZoo主页](#modelzoo主页) + + + +# SSD说明 + +SSD将边界框的输出空间离散成一组默认框,每个特征映射位置具有不同的纵横比和尺度。在预测时,网络对每个默认框中存在的对象类别进行评分,并对框进行调整以更好地匹配对象形状。此外,网络将多个不同分辨率的特征映射的预测组合在一起,自然处理各种大小的对象。 + +[论文](https://arxiv.org/abs/1512.02325): Wei Liu, Dragomir Anguelov, Dumitru Erhan, Christian Szegedy, Scott Reed, Cheng-Yang Fu, Alexander C. Berg.European Conference on Computer Vision (ECCV), 2016 (In press). + +# 模型架构 + +SSD方法基于前向卷积网络,该网络产生固定大小的边界框集合,并针对这些框内存在的对象类实例进行评分,然后通过非极大值抑制步骤进行最终检测。早期的网络层基于高质量图像分类的标准体系结构,被称为基础网络。后来通过向网络添加辅助结构进行检测。 + +# 数据集 + +使用的数据集: [COCO2017]() + +- 数据集大小:19 GB + - 训练集:18 GB,118000张图像 + - 验证集:1 GB,5000张图像 + - 标注:241 MB,实例,字幕,person_keypoints等 +- 数据格式:图像和json文件 + - 注意:数据在dataset.py中处理 + +# 环境要求 + +- 安装[MindSpore](https://www.mindspore.cn/install)。 + +- 下载数据集COCO2017。 + +- 本示例默认使用COCO2017作为训练数据集,您也可以使用自己的数据集。 + + 1. 如果使用coco数据集。**执行脚本时选择数据集coco。** + 安装Cython和pycocotool,也可以安装mmcv进行数据处理。 + + ```python + pip install Cython + + pip install pycocotools + + ``` + + 并在`config.py`中更改COCO_ROOT和其他您需要的设置。目录结构如下: + + ```text + . + └─cocodataset + ├─annotations + ├─instance_train2017.json + └─instance_val2017.json + ├─val2017 + └─train2017 + + ``` + + 2. 如果使用自己的数据集。**执行脚本时选择数据集为other。** + 将数据集信息整理成TXT文件,每行如下: + + ```text + train2017/0000001.jpg 0,259,401,459,7 35,28,324,201,2 0,30,59,80,2 + + ``` + + 每行是按空间分割的图像标注,第一列是图像的相对路径,其余为[xmin,ymin,xmax,ymax,class]格式的框和类信息。我们从`IMAGE_DIR`(数据集目录)和`ANNO_PATH`(TXT文件路径)的相对路径连接起来的图像路径中读取图像。在`config.py`中设置`IMAGE_DIR`和`ANNO_PATH`。 + +# 快速入门 + +通过官方网站安装MindSpore后,您可以按照如下步骤进行训练和评估: + +```shell script +# Ascend分布式训练 +bash run_distribute_train.sh [DEVICE_NUM] [EPOCH_SIZE] [LR] [DATASET] [RANK_TABLE_FILE] +``` + +```shell script +# 单卡训练 +bash run_standalone_train.sh +``` + +```shell script +# Ascend处理器环境运行eval +bash run_eval.sh [DATASET] [CHECKPOINT_PATH] [DEVICE_ID] +``` + +# 脚本说明 + +## 脚本及样例代码 + +```text +. +└─ cv + └─ ssd + ├─ README.md ## SSD相关说明 + ├─ ascend310_infer ## 实现310推理源代码 + ├─ scripts + ├─ run_distribute_train.sh ## Ascend分布式shell脚本 + ├─ run_infer_310.sh ## Ascend推理shell脚本 + └─ run_eval.sh ## Ascend评估shell脚本 + ├─ src + ├─ __init__.py ## 初始化文件 + ├─ box_util.py ## bbox工具 + ├─ coco_eval.py ## coco指标工具 + ├─ config.py ## 总配置 + ├─ dataset.py ## 创建并处理数据集 + ├─ init_params.py ## 参数工具 + ├─ lr_schedule.py ## 学习率生成器 + └─ ssd.py ## SSD架构 + ├─ eval.py ## 评估脚本 + ├─ export.py ## 导出 AIR,MINDIR模型的脚本 + ├─ postprogress.py ## 310推理后处理脚本 + ├─ train.py ## 训练脚本 + └─ mindspore_hub_conf.py ## MindSpore Hub接口 +``` + +## 脚本参数 + + ```text + train.py和config.py中主要参数如下: + + "device_num": 1 # 使用设备数量 + "lr": 0.05 # 学习率初始值 + "dataset": coco # 数据集名称 + "epoch_size": 500 # 轮次大小 + "batch_size": 32 # 输入张量的批次大小 + "pre_trained": None # 预训练检查点文件路径 + "pre_trained_epoch_size": 0 # 预训练轮次大小 + "save_checkpoint_epochs": 10 # 两个检查点之间的轮次间隔。默认情况下,每10个轮次都会保存检查点。 + "loss_scale": 1024 # 损失放大 + + "class_num": 81 # 数据集类数 + "image_shape": [300, 300] # 作为模型输入的图像高和宽 + "mindrecord_dir": "/data/MindRecord_COCO" # MindRecord路径 + "coco_root": "/data/coco2017" # COCO2017数据集路径 + "voc_root": "" # VOC原始数据集路径 + "image_dir": "" # 其他数据集图片路径,如果使用coco或voc,此参数无效。 + "anno_path": "" # 其他数据集标注路径,如果使用coco或voc,此参数无效。 + + ``` + +## 训练过程 + +运行`train.py`训练模型。如果`mindrecord_dir`为空,则会通过`coco_root`(coco数据集)或`image_dir`和`anno_path`(自己的数据集)生成[MindRecord](https://www.mindspore.cn/docs/programming_guide/zh-CN/master/convert_dataset.html)文件。**注意,如果mindrecord_dir不为空,将使用mindrecord_dir代替原始图像。** + +### Ascend上训练 + +- 分布式 + +```shell script + bash run_distribute_train.sh [DEVICE_NUM] [EPOCH_SIZE] [LR] [DATASET] [RANK_TABLE_FILE] [PRE_TRAINED](optional) [PRE_TRAINED_EPOCH_SIZE](optional) +``` + +此脚本需要五或七个参数。 + +- `DEVICE_NUM`:分布式训练的设备数。 +- `EPOCH_NUM`:分布式训练的轮次数。 +- `LR`:分布式训练的学习率初始值。 +- `DATASET`:分布式训练的数据集模式。 +- `RANK_TABLE_FILE`:[rank_table.json](https://gitee.com/mindspore/mindspore/tree/master/model_zoo/utils/hccl_tools)的路径。最好使用绝对路径。 +- `PRE_TRAINED`:预训练检查点文件的路径。最好使用绝对路径。 +- `PRE_TRAINED_EPOCH_SIZE`:预训练的轮次数。 + + 训练结果保存在当前路径中,文件夹名称以"LOG"开头。 您可在此文件夹中找到检查点文件以及结果,如下所示。 + +```text +epoch: 1 step: 458, loss is 3.1681802 +epoch time: 228752.4654865265, per step time: 499.4595316299705 +epoch: 2 step: 458, loss is 2.8847265 +epoch time: 38912.93382644653, per step time: 84.96273761232868 +epoch: 3 step: 458, loss is 2.8398118 +epoch time: 38769.184827804565, per step time: 84.64887516987896 +... + +epoch: 498 step: 458, loss is 0.70908034 +epoch time: 38771.079778671265, per step time: 84.65301261718616 +epoch: 499 step: 458, loss is 0.7974688 +epoch time: 38787.413120269775, per step time: 84.68867493508685 +epoch: 500 step: 458, loss is 0.5548882 +epoch time: 39064.8467540741, per step time: 85.29442522723602 +``` + +## 评估过程 + +### Ascend处理器环境评估 + +```shell script +bash run_eval.sh [DATASET] [CHECKPOINT_PATH] [DEVICE_ID] +``` + +此脚本需要两个参数。 + +- `DATASET`:评估数据集的模式。 +- `CHECKPOINT_PATH`:检查点文件的绝对路径。 +- `DEVICE_ID`: 评估的设备ID。 + +> 在训练过程中可以生成检查点。 + +推理结果保存在示例路径中,文件夹名称以“eval”开头。您可以在日志中找到类似以下的结果。 + +```text +Average Precision (AP) @[ IoU=0.50:0.95 | area= all | maxDets=100 ] = 0.327 +Average Precision (AP) @[ IoU=0.50 | area= all | maxDets=100 ] = 0.474 +Average Precision (AP) @[ IoU=0.75 | area= all | maxDets=100 ] = 0.358 +Average Precision (AP) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.120 +Average Precision (AP) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.350 +Average Precision (AP) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.459 +Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets= 1 ] = 0.315 +Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets= 10 ] = 0.489 +Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets=100 ] = 0.511 +Average Recall (AR) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.208 +Average Recall (AR) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.557 +Average Recall (AR) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.689 + +======================================== + +mAP: 0.32719216721918915 + +``` + +## 导出过程 + +### 导出 + +```shell +python export.py --ckpt_file [CKPT_PATH] --device_target [DEVICE_TARGET] --file_format[EXPORT_FORMAT] +``` + +`EXPORT_FORMAT`可选 ["AIR", "MINDIR"] + +## 推理过程 + +### 推理 + +在还行推理之前我们需要先导出模型。Air模型只能在昇腾910环境上导出,mindir可以在任意环境上导出。batch_size只支持1。 + +```shell +# Ascend310 inference +bash run_infer_310.sh [MINDIR_PATH] [DATA_PATH] [DVPP] [ANNO_FILE] [DEVICE_ID] +``` + +推理结果被保存到了当前目录,可以在acc.log中获得类似下面的结果。 + +```shell +Average Precision (AP) @[ IoU=0.50:0.95 | area= all | maxDets=100 ] = 0.327 +Average Precision (AP) @[ IoU=0.50 | area= all | maxDets=100 ] = 0.475 +Average Precision (AP) @[ IoU=0.75 | area= all | maxDets=100 ] = 0.358 +Average Precision (AP) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.115 +Average Precision (AP) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.353 +Average Precision (AP) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.455 +Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets= 1 ] = 0.314 +Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets= 10 ] = 0.485 +Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets=100 ] = 0.509 +Average Recall (AR) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.200 +Average Recall (AR) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.554 +Average Recall (AR) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.692 + +mAP: 0.3266651054070853 +``` + +### 性能 + +| 参数 | Ascend | +| ------------------- | --------------------- | +| 模型版本 | SSD resnet50 | +| 资源 | Ascend 910 | +| 上传日期 | 2021-03-29 | +| MindSpore版本 | 1.1.0 | +| 数据集 | COCO2017 | +| mAP | IoU=0.50: 32.7% | +| 模型大小 | 281M(.ckpt文件) | + +参数ckpt_file为必填项, +`EXPORT_FORMAT` 必须在 ["AIR", "MINDIR"]中选择。 + +# 随机情况说明 + +dataset.py中设置了“create_dataset”函数内的种子,同时还使用了train.py中的随机种子。 + +# ModelZoo主页 + + 请浏览官网[主页](https://gitee.com/mindspore/mindspore/tree/master/model_zoo)。 diff --git a/research/cv/ssd_resnet50/ascend310_infer/CMakeLists.txt b/research/cv/ssd_resnet50/ascend310_infer/CMakeLists.txt index ee3c85447340e0449ff2b70ed24f60a17e07b2b6..5da775f3746b97dae4573ee089b4db4414591bec 100644 --- a/research/cv/ssd_resnet50/ascend310_infer/CMakeLists.txt +++ b/research/cv/ssd_resnet50/ascend310_infer/CMakeLists.txt @@ -1,14 +1,14 @@ -cmake_minimum_required(VERSION 3.14.1) -project(Ascend310Infer) -add_compile_definitions(_GLIBCXX_USE_CXX11_ABI=0) -set(CMAKE_CXX_FLAGS "${CMAKE_CXX_FLAGS} -O0 -g -std=c++17 -Werror -Wall -fPIE -Wl,--allow-shlib-undefined") -set(PROJECT_SRC_ROOT ${CMAKE_CURRENT_LIST_DIR}/) -option(MINDSPORE_PATH "mindspore install path" "") -include_directories(${MINDSPORE_PATH}) -include_directories(${MINDSPORE_PATH}/include) -include_directories(${PROJECT_SRC_ROOT}) -find_library(MS_LIB libmindspore.so ${MINDSPORE_PATH}/lib) -file(GLOB_RECURSE MD_LIB ${MINDSPORE_PATH}/_c_dataengine*) - -add_executable(main src/main.cc src/utils.cc) -target_link_libraries(main ${MS_LIB} ${MD_LIB} gflags) +cmake_minimum_required(VERSION 3.14.1) +project(Ascend310Infer) +add_compile_definitions(_GLIBCXX_USE_CXX11_ABI=0) +set(CMAKE_CXX_FLAGS "${CMAKE_CXX_FLAGS} -O0 -g -std=c++17 -Werror -Wall -fPIE -Wl,--allow-shlib-undefined") +set(PROJECT_SRC_ROOT ${CMAKE_CURRENT_LIST_DIR}/) +option(MINDSPORE_PATH "mindspore install path" "") +include_directories(${MINDSPORE_PATH}) +include_directories(${MINDSPORE_PATH}/include) +include_directories(${PROJECT_SRC_ROOT}) +find_library(MS_LIB libmindspore.so ${MINDSPORE_PATH}/lib) +file(GLOB_RECURSE MD_LIB ${MINDSPORE_PATH}/_c_dataengine*) + +add_executable(main src/main.cc src/utils.cc) +target_link_libraries(main ${MS_LIB} ${MD_LIB} gflags) diff --git a/research/cv/ssd_resnet50/ascend310_infer/aipp.cfg b/research/cv/ssd_resnet50/ascend310_infer/aipp.cfg index 363d5d36fd1f24e3a6e880745d7150076f777bd0..83acabb343e82acc95877b0fbe5445b90cb47104 100644 --- a/research/cv/ssd_resnet50/ascend310_infer/aipp.cfg +++ b/research/cv/ssd_resnet50/ascend310_infer/aipp.cfg @@ -1,26 +1,26 @@ -aipp_op { - aipp_mode : static - input_format : YUV420SP_U8 - related_input_rank : 0 - csc_switch : true - rbuv_swap_switch : false - matrix_r0c0 : 256 - matrix_r0c1 : 0 - matrix_r0c2 : 359 - matrix_r1c0 : 256 - matrix_r1c1 : -88 - matrix_r1c2 : -183 - matrix_r2c0 : 256 - matrix_r2c1 : 454 - matrix_r2c2 : 0 - input_bias_0 : 0 - input_bias_1 : 128 - input_bias_2 : 128 - - mean_chn_0 : 124 - mean_chn_1 : 117 - mean_chn_2 : 104 - var_reci_chn_0 : 0.0171247538316637 - var_reci_chn_1 : 0.0175070028011204 - var_reci_chn_2 : 0.0174291938997821 +aipp_op { + aipp_mode : static + input_format : YUV420SP_U8 + related_input_rank : 0 + csc_switch : true + rbuv_swap_switch : false + matrix_r0c0 : 256 + matrix_r0c1 : 0 + matrix_r0c2 : 359 + matrix_r1c0 : 256 + matrix_r1c1 : -88 + matrix_r1c2 : -183 + matrix_r2c0 : 256 + matrix_r2c1 : 454 + matrix_r2c2 : 0 + input_bias_0 : 0 + input_bias_1 : 128 + input_bias_2 : 128 + + mean_chn_0 : 124 + mean_chn_1 : 117 + mean_chn_2 : 104 + var_reci_chn_0 : 0.0171247538316637 + var_reci_chn_1 : 0.0175070028011204 + var_reci_chn_2 : 0.0174291938997821 } \ No newline at end of file diff --git a/research/cv/ssd_resnet50/ascend310_infer/build.sh b/research/cv/ssd_resnet50/ascend310_infer/build.sh index 68255919420d6270a91677f9052af77434187068..5382250c235ce8904c29c303a07e4a724588f43d 100644 --- a/research/cv/ssd_resnet50/ascend310_infer/build.sh +++ b/research/cv/ssd_resnet50/ascend310_infer/build.sh @@ -1,28 +1,28 @@ -#!/bin/bash -# Copyright 2021 Huawei Technologies Co., Ltd -# -# Licensed under the Apache License, Version 2.0 (the "License"); -# you may not use this file except in compliance with the License. -# You may obtain a copy of the License at -# -# http://www.apache.org/licenses/LICENSE-2.0 -# -# Unless required by applicable law or agreed to in writing, software -# distributed under the License is distributed on an "AS IS" BASIS, -# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. -# See the License for the specific language governing permissions and -# limitations under the License. -# ============================================================================ -if [ ! -d out ]; then - mkdir out -fi - -cd out - -if [ -f "Makefile" ]; then - make clean -fi - -cmake .. \ - -DMINDSPORE_PATH="`pip3.7 show mindspore-ascend | grep Location | awk '{print $2"/mindspore"}' | xargs realpath`" -make +#!/bin/bash +# Copyright 2021 Huawei Technologies Co., Ltd +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. +# ============================================================================ +if [ ! -d out ]; then + mkdir out +fi + +cd out + +if [ -f "Makefile" ]; then + make clean +fi + +cmake .. \ + -DMINDSPORE_PATH="`pip3.7 show mindspore-ascend | grep Location | awk '{print $2"/mindspore"}' | xargs realpath`" +make diff --git a/research/cv/ssd_resnet50/ascend310_infer/inc/utils.h b/research/cv/ssd_resnet50/ascend310_infer/inc/utils.h index efebe03a8c1179f5a1f9d5f7ee07e0352a9937c6..6d66b313019614e146c5dc8d15a906ac170e8648 100644 --- a/research/cv/ssd_resnet50/ascend310_infer/inc/utils.h +++ b/research/cv/ssd_resnet50/ascend310_infer/inc/utils.h @@ -1,32 +1,32 @@ -/** - * Copyright 2021 Huawei Technologies Co., Ltd - * - * Licensed under the Apache License, Version 2.0 (the "License"); - * you may not use this file except in compliance with the License. - * You may obtain a copy of the License at - * - * http://www.apache.org/licenses/LICENSE-2.0 - * - * Unless required by applicable law or agreed to in writing, software - * distributed under the License is distributed on an "AS IS" BASIS, - * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. - * See the License for the specific language governing permissions and - * limitations under the License. - */ - -#ifndef MINDSPORE_INFERENCE_UTILS_H_ -#define MINDSPORE_INFERENCE_UTILS_H_ - -#include -#include -#include -#include -#include -#include "include/api/types.h" - -std::vector GetAllFiles(std::string_view dirName); -DIR *OpenDir(std::string_view dirName); -std::string RealPath(std::string_view path); -mindspore::MSTensor ReadFileToTensor(const std::string &file); -int WriteResult(const std::string& imageFile, const std::vector &outputs); -#endif +/** + * Copyright 2021 Huawei Technologies Co., Ltd + * + * Licensed under the Apache License, Version 2.0 (the "License"); + * you may not use this file except in compliance with the License. + * You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +#ifndef MINDSPORE_INFERENCE_UTILS_H_ +#define MINDSPORE_INFERENCE_UTILS_H_ + +#include +#include +#include +#include +#include +#include "include/api/types.h" + +std::vector GetAllFiles(std::string_view dirName); +DIR *OpenDir(std::string_view dirName); +std::string RealPath(std::string_view path); +mindspore::MSTensor ReadFileToTensor(const std::string &file); +int WriteResult(const std::string& imageFile, const std::vector &outputs); +#endif diff --git a/research/cv/ssd_resnet50/ascend310_infer/src/main.cc b/research/cv/ssd_resnet50/ascend310_infer/src/main.cc index 4b92f1a0719d29ed69e4d015a40025dd727d3416..f38cf71f57ebef706f0b13ad35776c2ede28ea2c 100644 --- a/research/cv/ssd_resnet50/ascend310_infer/src/main.cc +++ b/research/cv/ssd_resnet50/ascend310_infer/src/main.cc @@ -1,165 +1,165 @@ -/** - * Copyright 2021 Huawei Technologies Co., Ltd - * - * Licensed under the Apache License, Version 2.0 (the "License"); - * you may not use this file except in compliance with the License. - * You may obtain a copy of the License at - * - * http://www.apache.org/licenses/LICENSE-2.0 - * - * Unless required by applicable law or agreed to in writing, software - * distributed under the License is distributed on an "AS IS" BASIS, - * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. - * See the License for the specific language governing permissions and - * limitations under the License. - */ -#include -#include -#include -#include -#include -#include -#include -#include -#include -#include - -#include "include/api/model.h" -#include "include/api/context.h" -#include "include/api/types.h" -#include "include/api/serialization.h" -#include "include/dataset/vision_ascend.h" -#include "include/dataset/execute.h" -#include "include/dataset/vision.h" -#include "inc/utils.h" - -using mindspore::Context; -using mindspore::Serialization; -using mindspore::Model; -using mindspore::Status; -using mindspore::ModelType; -using mindspore::GraphCell; -using mindspore::kSuccess; -using mindspore::MSTensor; -using mindspore::dataset::Execute; -using mindspore::dataset::TensorTransform; -using mindspore::dataset::vision::DvppDecodeResizeJpeg; -using mindspore::dataset::vision::Resize; -using mindspore::dataset::vision::HWC2CHW; -using mindspore::dataset::vision::Normalize; -using mindspore::dataset::vision::Decode; - -DEFINE_string(mindir_path, "", "mindir path"); -DEFINE_string(dataset_path, ".", "dataset path"); -DEFINE_int32(device_id, 0, "device id"); -DEFINE_string(aipp_path, "./aipp.cfg", "aipp path"); -DEFINE_string(cpu_dvpp, "DVPP", "cpu or dvpp process"); -DEFINE_int32(image_height, 640, "image height"); -DEFINE_int32(image_width, 640, "image width"); - -int main(int argc, char **argv) { - gflags::ParseCommandLineFlags(&argc, &argv, true); - if (RealPath(FLAGS_mindir_path).empty()) { - std::cout << "Invalid mindir" << std::endl; - return 1; - } - - auto context = std::make_shared(); - auto ascend310 = std::make_shared(); - ascend310->SetDeviceID(FLAGS_device_id); - ascend310->SetBufferOptimizeMode("off_optimize"); - context->MutableDeviceInfo().push_back(ascend310); - mindspore::Graph graph; - Serialization::Load(FLAGS_mindir_path, ModelType::kMindIR, &graph); - if (FLAGS_cpu_dvpp == "DVPP") { - if (RealPath(FLAGS_aipp_path).empty()) { - std::cout << "Invalid aipp path" << std::endl; - return 1; - } else { - ascend310->SetInsertOpConfigPath(FLAGS_aipp_path); - } - } - - Model model; - Status ret = model.Build(GraphCell(graph), context); - if (ret != kSuccess) { - std::cout << "ERROR: Build failed." << std::endl; - return 1; - } - - auto all_files = GetAllFiles(FLAGS_dataset_path); - if (all_files.empty()) { - std::cout << "ERROR: no input data." << std::endl; - return 1; - } - - std::map costTime_map; - size_t size = all_files.size(); - - for (size_t i = 0; i < size; ++i) { - struct timeval start = {0}; - struct timeval end = {0}; - double startTimeMs; - double endTimeMs; - std::vector inputs; - std::vector outputs; - std::cout << "Start predict input files:" << all_files[i] << std::endl; - if (FLAGS_cpu_dvpp == "DVPP") { - auto resizeShape = {static_cast (FLAGS_image_height), static_cast (FLAGS_image_width)}; - Execute resize_op(std::shared_ptr(new DvppDecodeResizeJpeg(resizeShape))); - auto imgDvpp = std::make_shared(); - resize_op(ReadFileToTensor(all_files[i]), imgDvpp.get()); - inputs.emplace_back(imgDvpp->Name(), imgDvpp->DataType(), imgDvpp->Shape(), - imgDvpp->Data().get(), imgDvpp->DataSize()); - } else { - std::shared_ptr decode(new Decode()); - std::shared_ptr hwc2chw(new HWC2CHW()); - std::shared_ptr normalize( - new Normalize({123.675, 116.28, 103.53}, {58.395, 57.120, 57.375})); - auto resizeShape = {FLAGS_image_height, FLAGS_image_width}; - std::shared_ptr resize(new Resize(resizeShape)); - Execute composeDecode({decode, resize, normalize, hwc2chw}); - auto img = MSTensor(); - auto image = ReadFileToTensor(all_files[i]); - composeDecode(image, &img); - std::vector model_inputs = model.GetInputs(); - if (model_inputs.empty()) { - std::cout << "Invalid model, inputs is empty." << std::endl; - return 1; - } - inputs.emplace_back(model_inputs[0].Name(), model_inputs[0].DataType(), model_inputs[0].Shape(), - img.Data().get(), img.DataSize()); - } - - gettimeofday(&start, nullptr); - ret = model.Predict(inputs, &outputs); - gettimeofday(&end, nullptr); - if (ret != kSuccess) { - std::cout << "Predict " << all_files[i] << " failed." << std::endl; - return 1; - } - startTimeMs = (1.0 * start.tv_sec * 1000000 + start.tv_usec) / 1000; - endTimeMs = (1.0 * end.tv_sec * 1000000 + end.tv_usec) / 1000; - costTime_map.insert(std::pair(startTimeMs, endTimeMs)); - WriteResult(all_files[i], outputs); - } - double average = 0.0; - int inferCount = 0; - - for (auto iter = costTime_map.begin(); iter != costTime_map.end(); iter++) { - double diff = 0.0; - diff = iter->second - iter->first; - average += diff; - inferCount++; - } - average = average / inferCount; - std::stringstream timeCost; - timeCost << "NN inference cost average time: "<< average << " ms of infer_count " << inferCount << std::endl; - std::cout << "NN inference cost average time: "<< average << "ms of infer_count " << inferCount << std::endl; - std::string fileName = "./time_Result" + std::string("/test_perform_static.txt"); - std::ofstream fileStream(fileName.c_str(), std::ios::trunc); - fileStream << timeCost.str(); - fileStream.close(); - costTime_map.clear(); - return 0; -} +/** + * Copyright 2021 Huawei Technologies Co., Ltd + * + * Licensed under the Apache License, Version 2.0 (the "License"); + * you may not use this file except in compliance with the License. + * You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include + +#include "include/api/model.h" +#include "include/api/context.h" +#include "include/api/types.h" +#include "include/api/serialization.h" +#include "include/dataset/vision_ascend.h" +#include "include/dataset/execute.h" +#include "include/dataset/vision.h" +#include "inc/utils.h" + +using mindspore::Context; +using mindspore::Serialization; +using mindspore::Model; +using mindspore::Status; +using mindspore::ModelType; +using mindspore::GraphCell; +using mindspore::kSuccess; +using mindspore::MSTensor; +using mindspore::dataset::Execute; +using mindspore::dataset::TensorTransform; +using mindspore::dataset::vision::DvppDecodeResizeJpeg; +using mindspore::dataset::vision::Resize; +using mindspore::dataset::vision::HWC2CHW; +using mindspore::dataset::vision::Normalize; +using mindspore::dataset::vision::Decode; + +DEFINE_string(mindir_path, "", "mindir path"); +DEFINE_string(dataset_path, ".", "dataset path"); +DEFINE_int32(device_id, 0, "device id"); +DEFINE_string(aipp_path, "./aipp.cfg", "aipp path"); +DEFINE_string(cpu_dvpp, "DVPP", "cpu or dvpp process"); +DEFINE_int32(image_height, 640, "image height"); +DEFINE_int32(image_width, 640, "image width"); + +int main(int argc, char **argv) { + gflags::ParseCommandLineFlags(&argc, &argv, true); + if (RealPath(FLAGS_mindir_path).empty()) { + std::cout << "Invalid mindir" << std::endl; + return 1; + } + + auto context = std::make_shared(); + auto ascend310 = std::make_shared(); + ascend310->SetDeviceID(FLAGS_device_id); + ascend310->SetBufferOptimizeMode("off_optimize"); + context->MutableDeviceInfo().push_back(ascend310); + mindspore::Graph graph; + Serialization::Load(FLAGS_mindir_path, ModelType::kMindIR, &graph); + if (FLAGS_cpu_dvpp == "DVPP") { + if (RealPath(FLAGS_aipp_path).empty()) { + std::cout << "Invalid aipp path" << std::endl; + return 1; + } else { + ascend310->SetInsertOpConfigPath(FLAGS_aipp_path); + } + } + + Model model; + Status ret = model.Build(GraphCell(graph), context); + if (ret != kSuccess) { + std::cout << "ERROR: Build failed." << std::endl; + return 1; + } + + auto all_files = GetAllFiles(FLAGS_dataset_path); + if (all_files.empty()) { + std::cout << "ERROR: no input data." << std::endl; + return 1; + } + + std::map costTime_map; + size_t size = all_files.size(); + + for (size_t i = 0; i < size; ++i) { + struct timeval start = {0}; + struct timeval end = {0}; + double startTimeMs; + double endTimeMs; + std::vector inputs; + std::vector outputs; + std::cout << "Start predict input files:" << all_files[i] << std::endl; + if (FLAGS_cpu_dvpp == "DVPP") { + auto resizeShape = {static_cast (FLAGS_image_height), static_cast (FLAGS_image_width)}; + Execute resize_op(std::shared_ptr(new DvppDecodeResizeJpeg(resizeShape))); + auto imgDvpp = std::make_shared(); + resize_op(ReadFileToTensor(all_files[i]), imgDvpp.get()); + inputs.emplace_back(imgDvpp->Name(), imgDvpp->DataType(), imgDvpp->Shape(), + imgDvpp->Data().get(), imgDvpp->DataSize()); + } else { + std::shared_ptr decode(new Decode()); + std::shared_ptr hwc2chw(new HWC2CHW()); + std::shared_ptr normalize( + new Normalize({123.675, 116.28, 103.53}, {58.395, 57.120, 57.375})); + auto resizeShape = {FLAGS_image_height, FLAGS_image_width}; + std::shared_ptr resize(new Resize(resizeShape)); + Execute composeDecode({decode, resize, normalize, hwc2chw}); + auto img = MSTensor(); + auto image = ReadFileToTensor(all_files[i]); + composeDecode(image, &img); + std::vector model_inputs = model.GetInputs(); + if (model_inputs.empty()) { + std::cout << "Invalid model, inputs is empty." << std::endl; + return 1; + } + inputs.emplace_back(model_inputs[0].Name(), model_inputs[0].DataType(), model_inputs[0].Shape(), + img.Data().get(), img.DataSize()); + } + + gettimeofday(&start, nullptr); + ret = model.Predict(inputs, &outputs); + gettimeofday(&end, nullptr); + if (ret != kSuccess) { + std::cout << "Predict " << all_files[i] << " failed." << std::endl; + return 1; + } + startTimeMs = (1.0 * start.tv_sec * 1000000 + start.tv_usec) / 1000; + endTimeMs = (1.0 * end.tv_sec * 1000000 + end.tv_usec) / 1000; + costTime_map.insert(std::pair(startTimeMs, endTimeMs)); + WriteResult(all_files[i], outputs); + } + double average = 0.0; + int inferCount = 0; + + for (auto iter = costTime_map.begin(); iter != costTime_map.end(); iter++) { + double diff = 0.0; + diff = iter->second - iter->first; + average += diff; + inferCount++; + } + average = average / inferCount; + std::stringstream timeCost; + timeCost << "NN inference cost average time: "<< average << " ms of infer_count " << inferCount << std::endl; + std::cout << "NN inference cost average time: "<< average << "ms of infer_count " << inferCount << std::endl; + std::string fileName = "./time_Result" + std::string("/test_perform_static.txt"); + std::ofstream fileStream(fileName.c_str(), std::ios::trunc); + fileStream << timeCost.str(); + fileStream.close(); + costTime_map.clear(); + return 0; +} diff --git a/research/cv/ssd_resnet50/ascend310_infer/src/utils.cc b/research/cv/ssd_resnet50/ascend310_infer/src/utils.cc index c947e4d5f451b90bd4728aa3a92c4cfab174f5e6..5987daee43935da835a01872487fdcb2049405c4 100644 --- a/research/cv/ssd_resnet50/ascend310_infer/src/utils.cc +++ b/research/cv/ssd_resnet50/ascend310_infer/src/utils.cc @@ -1,129 +1,129 @@ -/** - * Copyright 2021 Huawei Technologies Co., Ltd - * - * Licensed under the Apache License, Version 2.0 (the "License"); - * you may not use this file except in compliance with the License. - * You may obtain a copy of the License at - * - * http://www.apache.org/licenses/LICENSE-2.0 - * - * Unless required by applicable law or agreed to in writing, software - * distributed under the License is distributed on an "AS IS" BASIS, - * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. - * See the License for the specific language governing permissions and - * limitations under the License. - */ - -#include -#include -#include -#include "inc/utils.h" - -using mindspore::MSTensor; -using mindspore::DataType; - -std::vector GetAllFiles(std::string_view dirName) { - struct dirent *filename; - DIR *dir = OpenDir(dirName); - if (dir == nullptr) { - return {}; - } - std::vector res; - while ((filename = readdir(dir)) != nullptr) { - std::string dName = std::string(filename->d_name); - if (dName == "." || dName == ".." || filename->d_type != DT_REG) { - continue; - } - res.emplace_back(std::string(dirName) + "/" + filename->d_name); - } - std::sort(res.begin(), res.end()); - for (auto &f : res) { - std::cout << "image file: " << f << std::endl; - } - return res; -} - -int WriteResult(const std::string& imageFile, const std::vector &outputs) { - std::string homePath = "./result_Files"; - for (size_t i = 0; i < outputs.size(); ++i) { - size_t outputSize; - std::shared_ptr netOutput; - netOutput = outputs[i].Data(); - outputSize = outputs[i].DataSize(); - int pos = imageFile.rfind('/'); - std::string fileName(imageFile, pos + 1); - fileName.replace(fileName.find('.'), fileName.size() - fileName.find('.'), '_' + std::to_string(i) + ".bin"); - std::string outFileName = homePath + "/" + fileName; - FILE * outputFile = fopen(outFileName.c_str(), "wb"); - fwrite(netOutput.get(), outputSize, sizeof(char), outputFile); - fclose(outputFile); - outputFile = nullptr; - } - return 0; -} - -mindspore::MSTensor ReadFileToTensor(const std::string &file) { - if (file.empty()) { - std::cout << "Pointer file is nullptr" << std::endl; - return mindspore::MSTensor(); - } - - std::ifstream ifs(file); - if (!ifs.good()) { - std::cout << "File: " << file << " is not exist" << std::endl; - return mindspore::MSTensor(); - } - - if (!ifs.is_open()) { - std::cout << "File: " << file << "open failed" << std::endl; - return mindspore::MSTensor(); - } - - ifs.seekg(0, std::ios::end); - size_t size = ifs.tellg(); - mindspore::MSTensor buffer(file, mindspore::DataType::kNumberTypeUInt8, {static_cast(size)}, nullptr, size); - - ifs.seekg(0, std::ios::beg); - ifs.read(reinterpret_cast(buffer.MutableData()), size); - ifs.close(); - - return buffer; -} - - -DIR *OpenDir(std::string_view dirName) { - if (dirName.empty()) { - std::cout << " dirName is null ! " << std::endl; - return nullptr; - } - std::string realPath = RealPath(dirName); - struct stat s; - lstat(realPath.c_str(), &s); - if (!S_ISDIR(s.st_mode)) { - std::cout << "dirName is not a valid directory !" << std::endl; - return nullptr; - } - DIR *dir; - dir = opendir(realPath.c_str()); - if (dir == nullptr) { - std::cout << "Can not open dir " << dirName << std::endl; - return nullptr; - } - std::cout << "Successfully opened the dir " << dirName << std::endl; - return dir; -} - -std::string RealPath(std::string_view path) { - char realPathMem[PATH_MAX] = {0}; - char *realPathRet = nullptr; - realPathRet = realpath(path.data(), realPathMem); - - if (realPathRet == nullptr) { - std::cout << "File: " << path << " is not exist."; - return ""; - } - - std::string realPath(realPathMem); - std::cout << path << " realpath is: " << realPath << std::endl; - return realPath; -} +/** + * Copyright 2021 Huawei Technologies Co., Ltd + * + * Licensed under the Apache License, Version 2.0 (the "License"); + * you may not use this file except in compliance with the License. + * You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +#include +#include +#include +#include "inc/utils.h" + +using mindspore::MSTensor; +using mindspore::DataType; + +std::vector GetAllFiles(std::string_view dirName) { + struct dirent *filename; + DIR *dir = OpenDir(dirName); + if (dir == nullptr) { + return {}; + } + std::vector res; + while ((filename = readdir(dir)) != nullptr) { + std::string dName = std::string(filename->d_name); + if (dName == "." || dName == ".." || filename->d_type != DT_REG) { + continue; + } + res.emplace_back(std::string(dirName) + "/" + filename->d_name); + } + std::sort(res.begin(), res.end()); + for (auto &f : res) { + std::cout << "image file: " << f << std::endl; + } + return res; +} + +int WriteResult(const std::string& imageFile, const std::vector &outputs) { + std::string homePath = "./result_Files"; + for (size_t i = 0; i < outputs.size(); ++i) { + size_t outputSize; + std::shared_ptr netOutput; + netOutput = outputs[i].Data(); + outputSize = outputs[i].DataSize(); + int pos = imageFile.rfind('/'); + std::string fileName(imageFile, pos + 1); + fileName.replace(fileName.find('.'), fileName.size() - fileName.find('.'), '_' + std::to_string(i) + ".bin"); + std::string outFileName = homePath + "/" + fileName; + FILE * outputFile = fopen(outFileName.c_str(), "wb"); + fwrite(netOutput.get(), outputSize, sizeof(char), outputFile); + fclose(outputFile); + outputFile = nullptr; + } + return 0; +} + +mindspore::MSTensor ReadFileToTensor(const std::string &file) { + if (file.empty()) { + std::cout << "Pointer file is nullptr" << std::endl; + return mindspore::MSTensor(); + } + + std::ifstream ifs(file); + if (!ifs.good()) { + std::cout << "File: " << file << " is not exist" << std::endl; + return mindspore::MSTensor(); + } + + if (!ifs.is_open()) { + std::cout << "File: " << file << "open failed" << std::endl; + return mindspore::MSTensor(); + } + + ifs.seekg(0, std::ios::end); + size_t size = ifs.tellg(); + mindspore::MSTensor buffer(file, mindspore::DataType::kNumberTypeUInt8, {static_cast(size)}, nullptr, size); + + ifs.seekg(0, std::ios::beg); + ifs.read(reinterpret_cast(buffer.MutableData()), size); + ifs.close(); + + return buffer; +} + + +DIR *OpenDir(std::string_view dirName) { + if (dirName.empty()) { + std::cout << " dirName is null ! " << std::endl; + return nullptr; + } + std::string realPath = RealPath(dirName); + struct stat s; + lstat(realPath.c_str(), &s); + if (!S_ISDIR(s.st_mode)) { + std::cout << "dirName is not a valid directory !" << std::endl; + return nullptr; + } + DIR *dir; + dir = opendir(realPath.c_str()); + if (dir == nullptr) { + std::cout << "Can not open dir " << dirName << std::endl; + return nullptr; + } + std::cout << "Successfully opened the dir " << dirName << std::endl; + return dir; +} + +std::string RealPath(std::string_view path) { + char realPathMem[PATH_MAX] = {0}; + char *realPathRet = nullptr; + realPathRet = realpath(path.data(), realPathMem); + + if (realPathRet == nullptr) { + std::cout << "File: " << path << " is not exist."; + return ""; + } + + std::string realPath(realPathMem); + std::cout << path << " realpath is: " << realPath << std::endl; + return realPath; +} diff --git a/research/cv/ssd_resnet50/eval.py b/research/cv/ssd_resnet50/eval.py index 830398b2ab7ba3eebf5ea5cf32a36a587363c3a8..a3e9fede3ac16300a32b45eb5394c321066a4238 100644 --- a/research/cv/ssd_resnet50/eval.py +++ b/research/cv/ssd_resnet50/eval.py @@ -1,98 +1,98 @@ -# Copyright 2021 Huawei Technologies Co., Ltd -# -# Licensed under the Apache License, Version 2.0 (the "License"); -# you may not use this file except in compliance with the License. -# You may obtain a copy of the License at -# -# http://www.apache.org/licenses/LICENSE-2.0 -# -# Unless required by applicable law or agreed to in writing, software -# distributed under the License is distributed on an "AS IS" BASIS, -# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. -# See the License for the specific language governing permissions and -# limitations under the License. -# ============================================================================ - -"""Evaluation for SSD""" - -import os -import argparse -import time -import numpy as np -from mindspore import context, Tensor -from mindspore.train.serialization import load_checkpoint, load_param_into_net -from src.ssd import SsdInferWithDecoder, ssd_resnet50 -from src.dataset import create_ssd_dataset, create_mindrecord -from src.config import config -from src.eval_utils import metrics -from src.box_utils import default_boxes - -def ssd_eval(dataset_path, ckpt_path, anno_json): - """SSD evaluation.""" - batch_size = 1 - ds = create_ssd_dataset(dataset_path, batch_size=batch_size, repeat_num=1, - is_training=False, use_multiprocessing=False) - if config.model == "ssd_resnet50": - net = ssd_resnet50(config=config) - else: - raise ValueError(f'config.model: {config.model} is not supported') - net = SsdInferWithDecoder(net, Tensor(default_boxes), config) - - print("Load Checkpoint!") - param_dict = load_checkpoint(ckpt_path) - net.init_parameters_data() - load_param_into_net(net, param_dict) - - net.set_train(False) - i = batch_size - total = ds.get_dataset_size() * batch_size - start = time.time() - pred_data = [] - print("\n========================================\n") - print("total images num: ", total) - print("Processing, please wait a moment.") - for data in ds.create_dict_iterator(output_numpy=True, num_epochs=1): - img_id = data['img_id'] - img_np = data['image'] - image_shape = data['image_shape'] - - output = net(Tensor(img_np)) - for batch_idx in range(img_np.shape[0]): - pred_data.append({"boxes": output[0].asnumpy()[batch_idx], - "box_scores": output[1].asnumpy()[batch_idx], - "img_id": int(np.squeeze(img_id[batch_idx])), - "image_shape": image_shape[batch_idx]}) - percent = round(i / total * 100., 2) - - print(f' {str(percent)} [{i}/{total}]', end='\r') - i += batch_size - cost_time = int((time.time() - start) * 1000) - print(f' 100% [{total}/{total}] cost {cost_time} ms') - mAP = metrics(pred_data, anno_json) - print("\n========================================\n") - print(f"mAP: {mAP}") - -def get_eval_args(): - parser = argparse.ArgumentParser(description='SSD evaluation') - parser.add_argument("--device_id", type=int, default=0, help="Device id, default is 0.") - parser.add_argument("--dataset", type=str, default="coco", help="Dataset, default is coco.") - parser.add_argument("--checkpoint_path", type=str, required=True, help="Checkpoint file path.") - parser.add_argument("--run_platform", type=str, default="Ascend", choices=("Ascend", "GPU", "CPU"), - help="run platform, support Ascend ,GPU and CPU.") - return parser.parse_args() - -if __name__ == '__main__': - args_opt = get_eval_args() - if args_opt.dataset == "coco": - json_path = os.path.join(config.coco_root, config.instances_set.format(config.val_data_type)) - elif args_opt.dataset == "voc": - json_path = os.path.join(config.voc_root, config.voc_json) - else: - raise ValueError('SSD eval only support dataset mode is coco and voc!') - - context.set_context(mode=context.GRAPH_MODE, device_target=args_opt.run_platform, device_id=args_opt.device_id) - - mindrecord_file = create_mindrecord(args_opt.dataset, "ssd_eval.mindrecord", False) - - print("Start Eval!") - ssd_eval(mindrecord_file, args_opt.checkpoint_path, json_path) +# Copyright 2021 Huawei Technologies Co., Ltd +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. +# ============================================================================ + +"""Evaluation for SSD""" + +import os +import argparse +import time +import numpy as np +from mindspore import context, Tensor +from mindspore.train.serialization import load_checkpoint, load_param_into_net +from src.ssd import SsdInferWithDecoder, ssd_resnet50 +from src.dataset import create_ssd_dataset, create_mindrecord +from src.config import config +from src.eval_utils import metrics +from src.box_utils import default_boxes + +def ssd_eval(dataset_path, ckpt_path, anno_json): + """SSD evaluation.""" + batch_size = 1 + ds = create_ssd_dataset(dataset_path, batch_size=batch_size, repeat_num=1, + is_training=False, use_multiprocessing=False) + if config.model == "ssd_resnet50": + net = ssd_resnet50(config=config) + else: + raise ValueError(f'config.model: {config.model} is not supported') + net = SsdInferWithDecoder(net, Tensor(default_boxes), config) + + print("Load Checkpoint!") + param_dict = load_checkpoint(ckpt_path) + net.init_parameters_data() + load_param_into_net(net, param_dict) + + net.set_train(False) + i = batch_size + total = ds.get_dataset_size() * batch_size + start = time.time() + pred_data = [] + print("\n========================================\n") + print("total images num: ", total) + print("Processing, please wait a moment.") + for data in ds.create_dict_iterator(output_numpy=True, num_epochs=1): + img_id = data['img_id'] + img_np = data['image'] + image_shape = data['image_shape'] + + output = net(Tensor(img_np)) + for batch_idx in range(img_np.shape[0]): + pred_data.append({"boxes": output[0].asnumpy()[batch_idx], + "box_scores": output[1].asnumpy()[batch_idx], + "img_id": int(np.squeeze(img_id[batch_idx])), + "image_shape": image_shape[batch_idx]}) + percent = round(i / total * 100., 2) + + print(f' {str(percent)} [{i}/{total}]', end='\r') + i += batch_size + cost_time = int((time.time() - start) * 1000) + print(f' 100% [{total}/{total}] cost {cost_time} ms') + mAP = metrics(pred_data, anno_json) + print("\n========================================\n") + print(f"mAP: {mAP}") + +def get_eval_args(): + parser = argparse.ArgumentParser(description='SSD evaluation') + parser.add_argument("--device_id", type=int, default=0, help="Device id, default is 0.") + parser.add_argument("--dataset", type=str, default="coco", help="Dataset, default is coco.") + parser.add_argument("--checkpoint_path", type=str, required=True, help="Checkpoint file path.") + parser.add_argument("--run_platform", type=str, default="Ascend", choices=("Ascend", "GPU", "CPU"), + help="run platform, support Ascend ,GPU and CPU.") + return parser.parse_args() + +if __name__ == '__main__': + args_opt = get_eval_args() + if args_opt.dataset == "coco": + json_path = os.path.join(config.coco_root, config.instances_set.format(config.val_data_type)) + elif args_opt.dataset == "voc": + json_path = os.path.join(config.voc_root, config.voc_json) + else: + raise ValueError('SSD eval only support dataset mode is coco and voc!') + + context.set_context(mode=context.GRAPH_MODE, device_target=args_opt.run_platform, device_id=args_opt.device_id) + + mindrecord_file = create_mindrecord(args_opt.dataset, "ssd_eval.mindrecord", False) + + print("Start Eval!") + ssd_eval(mindrecord_file, args_opt.checkpoint_path, json_path) diff --git a/research/cv/ssd_resnet50/export.py b/research/cv/ssd_resnet50/export.py index 290ed3763a306a1d515742fa9e8e3855742299f0..21c90210cd90c5ec732d77fd621e4d6491c1ae0e 100644 --- a/research/cv/ssd_resnet50/export.py +++ b/research/cv/ssd_resnet50/export.py @@ -1,52 +1,52 @@ -# Copyright 2021 Huawei Technologies Co., Ltd -# -# Licensed under the Apache License, Version 2.0 (the "License"); -# you may not use this file except in compliance with the License. -# You may obtain a copy of the License at -# -# http://www.apache.org/licenses/LICENSE-2.0 -# -# Unless required by applicable law or agreed to in writing, software -# distributed under the License is distributed on an "AS IS" BASIS, -# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. -# See the License for the specific language governing permissions and -# limitations under the License. -# ============================================================================ -"""export""" -import argparse -import numpy as np -from mindspore import context, Tensor, dtype -from mindspore.train.serialization import load_checkpoint, load_param_into_net, export -from src.ssd import SsdInferWithDecoder, ssd_resnet50 -from src.config import config -from src.box_utils import default_boxes - -parser = argparse.ArgumentParser(description='SSD export') -parser.add_argument("--device_id", type=int, default=0, help="Device id") -parser.add_argument("--batch_size", type=int, default=1, help="batch size") -parser.add_argument("--ckpt_file", type=str, required=True, help="Checkpoint file path.") -parser.add_argument("--file_name", type=str, default="ssd", help="output file name.") -parser.add_argument('--file_format', type=str, choices=["AIR", "MINDIR"], default='AIR', help='file format') -parser.add_argument("--device_target", type=str, choices=["Ascend", "GPU", "CPU"], default="Ascend", - help="device target") -args = parser.parse_args() - -context.set_context(mode=context.GRAPH_MODE, device_target=args.device_target) -if args.device_target == "Ascend": - context.set_context(device_id=args.device_id) - -if __name__ == '__main__': - if config.model == "ssd_resnet50": - net = ssd_resnet50(config=config) - else: - raise ValueError(f'config.model: {config.model} is not supported') - net = SsdInferWithDecoder(net, Tensor(default_boxes), config) - - param_dict = load_checkpoint(args.ckpt_file) - net.init_parameters_data() - load_param_into_net(net, param_dict) - net.set_train(False) - - input_shp = [args.batch_size, 3] + config.img_shape - input_array = Tensor(np.random.uniform(-1.0, 1.0, size=input_shp), dtype.float32) - export(net, input_array, file_name=args.file_name, file_format=args.file_format) +# Copyright 2021 Huawei Technologies Co., Ltd +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. +# ============================================================================ +"""export""" +import argparse +import numpy as np +from mindspore import context, Tensor, dtype +from mindspore.train.serialization import load_checkpoint, load_param_into_net, export +from src.ssd import SsdInferWithDecoder, ssd_resnet50 +from src.config import config +from src.box_utils import default_boxes + +parser = argparse.ArgumentParser(description='SSD export') +parser.add_argument("--device_id", type=int, default=0, help="Device id") +parser.add_argument("--batch_size", type=int, default=1, help="batch size") +parser.add_argument("--ckpt_file", type=str, required=True, help="Checkpoint file path.") +parser.add_argument("--file_name", type=str, default="ssd", help="output file name.") +parser.add_argument('--file_format', type=str, choices=["AIR", "MINDIR"], default='AIR', help='file format') +parser.add_argument("--device_target", type=str, choices=["Ascend", "GPU", "CPU"], default="Ascend", + help="device target") +args = parser.parse_args() + +context.set_context(mode=context.GRAPH_MODE, device_target=args.device_target) +if args.device_target == "Ascend": + context.set_context(device_id=args.device_id) + +if __name__ == '__main__': + if config.model == "ssd_resnet50": + net = ssd_resnet50(config=config) + else: + raise ValueError(f'config.model: {config.model} is not supported') + net = SsdInferWithDecoder(net, Tensor(default_boxes), config) + + param_dict = load_checkpoint(args.ckpt_file) + net.init_parameters_data() + load_param_into_net(net, param_dict) + net.set_train(False) + + input_shp = [args.batch_size, 3] + config.img_shape + input_array = Tensor(np.random.uniform(-1.0, 1.0, size=input_shp), dtype.float32) + export(net, input_array, file_name=args.file_name, file_format=args.file_format) diff --git a/research/cv/ssd_resnet50/infer/__init__.py b/research/cv/ssd_resnet50/infer/__init__.py new file mode 100644 index 0000000000000000000000000000000000000000..5561fecbf78c85332d430b130e077a13dcf4280b --- /dev/null +++ b/research/cv/ssd_resnet50/infer/__init__.py @@ -0,0 +1,13 @@ +# Copyright (C) 2021.Huawei Technologies Co., Ltd. All rights reserved. +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. diff --git a/research/cv/ssd_resnet50/infer/convert/aipp.cfg b/research/cv/ssd_resnet50/infer/convert/aipp.cfg new file mode 100644 index 0000000000000000000000000000000000000000..f40bc1b108ec4fd1f400ca916b91c524bfb162b1 --- /dev/null +++ b/research/cv/ssd_resnet50/infer/convert/aipp.cfg @@ -0,0 +1,12 @@ +aipp_op { +aipp_mode: static +input_format : RGB888_U8 +csc_switch : false +rbuv_swap_switch : true +mean_chn_0 : 124 +mean_chn_1 : 117 +mean_chn_2 : 104 + var_reci_chn_0 : 0.0171247538316637 + var_reci_chn_1 : 0.0175070028011204 + var_reci_chn_2 : 0.0174291938997821 +} \ No newline at end of file diff --git a/research/cv/ssd_resnet50/infer/convert/convert_om.sh b/research/cv/ssd_resnet50/infer/convert/convert_om.sh new file mode 100644 index 0000000000000000000000000000000000000000..c9bd774fc19f683a8a01a2935e294c8f7385596a --- /dev/null +++ b/research/cv/ssd_resnet50/infer/convert/convert_om.sh @@ -0,0 +1,25 @@ +# 转化模型 +input_air_path=$1 +output_om_path=$2 +aipp_cfg=$3 + +export ASCEND_ATC_PATH=/usr/local/Ascend/atc/bin/atc +export LD_LIBRARY_PATH=/usr/local/Ascend/atc/lib64:$LD_LIBRARY_PATH +export PATH=/usr/local/python3.7.5/bin:/usr/local/Ascend/atc/ccec_compiler/bin:/usr/local/Ascend/atc/bin:$PATH +export PYTHONPATH=/usr/local/Ascend/atc/python/site-packages:/usr/local/Ascend/atc/python/site-packages/auto_tune.egg/auto_tune:/usr/local/Ascend/atc/python/site-packages/schedule_search.egg +export ASCEND_OPP_PATH=/usr/local/Ascend/opp + +export ASCEND_SLOG_PRINT_TO_STDOUT=1 + +echo "Input AIR file path: ${input_air_path}" +echo "Output OM file path: ${output_om_path}" +echo "AIPP cfg file path: ${aipp_cfg}" + +atc --input_format=NCHW --framework=1 \ +--model=${input_air_path} \ +--output=${output_om_path} \ +--soc_version=Ascend310 \ +--disable_reuse_memory=0 \ +--insert_op_conf=${aipp_cfg} \ +--precision_mode=allow_fp32_to_fp16 \ +--op_select_implmode=high_precision \ No newline at end of file diff --git a/research/cv/ssd_resnet50/infer/mxbase/C++/CMakeLists.txt b/research/cv/ssd_resnet50/infer/mxbase/C++/CMakeLists.txt new file mode 100644 index 0000000000000000000000000000000000000000..f509e7d8bf1de35d7e0557cb230cf34a1010f9ae --- /dev/null +++ b/research/cv/ssd_resnet50/infer/mxbase/C++/CMakeLists.txt @@ -0,0 +1,59 @@ +cmake_minimum_required(VERSION 3.10.0) +project(ssd_ms) + +set(TARGET ssd_resnet50) + +SET(CMAKE_BUILD_TYPE "Debug") +SET(CMAKE_CXX_FLAGS_DEBUG "$ENV{CXXFLAGS} -O0 -Wall -g -ggdb") +SET(CMAKE_CXX_FLAGS_RELEASE "$ENV{CXXFLAGS} -O3 -Wall") + +add_definitions(-DENABLE_DVPP_INTERFACE) +add_definitions(-D_GLIBCXX_USE_CXX11_ABI=0) +add_definitions(-Dgoogle=mindxsdk_private) +add_compile_options(-std=c++11 -fPIE -fstack-protector-all -fPIC -Wall) +add_link_options(-Wl,-z,relro,-z,now,-z,noexecstack -pie) + +# Check environment variable +if(NOT DEFINED ENV{ASCEND_HOME}) + message(FATAL_ERROR "please define environment variable:ASCEND_HOME") +endif() +if(NOT DEFINED ENV{ASCEND_VERSION}) + message(WARNING "please define environment variable:ASCEND_VERSION") +endif() +if(NOT DEFINED ENV{ARCH_PATTERN}) + message(WARNING "please define environment variable:ARCH_PATTERN") +endif() + +# 设置acllib的头文件和动态链接库 +set(ACL_INC_DIR $ENV{ASCEND_HOME}/${ASCEND_VERSION}/${ARCH_PATTERN}/acllib/include) +set(ACL_LIB_DIR $ENV{ASCEND_HOME}/${ASCEND_VERSION}/${ARCH_PATTERN}/acllib/lib64) + +# 设置MxBase的头文件和动态链接库 +set(MXBASE_ROOT_DIR $ENV{MX_SDK_HOME}) +set(MXBASE_INC ${MXBASE_ROOT_DIR}/include) +set(MXBASE_LIB_DIR ${MXBASE_ROOT_DIR}/lib) +set(MXBASE_POST_LIB_DIR ${MXBASE_ROOT_DIR}/lib/modelpostprocessors) +set(MXBASE_POST_PROCESS_DIR ${MXBASE_ROOT_DIR}/include/MxBase/postprocess/include) + +# 设置opensource的头文件和动态链接库 +# 主要包含OpenCV、Google log等开源库 +set(OPENSOURCE_DIR ${MXBASE_ROOT_DIR}/opensource) + +include_directories(${ACL_INC_DIR}) +include_directories(${OPENSOURCE_DIR}/include) +include_directories(${OPENSOURCE_DIR}/include/opencv4) + +include_directories(${MXBASE_INC}) +include_directories(${MXBASE_POST_PROCESS_DIR}) +link_directories(${ACL_LIB_DIR}) +link_directories(${OPENSOURCE_DIR}/lib) +link_directories(${MXBASE_LIB_DIR}) +link_directories(${MXBASE_POST_LIB_DIR}) + +# 本地编译链接文件,根据mxbase目录下文件添加修改 +add_executable(${TARGET} ResNet50_main.cpp ResNet50Base.cpp) + +# 链接动态链接库,后处理lib:yolov3postprocess根据实际情况修改 +target_link_libraries(${TARGET} glog cpprest mxbase SsdMobilenetFpn_MindsporePost opencv_world stdc++fs) + +install(TARGETS ${TARGET} RUNTIME DESTINATION ${PROJECT_SOURCE_DIR}/) \ No newline at end of file diff --git a/research/cv/ssd_resnet50/infer/mxbase/C++/ResNet50Base.cpp b/research/cv/ssd_resnet50/infer/mxbase/C++/ResNet50Base.cpp new file mode 100644 index 0000000000000000000000000000000000000000..64a88de4e95c6066c0f4a9712a3dd29b94880021 --- /dev/null +++ b/research/cv/ssd_resnet50/infer/mxbase/C++/ResNet50Base.cpp @@ -0,0 +1,223 @@ +/* + * Copyright (c) 2021. Huawei Technologies Co., Ltd. All rights reserved. + * + * Licensed under the Apache License, Version 2.0 (the "License"); + * you may not use this file except in compliance with the License. + * You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +// 主代码逻辑 +#include "SSDResNet50.h" +#include +#include +#include +#include +#include "MxBase/DeviceManager/DeviceManager.h" +#include "MxBase/Log/Log.h" + +using namespace MxBase; + +APP_ERROR SSDResNet50::Init(const InitParam &initParam) { + deviceId_ = initParam.deviceId; + APP_ERROR ret = MxBase::DeviceManager::GetInstance()->InitDevices(); + if (ret != APP_ERR_OK) { + LogError << "Init devices failed, ret=" << ret << "."; + return ret; + } + ret = MxBase::TensorContext::GetInstance()->SetContext(initParam.deviceId); + if (ret != APP_ERR_OK) { + LogError << "Set context failed, ret=" << ret << "."; + return ret; + } + dvppWrapper_ = std::make_shared(); + ret = dvppWrapper_->Init(); + if (ret != APP_ERR_OK) { + LogError << "DvppWrapper init failed, ret=" << ret << "."; + return ret; + } + model_ = std::make_shared(); + ret = model_->Init(initParam.modelPath, modelDesc_); + if (ret != APP_ERR_OK) { + LogError << "ModelInferenceProcessor init failed, ret=" << ret << "."; + return ret; + } + MxBase::ConfigData configData; + const std::string checkTensor = initParam.checkTensor ? "true" : "false"; + + configData.SetJsonValue("CLASS_NUM", std::to_string(initParam.classNum)); + configData.SetJsonValue("SCORE_THRESH", std::to_string(initParam.score_thresh)); + configData.SetJsonValue("IOU_THRESH", std::to_string(initParam.iou_thresh)); + configData.SetJsonValue("CHECK_MODEL", checkTensor); + + auto jsonStr = configData.GetCfgJson().serialize(); + std::map> config; + config["postProcessConfigContent"] = std::make_shared(jsonStr); + config["labelPath"] = std::make_shared(initParam.labelPath); + + post_ = std::make_shared(); + ret = post_->Init(config); + if (ret != APP_ERR_OK) { + LogError << "SSDResNet50 init failed, ret=" << ret << "."; + return ret; + } + return APP_ERR_OK; +} + +APP_ERROR SSDResNet50::DeInit() { + dvppWrapper_->DeInit(); + model_->DeInit(); + post_->DeInit(); + MxBase::DeviceManager::GetInstance()->DestroyDevices(); + return APP_ERR_OK; +} + + +APP_ERROR SSDResNet50::ReadImage(const std::string &imgPath, cv::Mat &imageMat) +{ + imageMat = cv::imread(imgPath, cv::IMREAD_COLOR); + return APP_ERR_OK; +} + +APP_ERROR SSDResNet50::ResizeImage(const cv::Mat &srcImageMat, cv::Mat &dstImageMat) +{ + // static constexpr uint32_t resizeHeight = 300; + // static constexpr uint32_t resizeWidth = 300; + static constexpr uint32_t resizeHeight = 640; + static constexpr uint32_t resizeWidth = 640; + + cv::resize(srcImageMat, dstImageMat, cv::Size(resizeWidth, resizeHeight)); + return APP_ERR_OK; +} + +APP_ERROR SSDResNet50::CVMatToTensorBase(const cv::Mat &imageMat, MxBase::TensorBase &tensorBase) +{ + const uint32_t dataSize = imageMat.cols * imageMat.rows * YUV444_RGB_WIDTH_NU; + LogInfo << "image size after crop" << imageMat.cols << " " << imageMat.rows; + MemoryData memoryDataDst(dataSize, MemoryData::MEMORY_DEVICE, deviceId_); + MemoryData memoryDataSrc(imageMat.data, dataSize, MemoryData::MEMORY_HOST_MALLOC); + + APP_ERROR ret = MemoryHelper::MxbsMallocAndCopy(memoryDataDst, memoryDataSrc); + if (ret != APP_ERR_OK) { + LogError << GetError(ret) << "Memory malloc failed."; + return ret; + } + + std::vector shape = {imageMat.rows * YUV444_RGB_WIDTH_NU, static_cast(imageMat.cols)}; + tensorBase = TensorBase(memoryDataDst, false, shape, TENSOR_DTYPE_UINT8); + return APP_ERR_OK; +} + +APP_ERROR SSDResNet50::Inference(const std::vector &inputs, + std::vector &outputs) { + auto dtypes = model_->GetOutputDataType(); + for (size_t i = 0; i < modelDesc_.outputTensors.size(); ++i) { + std::vector shape = {}; + for (size_t j = 0; j < modelDesc_.outputTensors[i].tensorDims.size(); ++j) { + shape.push_back((uint32_t) modelDesc_.outputTensors[i].tensorDims[j]); + } + TensorBase tensor(shape, dtypes[i], MemoryData::MemoryType::MEMORY_DEVICE, deviceId_); + APP_ERROR ret = TensorBase::TensorBaseMalloc(tensor); + if (ret != APP_ERR_OK) { + LogError << "TensorBaseMalloc failed, ret=" << ret << "."; + return ret; + } + outputs.push_back(tensor); + } + DynamicInfo dynamicInfo = {}; + dynamicInfo.dynamicType = DynamicType::STATIC_BATCH; + dynamicInfo.batchSize = 1; + + APP_ERROR ret = model_->ModelInference(inputs, outputs, dynamicInfo); + if (ret != APP_ERR_OK) { + LogError << "ModelInference failed, ret=" << ret << "."; + return ret; + } + return APP_ERR_OK; +} + +APP_ERROR SSDResNet50::PostProcess(const std::vector &inputs, + std::vector> &objectInfos, + const std::vector &resizedImageInfos, + const std::map> &configParamMap) { + APP_ERROR ret = post_->Process(inputs, objectInfos, resizedImageInfos, configParamMap); + if (ret != APP_ERR_OK) { + LogError << "Process failed, ret=" << ret << "."; + return ret; + } + return APP_ERR_OK; +} + +APP_ERROR SSDResNet50::Process(const std::string &imgPath) { + cv::Mat imageMat; + APP_ERROR ret = ReadImage(imgPath, imageMat); + + const uint32_t originHeight = imageMat.rows; + const uint32_t originWidth = imageMat.cols; + + LogInfo << "image shape, size=" << originWidth << "," << originHeight << "."; + if (ret != APP_ERR_OK) { + LogError << "ReadImage failed, ret=" << ret << "."; + return ret; + } + + ResizeImage(imageMat, imageMat); + + TensorBase tensorBase; + ret = CVMatToTensorBase(imageMat, tensorBase); + + if (ret != APP_ERR_OK) { + LogError << "Resize failed, ret=" << ret << "."; + return ret; + } + std::vector inputs = {}; + std::vector outputs = {}; + inputs.push_back(tensorBase); + ret = Inference(inputs, outputs); + if (ret != APP_ERR_OK) { + LogError << "Inference failed, ret=" << ret << "."; + return ret; + } + LogInfo << "Inference success, ret=" << ret << "."; + std::vector resizedImageInfos = {}; + + ResizedImageInfo imgInfo; + + imgInfo.widthOriginal = originWidth; + imgInfo.heightOriginal = originHeight; + imgInfo.widthResize = 640; + imgInfo.heightResize = 640; + imgInfo.resizeType = MxBase::RESIZER_STRETCHING; + + // ResizedImageInfo imgInfo = {300, 300, originWidth, originHeight, MxBase::RESIZER_STRETCHING, 0.0}; + resizedImageInfos.push_back(imgInfo); + std::vector> objectInfos = {}; + std::map> configParamMap = {}; + + std::vector> BatchClsInfos = {}; + ret = PostProcess(outputs, objectInfos, resizedImageInfos, configParamMap); + if (ret != APP_ERR_OK) { + LogError << "PostProcess failed, ret=" << ret << "."; + return ret; + } + if (objectInfos.empty()) { + LogInfo << "No object detected." << std::endl; + return APP_ERR_OK; + } + + std::vector objects = objectInfos.at(0); + for (size_t i = 0; i < objects.size(); i++) { + ObjectInfo obj = objects.at(i); + LogInfo << "BBox[" << i << "]:[x0=" << obj.x0 << ", y0=" << obj.y0 << ", x1=" << obj.x1 << ", y1=" << obj.y1 + << "], confidence=" << obj.confidence << ", classId=" << obj.classId << ", className=" << obj.className + << std::endl; + } + return APP_ERR_OK; +} diff --git a/research/cv/ssd_resnet50/infer/mxbase/C++/ResNet50_main.cpp b/research/cv/ssd_resnet50/infer/mxbase/C++/ResNet50_main.cpp new file mode 100644 index 0000000000000000000000000000000000000000..f49d2cd8cde41d4945952678b805894ec287b924 --- /dev/null +++ b/research/cv/ssd_resnet50/infer/mxbase/C++/ResNet50_main.cpp @@ -0,0 +1,62 @@ +/** + * Copyright 2021 Huawei Technologies Co., Ltd + * + * Licensed under the Apache License, Version 2.0 (the "License"); + * you may not use this file except in compliance with the License. + * You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +// c++的入口文件 +#include +#include +#include +#include +#include +#include "SSDResNet50.h" +#include "MxBase/Log/Log.h" + +namespace { + const uint32_t CLASS_NUM = 81; +} + +int main(int argc, char *argv[]) { + if (argc <= 2) { + // LogWarn << "Please input image path, such as './ssd_resnet50 ssd_resnet50.om test.jpg'."; + return APP_ERR_OK; + } + + InitParam initParam = {}; + initParam.deviceId = 0; + initParam.classNum = CLASS_NUM; + initParam.labelPath = "../models/coco.names"; + + initParam.iou_thresh = 0.6; + initParam.score_thresh = 0.6; + initParam.checkTensor = true; + + initParam.modelPath = argv[1]; + auto ssdResnet50 = std::make_shared(); + APP_ERROR ret = ssdResnet50->Init(initParam); + if (ret != APP_ERR_OK) { + LogError << "SsdResnet50 init failed, ret=" << ret << "."; + return ret; + } + + std::string imgPath = argv[2]; + ret = ssdResnet50->Process(imgPath); + if (ret != APP_ERR_OK) { + LogError << "SsdResnet50 process failed, ret=" << ret << "."; + ssdResnet50->DeInit(); + return ret; + } + ssdResnet50->DeInit(); + return APP_ERR_OK; +} \ No newline at end of file diff --git a/research/cv/ssd_resnet50/infer/mxbase/C++/SSDResNet50.h b/research/cv/ssd_resnet50/infer/mxbase/C++/SSDResNet50.h new file mode 100644 index 0000000000000000000000000000000000000000..ff7c3f99ffe404b2e20100327eb3bc61dbd64fb8 --- /dev/null +++ b/research/cv/ssd_resnet50/infer/mxbase/C++/SSDResNet50.h @@ -0,0 +1,64 @@ +/* + * Copyright (c) 2021. Huawei Technologies Co., Ltd. All rights reserved. + * + * Licensed under the Apache License, Version 2.0 (the "License"); + * you may not use this file except in compliance with the License. + * You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +// 作为标识,防止重复编译报错 +#ifndef SSD_ResNet50 +#define SSD_ResNet50 + + +#include +#include +#include +#include +#include +#include +#include "MxBase/DvppWrapper/DvppWrapper.h" +#include "MxBase/ModelInfer/ModelInferenceProcessor.h" +#include "ObjectPostProcessors/SsdMobilenetFpn_MindsporePost.h" +#include "MxBase/Tensor/TensorContext/TensorContext.h" + +// 用关键字struct定义一种复杂的数据类型来包含多种信息,这就是结构。 +// std:: 是个名称空间标示符,C++标准库中的函数或者对象都是在命名空间std中定义的,所以我们要使用标准函数库中的函数或对象都要使用std来限定。 +struct InitParam { + uint32_t deviceId; + std::string labelPath; + uint32_t classNum; + float iou_thresh; + float score_thresh; + bool checkTensor; + std::string modelPath; +}; + +class SSDResNet50 { +public: + APP_ERROR Init(const InitParam &initParam); + APP_ERROR DeInit(); + APP_ERROR ReadImage(const std::string &imgPath, cv::Mat &imageMat); + APP_ERROR ResizeImage(const cv::Mat &srcImageMat, cv::Mat &dstImageMat); + APP_ERROR CVMatToTensorBase(const cv::Mat &imageMat, MxBase::TensorBase &tensorBase); + APP_ERROR Inference(const std::vector &inputs, std::vector &outputs); + APP_ERROR PostProcess(const std::vector &inputs, + std::vector> &objectInfos, + const std::vector &resizedImageInfos, + const std::map> &configParamMap); + APP_ERROR Process(const std::string &imgPath); +private: + std::shared_ptr dvppWrapper_; + std::shared_ptr model_; + std::shared_ptr post_; + MxBase::ModelDesc modelDesc_; + uint32_t deviceId_ = 0; +}; +#endif \ No newline at end of file diff --git a/research/cv/ssd_resnet50/infer/mxbase/C++/run.sh b/research/cv/ssd_resnet50/infer/mxbase/C++/run.sh new file mode 100644 index 0000000000000000000000000000000000000000..c11a1e85966b430fb7737e405ac58bf5f79f323e --- /dev/null +++ b/research/cv/ssd_resnet50/infer/mxbase/C++/run.sh @@ -0,0 +1,21 @@ +#!/bin/bash + +export ASCEND_HOME=/usr/local/Ascend +export ARCH_PATTERN=x86_64-linux +export ASCEND_VERSION=nnrt/latest +# OM_FILE=/home/data/sjtu_liu/CLY/SSD_ResNet50/ssd_resnet50.om + +MXBASE_CODE_DIR=/home/sjtu_liu/CLY/SSD_ResNet50/mxbase/C++ +OM_FILE=/home/sjtu_liu/CLY/SSD_ResNet50/ssd_resnet50.om +cd $MXBASE_CODE_DIR +rm -rf dist +mkdir dist +cd dist +cmake .. +make -j +make install + +cd ${MXBASE_CODE_DIR} +cp ${MXBASE_CODE_DIR}/dist/ssd_resnet50 . + +# ./ssd_resnet50 ${OM_FILE} ./test.jpg diff --git a/research/cv/ssd_resnet50/infer/mxbase/models/coco.names b/research/cv/ssd_resnet50/infer/mxbase/models/coco.names new file mode 100644 index 0000000000000000000000000000000000000000..65b54571fb68d12ad38304fd4fe6928cfc321378 --- /dev/null +++ b/research/cv/ssd_resnet50/infer/mxbase/models/coco.names @@ -0,0 +1,82 @@ +# This file is originally from https://github.com/tensorflow/models/blob/master/research/object_detection/data/mscoco_complete_label_map.pbtxt +background +person +bicycle +car +motorcycle +airplane +bus +train +truck +boat +traffic light +fire hydrant +stop sign +parking meter +bench +bird +cat +dog +horse +sheep +cow +elephant +bear +zebra +giraffe +backpack +umbrella +handbag +tie +suitcase +frisbee +skis +snowboard +sports ball +kite +baseball bat +baseball glove +skateboard +surfboard +tennis racket +bottle +wine glass +cup +fork +knife +spoon +bowl +banana +apple +sandwich +orange +broccoli +carrot +hot dog +pizza +donut +cake +chair +couch +potted plant +bed +dining table +toilet +tv +laptop +mouse +remote +keyboard +cell phone +microwave +oven +toaster +sink +refrigerator +book +clock +vase +scissors +teddy bear +hair drier +toothbrush \ No newline at end of file diff --git a/research/cv/ssd_resnet50/infer/sdk/.infer_by_sdk.py.swp b/research/cv/ssd_resnet50/infer/sdk/.infer_by_sdk.py.swp new file mode 100644 index 0000000000000000000000000000000000000000..63ba50890b4237e114384457978ccf155b0e276b Binary files /dev/null and b/research/cv/ssd_resnet50/infer/sdk/.infer_by_sdk.py.swp differ diff --git a/research/cv/ssd_resnet50/infer/sdk/conf/coco.names b/research/cv/ssd_resnet50/infer/sdk/conf/coco.names new file mode 100644 index 0000000000000000000000000000000000000000..65b54571fb68d12ad38304fd4fe6928cfc321378 --- /dev/null +++ b/research/cv/ssd_resnet50/infer/sdk/conf/coco.names @@ -0,0 +1,82 @@ +# This file is originally from https://github.com/tensorflow/models/blob/master/research/object_detection/data/mscoco_complete_label_map.pbtxt +background +person +bicycle +car +motorcycle +airplane +bus +train +truck +boat +traffic light +fire hydrant +stop sign +parking meter +bench +bird +cat +dog +horse +sheep +cow +elephant +bear +zebra +giraffe +backpack +umbrella +handbag +tie +suitcase +frisbee +skis +snowboard +sports ball +kite +baseball bat +baseball glove +skateboard +surfboard +tennis racket +bottle +wine glass +cup +fork +knife +spoon +bowl +banana +apple +sandwich +orange +broccoli +carrot +hot dog +pizza +donut +cake +chair +couch +potted plant +bed +dining table +toilet +tv +laptop +mouse +remote +keyboard +cell phone +microwave +oven +toaster +sink +refrigerator +book +clock +vase +scissors +teddy bear +hair drier +toothbrush \ No newline at end of file diff --git a/research/cv/ssd_resnet50/infer/sdk/conf/sdk_infer_env.rc b/research/cv/ssd_resnet50/infer/sdk/conf/sdk_infer_env.rc new file mode 100644 index 0000000000000000000000000000000000000000..78f96b3e6bb642ce105d99b79ee1a62596b702a8 --- /dev/null +++ b/research/cv/ssd_resnet50/infer/sdk/conf/sdk_infer_env.rc @@ -0,0 +1,6 @@ +export MX_SDK_HOME=/home/sam/mxManufacture +export ASCEND_AICPU_PATH=/usr/local/Ascend/ascend-toolkit/20.2.rc1/arm64-linux +export LD_LIBRARY_PATH=${MX_SDK_HOME}/lib:${MX_SDK_HOME}/opensource/lib:${LD_LIBRARY_PATH} +export PYTHONPATH=${MX_SDK_HOME}/python:${PYTHONPATH} +export GST_PLUGIN_PATH=${MX_SDK_HOME}/opensource/lib/gstreamer-1.0:${MX_SDK_HOME}/lib/plugins +export GST_PLUGIN_SCANNER=${MX_SDK_HOME}/opensource/libexec/gstreamer-1.0/gst-plugin-scanner \ No newline at end of file diff --git a/research/cv/ssd_resnet50/infer/sdk/conf/ssd-resnet50.pipeline b/research/cv/ssd_resnet50/infer/sdk/conf/ssd-resnet50.pipeline new file mode 100644 index 0000000000000000000000000000000000000000..e82f0dea05fcf4b4c226cd746e2dce5389064dbc --- /dev/null +++ b/research/cv/ssd_resnet50/infer/sdk/conf/ssd-resnet50.pipeline @@ -0,0 +1,61 @@ +{ + "detection": { + "stream_config": { + "deviceId": "0" + }, + "appsrc0": { + "props": { + "blocksize": "409600" + }, + "factory": "appsrc", + "next": "mxpi_imagedecoder0" + }, + "mxpi_imagedecoder0": { + "props": { + "handleMethod": "opencv" + }, + "factory": "mxpi_imagedecoder", + "next": "mxpi_imageresize0" + }, + "mxpi_imageresize0": { + "props": { + "parentName": "mxpi_imagedecoder0", + "handleMethod": "opencv", + "resizeHeight": "640", + "resizeWidth": "640", + "resizeType": "Resizer_Stretch" + }, + "factory": "mxpi_imageresize", + "next": "mxpi_tensorinfer0" + }, + "mxpi_tensorinfer0": { + "props": { + "waitingTime": "3000", + "dataSource": "mxpi_imageresize0", + "modelPath": "/home/data/sjtu_liu/CLY/SSD_ResNet50/ssd_resnet50.om" + }, + "factory": "mxpi_tensorinfer", + "next": "mxpi_objectpostprocessor0" + }, + "mxpi_objectpostprocessor0": { + "props": { + "dataSource": "mxpi_tensorinfer0", + "postProcessConfigPath": "./ssd_mobilenet_v1_fpn_ms_on_coco_postprocess.cfg", + "labelPath": "./coco.names", + "postProcessLibPath": "./libSsdMobilenetFpn_MindsporePost.so" + }, + "factory": "mxpi_objectpostprocessor", + "next": "mxpi_dataserialize0" + }, + "mxpi_dataserialize0": { + "props": { + "outputDataKeys": "mxpi_objectpostprocessor0" + }, + "factory": "mxpi_dataserialize", + "next": "appsink0" + }, + "appsink0": { + "factory": "appsink" + } + } +} diff --git a/research/cv/ssd_resnet50/infer/sdk/conf/ssd_mobilenet_v1_fpn_ms_on_coco_postprocess.cfg b/research/cv/ssd_resnet50/infer/sdk/conf/ssd_mobilenet_v1_fpn_ms_on_coco_postprocess.cfg new file mode 100644 index 0000000000000000000000000000000000000000..9d8bb9f25bfadace3c9207252169db51d461fa42 --- /dev/null +++ b/research/cv/ssd_resnet50/infer/sdk/conf/ssd_mobilenet_v1_fpn_ms_on_coco_postprocess.cfg @@ -0,0 +1,4 @@ +CLASS_NUM=81 +SCORE_THRESH=0.1 +IOU_THRESH=0.6 +CHECK_MODEL=true diff --git a/research/cv/ssd_resnet50/infer/sdk/infer_by_sdk.py b/research/cv/ssd_resnet50/infer/sdk/infer_by_sdk.py new file mode 100644 index 0000000000000000000000000000000000000000..a2655b017d9bc87e07ee6205b3ff255b9bf73131 --- /dev/null +++ b/research/cv/ssd_resnet50/infer/sdk/infer_by_sdk.py @@ -0,0 +1,218 @@ +# Copyright 2021 Huawei Technologies Co., Ltd +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. +# ============================================================================ + +import argparse +import json +import os +from StreamManagerApi import MxDataInput +from StreamManagerApi import StreamManagerApi + +# 支持的图片后缀,分别是这四个后缀名 +SUPPORT_IMG_SUFFIX = (".jpg", ".JPG", ".jpeg", ".JPEG") + +# os.path.dirname(__file__)获取当前脚本的完整路径,os.path.abspath()获取当前脚本的完整路径 +current_path = os.path.abspath(os.path.dirname(__file__)) + +# argparse是个解析器,argparse块可以让人轻松编写用户友好的命令行接口,使用argparse首先要创建ArgumentParser对象, +parser = argparse.ArgumentParser( + description="SSD ResNet50 infer " "example.", + fromfile_prefix_chars="@", +) + +# name or flags,一个命令或一个选项字符串的列表 +# str将数据强制转换为字符串。每种数据类型都可以强制转换为字符串 +# help 一个此选项作用的简单描述 +# default 当参数未在命令行中出现时使用的值。 +parser.add_argument( + "--pipeline_path", + type=str, + help="mxManufacture pipeline file path", + default=os.path.join(current_path, "../conf/ssd-resnet50.pipeline"), +) +parser.add_argument( + "--stream_name", + type=str, + help="Infer stream name in the pipeline config file", + default="detection", +) +parser.add_argument( + "--img_path", + type=str, + help="Image pathname, can be a image file or image directory", + default=os.path.join(current_path,"../coco/val2017" ), +) +# 目的用于存放推理后的结果 +parser.add_argument( + "--res_path", + type=str, + help="Directory to store the inferred result", + default=None, + required=False, +) + +# 赋值然后解析参数 +args = parser.parse_args() + +# 推理图像 +def infer(): + """Infer images by DVPP + OM. """ + pipeline_path = args.pipeline_path + # 将stream_name编码为utf-8的格式 + stream_name = args.stream_name.encode() + img_path = os.path.abspath(args.img_path) + res_dir_name = args.res_path + + # StreamManagerApi()用于对流程的基本管理:加载流程配置、创建流程、向流程上发送数据、获得执行结果 + stream_manager_api = StreamManagerApi() + # InitManager初始化一个StreamManagerApi + ret = stream_manager_api.InitManager() + if ret != 0: + print("Failed to init Stream manager, ret=%s" % str(ret)) + exit() + + # create streams by pipeline config file + # 读取pipeline文件 + with open(pipeline_path, "rb") as f: + pipeline_str = f.read() + + # CreateMultipleStreams,根据指定的pipeline配置创建Stream + ret = stream_manager_api.CreateMultipleStreams(pipeline_str) + if ret != 0: + print("Failed to create Stream, ret=%s" % str(ret)) + exit() + + # 插件的id + in_plugin_id = 0 + # Construct the input of the stream + # 构造stream的输入,MxDataInput用于Stream接收的数据结构定义。 + data_input = MxDataInput() + + # os.path.isfile()用于判断某一对象(需提供绝对路径)是否为文件 + # endswith用于判断是否为指定的图片的字符串结尾 + if os.path.isfile(img_path) and img_path.endswith(SUPPORT_IMG_SUFFIX): + file_list = [os.path.abspath(img_path)] + else: + # os.path.isdir()用于判断对象是否为一个目录 + file_list = os.listdir(img_path) + file_list = [ + # 将图片路径和图片连接,for in if 过滤掉那些不符合照片后缀的图片 + os.path.join(img_path, img) + for img in file_list + if img.endswith(SUPPORT_IMG_SUFFIX) + ] + + if not res_dir_name: + res_dir_name = os.path.join(".", "infer_res") + print(f"res_dir_name={res_dir_name}") + # 创建目录,e目标目录已存在的情况下不会触发FileExistsError异常。 + os.makedirs(res_dir_name, exist_ok=True) + pic_infer_dict_list = [] + # 开始对file_list进行遍历 + for file_name in file_list: + # 依次读出每张照片 + with open(file_name, "rb") as f: + img_data = f.read() + if not img_data: + print(f"read empty data from img:{file_name}") + continue + # data_input这个对象的data元素值为img_data + data_input.data = img_data + # SendDataWithUniqueId向指定的元件发送数据,输入in_plugin_id目标输入插件id,data_input ,根据官方的API,stream_name应该是可以不作为输入的 + unique_id = stream_manager_api.SendDataWithUniqueId( + stream_name, in_plugin_id, data_input + ) + if unique_id < 0: + print("Failed to send data to stream.") + exit() + # 获得Stream上的输出元件的结果(appsink), 延时3000ms + infer_result = stream_manager_api.GetResultWithUniqueId( + stream_name, unique_id, 3000 + ) + if infer_result.errorCode != 0: + print( + "GetResultWithUniqueId error. errorCode=%d, errorMsg=%s" + % (infer_result.errorCode, infer_result.data.decode()) + ) + exit() + # 将推理的结果parse_img_infer_result追加到pic_infer_dict_list数组中 + pic_infer_dict_list.extend( + parse_img_infer_result(file_name, infer_result) + ) + + print(f"Inferred image:{file_name} success!") + + with open(os.path.join(res_dir_name, "det_result.json"), "w") as fw: + # 将Python格式转为json格式并且写入 + fw.write(json.dumps(pic_infer_dict_list)) + + stream_manager_api.DestroyAllStreams() + +def trans_class_id(k): + if k >= 1 and k <= 11: + return k + elif k >= 12 and k <= 24: + return k + 1 + elif k >= 25 and k <= 26: + return k + 2 + elif k >= 27 and k <= 40: + return k + 4 + elif k >= 41 and k <= 60: + return k + 5 + elif k == 61: + return k + 6 + elif k == 62: + return k + 8 + elif k >= 63 and k <= 73: + return k + 9 + elif k >= 74 and k <= 80: + return k + 10 + +def parse_img_infer_result(file_name, infer_result): + # 将infer_result.data即元器件返回的结果转为dict格式,用get("MxpiObject", [])新建一个MxpiObject的Key且复制为[] + obj_list = json.loads(infer_result.data.decode()).get("MxpiObject", []) + det_obj_list = [] + for o in obj_list: + # 把图片框一个框,一个正方形,四个角的位置,坐标位置 + x0, y0, x1, y1 = ( + # round()函数,四舍五入到第四位 + round(o.get("x0"), 4), + round(o.get("y0"), 4), + round(o.get("x1"), 4), + round(o.get("y1"), 4), + ) + bbox_for_map = [int(x0), int(y0), int(x1 - x0), int(y1 - y0)] + score = o.get("classVec")[0].get("confidence") + category_id = o.get("classVec")[0].get("classId") + # basename()用于选取最后的文件名,即image的name,.split(".")用于把后缀给分割掉 + img_fname_without_suffix = os.path.basename(file_name).split(".")[0] + try: + image_id = int(img_fname_without_suffix) + except: + print("exception getting image id.") + image_id = img_fname_without_suffix + det_obj_list.append( + dict( + image_id=image_id, + bbox=bbox_for_map, + # 目录id,把图片归类 + category_id=trans_class_id(category_id), + # 置信度的问题,过滤掉比较小的,意思就是说假如我这边猜个猫的机率为0.1,那就是大概率不是猫,那这个数据就可以筛掉了 + # ssd_mobilenet_v1_fpn_ms_on_coco_postprocess.cfg文件里面的 SCORE_THRESH=0.6设置 + score=score, + ) + ) + return det_obj_list +if __name__ == "__main__": + infer() diff --git a/research/cv/ssd_resnet50/infer/sdk/perf/__init__.py b/research/cv/ssd_resnet50/infer/sdk/perf/__init__.py new file mode 100644 index 0000000000000000000000000000000000000000..45c53ab7c77e354418115dfd460b582e3288a536 --- /dev/null +++ b/research/cv/ssd_resnet50/infer/sdk/perf/__init__.py @@ -0,0 +1,13 @@ +# Copyright (C) 2020.Huawei Technologies Co., Ltd. All rights reserved. +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. diff --git a/research/cv/ssd_resnet50/infer/sdk/perf/generate_map_report.py b/research/cv/ssd_resnet50/infer/sdk/perf/generate_map_report.py new file mode 100644 index 0000000000000000000000000000000000000000..f25a4f9f766f007dbce72d9ea12489b310af732b --- /dev/null +++ b/research/cv/ssd_resnet50/infer/sdk/perf/generate_map_report.py @@ -0,0 +1,115 @@ +# Copyright 2021 Huawei Technologies Co., Ltd +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. +# ============================================================================ + +import os +from datetime import datetime + +from absl import flags +from absl import app +from pycocotools.coco import COCO +from pycocotools.cocoeval import COCOeval + +PRINT_LINES_TEMPLATE = """ +Average Precision (AP) @[ IoU=0.50:0.95 | area= all | maxDets=100 ] = %.3f +Average Precision (AP) @[ IoU=0.50 | area= all | maxDets=100 ] = %.3f +Average Precision (AP) @[ IoU=0.75 | area= all | maxDets=100 ] = %.3f +Average Precision (AP) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = %.3f +Average Precision (AP) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = %.3f +Average Precision (AP) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = %.3f +Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets= 1 ] = %.3f +Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets= 10 ] = %.3f +Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets=100 ] = %.3f +Average Recall (AR) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = %.3f +Average Recall (AR) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = %.3f +Average Recall (AR) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = %.3f +""" + +FLAGS = flags.FLAGS +flags.DEFINE_string( + name="annotations_json", + default=None, + help="annotations_json file path name", +) + +flags.DEFINE_string( + name="det_result_json", default=None, help="det_result json file" +) + +flags.DEFINE_enum( + name="anno_type", + default="bbox", + enum_values=["segm", "bbox", "keypoints"], + help="Annotation type", +)\ + +flags.DEFINE_string( + name="output_path_name", + default=None, + help="Where to out put the result files.", +) + +flags.mark_flag_as_required("annotations_json") +flags.mark_flag_as_required("det_result_json") +flags.mark_flag_as_required("output_path_name") + +def main(unused_arg): + del unused_arg + out_put_dir = os.path.dirname(FLAGS.output_path_name) + if not os.path.exists(out_put_dir): + os.makedirs(out_put_dir) + + fw = open(FLAGS.output_path_name, "a+") + now_time_str = datetime.now().strftime("%Y-%m-%d %H:%M:%S") + head_info = f"{'-'*50}mAP Test starts @ {now_time_str}{'-'*50}\n" + fw.write(head_info) + fw.flush() + # 把脚本里面的蓝样本过滤 + cocoGt = COCO(FLAGS.annotations_json) + + image_ids = cocoGt.getImgIds() + need_img_ids = [] + for img_id in image_ids: + iscrowd = False + anno_ids = cocoGt.getAnnIds(imgIds=img_id, iscrowd=None) + anno = cocoGt.loadAnns(anno_ids) + for label in anno: + iscrowd = iscrowd or label["iscrowd"] + + if iscrowd: + continue + need_img_ids.append(img_id) + + cocoDt = cocoGt.loadRes(FLAGS.det_result_json) + cocoEval = COCOeval(cocoGt, cocoDt, iouType=FLAGS.anno_type) + cocoEval.params.imgIds = sorted(need_img_ids) + print(cocoEval.params.imgIds) + cocoEval.evaluate() + cocoEval.accumulate() + cocoEval.summarize() + + format_lines = [ + line for line in PRINT_LINES_TEMPLATE.splitlines() if line.strip() + ] + for i, line in enumerate(format_lines): + fw.write(line % cocoEval.stats[i] + "\n") + + end_time_str = datetime.now().strftime("%Y-%m-%d %H:%M:%S") + tail_info = f"{'-'*50}mAP Test ends @ {end_time_str}{'-'*50}\n" + fw.write(tail_info) + fw.close() + + +if __name__ == "__main__": + app.run(main) diff --git a/research/cv/ssd_resnet50/infer/sdk/perf/run_map_test.sh b/research/cv/ssd_resnet50/infer/sdk/perf/run_map_test.sh new file mode 100644 index 0000000000000000000000000000000000000000..3c0f37117296159ecd7a6d47e7a68497b0e0b344 --- /dev/null +++ b/research/cv/ssd_resnet50/infer/sdk/perf/run_map_test.sh @@ -0,0 +1,15 @@ +#!/bin/bash + +PY=/usr/bin/python3.7 + +export PYTHONPATH=${PYTHONPATH}:. + +annotations_json=$1 +det_result_json=$2 +output_path_name=$3 + +${PY} generate_map_report.py \ +--annotations_json=${annotations_json} \ +--det_result_json=${det_result_json} \ +--output_path_name=${output_path_name} \ +--anno_type=bbox \ No newline at end of file diff --git a/research/cv/ssd_resnet50/infer/sdk/run.sh b/research/cv/ssd_resnet50/infer/sdk/run.sh new file mode 100644 index 0000000000000000000000000000000000000000..95945001a99ff97e486d7b4fa9e3232f0e4f592a --- /dev/null +++ b/research/cv/ssd_resnet50/infer/sdk/run.sh @@ -0,0 +1,39 @@ +#!/bin/bash + +# Copyright 2020 Huawei Technologies Co., Ltd +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +set -e + +CUR_PATH=$(cd "$(dirname "$0")" || { warn "Failed to check path/to/run.sh" ; exit ; } ; pwd) + +# Simple log helper functions +info() { echo -e "\033[1;34m[INFO ][MxStream] $1\033[1;37m" ; } +warn() { echo >&2 -e "\033[1;31m[WARN ][MxStream] $1\033[1;37m" ; } + +export MX_SDK_HOME="/home/data/sjtu_liu/mxVision" +export LD_LIBRARY_PATH=${MX_SDK_HOME}/lib:${MX_SDK_HOME}/opensource/lib:${MX_SDK_HOME}/opensource/lib64:/usr/local/Ascend/ascend-toolkit/latest/acllib/lib64:${LD_LIBRARY_PATH} +export GST_PLUGIN_SCANNER=${MX_SDK_HOME}/opensource/libexec/gstreamer-1.0/gst-plugin-scanner +export GST_PLUGIN_PATH=${MX_SDK_HOME}/opensource/lib/gstreamer-1.0:${MX_SDK_HOME}/lib/plugins + +#to set PYTHONPATH, import the StreamManagerApi.py +export PYTHONPATH=$PYTHONPATH:${MX_SDK_HOME}/python + +pipeline_path=$1 +stream_name=$2 +img_path=$3 +res_path=$4 + +python3.7 infer_by_sdk.py ${pipeline_path} ${stream_name} ${img_path} ${res_path} +exit 0 diff --git a/research/cv/ssd_resnet50/mindspore_hub_conf.py b/research/cv/ssd_resnet50/mindspore_hub_conf.py index 0f24698b48f8c2acfd45f4a3f7ccdd97a2b41967..dbc0f7490eeafa816648771b428c8da4e4f0fafc 100644 --- a/research/cv/ssd_resnet50/mindspore_hub_conf.py +++ b/research/cv/ssd_resnet50/mindspore_hub_conf.py @@ -1,24 +1,24 @@ -# Copyright 2021 Huawei Technologies Co., Ltd -# -# Licensed under the Apache License, Version 2.0 (the "License"); -# you may not use this file except in compliance with the License. -# You may obtain a copy of the License at -# -# http://www.apache.org/licenses/LICENSE-2.0 -# -# Unless required by applicable law or agreed to in writing, software -# distributed under the License is distributed on an "AS IS" BASIS, -# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. -# See the License for the specific language governing permissions and -# limitations under the License. -# ============================================================================ -"""hub config.""" -from src.ssd import SSD300, ssd_mobilenet_v2 -from src.config import config - -def create_network(name, *args, **kwargs): - if name == "ssd300": - backbone = ssd_mobilenet_v2() - ssd = SSD300(backbone=backbone, config=config, *args, **kwargs) - return ssd - raise NotImplementedError(f"{name} is not implemented in the repo") +# Copyright 2021 Huawei Technologies Co., Ltd +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. +# ============================================================================ +"""hub config.""" +from src.ssd import SSD300, ssd_mobilenet_v2 +from src.config import config + +def create_network(name, *args, **kwargs): + if name == "ssd300": + backbone = ssd_mobilenet_v2() + ssd = SSD300(backbone=backbone, config=config, *args, **kwargs) + return ssd + raise NotImplementedError(f"{name} is not implemented in the repo") diff --git a/research/cv/ssd_resnet50/modelarts/readme.md b/research/cv/ssd_resnet50/modelarts/readme.md new file mode 100644 index 0000000000000000000000000000000000000000..7d751096558065d3506c8617d9cb6426d11f2cba --- /dev/null +++ b/research/cv/ssd_resnet50/modelarts/readme.md @@ -0,0 +1,95 @@ + + +# SSD_Resnet50-Ascend (目标检测/MindSpore) + + + +## 1.概述 + +SSD GhostNet 将边界框的输出空间离散为一组默认框,每个特征地图位置的纵横比和比例不同。在预测时,网络为每个默认框中每个对象类别 生成分数,并对该框进行调整,以更好地匹配对象形状。此外,该网络结合了来自不同分辨率的多个特征图的预测,从而自然地处理不同尺寸的物体。 + +## 2.训练 +### 2.1.算法基本信息 +- 任务类型: 目标检测 +- 支持的框架引擎: Ascend-Powered-Engine- Mindspore-1.1.1-python3.7-aarch64 +- 算法输入: + - obs数据集路径,下面存放使用coco2017数据集。数据集的格式见训练手册说明。 +- 算法输出: + - 训练生成的ckpt模型 + +### 2.2.训练参数说明 +名称|默认值|类型|是否必填|描述 +---|---|---|---|--- +lr|0.05|float|True|初始学习率 +dataset|coco|string|True|数据集格式,可选值coco、voc、other +epoch_size|500|int|True|训练轮数 +batch_size|32|int|True|一次训练所抓取的数据样本数量 +save_checkpoint_epochs|10|int|False|保存checkpoint的轮数。 +num_classes|81|string|True|数据集类别数+1。 +voc_json|-|string|False|dataset为voc时,用于指定数据集标注文件,填相对于data_url的路径。 +anno_path|-|string|False|dataset为other时,用于指定数据集标注文件,填相对于data_url的路径。 +pre_trained|-|string|False|迁移学习时,预训练模型路径,模型放在data_url下,填相对于data_url的路径。 +loss_scale|1024|int|False|Loss scale. +filter_weight|False|Boolean|False|Filter head weight paramaters,迁移学习时需要设置为True。 + + + +### 2.3. 训练输出文件 + +训练完成后的输出文件如下 +``` +训练输出目录 V000X +├── ssd-10_12.ckpt +├── ssd-10_12.ckpt.air   +├── ssd-graph.meta +├── kernel_meta +│   ├── ApplyMomentum_13796921261177776697_0.info +│   ├── AddN_4688903218960634315_0.json +│   ├── ... + +``` + +## 3.迁移学习指导 +### 3.1. 数据集准备: + +参考训练手册:`迁移学习指导`->`数据集准备` + +### 3.2. 上传预训练模型ckpt文件到obs数据目录pretrain_model中,示例如下: + +``` +MicrocontrollerDetection # obs数据目录 + |- train # 训练图片数据集目录 + |- IMG_20181228_102033.jpg + |- IMG_20181228_102041.jpg + |- ..。 + |- train_labels.txt # 训练图片数据标注 + |- pretrain_model + |- ssd-3-61.ckpt # 预训练模型 ckpt文件 +``` + +classes_label_path参数对应的train_labels.txt的内容如下所示: + +``` +background +Arduino Nano +ESP8266 +Raspberry Pi 3 +Heltec ESP32 Lora +``` + + + +### 3.3. 修改调优参数 + +目前迁移学习支持修改数据集类别,订阅算法创建训练任务,创建训练作业时需要修改如下调优参数: + +* dataset改为other。 +* num_classes改为迁移学习数据集的类别数+1。 +* anno_path指定迁移学习数据集的标注文件路径。 +* filter_weight改为True。 +* pre_trained指定预训练模型路径。 + +以上参数的说明见`训练参数说明`。 + +### 3.4. 创建训练作业 +指定数据存储位置、模型输出位置和作业日志路径,创建训练作业进行迁移学习。 \ No newline at end of file diff --git a/research/cv/ssd_resnet50/modelarts/start.py b/research/cv/ssd_resnet50/modelarts/start.py new file mode 100644 index 0000000000000000000000000000000000000000..a9b497c9f20f7815895b642fa36fe1d6993b1ad6 --- /dev/null +++ b/research/cv/ssd_resnet50/modelarts/start.py @@ -0,0 +1,369 @@ +# coding=utf-8 +""" +Copyright 2021 Huawei Technologies Co., Ltd + +Licensed under the Apache License, Version 2.0 (the "License"); +you may not use this file except in compliance with the License. +You may obtain a copy of the License at + +http://www.apache.org/licenses/LICENSE-2.0 + +Unless required by applicable law or agreed to in writing, software +distributed under the License is distributed on an "AS IS" BASIS, +WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +See the License for the specific language governing permissions and +limitations under the License. +""" +import os +import argparse +import logging +import ast +import glob +import numpy as np +import mindspore +import mindspore.nn as nn +from mindspore import context, Tensor +from mindspore.communication.management import init, get_rank +from mindspore.train.callback import CheckpointConfig, ModelCheckpoint, LossMonitor, TimeMonitor +from mindspore.train import Model +from mindspore.context import ParallelMode +from mindspore.train.serialization import export as export_model +from mindspore.train.serialization import load_checkpoint, load_param_into_net +import sys +sys.path.append(os.path.join(os.path.abspath(os.path.dirname(__file__)), '../')) + +from mindspore.common import set_seed, dtype +from src.ssd import SSDWithLossCell, TrainingWrapper, ssd_resnet50 +from src.config import config +from src.dataset import create_ssd_dataset, create_mindrecord,data_to_mindrecord_byte_image,voc_data_to_mindrecord +from src.lr_schedule import get_lr +from src.init_params import init_net_param, filter_checkpoint_parameter_by_list +import moxing as mox + +CACHE_TRAIN_DATA_URL = "/cache/train_data_url" +CACHE_TRAIN_OUT_URL = "/cache/train_out_url" + + +def get_args(): + """ + Parse arguments + """ + parser = argparse.ArgumentParser(description="SSD training") + parser.add_argument("--run_platform", type=str, default="Ascend", choices=("Ascend", "GPU", "CPU"), + help="run platform, support Ascend, GPU and CPU.") + + # 模型输出目录 + parser.add_argument("--train_url", + type=str, default='', help='the path model saved') + # 数据集目录 + parser.add_argument("--data_url", + type=str, default='', help='the training data') + + parser.add_argument("--only_create_dataset", type=ast.literal_eval, default=False, + help="If set it true, only create Mindrecord, default is False.") + parser.add_argument("--distribute", type=ast.literal_eval, default=False, + help="Run distribute, default is False.") + parser.add_argument("--device_id", type=int, default=0, help="Device id, default is 0.") + parser.add_argument("--device_num", type=int, default=1, help="Use device nums, default is 1.") + parser.add_argument("--lr", type=float, default=0.05, help="Learning rate, default is 0.05.") + parser.add_argument("--mode", type=str, default="sink", help="Run sink mode or not, default is sink.") + parser.add_argument("--dataset", type=str, default="coco", help="Dataset, default is coco.") + parser.add_argument("--epoch_size", type=int, default=10, help="Epoch size, default is 500.") + parser.add_argument("--batch_size", type=int, default=32, help="Batch size, default is 32.") + parser.add_argument("--pre_trained", type=str, default=None, help="Pretrained Checkpoint file path.") + parser.add_argument("--pre_trained_epoch_size", type=int, default=0, help="Pretrained epoch size.") + parser.add_argument("--save_checkpoint_epochs", type=int, default=10, help="Save checkpoint epochs, default is 10.") + parser.add_argument("--loss_scale", type=int, default=1024, help="Loss scale, default is 1024.") + parser.add_argument("--filter_weight", type=ast.literal_eval, default=False, + help="Filter head weight parameters, default is False.") + parser.add_argument('--freeze_layer', type=str, default="none", choices=["none", "backbone"], + help="freeze the weights of network, support freeze the backbone's weights, " + "default is not freezing.") + + # 适配config.py中的参数 + parser.add_argument("--feature_extractor_base_param", type=str, default="") + parser.add_argument("--coco_root", type=str, default="") + # parser.add_argument("--classes_label_path", type=str, default="labels.txt") + parser.add_argument("--num_classes", type=int, default=81) + parser.add_argument("--voc_root", type=str, default="") + parser.add_argument("--voc_json", type=str, default="") + parser.add_argument("--image_dir", type=str, default="") + parser.add_argument("--anno_path", type=str, default="coco_labels.txt") + + + args_opt = parser.parse_args() + return args_opt + + +def update_config(args_opt): + """ + 补全在config中的数据集路径 + Args: + args_opt: + config: + + Returns: + + """ + """ + 补全在config中的数据集路径 + Args: + args_opt: + config: + + Returns: + + """ + if config.num_ssd_boxes == -1: + num = 0 + h, w = config.img_shape + for i in range(len(config.steps)): + num += (h // config.steps[i]) * (w // config.steps[i]) * \ + config.num_default[i] + config.num_ssd_boxes = num + + data_dir = CACHE_TRAIN_DATA_URL + + # mindrecord格式数据集路径 更新为选择的数据集路径 + config.mindrecord_dir = data_dir + + # 补全数据集路径 + dataset = args_opt.dataset + if dataset == 'coco': + coco_root = args_opt.coco_root + config.coco_root = os.path.join(data_dir, coco_root) + print(f"update config.coco_root {coco_root} to {config.coco_root}") + elif dataset == 'voc': + voc_root = args_opt.voc_root + config.voc_root = os.path.join(data_dir, voc_root) + print(f"update config.voc_root {voc_root} to {config.voc_root}") + else: + image_dir = args_opt.image_dir + anno_path = args_opt.anno_path + config.image_dir = os.path.join(data_dir, image_dir) + config.anno_path = os.path.join(data_dir, anno_path) + print(f"update config.image_dir {image_dir} to {config.image_dir}") + print(f"update config.anno_path {anno_path} to {config.anno_path}") + + # with open(os.path.join(data_dir, args_opt.classes_label_path), 'r') as f: + # config.classes = [line.strip() for line in f.readlines()] + config.num_classes = args_opt.num_classes + + # 补全预训练模型路径 + # feature_extractor_base_param = args_opt.feature_extractor_base_param + # if args_opt.pre_trained: + # # 迁移学习不需要该参数 + # config.feature_extractor_base_param = "" + # print('update config.feature_extractor_base_param to "" on pretrain.') + # elif feature_extractor_base_param: + # # 需要将预训练模型放在数据目录中 + # config.feature_extractor_base_param = os.path.join( + # data_dir, feature_extractor_base_param) + # print(f"update config.feature_extractor_base_param " + # f"{feature_extractor_base_param} to " + # f"{config.feature_extractor_base_param}") + print(f"config: {config}") + + +def get_last_ckpt(): + ckpt_pattern = os.path.join(CACHE_TRAIN_OUT_URL, "*.ckpt") + ckpts = glob.glob(ckpt_pattern) + if not ckpts: + print(f"Cant't found ckpt in {CACHE_TRAIN_OUT_URL}") + return + ckpts.sort(key=os.path.getmtime) + return ckpts[-1] + + +def export(net, device_id, ckpt_file, file_format="AIR", batch_size=1): + print(f"start export {ckpt_file} to {file_format}, device_id {device_id}") + context.set_context(mode=context.GRAPH_MODE, device_target="Ascend") + context.set_context(device_id=device_id) + + param_dict = load_checkpoint(ckpt_file) + net.init_parameters_data() + load_param_into_net(net, param_dict) + net.set_train(False) + + input_shp = [batch_size, 3] + config.img_shape + input_array = Tensor(np.random.uniform(-1.0, 1.0, size=input_shp), + mindspore.float32) + export_model(net, input_array, file_name=ckpt_file, file_format=file_format) + print(f"export {ckpt_file} to {file_format} success.") + + +def export_air(net, args): + ckpt = get_last_ckpt() + if not ckpt: + return + export(net, args.device_id, ckpt, "AIR") + + + +def ssd_model_build(args_opt): + """ + build ssd model + """ + if config.model == "ssd_resnet50": + ssd = ssd_resnet50(config=config) + init_net_param(ssd) + if args_opt.feature_extractor_base_param != "": + print("args_opt.feature_extractor_base_param的值是****=",args_opt.feature_extractor_base_param) + print("args_opt.pre_trained****=", args_opt.pre_trained) + print("load_checkpoint(args_opt.feature_extractor_base_param)****=",load_checkpoint(args_opt.feature_extractor_base_param)) + param_dict = load_checkpoint(args_opt.feature_extractor_base_param) + for x in list(param_dict.keys()): + param_dict["network.feature_extractor.resnet." + x] = param_dict[x] + del param_dict[x] + load_param_into_net(ssd.feature_extractor.resnet, param_dict) + else: + raise ValueError(f'config.model: {args_opt.model} is not supported') + return ssd + +def main(): + logging.basicConfig(level=logging.INFO, + format='%(levelname)s: %(message)s') + args_opt = get_args() + print("Training setting args:", args_opt) + + os.makedirs(CACHE_TRAIN_DATA_URL, exist_ok=True) + mox.file.copy_parallel(args_opt.data_url, CACHE_TRAIN_DATA_URL) + + update_config(args_opt) + + if args_opt.distribute: + device_num = args_opt.device_num + context.reset_auto_parallel_context() + context.set_auto_parallel_context(parallel_mode=ParallelMode.DATA_PARALLEL, gradients_mean=True, + device_num=device_num) + init() + rank = args_opt.device_id % device_num + else: + rank = 0 + device_num = 1 + print("Start create dataset!") + + + print("bbbbbbbbbb") + if args_opt.run_platform == "CPU": + print(args_opt) + print("lpf") + context.set_context(mode=context.GRAPH_MODE, device_target="CPU") + else: + context.set_context(mode=context.GRAPH_MODE, device_target=args_opt.run_platform, device_id=args_opt.device_id) + if args_opt.distribute: + device_num = args_opt.device_num + context.reset_auto_parallel_context() + context.set_auto_parallel_context(parallel_mode=ParallelMode.DATA_PARALLEL, gradients_mean=True, + device_num=device_num) + init() + context.set_auto_parallel_context(all_reduce_fusion_config=[29, 58, 89]) + rank = get_rank() + + + print("****args_opt.dataset=",args_opt.dataset) + prefix = "ssd.mindrecord" + mindrecord_dir = config.mindrecord_dir + mindrecord_file = os.path.join(mindrecord_dir, prefix + "0") + if not os.path.exists(mindrecord_file): + if not os.path.isdir(mindrecord_dir): + os.makedirs(mindrecord_dir) + if args_opt.dataset == "coco": + if os.path.isdir(config.coco_root): + print("Create Mindrecord.") + data_to_mindrecord_byte_image("coco", True, prefix) + print("Create Mindrecord Done, at {}".format(mindrecord_dir)) + else: + print("coco_root not exits.") + elif args_opt.dataset == "voc": + if os.path.isdir(config.voc_dir): + print("Create Mindrecord.") + voc_data_to_mindrecord(mindrecord_dir, True, prefix) + print("Create Mindrecord Done, at {}".format(mindrecord_dir)) + else: + print("voc_dir not exits.") + else: + if os.path.isdir(config.image_dir) and os.path.exists(config.anno_path): + print("Create Mindrecord.") + data_to_mindrecord_byte_image("other", True, prefix) + print("Create Mindrecord Done, at {}".format(mindrecord_dir)) + else: + print("image_dir or anno_path not exits.") + + print("*********mindrecord_file",mindrecord_file) + + + if args_opt.only_create_dataset: + return + + loss_scale = float(args_opt.loss_scale) + if args_opt.run_platform == "CPU": + loss_scale = 1.0 + + # When create MindDataset, using the fitst mindrecord file, such as ssd.mindrecord0. + use_multiprocessing = (args_opt.run_platform != "CPU") + dataset = create_ssd_dataset(mindrecord_file, repeat_num=1, batch_size=args_opt.batch_size, + device_num=device_num, rank=rank, use_multiprocessing=use_multiprocessing) + print("********dataset", dataset) + dataset_size = dataset.get_dataset_size() + print("****-----****dataset_size", dataset) + print(f"Create dataset done! dataset size is {dataset_size}") + ssd = ssd_model_build(args_opt) + print("finish ssd model building ...............") + + if ("use_float16" in config and config.use_float16) or args_opt.run_platform == "GPU": + ssd.to_float(dtype.float16) + net = SSDWithLossCell(ssd, config) + + # checkpoint + # ckpt_config = CheckpointConfig(save_checkpoint_steps=dataset_size * args_opt.save_checkpoint_epochs) + # save_ckpt_path = './ckpt_' + config.model + '_' + str(rank) + '/' + # ckpoint_cb = ModelCheckpoint(prefix="ssd", directory=save_ckpt_path, config=ckpt_config) + + ckpt_config = CheckpointConfig(save_checkpoint_steps=dataset_size * args_opt.save_checkpoint_epochs, keep_checkpoint_max=60) + ckpoint_cb = ModelCheckpoint(prefix="ssd", directory=CACHE_TRAIN_OUT_URL, config=ckpt_config) + + if args_opt.pre_trained: + param_dict = load_checkpoint(args_opt.pre_trained) + if args_opt.filter_weight: + filter_checkpoint_parameter_by_list(param_dict, config.checkpoint_filter_list) + load_param_into_net(net, param_dict, True) + + lr = Tensor(get_lr(global_step=args_opt.pre_trained_epoch_size * dataset_size, + lr_init=config.lr_init, lr_end=config.lr_end_rate * args_opt.lr, lr_max=args_opt.lr, + warmup_epochs=config.warmup_epochs, + total_epochs=args_opt.epoch_size, + steps_per_epoch=dataset_size)) + + if "use_global_norm" in config and config.use_global_norm: + opt = nn.Momentum(filter(lambda x: x.requires_grad, net.get_parameters()), lr, + config.momentum, config.weight_decay, 1.0) + net = TrainingWrapper(net, opt, loss_scale, True) + else: + opt = nn.Momentum(filter(lambda x: x.requires_grad, net.get_parameters()), lr, + config.momentum, config.weight_decay, loss_scale) + net = TrainingWrapper(net, opt, loss_scale) + + + callback = [TimeMonitor(data_size=dataset_size), LossMonitor(), ckpoint_cb] + model = Model(net) + dataset_sink_mode = False + if args_opt.mode == "sink" and args_opt.run_platform != "CPU": + print("In sink mode, one epoch return a loss.") + dataset_sink_mode = True + print("Start train SSD, the first epoch will be slower because of the graph compilation.") + + # 改变工作目录,用于模型保存 + os.makedirs(CACHE_TRAIN_OUT_URL, exist_ok=True) + os.chdir(CACHE_TRAIN_OUT_URL) + model.train(args_opt.epoch_size, dataset, callbacks=callback, dataset_sink_mode=dataset_sink_mode) + + + # net = SSDWithLossCell(ssd, config) + net = ssd_resnet50(config=config) + export_air(net, args_opt) + mox.file.copy_parallel(CACHE_TRAIN_OUT_URL, args_opt.train_url) + + +if __name__ == '__main__': + main() \ No newline at end of file diff --git a/research/cv/ssd_resnet50/postprocess.py b/research/cv/ssd_resnet50/postprocess.py index 3639e32504c69a680e54972a052608364a7f5904..7688fa0a69c09de8fd3ea7b2160d36e37a67623b 100644 --- a/research/cv/ssd_resnet50/postprocess.py +++ b/research/cv/ssd_resnet50/postprocess.py @@ -1,89 +1,89 @@ -# Copyright 2021 Huawei Technologies Co., Ltd -# -# Licensed under the Apache License, Version 2.0 (the "License"); -# you may not use this file except in compliance with the License. -# You may obtain a copy of the License at -# -# http://www.apache.org/licenses/LICENSE-2.0 -# -# Unless required by applicable law or agreed to in writing, software -# distributed under the License is distributed on an "AS IS" BASIS, -# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. -# See the License for the specific language governing permissions and -# limitations under the License. -# ============================================================================ -"""post process for 310 inference""" -import os -import argparse -import numpy as np -from PIL import Image - -from src.config import config -from src.eval_utils import metrics - -batch_size = 1 -parser = argparse.ArgumentParser(description="ssd acc calculation") -parser.add_argument("--result_path", type=str, required=True, help="result files path.") -parser.add_argument("--img_path", type=str, required=True, help="image file path.") -parser.add_argument("--anno_file", type=str, required=True, help="annotation file.") -parser.add_argument("--drop", action="store_true", help="drop iscrowd images or not.") -args = parser.parse_args() - -def get_imgSize(file_name): - img = Image.open(file_name) - return img.size - -def get_result(result_path, img_id_file_path): - """print the mAP""" - if args.drop: - from pycocotools.coco import COCO - train_cls = config.classes - train_cls_dict = {} - for i, cls in enumerate(train_cls): - train_cls_dict[cls] = i - coco = COCO(args.anno_file) - classs_dict = {} - cat_ids = coco.loadCats(coco.getCatIds()) - for cat in cat_ids: - classs_dict[cat["id"]] = cat["name"] - - files = os.listdir(img_id_file_path) - pred_data = [] - - for file in files: - img_ids_name = file.split('.')[0] - img_id = int(np.squeeze(img_ids_name)) - if args.drop: - anno_ids = coco.getAnnIds(imgIds=img_id, iscrowd=None) - anno = coco.loadAnns(anno_ids) - annos = [] - iscrowd = False - for label in anno: - bbox = label["bbox"] - class_name = classs_dict[label["category_id"]] - iscrowd = iscrowd or label["iscrowd"] - if class_name in train_cls: - x_min, x_max = bbox[0], bbox[0] + bbox[2] - y_min, y_max = bbox[1], bbox[1] + bbox[3] - annos.append(list(map(round, [y_min, x_min, y_max, x_max])) + [train_cls_dict[class_name]]) - if iscrowd or (not annos): - continue - - img_size = get_imgSize(os.path.join(img_id_file_path, file)) - image_shape = np.array([img_size[1], img_size[0]]) - result_path_0 = os.path.join(result_path, img_ids_name + "_0.bin") - result_path_1 = os.path.join(result_path, img_ids_name + "_1.bin") - boxes = np.fromfile(result_path_0, dtype=np.float32).reshape(config.num_ssd_boxes, 4) - box_scores = np.fromfile(result_path_1, dtype=np.float32).reshape(config.num_ssd_boxes, config.num_classes) - - pred_data.append({ - "boxes": boxes, - "box_scores": box_scores, - "img_id": img_id, - "image_shape": image_shape - }) - mAP = metrics(pred_data, args.anno_file) - print(f" mAP:{mAP}") - -if __name__ == '__main__': - get_result(args.result_path, args.img_path) +# Copyright 2021 Huawei Technologies Co., Ltd +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. +# ============================================================================ +"""post process for 310 inference""" +import os +import argparse +import numpy as np +from PIL import Image + +from src.config import config +from src.eval_utils import metrics + +batch_size = 1 +parser = argparse.ArgumentParser(description="ssd acc calculation") +parser.add_argument("--result_path", type=str, required=True, help="result files path.") +parser.add_argument("--img_path", type=str, required=True, help="image file path.") +parser.add_argument("--anno_file", type=str, required=True, help="annotation file.") +parser.add_argument("--drop", action="store_true", help="drop iscrowd images or not.") +args = parser.parse_args() + +def get_imgSize(file_name): + img = Image.open(file_name) + return img.size + +def get_result(result_path, img_id_file_path): + """print the mAP""" + if args.drop: + from pycocotools.coco import COCO + train_cls = config.classes + train_cls_dict = {} + for i, cls in enumerate(train_cls): + train_cls_dict[cls] = i + coco = COCO(args.anno_file) + classs_dict = {} + cat_ids = coco.loadCats(coco.getCatIds()) + for cat in cat_ids: + classs_dict[cat["id"]] = cat["name"] + + files = os.listdir(img_id_file_path) + pred_data = [] + + for file in files: + img_ids_name = file.split('.')[0] + img_id = int(np.squeeze(img_ids_name)) + if args.drop: + anno_ids = coco.getAnnIds(imgIds=img_id, iscrowd=None) + anno = coco.loadAnns(anno_ids) + annos = [] + iscrowd = False + for label in anno: + bbox = label["bbox"] + class_name = classs_dict[label["category_id"]] + iscrowd = iscrowd or label["iscrowd"] + if class_name in train_cls: + x_min, x_max = bbox[0], bbox[0] + bbox[2] + y_min, y_max = bbox[1], bbox[1] + bbox[3] + annos.append(list(map(round, [y_min, x_min, y_max, x_max])) + [train_cls_dict[class_name]]) + if iscrowd or (not annos): + continue + + img_size = get_imgSize(os.path.join(img_id_file_path, file)) + image_shape = np.array([img_size[1], img_size[0]]) + result_path_0 = os.path.join(result_path, img_ids_name + "_0.bin") + result_path_1 = os.path.join(result_path, img_ids_name + "_1.bin") + boxes = np.fromfile(result_path_0, dtype=np.float32).reshape(config.num_ssd_boxes, 4) + box_scores = np.fromfile(result_path_1, dtype=np.float32).reshape(config.num_ssd_boxes, config.num_classes) + + pred_data.append({ + "boxes": boxes, + "box_scores": box_scores, + "img_id": img_id, + "image_shape": image_shape + }) + mAP = metrics(pred_data, args.anno_file) + print(f" mAP:{mAP}") + +if __name__ == '__main__': + get_result(args.result_path, args.img_path) diff --git a/research/cv/ssd_resnet50/requirements.txt b/research/cv/ssd_resnet50/requirements.txt index 15287919a73f91639a67c249ea5f4db8f80c880c..6fc0b4bed46b1eb1cadea52ada75c33407415151 100644 --- a/research/cv/ssd_resnet50/requirements.txt +++ b/research/cv/ssd_resnet50/requirements.txt @@ -1,5 +1,5 @@ -pycocotools -opencv-python -xml-python -Pillow -numpy +pycocotools +opencv-python +xml-python +Pillow +numpy diff --git a/research/cv/ssd_resnet50/scripts/docker_start.sh b/research/cv/ssd_resnet50/scripts/docker_start.sh index 9fdd0e872e3d2d5b8a3408623f9c53ad7e86e03c..bd04ce6c1dbb0f31601f89ae14f130586aef8335 100644 --- a/research/cv/ssd_resnet50/scripts/docker_start.sh +++ b/research/cv/ssd_resnet50/scripts/docker_start.sh @@ -1,29 +1,29 @@ -#!/bin/bash - -docker_image=$1 -data_dir=$2 -model_dir=$3 - -docker run -it --ipc=host \ - --device=/dev/davinci0 \ - --device=/dev/davinci1 \ - --device=/dev/davinci2 \ - --device=/dev/davinci3 \ - --device=/dev/davinci4 \ - --device=/dev/davinci5 \ - --device=/dev/davinci6 \ - --device=/dev/davinci7 \ - --device=/dev/davinci_manager \ - --device=/dev/devmm_svm \ - --device=/dev/hisi_hdc \ - --privileged \ - -v /usr/local/Ascend/driver:/usr/local/Ascend/driver \ - -v /usr/local/Ascend/add-ons/:/usr/local/Ascend/add-ons \ - -v ${data_dir}:${data_dir} \ - -v ${model_dir}:${model_dir} \ - -v /var/log/npu/conf/slog/slog.conf:/var/log/npu/conf/slog/slog.conf \ - -v /var/log/npu/slog/:/var/log/npu/slog/ \ - -v /var/log/npu/profiling/:/var/log/npu/profiling \ - -v /var/log/npu/dump/:/var/log/npu/dump \ - -v /var/log/npu/:/usr/slog ${docker_image} \ - /bin/bash +#!/bin/bash + +docker_image=$1 +data_dir=$2 +model_dir=$3 + +docker run -it --ipc=host \ + --device=/dev/davinci0 \ + --device=/dev/davinci1 \ + --device=/dev/davinci2 \ + --device=/dev/davinci3 \ + --device=/dev/davinci4 \ + --device=/dev/davinci5 \ + --device=/dev/davinci6 \ + --device=/dev/davinci7 \ + --device=/dev/davinci_manager \ + --device=/dev/devmm_svm \ + --device=/dev/hisi_hdc \ + --privileged \ + -v /usr/local/Ascend/driver:/usr/local/Ascend/driver \ + -v /usr/local/Ascend/add-ons/:/usr/local/Ascend/add-ons \ + -v ${data_dir}:${data_dir} \ + -v ${model_dir}:${model_dir} \ + -v /var/log/npu/conf/slog/slog.conf:/var/log/npu/conf/slog/slog.conf \ + -v /var/log/npu/slog/:/var/log/npu/slog/ \ + -v /var/log/npu/profiling/:/var/log/npu/profiling \ + -v /var/log/npu/dump/:/var/log/npu/dump \ + -v /var/log/npu/:/usr/slog ${docker_image} \ + /bin/bash diff --git a/research/cv/ssd_resnet50/scripts/run_distribute_train.sh b/research/cv/ssd_resnet50/scripts/run_distribute_train.sh index a25ae704b383fe39343dee608962ad2cb7c80e7e..f0d1503b409f434231f5eb8531639631f6234ffb 100644 --- a/research/cv/ssd_resnet50/scripts/run_distribute_train.sh +++ b/research/cv/ssd_resnet50/scripts/run_distribute_train.sh @@ -1,84 +1,84 @@ -#!/bin/bash -# Copyright 2021 Huawei Technologies Co., Ltd -# -# Licensed under the Apache License, Version 2.0 (the "License"); -# you may not use this file except in compliance with the License. -# You may obtain a copy of the License at -# -# http://www.apache.org/licenses/LICENSE-2.0 -# -# Unless required by applicable law or agreed to in writing, software -# distributed under the License is distributed on an "AS IS" BASIS, -# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. -# See the License for the specific language governing permissions and -# limitations under the License. -# ============================================================================ - -echo "==============================================================================================================" -echo "Please run the script as: " -echo "sh run_distribute_train.sh DEVICE_NUM EPOCH_SIZE LR DATASET RANK_TABLE_FILE PRE_TRAINED PRE_TRAINED_EPOCH_SIZE" -echo "for example: sh run_distribute_train.sh 8 500 0.2 coco /data/hccl.json /opt/ssd-300.ckpt(optional) 200(optional)" -echo "It is better to use absolute path." -echo "=================================================================================================================" - -if [ $# != 5 ] && [ $# != 7 ] -then - echo "Usage: sh run_distribute_train.sh [DEVICE_NUM] [EPOCH_SIZE] [LR] [DATASET] \ -[RANK_TABLE_FILE] [PRE_TRAINED](optional) [PRE_TRAINED_EPOCH_SIZE](optional)" - exit 1 -fi - -# Before start distribute train, first create mindrecord files. -BASE_PATH=$(cd "`dirname $0`" || exit; pwd) -cd $BASE_PATH/../ || exit -python train.py --only_create_dataset=True --dataset=$4 - -echo "After running the script, the network runs in the background. The log will be generated in LOGx/log.txt" - -export RANK_SIZE=$1 -EPOCH_SIZE=$2 -LR=$3 -DATASET=$4 -PRE_TRAINED=$6 -PRE_TRAINED_EPOCH_SIZE=$7 -export RANK_TABLE_FILE=$5 - -start_device=0 -for((i=0;i env.log - if [ $# == 5 ] - then - python train.py \ - --distribute=True \ - --lr=$LR \ - --dataset=$DATASET \ - --device_num=$RANK_SIZE \ - --device_id=$DEVICE_ID \ - --epoch_size=$EPOCH_SIZE > log.txt 2>&1 & - fi - - if [ $# == 7 ] - then - python train.py \ - --distribute=True \ - --lr=$LR \ - --dataset=$DATASET \ - --device_num=$RANK_SIZE \ - --device_id=$DEVICE_ID \ - --pre_trained=$PRE_TRAINED \ - --pre_trained_epoch_size=$PRE_TRAINED_EPOCH_SIZE \ - --epoch_size=$EPOCH_SIZE > log.txt 2>&1 & - fi - - cd ../ -done +#!/bin/bash +# Copyright 2021 Huawei Technologies Co., Ltd +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. +# ============================================================================ + +echo "==============================================================================================================" +echo "Please run the script as: " +echo "sh run_distribute_train.sh DEVICE_NUM EPOCH_SIZE LR DATASET RANK_TABLE_FILE PRE_TRAINED PRE_TRAINED_EPOCH_SIZE" +echo "for example: sh run_distribute_train.sh 8 500 0.2 coco /data/hccl.json /opt/ssd-300.ckpt(optional) 200(optional)" +echo "It is better to use absolute path." +echo "=================================================================================================================" + +if [ $# != 5 ] && [ $# != 7 ] +then + echo "Usage: sh run_distribute_train.sh [DEVICE_NUM] [EPOCH_SIZE] [LR] [DATASET] \ +[RANK_TABLE_FILE] [PRE_TRAINED](optional) [PRE_TRAINED_EPOCH_SIZE](optional)" + exit 1 +fi + +# Before start distribute train, first create mindrecord files. +BASE_PATH=$(cd "`dirname $0`" || exit; pwd) +cd $BASE_PATH/../ || exit +python train.py --only_create_dataset=True --dataset=$4 + +echo "After running the script, the network runs in the background. The log will be generated in LOGx/log.txt" + +export RANK_SIZE=$1 +EPOCH_SIZE=$2 +LR=$3 +DATASET=$4 +PRE_TRAINED=$6 +PRE_TRAINED_EPOCH_SIZE=$7 +export RANK_TABLE_FILE=$5 + +start_device=0 +for((i=0;i env.log + if [ $# == 5 ] + then + python train.py \ + --distribute=True \ + --lr=$LR \ + --dataset=$DATASET \ + --device_num=$RANK_SIZE \ + --device_id=$DEVICE_ID \ + --epoch_size=$EPOCH_SIZE > log.txt 2>&1 & + fi + + if [ $# == 7 ] + then + python train.py \ + --distribute=True \ + --lr=$LR \ + --dataset=$DATASET \ + --device_num=$RANK_SIZE \ + --device_id=$DEVICE_ID \ + --pre_trained=$PRE_TRAINED \ + --pre_trained_epoch_size=$PRE_TRAINED_EPOCH_SIZE \ + --epoch_size=$EPOCH_SIZE > log.txt 2>&1 & + fi + + cd ../ +done diff --git a/research/cv/ssd_resnet50/scripts/run_eval.sh b/research/cv/ssd_resnet50/scripts/run_eval.sh index 77054ad87f642ad9de43b22634450784ee691a32..367879f4391865b387b1f1d786e34b35856b3d80 100644 --- a/research/cv/ssd_resnet50/scripts/run_eval.sh +++ b/research/cv/ssd_resnet50/scripts/run_eval.sh @@ -1,65 +1,65 @@ -#!/bin/bash -# Copyright 2021 Huawei Technologies Co., Ltd -# -# Licensed under the Apache License, Version 2.0 (the "License"); -# you may not use this file except in compliance with the License. -# You may obtain a copy of the License at -# -# http://www.apache.org/licenses/LICENSE-2.0 -# -# Unless required by applicable law or agreed to in writing, software -# distributed under the License is distributed on an "AS IS" BASIS, -# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. -# See the License for the specific language governing permissions and -# limitations under the License. -# ============================================================================ - -if [ $# != 3 ] -then - echo "Usage: sh run_eval.sh [DATASET] [CHECKPOINT_PATH] [DEVICE_ID]" -exit 1 -fi - -get_real_path(){ - if [ "${1:0:1}" == "/" ]; then - echo "$1" - else - echo "$(realpath -m $PWD/$1)" - fi -} - -DATASET=$1 -CHECKPOINT_PATH=$(get_real_path $2) -echo $DATASET -echo $CHECKPOINT_PATH - -if [ ! -f $CHECKPOINT_PATH ] -then - echo "error: CHECKPOINT_PATH=$PATH2 is not a file" -exit 1 -fi - -export DEVICE_NUM=1 -export DEVICE_ID=$3 -export RANK_SIZE=$DEVICE_NUM -export RANK_ID=0 - -BASE_PATH=$(cd "`dirname $0`" || exit; pwd) -cd $BASE_PATH/../ || exit - -if [ -d "eval$3" ]; -then - rm -rf ./eval$3 -fi - -mkdir ./eval$3 -cp ./*.py ./eval$3 -cp -r ./src ./eval$3 -cd ./eval$3 || exit -env > env.log -echo "start inferring for device $DEVICE_ID" -python eval.py \ - --dataset=$DATASET \ - --checkpoint_path=$CHECKPOINT_PATH \ - --device_id=$3 > log.txt 2>&1 & -cd .. +#!/bin/bash +# Copyright 2021 Huawei Technologies Co., Ltd +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. +# ============================================================================ + +if [ $# != 3 ] +then + echo "Usage: sh run_eval.sh [DATASET] [CHECKPOINT_PATH] [DEVICE_ID]" +exit 1 +fi + +get_real_path(){ + if [ "${1:0:1}" == "/" ]; then + echo "$1" + else + echo "$(realpath -m $PWD/$1)" + fi +} + +DATASET=$1 +CHECKPOINT_PATH=$(get_real_path $2) +echo $DATASET +echo $CHECKPOINT_PATH + +if [ ! -f $CHECKPOINT_PATH ] +then + echo "error: CHECKPOINT_PATH=$PATH2 is not a file" +exit 1 +fi + +export DEVICE_NUM=1 +export DEVICE_ID=$3 +export RANK_SIZE=$DEVICE_NUM +export RANK_ID=0 + +BASE_PATH=$(cd "`dirname $0`" || exit; pwd) +cd $BASE_PATH/../ || exit + +if [ -d "eval$3" ]; +then + rm -rf ./eval$3 +fi + +mkdir ./eval$3 +cp ./*.py ./eval$3 +cp -r ./src ./eval$3 +cd ./eval$3 || exit +env > env.log +echo "start inferring for device $DEVICE_ID" +python eval.py \ + --dataset=$DATASET \ + --checkpoint_path=$CHECKPOINT_PATH \ + --device_id=$3 > log.txt 2>&1 & +cd .. diff --git a/research/cv/ssd_resnet50/scripts/run_infer_310.sh b/research/cv/ssd_resnet50/scripts/run_infer_310.sh index 808468ad82d3e6e8ee9ad37576a8f26a551109ac..b64cca30f663da3605cff954697d09b974c84639 100644 --- a/research/cv/ssd_resnet50/scripts/run_infer_310.sh +++ b/research/cv/ssd_resnet50/scripts/run_infer_310.sh @@ -1,108 +1,108 @@ -#!/bin/bash -# Copyright 2021 Huawei Technologies Co., Ltd -# -# Licensed under the Apache License, Version 2.0 (the "License"); -# you may not use this file except in compliance with the License. -# You may obtain a copy of the License at -# -# http://www.apache.org/licenses/LICENSE-2.0 -# -# Unless required by applicable law or agreed to in writing, software -# distributed under the License is distributed on an "AS IS" BASIS, -# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. -# See the License for the specific language governing permissions and -# limitations under the License. -# ============================================================================ - -if [[ $# -lt 4 || $# -gt 5 ]]; then - echo "Usage: bash run_infer_310.sh [MINDIR_PATH] [DATA_PATH] [DVPP] [ANNO_FILE] [DEVICE_ID] - DVPP is mandatory, and must choose from [DVPP|CPU], it's case-insensitive - ANNO_PATH is mandatory, and should specify annotation file path of your data including file name. - DEVICE_ID is optional, it can be set by environment variable device_id, otherwise the value is zero" -exit 1 -fi - -get_real_path(){ - if [ "${1:0:1}" == "/" ]; then - echo "$1" - else - echo "$(realpath -m $PWD/$1)" - fi -} -model=$(get_real_path $1) -data_path=$(get_real_path $2) -DVPP=${3^^} -anno=$(get_real_path $4) - -device_id=0 -if [ $# == 5 ]; then - device_id=$5 -fi - -echo "mindir name: "$model -echo "dataset path: "$data_path -echo "image process mode: "$DVPP -echo "anno file: "$anno -echo "device id: "$device_id - -export ASCEND_HOME=/usr/local/Ascend/ -if [ -d ${ASCEND_HOME}/ascend-toolkit ]; then - export PATH=$ASCEND_HOME/fwkacllib/bin:$ASCEND_HOME/fwkacllib/ccec_compiler/bin:$ASCEND_HOME/ascend-toolkit/latest/fwkacllib/ccec_compiler/bin:$ASCEND_HOME/ascend-toolkit/latest/atc/bin:$PATH - export LD_LIBRARY_PATH=$ASCEND_HOME/fwkacllib/lib64:/usr/local/lib:$ASCEND_HOME/ascend-toolkit/latest/atc/lib64:$ASCEND_HOME/ascend-toolkit/latest/fwkacllib/lib64:$ASCEND_HOME/driver/lib64:$ASCEND_HOME/add-ons:$LD_LIBRARY_PATH - export TBE_IMPL_PATH=$ASCEND_HOME/ascend-toolkit/latest/opp/op_impl/built-in/ai_core/tbe - export PYTHONPATH=$ASCEND_HOME/fwkacllib/python/site-packages:${TBE_IMPL_PATH}:$ASCEND_HOME/ascend-toolkit/latest/fwkacllib/python/site-packages:$PYTHONPATH - export ASCEND_OPP_PATH=$ASCEND_HOME/ascend-toolkit/latest/opp -else - export PATH=$ASCEND_HOME/fwkacllib/bin:$ASCEND_HOME/fwkacllib/ccec_compiler/bin:$ASCEND_HOME/atc/ccec_compiler/bin:$ASCEND_HOME/atc/bin:$PATH - export LD_LIBRARY_PATH=$ASCEND_HOME/fwkacllib/lib64:/usr/local/lib:$ASCEND_HOME/atc/lib64:$ASCEND_HOME/acllib/lib64:$ASCEND_HOME/driver/lib64:$ASCEND_HOME/add-ons:$LD_LIBRARY_PATH - export PYTHONPATH=$ASCEND_HOME/fwkacllib/python/site-packages:$ASCEND_HOME/atc/python/site-packages:$PYTHONPATH - export ASCEND_OPP_PATH=$ASCEND_HOME/opp -fi - -function compile_app() -{ - cd ../ascend310_infer || exit - bash build.sh &> build.log -} - -function infer() -{ - cd - || exit - if [ -d result_Files ]; then - rm -rf ./result_Files - fi - if [ -d time_Result ]; then - rm -rf ./time_Result - fi - mkdir result_Files - mkdir time_Result - if [ "$DVPP" == "DVPP" ];then - ../ascend310_infer/out/main --mindir_path=$model --dataset_path=$data_path --device_id=$device_id --cpu_dvpp=$DVPP --aipp_path=../ascend310_infer/aipp.cfg --image_height=640 --image_width=640 &> infer.log - elif [ "$DVPP" == "CPU" ]; then - ../ascend310_infer/out/main --mindir_path=$model --dataset_path=$data_path --cpu_dvpp=$DVPP --device_id=$device_id --image_height=300 --image_width=300 &> infer.log - else - echo "image process mode must be in [DVPP|CPU]" - exit 1 - fi -} - -function cal_acc() -{ - python3.7 ../postprocess.py --result_path=./result_Files --img_path=$data_path --anno_file=$anno --drop &> acc.log & -} - -compile_app -if [ $? -ne 0 ]; then - echo "compile app code failed" - exit 1 -fi -infer -if [ $? -ne 0 ]; then - echo " execute inference failed" - exit 1 -fi -cal_acc -if [ $? -ne 0 ]; then - echo "calculate accuracy failed" - exit 1 +#!/bin/bash +# Copyright 2021 Huawei Technologies Co., Ltd +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. +# ============================================================================ + +if [[ $# -lt 4 || $# -gt 5 ]]; then + echo "Usage: bash run_infer_310.sh [MINDIR_PATH] [DATA_PATH] [DVPP] [ANNO_FILE] [DEVICE_ID] + DVPP is mandatory, and must choose from [DVPP|CPU], it's case-insensitive + ANNO_PATH is mandatory, and should specify annotation file path of your data including file name. + DEVICE_ID is optional, it can be set by environment variable device_id, otherwise the value is zero" +exit 1 +fi + +get_real_path(){ + if [ "${1:0:1}" == "/" ]; then + echo "$1" + else + echo "$(realpath -m $PWD/$1)" + fi +} +model=$(get_real_path $1) +data_path=$(get_real_path $2) +DVPP=${3^^} +anno=$(get_real_path $4) + +device_id=0 +if [ $# == 5 ]; then + device_id=$5 +fi + +echo "mindir name: "$model +echo "dataset path: "$data_path +echo "image process mode: "$DVPP +echo "anno file: "$anno +echo "device id: "$device_id + +export ASCEND_HOME=/usr/local/Ascend/ +if [ -d ${ASCEND_HOME}/ascend-toolkit ]; then + export PATH=$ASCEND_HOME/fwkacllib/bin:$ASCEND_HOME/fwkacllib/ccec_compiler/bin:$ASCEND_HOME/ascend-toolkit/latest/fwkacllib/ccec_compiler/bin:$ASCEND_HOME/ascend-toolkit/latest/atc/bin:$PATH + export LD_LIBRARY_PATH=$ASCEND_HOME/fwkacllib/lib64:/usr/local/lib:$ASCEND_HOME/ascend-toolkit/latest/atc/lib64:$ASCEND_HOME/ascend-toolkit/latest/fwkacllib/lib64:$ASCEND_HOME/driver/lib64:$ASCEND_HOME/add-ons:$LD_LIBRARY_PATH + export TBE_IMPL_PATH=$ASCEND_HOME/ascend-toolkit/latest/opp/op_impl/built-in/ai_core/tbe + export PYTHONPATH=$ASCEND_HOME/fwkacllib/python/site-packages:${TBE_IMPL_PATH}:$ASCEND_HOME/ascend-toolkit/latest/fwkacllib/python/site-packages:$PYTHONPATH + export ASCEND_OPP_PATH=$ASCEND_HOME/ascend-toolkit/latest/opp +else + export PATH=$ASCEND_HOME/fwkacllib/bin:$ASCEND_HOME/fwkacllib/ccec_compiler/bin:$ASCEND_HOME/atc/ccec_compiler/bin:$ASCEND_HOME/atc/bin:$PATH + export LD_LIBRARY_PATH=$ASCEND_HOME/fwkacllib/lib64:/usr/local/lib:$ASCEND_HOME/atc/lib64:$ASCEND_HOME/acllib/lib64:$ASCEND_HOME/driver/lib64:$ASCEND_HOME/add-ons:$LD_LIBRARY_PATH + export PYTHONPATH=$ASCEND_HOME/fwkacllib/python/site-packages:$ASCEND_HOME/atc/python/site-packages:$PYTHONPATH + export ASCEND_OPP_PATH=$ASCEND_HOME/opp +fi + +function compile_app() +{ + cd ../ascend310_infer || exit + bash build.sh &> build.log +} + +function infer() +{ + cd - || exit + if [ -d result_Files ]; then + rm -rf ./result_Files + fi + if [ -d time_Result ]; then + rm -rf ./time_Result + fi + mkdir result_Files + mkdir time_Result + if [ "$DVPP" == "DVPP" ];then + ../ascend310_infer/out/main --mindir_path=$model --dataset_path=$data_path --device_id=$device_id --cpu_dvpp=$DVPP --aipp_path=../ascend310_infer/aipp.cfg --image_height=640 --image_width=640 &> infer.log + elif [ "$DVPP" == "CPU" ]; then + ../ascend310_infer/out/main --mindir_path=$model --dataset_path=$data_path --cpu_dvpp=$DVPP --device_id=$device_id --image_height=300 --image_width=300 &> infer.log + else + echo "image process mode must be in [DVPP|CPU]" + exit 1 + fi +} + +function cal_acc() +{ + python3.7 ../postprocess.py --result_path=./result_Files --img_path=$data_path --anno_file=$anno --drop &> acc.log & +} + +compile_app +if [ $? -ne 0 ]; then + echo "compile app code failed" + exit 1 +fi +infer +if [ $? -ne 0 ]; then + echo " execute inference failed" + exit 1 +fi +cal_acc +if [ $? -ne 0 ]; then + echo "calculate accuracy failed" + exit 1 fi \ No newline at end of file diff --git a/research/cv/ssd_resnet50/scripts/run_standalone_train.sh b/research/cv/ssd_resnet50/scripts/run_standalone_train.sh index 87b522403d7ab74aa1e83580fdcc7c777257aa01..f56ac21026ac3188dccc4e8c5e52a9c2aefa28c2 100644 --- a/research/cv/ssd_resnet50/scripts/run_standalone_train.sh +++ b/research/cv/ssd_resnet50/scripts/run_standalone_train.sh @@ -1,25 +1,25 @@ -#!/bin/bash -# Copyright 2021 Huawei Technologies Co., Ltd -# -# Licensed under the Apache License, Version 2.0 (the "License"); -# you may not use this file except in compliance with the License. -# You may obtain a copy of the License at -# -# http://www.apache.org/licenses/LICENSE-2.0 -# -# Unless required by applicable law or agreed to in writing, software -# distributed under the License is distributed on an "AS IS" BASIS, -# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. -# See the License for the specific language governing permissions and -# limitations under the License. -# ============================================================================ - -python train.py \ - --distribute=False \ - --lr=0.02 \ - --dataset=coco \ - --device_num=1 \ - --device_id=1 \ - --epoch_size=12 \ - --save_checkpoint_epochs=2 \ - > log.txt 2>&1 & +#!/bin/bash +# Copyright 2021 Huawei Technologies Co., Ltd +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. +# ============================================================================ + +python train.py \ + --distribute=False \ + --lr=0.02 \ + --dataset=coco \ + --device_num=1 \ + --device_id=1 \ + --epoch_size=12 \ + --save_checkpoint_epochs=2 \ + > log.txt 2>&1 & diff --git a/research/cv/ssd_resnet50/src/anchor_generator.py b/research/cv/ssd_resnet50/src/anchor_generator.py index b0862fd648329ed5fb7be2c61390b20801112ea4..95121bc547d245a2c31fa79ce3828d85a1f4767d 100644 --- a/research/cv/ssd_resnet50/src/anchor_generator.py +++ b/research/cv/ssd_resnet50/src/anchor_generator.py @@ -1,94 +1,94 @@ -# Copyright 2021 Huawei Technologies Co., Ltd -# -# Licensed under the Apache License, Version 2.0 (the "License"); -# you may not use this file except in compliance with the License. -# You may obtain a copy of the License at -# -# http://www.apache.org/licenses/LICENSE-2.0 -# -# Unless required by applicable law or agreed to in writing, software -# distributed under the License is distributed on an "AS IS" BASIS, -# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. -# See the License for the specific language governing permissions and -# limitations under the License. -# ============================================================================ - -"""Anchor Generator""" - -import numpy as np - - -class GridAnchorGenerator: - """ - Anchor Generator - """ - def __init__(self, image_shape, scale, scales_per_octave, aspect_ratios): - super(GridAnchorGenerator, self).__init__() - self.scale = scale - self.scales_per_octave = scales_per_octave - self.aspect_ratios = aspect_ratios - self.image_shape = image_shape - - - def generate(self, step): - """generate anchors for one layer""" - scales = np.array([2**(float(scale) / self.scales_per_octave) - for scale in range(self.scales_per_octave)]).astype(np.float32) - aspects = np.array(list(self.aspect_ratios)).astype(np.float32) - - scales_grid, aspect_ratios_grid = np.meshgrid(scales, aspects) - scales_grid = scales_grid.reshape([-1]) - aspect_ratios_grid = aspect_ratios_grid.reshape([-1]) - - feature_size = [self.image_shape[0] / step, self.image_shape[1] / step] - grid_height, grid_width = feature_size - - base_size = np.array([self.scale * step, self.scale * step]).astype(np.float32) - anchor_offset = step / 2.0 - - ratio_sqrt = np.sqrt(aspect_ratios_grid) - heights = scales_grid / ratio_sqrt * base_size[0] - widths = scales_grid * ratio_sqrt * base_size[1] - - y_centers = np.arange(grid_height).astype(np.float32) - y_centers = y_centers * step + anchor_offset - x_centers = np.arange(grid_width).astype(np.float32) - x_centers = x_centers * step + anchor_offset - x_centers, y_centers = np.meshgrid(x_centers, y_centers) - - x_centers_shape = x_centers.shape - y_centers_shape = y_centers.shape - - widths_grid, x_centers_grid = np.meshgrid(widths, x_centers.reshape([-1])) - heights_grid, y_centers_grid = np.meshgrid(heights, y_centers.reshape([-1])) - - x_centers_grid = x_centers_grid.reshape(*x_centers_shape, -1) - y_centers_grid = y_centers_grid.reshape(*y_centers_shape, -1) - widths_grid = widths_grid.reshape(-1, *x_centers_shape) - heights_grid = heights_grid.reshape(-1, *y_centers_shape) - - - bbox_centers = np.stack([y_centers_grid, x_centers_grid], axis=3) - bbox_sizes = np.stack([heights_grid, widths_grid], axis=3) - bbox_centers = bbox_centers.reshape([-1, 2]) - bbox_sizes = bbox_sizes.reshape([-1, 2]) - bbox_corners = np.concatenate([bbox_centers - 0.5 * bbox_sizes, bbox_centers + 0.5 * bbox_sizes], axis=1) - self.bbox_corners = bbox_corners / np.array([*self.image_shape, *self.image_shape]).astype(np.float32) - self.bbox_centers = np.concatenate([bbox_centers, bbox_sizes], axis=1) - self.bbox_centers = self.bbox_centers / np.array([*self.image_shape, *self.image_shape]).astype(np.float32) - - print(self.bbox_centers.shape) - return self.bbox_centers, self.bbox_corners - - def generate_multi_levels(self, steps): - """generate anchor for multi layer""" - bbox_centers_list = [] - bbox_corners_list = [] - for step in steps: - bbox_centers, bbox_corners = self.generate(step) - bbox_centers_list.append(bbox_centers) - bbox_corners_list.append(bbox_corners) - - self.bbox_centers = np.concatenate(bbox_centers_list, axis=0) - self.bbox_corners = np.concatenate(bbox_corners_list, axis=0) - return self.bbox_centers, self.bbox_corners +# Copyright 2021 Huawei Technologies Co., Ltd +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. +# ============================================================================ + +"""Anchor Generator""" + +import numpy as np + + +class GridAnchorGenerator: + """ + Anchor Generator + """ + def __init__(self, image_shape, scale, scales_per_octave, aspect_ratios): + super(GridAnchorGenerator, self).__init__() + self.scale = scale + self.scales_per_octave = scales_per_octave + self.aspect_ratios = aspect_ratios + self.image_shape = image_shape + + + def generate(self, step): + """generate anchors for one layer""" + scales = np.array([2**(float(scale) / self.scales_per_octave) + for scale in range(self.scales_per_octave)]).astype(np.float32) + aspects = np.array(list(self.aspect_ratios)).astype(np.float32) + + scales_grid, aspect_ratios_grid = np.meshgrid(scales, aspects) + scales_grid = scales_grid.reshape([-1]) + aspect_ratios_grid = aspect_ratios_grid.reshape([-1]) + + feature_size = [self.image_shape[0] / step, self.image_shape[1] / step] + grid_height, grid_width = feature_size + + base_size = np.array([self.scale * step, self.scale * step]).astype(np.float32) + anchor_offset = step / 2.0 + + ratio_sqrt = np.sqrt(aspect_ratios_grid) + heights = scales_grid / ratio_sqrt * base_size[0] + widths = scales_grid * ratio_sqrt * base_size[1] + + y_centers = np.arange(grid_height).astype(np.float32) + y_centers = y_centers * step + anchor_offset + x_centers = np.arange(grid_width).astype(np.float32) + x_centers = x_centers * step + anchor_offset + x_centers, y_centers = np.meshgrid(x_centers, y_centers) + + x_centers_shape = x_centers.shape + y_centers_shape = y_centers.shape + + widths_grid, x_centers_grid = np.meshgrid(widths, x_centers.reshape([-1])) + heights_grid, y_centers_grid = np.meshgrid(heights, y_centers.reshape([-1])) + + x_centers_grid = x_centers_grid.reshape(*x_centers_shape, -1) + y_centers_grid = y_centers_grid.reshape(*y_centers_shape, -1) + widths_grid = widths_grid.reshape(-1, *x_centers_shape) + heights_grid = heights_grid.reshape(-1, *y_centers_shape) + + + bbox_centers = np.stack([y_centers_grid, x_centers_grid], axis=3) + bbox_sizes = np.stack([heights_grid, widths_grid], axis=3) + bbox_centers = bbox_centers.reshape([-1, 2]) + bbox_sizes = bbox_sizes.reshape([-1, 2]) + bbox_corners = np.concatenate([bbox_centers - 0.5 * bbox_sizes, bbox_centers + 0.5 * bbox_sizes], axis=1) + self.bbox_corners = bbox_corners / np.array([*self.image_shape, *self.image_shape]).astype(np.float32) + self.bbox_centers = np.concatenate([bbox_centers, bbox_sizes], axis=1) + self.bbox_centers = self.bbox_centers / np.array([*self.image_shape, *self.image_shape]).astype(np.float32) + + print(self.bbox_centers.shape) + return self.bbox_centers, self.bbox_corners + + def generate_multi_levels(self, steps): + """generate anchor for multi layer""" + bbox_centers_list = [] + bbox_corners_list = [] + for step in steps: + bbox_centers, bbox_corners = self.generate(step) + bbox_centers_list.append(bbox_centers) + bbox_corners_list.append(bbox_corners) + + self.bbox_centers = np.concatenate(bbox_centers_list, axis=0) + self.bbox_corners = np.concatenate(bbox_corners_list, axis=0) + return self.bbox_centers, self.bbox_corners diff --git a/research/cv/ssd_resnet50/src/box_utils.py b/research/cv/ssd_resnet50/src/box_utils.py index 0e6544055f16c00dbd87cef95da40fa2049a5e6d..60d3ef8573d8960d475b2036938906d4ccdabd48 100644 --- a/research/cv/ssd_resnet50/src/box_utils.py +++ b/research/cv/ssd_resnet50/src/box_utils.py @@ -1,170 +1,170 @@ -# Copyright 2021 Huawei Technologies Co., Ltd -# -# Licensed under the Apache License, Version 2.0 (the "License"); -# you may not use this file except in compliance with the License. -# You may obtain a copy of the License at -# -# http://www.apache.org/licenses/LICENSE-2.0 -# -# Unless required by applicable law or agreed to in writing, software -# distributed under the License is distributed on an "AS IS" BASIS, -# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. -# See the License for the specific language governing permissions and -# limitations under the License. -# ============================================================================ - -"""Bbox utils""" - -import math -import itertools as it -import numpy as np -from .config import config -from .anchor_generator import GridAnchorGenerator - - -class GeneratDefaultBoxes(): - """ - Generate Default boxes for SSD, follows the order of (W, H, archor_sizes). - `self.default_boxes` has a shape of [archor_sizes, H, W, 4], the last dimension is [y, x, h, w]. - `self.default_boxes_tlbr` has a shape as `self.default_boxes`, the last dimension is [y1, x1, y2, x2]. - """ - def __init__(self): - fk = config.img_shape[0] / np.array(config.steps) - scale_rate = (config.max_scale - config.min_scale) / (len(config.num_default) - 1) - scales = [config.min_scale + scale_rate * i for i in range(len(config.num_default))] + [1.0] - self.default_boxes = [] - for idex, feature_size in enumerate(config.feature_size): - sk1 = scales[idex] - sk2 = scales[idex + 1] - sk3 = math.sqrt(sk1 * sk2) - if idex == 0 and not config.aspect_ratios[idex]: - w, h = sk1 * math.sqrt(2), sk1 / math.sqrt(2) - all_sizes = [(0.1, 0.1), (w, h), (h, w)] - else: - all_sizes = [(sk1, sk1)] - for aspect_ratio in config.aspect_ratios[idex]: - w, h = sk1 * math.sqrt(aspect_ratio), sk1 / math.sqrt(aspect_ratio) - all_sizes.append((w, h)) - all_sizes.append((h, w)) - all_sizes.append((sk3, sk3)) - - assert len(all_sizes) == config.num_default[idex] - - for i, j in it.product(range(feature_size), repeat=2): - for w, h in all_sizes: - cx, cy = (j + 0.5) / fk[idex], (i + 0.5) / fk[idex] - self.default_boxes.append([cy, cx, h, w]) - - def to_tlbr(cy, cx, h, w): - return cy - h / 2, cx - w / 2, cy + h / 2, cx + w / 2 - - # For IoU calculation - self.default_boxes_tlbr = np.array(tuple(to_tlbr(*i) for i in self.default_boxes), dtype='float32') - self.default_boxes = np.array(self.default_boxes, dtype='float32') - -if 'use_anchor_generator' in config and config.use_anchor_generator: - generator = GridAnchorGenerator(config.img_shape, 4, 2, [1.0, 2.0, 0.5]) - default_boxes, default_boxes_tlbr = generator.generate_multi_levels(config.steps) -else: - default_boxes_tlbr = GeneratDefaultBoxes().default_boxes_tlbr - default_boxes = GeneratDefaultBoxes().default_boxes -y1, x1, y2, x2 = np.split(default_boxes_tlbr[:, :4], 4, axis=-1) -vol_anchors = (x2 - x1) * (y2 - y1) -matching_threshold = config.match_threshold - - -def ssd_bboxes_encode(boxes): - """ - Labels anchors with ground truth inputs. - - Args: - boxex: ground truth with shape [N, 5], for each row, it stores [y, x, h, w, cls]. - - Returns: - gt_loc: location ground truth with shape [num_anchors, 4]. - gt_label: class ground truth with shape [num_anchors, 1]. - num_matched_boxes: number of positives in an image. - """ - - def jaccard_with_anchors(bbox): - """Compute jaccard score a box and the anchors.""" - # Intersection bbox and volume. - ymin = np.maximum(y1, bbox[0]) - xmin = np.maximum(x1, bbox[1]) - ymax = np.minimum(y2, bbox[2]) - xmax = np.minimum(x2, bbox[3]) - w = np.maximum(xmax - xmin, 0.) - h = np.maximum(ymax - ymin, 0.) - - # Volumes. - inter_vol = h * w - union_vol = vol_anchors + (bbox[2] - bbox[0]) * (bbox[3] - bbox[1]) - inter_vol - jaccard = inter_vol / union_vol - return np.squeeze(jaccard) - - pre_scores = np.zeros((config.num_ssd_boxes), dtype=np.float32) - t_boxes = np.zeros((config.num_ssd_boxes, 4), dtype=np.float32) - t_label = np.zeros((config.num_ssd_boxes), dtype=np.int64) - for bbox in boxes: - label = int(bbox[4]) - scores = jaccard_with_anchors(bbox) - idx = np.argmax(scores) - scores[idx] = 2.0 - mask = (scores > matching_threshold) - mask = mask & (scores > pre_scores) - pre_scores = np.maximum(pre_scores, scores * mask) - t_label = mask * label + (1 - mask) * t_label - for i in range(4): - t_boxes[:, i] = mask * bbox[i] + (1 - mask) * t_boxes[:, i] - - index = np.nonzero(t_label) - - # Transform to tlbr. - bboxes = np.zeros((config.num_ssd_boxes, 4), dtype=np.float32) - bboxes[:, [0, 1]] = (t_boxes[:, [0, 1]] + t_boxes[:, [2, 3]]) / 2 - bboxes[:, [2, 3]] = t_boxes[:, [2, 3]] - t_boxes[:, [0, 1]] - - # Encode features. - bboxes_t = bboxes[index] - default_boxes_t = default_boxes[index] - bboxes_t[:, :2] = (bboxes_t[:, :2] - default_boxes_t[:, :2]) / (default_boxes_t[:, 2:] * config.prior_scaling[0]) - tmp = np.maximum(bboxes_t[:, 2:4] / default_boxes_t[:, 2:4], 0.000001) - bboxes_t[:, 2:4] = np.log(tmp) / config.prior_scaling[1] - bboxes[index] = bboxes_t - - num_match = np.array([len(np.nonzero(t_label)[0])], dtype=np.int32) - return bboxes, t_label.astype(np.int32), num_match - - -def ssd_bboxes_decode(boxes): - """Decode predict boxes to [y, x, h, w]""" - boxes_t = boxes.copy() - default_boxes_t = default_boxes.copy() - boxes_t[:, :2] = boxes_t[:, :2] * config.prior_scaling[0] * default_boxes_t[:, 2:] + default_boxes_t[:, :2] - boxes_t[:, 2:4] = np.exp(boxes_t[:, 2:4] * config.prior_scaling[1]) * default_boxes_t[:, 2:4] - - bboxes = np.zeros((len(boxes_t), 4), dtype=np.float32) - - bboxes[:, [0, 1]] = boxes_t[:, [0, 1]] - boxes_t[:, [2, 3]] / 2 - bboxes[:, [2, 3]] = boxes_t[:, [0, 1]] + boxes_t[:, [2, 3]] / 2 - - return np.clip(bboxes, 0, 1) - - -def intersect(box_a, box_b): - """Compute the intersect of two sets of boxes.""" - max_yx = np.minimum(box_a[:, 2:4], box_b[2:4]) - min_yx = np.maximum(box_a[:, :2], box_b[:2]) - inter = np.clip((max_yx - min_yx), a_min=0, a_max=np.inf) - return inter[:, 0] * inter[:, 1] - - -def jaccard_numpy(box_a, box_b): - """Compute the jaccard overlap of two sets of boxes.""" - inter = intersect(box_a, box_b) - area_a = ((box_a[:, 2] - box_a[:, 0]) * - (box_a[:, 3] - box_a[:, 1])) - area_b = ((box_b[2] - box_b[0]) * - (box_b[3] - box_b[1])) - union = area_a + area_b - inter - return inter / union +# Copyright 2021 Huawei Technologies Co., Ltd +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. +# ============================================================================ + +"""Bbox utils""" + +import math +import itertools as it +import numpy as np +from .config import config +from .anchor_generator import GridAnchorGenerator + + +class GeneratDefaultBoxes(): + """ + Generate Default boxes for SSD, follows the order of (W, H, archor_sizes). + `self.default_boxes` has a shape of [archor_sizes, H, W, 4], the last dimension is [y, x, h, w]. + `self.default_boxes_tlbr` has a shape as `self.default_boxes`, the last dimension is [y1, x1, y2, x2]. + """ + def __init__(self): + fk = config.img_shape[0] / np.array(config.steps) + scale_rate = (config.max_scale - config.min_scale) / (len(config.num_default) - 1) + scales = [config.min_scale + scale_rate * i for i in range(len(config.num_default))] + [1.0] + self.default_boxes = [] + for idex, feature_size in enumerate(config.feature_size): + sk1 = scales[idex] + sk2 = scales[idex + 1] + sk3 = math.sqrt(sk1 * sk2) + if idex == 0 and not config.aspect_ratios[idex]: + w, h = sk1 * math.sqrt(2), sk1 / math.sqrt(2) + all_sizes = [(0.1, 0.1), (w, h), (h, w)] + else: + all_sizes = [(sk1, sk1)] + for aspect_ratio in config.aspect_ratios[idex]: + w, h = sk1 * math.sqrt(aspect_ratio), sk1 / math.sqrt(aspect_ratio) + all_sizes.append((w, h)) + all_sizes.append((h, w)) + all_sizes.append((sk3, sk3)) + + assert len(all_sizes) == config.num_default[idex] + + for i, j in it.product(range(feature_size), repeat=2): + for w, h in all_sizes: + cx, cy = (j + 0.5) / fk[idex], (i + 0.5) / fk[idex] + self.default_boxes.append([cy, cx, h, w]) + + def to_tlbr(cy, cx, h, w): + return cy - h / 2, cx - w / 2, cy + h / 2, cx + w / 2 + + # For IoU calculation + self.default_boxes_tlbr = np.array(tuple(to_tlbr(*i) for i in self.default_boxes), dtype='float32') + self.default_boxes = np.array(self.default_boxes, dtype='float32') + +if 'use_anchor_generator' in config and config.use_anchor_generator: + generator = GridAnchorGenerator(config.img_shape, 4, 2, [1.0, 2.0, 0.5]) + default_boxes, default_boxes_tlbr = generator.generate_multi_levels(config.steps) +else: + default_boxes_tlbr = GeneratDefaultBoxes().default_boxes_tlbr + default_boxes = GeneratDefaultBoxes().default_boxes +y1, x1, y2, x2 = np.split(default_boxes_tlbr[:, :4], 4, axis=-1) +vol_anchors = (x2 - x1) * (y2 - y1) +matching_threshold = config.match_threshold + + +def ssd_bboxes_encode(boxes): + """ + Labels anchors with ground truth inputs. + + Args: + boxex: ground truth with shape [N, 5], for each row, it stores [y, x, h, w, cls]. + + Returns: + gt_loc: location ground truth with shape [num_anchors, 4]. + gt_label: class ground truth with shape [num_anchors, 1]. + num_matched_boxes: number of positives in an image. + """ + + def jaccard_with_anchors(bbox): + """Compute jaccard score a box and the anchors.""" + # Intersection bbox and volume. + ymin = np.maximum(y1, bbox[0]) + xmin = np.maximum(x1, bbox[1]) + ymax = np.minimum(y2, bbox[2]) + xmax = np.minimum(x2, bbox[3]) + w = np.maximum(xmax - xmin, 0.) + h = np.maximum(ymax - ymin, 0.) + + # Volumes. + inter_vol = h * w + union_vol = vol_anchors + (bbox[2] - bbox[0]) * (bbox[3] - bbox[1]) - inter_vol + jaccard = inter_vol / union_vol + return np.squeeze(jaccard) + + pre_scores = np.zeros((config.num_ssd_boxes), dtype=np.float32) + t_boxes = np.zeros((config.num_ssd_boxes, 4), dtype=np.float32) + t_label = np.zeros((config.num_ssd_boxes), dtype=np.int64) + for bbox in boxes: + label = int(bbox[4]) + scores = jaccard_with_anchors(bbox) + idx = np.argmax(scores) + scores[idx] = 2.0 + mask = (scores > matching_threshold) + mask = mask & (scores > pre_scores) + pre_scores = np.maximum(pre_scores, scores * mask) + t_label = mask * label + (1 - mask) * t_label + for i in range(4): + t_boxes[:, i] = mask * bbox[i] + (1 - mask) * t_boxes[:, i] + + index = np.nonzero(t_label) + + # Transform to tlbr. + bboxes = np.zeros((config.num_ssd_boxes, 4), dtype=np.float32) + bboxes[:, [0, 1]] = (t_boxes[:, [0, 1]] + t_boxes[:, [2, 3]]) / 2 + bboxes[:, [2, 3]] = t_boxes[:, [2, 3]] - t_boxes[:, [0, 1]] + + # Encode features. + bboxes_t = bboxes[index] + default_boxes_t = default_boxes[index] + bboxes_t[:, :2] = (bboxes_t[:, :2] - default_boxes_t[:, :2]) / (default_boxes_t[:, 2:] * config.prior_scaling[0]) + tmp = np.maximum(bboxes_t[:, 2:4] / default_boxes_t[:, 2:4], 0.000001) + bboxes_t[:, 2:4] = np.log(tmp) / config.prior_scaling[1] + bboxes[index] = bboxes_t + + num_match = np.array([len(np.nonzero(t_label)[0])], dtype=np.int32) + return bboxes, t_label.astype(np.int32), num_match + + +def ssd_bboxes_decode(boxes): + """Decode predict boxes to [y, x, h, w]""" + boxes_t = boxes.copy() + default_boxes_t = default_boxes.copy() + boxes_t[:, :2] = boxes_t[:, :2] * config.prior_scaling[0] * default_boxes_t[:, 2:] + default_boxes_t[:, :2] + boxes_t[:, 2:4] = np.exp(boxes_t[:, 2:4] * config.prior_scaling[1]) * default_boxes_t[:, 2:4] + + bboxes = np.zeros((len(boxes_t), 4), dtype=np.float32) + + bboxes[:, [0, 1]] = boxes_t[:, [0, 1]] - boxes_t[:, [2, 3]] / 2 + bboxes[:, [2, 3]] = boxes_t[:, [0, 1]] + boxes_t[:, [2, 3]] / 2 + + return np.clip(bboxes, 0, 1) + + +def intersect(box_a, box_b): + """Compute the intersect of two sets of boxes.""" + max_yx = np.minimum(box_a[:, 2:4], box_b[2:4]) + min_yx = np.maximum(box_a[:, :2], box_b[:2]) + inter = np.clip((max_yx - min_yx), a_min=0, a_max=np.inf) + return inter[:, 0] * inter[:, 1] + + +def jaccard_numpy(box_a, box_b): + """Compute the jaccard overlap of two sets of boxes.""" + inter = intersect(box_a, box_b) + area_a = ((box_a[:, 2] - box_a[:, 0]) * + (box_a[:, 3] - box_a[:, 1])) + area_b = ((box_b[2] - box_b[0]) * + (box_b[3] - box_b[1])) + union = area_a + area_b - inter + return inter / union diff --git a/research/cv/ssd_resnet50/src/config.py b/research/cv/ssd_resnet50/src/config.py index ef4478bc7437d304ea09328a280388e445ae6db6..99a65757f1adccf7bfbb19f602d37de8958e4490 100644 --- a/research/cv/ssd_resnet50/src/config.py +++ b/research/cv/ssd_resnet50/src/config.py @@ -1,36 +1,36 @@ -# Copyright 2021 Huawei Technologies Co., Ltd -# -# Licensed under the Apache License, Version 2.0 (the "License"); -# you may not use this file except in compliance with the License. -# You may obtain a copy of the License at -# -# http://www.apache.org/licenses/LICENSE-2.0 -# -# Unless required by applicable law or agreed to in writing, software -# distributed under the License is distributed on an "AS IS" BASIS, -# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. -# See the License for the specific language governing permissions and -# limitations under the License. -# ============================================================================ - -"""Config parameters for SSD models.""" - -from .config_ssd_resnet50 import config as config_ssd_resnet50 - -using_model = "ssd_resnet50" - -config_map = { - "ssd_resnet50": config_ssd_resnet50 -} - - -print("...............using "+using_model+" model................") -config = config_map[using_model] - - -if config.num_ssd_boxes == -1: - num = 0 - h, w = config.img_shape - for i in range(len(config.steps)): - num += (h // config.steps[i]) * (w // config.steps[i]) * config.num_default[i] - config.num_ssd_boxes = num +# Copyright 2021 Huawei Technologies Co., Ltd +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. +# ============================================================================ + +"""Config parameters for SSD models.""" + +from .config_ssd_resnet50 import config as config_ssd_resnet50 + +using_model = "ssd_resnet50" + +config_map = { + "ssd_resnet50": config_ssd_resnet50 +} + + +print("...............using "+using_model+" model................") +config = config_map[using_model] + + +if config.num_ssd_boxes == -1: + num = 0 + h, w = config.img_shape + for i in range(len(config.steps)): + num += (h // config.steps[i]) * (w // config.steps[i]) * config.num_default[i] + config.num_ssd_boxes = num diff --git a/research/cv/ssd_resnet50/src/config_ssd_resnet50.py b/research/cv/ssd_resnet50/src/config_ssd_resnet50.py index 25b75990118d44c953bc29c74a50596d52f0e888..cc201e1e9f1e25efac3b8dd6c26f6c12b524f682 100644 --- a/research/cv/ssd_resnet50/src/config_ssd_resnet50.py +++ b/research/cv/ssd_resnet50/src/config_ssd_resnet50.py @@ -1,88 +1,88 @@ -# Copyright 2021 Huawei Technologies Co., Ltd -# -# Licensed under the Apache License, Version 2.0 (the "License"); -# you may not use this file except in compliance with the License. -# You may obtain a copy of the License at -# -# http://www.apache.org/licenses/LICENSE-2.0 -# -# Unless required by applicable law or agreed to in writing, software -# distributed under the License is distributed on an "AS IS" BASIS, -# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. -# See the License for the specific language governing permissions and -# limitations under the License. -#" ============================================================================ - -"""Config parameters for SSD models.""" - -from easydict import EasyDict as ed - -config = ed({ - "model": "ssd_resnet50", - "img_shape": [640, 640], - "num_ssd_boxes": -1, - "match_threshold": 0.5, - "nms_threshold": 0.6, - "min_score": 0.1, - "max_boxes": 100, - - # learning rate settings - "global_step": 0, - "lr_init": 0.01333, - "lr_end_rate": 0.0, - "warmup_epochs": 2, - "weight_decay": 4e-4, - "momentum": 0.9, - - # network - "num_default": [6, 6, 6, 6, 6], - "extras_in_channels": [256, 512, 1024, 256, 256], - "extras_out_channels": [512, 1024, 512, 256, 256], - # "extras_out_channels": [256, 256, 256, 256, 256], - "extras_strides": [1, 1, 2, 2, 2, 2], - "extras_ratio": [0.2, 0.2, 0.2, 0.25, 0.5, 0.25], - "feature_size": [80, 40, 20, 10, 5], - "min_scale": 0.2, - "max_scale": 0.95, - "aspect_ratios": [(2, 3), (2, 3), (2, 3), (2, 3), (2, 3), (2, 3)], - "steps": (8, 16, 32, 64, 128), - "prior_scaling": (0.1, 0.2), - "gamma": 2.0, - "alpha": 0.25, - "num_addition_layers": 4, - "use_anchor_generator": True, - "use_global_norm": True, - "use_float16": True, - - # `mindrecord_dir` and `coco_root` are better to use absolute path. - "feature_extractor_base_param": "/ckpt/resnet50.ckpt", - "checkpoint_filter_list": ['multi_loc_layers', 'multi_cls_layers'], - "mindrecord_dir": "/data/MindRecord_COCO", - "coco_root": "/data/coco2017", - "train_data_type": "train2017", - "val_data_type": "val2017", - "instances_set": "annotations/instances_{}.json", - "classes": ('background', 'person', 'bicycle', 'car', 'motorcycle', 'airplane', 'bus', - 'train', 'truck', 'boat', 'traffic light', 'fire hydrant', - 'stop sign', 'parking meter', 'bench', 'bird', 'cat', 'dog', - 'horse', 'sheep', 'cow', 'elephant', 'bear', 'zebra', - 'giraffe', 'backpack', 'umbrella', 'handbag', 'tie', - 'suitcase', 'frisbee', 'skis', 'snowboard', 'sports ball', - 'kite', 'baseball bat', 'baseball glove', 'skateboard', - 'surfboard', 'tennis racket', 'bottle', 'wine glass', 'cup', - 'fork', 'knife', 'spoon', 'bowl', 'banana', 'apple', - 'sandwich', 'orange', 'broccoli', 'carrot', 'hot dog', 'pizza', - 'donut', 'cake', 'chair', 'couch', 'potted plant', 'bed', - 'dining table', 'toilet', 'tv', 'laptop', 'mouse', 'remote', - 'keyboard', 'cell phone', 'microwave', 'oven', 'toaster', 'sink', - 'refrigerator', 'book', 'clock', 'vase', 'scissors', - 'teddy bear', 'hair drier', 'toothbrush'), - "num_classes": 81, - # The annotation.json position of voc validation dataset. - "voc_json": "annotations/voc_instances_val.json", - # voc original dataset. - "voc_root": "/data/voc_dataset", - # if coco or voc used, `image_dir` and `anno_path` are useless. - "image_dir": "", - "anno_path": "" -}) +# Copyright 2021 Huawei Technologies Co., Ltd +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. +#" ============================================================================ + +"""Config parameters for SSD models.""" + +from easydict import EasyDict as ed + +config = ed({ + "model": "ssd_resnet50", + "img_shape": [640, 640], + "num_ssd_boxes": -1, + "match_threshold": 0.5, + "nms_threshold": 0.6, + "min_score": 0.1, + "max_boxes": 100, + + # learning rate settings + "global_step": 0, + "lr_init": 0.01333, + "lr_end_rate": 0.0, + "warmup_epochs": 2, + "weight_decay": 4e-4, + "momentum": 0.9, + + # network + "num_default": [6, 6, 6, 6, 6], + "extras_in_channels": [256, 512, 1024, 256, 256], + "extras_out_channels": [512, 1024, 512, 256, 256], + # "extras_out_channels": [256, 256, 256, 256, 256], + "extras_strides": [1, 1, 2, 2, 2, 2], + "extras_ratio": [0.2, 0.2, 0.2, 0.25, 0.5, 0.25], + "feature_size": [80, 40, 20, 10, 5], + "min_scale": 0.2, + "max_scale": 0.95, + "aspect_ratios": [(2, 3), (2, 3), (2, 3), (2, 3), (2, 3), (2, 3)], + "steps": (8, 16, 32, 64, 128), + "prior_scaling": (0.1, 0.2), + "gamma": 2.0, + "alpha": 0.25, + "num_addition_layers": 4, + "use_anchor_generator": True, + "use_global_norm": True, + "use_float16": True, + + # `mindrecord_dir` and `coco_root` are better to use absolute path. + "feature_extractor_base_param": "/ckpt/resnet50.ckpt", + "checkpoint_filter_list": ['multi_loc_layers', 'multi_cls_layers'], + "mindrecord_dir": "/data/MindRecord_COCO", + "coco_root": "/data/coco2017", + "train_data_type": "train2017", + "val_data_type": "val2017", + "instances_set": "annotations/instances_{}.json", + "classes": ('background', 'person', 'bicycle', 'car', 'motorcycle', 'airplane', 'bus', + 'train', 'truck', 'boat', 'traffic light', 'fire hydrant', + 'stop sign', 'parking meter', 'bench', 'bird', 'cat', 'dog', + 'horse', 'sheep', 'cow', 'elephant', 'bear', 'zebra', + 'giraffe', 'backpack', 'umbrella', 'handbag', 'tie', + 'suitcase', 'frisbee', 'skis', 'snowboard', 'sports ball', + 'kite', 'baseball bat', 'baseball glove', 'skateboard', + 'surfboard', 'tennis racket', 'bottle', 'wine glass', 'cup', + 'fork', 'knife', 'spoon', 'bowl', 'banana', 'apple', + 'sandwich', 'orange', 'broccoli', 'carrot', 'hot dog', 'pizza', + 'donut', 'cake', 'chair', 'couch', 'potted plant', 'bed', + 'dining table', 'toilet', 'tv', 'laptop', 'mouse', 'remote', + 'keyboard', 'cell phone', 'microwave', 'oven', 'toaster', 'sink', + 'refrigerator', 'book', 'clock', 'vase', 'scissors', + 'teddy bear', 'hair drier', 'toothbrush'), + "num_classes": 81, + # The annotation.json position of voc validation dataset. + "voc_json": "annotations/voc_instances_val.json", + # voc original dataset. + "voc_root": "/data/voc_dataset", + # if coco or voc used, `image_dir` and `anno_path` are useless. + "image_dir": "", + "anno_path": "" +}) diff --git a/research/cv/ssd_resnet50/src/dataset.py b/research/cv/ssd_resnet50/src/dataset.py index a102b4729b1c6d3d6513700467892337975ef0f4..1f1e13ce5c37c3d9472993d92b25c2f5c0f4b9e0 100644 --- a/research/cv/ssd_resnet50/src/dataset.py +++ b/research/cv/ssd_resnet50/src/dataset.py @@ -1,453 +1,453 @@ -# Copyright 2021 Huawei Technologies Co., Ltd -# -# Licensed under the Apache License, Version 2.0 (the "License"); -# you may not use this file except in compliance with the License. -# You may obtain a copy of the License at -# -# http://www.apache.org/licenses/LICENSE-2.0 -# -# Unless required by applicable law or agreed to in writing, software -# distributed under the License is distributed on an "AS IS" BASIS, -# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. -# See the License for the specific language governing permissions and -# limitations under the License. -# ============================================================================ - -"""SSD dataset""" - -from __future__ import division - -import os -import json -import xml.etree.ElementTree as et -import numpy as np -import cv2 - -import mindspore.dataset as de -import mindspore.dataset.vision.c_transforms as C -from mindspore.mindrecord import FileWriter -from .config import config -from .box_utils import jaccard_numpy, ssd_bboxes_encode - - -def _rand(a=0., b=1.): - """Generate random.""" - return np.random.rand() * (b - a) + a - - -def get_imageId_from_fileName(filename, id_iter): - """Get imageID from fileName if fileName is int, else return id_iter.""" - filename = os.path.splitext(filename)[0] - if filename.isdigit(): - return int(filename) - return id_iter - - -def random_sample_crop(image, boxes): - """Random Crop the image and boxes""" - height, width, _ = image.shape - min_iou = np.random.choice([None, 0.1, 0.3, 0.5, 0.7, 0.9]) - - if min_iou is None: - return image, boxes - - # max trails (50) - for _ in range(50): - image_t = image - - w = _rand(0.3, 1.0) * width - h = _rand(0.3, 1.0) * height - - # aspect ratio constraint b/t .5 & 2 - if h / w < 0.5 or h / w > 2: - continue - - left = _rand() * (width - w) - top = _rand() * (height - h) - - rect = np.array([int(top), int(left), int(top + h), int(left + w)]) - overlap = jaccard_numpy(boxes, rect) - - # dropout some boxes - drop_mask = overlap > 0 - if not drop_mask.any(): - continue - - if overlap[drop_mask].min() < min_iou and overlap[drop_mask].max() > (min_iou + 0.2): - continue - - image_t = image_t[rect[0]:rect[2], rect[1]:rect[3], :] - - centers = (boxes[:, :2] + boxes[:, 2:4]) / 2.0 - - m1 = (rect[0] < centers[:, 0]) * (rect[1] < centers[:, 1]) - m2 = (rect[2] > centers[:, 0]) * (rect[3] > centers[:, 1]) - - # mask in that both m1 and m2 are true - mask = m1 * m2 * drop_mask - - # have any valid boxes? try again if not - if not mask.any(): - continue - - # take only matching gt boxes - boxes_t = boxes[mask, :].copy() - - boxes_t[:, :2] = np.maximum(boxes_t[:, :2], rect[:2]) - boxes_t[:, :2] -= rect[:2] - boxes_t[:, 2:4] = np.minimum(boxes_t[:, 2:4], rect[2:4]) - boxes_t[:, 2:4] -= rect[:2] - - return image_t, boxes_t - return image, boxes - - -def preprocess_fn(img_id, image, box, is_training): - """Preprocess function for dataset.""" - cv2.setNumThreads(2) - - def _infer_data(image, input_shape): - img_h, img_w, _ = image.shape - input_h, input_w = input_shape - - image = cv2.resize(image, (input_w, input_h)) - - # When the channels of image is 1 - if len(image.shape) == 2: - image = np.expand_dims(image, axis=-1) - image = np.concatenate([image, image, image], axis=-1) - - return img_id, image, np.array((img_h, img_w), np.float32) - - def _data_aug(image, box, is_training, image_size=(300, 300)): - """Data augmentation function.""" - ih, iw, _ = image.shape - h, w = image_size - - if not is_training: - return _infer_data(image, image_size) - - # Random crop - box = box.astype(np.float32) - image, box = random_sample_crop(image, box) - ih, iw, _ = image.shape - - # Resize image - image = cv2.resize(image, (w, h)) - - # Flip image or not - flip = _rand() < .5 - if flip: - image = cv2.flip(image, 1, dst=None) - - # When the channels of image is 1 - if len(image.shape) == 2: - image = np.expand_dims(image, axis=-1) - image = np.concatenate([image, image, image], axis=-1) - - box[:, [0, 2]] = box[:, [0, 2]] / ih - box[:, [1, 3]] = box[:, [1, 3]] / iw - - if flip: - box[:, [1, 3]] = 1 - box[:, [3, 1]] - - box, label, num_match = ssd_bboxes_encode(box) - return image, box, label, num_match - - return _data_aug(image, box, is_training, image_size=config.img_shape) - - -def create_voc_label(is_training): - """Get image path and annotation from VOC.""" - voc_root = config.voc_root - cls_map = {name: i for i, name in enumerate(config.classes)} - sub_dir = 'train' if is_training else 'eval' - voc_dir = os.path.join(voc_root, sub_dir) - if not os.path.isdir(voc_dir): - raise ValueError(f'Cannot find {sub_dir} dataset path.') - - image_dir = anno_dir = voc_dir - if os.path.isdir(os.path.join(voc_dir, 'Images')): - image_dir = os.path.join(voc_dir, 'Images') - if os.path.isdir(os.path.join(voc_dir, 'Annotations')): - anno_dir = os.path.join(voc_dir, 'Annotations') - - if not is_training: - json_file = os.path.join(config.voc_root, config.voc_json) - file_dir = os.path.split(json_file)[0] - if not os.path.isdir(file_dir): - os.makedirs(file_dir) - json_dict = {"images": [], "type": "instances", "annotations": [], - "categories": []} - bnd_id = 1 - - image_files_dict = {} - image_anno_dict = {} - images = [] - id_iter = 0 - for anno_file in os.listdir(anno_dir): - print(anno_file) - if not anno_file.endswith('xml'): - continue - tree = et.parse(os.path.join(anno_dir, anno_file)) - root_node = tree.getroot() - file_name = root_node.find('filename').text - img_id = get_imageId_from_fileName(file_name, id_iter) - id_iter += 1 - image_path = os.path.join(image_dir, file_name) - print(image_path) - if not os.path.isfile(image_path): - print(f'Cannot find image {file_name} according to annotations.') - continue - - labels = [] - for obj in root_node.iter('object'): - cls_name = obj.find('name').text - if cls_name not in cls_map: - print(f'Label "{cls_name}" not in "{config.classes}"') - continue - bnd_box = obj.find('bndbox') - x_min = int(float(bnd_box.find('xmin').text)) - 1 - y_min = int(float(bnd_box.find('ymin').text)) - 1 - x_max = int(float(bnd_box.find('xmax').text)) - 1 - y_max = int(float(bnd_box.find('ymax').text)) - 1 - labels.append([y_min, x_min, y_max, x_max, cls_map[cls_name]]) - - if not is_training: - o_width = abs(x_max - x_min) - o_height = abs(y_max - y_min) - ann = {'area': o_width * o_height, 'iscrowd': 0, 'image_id': \ - img_id, 'bbox': [x_min, y_min, o_width, o_height], \ - 'category_id': cls_map[cls_name], 'id': bnd_id, \ - 'ignore': 0, \ - 'segmentation': []} - json_dict['annotations'].append(ann) - bnd_id = bnd_id + 1 - - if labels: - images.append(img_id) - image_files_dict[img_id] = image_path - image_anno_dict[img_id] = np.array(labels) - - if not is_training: - size = root_node.find("size") - width = int(size.find('width').text) - height = int(size.find('height').text) - image = {'file_name': file_name, 'height': height, 'width': width, - 'id': img_id} - json_dict['images'].append(image) - - if not is_training: - for cls_name, cid in cls_map.items(): - cat = {'supercategory': 'none', 'id': cid, 'name': cls_name} - json_dict['categories'].append(cat) - json_fp = open(json_file, 'w') - json_str = json.dumps(json_dict) - json_fp.write(json_str) - json_fp.close() - - return images, image_files_dict, image_anno_dict - - -def create_coco_label(is_training): - """Get image path and annotation from COCO.""" - from pycocotools.coco import COCO - - coco_root = config.coco_root - data_type = config.val_data_type - if is_training: - data_type = config.train_data_type - - # Classes need to train or test. - train_cls = config.classes - train_cls_dict = {} - for i, cls in enumerate(train_cls): - train_cls_dict[cls] = i - - anno_json = os.path.join(coco_root, config.instances_set.format(data_type)) - - coco = COCO(anno_json) - classs_dict = {} - cat_ids = coco.loadCats(coco.getCatIds()) - for cat in cat_ids: - classs_dict[cat["id"]] = cat["name"] - - image_ids = coco.getImgIds() - images = [] - image_path_dict = {} - image_anno_dict = {} - - for img_id in image_ids: - image_info = coco.loadImgs(img_id) - file_name = image_info[0]["file_name"] - anno_ids = coco.getAnnIds(imgIds=img_id, iscrowd=None) - anno = coco.loadAnns(anno_ids) - image_path = os.path.join(coco_root, data_type, file_name) - annos = [] - iscrowd = False - for label in anno: - bbox = label["bbox"] - class_name = classs_dict[label["category_id"]] - iscrowd = iscrowd or label["iscrowd"] - if class_name in train_cls: - x_min, x_max = bbox[0], bbox[0] + bbox[2] - y_min, y_max = bbox[1], bbox[1] + bbox[3] - annos.append(list(map(round, [y_min, x_min, y_max, x_max])) + [train_cls_dict[class_name]]) - - if not is_training and iscrowd: - continue - if len(annos) >= 1: - images.append(img_id) - image_path_dict[img_id] = image_path - image_anno_dict[img_id] = np.array(annos) - - return images, image_path_dict, image_anno_dict - - -def anno_parser(annos_str): - """Parse annotation from string to list.""" - annos = [] - for anno_str in annos_str: - anno = list(map(int, anno_str.strip().split(','))) - annos.append(anno) - return annos - - -def filter_valid_data(image_dir, anno_path): - """Filter valid image file, which both in image_dir and anno_path.""" - images = [] - image_path_dict = {} - image_anno_dict = {} - if not os.path.isdir(image_dir): - raise RuntimeError("Path given is not valid.") - if not os.path.isfile(anno_path): - raise RuntimeError("Annotation file is not valid.") - - with open(anno_path, "rb") as f: - lines = f.readlines() - for img_id, line in enumerate(lines): - line_str = line.decode("utf-8").strip() - line_split = str(line_str).split(' ') - file_name = line_split[0] - image_path = os.path.join(image_dir, file_name) - if os.path.isfile(image_path): - images.append(img_id) - image_path_dict[img_id] = image_path - image_anno_dict[img_id] = anno_parser(line_split[1:]) - - return images, image_path_dict, image_anno_dict - - -def voc_data_to_mindrecord(mindrecord_dir, is_training, prefix="ssd.mindrecord", file_num=8): - """Create MindRecord file by image_dir and anno_path.""" - mindrecord_path = os.path.join(mindrecord_dir, prefix) - writer = FileWriter(mindrecord_path, file_num) - images, image_path_dict, image_anno_dict = create_voc_label(is_training) - - ssd_json = { - "img_id": {"type": "int32", "shape": [1]}, - "image": {"type": "bytes"}, - "annotation": {"type": "int32", "shape": [-1, 5]}, - } - writer.add_schema(ssd_json, "ssd_json") - - for img_id in images: - image_path = image_path_dict[img_id] - with open(image_path, 'rb') as f: - img = f.read() - annos = np.array(image_anno_dict[img_id], dtype=np.int32) - img_id = np.array([img_id], dtype=np.int32) - row = {"img_id": img_id, "image": img, "annotation": annos} - writer.write_raw_data([row]) - writer.commit() - - -def data_to_mindrecord_byte_image(dataset="coco", is_training=True, prefix="ssd.mindrecord", file_num=8): - """Create MindRecord file.""" - mindrecord_dir = config.mindrecord_dir - mindrecord_path = os.path.join(mindrecord_dir, prefix) - writer = FileWriter(mindrecord_path, file_num) - if dataset == "coco": - images, image_path_dict, image_anno_dict = create_coco_label(is_training) - else: - images, image_path_dict, image_anno_dict = filter_valid_data(config.image_dir, config.anno_path) - - ssd_json = { - "img_id": {"type": "int32", "shape": [1]}, - "image": {"type": "bytes"}, - "annotation": {"type": "int32", "shape": [-1, 5]}, - } - writer.add_schema(ssd_json, "ssd_json") - - for img_id in images: - image_path = image_path_dict[img_id] - with open(image_path, 'rb') as f: - img = f.read() - annos = np.array(image_anno_dict[img_id], dtype=np.int32) - img_id = np.array([img_id], dtype=np.int32) - row = {"img_id": img_id, "image": img, "annotation": annos} - writer.write_raw_data([row]) - writer.commit() - - -def create_ssd_dataset(mindrecord_file, batch_size=32, repeat_num=10, device_num=1, rank=0, - is_training=True, num_parallel_workers=4, use_multiprocessing=True): - """Create SSD dataset with MindDataset.""" - ds = de.MindDataset(mindrecord_file, columns_list=["img_id", "image", "annotation"], num_shards=device_num, - shard_id=rank, num_parallel_workers=num_parallel_workers, shuffle=is_training) - decode = C.Decode() - ds = ds.map(operations=decode, input_columns=["image"]) - change_swap_op = C.HWC2CHW() - normalize_op = C.Normalize(mean=[0.485 * 255, 0.456 * 255, 0.406 * 255], - std=[0.229 * 255, 0.224 * 255, 0.225 * 255]) - color_adjust_op = C.RandomColorAdjust(brightness=0.4, contrast=0.4, saturation=0.4) - compose_map_func = (lambda img_id, image, annotation: preprocess_fn(img_id, image, annotation, is_training)) - if is_training: - output_columns = ["image", "box", "label", "num_match"] - trans = [color_adjust_op, normalize_op, change_swap_op] - else: - output_columns = ["img_id", "image", "image_shape"] - trans = [normalize_op, change_swap_op] - ds = ds.map(operations=compose_map_func, input_columns=["img_id", "image", "annotation"], - output_columns=output_columns, column_order=output_columns, - python_multiprocessing=use_multiprocessing, - num_parallel_workers=num_parallel_workers) - ds = ds.map(operations=trans, input_columns=["image"], python_multiprocessing=use_multiprocessing, - num_parallel_workers=num_parallel_workers) - ds = ds.batch(batch_size, drop_remainder=True) - ds = ds.repeat(repeat_num) - return ds - - -def create_mindrecord(dataset="coco", prefix="ssd.mindrecord", is_training=True): - """ It will generate mindrecord file in config.mindrecord_dir, - and the file name is ssd.mindrecord0, 1, ... file_num. - """ - print("Start create dataset!") - mindrecord_dir = config.mindrecord_dir - mindrecord_file = os.path.join(mindrecord_dir, prefix + "0") - if not os.path.exists(mindrecord_file): - if not os.path.isdir(mindrecord_dir): - os.makedirs(mindrecord_dir) - if dataset == "coco": - if os.path.isdir(config.coco_root): - print("Create Mindrecord.") - data_to_mindrecord_byte_image("coco", is_training, prefix) - print("Create Mindrecord Done, at {}".format(mindrecord_dir)) - else: - print("coco_root not exits.") - elif dataset == "voc": - if os.path.isdir(config.voc_root): - print("Create Mindrecord.") - voc_data_to_mindrecord(mindrecord_dir, is_training, prefix) - print("Create Mindrecord Done, at {}".format(mindrecord_dir)) - else: - print("voc_root not exits.") - else: - if os.path.isdir(config.image_dir) and os.path.exists(config.anno_path): - print("Create Mindrecord.") - data_to_mindrecord_byte_image("other", is_training, prefix) - print("Create Mindrecord Done, at {}".format(mindrecord_dir)) - else: - print("image_dir or anno_path not exits.") - return mindrecord_file +# Copyright 2021 Huawei Technologies Co., Ltd +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. +# ============================================================================ + +"""SSD dataset""" + +from __future__ import division + +import os +import json +import xml.etree.ElementTree as et +import numpy as np +import cv2 + +import mindspore.dataset as de +import mindspore.dataset.vision.c_transforms as C +from mindspore.mindrecord import FileWriter +from .config import config +from .box_utils import jaccard_numpy, ssd_bboxes_encode + + +def _rand(a=0., b=1.): + """Generate random.""" + return np.random.rand() * (b - a) + a + + +def get_imageId_from_fileName(filename, id_iter): + """Get imageID from fileName if fileName is int, else return id_iter.""" + filename = os.path.splitext(filename)[0] + if filename.isdigit(): + return int(filename) + return id_iter + + +def random_sample_crop(image, boxes): + """Random Crop the image and boxes""" + height, width, _ = image.shape + min_iou = np.random.choice([None, 0.1, 0.3, 0.5, 0.7, 0.9]) + + if min_iou is None: + return image, boxes + + # max trails (50) + for _ in range(50): + image_t = image + + w = _rand(0.3, 1.0) * width + h = _rand(0.3, 1.0) * height + + # aspect ratio constraint b/t .5 & 2 + if h / w < 0.5 or h / w > 2: + continue + + left = _rand() * (width - w) + top = _rand() * (height - h) + + rect = np.array([int(top), int(left), int(top + h), int(left + w)]) + overlap = jaccard_numpy(boxes, rect) + + # dropout some boxes + drop_mask = overlap > 0 + if not drop_mask.any(): + continue + + if overlap[drop_mask].min() < min_iou and overlap[drop_mask].max() > (min_iou + 0.2): + continue + + image_t = image_t[rect[0]:rect[2], rect[1]:rect[3], :] + + centers = (boxes[:, :2] + boxes[:, 2:4]) / 2.0 + + m1 = (rect[0] < centers[:, 0]) * (rect[1] < centers[:, 1]) + m2 = (rect[2] > centers[:, 0]) * (rect[3] > centers[:, 1]) + + # mask in that both m1 and m2 are true + mask = m1 * m2 * drop_mask + + # have any valid boxes? try again if not + if not mask.any(): + continue + + # take only matching gt boxes + boxes_t = boxes[mask, :].copy() + + boxes_t[:, :2] = np.maximum(boxes_t[:, :2], rect[:2]) + boxes_t[:, :2] -= rect[:2] + boxes_t[:, 2:4] = np.minimum(boxes_t[:, 2:4], rect[2:4]) + boxes_t[:, 2:4] -= rect[:2] + + return image_t, boxes_t + return image, boxes + + +def preprocess_fn(img_id, image, box, is_training): + """Preprocess function for dataset.""" + cv2.setNumThreads(2) + + def _infer_data(image, input_shape): + img_h, img_w, _ = image.shape + input_h, input_w = input_shape + + image = cv2.resize(image, (input_w, input_h)) + + # When the channels of image is 1 + if len(image.shape) == 2: + image = np.expand_dims(image, axis=-1) + image = np.concatenate([image, image, image], axis=-1) + + return img_id, image, np.array((img_h, img_w), np.float32) + + def _data_aug(image, box, is_training, image_size=(300, 300)): + """Data augmentation function.""" + ih, iw, _ = image.shape + h, w = image_size + + if not is_training: + return _infer_data(image, image_size) + + # Random crop + box = box.astype(np.float32) + image, box = random_sample_crop(image, box) + ih, iw, _ = image.shape + + # Resize image + image = cv2.resize(image, (w, h)) + + # Flip image or not + flip = _rand() < .5 + if flip: + image = cv2.flip(image, 1, dst=None) + + # When the channels of image is 1 + if len(image.shape) == 2: + image = np.expand_dims(image, axis=-1) + image = np.concatenate([image, image, image], axis=-1) + + box[:, [0, 2]] = box[:, [0, 2]] / ih + box[:, [1, 3]] = box[:, [1, 3]] / iw + + if flip: + box[:, [1, 3]] = 1 - box[:, [3, 1]] + + box, label, num_match = ssd_bboxes_encode(box) + return image, box, label, num_match + + return _data_aug(image, box, is_training, image_size=config.img_shape) + + +def create_voc_label(is_training): + """Get image path and annotation from VOC.""" + voc_root = config.voc_root + cls_map = {name: i for i, name in enumerate(config.classes)} + sub_dir = 'train' if is_training else 'eval' + voc_dir = os.path.join(voc_root, sub_dir) + if not os.path.isdir(voc_dir): + raise ValueError(f'Cannot find {sub_dir} dataset path.') + + image_dir = anno_dir = voc_dir + if os.path.isdir(os.path.join(voc_dir, 'Images')): + image_dir = os.path.join(voc_dir, 'Images') + if os.path.isdir(os.path.join(voc_dir, 'Annotations')): + anno_dir = os.path.join(voc_dir, 'Annotations') + + if not is_training: + json_file = os.path.join(config.voc_root, config.voc_json) + file_dir = os.path.split(json_file)[0] + if not os.path.isdir(file_dir): + os.makedirs(file_dir) + json_dict = {"images": [], "type": "instances", "annotations": [], + "categories": []} + bnd_id = 1 + + image_files_dict = {} + image_anno_dict = {} + images = [] + id_iter = 0 + for anno_file in os.listdir(anno_dir): + print(anno_file) + if not anno_file.endswith('xml'): + continue + tree = et.parse(os.path.join(anno_dir, anno_file)) + root_node = tree.getroot() + file_name = root_node.find('filename').text + img_id = get_imageId_from_fileName(file_name, id_iter) + id_iter += 1 + image_path = os.path.join(image_dir, file_name) + print(image_path) + if not os.path.isfile(image_path): + print(f'Cannot find image {file_name} according to annotations.') + continue + + labels = [] + for obj in root_node.iter('object'): + cls_name = obj.find('name').text + if cls_name not in cls_map: + print(f'Label "{cls_name}" not in "{config.classes}"') + continue + bnd_box = obj.find('bndbox') + x_min = int(float(bnd_box.find('xmin').text)) - 1 + y_min = int(float(bnd_box.find('ymin').text)) - 1 + x_max = int(float(bnd_box.find('xmax').text)) - 1 + y_max = int(float(bnd_box.find('ymax').text)) - 1 + labels.append([y_min, x_min, y_max, x_max, cls_map[cls_name]]) + + if not is_training: + o_width = abs(x_max - x_min) + o_height = abs(y_max - y_min) + ann = {'area': o_width * o_height, 'iscrowd': 0, 'image_id': \ + img_id, 'bbox': [x_min, y_min, o_width, o_height], \ + 'category_id': cls_map[cls_name], 'id': bnd_id, \ + 'ignore': 0, \ + 'segmentation': []} + json_dict['annotations'].append(ann) + bnd_id = bnd_id + 1 + + if labels: + images.append(img_id) + image_files_dict[img_id] = image_path + image_anno_dict[img_id] = np.array(labels) + + if not is_training: + size = root_node.find("size") + width = int(size.find('width').text) + height = int(size.find('height').text) + image = {'file_name': file_name, 'height': height, 'width': width, + 'id': img_id} + json_dict['images'].append(image) + + if not is_training: + for cls_name, cid in cls_map.items(): + cat = {'supercategory': 'none', 'id': cid, 'name': cls_name} + json_dict['categories'].append(cat) + json_fp = open(json_file, 'w') + json_str = json.dumps(json_dict) + json_fp.write(json_str) + json_fp.close() + + return images, image_files_dict, image_anno_dict + + +def create_coco_label(is_training): + """Get image path and annotation from COCO.""" + from pycocotools.coco import COCO + + coco_root = config.coco_root + data_type = config.val_data_type + if is_training: + data_type = config.train_data_type + + # Classes need to train or test. + train_cls = config.classes + train_cls_dict = {} + for i, cls in enumerate(train_cls): + train_cls_dict[cls] = i + + anno_json = os.path.join(coco_root, config.instances_set.format(data_type)) + + coco = COCO(anno_json) + classs_dict = {} + cat_ids = coco.loadCats(coco.getCatIds()) + for cat in cat_ids: + classs_dict[cat["id"]] = cat["name"] + + image_ids = coco.getImgIds() + images = [] + image_path_dict = {} + image_anno_dict = {} + + for img_id in image_ids: + image_info = coco.loadImgs(img_id) + file_name = image_info[0]["file_name"] + anno_ids = coco.getAnnIds(imgIds=img_id, iscrowd=None) + anno = coco.loadAnns(anno_ids) + image_path = os.path.join(coco_root, data_type, file_name) + annos = [] + iscrowd = False + for label in anno: + bbox = label["bbox"] + class_name = classs_dict[label["category_id"]] + iscrowd = iscrowd or label["iscrowd"] + if class_name in train_cls: + x_min, x_max = bbox[0], bbox[0] + bbox[2] + y_min, y_max = bbox[1], bbox[1] + bbox[3] + annos.append(list(map(round, [y_min, x_min, y_max, x_max])) + [train_cls_dict[class_name]]) + + if not is_training and iscrowd: + continue + if len(annos) >= 1: + images.append(img_id) + image_path_dict[img_id] = image_path + image_anno_dict[img_id] = np.array(annos) + + return images, image_path_dict, image_anno_dict + + +def anno_parser(annos_str): + """Parse annotation from string to list.""" + annos = [] + for anno_str in annos_str: + anno = list(map(int, anno_str.strip().split(','))) + annos.append(anno) + return annos + + +def filter_valid_data(image_dir, anno_path): + """Filter valid image file, which both in image_dir and anno_path.""" + images = [] + image_path_dict = {} + image_anno_dict = {} + if not os.path.isdir(image_dir): + raise RuntimeError("Path given is not valid.") + if not os.path.isfile(anno_path): + raise RuntimeError("Annotation file is not valid.") + + with open(anno_path, "rb") as f: + lines = f.readlines() + for img_id, line in enumerate(lines): + line_str = line.decode("utf-8").strip() + line_split = str(line_str).split(' ') + file_name = line_split[0] + image_path = os.path.join(image_dir, file_name) + if os.path.isfile(image_path): + images.append(img_id) + image_path_dict[img_id] = image_path + image_anno_dict[img_id] = anno_parser(line_split[1:]) + + return images, image_path_dict, image_anno_dict + + +def voc_data_to_mindrecord(mindrecord_dir, is_training, prefix="ssd.mindrecord", file_num=8): + """Create MindRecord file by image_dir and anno_path.""" + mindrecord_path = os.path.join(mindrecord_dir, prefix) + writer = FileWriter(mindrecord_path, file_num) + images, image_path_dict, image_anno_dict = create_voc_label(is_training) + + ssd_json = { + "img_id": {"type": "int32", "shape": [1]}, + "image": {"type": "bytes"}, + "annotation": {"type": "int32", "shape": [-1, 5]}, + } + writer.add_schema(ssd_json, "ssd_json") + + for img_id in images: + image_path = image_path_dict[img_id] + with open(image_path, 'rb') as f: + img = f.read() + annos = np.array(image_anno_dict[img_id], dtype=np.int32) + img_id = np.array([img_id], dtype=np.int32) + row = {"img_id": img_id, "image": img, "annotation": annos} + writer.write_raw_data([row]) + writer.commit() + + +def data_to_mindrecord_byte_image(dataset="coco", is_training=True, prefix="ssd.mindrecord", file_num=8): + """Create MindRecord file.""" + mindrecord_dir = config.mindrecord_dir + mindrecord_path = os.path.join(mindrecord_dir, prefix) + writer = FileWriter(mindrecord_path, file_num) + if dataset == "coco": + images, image_path_dict, image_anno_dict = create_coco_label(is_training) + else: + images, image_path_dict, image_anno_dict = filter_valid_data(config.image_dir, config.anno_path) + + ssd_json = { + "img_id": {"type": "int32", "shape": [1]}, + "image": {"type": "bytes"}, + "annotation": {"type": "int32", "shape": [-1, 5]}, + } + writer.add_schema(ssd_json, "ssd_json") + + for img_id in images: + image_path = image_path_dict[img_id] + with open(image_path, 'rb') as f: + img = f.read() + annos = np.array(image_anno_dict[img_id], dtype=np.int32) + img_id = np.array([img_id], dtype=np.int32) + row = {"img_id": img_id, "image": img, "annotation": annos} + writer.write_raw_data([row]) + writer.commit() + + +def create_ssd_dataset(mindrecord_file, batch_size=32, repeat_num=10, device_num=1, rank=0, + is_training=True, num_parallel_workers=4, use_multiprocessing=True): + """Create SSD dataset with MindDataset.""" + ds = de.MindDataset(mindrecord_file, columns_list=["img_id", "image", "annotation"], num_shards=device_num, + shard_id=rank, num_parallel_workers=num_parallel_workers, shuffle=is_training) + decode = C.Decode() + ds = ds.map(operations=decode, input_columns=["image"]) + change_swap_op = C.HWC2CHW() + normalize_op = C.Normalize(mean=[0.485 * 255, 0.456 * 255, 0.406 * 255], + std=[0.229 * 255, 0.224 * 255, 0.225 * 255]) + color_adjust_op = C.RandomColorAdjust(brightness=0.4, contrast=0.4, saturation=0.4) + compose_map_func = (lambda img_id, image, annotation: preprocess_fn(img_id, image, annotation, is_training)) + if is_training: + output_columns = ["image", "box", "label", "num_match"] + trans = [color_adjust_op, normalize_op, change_swap_op] + else: + output_columns = ["img_id", "image", "image_shape"] + trans = [normalize_op, change_swap_op] + ds = ds.map(operations=compose_map_func, input_columns=["img_id", "image", "annotation"], + output_columns=output_columns, column_order=output_columns, + python_multiprocessing=use_multiprocessing, + num_parallel_workers=num_parallel_workers) + ds = ds.map(operations=trans, input_columns=["image"], python_multiprocessing=use_multiprocessing, + num_parallel_workers=num_parallel_workers) + ds = ds.batch(batch_size, drop_remainder=True) + ds = ds.repeat(repeat_num) + return ds + + +def create_mindrecord(dataset="coco", prefix="ssd.mindrecord", is_training=True): + """ It will generate mindrecord file in config.mindrecord_dir, + and the file name is ssd.mindrecord0, 1, ... file_num. + """ + print("Start create dataset!") + mindrecord_dir = config.mindrecord_dir + mindrecord_file = os.path.join(mindrecord_dir, prefix + "0") + if not os.path.exists(mindrecord_file): + if not os.path.isdir(mindrecord_dir): + os.makedirs(mindrecord_dir) + if dataset == "coco": + if os.path.isdir(config.coco_root): + print("Create Mindrecord.") + data_to_mindrecord_byte_image("coco", is_training, prefix) + print("Create Mindrecord Done, at {}".format(mindrecord_dir)) + else: + print("coco_root not exits.") + elif dataset == "voc": + if os.path.isdir(config.voc_root): + print("Create Mindrecord.") + voc_data_to_mindrecord(mindrecord_dir, is_training, prefix) + print("Create Mindrecord Done, at {}".format(mindrecord_dir)) + else: + print("voc_root not exits.") + else: + if os.path.isdir(config.image_dir) and os.path.exists(config.anno_path): + print("Create Mindrecord.") + data_to_mindrecord_byte_image("other", is_training, prefix) + print("Create Mindrecord Done, at {}".format(mindrecord_dir)) + else: + print("image_dir or anno_path not exits.") + return mindrecord_file diff --git a/research/cv/ssd_resnet50/src/eval_utils.py b/research/cv/ssd_resnet50/src/eval_utils.py index fd2590ebdbdfa562be1370add55301c7d096d9de..9212a76f13aa3e64ba8aecec92804497df13606e 100644 --- a/research/cv/ssd_resnet50/src/eval_utils.py +++ b/research/cv/ssd_resnet50/src/eval_utils.py @@ -1,119 +1,119 @@ -# Copyright 2021 Huawei Technologies Co., Ltd -# -# Licensed under the Apache License, Version 2.0 (the "License"); -# you may not use this file except in compliance with the License. -# You may obtain a copy of the License at -# -# http://www.apache.org/licenses/LICENSE-2.0 -# -# Unless required by applicable law or agreed to in writing, software -# distributed under the License is distributed on an "AS IS" BASIS, -# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. -# See the License for the specific language governing permissions and -# limitations under the License. -# ============================================================================ -"""Coco metrics utils""" - -import json -import numpy as np -from .config import config - - -def apply_nms(all_boxes, all_scores, thres, max_boxes): - """Apply NMS to bboxes.""" - y1 = all_boxes[:, 0] - x1 = all_boxes[:, 1] - y2 = all_boxes[:, 2] - x2 = all_boxes[:, 3] - areas = (x2 - x1 + 1) * (y2 - y1 + 1) - - order = all_scores.argsort()[::-1] - keep = [] - - while order.size > 0: - i = order[0] - keep.append(i) - - if len(keep) >= max_boxes: - break - - xx1 = np.maximum(x1[i], x1[order[1:]]) - yy1 = np.maximum(y1[i], y1[order[1:]]) - xx2 = np.minimum(x2[i], x2[order[1:]]) - yy2 = np.minimum(y2[i], y2[order[1:]]) - - w = np.maximum(0.0, xx2 - xx1 + 1) - h = np.maximum(0.0, yy2 - yy1 + 1) - inter = w * h - - ovr = inter / (areas[i] + areas[order[1:]] - inter) - - inds = np.where(ovr <= thres)[0] - - order = order[inds + 1] - return keep - - -def metrics(pred_data, anno_json): - """Calculate mAP of predicted bboxes.""" - from pycocotools.coco import COCO - from pycocotools.cocoeval import COCOeval - num_classes = config.num_classes - - #Classes need to train or test. - val_cls = config.classes - val_cls_dict = {} - for i, cls in enumerate(val_cls): - val_cls_dict[i] = cls - coco_gt = COCO(anno_json) - classs_dict = {} - cat_ids = coco_gt.loadCats(coco_gt.getCatIds()) - for cat in cat_ids: - classs_dict[cat["name"]] = cat["id"] - - predictions = [] - img_ids = [] - - for sample in pred_data: - pred_boxes = sample['boxes'] - box_scores = sample['box_scores'] - img_id = sample['img_id'] - h, w = sample['image_shape'] - - final_boxes = [] - final_label = [] - final_score = [] - img_ids.append(img_id) - - for c in range(1, num_classes): - class_box_scores = box_scores[:, c] - score_mask = class_box_scores > config.min_score - class_box_scores = class_box_scores[score_mask] - class_boxes = pred_boxes[score_mask] * [h, w, h, w] - - if score_mask.any(): - nms_index = apply_nms(class_boxes, class_box_scores, config.nms_threshold, config.max_boxes) - class_boxes = class_boxes[nms_index] - class_box_scores = class_box_scores[nms_index] - - final_boxes += class_boxes.tolist() - final_score += class_box_scores.tolist() - final_label += [classs_dict[val_cls_dict[c]]] * len(class_box_scores) - - for loc, label, score in zip(final_boxes, final_label, final_score): - res = {} - res['image_id'] = img_id - res['bbox'] = [loc[1], loc[0], loc[3] - loc[1], loc[2] - loc[0]] - res['score'] = score - res['category_id'] = label - predictions.append(res) - with open('predictions.json', 'w') as f: - json.dump(predictions, f) - - coco_dt = coco_gt.loadRes('predictions.json') - E = COCOeval(coco_gt, coco_dt, iouType='bbox') - E.params.imgIds = img_ids - E.evaluate() - E.accumulate() - E.summarize() - return E.stats[0] +# Copyright 2021 Huawei Technologies Co., Ltd +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. +# ============================================================================ +"""Coco metrics utils""" + +import json +import numpy as np +from .config import config + + +def apply_nms(all_boxes, all_scores, thres, max_boxes): + """Apply NMS to bboxes.""" + y1 = all_boxes[:, 0] + x1 = all_boxes[:, 1] + y2 = all_boxes[:, 2] + x2 = all_boxes[:, 3] + areas = (x2 - x1 + 1) * (y2 - y1 + 1) + + order = all_scores.argsort()[::-1] + keep = [] + + while order.size > 0: + i = order[0] + keep.append(i) + + if len(keep) >= max_boxes: + break + + xx1 = np.maximum(x1[i], x1[order[1:]]) + yy1 = np.maximum(y1[i], y1[order[1:]]) + xx2 = np.minimum(x2[i], x2[order[1:]]) + yy2 = np.minimum(y2[i], y2[order[1:]]) + + w = np.maximum(0.0, xx2 - xx1 + 1) + h = np.maximum(0.0, yy2 - yy1 + 1) + inter = w * h + + ovr = inter / (areas[i] + areas[order[1:]] - inter) + + inds = np.where(ovr <= thres)[0] + + order = order[inds + 1] + return keep + + +def metrics(pred_data, anno_json): + """Calculate mAP of predicted bboxes.""" + from pycocotools.coco import COCO + from pycocotools.cocoeval import COCOeval + num_classes = config.num_classes + + #Classes need to train or test. + val_cls = config.classes + val_cls_dict = {} + for i, cls in enumerate(val_cls): + val_cls_dict[i] = cls + coco_gt = COCO(anno_json) + classs_dict = {} + cat_ids = coco_gt.loadCats(coco_gt.getCatIds()) + for cat in cat_ids: + classs_dict[cat["name"]] = cat["id"] + + predictions = [] + img_ids = [] + + for sample in pred_data: + pred_boxes = sample['boxes'] + box_scores = sample['box_scores'] + img_id = sample['img_id'] + h, w = sample['image_shape'] + + final_boxes = [] + final_label = [] + final_score = [] + img_ids.append(img_id) + + for c in range(1, num_classes): + class_box_scores = box_scores[:, c] + score_mask = class_box_scores > config.min_score + class_box_scores = class_box_scores[score_mask] + class_boxes = pred_boxes[score_mask] * [h, w, h, w] + + if score_mask.any(): + nms_index = apply_nms(class_boxes, class_box_scores, config.nms_threshold, config.max_boxes) + class_boxes = class_boxes[nms_index] + class_box_scores = class_box_scores[nms_index] + + final_boxes += class_boxes.tolist() + final_score += class_box_scores.tolist() + final_label += [classs_dict[val_cls_dict[c]]] * len(class_box_scores) + + for loc, label, score in zip(final_boxes, final_label, final_score): + res = {} + res['image_id'] = img_id + res['bbox'] = [loc[1], loc[0], loc[3] - loc[1], loc[2] - loc[0]] + res['score'] = score + res['category_id'] = label + predictions.append(res) + with open('predictions.json', 'w') as f: + json.dump(predictions, f) + + coco_dt = coco_gt.loadRes('predictions.json') + E = COCOeval(coco_gt, coco_dt, iouType='bbox') + E.params.imgIds = img_ids + E.evaluate() + E.accumulate() + E.summarize() + return E.stats[0] diff --git a/research/cv/ssd_resnet50/src/init_params.py b/research/cv/ssd_resnet50/src/init_params.py index 64833e798657d6c11a49f80f62be9f78646174bc..ef34ff694530bd1d9c5aece8b50bc5ef42c66ba2 100644 --- a/research/cv/ssd_resnet50/src/init_params.py +++ b/research/cv/ssd_resnet50/src/init_params.py @@ -1,50 +1,50 @@ -# Copyright 2021 Huawei Technologies Co., Ltd -# -# Licensed under the Apache License, Version 2.0 (the "License"); -# you may not use this file except in compliance with the License. -# You may obtain a copy of the License at -# -# http://www.apache.org/licenses/LICENSE-2.0 -# -# Unless required by applicable law or agreed to in writing, software -# distributed under the License is distributed on an "AS IS" BASIS, -# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. -# See the License for the specific language governing permissions and -# limitations under the License. -# ============================================================================ -"""Parameters utils""" - -from mindspore.common.initializer import initializer, TruncatedNormal - -def init_net_param(network, initialize_mode='TruncatedNormal'): - """Init the parameters in net.""" - params = network.trainable_params() - for p in params: - if 'beta' not in p.name and 'gamma' not in p.name and 'bias' not in p.name: - if initialize_mode == 'TruncatedNormal': - p.set_data(initializer(TruncatedNormal(0.02), p.data.shape, p.data.dtype)) - else: - p.set_data(initialize_mode, p.data.shape, p.data.dtype) - - -def load_backbone_params(network, param_dict): - """Init the parameters from pre-train model, default is mobilenetv2.""" - for _, param in network.parameters_and_names(): - param_name = param.name.replace('network.backbone.', '') - name_split = param_name.split('.') - if 'features_1' in param_name: - param_name = param_name.replace('features_1', 'features') - if 'features_2' in param_name: - param_name = '.'.join(['features', str(int(name_split[1]) + 14)] + name_split[2:]) - if param_name in param_dict: - param.set_data(param_dict[param_name].data) - - -def filter_checkpoint_parameter_by_list(param_dict, filter_list): - """remove useless parameters according to filter_list""" - for key in list(param_dict.keys()): - for name in filter_list: - if name in key: - print("Delete parameter from checkpoint: ", key) - del param_dict[key] - break +# Copyright 2021 Huawei Technologies Co., Ltd +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. +# ============================================================================ +"""Parameters utils""" + +from mindspore.common.initializer import initializer, TruncatedNormal + +def init_net_param(network, initialize_mode='TruncatedNormal'): + """Init the parameters in net.""" + params = network.trainable_params() + for p in params: + if 'beta' not in p.name and 'gamma' not in p.name and 'bias' not in p.name: + if initialize_mode == 'TruncatedNormal': + p.set_data(initializer(TruncatedNormal(0.02), p.data.shape, p.data.dtype)) + else: + p.set_data(initialize_mode, p.data.shape, p.data.dtype) + + +def load_backbone_params(network, param_dict): + """Init the parameters from pre-train model, default is mobilenetv2.""" + for _, param in network.parameters_and_names(): + param_name = param.name.replace('network.backbone.', '') + name_split = param_name.split('.') + if 'features_1' in param_name: + param_name = param_name.replace('features_1', 'features') + if 'features_2' in param_name: + param_name = '.'.join(['features', str(int(name_split[1]) + 14)] + name_split[2:]) + if param_name in param_dict: + param.set_data(param_dict[param_name].data) + + +def filter_checkpoint_parameter_by_list(param_dict, filter_list): + """remove useless parameters according to filter_list""" + for key in list(param_dict.keys()): + for name in filter_list: + if name in key: + print("Delete parameter from checkpoint: ", key) + del param_dict[key] + break diff --git a/research/cv/ssd_resnet50/src/lr_schedule.py b/research/cv/ssd_resnet50/src/lr_schedule.py index 893ccbe2300076d03dd78b04896c4bf4b6e66a42..9bc918e48d5833ee7fc66517021297a739eea6e9 100644 --- a/research/cv/ssd_resnet50/src/lr_schedule.py +++ b/research/cv/ssd_resnet50/src/lr_schedule.py @@ -1,55 +1,55 @@ -# Copyright 2021 Huawei Technologies Co., Ltd -# -# Licensed under the Apache License, Version 2.0 (the "License"); -# you may not use this file except in compliance with the License. -# You may obtain a copy of the License at -# -# http://www.apache.org/licenses/LICENSE-2.0 -# -# Unless required by applicable law or agreed to in writing, software -# distributed under the License is distributed on an "AS IS" BASIS, -# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. -# See the License for the specific language governing permissions and -# limitations under the License. -# ============================================================================ -"""Learning rate schedule""" - -import math -import numpy as np - - -def get_lr(global_step, lr_init, lr_end, lr_max, warmup_epochs, total_epochs, steps_per_epoch): - """ - generate learning rate array - - Args: - global_step(int): total steps of the training - lr_init(float): init learning rate - lr_end(float): end learning rate - lr_max(float): max learning rate - warmup_epochs(float): number of warmup epochs - total_epochs(int): total epoch of training - steps_per_epoch(int): steps of one epoch - - Returns: - np.array, learning rate array - """ - lr_each_step = [] - total_steps = steps_per_epoch * total_epochs - warmup_steps = steps_per_epoch * warmup_epochs - for i in range(total_steps): - if i < warmup_steps: - lr = lr_init + (lr_max - lr_init) * i / warmup_steps - else: - lr = lr_end + \ - (lr_max - lr_end) * \ - (1. + math.cos(math.pi * (i - warmup_steps) / (total_steps - warmup_steps))) / 2. - if lr < 0.0: - lr = 0.0 - lr_each_step.append(lr) - - current_step = global_step - lr_each_step = np.array(lr_each_step).astype(np.float32) - learning_rate = lr_each_step[current_step:] - - return learning_rate +# Copyright 2021 Huawei Technologies Co., Ltd +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. +# ============================================================================ +"""Learning rate schedule""" + +import math +import numpy as np + + +def get_lr(global_step, lr_init, lr_end, lr_max, warmup_epochs, total_epochs, steps_per_epoch): + """ + generate learning rate array + + Args: + global_step(int): total steps of the training + lr_init(float): init learning rate + lr_end(float): end learning rate + lr_max(float): max learning rate + warmup_epochs(float): number of warmup epochs + total_epochs(int): total epoch of training + steps_per_epoch(int): steps of one epoch + + Returns: + np.array, learning rate array + """ + lr_each_step = [] + total_steps = steps_per_epoch * total_epochs + warmup_steps = steps_per_epoch * warmup_epochs + for i in range(total_steps): + if i < warmup_steps: + lr = lr_init + (lr_max - lr_init) * i / warmup_steps + else: + lr = lr_end + \ + (lr_max - lr_end) * \ + (1. + math.cos(math.pi * (i - warmup_steps) / (total_steps - warmup_steps))) / 2. + if lr < 0.0: + lr = 0.0 + lr_each_step.append(lr) + + current_step = global_step + lr_each_step = np.array(lr_each_step).astype(np.float32) + learning_rate = lr_each_step[current_step:] + + return learning_rate diff --git a/research/cv/ssd_resnet50/src/resnet.py b/research/cv/ssd_resnet50/src/resnet.py index 75d314ff14187e68874a5ec0309fed17ba867e94..92c83e25c97a2e31ec47364ad14e4293e5bfe6dd 100644 --- a/research/cv/ssd_resnet50/src/resnet.py +++ b/research/cv/ssd_resnet50/src/resnet.py @@ -1,222 +1,222 @@ -# Copyright 2021 Huawei Technologies Co., Ltd -# -# Licensed under the Apache License, Version 2.0 (the "License"); -# you may not use this file except in compliance with the License. -# You may obtain a copy of the License at -# -# http://www.apache.org/licenses/LICENSE-2.0 -# -# Unless required by applicable law or agreed to in writing, software -# distributed under the License is distributed on an "AS IS" BASIS, -# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. -# See the License for the specific language governing permissions and -# limitations under the License. -# ============================================================================ -"""ResNet.""" -import mindspore.nn as nn -from mindspore.ops import operations as P - - -def _conv3x3(in_channel, out_channel, stride=1): - return nn.Conv2d(in_channel, out_channel, - kernel_size=3, stride=stride, padding=0, pad_mode='same') - - -def _conv1x1(in_channel, out_channel, stride=1): - return nn.Conv2d(in_channel, out_channel, kernel_size=1, stride=stride, padding=0, pad_mode='same') - - -def _conv7x7(in_channel, out_channel, stride=1): - return nn.Conv2d(in_channel, out_channel, kernel_size=7, stride=stride, padding=0, pad_mode='same') - - -def _bn(channel): - return nn.BatchNorm2d(channel, eps=1e-3, momentum=0.997, - gamma_init=1, beta_init=0, moving_mean_init=0, moving_var_init=1) - - -def _bn_last(channel): - return nn.BatchNorm2d(channel, eps=1e-3, momentum=0.997, - gamma_init=0, beta_init=0, moving_mean_init=0, moving_var_init=1) - -class ResidualBlock(nn.Cell): - """ - ResNet V1 residual block definition. - - Args: - in_channel (int): Input channel. - out_channel (int): Output channel. - stride (int): Stride size for the first convolutional layer. Default: 1. - - Returns: - Tensor, output tensor. - - Examples: - >>> ResidualBlock(3, 256, stride=2) - """ - expansion = 4 - - def __init__(self, - in_channel, - out_channel, - stride=1): - super(ResidualBlock, self).__init__() - self.stride = stride - channel = out_channel // self.expansion - self.conv1 = _conv1x1(in_channel, channel, stride=1) - self.bn1 = _bn(channel) - self.conv2 = _conv3x3(channel, channel, stride=stride) - self.bn2 = _bn(channel) - - self.conv3 = _conv1x1(channel, out_channel, stride=1) - self.bn3 = _bn_last(out_channel) - self.relu = nn.ReLU() - - self.down_sample = False - - if stride != 1 or in_channel != out_channel: - self.down_sample = True - self.down_sample_layer = None - - if self.down_sample: - self.down_sample_layer = nn.SequentialCell([_conv1x1(in_channel, out_channel, stride), _bn(out_channel)]) - self.add = P.Add() - - def construct(self, x): - """ - Forward - """ - identity = x - - out = self.conv1(x) - out = self.bn1(out) - out = self.relu(out) - out = self.conv2(out) - out = self.bn2(out) - out = self.relu(out) - out = self.conv3(out) - out = self.bn3(out) - - if self.down_sample: - identity = self.down_sample_layer(identity) - - out = self.add(out, identity) - out = self.relu(out) - - return out - - -class ResNet(nn.Cell): - """ - ResNet architecture. - - Args: - block (Cell): Block for network. - layer_nums (list): Numbers of block in different layers. - in_channels (list): Input channel in each layer. - out_channels (list): Output channel in each layer. - strides (list): Stride size in each layer. - num_classes (int): The number of classes that the training images are belonging to. - Returns: - Tensor, output tensor. - - Examples: - >>> ResNet(ResidualBlock, - >>> [3, 4, 6, 3], - >>> [64, 256, 512, 1024], - >>> [256, 512, 1024, 2048], - >>> [1, 2, 2, 2], - >>> 10) - """ - - def __init__(self, - block, - layer_nums, - in_channels, - out_channels, - strides): - super(ResNet, self).__init__() - - if not len(layer_nums) == len(in_channels) == len(out_channels) == 4: - raise ValueError("the length of layer_num, in_channels, out_channels list must be 4!") - self.conv1 = _conv7x7(3, 64, stride=2) - self.bn1 = _bn(64) - self.relu = P.ReLU() - self.maxpool = nn.MaxPool2d(kernel_size=3, stride=2, pad_mode="same") - self.layer1 = self._make_layer(block, - layer_nums[0], - in_channel=in_channels[0], - out_channel=out_channels[0], - stride=strides[0]) - self.layer2 = self._make_layer(block, - layer_nums[1], - in_channel=in_channels[1], - out_channel=out_channels[1], - stride=strides[1]) - self.layer3 = self._make_layer(block, - layer_nums[2], - in_channel=in_channels[2], - out_channel=out_channels[2], - stride=strides[2]) - self.layer4 = self._make_layer(block, - layer_nums[3], - in_channel=in_channels[3], - out_channel=out_channels[3], - stride=strides[3]) - - def _make_layer(self, block, layer_num, in_channel, out_channel, stride): - """ - Make stage network of ResNet. - - Args: - block (Cell): Resnet block. - layer_num (int): Layer number. - in_channel (int): Input channel. - out_channel (int): Output channel. - stride (int): Stride size for the first convolutional layer. - Returns: - SequentialCell, the output layer. - - Examples: - >>> _make_layer(ResidualBlock, 3, 128, 256, 2) - """ - layers = [] - - resnet_block = block(in_channel, out_channel, stride=stride) - layers.append(resnet_block) - for _ in range(1, layer_num): - resnet_block = block(out_channel, out_channel, stride=1) - layers.append(resnet_block) - return nn.SequentialCell(layers) - - def construct(self, x): - """ - Forward - """ - x = self.conv1(x) - x = self.bn1(x) - x = self.relu(x) - c1 = self.maxpool(x) - - c2 = self.layer1(c1) - c3 = self.layer2(c2) - c4 = self.layer3(c3) - c5 = self.layer4(c4) - return c1, c2, c3, c4, c5 - - -def resnet50(): - """ - Get ResNet50 neural network. - - Returns: - Cell, cell instance of ResNet50 neural network. - - Examples: - >>> net = resnet50() - """ - return ResNet(ResidualBlock, - [3, 4, 6, 3], - [64, 256, 512, 1024], - [256, 512, 1024, 2048], - [1, 2, 2, 2]) +# Copyright 2021 Huawei Technologies Co., Ltd +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. +# ============================================================================ +"""ResNet.""" +import mindspore.nn as nn +from mindspore.ops import operations as P + + +def _conv3x3(in_channel, out_channel, stride=1): + return nn.Conv2d(in_channel, out_channel, + kernel_size=3, stride=stride, padding=0, pad_mode='same') + + +def _conv1x1(in_channel, out_channel, stride=1): + return nn.Conv2d(in_channel, out_channel, kernel_size=1, stride=stride, padding=0, pad_mode='same') + + +def _conv7x7(in_channel, out_channel, stride=1): + return nn.Conv2d(in_channel, out_channel, kernel_size=7, stride=stride, padding=0, pad_mode='same') + + +def _bn(channel): + return nn.BatchNorm2d(channel, eps=1e-3, momentum=0.997, + gamma_init=1, beta_init=0, moving_mean_init=0, moving_var_init=1) + + +def _bn_last(channel): + return nn.BatchNorm2d(channel, eps=1e-3, momentum=0.997, + gamma_init=0, beta_init=0, moving_mean_init=0, moving_var_init=1) + +class ResidualBlock(nn.Cell): + """ + ResNet V1 residual block definition. + + Args: + in_channel (int): Input channel. + out_channel (int): Output channel. + stride (int): Stride size for the first convolutional layer. Default: 1. + + Returns: + Tensor, output tensor. + + Examples: + >>> ResidualBlock(3, 256, stride=2) + """ + expansion = 4 + + def __init__(self, + in_channel, + out_channel, + stride=1): + super(ResidualBlock, self).__init__() + self.stride = stride + channel = out_channel // self.expansion + self.conv1 = _conv1x1(in_channel, channel, stride=1) + self.bn1 = _bn(channel) + self.conv2 = _conv3x3(channel, channel, stride=stride) + self.bn2 = _bn(channel) + + self.conv3 = _conv1x1(channel, out_channel, stride=1) + self.bn3 = _bn_last(out_channel) + self.relu = nn.ReLU() + + self.down_sample = False + + if stride != 1 or in_channel != out_channel: + self.down_sample = True + self.down_sample_layer = None + + if self.down_sample: + self.down_sample_layer = nn.SequentialCell([_conv1x1(in_channel, out_channel, stride), _bn(out_channel)]) + self.add = P.Add() + + def construct(self, x): + """ + Forward + """ + identity = x + + out = self.conv1(x) + out = self.bn1(out) + out = self.relu(out) + out = self.conv2(out) + out = self.bn2(out) + out = self.relu(out) + out = self.conv3(out) + out = self.bn3(out) + + if self.down_sample: + identity = self.down_sample_layer(identity) + + out = self.add(out, identity) + out = self.relu(out) + + return out + + +class ResNet(nn.Cell): + """ + ResNet architecture. + + Args: + block (Cell): Block for network. + layer_nums (list): Numbers of block in different layers. + in_channels (list): Input channel in each layer. + out_channels (list): Output channel in each layer. + strides (list): Stride size in each layer. + num_classes (int): The number of classes that the training images are belonging to. + Returns: + Tensor, output tensor. + + Examples: + >>> ResNet(ResidualBlock, + >>> [3, 4, 6, 3], + >>> [64, 256, 512, 1024], + >>> [256, 512, 1024, 2048], + >>> [1, 2, 2, 2], + >>> 10) + """ + + def __init__(self, + block, + layer_nums, + in_channels, + out_channels, + strides): + super(ResNet, self).__init__() + + if not len(layer_nums) == len(in_channels) == len(out_channels) == 4: + raise ValueError("the length of layer_num, in_channels, out_channels list must be 4!") + self.conv1 = _conv7x7(3, 64, stride=2) + self.bn1 = _bn(64) + self.relu = P.ReLU() + self.maxpool = nn.MaxPool2d(kernel_size=3, stride=2, pad_mode="same") + self.layer1 = self._make_layer(block, + layer_nums[0], + in_channel=in_channels[0], + out_channel=out_channels[0], + stride=strides[0]) + self.layer2 = self._make_layer(block, + layer_nums[1], + in_channel=in_channels[1], + out_channel=out_channels[1], + stride=strides[1]) + self.layer3 = self._make_layer(block, + layer_nums[2], + in_channel=in_channels[2], + out_channel=out_channels[2], + stride=strides[2]) + self.layer4 = self._make_layer(block, + layer_nums[3], + in_channel=in_channels[3], + out_channel=out_channels[3], + stride=strides[3]) + + def _make_layer(self, block, layer_num, in_channel, out_channel, stride): + """ + Make stage network of ResNet. + + Args: + block (Cell): Resnet block. + layer_num (int): Layer number. + in_channel (int): Input channel. + out_channel (int): Output channel. + stride (int): Stride size for the first convolutional layer. + Returns: + SequentialCell, the output layer. + + Examples: + >>> _make_layer(ResidualBlock, 3, 128, 256, 2) + """ + layers = [] + + resnet_block = block(in_channel, out_channel, stride=stride) + layers.append(resnet_block) + for _ in range(1, layer_num): + resnet_block = block(out_channel, out_channel, stride=1) + layers.append(resnet_block) + return nn.SequentialCell(layers) + + def construct(self, x): + """ + Forward + """ + x = self.conv1(x) + x = self.bn1(x) + x = self.relu(x) + c1 = self.maxpool(x) + + c2 = self.layer1(c1) + c3 = self.layer2(c2) + c4 = self.layer3(c3) + c5 = self.layer4(c4) + return c1, c2, c3, c4, c5 + + +def resnet50(): + """ + Get ResNet50 neural network. + + Returns: + Cell, cell instance of ResNet50 neural network. + + Examples: + >>> net = resnet50() + """ + return ResNet(ResidualBlock, + [3, 4, 6, 3], + [64, 256, 512, 1024], + [256, 512, 1024, 2048], + [1, 2, 2, 2]) diff --git a/research/cv/ssd_resnet50/src/resnet_extra.py b/research/cv/ssd_resnet50/src/resnet_extra.py index 8864c3929e835d0290a3af4cf9feed045d9567d3..022d7a708f712f4480168a87b665bbea9305586d 100644 --- a/research/cv/ssd_resnet50/src/resnet_extra.py +++ b/research/cv/ssd_resnet50/src/resnet_extra.py @@ -1,68 +1,68 @@ -# Copyright 2021 Huawei Technologies Co., Ltd -# -# Licensed under the Apache License, Version 2.0 (the "License"); -# you may not use this file except in compliance with the License. -# You may obtain a copy of the License at -# -# http://www.apache.org/licenses/LICENSE-2.0 -# -# Unless required by applicable law or agreed to in writing, software -# distributed under the License is distributed on an "AS IS" BASIS, -# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. -# See the License for the specific language governing permissions and -# limitations under the License. -# ============================================================================ -"""resnet extractor""" -import mindspore.nn as nn -from .resnet import resnet50 - -def conv_bn_relu(in_channel, out_channel, kernel_size, stride, depthwise, activation='relu6'): - output = [] - output.append(nn.Conv2d(in_channel, out_channel, kernel_size, stride, pad_mode="same", - group=1 if not depthwise else in_channel)) - output.append(nn.BatchNorm2d(out_channel)) - if activation: - output.append(nn.get_activation(activation)) - return nn.SequentialCell(output) - -class ExtraLayer(nn.Cell): - """ - extra feature extractor - """ - def __init__(self, levels, res_channels, channels, kernel_size, stride): - super(ExtraLayer, self).__init__() - self.levels = levels - self.Channel_cover = conv_bn_relu(512, channels, kernel_size, 1, False) - bottom_up_cells = [ - conv_bn_relu(channels, channels, kernel_size, stride, False) for x in range(self.levels) - ] - self.blocks = nn.CellList(bottom_up_cells) - - def construct(self, features): - """ - Forward - """ - mid_feature = self.Channel_cover(features[-1]) - features = features + (self.blocks[0](mid_feature),) - features = features + (self.blocks[1](features[-1]),) - return features - - -class resnet50_extra(nn.Cell): - """ - ResNet with extra feature. - """ - def __init__(self): - super(resnet50_extra, self).__init__() - self.resnet = resnet50() - self.extra = ExtraLayer(2, 512, 256, 3, 2) - self.Channel_cover = conv_bn_relu(2048, 512, 3, 1, False) - - def construct(self, x): - """ - Forward - """ - _, _, c3, c4, c5 = self.resnet(x) - c5 = self.Channel_cover(c5) - features = self.extra((c3, c4, c5)) - return features +# Copyright 2021 Huawei Technologies Co., Ltd +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. +# ============================================================================ +"""resnet extractor""" +import mindspore.nn as nn +from .resnet import resnet50 + +def conv_bn_relu(in_channel, out_channel, kernel_size, stride, depthwise, activation='relu6'): + output = [] + output.append(nn.Conv2d(in_channel, out_channel, kernel_size, stride, pad_mode="same", + group=1 if not depthwise else in_channel)) + output.append(nn.BatchNorm2d(out_channel)) + if activation: + output.append(nn.get_activation(activation)) + return nn.SequentialCell(output) + +class ExtraLayer(nn.Cell): + """ + extra feature extractor + """ + def __init__(self, levels, res_channels, channels, kernel_size, stride): + super(ExtraLayer, self).__init__() + self.levels = levels + self.Channel_cover = conv_bn_relu(512, channels, kernel_size, 1, False) + bottom_up_cells = [ + conv_bn_relu(channels, channels, kernel_size, stride, False) for x in range(self.levels) + ] + self.blocks = nn.CellList(bottom_up_cells) + + def construct(self, features): + """ + Forward + """ + mid_feature = self.Channel_cover(features[-1]) + features = features + (self.blocks[0](mid_feature),) + features = features + (self.blocks[1](features[-1]),) + return features + + +class resnet50_extra(nn.Cell): + """ + ResNet with extra feature. + """ + def __init__(self): + super(resnet50_extra, self).__init__() + self.resnet = resnet50() + self.extra = ExtraLayer(2, 512, 256, 3, 2) + self.Channel_cover = conv_bn_relu(2048, 512, 3, 1, False) + + def construct(self, x): + """ + Forward + """ + _, _, c3, c4, c5 = self.resnet(x) + c5 = self.Channel_cover(c5) + features = self.extra((c3, c4, c5)) + return features diff --git a/research/cv/ssd_resnet50/src/ssd.py b/research/cv/ssd_resnet50/src/ssd.py index 7ec90034385196eb2475146cec5621c2e1a3c9e2..ef793224a1160e16955fbbf7de96b268e09416b6 100644 --- a/research/cv/ssd_resnet50/src/ssd.py +++ b/research/cv/ssd_resnet50/src/ssd.py @@ -1,502 +1,502 @@ -# Copyright 2021 Huawei Technologies Co., Ltd -# -# Licensed under the Apache License, Version 2.0 (the "License"); -# you may not use this file except in compliance with the License. -# You may obtain a copy of the License at -# -# http://www.apache.org/licenses/LICENSE-2.0 -# -# Unless required by applicable law or agreed to in writing, software -# distributed under the License is distributed on an "AS IS" BASIS, -# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. -# See the License for the specific language governing permissions and -# limitations under the License. -# ============================================================================ - -"""SSD net based MobilenetV2.""" - -import mindspore.common.dtype as mstype -import mindspore as ms -import mindspore.nn as nn -from mindspore import context, Tensor -from mindspore.context import ParallelMode -from mindspore.parallel._auto_parallel_context import auto_parallel_context -from mindspore.communication.management import get_group_size -from mindspore.ops import operations as P -from mindspore.ops import functional as F -from mindspore.ops import composite as C -from .resnet_extra import resnet50_extra - -def _make_divisible(v, divisor, min_value=None): - """nsures that all layers have a channel number that is divisible by 8.""" - if min_value is None: - min_value = divisor - new_v = max(min_value, int(v + divisor / 2) // divisor * divisor) - # Make sure that round down does not go down by more than 10%. - if new_v < 0.9 * v: - new_v += divisor - return new_v - - -def _conv2d(in_channel, out_channel, kernel_size=3, stride=1, pad_mod='same'): - return nn.Conv2d(in_channel, out_channel, kernel_size=kernel_size, stride=stride, - padding=0, pad_mode=pad_mod, has_bias=True) - - -def _bn(channel): - return nn.BatchNorm2d(channel, eps=1e-3, momentum=0.97, - gamma_init=1, beta_init=0, moving_mean_init=0, moving_var_init=1) - - -def _last_conv2d(in_channel, out_channel, kernel_size=3, stride=1, pad_mod='same', pad=0): - in_channels = in_channel - out_channels = in_channel - depthwise_conv = nn.Conv2d(in_channels, out_channels, kernel_size, stride, pad_mode='same', - padding=pad, group=in_channels) - conv = _conv2d(in_channel, out_channel, kernel_size=1) - return nn.SequentialCell([depthwise_conv, _bn(in_channel), nn.ReLU6(), conv]) - - -class ConvBNReLU(nn.Cell): - """ - Convolution/Depthwise fused with Batchnorm and ReLU block definition. - - Args: - in_planes (int): Input channel. - out_planes (int): Output channel. - kernel_size (int): Input kernel size. - stride (int): Stride size for the first convolutional layer. Default: 1. - groups (int): channel group. Convolution is 1 while Depthiwse is input channel. Default: 1. - shared_conv(Cell): Use the weight shared conv, default: None. - - Returns: - Tensor, output tensor. - - Examples: - >>> ConvBNReLU(16, 256, kernel_size=1, stride=1, groups=1) - """ - def __init__(self, in_planes, out_planes, kernel_size=3, stride=1, groups=1, shared_conv=None): - super(ConvBNReLU, self).__init__() - padding = 0 - in_channels = in_planes - out_channels = out_planes - if shared_conv is None: - if groups == 1: - conv = nn.Conv2d(in_channels, out_channels, kernel_size, stride, pad_mode='same', padding=padding) - else: - out_channels = in_planes - conv = nn.Conv2d(in_channels, out_channels, kernel_size, stride, pad_mode='same', - padding=padding, group=in_channels) - layers = [conv, _bn(out_planes), nn.ReLU6()] - else: - layers = [shared_conv, _bn(out_planes), nn.ReLU6()] - self.features = nn.SequentialCell(layers) - - def construct(self, x): - output = self.features(x) - return output - - -class InvertedResidual(nn.Cell): - """ - Residual block definition. - - Args: - inp (int): Input channel. - oup (int): Output channel. - stride (int): Stride size for the first convolutional layer. Default: 1. - expand_ratio (int): expand ration of input channel - - Returns: - Tensor, output tensor. - - Examples: - >>> ResidualBlock(3, 256, 1, 1) - """ - def __init__(self, inp, oup, stride, expand_ratio, last_relu=False): - super(InvertedResidual, self).__init__() - assert stride in [1, 2] - - hidden_dim = int(round(inp * expand_ratio)) - self.use_res_connect = stride == 1 and inp == oup - - layers = [] - if expand_ratio != 1: - layers.append(ConvBNReLU(inp, hidden_dim, kernel_size=1)) - layers.extend([ - # dw - ConvBNReLU(hidden_dim, hidden_dim, stride=stride, groups=hidden_dim), - # pw-linear - nn.Conv2d(hidden_dim, oup, kernel_size=1, stride=1, has_bias=False), - _bn(oup), - ]) - self.conv = nn.SequentialCell(layers) - self.add = P.Add() - self.cast = P.Cast() - self.last_relu = last_relu - self.relu = nn.ReLU6() - - def construct(self, x): - identity = x - x = self.conv(x) - if self.use_res_connect: - x = self.add(identity, x) - if self.last_relu: - x = self.relu(x) - return x - - -class FlattenConcat(nn.Cell): - """ - Concatenate predictions into a single tensor. - - Args: - config (dict): The default config of SSD. - - Returns: - Tensor, flatten predictions. - """ - def __init__(self, config): - super(FlattenConcat, self).__init__() - self.num_ssd_boxes = config.num_ssd_boxes - self.concat = P.Concat(axis=1) - self.transpose = P.Transpose() - def construct(self, inputs): - output = () - batch_size = F.shape(inputs[0])[0] - for x in inputs: - x = self.transpose(x, (0, 2, 3, 1)) - output += (F.reshape(x, (batch_size, -1)),) - res = self.concat(output) - return F.reshape(res, (batch_size, self.num_ssd_boxes, -1)) - - -class MultiBox(nn.Cell): - """ - Multibox conv layers. Each multibox layer contains class conf scores and localization predictions. - - Args: - config (dict): The default config of SSD. - - Returns: - Tensor, localization predictions. - Tensor, class conf scores. - """ - def __init__(self, config): - super(MultiBox, self).__init__() - num_classes = config.num_classes - out_channels = config.extras_out_channels - num_default = config.num_default - - loc_layers = [] - cls_layers = [] - for k, out_channel in enumerate(out_channels): - loc_layers += [_last_conv2d(out_channel, 4 * num_default[k], - kernel_size=3, stride=1, pad_mod='same', pad=0)] - cls_layers += [_last_conv2d(out_channel, num_classes * num_default[k], - kernel_size=3, stride=1, pad_mod='same', pad=0)] - - self.multi_loc_layers = nn.layer.CellList(loc_layers) - self.multi_cls_layers = nn.layer.CellList(cls_layers) - self.flatten_concat = FlattenConcat(config) - - def construct(self, inputs): - loc_outputs = () - cls_outputs = () - for i in range(len(self.multi_loc_layers)): - loc_outputs += (self.multi_loc_layers[i](inputs[i]),) - cls_outputs += (self.multi_cls_layers[i](inputs[i]),) - return self.flatten_concat(loc_outputs), self.flatten_concat(cls_outputs) - - -class WeightSharedMultiBox(nn.Cell): - """ - Weight shared Multi-box conv layers. Each multi-box layer contains class conf scores and localization predictions. - All box predictors shares the same conv weight in different features. - - Args: - config (dict): The default config of SSD. - loc_cls_shared_addition(bool): Whether the location predictor and classifier prediction share the - same addition layer. - Returns: - Tensor, localization predictions. - Tensor, class conf scores. - """ - def __init__(self, config, loc_cls_shared_addition=False): - super(WeightSharedMultiBox, self).__init__() - num_classes = config.num_classes - out_channels = config.extras_out_channels[0] - num_default = config.num_default[0] - num_features = len(config.feature_size) - num_addition_layers = config.num_addition_layers - self.loc_cls_shared_addition = loc_cls_shared_addition - - if not loc_cls_shared_addition: - loc_convs = [ - _conv2d(out_channels, out_channels, 3, 1) for x in range(num_addition_layers) - ] - cls_convs = [ - _conv2d(out_channels, out_channels, 3, 1) for x in range(num_addition_layers) - ] - addition_loc_layer_list = [] - addition_cls_layer_list = [] - for _ in range(num_features): - addition_loc_layer = [ - ConvBNReLU(out_channels, out_channels, 3, 1, 1, loc_convs[x]) for x in range(num_addition_layers) - ] - addition_cls_layer = [ - ConvBNReLU(out_channels, out_channels, 3, 1, 1, cls_convs[x]) for x in range(num_addition_layers) - ] - addition_loc_layer_list.append(nn.SequentialCell(addition_loc_layer)) - addition_cls_layer_list.append(nn.SequentialCell(addition_cls_layer)) - self.addition_layer_loc = nn.CellList(addition_loc_layer_list) - self.addition_layer_cls = nn.CellList(addition_cls_layer_list) - else: - convs = [ - _conv2d(out_channels, out_channels, 3, 1) for x in range(num_addition_layers) - ] - addition_layer_list = [] - for _ in range(num_features): - addition_layers = [ - ConvBNReLU(out_channels, out_channels, 3, 1, 1, convs[x]) for x in range(num_addition_layers) - ] - addition_layer_list.append(nn.SequentialCell(addition_layers)) - self.addition_layer = nn.SequentialCell(addition_layer_list) - - loc_layers = [_conv2d(out_channels, 4 * num_default, - kernel_size=3, stride=1, pad_mod='same')] - cls_layers = [_conv2d(out_channels, num_classes * num_default, - kernel_size=3, stride=1, pad_mod='same')] - - self.loc_layers = nn.SequentialCell(loc_layers) - self.cls_layers = nn.SequentialCell(cls_layers) - self.flatten_concat = FlattenConcat(config) - - def construct(self, inputs): - """ - Forward - """ - loc_outputs = () - cls_outputs = () - num_heads = len(inputs) - for i in range(num_heads): - if self.loc_cls_shared_addition: - features = self.addition_layer[i](inputs[i]) - loc_outputs += (self.loc_layers(features),) - cls_outputs += (self.cls_layers(features),) - else: - features = self.addition_layer_loc[i](inputs[i]) - loc_outputs += (self.loc_layers(features),) - features = self.addition_layer_cls[i](inputs[i]) - cls_outputs += (self.cls_layers(features),) - return self.flatten_concat(loc_outputs), self.flatten_concat(cls_outputs) - -class SsdResNet50(nn.Cell): - """ - SSD Network using ResNet50 to extract features - - Args: - config (dict): The default config of SSD. - - Returns: - Tensor, localization predictions. - Tensor, class conf scores. - - Examples:backbone - SsdResNet50(config). - """ - def __init__(self, config): - super(SsdResNet50, self).__init__() - self.multi_box = MultiBox(config) - self.activation = P.Sigmoid() - self.feature_extractor = resnet50_extra() - - def construct(self, x): - """ - Forward - """ - features = self.feature_extractor(x) - pred_loc, pred_label = self.multi_box(features) - if not self.training: - pred_label = self.activation(pred_label) - pred_loc = F.cast(pred_loc, mstype.float32) - pred_label = F.cast(pred_label, mstype.float32) - return pred_loc, pred_label - -class SigmoidFocalClassificationLoss(nn.Cell): - """" - Sigmoid focal-loss for classification. - - Args: - gamma (float): Hyper-parameter to balance the easy and hard examples. Default: 2.0 - alpha (float): Hyper-parameter to balance the positive and negative example. Default: 0.25 - - Returns: - Tensor, the focal loss. - """ - def __init__(self, gamma=2.0, alpha=0.25): - super(SigmoidFocalClassificationLoss, self).__init__() - self.sigmiod_cross_entropy = P.SigmoidCrossEntropyWithLogits() - self.sigmoid = P.Sigmoid() - self.pow = P.Pow() - self.onehot = P.OneHot() - self.on_value = Tensor(1.0, mstype.float32) - self.off_value = Tensor(0.0, mstype.float32) - self.gamma = gamma - self.alpha = alpha - - def construct(self, logits, label): - """ - Forward - """ - label = self.onehot(label, F.shape(logits)[-1], self.on_value, self.off_value) - sigmiod_cross_entropy = self.sigmiod_cross_entropy(logits, label) - sigmoid = self.sigmoid(logits) - label = F.cast(label, mstype.float32) - p_t = label * sigmoid + (1 - label) * (1 - sigmoid) - modulating_factor = self.pow(1 - p_t, self.gamma) - alpha_weight_factor = label * self.alpha + (1 - label) * (1 - self.alpha) - focal_loss = modulating_factor * alpha_weight_factor * sigmiod_cross_entropy - return focal_loss - - -class SSDWithLossCell(nn.Cell): - """" - Provide SSD training loss through network. - - Args: - network (Cell): The training network. - config (dict): SSD config. - - Returns: - Tensor, the loss of the network. - """ - def __init__(self, network, config): - super(SSDWithLossCell, self).__init__() - self.network = network - self.less = P.Less() - self.tile = P.Tile() - self.reduce_sum = P.ReduceSum() - self.expand_dims = P.ExpandDims() - self.class_loss = SigmoidFocalClassificationLoss(config.gamma, config.alpha) - self.loc_loss = nn.SmoothL1Loss() - - def construct(self, x, gt_loc, gt_label, num_matched_boxes): - """ - Forward - """ - pred_loc, pred_label = self.network(x) - mask = F.cast(self.less(0, gt_label), mstype.float32) - num_matched_boxes = self.reduce_sum(F.cast(num_matched_boxes, mstype.float32)) - - # Localization Loss - mask_loc = self.tile(self.expand_dims(mask, -1), (1, 1, 4)) - smooth_l1 = self.loc_loss(pred_loc, gt_loc) * mask_loc - loss_loc = self.reduce_sum(self.reduce_sum(smooth_l1, -1), -1) - - # Classification Loss - loss_cls = self.class_loss(pred_label, gt_label) - loss_cls = self.reduce_sum(loss_cls, (1, 2)) - - return self.reduce_sum((loss_cls + loss_loc) / num_matched_boxes) - - -grad_scale = C.MultitypeFuncGraph("grad_scale") -@grad_scale.register("Tensor", "Tensor") -def tensor_grad_scale(scale, grad): - return grad * P.Reciprocal()(scale) - - -class TrainingWrapper(nn.Cell): - """ - Encapsulation class of SSD network training. - - Append an optimizer to the training network after that the construct - function can be called to create the backward graph. - - Args: - network (Cell): The training network. Note that loss function should have been added. - optimizer (Optimizer): Optimizer for updating the weights. - sens (Number): The adjust parameter. Default: 1.0. - use_global_nrom(bool): Whether apply global norm before optimizer. Default: False - """ - def __init__(self, network, optimizer, sens=1.0, use_global_norm=False): - super(TrainingWrapper, self).__init__(auto_prefix=False) - self.network = network - self.network.set_grad() - self.weights = ms.ParameterTuple(network.trainable_params()) - self.optimizer = optimizer - self.grad = C.GradOperation(get_by_list=True, sens_param=True) - self.sens = sens - self.reducer_flag = False - self.grad_reducer = None - self.use_global_norm = use_global_norm - self.parallel_mode = context.get_auto_parallel_context("parallel_mode") - if self.parallel_mode in [ParallelMode.DATA_PARALLEL, ParallelMode.HYBRID_PARALLEL]: - self.reducer_flag = True - if self.reducer_flag: - mean = context.get_auto_parallel_context("gradients_mean") - if auto_parallel_context().get_device_num_is_set(): - degree = context.get_auto_parallel_context("device_num") - else: - degree = get_group_size() - self.grad_reducer = nn.DistributedGradReducer(optimizer.parameters, mean, degree) - self.hyper_map = C.HyperMap() - - def construct(self, *args): - """ - Forward - """ - weights = self.weights - loss = self.network(*args) - sens = P.Fill()(P.DType()(loss), P.Shape()(loss), self.sens) - grads = self.grad(self.network, weights)(*args, sens) - if self.reducer_flag: - # apply grad reducer on grads - grads = self.grad_reducer(grads) - if self.use_global_norm: - grads = self.hyper_map(F.partial(grad_scale, F.scalar_to_array(self.sens)), grads) - grads = C.clip_by_global_norm(grads) - self.optimizer(grads) - return loss - -class SsdInferWithDecoder(nn.Cell): - """ - SSD Infer wrapper to decode the bbox locations. - - Args: - network (Cell): the origin ssd infer network without bbox decoder. - default_boxes (Tensor): the default_boxes from anchor generator - config (dict): ssd config - Returns: - Tensor, the locations for bbox after decoder representing (y0,x0,y1,x1) - Tensor, the prediction labels. - - """ - def __init__(self, network, default_boxes, config): - super(SsdInferWithDecoder, self).__init__() - self.network = network - self.default_boxes = default_boxes - self.prior_scaling_xy = config.prior_scaling[0] - self.prior_scaling_wh = config.prior_scaling[1] - - def construct(self, x): - """ - Forward - """ - pred_loc, pred_label = self.network(x) - - default_bbox_xy = self.default_boxes[..., :2] - default_bbox_wh = self.default_boxes[..., 2:] - pred_xy = pred_loc[..., :2] * self.prior_scaling_xy * default_bbox_wh + default_bbox_xy - pred_wh = P.Exp()(pred_loc[..., 2:] * self.prior_scaling_wh) * default_bbox_wh - - pred_xy_0 = pred_xy - pred_wh / 2.0 - pred_xy_1 = pred_xy + pred_wh / 2.0 - pred_xy = P.Concat(-1)((pred_xy_0, pred_xy_1)) - pred_xy = P.Maximum()(pred_xy, 0) - pred_xy = P.Minimum()(pred_xy, 1) - return pred_xy, pred_label - -def ssd_resnet50(**kwargs): - return SsdResNet50(**kwargs) +# Copyright 2021 Huawei Technologies Co., Ltd +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. +# ============================================================================ + +"""SSD net based MobilenetV2.""" + +import mindspore.common.dtype as mstype +import mindspore as ms +import mindspore.nn as nn +from mindspore import context, Tensor +from mindspore.context import ParallelMode +from mindspore.parallel._auto_parallel_context import auto_parallel_context +from mindspore.communication.management import get_group_size +from mindspore.ops import operations as P +from mindspore.ops import functional as F +from mindspore.ops import composite as C +from .resnet_extra import resnet50_extra + +def _make_divisible(v, divisor, min_value=None): + """nsures that all layers have a channel number that is divisible by 8.""" + if min_value is None: + min_value = divisor + new_v = max(min_value, int(v + divisor / 2) // divisor * divisor) + # Make sure that round down does not go down by more than 10%. + if new_v < 0.9 * v: + new_v += divisor + return new_v + + +def _conv2d(in_channel, out_channel, kernel_size=3, stride=1, pad_mod='same'): + return nn.Conv2d(in_channel, out_channel, kernel_size=kernel_size, stride=stride, + padding=0, pad_mode=pad_mod, has_bias=True) + + +def _bn(channel): + return nn.BatchNorm2d(channel, eps=1e-3, momentum=0.97, + gamma_init=1, beta_init=0, moving_mean_init=0, moving_var_init=1) + + +def _last_conv2d(in_channel, out_channel, kernel_size=3, stride=1, pad_mod='same', pad=0): + in_channels = in_channel + out_channels = in_channel + depthwise_conv = nn.Conv2d(in_channels, out_channels, kernel_size, stride, pad_mode='same', + padding=pad, group=in_channels) + conv = _conv2d(in_channel, out_channel, kernel_size=1) + return nn.SequentialCell([depthwise_conv, _bn(in_channel), nn.ReLU6(), conv]) + + +class ConvBNReLU(nn.Cell): + """ + Convolution/Depthwise fused with Batchnorm and ReLU block definition. + + Args: + in_planes (int): Input channel. + out_planes (int): Output channel. + kernel_size (int): Input kernel size. + stride (int): Stride size for the first convolutional layer. Default: 1. + groups (int): channel group. Convolution is 1 while Depthiwse is input channel. Default: 1. + shared_conv(Cell): Use the weight shared conv, default: None. + + Returns: + Tensor, output tensor. + + Examples: + >>> ConvBNReLU(16, 256, kernel_size=1, stride=1, groups=1) + """ + def __init__(self, in_planes, out_planes, kernel_size=3, stride=1, groups=1, shared_conv=None): + super(ConvBNReLU, self).__init__() + padding = 0 + in_channels = in_planes + out_channels = out_planes + if shared_conv is None: + if groups == 1: + conv = nn.Conv2d(in_channels, out_channels, kernel_size, stride, pad_mode='same', padding=padding) + else: + out_channels = in_planes + conv = nn.Conv2d(in_channels, out_channels, kernel_size, stride, pad_mode='same', + padding=padding, group=in_channels) + layers = [conv, _bn(out_planes), nn.ReLU6()] + else: + layers = [shared_conv, _bn(out_planes), nn.ReLU6()] + self.features = nn.SequentialCell(layers) + + def construct(self, x): + output = self.features(x) + return output + + +class InvertedResidual(nn.Cell): + """ + Residual block definition. + + Args: + inp (int): Input channel. + oup (int): Output channel. + stride (int): Stride size for the first convolutional layer. Default: 1. + expand_ratio (int): expand ration of input channel + + Returns: + Tensor, output tensor. + + Examples: + >>> ResidualBlock(3, 256, 1, 1) + """ + def __init__(self, inp, oup, stride, expand_ratio, last_relu=False): + super(InvertedResidual, self).__init__() + assert stride in [1, 2] + + hidden_dim = int(round(inp * expand_ratio)) + self.use_res_connect = stride == 1 and inp == oup + + layers = [] + if expand_ratio != 1: + layers.append(ConvBNReLU(inp, hidden_dim, kernel_size=1)) + layers.extend([ + # dw + ConvBNReLU(hidden_dim, hidden_dim, stride=stride, groups=hidden_dim), + # pw-linear + nn.Conv2d(hidden_dim, oup, kernel_size=1, stride=1, has_bias=False), + _bn(oup), + ]) + self.conv = nn.SequentialCell(layers) + self.add = P.Add() + self.cast = P.Cast() + self.last_relu = last_relu + self.relu = nn.ReLU6() + + def construct(self, x): + identity = x + x = self.conv(x) + if self.use_res_connect: + x = self.add(identity, x) + if self.last_relu: + x = self.relu(x) + return x + + +class FlattenConcat(nn.Cell): + """ + Concatenate predictions into a single tensor. + + Args: + config (dict): The default config of SSD. + + Returns: + Tensor, flatten predictions. + """ + def __init__(self, config): + super(FlattenConcat, self).__init__() + self.num_ssd_boxes = config.num_ssd_boxes + self.concat = P.Concat(axis=1) + self.transpose = P.Transpose() + def construct(self, inputs): + output = () + batch_size = F.shape(inputs[0])[0] + for x in inputs: + x = self.transpose(x, (0, 2, 3, 1)) + output += (F.reshape(x, (batch_size, -1)),) + res = self.concat(output) + return F.reshape(res, (batch_size, self.num_ssd_boxes, -1)) + + +class MultiBox(nn.Cell): + """ + Multibox conv layers. Each multibox layer contains class conf scores and localization predictions. + + Args: + config (dict): The default config of SSD. + + Returns: + Tensor, localization predictions. + Tensor, class conf scores. + """ + def __init__(self, config): + super(MultiBox, self).__init__() + num_classes = config.num_classes + out_channels = config.extras_out_channels + num_default = config.num_default + + loc_layers = [] + cls_layers = [] + for k, out_channel in enumerate(out_channels): + loc_layers += [_last_conv2d(out_channel, 4 * num_default[k], + kernel_size=3, stride=1, pad_mod='same', pad=0)] + cls_layers += [_last_conv2d(out_channel, num_classes * num_default[k], + kernel_size=3, stride=1, pad_mod='same', pad=0)] + + self.multi_loc_layers = nn.layer.CellList(loc_layers) + self.multi_cls_layers = nn.layer.CellList(cls_layers) + self.flatten_concat = FlattenConcat(config) + + def construct(self, inputs): + loc_outputs = () + cls_outputs = () + for i in range(len(self.multi_loc_layers)): + loc_outputs += (self.multi_loc_layers[i](inputs[i]),) + cls_outputs += (self.multi_cls_layers[i](inputs[i]),) + return self.flatten_concat(loc_outputs), self.flatten_concat(cls_outputs) + + +class WeightSharedMultiBox(nn.Cell): + """ + Weight shared Multi-box conv layers. Each multi-box layer contains class conf scores and localization predictions. + All box predictors shares the same conv weight in different features. + + Args: + config (dict): The default config of SSD. + loc_cls_shared_addition(bool): Whether the location predictor and classifier prediction share the + same addition layer. + Returns: + Tensor, localization predictions. + Tensor, class conf scores. + """ + def __init__(self, config, loc_cls_shared_addition=False): + super(WeightSharedMultiBox, self).__init__() + num_classes = config.num_classes + out_channels = config.extras_out_channels[0] + num_default = config.num_default[0] + num_features = len(config.feature_size) + num_addition_layers = config.num_addition_layers + self.loc_cls_shared_addition = loc_cls_shared_addition + + if not loc_cls_shared_addition: + loc_convs = [ + _conv2d(out_channels, out_channels, 3, 1) for x in range(num_addition_layers) + ] + cls_convs = [ + _conv2d(out_channels, out_channels, 3, 1) for x in range(num_addition_layers) + ] + addition_loc_layer_list = [] + addition_cls_layer_list = [] + for _ in range(num_features): + addition_loc_layer = [ + ConvBNReLU(out_channels, out_channels, 3, 1, 1, loc_convs[x]) for x in range(num_addition_layers) + ] + addition_cls_layer = [ + ConvBNReLU(out_channels, out_channels, 3, 1, 1, cls_convs[x]) for x in range(num_addition_layers) + ] + addition_loc_layer_list.append(nn.SequentialCell(addition_loc_layer)) + addition_cls_layer_list.append(nn.SequentialCell(addition_cls_layer)) + self.addition_layer_loc = nn.CellList(addition_loc_layer_list) + self.addition_layer_cls = nn.CellList(addition_cls_layer_list) + else: + convs = [ + _conv2d(out_channels, out_channels, 3, 1) for x in range(num_addition_layers) + ] + addition_layer_list = [] + for _ in range(num_features): + addition_layers = [ + ConvBNReLU(out_channels, out_channels, 3, 1, 1, convs[x]) for x in range(num_addition_layers) + ] + addition_layer_list.append(nn.SequentialCell(addition_layers)) + self.addition_layer = nn.SequentialCell(addition_layer_list) + + loc_layers = [_conv2d(out_channels, 4 * num_default, + kernel_size=3, stride=1, pad_mod='same')] + cls_layers = [_conv2d(out_channels, num_classes * num_default, + kernel_size=3, stride=1, pad_mod='same')] + + self.loc_layers = nn.SequentialCell(loc_layers) + self.cls_layers = nn.SequentialCell(cls_layers) + self.flatten_concat = FlattenConcat(config) + + def construct(self, inputs): + """ + Forward + """ + loc_outputs = () + cls_outputs = () + num_heads = len(inputs) + for i in range(num_heads): + if self.loc_cls_shared_addition: + features = self.addition_layer[i](inputs[i]) + loc_outputs += (self.loc_layers(features),) + cls_outputs += (self.cls_layers(features),) + else: + features = self.addition_layer_loc[i](inputs[i]) + loc_outputs += (self.loc_layers(features),) + features = self.addition_layer_cls[i](inputs[i]) + cls_outputs += (self.cls_layers(features),) + return self.flatten_concat(loc_outputs), self.flatten_concat(cls_outputs) + +class SsdResNet50(nn.Cell): + """ + SSD Network using ResNet50 to extract features + + Args: + config (dict): The default config of SSD. + + Returns: + Tensor, localization predictions. + Tensor, class conf scores. + + Examples:backbone + SsdResNet50(config). + """ + def __init__(self, config): + super(SsdResNet50, self).__init__() + self.multi_box = MultiBox(config) + self.activation = P.Sigmoid() + self.feature_extractor = resnet50_extra() + + def construct(self, x): + """ + Forward + """ + features = self.feature_extractor(x) + pred_loc, pred_label = self.multi_box(features) + if not self.training: + pred_label = self.activation(pred_label) + pred_loc = F.cast(pred_loc, mstype.float32) + pred_label = F.cast(pred_label, mstype.float32) + return pred_loc, pred_label + +class SigmoidFocalClassificationLoss(nn.Cell): + """" + Sigmoid focal-loss for classification. + + Args: + gamma (float): Hyper-parameter to balance the easy and hard examples. Default: 2.0 + alpha (float): Hyper-parameter to balance the positive and negative example. Default: 0.25 + + Returns: + Tensor, the focal loss. + """ + def __init__(self, gamma=2.0, alpha=0.25): + super(SigmoidFocalClassificationLoss, self).__init__() + self.sigmiod_cross_entropy = P.SigmoidCrossEntropyWithLogits() + self.sigmoid = P.Sigmoid() + self.pow = P.Pow() + self.onehot = P.OneHot() + self.on_value = Tensor(1.0, mstype.float32) + self.off_value = Tensor(0.0, mstype.float32) + self.gamma = gamma + self.alpha = alpha + + def construct(self, logits, label): + """ + Forward + """ + label = self.onehot(label, F.shape(logits)[-1], self.on_value, self.off_value) + sigmiod_cross_entropy = self.sigmiod_cross_entropy(logits, label) + sigmoid = self.sigmoid(logits) + label = F.cast(label, mstype.float32) + p_t = label * sigmoid + (1 - label) * (1 - sigmoid) + modulating_factor = self.pow(1 - p_t, self.gamma) + alpha_weight_factor = label * self.alpha + (1 - label) * (1 - self.alpha) + focal_loss = modulating_factor * alpha_weight_factor * sigmiod_cross_entropy + return focal_loss + + +class SSDWithLossCell(nn.Cell): + """" + Provide SSD training loss through network. + + Args: + network (Cell): The training network. + config (dict): SSD config. + + Returns: + Tensor, the loss of the network. + """ + def __init__(self, network, config): + super(SSDWithLossCell, self).__init__() + self.network = network + self.less = P.Less() + self.tile = P.Tile() + self.reduce_sum = P.ReduceSum() + self.expand_dims = P.ExpandDims() + self.class_loss = SigmoidFocalClassificationLoss(config.gamma, config.alpha) + self.loc_loss = nn.SmoothL1Loss() + + def construct(self, x, gt_loc, gt_label, num_matched_boxes): + """ + Forward + """ + pred_loc, pred_label = self.network(x) + mask = F.cast(self.less(0, gt_label), mstype.float32) + num_matched_boxes = self.reduce_sum(F.cast(num_matched_boxes, mstype.float32)) + + # Localization Loss + mask_loc = self.tile(self.expand_dims(mask, -1), (1, 1, 4)) + smooth_l1 = self.loc_loss(pred_loc, gt_loc) * mask_loc + loss_loc = self.reduce_sum(self.reduce_sum(smooth_l1, -1), -1) + + # Classification Loss + loss_cls = self.class_loss(pred_label, gt_label) + loss_cls = self.reduce_sum(loss_cls, (1, 2)) + + return self.reduce_sum((loss_cls + loss_loc) / num_matched_boxes) + + +grad_scale = C.MultitypeFuncGraph("grad_scale") +@grad_scale.register("Tensor", "Tensor") +def tensor_grad_scale(scale, grad): + return grad * P.Reciprocal()(scale) + + +class TrainingWrapper(nn.Cell): + """ + Encapsulation class of SSD network training. + + Append an optimizer to the training network after that the construct + function can be called to create the backward graph. + + Args: + network (Cell): The training network. Note that loss function should have been added. + optimizer (Optimizer): Optimizer for updating the weights. + sens (Number): The adjust parameter. Default: 1.0. + use_global_nrom(bool): Whether apply global norm before optimizer. Default: False + """ + def __init__(self, network, optimizer, sens=1.0, use_global_norm=False): + super(TrainingWrapper, self).__init__(auto_prefix=False) + self.network = network + self.network.set_grad() + self.weights = ms.ParameterTuple(network.trainable_params()) + self.optimizer = optimizer + self.grad = C.GradOperation(get_by_list=True, sens_param=True) + self.sens = sens + self.reducer_flag = False + self.grad_reducer = None + self.use_global_norm = use_global_norm + self.parallel_mode = context.get_auto_parallel_context("parallel_mode") + if self.parallel_mode in [ParallelMode.DATA_PARALLEL, ParallelMode.HYBRID_PARALLEL]: + self.reducer_flag = True + if self.reducer_flag: + mean = context.get_auto_parallel_context("gradients_mean") + if auto_parallel_context().get_device_num_is_set(): + degree = context.get_auto_parallel_context("device_num") + else: + degree = get_group_size() + self.grad_reducer = nn.DistributedGradReducer(optimizer.parameters, mean, degree) + self.hyper_map = C.HyperMap() + + def construct(self, *args): + """ + Forward + """ + weights = self.weights + loss = self.network(*args) + sens = P.Fill()(P.DType()(loss), P.Shape()(loss), self.sens) + grads = self.grad(self.network, weights)(*args, sens) + if self.reducer_flag: + # apply grad reducer on grads + grads = self.grad_reducer(grads) + if self.use_global_norm: + grads = self.hyper_map(F.partial(grad_scale, F.scalar_to_array(self.sens)), grads) + grads = C.clip_by_global_norm(grads) + self.optimizer(grads) + return loss + +class SsdInferWithDecoder(nn.Cell): + """ + SSD Infer wrapper to decode the bbox locations. + + Args: + network (Cell): the origin ssd infer network without bbox decoder. + default_boxes (Tensor): the default_boxes from anchor generator + config (dict): ssd config + Returns: + Tensor, the locations for bbox after decoder representing (y0,x0,y1,x1) + Tensor, the prediction labels. + + """ + def __init__(self, network, default_boxes, config): + super(SsdInferWithDecoder, self).__init__() + self.network = network + self.default_boxes = default_boxes + self.prior_scaling_xy = config.prior_scaling[0] + self.prior_scaling_wh = config.prior_scaling[1] + + def construct(self, x): + """ + Forward + """ + pred_loc, pred_label = self.network(x) + + default_bbox_xy = self.default_boxes[..., :2] + default_bbox_wh = self.default_boxes[..., 2:] + pred_xy = pred_loc[..., :2] * self.prior_scaling_xy * default_bbox_wh + default_bbox_xy + pred_wh = P.Exp()(pred_loc[..., 2:] * self.prior_scaling_wh) * default_bbox_wh + + pred_xy_0 = pred_xy - pred_wh / 2.0 + pred_xy_1 = pred_xy + pred_wh / 2.0 + pred_xy = P.Concat(-1)((pred_xy_0, pred_xy_1)) + pred_xy = P.Maximum()(pred_xy, 0) + pred_xy = P.Minimum()(pred_xy, 1) + return pred_xy, pred_label + +def ssd_resnet50(**kwargs): + return SsdResNet50(**kwargs) diff --git a/research/cv/ssd_resnet50/train.py b/research/cv/ssd_resnet50/train.py index aad9bd369fb6821d019a6d90925bddca7dbf6d78..deda2bb87ca3ca0e8ef36d6efad869062e92fa99 100644 --- a/research/cv/ssd_resnet50/train.py +++ b/research/cv/ssd_resnet50/train.py @@ -1,160 +1,160 @@ -# Copyright 2021 Huawei Technologies Co., Ltd -# -# Licensed under the Apache License, Version 2.0 (the "License"); -# you may not use this file except in compliance with the License. -# You may obtain a copy of the License at -# -# http://www.apache.org/licenses/LICENSE-2.0 -# -# Unless required by applicable law or agreed to in writing, software -# distributed under the License is distributed on an "AS IS" BASIS, -# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. -# See the License for the specific language governing permissions and -# limitations under the License. -# ============================================================================ - -"""Train SSD and get checkpoint files.""" - -import argparse -import ast -import mindspore.nn as nn -from mindspore import context, Tensor -from mindspore.communication.management import init, get_rank -from mindspore.train.callback import CheckpointConfig, ModelCheckpoint, LossMonitor, TimeMonitor -from mindspore.train import Model -from mindspore.context import ParallelMode -from mindspore.train.serialization import load_checkpoint, load_param_into_net -from mindspore.common import set_seed, dtype -from src.ssd import SSDWithLossCell, TrainingWrapper, ssd_resnet50 -from src.config import config -from src.dataset import create_ssd_dataset, create_mindrecord -from src.lr_schedule import get_lr -from src.init_params import init_net_param, filter_checkpoint_parameter_by_list - -set_seed(1) - -def get_args(): - """ - get args - """ - parser = argparse.ArgumentParser(description="SSD training") - parser.add_argument("--run_platform", type=str, default="Ascend", choices=("Ascend", "GPU", "CPU"), - help="run platform, support Ascend, GPU and CPU.") - parser.add_argument("--only_create_dataset", type=ast.literal_eval, default=False, - help="If set it true, only create Mindrecord, default is False.") - parser.add_argument("--distribute", type=ast.literal_eval, default=False, - help="Run distribute, default is False.") - parser.add_argument("--device_id", type=int, default=0, help="Device id, default is 0.") - parser.add_argument("--device_num", type=int, default=1, help="Use device nums, default is 1.") - parser.add_argument("--lr", type=float, default=0.05, help="Learning rate, default is 0.05.") - parser.add_argument("--mode", type=str, default="sink", help="Run sink mode or not, default is sink.") - parser.add_argument("--dataset", type=str, default="coco", help="Dataset, default is coco.") - parser.add_argument("--epoch_size", type=int, default=500, help="Epoch size, default is 500.") - parser.add_argument("--batch_size", type=int, default=32, help="Batch size, default is 32.") - parser.add_argument("--pre_trained", type=str, default=None, help="Pretrained Checkpoint file path.") - parser.add_argument("--pre_trained_epoch_size", type=int, default=0, help="Pretrained epoch size.") - parser.add_argument("--save_checkpoint_epochs", type=int, default=10, help="Save checkpoint epochs, default is 10.") - parser.add_argument("--loss_scale", type=int, default=1024, help="Loss scale, default is 1024.") - parser.add_argument("--filter_weight", type=ast.literal_eval, default=False, - help="Filter head weight parameters, default is False.") - parser.add_argument('--freeze_layer', type=str, default="none", choices=["none", "backbone"], - help="freeze the weights of network, support freeze the backbone's weights, " - "default is not freezing.") - args_opt = parser.parse_args() - return args_opt - -def ssd_model_build(args_opt): - """ - build ssd model - """ - if config.model == "ssd_resnet50": - ssd = ssd_resnet50(config=config) - init_net_param(ssd) - if config.feature_extractor_base_param != "": - param_dict = load_checkpoint(config.feature_extractor_base_param) - for x in list(param_dict.keys()): - param_dict["network.feature_extractor.resnet." + x] = param_dict[x] - del param_dict[x] - load_param_into_net(ssd.feature_extractor.resnet, param_dict) - else: - raise ValueError(f'config.model: {config.model} is not supported') - return ssd - -def main(): - args_opt = get_args() - rank = 0 - device_num = 1 - if args_opt.run_platform == "CPU": - context.set_context(mode=context.GRAPH_MODE, device_target="CPU") - else: - context.set_context(mode=context.GRAPH_MODE, device_target=args_opt.run_platform, device_id=args_opt.device_id) - if args_opt.distribute: - device_num = args_opt.device_num - context.reset_auto_parallel_context() - context.set_auto_parallel_context(parallel_mode=ParallelMode.DATA_PARALLEL, gradients_mean=True, - device_num=device_num) - init() - context.set_auto_parallel_context(all_reduce_fusion_config=[29, 58, 89]) - rank = get_rank() - - mindrecord_file = create_mindrecord(args_opt.dataset, "ssd.mindrecord", True) - - if args_opt.only_create_dataset: - return - - loss_scale = float(args_opt.loss_scale) - if args_opt.run_platform == "CPU": - loss_scale = 1.0 - - # When create MindDataset, using the fitst mindrecord file, such as ssd.mindrecord0. - use_multiprocessing = (args_opt.run_platform != "CPU") - dataset = create_ssd_dataset(mindrecord_file, repeat_num=1, batch_size=args_opt.batch_size, - device_num=device_num, rank=rank, use_multiprocessing=use_multiprocessing) - - dataset_size = dataset.get_dataset_size() - print(f"Create dataset done! dataset size is {dataset_size}") - ssd = ssd_model_build(args_opt) - print("finish ssd model building ...............") - - if ("use_float16" in config and config.use_float16) or args_opt.run_platform == "GPU": - ssd.to_float(dtype.float16) - net = SSDWithLossCell(ssd, config) - - # checkpoint - ckpt_config = CheckpointConfig(save_checkpoint_steps=dataset_size * args_opt.save_checkpoint_epochs) - save_ckpt_path = './ckpt_' + config.model + '_' + str(rank) + '/' - ckpoint_cb = ModelCheckpoint(prefix="ssd", directory=save_ckpt_path, config=ckpt_config) - - if args_opt.pre_trained: - param_dict = load_checkpoint(args_opt.pre_trained) - if args_opt.filter_weight: - filter_checkpoint_parameter_by_list(param_dict, config.checkpoint_filter_list) - load_param_into_net(net, param_dict, True) - - lr = Tensor(get_lr(global_step=args_opt.pre_trained_epoch_size * dataset_size, - lr_init=config.lr_init, lr_end=config.lr_end_rate * args_opt.lr, lr_max=args_opt.lr, - warmup_epochs=config.warmup_epochs, - total_epochs=args_opt.epoch_size, - steps_per_epoch=dataset_size)) - - if "use_global_norm" in config and config.use_global_norm: - opt = nn.Momentum(filter(lambda x: x.requires_grad, net.get_parameters()), lr, - config.momentum, config.weight_decay, 1.0) - net = TrainingWrapper(net, opt, loss_scale, True) - else: - opt = nn.Momentum(filter(lambda x: x.requires_grad, net.get_parameters()), lr, - config.momentum, config.weight_decay, loss_scale) - net = TrainingWrapper(net, opt, loss_scale) - - - callback = [TimeMonitor(data_size=dataset_size), LossMonitor(), ckpoint_cb] - model = Model(net) - dataset_sink_mode = False - if args_opt.mode == "sink" and args_opt.run_platform != "CPU": - print("In sink mode, one epoch return a loss.") - dataset_sink_mode = True - print("Start train SSD, the first epoch will be slower because of the graph compilation.") - model.train(args_opt.epoch_size, dataset, callbacks=callback, dataset_sink_mode=dataset_sink_mode) - -if __name__ == '__main__': - main() +# Copyright 2021 Huawei Technologies Co., Ltd +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. +# ============================================================================ + +"""Train SSD and get checkpoint files.""" + +import argparse +import ast +import mindspore.nn as nn +from mindspore import context, Tensor +from mindspore.communication.management import init, get_rank +from mindspore.train.callback import CheckpointConfig, ModelCheckpoint, LossMonitor, TimeMonitor +from mindspore.train import Model +from mindspore.context import ParallelMode +from mindspore.train.serialization import load_checkpoint, load_param_into_net +from mindspore.common import set_seed, dtype +from src.ssd import SSDWithLossCell, TrainingWrapper, ssd_resnet50 +from src.config import config +from src.dataset import create_ssd_dataset, create_mindrecord +from src.lr_schedule import get_lr +from src.init_params import init_net_param, filter_checkpoint_parameter_by_list + +set_seed(1) + +def get_args(): + """ + get args + """ + parser = argparse.ArgumentParser(description="SSD training") + parser.add_argument("--run_platform", type=str, default="Ascend", choices=("Ascend", "GPU", "CPU"), + help="run platform, support Ascend, GPU and CPU.") + parser.add_argument("--only_create_dataset", type=ast.literal_eval, default=False, + help="If set it true, only create Mindrecord, default is False.") + parser.add_argument("--distribute", type=ast.literal_eval, default=False, + help="Run distribute, default is False.") + parser.add_argument("--device_id", type=int, default=0, help="Device id, default is 0.") + parser.add_argument("--device_num", type=int, default=1, help="Use device nums, default is 1.") + parser.add_argument("--lr", type=float, default=0.05, help="Learning rate, default is 0.05.") + parser.add_argument("--mode", type=str, default="sink", help="Run sink mode or not, default is sink.") + parser.add_argument("--dataset", type=str, default="coco", help="Dataset, default is coco.") + parser.add_argument("--epoch_size", type=int, default=500, help="Epoch size, default is 500.") + parser.add_argument("--batch_size", type=int, default=32, help="Batch size, default is 32.") + parser.add_argument("--pre_trained", type=str, default=None, help="Pretrained Checkpoint file path.") + parser.add_argument("--pre_trained_epoch_size", type=int, default=0, help="Pretrained epoch size.") + parser.add_argument("--save_checkpoint_epochs", type=int, default=10, help="Save checkpoint epochs, default is 10.") + parser.add_argument("--loss_scale", type=int, default=1024, help="Loss scale, default is 1024.") + parser.add_argument("--filter_weight", type=ast.literal_eval, default=False, + help="Filter head weight parameters, default is False.") + parser.add_argument('--freeze_layer', type=str, default="none", choices=["none", "backbone"], + help="freeze the weights of network, support freeze the backbone's weights, " + "default is not freezing.") + args_opt = parser.parse_args() + return args_opt + +def ssd_model_build(args_opt): + """ + build ssd model + """ + if config.model == "ssd_resnet50": + ssd = ssd_resnet50(config=config) + init_net_param(ssd) + if config.feature_extractor_base_param != "": + param_dict = load_checkpoint(config.feature_extractor_base_param) + for x in list(param_dict.keys()): + param_dict["network.feature_extractor.resnet." + x] = param_dict[x] + del param_dict[x] + load_param_into_net(ssd.feature_extractor.resnet, param_dict) + else: + raise ValueError(f'config.model: {config.model} is not supported') + return ssd + +def main(): + args_opt = get_args() + rank = 0 + device_num = 1 + if args_opt.run_platform == "CPU": + context.set_context(mode=context.GRAPH_MODE, device_target="CPU") + else: + context.set_context(mode=context.GRAPH_MODE, device_target=args_opt.run_platform, device_id=args_opt.device_id) + if args_opt.distribute: + device_num = args_opt.device_num + context.reset_auto_parallel_context() + context.set_auto_parallel_context(parallel_mode=ParallelMode.DATA_PARALLEL, gradients_mean=True, + device_num=device_num) + init() + context.set_auto_parallel_context(all_reduce_fusion_config=[29, 58, 89]) + rank = get_rank() + + mindrecord_file = create_mindrecord(args_opt.dataset, "ssd.mindrecord", True) + + if args_opt.only_create_dataset: + return + + loss_scale = float(args_opt.loss_scale) + if args_opt.run_platform == "CPU": + loss_scale = 1.0 + + # When create MindDataset, using the fitst mindrecord file, such as ssd.mindrecord0. + use_multiprocessing = (args_opt.run_platform != "CPU") + dataset = create_ssd_dataset(mindrecord_file, repeat_num=1, batch_size=args_opt.batch_size, + device_num=device_num, rank=rank, use_multiprocessing=use_multiprocessing) + + dataset_size = dataset.get_dataset_size() + print(f"Create dataset done! dataset size is {dataset_size}") + ssd = ssd_model_build(args_opt) + print("finish ssd model building ...............") + + if ("use_float16" in config and config.use_float16) or args_opt.run_platform == "GPU": + ssd.to_float(dtype.float16) + net = SSDWithLossCell(ssd, config) + + # checkpoint + ckpt_config = CheckpointConfig(save_checkpoint_steps=dataset_size * args_opt.save_checkpoint_epochs) + save_ckpt_path = './ckpt_' + config.model + '_' + str(rank) + '/' + ckpoint_cb = ModelCheckpoint(prefix="ssd", directory=save_ckpt_path, config=ckpt_config) + + if args_opt.pre_trained: + param_dict = load_checkpoint(args_opt.pre_trained) + if args_opt.filter_weight: + filter_checkpoint_parameter_by_list(param_dict, config.checkpoint_filter_list) + load_param_into_net(net, param_dict, True) + + lr = Tensor(get_lr(global_step=args_opt.pre_trained_epoch_size * dataset_size, + lr_init=config.lr_init, lr_end=config.lr_end_rate * args_opt.lr, lr_max=args_opt.lr, + warmup_epochs=config.warmup_epochs, + total_epochs=args_opt.epoch_size, + steps_per_epoch=dataset_size)) + + if "use_global_norm" in config and config.use_global_norm: + opt = nn.Momentum(filter(lambda x: x.requires_grad, net.get_parameters()), lr, + config.momentum, config.weight_decay, 1.0) + net = TrainingWrapper(net, opt, loss_scale, True) + else: + opt = nn.Momentum(filter(lambda x: x.requires_grad, net.get_parameters()), lr, + config.momentum, config.weight_decay, loss_scale) + net = TrainingWrapper(net, opt, loss_scale) + + + callback = [TimeMonitor(data_size=dataset_size), LossMonitor(), ckpoint_cb] + model = Model(net) + dataset_sink_mode = False + if args_opt.mode == "sink" and args_opt.run_platform != "CPU": + print("In sink mode, one epoch return a loss.") + dataset_sink_mode = True + print("Start train SSD, the first epoch will be slower because of the graph compilation.") + model.train(args_opt.epoch_size, dataset, callbacks=callback, dataset_sink_mode=dataset_sink_mode) + +if __name__ == '__main__': + main()