From 15ea1e5424363a2e5c55d748fc5e402871718531 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?=E9=83=91=E7=90=B3?= <10843194+zl2170906@user.noreply.gitee.com> Date: Tue, 24 May 2022 01:00:09 +0000 Subject: [PATCH 01/10] =?UTF-8?q?=E5=88=A0=E9=99=A4=E6=96=87=E4=BB=B6=20AC?= =?UTF-8?q?L=5FPyTorch/contrib/cv/super=5Fresolution/RDN/env.sh?= MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit --- ACL_PyTorch/contrib/cv/super_resolution/RDN/env.sh | 6 ------ 1 file changed, 6 deletions(-) delete mode 100644 ACL_PyTorch/contrib/cv/super_resolution/RDN/env.sh diff --git a/ACL_PyTorch/contrib/cv/super_resolution/RDN/env.sh b/ACL_PyTorch/contrib/cv/super_resolution/RDN/env.sh deleted file mode 100644 index 52554cfca2..0000000000 --- a/ACL_PyTorch/contrib/cv/super_resolution/RDN/env.sh +++ /dev/null @@ -1,6 +0,0 @@ -export install_path=/usr/local/Ascend/ascend-toolkit/latest -export PATH=/usr/local/python3.7.5/bin:${install_path}/atc/ccec_compiler/bin:${install_path}/atc/bin:$PATH -export PYTHONPATH=${install_path}/atc/python/site-packages:$PYTHONPATH -export LD_LIBRARY_PATH=${install_path}/atc/lib64:${install_path}/acllib/lib64:$LD_LIBRARY_PATH -export ASCEND_OPP_PATH=${install_path}/opp -export ASCEND_AICPU_PATH=/usr/local/Ascend/ascend-toolkit/latest -- Gitee From 421824b87c301bc02cfe09677e97eb08331590f9 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?=E9=83=91=E7=90=B3?= <10843194+zl2170906@user.noreply.gitee.com> Date: Tue, 24 May 2022 01:51:50 +0000 Subject: [PATCH 02/10] update ACL_PyTorch/contrib/cv/super_resolution/RDN/README.md. --- .../contrib/cv/super_resolution/RDN/README.md | 159 ++++++++---------- 1 file changed, 72 insertions(+), 87 deletions(-) diff --git a/ACL_PyTorch/contrib/cv/super_resolution/RDN/README.md b/ACL_PyTorch/contrib/cv/super_resolution/RDN/README.md index 8cddc35c62..59bebb133b 100644 --- a/ACL_PyTorch/contrib/cv/super_resolution/RDN/README.md +++ b/ACL_PyTorch/contrib/cv/super_resolution/RDN/README.md @@ -19,7 +19,7 @@ #### 2.1 深度学习框架 ``` -CANN 5.0.2 +CANN 5.1 pytorch = 1.5.0 torchvision = 0.6.0 onnx = 1.7.0 @@ -30,9 +30,9 @@ onnx = 1.7.0 #### 2.2 python第三方库 ``` -numpy == 1.20.3 -Pillow == 8.2.0 -opencv-python == 4.5.2.54 +numpy == 1.21.2 +Pillow == 9.1.0 +opencv-python == 4.5.5.64 mmcv == 1.3.12 ``` @@ -83,61 +83,16 @@ mmcv == 1.3.12 1. 设置环境变量 ``` - source env.sh + source /usr/local/Ascend/ascend-toolkit/set_env.sh ``` 2. 使用atc将onnx模型转换为om模型文件,工具使用方法可以参考CANN 5.0.1 开发辅助工具指南 (推理) 01 ``` - atc --framework=5 --model=rdn_x2.onnx --output=rdn_x2_bs1 --input_format=NCHW --input_shape="image:1,3,114,114" --log=debug --soc_version=Ascend310 + atc --framework=5 --model=rdn_x2.onnx --output=rdn_x2_bs1 --input_format=NCHW --input_shape="image:1,3,114,114" --log=debug --soc_version=Ascend710 ``` - -#### 3.3 om模型性能优化 - -直接使用atc工具转换会将Transpose算子翻译为TransposeD,而TransposeD在一些输入形状下会有很差的性能。因此首先需要将Transpose的输入形状加入atc转换的白名单中。 - -1. 获取om模型的profilling文件 - - 运行[profilling分析脚本](https://gitee.com/wangjiangben_hw/ascend-pytorch-crowdintelligence-doc/tree/master/Ascend-PyTorch%E7%A6%BB%E7%BA%BF%E6%8E%A8%E7%90%86%E6%8C%87%E5%AF%BC/%E4%B8%93%E9%A2%98%E6%A1%88%E4%BE%8B/%E7%9B%B8%E5%85%B3%E5%B7%A5%E5%85%B7/run_profiling)生成profilling目录,其中其中op_statistic_0_1.csv文件统计了模型中每类算子总体耗时与百分比,op_summary_0_1.csv中包含了模型每个算子的aicore耗时 - -2. 分析profilling文件 - - 使用profiling工具分析,可以从输出的csv文件看到算子统计结果 - - | Model Name | OP Type | Core Type | Count | Total Time(us) | Min Time(us) | Avg Time(us) | Max Time(us) | Ratio(%) | - | ---------- | ---------- | --------- | ----- | -------------- | ------------ | ------------ | ------------ | -------- | - | rdn_x2_bs1 | TransData | AI Core | 4 | 935253.6 | 63.80209 | 233813.4 | 934225.7 | 44.2515 | - | rdn_x2_bs1 | TransposeD | AI Core | 1 | 934287.8 | 934287.8 | 934287.8 | 934287.8 | 44.2058 | - | rdn_x2_bs1 | Conv2D | AI Core | 150 | 126733.2 | 78.2812 | 844.8882 | 1563.906 | 5.996379 | - | rdn_x2_bs1 | ConcatD | AI Core | 129 | 117099.7 | 193.9584 | 907.7495 | 1612.917 | 5.540568 | - | rdn_x2_bs1 | Cast | AI Core | 2 | 121.6667 | 59.4792 | 60.83335 | 62.1875 | 0.005757 | - - 可以看到TransposeD和TransData算子耗时很长,重点对TransposeD算子进行优化。 - -3. TransposeD算子优化 - - 将TransposeD算子的输入shape添加到白名单中,/usr/local/Ascend/ascend-toolkit/5.0.2/x86_64-linux/opp/op_impl/built-in/ai_core/tbe/impl/dynamic/transpose.py里添加shape白名单:[1, 64, 2, 2, 114, 114] - - ```python - white_list_shape = [ - ... - [1, 64, 2, 2, 114, 114] - ] - ``` - -4. 优化前后性能对比 - - | | TransposeD Avg Time(us) | 310 推理性能(FPS) | - | :----: | :---------------------: | :---------------: | - | 优化前 | 934287.8125 | 3.76 | - | 优化后 | 10141.094 | 29.56 | - - 优化TransposeD算子后,310性能超过T4基准性能,满足验收要求 - - - ### 4 数据集预处理 #### 4.1 数据集获取 @@ -176,7 +131,7 @@ python3.7 RDN_preprocess.py --src-path=/root/dataset/set5 --save-path=./prep_dat #### 5.1 benchmark工具概述 -benchmark工具为华为自研的模型推理工具,支持多种模型的离线推理,能够迅速统计出模型在Ascend310上的性能,支持真实数据和纯推理两种模式,配合后处理脚本,可以实现诸多模型的端到端过程,获取工具及使用方法可以参考CANN V100R020C10 推理benchmark工具用户指南 01 +benchmark工具为华为自研的模型推理工具,支持多种模型的离线推理,能够迅速统计出模型在Ascend710上的性能,支持真实数据和纯推理两种模式,配合后处理脚本,可以实现诸多模型的端到端过程,获取工具及使用方法可以参考CANN V100R020C10 推理benchmark工具用户指南 01 @@ -204,34 +159,34 @@ python3.7 RDN_postprocess.py --pred-path=./result/dumpOutput_device0 --label-pat ```json { - "title": "Overall statistical evaluation", + "title": "Overall statistical evaluation", "value": [ { - "key": "Number of images", + "key": "Number of images", "value": "5" - }, + }, { - "key": "Top1 PSNR", + "key": "Top1 PSNR", "value": "43.16" - }, + }, { - "key": "Top2 PSNR", + "key": "Top2 PSNR", "value": "38.9" - }, + }, { - "key": "Top3 PSNR", + "key": "Top3 PSNR", "value": "37.56" - }, + }, { - "key": "Top4 PSNR", + "key": "Top4 PSNR", "value": "36.8" - }, + }, { - "key": "Top5 PSNR", + "key": "Top5 PSNR", "value": "34.94" - }, + }, { - "key": "Avg PSNR", + "key": "Avg PSNR", "value": "38.27" } ] @@ -242,18 +197,17 @@ python3.7 RDN_postprocess.py --pred-path=./result/dumpOutput_device0 --label-pat #### 6.2 精度对比 -github仓库中给出的官方精度为PSNR:38.18,npu离线推理的精度为PSNR:38.27,故精度达标 +github仓库中给出的官方精度为PSNR:38.18,310离线推理的精度为PSNR:38.27,710离线推理的精度为PSNR:38.27,故精度达标 ### 7 性能对比 -#### 7.1 npu性能数据 +#### 7.1 npu(310)性能数据 -参照3.3所描述的方法,将Transpose的输入[1, 64, 2, 2, 114, 114]加入优化白名单。随后运行 +运行 ``` -source env.sh atc --framework=5 --model=rdn_x2.onnx --output=rdn_x2_bs1 --input_format=NCHW --input_shape="image:1,3,114,114" --log=debug --soc_version=Ascend310 ``` @@ -266,15 +220,44 @@ atc --framework=5 --model=rdn_x2.onnx --output=rdn_x2_bs1 --input_format=NCHW -- batch1的性能,benchmark工具在整个数据集上推理后生成result/perf_vision_batchsize_1_device_0.txt: ``` -[e2e] throughputRate: 1.32351, latency: 3777.82 -[data read] throughputRate: 263.518, moduleLatency: 3.7948 -[preprocess] throughputRate: 3.93749, moduleLatency: 253.969 -[infer] throughputRate: 6.8117, Interface throughputRate: 7.39423, moduleLatency: 139.064 -[post] throughputRate: 3.24091, moduleLatency: 308.555 +[e2e] throughputRate: 3.60508, latency: 1386.93 +[data read] throughputRate: 5656.11, moduleLatency: 0.1768 +[preprocess] throughputRate: 25.072, moduleLatency: 39.8852 +[inference] throughputRate: 7.36218, Interface throughputRate: 7.41332, moduleLatency: 135.745 +[postprocess] throughputRate: 7.93869, moduleLatency: 125.965 +``` + +Interface throughputRate: 7.41332,7.41332x4=29.653即是batch1 310单卡吞吐率 +bs1 310单卡吞吐率:7.41332x4=29.653fps/card + + + +#### 7.2 npu(710)性能数据 + +运行 + +``` +atc --framework=5 --model=rdn_x2.onnx --output=rdn_x2_bs1 --input_format=NCHW --input_shape="image:1,3,114,114" --log=debug --soc_version=Ascend710 +``` + +得到size为114的om模型 + + + +**benchmark工具在整个数据集上推理获得性能数据** + +batch1的性能,benchmark工具在整个数据集上推理后生成result/perf_vision_batchsize_1_device_0.txt: + +``` +[e2e] throughputRate: 3.29436, latency: 1517.74 +[data read] throughputRate: 3700.96, moduleLatency: 0.2702 +[preprocess] throughputRate: 28.5571, moduleLatency: 35.0176 +[inference] throughputRate: 39.8584, Interface throughputRate: 49.4688, moduleLatency: 24.9406 +[postprocess] throughputRate: 18.6048, moduleLatency: 53.7495 ``` -Interface throughputRate: 7.39423,7.39423x4=29.577既是batch1 310单卡吞吐率 -bs1 310单卡吞吐率:7.39423x4=29.577fps/card +Interface throughputRate: 49.4688 +bs1 710单卡吞吐率:49.469fps/card @@ -291,21 +274,23 @@ trtexec --onnx=rdn_x2.onnx --fp16 --shapes=image:1x3x114x114 gpu T4是4个device并行执行的结果,mean是时延(tensorrt的时延是batch个数据的推理时间),即吞吐率的倒数乘以batch ``` -[09/04/2021-07:06:46] [I] GPU Compute -[09/04/2021-07:06:46] [I] min: 38.6577 ms -[09/04/2021-07:06:46] [I] max: 44.8702 ms -[09/04/2021-07:06:46] [I] mean: 39.3814 ms -[09/04/2021-07:06:46] [I] median: 39.0232 ms -[09/04/2021-07:06:46] [I] percentile: 44.8702 ms at 99% -[09/04/2021-07:06:46] [I] total compute time: 3.07175 s +[05/10/2022-20:47:53] [I] GPU Compute +[05/10/2022-20:47:53] [I] min: 39.166 ms +[05/10/2022-20:47:53] [I] max: 48.2043 ms +[05/10/2022-20:47:53] [I] mean: 44.9095 ms +[05/10/2022-20:47:53] [I] median: 44.9874 ms +[05/10/2022-20:47:53] [I] percentile: 48.2043 ms at 99% +[05/10/2022-20:47:53] [I] total compute time: 3.09876 s ``` -batch1 t4单卡吞吐率:1000/(39.3814/1)=25.393fps +batch1 t4单卡吞吐率:1000/(44.9095/1)=22.267fps #### 7.3 性能对比 -batch1:29.577fps > 25.393fps +batch1: +710/310 = 49.469fps/29.653fps = 1.668 > 1.2 +710/T4 = 49.469fps/22.267fps = 2.222 > 1.6 -310单个device的吞吐率乘4即单卡吞吐率比T4单卡的吞吐率大,故310性能高于T4性能,性能达标。 \ No newline at end of file +710单卡吞吐率大于1.2倍310单卡吞吐率,且大于1.6倍T4单卡吞吐率,故性能达标。 \ No newline at end of file -- Gitee From bcae5067aad3058899958487992e8cd667881471 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?=E9=83=91=E7=90=B3?= <10843194+zl2170906@user.noreply.gitee.com> Date: Tue, 24 May 2022 15:06:09 +0000 Subject: [PATCH 03/10] update ACL_PyTorch/contrib/cv/super_resolution/RDN/requirements.txt. --- .../contrib/cv/super_resolution/RDN/requirements.txt | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/ACL_PyTorch/contrib/cv/super_resolution/RDN/requirements.txt b/ACL_PyTorch/contrib/cv/super_resolution/RDN/requirements.txt index 3286880e56..25aad1e4ee 100644 --- a/ACL_PyTorch/contrib/cv/super_resolution/RDN/requirements.txt +++ b/ACL_PyTorch/contrib/cv/super_resolution/RDN/requirements.txt @@ -1,7 +1,7 @@ torch == 1.5.0 torchvision == 0.6.0 onnx == 1.7.0 -numpy == 1.20.3 -Pillow == 8.2.0 -opencv-python == 4.5.2.54 +numpy == 1.21.2 +Pillow == 9.1.0 +opencv-python == 4.5.5.64 mmcv == 1.3.12 \ No newline at end of file -- Gitee From 59b3064a7d939bfe6332155b5948b72aed0939a8 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?=E9=83=91=E7=90=B3?= <10843194+zl2170906@user.noreply.gitee.com> Date: Tue, 24 May 2022 15:16:42 +0000 Subject: [PATCH 04/10] update ACL_PyTorch/contrib/cv/super_resolution/RDN/test/README.md. --- .../contrib/cv/super_resolution/RDN/test/README.md | 10 ++++------ 1 file changed, 4 insertions(+), 6 deletions(-) diff --git a/ACL_PyTorch/contrib/cv/super_resolution/RDN/test/README.md b/ACL_PyTorch/contrib/cv/super_resolution/RDN/test/README.md index f3fc08cb2b..19b0bc39ba 100644 --- a/ACL_PyTorch/contrib/cv/super_resolution/RDN/test/README.md +++ b/ACL_PyTorch/contrib/cv/super_resolution/RDN/test/README.md @@ -28,13 +28,11 @@ 将benchmark.x86_64放到当前目录 -6. TransposeD算子性能优化 - 由于om模型中存在低性能的TransposeD算子,通过添加白名单使用高性能的Transpose算子。/usr/local/Ascend/ascend-toolkit/latest/x86_64-linux/opp/op_impl/built-in/ai_core/tbe/impl/dynamic/transpose.py里添加shape白名单:[1, 64, 2, 2, 114, 114] ### 2 离线推理 -310上执行,执行时使npu-smi info查看设备状态,确保device空闲 +710上执行,执行时使npu-smi info查看设备状态,确保device空闲 ``` bash test/pth2om.sh @@ -45,8 +43,8 @@ bash test/eval_acc_perf.sh --datasets_path=/root/datasets **评测结果:** -| 模型 | 官网pth精度 | 310离线推理精度 | 基准性能 | 310性能 | -| :-----: | :---------: | :-------------: | :--------: | :-------: | -| RDN bs1 | PSNR:38.18 | PSNR:38.27 | fps:25.393 | fps:29.577 | +| 模型 | 官网pth精度 | 310离线推理精度 | 710离线推理精度 | 基准性能 | 310性能 | 710性能 | +| :-----: | :---------: | :-------------: | :-------------: | :-------: | :-------: | :-------: | +| RDN bs1 | PSNR:38.18 | PSNR:38.27 | PSNR:38.27 | fps:49.469 | fps:29.653 | fps:22.267 | - 因Set5数据集只有5张图片,因此仅使用了bs1进行评测。 \ No newline at end of file -- Gitee From a7534a21cd42179cee6d5632b6377a655b9ba37b Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?=E9=83=91=E7=90=B3?= <10843194+zl2170906@user.noreply.gitee.com> Date: Tue, 24 May 2022 15:18:49 +0000 Subject: [PATCH 05/10] update ACL_PyTorch/contrib/cv/super_resolution/RDN/test/README.md. --- ACL_PyTorch/contrib/cv/super_resolution/RDN/test/README.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/ACL_PyTorch/contrib/cv/super_resolution/RDN/test/README.md b/ACL_PyTorch/contrib/cv/super_resolution/RDN/test/README.md index 19b0bc39ba..00bb5c3b63 100644 --- a/ACL_PyTorch/contrib/cv/super_resolution/RDN/test/README.md +++ b/ACL_PyTorch/contrib/cv/super_resolution/RDN/test/README.md @@ -45,6 +45,6 @@ bash test/eval_acc_perf.sh --datasets_path=/root/datasets | 模型 | 官网pth精度 | 310离线推理精度 | 710离线推理精度 | 基准性能 | 310性能 | 710性能 | | :-----: | :---------: | :-------------: | :-------------: | :-------: | :-------: | :-------: | -| RDN bs1 | PSNR:38.18 | PSNR:38.27 | PSNR:38.27 | fps:49.469 | fps:29.653 | fps:22.267 | +| RDN bs1 | PSNR:38.18 | PSNR:38.27 | PSNR:38.27 | fps:22.267 | fps:29.653 | fps:49.469 | - 因Set5数据集只有5张图片,因此仅使用了bs1进行评测。 \ No newline at end of file -- Gitee From 75ce5507194751ae3b3305c4f24d3322c4f00ecb Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?=E9=83=91=E7=90=B3?= <10843194+zl2170906@user.noreply.gitee.com> Date: Tue, 24 May 2022 15:22:53 +0000 Subject: [PATCH 06/10] update ACL_PyTorch/contrib/cv/super_resolution/RDN/test/pth2om.sh. --- ACL_PyTorch/contrib/cv/super_resolution/RDN/test/pth2om.sh | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/ACL_PyTorch/contrib/cv/super_resolution/RDN/test/pth2om.sh b/ACL_PyTorch/contrib/cv/super_resolution/RDN/test/pth2om.sh index 3f131616c6..09ac1b6c85 100644 --- a/ACL_PyTorch/contrib/cv/super_resolution/RDN/test/pth2om.sh +++ b/ACL_PyTorch/contrib/cv/super_resolution/RDN/test/pth2om.sh @@ -9,7 +9,7 @@ fi rm -rf rdn_x2_bs1.om source env.sh -atc --framework=5 --model=rdn_x2.onnx --output=rdn_x2_bs1 --input_format=NCHW --input_shape="image:1,3,114,114" --log=debug --soc_version=Ascend310 +atc --framework=5 --model=rdn_x2.onnx --output=rdn_x2_bs1 --input_format=NCHW --input_shape="image:1,3,114,114" --log=debug --soc_version=Ascend710 if [ -f "rdn_x2_bs1.om" ]; then echo "success" -- Gitee From 1bc5d4846b8003b8e97e0ca6b912023c9c28aed1 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?=E9=83=91=E7=90=B3?= <10843194+zl2170906@user.noreply.gitee.com> Date: Tue, 24 May 2022 15:29:18 +0000 Subject: [PATCH 07/10] update ACL_PyTorch/contrib/cv/super_resolution/RDN/README.md. --- ACL_PyTorch/contrib/cv/super_resolution/RDN/README.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/ACL_PyTorch/contrib/cv/super_resolution/RDN/README.md b/ACL_PyTorch/contrib/cv/super_resolution/RDN/README.md index 59bebb133b..5d782767b7 100644 --- a/ACL_PyTorch/contrib/cv/super_resolution/RDN/README.md +++ b/ACL_PyTorch/contrib/cv/super_resolution/RDN/README.md @@ -33,7 +33,7 @@ onnx = 1.7.0 numpy == 1.21.2 Pillow == 9.1.0 opencv-python == 4.5.5.64 -mmcv == 1.3.12 +mmcv == 1.5.1 ``` **说明:** -- Gitee From fb50112c13d243598be8262e99a988661989263d Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?=E9=83=91=E7=90=B3?= <10843194+zl2170906@user.noreply.gitee.com> Date: Tue, 24 May 2022 15:29:37 +0000 Subject: [PATCH 08/10] update ACL_PyTorch/contrib/cv/super_resolution/RDN/requirements.txt. --- ACL_PyTorch/contrib/cv/super_resolution/RDN/requirements.txt | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/ACL_PyTorch/contrib/cv/super_resolution/RDN/requirements.txt b/ACL_PyTorch/contrib/cv/super_resolution/RDN/requirements.txt index 25aad1e4ee..ee15f80932 100644 --- a/ACL_PyTorch/contrib/cv/super_resolution/RDN/requirements.txt +++ b/ACL_PyTorch/contrib/cv/super_resolution/RDN/requirements.txt @@ -4,4 +4,4 @@ onnx == 1.7.0 numpy == 1.21.2 Pillow == 9.1.0 opencv-python == 4.5.5.64 -mmcv == 1.3.12 \ No newline at end of file +mmcv == 1.5.1 \ No newline at end of file -- Gitee From 3aa1c1f9c9847a127d90eb46bd3747d344f7d150 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?=E9=83=91=E7=90=B3?= <10843194+zl2170906@user.noreply.gitee.com> Date: Tue, 24 May 2022 15:33:13 +0000 Subject: [PATCH 09/10] update ACL_PyTorch/contrib/cv/super_resolution/RDN/test/parse.py. --- ACL_PyTorch/contrib/cv/super_resolution/RDN/test/parse.py | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/ACL_PyTorch/contrib/cv/super_resolution/RDN/test/parse.py b/ACL_PyTorch/contrib/cv/super_resolution/RDN/test/parse.py index 8c265696ff..9363115260 100644 --- a/ACL_PyTorch/contrib/cv/super_resolution/RDN/test/parse.py +++ b/ACL_PyTorch/contrib/cv/super_resolution/RDN/test/parse.py @@ -27,5 +27,5 @@ if __name__ == '__main__': with open(result_txt, 'r') as f: content = f.read() txt_data_list = [i.strip() for i in re.findall(r':(.*?),', content.replace('\n', ',') + ',')] - fps = float(txt_data_list[7].replace('samples/s', '')) * 4 - print('310 bs{} fps:{:.3f}'.format(result_txt.split('_')[3], fps)) \ No newline at end of file + fps = float(txt_data_list[7].replace('samples/s', '')) + print('710 bs{} fps:{:.3f}'.format(result_txt.split('_')[3], fps)) \ No newline at end of file -- Gitee From fd0f80da3a01938484728e75273df89a2c2d7d26 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?=E9=83=91=E7=90=B3?= <10843194+zl2170906@user.noreply.gitee.com> Date: Tue, 24 May 2022 16:31:09 +0000 Subject: [PATCH 10/10] update ACL_PyTorch/contrib/cv/super_resolution/RDN/test/eval_acc_perf.sh. --- .../contrib/cv/super_resolution/RDN/test/eval_acc_perf.sh | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/ACL_PyTorch/contrib/cv/super_resolution/RDN/test/eval_acc_perf.sh b/ACL_PyTorch/contrib/cv/super_resolution/RDN/test/eval_acc_perf.sh index 551c59a730..2a70867264 100644 --- a/ACL_PyTorch/contrib/cv/super_resolution/RDN/test/eval_acc_perf.sh +++ b/ACL_PyTorch/contrib/cv/super_resolution/RDN/test/eval_acc_perf.sh @@ -22,7 +22,7 @@ if [ $? != 0 ]; then echo "fail!" exit -1 fi -source env.sh +source /usr/local/Ascend/ascend-toolkit/set_env.sh rm -rf result/dumpOutput_device0 ./benchmark.${arch} -model_type=vision -device_id=0 -batch_size=1 -om_path=./rdn_x2_bs1.om -input_text_path=./RDN_prep_bin.info -input_width=114 -input_height=114 -output_binary=False -useDvpp=False if [ $? != 0 ]; then -- Gitee