From e996db6c6dd39c25a62c5e422f6f744333babe40 Mon Sep 17 00:00:00 2001 From: gzhcv Date: Sun, 7 Feb 2021 20:24:32 +0800 Subject: [PATCH] Add the gpu step_trace docs --- .../source_en/advanced_use/performance_profiling.md | 4 ++-- .../source_en/advanced_use/performance_profiling_gpu.md | 8 ++++++++ .../source_zh_cn/advanced_use/performance_profiling.md | 4 ++-- .../advanced_use/performance_profiling_gpu.md | 8 ++++++++ 4 files changed, 20 insertions(+), 4 deletions(-) diff --git a/tutorials/training/source_en/advanced_use/performance_profiling.md b/tutorials/training/source_en/advanced_use/performance_profiling.md index 9b6a0473c2..db1df910bd 100644 --- a/tutorials/training/source_en/advanced_use/performance_profiling.md +++ b/tutorials/training/source_en/advanced_use/performance_profiling.md @@ -95,8 +95,8 @@ Figure 2 displays the Step Trace page. The Step Trace detail will show the start In order to divide the stages, the Step Trace Component need to figure out the forward propagation start operator and the backward propagation end operator. MindSpore will automatically figure out the two operators to reduce the profiler configuration work. The first operator after `get_next` will be selected as the forward start operator and the operator before the last all reduce will be selected as the backward end operator. **However, Profiler do not guarantee that the automatically selected operators will meet the user's expectation in all cases.** Users can set the two operators manually as follows: -- Set environment variable `FP_POINT` to configure the forward start operator, for example, `export FP_POINT=fp32_vars/conv2d/BatchNorm`. -- Set environment variable `BP_POINT` to configure the backward end operator, for example, `export BP_POINT=loss_scale/gradients/AddN_70`. +- Set environment variable `PROFILING_FP_START` to configure the forward start operator, for example, `export PROFILING_FP_START=fp32_vars/conv2d/BatchNorm`. +- Set environment variable `PROFILING_BP_END` to configure the backward end operator, for example, `export PROFILING_BP_END=loss_scale/gradients/AddN_70`. #### Operator Performance Analysis diff --git a/tutorials/training/source_en/advanced_use/performance_profiling_gpu.md b/tutorials/training/source_en/advanced_use/performance_profiling_gpu.md index 3ed1296399..05ec588d77 100644 --- a/tutorials/training/source_en/advanced_use/performance_profiling_gpu.md +++ b/tutorials/training/source_en/advanced_use/performance_profiling_gpu.md @@ -130,6 +130,14 @@ The usage is almost the same as that in Ascend. The difference is GPU Timeline d > > +#### Timeline Analysis + +The usage is almost the same as that in Ascend. + +> The usage is described as follows: +> +> + #### Data Preparation Analysis The usage is almost the same as that in Ascend. diff --git a/tutorials/training/source_zh_cn/advanced_use/performance_profiling.md b/tutorials/training/source_zh_cn/advanced_use/performance_profiling.md index b0ef4c194a..669965e09c 100644 --- a/tutorials/training/source_zh_cn/advanced_use/performance_profiling.md +++ b/tutorials/training/source_zh_cn/advanced_use/performance_profiling.md @@ -99,8 +99,8 @@ profiler.analyse() 迭代轨迹在做阶段划分时,需要识别前向计算开始的算子和反向计算结束的算子。为了降低用户使用Profiler的门槛,MindSpore会对这两个算子做自动识别,方法为: 前向计算开始的算子指定为`get_next`算子之后连接的第一个算子,反向计算结束的算子指定为最后一次all reduce之前连接的算子。**Profiler不保证在所有情况下自动识别的结果和用户的预期一致,用户可以根据网络的特点自行调整**,调整方法如下: -- 设置`FP_POINT`环境变量指定前向计算开始的算子,如`export FP_POINT=fp32_vars/conv2d/BatchNorm`。 -- 设置`BP_POINT`环境变量指定反向计算结束的算子,如`export BP_POINT=loss_scale/gradients/AddN_70`。 +- 设置`PROFILING_FP_START`环境变量指定前向计算开始的算子,如`export PROFILING_FP_START=fp32_vars/conv2d/BatchNorm`。 +- 设置`PROFILING_BP_END`环境变量指定反向计算结束的算子,如`export PROFILING_BP_END=loss_scale/gradients/AddN_70`。 #### 算子性能分析 diff --git a/tutorials/training/source_zh_cn/advanced_use/performance_profiling_gpu.md b/tutorials/training/source_zh_cn/advanced_use/performance_profiling_gpu.md index 1be993d8a8..4c6a35e005 100644 --- a/tutorials/training/source_zh_cn/advanced_use/performance_profiling_gpu.md +++ b/tutorials/training/source_zh_cn/advanced_use/performance_profiling_gpu.md @@ -132,6 +132,14 @@ GPU场景下,Timeline分析的使用方法和Ascend场景相同,不同之处 > > +#### 迭代轨迹分析 + +GPU场景下,迭代轨迹分析的使用方法和Ascend场景相同,使用方法参考: + +> 与Ascend使用方式一致,可以参考: +> +> + #### 数据准备性能分析 GPU场景下,数据准备性能分析的使用方法和Ascend场景相同,使用方法参考: -- Gitee