diff --git a/tutorials/source_en/advanced_use/customized_debugging_information.md b/tutorials/source_en/advanced_use/customized_debugging_information.md index c59a0eb40c4f192e30ae614606e24519a1a07d47..cf5fed5aac7266be739e0aac4bc849b4392b0025 100644 --- a/tutorials/source_en/advanced_use/customized_debugging_information.md +++ b/tutorials/source_en/advanced_use/customized_debugging_information.md @@ -221,6 +221,8 @@ val:[[1 1] When the training result deviates from the expectation on Ascend, the input and output of the operator can be dumped for debugging through Asynchronous Data Dump. +> `comm_ops` operators are not supported by Asynchronous Data Dump. `comm_ops` can be found in [Operator List](https://www.mindspore.cn/docs/en/master/operator_list.html). + 1. Turn on the switch to save graph IR: `context.set_context(save_graphs=True)`. 2. Execute training script. 3. Open `hwopt_d_end_graph_{graph id}.ir` in the directory you execute the script and find the name of the operators you want to Dump. @@ -244,6 +246,9 @@ When the training result deviates from the expectation on Ascend, the input and } ``` + > - Iteration should be set to 0 in non data sink mode and data of every iterationi will be dumped. + > - Iteration should increase by 1 in data sink mode. For example, data of GetNext will be dumped in iteration 0 and data of compute graph will be dumped in iteration 1. + 5. Set environment variables. ```bash @@ -252,9 +257,8 @@ When the training result deviates from the expectation on Ascend, the input and export DATA_DUMP_CONFIG_PATH=data_dump.json ``` - > Set the environment variables before executing the training script. Setting environment variables during training will not take effect. - - > Dump environment variables need to be configured before calling `mindspore.communication.management.init`. + > - Set the environment variables before executing the training script. Setting environment variables during training will not take effect. + > - Dump environment variables need to be configured before calling `mindspore.communication.management.init`. 6. Execute the training script again. 7. Parse the Dump file. diff --git a/tutorials/source_zh_cn/advanced_use/customized_debugging_information.md b/tutorials/source_zh_cn/advanced_use/customized_debugging_information.md index d59e35de20445c9c377d726292ff6ece753f3b88..c2fdbb5213809c09d4db141602dc291c8e708e2d 100644 --- a/tutorials/source_zh_cn/advanced_use/customized_debugging_information.md +++ b/tutorials/source_zh_cn/advanced_use/customized_debugging_information.md @@ -221,6 +221,8 @@ val:[[1 1] 在Ascend环境上执行训练,当训练结果和预期有偏差时,可以通过异步数据Dump功能保存算子的输入输出进行调试。 +> 异步数据Dump不支持`comm_ops`类别的算子,算子类别详见[算子支持列表](https://www.mindspore.cn/docs/zh-CN/master/operator_list.html)。 + 1. 开启IR保存开关: `context.set_context(save_graphs=True)`。 2. 执行网络脚本。 3. 查看执行目录下的`hwopt_d_end_graph_{graph id}.ir`,找到需要Dump的算子名称。 @@ -244,6 +246,9 @@ val:[[1 1] } ``` + > - 非数据下沉模式下,iteration需要设置成0,并且会Dump出每个epoch的数据。 + > - 数据下沉模式iteration需要增加1。例如iteration-0会Dump出GetNext算子的数据,而iteration-1才会去Dump真正的计算图的数据。 + 5. 设置数据Dump的环境变量。 ```bash @@ -252,9 +257,8 @@ val:[[1 1] export DATA_DUMP_CONFIG_PATH=data_dump.json ``` - > 在网络脚本执行前,设置好环境变量;网络脚本执行过程中设置将会不生效。 - - > 在分布式场景下,Dump环境变量需要调用`mindspore.communication.management.init`之前配置。 + > - 在网络脚本执行前,设置好环境变量;网络脚本执行过程中设置将会不生效。 + > - 在分布式场景下,Dump环境变量需要调用`mindspore.communication.management.init`之前配置。 6. 再次执行用例进行异步数据Dump。 7. 解析文件。