From e907816cd82571c47b25c5677fecf149728d9023 Mon Sep 17 00:00:00 2001 From: qqqhhhbbb Date: Mon, 21 Jul 2025 15:27:10 +0800 Subject: [PATCH 1/3] debug docs modified --- tutorials/source_en/debug/dump.md | 42 ++++++++++++++-------------- tutorials/source_zh_cn/debug/dump.md | 38 ++++++++++++------------- 2 files changed, 40 insertions(+), 40 deletions(-) diff --git a/tutorials/source_en/debug/dump.md b/tutorials/source_en/debug/dump.md index 158abd0b8f..161d79a65d 100644 --- a/tutorials/source_en/debug/dump.md +++ b/tutorials/source_en/debug/dump.md @@ -10,7 +10,7 @@ The MindSpore Dump functionality has been gradually migrated to the [msprobe too > [msprobe](https://gitee.com/ascend/mstt/tree/master/debug/accuracy_tools/msprobe) is a toolkit under the MindStudio Training Tools suite, specifically for accuracy debugging. It primarily includes functionalities such as accuracy pre-inspection, overflow detection, and accuracy comparison. Currently, it is compatible with the PyTorch and MindSpore frameworks. -The Dump features for dynamic graphs and static graphs in Ascend O2 mode have been fully migrated to the msprobe tool and are enabled through the msprobe tool entry point. For more details, please refer to the [msprobe Tool MindSpore Scenario Accuracy Data Collection Guide](https://gitee.com/ascend/mstt/blob/master/debug/accuracy_tools/msprobe/docs/06.data_dump_MindSpore.md). +The Dump features for dynamic graphs and static graphs in Ascend GE mode have been fully migrated to the msprobe tool and are enabled through the msprobe tool entry point. For more details, please refer to the [msprobe Tool MindSpore Scenario Accuracy Data Collection Guide](https://gitee.com/ascend/mstt/blob/master/debug/accuracy_tools/msprobe/docs/06.data_dump_MindSpore.md). For graphs in Ascend OO/O1 modes and CPU/GPU modes, these functionalities are still enabled through the framework entry points but will be gradually migrated to the msprobe tool in subsequent updates. @@ -18,13 +18,13 @@ For graphs in Ascend OO/O1 modes and CPU/GPU modes, these functionalities are st In different modes, the Dump features supported by MindSpore are not entirely the same, and the required configuration files and the generated data formats vary accordingly. Therefore, you need to select the corresponding Dump configuration based on the running mode: -- [Dump in Ascend O0/O1 Mode](#dump-in-ascend-o0o1-mode) -- [Dump in Ascend O2 Mode](#dump-in-ascend-o2-mode) +- [Dump in Ascend ms_backend Mode](#dump-in-ascend-o0o1-mode) +- [Dump in Ascend GE Mode](#dump-in-ascend-GE-mode) - [Dump in CPU/GPU mode](#dump-in-cpugpu-mode) -> - The differences between Ascend O0, O1, and O2 modes can be found in [the parameter jit_level of the set_context method](https://www.mindspore.cn/docs/en/master/api_python/mindspore/mindspore.set_context.html). +> - The differences between Ascend O0, O1, and GE modes can be found in [the parameter jit_level of the set_context method](https://www.mindspore.cn/docs/en/master/api_python/mindspore/mindspore.set_context.html). > -> - Dumping constant data is only supported in CPU/GPU mode, while not supported in Ascend O0/O1/O2 mode. +> - Dumping constant data is only supported in CPU/GPU mode, while not supported in Ascend ms_backend/GE mode. > > - Currently, Dump does not support heterogeneous training, meaning it does not support CPU/Ascend mixed training or GPU/Ascend mixed training. @@ -33,7 +33,7 @@ MindSpore supports different Dump functionalities under various modes, as shown - + @@ -100,7 +100,7 @@ MindSpore supports different Dump functionalities under various modes, as shown > In terms of statistics, the computing speed of the device is faster than that of the host(currently only supported on Ascend backend), but the host has more statistical indicators than the device. Refer to the `statistic_category` option for details. -## Dump in Ascend O0/O1 Mode +## Dump in Ascend ms_backend Mode ### Dump Step @@ -139,7 +139,7 @@ MindSpore supports different Dump functionalities under various modes, as shown - `input_output`: 0: dump input and output of kernel, 1:dump input of kernel, 2:dump output of kernel. When `op_debug_mode` is set to 3, `input_output` can only be set to save both the operator's inputs and outputs. Only input of kernel can be saved when "op_debug_mode" is set to `4`. - `kernels`: This item can be configured in three formats: 1. List of operator names. Turn on the IR save switch by setting the environment variable `MS_DEV_SAVE_GRAPHS` to 2 and execute the network to obtain the operator name from the generated `trace_code_graph_{graph_id}`IR file. For details, please refer to [Saving IR](https://www.mindspore.cn/tutorials/en/master/debug/error_analysis/mindir.html#saving-ir). - Note that whether setting the environment variable `MS_DEV_SAVE_GRAPHS` to 2 may cause the different IDs of the same operator, so when dump specified operators, keep this setting unchanged after obtaining the operator name. Or you can obtain the operator names from the file `ms_output_trace_code_graph_{graph_id}.ir` saved by Dump. Refer to [Ascend O0/O1 Dump Data Object Directory](#introduction-to-data-object-directory-and-data-file). + Note that whether setting the environment variable `MS_DEV_SAVE_GRAPHS` to 2 may cause the different IDs of the same operator, so when dump specified operators, keep this setting unchanged after obtaining the operator name. Or you can obtain the operator names from the file `ms_output_trace_code_graph_{graph_id}.ir` saved by Dump. Refer to [Ascend ms_backend Dump Data Object Directory](#introduction-to-data-object-directory-and-data-file). 2. You can also specify an operator type. When there is no operator scope information or operator id information in the string, the background considers it as an operator type, such as "conv". The matching rule of operator type is: when the operator name contains an operator type string, the matching is considered successful (case insensitive). For example, "conv" can match operators "Conv2D-op1234" and "Conv3D-op1221". 3. Regular expressions are supported. When the string conforms to the format of "name-regex(xxx)", it would be considered a regular expression. For example, "name-regex(Default/.+)" can match all operators with names starting with "Default/". - `support_device`: Supported devices, default setting is `[0,1,2,3,4,5,6,7]`. In distributed training scenarios where data on individual devices needs to be dumped, you can specify only the device Id that needs to be dumped in `support_device`. This configuration parameter is invalid on the CPU, because there is no concept of device on the CPU, but it is still need to reserve this parameter in the json file. @@ -208,11 +208,11 @@ MindSpore supports different Dump functionalities under various modes, as shown You can set `set_context(reserve_class_name_in_scope=False)` in your training script to avoid dump failure because of file name is too long. -4. Read and parse dump data through `numpy.load`, refer to [Introduction to Ascend O0/O1 Dump Data File](#introduction-to-data-object-directory-and-data-file). +4. Read and parse dump data through `numpy.load`, refer to [Introduction to Ascend ms_backend Dump Data File](#introduction-to-data-object-directory-and-data-file). ### Introduction to Data Object Directory and Data File -After starting the training, the data objects saved under the Ascend O0/O1 Dump mode include the final execution graph (`ms_output_trace_code_graph_{graph_id}.ir` file) and the input and output data of the operators in the graph. The data directory structure is as follows: +After starting the training, the data objects saved under the Ascend ms_backend Dump mode include the final execution graph (`ms_output_trace_code_graph_{graph_id}.ir` file) and the input and output data of the operators in the graph. The data directory structure is as follows: ```text {path}/ @@ -266,7 +266,7 @@ Only when `save_kernel_args` is `true`, `{op_type}.{op_name}.json` is generated This JSON indicates that both initialization parameters `transpose_a` and `transpose_b` of the `Matmul` operator have the value `False`. -The data file generated by the Ascend O0/O1 Dump is a binary file with the suffix `.npy`, and the file naming format is: +The data file generated by the Ascend ms_backend Dump is a binary file with the suffix `.npy`, and the file naming format is: ```text {op_type}.{op_name}.{task_id}.{stream_id}.{timestamp}.{input_output_index}.{slot}.{format}.{dtype}.npy @@ -274,9 +274,9 @@ The data file generated by the Ascend O0/O1 Dump is a binary file with the suffi User can use Numpy interface `numpy.load` to read the data. -The statistics file generated by the Ascend O0/O1 dump is named `statistic.csv`. This file stores key statistics for all tensors dumped under the same directory as itself (with the file names `{op_type}.{op_name}.{task_id}.{stream_id}.{timestamp}.{input_output_index}.{slot}.{format}.npy`). Each row in `statistic.csv` summarizes a single tensor, each row contains the statistics: Op Type, Op Name, Task ID, Stream ID, Timestamp, IO, Slot, Data Size, Data Type, Shape, and statistics items configured by the user. Note that opening this file with Excel may cause data to be displayed incorrectly. Please use commands like `vi` or `cat`, or use Excel to import csv from text for viewing. +The statistics file generated by the Ascend ms_backend dump is named `statistic.csv`. This file stores key statistics for all tensors dumped under the same directory as itself (with the file names `{op_type}.{op_name}.{task_id}.{stream_id}.{timestamp}.{input_output_index}.{slot}.{format}.npy`). Each row in `statistic.csv` summarizes a single tensor, each row contains the statistics: Op Type, Op Name, Task ID, Stream ID, Timestamp, IO, Slot, Data Size, Data Type, Shape, and statistics items configured by the user. Note that opening this file with Excel may cause data to be displayed incorrectly. Please use commands like `vi` or `cat`, or use Excel to import csv from text for viewing. -The suffixes of the final execution graph files generated by Ascend O0/O1 Dump are `.pb` and `.ir` respectively, and the file naming format is: +The suffixes of the final execution graph files generated by Ascend ms_backend Dump are `.pb` and `.ir` respectively, and the file naming format is: ```text ms_output_trace_code_graph_{graph_id}.pb @@ -285,7 +285,7 @@ ms_output_trace_code_graph_{graph_id}.ir The files with the suffix `.ir` can be opened and viewed by the `vi` command. -The suffix of the node execution sequence file generated by the Ascend O0/O1 Dump is `.csv`, and the file naming format is: +The suffix of the node execution sequence file generated by the Ascend ms_backend Dump is `.csv`, and the file naming format is: ```text ms_execution_order_graph_{graph_id}.csv @@ -293,7 +293,7 @@ ms_execution_order_graph_{graph_id}.csv ### Data Analysis Sample -In order to better demonstrate the process of using dump to save and analyze data, we provide a set of [complete sample script](https://gitee.com/mindspore/docs/tree/master/docs/sample_code/dump) , you only need to execute `bash dump_sync_dump.sh` for Ascend O0/O1 dump. +In order to better demonstrate the process of using dump to save and analyze data, we provide a set of [complete sample script](https://gitee.com/mindspore/docs/tree/master/docs/sample_code/dump) , you only need to execute `bash dump_sync_dump.sh` for Ascend ms_backend dump. After the graph corresponding to the script is saved to the disk through the Dump function, the final execution graph file `ms_output_trace_code_graph_{graph_id}.ir` will be generated. This file saves the stack information of each operator in the corresponding graph, and records the generation script corresponding to the operator. @@ -431,9 +431,9 @@ numpy.load("Conv2D.Conv2D-op12.0.0.1623124369613540.output.0.DefaultFormat.float Generate the numpy.array data. -## Dump in Ascend O2 Mode +## Dump in Ascend GE Mode -O2 mode Dump under Ascend has been migrated to the msprobe tool. For more details, please see [msprobe Tool MindSpore Scene Accuracy Data Collection Guide](https://gitee.com/ascend/mstt/blob/master/debug/accuracy_tools/msprobe/docs/06.data_dump_MindSpore.md). +GE mode Dump under Ascend has been migrated to the msprobe tool. For more details, please see [msprobe Tool MindSpore Scene Accuracy Data Collection Guide](https://gitee.com/ascend/mstt/blob/master/debug/accuracy_tools/msprobe/docs/06.data_dump_MindSpore.md). For data collection methods, please refer to the example code in [Graph Scenario Data Collection with msprobe](https://gitee.com/ascend/mstt/blob/master/debug/accuracy_tools/msprobe/docs/06.data_dump_MindSpore.md#71-%E9%9D%99%E6%80%81%E5%9B%BE%E5%9C%BA%E6%99%AF); @@ -489,7 +489,7 @@ For detailed configuration descriptions, please refer to the [Introduction to co - `input_output`: 0: dump input and output of kernel, 1: dump input of kernel, 2: dump output of kernel. Only input of kernel can be saved when "op_debug_mode" is set to `4`. - `kernels`: This item can be configured in three formats: 1. List of operator names. Turn on the IR save switch by setting the environment variable `MS_DEV_SAVE_GRAPHS` to 2 and execute the network to obtain the operator name from the generated `trace_code_graph_{graph_id}`IR file. For details, please refer to [Saving IR](https://www.mindspore.cn/tutorials/en/master/debug/error_analysis/mindir.html#saving-ir). - Note that whether setting the environment variable `MS_DEV_SAVE_GRAPHS` to 2 may cause the different IDs of the same operator, so when dump specified operators, keep this setting unchanged after obtaining the operator name. Or you can obtain the operator names from the file `ms_output_trace_code_graph_{graph_id}.ir` saved by Dump. Refer to [Ascend O0/O1 Dump Data Object Directory](#introduction-to-data-object-directory-and-data-file). + Note that whether setting the environment variable `MS_DEV_SAVE_GRAPHS` to 2 may cause the different IDs of the same operator, so when dump specified operators, keep this setting unchanged after obtaining the operator name. Or you can obtain the operator names from the file `ms_output_trace_code_graph_{graph_id}.ir` saved by Dump. Refer to [Ascend ms_backend Dump Data Object Directory](#introduction-to-data-object-directory-and-data-file). 2. You can also specify an operator type. When there is no operator scope information or operator id information in the string, the background considers it as an operator type, such as "conv". The matching rule of operator type is: when the operator name contains an operator type string, the matching is considered successful (case insensitive). For example, "conv" can match operators "Conv2D-op1234" and "Conv3D-op1221". 3. Regular expressions are supported. When the string conforms to the format of "name-regex(xxx)", it would be considered a regular expression. For example, "name-regex(Default/.+)" can match all operators with names starting with "Default/". - `support_device`: Supported devices, default setting is `[0,1,2,3,4,5,6,7]`. You can specify specific device ids to dump specific device data. This configuration parameter is invalid on the CPU, because there is no concept of device on the CPU, but it is still need to reserve this parameter in the json file. @@ -786,7 +786,7 @@ Generate the numpy.array data. - Dump only supports saving data with type of bool, int, int8, in16, int32, int64, uint, uint8, uint16, uint32, uint64, float, float16, float32, float64, bfloat16, double, complex64 and complex128. - Complex64 and complex128 only support saving as npy files, not as statistics information. - The Print operator has an input parameter with type of string, which is not a data type supported by Dump. Therefore, when the Print operator is included in the script, there will be an error log, which will not affect the saving data of other types. -- When Ascend O2 dump is enabled, lite exception dump is not supported by using set_context(ascend_config={"exception_dump": "2"}), while full exception dump is supported by using set_context(ascend_config={"exception_dump": "1"}). -- When Ascend O2 dump is enabled, sink size can only be set to 1. User can use [Model.train()](https://www.mindspore.cn/docs/en/master/api_python/train/mindspore.train.Model.html#mindspore.train.Model.train) or [data_sink()](https://www.mindspore.cn/docs/en/master/api_python/mindspore/mindspore.data_sink.html) to set up sink size. -- When Ascend O2 dump is enabled, if **statistical value dumping** is performed in scenarios with a large amount of data (such as when the network itself is of a large scale or multiple steps are dumped consecutively), it may cause the host-side memory to become full, leading to a failure in data flow synchronization. It is recommended to replace it with the new version of [**statistical value dumping**](https://gitee.com/ascend/mstt/blob/master/debug/accuracy_tools/msprobe/docs/06.data_dump_MindSpore.md#51-%E9%9D%99%E6%80%81%E5%9B%BE%E5%9C%BA%E6%99%AF). +- When Ascend GE dump is enabled, lite exception dump is not supported by using set_context(ascend_config={"exception_dump": "2"}), while full exception dump is supported by using set_context(ascend_config={"exception_dump": "1"}). +- When Ascend GE dump is enabled, sink size can only be set to 1. User can use [Model.train()](https://www.mindspore.cn/docs/en/master/api_python/train/mindspore.train.Model.html#mindspore.train.Model.train) or [data_sink()](https://www.mindspore.cn/docs/en/master/api_python/mindspore/mindspore.data_sink.html) to set up sink size. +- When Ascend GE dump is enabled, if **statistical value dumping** is performed in scenarios with a large amount of data (such as when the network itself is of a large scale or multiple steps are dumped consecutively), it may cause the host-side memory to become full, leading to a failure in data flow synchronization. It is recommended to replace it with the new version of [**statistical value dumping**](https://gitee.com/ascend/mstt/blob/master/debug/accuracy_tools/msprobe/docs/06.data_dump_MindSpore.md#51-%E9%9D%99%E6%80%81%E5%9B%BE%E5%9C%BA%E6%99%AF). - By default, Dump ignores invalid operator outputs, such as the outputs of the Send/Print operator or the third reserved output of the FlashAttentionScore operator. If you need to retain these invalid outputs, you can set the environment variable `MINDSPORE_DUMP_IGNORE_USELESS_OUTPUT` to `0`. For details, please refer to [Environment Variables - Dump Debugging](https://www.mindspore.cn/docs/en/master/api_python/env_var_list.html#dump-debugging). diff --git a/tutorials/source_zh_cn/debug/dump.md b/tutorials/source_zh_cn/debug/dump.md index 9daebda6ef..13934390d5 100644 --- a/tutorials/source_zh_cn/debug/dump.md +++ b/tutorials/source_zh_cn/debug/dump.md @@ -10,7 +10,7 @@ MindSpore Dump功能已陆续迁移到[msprobe工具](https://gitee.com/ascend/m > [msprobe](https://gitee.com/ascend/mstt/tree/master/debug/accuracy_tools/msprobe) 是 MindStudio Training Tools 工具链下精度调试部分的工具包。主要包括精度预检、溢出检测和精度比对等功能,目前适配 PyTorch 和 MindSpore 框架。 -其中动态图、静态图Ascend O2模式Dump已完全迁移到msprobe工具,通过msprobe工具入口使能,详情请查看[《msprobe 工具 MindSpore场景精度数据采集指南》](https://gitee.com/ascend/mstt/blob/master/debug/accuracy_tools/msprobe/docs/06.data_dump_MindSpore.md)。 +其中动态图、静态图Ascend GE模式Dump已完全迁移到msprobe工具,通过msprobe工具入口使能,详情请查看[《msprobe 工具 MindSpore场景精度数据采集指南》](https://gitee.com/ascend/mstt/blob/master/debug/accuracy_tools/msprobe/docs/06.data_dump_MindSpore.md)。 静态图Ascend OO/O1和CPU/GPU模式仍然通过框架入口使能,后续会陆续迁移到msprobe工具。 @@ -18,13 +18,13 @@ MindSpore Dump功能已陆续迁移到[msprobe工具](https://gitee.com/ascend/m MindSpore在不同模式下支持的Dump功能不完全相同,需要的配置文件和以及生成的数据格式也不同,因此需要根据运行的模式选择对应的Dump配置: -- [Ascend下O0/O1模式Dump](#ascend下o0o1模式dump) -- [Ascend下O2模式Dump](#ascend下o2模式dump) +- [Ascend下ms_backend模式Dump](#ascend下o0o1模式dump) +- [Ascend下GE模式Dump](#ascend下GE模式dump) - [CPU/GPU模式Dump](#cpugpu模式dump) -> - Ascend下O0/O1/O2模式的区别请见[set_context的参数jit_level](https://www.mindspore.cn/docs/zh-CN/master/api_python/mindspore/mindspore.set_context.html)。 +> - Ascend下ms_backend/GE模式的区别请见[set_context的参数jit_level](https://www.mindspore.cn/docs/zh-CN/master/api_python/mindspore/mindspore.set_context.html)。 > -> - CPU/GPU模式支持dump常量数据,Ascend O0/O1/O2模式不支持Dump常量数据。 +> - CPU/GPU模式支持dump常量数据,Ascend ms_backend/GE模式不支持Dump常量数据。 > > - Dump暂不支持异构训练,即不支持CPU/Ascend混合训练或GPU/Ascend混合训练。 @@ -33,7 +33,7 @@ MindSpore在不同模式下支持的Dump功能如下表所示:
FeatureAscend O0/O1Ascend ms_backend CPU/GPU
- + @@ -100,7 +100,7 @@ MindSpore在不同模式下支持的Dump功能如下表所示: > 在统计信息方面,device计算速度较host快(目前仅支持Ascend后端),但host统计指标比device多,详见`statistic_category`选项。 -## Ascend下O0/O1模式Dump +## Ascend下ms_backend模式Dump ### 操作步骤 @@ -139,7 +139,7 @@ MindSpore在不同模式下支持的Dump功能如下表所示: - `input_output`:设置成0,表示Dump出算子的输入和算子的输出;设置成1,表示Dump出算子的输入;设置成2,表示Dump出算子的输出。在op_debug_mode设置为3时,只能设置`input_output`为同时保存算子输入和算子输出。在op_debug_mode设置为4时,只能保存算子输入。 - `kernels`:该项可以配置三种格式: 1. 算子的名称列表。通过设置环境变量`MS_DEV_SAVE_GRAPHS`的值为2开启IR保存开关并执行用例,从生成的IR文件`trace_code_graph_{graph_id}`中获取算子名称。详细说明可以参照教程:[如何保存IR](https://www.mindspore.cn/tutorials/zh-CN/master/debug/error_analysis/mindir.html#如何保存ir)。 - 需要注意的是,是否设置环境变量`MS_DEV_SAVE_GRAPHS`的值为2可能会导致同一个算子的id不同,所以在Dump指定算子时要在获取算子名称之后保持这一项设置不变。或者也可以在Dump保存的`ms_output_trace_code_graph_{graph_id}.ir`文件中获取算子名称,参考[Ascend O0/O1模式下Dump数据对象目录](#数据对象目录和数据文件介绍)。 + 需要注意的是,是否设置环境变量`MS_DEV_SAVE_GRAPHS`的值为2可能会导致同一个算子的id不同,所以在Dump指定算子时要在获取算子名称之后保持这一项设置不变。或者也可以在Dump保存的`ms_output_trace_code_graph_{graph_id}.ir`文件中获取算子名称,参考[Ascend ms_backend模式下Dump数据对象目录](#数据对象目录和数据文件介绍)。 2. 还可以指定算子类型。当字符串中不带算子scope信息和算子id信息时,后台则认为其为算子类型,例如:"conv"。算子类型的匹配规则为:当发现算子名中包含算子类型字符串时,则认为匹配成功(不区分大小写),例如:"conv" 可以匹配算子 "Conv2D-op1234"、"Conv3D-op1221"。 3. 算子名称的正则表达式。当字符串符合"name-regex(xxx)"格式时,后台则会将其作为正则表达式。例如,"name-regex(Default/.+)"可匹配算子名称以"Default/"开头的所有算子。 - `support_device`:支持的设备,默认设置成0到7即可;在分布式训练场景下,需要dump个别设备上的数据,可以只在`support_device`中指定需要Dump的设备Id。该配置参数在CPU上无效,因为CPU下没有device这个概念,但是在json格式的配置文件中仍需保留该字段。 @@ -208,11 +208,11 @@ MindSpore在不同模式下支持的Dump功能如下表所示: 可以在训练脚本中设置`set_context(reserve_class_name_in_scope=False)`,避免Dump文件名称过长导致Dump数据文件生成失败。 -4. 通过`numpy.load`读取和解析Dump数据,参考[Ascend O0/O1模式下Dump数据文件介绍](#数据对象目录和数据文件介绍)。 +4. 通过`numpy.load`读取和解析Dump数据,参考[Ascend ms_backend模式下Dump数据文件介绍](#数据对象目录和数据文件介绍)。 ### 数据对象目录和数据文件介绍 -启动训练后,Ascend O0/O1模式下Dump保存的数据对象包括最终执行图(`ms_output_trace_code_graph_{graph_id}.ir`文件)以及图中算子的输入和输出数据,数据目录结构如下所示: +启动训练后,Ascend ms_backend模式下Dump保存的数据对象包括最终执行图(`ms_output_trace_code_graph_{graph_id}.ir`文件)以及图中算子的输入和输出数据,数据目录结构如下所示: ```text {path}/ @@ -266,7 +266,7 @@ MindSpore在不同模式下支持的Dump功能如下表所示: 代表`Matmul`算子的两个初始化参数`transpose_a`和`transpose_b`的值均为`False`。 -Ascend O0/O1模式下Dump生成的数据文件是后缀名为`.npy`的文件,文件命名格式为: +Ascend ms_backend模式下Dump生成的数据文件是后缀名为`.npy`的文件,文件命名格式为: ```text {op_type}.{op_name}.{task_id}.{stream_id}.{timestamp}.{input_output_index}.{slot}.{format}.{dtype}.npy @@ -274,9 +274,9 @@ Ascend O0/O1模式下Dump生成的数据文件是后缀名为`.npy`的文件, 可以用Numpy的`numpy.load`接口读取数据。 -Ascend O0/O1模式下生成的统计数据文件名为`statistic.csv`,此文件存有相同目录下所有落盘张量(文件名为`{op_type}.{op_name}.{task_id}.{stream_id}.{timestamp}.{input_output_index}.{slot}.{format}.npy`)的统计信息。每个张量一行,每行有张量的 Op Type、Op Name、Task ID、Stream ID、Timestamp、IO、Slot、Data Size、Data Type、Shape以及用户配置的统计信息项。注意,如果用Excel来打开此文件,数据可能无法正确显示。请用`vi`、`cat`等命令查看,或者使用Excel自文本导入csv查看。 +Ascend ms_backend模式下生成的统计数据文件名为`statistic.csv`,此文件存有相同目录下所有落盘张量(文件名为`{op_type}.{op_name}.{task_id}.{stream_id}.{timestamp}.{input_output_index}.{slot}.{format}.npy`)的统计信息。每个张量一行,每行有张量的 Op Type、Op Name、Task ID、Stream ID、Timestamp、IO、Slot、Data Size、Data Type、Shape以及用户配置的统计信息项。注意,如果用Excel来打开此文件,数据可能无法正确显示。请用`vi`、`cat`等命令查看,或者使用Excel自文本导入csv查看。 -Ascend O0/O1模式下生成的最终执行图文件后缀名分别为`.pb`和`.ir`,文件命名格式为: +Ascend ms_backend模式下生成的最终执行图文件后缀名分别为`.pb`和`.ir`,文件命名格式为: ```text ms_output_trace_code_graph_{graph_id}.pb @@ -285,7 +285,7 @@ ms_output_trace_code_graph_{graph_id}.ir 其中以`.ir`为后缀的文件可以通过`vi`命令打开查看。 -Ascend O0/O1模式下Dump生成的节点执行序文件后缀名为`.csv`,文件命名格式为: +Ascend ms_backend模式下Dump生成的节点执行序文件后缀名为`.csv`,文件命名格式为: ```text ms_execution_order_graph_{graph_id}.csv @@ -431,9 +431,9 @@ numpy.load("Conv2D.Conv2D-op12.0.0.1623124369613540.output.0.DefaultFormat.float 生成numpy.array数据。 -## Ascend下O2模式Dump +## Ascend GE模式Dump -Ascend下O2模式Dump已迁移到msprobe工具,更多详情请查看[《msprobe 工具 MindSpore场景精度数据采集指南》](https://gitee.com/ascend/mstt/blob/master/debug/accuracy_tools/msprobe/docs/06.data_dump_MindSpore.md)。 +Ascend下GE模式Dump已迁移到msprobe工具,更多详情请查看[《msprobe 工具 MindSpore场景精度数据采集指南》](https://gitee.com/ascend/mstt/blob/master/debug/accuracy_tools/msprobe/docs/06.data_dump_MindSpore.md)。 采集方式请参考示例代码[《msprobe静态图场景采集》](https://gitee.com/ascend/mstt/blob/master/debug/accuracy_tools/msprobe/docs/06.data_dump_MindSpore.md#71-%E9%9D%99%E6%80%81%E5%9B%BE%E5%9C%BA%E6%99%AF); @@ -786,7 +786,7 @@ numpy.load("Conv2D.Conv2D-op12.0.0.1623124369613540.output.0.DefaultFormat.npy") - Dump仅支持bool、int、int8、in16、int32、int64、uint、uint8、uint16、uint32、uint64、float、float16、float32、float64、bfloat16、double、complex64、complex128类型数据的保存。 - complex64和complex128仅支持保存为npy文件,不支持保存为统计值信息。 - Print算子内部有一个输入参数为string类型,string类型不属于Dump支持的数据类型,所以在脚本中包含Print算子时,会有错误日志,这不会影响其他类型数据的保存。 -- 使能Ascend O2模式下Dump时,不支持同时使用set_context(ascend_config={"exception_dump": "2"})配置轻量异常dump; 支持同时使用set_context(ascend_config={"exception_dump": "1"})配置全量异常dump。 -- 使能Ascend O2模式下Dump时,sink size只能设置为1。用户通常可以使用[Model.train()](https://www.mindspore.cn/docs/zh-CN/master/api_python/train/mindspore.train.Model.html#mindspore.train.Model.train)或[data_sink()](https://www.mindspore.cn/docs/zh-CN/master/api_python/mindspore/mindspore.data_sink.html)接口配置sink size。 -- 使能Ascend O2模式下Dump时,**统计值dump**如果是大数据量dump场景(如网络本身规模庞大,连续dump多个step等),可能会导致host侧内存被占满,导致数据流同步失败,建议使用新版[**统计值dump**](https://gitee.com/ascend/mstt/blob/master/debug/accuracy_tools/msprobe/docs/06.data_dump_MindSpore.md#51-%E9%9D%99%E6%80%81%E5%9B%BE%E5%9C%BA%E6%99%AF)替代。 +- 使能Ascend GE模式下Dump时,不支持同时使用set_context(ascend_config={"exception_dump": "2"})配置轻量异常dump; 支持同时使用set_context(ascend_config={"exception_dump": "1"})配置全量异常dump。 +- 使能Ascend GE模式下Dump时,sink size只能设置为1。用户通常可以使用[Model.train()](https://www.mindspore.cn/docs/zh-CN/master/api_python/train/mindspore.train.Model.html#mindspore.train.Model.train)或[data_sink()](https://www.mindspore.cn/docs/zh-CN/master/api_python/mindspore/mindspore.data_sink.html)接口配置sink size。 +- 使能Ascend GE模式下Dump时,**统计值dump**如果是大数据量dump场景(如网络本身规模庞大,连续dump多个step等),可能会导致host侧内存被占满,导致数据流同步失败,建议使用新版[**统计值dump**](https://gitee.com/ascend/mstt/blob/master/debug/accuracy_tools/msprobe/docs/06.data_dump_MindSpore.md#51-%E9%9D%99%E6%80%81%E5%9B%BE%E5%9C%BA%E6%99%AF)替代。 - 默认情况下,Dump会忽略算子的无效输出,比如Send/Print算子的输出、FlashAttentionScore算子的第三个预留输出等。如果需要保留这些无效输出,可以将环境变量`MINDSPORE_DUMP_IGNORE_USELESS_OUTPUT`设置为`0`。详情请参阅[环境变量-Dump调试](https://www.mindspore.cn/docs/zh-CN/master/api_python/env_var_list.html#dump%E8%B0%83%E8%AF%95)。 -- Gitee From 42c9725ee0cacba6b5273502ad711df45ab74451 Mon Sep 17 00:00:00 2001 From: qqqhhhbbb Date: Mon, 21 Jul 2025 16:07:34 +0800 Subject: [PATCH 2/3] change O0/O1/O2 to ms_backend/GE --- tutorials/source_en/debug/dump.md | 9 +++------ tutorials/source_zh_cn/debug/dump.md | 11 ++++------- 2 files changed, 7 insertions(+), 13 deletions(-) diff --git a/tutorials/source_en/debug/dump.md b/tutorials/source_en/debug/dump.md index 161d79a65d..d3de032f10 100644 --- a/tutorials/source_en/debug/dump.md +++ b/tutorials/source_en/debug/dump.md @@ -18,11 +18,11 @@ For graphs in Ascend OO/O1 modes and CPU/GPU modes, these functionalities are st In different modes, the Dump features supported by MindSpore are not entirely the same, and the required configuration files and the generated data formats vary accordingly. Therefore, you need to select the corresponding Dump configuration based on the running mode: -- [Dump in Ascend ms_backend Mode](#dump-in-ascend-o0o1-mode) +- [Dump in Ascend ms_backend Mode](#dump-in-ascend-ms_backend-mode) - [Dump in Ascend GE Mode](#dump-in-ascend-GE-mode) - [Dump in CPU/GPU mode](#dump-in-cpugpu-mode) -> - The differences between Ascend O0, O1, and GE modes can be found in [the parameter jit_level of the set_context method](https://www.mindspore.cn/docs/en/master/api_python/mindspore/mindspore.set_context.html). +> - The differences between Ascend ms_backend and GE modes can be found in [the parameter jit_level of the set_context method](https://www.mindspore.cn/docs/en/master/api_python/mindspore/mindspore.jit.html#mindspore.jit). > > - Dumping constant data is only supported in CPU/GPU mode, while not supported in Ascend ms_backend/GE mode. > @@ -170,7 +170,7 @@ MindSpore supports different Dump functionalities under various modes, as shown - `trans_flag`: Enable trans flag. Transform the device data format into NCHW. If it is `true`, the data will be saved in the 4D format (NCHW) format on the Host side; if it is `false`, the data format on the Device side will be retained. Default: `true`. - `stat_calc_mode`: Select the backend for statistical calculations. Options are "host" and "device". Choosing "device" enables device computation of statistics, currently only effective on Ascend, and supports only min/max/avg/l2norm statistics. When `op_debug_mode` is set to 3, only `stat_calc_mode` set to "host" is supported. - `device_stat_precision_mode`(Optional): Precision mode of device statistics, and the value can be "high" or "low". When "high" is selected, avg/l2norm statistics will be calculated using float32, which will increase device memory usage and have higher precision; when "low" is selected, the same type as the original data will be used for calculation, which will occupy less device memory, but statistics overflow may be caused when processing large values. The default value is "high". - - `sample_mode`(Optional): Setting it to 0 means the sample dump function is not enabled. Enable the sample dump function in graph compilation with optimization level O0 or O1. This field is effective only when "op_debug_mode" is set to `0`, sample dump cannot be enabled in other scene. + - `sample_mode`(Optional): Setting it to 0 means the sample dump function is not enabled. Enable the sampling dump feature during graph compilation using the ms_backend backend. This field is effective only when "op_debug_mode" is set to `0`, sample dump cannot be enabled in other scene. - `sample_num`(Optional): Used to control the size of sample in sample dump. The default value is 100. - `save_kernel_args`(Optional): When set to true, the initialization information of kernels will be saved. This field is effective only when `enable` is set to `true`. @@ -206,7 +206,6 @@ MindSpore supports different Dump functionalities under various modes, as shown After the training is started, if the `MINDSPORE_DUMP_CONFIG` environment variable is correctly configured, the content of the configuration file will be read and the operator data will be saved according to the data storage path specified in the Dump configuration. If `model.train` or `DatasetHelper` is not called in the script, the default is non-data sinking mode. Using the Dump function will automatically generate the IR file of the final execution graph. - You can set `set_context(reserve_class_name_in_scope=False)` in your training script to avoid dump failure because of file name is too long. 4. Read and parse dump data through `numpy.load`, refer to [Introduction to Ascend ms_backend Dump Data File](#introduction-to-data-object-directory-and-data-file). @@ -549,7 +548,6 @@ For detailed configuration descriptions, please refer to the [Introduction to co If you want to dump data in GPU environment, you must use the non-data sink mode (set the `dataset_sink_mode` parameter in `model.train` or `DatasetHelper` to `False`) to ensure that you can get the dump data of each step. If `model.train` or `DatasetHelper` is not called in the script, the default is non-data sinking mode. Using the Dump function will automatically generate the IR file of the final execution graph. - You can set `set_context(reserve_class_name_in_scope=False)` in your training script to avoid dump failure because of file name is too long. 4. Read and parse dump data through `numpy.load`, refer to [Introduction to CPU/GPU Dump Data File](#introduction-to-data-object-directory-and-data-file-1). @@ -786,7 +784,6 @@ Generate the numpy.array data. - Dump only supports saving data with type of bool, int, int8, in16, int32, int64, uint, uint8, uint16, uint32, uint64, float, float16, float32, float64, bfloat16, double, complex64 and complex128. - Complex64 and complex128 only support saving as npy files, not as statistics information. - The Print operator has an input parameter with type of string, which is not a data type supported by Dump. Therefore, when the Print operator is included in the script, there will be an error log, which will not affect the saving data of other types. -- When Ascend GE dump is enabled, lite exception dump is not supported by using set_context(ascend_config={"exception_dump": "2"}), while full exception dump is supported by using set_context(ascend_config={"exception_dump": "1"}). - When Ascend GE dump is enabled, sink size can only be set to 1. User can use [Model.train()](https://www.mindspore.cn/docs/en/master/api_python/train/mindspore.train.Model.html#mindspore.train.Model.train) or [data_sink()](https://www.mindspore.cn/docs/en/master/api_python/mindspore/mindspore.data_sink.html) to set up sink size. - When Ascend GE dump is enabled, if **statistical value dumping** is performed in scenarios with a large amount of data (such as when the network itself is of a large scale or multiple steps are dumped consecutively), it may cause the host-side memory to become full, leading to a failure in data flow synchronization. It is recommended to replace it with the new version of [**statistical value dumping**](https://gitee.com/ascend/mstt/blob/master/debug/accuracy_tools/msprobe/docs/06.data_dump_MindSpore.md#51-%E9%9D%99%E6%80%81%E5%9B%BE%E5%9C%BA%E6%99%AF). - By default, Dump ignores invalid operator outputs, such as the outputs of the Send/Print operator or the third reserved output of the FlashAttentionScore operator. If you need to retain these invalid outputs, you can set the environment variable `MINDSPORE_DUMP_IGNORE_USELESS_OUTPUT` to `0`. For details, please refer to [Environment Variables - Dump Debugging](https://www.mindspore.cn/docs/en/master/api_python/env_var_list.html#dump-debugging). diff --git a/tutorials/source_zh_cn/debug/dump.md b/tutorials/source_zh_cn/debug/dump.md index 13934390d5..44aef5ef05 100644 --- a/tutorials/source_zh_cn/debug/dump.md +++ b/tutorials/source_zh_cn/debug/dump.md @@ -12,17 +12,17 @@ MindSpore Dump功能已陆续迁移到[msprobe工具](https://gitee.com/ascend/m 其中动态图、静态图Ascend GE模式Dump已完全迁移到msprobe工具,通过msprobe工具入口使能,详情请查看[《msprobe 工具 MindSpore场景精度数据采集指南》](https://gitee.com/ascend/mstt/blob/master/debug/accuracy_tools/msprobe/docs/06.data_dump_MindSpore.md)。 -静态图Ascend OO/O1和CPU/GPU模式仍然通过框架入口使能,后续会陆续迁移到msprobe工具。 +静态图Ascend ms_backend和CPU/GPU模式仍然通过框架入口使能,后续会陆续迁移到msprobe工具。 ## 配置指南 MindSpore在不同模式下支持的Dump功能不完全相同,需要的配置文件和以及生成的数据格式也不同,因此需要根据运行的模式选择对应的Dump配置: -- [Ascend下ms_backend模式Dump](#ascend下o0o1模式dump) +- [Ascend下ms_backend模式Dump](#ascend下ms_backend模式dump) - [Ascend下GE模式Dump](#ascend下GE模式dump) - [CPU/GPU模式Dump](#cpugpu模式dump) -> - Ascend下ms_backend/GE模式的区别请见[set_context的参数jit_level](https://www.mindspore.cn/docs/zh-CN/master/api_python/mindspore/mindspore.set_context.html)。 +> - Ascend下ms_backend/GE模式的区别请见[set_context的参数jit_level](https://www.mindspore.cn/docs/zh-CN/master/api_python/mindspore/mindspore.jit.html#mindspore.jit)。 > > - CPU/GPU模式支持dump常量数据,Ascend ms_backend/GE模式不支持Dump常量数据。 > @@ -170,7 +170,7 @@ MindSpore在不同模式下支持的Dump功能如下表所示: - `trans_flag`:开启格式转换,将设备上的数据格式转换成NCHW格式。若为`true`,则数据会以Host侧的4D格式(NCHW)格式保存;若为`false`,则保留Device侧的数据格式。该配置参数在CPU上无效,因为CPU上没有format转换。默认值:true。 - `stat_calc_mode`:选择统计信息计算后端,可选"host"和"device"。选择"device"后可以使能device计算统计信息,当前只在Ascend生效,只支持`min/max/avg/l2norm`统计量。在op_debug_mode设置为3时,仅支持将`stat_calc_mode`设置为"host"。 - `device_stat_precision_mode`(可选):device统计信息精度模式,可选"high"和"low"。选择"high"时,`avg/l2norm`统计量使用float32进行计算,会增加device内存占用,精度更高;为"low"时使用与原始数据相同的类型进行计算,device内存占用较少,但在处理较大数值时可能会导致统计量溢出。默认值为"high"。 - - `sample_mode`(可选):设置成0,表示不开启切片dump功能;设置成1时,在图编译等级为O0或O1的情况下开启切片dump功能。仅在op_debug_mode设置为0时生效,其他场景不会开启切片dump功能。 + - `sample_mode`(可选):设置成0,表示不开启切片dump功能;设置成1时,在图编译后端为ms_backend的情况下开启切片dump功能。仅在op_debug_mode设置为0时生效,其他场景不会开启切片dump功能。 - `sample_num`(可选):用于控制切片dump中切片的大小。默认值为100。 - `save_kernel_args`(可选): 设置成true时,会保存算子的初始化信息。仅当`enable`设置为`true`时生效。 @@ -206,7 +206,6 @@ MindSpore在不同模式下支持的Dump功能如下表所示: 训练启动后,若正确配置了`MINDSPORE_DUMP_CONFIG`环境变量,则会读取配置文件的内容,并按照Dump配置中指定的数据保存路径保存算子数据。 若脚本中都不调用`model.train`或`DatasetHelper`,则默认为非数据下沉模式。使用Dump功能将自动生成最终执行图的IR文件。 - 可以在训练脚本中设置`set_context(reserve_class_name_in_scope=False)`,避免Dump文件名称过长导致Dump数据文件生成失败。 4. 通过`numpy.load`读取和解析Dump数据,参考[Ascend ms_backend模式下Dump数据文件介绍](#数据对象目录和数据文件介绍)。 @@ -549,7 +548,6 @@ Ascend下GE模式Dump已迁移到msprobe工具,更多详情请查看[《msprob GPU环境如果要Dump数据,必须采用非数据下沉模式(设置`model.train`或`DatasetHelper`中的`dataset_sink_mode`参数为`False`),以保证可以获取每个step的Dump数据。 若脚本中都不调用`model.train`或`DatasetHelper`,则默认为非数据下沉模式。使用Dump功能将自动生成最终执行图的IR文件。 - 可以在训练脚本中设置`set_context(reserve_class_name_in_scope=False)`,避免Dump文件名称过长导致Dump数据文件生成失败。 4. 通过`numpy.load`读取和解析CPU/GPU模式下Dump数据,参考[CPU/GPU模式下Dump数据文件介绍](#数据对象目录和数据文件介绍-1)。 @@ -786,7 +784,6 @@ numpy.load("Conv2D.Conv2D-op12.0.0.1623124369613540.output.0.DefaultFormat.npy") - Dump仅支持bool、int、int8、in16、int32、int64、uint、uint8、uint16、uint32、uint64、float、float16、float32、float64、bfloat16、double、complex64、complex128类型数据的保存。 - complex64和complex128仅支持保存为npy文件,不支持保存为统计值信息。 - Print算子内部有一个输入参数为string类型,string类型不属于Dump支持的数据类型,所以在脚本中包含Print算子时,会有错误日志,这不会影响其他类型数据的保存。 -- 使能Ascend GE模式下Dump时,不支持同时使用set_context(ascend_config={"exception_dump": "2"})配置轻量异常dump; 支持同时使用set_context(ascend_config={"exception_dump": "1"})配置全量异常dump。 - 使能Ascend GE模式下Dump时,sink size只能设置为1。用户通常可以使用[Model.train()](https://www.mindspore.cn/docs/zh-CN/master/api_python/train/mindspore.train.Model.html#mindspore.train.Model.train)或[data_sink()](https://www.mindspore.cn/docs/zh-CN/master/api_python/mindspore/mindspore.data_sink.html)接口配置sink size。 - 使能Ascend GE模式下Dump时,**统计值dump**如果是大数据量dump场景(如网络本身规模庞大,连续dump多个step等),可能会导致host侧内存被占满,导致数据流同步失败,建议使用新版[**统计值dump**](https://gitee.com/ascend/mstt/blob/master/debug/accuracy_tools/msprobe/docs/06.data_dump_MindSpore.md#51-%E9%9D%99%E6%80%81%E5%9B%BE%E5%9C%BA%E6%99%AF)替代。 - 默认情况下,Dump会忽略算子的无效输出,比如Send/Print算子的输出、FlashAttentionScore算子的第三个预留输出等。如果需要保留这些无效输出,可以将环境变量`MINDSPORE_DUMP_IGNORE_USELESS_OUTPUT`设置为`0`。详情请参阅[环境变量-Dump调试](https://www.mindspore.cn/docs/zh-CN/master/api_python/env_var_list.html#dump%E8%B0%83%E8%AF%95)。 -- Gitee From 10426684641b06763270217553a04757c67d8e14 Mon Sep 17 00:00:00 2001 From: qqqhhhbbb Date: Mon, 21 Jul 2025 15:27:10 +0800 Subject: [PATCH 3/3] debug docs modified change O0/O1/O2 to ms_backend/GE debug modified --- tutorials/source_en/debug/dump.md | 45 +++++++++++++--------------- tutorials/source_zh_cn/debug/dump.md | 43 +++++++++++++------------- 2 files changed, 41 insertions(+), 47 deletions(-) diff --git a/tutorials/source_en/debug/dump.md b/tutorials/source_en/debug/dump.md index 158abd0b8f..f998028ac5 100644 --- a/tutorials/source_en/debug/dump.md +++ b/tutorials/source_en/debug/dump.md @@ -10,7 +10,7 @@ The MindSpore Dump functionality has been gradually migrated to the [msprobe too > [msprobe](https://gitee.com/ascend/mstt/tree/master/debug/accuracy_tools/msprobe) is a toolkit under the MindStudio Training Tools suite, specifically for accuracy debugging. It primarily includes functionalities such as accuracy pre-inspection, overflow detection, and accuracy comparison. Currently, it is compatible with the PyTorch and MindSpore frameworks. -The Dump features for dynamic graphs and static graphs in Ascend O2 mode have been fully migrated to the msprobe tool and are enabled through the msprobe tool entry point. For more details, please refer to the [msprobe Tool MindSpore Scenario Accuracy Data Collection Guide](https://gitee.com/ascend/mstt/blob/master/debug/accuracy_tools/msprobe/docs/06.data_dump_MindSpore.md). +The Dump features for dynamic graphs and static graphs in Ascend GE mode have been fully migrated to the msprobe tool and are enabled through the msprobe tool entry point. For more details, please refer to the [msprobe Tool MindSpore Scenario Accuracy Data Collection Guide](https://gitee.com/ascend/mstt/blob/master/debug/accuracy_tools/msprobe/docs/06.data_dump_MindSpore.md). For graphs in Ascend OO/O1 modes and CPU/GPU modes, these functionalities are still enabled through the framework entry points but will be gradually migrated to the msprobe tool in subsequent updates. @@ -18,13 +18,13 @@ For graphs in Ascend OO/O1 modes and CPU/GPU modes, these functionalities are st In different modes, the Dump features supported by MindSpore are not entirely the same, and the required configuration files and the generated data formats vary accordingly. Therefore, you need to select the corresponding Dump configuration based on the running mode: -- [Dump in Ascend O0/O1 Mode](#dump-in-ascend-o0o1-mode) -- [Dump in Ascend O2 Mode](#dump-in-ascend-o2-mode) +- [Dump in Ascend ms_backend Mode](#dump-in-ascend-ms_backend-mode) +- [Dump in Ascend GE Mode](#dump-in-ascend-GE-mode) - [Dump in CPU/GPU mode](#dump-in-cpugpu-mode) -> - The differences between Ascend O0, O1, and O2 modes can be found in [the parameter jit_level of the set_context method](https://www.mindspore.cn/docs/en/master/api_python/mindspore/mindspore.set_context.html). +> - The differences between Ascend ms_backend and GE modes can be found in [the parameter jit](https://www.mindspore.cn/docs/en/master/api_python/mindspore/mindspore.jit.html#mindspore.jit). > -> - Dumping constant data is only supported in CPU/GPU mode, while not supported in Ascend O0/O1/O2 mode. +> - Dumping constant data is only supported in CPU/GPU mode, while not supported in Ascend ms_backend/GE mode. > > - Currently, Dump does not support heterogeneous training, meaning it does not support CPU/Ascend mixed training or GPU/Ascend mixed training. @@ -33,7 +33,7 @@ MindSpore supports different Dump functionalities under various modes, as shown
功能Ascend O0/O1Ascend ms_backend CPU/GPU
- + @@ -100,7 +100,7 @@ MindSpore supports different Dump functionalities under various modes, as shown > In terms of statistics, the computing speed of the device is faster than that of the host(currently only supported on Ascend backend), but the host has more statistical indicators than the device. Refer to the `statistic_category` option for details. -## Dump in Ascend O0/O1 Mode +## Dump in Ascend ms_backend Mode ### Dump Step @@ -139,7 +139,7 @@ MindSpore supports different Dump functionalities under various modes, as shown - `input_output`: 0: dump input and output of kernel, 1:dump input of kernel, 2:dump output of kernel. When `op_debug_mode` is set to 3, `input_output` can only be set to save both the operator's inputs and outputs. Only input of kernel can be saved when "op_debug_mode" is set to `4`. - `kernels`: This item can be configured in three formats: 1. List of operator names. Turn on the IR save switch by setting the environment variable `MS_DEV_SAVE_GRAPHS` to 2 and execute the network to obtain the operator name from the generated `trace_code_graph_{graph_id}`IR file. For details, please refer to [Saving IR](https://www.mindspore.cn/tutorials/en/master/debug/error_analysis/mindir.html#saving-ir). - Note that whether setting the environment variable `MS_DEV_SAVE_GRAPHS` to 2 may cause the different IDs of the same operator, so when dump specified operators, keep this setting unchanged after obtaining the operator name. Or you can obtain the operator names from the file `ms_output_trace_code_graph_{graph_id}.ir` saved by Dump. Refer to [Ascend O0/O1 Dump Data Object Directory](#introduction-to-data-object-directory-and-data-file). + Note that whether setting the environment variable `MS_DEV_SAVE_GRAPHS` to 2 may cause the different IDs of the same operator, so when dump specified operators, keep this setting unchanged after obtaining the operator name. Or you can obtain the operator names from the file `ms_output_trace_code_graph_{graph_id}.ir` saved by Dump. Refer to [Ascend ms_backend Dump Data Object Directory](#introduction-to-data-object-directory-and-data-file). 2. You can also specify an operator type. When there is no operator scope information or operator id information in the string, the background considers it as an operator type, such as "conv". The matching rule of operator type is: when the operator name contains an operator type string, the matching is considered successful (case insensitive). For example, "conv" can match operators "Conv2D-op1234" and "Conv3D-op1221". 3. Regular expressions are supported. When the string conforms to the format of "name-regex(xxx)", it would be considered a regular expression. For example, "name-regex(Default/.+)" can match all operators with names starting with "Default/". - `support_device`: Supported devices, default setting is `[0,1,2,3,4,5,6,7]`. In distributed training scenarios where data on individual devices needs to be dumped, you can specify only the device Id that needs to be dumped in `support_device`. This configuration parameter is invalid on the CPU, because there is no concept of device on the CPU, but it is still need to reserve this parameter in the json file. @@ -170,7 +170,7 @@ MindSpore supports different Dump functionalities under various modes, as shown - `trans_flag`: Enable trans flag. Transform the device data format into NCHW. If it is `true`, the data will be saved in the 4D format (NCHW) format on the Host side; if it is `false`, the data format on the Device side will be retained. Default: `true`. - `stat_calc_mode`: Select the backend for statistical calculations. Options are "host" and "device". Choosing "device" enables device computation of statistics, currently only effective on Ascend, and supports only min/max/avg/l2norm statistics. When `op_debug_mode` is set to 3, only `stat_calc_mode` set to "host" is supported. - `device_stat_precision_mode`(Optional): Precision mode of device statistics, and the value can be "high" or "low". When "high" is selected, avg/l2norm statistics will be calculated using float32, which will increase device memory usage and have higher precision; when "low" is selected, the same type as the original data will be used for calculation, which will occupy less device memory, but statistics overflow may be caused when processing large values. The default value is "high". - - `sample_mode`(Optional): Setting it to 0 means the sample dump function is not enabled. Enable the sample dump function in graph compilation with optimization level O0 or O1. This field is effective only when "op_debug_mode" is set to `0`, sample dump cannot be enabled in other scene. + - `sample_mode`(Optional): Setting it to 0 means the sample dump function is not enabled. Enable the sampling dump feature during graph compilation using the ms_backend backend. This field is effective only when "op_debug_mode" is set to `0`, sample dump cannot be enabled in other scene. - `sample_num`(Optional): Used to control the size of sample in sample dump. The default value is 100. - `save_kernel_args`(Optional): When set to true, the initialization information of kernels will be saved. This field is effective only when `enable` is set to `true`. @@ -206,13 +206,12 @@ MindSpore supports different Dump functionalities under various modes, as shown After the training is started, if the `MINDSPORE_DUMP_CONFIG` environment variable is correctly configured, the content of the configuration file will be read and the operator data will be saved according to the data storage path specified in the Dump configuration. If `model.train` or `DatasetHelper` is not called in the script, the default is non-data sinking mode. Using the Dump function will automatically generate the IR file of the final execution graph. - You can set `set_context(reserve_class_name_in_scope=False)` in your training script to avoid dump failure because of file name is too long. -4. Read and parse dump data through `numpy.load`, refer to [Introduction to Ascend O0/O1 Dump Data File](#introduction-to-data-object-directory-and-data-file). +4. Read and parse dump data through `numpy.load`, refer to [Introduction to Ascend ms_backend Dump Data File](#introduction-to-data-object-directory-and-data-file). ### Introduction to Data Object Directory and Data File -After starting the training, the data objects saved under the Ascend O0/O1 Dump mode include the final execution graph (`ms_output_trace_code_graph_{graph_id}.ir` file) and the input and output data of the operators in the graph. The data directory structure is as follows: +After starting the training, the data objects saved under the Ascend ms_backend Dump mode include the final execution graph (`ms_output_trace_code_graph_{graph_id}.ir` file) and the input and output data of the operators in the graph. The data directory structure is as follows: ```text {path}/ @@ -266,7 +265,7 @@ Only when `save_kernel_args` is `true`, `{op_type}.{op_name}.json` is generated This JSON indicates that both initialization parameters `transpose_a` and `transpose_b` of the `Matmul` operator have the value `False`. -The data file generated by the Ascend O0/O1 Dump is a binary file with the suffix `.npy`, and the file naming format is: +The data file generated by the Ascend ms_backend Dump is a binary file with the suffix `.npy`, and the file naming format is: ```text {op_type}.{op_name}.{task_id}.{stream_id}.{timestamp}.{input_output_index}.{slot}.{format}.{dtype}.npy @@ -274,9 +273,9 @@ The data file generated by the Ascend O0/O1 Dump is a binary file with the suffi User can use Numpy interface `numpy.load` to read the data. -The statistics file generated by the Ascend O0/O1 dump is named `statistic.csv`. This file stores key statistics for all tensors dumped under the same directory as itself (with the file names `{op_type}.{op_name}.{task_id}.{stream_id}.{timestamp}.{input_output_index}.{slot}.{format}.npy`). Each row in `statistic.csv` summarizes a single tensor, each row contains the statistics: Op Type, Op Name, Task ID, Stream ID, Timestamp, IO, Slot, Data Size, Data Type, Shape, and statistics items configured by the user. Note that opening this file with Excel may cause data to be displayed incorrectly. Please use commands like `vi` or `cat`, or use Excel to import csv from text for viewing. +The statistics file generated by the Ascend ms_backend dump is named `statistic.csv`. This file stores key statistics for all tensors dumped under the same directory as itself (with the file names `{op_type}.{op_name}.{task_id}.{stream_id}.{timestamp}.{input_output_index}.{slot}.{format}.npy`). Each row in `statistic.csv` summarizes a single tensor, each row contains the statistics: Op Type, Op Name, Task ID, Stream ID, Timestamp, IO, Slot, Data Size, Data Type, Shape, and statistics items configured by the user. Note that opening this file with Excel may cause data to be displayed incorrectly. Please use commands like `vi` or `cat`, or use Excel to import csv from text for viewing. -The suffixes of the final execution graph files generated by Ascend O0/O1 Dump are `.pb` and `.ir` respectively, and the file naming format is: +The suffixes of the final execution graph files generated by Ascend ms_backend Dump are `.pb` and `.ir` respectively, and the file naming format is: ```text ms_output_trace_code_graph_{graph_id}.pb @@ -285,7 +284,7 @@ ms_output_trace_code_graph_{graph_id}.ir The files with the suffix `.ir` can be opened and viewed by the `vi` command. -The suffix of the node execution sequence file generated by the Ascend O0/O1 Dump is `.csv`, and the file naming format is: +The suffix of the node execution sequence file generated by the Ascend ms_backend Dump is `.csv`, and the file naming format is: ```text ms_execution_order_graph_{graph_id}.csv @@ -293,7 +292,7 @@ ms_execution_order_graph_{graph_id}.csv ### Data Analysis Sample -In order to better demonstrate the process of using dump to save and analyze data, we provide a set of [complete sample script](https://gitee.com/mindspore/docs/tree/master/docs/sample_code/dump) , you only need to execute `bash dump_sync_dump.sh` for Ascend O0/O1 dump. +In order to better demonstrate the process of using dump to save and analyze data, we provide a set of [complete sample script](https://gitee.com/mindspore/docs/tree/master/docs/sample_code/dump) , you only need to execute `bash dump_sync_dump.sh` for Ascend ms_backend dump. After the graph corresponding to the script is saved to the disk through the Dump function, the final execution graph file `ms_output_trace_code_graph_{graph_id}.ir` will be generated. This file saves the stack information of each operator in the corresponding graph, and records the generation script corresponding to the operator. @@ -431,9 +430,9 @@ numpy.load("Conv2D.Conv2D-op12.0.0.1623124369613540.output.0.DefaultFormat.float Generate the numpy.array data. -## Dump in Ascend O2 Mode +## Dump in Ascend GE Mode -O2 mode Dump under Ascend has been migrated to the msprobe tool. For more details, please see [msprobe Tool MindSpore Scene Accuracy Data Collection Guide](https://gitee.com/ascend/mstt/blob/master/debug/accuracy_tools/msprobe/docs/06.data_dump_MindSpore.md). +GE mode Dump under Ascend has been migrated to the msprobe tool. For more details, please see [msprobe Tool MindSpore Scene Accuracy Data Collection Guide](https://gitee.com/ascend/mstt/blob/master/debug/accuracy_tools/msprobe/docs/06.data_dump_MindSpore.md). For data collection methods, please refer to the example code in [Graph Scenario Data Collection with msprobe](https://gitee.com/ascend/mstt/blob/master/debug/accuracy_tools/msprobe/docs/06.data_dump_MindSpore.md#71-%E9%9D%99%E6%80%81%E5%9B%BE%E5%9C%BA%E6%99%AF); @@ -489,7 +488,7 @@ For detailed configuration descriptions, please refer to the [Introduction to co - `input_output`: 0: dump input and output of kernel, 1: dump input of kernel, 2: dump output of kernel. Only input of kernel can be saved when "op_debug_mode" is set to `4`. - `kernels`: This item can be configured in three formats: 1. List of operator names. Turn on the IR save switch by setting the environment variable `MS_DEV_SAVE_GRAPHS` to 2 and execute the network to obtain the operator name from the generated `trace_code_graph_{graph_id}`IR file. For details, please refer to [Saving IR](https://www.mindspore.cn/tutorials/en/master/debug/error_analysis/mindir.html#saving-ir). - Note that whether setting the environment variable `MS_DEV_SAVE_GRAPHS` to 2 may cause the different IDs of the same operator, so when dump specified operators, keep this setting unchanged after obtaining the operator name. Or you can obtain the operator names from the file `ms_output_trace_code_graph_{graph_id}.ir` saved by Dump. Refer to [Ascend O0/O1 Dump Data Object Directory](#introduction-to-data-object-directory-and-data-file). + Note that whether setting the environment variable `MS_DEV_SAVE_GRAPHS` to 2 may cause the different IDs of the same operator, so when dump specified operators, keep this setting unchanged after obtaining the operator name. Or you can obtain the operator names from the file `ms_output_trace_code_graph_{graph_id}.ir` saved by Dump. Refer to [Ascend ms_backend Dump Data Object Directory](#introduction-to-data-object-directory-and-data-file). 2. You can also specify an operator type. When there is no operator scope information or operator id information in the string, the background considers it as an operator type, such as "conv". The matching rule of operator type is: when the operator name contains an operator type string, the matching is considered successful (case insensitive). For example, "conv" can match operators "Conv2D-op1234" and "Conv3D-op1221". 3. Regular expressions are supported. When the string conforms to the format of "name-regex(xxx)", it would be considered a regular expression. For example, "name-regex(Default/.+)" can match all operators with names starting with "Default/". - `support_device`: Supported devices, default setting is `[0,1,2,3,4,5,6,7]`. You can specify specific device ids to dump specific device data. This configuration parameter is invalid on the CPU, because there is no concept of device on the CPU, but it is still need to reserve this parameter in the json file. @@ -549,7 +548,6 @@ For detailed configuration descriptions, please refer to the [Introduction to co If you want to dump data in GPU environment, you must use the non-data sink mode (set the `dataset_sink_mode` parameter in `model.train` or `DatasetHelper` to `False`) to ensure that you can get the dump data of each step. If `model.train` or `DatasetHelper` is not called in the script, the default is non-data sinking mode. Using the Dump function will automatically generate the IR file of the final execution graph. - You can set `set_context(reserve_class_name_in_scope=False)` in your training script to avoid dump failure because of file name is too long. 4. Read and parse dump data through `numpy.load`, refer to [Introduction to CPU/GPU Dump Data File](#introduction-to-data-object-directory-and-data-file-1). @@ -786,7 +784,6 @@ Generate the numpy.array data. - Dump only supports saving data with type of bool, int, int8, in16, int32, int64, uint, uint8, uint16, uint32, uint64, float, float16, float32, float64, bfloat16, double, complex64 and complex128. - Complex64 and complex128 only support saving as npy files, not as statistics information. - The Print operator has an input parameter with type of string, which is not a data type supported by Dump. Therefore, when the Print operator is included in the script, there will be an error log, which will not affect the saving data of other types. -- When Ascend O2 dump is enabled, lite exception dump is not supported by using set_context(ascend_config={"exception_dump": "2"}), while full exception dump is supported by using set_context(ascend_config={"exception_dump": "1"}). -- When Ascend O2 dump is enabled, sink size can only be set to 1. User can use [Model.train()](https://www.mindspore.cn/docs/en/master/api_python/train/mindspore.train.Model.html#mindspore.train.Model.train) or [data_sink()](https://www.mindspore.cn/docs/en/master/api_python/mindspore/mindspore.data_sink.html) to set up sink size. -- When Ascend O2 dump is enabled, if **statistical value dumping** is performed in scenarios with a large amount of data (such as when the network itself is of a large scale or multiple steps are dumped consecutively), it may cause the host-side memory to become full, leading to a failure in data flow synchronization. It is recommended to replace it with the new version of [**statistical value dumping**](https://gitee.com/ascend/mstt/blob/master/debug/accuracy_tools/msprobe/docs/06.data_dump_MindSpore.md#51-%E9%9D%99%E6%80%81%E5%9B%BE%E5%9C%BA%E6%99%AF). +- When Ascend GE dump is enabled, sink size can only be set to 1. User can use [Model.train()](https://www.mindspore.cn/docs/en/master/api_python/train/mindspore.train.Model.html#mindspore.train.Model.train) or [data_sink()](https://www.mindspore.cn/docs/en/master/api_python/mindspore/mindspore.data_sink.html) to set up sink size. +- When Ascend GE dump is enabled, if **statistical value dumping** is performed in scenarios with a large amount of data (such as when the network itself is of a large scale or multiple steps are dumped consecutively), it may cause the host-side memory to become full, leading to a failure in data flow synchronization. It is recommended to replace it with the new version of [**statistical value dumping**](https://gitee.com/ascend/mstt/blob/master/debug/accuracy_tools/msprobe/docs/06.data_dump_MindSpore.md#51-%E9%9D%99%E6%80%81%E5%9B%BE%E5%9C%BA%E6%99%AF). - By default, Dump ignores invalid operator outputs, such as the outputs of the Send/Print operator or the third reserved output of the FlashAttentionScore operator. If you need to retain these invalid outputs, you can set the environment variable `MINDSPORE_DUMP_IGNORE_USELESS_OUTPUT` to `0`. For details, please refer to [Environment Variables - Dump Debugging](https://www.mindspore.cn/docs/en/master/api_python/env_var_list.html#dump-debugging). diff --git a/tutorials/source_zh_cn/debug/dump.md b/tutorials/source_zh_cn/debug/dump.md index 9daebda6ef..9ba0709c7e 100644 --- a/tutorials/source_zh_cn/debug/dump.md +++ b/tutorials/source_zh_cn/debug/dump.md @@ -10,21 +10,21 @@ MindSpore Dump功能已陆续迁移到[msprobe工具](https://gitee.com/ascend/m > [msprobe](https://gitee.com/ascend/mstt/tree/master/debug/accuracy_tools/msprobe) 是 MindStudio Training Tools 工具链下精度调试部分的工具包。主要包括精度预检、溢出检测和精度比对等功能,目前适配 PyTorch 和 MindSpore 框架。 -其中动态图、静态图Ascend O2模式Dump已完全迁移到msprobe工具,通过msprobe工具入口使能,详情请查看[《msprobe 工具 MindSpore场景精度数据采集指南》](https://gitee.com/ascend/mstt/blob/master/debug/accuracy_tools/msprobe/docs/06.data_dump_MindSpore.md)。 +其中动态图、静态图Ascend GE模式Dump已完全迁移到msprobe工具,通过msprobe工具入口使能,详情请查看[《msprobe 工具 MindSpore场景精度数据采集指南》](https://gitee.com/ascend/mstt/blob/master/debug/accuracy_tools/msprobe/docs/06.data_dump_MindSpore.md)。 -静态图Ascend OO/O1和CPU/GPU模式仍然通过框架入口使能,后续会陆续迁移到msprobe工具。 +静态图Ascend ms_backend和CPU/GPU模式仍然通过框架入口使能,后续会陆续迁移到msprobe工具。 ## 配置指南 MindSpore在不同模式下支持的Dump功能不完全相同,需要的配置文件和以及生成的数据格式也不同,因此需要根据运行的模式选择对应的Dump配置: -- [Ascend下O0/O1模式Dump](#ascend下o0o1模式dump) -- [Ascend下O2模式Dump](#ascend下o2模式dump) +- [Ascend下ms_backend模式Dump](#ascend下ms_backend模式dump) +- [Ascend下GE模式Dump](#ascend下GE模式dump) - [CPU/GPU模式Dump](#cpugpu模式dump) -> - Ascend下O0/O1/O2模式的区别请见[set_context的参数jit_level](https://www.mindspore.cn/docs/zh-CN/master/api_python/mindspore/mindspore.set_context.html)。 +> - Ascend下ms_backend/GE模式的区别请见[jit接口](https://www.mindspore.cn/docs/zh-CN/master/api_python/mindspore/mindspore.jit.html#mindspore.jit)。 > -> - CPU/GPU模式支持dump常量数据,Ascend O0/O1/O2模式不支持Dump常量数据。 +> - CPU/GPU模式支持dump常量数据,Ascend ms_backend/GE模式不支持Dump常量数据。 > > - Dump暂不支持异构训练,即不支持CPU/Ascend混合训练或GPU/Ascend混合训练。 @@ -33,7 +33,7 @@ MindSpore在不同模式下支持的Dump功能如下表所示:
FeatureAscend O0/O1Ascend ms_backend CPU/GPU
- + @@ -100,7 +100,7 @@ MindSpore在不同模式下支持的Dump功能如下表所示: > 在统计信息方面,device计算速度较host快(目前仅支持Ascend后端),但host统计指标比device多,详见`statistic_category`选项。 -## Ascend下O0/O1模式Dump +## Ascend下ms_backend模式Dump ### 操作步骤 @@ -139,7 +139,7 @@ MindSpore在不同模式下支持的Dump功能如下表所示: - `input_output`:设置成0,表示Dump出算子的输入和算子的输出;设置成1,表示Dump出算子的输入;设置成2,表示Dump出算子的输出。在op_debug_mode设置为3时,只能设置`input_output`为同时保存算子输入和算子输出。在op_debug_mode设置为4时,只能保存算子输入。 - `kernels`:该项可以配置三种格式: 1. 算子的名称列表。通过设置环境变量`MS_DEV_SAVE_GRAPHS`的值为2开启IR保存开关并执行用例,从生成的IR文件`trace_code_graph_{graph_id}`中获取算子名称。详细说明可以参照教程:[如何保存IR](https://www.mindspore.cn/tutorials/zh-CN/master/debug/error_analysis/mindir.html#如何保存ir)。 - 需要注意的是,是否设置环境变量`MS_DEV_SAVE_GRAPHS`的值为2可能会导致同一个算子的id不同,所以在Dump指定算子时要在获取算子名称之后保持这一项设置不变。或者也可以在Dump保存的`ms_output_trace_code_graph_{graph_id}.ir`文件中获取算子名称,参考[Ascend O0/O1模式下Dump数据对象目录](#数据对象目录和数据文件介绍)。 + 需要注意的是,是否设置环境变量`MS_DEV_SAVE_GRAPHS`的值为2可能会导致同一个算子的id不同,所以在Dump指定算子时要在获取算子名称之后保持这一项设置不变。或者也可以在Dump保存的`ms_output_trace_code_graph_{graph_id}.ir`文件中获取算子名称,参考[Ascend ms_backend模式下Dump数据对象目录](#数据对象目录和数据文件介绍)。 2. 还可以指定算子类型。当字符串中不带算子scope信息和算子id信息时,后台则认为其为算子类型,例如:"conv"。算子类型的匹配规则为:当发现算子名中包含算子类型字符串时,则认为匹配成功(不区分大小写),例如:"conv" 可以匹配算子 "Conv2D-op1234"、"Conv3D-op1221"。 3. 算子名称的正则表达式。当字符串符合"name-regex(xxx)"格式时,后台则会将其作为正则表达式。例如,"name-regex(Default/.+)"可匹配算子名称以"Default/"开头的所有算子。 - `support_device`:支持的设备,默认设置成0到7即可;在分布式训练场景下,需要dump个别设备上的数据,可以只在`support_device`中指定需要Dump的设备Id。该配置参数在CPU上无效,因为CPU下没有device这个概念,但是在json格式的配置文件中仍需保留该字段。 @@ -170,7 +170,7 @@ MindSpore在不同模式下支持的Dump功能如下表所示: - `trans_flag`:开启格式转换,将设备上的数据格式转换成NCHW格式。若为`true`,则数据会以Host侧的4D格式(NCHW)格式保存;若为`false`,则保留Device侧的数据格式。该配置参数在CPU上无效,因为CPU上没有format转换。默认值:true。 - `stat_calc_mode`:选择统计信息计算后端,可选"host"和"device"。选择"device"后可以使能device计算统计信息,当前只在Ascend生效,只支持`min/max/avg/l2norm`统计量。在op_debug_mode设置为3时,仅支持将`stat_calc_mode`设置为"host"。 - `device_stat_precision_mode`(可选):device统计信息精度模式,可选"high"和"low"。选择"high"时,`avg/l2norm`统计量使用float32进行计算,会增加device内存占用,精度更高;为"low"时使用与原始数据相同的类型进行计算,device内存占用较少,但在处理较大数值时可能会导致统计量溢出。默认值为"high"。 - - `sample_mode`(可选):设置成0,表示不开启切片dump功能;设置成1时,在图编译等级为O0或O1的情况下开启切片dump功能。仅在op_debug_mode设置为0时生效,其他场景不会开启切片dump功能。 + - `sample_mode`(可选):设置成0,表示不开启切片dump功能;设置成1时,在图编译后端为ms_backend的情况下开启切片dump功能。仅在op_debug_mode设置为0时生效,其他场景不会开启切片dump功能。 - `sample_num`(可选):用于控制切片dump中切片的大小。默认值为100。 - `save_kernel_args`(可选): 设置成true时,会保存算子的初始化信息。仅当`enable`设置为`true`时生效。 @@ -206,13 +206,12 @@ MindSpore在不同模式下支持的Dump功能如下表所示: 训练启动后,若正确配置了`MINDSPORE_DUMP_CONFIG`环境变量,则会读取配置文件的内容,并按照Dump配置中指定的数据保存路径保存算子数据。 若脚本中都不调用`model.train`或`DatasetHelper`,则默认为非数据下沉模式。使用Dump功能将自动生成最终执行图的IR文件。 - 可以在训练脚本中设置`set_context(reserve_class_name_in_scope=False)`,避免Dump文件名称过长导致Dump数据文件生成失败。 -4. 通过`numpy.load`读取和解析Dump数据,参考[Ascend O0/O1模式下Dump数据文件介绍](#数据对象目录和数据文件介绍)。 +4. 通过`numpy.load`读取和解析Dump数据,参考[Ascend ms_backend模式下Dump数据文件介绍](#数据对象目录和数据文件介绍)。 ### 数据对象目录和数据文件介绍 -启动训练后,Ascend O0/O1模式下Dump保存的数据对象包括最终执行图(`ms_output_trace_code_graph_{graph_id}.ir`文件)以及图中算子的输入和输出数据,数据目录结构如下所示: +启动训练后,Ascend ms_backend模式下Dump保存的数据对象包括最终执行图(`ms_output_trace_code_graph_{graph_id}.ir`文件)以及图中算子的输入和输出数据,数据目录结构如下所示: ```text {path}/ @@ -266,7 +265,7 @@ MindSpore在不同模式下支持的Dump功能如下表所示: 代表`Matmul`算子的两个初始化参数`transpose_a`和`transpose_b`的值均为`False`。 -Ascend O0/O1模式下Dump生成的数据文件是后缀名为`.npy`的文件,文件命名格式为: +Ascend ms_backend模式下Dump生成的数据文件是后缀名为`.npy`的文件,文件命名格式为: ```text {op_type}.{op_name}.{task_id}.{stream_id}.{timestamp}.{input_output_index}.{slot}.{format}.{dtype}.npy @@ -274,9 +273,9 @@ Ascend O0/O1模式下Dump生成的数据文件是后缀名为`.npy`的文件, 可以用Numpy的`numpy.load`接口读取数据。 -Ascend O0/O1模式下生成的统计数据文件名为`statistic.csv`,此文件存有相同目录下所有落盘张量(文件名为`{op_type}.{op_name}.{task_id}.{stream_id}.{timestamp}.{input_output_index}.{slot}.{format}.npy`)的统计信息。每个张量一行,每行有张量的 Op Type、Op Name、Task ID、Stream ID、Timestamp、IO、Slot、Data Size、Data Type、Shape以及用户配置的统计信息项。注意,如果用Excel来打开此文件,数据可能无法正确显示。请用`vi`、`cat`等命令查看,或者使用Excel自文本导入csv查看。 +Ascend ms_backend模式下生成的统计数据文件名为`statistic.csv`,此文件存有相同目录下所有落盘张量(文件名为`{op_type}.{op_name}.{task_id}.{stream_id}.{timestamp}.{input_output_index}.{slot}.{format}.npy`)的统计信息。每个张量一行,每行有张量的 Op Type、Op Name、Task ID、Stream ID、Timestamp、IO、Slot、Data Size、Data Type、Shape以及用户配置的统计信息项。注意,如果用Excel来打开此文件,数据可能无法正确显示。请用`vi`、`cat`等命令查看,或者使用Excel自文本导入csv查看。 -Ascend O0/O1模式下生成的最终执行图文件后缀名分别为`.pb`和`.ir`,文件命名格式为: +Ascend ms_backend模式下生成的最终执行图文件后缀名分别为`.pb`和`.ir`,文件命名格式为: ```text ms_output_trace_code_graph_{graph_id}.pb @@ -285,7 +284,7 @@ ms_output_trace_code_graph_{graph_id}.ir 其中以`.ir`为后缀的文件可以通过`vi`命令打开查看。 -Ascend O0/O1模式下Dump生成的节点执行序文件后缀名为`.csv`,文件命名格式为: +Ascend ms_backend模式下Dump生成的节点执行序文件后缀名为`.csv`,文件命名格式为: ```text ms_execution_order_graph_{graph_id}.csv @@ -431,9 +430,9 @@ numpy.load("Conv2D.Conv2D-op12.0.0.1623124369613540.output.0.DefaultFormat.float 生成numpy.array数据。 -## Ascend下O2模式Dump +## Ascend GE模式Dump -Ascend下O2模式Dump已迁移到msprobe工具,更多详情请查看[《msprobe 工具 MindSpore场景精度数据采集指南》](https://gitee.com/ascend/mstt/blob/master/debug/accuracy_tools/msprobe/docs/06.data_dump_MindSpore.md)。 +Ascend下GE模式Dump已迁移到msprobe工具,更多详情请查看[《msprobe 工具 MindSpore场景精度数据采集指南》](https://gitee.com/ascend/mstt/blob/master/debug/accuracy_tools/msprobe/docs/06.data_dump_MindSpore.md)。 采集方式请参考示例代码[《msprobe静态图场景采集》](https://gitee.com/ascend/mstt/blob/master/debug/accuracy_tools/msprobe/docs/06.data_dump_MindSpore.md#71-%E9%9D%99%E6%80%81%E5%9B%BE%E5%9C%BA%E6%99%AF); @@ -549,7 +548,6 @@ Ascend下O2模式Dump已迁移到msprobe工具,更多详情请查看[《msprob GPU环境如果要Dump数据,必须采用非数据下沉模式(设置`model.train`或`DatasetHelper`中的`dataset_sink_mode`参数为`False`),以保证可以获取每个step的Dump数据。 若脚本中都不调用`model.train`或`DatasetHelper`,则默认为非数据下沉模式。使用Dump功能将自动生成最终执行图的IR文件。 - 可以在训练脚本中设置`set_context(reserve_class_name_in_scope=False)`,避免Dump文件名称过长导致Dump数据文件生成失败。 4. 通过`numpy.load`读取和解析CPU/GPU模式下Dump数据,参考[CPU/GPU模式下Dump数据文件介绍](#数据对象目录和数据文件介绍-1)。 @@ -786,7 +784,6 @@ numpy.load("Conv2D.Conv2D-op12.0.0.1623124369613540.output.0.DefaultFormat.npy") - Dump仅支持bool、int、int8、in16、int32、int64、uint、uint8、uint16、uint32、uint64、float、float16、float32、float64、bfloat16、double、complex64、complex128类型数据的保存。 - complex64和complex128仅支持保存为npy文件,不支持保存为统计值信息。 - Print算子内部有一个输入参数为string类型,string类型不属于Dump支持的数据类型,所以在脚本中包含Print算子时,会有错误日志,这不会影响其他类型数据的保存。 -- 使能Ascend O2模式下Dump时,不支持同时使用set_context(ascend_config={"exception_dump": "2"})配置轻量异常dump; 支持同时使用set_context(ascend_config={"exception_dump": "1"})配置全量异常dump。 -- 使能Ascend O2模式下Dump时,sink size只能设置为1。用户通常可以使用[Model.train()](https://www.mindspore.cn/docs/zh-CN/master/api_python/train/mindspore.train.Model.html#mindspore.train.Model.train)或[data_sink()](https://www.mindspore.cn/docs/zh-CN/master/api_python/mindspore/mindspore.data_sink.html)接口配置sink size。 -- 使能Ascend O2模式下Dump时,**统计值dump**如果是大数据量dump场景(如网络本身规模庞大,连续dump多个step等),可能会导致host侧内存被占满,导致数据流同步失败,建议使用新版[**统计值dump**](https://gitee.com/ascend/mstt/blob/master/debug/accuracy_tools/msprobe/docs/06.data_dump_MindSpore.md#51-%E9%9D%99%E6%80%81%E5%9B%BE%E5%9C%BA%E6%99%AF)替代。 +- 使能Ascend GE模式下Dump时,sink size只能设置为1。用户通常可以使用[Model.train()](https://www.mindspore.cn/docs/zh-CN/master/api_python/train/mindspore.train.Model.html#mindspore.train.Model.train)或[data_sink()](https://www.mindspore.cn/docs/zh-CN/master/api_python/mindspore/mindspore.data_sink.html)接口配置sink size。 +- 使能Ascend GE模式下Dump时,**统计值dump**如果是大数据量dump场景(如网络本身规模庞大,连续dump多个step等),可能会导致host侧内存被占满,导致数据流同步失败,建议使用新版[**统计值dump**](https://gitee.com/ascend/mstt/blob/master/debug/accuracy_tools/msprobe/docs/06.data_dump_MindSpore.md#51-%E9%9D%99%E6%80%81%E5%9B%BE%E5%9C%BA%E6%99%AF)替代。 - 默认情况下,Dump会忽略算子的无效输出,比如Send/Print算子的输出、FlashAttentionScore算子的第三个预留输出等。如果需要保留这些无效输出,可以将环境变量`MINDSPORE_DUMP_IGNORE_USELESS_OUTPUT`设置为`0`。详情请参阅[环境变量-Dump调试](https://www.mindspore.cn/docs/zh-CN/master/api_python/env_var_list.html#dump%E8%B0%83%E8%AF%95)。 -- Gitee
功能Ascend O0/O1Ascend ms_backend CPU/GPU