From 856e8a37f9b305dd97ca188e58681e8dfc17725c Mon Sep 17 00:00:00 2001 From: louyujing Date: Fri, 13 Oct 2023 03:48:34 +0000 Subject: [PATCH 1/4] update debug/accuracy_tools/ptdbg_ascend/doc/FAQ.md. Signed-off-by: louyujing --- debug/accuracy_tools/ptdbg_ascend/doc/FAQ.md | 4 ++++ 1 file changed, 4 insertions(+) diff --git a/debug/accuracy_tools/ptdbg_ascend/doc/FAQ.md b/debug/accuracy_tools/ptdbg_ascend/doc/FAQ.md index e571e64d15..dbc7209940 100644 --- a/debug/accuracy_tools/ptdbg_ascend/doc/FAQ.md +++ b/debug/accuracy_tools/ptdbg_ascend/doc/FAQ.md @@ -103,3 +103,7 @@ compare(dump_result_param, "./output", stack_mode=True) - matmul期望的输入是二维,当输入不是二维时,会将输入通过view操作展成二维,再进行matmul运算,因此在反向求导时,backward_hook能拿到的是UnsafeViewBackward这步操作里面数据的梯度信息,取不到MmBackward这步操作里面数据的梯度信息,即权重的反向梯度数据。 - 典型的例子有,当linear的输入不是二维,且无bias时,会调用output = input.matmul(weight.t()),因此拿不到linear层的weight的反向梯度数据。 + +### 13. pkl文件中的某些api的dtype类型为float16,但是读取此api的npy文件显示的dtype类型为float32 + +- ptdbg工具在dump数据时需要将原始数据to到cpu上再转换为numpy类型,在apex/amp混精场景下to到 cpu,float16的数据可能会转换为float32类型,为正常现象,最终dump数据的dtype以pkl文件为准 -- Gitee From 16749f1e2524bdd4b314dae414b395d232cd6b08 Mon Sep 17 00:00:00 2001 From: louyujing Date: Fri, 13 Oct 2023 07:46:20 +0000 Subject: [PATCH 2/4] update debug/accuracy_tools/ptdbg_ascend/doc/FAQ.md. Signed-off-by: louyujing --- debug/accuracy_tools/ptdbg_ascend/doc/FAQ.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/debug/accuracy_tools/ptdbg_ascend/doc/FAQ.md b/debug/accuracy_tools/ptdbg_ascend/doc/FAQ.md index dbc7209940..eb18aab9cb 100644 --- a/debug/accuracy_tools/ptdbg_ascend/doc/FAQ.md +++ b/debug/accuracy_tools/ptdbg_ascend/doc/FAQ.md @@ -106,4 +106,4 @@ compare(dump_result_param, "./output", stack_mode=True) ### 13. pkl文件中的某些api的dtype类型为float16,但是读取此api的npy文件显示的dtype类型为float32 -- ptdbg工具在dump数据时需要将原始数据to到cpu上再转换为numpy类型,在apex/amp混精场景下to到 cpu,float16的数据可能会转换为float32类型,为正常现象,最终dump数据的dtype以pkl文件为准 +- ptdbg工具在dump数据时需要将原始数据从npu to cpu上再转换为numpy类型,npu to cpu的逻辑和gpu to cpu是保持一致的,都存在dtype可能从float16变为float32类型的情况,如果出现dtype不一致,最终dump数据的dtype以pkl文件为准。 -- Gitee From b386dc97d6dc2502374e4101e35989e30dc50ef9 Mon Sep 17 00:00:00 2001 From: louyujing Date: Fri, 13 Oct 2023 07:48:01 +0000 Subject: [PATCH 3/4] update debug/accuracy_tools/ptdbg_ascend/doc/FAQ.md. Signed-off-by: louyujing --- debug/accuracy_tools/ptdbg_ascend/doc/FAQ.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/debug/accuracy_tools/ptdbg_ascend/doc/FAQ.md b/debug/accuracy_tools/ptdbg_ascend/doc/FAQ.md index eb18aab9cb..7d74ac5d06 100644 --- a/debug/accuracy_tools/ptdbg_ascend/doc/FAQ.md +++ b/debug/accuracy_tools/ptdbg_ascend/doc/FAQ.md @@ -106,4 +106,4 @@ compare(dump_result_param, "./output", stack_mode=True) ### 13. pkl文件中的某些api的dtype类型为float16,但是读取此api的npy文件显示的dtype类型为float32 -- ptdbg工具在dump数据时需要将原始数据从npu to cpu上再转换为numpy类型,npu to cpu的逻辑和gpu to cpu是保持一致的,都存在dtype可能从float16变为float32类型的情况,如果出现dtype不一致,最终dump数据的dtype以pkl文件为准。 +- ptdbg工具在dump数据时需要将原始数据从npu to cpu上再转换为numpy类型,npu to cpu的逻辑和gpu to cpu是保持一致的,都存在dtype可能从float16变为float32类型的情况,如果出现dtype不一致问题,最终dump数据的dtype以pkl文件为准。 -- Gitee From 8dac4f0cdcf7017d424a2a7e3d16669ba092ccc1 Mon Sep 17 00:00:00 2001 From: louyujing Date: Tue, 17 Oct 2023 02:41:13 +0000 Subject: [PATCH 4/4] update debug/accuracy_tools/ptdbg_ascend/doc/FAQ.md. Signed-off-by: louyujing --- debug/accuracy_tools/ptdbg_ascend/doc/FAQ.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/debug/accuracy_tools/ptdbg_ascend/doc/FAQ.md b/debug/accuracy_tools/ptdbg_ascend/doc/FAQ.md index 7d74ac5d06..d77c81e699 100644 --- a/debug/accuracy_tools/ptdbg_ascend/doc/FAQ.md +++ b/debug/accuracy_tools/ptdbg_ascend/doc/FAQ.md @@ -106,4 +106,4 @@ compare(dump_result_param, "./output", stack_mode=True) ### 13. pkl文件中的某些api的dtype类型为float16,但是读取此api的npy文件显示的dtype类型为float32 -- ptdbg工具在dump数据时需要将原始数据从npu to cpu上再转换为numpy类型,npu to cpu的逻辑和gpu to cpu是保持一致的,都存在dtype可能从float16变为float32类型的情况,如果出现dtype不一致问题,最终dump数据的dtype以pkl文件为准。 +- ptdbg工具在dump数据时需要将原始数据从npu to cpu上再转换为numpy类型,npu to cpu的逻辑和gpu to cpu是保持一致的,都存在dtype可能从float16变为float32类型的情况,如果出现dtype不一致的问题,最终dump数据的dtype以pkl文件为准。 -- Gitee