From 2cb020abdf5d12488ad4506e1d940b22726a15ca Mon Sep 17 00:00:00 2001 From: zwx5317131 Date: Fri, 11 Jun 2021 16:22:47 +0800 Subject: [PATCH] =?UTF-8?q?=E6=B7=BB=E5=8A=A0FAQ21=E3=80=81FAQ22?= MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit --- ...345\236\213\344\274\227\346\231\272FAQ.md" | 30 +++++++++++++++++++ 1 file changed, 30 insertions(+) diff --git "a/AscendPytorch\346\250\241\345\236\213\344\274\227\346\231\272FAQ.md" "b/AscendPytorch\346\250\241\345\236\213\344\274\227\346\231\272FAQ.md" index d13444e..fc21ef8 100644 --- "a/AscendPytorch\346\250\241\345\236\213\344\274\227\346\231\272FAQ.md" +++ "b/AscendPytorch\346\250\241\345\236\213\344\274\227\346\231\272FAQ.md" @@ -389,6 +389,36 @@ StopIteration model.load_state_dict(state_dict) ``` +### FAQ21、模型训练时报fill算子错误: RuntimeError: Run:/usr1/workspace/PyTorch_Apex_Daily_c20tr5/CODE/aten/src/ATen/native/npu/utils/OpParamMaker.h:280 NPU error,NPU error code is:500002 + +* 现象描述 + +![](https://gitee.com/wangjiangben_hw/ascend-pytorch-crowdintelligence-doc/raw/master/figures/model_faq21_0529.PNG) + +* 原因分析 + + 脚本中fill算子输入的类型是int64, 查看vim /usr/local/Ascend/ascend-toolkit/latest/opp/op_impl/built-in/ai_core/tbe/config/ascend910/aic-ascend910-ops-info.json中Fills算子支持的输入类型是float16,float32,int32 + +* 处理方法 + + 1)将fill算子输入的类型改成int32。 + + +### FAQ22、cpu下运行scatter算子报错:RuntimeError: index 4558486308284583594 is out of bounds for dimension 1 with size 4233. + +* 现象描述 + +![](https://gitee.com/wangjiangben_hw/ascend-pytorch-crowdintelligence-doc/raw/master/figures/model_faq22_0604.PNG) + +* 原因分析 + + scatter算子中的index参数仅支持long类型 + index (LongTensor) – the indices of elements to scatter, can be either empty or the same size of src. When empty, the operation returns identity + +* 处理方法 + + 修改代码中b的类型为long。 + ## [2.2 NPU模型分布式运行常见问题FAQ](#22-NPU模型分布式运行常见问题FAQ) ### FAQ1、在模型分布式训练时,遇到报错 host not found. -- Gitee