From d01d785af0e929385eb657d650c45492ed3acfa9 Mon Sep 17 00:00:00 2001 From: liyu319 Date: Fri, 11 Jul 2025 06:38:36 +0000 Subject: [PATCH 1/2] =?UTF-8?q?=E6=96=B0=E5=A2=9E=E9=87=8F=E5=8C=96?= =?UTF-8?q?=E6=8E=A8=E8=8D=90=E8=AF=B4=E6=98=8E?= MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Signed-off-by: liyu319 --- MindIE/LLM/DeepSeek/DeepSeek-R1-Distill-Llama-70B/README.md | 1 + 1 file changed, 1 insertion(+) diff --git a/MindIE/LLM/DeepSeek/DeepSeek-R1-Distill-Llama-70B/README.md b/MindIE/LLM/DeepSeek/DeepSeek-R1-Distill-Llama-70B/README.md index 2d163004b0..9561f7f2d6 100644 --- a/MindIE/LLM/DeepSeek/DeepSeek-R1-Distill-Llama-70B/README.md +++ b/MindIE/LLM/DeepSeek/DeepSeek-R1-Distill-Llama-70B/README.md @@ -84,6 +84,7 @@ docker exec -it ${容器名称} bash * W8A8量化权重请使用以下指令生成 * 注意该量化方式仅支持在Atlas 800I A2服务器上运行 + * 注意官方推荐在使用w8a8量化时,对float16的数据类型进行量化能取得较好的性能收益。 ```shell # 设置CANN包的环境变量 -- Gitee From aefebd3a46bfe1f8d7703b6278f75e994d5875a0 Mon Sep 17 00:00:00 2001 From: liyu319 Date: Fri, 11 Jul 2025 06:54:12 +0000 Subject: [PATCH 2/2] =?UTF-8?q?=E8=AF=B4=E6=98=8E=E6=9B=B4=E6=96=B0?= MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Signed-off-by: liyu319 --- MindIE/LLM/DeepSeek/DeepSeek-R1-Distill-Llama-70B/README.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/MindIE/LLM/DeepSeek/DeepSeek-R1-Distill-Llama-70B/README.md b/MindIE/LLM/DeepSeek/DeepSeek-R1-Distill-Llama-70B/README.md index 9561f7f2d6..597c2367a6 100644 --- a/MindIE/LLM/DeepSeek/DeepSeek-R1-Distill-Llama-70B/README.md +++ b/MindIE/LLM/DeepSeek/DeepSeek-R1-Distill-Llama-70B/README.md @@ -84,7 +84,7 @@ docker exec -it ${容器名称} bash * W8A8量化权重请使用以下指令生成 * 注意该量化方式仅支持在Atlas 800I A2服务器上运行 - * 注意官方推荐在使用w8a8量化时,对float16的数据类型进行量化能取得较好的性能收益。 + * 注意推荐在使用w8a8量化时,对float16的数据类型进行量化能取得较好的性能收益。可以在平衡精度和性能的选择后,酌情考虑将模型config.json的torch.dtype改为float16 ```shell # 设置CANN包的环境变量 -- Gitee