From 579314796711dac8d3471bcb57a70d30d3482291 Mon Sep 17 00:00:00 2001 From: kongziyi <1045916357@qq.com> Date: Sat, 21 Jun 2025 15:58:43 +0800 Subject: [PATCH] =?UTF-8?q?=E3=80=90master=E3=80=91=E3=80=90docs=E3=80=91A?= =?UTF-8?q?dd=20explanation=20of=20performance=20fluctuations=20due=20to?= =?UTF-8?q?=20dualpipe=20and=20zerc=20features?= MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit --- examples/mindspore/deepseek3/README.md | 1 + 1 file changed, 1 insertion(+) diff --git a/examples/mindspore/deepseek3/README.md b/examples/mindspore/deepseek3/README.md index f3e551f20..946e73c54 100644 --- a/examples/mindspore/deepseek3/README.md +++ b/examples/mindspore/deepseek3/README.md @@ -93,3 +93,4 @@ sh examples/mindspore/deepseek3/pretrain_deepseek3_671B_4k_ms.sh - 多机训练需在多个终端同时启动预训练脚本。 - 如果使用多机训练,且没有设置数据共享,需要在训练启动脚本中增加--no-shared-storage参数,设置此参数之后将会根据布式参数判断非主节点是否需要load数据,并检查相应缓存和生成数据。 +- 小网场景下同时使能dualpipe和MoE零冗余通信(--moe-zerc)特性时,性能波动较大且收益不明显,不建议开启。 -- Gitee