From 60fc9cdc360d94b742847ea91dc8e96a03145ae1 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?=E7=8E=8B=E6=8C=AF=E9=82=A6?= Date: Mon, 23 Jun 2025 10:10:17 +0800 Subject: [PATCH] add layout extra docs --- .../source_en/features/parallel/operator_parallel.md | 5 +++++ .../source_zh_cn/features/parallel/operator_parallel.md | 5 +++++ 2 files changed, 10 insertions(+) diff --git a/docs/mindspore/source_en/features/parallel/operator_parallel.md b/docs/mindspore/source_en/features/parallel/operator_parallel.md index 789c221db6..b01560c0f9 100644 --- a/docs/mindspore/source_en/features/parallel/operator_parallel.md +++ b/docs/mindspore/source_en/features/parallel/operator_parallel.md @@ -97,6 +97,11 @@ In order to express sharding as in the above scenario, functional extensions are The parameters in_strategy and out_strategy both additionally receive the new quantity type tuple(Layout) type. [Layout](https://www.mindspore.cn/docs/en/master/api_python/parallel/mindspore.parallel.Layout.html) is initialized using the device matrix, while requiring an alias for each axis of the device matrix. For example: "layout = Layout((8, 4, 4), name = ("dp", "sp", "mp"))" means that the device has 128 cards in total, which are arranged in the shape of (8, 4, 4), and aliases "dp", "sp", "mp" are given to each axis. +For the specific meaning of Layout and the configuration derivation method, please refer to the following two technical documents: + +- [Deriving Tensor Sharding on Each Card Based on MindSpore Layout (List Method)](https://discuss.mindspore.cn/t/topic/124) +- [Deriving Tensor Sharding on Each Card Based on MindSpore Layout (Graphical Method)](https://discuss.mindspore.cn/t/topic/125) + By passing in the aliases for these axes when calling Layout, each tensor determines which axis of the device matrix each dimension is mapped to based on its shape (shape), and the corresponding number of slice shares. For example: - "dp" denotes 8 cuts within 8 devices in the highest dimension of the device layout. diff --git a/docs/mindspore/source_zh_cn/features/parallel/operator_parallel.md b/docs/mindspore/source_zh_cn/features/parallel/operator_parallel.md index cb42e4dd5a..9527da6945 100644 --- a/docs/mindspore/source_zh_cn/features/parallel/operator_parallel.md +++ b/docs/mindspore/source_zh_cn/features/parallel/operator_parallel.md @@ -107,6 +107,11 @@ paralell_net = AutoParallel(net, parallel_mode='semi_auto') 入参in_strategy和out_strategy都额外接收新的数量类型——tuple(Layout)。其中[Layout](https://www.mindspore.cn/docs/zh-CN/master/api_python/parallel/mindspore.parallel.Layout.html) 通过设备矩阵进行初始化,并同时要求给设备矩阵的每个轴取一个别名。例如:"layout = Layout((8, 4, 4), name = ("dp", "sp", "mp"))"表示该设备共有128张卡,按照(8, 4, 4)的形状进行排列,并为每个轴分别取了别名"dp"、"sp"、"mp"。 +关于Layout的具体含义与配置推导方法,可参考如下两篇技术文档: + +- [基于MindSpore Layout推导各卡上的Tensor分片(列表法)](https://discuss.mindspore.cn/t/topic/124) +- [基于MindSpore Layout推导各卡上的Tensor分片(图解法)](https://discuss.mindspore.cn/t/topic/125) + 在调用Layout时,通过传入这些轴的别名,每个张量根据其形状(shape)决定每个维度映射到设备矩阵的哪个轴,以及对应的切分份数。例如: - "dp"表示在设备排布的最高维度的8个设备内切分为8份; -- Gitee