diff --git a/docs/mindspore/source_en/model_train/parallel/advanced_operator_parallel.md b/docs/mindspore/source_en/model_train/parallel/advanced_operator_parallel.md index 91fde5399cfbff7c679eaad3ed326edf9e40781a..f7835a21ba5756ef75fecbb2c56e81189884cefa 100644 --- a/docs/mindspore/source_en/model_train/parallel/advanced_operator_parallel.md +++ b/docs/mindspore/source_en/model_train/parallel/advanced_operator_parallel.md @@ -44,6 +44,18 @@ a_strategy = layout("mp", ("sp", "dp")) Notice that the "[a0, a1, a2, a3]" of the tensor a is sliced twice to the "sp" and "mp" axes of the device, so that the result comes out as: ![image](https://mindspore-website.obs.cn-north-4.myhuaweicloud.com/website-images/master/docs/mindspore/source_zh_cn/model_train/parallel/images/advanced_operator_parallel_view1.PNG) +The specific coordinate mapping principle of the above icon is `a[x][y]=>Rank[mp]([sp][dp])`, the coordinates of $a$ y=0,1,2,3 start to fill from the dp dimension, and start to fill the next dimension (sp dimension) after the base number of each dimension is full.Specifically, the following table correspondence is as: + +| Arr | mx->p | y->(sp,dp) | => | dp | sp | mp | Rank | +| --- | ----- | ---------- | --- | --- | --- | --- | ---- | +| a0 | 0 | 0 | | 0 | 0 | 0 | r0 | +| a1 | 0 | 1 | | 1 | 0 | 0 | r4 | +| a2 | 0 | 2 | | 0 | 1 | 0 | r2 | +| a3 | 0 | 3 | | 1 | 1 | 0 | r6 | +| a4 | 1 | 0 | | 0 | 0 | 1 | r1 | +| a5 | 1 | 1 | | 1 | 0 | 1 | r5 | +| a6 | 1 | 2 | | 0 | 1 | 1 | r3 | +| a7 | 1 | 3 | | 1 | 1 | 1 | r7 | The following is exemplified by a concrete example in which the user computes a two-dimensional matrix multiplication over 8 cards: `Y = (X * W)` , where the devices are organized according to `2 * 2 * 2`, and the cut of X coincides with the cut of the tensor a described above: diff --git a/docs/mindspore/source_zh_cn/model_train/parallel/advanced_operator_parallel.md b/docs/mindspore/source_zh_cn/model_train/parallel/advanced_operator_parallel.md index c6b9ad8d4d4e38cc415f3346130d9906723f2707..5dfda9ee79ab593c1040edba3cedfee94928fa3d 100644 --- a/docs/mindspore/source_zh_cn/model_train/parallel/advanced_operator_parallel.md +++ b/docs/mindspore/source_zh_cn/model_train/parallel/advanced_operator_parallel.md @@ -43,6 +43,20 @@ a_strategy = layout("mp", ("sp", "dp")) ![image](images/advanced_operator_parallel_view1.PNG) +上述图标具体坐标映射原理是`a[x][y]=>Rank[mp]([sp][dp])`,$a$的坐标y=0,1,2,3从dp维开始填充,满了每一维度的进制数之后开始填充到下一维(sp维), +具体是下面的表格对应关系: + +| Arr | mx->p | y->(sp,dp) | => | dp | sp | mp | Rank | +| --- | ----- | ---------- | --- | --- | --- | --- | ---- | +| a0 | 0 | 0 | | 0 | 0 | 0 | r0 | +| a1 | 0 | 1 | | 1 | 0 | 0 | r4 | +| a2 | 0 | 2 | | 0 | 1 | 0 | r2 | +| a3 | 0 | 3 | | 1 | 1 | 0 | r6 | +| a4 | 1 | 0 | | 0 | 0 | 1 | r1 | +| a5 | 1 | 1 | | 1 | 0 | 1 | r5 | +| a6 | 1 | 2 | | 0 | 1 | 1 | r3 | +| a7 | 1 | 3 | | 1 | 1 | 1 | r7 | + 下面以一个具体的例子进行示例,用户在8个卡上计算二维矩阵乘:`Y = (X * W)` ,其中设备按照`2 * 2 * 2`进行组织,X的切分与上述的张量a切分一致: ```python diff --git a/docs/sample_code/startup_method/msrun_single.sh b/docs/sample_code/startup_method/msrun_single.sh index d7941671d6110272f155265cee2e43e779ca34a5..619feafef9f5f896c2b0a0298484563aaca63e33 100644 --- a/docs/sample_code/startup_method/msrun_single.sh +++ b/docs/sample_code/startup_method/msrun_single.sh @@ -18,4 +18,4 @@ rm -rf msrun_log mkdir msrun_log echo "start training" -msrun --worker_num=8 --local_worker_num=8 --master_port=8118 --log_dir=msrun_log --join=True --cluster_time_out=300 net.py +msrun --worker_num=8 --local_worker_num=4 --master_port=8118 --log_dir=msrun_log --join=True --cluster_time_out=300 net.py