diff --git a/README.md b/README.md index b6ac17dec7cd8f039d26ddcdc6e7712abde3a6f0..c36c978a11369392565bc317657a3aa34b28264b 100644 --- a/README.md +++ b/README.md @@ -50,7 +50,7 @@ For more information about `MindSpore` framework, please refer to [FAQ](https:// - **Q: What is Some *RANK_TBAL_FILE* which mentioned in many models?** - **A**: *RANK_TABLE_FILE* is the config file of cluster on Ascend while running distributed training. For more information, you could refer to the generator [hccl_tools](https://gitee.com/mindspore/models/tree/master/utils/hccl_tools) and [Parallel Distributed Training Example](https://www.mindspore.cn/tutorials/experts/en/master/parallel/rank_table.html) + **A**: *RANK_TABLE_FILE* is the config file of cluster on Ascend while running distributed training. For more information, you could refer to the generator [hccl_tools](https://gitee.com/mindspore/models/tree/master/utils/hccl_tools) and [Parallel Distributed Training Example](https://www.mindspore.cn/docs/en/master/model_train/parallel/rank_table.html) - **Q: How to run the scripts on Windows system?** diff --git a/README_CN.md b/README_CN.md index 0428a6864295155cdf13d21d97cc237dd0b9ab19..a655cd380825e3a5f607c57a1266834322001b4f 100644 --- a/README_CN.md +++ b/README_CN.md @@ -50,11 +50,11 @@ MindSpore已获得Apache 2.0许可,请参见LICENSE文件。 - **Q: 一些模型描述中提到的*RANK_TABLE_FILE*文件,是什么?** - **A**: *RANK_TABLE_FILE*是一个Ascend环境上用于指定分布式集群信息的文件,更多信息可以参考生成工具[hccl_toos](https://gitee.com/mindspore/models/tree/master/utils/hccl_tools)和[分布式并行训练教程](https://www.mindspore.cn/tutorials/experts/zh-CN/master/parallel/rank_table.html) + **A**: *RANK_TABLE_FILE*是一个Ascend环境上用于指定分布式集群信息的文件,更多信息可以参考生成工具[hccl_toos](https://gitee.com/mindspore/models/tree/master/utils/hccl_tools)和[分布式并行训练教程](https://www.mindspore.cn/docs/zh-CN/master/model_train/parallel/rank_table.html) - **Q: 如何使用多机多卡运行脚本** - **A**: 本仓内所提供的分布式(distribute)运行启动默认为单机多卡,如需多机多卡启动需要在单机多卡的基础上进行一定程度的适配,可参考[多机多卡分布式教程](https://www.mindspore.cn/tutorials/experts/zh-CN/master/parallel/rank_table.html#%E5%A4%9A%E6%9C%BA%E5%A4%9A%E5%8D%A1) + **A**: 本仓内所提供的分布式(distribute)运行启动默认为单机多卡,如需多机多卡启动需要在单机多卡的基础上进行一定程度的适配,可参考[多机多卡分布式教程](https://www.mindspore.cn/docs/zh-CN/master/model_train/parallel/rank_table.html#%E5%A4%9A%E6%9C%BA%E5%A4%9A%E5%8D%A1) - **Q: 在windows环境上要怎么运行网络脚本?** diff --git a/official/cv/CTPN/README.md b/official/cv/CTPN/README.md index 47be063e7836a0daaf09b57dfed71dd20e3e356f..e05fc84d6925016a2018ca89e27482f99ee2b80b 100644 --- a/official/cv/CTPN/README.md +++ b/official/cv/CTPN/README.md @@ -246,7 +246,7 @@ imagenet_cfg = edict({ Then you can train it with ImageNet2012. > Notes: -> RANK_TABLE_FILE can refer to [Link](https://www.mindspore.cn/tutorials/experts/en/master/parallel/rank_table.html) , and the device_ip can be got as [Link](https://gitee.com/mindspore/models/tree/master/utils/hccl_tools). For large models like InceptionV4, it's better to export an external environment variable `export HCCL_CONNECT_TIMEOUT=600` to extend hccl connection checking time from the default 120 seconds to 600 seconds. Otherwise, the connection could be timeout since compiling time increases with the growth of model size. +> RANK_TABLE_FILE can refer to [Link](https://www.mindspore.cn/docs/en/master/model_train/parallel/rank_table.html) , and the device_ip can be got as [Link](https://gitee.com/mindspore/models/tree/master/utils/hccl_tools). For large models like InceptionV4, it's better to export an external environment variable `export HCCL_CONNECT_TIMEOUT=600` to extend hccl connection checking time from the default 120 seconds to 600 seconds. Otherwise, the connection could be timeout since compiling time increases with the growth of model size. > > This is processor cores binding operation regarding the `device_num` and total processor numbers. If you are not expect to do it, remove the operations `taskset` in `scripts/run_distribute_train.sh` > diff --git a/official/cv/CTPN/README_CN.md b/official/cv/CTPN/README_CN.md index 5484d81459f2fb50daba9b7310463555b4710356..bb04f79b028a9c13558f2004b38332e2e65efd7a 100644 --- a/official/cv/CTPN/README_CN.md +++ b/official/cv/CTPN/README_CN.md @@ -234,7 +234,7 @@ imagenet_cfg = edict({ 然后,您可以使用ImageNet2012训练它。 > 注: -> RANK_TABLE_FILE文件,请参考[链接](https://www.mindspore.cn/tutorials/experts/en/master/parallel/rank_table.html)。如需获取设备IP,请点击[链接](https://gitee.com/mindspore/models/tree/master/utils/hccl_tools)。对于InceptionV4等大模型,最好导出外部环境变量`export HCCL_CONNECT_TIMEOUT=600`,将hccl连接检查时间从默认的120秒延长到600秒。否则,连接可能会超时,因为随着模型增大,编译时间也会增加。 +> RANK_TABLE_FILE文件,请参考[链接](https://www.mindspore.cn/docs/en/master/model_train/parallel/rank_table.html)。如需获取设备IP,请点击[链接](https://gitee.com/mindspore/models/tree/master/utils/hccl_tools)。对于InceptionV4等大模型,最好导出外部环境变量`export HCCL_CONNECT_TIMEOUT=600`,将hccl连接检查时间从默认的120秒延长到600秒。否则,连接可能会超时,因为随着模型增大,编译时间也会增加。 > > 处理器绑核操作取决于`device_num`和总处理器数。如果不希望这样做,请删除`scripts/run_distribute_train.sh`中的`taskset`操作。 > diff --git a/official/cv/DeepText/README.md b/official/cv/DeepText/README.md index 73854bf83d77abdb0c2a3c860c892a0f64d923f5..42fffda7ad8d0d166a5b89ebed0d6fc297e85d3a 100644 --- a/official/cv/DeepText/README.md +++ b/official/cv/DeepText/README.md @@ -143,7 +143,7 @@ Here we used 4 datasets for training, and 1 datasets for Evaluation. ``` > Notes: -> RANK_TABLE_FILE can refer to [Link](https://www.mindspore.cn/tutorials/experts/en/master/parallel/rank_table.html) , and the device_ip can be got as [Link](https://gitee.com/mindspore/models/tree/master/utils/hccl_tools). For large models like InceptionV4, it's better to export an external environment variable `export HCCL_CONNECT_TIMEOUT=600` to extend hccl connection checking time from the default 120 seconds to 600 seconds. Otherwise, the connection could be timeout since compiling time increases with the growth of model size. +> RANK_TABLE_FILE can refer to [Link](https://www.mindspore.cn/docs/en/master/model_train/parallel/rank_table.html) , and the device_ip can be got as [Link](https://gitee.com/mindspore/models/tree/master/utils/hccl_tools). For large models like InceptionV4, it's better to export an external environment variable `export HCCL_CONNECT_TIMEOUT=600` to extend hccl connection checking time from the default 120 seconds to 600 seconds. Otherwise, the connection could be timeout since compiling time increases with the growth of model size. > > This is processor cores binding operation regarding the `device_num` and total processor numbers. If you are not expect to do it, remove the operations `taskset` in `scripts/run_distribute_train.sh` > diff --git a/official/cv/DeepText/README_CN.md b/official/cv/DeepText/README_CN.md index 2a69543b72c7c2581d96d80162b8e7a3b9b99095..604a3517a24c2ae9c9b361532c91b3e395cfe787 100644 --- a/official/cv/DeepText/README_CN.md +++ b/official/cv/DeepText/README_CN.md @@ -133,7 +133,7 @@ InceptionV4的整体网络架构如下: ``` > 注: -> RANK_TABLE_FILE文件,请参考[链接](https://www.mindspore.cn/tutorials/experts/en/master/parallel/rank_table.html)。如需获取设备IP,请点击[链接](https://gitee.com/mindspore/models/tree/master/utils/hccl_tools)。对于InceptionV4等大模型,最好导出外部环境变量`export HCCL_CONNECT_TIMEOUT=600`,将hccl连接检查时间从默认的120秒延长到600秒。否则,连接可能会超时,因为随着模型增大,编译时间也会增加。 +> RANK_TABLE_FILE文件,请参考[链接](https://www.mindspore.cn/docs/en/master/model_train/parallel/rank_table.html)。如需获取设备IP,请点击[链接](https://gitee.com/mindspore/models/tree/master/utils/hccl_tools)。对于InceptionV4等大模型,最好导出外部环境变量`export HCCL_CONNECT_TIMEOUT=600`,将hccl连接检查时间从默认的120秒延长到600秒。否则,连接可能会超时,因为随着模型增大,编译时间也会增加。 > > 处理器绑核操作取决于`device_num`和总处理器数。如果不希望这样做,请删除`scripts/run_distribute_train.sh`中的`taskset`操作。 > diff --git a/official/cv/Inception/inceptionv4/README.md b/official/cv/Inception/inceptionv4/README.md index 7b96795d6c8a93adf6fc71c1fbbfac711e500cee..df52f98bfd50baf3853a6db8606f55e56c6505fd 100644 --- a/official/cv/Inception/inceptionv4/README.md +++ b/official/cv/Inception/inceptionv4/README.md @@ -279,7 +279,7 @@ You can start training using python or shell scripts. The usage of shell scripts ``` > Notes: -> RANK_TABLE_FILE can refer to [Link](https://www.mindspore.cn/tutorials/experts/en/master/parallel/rank_table.html) , and the device_ip can be got as [Link](https://gitee.com/mindspore/models/tree/master/utils/hccl_tools). For large models like InceptionV4, it's better to export an external environment variable `export HCCL_CONNECT_TIMEOUT=600` to extend hccl connection checking time from the default 120 seconds to 600 seconds. Otherwise, the connection could be timeout since compiling time increases with the growth of model size. +> RANK_TABLE_FILE can refer to [Link](https://www.mindspore.cn/docs/en/master/model_train/parallel/rank_table.html) , and the device_ip can be got as [Link](https://gitee.com/mindspore/models/tree/master/utils/hccl_tools). For large models like InceptionV4, it's better to export an external environment variable `export HCCL_CONNECT_TIMEOUT=600` to extend hccl connection checking time from the default 120 seconds to 600 seconds. Otherwise, the connection could be timeout since compiling time increases with the growth of model size. > > This is processor cores binding operation regarding the `device_num` and total processor numbers. If you are not expect to do it, remove the operations `taskset` in `scripts/run_distribute_train.sh` diff --git a/official/cv/Inception/inceptionv4/README_CN.md b/official/cv/Inception/inceptionv4/README_CN.md index 9478adfd0d5e659874659edff36a8d8f4e1c13b9..9502b76267b3d0f45ca0818dbc4dd807314e5388 100644 --- a/official/cv/Inception/inceptionv4/README_CN.md +++ b/official/cv/Inception/inceptionv4/README_CN.md @@ -267,7 +267,7 @@ train.py和config.py中的主要涉及如下参数: ``` > 注: -> 有关RANK_TABLE_FILE,可参考[链接](https://www.mindspore.cn/tutorials/experts/zh-CN/master/parallel/rank_table.html)。设备IP可参考[链接](https://gitee.com/mindspore/models/tree/master/utils/hccl_tools)。对于像InceptionV4这样的大型模型,最好设置外部环境变量`export HCCL_CONNECT_TIMEOUT=600`,将hccl连接检查时间从默认的120秒延长到600秒。否则,可能会连接超时,因为编译时间会随着模型增大而增加。 +> 有关RANK_TABLE_FILE,可参考[链接](https://www.mindspore.cn/docs/zh-CN/master/model_train/parallel/rank_table.html)。设备IP可参考[链接](https://gitee.com/mindspore/models/tree/master/utils/hccl_tools)。对于像InceptionV4这样的大型模型,最好设置外部环境变量`export HCCL_CONNECT_TIMEOUT=600`,将hccl连接检查时间从默认的120秒延长到600秒。否则,可能会连接超时,因为编译时间会随着模型增大而增加。 > > 绑核操作取决于`device_num`参数值及处理器总数。如果不需要,删除`scripts/run_distribute_train.sh`脚本中的`taskset`操作任务集即可。 diff --git a/official/cv/Inception/xception/README.md b/official/cv/Inception/xception/README.md index 28482d0281049f5727a6e6c9d345a13ba986d287..a0181e25aed0ae6c879e090f94acda0c3d040424 100644 --- a/official/cv/Inception/xception/README.md +++ b/official/cv/Inception/xception/README.md @@ -189,7 +189,7 @@ You can start training using python or shell scripts. The usage of shell scripts bash run_infer_310.sh MINDIR_PATH DATA_PATH LABEL_FILE DEVICE_ID ``` -> Notes: RANK_TABLE_FILE can refer to [Link](https://www.mindspore.cn/tutorials/experts/en/master/parallel/rank_table.html), and the device_ip can be got as [Link](https://gitee.com/mindspore/models/tree/master/utils/hccl_tools). +> Notes: RANK_TABLE_FILE can refer to [Link](https://www.mindspore.cn/docs/en/master/model_train/parallel/rank_table.html), and the device_ip can be got as [Link](https://gitee.com/mindspore/models/tree/master/utils/hccl_tools). ### Launch diff --git a/official/cv/Inception/xception/README_CN.md b/official/cv/Inception/xception/README_CN.md index 02cc8207b48529923d3bca9c223bcb037eb6f60b..f36fc15c1546940f09e10247120247e3cd7b5d5d 100644 --- a/official/cv/Inception/xception/README_CN.md +++ b/official/cv/Inception/xception/README_CN.md @@ -189,7 +189,7 @@ Xception的整体网络架构如下: bash run_infer_310.sh MINDIR_PATH DATA_PATH LABEL_FILE DEVICE_ID ``` -> 注:RANK_TABLE_FILE可以参考[链接](https://www.mindspore.cn/tutorials/experts/en/master/parallel/rank_table.html),device_ip可以参考[链接](https://gitee.com/mindspore/models/tree/master/utils/hccl_tools)。 +> 注:RANK_TABLE_FILE可以参考[链接](https://www.mindspore.cn/docs/en/master/model_train/parallel/rank_table.html),device_ip可以参考[链接](https://gitee.com/mindspore/models/tree/master/utils/hccl_tools)。 ### 启动 diff --git a/official/cv/ResNet/README.md b/official/cv/ResNet/README.md index 680ef11aeadc685293018efbb167e88efc467205..26c078d21520df8ddc41c52d94f755f5a99e2e4b 100644 --- a/official/cv/ResNet/README.md +++ b/official/cv/ResNet/README.md @@ -480,7 +480,7 @@ bash run_eval_gpu_resnet_benchmark.sh [DATASET_PATH] [CKPT_PATH] [BATCH_SIZE](op For distributed training, a hostfile configuration needs to be created in advance. -Please follow the instructions in the link [GPU-Multi-Host](https://www.mindspore.cn/tutorials/experts/en/master/parallel/mpirun.html). +Please follow the instructions in the link [GPU-Multi-Host](https://www.mindspore.cn/docs/en/master/model_train/parallel/mpirun.html). #### Running parameter server mode training diff --git a/official/cv/RetinaNet/README.md b/official/cv/RetinaNet/README.md index 76e6bd72c937d4a3afa1e4a52c7fde4c88b9e554..51b9f5503117bbbbe8cb90a2612c9ba51a731156 100644 --- a/official/cv/RetinaNet/README.md +++ b/official/cv/RetinaNet/README.md @@ -208,7 +208,7 @@ bash scripts/run_single_train.sh DEVICE_ID MINDRECORD_DIR CONFIG_PATH PRE_TRAINE > Note: - For details about RANK_TABLE_FILE, see [Link](https://www.mindspore.cn/tutorials/experts/en/master/parallel/rank_table.html). For details about how to obtain device IP address, see [Link](https://gitee.com/mindspore/models/tree/master/utils/hccl_tools). + For details about RANK_TABLE_FILE, see [Link](https://www.mindspore.cn/docs/en/master/model_train/parallel/rank_table.html). For details about how to obtain device IP address, see [Link](https://gitee.com/mindspore/models/tree/master/utils/hccl_tools). #### Running diff --git a/official/cv/RetinaNet/README_CN.md b/official/cv/RetinaNet/README_CN.md index 837efcf31a5f5dab3227918e53bd0f22343db158..ee093f3394469c87608f3d77552f4f905a72e799 100644 --- a/official/cv/RetinaNet/README_CN.md +++ b/official/cv/RetinaNet/README_CN.md @@ -203,7 +203,7 @@ bash scripts/run_single_train.sh DEVICE_ID MINDRECORD_DIR CONFIG_PATH PRE_TRAINE > 注意: - RANK_TABLE_FILE相关参考资料见[链接](https://www.mindspore.cn/tutorials/experts/zh-CN/master/parallel/rank_table.html), 获取device_ip方法详见[链接](https://gitee.com/mindspore/models/tree/master/utils/hccl_tools)。 + RANK_TABLE_FILE相关参考资料见[链接](https://www.mindspore.cn/docs/zh-CN/master/model_train/parallel/rank_table.html), 获取device_ip方法详见[链接](https://gitee.com/mindspore/models/tree/master/utils/hccl_tools)。 #### 运行 diff --git a/official/cv/Unet/README_CN.md b/official/cv/Unet/README_CN.md index ecf29e4303f3cbb24d97b0f1280009f002444588..cd3aea7da87547ac103e5e1f6ca67abe87130b92 100644 --- a/official/cv/Unet/README_CN.md +++ b/official/cv/Unet/README_CN.md @@ -607,7 +607,7 @@ bash ./scripts/run_eval_onnx.sh [DATASET_PATH] [ONNX_MODEL] [DEVICE_TARGET] [CON **推理前需参照 [MindSpore C++推理部署指南](https://gitee.com/mindspore/models/blob/master/utils/cpp_infer/README_CN.md) 进行环境变量设置。** -如果您需要使用训练好的模型在Ascend 910、Ascend 310等多个硬件平台上进行推理上进行推理,可参考此[链接](https://www.mindspore.cn/tutorials/experts/zh-CN/master/infer/inference.html)。下面是一个简单的操作步骤示例: +如果您需要使用训练好的模型在Ascend 910、Ascend 310等多个硬件平台上进行推理上进行推理,可参考此[链接](https://www.mindspore.cn/docs/zh-CN/master/model_infer/ms_infer/overview.html)。下面是一个简单的操作步骤示例: ### 继续训练预训练模型 diff --git a/official/cv/VGG/vgg16/README.md b/official/cv/VGG/vgg16/README.md index 163b06ecad3304b1fa20c8168fe476c183ac2f08..931d40a708d33c554bfb537ee34d2006b76398c8 100644 --- a/official/cv/VGG/vgg16/README.md +++ b/official/cv/VGG/vgg16/README.md @@ -530,7 +530,7 @@ train_parallel1/log:epcoh: 2 step: 97, loss is 1.7133579 ... ``` -> About rank_table.json, you can refer to the [distributed training tutorial](https://www.mindspore.cn/tutorials/experts/en/master/parallel/overview.html). +> About rank_table.json, you can refer to the [distributed training tutorial](https://www.mindspore.cn/docs/en/master/model_train/parallel/overview.html). > **Attention** This will bind the processor cores according to the `device_num` and total processor numbers. If you don't expect to run pretraining with binding processor cores, remove the operations about `taskset` in `scripts/run_distribute_train.sh` ##### Run vgg16 on GPU diff --git a/official/cv/VGG/vgg16/README_CN.md b/official/cv/VGG/vgg16/README_CN.md index a8fb19e7530fef94606211b67da007c7f8a0c0a6..12d1fae1c6d20e22132dd14accb2a4070e0674aa 100644 --- a/official/cv/VGG/vgg16/README_CN.md +++ b/official/cv/VGG/vgg16/README_CN.md @@ -530,7 +530,7 @@ train_parallel1/log:epcoh: 2 step: 97, loss is 1.7133579 ... ``` -> 关于rank_table.json,可以参考[分布式并行训练](https://www.mindspore.cn/tutorials/experts/zh-CN/master/parallel/overview.html)。 +> 关于rank_table.json,可以参考[分布式并行训练](https://www.mindspore.cn/docs/zh-CN/master/model_train/parallel/overview.html)。 > **注意** 将根据`device_num`和处理器总数绑定处理器核。如果您不希望预训练中绑定处理器内核,请在`scripts/run_distribute_train.sh`脚本中移除`taskset`相关操作。 ##### GPU处理器环境运行VGG16 diff --git a/official/cv/VGG/vgg19/README.md b/official/cv/VGG/vgg19/README.md index d0221fc199387ced5dde9f17608285b845c8f599..bd2a9724ebb1ae11b8b37a6f091bf9852be2198e 100644 --- a/official/cv/VGG/vgg19/README.md +++ b/official/cv/VGG/vgg19/README.md @@ -453,7 +453,7 @@ train_parallel1/log:epcoh: 2 step: 97, loss is 1.7133579 ... ``` -> About rank_table.json, you can refer to the [distributed training tutorial](https://www.mindspore.cn/tutorials/experts/en/master/parallel/overview.html). +> About rank_table.json, you can refer to the [distributed training tutorial](https://www.mindspore.cn/docs/en/master/model_train/parallel/overview.html). > **Attention** This will bind the processor cores according to the `device_num` and total processor numbers. If you don't expect to run pretraining with binding processor cores, remove the operations about `taskset` in `scripts/run_distribute_train.sh` ##### Run vgg19 on GPU diff --git a/official/cv/VGG/vgg19/README_CN.md b/official/cv/VGG/vgg19/README_CN.md index 5b046bb7649b6059ffeb6fe770d04c669582ff28..7d9b1710bf2c79feccd1c1d2a2b328169d04d080 100644 --- a/official/cv/VGG/vgg19/README_CN.md +++ b/official/cv/VGG/vgg19/README_CN.md @@ -466,7 +466,7 @@ train_parallel1/log:epcoh: 2 step: 97, loss is 1.7133579 ... ``` -> 关于rank_table.json,可以参考[分布式并行训练](https://www.mindspore.cn/tutorials/experts/zh-CN/master/parallel/overview.html)。 +> 关于rank_table.json,可以参考[分布式并行训练](https://www.mindspore.cn/docs/zh-CN/master/model_train/parallel/overview.html)。 > **注意** 将根据`device_num`和处理器总数绑定处理器核。如果您不希望预训练中绑定处理器内核,请在`scripts/run_distribute_train.sh`脚本中移除`taskset`相关操作。 ##### GPU处理器环境运行VGG19 diff --git a/official/cv/VIT/README_CN.md b/official/cv/VIT/README_CN.md index 601a9508a29432f56190a3715a434706a12a6cd2..120ac25c7f146541559ec2b44e79299a1a6151b3 100644 --- a/official/cv/VIT/README_CN.md +++ b/official/cv/VIT/README_CN.md @@ -451,7 +451,7 @@ python export.py --config_path=[CONFIG_PATH] ### 推理 -如果您需要使用此训练模型在GPU、Ascend 910、Ascend 310等多个硬件平台上进行推理,可参考此[链接](https://www.mindspore.cn/tutorials/experts/zh-CN/master/infer/inference.html)。下面是操作步骤示例: +如果您需要使用此训练模型在GPU、Ascend 910、Ascend 310等多个硬件平台上进行推理,可参考此[链接](https://www.mindspore.cn/docs/zh-CN/master/model_infer/ms_infer/overview.html)。下面是操作步骤示例: - Ascend处理器环境运行 diff --git a/official/nlp/Pangu_alpha/README.md b/official/nlp/Pangu_alpha/README.md index c07ba9f077837bdb80fe6aa5eef44aebaedf95aa..118a38c4229b2159483ac8d0b51c41d039dc20a2 100644 --- a/official/nlp/Pangu_alpha/README.md +++ b/official/nlp/Pangu_alpha/README.md @@ -51,7 +51,7 @@ with our parallel setting. We summarized the training tricks as following: 2. Pipeline Model Parallelism 3. Optimizer Model Parallelism -The above features can be found [here](https://www.mindspore.cn/tutorials/experts/en/master/parallel/overview.html). +The above features can be found [here](https://www.mindspore.cn/docs/en/master/model_train/parallel/overview.html). More amazing features are still under developing. The technical report and checkpoint file can be found [here](https://git.openi.org.cn/PCL-Platform.Intelligence/PanGu-AIpha). @@ -157,7 +157,7 @@ bash scripts/run_distribute_train.sh /data/pangu_30_step_ba64/ /root/hccl_8p.jso The above command involves some `args` described below: - DATASET: The path to the mindrecord files's parent directory . For example: `/home/work/mindrecord/`. -- RANK_TABLE: The details of the rank table can be found [here](https://www.mindspore.cn/tutorials/experts/en/master/parallel/rank_table.html). It's a json file describes the `device id`, `service ip` and `rank`. +- RANK_TABLE: The details of the rank table can be found [here](https://www.mindspore.cn/docs/en/master/model_train/parallel/rank_table.html). It's a json file describes the `device id`, `service ip` and `rank`. - RANK_SIZE: The device number. This can be your total device numbers. For example, 8, 16, 32 ... - TYPE: The param init type. The parameters will be initialized with float32. Or you can replace it with `fp16`. This will save a little memory used on the device. - MODE: The configure mode. This mode will set the `hidden size` and `layers` to make the parameter number near 2.6 billions. The other mode can be `13B` (`hidden size` 5120 and `layers` 40, which needs at least 16 cards to train.) and `200B`. @@ -206,7 +206,7 @@ bash scripts/run_distribute_train_gpu.sh RANK_SIZE HOSTFILE DATASET PER_BATCH MO ``` - RANK_SIZE: The device number. This can be your total device numbers. For example, 8, 16, 32 ... -- HOSTFILE: It's a text file describes the host ip and its devices. Please see our [tutorial](https://www.mindspore.cn/tutorials/experts/en/master/parallel/mpirun.html) or [OpenMPI](https://www.open-mpi.org/) for more details. +- HOSTFILE: It's a text file describes the host ip and its devices. Please see our [tutorial](https://www.mindspore.cn/docs/en/master/model_train/parallel/mpirun.html) or [OpenMPI](https://www.open-mpi.org/) for more details. - DATASET: The path to the mindrecord files's parent directory . For example: `/home/work/mindrecord/`. - PER_BATCH: The batch size for each data parallel-way. - MODE: Can be `1.3B` `2.6B`, `13B` and `200B`. @@ -228,7 +228,7 @@ bash scripts/run_distribute_train_moe_host_device.sh DATASET RANK_TABLE RANK_SIZ The above command involves some `args` described below: - DATASET: The path to the mindrecord files's parent directory . For example: `/home/work/mindrecord/`. -- RANK_TABLE: The details of the rank table can be found [here](https://www.mindspore.cn/tutorials/experts/en/master/parallel/rank_table.html). It's a json file describes the `device id`, `service ip` and `rank`. +- RANK_TABLE: The details of the rank table can be found [here](https://www.mindspore.cn/docs/en/master/model_train/parallel/rank_table.html). It's a json file describes the `device id`, `service ip` and `rank`. - RANK_SIZE: The device number. This can be your total device numbers. For example, 8, 16, 32 ... - TYPE: The param init type. The parameters will be initialized with float32. Or you can replace it with `fp16`. This will save a little memory used on the device. - MODE: The configure mode. This mode will set the `hidden size` and `layers` to make the parameter number near 2.6 billions. The other mode can be `13B` (`hidden size` 5120 and `layers` 40, which needs at least 16 cards to train.) and `200B`. diff --git a/official/nlp/Pangu_alpha/README_CN.md b/official/nlp/Pangu_alpha/README_CN.md index dde68a147a952a50c401db1a3a024bedd8eec51a..9e272de851f76e60136fd4f97afc9bf15fbec222 100644 --- a/official/nlp/Pangu_alpha/README_CN.md +++ b/official/nlp/Pangu_alpha/README_CN.md @@ -51,7 +51,7 @@ 2. 流水线模型并行 3. 优化器模型并行 -有关上述特性,请点击[此处](https://www.mindspore.cn/tutorials/experts/en/master/parallel/overview.html)查看详情。 +有关上述特性,请点击[此处](https://www.mindspore.cn/docs/en/master/model_train/parallel/overview.html)查看详情。 更多特性敬请期待。 详细技术报告和检查点文件,可点击[此处](https://git.openi.org.cn/PCL-Platform.Intelligence/PanGu-AIpha)查看。 @@ -156,7 +156,7 @@ bash scripts/run_distribute_train.sh /data/pangu_30_step_ba64/ /root/hccl_8p.jso 上述命令涉及以下`args`: - DATASET:mindrecord文件父目录的路径。例如:`/home/work/mindrecord/`。 -- RANK_TABLE:rank table的详细信息,请点击[此处](https://www.mindspore.cn/tutorials/experts/en/master/parallel/rank_table.html)查看。该.json文件描述了`device id`、`service ip`和`rank`。 +- RANK_TABLE:rank table的详细信息,请点击[此处](https://www.mindspore.cn/docs/en/master/model_train/parallel/rank_table.html)查看。该.json文件描述了`device id`、`service ip`和`rank`。 - RANK_SIZE:设备编号,也可以表示设备总数。例如,8、16、32 ... - TYPE:参数初始化类型。参数使用单精度(FP32) 或半精度(FP16)初始化。可以节省设备占用内存。 - MODE:配置模式。通过设置`hidden size`和`layers`,将参数量增至26亿。还可以选择13B(`hidden size`为5120和`layers`为40,训练至少需要16卡)和200B模式。 @@ -205,7 +205,7 @@ bash scripts/run_distribute_train_gpu.sh RANK_SIZE HOSTFILE DATASET PER_BATCH MO ``` - RANK_SIZE:设备编号,也可以表示设备总数。例如,8、16、32 ... -- HOSTFILE:描述主机IP及其设备的文本文件。有关更多详细信息,请参见我们的[教程](https://www.mindspore.cn/tutorials/experts/en/master/parallel/mpirun.html) or [OpenMPI](https://www.open-mpi.org/)。 +- HOSTFILE:描述主机IP及其设备的文本文件。有关更多详细信息,请参见我们的[教程](https://www.mindspore.cn/docs/en/master/model_train/parallel/mpirun.html) or [OpenMPI](https://www.open-mpi.org/)。 - DATASET:mindrecord文件父目录的路径。例如:`/home/work/mindrecord/`。 - PER_BATCH:每个数据并行的批处理大小, - MODE:可以是`1.3B`、`2.6B`、`13B`或`200B`。 @@ -227,7 +227,7 @@ bash scripts/run_distribute_train_moe_host_device.sh DATASET RANK_TABLE RANK_SIZ 上述命令涉及以下args: - DATASET:mindrecord文件父目录的路径。例如:`/home/work/mindrecord/`。 -- RANK_TABLE:rank table的详细信息,请点击[此处](https://www.mindspore.cn/tutorials/experts/en/master/parallel/rank_table.html)查看。该.json文件描述了device id、service ip和rank。 +- RANK_TABLE:rank table的详细信息,请点击[此处](https://www.mindspore.cn/docs/en/master/model_train/parallel/rank_table.html)查看。该.json文件描述了device id、service ip和rank。 - RANK_SIZE:设备编号,也可以是您的设备总数。例如,8、16、32 ... - TYPE:参数初始化类型。参数使用单精度(FP32) 或半精度(FP16)初始化。可以节省设备占用内存。 - MODE:配置模式。通过设置`hidden size`和`layers`,将参数量增至26亿。还可以选择`13B`(`hidden size`为5120和`layers`为40,训练至少需要16卡)和`200B`模式。 diff --git a/research/cv/3D_DenseNet/README.md b/research/cv/3D_DenseNet/README.md index a7d502906496627ea362c8944be3fa4967b4c284..1b8a78ffef1c42b2e7adda96dc0f05a6a42a59e8 100644 --- a/research/cv/3D_DenseNet/README.md +++ b/research/cv/3D_DenseNet/README.md @@ -222,7 +222,7 @@ Dice Coefficient (DC) for 9th subject (9 subjects for training and 1 subject for |-------------------|:-------------------:|:---------------------:|:-----:|:--------------:| |3D-SkipDenseSeg | 93.66| 90.80 | 90.65 | 91.70 | -Notes: RANK_TABLE_FILE can refer to [Link](https://www.mindspore.cn/tutorials/experts/en/master/parallel/rank_table.html) , and the device_ip can be got as [Link](https://gitee.com/mindspore/models/tree/master/utils/hccl_tools) For large models like InceptionV4, it's better to export an external environment variable export HCCL_CONNECT_TIMEOUT=600 to extend hccl connection checking time from the default 120 seconds to 600 seconds. Otherwise, the connection could be timeout since compiling time increases with the growth of model size. To avoid ops error,you should change the code like below: +Notes: RANK_TABLE_FILE can refer to [Link](https://www.mindspore.cn/docs/en/master/model_train/parallel/rank_table.html) , and the device_ip can be got as [Link](https://gitee.com/mindspore/models/tree/master/utils/hccl_tools) For large models like InceptionV4, it's better to export an external environment variable export HCCL_CONNECT_TIMEOUT=600 to extend hccl connection checking time from the default 120 seconds to 600 seconds. Otherwise, the connection could be timeout since compiling time increases with the growth of model size. To avoid ops error,you should change the code like below: in train.py: diff --git a/research/cv/3D_DenseNet/README_CN.md b/research/cv/3D_DenseNet/README_CN.md index b46f106999f968aa8450aacb7ca3aca27945f16c..022157600adae3fdd198ad5a1746aedc963f26e0 100644 --- a/research/cv/3D_DenseNet/README_CN.md +++ b/research/cv/3D_DenseNet/README_CN.md @@ -212,7 +212,7 @@ bash run_eval.sh 3D-DenseSeg-20000_36.ckpt data/data_val |-------------------|:-------------------:|:---------------------:|:-----:|:--------------:| |3D-SkipDenseSeg | 93.66| 90.80 | 90.65 | 91.70 | -Notes: 分布式训练需要一个RANK_TABLE_FILE,文件的删除方式可以参考该链接[Link](https://www.mindspore.cn/tutorials/experts/en/master/parallel/rank_table.html) ,device_ip的设置参考该链接 [Link](https://gitee.com/mindspore/models/tree/master/utils/hccl_tools) 对于像InceptionV4这样的大模型来说, 最好导出一个外部环境变量,export HCCL_CONNECT_TIMEOUT=600,以将hccl连接检查时间从默认的120秒延长到600秒。否则,连接可能会超时,因为编译时间会随着模型大小的增加而增加。在1.3.0版本下,3D算子可能存在一些问题,您可能需要更改context.set_auto_parallel_context的部分代码: +Notes: 分布式训练需要一个RANK_TABLE_FILE,文件的删除方式可以参考该链接[Link](https://www.mindspore.cn/docs/en/master/model_train/parallel/rank_table.html) ,device_ip的设置参考该链接 [Link](https://gitee.com/mindspore/models/tree/master/utils/hccl_tools) 对于像InceptionV4这样的大模型来说, 最好导出一个外部环境变量,export HCCL_CONNECT_TIMEOUT=600,以将hccl连接检查时间从默认的120秒延长到600秒。否则,连接可能会超时,因为编译时间会随着模型大小的增加而增加。在1.3.0版本下,3D算子可能存在一些问题,您可能需要更改context.set_auto_parallel_context的部分代码: in train.py: diff --git a/research/cv/AlignedReID++/README_CN.md b/research/cv/AlignedReID++/README_CN.md index ff8c6ae8193366b94163e1121aec9f7b92186a4b..559a94c110e06e60cad19b6920fa0c88547dd862 100644 --- a/research/cv/AlignedReID++/README_CN.md +++ b/research/cv/AlignedReID++/README_CN.md @@ -405,7 +405,7 @@ market1501上评估AlignedReID++ ### 推理 -如果您需要使用此训练模型在GPU、Ascend 910、Ascend 310等多个硬件平台上进行推理,可参考此[链接](https://www.mindspore.cn/tutorials/experts/zh-CN/master/infer/inference.html)。下面是操作步骤示例: +如果您需要使用此训练模型在GPU、Ascend 910、Ascend 310等多个硬件平台上进行推理,可参考此[链接](https://www.mindspore.cn/docs/zh-CN/master/model_infer/ms_infer/overview.html)。下面是操作步骤示例: 在进行推理之前我们需要先导出模型,mindir可以在本地环境上导出。batch_size默认为1。 diff --git a/research/cv/C3D/README.md b/research/cv/C3D/README.md index 3c710bdfcd9da53e2c114114748f269565fb0282..6102a83e4a1c2658fcb1cad157cff7bf48aa8471 100644 --- a/research/cv/C3D/README.md +++ b/research/cv/C3D/README.md @@ -465,7 +465,7 @@ The above shell script will run distribute training in the background. You can v #### Distributed training on Ascend > Notes: -> RANK_TABLE_FILE can refer to [Link](https://www.mindspore.cn/tutorials/experts/en/master/parallel/rank_table.html) , and the device_ip can be got as [Link](https://gitee.com/mindspore/models/tree/master/utils/hccl_tools). For large models like InceptionV4, it's better to export an external environment variable `export HCCL_CONNECT_TIMEOUT=600` to extend hccl connection checking time from the default 120 seconds to 600 seconds. Otherwise, the connection could be timeout since compiling time increases with the growth of model size. +> RANK_TABLE_FILE can refer to [Link](https://www.mindspore.cn/docs/en/master/model_train/parallel/rank_table.html) , and the device_ip can be got as [Link](https://gitee.com/mindspore/models/tree/master/utils/hccl_tools). For large models like InceptionV4, it's better to export an external environment variable `export HCCL_CONNECT_TIMEOUT=600` to extend hccl connection checking time from the default 120 seconds to 600 seconds. Otherwise, the connection could be timeout since compiling time increases with the growth of model size. > ```text diff --git a/research/cv/C3D/README_CN.md b/research/cv/C3D/README_CN.md index 574da58dcd8e3e6ca9ea31d89d37f8d0df46317c..7678cd917f9f4aa3682d48bfe093fabf0e5a3005 100644 --- a/research/cv/C3D/README_CN.md +++ b/research/cv/C3D/README_CN.md @@ -456,7 +456,7 @@ bash run_standalone_train_gpu.sh [CONFIG_PATH] [DEVICE_ID] #### Ascend分布式训练 > 注: -> RANK_TABLE_FILE文件,请参考[链接](https://www.mindspore.cn/tutorials/experts/en/master/parallel/rank_table.html)。如需获取设备IP,请点击[链接](https://gitee.com/mindspore/models/tree/master/utils/hccl_tools)。对于InceptionV4等大模型,最好导出外部环境变量`export HCCL_CONNECT_TIMEOUT=600`,将hccl连接检查时间从默认的120秒延长到600秒。否则,连接可能会超时,因为随着模型增大,编译时间也会增加。 +> RANK_TABLE_FILE文件,请参考[链接](https://www.mindspore.cn/docs/en/master/model_train/parallel/rank_table.html)。如需获取设备IP,请点击[链接](https://gitee.com/mindspore/models/tree/master/utils/hccl_tools)。对于InceptionV4等大模型,最好导出外部环境变量`export HCCL_CONNECT_TIMEOUT=600`,将hccl连接检查时间从默认的120秒延长到600秒。否则,连接可能会超时,因为随着模型增大,编译时间也会增加。 > ```text diff --git a/research/cv/EGnet/README_CN.md b/research/cv/EGnet/README_CN.md index fcd0d32c455702975296459a0835b6191815ade2..e486c8f6c0023c9dd3861de373b2946dfe1f87fc 100644 --- a/research/cv/EGnet/README_CN.md +++ b/research/cv/EGnet/README_CN.md @@ -363,7 +363,7 @@ bash run_standalone_train_gpu.sh bash run_distribute_train.sh 8 [RANK_TABLE_FILE] ``` -线下运行分布式训练请参照[rank table启动](https://www.mindspore.cn/tutorials/experts/zh-CN/master/parallel/rank_table.html) +线下运行分布式训练请参照[rank table启动](https://www.mindspore.cn/docs/zh-CN/master/model_train/parallel/rank_table.html) - 线上modelarts分布式训练 diff --git a/research/cv/LightCNN/README.md b/research/cv/LightCNN/README.md index 409d83b0eadf8bb97046318d15867a522fc5621f..c2de4a52392a28c433d1e70c91fe87af26ba6251 100644 --- a/research/cv/LightCNN/README.md +++ b/research/cv/LightCNN/README.md @@ -139,7 +139,7 @@ reduce precision" to view the operators with reduced precision. - Generate config json file for 8-card training - [Simple tutorial](https://gitee.com/mindspore/models/tree/master/utils/hccl_tools) - For detailed configuration method, please refer to - the [rank table Startup](https://www.mindspore.cn/tutorials/experts/en/master/parallel/rank_table.html). + the [rank table Startup](https://www.mindspore.cn/docs/en/master/model_train/parallel/rank_table.html). # [Quick start](#Quickstart) diff --git a/research/cv/LightCNN/README_CN.md b/research/cv/LightCNN/README_CN.md index 94e9e5837d77dd68725b2ff99bec77f7137630a2..d363114b856317c2005ca9a9e76817a7d98d99e0 100644 --- a/research/cv/LightCNN/README_CN.md +++ b/research/cv/LightCNN/README_CN.md @@ -107,7 +107,7 @@ LightCNN适用于有大量噪声的人脸识别数据集,提出了maxout 的 - [MindSpore Python API](https://www.mindspore.cn/docs/zh-CN/master/api_python/mindspore.html) - 生成config json文件用于8卡训练。 - [简易教程](https://gitee.com/mindspore/models/tree/master/utils/hccl_tools) - - 详细配置方法请参照[rank table启动](https://www.mindspore.cn/tutorials/experts/zh-CN/master/parallel/rank_table.html)。 + - 详细配置方法请参照[rank table启动](https://www.mindspore.cn/docs/zh-CN/master/model_train/parallel/rank_table.html)。 # 快速入门 diff --git a/research/cv/Unet3d/README.md b/research/cv/Unet3d/README.md index 203b77eb1a981715d71e80e766ada7824d3fc75b..4e823c20773f66f19819452b37022e415078b50a 100644 --- a/research/cv/Unet3d/README.md +++ b/research/cv/Unet3d/README.md @@ -312,7 +312,7 @@ After training, you'll get some checkpoint files under the `train_parallel_fp[32 #### Distributed training on Ascend > Notes: -> RANK_TABLE_FILE can refer to [Link](https://www.mindspore.cn/tutorials/experts/en/master/parallel/rank_table.html) , and the device_ip can be got as [Link](https://gitee.com/mindspore/models/tree/master/utils/hccl_tools). For large models like InceptionV4, it's better to export an external environment variable `export HCCL_CONNECT_TIMEOUT=600` to extend hccl connection checking time from the default 120 seconds to 600 seconds. Otherwise, the connection could be timeout since compiling time increases with the growth of model size. +> RANK_TABLE_FILE can refer to [Link](https://www.mindspore.cn/docs/en/master/model_train/parallel/rank_table.html) , and the device_ip can be got as [Link](https://gitee.com/mindspore/models/tree/master/utils/hccl_tools). For large models like InceptionV4, it's better to export an external environment variable `export HCCL_CONNECT_TIMEOUT=600` to extend hccl connection checking time from the default 120 seconds to 600 seconds. Otherwise, the connection could be timeout since compiling time increases with the growth of model size. > ```shell diff --git a/research/cv/Unet3d/README_CN.md b/research/cv/Unet3d/README_CN.md index f1211bd146a464efa90d2db9b2c15398a2dae50a..1c27edd3493e3d207f5e257b1dddfe5b2b5068a2 100644 --- a/research/cv/Unet3d/README_CN.md +++ b/research/cv/Unet3d/README_CN.md @@ -312,7 +312,7 @@ bash ./run_distribute_train_gpu_fp16.sh /path_prefix/LUNA16/train #### 在Ascend上进行分布式训练 > 注: -> RANK_TABLE_FILE参考[链接](https://www.mindspore.cn/tutorials/experts/en/master/parallel/rank_table.html),device_ip参考[链接](https://gitee.com/mindspore/models/tree/master/utils/hccl_tools)。对于像InceptionV4这样的大模型,最好导出外部环境变量`export HCCL_CONNECT_TIMEOUT=600`,将HCCL连接检查时间从默认的120秒延长到600秒。否则,连接可能会超时,因为编译时间会随着模型大小的增长而增加。 +> RANK_TABLE_FILE参考[链接](https://www.mindspore.cn/docs/en/master/model_train/parallel/rank_table.html),device_ip参考[链接](https://gitee.com/mindspore/models/tree/master/utils/hccl_tools)。对于像InceptionV4这样的大模型,最好导出外部环境变量`export HCCL_CONNECT_TIMEOUT=600`,将HCCL连接检查时间从默认的120秒延长到600秒。否则,连接可能会超时,因为编译时间会随着模型大小的增长而增加。 > ```shell diff --git a/research/cv/cnnctc/README_CN.md b/research/cv/cnnctc/README_CN.md index 54b9eb6a13d8f789e2b2c26f6774b1fc87991a58..1454105ddae5e26ea3c9745aa2d0fd65793e6361 100644 --- a/research/cv/cnnctc/README_CN.md +++ b/research/cv/cnnctc/README_CN.md @@ -261,7 +261,7 @@ bash scripts/run_distribute_train_ascend.sh [RANK_TABLE_FILE] [PRETRAINED_CKPT(o > 注意: - RANK_TABLE_FILE相关参考资料见[链接](https://www.mindspore.cn/tutorials/experts/zh-CN/master/parallel/rank_table.html), 获取device_ip方法详见[链接](https://gitee.com/mindspore/models/tree/master/utils/hccl_tools). + RANK_TABLE_FILE相关参考资料见[链接](https://www.mindspore.cn/docs/zh-CN/master/model_train/parallel/rank_table.html), 获取device_ip方法详见[链接](https://gitee.com/mindspore/models/tree/master/utils/hccl_tools). ### 训练结果 @@ -485,7 +485,7 @@ accuracy: 0.8427 ### 推理 -如果您需要在GPU、Ascend 910、Ascend 310等多个硬件平台上使用训练好的模型进行推理,请参考此[链接](https://www.mindspore.cn/tutorials/experts/zh-CN/master/infer/inference.html)。以下为简单示例: +如果您需要在GPU、Ascend 910、Ascend 310等多个硬件平台上使用训练好的模型进行推理,请参考此[链接](https://www.mindspore.cn/docs/zh-CN/master/model_infer/ms_infer/overview.html)。以下为简单示例: - Ascend处理器环境运行 diff --git a/research/cv/cspdarknet53/README.md b/research/cv/cspdarknet53/README.md index f5130a3b12c2fa0937e7ed12eae786a43e52067c..5ddf567a9c0f835ce140a01a614470bf30b0bb15 100644 --- a/research/cv/cspdarknet53/README.md +++ b/research/cv/cspdarknet53/README.md @@ -206,7 +206,7 @@ bash run_distribute_train.sh [RANK_TABLE_FILE] [DATA_DIR] (option)[PATH_CHECKPOI bash run_standalone_train.sh [DEVICE_ID] [DATA_DIR] (option)[PATH_CHECKPOINT] ``` -> Notes: RANK_TABLE_FILE can refer to [Link](https://www.mindspore.cn/tutorials/experts/en/master/parallel/rank_table.html), and the device_ip can be got as [Link](https://gitee.com/mindspore/models/tree/master/utils/hccl_tools). For large models like InceptionV3, it's better to export an external environment variable `export HCCL_CONNECT_TIMEOUT=600` to extend hccl connection checking time from the default 120 seconds to 600 seconds. Otherwise, the connection could be timeout since compiling time increases with the growth of model size. +> Notes: RANK_TABLE_FILE can refer to [Link](https://www.mindspore.cn/docs/en/master/model_train/parallel/rank_table.html), and the device_ip can be got as [Link](https://gitee.com/mindspore/models/tree/master/utils/hccl_tools). For large models like InceptionV3, it's better to export an external environment variable `export HCCL_CONNECT_TIMEOUT=600` to extend hccl connection checking time from the default 120 seconds to 600 seconds. Otherwise, the connection could be timeout since compiling time increases with the growth of model size. > > This is processor cores binding operation regarding the `device_num` and total processor numbers. If you are not expect to do it, remove the operations `taskset` in `scripts/run_distribute_train.sh` diff --git a/research/cv/dlinknet/README_CN.md b/research/cv/dlinknet/README_CN.md index 139e0b13d4065ea509f2bfccb67b618ce8ffba81..86a13f42ca56e48e18c9945a5d2ad00e680ac8fb 100644 --- a/research/cv/dlinknet/README_CN.md +++ b/research/cv/dlinknet/README_CN.md @@ -333,7 +333,7 @@ bash scripts/run_distribute_gpu_train.sh [DATASET] [CONFIG_PATH] [DEVICE_NUM] [C #### 推理 -如果您需要使用训练好的模型在Ascend 910、Ascend 310等多个硬件平台上进行推理上进行推理,可参考此[链接](https://www.mindspore.cn/tutorials/experts/zh-CN/master/infer/inference.html)。下面是一个简单的操作步骤示例: +如果您需要使用训练好的模型在Ascend 910、Ascend 310等多个硬件平台上进行推理上进行推理,可参考此[链接](https://www.mindspore.cn/docs/zh-CN/master/model_infer/ms_infer/overview.html)。下面是一个简单的操作步骤示例: ##### Ascend 310环境运行 diff --git a/research/cv/east/README.md b/research/cv/east/README.md index f9b125a59f5d955a747f4b29a21fe9140503794c..bfe3231dbaec8b0ce39a098753f6c47c011af6b4 100644 --- a/research/cv/east/README.md +++ b/research/cv/east/README.md @@ -134,7 +134,7 @@ bash run_eval_gpu.sh [DATASET_PATH] [CKPT_PATH] [DEVICE_ID] ``` > Notes: -> RANK_TABLE_FILE can refer to [Link](https://www.mindspore.cn/tutorials/experts/en/master/parallel/rank_table.html) , and the device_ip can be got as [Link](https://gitee.com/mindspore/models/tree/master/utils/hccl_tools). For large models like InceptionV4, it's better to export an external environment variable `export HCCL_CONNECT_TIMEOUT=600` to extend hccl connection checking time from the default 120 seconds to 600 seconds. Otherwise, the connection could be timeout since compiling time increases with the growth of model size. +> RANK_TABLE_FILE can refer to [Link](https://www.mindspore.cn/docs/en/master/model_train/parallel/rank_table.html) , and the device_ip can be got as [Link](https://gitee.com/mindspore/models/tree/master/utils/hccl_tools). For large models like InceptionV4, it's better to export an external environment variable `export HCCL_CONNECT_TIMEOUT=600` to extend hccl connection checking time from the default 120 seconds to 600 seconds. Otherwise, the connection could be timeout since compiling time increases with the growth of model size. > > This is processor cores binding operation regarding the `device_num` and total processor numbers. If you are not expect to do it, remove the operations `taskset` in `scripts/run_distribute_train.sh` > diff --git a/research/cv/googlenet/README_CN.md b/research/cv/googlenet/README_CN.md index 1c60a9eba68e4dc3ef40bba27c718fecc0a9d327..883104b4f5284c073f554845c5bd6a2d96f9a44f 100644 --- a/research/cv/googlenet/README_CN.md +++ b/research/cv/googlenet/README_CN.md @@ -598,7 +598,7 @@ python export.py --config_path [CONFIG_PATH] ### 推理 -如果您需要使用此训练模型在GPU、Ascend 910、Ascend 310等多个硬件平台上进行推理,可参考此[链接](https://www.mindspore.cn/tutorials/experts/zh-CN/master/infer/inference.html)。下面是操作步骤示例: +如果您需要使用此训练模型在GPU、Ascend 910、Ascend 310等多个硬件平台上进行推理,可参考此[链接](https://www.mindspore.cn/docs/zh-CN/master/model_infer/ms_infer/overview.html)。下面是操作步骤示例: - Ascend处理器环境运行 diff --git a/research/cv/hardnet/README_CN.md b/research/cv/hardnet/README_CN.md index 65b6a7d2b6da55a329fff5efcc8be1b4ee162321..44f397ae05a802c12178446059551b2680088b78 100644 --- a/research/cv/hardnet/README_CN.md +++ b/research/cv/hardnet/README_CN.md @@ -449,7 +449,7 @@ bash run_infer_310.sh [MINDIR_PATH] [DATASET_PATH] [DEVICE_ID] ### 推理 -如果您需要使用此训练模型在Ascend 910上进行推理,可参考此[链接](https://www.mindspore.cn/tutorials/experts/zh-CN/master/infer/inference.html)。下面是操作步骤示例: +如果您需要使用此训练模型在Ascend 910上进行推理,可参考此[链接](https://www.mindspore.cn/docs/zh-CN/master/model_infer/ms_infer/overview.html)。下面是操作步骤示例: - Ascend处理器环境运行 @@ -486,7 +486,7 @@ bash run_infer_310.sh [MINDIR_PATH] [DATASET_PATH] [DEVICE_ID] print("==============Acc: {} ==============".format(acc)) ``` -如果您需要使用此训练模型在GPU上进行推理,可参考此[链接](https://www.mindspore.cn/tutorials/experts/zh-CN/master/infer/inference.html)。下面是操作步骤示例: +如果您需要使用此训练模型在GPU上进行推理,可参考此[链接](https://www.mindspore.cn/docs/zh-CN/master/model_infer/ms_infer/overview.html)。下面是操作步骤示例: - GPU处理器环境运行 diff --git a/research/cv/inception_resnet_v2/README.md b/research/cv/inception_resnet_v2/README.md index 4ef422773d3629c6fd55b5d7d3dad7a3e7c4ad1f..d7a3350677c0736077c989336d3005435cf7df95 100644 --- a/research/cv/inception_resnet_v2/README.md +++ b/research/cv/inception_resnet_v2/README.md @@ -122,7 +122,7 @@ bash scripts/run_standalone_train_ascend.sh DEVICE_ID DATA_DIR ``` > Notes: -> RANK_TABLE_FILE can refer to [Link](https://www.mindspore.cn/tutorials/experts/en/master/parallel/rank_table.html) , and the device_ip can be got as [Link](https://gitee.com/mindspore/models/tree/master/utils/hccl_tools). For large models like InceptionV4, it's better to export an external environment variable `export HCCL_CONNECT_TIMEOUT=600` to extend hccl connection checking time from the default 120 seconds to 600 seconds. Otherwise, the connection could be timeout since compiling time increases with the growth of model size. +> RANK_TABLE_FILE can refer to [Link](https://www.mindspore.cn/docs/en/master/model_train/parallel/rank_table.html) , and the device_ip can be got as [Link](https://gitee.com/mindspore/models/tree/master/utils/hccl_tools). For large models like InceptionV4, it's better to export an external environment variable `export HCCL_CONNECT_TIMEOUT=600` to extend hccl connection checking time from the default 120 seconds to 600 seconds. Otherwise, the connection could be timeout since compiling time increases with the growth of model size. > > This is processor cores binding operation regarding the `device_num` and total processor numbers. If you are not expect to do it, remove the operations `taskset` in `scripts/run_distribute_train.sh` diff --git a/research/cv/inception_resnet_v2/README_CN.md b/research/cv/inception_resnet_v2/README_CN.md index 70943fca3c65bd8475b0fc7048db6faa10cdc4b2..2358f9ae41b9806f20d85e70a07948e91c3b0d1b 100644 --- a/research/cv/inception_resnet_v2/README_CN.md +++ b/research/cv/inception_resnet_v2/README_CN.md @@ -133,7 +133,7 @@ bash scripts/run_distribute_train_ascend.sh RANK_TABLE_FILE DATA_DIR bash scripts/run_standalone_train_ascend.sh DEVICE_ID DATA_DIR ``` -> 注:RANK_TABLE_FILE可参考[链接]( https://www.mindspore.cn/tutorials/experts/zh-CN/master/parallel/rank_table.html)。device_ip可以通过[链接](https://gitee.com/mindspore/models/tree/master/utils/hccl_tools)获取 +> 注:RANK_TABLE_FILE可参考[链接]( https://www.mindspore.cn/docs/zh-CN/master/model_train/parallel/rank_table.html)。device_ip可以通过[链接](https://gitee.com/mindspore/models/tree/master/utils/hccl_tools)获取 - GPU: diff --git a/research/cv/nas-fpn/README_CN.md b/research/cv/nas-fpn/README_CN.md index 83c05f1e579ed9edb41573b68963c53ef1d0f19e..d0a1a21fef0533f7421b3a754bc401fc79087fb6 100644 --- a/research/cv/nas-fpn/README_CN.md +++ b/research/cv/nas-fpn/README_CN.md @@ -161,7 +161,7 @@ bash scripts/run_single_train.sh DEVICE_ID MINDRECORD_DIR PRE_TRAINED(optional) ``` > 注意: -RANK_TABLE_FILE相关参考资料见[链接](https://www.mindspore.cn/tutorials/experts/zh-CN/master/parallel/rank_table.html), 获取device_ip方法详见[链接](https://gitee.com/mindspore/models/tree/master/utils/hccl_tools). +RANK_TABLE_FILE相关参考资料见[链接](https://www.mindspore.cn/docs/zh-CN/master/model_train/parallel/rank_table.html), 获取device_ip方法详见[链接](https://gitee.com/mindspore/models/tree/master/utils/hccl_tools). #### 运行 diff --git a/research/cv/osnet/README.md b/research/cv/osnet/README.md index 2bbafb4798fd01f59b72fd6304301ff6cdda05c5..6303d8f5a1d492cb19a267724c30570926ad08e2 100644 --- a/research/cv/osnet/README.md +++ b/research/cv/osnet/README.md @@ -160,7 +160,7 @@ bash run_eval_ascend.sh [DATASET] [CHECKPOINT_PATH] [DEVICE_ID] ``` > Notes: -> RANK_TABLE_FILE can refer to [Link](https://www.mindspore.cn/tutorials/experts/en/master/parallel/rank_table.html) , and the device_ip can be got as [Link](https://gitee.com/mindspore/models/tree/master/utils/hccl_tools). For large models like InceptionV4, it's better to export an external environment variable `export HCCL_CONNECT_TIMEOUT=600` to extend hccl connection checking time from the default 120 seconds to 600 seconds. Otherwise, the connection could be timeout since compiling time increases with the growth of model size. +> RANK_TABLE_FILE can refer to [Link](https://www.mindspore.cn/docs/en/master/model_train/parallel/rank_table.html) , and the device_ip can be got as [Link](https://gitee.com/mindspore/models/tree/master/utils/hccl_tools). For large models like InceptionV4, it's better to export an external environment variable `export HCCL_CONNECT_TIMEOUT=600` to extend hccl connection checking time from the default 120 seconds to 600 seconds. Otherwise, the connection could be timeout since compiling time increases with the growth of model size. > > This is processor cores binding operation regarding the `device_num` and total processor numbers. If you are not expect to do it, remove the operations `taskset` in `scripts/run_train_distribute_ascend.sh` > diff --git a/research/cv/retinanet_resnet101/README.md b/research/cv/retinanet_resnet101/README.md index 67bef72b14965851f50e90e5ccc27e6e74b9ca37..5df618a2fecfe01569fd80606da8838adb6a2872 100644 --- a/research/cv/retinanet_resnet101/README.md +++ b/research/cv/retinanet_resnet101/README.md @@ -287,7 +287,7 @@ bash run_distribute_train.sh [DEVICE_NUM] [EPOCH_SIZE] [LR] [DATASET] [RANK_TABL bash run_single_train.sh [DEVICE_ID] [EPOCH_SIZE] [LR] [DATASET] [PRE_TRAINED](optional) [PRE_TRAINED_EPOCH_SIZE](optional) ``` -> Note: RANK_TABLE_FILE related reference materials see in this [link](https://www.mindspore.cn/tutorials/experts/en/master/parallel/rank_table.html), for details on how to get device_ip check this [link](https://gitee.com/mindspore/models/tree/master/utils/hccl_tools). +> Note: RANK_TABLE_FILE related reference materials see in this [link](https://www.mindspore.cn/docs/en/master/model_train/parallel/rank_table.html), for details on how to get device_ip check this [link](https://gitee.com/mindspore/models/tree/master/utils/hccl_tools). - GPU diff --git a/research/cv/retinanet_resnet101/README_CN.md b/research/cv/retinanet_resnet101/README_CN.md index 9b99e85186e2a6075cc3450be2e3578a5e08eac5..c237d8cf03e2dcb4f33a739fdcfda42e4df4b045 100644 --- a/research/cv/retinanet_resnet101/README_CN.md +++ b/research/cv/retinanet_resnet101/README_CN.md @@ -292,7 +292,7 @@ bash run_distribute_train.sh [DEVICE_NUM] [EPOCH_SIZE] [LR] [DATASET] [RANK_TABL bash run_single_train.sh [DEVICE_ID] [EPOCH_SIZE] [LR] [DATASET] [PRE_TRAINED](optional) [PRE_TRAINED_EPOCH_SIZE](optional) ``` -> 注意: RANK_TABLE_FILE相关参考资料见[链接](https://www.mindspore.cn/tutorials/experts/zh-CN/master/parallel/rank_table.html), 获取device_ip方法详见[链接](https://gitee.com/mindspore/models/tree/master/utils/hccl_tools). +> 注意: RANK_TABLE_FILE相关参考资料见[链接](https://www.mindspore.cn/docs/zh-CN/master/model_train/parallel/rank_table.html), 获取device_ip方法详见[链接](https://gitee.com/mindspore/models/tree/master/utils/hccl_tools). - GPU diff --git a/research/cv/retinanet_resnet152/README.md b/research/cv/retinanet_resnet152/README.md index c64afc48435e568d856d5bc932cc587066cd23b3..6ef8d5d6f5f7cc11d908901892ffe94705621f73 100644 --- a/research/cv/retinanet_resnet152/README.md +++ b/research/cv/retinanet_resnet152/README.md @@ -291,7 +291,7 @@ bash run_distribute_train.sh DEVICE_NUM EPOCH_SIZE LR DATASET RANK_TABLE_FILE PR bash run_distribute_train.sh DEVICE_ID EPOCH_SIZE LR DATASET PRE_TRAINED(optional) PRE_TRAINED_EPOCH_SIZE(optional) ``` -> Note: RANK_TABLE_FILE related reference materials see in this [link](https://www.mindspore.cn/tutorials/experts/zh-CN/master/parallel/rank_table.html), +> Note: RANK_TABLE_FILE related reference materials see in this [link](https://www.mindspore.cn/docs/zh-CN/master/model_train/parallel/rank_table.html), > for details on how to get device_ip check this [link](https://gitee.com/mindspore/models/tree/master/utils/hccl_tools). - GPU: diff --git a/research/cv/retinanet_resnet152/README_CN.md b/research/cv/retinanet_resnet152/README_CN.md index 9120e7a59eda491a1c92517abedcf128406fbd44..8d9ae8df6c9768dd3f73aee0c029ee9de0dd4735 100644 --- a/research/cv/retinanet_resnet152/README_CN.md +++ b/research/cv/retinanet_resnet152/README_CN.md @@ -285,7 +285,7 @@ bash run_distribute_train.sh DEVICE_NUM EPOCH_SIZE LR DATASET RANK_TABLE_FILE PR bash run_distribute_train.sh DEVICE_ID EPOCH_SIZE LR DATASET PRE_TRAINED(optional) PRE_TRAINED_EPOCH_SIZE(optional) ``` -> 注意: RANK_TABLE_FILE相关参考资料见[链接](https://www.mindspore.cn/tutorials/experts/zh-CN/master/parallel/rank_table.html), +> 注意: RANK_TABLE_FILE相关参考资料见[链接](https://www.mindspore.cn/docs/zh-CN/master/model_train/parallel/rank_table.html), > 获取device_ip方法详见[链接](https://gitee.com/mindspore/models/tree/master/utils/hccl_tools). - GPU: diff --git a/research/cv/sphereface/README_CN.md b/research/cv/sphereface/README_CN.md index 8946eace2f6f99b5f07e83cb93bfab53c5ec405a..8a24fde6ee56a16c668ba2d1fd43e7c3e89a742e 100644 --- a/research/cv/sphereface/README_CN.md +++ b/research/cv/sphereface/README_CN.md @@ -476,7 +476,7 @@ sphereface网络使用LFW推理得到的结果如下: ### 推理 -如果您需要使用此训练模型在GPU、Ascend 910、Ascend 310等多个硬件平台上进行推理,可参考此[链接](https://www.mindspore.cn/tutorials/experts/zh-CN/master/infer/inference.html)。下面是操作步骤示例: +如果您需要使用此训练模型在GPU、Ascend 910、Ascend 310等多个硬件平台上进行推理,可参考此[链接](https://www.mindspore.cn/docs/zh-CN/master/model_infer/ms_infer/overview.html)。下面是操作步骤示例: - Ascend、GPU处理器环境运行 diff --git a/research/cv/squeezenet/README.md b/research/cv/squeezenet/README.md index 48fe41b1c0b04abfd00aac8a190c4a88d916198f..8de6a7f208afb9c4c0495288bd559c842b83a1b4 100644 --- a/research/cv/squeezenet/README.md +++ b/research/cv/squeezenet/README.md @@ -720,7 +720,7 @@ Inference result is saved in current path, you can find result like this in acc. ### Inference -If you need to use the trained model to perform inference on multiple hardware platforms, such as GPU, Ascend 910 or Ascend 310, you can refer to this [Link](https://www.mindspore.cn/tutorials/experts/en/master/infer/inference.html). Following the steps below, this is a simple example: +If you need to use the trained model to perform inference on multiple hardware platforms, such as GPU, Ascend 910 or Ascend 310, you can refer to this [Link](https://www.mindspore.cn/docs/en/master/model_infer/ms_infer/overview.html). Following the steps below, this is a simple example: - Running on Ascend diff --git a/research/cv/squeezenet1_1/README.md b/research/cv/squeezenet1_1/README.md index 63cfd4d24006aa4594148c419a5400166a000500..ff006ebbfbb4c2047bd8ad57d0700e378c06e235 100644 --- a/research/cv/squeezenet1_1/README.md +++ b/research/cv/squeezenet1_1/README.md @@ -306,7 +306,7 @@ Inference result is saved in current path, you can find result like this in acc. ### Inference -If you need to use the trained model to perform inference on multiple hardware platforms, such as GPU, Ascend 910 or Ascend 310, you can refer to this [Link](https://www.mindspore.cn/tutorials/experts/en/master/infer/inference.html). Following the steps below, this is a simple example: +If you need to use the trained model to perform inference on multiple hardware platforms, such as GPU, Ascend 910 or Ascend 310, you can refer to this [Link](https://www.mindspore.cn/docs/en/master/model_infer/ms_infer/overview.html). Following the steps below, this is a simple example: - Running on Ascend diff --git a/research/cv/textfusenet/README.md b/research/cv/textfusenet/README.md index 2be4f567fd65eea83d54c11c0b16a800652c8a74..4eecda4be20364391a6de192232767b2f10d0968 100755 --- a/research/cv/textfusenet/README.md +++ b/research/cv/textfusenet/README.md @@ -319,7 +319,7 @@ Usage: bash run_standalone_train.sh [PRETRAINED_MODEL] ## [Training Process](#contents) -- Set options in `config.py`, including loss_scale, learning rate and network hyperparameters. Click [here](https://www.mindspore.cn/tutorials/experts/en/master/dataset/augment.html) for more information about dataset. +- Set options in `config.py`, including loss_scale, learning rate and network hyperparameters. Click [here](https://www.mindspore.cn/docs/en/master/model_train/dataset/augment.html) for more information about dataset. ### [Training](#content) diff --git a/research/cv/textfusenet/README_CN.md b/research/cv/textfusenet/README_CN.md index c3435718934bf6f297700a7c19e502ffa2492ac8..635953fada28d1fbbab27055afd023557310da48 100755 --- a/research/cv/textfusenet/README_CN.md +++ b/research/cv/textfusenet/README_CN.md @@ -328,7 +328,7 @@ Shapely==1.5.9 ## 训练过程 -- 在`config.py`中设置配置项,包括loss_scale、学习率和网络超参。单击[此处](https://www.mindspore.cn/tutorials/experts/zh-CN/master/dataset/augment.html)获取更多数据集相关信息. +- 在`config.py`中设置配置项,包括loss_scale、学习率和网络超参。单击[此处](https://www.mindspore.cn/docs/zh-CN/master/model_train/dataset/augment.html)获取更多数据集相关信息. ### 训练 diff --git a/research/cv/tinydarknet/README_CN.md b/research/cv/tinydarknet/README_CN.md index a2f66fc416d69304ade678826c0501a84d11f44b..e537488d0402ce2be96674891c2dfa78a8fce4f4 100644 --- a/research/cv/tinydarknet/README_CN.md +++ b/research/cv/tinydarknet/README_CN.md @@ -64,7 +64,7 @@ Tiny-DarkNet是Joseph Chet Redmon等人提出的一个16层的针对于经典的 - + # [环境要求](#目录) diff --git a/research/cv/vnet/README_CN.md b/research/cv/vnet/README_CN.md index 95924d26ccbcc00dea39032773a11b4bf9e0cb78..6572a68403c40e17763baff3229960baeee5648b 100644 --- a/research/cv/vnet/README_CN.md +++ b/research/cv/vnet/README_CN.md @@ -101,7 +101,7 @@ VNet适用于医学图像分割,使用3D卷积,能够处理3D MR图像数据 - [MindSpore Python API](https://www.mindspore.cn/docs/zh-CN/master/api_python/mindspore.html) - 生成config json文件用于多卡训练。 - [简易教程](https://gitee.com/mindspore/models/tree/master/utils/hccl_tools) - - 详细配置方法请参照[rank table启动](https://www.mindspore.cn/tutorials/experts/zh-CN/master/parallel/rank_table.html)。 + - 详细配置方法请参照[rank table启动](https://www.mindspore.cn/docs/zh-CN/master/model_train/parallel/rank_table.html)。 # 快速入门 diff --git a/research/cv/wideresnet/README.md b/research/cv/wideresnet/README.md index c16dec6d8fc10dee0cdbfbe24220739f302d9099..ec4fea43aa19e1fcd7409577196ae4d4faed09f6 100644 --- a/research/cv/wideresnet/README.md +++ b/research/cv/wideresnet/README.md @@ -208,7 +208,7 @@ bash run_standalone_train_gpu.sh [DATASET_PATH] [CONFIG_PATH] [EXPERIMENT_LABEL] For distributed training, a hostfile configuration needs to be created in advance. -Please follow the instructions in the link [GPU-Multi-Host](https://www.mindspore.cn/tutorials/experts/en/master/parallel/mpirun.html). +Please follow the instructions in the link [GPU-Multi-Host](https://www.mindspore.cn/docs/en/master/model_train/parallel/mpirun.html). ##### Evaluation while training diff --git a/research/cv/wideresnet/README_CN.md b/research/cv/wideresnet/README_CN.md index b96a63bb36ddb4f6cba7a42a83e0ea7e2c35d816..339c79ff4f83fba5d8d6f86c6fbdab47d802e7d0 100644 --- a/research/cv/wideresnet/README_CN.md +++ b/research/cv/wideresnet/README_CN.md @@ -211,7 +211,7 @@ bash run_standalone_train_gpu.sh [DATASET_PATH] [CONFIG_PATH] [EXPERIMENT_LABEL] 对于分布式培训,需要提前创建主机文件配置。 -请按照链接中的说明操作 [GPU-Multi-Host](https://www.mindspore.cn/tutorials/experts/en/master/parallel/mpirun.html). +请按照链接中的说明操作 [GPU-Multi-Host](https://www.mindspore.cn/docs/en/master/model_train/parallel/mpirun.html). ## 培训时的评估 diff --git a/research/nlp/mass/README.md b/research/nlp/mass/README.md index 3364fb0dde2d97bac37b236726a7b8ca830a3b45..cb5cddaacaf2daa51c8f6dcad74ec1501de18e50 100644 --- a/research/nlp/mass/README.md +++ b/research/nlp/mass/README.md @@ -501,7 +501,7 @@ subword-nmt rouge ``` - + # Get started @@ -563,7 +563,7 @@ Get the log and output files under the path `./train_mass_*/`, and the model fil ## Inference -If you need to use the trained model to perform inference on multiple hardware platforms, such as GPU, Ascend 910 or Ascend 310, you can refer to this [Link](https://www.mindspore.cn/tutorials/experts/en/master/infer/inference.html). +If you need to use the trained model to perform inference on multiple hardware platforms, such as GPU, Ascend 910 or Ascend 310, you can refer to this [Link](https://www.mindspore.cn/docs/en/master/model_infer/ms_infer/overview.html). For inference, config the options in `default_config.yaml` firstly: - Assign the `default_config.yaml` under `data_path` node to the dataset path. diff --git a/research/nlp/mass/README_CN.md b/research/nlp/mass/README_CN.md index c2203ddc59451f45e0a98718b05b8220388f5f3a..f0053f7673ee0e72f590befb3771f8a1b349c54e 100644 --- a/research/nlp/mass/README_CN.md +++ b/research/nlp/mass/README_CN.md @@ -505,7 +505,7 @@ subword-nmt rouge ``` - + # 快速上手 @@ -567,7 +567,7 @@ bash run_gpu.sh -t t -n 1 -i 1 ## 推理 -如果您需要使用此训练模型在GPU、Ascend 910、Ascend 310等多个硬件平台上进行推理,可参考此[链接](https://www.mindspore.cn/tutorials/experts/zh-CN/master/infer/inference.html)。 +如果您需要使用此训练模型在GPU、Ascend 910、Ascend 310等多个硬件平台上进行推理,可参考此[链接](https://www.mindspore.cn/docs/zh-CN/master/model_infer/ms_infer/overview.html)。 推理时,请先配置`config.json`中的选项: - 将`default_config.yaml`节点下的`data_path`配置为数据集路径。 diff --git a/research/nlp/rotate/README_CN.md b/research/nlp/rotate/README_CN.md index cf901cd7b71448bce156c4eb431afefd1075d084..a87a4910b4d7df31468f4ae1f7c705313bd102a5 100644 --- a/research/nlp/rotate/README_CN.md +++ b/research/nlp/rotate/README_CN.md @@ -86,7 +86,7 @@ bash run_infer_310.sh [MINDIR_HEAD_PATH] [MINDIR_TAIL_PATH] [DATASET_PATH] [NEED 在裸机环境(本地有Ascend 910 AI 处理器)进行分布式训练时,需要配置当前多卡环境的组网信息文件。 请遵循一下链接中的说明创建json文件: - + - GPU处理器环境运行 diff --git a/research/recommend/ncf/README.md b/research/recommend/ncf/README.md index 26e0b0a231142d571ad043531e0a88b6506cd7e1..2bb4556b4077a33ff3e970a622e259d90eb4ca64 100644 --- a/research/recommend/ncf/README.md +++ b/research/recommend/ncf/README.md @@ -356,9 +356,9 @@ Inference result is saved in current path, you can find result like this in acc. ### Inference -If you need to use the trained model to perform inference on multiple hardware platforms, such as Ascend 910 or Ascend 310, you can refer to this [Link](https://www.mindspore.cn/tutorials/experts/en/master/infer/inference.html). Following the steps below, this is a simple example: +If you need to use the trained model to perform inference on multiple hardware platforms, such as Ascend 910 or Ascend 310, you can refer to this [Link](https://www.mindspore.cn/docs/en/master/model_infer/ms_infer/overview.html). Following the steps below, this is a simple example: - + ```python # Load unseen dataset for inference