From 9d7abec16f5df6179836b1c6de3075eb411f705e Mon Sep 17 00:00:00 2001 From: huan <3174348550@qq.com> Date: Thu, 17 Apr 2025 15:16:02 +0800 Subject: [PATCH] modify error links --- README.md | 2 +- README_CN.md | 4 +-- official/README.md | 23 ---------------- official/README_CN.md | 27 ++----------------- official/cv/CRNN/README.md | 4 +-- official/cv/CRNN/README_CN.md | 4 +-- official/cv/CTPN/README.md | 2 +- official/cv/CTPN/README_CN.md | 2 +- official/cv/DeepText/README.md | 2 +- official/cv/DeepText/README_CN.md | 2 +- official/cv/Inception/inceptionv4/README.md | 2 +- .../cv/Inception/inceptionv4/README_CN.md | 2 +- official/cv/Inception/xception/README.md | 2 +- official/cv/Inception/xception/README_CN.md | 2 +- .../MaskRCNN/maskrcnn_mobilenetv1/README.md | 2 +- .../maskrcnn_mobilenetv1/README_CN.md | 2 +- .../cv/MaskRCNN/maskrcnn_resnet50/README.md | 2 +- .../MaskRCNN/maskrcnn_resnet50/README_CN.md | 2 +- official/cv/ResNet/README.md | 4 +-- official/cv/ResNet/README_CN.md | 2 +- official/cv/RetinaNet/README.md | 2 +- official/cv/RetinaNet/README_CN.md | 2 +- official/cv/SSD/README.md | 2 +- official/cv/SSD/README_CN.md | 2 +- official/cv/Unet/README.md | 2 +- official/cv/Unet/README_CN.md | 2 +- official/cv/VGG/vgg16/README.md | 2 +- official/cv/VGG/vgg16/README_CN.md | 2 +- official/cv/VGG/vgg19/README.md | 2 +- official/cv/VGG/vgg19/README_CN.md | 2 +- official/cv/VIT/README.md | 2 +- official/cv/VIT/README_CN.md | 2 +- official/nlp/Pangu_alpha/README.md | 8 +++--- official/nlp/Pangu_alpha/README_CN.md | 8 +++--- official/nlp/Transformer/README.md | 2 +- official/nlp/Transformer/README_CN.md | 2 +- research/audio/speech_transformer/README.md | 2 +- research/cv/3D_DenseNet/README.md | 2 +- research/cv/3D_DenseNet/README_CN.md | 2 +- research/cv/AlignedReID++/README_CN.md | 2 +- research/cv/C3D/README.md | 2 +- research/cv/C3D/README_CN.md | 2 +- research/cv/EGnet/README_CN.md | 2 +- research/cv/LightCNN/README.md | 2 +- research/cv/LightCNN/README_CN.md | 2 +- research/cv/Unet3d/README.md | 2 +- research/cv/Unet3d/README_CN.md | 2 +- research/cv/cnnctc/README.md | 2 +- research/cv/cnnctc/README_CN.md | 4 +-- research/cv/crnn_seq2seq_ocr/README.md | 2 +- research/cv/cspdarknet53/README.md | 2 +- research/cv/dcgan/README.md | 2 +- research/cv/dlinknet/README.md | 2 +- research/cv/dlinknet/README_CN.md | 2 +- research/cv/east/README.md | 2 +- research/cv/essay-recogination/README_CN.md | 2 +- research/cv/googlenet/README.md | 2 +- research/cv/googlenet/README_CN.md | 2 +- research/cv/hardnet/README_CN.md | 4 +-- research/cv/inception_resnet_v2/README.md | 2 +- research/cv/inception_resnet_v2/README_CN.md | 2 +- research/cv/nas-fpn/README_CN.md | 2 +- research/cv/ntsnet/README.md | 2 +- research/cv/osnet/README.md | 2 +- research/cv/predrnn++/README.md | 2 +- research/cv/retinanet_resnet101/README.md | 2 +- research/cv/retinanet_resnet101/README_CN.md | 2 +- research/cv/retinanet_resnet152/README.md | 2 +- research/cv/retinanet_resnet152/README_CN.md | 2 +- research/cv/sphereface/README.md | 2 +- research/cv/sphereface/README_CN.md | 2 +- research/cv/squeezenet/README.md | 2 +- research/cv/squeezenet1_1/README.md | 2 +- research/cv/ssd_ghostnet/README.md | 2 +- research/cv/ssd_inception_v2/README.md | 2 +- research/cv/ssd_inceptionv2/README_CN.md | 2 +- research/cv/ssd_mobilenetV2/README.md | 2 +- research/cv/ssd_mobilenetV2_FPNlite/README.md | 2 +- research/cv/ssd_resnet34/README.md | 2 +- research/cv/ssd_resnet34/README_CN.md | 2 +- research/cv/ssd_resnet50/README.md | 2 +- research/cv/ssd_resnet50/README_CN.md | 2 +- research/cv/ssd_resnet_34/README.md | 2 +- research/cv/textfusenet/README.md | 2 +- research/cv/textfusenet/README_CN.md | 2 +- research/cv/tinydarknet/README_CN.md | 2 +- research/cv/vnet/README_CN.md | 2 +- research/cv/warpctc/README.md | 2 +- research/cv/warpctc/README_CN.md | 2 +- research/cv/wideresnet/README.md | 2 +- research/cv/wideresnet/README_CN.md | 2 +- research/cv/yolov3_resnet18/README.md | 2 +- research/cv/yolov3_resnet18/README_CN.md | 2 +- research/nlp/cpm/README.md | 2 +- research/nlp/cpm/README_CN.md | 2 +- research/nlp/mass/README.md | 4 +-- research/nlp/mass/README_CN.md | 4 +-- research/nlp/rotate/README_CN.md | 2 +- research/recommend/ncf/README.md | 4 +-- 99 files changed, 114 insertions(+), 160 deletions(-) diff --git a/README.md b/README.md index c36c978a1..96fa51641 100644 --- a/README.md +++ b/README.md @@ -50,7 +50,7 @@ For more information about `MindSpore` framework, please refer to [FAQ](https:// - **Q: What is Some *RANK_TBAL_FILE* which mentioned in many models?** - **A**: *RANK_TABLE_FILE* is the config file of cluster on Ascend while running distributed training. For more information, you could refer to the generator [hccl_tools](https://gitee.com/mindspore/models/tree/master/utils/hccl_tools) and [Parallel Distributed Training Example](https://www.mindspore.cn/docs/en/master/model_train/parallel/rank_table.html) + **A**: *RANK_TABLE_FILE* is the config file of cluster on Ascend while running distributed training. For more information, you could refer to the generator [hccl_tools](https://gitee.com/mindspore/models/tree/master/utils/hccl_tools) and [Parallel Distributed Training Example](https://www.mindspore.cn/tutorials/en/master/parallel/rank_table.html) - **Q: How to run the scripts on Windows system?** diff --git a/README_CN.md b/README_CN.md index a655cd380..89bbff911 100644 --- a/README_CN.md +++ b/README_CN.md @@ -50,11 +50,11 @@ MindSpore已获得Apache 2.0许可,请参见LICENSE文件。 - **Q: 一些模型描述中提到的*RANK_TABLE_FILE*文件,是什么?** - **A**: *RANK_TABLE_FILE*是一个Ascend环境上用于指定分布式集群信息的文件,更多信息可以参考生成工具[hccl_toos](https://gitee.com/mindspore/models/tree/master/utils/hccl_tools)和[分布式并行训练教程](https://www.mindspore.cn/docs/zh-CN/master/model_train/parallel/rank_table.html) + **A**: *RANK_TABLE_FILE*是一个Ascend环境上用于指定分布式集群信息的文件,更多信息可以参考生成工具[hccl_toos](https://gitee.com/mindspore/models/tree/master/utils/hccl_tools)和[分布式并行训练教程](https://www.mindspore.cn/tutorials/zh-CN/master/parallel/rank_table.html) - **Q: 如何使用多机多卡运行脚本** - **A**: 本仓内所提供的分布式(distribute)运行启动默认为单机多卡,如需多机多卡启动需要在单机多卡的基础上进行一定程度的适配,可参考[多机多卡分布式教程](https://www.mindspore.cn/docs/zh-CN/master/model_train/parallel/rank_table.html#%E5%A4%9A%E6%9C%BA%E5%A4%9A%E5%8D%A1) + **A**: 本仓内所提供的分布式(distribute)运行启动默认为单机多卡,如需多机多卡启动需要在单机多卡的基础上进行一定程度的适配,可参考[多机多卡分布式教程](https://www.mindspore.cn/tutorials/zh-CN/master/parallel/rank_table.html#%E5%A4%9A%E6%9C%BA%E5%A4%9A%E5%8D%A1) - **Q: 在windows环境上要怎么运行网络脚本?** diff --git a/official/README.md b/official/README.md index cfd16e89c..ef2841e59 100644 --- a/official/README.md +++ b/official/README.md @@ -242,29 +242,6 @@ | retinaface_mobilenet_0.25 | WiderFace | 90.77/88.2/74.76 | [config](https://github.com/mindspore-lab/mindface/tree/main/mindface/detection) | [link](https://gitee.com/mindspore/models/tree/master/research/cv/retinaface) | | retinaface_r50 | WiderFace | 95.07/93.61/84.84 | [config](https://github.com/mindspore-lab/mindface/tree/main/mindface/detection) | [link](https://gitee.com/mindspore/models/tree/master/official/cv/RetinaFace_ResNet50) | -### NLP - -### nlp - -| model | mindformer recipe | vanilla mindspore -| :-: | :-: | :-: | -| bert_base | [config](https://gitee.com/mindspore/mindformers/blob/dev/docs/model_cards/t5.md) | [link](https://gitee.com/mindspore/models/tree/master/official/nlp/Bert) | -| t5_small | [config](https://github.com/mindspore-lab/mindformers/blob/master/docs/model_cards/bert.md) | | -| gpt2_small | [config](https://gitee.com/mindspore/mindformers/blob/dev/docs/model_cards/gpt2.md) | | -| gpt2_13b | [config](https://gitee.com/mindspore/mindformers/blob/dev/docs/model_cards/gpt2.md) | | -| gpt2_52b | [config](https://gitee.com/mindspore/mindformers/blob/dev/docs/model_cards/gpt2.md) | | -| pangu_alpha | [config](https://gitee.com/mindspore/mindformers/blob/dev/docs/model_cards/pangualpha.md) | | -| glm_6b | [config](https://gitee.com/mindspore/mindformers/blob/dev/docs/model_cards/glm.md) | | -| glm_6b_lora | [config](https://gitee.com/mindspore/mindformers/blob/dev/docs/model_cards/glm.md) | | -| llama_7b | [config](https://gitee.com/mindspore/mindformers/blob/dev/docs/model_cards/llama.md) | | -| llama_13b | [config](https://gitee.com/mindspore/mindformers/blob/dev/docs/model_cards/llama.md) | | -| llama_65b | [config](https://gitee.com/mindspore/mindformers/blob/dev/docs/model_cards/llama.md) | | -| llama_7b_lora | [config](https://gitee.com/mindspore/mindformers/blob/dev/docs/model_cards/llama.md) | | -| bloom_560m | [config](https://gitee.com/mindspore/mindformers/blob/dev/docs/model_cards/bloom.md) | | -| bloom_7.1b | [config](https://gitee.com/mindspore/mindformers/blob/dev/docs/model_cards/bloom.md) | | -| bloom_65b | [config](https://gitee.com/mindspore/mindformers/blob/dev/docs/model_cards/bloom.md) | | -| bloom_176b | [config](https://gitee.com/mindspore/mindformers/blob/dev/docs/model_cards/bloom.md) | | - ### Recommendation | model | mind_series recipe | vanilla mindspore | diff --git a/official/README_CN.md b/official/README_CN.md index 316965958..17037ae0d 100644 --- a/official/README_CN.md +++ b/official/README_CN.md @@ -10,7 +10,7 @@ ### 计算机视觉 -#### 图像分类(骨干类) +#### 图像分类(骨干类) | model | acc@1 | mindcv recipe | vanilla mindspore | | :-: | :-: | :-: | :-: | @@ -242,30 +242,7 @@ | retinaface_mobilenet_0.25 | WiderFace | 90.77/88.2/74.76 | [config](https://github.com/mindspore-lab/mindface/tree/main/mindface/detection) | [link](https://gitee.com/mindspore/models/tree/master/research/cv/retinaface) | | retinaface_r50 | WiderFace | 95.07/93.61/84.84 | [config](https://github.com/mindspore-lab/mindface/tree/main/mindface/detection) | [link](https://gitee.com/mindspore/models/tree/master/official/cv/RetinaFace_ResNet50) | -### 自然语言处理 - -### nlp - -| model | mindformer recipe | vanilla mindspore -| :-: | :-: | :-: | -| bert_base | [config](https://gitee.com/mindspore/mindformers/blob/dev/docs/model_cards/t5.md) | [link](https://gitee.com/mindspore/models/tree/master/official/nlp/Bert) | -| t5_small | [config](https://github.com/mindspore-lab/mindformers/blob/master/docs/model_cards/bert.md) | | -| gpt2_small | [config](https://gitee.com/mindspore/mindformers/blob/dev/docs/model_cards/gpt2.md) | | -| gpt2_13b | [config](https://gitee.com/mindspore/mindformers/blob/dev/docs/model_cards/gpt2.md) | | -| gpt2_52b | [config](https://gitee.com/mindspore/mindformers/blob/dev/docs/model_cards/gpt2.md) | | -| pangu_alpha | [config](https://gitee.com/mindspore/mindformers/blob/dev/docs/model_cards/pangualpha.md) | | -| glm_6b | [config](https://gitee.com/mindspore/mindformers/blob/dev/docs/model_cards/glm.md) | | -| glm_6b_lora | [config](https://gitee.com/mindspore/mindformers/blob/dev/docs/model_cards/glm.md) | | -| llama_7b | [config](https://gitee.com/mindspore/mindformers/blob/dev/docs/model_cards/llama.md) | | -| llama_13b | [config](https://gitee.com/mindspore/mindformers/blob/dev/docs/model_cards/llama.md) | | -| llama_65b | [config](https://gitee.com/mindspore/mindformers/blob/dev/docs/model_cards/llama.md) | | -| llama_7b_lora | [config](https://gitee.com/mindspore/mindformers/blob/dev/docs/model_cards/llama.md) | | -| bloom_560m | [config](https://gitee.com/mindspore/mindformers/blob/dev/docs/model_cards/bloom.md) | | -| bloom_7.1b | [config](https://gitee.com/mindspore/mindformers/blob/dev/docs/model_cards/bloom.md) | | -| bloom_65b | [config](https://gitee.com/mindspore/mindformers/blob/dev/docs/model_cards/bloom.md) | | -| bloom_176b | [config](https://gitee.com/mindspore/mindformers/blob/dev/docs/model_cards/bloom.md) | | - -MindSpore仅提供下载和预处理公共数据集的脚本。我们不拥有这些数据集,也不对它们的质量负责或维护。请确保您具有在数据集许可下使用该数据集的权限。在这些数据集上训练的模型仅用于非商业研究和教学目的。 +### 推荐 | model | mind_series recipe | vanilla mindspore | | :-: | :-: | :-: | diff --git a/official/cv/CRNN/README.md b/official/cv/CRNN/README.md index aebf05ab2..70491e094 100644 --- a/official/cv/CRNN/README.md +++ b/official/cv/CRNN/README.md @@ -51,7 +51,7 @@ We provide 2 versions of network using different ways to transfer the hidden siz Note that you can run the scripts based on the dataset mentioned in original paper or widely used in relevant domain/network architecture. In the following sections, we will introduce how to run the scripts using the related dataset below. -We use five datasets mentioned in the paper.For training, we use the synthetic dataset([MJSynth](https://www.robots.ox.ac.uk/~vgg/data/text/) and [SynthText](https://github.com/ankush-me/SynthText)) released by Jaderberg etal as the training data, which contains 8 millions training images and their corresponding ground truth words.For evaluation, we use four popular benchmarks for scene text recognition, nalely ICDAR 2003([IC03](http://www.iapr-tc11.org/mediawiki/index.php?title=ICDAR_2003_Robust_Reading_Competitions)),ICDAR2013([IC13](https://rrc.cvc.uab.es/?ch=2&com=downloads)),IIIT 5k-word([IIIT5k](https://cvit.iiit.ac.in/research/projects/cvit-projects/the-iiit-5k-word-dataset)),and Street View Text([SVT](http://vision.ucsd.edu/~kai/grocr/)). +We use five datasets mentioned in the paper.For training, we use the synthetic dataset([MJSynth](https://www.robots.ox.ac.uk/~vgg/data/text/) and [SynthText](https://github.com/ankush-me/SynthText)) released by Jaderberg etal as the training data, which contains 8 millions training images and their corresponding ground truth words.For evaluation, we use four popular benchmarks for scene text recognition, nalely ICDAR 2003([IC03](http://www.iapr-tc11.org/mediawiki/index.php?title=ICDAR_2003_Robust_Reading_Competitions)),ICDAR2013([IC13](https://rrc.cvc.uab.es/?ch=2&com=downloads)),IIIT 5k-word([IIIT5k](https://cvit.iiit.ac.in/research/projects/cvit-projects/the-iiit-5k-word-dataset)). ### [Dataset Prepare](#content) @@ -237,7 +237,7 @@ Parameters for both training and evaluation can be set in default_config.yaml. ## [Training Process](#contents) -- Set options in `config.py`, including learning rate and other network hyperparameters. Click [MindSpore dataset preparation tutorial](https://www.mindspore.cn/docs/en/master/model_train/index.html) for more information about dataset. +- Set options in `config.py`, including learning rate and other network hyperparameters. ### [Training](#contents) diff --git a/official/cv/CRNN/README_CN.md b/official/cv/CRNN/README_CN.md index c4a9e7966..16fb2ec4e 100644 --- a/official/cv/CRNN/README_CN.md +++ b/official/cv/CRNN/README_CN.md @@ -51,7 +51,7 @@ CRNN使用vgg16结构进行特征提取,附加两层双向LSTM,最后使用C 注:可以运行原始论文中提到的数据集脚本,也可以运行在相关域/网络架构中广泛使用的脚本。下面将介绍如何使用相关数据集运行脚本。 -我们使用论文中提到的五个数据集。在训练中,使用Jederberg等人发布的合成数据集([MJSynth](https://www.robots.ox.ac.uk/~vgg/data/text/)和[SynthText](https://github.com/ankush-me/SynthText))作为训练数据,其中包含800万张训练图像及其对应的地面真值词。在评估中,使用四个流行的场景文本识别基准,即ICDAR 2003([IC03](http://www.iapr-tc11.org/mediawiki/index.php?title=ICDAR_2003_Robust_Reading_Competitions))、ICDAR2013([IC13](https://rrc.cvc.uab.es/?ch=2&com=downloads))、IIIT 5k-word([IIIT5k](https://cvit.iiit.ac.in/research/projects/cvit-projects/the-iiit-5k-word-dataset))和街景文本([SVT](http://vision.ucsd.edu/~kai/grocr/))。 +我们使用论文中提到的五个数据集。在训练中,使用Jederberg等人发布的合成数据集([MJSynth](https://www.robots.ox.ac.uk/~vgg/data/text/)和[SynthText](https://github.com/ankush-me/SynthText))作为训练数据,其中包含800万张训练图像及其对应的地面真值词。在评估中,使用四个流行的场景文本识别基准,即ICDAR 2003([IC03](http://www.iapr-tc11.org/mediawiki/index.php?title=ICDAR_2003_Robust_Reading_Competitions))、ICDAR2013([IC13](https://rrc.cvc.uab.es/?ch=2&com=downloads))、IIIT 5k-word([IIIT5k](https://cvit.iiit.ac.in/research/projects/cvit-projects/the-iiit-5k-word-dataset))。 ### [数据集准备](#目录) @@ -237,7 +237,7 @@ crnn ## [训练过程](#目录) -- 设置`config.py`中的选项,包括学习率和其他网络超参。有关数据集的更多信息,请参阅[MindSpore数据集准备教程](https://www.mindspore.cn/docs/zh-CN/master/model_train/index.html)。 +- 设置`config.py`中的选项,包括学习率和其他网络超参。 ### [训练](#目录) diff --git a/official/cv/CTPN/README.md b/official/cv/CTPN/README.md index e05fc84d6..af97b3667 100644 --- a/official/cv/CTPN/README.md +++ b/official/cv/CTPN/README.md @@ -246,7 +246,7 @@ imagenet_cfg = edict({ Then you can train it with ImageNet2012. > Notes: -> RANK_TABLE_FILE can refer to [Link](https://www.mindspore.cn/docs/en/master/model_train/parallel/rank_table.html) , and the device_ip can be got as [Link](https://gitee.com/mindspore/models/tree/master/utils/hccl_tools). For large models like InceptionV4, it's better to export an external environment variable `export HCCL_CONNECT_TIMEOUT=600` to extend hccl connection checking time from the default 120 seconds to 600 seconds. Otherwise, the connection could be timeout since compiling time increases with the growth of model size. +> RANK_TABLE_FILE can refer to [Link](https://www.mindspore.cn/tutorials/en/master/parallel/rank_table.html) , and the device_ip can be got as [Link](https://gitee.com/mindspore/models/tree/master/utils/hccl_tools). For large models like InceptionV4, it's better to export an external environment variable `export HCCL_CONNECT_TIMEOUT=600` to extend hccl connection checking time from the default 120 seconds to 600 seconds. Otherwise, the connection could be timeout since compiling time increases with the growth of model size. > > This is processor cores binding operation regarding the `device_num` and total processor numbers. If you are not expect to do it, remove the operations `taskset` in `scripts/run_distribute_train.sh` > diff --git a/official/cv/CTPN/README_CN.md b/official/cv/CTPN/README_CN.md index bb04f79b0..3770e39e6 100644 --- a/official/cv/CTPN/README_CN.md +++ b/official/cv/CTPN/README_CN.md @@ -234,7 +234,7 @@ imagenet_cfg = edict({ 然后,您可以使用ImageNet2012训练它。 > 注: -> RANK_TABLE_FILE文件,请参考[链接](https://www.mindspore.cn/docs/en/master/model_train/parallel/rank_table.html)。如需获取设备IP,请点击[链接](https://gitee.com/mindspore/models/tree/master/utils/hccl_tools)。对于InceptionV4等大模型,最好导出外部环境变量`export HCCL_CONNECT_TIMEOUT=600`,将hccl连接检查时间从默认的120秒延长到600秒。否则,连接可能会超时,因为随着模型增大,编译时间也会增加。 +> RANK_TABLE_FILE文件,请参考[链接](https://www.mindspore.cn/tutorials/en/master/parallel/rank_table.html)。如需获取设备IP,请点击[链接](https://gitee.com/mindspore/models/tree/master/utils/hccl_tools)。对于InceptionV4等大模型,最好导出外部环境变量`export HCCL_CONNECT_TIMEOUT=600`,将hccl连接检查时间从默认的120秒延长到600秒。否则,连接可能会超时,因为随着模型增大,编译时间也会增加。 > > 处理器绑核操作取决于`device_num`和总处理器数。如果不希望这样做,请删除`scripts/run_distribute_train.sh`中的`taskset`操作。 > diff --git a/official/cv/DeepText/README.md b/official/cv/DeepText/README.md index 42fffda7a..d69a317cf 100644 --- a/official/cv/DeepText/README.md +++ b/official/cv/DeepText/README.md @@ -143,7 +143,7 @@ Here we used 4 datasets for training, and 1 datasets for Evaluation. ``` > Notes: -> RANK_TABLE_FILE can refer to [Link](https://www.mindspore.cn/docs/en/master/model_train/parallel/rank_table.html) , and the device_ip can be got as [Link](https://gitee.com/mindspore/models/tree/master/utils/hccl_tools). For large models like InceptionV4, it's better to export an external environment variable `export HCCL_CONNECT_TIMEOUT=600` to extend hccl connection checking time from the default 120 seconds to 600 seconds. Otherwise, the connection could be timeout since compiling time increases with the growth of model size. +> RANK_TABLE_FILE can refer to [Link](https://www.mindspore.cn/tutorials/en/master/parallel/rank_table.html) , and the device_ip can be got as [Link](https://gitee.com/mindspore/models/tree/master/utils/hccl_tools). For large models like InceptionV4, it's better to export an external environment variable `export HCCL_CONNECT_TIMEOUT=600` to extend hccl connection checking time from the default 120 seconds to 600 seconds. Otherwise, the connection could be timeout since compiling time increases with the growth of model size. > > This is processor cores binding operation regarding the `device_num` and total processor numbers. If you are not expect to do it, remove the operations `taskset` in `scripts/run_distribute_train.sh` > diff --git a/official/cv/DeepText/README_CN.md b/official/cv/DeepText/README_CN.md index 604a3517a..0840a9ee4 100644 --- a/official/cv/DeepText/README_CN.md +++ b/official/cv/DeepText/README_CN.md @@ -133,7 +133,7 @@ InceptionV4的整体网络架构如下: ``` > 注: -> RANK_TABLE_FILE文件,请参考[链接](https://www.mindspore.cn/docs/en/master/model_train/parallel/rank_table.html)。如需获取设备IP,请点击[链接](https://gitee.com/mindspore/models/tree/master/utils/hccl_tools)。对于InceptionV4等大模型,最好导出外部环境变量`export HCCL_CONNECT_TIMEOUT=600`,将hccl连接检查时间从默认的120秒延长到600秒。否则,连接可能会超时,因为随着模型增大,编译时间也会增加。 +> RANK_TABLE_FILE文件,请参考[链接](https://www.mindspore.cn/tutorials/en/master/parallel/rank_table.html)。如需获取设备IP,请点击[链接](https://gitee.com/mindspore/models/tree/master/utils/hccl_tools)。对于InceptionV4等大模型,最好导出外部环境变量`export HCCL_CONNECT_TIMEOUT=600`,将hccl连接检查时间从默认的120秒延长到600秒。否则,连接可能会超时,因为随着模型增大,编译时间也会增加。 > > 处理器绑核操作取决于`device_num`和总处理器数。如果不希望这样做,请删除`scripts/run_distribute_train.sh`中的`taskset`操作。 > diff --git a/official/cv/Inception/inceptionv4/README.md b/official/cv/Inception/inceptionv4/README.md index df52f98bf..3378a3eb7 100644 --- a/official/cv/Inception/inceptionv4/README.md +++ b/official/cv/Inception/inceptionv4/README.md @@ -279,7 +279,7 @@ You can start training using python or shell scripts. The usage of shell scripts ``` > Notes: -> RANK_TABLE_FILE can refer to [Link](https://www.mindspore.cn/docs/en/master/model_train/parallel/rank_table.html) , and the device_ip can be got as [Link](https://gitee.com/mindspore/models/tree/master/utils/hccl_tools). For large models like InceptionV4, it's better to export an external environment variable `export HCCL_CONNECT_TIMEOUT=600` to extend hccl connection checking time from the default 120 seconds to 600 seconds. Otherwise, the connection could be timeout since compiling time increases with the growth of model size. +> RANK_TABLE_FILE can refer to [Link](https://www.mindspore.cn/tutorials/en/master/parallel/rank_table.html) , and the device_ip can be got as [Link](https://gitee.com/mindspore/models/tree/master/utils/hccl_tools). For large models like InceptionV4, it's better to export an external environment variable `export HCCL_CONNECT_TIMEOUT=600` to extend hccl connection checking time from the default 120 seconds to 600 seconds. Otherwise, the connection could be timeout since compiling time increases with the growth of model size. > > This is processor cores binding operation regarding the `device_num` and total processor numbers. If you are not expect to do it, remove the operations `taskset` in `scripts/run_distribute_train.sh` diff --git a/official/cv/Inception/inceptionv4/README_CN.md b/official/cv/Inception/inceptionv4/README_CN.md index 9502b7626..3c481fbbb 100644 --- a/official/cv/Inception/inceptionv4/README_CN.md +++ b/official/cv/Inception/inceptionv4/README_CN.md @@ -267,7 +267,7 @@ train.py和config.py中的主要涉及如下参数: ``` > 注: -> 有关RANK_TABLE_FILE,可参考[链接](https://www.mindspore.cn/docs/zh-CN/master/model_train/parallel/rank_table.html)。设备IP可参考[链接](https://gitee.com/mindspore/models/tree/master/utils/hccl_tools)。对于像InceptionV4这样的大型模型,最好设置外部环境变量`export HCCL_CONNECT_TIMEOUT=600`,将hccl连接检查时间从默认的120秒延长到600秒。否则,可能会连接超时,因为编译时间会随着模型增大而增加。 +> 有关RANK_TABLE_FILE,可参考[链接](https://www.mindspore.cn/tutorials/zh-CN/master/parallel/rank_table.html)。设备IP可参考[链接](https://gitee.com/mindspore/models/tree/master/utils/hccl_tools)。对于像InceptionV4这样的大型模型,最好设置外部环境变量`export HCCL_CONNECT_TIMEOUT=600`,将hccl连接检查时间从默认的120秒延长到600秒。否则,可能会连接超时,因为编译时间会随着模型增大而增加。 > > 绑核操作取决于`device_num`参数值及处理器总数。如果不需要,删除`scripts/run_distribute_train.sh`脚本中的`taskset`操作任务集即可。 diff --git a/official/cv/Inception/xception/README.md b/official/cv/Inception/xception/README.md index a0181e25a..8ea9743fe 100644 --- a/official/cv/Inception/xception/README.md +++ b/official/cv/Inception/xception/README.md @@ -189,7 +189,7 @@ You can start training using python or shell scripts. The usage of shell scripts bash run_infer_310.sh MINDIR_PATH DATA_PATH LABEL_FILE DEVICE_ID ``` -> Notes: RANK_TABLE_FILE can refer to [Link](https://www.mindspore.cn/docs/en/master/model_train/parallel/rank_table.html), and the device_ip can be got as [Link](https://gitee.com/mindspore/models/tree/master/utils/hccl_tools). +> Notes: RANK_TABLE_FILE can refer to [Link](https://www.mindspore.cn/tutorials/en/master/parallel/rank_table.html), and the device_ip can be got as [Link](https://gitee.com/mindspore/models/tree/master/utils/hccl_tools). ### Launch diff --git a/official/cv/Inception/xception/README_CN.md b/official/cv/Inception/xception/README_CN.md index f36fc15c1..005e86900 100644 --- a/official/cv/Inception/xception/README_CN.md +++ b/official/cv/Inception/xception/README_CN.md @@ -189,7 +189,7 @@ Xception的整体网络架构如下: bash run_infer_310.sh MINDIR_PATH DATA_PATH LABEL_FILE DEVICE_ID ``` -> 注:RANK_TABLE_FILE可以参考[链接](https://www.mindspore.cn/docs/en/master/model_train/parallel/rank_table.html),device_ip可以参考[链接](https://gitee.com/mindspore/models/tree/master/utils/hccl_tools)。 +> 注:RANK_TABLE_FILE可以参考[链接](https://www.mindspore.cn/tutorials/en/master/parallel/rank_table.html),device_ip可以参考[链接](https://gitee.com/mindspore/models/tree/master/utils/hccl_tools)。 ### 启动 diff --git a/official/cv/MaskRCNN/maskrcnn_mobilenetv1/README.md b/official/cv/MaskRCNN/maskrcnn_mobilenetv1/README.md index 1659ce27f..0943af1a4 100644 --- a/official/cv/MaskRCNN/maskrcnn_mobilenetv1/README.md +++ b/official/cv/MaskRCNN/maskrcnn_mobilenetv1/README.md @@ -522,7 +522,7 @@ Usage: bash run_distribute_train_gpu.sh [DATA_PATH] [PRETRAINED_PATH] (optional) ## [Training Process](#contents) -- Set options in `default_config.yaml`, including loss_scale, learning rate and network hyperparameters. Click [here](https://www.mindspore.cn/docs/en/master/model_train/index.html) for more information about dataset. +- Set options in `default_config.yaml`, including loss_scale, learning rate and network hyperparameters. ### [Training](#content) diff --git a/official/cv/MaskRCNN/maskrcnn_mobilenetv1/README_CN.md b/official/cv/MaskRCNN/maskrcnn_mobilenetv1/README_CN.md index 6e6869218..3b8941536 100644 --- a/official/cv/MaskRCNN/maskrcnn_mobilenetv1/README_CN.md +++ b/official/cv/MaskRCNN/maskrcnn_mobilenetv1/README_CN.md @@ -521,7 +521,7 @@ test_batch_size": 2, # ## [训练过程](#目录) -- 在`default_config.yaml`中设置选项,包括损失缩放、学习率和网络超参。有关数据集的更多信息,请单击[此处](https://www.mindspore.cn/docs/en/master/model_train/index.html) for more information about dataset.。 +- 在`default_config.yaml`中设置选项,包括损失缩放、学习率和网络超参。 ### [训练](#目录) diff --git a/official/cv/MaskRCNN/maskrcnn_resnet50/README.md b/official/cv/MaskRCNN/maskrcnn_resnet50/README.md index e7f5bcc9c..a599ce73c 100644 --- a/official/cv/MaskRCNN/maskrcnn_resnet50/README.md +++ b/official/cv/MaskRCNN/maskrcnn_resnet50/README.md @@ -543,7 +543,7 @@ Usage: bash run_standalone_train.sh [PRETRAINED_MODEL] [DATA_PATH] ## [Training Process](#contents) -- Set options in `config.py`, including loss_scale, learning rate and network hyperparameters. Click [here](https://www.mindspore.cn/docs/en/master/model_train/index.html) for more information about dataset. +- Set options in `config.py`, including loss_scale, learning rate and network hyperparameters. ### [Training](#content) diff --git a/official/cv/MaskRCNN/maskrcnn_resnet50/README_CN.md b/official/cv/MaskRCNN/maskrcnn_resnet50/README_CN.md index 2caaa62d5..447307e01 100644 --- a/official/cv/MaskRCNN/maskrcnn_resnet50/README_CN.md +++ b/official/cv/MaskRCNN/maskrcnn_resnet50/README_CN.md @@ -525,7 +525,7 @@ bash run_eval.sh [VALIDATION_JSON_FILE] [CHECKPOINT_PATH] [DATA_PATH] ## 训练过程 -- 在`config.py`中设置配置项,包括loss_scale、学习率和网络超参。单击[此处](https://www.mindspore.cn/docs/zh-CN/master/model_train/index.html)获取更多数据集相关信息. +- 在`config.py`中设置配置项,包括loss_scale、学习率和网络超参。 ### 训练 diff --git a/official/cv/ResNet/README.md b/official/cv/ResNet/README.md index e207f981c..0a08a869f 100644 --- a/official/cv/ResNet/README.md +++ b/official/cv/ResNet/README.md @@ -480,7 +480,7 @@ bash run_eval_gpu_resnet_benchmark.sh [DATASET_PATH] [CKPT_PATH] [BATCH_SIZE](op For distributed training, a hostfile configuration needs to be created in advance. -Please follow the instructions in the link [GPU-Multi-Host](https://www.mindspore.cn/docs/en/master/model_train/parallel/mpirun.html). +Please follow the instructions in the link [GPU-Multi-Host](https://www.mindspore.cn/tutorials/en/master/parallel/mpirun.html). #### Running parameter server mode training @@ -1484,7 +1484,7 @@ Refer to the [ModelZoo FAQ](https://gitee.com/mindspore/models#FAQ) for some com **A**: Suggested reference:https://bbs.huaweicloud.com/forum/thread-134093-1-1.html -- **Q: How to solve the memory shortage caused by accumulation operators such as ReduceMean and BiasAddGrad on 910B?** +- **Q: How to solve the memory shortage caused by accumulation operators such as ReduceMean and BiasAddGrad on Atlas A2 training series?** **A**: Suggested adding `mindspore.set_context(ascend_config={"atomic_clean_policy": 0})` in `train.py`. If the problem still hasn't been resolved, please go to the [MindSpore community](https://gitee.com/mindspore/mindspore/issues) to submit an issue. diff --git a/official/cv/ResNet/README_CN.md b/official/cv/ResNet/README_CN.md index 5aafb7f53..b030f3c67 100644 --- a/official/cv/ResNet/README_CN.md +++ b/official/cv/ResNet/README_CN.md @@ -1425,7 +1425,7 @@ result:{'top_1_accuracy': 0.928385416666666} prune_rate=0.45 ckpt=~/resnet50_cif **A**: 建议参考https://bbs.huaweicloud.com/forum/thread-134093-1-1.html -- **Q: 如何解决910B硬件上因ReduceMean、BiasAddGrad等累加算子导致的内存不足?** +- **Q: 如何解决Atlas A2训练系列产品上因ReduceMean、BiasAddGrad等累加算子导致的内存不足?** **A**: 建议在`train.py`中添加`mindspore.set_context(ascend_config={"atomic_clean_policy": 0})`,如果还是没有解决问题,请到[MindSpore社区](https://gitee.com/mindspore/mindspore/issues)提issue。 diff --git a/official/cv/RetinaNet/README.md b/official/cv/RetinaNet/README.md index 51b9f5503..b3a064e41 100644 --- a/official/cv/RetinaNet/README.md +++ b/official/cv/RetinaNet/README.md @@ -208,7 +208,7 @@ bash scripts/run_single_train.sh DEVICE_ID MINDRECORD_DIR CONFIG_PATH PRE_TRAINE > Note: - For details about RANK_TABLE_FILE, see [Link](https://www.mindspore.cn/docs/en/master/model_train/parallel/rank_table.html). For details about how to obtain device IP address, see [Link](https://gitee.com/mindspore/models/tree/master/utils/hccl_tools). + For details about RANK_TABLE_FILE, see [Link](https://www.mindspore.cn/tutorials/en/master/parallel/rank_table.html). For details about how to obtain device IP address, see [Link](https://gitee.com/mindspore/models/tree/master/utils/hccl_tools). #### Running diff --git a/official/cv/RetinaNet/README_CN.md b/official/cv/RetinaNet/README_CN.md index ee093f339..d69a835cd 100644 --- a/official/cv/RetinaNet/README_CN.md +++ b/official/cv/RetinaNet/README_CN.md @@ -203,7 +203,7 @@ bash scripts/run_single_train.sh DEVICE_ID MINDRECORD_DIR CONFIG_PATH PRE_TRAINE > 注意: - RANK_TABLE_FILE相关参考资料见[链接](https://www.mindspore.cn/docs/zh-CN/master/model_train/parallel/rank_table.html), 获取device_ip方法详见[链接](https://gitee.com/mindspore/models/tree/master/utils/hccl_tools)。 + RANK_TABLE_FILE相关参考资料见[链接](https://www.mindspore.cn/tutorials/zh-CN/master/parallel/rank_table.html), 获取device_ip方法详见[链接](https://gitee.com/mindspore/models/tree/master/utils/hccl_tools)。 #### 运行 diff --git a/official/cv/SSD/README.md b/official/cv/SSD/README.md index 781d7852a..e41ad5948 100644 --- a/official/cv/SSD/README.md +++ b/official/cv/SSD/README.md @@ -324,7 +324,7 @@ Then you can run everything just like on ascend. ### [Training Process](#contents) -To train the model, run `train.py`. If the `mindrecord_dir` is empty, it will generate [mindrecord](https://www.mindspore.cn/docs/en/master/model_train/dataset/record.html) files by `coco_root`(coco dataset), `voc_root`(voc dataset) or `image_dir` and `anno_path`(own dataset). **Note if mindrecord_dir isn't empty, it will use mindrecord_dir instead of raw images.** +To train the model, run `train.py`. If the `mindrecord_dir` is empty, it will generate [mindrecord](https://www.mindspore.cn/tutorials/en/master/dataset/record.html) files by `coco_root`(coco dataset), `voc_root`(voc dataset) or `image_dir` and `anno_path`(own dataset). **Note if mindrecord_dir isn't empty, it will use mindrecord_dir instead of raw images.** #### Training on Ascend diff --git a/official/cv/SSD/README_CN.md b/official/cv/SSD/README_CN.md index 8b97f4e9b..169ca7e8a 100644 --- a/official/cv/SSD/README_CN.md +++ b/official/cv/SSD/README_CN.md @@ -275,7 +275,7 @@ bash run_eval_gpu.sh [DATASET] [CHECKPOINT_PATH] [DEVICE_ID] [CONFIG_PATH] ## 训练过程 -运行`train.py`训练模型。如果`mindrecord_dir`为空,则会通过`coco_root`(coco数据集)或`image_dir`和`anno_path`(自己的数据集)生成[MindRecord](https://www.mindspore.cn/docs/zh-CN/master/model_train/dataset/record.html)文件。**注意,如果mindrecord_dir不为空,将使用mindrecord_dir代替原始图像。** +运行`train.py`训练模型。如果`mindrecord_dir`为空,则会通过`coco_root`(coco数据集)或`image_dir`和`anno_path`(自己的数据集)生成[MindRecord](https://www.mindspore.cn/tutorials/zh-CN/master/dataset/record.html)文件。**注意,如果mindrecord_dir不为空,将使用mindrecord_dir代替原始图像。** ### Ascend上训练 diff --git a/official/cv/Unet/README.md b/official/cv/Unet/README.md index f590c050b..f5fc7a3ca 100644 --- a/official/cv/Unet/README.md +++ b/official/cv/Unet/README.md @@ -617,7 +617,7 @@ Result on ONNX **Before inference, please refer to [MindSpore Inference with C++ Deployment Guide](https://gitee.com/mindspore/models/blob/master/utils/cpp_infer/README.md) to set environment variables.** If you need to use the trained model to perform inference on multiple hardware platforms, such as Ascend 910 or Ascend 310, you -can refer to this [Link](https://www.mindspore.cn/docs/en/master/model_infer/index.html). Following +can refer to this [Link](https://www.mindspore.cn/tutorials/en/master/model_infer/ms_infer/llm_inference_overview.html). Following the steps below, this is a simple example: ### Continue Training on the Pretrained Model diff --git a/official/cv/Unet/README_CN.md b/official/cv/Unet/README_CN.md index 2c664737b..21b776db8 100644 --- a/official/cv/Unet/README_CN.md +++ b/official/cv/Unet/README_CN.md @@ -607,7 +607,7 @@ bash ./scripts/run_eval_onnx.sh [DATASET_PATH] [ONNX_MODEL] [DEVICE_TARGET] [CON **推理前需参照 [MindSpore C++推理部署指南](https://gitee.com/mindspore/models/blob/master/utils/cpp_infer/README_CN.md) 进行环境变量设置。** -如果您需要使用训练好的模型在Ascend 910、Ascend 310等多个硬件平台上进行推理,可参考此[链接](https://www.mindspore.cn/docs/zh-CN/master/model_infer/index.html)。下面是一个简单的操作步骤示例: +如果您需要使用训练好的模型在Ascend 910、Ascend 310等多个硬件平台上进行推理,可参考此[链接](https://www.mindspore.cn/tutorials/zh-CN/master/model_infer/ms_infer/llm_inference_overview.html)。下面是一个简单的操作步骤示例: ### 继续训练预训练模型 diff --git a/official/cv/VGG/vgg16/README.md b/official/cv/VGG/vgg16/README.md index 931d40a70..063fe2f39 100644 --- a/official/cv/VGG/vgg16/README.md +++ b/official/cv/VGG/vgg16/README.md @@ -530,7 +530,7 @@ train_parallel1/log:epcoh: 2 step: 97, loss is 1.7133579 ... ``` -> About rank_table.json, you can refer to the [distributed training tutorial](https://www.mindspore.cn/docs/en/master/model_train/parallel/overview.html). +> About rank_table.json, you can refer to the [distributed training tutorial](https://www.mindspore.cn/tutorials/en/master/parallel/overview.html). > **Attention** This will bind the processor cores according to the `device_num` and total processor numbers. If you don't expect to run pretraining with binding processor cores, remove the operations about `taskset` in `scripts/run_distribute_train.sh` ##### Run vgg16 on GPU diff --git a/official/cv/VGG/vgg16/README_CN.md b/official/cv/VGG/vgg16/README_CN.md index 12d1fae1c..132b9f13d 100644 --- a/official/cv/VGG/vgg16/README_CN.md +++ b/official/cv/VGG/vgg16/README_CN.md @@ -530,7 +530,7 @@ train_parallel1/log:epcoh: 2 step: 97, loss is 1.7133579 ... ``` -> 关于rank_table.json,可以参考[分布式并行训练](https://www.mindspore.cn/docs/zh-CN/master/model_train/parallel/overview.html)。 +> 关于rank_table.json,可以参考[分布式并行训练](https://www.mindspore.cn/tutorials/zh-CN/master/parallel/overview.html)。 > **注意** 将根据`device_num`和处理器总数绑定处理器核。如果您不希望预训练中绑定处理器内核,请在`scripts/run_distribute_train.sh`脚本中移除`taskset`相关操作。 ##### GPU处理器环境运行VGG16 diff --git a/official/cv/VGG/vgg19/README.md b/official/cv/VGG/vgg19/README.md index bd2a9724e..f3bc2c8e7 100644 --- a/official/cv/VGG/vgg19/README.md +++ b/official/cv/VGG/vgg19/README.md @@ -453,7 +453,7 @@ train_parallel1/log:epcoh: 2 step: 97, loss is 1.7133579 ... ``` -> About rank_table.json, you can refer to the [distributed training tutorial](https://www.mindspore.cn/docs/en/master/model_train/parallel/overview.html). +> About rank_table.json, you can refer to the [distributed training tutorial](https://www.mindspore.cn/tutorials/en/master/parallel/overview.html). > **Attention** This will bind the processor cores according to the `device_num` and total processor numbers. If you don't expect to run pretraining with binding processor cores, remove the operations about `taskset` in `scripts/run_distribute_train.sh` ##### Run vgg19 on GPU diff --git a/official/cv/VGG/vgg19/README_CN.md b/official/cv/VGG/vgg19/README_CN.md index 7d9b1710b..d54dfe934 100644 --- a/official/cv/VGG/vgg19/README_CN.md +++ b/official/cv/VGG/vgg19/README_CN.md @@ -466,7 +466,7 @@ train_parallel1/log:epcoh: 2 step: 97, loss is 1.7133579 ... ``` -> 关于rank_table.json,可以参考[分布式并行训练](https://www.mindspore.cn/docs/zh-CN/master/model_train/parallel/overview.html)。 +> 关于rank_table.json,可以参考[分布式并行训练](https://www.mindspore.cn/tutorials/zh-CN/master/parallel/overview.html)。 > **注意** 将根据`device_num`和处理器总数绑定处理器核。如果您不希望预训练中绑定处理器内核,请在`scripts/run_distribute_train.sh`脚本中移除`taskset`相关操作。 ##### GPU处理器环境运行VGG19 diff --git a/official/cv/VIT/README.md b/official/cv/VIT/README.md index 3fe19000a..e6355d369 100644 --- a/official/cv/VIT/README.md +++ b/official/cv/VIT/README.md @@ -449,7 +449,7 @@ in acc.log. ### Inference -If you need to use the trained model to perform inference on multiple hardware platforms, such as GPU, Ascend 910 or Ascend 310, you can refer to this [Link](https://www.mindspore.cn/docs/en/master/model_infer/index.html). Following the steps below, this is a simple example: +If you need to use the trained model to perform inference on multiple hardware platforms, such as GPU, Ascend 910 or Ascend 310, you can refer to this [Link](https://www.mindspore.cn/tutorials/en/master/model_infer/ms_infer/llm_inference_overview.html). Following the steps below, this is a simple example: - Running on Ascend diff --git a/official/cv/VIT/README_CN.md b/official/cv/VIT/README_CN.md index 3e33d3eed..28da32b14 100644 --- a/official/cv/VIT/README_CN.md +++ b/official/cv/VIT/README_CN.md @@ -451,7 +451,7 @@ python export.py --config_path=[CONFIG_PATH] ### 推理 -如果您需要使用此训练模型在GPU、Ascend 910、Ascend 310等多个硬件平台上进行推理,可参考此[链接](https://www.mindspore.cn/docs/zh-CN/master/model_infer/index.html)。下面是操作步骤示例: +如果您需要使用此训练模型在GPU、Ascend 910、Ascend 310等多个硬件平台上进行推理,可参考此[链接](https://www.mindspore.cn/tutorials/zh-CN/master/model_infer/ms_infer/llm_inference_overview.html)。下面是操作步骤示例: - Ascend处理器环境运行 diff --git a/official/nlp/Pangu_alpha/README.md b/official/nlp/Pangu_alpha/README.md index 118a38c42..e2156c592 100644 --- a/official/nlp/Pangu_alpha/README.md +++ b/official/nlp/Pangu_alpha/README.md @@ -51,7 +51,7 @@ with our parallel setting. We summarized the training tricks as following: 2. Pipeline Model Parallelism 3. Optimizer Model Parallelism -The above features can be found [here](https://www.mindspore.cn/docs/en/master/model_train/parallel/overview.html). +The above features can be found [here](https://www.mindspore.cn/tutorials/en/master/parallel/overview.html). More amazing features are still under developing. The technical report and checkpoint file can be found [here](https://git.openi.org.cn/PCL-Platform.Intelligence/PanGu-AIpha). @@ -157,7 +157,7 @@ bash scripts/run_distribute_train.sh /data/pangu_30_step_ba64/ /root/hccl_8p.jso The above command involves some `args` described below: - DATASET: The path to the mindrecord files's parent directory . For example: `/home/work/mindrecord/`. -- RANK_TABLE: The details of the rank table can be found [here](https://www.mindspore.cn/docs/en/master/model_train/parallel/rank_table.html). It's a json file describes the `device id`, `service ip` and `rank`. +- RANK_TABLE: The details of the rank table can be found [here](https://www.mindspore.cn/tutorials/en/master/parallel/rank_table.html). It's a json file describes the `device id`, `service ip` and `rank`. - RANK_SIZE: The device number. This can be your total device numbers. For example, 8, 16, 32 ... - TYPE: The param init type. The parameters will be initialized with float32. Or you can replace it with `fp16`. This will save a little memory used on the device. - MODE: The configure mode. This mode will set the `hidden size` and `layers` to make the parameter number near 2.6 billions. The other mode can be `13B` (`hidden size` 5120 and `layers` 40, which needs at least 16 cards to train.) and `200B`. @@ -206,7 +206,7 @@ bash scripts/run_distribute_train_gpu.sh RANK_SIZE HOSTFILE DATASET PER_BATCH MO ``` - RANK_SIZE: The device number. This can be your total device numbers. For example, 8, 16, 32 ... -- HOSTFILE: It's a text file describes the host ip and its devices. Please see our [tutorial](https://www.mindspore.cn/docs/en/master/model_train/parallel/mpirun.html) or [OpenMPI](https://www.open-mpi.org/) for more details. +- HOSTFILE: It's a text file describes the host ip and its devices. Please see our [tutorial](https://www.mindspore.cn/tutorials/en/master/parallel/mpirun.html) or [OpenMPI](https://www.open-mpi.org/) for more details. - DATASET: The path to the mindrecord files's parent directory . For example: `/home/work/mindrecord/`. - PER_BATCH: The batch size for each data parallel-way. - MODE: Can be `1.3B` `2.6B`, `13B` and `200B`. @@ -228,7 +228,7 @@ bash scripts/run_distribute_train_moe_host_device.sh DATASET RANK_TABLE RANK_SIZ The above command involves some `args` described below: - DATASET: The path to the mindrecord files's parent directory . For example: `/home/work/mindrecord/`. -- RANK_TABLE: The details of the rank table can be found [here](https://www.mindspore.cn/docs/en/master/model_train/parallel/rank_table.html). It's a json file describes the `device id`, `service ip` and `rank`. +- RANK_TABLE: The details of the rank table can be found [here](https://www.mindspore.cn/tutorials/en/master/parallel/rank_table.html). It's a json file describes the `device id`, `service ip` and `rank`. - RANK_SIZE: The device number. This can be your total device numbers. For example, 8, 16, 32 ... - TYPE: The param init type. The parameters will be initialized with float32. Or you can replace it with `fp16`. This will save a little memory used on the device. - MODE: The configure mode. This mode will set the `hidden size` and `layers` to make the parameter number near 2.6 billions. The other mode can be `13B` (`hidden size` 5120 and `layers` 40, which needs at least 16 cards to train.) and `200B`. diff --git a/official/nlp/Pangu_alpha/README_CN.md b/official/nlp/Pangu_alpha/README_CN.md index 9e272de85..9307a1c00 100644 --- a/official/nlp/Pangu_alpha/README_CN.md +++ b/official/nlp/Pangu_alpha/README_CN.md @@ -51,7 +51,7 @@ 2. 流水线模型并行 3. 优化器模型并行 -有关上述特性,请点击[此处](https://www.mindspore.cn/docs/en/master/model_train/parallel/overview.html)查看详情。 +有关上述特性,请点击[此处](https://www.mindspore.cn/tutorials/en/master/parallel/overview.html)查看详情。 更多特性敬请期待。 详细技术报告和检查点文件,可点击[此处](https://git.openi.org.cn/PCL-Platform.Intelligence/PanGu-AIpha)查看。 @@ -156,7 +156,7 @@ bash scripts/run_distribute_train.sh /data/pangu_30_step_ba64/ /root/hccl_8p.jso 上述命令涉及以下`args`: - DATASET:mindrecord文件父目录的路径。例如:`/home/work/mindrecord/`。 -- RANK_TABLE:rank table的详细信息,请点击[此处](https://www.mindspore.cn/docs/en/master/model_train/parallel/rank_table.html)查看。该.json文件描述了`device id`、`service ip`和`rank`。 +- RANK_TABLE:rank table的详细信息,请点击[此处](https://www.mindspore.cn/tutorials/en/master/parallel/rank_table.html)查看。该.json文件描述了`device id`、`service ip`和`rank`。 - RANK_SIZE:设备编号,也可以表示设备总数。例如,8、16、32 ... - TYPE:参数初始化类型。参数使用单精度(FP32) 或半精度(FP16)初始化。可以节省设备占用内存。 - MODE:配置模式。通过设置`hidden size`和`layers`,将参数量增至26亿。还可以选择13B(`hidden size`为5120和`layers`为40,训练至少需要16卡)和200B模式。 @@ -205,7 +205,7 @@ bash scripts/run_distribute_train_gpu.sh RANK_SIZE HOSTFILE DATASET PER_BATCH MO ``` - RANK_SIZE:设备编号,也可以表示设备总数。例如,8、16、32 ... -- HOSTFILE:描述主机IP及其设备的文本文件。有关更多详细信息,请参见我们的[教程](https://www.mindspore.cn/docs/en/master/model_train/parallel/mpirun.html) or [OpenMPI](https://www.open-mpi.org/)。 +- HOSTFILE:描述主机IP及其设备的文本文件。有关更多详细信息,请参见我们的[教程](https://www.mindspore.cn/tutorials/en/master/parallel/mpirun.html) or [OpenMPI](https://www.open-mpi.org/)。 - DATASET:mindrecord文件父目录的路径。例如:`/home/work/mindrecord/`。 - PER_BATCH:每个数据并行的批处理大小, - MODE:可以是`1.3B`、`2.6B`、`13B`或`200B`。 @@ -227,7 +227,7 @@ bash scripts/run_distribute_train_moe_host_device.sh DATASET RANK_TABLE RANK_SIZ 上述命令涉及以下args: - DATASET:mindrecord文件父目录的路径。例如:`/home/work/mindrecord/`。 -- RANK_TABLE:rank table的详细信息,请点击[此处](https://www.mindspore.cn/docs/en/master/model_train/parallel/rank_table.html)查看。该.json文件描述了device id、service ip和rank。 +- RANK_TABLE:rank table的详细信息,请点击[此处](https://www.mindspore.cn/tutorials/en/master/parallel/rank_table.html)查看。该.json文件描述了device id、service ip和rank。 - RANK_SIZE:设备编号,也可以是您的设备总数。例如,8、16、32 ... - TYPE:参数初始化类型。参数使用单精度(FP32) 或半精度(FP16)初始化。可以节省设备占用内存。 - MODE:配置模式。通过设置`hidden size`和`layers`,将参数量增至26亿。还可以选择`13B`(`hidden size`为5120和`layers`为40,训练至少需要16卡)和`200B`模式。 diff --git a/official/nlp/Transformer/README.md b/official/nlp/Transformer/README.md index ec6a66aee..298cf0404 100644 --- a/official/nlp/Transformer/README.md +++ b/official/nlp/Transformer/README.md @@ -354,7 +354,7 @@ Parameters for learning rate: ## [Training Process](#contents) -- Set options in `default_config_large.yaml`, including loss_scale, learning rate and network hyperparameters. Click [here](https://www.mindspore.cn/docs/en/master/model_train/index.html) for more information about dataset. +- Set options in `default_config_large.yaml`, including loss_scale, learning rate and network hyperparameters. - Run `run_standalone_train.sh` for non-distributed training of Transformer model. diff --git a/official/nlp/Transformer/README_CN.md b/official/nlp/Transformer/README_CN.md index dc9386944..150bc66f2 100644 --- a/official/nlp/Transformer/README_CN.md +++ b/official/nlp/Transformer/README_CN.md @@ -356,7 +356,7 @@ Parameters for learning rate: ### 训练过程 -- 在`default_config_large.yaml`中设置选项,包括loss_scale、学习率和网络超参数。点击[这里](https://www.mindspore.cn/docs/zh-CN/master/model_train/index.html)查看更多数据集信息。 +- 在`default_config_large.yaml`中设置选项,包括loss_scale、学习率和网络超参数。 - 运行`run_standalone_train.sh`,进行Transformer模型的单卡训练。 diff --git a/research/audio/speech_transformer/README.md b/research/audio/speech_transformer/README.md index bf62c4616..4cb9b6575 100644 --- a/research/audio/speech_transformer/README.md +++ b/research/audio/speech_transformer/README.md @@ -187,7 +187,7 @@ Dataset is preprocessed using `Kaldi` and converts kaldi binaries into Python pi ## [Training Process](#contents) -- Set options in `default_config.yaml`, including loss_scale, learning rate and network hyperparameters. Click [here](https://www.mindspore.cn/docs/en/master/model_train/index.html) for more information about dataset. +- Set options in `default_config.yaml`, including loss_scale, learning rate and network hyperparameters. - Run `run_standalone_train_gpu.sh` for non-distributed training of Transformer model. diff --git a/research/cv/3D_DenseNet/README.md b/research/cv/3D_DenseNet/README.md index 1b8a78ffe..6ea1ba349 100644 --- a/research/cv/3D_DenseNet/README.md +++ b/research/cv/3D_DenseNet/README.md @@ -222,7 +222,7 @@ Dice Coefficient (DC) for 9th subject (9 subjects for training and 1 subject for |-------------------|:-------------------:|:---------------------:|:-----:|:--------------:| |3D-SkipDenseSeg | 93.66| 90.80 | 90.65 | 91.70 | -Notes: RANK_TABLE_FILE can refer to [Link](https://www.mindspore.cn/docs/en/master/model_train/parallel/rank_table.html) , and the device_ip can be got as [Link](https://gitee.com/mindspore/models/tree/master/utils/hccl_tools) For large models like InceptionV4, it's better to export an external environment variable export HCCL_CONNECT_TIMEOUT=600 to extend hccl connection checking time from the default 120 seconds to 600 seconds. Otherwise, the connection could be timeout since compiling time increases with the growth of model size. To avoid ops error,you should change the code like below: +Notes: RANK_TABLE_FILE can refer to [Link](https://www.mindspore.cn/tutorials/en/master/parallel/rank_table.html) , and the device_ip can be got as [Link](https://gitee.com/mindspore/models/tree/master/utils/hccl_tools) For large models like InceptionV4, it's better to export an external environment variable export HCCL_CONNECT_TIMEOUT=600 to extend hccl connection checking time from the default 120 seconds to 600 seconds. Otherwise, the connection could be timeout since compiling time increases with the growth of model size. To avoid ops error,you should change the code like below: in train.py: diff --git a/research/cv/3D_DenseNet/README_CN.md b/research/cv/3D_DenseNet/README_CN.md index 022157600..fe913c378 100644 --- a/research/cv/3D_DenseNet/README_CN.md +++ b/research/cv/3D_DenseNet/README_CN.md @@ -212,7 +212,7 @@ bash run_eval.sh 3D-DenseSeg-20000_36.ckpt data/data_val |-------------------|:-------------------:|:---------------------:|:-----:|:--------------:| |3D-SkipDenseSeg | 93.66| 90.80 | 90.65 | 91.70 | -Notes: 分布式训练需要一个RANK_TABLE_FILE,文件的删除方式可以参考该链接[Link](https://www.mindspore.cn/docs/en/master/model_train/parallel/rank_table.html) ,device_ip的设置参考该链接 [Link](https://gitee.com/mindspore/models/tree/master/utils/hccl_tools) 对于像InceptionV4这样的大模型来说, 最好导出一个外部环境变量,export HCCL_CONNECT_TIMEOUT=600,以将hccl连接检查时间从默认的120秒延长到600秒。否则,连接可能会超时,因为编译时间会随着模型大小的增加而增加。在1.3.0版本下,3D算子可能存在一些问题,您可能需要更改context.set_auto_parallel_context的部分代码: +Notes: 分布式训练需要一个RANK_TABLE_FILE,文件的删除方式可以参考该链接[Link](https://www.mindspore.cn/tutorials/en/master/parallel/rank_table.html) ,device_ip的设置参考该链接 [Link](https://gitee.com/mindspore/models/tree/master/utils/hccl_tools) 对于像InceptionV4这样的大模型来说, 最好导出一个外部环境变量,export HCCL_CONNECT_TIMEOUT=600,以将hccl连接检查时间从默认的120秒延长到600秒。否则,连接可能会超时,因为编译时间会随着模型大小的增加而增加。在1.3.0版本下,3D算子可能存在一些问题,您可能需要更改context.set_auto_parallel_context的部分代码: in train.py: diff --git a/research/cv/AlignedReID++/README_CN.md b/research/cv/AlignedReID++/README_CN.md index 532d291ec..7d59efd3f 100644 --- a/research/cv/AlignedReID++/README_CN.md +++ b/research/cv/AlignedReID++/README_CN.md @@ -405,7 +405,7 @@ market1501上评估AlignedReID++ ### 推理 -如果您需要使用此训练模型在GPU、Ascend 910、Ascend 310等多个硬件平台上进行推理,可参考此[链接](https://www.mindspore.cn/docs/zh-CN/master/model_infer/index.html)。下面是操作步骤示例: +如果您需要使用此训练模型在GPU、Ascend 910、Ascend 310等多个硬件平台上进行推理,可参考此[链接](https://www.mindspore.cn/tutorials/zh-CN/master/model_infer/ms_infer/llm_inference_overview.html)。下面是操作步骤示例: 在进行推理之前我们需要先导出模型,mindir可以在本地环境上导出。batch_size默认为1。 diff --git a/research/cv/C3D/README.md b/research/cv/C3D/README.md index 6102a83e4..9a555620e 100644 --- a/research/cv/C3D/README.md +++ b/research/cv/C3D/README.md @@ -465,7 +465,7 @@ The above shell script will run distribute training in the background. You can v #### Distributed training on Ascend > Notes: -> RANK_TABLE_FILE can refer to [Link](https://www.mindspore.cn/docs/en/master/model_train/parallel/rank_table.html) , and the device_ip can be got as [Link](https://gitee.com/mindspore/models/tree/master/utils/hccl_tools). For large models like InceptionV4, it's better to export an external environment variable `export HCCL_CONNECT_TIMEOUT=600` to extend hccl connection checking time from the default 120 seconds to 600 seconds. Otherwise, the connection could be timeout since compiling time increases with the growth of model size. +> RANK_TABLE_FILE can refer to [Link](https://www.mindspore.cn/tutorials/en/master/parallel/rank_table.html) , and the device_ip can be got as [Link](https://gitee.com/mindspore/models/tree/master/utils/hccl_tools). For large models like InceptionV4, it's better to export an external environment variable `export HCCL_CONNECT_TIMEOUT=600` to extend hccl connection checking time from the default 120 seconds to 600 seconds. Otherwise, the connection could be timeout since compiling time increases with the growth of model size. > ```text diff --git a/research/cv/C3D/README_CN.md b/research/cv/C3D/README_CN.md index 7678cd917..306fe33e5 100644 --- a/research/cv/C3D/README_CN.md +++ b/research/cv/C3D/README_CN.md @@ -456,7 +456,7 @@ bash run_standalone_train_gpu.sh [CONFIG_PATH] [DEVICE_ID] #### Ascend分布式训练 > 注: -> RANK_TABLE_FILE文件,请参考[链接](https://www.mindspore.cn/docs/en/master/model_train/parallel/rank_table.html)。如需获取设备IP,请点击[链接](https://gitee.com/mindspore/models/tree/master/utils/hccl_tools)。对于InceptionV4等大模型,最好导出外部环境变量`export HCCL_CONNECT_TIMEOUT=600`,将hccl连接检查时间从默认的120秒延长到600秒。否则,连接可能会超时,因为随着模型增大,编译时间也会增加。 +> RANK_TABLE_FILE文件,请参考[链接](https://www.mindspore.cn/tutorials/en/master/parallel/rank_table.html)。如需获取设备IP,请点击[链接](https://gitee.com/mindspore/models/tree/master/utils/hccl_tools)。对于InceptionV4等大模型,最好导出外部环境变量`export HCCL_CONNECT_TIMEOUT=600`,将hccl连接检查时间从默认的120秒延长到600秒。否则,连接可能会超时,因为随着模型增大,编译时间也会增加。 > ```text diff --git a/research/cv/EGnet/README_CN.md b/research/cv/EGnet/README_CN.md index e486c8f6c..a17fbf18c 100644 --- a/research/cv/EGnet/README_CN.md +++ b/research/cv/EGnet/README_CN.md @@ -363,7 +363,7 @@ bash run_standalone_train_gpu.sh bash run_distribute_train.sh 8 [RANK_TABLE_FILE] ``` -线下运行分布式训练请参照[rank table启动](https://www.mindspore.cn/docs/zh-CN/master/model_train/parallel/rank_table.html) +线下运行分布式训练请参照[rank table启动](https://www.mindspore.cn/tutorials/zh-CN/master/parallel/rank_table.html) - 线上modelarts分布式训练 diff --git a/research/cv/LightCNN/README.md b/research/cv/LightCNN/README.md index c2de4a523..00c9fa674 100644 --- a/research/cv/LightCNN/README.md +++ b/research/cv/LightCNN/README.md @@ -139,7 +139,7 @@ reduce precision" to view the operators with reduced precision. - Generate config json file for 8-card training - [Simple tutorial](https://gitee.com/mindspore/models/tree/master/utils/hccl_tools) - For detailed configuration method, please refer to - the [rank table Startup](https://www.mindspore.cn/docs/en/master/model_train/parallel/rank_table.html). + the [rank table Startup](https://www.mindspore.cn/tutorials/en/master/parallel/rank_table.html). # [Quick start](#Quickstart) diff --git a/research/cv/LightCNN/README_CN.md b/research/cv/LightCNN/README_CN.md index d363114b8..6bdbae9a4 100644 --- a/research/cv/LightCNN/README_CN.md +++ b/research/cv/LightCNN/README_CN.md @@ -107,7 +107,7 @@ LightCNN适用于有大量噪声的人脸识别数据集,提出了maxout 的 - [MindSpore Python API](https://www.mindspore.cn/docs/zh-CN/master/api_python/mindspore.html) - 生成config json文件用于8卡训练。 - [简易教程](https://gitee.com/mindspore/models/tree/master/utils/hccl_tools) - - 详细配置方法请参照[rank table启动](https://www.mindspore.cn/docs/zh-CN/master/model_train/parallel/rank_table.html)。 + - 详细配置方法请参照[rank table启动](https://www.mindspore.cn/tutorials/zh-CN/master/parallel/rank_table.html)。 # 快速入门 diff --git a/research/cv/Unet3d/README.md b/research/cv/Unet3d/README.md index 4e823c207..a47ddd087 100644 --- a/research/cv/Unet3d/README.md +++ b/research/cv/Unet3d/README.md @@ -312,7 +312,7 @@ After training, you'll get some checkpoint files under the `train_parallel_fp[32 #### Distributed training on Ascend > Notes: -> RANK_TABLE_FILE can refer to [Link](https://www.mindspore.cn/docs/en/master/model_train/parallel/rank_table.html) , and the device_ip can be got as [Link](https://gitee.com/mindspore/models/tree/master/utils/hccl_tools). For large models like InceptionV4, it's better to export an external environment variable `export HCCL_CONNECT_TIMEOUT=600` to extend hccl connection checking time from the default 120 seconds to 600 seconds. Otherwise, the connection could be timeout since compiling time increases with the growth of model size. +> RANK_TABLE_FILE can refer to [Link](https://www.mindspore.cn/tutorials/en/master/parallel/rank_table.html) , and the device_ip can be got as [Link](https://gitee.com/mindspore/models/tree/master/utils/hccl_tools). For large models like InceptionV4, it's better to export an external environment variable `export HCCL_CONNECT_TIMEOUT=600` to extend hccl connection checking time from the default 120 seconds to 600 seconds. Otherwise, the connection could be timeout since compiling time increases with the growth of model size. > ```shell diff --git a/research/cv/Unet3d/README_CN.md b/research/cv/Unet3d/README_CN.md index 1c27edd34..28a8e4bf2 100644 --- a/research/cv/Unet3d/README_CN.md +++ b/research/cv/Unet3d/README_CN.md @@ -312,7 +312,7 @@ bash ./run_distribute_train_gpu_fp16.sh /path_prefix/LUNA16/train #### 在Ascend上进行分布式训练 > 注: -> RANK_TABLE_FILE参考[链接](https://www.mindspore.cn/docs/en/master/model_train/parallel/rank_table.html),device_ip参考[链接](https://gitee.com/mindspore/models/tree/master/utils/hccl_tools)。对于像InceptionV4这样的大模型,最好导出外部环境变量`export HCCL_CONNECT_TIMEOUT=600`,将HCCL连接检查时间从默认的120秒延长到600秒。否则,连接可能会超时,因为编译时间会随着模型大小的增长而增加。 +> RANK_TABLE_FILE参考[链接](https://www.mindspore.cn/tutorials/en/master/parallel/rank_table.html),device_ip参考[链接](https://gitee.com/mindspore/models/tree/master/utils/hccl_tools)。对于像InceptionV4这样的大模型,最好导出外部环境变量`export HCCL_CONNECT_TIMEOUT=600`,将HCCL连接检查时间从默认的120秒延长到600秒。否则,连接可能会超时,因为编译时间会随着模型大小的增长而增加。 > ```shell diff --git a/research/cv/cnnctc/README.md b/research/cv/cnnctc/README.md index 789b52833..d14752bf2 100644 --- a/research/cv/cnnctc/README.md +++ b/research/cv/cnnctc/README.md @@ -542,7 +542,7 @@ accuracy: 0.8427 ### Inference -If you need to use the trained model to perform inference on multiple hardware platforms, such as GPU, Ascend 910 or Ascend 310, you can refer to this [Link](https://www.mindspore.cn/docs/en/master/model_infer/index.html). Following the steps below, this is a simple example: +If you need to use the trained model to perform inference on multiple hardware platforms, such as GPU, Ascend 910 or Ascend 310, you can refer to this [Link](https://www.mindspore.cn/tutorials/en/master/model_infer/ms_infer/llm_inference_overview.html). Following the steps below, this is a simple example: - Running on Ascend diff --git a/research/cv/cnnctc/README_CN.md b/research/cv/cnnctc/README_CN.md index 1589dda88..0dd9c823d 100644 --- a/research/cv/cnnctc/README_CN.md +++ b/research/cv/cnnctc/README_CN.md @@ -261,7 +261,7 @@ bash scripts/run_distribute_train_ascend.sh [RANK_TABLE_FILE] [PRETRAINED_CKPT(o > 注意: - RANK_TABLE_FILE相关参考资料见[链接](https://www.mindspore.cn/docs/zh-CN/master/model_train/parallel/rank_table.html), 获取device_ip方法详见[链接](https://gitee.com/mindspore/models/tree/master/utils/hccl_tools). + RANK_TABLE_FILE相关参考资料见[链接](https://www.mindspore.cn/tutorials/zh-CN/master/parallel/rank_table.html), 获取device_ip方法详见[链接](https://gitee.com/mindspore/models/tree/master/utils/hccl_tools). ### 训练结果 @@ -485,7 +485,7 @@ accuracy: 0.8427 ### 推理 -如果您需要在GPU、Ascend 910、Ascend 310等多个硬件平台上使用训练好的模型进行推理,请参考此[链接](https://www.mindspore.cn/docs/zh-CN/master/model_infer/index.html)。以下为简单示例: +如果您需要在GPU、Ascend 910、Ascend 310等多个硬件平台上使用训练好的模型进行推理,请参考此[链接](https://www.mindspore.cn/tutorials/zh-CN/master/model_infer/ms_infer/llm_inference_overview.html)。以下为简单示例: - Ascend处理器环境运行 diff --git a/research/cv/crnn_seq2seq_ocr/README.md b/research/cv/crnn_seq2seq_ocr/README.md index bfa509cf9..c21d9b1f3 100644 --- a/research/cv/crnn_seq2seq_ocr/README.md +++ b/research/cv/crnn_seq2seq_ocr/README.md @@ -229,7 +229,7 @@ Parameters for both training and evaluation can be set in config.py. ## [Training Process](#contents) -- Set options in `default_config.yaml`, including learning rate and other network hyperparameters. Click [MindSpore dataset preparation tutorial](https://www.mindspore.cn/docs/en/master/model_train/index.html) for more information about dataset. +- Set options in `default_config.yaml`, including learning rate and other network hyperparameters. ### [Training](#contents) diff --git a/research/cv/cspdarknet53/README.md b/research/cv/cspdarknet53/README.md index 5ddf567a9..b071fce85 100644 --- a/research/cv/cspdarknet53/README.md +++ b/research/cv/cspdarknet53/README.md @@ -206,7 +206,7 @@ bash run_distribute_train.sh [RANK_TABLE_FILE] [DATA_DIR] (option)[PATH_CHECKPOI bash run_standalone_train.sh [DEVICE_ID] [DATA_DIR] (option)[PATH_CHECKPOINT] ``` -> Notes: RANK_TABLE_FILE can refer to [Link](https://www.mindspore.cn/docs/en/master/model_train/parallel/rank_table.html), and the device_ip can be got as [Link](https://gitee.com/mindspore/models/tree/master/utils/hccl_tools). For large models like InceptionV3, it's better to export an external environment variable `export HCCL_CONNECT_TIMEOUT=600` to extend hccl connection checking time from the default 120 seconds to 600 seconds. Otherwise, the connection could be timeout since compiling time increases with the growth of model size. +> Notes: RANK_TABLE_FILE can refer to [Link](https://www.mindspore.cn/tutorials/en/master/parallel/rank_table.html), and the device_ip can be got as [Link](https://gitee.com/mindspore/models/tree/master/utils/hccl_tools). For large models like InceptionV3, it's better to export an external environment variable `export HCCL_CONNECT_TIMEOUT=600` to extend hccl connection checking time from the default 120 seconds to 600 seconds. Otherwise, the connection could be timeout since compiling time increases with the growth of model size. > > This is processor cores binding operation regarding the `device_num` and total processor numbers. If you are not expect to do it, remove the operations `taskset` in `scripts/run_distribute_train.sh` diff --git a/research/cv/dcgan/README.md b/research/cv/dcgan/README.md index 88986dcd8..a3d38ff24 100644 --- a/research/cv/dcgan/README.md +++ b/research/cv/dcgan/README.md @@ -156,7 +156,7 @@ dcgan_cifar10_cfg { ## [Training Process](#contents) -- Set options in `config.py`, including learning rate, output filename and network hyperparameters. Click [here](https://www.mindspore.cn/docs/en/master/model_train/index.html) for more information about dataset. +- Set options in `config.py`, including learning rate, output filename and network hyperparameters. ### [Training](#content) diff --git a/research/cv/dlinknet/README.md b/research/cv/dlinknet/README.md index 47c9969cc..565d149d7 100644 --- a/research/cv/dlinknet/README.md +++ b/research/cv/dlinknet/README.md @@ -328,7 +328,7 @@ bash scripts/run_distribute_gpu_train.sh [DATASET] [CONFIG_PATH] [DEVICE_NUM] [C #### inference If you need to use the trained model to perform inference on multiple hardware platforms, such as Ascend 910 or Ascend 310, you -can refer to this [Link](https://www.mindspore.cn/docs/en/master/model_infer/index.html). Following +can refer to this [Link](https://www.mindspore.cn/tutorials/en/master/model_infer/ms_infer/llm_inference_overview.html). Following the steps below, this is a simple example: ##### running-on-ascend-310 diff --git a/research/cv/dlinknet/README_CN.md b/research/cv/dlinknet/README_CN.md index 3911a9e0d..1dbd95c49 100644 --- a/research/cv/dlinknet/README_CN.md +++ b/research/cv/dlinknet/README_CN.md @@ -333,7 +333,7 @@ bash scripts/run_distribute_gpu_train.sh [DATASET] [CONFIG_PATH] [DEVICE_NUM] [C #### 推理 -如果您需要使用训练好的模型在Ascend 910、Ascend 310等多个硬件平台上进行推理,可参考此[链接](https://www.mindspore.cn/docs/zh-CN/master/model_infer/index.html)。下面是一个简单的操作步骤示例: +如果您需要使用训练好的模型在Ascend 910、Ascend 310等多个硬件平台上进行推理,可参考此[链接](https://www.mindspore.cn/tutorials/zh-CN/master/model_infer/ms_infer/llm_inference_overview.html)。下面是一个简单的操作步骤示例: ##### Ascend 310环境运行 diff --git a/research/cv/east/README.md b/research/cv/east/README.md index bfe3231db..494977be4 100644 --- a/research/cv/east/README.md +++ b/research/cv/east/README.md @@ -134,7 +134,7 @@ bash run_eval_gpu.sh [DATASET_PATH] [CKPT_PATH] [DEVICE_ID] ``` > Notes: -> RANK_TABLE_FILE can refer to [Link](https://www.mindspore.cn/docs/en/master/model_train/parallel/rank_table.html) , and the device_ip can be got as [Link](https://gitee.com/mindspore/models/tree/master/utils/hccl_tools). For large models like InceptionV4, it's better to export an external environment variable `export HCCL_CONNECT_TIMEOUT=600` to extend hccl connection checking time from the default 120 seconds to 600 seconds. Otherwise, the connection could be timeout since compiling time increases with the growth of model size. +> RANK_TABLE_FILE can refer to [Link](https://www.mindspore.cn/tutorials/en/master/parallel/rank_table.html) , and the device_ip can be got as [Link](https://gitee.com/mindspore/models/tree/master/utils/hccl_tools). For large models like InceptionV4, it's better to export an external environment variable `export HCCL_CONNECT_TIMEOUT=600` to extend hccl connection checking time from the default 120 seconds to 600 seconds. Otherwise, the connection could be timeout since compiling time increases with the growth of model size. > > This is processor cores binding operation regarding the `device_num` and total processor numbers. If you are not expect to do it, remove the operations `taskset` in `scripts/run_distribute_train.sh` > diff --git a/research/cv/essay-recogination/README_CN.md b/research/cv/essay-recogination/README_CN.md index 707c932ec..62246b266 100644 --- a/research/cv/essay-recogination/README_CN.md +++ b/research/cv/essay-recogination/README_CN.md @@ -111,7 +111,7 @@ train.valInterval = 100 #边训练边推 ## 训练过程 -- 在`parameters/hwdb.gin`中设置选项,包括学习率和网络超参数。单击[MindSpore加载数据集教程](https://www.mindspore.cn/docs/zh-CN/master/model_train/index.html),了解更多信息。 +- 在`parameters/hwdb.gin`中设置选项,包括学习率和网络超参数。 ### 训练 diff --git a/research/cv/googlenet/README.md b/research/cv/googlenet/README.md index de0308129..843d7b5e1 100644 --- a/research/cv/googlenet/README.md +++ b/research/cv/googlenet/README.md @@ -597,7 +597,7 @@ Current batch_ Size can only be set to 1. ### Inference -If you need to use the trained model to perform inference on multiple hardware platforms, such as GPU, Ascend 910 or Ascend 310, you can refer to this [Link](https://www.mindspore.cn/docs/en/master/model_infer/index.html). Following the steps below, this is a simple example: +If you need to use the trained model to perform inference on multiple hardware platforms, such as GPU, Ascend 910 or Ascend 310, you can refer to this [Link](https://www.mindspore.cn/tutorials/en/master/model_infer/ms_infer/llm_inference_overview.html). Following the steps below, this is a simple example: - Running on Ascend diff --git a/research/cv/googlenet/README_CN.md b/research/cv/googlenet/README_CN.md index 3c294fce2..f9f7c2ec2 100644 --- a/research/cv/googlenet/README_CN.md +++ b/research/cv/googlenet/README_CN.md @@ -598,7 +598,7 @@ python export.py --config_path [CONFIG_PATH] ### 推理 -如果您需要使用此训练模型在GPU、Ascend 910、Ascend 310等多个硬件平台上进行推理,可参考此[链接](https://www.mindspore.cn/docs/zh-CN/master/model_infer/index.html)。下面是操作步骤示例: +如果您需要使用此训练模型在GPU、Ascend 910、Ascend 310等多个硬件平台上进行推理,可参考此[链接](https://www.mindspore.cn/tutorials/zh-CN/master/model_infer/ms_infer/llm_inference_overview.html)。下面是操作步骤示例: - Ascend处理器环境运行 diff --git a/research/cv/hardnet/README_CN.md b/research/cv/hardnet/README_CN.md index 6eb181d04..9e09127a4 100644 --- a/research/cv/hardnet/README_CN.md +++ b/research/cv/hardnet/README_CN.md @@ -449,7 +449,7 @@ bash run_infer_310.sh [MINDIR_PATH] [DATASET_PATH] [DEVICE_ID] ### 推理 -如果您需要使用此训练模型在Ascend 910上进行推理,可参考此[链接](https://www.mindspore.cn/docs/zh-CN/master/model_infer/index.html)。下面是操作步骤示例: +如果您需要使用此训练模型在Ascend 910上进行推理,可参考此[链接](https://www.mindspore.cn/tutorials/zh-CN/master/model_infer/ms_infer/llm_inference_overview.html)。下面是操作步骤示例: - Ascend处理器环境运行 @@ -486,7 +486,7 @@ bash run_infer_310.sh [MINDIR_PATH] [DATASET_PATH] [DEVICE_ID] print("==============Acc: {} ==============".format(acc)) ``` -如果您需要使用此训练模型在GPU上进行推理,可参考此[链接](https://www.mindspore.cn/docs/zh-CN/master/model_infer/index.html)。下面是操作步骤示例: +如果您需要使用此训练模型在GPU上进行推理,可参考此[链接](https://www.mindspore.cn/tutorials/zh-CN/master/model_infer/ms_infer/llm_inference_overview.html)。下面是操作步骤示例: - GPU处理器环境运行 diff --git a/research/cv/inception_resnet_v2/README.md b/research/cv/inception_resnet_v2/README.md index d7a335067..7c0f4cb22 100644 --- a/research/cv/inception_resnet_v2/README.md +++ b/research/cv/inception_resnet_v2/README.md @@ -122,7 +122,7 @@ bash scripts/run_standalone_train_ascend.sh DEVICE_ID DATA_DIR ``` > Notes: -> RANK_TABLE_FILE can refer to [Link](https://www.mindspore.cn/docs/en/master/model_train/parallel/rank_table.html) , and the device_ip can be got as [Link](https://gitee.com/mindspore/models/tree/master/utils/hccl_tools). For large models like InceptionV4, it's better to export an external environment variable `export HCCL_CONNECT_TIMEOUT=600` to extend hccl connection checking time from the default 120 seconds to 600 seconds. Otherwise, the connection could be timeout since compiling time increases with the growth of model size. +> RANK_TABLE_FILE can refer to [Link](https://www.mindspore.cn/tutorials/en/master/parallel/rank_table.html) , and the device_ip can be got as [Link](https://gitee.com/mindspore/models/tree/master/utils/hccl_tools). For large models like InceptionV4, it's better to export an external environment variable `export HCCL_CONNECT_TIMEOUT=600` to extend hccl connection checking time from the default 120 seconds to 600 seconds. Otherwise, the connection could be timeout since compiling time increases with the growth of model size. > > This is processor cores binding operation regarding the `device_num` and total processor numbers. If you are not expect to do it, remove the operations `taskset` in `scripts/run_distribute_train.sh` diff --git a/research/cv/inception_resnet_v2/README_CN.md b/research/cv/inception_resnet_v2/README_CN.md index 2358f9ae4..e20b018fc 100644 --- a/research/cv/inception_resnet_v2/README_CN.md +++ b/research/cv/inception_resnet_v2/README_CN.md @@ -133,7 +133,7 @@ bash scripts/run_distribute_train_ascend.sh RANK_TABLE_FILE DATA_DIR bash scripts/run_standalone_train_ascend.sh DEVICE_ID DATA_DIR ``` -> 注:RANK_TABLE_FILE可参考[链接]( https://www.mindspore.cn/docs/zh-CN/master/model_train/parallel/rank_table.html)。device_ip可以通过[链接](https://gitee.com/mindspore/models/tree/master/utils/hccl_tools)获取 +> 注:RANK_TABLE_FILE可参考[链接]( https://www.mindspore.cn/tutorials/zh-CN/master/parallel/rank_table.html)。device_ip可以通过[链接](https://gitee.com/mindspore/models/tree/master/utils/hccl_tools)获取 - GPU: diff --git a/research/cv/nas-fpn/README_CN.md b/research/cv/nas-fpn/README_CN.md index d0a1a21fe..230ab73d5 100644 --- a/research/cv/nas-fpn/README_CN.md +++ b/research/cv/nas-fpn/README_CN.md @@ -161,7 +161,7 @@ bash scripts/run_single_train.sh DEVICE_ID MINDRECORD_DIR PRE_TRAINED(optional) ``` > 注意: -RANK_TABLE_FILE相关参考资料见[链接](https://www.mindspore.cn/docs/zh-CN/master/model_train/parallel/rank_table.html), 获取device_ip方法详见[链接](https://gitee.com/mindspore/models/tree/master/utils/hccl_tools). +RANK_TABLE_FILE相关参考资料见[链接](https://www.mindspore.cn/tutorials/zh-CN/master/parallel/rank_table.html), 获取device_ip方法详见[链接](https://gitee.com/mindspore/models/tree/master/utils/hccl_tools). #### 运行 diff --git a/research/cv/ntsnet/README.md b/research/cv/ntsnet/README.md index d90b27eb9..915b75076 100644 --- a/research/cv/ntsnet/README.md +++ b/research/cv/ntsnet/README.md @@ -133,7 +133,7 @@ Usage: bash run_standalone_train_ascend.sh [DATA_URL] [TRAIN_URL] ## [Training Process](#contents) -- Set options in `config.py`, including learning rate, output filename and network hyperparameters. Click [here](https://www.mindspore.cn/docs/en/master/model_train/index.html) for more information about dataset. +- Set options in `config.py`, including learning rate, output filename and network hyperparameters. - Get ResNet50 pretrained model from [Mindspore Hub](https://www.mindspore.cn/resources/hub/details?MindSpore/ascend/v1.2/resnet50_v1.2_imagenet2012) ### [Training](#content) diff --git a/research/cv/osnet/README.md b/research/cv/osnet/README.md index 6303d8f5a..a5674b8d2 100644 --- a/research/cv/osnet/README.md +++ b/research/cv/osnet/README.md @@ -160,7 +160,7 @@ bash run_eval_ascend.sh [DATASET] [CHECKPOINT_PATH] [DEVICE_ID] ``` > Notes: -> RANK_TABLE_FILE can refer to [Link](https://www.mindspore.cn/docs/en/master/model_train/parallel/rank_table.html) , and the device_ip can be got as [Link](https://gitee.com/mindspore/models/tree/master/utils/hccl_tools). For large models like InceptionV4, it's better to export an external environment variable `export HCCL_CONNECT_TIMEOUT=600` to extend hccl connection checking time from the default 120 seconds to 600 seconds. Otherwise, the connection could be timeout since compiling time increases with the growth of model size. +> RANK_TABLE_FILE can refer to [Link](https://www.mindspore.cn/tutorials/en/master/parallel/rank_table.html) , and the device_ip can be got as [Link](https://gitee.com/mindspore/models/tree/master/utils/hccl_tools). For large models like InceptionV4, it's better to export an external environment variable `export HCCL_CONNECT_TIMEOUT=600` to extend hccl connection checking time from the default 120 seconds to 600 seconds. Otherwise, the connection could be timeout since compiling time increases with the growth of model size. > > This is processor cores binding operation regarding the `device_num` and total processor numbers. If you are not expect to do it, remove the operations `taskset` in `scripts/run_train_distribute_ascend.sh` > diff --git a/research/cv/predrnn++/README.md b/research/cv/predrnn++/README.md index bb4364c74..ff529dd96 100644 --- a/research/cv/predrnn++/README.md +++ b/research/cv/predrnn++/README.md @@ -161,7 +161,7 @@ input0_path: "" # export input path ## [Training Process](#contents) -- Set options in `config.py`, including learning rate and other network hyperparameters. Click [MindSpore dataset preparation tutorial](https://www.mindspore.cn/docs/en/master/model_train/index.html) for more information about dataset. +- Set options in `config.py`, including learning rate and other network hyperparameters. ### [Training](#contents) diff --git a/research/cv/retinanet_resnet101/README.md b/research/cv/retinanet_resnet101/README.md index 5df618a2f..95a619110 100644 --- a/research/cv/retinanet_resnet101/README.md +++ b/research/cv/retinanet_resnet101/README.md @@ -287,7 +287,7 @@ bash run_distribute_train.sh [DEVICE_NUM] [EPOCH_SIZE] [LR] [DATASET] [RANK_TABL bash run_single_train.sh [DEVICE_ID] [EPOCH_SIZE] [LR] [DATASET] [PRE_TRAINED](optional) [PRE_TRAINED_EPOCH_SIZE](optional) ``` -> Note: RANK_TABLE_FILE related reference materials see in this [link](https://www.mindspore.cn/docs/en/master/model_train/parallel/rank_table.html), for details on how to get device_ip check this [link](https://gitee.com/mindspore/models/tree/master/utils/hccl_tools). +> Note: RANK_TABLE_FILE related reference materials see in this [link](https://www.mindspore.cn/tutorials/en/master/parallel/rank_table.html), for details on how to get device_ip check this [link](https://gitee.com/mindspore/models/tree/master/utils/hccl_tools). - GPU diff --git a/research/cv/retinanet_resnet101/README_CN.md b/research/cv/retinanet_resnet101/README_CN.md index c237d8cf0..8a8340f01 100644 --- a/research/cv/retinanet_resnet101/README_CN.md +++ b/research/cv/retinanet_resnet101/README_CN.md @@ -292,7 +292,7 @@ bash run_distribute_train.sh [DEVICE_NUM] [EPOCH_SIZE] [LR] [DATASET] [RANK_TABL bash run_single_train.sh [DEVICE_ID] [EPOCH_SIZE] [LR] [DATASET] [PRE_TRAINED](optional) [PRE_TRAINED_EPOCH_SIZE](optional) ``` -> 注意: RANK_TABLE_FILE相关参考资料见[链接](https://www.mindspore.cn/docs/zh-CN/master/model_train/parallel/rank_table.html), 获取device_ip方法详见[链接](https://gitee.com/mindspore/models/tree/master/utils/hccl_tools). +> 注意: RANK_TABLE_FILE相关参考资料见[链接](https://www.mindspore.cn/tutorials/zh-CN/master/parallel/rank_table.html), 获取device_ip方法详见[链接](https://gitee.com/mindspore/models/tree/master/utils/hccl_tools). - GPU diff --git a/research/cv/retinanet_resnet152/README.md b/research/cv/retinanet_resnet152/README.md index 6ef8d5d6f..42960fd15 100644 --- a/research/cv/retinanet_resnet152/README.md +++ b/research/cv/retinanet_resnet152/README.md @@ -291,7 +291,7 @@ bash run_distribute_train.sh DEVICE_NUM EPOCH_SIZE LR DATASET RANK_TABLE_FILE PR bash run_distribute_train.sh DEVICE_ID EPOCH_SIZE LR DATASET PRE_TRAINED(optional) PRE_TRAINED_EPOCH_SIZE(optional) ``` -> Note: RANK_TABLE_FILE related reference materials see in this [link](https://www.mindspore.cn/docs/zh-CN/master/model_train/parallel/rank_table.html), +> Note: RANK_TABLE_FILE related reference materials see in this [link](https://www.mindspore.cn/tutorials/zh-CN/master/parallel/rank_table.html), > for details on how to get device_ip check this [link](https://gitee.com/mindspore/models/tree/master/utils/hccl_tools). - GPU: diff --git a/research/cv/retinanet_resnet152/README_CN.md b/research/cv/retinanet_resnet152/README_CN.md index 8d9ae8df6..13d27a27a 100644 --- a/research/cv/retinanet_resnet152/README_CN.md +++ b/research/cv/retinanet_resnet152/README_CN.md @@ -285,7 +285,7 @@ bash run_distribute_train.sh DEVICE_NUM EPOCH_SIZE LR DATASET RANK_TABLE_FILE PR bash run_distribute_train.sh DEVICE_ID EPOCH_SIZE LR DATASET PRE_TRAINED(optional) PRE_TRAINED_EPOCH_SIZE(optional) ``` -> 注意: RANK_TABLE_FILE相关参考资料见[链接](https://www.mindspore.cn/docs/zh-CN/master/model_train/parallel/rank_table.html), +> 注意: RANK_TABLE_FILE相关参考资料见[链接](https://www.mindspore.cn/tutorials/zh-CN/master/parallel/rank_table.html), > 获取device_ip方法详见[链接](https://gitee.com/mindspore/models/tree/master/utils/hccl_tools). - GPU: diff --git a/research/cv/sphereface/README.md b/research/cv/sphereface/README.md index 1c22df065..ff3c18549 100644 --- a/research/cv/sphereface/README.md +++ b/research/cv/sphereface/README.md @@ -474,7 +474,7 @@ The accuracy of evaluating DenseNet121 on the test dataset of ImageNet will be a ### Inference -If you need to use the trained model to perform inference on multiple hardware platforms, such as GPU, Ascend 910 or Ascend 310, you can refer to this [Link](https://www.mindspore.cn/docs/en/master/model_infer/index.html). Following the steps below, this is a simple example: +If you need to use the trained model to perform inference on multiple hardware platforms, such as GPU, Ascend 910 or Ascend 310, you can refer to this [Link](https://www.mindspore.cn/tutorials/en/master/model_infer/ms_infer/llm_inference_overview.html). Following the steps below, this is a simple example: - Running on Ascend and GPU diff --git a/research/cv/sphereface/README_CN.md b/research/cv/sphereface/README_CN.md index 3c1a8622e..fdad403b1 100644 --- a/research/cv/sphereface/README_CN.md +++ b/research/cv/sphereface/README_CN.md @@ -476,7 +476,7 @@ sphereface网络使用LFW推理得到的结果如下: ### 推理 -如果您需要使用此训练模型在GPU、Ascend 910、Ascend 310等多个硬件平台上进行推理,可参考此[链接](https://www.mindspore.cn/docs/zh-CN/master/model_infer/index.html)。下面是操作步骤示例: +如果您需要使用此训练模型在GPU、Ascend 910、Ascend 310等多个硬件平台上进行推理,可参考此[链接](https://www.mindspore.cn/tutorials/zh-CN/master/model_infer/ms_infer/llm_inference_overview.html)。下面是操作步骤示例: - Ascend、GPU处理器环境运行 diff --git a/research/cv/squeezenet/README.md b/research/cv/squeezenet/README.md index 19f13910e..3d58e58c5 100644 --- a/research/cv/squeezenet/README.md +++ b/research/cv/squeezenet/README.md @@ -720,7 +720,7 @@ Inference result is saved in current path, you can find result like this in acc. ### Inference -If you need to use the trained model to perform inference on multiple hardware platforms, such as GPU, Ascend 910 or Ascend 310, you can refer to this [Link](https://www.mindspore.cn/docs/en/master/model_infer/index.html). Following the steps below, this is a simple example: +If you need to use the trained model to perform inference on multiple hardware platforms, such as GPU, Ascend 910 or Ascend 310, you can refer to this [Link](https://www.mindspore.cn/tutorials/en/master/model_infer/ms_infer/llm_inference_overview.html). Following the steps below, this is a simple example: - Running on Ascend diff --git a/research/cv/squeezenet1_1/README.md b/research/cv/squeezenet1_1/README.md index 44a0111a5..b78dafd51 100644 --- a/research/cv/squeezenet1_1/README.md +++ b/research/cv/squeezenet1_1/README.md @@ -306,7 +306,7 @@ Inference result is saved in current path, you can find result like this in acc. ### Inference -If you need to use the trained model to perform inference on multiple hardware platforms, such as GPU, Ascend 910 or Ascend 310, you can refer to this [Link](https://www.mindspore.cn/docs/en/master/model_infer/index.html). Following the steps below, this is a simple example: +If you need to use the trained model to perform inference on multiple hardware platforms, such as GPU, Ascend 910 or Ascend 310, you can refer to this [Link](https://www.mindspore.cn/tutorials/en/master/model_infer/ms_infer/llm_inference_overview.html). Following the steps below, this is a simple example: - Running on Ascend diff --git a/research/cv/ssd_ghostnet/README.md b/research/cv/ssd_ghostnet/README.md index dbc3792f0..beb2105b1 100644 --- a/research/cv/ssd_ghostnet/README.md +++ b/research/cv/ssd_ghostnet/README.md @@ -210,7 +210,7 @@ If you want to run in modelarts, please check the official documentation of [mod ### Training on Ascend -To train the model, run `train.py`. If the `mindrecord_dir` is empty, it will generate [mindrecord](https://www.mindspore.cn/docs/en/master/model_train/dataset/record.html) files by `coco_root`(coco dataset) or `iamge_dir` and `anno_path`(own dataset). **Note if mindrecord_dir isn't empty, it will use mindrecord_dir instead of raw images.** +To train the model, run `train.py`. If the `mindrecord_dir` is empty, it will generate [mindrecord](https://www.mindspore.cn/tutorials/en/master/dataset/record.html) files by `coco_root`(coco dataset) or `iamge_dir` and `anno_path`(own dataset). **Note if mindrecord_dir isn't empty, it will use mindrecord_dir instead of raw images.** - Distribute mode diff --git a/research/cv/ssd_inception_v2/README.md b/research/cv/ssd_inception_v2/README.md index 013934192..2e4a407a8 100644 --- a/research/cv/ssd_inception_v2/README.md +++ b/research/cv/ssd_inception_v2/README.md @@ -213,7 +213,7 @@ bash scripts/docker_start.sh ssd:20.1.0 [DATA_DIR] [MODEL_DIR] ### [Training Process](#contents) -To train the model, run `train.py`. If the `mindrecord_dir` is empty, it will generate [mindrecord](https://www.mindspore.cn/docs/en/master/model_train/dataset/record.html) files by `coco_root`(coco dataset). **Note if mindrecord_dir isn't empty, it will use mindrecord_dir instead of raw images.** +To train the model, run `train.py`. If the `mindrecord_dir` is empty, it will generate [mindrecord](https://www.mindspore.cn/tutorials/en/master/dataset/record.html) files by `coco_root`(coco dataset). **Note if mindrecord_dir isn't empty, it will use mindrecord_dir instead of raw images.** #### Training on GPU diff --git a/research/cv/ssd_inceptionv2/README_CN.md b/research/cv/ssd_inceptionv2/README_CN.md index 3875e7ec8..dc0753730 100644 --- a/research/cv/ssd_inceptionv2/README_CN.md +++ b/research/cv/ssd_inceptionv2/README_CN.md @@ -171,7 +171,7 @@ bash run_eval.sh [DEVICE_ID] [DATASET] [DATASET_PATH] [CHECKPOINT_PATH] [MINDREC ## 训练过程 -运行`train.py`训练模型。如果`mindrecord_dir`为空,则会通过`coco_root`(coco数据集)或`image_dir`和`anno_path`(自己的数据集)生成[MindRecord](https://www.mindspore.cn/docs/zh-CN/master/model_train/dataset/record.html)文件。**注意,如果mindrecord_dir不为空,将使用mindrecord_dir代替原始图像。** +运行`train.py`训练模型。如果`mindrecord_dir`为空,则会通过`coco_root`(coco数据集)或`image_dir`和`anno_path`(自己的数据集)生成[MindRecord](https://www.mindspore.cn/tutorials/zh-CN/master/dataset/record.html)文件。**注意,如果mindrecord_dir不为空,将使用mindrecord_dir代替原始图像。** ### Ascend上训练 diff --git a/research/cv/ssd_mobilenetV2/README.md b/research/cv/ssd_mobilenetV2/README.md index 8ce3f93e1..93b0f196f 100644 --- a/research/cv/ssd_mobilenetV2/README.md +++ b/research/cv/ssd_mobilenetV2/README.md @@ -221,7 +221,7 @@ bash scripts/run_eval_gpu.sh [DATASET] [CHECKPOINT_PATH] [DEVICE_ID] ### [Training Process](#contents) -To train the model, run `train.py`. If the `mindrecord_dir` is empty, it will generate [mindrecord](https://www.mindspore.cn/docs/en/master/model_train/dataset/record.html) files by `coco_root`(coco dataset), `voc_root`(voc dataset) or `image_dir` and `anno_path`(own dataset). **Note if mindrecord_dir isn't empty, it will use mindrecord_dir instead of raw images.** +To train the model, run `train.py`. If the `mindrecord_dir` is empty, it will generate [mindrecord](https://www.mindspore.cn/tutorials/en/master/dataset/record.html) files by `coco_root`(coco dataset), `voc_root`(voc dataset) or `image_dir` and `anno_path`(own dataset). **Note if mindrecord_dir isn't empty, it will use mindrecord_dir instead of raw images.** #### Training on Ascend diff --git a/research/cv/ssd_mobilenetV2_FPNlite/README.md b/research/cv/ssd_mobilenetV2_FPNlite/README.md index 9f1becb77..e43590a9d 100644 --- a/research/cv/ssd_mobilenetV2_FPNlite/README.md +++ b/research/cv/ssd_mobilenetV2_FPNlite/README.md @@ -233,7 +233,7 @@ bash run_eval_gpu.sh [CONFIG_FILE] [DATASET] [CHECKPOINT_PATH] [DEVICE_ID] ### [Training Process](#contents) -To train the model, run `train.py`. If the `mindrecord_dir` is empty, it will generate [mindrecord](https://www.mindspore.cn/docs/en/master/model_train/dataset/record.html) files by `coco_root`(coco dataset), `voc_root`(voc dataset) or `image_dir` and `anno_path`(own dataset). **Note if mindrecord_dir isn't empty, it will use mindrecord_dir instead of raw images.** +To train the model, run `train.py`. If the `mindrecord_dir` is empty, it will generate [mindrecord](https://www.mindspore.cn/tutorials/en/master/dataset/record.html) files by `coco_root`(coco dataset), `voc_root`(voc dataset) or `image_dir` and `anno_path`(own dataset). **Note if mindrecord_dir isn't empty, it will use mindrecord_dir instead of raw images.** #### Training on Ascend diff --git a/research/cv/ssd_resnet34/README.md b/research/cv/ssd_resnet34/README.md index 968aaf479..067a5cf47 100644 --- a/research/cv/ssd_resnet34/README.md +++ b/research/cv/ssd_resnet34/README.md @@ -206,7 +206,7 @@ bash run_infer_310.sh [MINDIR_PATH] [DATA_PATH] [DEVICE_ID] ### [Training Process](#contents) -To train the model, run `train.py`. If the `mindrecord_dir` is empty, it will generate [mindrecord](https://www.mindspore.cn/docs/zh-CN/master/model_train/dataset/record.html) files by `coco_root`(coco dataset), `voc_root`(voc dataset) or `image_dir` and `anno_path`(own dataset). **Note if mindrecord_dir isn't empty, it will use mindrecord_dir instead of raw images.** +To train the model, run `train.py`. If the `mindrecord_dir` is empty, it will generate [mindrecord](https://www.mindspore.cn/tutorials/zh-CN/master/dataset/record.html) files by `coco_root`(coco dataset), `voc_root`(voc dataset) or `image_dir` and `anno_path`(own dataset). **Note if mindrecord_dir isn't empty, it will use mindrecord_dir instead of raw images.** #### Training on Ascend diff --git a/research/cv/ssd_resnet34/README_CN.md b/research/cv/ssd_resnet34/README_CN.md index 71d18431a..f299b3829 100644 --- a/research/cv/ssd_resnet34/README_CN.md +++ b/research/cv/ssd_resnet34/README_CN.md @@ -172,7 +172,7 @@ sh scripts/run_eval.sh [DEVICE_ID] [DATASET] [DATASET_PATH] [CHECKPOINT_PATH] [M ## 训练过程 -运行`train.py`训练模型。如果`mindrecord_dir`为空,则会通过`coco_root`(coco数据集)或`image_dir`和`anno_path`(自己的数据集)生成[MindRecord](https://www.mindspore.cn/docs/zh-CN/master/model_train/dataset/record.html)文件。**注意,如果mindrecord_dir不为空,将使用mindrecord_dir代替原始图像。** +运行`train.py`训练模型。如果`mindrecord_dir`为空,则会通过`coco_root`(coco数据集)或`image_dir`和`anno_path`(自己的数据集)生成[MindRecord](https://www.mindspore.cn/tutorials/zh-CN/master/dataset/record.html)文件。**注意,如果mindrecord_dir不为空,将使用mindrecord_dir代替原始图像。** ### Ascend上训练 diff --git a/research/cv/ssd_resnet50/README.md b/research/cv/ssd_resnet50/README.md index 1063e7961..d107d4c07 100644 --- a/research/cv/ssd_resnet50/README.md +++ b/research/cv/ssd_resnet50/README.md @@ -204,7 +204,7 @@ Then you can run everything just like on ascend. ### [Training Process](#contents) -To train the model, run `train.py`. If the `mindrecord_dir` is empty, it will generate [mindrecord](https://www.mindspore.cn/docs/en/master/model_train/dataset/record.html) files by `coco_root`(coco dataset), `voc_root`(voc dataset) or `image_dir` and `anno_path`(own dataset). **Note if mindrecord_dir isn't empty, it will use mindrecord_dir instead of raw images.** +To train the model, run `train.py`. If the `mindrecord_dir` is empty, it will generate [mindrecord](https://www.mindspore.cn/tutorials/en/master/dataset/record.html) files by `coco_root`(coco dataset), `voc_root`(voc dataset) or `image_dir` and `anno_path`(own dataset). **Note if mindrecord_dir isn't empty, it will use mindrecord_dir instead of raw images.** #### Training on Ascend diff --git a/research/cv/ssd_resnet50/README_CN.md b/research/cv/ssd_resnet50/README_CN.md index 9fe3222ad..816c1a774 100644 --- a/research/cv/ssd_resnet50/README_CN.md +++ b/research/cv/ssd_resnet50/README_CN.md @@ -163,7 +163,7 @@ bash run_eval.sh [DATASET] [CHECKPOINT_PATH] [DEVICE_ID] ## 训练过程 -运行`train.py`训练模型。如果`mindrecord_dir`为空,则会通过`coco_root`(coco数据集)或`image_dir`和`anno_path`(自己的数据集)生成[MindRecord](https://www.mindspore.cn/docs/zh-CN/master/model_train/dataset/record.html)文件。**注意,如果mindrecord_dir不为空,将使用mindrecord_dir代替原始图像。** +运行`train.py`训练模型。如果`mindrecord_dir`为空,则会通过`coco_root`(coco数据集)或`image_dir`和`anno_path`(自己的数据集)生成[MindRecord](https://www.mindspore.cn/tutorials/zh-CN/master/dataset/record.html)文件。**注意,如果mindrecord_dir不为空,将使用mindrecord_dir代替原始图像。** ### Ascend上训练 diff --git a/research/cv/ssd_resnet_34/README.md b/research/cv/ssd_resnet_34/README.md index ae5c9b484..26ec67fc2 100644 --- a/research/cv/ssd_resnet_34/README.md +++ b/research/cv/ssd_resnet_34/README.md @@ -204,7 +204,7 @@ Major parameters in train.py and config.py for Multi GPU train: ### [Training Process](#contents) -To train the model, run `train.py`. If the `mindrecord_dir` is empty, it will generate [mindrecord](https://www.mindspore.cn/docs/zh-CN/master/model_train/dataset/record.html) files by `coco_root`(coco dataset), `voc_root`(voc dataset) or `image_dir` and `anno_path`(own dataset). **Note if mindrecord_dir isn't empty, it will use mindrecord_dir instead of raw images.** +To train the model, run `train.py`. If the `mindrecord_dir` is empty, it will generate [mindrecord](https://www.mindspore.cn/tutorials/zh-CN/master/dataset/record.html) files by `coco_root`(coco dataset), `voc_root`(voc dataset) or `image_dir` and `anno_path`(own dataset). **Note if mindrecord_dir isn't empty, it will use mindrecord_dir instead of raw images.** #### Training on GPU diff --git a/research/cv/textfusenet/README.md b/research/cv/textfusenet/README.md index 4eecda4be..03a9b23cb 100755 --- a/research/cv/textfusenet/README.md +++ b/research/cv/textfusenet/README.md @@ -319,7 +319,7 @@ Usage: bash run_standalone_train.sh [PRETRAINED_MODEL] ## [Training Process](#contents) -- Set options in `config.py`, including loss_scale, learning rate and network hyperparameters. Click [here](https://www.mindspore.cn/docs/en/master/model_train/dataset/augment.html) for more information about dataset. +- Set options in `config.py`, including loss_scale, learning rate and network hyperparameters. Click [here](https://www.mindspore.cn/tutorials/en/master/dataset/augment.html) for more information about dataset. ### [Training](#content) diff --git a/research/cv/textfusenet/README_CN.md b/research/cv/textfusenet/README_CN.md index 635953fad..55a04213f 100755 --- a/research/cv/textfusenet/README_CN.md +++ b/research/cv/textfusenet/README_CN.md @@ -328,7 +328,7 @@ Shapely==1.5.9 ## 训练过程 -- 在`config.py`中设置配置项,包括loss_scale、学习率和网络超参。单击[此处](https://www.mindspore.cn/docs/zh-CN/master/model_train/dataset/augment.html)获取更多数据集相关信息. +- 在`config.py`中设置配置项,包括loss_scale、学习率和网络超参。单击[此处](https://www.mindspore.cn/tutorials/zh-CN/master/dataset/augment.html)获取更多数据集相关信息. ### 训练 diff --git a/research/cv/tinydarknet/README_CN.md b/research/cv/tinydarknet/README_CN.md index e537488d0..caf39d1c4 100644 --- a/research/cv/tinydarknet/README_CN.md +++ b/research/cv/tinydarknet/README_CN.md @@ -64,7 +64,7 @@ Tiny-DarkNet是Joseph Chet Redmon等人提出的一个16层的针对于经典的 - + # [环境要求](#目录) diff --git a/research/cv/vnet/README_CN.md b/research/cv/vnet/README_CN.md index 6572a6840..6b9318021 100644 --- a/research/cv/vnet/README_CN.md +++ b/research/cv/vnet/README_CN.md @@ -101,7 +101,7 @@ VNet适用于医学图像分割,使用3D卷积,能够处理3D MR图像数据 - [MindSpore Python API](https://www.mindspore.cn/docs/zh-CN/master/api_python/mindspore.html) - 生成config json文件用于多卡训练。 - [简易教程](https://gitee.com/mindspore/models/tree/master/utils/hccl_tools) - - 详细配置方法请参照[rank table启动](https://www.mindspore.cn/docs/zh-CN/master/model_train/parallel/rank_table.html)。 + - 详细配置方法请参照[rank table启动](https://www.mindspore.cn/tutorials/zh-CN/master/parallel/rank_table.html)。 # 快速入门 diff --git a/research/cv/warpctc/README.md b/research/cv/warpctc/README.md index b9b8c0106..3afdc681e 100644 --- a/research/cv/warpctc/README.md +++ b/research/cv/warpctc/README.md @@ -254,7 +254,7 @@ save_checkpoint_path: "./checkpoint" # path to save checkpoint ### [Training Process](#contents) -- Set options in `default_config.yaml`, including learning rate and other network hyperparameters. Click [MindSpore dataset preparation tutorial](https://www.mindspore.cn/docs/en/master/model_train/index.html) for more information about dataset. +- Set options in `default_config.yaml`, including learning rate and other network hyperparameters. #### [Training](#contents) diff --git a/research/cv/warpctc/README_CN.md b/research/cv/warpctc/README_CN.md index eff8427d8..3992a5e62 100644 --- a/research/cv/warpctc/README_CN.md +++ b/research/cv/warpctc/README_CN.md @@ -257,7 +257,7 @@ save_checkpoint_path: "./checkpoints" # 检查点保存路径,相对于t ## 训练过程 -- 在`default_config.yaml`中设置选项,包括学习率和网络超参数。单击[MindSpore加载数据集教程](https://www.mindspore.cn/docs/zh-CN/master/model_train/index.html),了解更多信息。 +- 在`default_config.yaml`中设置选项,包括学习率和网络超参数。 ### 训练 diff --git a/research/cv/wideresnet/README.md b/research/cv/wideresnet/README.md index ec4fea43a..c147e51c2 100644 --- a/research/cv/wideresnet/README.md +++ b/research/cv/wideresnet/README.md @@ -208,7 +208,7 @@ bash run_standalone_train_gpu.sh [DATASET_PATH] [CONFIG_PATH] [EXPERIMENT_LABEL] For distributed training, a hostfile configuration needs to be created in advance. -Please follow the instructions in the link [GPU-Multi-Host](https://www.mindspore.cn/docs/en/master/model_train/parallel/mpirun.html). +Please follow the instructions in the link [GPU-Multi-Host](https://www.mindspore.cn/tutorials/en/master/parallel/mpirun.html). ##### Evaluation while training diff --git a/research/cv/wideresnet/README_CN.md b/research/cv/wideresnet/README_CN.md index 339c79ff4..7182c82ea 100644 --- a/research/cv/wideresnet/README_CN.md +++ b/research/cv/wideresnet/README_CN.md @@ -211,7 +211,7 @@ bash run_standalone_train_gpu.sh [DATASET_PATH] [CONFIG_PATH] [EXPERIMENT_LABEL] 对于分布式培训,需要提前创建主机文件配置。 -请按照链接中的说明操作 [GPU-Multi-Host](https://www.mindspore.cn/docs/en/master/model_train/parallel/mpirun.html). +请按照链接中的说明操作 [GPU-Multi-Host](https://www.mindspore.cn/tutorials/en/master/parallel/mpirun.html). ## 培训时的评估 diff --git a/research/cv/yolov3_resnet18/README.md b/research/cv/yolov3_resnet18/README.md index 69ec4aa5e..b72e14214 100644 --- a/research/cv/yolov3_resnet18/README.md +++ b/research/cv/yolov3_resnet18/README.md @@ -269,7 +269,7 @@ After installing MindSpore via the official website, you can start training and ### Training on Ascend -To train the model, run `train.py` with the dataset `image_dir`, `anno_path` and `mindrecord_dir`. If the `mindrecord_dir` is empty, it wil generate [mindrecord](https://www.mindspore.cn/docs/en/master/model_train/dataset/record.html) file by `image_dir` and `anno_path`(the absolute image path is joined by the `image_dir` and the relative path in `anno_path`). **Note if `mindrecord_dir` isn't empty, it will use `mindrecord_dir` rather than `image_dir` and `anno_path`.** +To train the model, run `train.py` with the dataset `image_dir`, `anno_path` and `mindrecord_dir`. If the `mindrecord_dir` is empty, it wil generate [mindrecord](https://www.mindspore.cn/tutorials/en/master/dataset/record.html) file by `image_dir` and `anno_path`(the absolute image path is joined by the `image_dir` and the relative path in `anno_path`). **Note if `mindrecord_dir` isn't empty, it will use `mindrecord_dir` rather than `image_dir` and `anno_path`.** - Stand alone mode diff --git a/research/cv/yolov3_resnet18/README_CN.md b/research/cv/yolov3_resnet18/README_CN.md index 94cc0bf33..771f86630 100644 --- a/research/cv/yolov3_resnet18/README_CN.md +++ b/research/cv/yolov3_resnet18/README_CN.md @@ -268,7 +268,7 @@ YOLOv3整体网络架构如下: ### Ascend上训练 -训练模型运行`train.py`,使用数据集`image_dir`、`anno_path`和`mindrecord_dir`。如果`mindrecord_dir`为空,则通过`image_dir`和`anno_path`(图像绝对路径由`image_dir`和`anno_path`中的相对路径连接)生成[MindRecord](https://www.mindspore.cn/docs/zh-CN/master/model_train/dataset/record.html)文件。**注意,如果`mindrecord_dir`不为空,将使用`mindrecord_dir`而不是`image_dir`和`anno_path`。** +训练模型运行`train.py`,使用数据集`image_dir`、`anno_path`和`mindrecord_dir`。如果`mindrecord_dir`为空,则通过`image_dir`和`anno_path`(图像绝对路径由`image_dir`和`anno_path`中的相对路径连接)生成[MindRecord](https://www.mindspore.cn/tutorials/zh-CN/master/dataset/record.html)文件。**注意,如果`mindrecord_dir`不为空,将使用`mindrecord_dir`而不是`image_dir`和`anno_path`。** - 单机模式 diff --git a/research/nlp/cpm/README.md b/research/nlp/cpm/README.md index ffc11e53f..90bd85bcb 100644 --- a/research/nlp/cpm/README.md +++ b/research/nlp/cpm/README.md @@ -309,7 +309,7 @@ After processing, the mindrecord file of training and reasoning is generated in ### Finetune Training Process -- Set options in `src/config.py`, including loss_scale, learning rate and network hyperparameters. Click [here](https://www.mindspore.cn/docs/en/master/model_train/index.html) for more information about dataset. +- Set options in `src/config.py`, including loss_scale, learning rate and network hyperparameters. - Run `run_distribute_train_ascend_single_machine.sh` for distributed and single machine training of CPM model. diff --git a/research/nlp/cpm/README_CN.md b/research/nlp/cpm/README_CN.md index c4a0f01cd..f31d352ec 100644 --- a/research/nlp/cpm/README_CN.md +++ b/research/nlp/cpm/README_CN.md @@ -309,7 +309,7 @@ Parameters for dataset and network (Training/Evaluation): ### Finetune训练过程 -- 在`src/config.py`中设置,包括模型并行、batchsize、学习率和网络超参数。点击[这里](https://www.mindspore.cn/docs/zh-CN/master/model_train/index.html)查看更多数据集信息。 +- 在`src/config.py`中设置,包括模型并行、batchsize、学习率和网络超参数。 - 运行`run_distribute_train_ascend_single_machine.sh`,进行CPM模型的单机8卡分布式训练。 diff --git a/research/nlp/mass/README.md b/research/nlp/mass/README.md index 979c9bcd4..598d943ff 100644 --- a/research/nlp/mass/README.md +++ b/research/nlp/mass/README.md @@ -501,7 +501,7 @@ subword-nmt rouge ``` - + # Get started @@ -563,7 +563,7 @@ Get the log and output files under the path `./train_mass_*/`, and the model fil ## Inference -If you need to use the trained model to perform inference on multiple hardware platforms, such as GPU, Ascend 910 or Ascend 310, you can refer to this [Link](https://www.mindspore.cn/docs/en/master/model_infer/index.html). +If you need to use the trained model to perform inference on multiple hardware platforms, such as GPU, Ascend 910 or Ascend 310, you can refer to this [Link](https://www.mindspore.cn/tutorials/en/master/model_infer/ms_infer/llm_inference_overview.html). For inference, config the options in `default_config.yaml` firstly: - Assign the `default_config.yaml` under `data_path` node to the dataset path. diff --git a/research/nlp/mass/README_CN.md b/research/nlp/mass/README_CN.md index 0fd2da545..d8d954495 100644 --- a/research/nlp/mass/README_CN.md +++ b/research/nlp/mass/README_CN.md @@ -505,7 +505,7 @@ subword-nmt rouge ``` - + # 快速上手 @@ -567,7 +567,7 @@ bash run_gpu.sh -t t -n 1 -i 1 ## 推理 -如果您需要使用此训练模型在GPU、Ascend 910、Ascend 310等多个硬件平台上进行推理,可参考此[链接](https://www.mindspore.cn/docs/zh-CN/master/model_infer/index.html)。 +如果您需要使用此训练模型在GPU、Ascend 910、Ascend 310等多个硬件平台上进行推理,可参考此[链接](https://www.mindspore.cn/tutorials/zh-CN/master/model_infer/ms_infer/llm_inference_overview.html)。 推理时,请先配置`config.json`中的选项: - 将`default_config.yaml`节点下的`data_path`配置为数据集路径。 diff --git a/research/nlp/rotate/README_CN.md b/research/nlp/rotate/README_CN.md index a87a4910b..5c92b421f 100644 --- a/research/nlp/rotate/README_CN.md +++ b/research/nlp/rotate/README_CN.md @@ -86,7 +86,7 @@ bash run_infer_310.sh [MINDIR_HEAD_PATH] [MINDIR_TAIL_PATH] [DATASET_PATH] [NEED 在裸机环境(本地有Ascend 910 AI 处理器)进行分布式训练时,需要配置当前多卡环境的组网信息文件。 请遵循一下链接中的说明创建json文件: - + - GPU处理器环境运行 diff --git a/research/recommend/ncf/README.md b/research/recommend/ncf/README.md index 078d04083..5f1e5ac89 100644 --- a/research/recommend/ncf/README.md +++ b/research/recommend/ncf/README.md @@ -356,9 +356,9 @@ Inference result is saved in current path, you can find result like this in acc. ### Inference -If you need to use the trained model to perform inference on multiple hardware platforms, such as Ascend 910 or Ascend 310, you can refer to this [Link](https://www.mindspore.cn/docs/en/master/model_infer/index.html). Following the steps below, this is a simple example: +If you need to use the trained model to perform inference on multiple hardware platforms, such as Ascend 910 or Ascend 310, you can refer to this [Link](https://www.mindspore.cn/tutorials/en/master/model_infer/ms_infer/llm_inference_overview.html). Following the steps below, this is a simple example: - + ```python # Load unseen dataset for inference -- Gitee