diff --git a/docs/mindformers/docs/source_en/feature/high_availability.md b/docs/mindformers/docs/source_en/feature/high_availability.md index 46a9bc4a25562aa7183e3be7cbb17b486003cdd6..61ee0e391a1fc0f7af6ccde70d4953446d68a3e8 100644 --- a/docs/mindformers/docs/source_en/feature/high_availability.md +++ b/docs/mindformers/docs/source_en/feature/high_availability.md @@ -1,4 +1,4 @@ -# High Availability +# Training High Availability [![View Source On Gitee](https://mindspore-website.obs.cn-north-4.myhuaweicloud.com/website-images/r2.7.0rc1/resource/_static/logo_source_en.svg)](https://gitee.com/mindspore/docs/blob/r2.7.0rc1/docs/mindformers/docs/source_en/feature/high_availability.md) diff --git a/docs/mindformers/docs/source_en/feature/memory_optimization.md b/docs/mindformers/docs/source_en/feature/memory_optimization.md index e95541687205b191e5e8888c3fa576228565b64c..cb7ba92e4acfc40ca89ef705b7a0acdda823f83d 100644 --- a/docs/mindformers/docs/source_en/feature/memory_optimization.md +++ b/docs/mindformers/docs/source_en/feature/memory_optimization.md @@ -1,4 +1,4 @@ -# Memory Optimization Features +# Memory Optimization [![View Source On Gitee](https://mindspore-website.obs.cn-north-4.myhuaweicloud.com/website-images/r2.7.0rc1/resource/_static/logo_source_en.svg)](https://gitee.com/mindspore/docs/blob/r2.7.0rc1/docs/mindformers/docs/source_en/feature/memory_optimization.md) diff --git a/docs/mindformers/docs/source_en/guide/pre_training.md b/docs/mindformers/docs/source_en/guide/pre_training.md index 4348a26d4f4338cc038fd8326f0fc49a19f4961f..04353136d3682da256215371247925e2100bd1f5 100644 --- a/docs/mindformers/docs/source_en/guide/pre_training.md +++ b/docs/mindformers/docs/source_en/guide/pre_training.md @@ -28,7 +28,7 @@ Based on actual operations, the basic pretraining process can be divided into th ### 5. Fault Recovery - To handle unexpected interruptions during training, MindSpore Transformers includes [high availability features](https://www.mindspore.cn/mindformers/docs/en/r1.6.0/feature/high_availability.html) such as final-state saving and automatic recovery. It also supports [resuming training from checkpoints](https://www.mindspore.cn/mindformers/docs/en/r1.6.0/feature/resume_training.html), improving training stability. + To handle unexpected interruptions during training, MindSpore Transformers includes [training high availability](https://www.mindspore.cn/mindformers/docs/en/r1.6.0/feature/high_availability.html) such as final-state saving and automatic recovery. It also supports [resuming training from checkpoints](https://www.mindspore.cn/mindformers/docs/en/r1.6.0/feature/resume_training.html), improving training stability. ## MindSpore Transformers-based Pretraining Practice diff --git a/docs/mindformers/docs/source_en/guide/supervised_fine_tuning.md b/docs/mindformers/docs/source_en/guide/supervised_fine_tuning.md index bb330732126ed7a2be5b325e5f4443ad4cc13cdf..2b7df2041becadd3105397d5f04ff1a9d9a2a14e 100644 --- a/docs/mindformers/docs/source_en/guide/supervised_fine_tuning.md +++ b/docs/mindformers/docs/source_en/guide/supervised_fine_tuning.md @@ -34,7 +34,7 @@ Checkpoints are saved during training, or model weights are saved to a specified ### 6. Fault Recovery -To handle exceptions such as training interruptions, MindSpore Transformers offers [high-availability features](https://www.mindspore.cn/mindformers/docs/en/r1.6.0/feature/high_availability.html) like last-state saving and automatic recovery, as well as [checkpoint-based resumed training](https://www.mindspore.cn/mindformers/docs/en/r1.6.0/feature/resume_training.html), enhancing training stability. +To handle exceptions such as training interruptions, MindSpore Transformers offers [training high availability](https://www.mindspore.cn/mindformers/docs/en/r1.6.0/feature/high_availability.html) like last-state saving and automatic recovery, as well as [checkpoint-based resumed training](https://www.mindspore.cn/mindformers/docs/en/r1.6.0/feature/resume_training.html), enhancing training stability. ## Full-Parameter Fine-Tuning with MindSpore Transformers diff --git a/docs/mindformers/docs/source_en/index.rst b/docs/mindformers/docs/source_en/index.rst index ce4d2246f64865fab55972bc9b73ed6555bd59f6..04526dba6ceffb344bc34cbd25f472614e7dc7a6 100644 --- a/docs/mindformers/docs/source_en/index.rst +++ b/docs/mindformers/docs/source_en/index.rst @@ -84,7 +84,7 @@ MindSpore Transformers provides a wealth of features throughout the full-process Provides high-availability capabilities for the training phase of large models, including end-of-life CKPT preservation, UCE fault-tolerant recovery, and process-level rescheduling recovery (Beta feature). - - `Parallel Training `_ + - `Distributed Parallel Training `_ One-click configuration of multi-dimensional hybrid distributed parallel allows models to run efficiently in clusters up to 10,000 cards. diff --git a/docs/mindformers/docs/source_zh_cn/feature/high_availability.md b/docs/mindformers/docs/source_zh_cn/feature/high_availability.md index f65f28ef454af19192bd05182bdca3dbce04af90..d1c7da1470980d34a4995b9f7770d63b60a59f81 100644 --- a/docs/mindformers/docs/source_zh_cn/feature/high_availability.md +++ b/docs/mindformers/docs/source_zh_cn/feature/high_availability.md @@ -1,4 +1,4 @@ -# 高可用特性 +# 训练高可用 [![查看源文件](https://mindspore-website.obs.cn-north-4.myhuaweicloud.com/website-images/r2.7.0rc1/resource/_static/logo_source.svg)](https://gitee.com/mindspore/docs/blob/r2.7.0rc1/docs/mindformers/docs/source_zh_cn/feature/high_availability.md) diff --git a/docs/mindformers/docs/source_zh_cn/feature/memory_optimization.md b/docs/mindformers/docs/source_zh_cn/feature/memory_optimization.md index f8156604787e85cb291db69304180a3df57515c2..9fb14ed13db01c6ac5f05c2a8cd97dbd17e54b55 100644 --- a/docs/mindformers/docs/source_zh_cn/feature/memory_optimization.md +++ b/docs/mindformers/docs/source_zh_cn/feature/memory_optimization.md @@ -1,4 +1,4 @@ -# 训练内存优化特性 +# 训练内存优化 [![View Source On Gitee](https://mindspore-website.obs.cn-north-4.myhuaweicloud.com/website-images/r2.7.0rc1/resource/_static/logo_source.svg)](https://gitee.com/mindspore/docs/blob/r2.7.0rc1/docs/mindformers/docs/source_zh_cn/feature/memory_optimization.md) diff --git a/docs/mindformers/docs/source_zh_cn/guide/pre_training.md b/docs/mindformers/docs/source_zh_cn/guide/pre_training.md index 9114f70630afdc58c254bfad423cd828227ee6e9..20c5d64e47bc481be82fe2594cd64c549bea5c24 100644 --- a/docs/mindformers/docs/source_zh_cn/guide/pre_training.md +++ b/docs/mindformers/docs/source_zh_cn/guide/pre_training.md @@ -28,7 +28,7 @@ MindSpore Transformers 提供[一键启动脚本](https://www.mindspore.cn/mindf ### 5. 故障恢复 -为应对训练中断等异常情况,MindSpore Transformers 具备临终保存、自动恢复等[高可用特性](https://www.mindspore.cn/mindformers/docs/zh-CN/r1.6.0/feature/high_availability.html),并支持[断点续训](https://www.mindspore.cn/mindformers/docs/zh-CN/r1.6.0/feature/resume_training.html),提升训练稳定性。 +为应对训练中断等异常情况,MindSpore Transformers 具备临终保存、自动恢复等[训练高可用](https://www.mindspore.cn/mindformers/docs/zh-CN/r1.6.0/feature/high_availability.html),并支持[断点续训](https://www.mindspore.cn/mindformers/docs/zh-CN/r1.6.0/feature/resume_training.html),提升训练稳定性。 ## 基于 MindSpore Transformers 的预训练实践 diff --git a/docs/mindformers/docs/source_zh_cn/guide/supervised_fine_tuning.md b/docs/mindformers/docs/source_zh_cn/guide/supervised_fine_tuning.md index 15ed443a5bfa10853970be7df6cf4898a62c080e..53f613706626985c96f67d0f59d25fa4795af645 100644 --- a/docs/mindformers/docs/source_zh_cn/guide/supervised_fine_tuning.md +++ b/docs/mindformers/docs/source_zh_cn/guide/supervised_fine_tuning.md @@ -34,7 +34,7 @@ MindSpore Transformers提供[一键启动脚本](https://www.mindspore.cn/mindfo ### 6. 故障恢复 -为应对训练中断等异常情况,MindSpore Transformers具备临终保存、自动恢复等[高可用特性](https://www.mindspore.cn/mindformers/docs/zh-CN/r1.6.0/feature/high_availability.html),并支持[断点续训](https://www.mindspore.cn/mindformers/docs/zh-CN/r1.6.0/feature/resume_training.html),提升训练稳定性。 +为应对训练中断等异常情况,MindSpore Transformers具备临终保存、自动恢复等[训练高可用](https://www.mindspore.cn/mindformers/docs/zh-CN/r1.6.0/feature/high_availability.html),并支持[断点续训](https://www.mindspore.cn/mindformers/docs/zh-CN/r1.6.0/feature/resume_training.html),提升训练稳定性。 ## 使用MindSpore Transformers进行全参微调 diff --git a/docs/mindformers/docs/source_zh_cn/index.rst b/docs/mindformers/docs/source_zh_cn/index.rst index db4ba87fe0dcec559afd526afb8b95b56d4f85bf..9ea1eded25fc444bb28ac609c0f7d223af4bbb44 100644 --- a/docs/mindformers/docs/source_zh_cn/index.rst +++ b/docs/mindformers/docs/source_zh_cn/index.rst @@ -48,7 +48,7 @@ MindSpore Transformers提供了统一的一键启动脚本,支持一键启动 @@ -111,7 +111,7 @@ MindSpore Transformers功能特性说明 提供大模型训练阶段的高可用能力,包括临终 CKPT 保存、UCE 故障容错恢复和进程级重调度恢复功能(Beta特性)。 - - `分布式训练 `_ + - `分布式并行训练 `_ 一键配置多维混合分布式并行,让模型在上至万卡的集群中高效训练。