diff --git a/docs/mindspore/source_en/migration_guide/model_development/model_and_loss.md b/docs/mindspore/source_en/migration_guide/model_development/model_and_loss.md index be1d1582ed52a93d1910512000d64d395ce76f05..2492e9c493d9f262d134e665cd540d47f55cfb7d 100644 --- a/docs/mindspore/source_en/migration_guide/model_development/model_and_loss.md +++ b/docs/mindspore/source_en/migration_guide/model_development/model_and_loss.md @@ -156,7 +156,7 @@ The customized `to_float` conflicts with the `amp_level` in the model. If the cu #### Customizing Initialization Parameters -Generally, the high-level API encapsulated by MindSpore initializes parameters by default. Sometimes, the initialization distribution is inconsistent with the required initialization and PyTorch initialization. In this case, you need to customize initialization. [Initializing Network Arguments](https://mindspore.cn/tutorials/en/master/advanced/modules/parameter.html#initializing-network-arguments) describes a method of initializing parameters by using API attributes. This section describes a method of initializing parameters by using Cell. +Generally, the high-level API encapsulated by MindSpore initializes parameters by default. Sometimes, the initialization distribution is inconsistent with the required initialization and PyTorch initialization. In this case, you need to customize initialization. [Initializing Network Arguments](https://mindspore.cn/tutorials/en/master/advanced/modules/initializer.html#customized-parameter-initialization) describes a method of initializing parameters by using API attributes. This section describes a method of initializing parameters by using Cell. For details about the parameters, see [Network Parameters](https://mindspore.cn/tutorials/zh-CN/master/advanced/modules/initializer.html). This section uses `Cell` as an example to describe how to obtain all parameters in `Cell` and how to initialize the parameters in `Cell`. diff --git a/docs/reinforcement/docs/source_en/replaybuffer.md b/docs/reinforcement/docs/source_en/replaybuffer.md index df489c040eedc58412039e29747299fe3d6dd493..36d22ee3c25ee466bb3799a255db28509848d7e1 100644 --- a/docs/reinforcement/docs/source_en/replaybuffer.md +++ b/docs/reinforcement/docs/source_en/replaybuffer.md @@ -40,6 +40,8 @@ To simulate the FIFO characteristics of a circular queue, we use two cursors to 3. After continuing to insert a batch_size of 4, the queue is full and the count is 6. 4. After continuing to insert a batch_size of 2, overwrite updates the old data and adds 2 to the head. +![insert schematic diagram](https://gitee.com/mindspore/docs/blob/master/docs/reinforcement/docs/source_zh_cn/images/insert.png) + #### 2 Search The search method accepts an index as an input, indicating the specific location of the data to be found. The output is a set of Tensor, as shown in the following figure: @@ -54,10 +56,12 @@ The search method accepts an index as an input, indicating the specific location The sampling method has no input and the output is a set of Tensor with the size of the batch_size when the UniformReplayBuffer is created. This is shown in the following figure: Assuming that batch_size is 3, a random set of indexes will be generated in the operator, and this random set of indexes has two cases: -1. Packet ordering: each index means the real data position, which needs to be remapped by cursor operation. -2. No packet ordering: each index does not represent the real position and is obtained directly. +1. Order preserving: each index means the real data position, which needs to be remapped by cursor operation. +2. No order preserving: each index does not represent the real position and is obtained directly. + +Both approaches have a slight impact on randomness, and the default is to use no order preserving to get the best performance. -Both approaches have a slight impact on randomness, and the default is to use no-packet ordering to get the best performance. +![sample schematic diagram](https://gitee.com/mindspore/docs/blob/master/docs/reinforcement/docs/source_zh_cn/images/sample.png) ## UniformReplayBuffer Introduction of MindSpore Reinforcement Learning