From c57c2fa36013fbc590cb478a3d3dc880e9939d43 Mon Sep 17 00:00:00 2001 From: huanxiaoling <3174348550@qq.com> Date: Wed, 2 Nov 2022 15:55:08 +0800 Subject: [PATCH] modify the files in docs --- .../model_development/model_and_loss.md | 2 +- docs/reinforcement/docs/source_en/replaybuffer.md | 10 +++++++--- 2 files changed, 8 insertions(+), 4 deletions(-) diff --git a/docs/mindspore/source_en/migration_guide/model_development/model_and_loss.md b/docs/mindspore/source_en/migration_guide/model_development/model_and_loss.md index be1d1582ed..2492e9c493 100644 --- a/docs/mindspore/source_en/migration_guide/model_development/model_and_loss.md +++ b/docs/mindspore/source_en/migration_guide/model_development/model_and_loss.md @@ -156,7 +156,7 @@ The customized `to_float` conflicts with the `amp_level` in the model. If the cu #### Customizing Initialization Parameters -Generally, the high-level API encapsulated by MindSpore initializes parameters by default. Sometimes, the initialization distribution is inconsistent with the required initialization and PyTorch initialization. In this case, you need to customize initialization. [Initializing Network Arguments](https://mindspore.cn/tutorials/en/master/advanced/modules/parameter.html#initializing-network-arguments) describes a method of initializing parameters by using API attributes. This section describes a method of initializing parameters by using Cell. +Generally, the high-level API encapsulated by MindSpore initializes parameters by default. Sometimes, the initialization distribution is inconsistent with the required initialization and PyTorch initialization. In this case, you need to customize initialization. [Initializing Network Arguments](https://mindspore.cn/tutorials/en/master/advanced/modules/initializer.html#customized-parameter-initialization) describes a method of initializing parameters by using API attributes. This section describes a method of initializing parameters by using Cell. For details about the parameters, see [Network Parameters](https://mindspore.cn/tutorials/zh-CN/master/advanced/modules/initializer.html). This section uses `Cell` as an example to describe how to obtain all parameters in `Cell` and how to initialize the parameters in `Cell`. diff --git a/docs/reinforcement/docs/source_en/replaybuffer.md b/docs/reinforcement/docs/source_en/replaybuffer.md index df489c040e..36d22ee3c2 100644 --- a/docs/reinforcement/docs/source_en/replaybuffer.md +++ b/docs/reinforcement/docs/source_en/replaybuffer.md @@ -40,6 +40,8 @@ To simulate the FIFO characteristics of a circular queue, we use two cursors to 3. After continuing to insert a batch_size of 4, the queue is full and the count is 6. 4. After continuing to insert a batch_size of 2, overwrite updates the old data and adds 2 to the head. +![insert schematic diagram](https://gitee.com/mindspore/docs/blob/master/docs/reinforcement/docs/source_zh_cn/images/insert.png) + #### 2 Search The search method accepts an index as an input, indicating the specific location of the data to be found. The output is a set of Tensor, as shown in the following figure: @@ -54,10 +56,12 @@ The search method accepts an index as an input, indicating the specific location The sampling method has no input and the output is a set of Tensor with the size of the batch_size when the UniformReplayBuffer is created. This is shown in the following figure: Assuming that batch_size is 3, a random set of indexes will be generated in the operator, and this random set of indexes has two cases: -1. Packet ordering: each index means the real data position, which needs to be remapped by cursor operation. -2. No packet ordering: each index does not represent the real position and is obtained directly. +1. Order preserving: each index means the real data position, which needs to be remapped by cursor operation. +2. No order preserving: each index does not represent the real position and is obtained directly. + +Both approaches have a slight impact on randomness, and the default is to use no order preserving to get the best performance. -Both approaches have a slight impact on randomness, and the default is to use no-packet ordering to get the best performance. +![sample schematic diagram](https://gitee.com/mindspore/docs/blob/master/docs/reinforcement/docs/source_zh_cn/images/sample.png) ## UniformReplayBuffer Introduction of MindSpore Reinforcement Learning -- Gitee