diff --git a/docs/programming_guide/source_en/initializer.md b/docs/programming_guide/source_en/initializer.md index 9d6a454772287f5ea8c376b7427c3a286857b704..1abe645a546c2d7d3b12ca0c382a199f1f24bea5 100644 --- a/docs/programming_guide/source_en/initializer.md +++ b/docs/programming_guide/source_en/initializer.md @@ -1,38 +1,20 @@ -# Initialization of Network Parameters +# Network parameter initialize -Translator: [Karlos Ma](https://gitee.com/Mavendetta985) - - - -- [Initialization of Network Parameters](#initialization-of-network-parameters) - - [Overview](#overview) - - [Using Encapsulation Operator to Initialize Parameters](#using-encapsulation-operator-to-initialize-parameters) - - [Character String](#character-string) - - [Initializer Subclass](#initializer-subclass) - - [The Custom of the Tensor](#the-custom-of-the-tensor) - - [Using the Initializer Method to Initialize Parameters](#using-the-initializer-method-to-initialize-parameters) - - [The Parameter of Init is Tensor](#the-parameter-of-init-is-tensor) - - [The Parameter of Init is Str](#the-parameter-of-init-is-str) - - [The Parameter of Init is the Subclass of Initializer](#the-parameter-of-init-is-the-subclass-of-initializer) - - [Application in Parameter](#application-in-parameter) - - - - +[![image0](https://gitee.com/mindspore/docs/raw/master/resource/_static/logo_source.png)](https://gitee.com/mindspore/docs/blob/master/docs/programming_guide/source_zh_cn/initializer.ipynb) [![image1](https://gitee.com/mindspore/docs/raw/master/resource/_static/logo_notebook.png)](https://obs.dualstack.cn-north-4.myhuaweicloud.com/mindspore-website/notebook/master/programming_guide/mindspore_initializer.ipynb) [![image2](https://gitee.com/mindspore/docs/raw/master/resource/_static/logo_modelarts.png)](https://authoring-modelarts-cnnorth4.huaweicloud.com/console/lab?share-url-b64=aHR0cHM6Ly9vYnMuZHVhbHN0YWNrLmNuLW5vcnRoLTQubXlodWF3ZWljbG91ZC5jb20vbWluZHNwb3JlLXdlYnNpdGUvbm90ZWJvb2svbW9kZWxhcnRzL3Byb2dyYW1taW5nX2d1aWRlL21pbmRzcG9yZV9pbml0aWFsaXplci5pcHluYg==&imagename=MindSpore1.1.1) ## Overview -MindSpore provides a weight initialization module, which allows users to initialize network parameters by encapsulated operators and initializer methods to call strings, initializer subclasses, or custom Tensors. The Initializer class is the basic data structure used for initialization in MindSpore. Its subclasses contain several different types of data distribution (Zero, One, XavierUniform, Heuniform, Henormal, Constant, Uniform, Normal, TruncatedNormal). The following two parameter initialization modes, encapsulation operator and initializer method, are introduced in detail. +MindSpore provides weight initialized module, the user can initialize the network parameters by encapsulating operator and the initializer method to call the string、 Initializer subclass or user-defined Tensor. Initializer class is the basic data structure used for initialization in MindSpore, its subclasses contain several different types of data distribution(Zero,One,XavierUniform,HeUniform,HeNormal,Constant,Uniform,Normal,TruncatedNormal) . The following will have a detailed introduction to the two parameter initialization modes of encapsulation operator and the initializer method. -## Using Encapsulation Operator to Initialize Parameters +## Initialize parameters by encapsulated operator -Mindspore provides a variety of methods of initializing parameters, and encapsulates parameter initialization functions in some operators. This section will introduce the method of initialization of parameters by operators with parameter initialization function. Taking `Conv2D` operator as an example, it will introduce the initialization of parameters in the network by strings, `Initializer` subclass and custom `Tensor`, etc. `Normal`, a subclass of `Initializer`, is used in the following code examples and can be replaced with any of the subclasses of Initializer in the code examples. +MindSpore provides a variety of parameter initialization methods, and encapsulates the function of parameter initialization in some operators. This section will introduce the method of initializing the parameters by the operator with parameter initialization function, taking the `Conv2d` operator as an example to introduce the initialization of the parameters in the network by string, `Initializer` subclass and user-defined `Tensor`, the following code examples all take the subclass `Normal` of `Initializer` as an example. In the code examples, `Normal` can be replaced with any of the subclasses of `Initializer`. -### Character String +### String -Network parameters are initialized using a string. The contents of the string need to be consistent with the name of the `Initializer` subclass. Initialization using a string will use the default parameters in the `Initializer` subclass. For example, using the string `Normal` is equivalent to using the `Initializer` subclass `Normal()`. The code sample is as follows: +Initialize the network parameters with string. The content of the string must be consistent with the name of the `Initializer` subclass. Initializing the network parameters with string method will use the default parameters in the `Initializer` subclass, for example, using `Normal` string is equivalent to using the subclass `Normal()` of `Initializer`, the code example is as follows: -```python +```Python import numpy as np import mindspore.nn as nn from mindspore import Tensor @@ -44,9 +26,6 @@ input_data = Tensor(np.ones([1, 3, 16, 50], dtype=np.float32)) net = nn.Conv2d(3, 64, 3, weight_init='Normal') output = net(input_data) print(output) -``` - -```text [[[[ 3.10382620e-02 4.38603461e-02 4.38603461e-02 ... 4.38603461e-02 4.38603461e-02 1.38719045e-02] [ 3.26051228e-02 3.54298912e-02 3.54298912e-02 ... 3.54298912e-02 @@ -78,11 +57,11 @@ print(output) 6.74417242e-05 -2.27325838e-02]]]] ``` -### Initializer Subclass +### Initializer subclass -`Initializer` subclass is used to initialize network parameters, which is similar to the effect of using string to initialize parameters. The difference is that using string to initialize parameters uses the default parameter of the `Initializer` subclass. If you want to use the parameters in the `Initializer` subclass, the `Initializer` subclass must be used to initialize the parameters. Taking `Normal(0.2)` as an example, the code sample is as follows: +Initialize the network parameters by using the Initializer subclass is similar to the effect of using string to initialize the parameters. The difference is that using a string to initialize the parameters is to use the default parameters of the `Initializer` subclass. If you want to use the parameters of the `Initializer` subclass, you must initialize the parameters with the way of `Initializer` subclass. Taking `Normal(0.2)` as an example, the code sample is as follows: -```python +```Python import numpy as np import mindspore.nn as nn from mindspore import Tensor @@ -95,9 +74,6 @@ input_data = Tensor(np.ones([1, 3, 16, 50], dtype=np.float32)) net = nn.Conv2d(3, 64, 3, weight_init=Normal(0.2)) output = net(input_data) print(output) -``` - -```text [[[[ 6.2076533e-01 8.7720710e-01 8.7720710e-01 ... 8.7720710e-01 8.7720710e-01 2.7743810e-01] [ 6.5210247e-01 7.0859784e-01 7.0859784e-01 ... 7.0859784e-01 @@ -129,11 +105,11 @@ print(output) 1.3488382e-03 -4.5465171e-01]]]] ``` -### The Custom of the Tensor +### user-defined Tensor -In addition to the above two initialization methods, when the network wants to use data types that are not available in MindSpore, users can customize `Tensor` to initialize the parameters. The code sample is as follows: +In addition to the above two initialization methods, when the network uses data types that are not available in MindSpore to initialize the parameters, the user can initialize the parameters by customizing the `Tensor` method. The code sample is as follows: -```python +```Python import numpy as np import mindspore.nn as nn from mindspore import Tensor @@ -144,9 +120,6 @@ input_data = Tensor(np.ones([1, 3, 16, 50], dtype=np.float32)) net = nn.Conv2d(3, 64, 3, weight_init=weight) output = net(input_data) print(output) -``` - -```text [[[[12. 18. 18. ... 18. 18. 12.] [18. 27. 27. ... 27. 27. 18.] [18. 27. 27. ... 27. 27. 18.] @@ -166,22 +139,23 @@ print(output) [12. 18. 18. ... 18. 18. 12.]]]] ``` -## Using the Initializer Method to Initialize Parameters +## Initialize the parameter by using initializer method + +In the above code example, how to initialize parameters in the network is given. For example, the nn layer is used to encapsulate the `Conv2d` operator in the network, and the parameter `weight_init` is passed as the data type to be initialized into the `Conv2d` operator. The operator will complete the initialization of the parameters by calling the `Parameter` class during initialization, and then calling the `initializer` method encapsulated in the `Parameter` class. However, some operators do not internally encapsulate the function of parameter initialization like `Conv2d`. For example, the weight of the `Conv3d` operator is passed as a parameter to the `Conv3d` operator. At this time, you need to manually define the weight initialization. -In the above code sample, the method of Parameter initialization in the network is given. For example, NN layer is used to encapsulate a `Conv2D` operator in the network, and the Parameter `weight_init` is passed into a `Conv2D` operator as the data type to be initialized. The operator will be initialized by calling `Parameter` class. Then the `initializer` method encapsulated in the `Parameter` class is called to initialize the parameters. However, some operators do not encapsulate the function of parameter initialization internally like `Conv2D`. For example, the weights of `Conv3D` operators are passed to `Conv3D` operators as parameters. In this case, it is necessary to manually define the initialization of weights. +When initializing parameters, you can use the `initializer` method to call different data types in the `Initializer` subclass to initialize the parameters, and then generate different types of data. -When initializing a parameter, you can use the `Initializer` method to initialize the parameter by calling different data types in the `Initializer` subclasses, resulting in different types of data. +When using the initializer for parameter initialization, the supported parameters are `init`, `shape`, and `dtype`: -When initializer is used for parameter initialization, the parameters passed in are `init`, `shape`, `dtype`: - -`init`: Supported subclasses of incoming `Tensor`, `STR`, `Subclass of Initializer`. - -`shape`: Supported subclasses of incoming `list`, `tuple`, `int`. - -`dtype`: Supported subclasses of incoming `mindspore.dtype`. +- `init`: Support for passing in `Tensor`, `str`, subclasses of `Initializer`. +- `shape`: Support for incoming `list`, `tuple`, `int`. +- `dtype`: Support the incoming `mindspore.dtype`. -### The Parameter of Init is Tensor +### the parameter of init is Tensor -The code sample is shown below: +The code sample is as follows: -```python +```Python import numpy as np from mindspore import Tensor from mindspore import dtype as mstype @@ -199,8 +173,9 @@ output = conv3d(input_data, weight) print(output) ``` -```text The output is as follows: + +```Python [[[[[108 108 108 ... 108 108 108] [108 108 108 ... 108 108 108] [108 108 108 ... 108 108 108] @@ -218,11 +193,11 @@ The output is as follows: [108 108 108 ... 108 108 108]]]]] ``` -### The Parameter of Init is Str +### the parameter of init is str The code sample is as follows: -```python +```Python import numpy as np from mindspore import Tensor from mindspore import dtype as mstype @@ -239,8 +214,9 @@ output = conv3d(input_data, weight) print(output) ``` -```text The output is as follows: + +```Python [[[[[0 0 0 ... 0 0 0] [0 0 0 ... 0 0 0] [0 0 0 ... 0 0 0]] @@ -258,11 +234,11 @@ The output is as follows: [0 0 0 ... 0 0 0]]]]] ``` -### The Parameter of Init is the Subclass of Initializer +### the parameter of init is subclasses of Initializer The code sample is as follows: -```python +```Python import numpy as np from mindspore import Tensor from mindspore import dtype as mstype @@ -277,9 +253,6 @@ weight = initializer(Normal(0.2), shape=[32, 3, 4, 3, 3], dtype=mstype.float32) conv3d = nps.Conv3D(out_channel=32, kernel_size=(4, 3, 3)) output = conv3d(input_data, weight) print(output) -``` - -```text [[[[[0 0 0 ... 0 0 0] [0 0 0 ... 0 0 0] [0 0 0 ... 0 0 0]] @@ -301,7 +274,7 @@ print(output) The code sample is as follows: -```python +```Python import numpy as np from mindspore import dtype as mstype from mindspore.common import set_seed @@ -318,9 +291,6 @@ net = ops.Add() output = net(input_data, weight1) output = net(output, weight2) print(output) -``` - -```text [[-0.3305102 1.0412874 2.0412874 3.0412874] [ 4.0412874 4.9479127 5.9479127 6.9479127] [ 7.947912 9.063009 10.063009 11.063009 ]