diff --git a/docs/programming_guide/source_en/initializer.md b/docs/programming_guide/source_en/initializer.md index cd934f345ffd31fe2502207806817a392642f10c..6fc32f1a5049f82e263c2da851a330429f05f667 100644 --- a/docs/programming_guide/source_en/initializer.md +++ b/docs/programming_guide/source_en/initializer.md @@ -1,5 +1,314 @@ -# Initialization of Network Parameters +# Initialization of Network Parameters -No English version right now, welcome to contribute. +[![](https://gitee.com/mindspore/docs/raw/master/resource/_static/logo_source.png)](https://gitee.com/mindspore/docs/blob/master/docs/programming_guide/source_zh_cn/initializer.ipynb) [![](https://gitee.com/mindspore/docs/raw/master/resource/_static/logo_notebook.png)](https://obs.dualstack.cn-north-4.myhuaweicloud.com/mindspore-website/notebook/master/programming_guide/mindspore_initializer.ipynb) [![](https://gitee.com/mindspore/docs/raw/master/resource/_static/logo_modelarts.png)](https://authoring-modelarts-cnnorth4.huaweicloud.com/console/lab?share-url-b64=aHR0cHM6Ly9vYnMuZHVhbHN0YWNrLmNuLW5vcnRoLTQubXlodWF3ZWljbG91ZC5jb20vbWluZHNwb3JlLXdlYnNpdGUvbm90ZWJvb2svbW9kZWxhcnRzL3Byb2dyYW1taW5nX2d1aWRlL21pbmRzcG9yZV9pbml0aWFsaXplci5pcHluYg==&imagename=MindSpore1.1.1) - \ No newline at end of file + +# Overview + + +MindSpore provides a weight initialization module. Users can use encapsulation operators and initializer methods to call strings, Initializer subclasses, or custom Tensors to initialize network parameters. The Initializer class is the basic data structure used for initialization in MindSpore. Its subclasses contain several different types of data distribution (Zero, One, XavierUniform, HeUniform, HeNormal, Constant, Uniform, Normal, TruncatedNormal). The following is a detailed introduction to the two parameter initialization modes of the encapsulation operator and the initializer method. + +## Use packing operator to initialize parameters + +MindSpore provides a variety of parameter initialization methods, and encapsulates the function of parameter initialization in some operators. This section will introduce the method of initializing the parameters by the operator with parameter initialization function. Take the `Conv2d` operator as an example to introduce the initialization of the parameters in the network using strings, subclasses of `Initializer` and custom `Tensor`, etc. The following code examples all take the subclass `Normal` of `Initializer` as an example. In the code examples, `Normal` can be replaced with any of the subclasses of `Initializer`. + + +### String + + +Use string to initialize the network parameters. The content of the string needs to be consistent with the name of the `Initializer` subclass. Initialization using the string method will use the default parameters in the `Initializer` subclass. For example, using the string `Normal` is equivalent to using the subclass `Normal()` of `Initializer`. The code sample is as follows: + + +```python +import numpy as np +import mindspore.nn as nn +from mindspore import Tensor +from mindspore.common import set_seed + +set_seed(1) + +input_data = Tensor(np.ones([1, 3, 16, 50], dtype=np.float32)) +net = nn.Conv2d(3, 64, 3, weight_init='Normal') +output = net(input_data) +print(output) +``` +```text + [[[[ 3.10382620e-02 4.38603461e-02 4.38603461e-02 ... 4.38603461e-02 + 4.38603461e-02 1.38719045e-02] + [ 3.26051228e-02 3.54298912e-02 3.54298912e-02 ... 3.54298912e-02 + 3.54298912e-02 -5.54019120e-03] + [ 3.26051228e-02 3.54298912e-02 3.54298912e-02 ... 3.54298912e-02 + 3.54298912e-02 -5.54019120e-03] + ... + [ 3.26051228e-02 3.54298912e-02 3.54298912e-02 ... 3.54298912e-02 + 3.54298912e-02 -5.54019120e-03] + [ 3.26051228e-02 3.54298912e-02 3.54298912e-02 ... 3.54298912e-02 + 3.54298912e-02 -5.54019120e-03] + [ 9.66199022e-03 1.24104535e-02 1.24104535e-02 ... 1.24104535e-02 + 1.24104535e-02 -1.38977719e-02]] + + ... + + [[ 3.98553275e-02 -1.35465711e-03 -1.35465711e-03 ... -1.35465711e-03 + -1.35465711e-03 -1.00310734e-02] + [ 4.38403059e-03 -3.60766202e-02 -3.60766202e-02 ... -3.60766202e-02 + -3.60766202e-02 -2.95619294e-02] + [ 4.38403059e-03 -3.60766202e-02 -3.60766202e-02 ... -3.60766202e-02 + -3.60766202e-02 -2.95619294e-02] + ... + [ 4.38403059e-03 -3.60766202e-02 -3.60766202e-02 ... -3.60766202e-02 + -3.60766202e-02 -2.95619294e-02] + [ 4.38403059e-03 -3.60766202e-02 -3.60766202e-02 ... -3.60766202e-02 + -3.60766202e-02 -2.95619294e-02] + [ 1.33139016e-02 6.74417242e-05 6.74417242e-05 ... 6.74417242e-05 + 6.74417242e-05 -2.27325838e-02]]]] +``` + +### Initializer subclass + + +Using the `Initializer` subclass to initialize the network parameters is similar to the effect of using a string to initialize the parameters. The difference is that using a string to initialize the parameters is to use the default parameters of the `Initializer` subclass. If you want to use the parameters in the `Initializer` subclass, you must use the `Initializer` subclass to initialize the parameters. Take `Normal(0.2)` as an example, the code sample is as follows: + + +```python +import numpy as np +import mindspore.nn as nn +from mindspore import Tensor +from mindspore.common import set_seed +from mindspore.common.initializer import Normal + +set_seed(1) + +input_data = Tensor(np.ones([1, 3, 16, 50], dtype=np.float32)) +net = nn.Conv2d(3, 64, 3, weight_init=Normal(0.2)) +output = net(input_data) +print(output) +``` +```text + [[[[ 6.2076533e-01 8.7720710e-01 8.7720710e-01 ... 8.7720710e-01 + 8.7720710e-01 2.7743810e-01] + [ 6.5210247e-01 7.0859784e-01 7.0859784e-01 ... 7.0859784e-01 + 7.0859784e-01 -1.1080378e-01] + [ 6.5210247e-01 7.0859784e-01 7.0859784e-01 ... 7.0859784e-01 + 7.0859784e-01 -1.1080378e-01] + ... + [ 6.5210247e-01 7.0859784e-01 7.0859784e-01 ... 7.0859784e-01 + 7.0859784e-01 -1.1080378e-01] + [ 6.5210247e-01 7.0859784e-01 7.0859784e-01 ... 7.0859784e-01 + 7.0859784e-01 -1.1080378e-01] + [ 1.9323981e-01 2.4820906e-01 2.4820906e-01 ... 2.4820906e-01 + 2.4820906e-01 -2.7795550e-01]] + + ... + + [[ 7.9710668e-01 -2.7093157e-02 -2.7093157e-02 ... -2.7093157e-02 + -2.7093157e-02 -2.0062150e-01] + [ 8.7680638e-02 -7.2153252e-01 -7.2153252e-01 ... -7.2153252e-01 + -7.2153252e-01 -5.9123868e-01] + [ 8.7680638e-02 -7.2153252e-01 -7.2153252e-01 ... -7.2153252e-01 + -7.2153252e-01 -5.9123868e-01] + ... + [ 8.7680638e-02 -7.2153252e-01 -7.2153252e-01 ... -7.2153252e-01 + -7.2153252e-01 -5.9123868e-01] + [ 8.7680638e-02 -7.2153252e-01 -7.2153252e-01 ... -7.2153252e-01 + -7.2153252e-01 -5.9123868e-01] + [ 2.6627803e-01 1.3488382e-03 1.3488382e-03 ... 1.3488382e-03 + 1.3488382e-03 -4.5465171e-01]]]] +``` + + +### Custom Tensor + + +In addition to the above two initialization methods, when the network uses data types that are not available in MindSpore to initialize the parameters, the user can initialize the parameters by customizing the `Tensor` method. The code sample is as follows: + + +```python +import numpy as np +import mindspore.nn as nn +from mindspore import Tensor +from mindspore import dtype as mstype + +weight = Tensor(np.ones([64, 3, 3, 3]), dtype=mstype.float32) +input_data = Tensor(np.ones([1, 3, 16, 50], dtype=np.float32)) +net = nn.Conv2d(3, 64, 3, weight_init=weight) +output = net(input_data) +print(output) +``` +```text + [[[[12. 18. 18. ... 18. 18. 12.] + [18. 27. 27. ... 27. 27. 18.] + [18. 27. 27. ... 27. 27. 18.] + ... + [18. 27. 27. ... 27. 27. 18.] + [18. 27. 27. ... 27. 27. 18.] + [12. 18. 18. ... 18. 18. 12.]] + + ... + + [[12. 18. 18. ... 18. 18. 12.] + [18. 27. 27. ... 27. 27. 18.] + [18. 27. 27. ... 27. 27. 18.] + ... + [18. 27. 27. ... 27. 27. 18.] + [18. 27. 27. ... 27. 27. 18.] + [12. 18. 18. ... 18. 18. 12.]]]] +``` + +## Use the initializer method to initialize the parameters + +In the above code sample, how to initialize the parameters in the network is given. For example, the nn layer is used to encapsulate the `Conv2d` operator in the network, and the parameter `weight_init` is passed into the `Conv2d` operator as the data type to be initialized. The operator will call the `Parameter` class during initialization, and then call the package in the `initializer` method in the `Parameter` class completes the initialization of the parameters. For example, the nn layer is used to encapsulate the `Conv2d` operator in the network, and the parameter `weight_init` is passed into the `Conv2d` operator as the data type to be initialized. The operator will call the `Parameter` class during initialization, and then call the package in the `initializer` method in the `Parameter` class to complete the initialization of the parameters. +When initializing parameters, you can use the `initializer` method to call different data types in the `Initializer` subclass to initialize the parameters, and then generate different types of data. + +When using the initializer for parameter initialization, the supported parameters are `init`, `shape`, and `dtype`: + +- `init`:Supports passing in `Tensor`, `str`, subclasses of `Initializer`. + +- `shape`:Support incoming `list`, `tuple`, `int`. + +- `dtype`:Support to import `mindspore.dtype`. + +### The init parameter is Tensor + +The code sample is as follows: + +```python +import numpy as np +from mindspore import Tensor +from mindspore import dtype as mstype +from mindspore.common import set_seed +from mindspore.common.initializer import initializer +from mindspore.ops.operations import nn_ops as nps + +set_seed(1) + +input_data = Tensor(np.ones([16, 3, 10, 32, 32]), dtype=mstype.float32) +weight_init = Tensor(np.ones([32, 3, 4, 3, 3]), dtype=mstype.float32) +weight = initializer(weight_init, shape=[32, 3, 4, 3, 3]) +conv3d = nps.Conv3D(out_channel=32, kernel_size=(4, 3, 3)) +output = conv3d(input_data, weight) +print(output) +``` + +The output is as follows: + +```text +[[[[[108 108 108 ... 108 108 108] + [108 108 108 ... 108 108 108] + [108 108 108 ... 108 108 108] + ... + [108 108 108 ... 108 108 108] + [108 108 108 ... 108 108 108] + [108 108 108 ... 108 108 108]] + ... + [[108 108 108 ... 108 108 108] + [108 108 108 ... 108 108 108] + [108 108 108 ... 108 108 108] + ... + [108 108 108 ... 108 108 108] + [108 108 108 ... 108 108 108] + [108 108 108 ... 108 108 108]]]]] +``` + +### The init parameter is str + +The code sample is as follows: + +```python +import numpy as np +from mindspore import Tensor +from mindspore import dtype as mstype +from mindspore.common import set_seed +from mindspore.common.initializer import initializer +from mindspore.ops.operations import nn_ops as nps + +set_seed(1) + +input_data = Tensor(np.ones([16, 3, 10, 32, 32]), dtype=mstype.float32) +weight = initializer('Normal', shape=[32, 3, 4, 3, 3], dtype=mstype.float32) +conv3d = nps.Conv3D(out_channel=32, kernel_size=(4, 3, 3)) +output = conv3d(input_data, weight) +print(output) +``` + +The output is as follows: + +```text +[[[[[0 0 0 ... 0 0 0] + [0 0 0 ... 0 0 0] + [0 0 0 ... 0 0 0]] + ... + [0 0 0 ... 0 0 0] + [0 0 0 ... 0 0 0] + [0 0 0 ... 0 0 0]] + ... + [[0 0 0 ... 0 0 0] + [0 0 0 ... 0 0 0] + [0 0 0 ... 0 0 0]] + ... + [0 0 0 ... 0 0 0] + [0 0 0 ... 0 0 0] + [0 0 0 ... 0 0 0]]]]] +``` + +### The init parameter is a subclass of Initializer + +The code sample is as follows: + +```python +import numpy as np +from mindspore import Tensor +from mindspore import dtype as mstype +from mindspore.common import set_seed +from mindspore.ops.operations import nn_ops as nps +from mindspore.common.initializer import Normal, initializer + +set_seed(1) + +input_data = Tensor(np.ones([16, 3, 10, 32, 32]), dtype=mstype.float32) +weight = initializer(Normal(0.2), shape=[32, 3, 4, 3, 3], dtype=mstype.float32) +conv3d = nps.Conv3D(out_channel=32, kernel_size=(4, 3, 3)) +output = conv3d(input_data, weight) +print(output) +``` + +```text +[[[[[0 0 0 ... 0 0 0] + [0 0 0 ... 0 0 0] + [0 0 0 ... 0 0 0]] + ... + [0 0 0 ... 0 0 0] + [0 0 0 ... 0 0 0] + [0 0 0 ... 0 0 0]] + ... + [[0 0 0 ... 0 0 0] + [0 0 0 ... 0 0 0] + [0 0 0 ... 0 0 0]] + ... + [0 0 0 ... 0 0 0] + [0 0 0 ... 0 0 0] + [0 0 0 ... 0 0 0]]]]] +``` + +### Application in Parameter + +The code sample is as follows: + +```python +import numpy as np +from mindspore import dtype as mstype +from mindspore.common import set_seed +from mindspore.ops import operations as ops +from mindspore import Tensor, Parameter, context +from mindspore.common.initializer import Normal, initializer + +set_seed(1) + +weight1 = Parameter(initializer('Normal', [5, 4], mstype.float32), name="w1") +weight2 = Parameter(initializer(Normal(0.2), [5, 4], mstype.float32), name="w2") +input_data = Tensor(np.arange(20).reshape(5, 4), dtype=mstype.float32) +net = ops.Add() +output = net(input_data, weight1) +output = net(output, weight2) +print(output) +```