diff --git a/docs/programming_guide/source_en/initializer.md b/docs/programming_guide/source_en/initializer.md index cd934f345ffd31fe2502207806817a392642f10c..4d1e5866d21cd61df0b23d6c2a01d667beca5a99 100644 --- a/docs/programming_guide/source_en/initializer.md +++ b/docs/programming_guide/source_en/initializer.md @@ -1,5 +1,308 @@ -# Initialization of Network Parameters +# Initialization of Network Parameters -No English version right now, welcome to contribute. +## Overview - \ No newline at end of file +MindSpore provides a weight initialization module, which allows users to initialize network parameters by encapsulating operators and initializer methods to call strings, initializer subclasses, or custom Tensors. The Initializer class is the basic data structure used for initialization in MindSpore. Its subclasses contain several different types of data distribution (Zero, One, XavierUniform, Heuniform, Henormal, Constant, Uniform, Normal, TruncatedNormal). The following two parameter initialization modes, encapsulation operator and initializer method, are introduced in detail. + +## Parameters are initialized using an encapsulation operator + +Mindspore provides a variety of parameter initialization methods, and encapsulates parameter initialization functions in some operators. This section will introduce the method of initialization of parameters by operators with parameter initialization function. Taking `Conv2d` operator as an example, it will introduce the initialization of parameters in the network by strings, subclasses of `Initializer` and custom `Tensor`, etc. In the following code examples, the subclass `Normal` of `Initializer` is used as an example. In the code example, `Normal` can be replaced by any of the subclasses of `Initializer`. + +### String + +Network parameters are initialized using a string. The content of the string needs to be the same as the name of the `Initializer` subclass. The default parameters in the `Initializer` subclass are used for initialization using a string. For example, using the string `Normal` is equivalent to using the subclass `Normal()` of `Initializer`. The code sample is as follows: + +```python + +import numpy as np +import mindspore.nn as nn +from mindspore import Tensor +from mindspore.common import set_seed + +set_seed(1) + +input_data = Tensor(np.ones([1, 3, 16, 50], dtype=np.float32)) +net = nn.Conv2d(3, 64, 3, weight_init='Normal') +output = net(input_data) +print(output) + +``` + +### Initializer subclass + +Using the `Initializer` subclass to initialize network parameters is similar to using a string to initialize parameters. The difference is that using a string to initialize parameters is the default parameter for using the `Initializer` subclass. If you want to use the parameters in the `Initializer` subclass, You must use the `Initializer` subclass to initialize the parameter. Take `Normal(0.2)` as an example. The code sample is as follows: + +```python + +import numpy as np +import mindspore.nn as nn +from mindspore import Tensor +from mindspore.common import set_seed +from mindspore.common.initializer import Normal + +set_seed(1) + +input_data = Tensor(np.ones([1, 3, 16, 50], dtype=np.float32)) +net = nn.Conv2d(3, 64, 3, weight_init=Normal(0.2)) +output = net(input_data) +print(output) + +``` + +```text + +[[[[ 6.2076533e-01 8.7720710e-01 8.7720710e-01 ... 8.7720710e-01 + 8.7720710e-01 2.7743810e-01] + [ 6.5210247e-01 7.0859784e-01 7.0859784e-01 ... 7.0859784e-01 + 7.0859784e-01 -1.1080378e-01] + [ 6.5210247e-01 7.0859784e-01 7.0859784e-01 ... 7.0859784e-01 + 7.0859784e-01 -1.1080378e-01] + ... + [ 6.5210247e-01 7.0859784e-01 7.0859784e-01 ... 7.0859784e-01 + 7.0859784e-01 -1.1080378e-01] + [ 6.5210247e-01 7.0859784e-01 7.0859784e-01 ... 7.0859784e-01 + 7.0859784e-01 -1.1080378e-01] + [ 1.9323981e-01 2.4820906e-01 2.4820906e-01 ... 2.4820906e-01 + 2.4820906e-01 -2.7795550e-01]] + + ... + + [[ 7.9710668e-01 -2.7093157e-02 -2.7093157e-02 ... -2.7093157e-02 + -2.7093157e-02 -2.0062150e-01] + [ 8.7680638e-02 -7.2153252e-01 -7.2153252e-01 ... -7.2153252e-01 + -7.2153252e-01 -5.9123868e-01] + [ 8.7680638e-02 -7.2153252e-01 -7.2153252e-01 ... -7.2153252e-01 + -7.2153252e-01 -5.9123868e-01] + ... + [ 8.7680638e-02 -7.2153252e-01 -7.2153252e-01 ... -7.2153252e-01 + -7.2153252e-01 -5.9123868e-01] + [ 8.7680638e-02 -7.2153252e-01 -7.2153252e-01 ... -7.2153252e-01 + -7.2153252e-01 -5.9123868e-01] + [ 2.6627803e-01 1.3488382e-03 1.3488382e-03 ... 1.3488382e-03 + 1.3488382e-03 -4.5465171e-01]]]] + +``` + +### Custom Tensor + +In addition to the above two initialization methods, when the network initializes the parameters with data types not available in MindSpore, users can initialize the parameters by customizing `Tensor`. The code sample is as follows: + +```python + +import numpy as np +import mindspore.nn as nn +from mindspore import Tensor +from mindspore import dtype as mstype + +weight = Tensor(np.ones([64, 3, 3, 3]), dtype=mstype.float32) +input_data = Tensor(np.ones([1, 3, 16, 50], dtype=np.float32)) +net = nn.Conv2d(3, 64, 3, weight_init=weight) +output = net(input_data) +print(output) + +``` + +```text + +[[[[12. 18. 18. ... 18. 18. 12.] + [18. 27. 27. ... 27. 27. 18.] + [18. 27. 27. ... 27. 27. 18.] + ... + [18. 27. 27. ... 27. 27. 18.] + [18. 27. 27. ... 27. 27. 18.] + [12. 18. 18. ... 18. 18. 12.]] + + ... + + [[12. 18. 18. ... 18. 18. 12.] + [18. 27. 27. ... 27. 27. 18.] + [18. 27. 27. ... 27. 27. 18.] + ... + [18. 27. 27. ... 27. 27. 18.] + [18. 27. 27. ... 27. 27. 18.] + [12. 18. 18. ... 18. 18. 12.]]]] + +``` + +## The parameters are initialized using the initializer method + +In the above code sample, the method of Parameter initialization in the network is given. For example, NN layer is used to encapsulate `Conv2d` operator in the network, Parameter`weight_init` is passed into `Conv2d` operator as the data type to be initialized, and the operator will be initialized by calling `Parameter` class. Then the `initializer` method encapsulated in the `Parameter` class is called to complete the initialization of the parameters. However, some operators do not encapsulate the function of parameter initialization internally like `Conv2d`. For example, the weight of `Conv3d` operator is passed into `Conv3d` operator as parameter. In this case, it is necessary to define the initialization of weight manually. + +When the parameter is initialized, the `initializer` method can be used to initialize the parameter by calling different data types in the `initializer` subclass to produce different types of data. + +When using initializer for parameter initialization, support for the incoming parameters are`init`,`shape`, `dtype` : + +- `init`:Supported `Tensor`, `str`, `subclasses of Initializer` passed in. + +- `shape`:Support passing in `list`, `tuple`, `int`. + +- `dtype`:Support passing in `mindspore.dtype`. + +### The init parameter is Tensor + +The code sample is as follows: + +```python + +import numpy as np +from mindspore import Tensor +from mindspore import dtype as mstype +from mindspore.common import set_seed +from mindspore.common.initializer import initializer +from mindspore.ops.operations import nn_ops as nps + +set_seed(1) + +input_data = Tensor(np.ones([16, 3, 10, 32, 32]), dtype=mstype.float32) +weight_init = Tensor(np.ones([32, 3, 4, 3, 3]), dtype=mstype.float32) +weight = initializer(weight_init, shape=[32, 3, 4, 3, 3]) +conv3d = nps.Conv3D(out_channel=32, kernel_size=(4, 3, 3)) +output = conv3d(input_data, weight) +print(output) + +``` + +The output is as follows: + +```text + +[[[[[108 108 108 ... 108 108 108] + [108 108 108 ... 108 108 108] + [108 108 108 ... 108 108 108] + ... + [108 108 108 ... 108 108 108] + [108 108 108 ... 108 108 108] + [108 108 108 ... 108 108 108]] + ... + [[108 108 108 ... 108 108 108] + [108 108 108 ... 108 108 108] + [108 108 108 ... 108 108 108] + ... + [108 108 108 ... 108 108 108] + [108 108 108 ... 108 108 108] + [108 108 108 ... 108 108 108]]]]] + +``` + +### The init parameter is str + +The code sample is as follows: + +```python + +import numpy as np +from mindspore import Tensor +from mindspore import dtype as mstype +from mindspore.common import set_seed +from mindspore.common.initializer import initializer +from mindspore.ops.operations import nn_ops as nps + +set_seed(1) + +input_data = Tensor(np.ones([16, 3, 10, 32, 32]), dtype=mstype.float32) +weight = initializer('Normal', shape=[32, 3, 4, 3, 3], dtype=mstype.float32) +conv3d = nps.Conv3D(out_channel=32, kernel_size=(4, 3, 3)) +output = conv3d(input_data, weight) +print(output) + +``` + +The output is as follows: + +```text + +[[[[[0 0 0 ... 0 0 0] + [0 0 0 ... 0 0 0] + [0 0 0 ... 0 0 0]] + ... + [0 0 0 ... 0 0 0] + [0 0 0 ... 0 0 0] + [0 0 0 ... 0 0 0]] + ... + [[0 0 0 ... 0 0 0] + [0 0 0 ... 0 0 0] + [0 0 0 ... 0 0 0]] + ... + [0 0 0 ... 0 0 0] + [0 0 0 ... 0 0 0] + [0 0 0 ... 0 0 0]]]]] + +``` + +### The init parameter is a subclass of Initializer + +The code sample is as follows: + +```python + +import numpy as np +from mindspore import Tensor +from mindspore import dtype as mstype +from mindspore.common import set_seed +from mindspore.ops.operations import nn_ops as nps +from mindspore.common.initializer import Normal, initializer + +set_seed(1) + +input_data = Tensor(np.ones([16, 3, 10, 32, 32]), dtype=mstype.float32) +weight = initializer(Normal(0.2), shape=[32, 3, 4, 3, 3], dtype=mstype.float32) +conv3d = nps.Conv3D(out_channel=32, kernel_size=(4, 3, 3)) +output = conv3d(input_data, weight) +print(output) + +``` + +```text + +[[[[[0 0 0 ... 0 0 0] + [0 0 0 ... 0 0 0] + [0 0 0 ... 0 0 0]] + ... + [0 0 0 ... 0 0 0] + [0 0 0 ... 0 0 0] + [0 0 0 ... 0 0 0]] + ... + [[0 0 0 ... 0 0 0] + [0 0 0 ... 0 0 0] + [0 0 0 ... 0 0 0]] + ... + [0 0 0 ... 0 0 0] + [0 0 0 ... 0 0 0] + [0 0 0 ... 0 0 0]]]]] + +``` + +### Application in Parameter + +The code sample is as follows: + +```python + +import numpy as np +from mindspore import dtype as mstype +from mindspore.common import set_seed +from mindspore.ops import operations as ops +from mindspore import Tensor, Parameter, context +from mindspore.common.initializer import Normal, initializer + +set_seed(1) + +weight1 = Parameter(initializer('Normal', [5, 4], mstype.float32), name="w1") +weight2 = Parameter(initializer(Normal(0.2), [5, 4], mstype.float32), name="w2") +input_data = Tensor(np.arange(20).reshape(5, 4), dtype=mstype.float32) +net = ops.Add() +output = net(input_data, weight1) +output = net(output, weight2) +print(output) + +``` + +```text + +[[-0.3305102 1.0412874 2.0412874 3.0412874] + [ 4.0412874 4.9479127 5.9479127 6.9479127] + [ 7.947912 9.063009 10.063009 11.063009 ] + [12.063009 13.536987 14.536987 14.857441 ] + [15.751231 17.073082 17.808317 19.364822 ]] + +```