diff --git a/docs/programming_guide/source_en/initializer.md b/docs/programming_guide/source_en/initializer.md index cd934f345ffd31fe2502207806817a392642f10c..3a1531fa40ba95959c37acfeef0d3d7fc7431d91 100644 --- a/docs/programming_guide/source_en/initializer.md +++ b/docs/programming_guide/source_en/initializer.md @@ -1,5 +1,324 @@ -# Initialization of Network Parameters - -No English version right now, welcome to contribute. - - \ No newline at end of file +# Initialization of Network Parameters + +[toc] + +## Overview + +MindSpore provides a weight initialization module. Users can initialize the network parameters by calling string, Initializer subclass, or custom Tensor and so on by encapsulate operators and Initializer methods. The Initializer class is the basic data structure used for initialization in MindSpore. Its subclass included several different types of data distributions (Zero, One, XavierUniform, HeUniform, HeNormal, Constant, Uniform, Normal, TruncatedNormal). There is a detailed introduction for the two parameter initialization modes of the encapsulation operators and the Initializer methods as follows. + +## Using the encapsulation operators to initialize the parameters + +MindSpore provides several ways to initialize the parameters and encapsulates the function of parameter initialization in some operators. This section will introduce the methods of initializing parameters by the operators which have the function of parameter initialization and take `Conv2d` as an example to introduce the methods of initializing parameters in the network by calling string, `Initializer` subclass and custom Tensor. The following code samples all take the `Normal` which is the subclass of `Initializer` as an example and the `Normal` can be replaced with any one of the subclass of `Initializer` in the code samples. + +### String + +The content of string should be consistent with the name of the `Initializer` subclass when using the string to initialize the network parameters. Using the string to initialize the network parameters will use the default parameters in the `Initializer` subclass. For example, using the string `Normal` is equivalent to using the subclass of `Initializer` `Normal()`. The code sample is as follows: + +```python +import numpy as np +import mindspore.nn as nn +from mindspore import Tensor +from mindspore.common import set_seed + +set_seed(1) + +input_data = Tensor(np.ones([1, 3, 16, 50], dtype=np.float32)) +net = nn.Conv2d(3, 64, 3, weight_init='Normal') +output = net(input_data) +print(output) +``` + +The output is as follows: + +```text +[[[[ 3.10382620e-02 4.38603461e-02 4.38603461e-02 ... 4.38603461e-02 + 4.38603461e-02 1.38719045e-02] + [ 3.26051228e-02 3.54298912e-02 3.54298912e-02 ... 3.54298912e-02 + 3.54298912e-02 -5.54019120e-03] + [ 3.26051228e-02 3.54298912e-02 3.54298912e-02 ... 3.54298912e-02 + 3.54298912e-02 -5.54019120e-03] + ... + [ 3.26051228e-02 3.54298912e-02 3.54298912e-02 ... 3.54298912e-02 + 3.54298912e-02 -5.54019120e-03] + [ 3.26051228e-02 3.54298912e-02 3.54298912e-02 ... 3.54298912e-02 + 3.54298912e-02 -5.54019120e-03] + [ 9.66199022e-03 1.24104535e-02 1.24104535e-02 ... 1.24104535e-02 + 1.24104535e-02 -1.38977719e-02]] + + ... + + [[ 3.98553275e-02 -1.35465711e-03 -1.35465711e-03 ... -1.35465711e-03 + -1.35465711e-03 -1.00310734e-02] + [ 4.38403059e-03 -3.60766202e-02 -3.60766202e-02 ... -3.60766202e-02 + -3.60766202e-02 -2.95619294e-02] + [ 4.38403059e-03 -3.60766202e-02 -3.60766202e-02 ... -3.60766202e-02 + -3.60766202e-02 -2.95619294e-02] + ... + [ 4.38403059e-03 -3.60766202e-02 -3.60766202e-02 ... -3.60766202e-02 + -3.60766202e-02 -2.95619294e-02] + [ 4.38403059e-03 -3.60766202e-02 -3.60766202e-02 ... -3.60766202e-02 + -3.60766202e-02 -2.95619294e-02] + [ 1.33139016e-02 6.74417242e-05 6.74417242e-05 ... 6.74417242e-05 + 6.74417242e-05 -2.27325838e-02]]]] +``` + +### Initializer subclass + +The effect of initializing the network parameters with `Initializer` subclass is similar to using string to initialize the parameters. However, using string to initialize the parameter is to use the default parameters of the `Initializer` subclass which is different from using string to initialize the parameters. Users must use the `Initializer` subclass to initialize the parameters if using the parameters in the `Initializer` subclass. Taking `Normal(0.2)` as an example, the code sample is as follows: + +```python +import numpy as np +import mindspore.nn as nn +from mindspore import Tensor +from mindspore.common import set_seed +from mindspore.common.initializer import Normal + +set_seed(1) + +input_data = Tensor(np.ones([1, 3, 16, 50], dtype=np.float32)) +net = nn.Conv2d(3, 64, 3, weight_init=Normal(0.2)) +output = net(input_data) +print(output) +``` + +The output is as follows: + +```text +[[[[ 6.2076533e-01 8.7720710e-01 8.7720710e-01 ... 8.7720710e-01 + 8.7720710e-01 2.7743810e-01] + [ 6.5210247e-01 7.0859784e-01 7.0859784e-01 ... 7.0859784e-01 + 7.0859784e-01 -1.1080378e-01] + [ 6.5210247e-01 7.0859784e-01 7.0859784e-01 ... 7.0859784e-01 + 7.0859784e-01 -1.1080378e-01] + ... + [ 6.5210247e-01 7.0859784e-01 7.0859784e-01 ... 7.0859784e-01 + 7.0859784e-01 -1.1080378e-01] + [ 6.5210247e-01 7.0859784e-01 7.0859784e-01 ... 7.0859784e-01 + 7.0859784e-01 -1.1080378e-01] + [ 1.9323981e-01 2.4820906e-01 2.4820906e-01 ... 2.4820906e-01 + 2.4820906e-01 -2.7795550e-01]] + + ... + + [[ 7.9710668e-01 -2.7093157e-02 -2.7093157e-02 ... -2.7093157e-02 + -2.7093157e-02 -2.0062150e-01] + [ 8.7680638e-02 -7.2153252e-01 -7.2153252e-01 ... -7.2153252e-01 + -7.2153252e-01 -5.9123868e-01] + [ 8.7680638e-02 -7.2153252e-01 -7.2153252e-01 ... -7.2153252e-01 + -7.2153252e-01 -5.9123868e-01] + ... + [ 8.7680638e-02 -7.2153252e-01 -7.2153252e-01 ... -7.2153252e-01 + -7.2153252e-01 -5.9123868e-01] + [ 8.7680638e-02 -7.2153252e-01 -7.2153252e-01 ... -7.2153252e-01 + -7.2153252e-01 -5.9123868e-01] + [ 2.6627803e-01 1.3488382e-03 1.3488382e-03 ... 1.3488382e-03 + 1.3488382e-03 -4.5465171e-01]]]] +``` + +### Customized Tensor + +In addition to the above two initialization methods, users can initialize the parameters by customized `Tensor` if the network needs to initialize the parameters with data types which are not available in MindSpore. The code sample is as follows: + +```python +import numpy as np +import mindspore.nn as nn +from mindspore import Tensor +from mindspore import dtype as mstype + +weight = Tensor(np.ones([64, 3, 3, 3]), dtype=mstype.float32) +input_data = Tensor(np.ones([1, 3, 16, 50], dtype=np.float32)) +net = nn.Conv2d(3, 64, 3, weight_init=weight) +output = net(input_data) +print(output) +``` + +The output is as follows: + +```text +[[[[12. 18. 18. ... 18. 18. 12.] + [18. 27. 27. ... 27. 27. 18.] + [18. 27. 27. ... 27. 27. 18.] + ... + [18. 27. 27. ... 27. 27. 18.] + [18. 27. 27. ... 27. 27. 18.] + [12. 18. 18. ... 18. 18. 12.]] + + ... + + [[12. 18. 18. ... 18. 18. 12.] + [18. 27. 27. ... 27. 27. 18.] + [18. 27. 27. ... 27. 27. 18.] + ... + [18. 27. 27. ... 27. 27. 18.] + [18. 27. 27. ... 27. 27. 18.] + [12. 18. 18. ... 18. 18. 12.]]]] +``` + +## Using the Initializer methods to initialize the parameters + +In the above code samples, how to initialize the parameters in the network is given. For example, the parameter `weight_init` is passed to the `Conv2d` operator as the data type to be initialized while using nn layer to encapsulate `Conv2d` operator in the network. The operator will initialize the parameters by calling `Parameter` class and then call the `initializer` method encapsulated in the `Parameter` class. However, some operators do not internally encapsulate the function of parameter initialization like `Conv2d`. For example, the weight of the `Conv3d` operator is passed into the `Conv3d` operator as a parameter. At this time, users need to manually customize the initialization of the weight. + +Users can call different data types in the `Intializer` subclass to initialize the parameters by using `Initializer` methods and then generate different types of data when initializing the parameters. + +The supported parameters are `init`, `shape` and `dtype` when using `Initializer` to initialize the parameters. + +- `init`: `init` supports subclass of `Tensor`, `str` and `Initializer`. +- `shape`: `shape` supports `list`, `tuple` and `int`. +- `dtype`: `dtype` supports `mindspore.dtype`. + +### The init parameter is Tensor + +The code sample is as follows: + +```python +import numpy as np +from mindspore import Tensor +from mindspore import dtype as mstype +from mindspore.common import set_seed +from mindspore.common.initializer import initializer +from mindspore.ops.operations import nn_ops as nps + +set_seed(1) + +input_data = Tensor(np.ones([16, 3, 10, 32, 32]), dtype=mstype.float32) +weight_init = Tensor(np.ones([32, 3, 4, 3, 3]), dtype=mstype.float32) +weight = initializer(weight_init, shape=[32, 3, 4, 3, 3]) +conv3d = nps.Conv3D(out_channel=32, kernel_size=(4, 3, 3)) +output = conv3d(input_data, weight) +print(output) +``` + +The output is as follows: + +```text +[[[[[108 108 108 ... 108 108 108] + [108 108 108 ... 108 108 108] + [108 108 108 ... 108 108 108] + ... + [108 108 108 ... 108 108 108] + [108 108 108 ... 108 108 108] + [108 108 108 ... 108 108 108]] + ... + [[108 108 108 ... 108 108 108] + [108 108 108 ... 108 108 108] + [108 108 108 ... 108 108 108] + ... + [108 108 108 ... 108 108 108] + [108 108 108 ... 108 108 108] + [108 108 108 ... 108 108 108]]]]] +``` + +### The init parameter is str + +The code sample is as follows: + +```python +import numpy as np +from mindspore import Tensor +from mindspore import dtype as mstype +from mindspore.common import set_seed +from mindspore.common.initializer import initializer +from mindspore.ops.operations import nn_ops as nps + +set_seed(1) + +input_data = Tensor(np.ones([16, 3, 10, 32, 32]), dtype=mstype.float32) +weight = initializer('Normal', shape=[32, 3, 4, 3, 3], dtype=mstype.float32) +conv3d = nps.Conv3D(out_channel=32, kernel_size=(4, 3, 3)) +output = conv3d(input_data, weight) +print(output) +``` + +The output is as follows: + +```text +[[[[[0 0 0 ... 0 0 0] + [0 0 0 ... 0 0 0] + [0 0 0 ... 0 0 0]] + ... + [0 0 0 ... 0 0 0] + [0 0 0 ... 0 0 0] + [0 0 0 ... 0 0 0]] + ... + [[0 0 0 ... 0 0 0] + [0 0 0 ... 0 0 0] + [0 0 0 ... 0 0 0]] + ... + [0 0 0 ... 0 0 0] + [0 0 0 ... 0 0 0] + [0 0 0 ... 0 0 0]]]]] +``` + +### The init parameter is Initializer subclass + +The code sample is as follows: + +```python +import numpy as np +from mindspore import Tensor +from mindspore import dtype as mstype +from mindspore.common import set_seed +from mindspore.ops.operations import nn_ops as nps +from mindspore.common.initializer import Normal, initializer + +set_seed(1) + +input_data = Tensor(np.ones([16, 3, 10, 32, 32]), dtype=mstype.float32) +weight = initializer(Normal(0.2), shape=[32, 3, 4, 3, 3], dtype=mstype.float32) +conv3d = nps.Conv3D(out_channel=32, kernel_size=(4, 3, 3)) +output = conv3d(input_data, weight) +print(output) +``` + +The output is as follows: + +```text +[[[[[0 0 0 ... 0 0 0] + [0 0 0 ... 0 0 0] + [0 0 0 ... 0 0 0]] + ... + [0 0 0 ... 0 0 0] + [0 0 0 ... 0 0 0] + [0 0 0 ... 0 0 0]] + ... + [[0 0 0 ... 0 0 0] + [0 0 0 ... 0 0 0] + [0 0 0 ... 0 0 0]] + ... + [0 0 0 ... 0 0 0] + [0 0 0 ... 0 0 0] + [0 0 0 ... 0 0 0]]]]] +``` + +### Application in Parameter + +The code sample is as follows: + +```python +import numpy as np +from mindspore import dtype as mstype +from mindspore.common import set_seed +from mindspore.ops import operations as ops +from mindspore import Tensor, Parameter, context +from mindspore.common.initializer import Normal, initializer + +set_seed(1) + +weight1 = Parameter(initializer('Normal', [5, 4], mstype.float32), name="w1") +weight2 = Parameter(initializer(Normal(0.2), [5, 4], mstype.float32), name="w2") +input_data = Tensor(np.arange(20).reshape(5, 4), dtype=mstype.float32) +net = ops.Add() +output = net(input_data, weight1) +output = net(output, weight2) +print(output) +``` + +The output is as follows: + +```text +[[-0.3305102 1.0412874 2.0412874 3.0412874] + [ 4.0412874 4.9479127 5.9479127 6.9479127] + [ 7.947912 9.063009 10.063009 11.063009 ] + [12.063009 13.536987 14.536987 14.857441 ] + [15.751231 17.073082 17.808317 19.364822 ]] +``` \ No newline at end of file