diff --git a/docs/programming_guide/source_en/initializer.md b/docs/programming_guide/source_en/initializer.md index cd934f345ffd31fe2502207806817a392642f10c..d047cfe2193bc80348e98bb689ef62438e5033b0 100644 --- a/docs/programming_guide/source_en/initializer.md +++ b/docs/programming_guide/source_en/initializer.md @@ -1,5 +1,339 @@ # Initialization of Network Parameters -No English version right now, welcome to contribute. +# Overview - \ No newline at end of file +MindSpore provides a module called weight initialization. Users can use string, subclass called Initializer or custom Tensors by encapsulating operator and the method called initializer to finish the Initialization of Network Parameters. The class called Initializer is the base data struct to be used for initialization in MindSpore, whose subclasses contain several different types of data distribution(Zero,One, XavierUniform,HeUniform,HeNormal, Constant,Uniform,Normal,TruncatedNormal). The following, we will introduce the Initialization of Network Parameters of the encapsulation operator and the method called initializer. + +# Using the encapsulation operator to initialize parameters + +MindSpore provides a lot of ways to initialize parameters, and encapsulates the function of parameter initialization in some operators. In this part, we will introduce the method of initializing the parameters by the operator with parameter initialization function. We will take Conv2d as an example to initialize the parameters in the network with strings, the subclasses called Initializer, and custom Tensors. The following code examples all take the subclass called Normal of Initializer as an example. In the code examples, the subclass called normal can be replaced with any one of the subclasses of Initializer. + +# String + +If we use string to initialize Network Parameters, the content of the string needs to be consistent with the name of the subclass called Initializer. +We will use the default parameters in the subclass called Initializer to initialize by string way. For example, if you use the string called normal and you also use the subclass called normal() of Initializer. You can see the following code examples. + +```python + +import numpy as np +import mindspore.nn as nn +from mindspore import Tensor +from mindspore.common import set_seed + +set_seed(1) + +input_data = Tensor(np.ones([1, 3, 16, 50], dtype=np.float32)) +net = nn.Conv2d(3, 64, 3, weight_init='Normal') +output = net(input_data) +print(output) + +``` + +```python + +[[[[ 3.10382620e-02 4.38603461e-02 4.38603461e-02 ... 4.38603461e-02 + 4.38603461e-02 1.38719045e-02] + [ 3.26051228e-02 3.54298912e-02 3.54298912e-02 ... 3.54298912e-02 + 3.54298912e-02 -5.54019120e-03] + [ 3.26051228e-02 3.54298912e-02 3.54298912e-02 ... 3.54298912e-02 + 3.54298912e-02 -5.54019120e-03] + ... + [ 3.26051228e-02 3.54298912e-02 3.54298912e-02 ... 3.54298912e-02 + 3.54298912e-02 -5.54019120e-03] + [ 3.26051228e-02 3.54298912e-02 3.54298912e-02 ... 3.54298912e-02 + 3.54298912e-02 -5.54019120e-03] + [ 9.66199022e-03 1.24104535e-02 1.24104535e-02 ... 1.24104535e-02 + 1.24104535e-02 -1.38977719e-02]] + + ... + + [[ 3.98553275e-02 -1.35465711e-03 -1.35465711e-03 ... -1.35465711e-03 + -1.35465711e-03 -1.00310734e-02] + [ 4.38403059e-03 -3.60766202e-02 -3.60766202e-02 ... -3.60766202e-02 + -3.60766202e-02 -2.95619294e-02] + [ 4.38403059e-03 -3.60766202e-02 -3.60766202e-02 ... -3.60766202e-02 + -3.60766202e-02 -2.95619294e-02] + ... + [ 4.38403059e-03 -3.60766202e-02 -3.60766202e-02 ... -3.60766202e-02 + -3.60766202e-02 -2.95619294e-02] + [ 4.38403059e-03 -3.60766202e-02 -3.60766202e-02 ... -3.60766202e-02 + -3.60766202e-02 -2.95619294e-02] + [ 1.33139016e-02 6.74417242e-05 6.74417242e-05 ... 6.74417242e-05 + 6.74417242e-05 -2.27325838e-02]]]] + +``` + +# The subclass called Initializer + +Using subclasses called Initializer to initialize the Network Parameters is similar to using strings to initialize parameters. The difference is that using strings for Parameter Initialization uses the default parameters of the subclass called Initializer. If you want to use the parameters in the subclass called Initializer, you must use the subclass called Initializer to initialize the parameters. We will take Normal(0.2) as an example and you can see the following code example. + +```python + +import numpy as np +import mindspore.nn as nn +from mindspore import Tensor +from mindspore.common import set_seed +from mindspore.common.initializer import Normal + +set_seed(1) + +input_data = Tensor(np.ones([1, 3, 16, 50], dtype=np.float32)) +net = nn.Conv2d(3, 64, 3, weight_init=Normal(0.2)) +output = net(input_data) +print(output) + +``` + +```python + +[[[[ 6.2076533e-01 8.7720710e-01 8.7720710e-01 ... 8.7720710e-01 + 8.7720710e-01 2.7743810e-01] + [ 6.5210247e-01 7.0859784e-01 7.0859784e-01 ... 7.0859784e-01 + 7.0859784e-01 -1.1080378e-01] + [ 6.5210247e-01 7.0859784e-01 7.0859784e-01 ... 7.0859784e-01 + 7.0859784e-01 -1.1080378e-01] + ... + [ 6.5210247e-01 7.0859784e-01 7.0859784e-01 ... 7.0859784e-01 + 7.0859784e-01 -1.1080378e-01] + [ 6.5210247e-01 7.0859784e-01 7.0859784e-01 ... 7.0859784e-01 + 7.0859784e-01 -1.1080378e-01] + [ 1.9323981e-01 2.4820906e-01 2.4820906e-01 ... 2.4820906e-01 + 2.4820906e-01 -2.7795550e-01]] + + ... + + [[ 7.9710668e-01 -2.7093157e-02 -2.7093157e-02 ... -2.7093157e-02 + -2.7093157e-02 -2.0062150e-01] + [ 8.7680638e-02 -7.2153252e-01 -7.2153252e-01 ... -7.2153252e-01 + -7.2153252e-01 -5.9123868e-01] + [ 8.7680638e-02 -7.2153252e-01 -7.2153252e-01 ... -7.2153252e-01 + -7.2153252e-01 -5.9123868e-01] + ... + [ 8.7680638e-02 -7.2153252e-01 -7.2153252e-01 ... -7.2153252e-01 + -7.2153252e-01 -5.9123868e-01] + [ 8.7680638e-02 -7.2153252e-01 -7.2153252e-01 ... -7.2153252e-01 + -7.2153252e-01 -5.9123868e-01] + [ 2.6627803e-01 1.3488382e-03 1.3488382e-03 ... 1.3488382e-03 + 1.3488382e-03 -4.5465171e-01]]]] + +``` + +# Custom Tensor + +In addition to the above two initialization methods, when network uses the data types which are not available in MindSpore to initialize the parameters, the user can initialize the parameters by customizing the Tensor. We can see the following code example. + +```python + +import numpy as np +import mindspore.nn as nn +from mindspore import Tensor +from mindspore import dtype as mstype + +weight = Tensor(np.ones([64, 3, 3, 3]), dtype=mstype.float32) +input_data = Tensor(np.ones([1, 3, 16, 50], dtype=np.float32)) +net = nn.Conv2d(3, 64, 3, weight_init=weight) +output = net(input_data) +print(output) + +``` + +```python + +[[[[12. 18. 18. ... 18. 18. 12.] + [18. 27. 27. ... 27. 27. 18.] + [18. 27. 27. ... 27. 27. 18.] + ... + [18. 27. 27. ... 27. 27. 18.] + [18. 27. 27. ... 27. 27. 18.] + [12. 18. 18. ... 18. 18. 12.]] + + ... + + [[12. 18. 18. ... 18. 18. 12.] + [18. 27. 27. ... 27. 27. 18.] + [18. 27. 27. ... 27. 27. 18.] + ... + [18. 27. 27. ... 27. 27. 18.] + [18. 27. 27. ... 27. 27. 18.] + [12. 18. 18. ... 18. 18. 12.]]]] +``` + +# Using the initializer method to initialize the parameters + +In the above code sample,we have given the method of how to initialize the parameters in the network. For example, if we use the nn layer to encapsulate the Conv2d operator in the network, the parameter weight_init is passed to the Conv2d operator as the data type to be initialized which will call the Parameter class during initialization, and then call the initializer method encapsulated in the Parameter class to complete the Parameter Initialization. However, some operators don't encapsulate the function of Parameter Initialization like Conv2d. For example, the weight of the Conv3d operator is passed into the Conv3d operator as a parameter, and the initialization of the weight needs to be manually defined at this time.When you initialize parameters, you can use the initializer method to call different data types in the subclass called Initializer to initialize the parameters, and then generate the different types of data. When you use the initializer for Parameter Initialization, the supported parameters are init, shape, and dtype. +init: Supporting the subclasses of Tensor, str, and Initializer. +shape: Supporting the incoming list, tuple, int. +dtype: Supporting the incoming mindspore.dtype. + +## The init parameter is Tensor + +we can see the following code example: + +```python + +import numpy as np +from mindspore import Tensor +from mindspore import dtype as mstype +from mindspore.common import set_seed +from mindspore.common.initializer import initializer +from mindspore.ops.operations import nn_ops as nps + +set_seed(1) + +input_data = Tensor(np.ones([16, 3, 10, 32, 32]), dtype=mstype.float32) +weight_init = Tensor(np.ones([32, 3, 4, 3, 3]), dtype=mstype.float32) +weight = initializer(weight_init, shape=[32, 3, 4, 3, 3]) +conv3d = nps.Conv3D(out_channel=32, kernel_size=(4, 3, 3)) +output = conv3d(input_data, weight) +print(output) + +``` + +The output is + +```python + +[[[[[108 108 108 ... 108 108 108] + [108 108 108 ... 108 108 108] + [108 108 108 ... 108 108 108] + ... + [108 108 108 ... 108 108 108] + [108 108 108 ... 108 108 108] + [108 108 108 ... 108 108 108]] + ... + [[108 108 108 ... 108 108 108] + [108 108 108 ... 108 108 108] + [108 108 108 ... 108 108 108] + ... + [108 108 108 ... 108 108 108] + [108 108 108 ... 108 108 108] + [108 108 108 ... 108 108 108]]]]] + +``` + +## The init parameter is str + +we can see the following code example: + +```python + +import numpy as np +from mindspore import Tensor +from mindspore import dtype as mstype +from mindspore.common import set_seed +from mindspore.common.initializer import initializer +from mindspore.ops.operations import nn_ops as nps + +set_seed(1) + +input_data = Tensor(np.ones([16, 3, 10, 32, 32]), dtype=mstype.float32) +weight = initializer('Normal', shape=[32, 3, 4, 3, 3], dtype=mstype.float32) +conv3d = nps.Conv3D(out_channel=32, kernel_size=(4, 3, 3)) +output = conv3d(input_data, weight) +print(output) + +``` + +The output is + +```python + +[[[[[0 0 0 ... 0 0 0] + [0 0 0 ... 0 0 0] + [0 0 0 ... 0 0 0]] + ... + [0 0 0 ... 0 0 0] + [0 0 0 ... 0 0 0] + [0 0 0 ... 0 0 0]] + ... + [[0 0 0 ... 0 0 0] + [0 0 0 ... 0 0 0] + [0 0 0 ... 0 0 0]] + ... + [0 0 0 ... 0 0 0] + [0 0 0 ... 0 0 0] + [0 0 0 ... 0 0 0]]]]] + +``` + +## The init parameter is subclass called Initializer + +we can see the following code example: + +```python + +import numpy as np +from mindspore import Tensor +from mindspore import dtype as mstype +from mindspore.common import set_seed +from mindspore.ops.operations import nn_ops as nps +from mindspore.common.initializer import Normal, initializer + +set_seed(1) + +input_data = Tensor(np.ones([16, 3, 10, 32, 32]), dtype=mstype.float32) +weight = initializer(Normal(0.2), shape=[32, 3, 4, 3, 3], dtype=mstype.float32) +conv3d = nps.Conv3D(out_channel=32, kernel_size=(4, 3, 3)) +output = conv3d(input_data, weight) +print(output) + +``` + +The output is + +```python + +[[[[[0 0 0 ... 0 0 0] + [0 0 0 ... 0 0 0] + [0 0 0 ... 0 0 0]] + ... + [0 0 0 ... 0 0 0] + [0 0 0 ... 0 0 0] + [0 0 0 ... 0 0 0]] + ... + [[0 0 0 ... 0 0 0] + [0 0 0 ... 0 0 0] + [0 0 0 ... 0 0 0]] + ... + [0 0 0 ... 0 0 0] + [0 0 0 ... 0 0 0] + [0 0 0 ... 0 0 0]]]]] + +``` + +## Application in Parameter + +we can see the following code example: + +```python + +import numpy as np +from mindspore import dtype as mstype +from mindspore.common import set_seed +from mindspore.ops import operations as ops +from mindspore import Tensor, Parameter, context +from mindspore.common.initializer import Normal, initializer + +set_seed(1) + +weight1 = Parameter(initializer('Normal', [5, 4], mstype.float32), name="w1") +weight2 = Parameter(initializer(Normal(0.2), [5, 4], mstype.float32), name="w2") +input_data = Tensor(np.arange(20).reshape(5, 4), dtype=mstype.float32) +net = ops.Add() +output = net(input_data, weight1) +output = net(output, weight2) +print(output) + +``` + +```python + +[[-0.3305102 1.0412874 2.0412874 3.0412874] + [ 4.0412874 4.9479127 5.9479127 6.9479127] + [ 7.947912 9.063009 10.063009 11.063009 ] + [12.063009 13.536987 14.536987 14.857441 ] + [15.751231 17.073082 17.808317 19.364822 ]] + +``` + + \ No newline at end of file