diff --git a/api/source_en/api/python/mindspore/mindspore.nn.layer.combined.rst b/api/source_en/api/python/mindspore/mindspore.nn.layer.combined.rst new file mode 100644 index 0000000000000000000000000000000000000000..11201a5b6e0ca09af37a4ddb78c115bf4e4659c5 --- /dev/null +++ b/api/source_en/api/python/mindspore/mindspore.nn.layer.combined.rst @@ -0,0 +1,5 @@ +mindspore.nn.layer.combined +=========================== + +.. automodule:: mindspore.nn.layer.combined + :members: diff --git a/api/source_en/index.rst b/api/source_en/index.rst index 0b432886434cf99ee0f3dcd94a82d7b5dc2fe3c6..1ef737188e3838abad688bcd1634375e36f7ea9b 100644 --- a/api/source_en/index.rst +++ b/api/source_en/index.rst @@ -17,6 +17,7 @@ MindSpore API api/python/mindspore/mindspore.context api/python/mindspore/mindspore.nn api/python/mindspore/mindspore.nn.dynamic_lr + api/python/mindspore/mindspore.nn.layer.combined api/python/mindspore/mindspore.ops api/python/mindspore/mindspore.ops.composite api/python/mindspore/mindspore.ops.operations diff --git a/api/source_zh_cn/api/python/mindspore/mindspore.nn.layer.combined.rst b/api/source_zh_cn/api/python/mindspore/mindspore.nn.layer.combined.rst new file mode 100644 index 0000000000000000000000000000000000000000..11201a5b6e0ca09af37a4ddb78c115bf4e4659c5 --- /dev/null +++ b/api/source_zh_cn/api/python/mindspore/mindspore.nn.layer.combined.rst @@ -0,0 +1,5 @@ +mindspore.nn.layer.combined +=========================== + +.. automodule:: mindspore.nn.layer.combined + :members: diff --git a/api/source_zh_cn/index.rst b/api/source_zh_cn/index.rst index c703be92f6547eb9abef6a1fd8fe9f248110c815..065a579da37a65da9af6b02ddb41d3f12e62f21a 100644 --- a/api/source_zh_cn/index.rst +++ b/api/source_zh_cn/index.rst @@ -17,6 +17,7 @@ MindSpore API api/python/mindspore/mindspore.context api/python/mindspore/mindspore.nn api/python/mindspore/mindspore.nn.dynamic_lr + api/python/mindspore/mindspore.nn.layer.combined api/python/mindspore/mindspore.ops api/python/mindspore/mindspore.ops.composite api/python/mindspore/mindspore.ops.operations diff --git a/tutorials/source_zh_cn/advanced_use/aware_quantization.md b/tutorials/source_zh_cn/advanced_use/aware_quantization.md new file mode 100644 index 0000000000000000000000000000000000000000..e6a163cfc2766554a252055326dd03c60c6d7d37 --- /dev/null +++ b/tutorials/source_zh_cn/advanced_use/aware_quantization.md @@ -0,0 +1,143 @@ +# 量化 + + + +- [量化](#量化) + - [概述](#概述) + - [量化介绍](#量化介绍) + - [感知量化训练](#感知量化训练) + - [伪量化节点](#伪量化节点) + - [感知量化示例](#感知量化示例) + - [参考文献](#参考文献) + + + + + +## 概述 + +与FP32类型相比,FP16、INT8、INT4的低精度数据表达类型所占用空间更小,因此对应的存储空间和传输时间都可以大幅下降。以手机为例,为了提供更人性化和智能的服务,现在越来越多的OS和APP集成了深度学习的功能,自然需要包含大量的模型及权重文件。经典的AlexNet,原始权重文件的大小已经超过了200MB,而最近出现的新模型正在往结构更复杂、参数更多的方向发展。显然,低精度类型的空间受益还是很明显的。低比特的计算性能也更高,INT8相对比FP32的加速比可达到3倍甚至更高,功耗上也对应有所减少。 + +本文将会介绍如何通过MindSpore使用感知量化算法进行训练。 + +## 量化介绍 + +### 伪量化节点 + +伪量化节点的作用:(1)找到网络数据的分布,即找到待量化参数的最大值和最小值;(2)模拟量化到低比特操作的时候的精度损失,把该损失作用到网络模型中,传递给损失函数,让优化器去在训练过程中对该损失值进行优化。 + +对于权值和数据的量化,MindSpore都采用参考文献[1]中的方案进行量化。 + +### 感知量化训练 + +感知量化训练(Aware Quantization training),MindSpore的量化感知训练是一种伪量化的过程,它是在可识别的某些操作内嵌入伪量化节点,用以统计训练时流经该节点数据的最大最小值。其目的是减少精度损失,其参与模型训练的前向推理过程令模型获得量化损失,但梯度更新需要在浮点下进行,因而其并不参与反向传播过程。 + +伪量化节点的意义在于统计流经数据的min和max值,并参与前向传播,让损失函数的值增大,优化器感知到损失值的增加,并进行持续性地反向传播学习,进一步减少因为伪量化操作而引起的精度下降,从而提升精确度。 + +在MindSpore的伪量化感知训练中,支持非对称和对称的量化算法,支持4bit和8bit的量化方案。 + +## 感知量化示例 + +使用感知量化训练,具体的实现步骤为: + +1. 引用量化接口 +2. 定义网络模型 +3. 量化自动构图 + +代码样例如下: + +1. 引用量化接口: + + ```python + from mindspore.train.quant import quant as qat + from mindspore.nn.layer import combined + ``` + +2. 定义网络模型: + + 以LeNet5网络模型为例子,原网络模型的定义如下所示。 + + ```python + class LeNet5(nn.Cell): + def __init__(self, num_class=10): + super(LeNet5, self).__init__() + self.num_class = num_class + + self.conv1 = nn.Conv2d(1, 6, kernel_size=5) + self.bn1 = nn.batchnorm(6) + self.act1 = nn.relu() + + self.conv2 = nn.Conv2d(6, 16, kernel_size=5) + self.bn2 = nn.batchnorm(16) + self.act2 = nn.relu() + + self.fc1 = nn.Dense(16 * 5 * 5, 120) + self.fc2 = nn.Dense(120, 84) + self.act3 = nn.relu() + self.fc3 = nn.Dense(84, self.num_class) + self.max_pool2d = nn.MaxPool2d(kernel_size=2, stride=2) + + def construct(self, x): + x = self.conv1(x) + x = self.bn1(x) + x = self.act1(x) + x = self.max_pool2d(x) + x = self.conv2(x) + x = self.bn2(x) + x = self.act2(x) + x = self.max_pool2d(x) + x = self.flattern(x) + x = self.fc1(x) + x = self.act3() + x = self.fc2(x) + x = self.act3() + x = self.fc3(x) + return x + ``` + + 感知量化网络模型定义,使用`combined.Conv2d`替换原网络模型中的三个算子`nn.Conv2d`、`nn.batchnorm`和`nn.relu`。同理 + `combined.Dense`替换原网络模型中的对应的算子`nn.Dense`、`nn.batchnorm`和`nn.relu`。 + + ```python + class LeNet5(nn.Cell): + def __init__(self, num_class=10): + super(LeNet5, self).__init__() + self.num_class = num_class + + self.conv1 = combined.Conv2d(1, 6, kernel_size=5, batchnorm=True, activation='relu') + self.conv2 = combined.Conv2d(6, 16, kernel_size=5, batchnorm=True, activation='relu') + + self.fc1 = combined.Dense(16 * 5 * 5, 120, activation='relu') + self.fc2 = combined.Dense(120, 84, activation='relu') + self.fc3 = combined.Dense(84, self.num_class) + self.max_pool2d = nn.MaxPool2d(kernel_size=2, stride=2) + + def construct(self, x): + x = self.conv1(x) + x = self.max_pool2d(x) + x = self.conv2(x) + x = self.max_pool2d(x) + x = self.flattern(x) + x = self.fc1(x) + x = self.fc2(x) + x = self.fc3(x) + return x + ``` +3. 量化自动构图: + + 使用`create_training_network()`接口封装网络模型,该步骤将会自动插入伪量化算子。 + + ```python + net = qat.convert_quant_network(net, quant_delay=0, bn_fold=False, freeze_bn=10000, weight_bits=8, act_bits=8) + ``` + + 其余步骤(如定义损失函数、优化器、超参数和训练网络等)与普通网络训练相同。 + +## 参考文献 + +[1] Jacob B, Kligys S, Chen B, et al. Quantization and training of neural networks for efficient integer-arithmetic-only inference[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2018: 2704-2713. + +[2] Krishnamoorthi R. Quantizing deep convolutional networks for efficient inference: A whitepaper[J]. arXiv preprint arXiv:1806.08342, 2018. + +[3] Jacob B, Kligys S, Chen B, et al. Quantization and training of neural networks for efficient integer-arithmetic-only inference[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2018: 2704-2713. + diff --git a/tutorials/source_zh_cn/index.rst b/tutorials/source_zh_cn/index.rst index 69d0866d1367b71953f3e38f5fb0fae59a876d29..ec001212d50d5d093c0a57be4abe679444bc0b71 100644 --- a/tutorials/source_zh_cn/index.rst +++ b/tutorials/source_zh_cn/index.rst @@ -47,6 +47,7 @@ MindSpore教程 advanced_use/distributed_training_tutorials advanced_use/mixed_precision + advanced_use/aware_quantization .. toctree:: :glob: