From 57642dd281899c22fccf7740e064eaf12ea91619 Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?=E5=BC=A0=E6=B4=8B=E6=B4=8B?= <584244991@qq.com>
Date: Thu, 11 Aug 2022 03:34:41 +0000
Subject: [PATCH] update
 TensorFlow/contrib/nlp/quantum_sample_learning_ID2036_for_Tensorflow/README.md.

---
 .../README.md                                 | 331 ++++++++----------
 1 file changed, 139 insertions(+), 192 deletions(-)
diff --git a/TensorFlow/contrib/nlp/quantum_sample_learning_ID2036_for_Tensorflow/README.md b/TensorFlow/contrib/nlp/quantum_sample_learning_ID2036_for_Tensorflow/README.md
index 067f718e2..c5eea9eb1 100644
--- a/TensorFlow/contrib/nlp/quantum_sample_learning_ID2036_for_Tensorflow/README.md
+++ b/TensorFlow/contrib/nlp/quantum_sample_learning_ID2036_for_Tensorflow/README.md
@@ -1,180 +1,170 @@
+- [基本信息](#基本信息.md)
+- [概述](#概述.md)
+- [训练环境准备](#训练环境准备.md)
+- [快速上手](#快速上手.md)
+- [迁移学习指导](#迁移学习指导.md)
+- [高级参考](#高级参考.md)
 <h2 id="基本信息.md">基本信息</h2>
 
-发布者（Publisher）：Huawei
+**发布者（Publisher）：Huawei**
 
-版本（Version）：1.1
+**应用领域（Application Domain）：Natural Language Processing**
 
-修改时间（Modified） ：2022.7.19
+**版本（Version）：1.1**
 
-大小（Size）：74M
+**修改时间（Modified） ：2022.8.11**
 
-框架（Framework）：TensorFlow 1.15.0
+**大小（Size）：104**
 
-模型格式（Model Format）：ckpt
+**框架（Framework）：TensorFlow_1.15**
 
-精度（Precision）：Mixed
+**模型格式（Model Format）：ckpt**
 
-处理器（Processor）：昇腾910
+**精度（Precision）：FP32**
 
-应用级别（Categories）：Official
+**处理器（Processor）：昇腾910**
 
+**应用级别（Categories）：Official**
 
+**描述（Description）：基于TensorFlow框架quantum_sample_learning处理网络训练代码**
 
 <h2 id="概述.md">概述</h2>
 
-Here we use byte as the unit of our model for quantum sample learning. In our case, the inputs to the language model are samples of bitstrings
-and the model is trained to predict the next bit given the bits observed so far,
-starting with a start of sequence token. We use a standard LSTM language model
-with a logistic output layer. To sample from the model, we input the start of
-sequence token, sample from the output distribution then input the result as
-the next timestep. This is repeated until the required number of samples is
-obtained. 
+## 简述<a name="section194554031510"></a>
 
-- 参考论文：[[2010.11983\] Learnability and Complexity of Quantum Samples (arxiv.org)](https://arxiv.org/abs/2010.11983)
+在这里，我们使用字节作为量子样本学习模型的单位。在我们的例子中，语言模型的输入是位字符串的样本，模型被训练为预测下一个位，给定到目前为止观察到的位，从序列令牌的开始。我们使用带有逻辑输出层的标准LSTM语言模型。要从模型中采样，我们输入序列标记的开始，从输出分布中采样，然后将结果作为下一个时间步输入。重复此操作，直到获得所需的样品数量。
 
-- 参考实现：https://github.com/google-research/google-research/tree/master/quantum_sample_learning
-
-- 适配昇腾 AI 处理器的实现：
-  
-  [TensorFlow/contrib/nlp/quantum_sample_learning_ID2036_for_Tensorflow · Ascend/ModelZoo-TensorFlow - 码云 - 开源中国 (gitee.com)](https://gitee.com/ascend/ModelZoo-TensorFlow/tree/master/TensorFlow/contrib/nlp/quantum_sample_learning_ID2036_for_Tensorflow)
-
-
-- 通过Git获取对应commit\_id的代码方法如下：
+- 参考论文：
   
-    ```
-    git clone {repository_url}    # 克隆仓库的代码
-    cd {repository_name}    # 切换到模型的代码仓目录
-    git checkout  {branch}    # 切换到对应分支
-    git reset --hard ｛commit_id｝     # 代码设置到对应的commit_id
-    cd ｛code_path｝    # 切换到模型代码所在路径，若仓库下只有该模型，则无需切换
-    ```
+  [https://arxiv.org/abs/2010.11983](Learnability and Complexity of Quantum Samples)
 
+- 参考实现：
 
+  https://github.com/google-research/google-research/tree/master/quantum_sample_learning
 
-## 默认配置
-
-- 训练超参
-
-  - epoch：20
-  - batch_size：64
-  - learning_rate：0.001
-  - num_qubits：12
-  - rnn_units：256
-
+- 适配昇腾 AI 处理器的实现：
   
+  https://gitee.com/ascend/ModelZoo-TensorFlow/edit/master/TensorFlow/contrib/nlp/quantum_sample_learning_ID2036_for_Tensorflow
 
+- 通过Git获取对应commit\_id的代码方法如下：
+  
+        git clone {repository_url}    # 克隆仓库的代码
+        cd {repository_name}    # 切换到模型的代码仓目录
+        git checkout  {branch}    # 切换到对应分支
+        git reset --hard ｛commit_id｝     # 代码设置到对应的commit_id
+        cd ｛code_path｝    # 切换到模型代码所在路径，若仓库下只有该模型，则无需切换
+    
 
-<h2 id="训练环境准备.md">训练环境准备</h2>
-
-1. 硬件环境准备请参见各硬件产品文档"[驱动和固件安装升级指南]( https://support.huawei.com/enterprise/zh/category/ai-computing-platform-pid-1557196528909)"。需要在硬件设备上安装与CANN版本配套的固件与驱动。
-
-2. requirements
-
-   ```
-   python==3.6  
-   absl-py
-   cirq==0.8.0
-   numpy==1.16.4
-   scipy==1.2.1
-   tensorflow==1.15
-   
-   Ascend: 1*Ascend 910 
-   CPU: 24vCPUs 96GiB
-   ```
-
-   
-
-## 快速上手
-
-- 数据集准备
-  - 模型训练使用q12c0数据集
-
-
-
-## 模型训练
-
-- 单击“立即下载”，并选择合适的下载方式下载源码包。
-
-- 启动训练之前，首先要配置程序运行相关环境变量。
+## 默认配置<a name="section91661242121611"></a>
 
-- 单卡训练 
+-   训练超参（单卡）：
+    - Batch size: 64
+    - epoch：20
+    - learning_rate：0.001
+    - checkpoint_dir
+    - probabilities_path
+    - eval_samples=500000
+    - training_eval_samples=4000
+    - train_size=500000
 
-  1. 配置训练参数。
 
-     在`run_lm.py`中，配置checkpoint保存路径，请用户根据实际路径配置，参数如下所示：
+## 支持特性<a name="section1899153513554"></a>
 
-     ```
-     flags.DEFINE_string('checkpoint_dir', './checkpoint',
-                         'Where to save checkpoints')
-     ```
+| 特性列表   | 是否支持 |
+| ---------- | -------- |
+| 分布式训练 | 是       |
+| 混合精度   | 否      |
+| 数据并行   | 是       |
 
-  2. 启动训练。
 
-     ```
-     python run_lm.py 
-     ```
+## 混合精度训练<a name="section168064817164"></a>
 
-  3. 在`evaluate.py`中配置进行验证的checkpoint路径
+昇腾910 AI处理器提供自动混合精度功能，可以针对全网中float32数据类型的算子，按照内置的优化策略，自动将部分float32的算子降低精度到float16，从而在精度损失很小的情况下提升系统性能并减少内存使用。
 
-    ```
-    flags.DEFINE_string('checkpoint_dir', './checkpoint',
-                        'Where to save checkpoints')
-    ```
+## 开启混合精度<a name="section20779114113713"></a>
 
-  4. 进行验证
+拉起脚本中，
 
-    ```
-    python evaluate.py
-    ```
+```
+ ./train_full_1p.sh --help
+
+parameter explain:
+    --precision_mode         #precision mode(allow_fp32_to_fp16/force_fp16/must_keep_origin_dtype/allow_mix_precision)
+    --data_path              # dataset of training
+    --output_path            # output of training
+    --train_steps            # max_step for training
+    --train_epochs           # max_epoch for training
+    --batch_size             # batch size
+    -h/--help                show help message
+```
 
+混合精度相关代码示例:
 
+ ```
+    precision_mode="allow_mix_precision"
 
-## 训练结果
+ ```
 
-论文
+<h2 id="训练环境准备.md">训练环境准备</h2>
 
+-  硬件环境和运行环境准备请参见《[CANN软件安装指南](https://support.huawei.com/enterprise/zh/ascend-computing/cann-pid-251168373?category=installation-update)》
+-  运行以下命令安装依赖。
 ```
-Linear Fidelity: 0.982864
-Logistic Fidelity: 0.979632
-theoretical_linear_xeb: 1.018109
-theoretical_logistic_xeb: 1.006301
-linear_xeb: 0.982864
-logistic_xeb: 0.979632
-kl_div: 0.021964
+pip3 install requirements.txt
 ```
+说明：依赖配置文件requirements.txt文件位于模型的根目录
 
-GPU
+<h2 id="快速上手.md">快速上手</h2>
 
-```
-Linear Fidelity: 1.006005
-Logistic Fidelity: 1.001702
-theoretical_linear_xeb: 1.015967
-theoretical_logistic_xeb: 1.005840
-linear_xeb: 1.006005
-logistic_xeb: 1.001702
-kl_div: 0.004158
-```
+## 数据集准备<a name="section361114841316"></a>
 
-NPU
+1、数据集链接https://pan.baidu.com/s/1WAl4C_EnQi4wp6l684yI9w  提取码：unp5
 
-```
-Linear Fidelity: 1.058425
-Logistic Fidelity: 1.029696
-theoretical_linear_xeb: 1.020069
-theoretical_logistic_xeb: 1.006884
-linear_xeb: 1.058425
-logistic_xeb: 1.029696
-kl_div: 0.004556
-```
+2、quantum_sample_learning训练的模型及数据集可以参考"简述 -> 参考实现"
 
-+ 精度对比
 
-  |              | 论文     | GPU      | NPU      |
-  | ------------ | -------- | -------- | -------- |
-  | linear_xeb   | 0.982864 | 1.006005 | 1.058425 |
-  | logistic_xeb | 0.979632 | 1.001702 | 1.029696 |
+## 模型训练<a name="section715881518135"></a>
 
-  
+- 单击“立即下载”，并选择合适的下载方式下载源码包。
+- 开始训练。
+
+    - 启动训练之前，首先要配置程序运行相关环境变量。
+
+      环境变量配置信息参见：
+
+      [Ascend 910训练平台环境变量设置](https://gitee.com/ascend/modelzoo/wikis/Ascend%20910%E8%AE%AD%E7%BB%83%E5%B9%B3%E5%8F%B0%E7%8E%AF%E5%A2%83%E5%8F%98%E9%87%8F%E8%AE%BE%E7%BD%AE?sort_id=3148819)
+
+    - 单卡训练
+
+
+              1. 配置训练参数。
+            
+                 在`run_lm.py`中，配置checkpoint保存路径，请用户根据实际路径配置，参数如下所示：
+            
+                 ```
+                 flags.DEFINE_string('checkpoint_dir', './checkpoint',
+                                     'Where to save checkpoints')
+                 ```
+            
+              2. 启动训练。
+            
+                 ```
+                 python run_lm.py 
+                 ```
+            
+              3. 在`evaluate.py`中配置进行验证的checkpoint路径
+            
+                ```
+                flags.DEFINE_string('checkpoint_dir', './checkpoint',
+                                    'Where to save checkpoints')
+                ```
+            
+              4. 进行验证
+            
+                ```
+                python evaluate.py
+           
+                ```
 
 <h2 id="高级参考.md">高级参考</h2>
 
@@ -194,71 +184,28 @@ Quantum Sample Learning
   ├─evaluate.py     模型评估程序入口
 ```
 
-
-
-## 脚本参数
+## 脚本参数<a name="section6669162441511"></a>
 
 ```
-flags.DEFINE_string('data_url', './dataset',
-                    'Where to save datasets')
-flags.DEFINE_string('train_url', './output',
-                    'Where to save Output')
-flags.DEFINE_string('checkpoint_dir', './checkpoint',
-                    'Where to save checkpoints')
-flags.DEFINE_string('save_data', '', 'Where to generate data (optional).')
-flags.DEFINE_string('eval_sample_file', '',
-                    'A file of samples to evaluate (optional).')
-flags.DEFINE_boolean(
-    'eval_has_separator', False,
-    'Set if the numbers in the samples are separated by spaces.')
-flags.DEFINE_integer('epochs', 20, 'Number of epochs to train.')
-flags.DEFINE_integer('eval_samples', 500000,
-                     'Number of samples for evaluation.')
-flags.DEFINE_integer('training_eval_samples', 4000,
-                     'Number of samples for evaluation during training.')
-flags.DEFINE_integer('num_qubits', 12, 'Number of qubits to be learnt')
-flags.DEFINE_integer('rnn_units', 256, 'Number of RNN hidden units.')
-flags.DEFINE_integer(
-    'num_moments', -2,
-    'If > 12, then use training data generated with this number of moments.')
-flags.DEFINE_integer('batch_size', 64, 'Batch size')
-flags.DEFINE_float('learning_rate', 0.001, 'Learning rate')
-flags.DEFINE_boolean('use_adamax', False,
-                     'Use the Adamax optimizer.')
-flags.DEFINE_boolean('eval_during_training', False,
-                     'Perform eval while training.')
-flags.DEFINE_float('kl_smoothing', 1, 'The KL smoothing factor.')
-flags.DEFINE_boolean(
-    'save_test_counts', False, 'Whether to save test counts distribution.')
-flags.DEFINE_string(
-    'probabilities_path', './data/q12c0.txt',
-    'The path of the theoretical distribution')
-flags.DEFINE_string(
-    'experimental_bitstrings_path',
-    'quantum_sample_learning/data/experimental_samples_q12c0d14.txt',
-    'The path of the experiment measurements')
-flags.DEFINE_integer('train_size', 500000, 'Training set size to generate')
-flags.DEFINE_boolean('use_theoretical_distribution', True,
-                     'Use the theoretical bitstring distribution.')
-flags.DEFINE_integer(
-    'subset_parity_size', 0,
-    'size of the subset for reordering the bit strings according to the '
-    'parity defined by the bit string of length specified here')
-flags.DEFINE_boolean('random_subset', False,
-                     'Randomly choose which subset of bits to '
-                     'evaluate the subset parity on.')
-flags.DEFINE_boolean('porter_thomas', False,
-                     'Sample from Poter-Thomas distribution')
+--checkpoint_dir, './checkpoint'                           #模型保存路径
+--save_data 
+--eval_sample_file
+--eval_has_separator', False,  
+--epochs, 20
+--eval_samples, 500000
+--training_eval_samples', 4000
+--num_qubits', 12, 'Number of qubits to be learnt
+--rnn_units', 256, 'Number of RNN hidden units
+--batch_size', 64, 'Batch size')
+--learning_rate', 0.001
+--save_test_counts, False
+--probabilities_path, './data/q12c0.txt'                    #数据集路径
+--experimental_bitstrings_path',quantum_sample_learning/data/experimental_samples_q12c0d14.txt'
+--train_size, 500000
+--random_subset', False
+--porter_thomas', False
 ```
 
+## 训练过程<a name="section1589455252218"></a>
 
-
-## 下载链接
-
-### 数据集下载
-
-链接：https://pan.baidu.com/s/1WAl4C_EnQi4wp6l684yI9w  提取码：unp5
-
-### checkpoint文件
-
-链接：https://pan.baidu.com/s/1wckJSk7sNv0HvFuzJKWSdA  提取码：s4yz
+通过“模型训练”中的训练指令启动单卡或者多卡训练。单卡和多卡通过运行不同脚本，支持单卡，8卡网络训练。模型存储路径为${cur_path}/output/$ASCEND_DEVICE_ID，包括训练的log以及checkpoints文件。以1卡训练为例，loss信息在文件${cur_path}/output/${ASCEND_DEVICE_ID}/train_${ASCEND_DEVICE_ID}.log中。
\ No newline at end of file
-- 
Gitee