diff --git a/docs/federated/docs/source_en/local_differential_privacy_training_noise.md b/docs/federated/docs/source_en/local_differential_privacy_training_noise.md
index 09396fff7bda2a4c86e424afa4ea9e312ce94644..965c298e7877ac42109a759fe3dbb7b0fd1a902d 100644
--- a/docs/federated/docs/source_en/local_differential_privacy_training_noise.md
+++ b/docs/federated/docs/source_en/local_differential_privacy_training_noise.md
@@ -1,4 +1,4 @@
-# Horizontal FL-Local differential privacy perturbation training
+# Horizontal FL-Local Differential Privacy Perturbation Training
 
 <a href="https://gitee.com/mindspore/docs/blob/master/docs/federated/docs/source_en/local_differential_privacy_training_noise.md" target="_blank"><img src="https://mindspore-website.obs.cn-north-4.myhuaweicloud.com/website-images/master/resource/_static/logo_source_en.png"></a>
 
@@ -6,12 +6,11 @@ During federated learning, user data is used only for local device training and
 However, in the conventional federated learning framework, models are migrated to the cloud in plaintext. There is still a risk of indirect disclosure of user privacy.
 After obtaining the plaintext model uploaded by a user, the attacker can restore the user's personal training data through attacks such as reconstruction and model inversion. As a result, user privacy is disclosed.
 
-As a federated learning framework, MindSpore Federated provides secure aggregation algorithms based on local differential privacy (LDP).
-Noise addition is performed on local models before they are migrated to the cloud. On the premise of ensuring the model availability, the problem of privacy leakage in horizontal federated learning is solved.
+As a federated learning framework, MindSpore Federated provides secure aggregation algorithms based on local differential privacy (LDP). Noise addition is performed on local models before they are migrated to the cloud. On the premise of ensuring the model availability, the problem of privacy leakage in horizontal federated learning is solved.
 
 ## Principles
 
-Differential privacy is a mechanism for protecting user data privacy. **Differential privacy** is defined as follows:
+Differential privacy is a mechanism for protecting user data privacy. Differential privacy is defined as follows:
 
 $$
 Pr[\mathcal{K}(D)\in S] \le e^{\epsilon} Pr[\mathcal{K}(D') \in S]+\delta​
@@ -33,14 +32,11 @@ The MindSpore Federated client uploads the noise-added model $W_p$ to the cloud
 
 ## Usage
 
-Local differential privacy training currently only supports cross device scenarios. Enabling differential privacy training is simple. You only need to perform the following operation during the cloud service startup.
-Use `set_fl_context()` to set `encrypt_type='DP_ENCRYPT'`.
+Local differential privacy training currently only supports cross device scenarios. Enabling differential privacy training is simple. You only need to set the `encrypt_type` field to `DP_ENCRYPT` via [yaml](https://gitee.com/mindspore/federated/blob/master/docs/api/api_python/federated_server_yaml.md#) when starting the cloud-side service.
 
-In addition, to control the effect of privacy protection, three parameters are provided: `dp_eps`, `dp_delta`, and `dp_norm_clip`.
-They are also set through `set_fl_context()`. The valid value range of `dp_eps` and `dp_norm_clip` is greater than 0.
+In addition, to control the effect of privacy protection, three parameters are provided: `dp_eps`, `dp_delta`, and `dp_norm_clip`. They are also set through the yaml file.
 
-The value of `dp_delta` ranges between 0 and 1. Generally, the smaller the values of `dp_eps` and `dp_delta`, the better the privacy protection effect.
-However, the impact on model convergence is greater. It is recommended that `dp_delta` be set to the reciprocal of the number of clients and the value of `dp_eps` be greater than 50.
+The valid value range of `dp_eps` and `dp_norm_clip` is greater than 0. The legal range of `dp_delta` is 0<`dp_delta`<1. In general, the smaller `dp_eps` and `dp_delta` are, the better the privacy protection will be, but the greater the impact on the convergence of the model. It is recommended that `dp_delta` be taken as the inverse of the number of clients and `dp_eps` be greater than 50.
 
 `dp_norm_clip` is the adjustment coefficient of the model weight before noise is added to the model weight by the LDP mechanism. It affects the convergence of the model. The recommended value ranges from 0.5 to 2.
 
diff --git a/docs/federated/docs/source_en/local_differential_privacy_training_signds.md b/docs/federated/docs/source_en/local_differential_privacy_training_signds.md
new file mode 100644
index 0000000000000000000000000000000000000000..8f8ccbfc584738fd503d45cab862ff5352533078
--- /dev/null
+++ b/docs/federated/docs/source_en/local_differential_privacy_training_signds.md
@@ -0,0 +1,103 @@
+# Horizontal FL-Local Differential Privacy SignDS training
+
+<a href="https://gitee.com/mindspore/docs/blob/master/docs/federated/docs/source_en/local_differential_privacy_training_signds.md" target="_blank"><img src="https://mindspore-website.obs.cn-north-4.myhuaweicloud.com/website-images/master/resource/_static/logo_source_en.png"></a>
+
+## Privacy Protection Background
+
+Federated learning enables the client user to participate in global model training without uploading the original dataset by allowing the participant to upload only the new model after local training or update the update information of the model, breaking through the data silos. This common scenario of federated learning corresponds to the default scheme in the MindSpore federated learning framework, where the `encrypt_type` switch defaults to `not_encrypt` when starting the `server`. The `installation and deployment` and `application practices` in the federated learning tutorial both use this approach by default, which is a common federated seeking averaging scheme without any privacy-protecting treatment such as cryptographic perturbation. For the convenience of description, `not_encrypt' is used below to refer specifically to this default scheme.
+
+This federated learning scheme is not free from privacy leakage, using the above `not_encrypt` scheme for training. The Server receives the local training model uploaded by the Client, which can still reconstruct the user training data through some attack methods [1], thus leaking user privacy, so the `not_encrypt` scheme needs to further increase the user privacy protection mechanism.
+
+The global model `oldModel` received by the Client in each round of federated learning is issued by the Server, which does not involve user privacy issues. However, the local model `newModel` obtained by each Client after several epochs of local training fits its local privacy data, so the privacy protection focuses on the weight difference between the two `newModel`-`oldModel`=`update`.
+
+The `DP_ENCRYPT` differential noise scheme already implemented in the MindSpore Federated framework achieves privacy preservation by iteratively perturbing Gaussian random noise to `update`. However, as the dimensionality of the model increases, the increase in the `update` paradigm will increase the noise, thus requiring more Clients to participate in the same round of aggregation to neutralize the noise impact, otherwise the convergence and accuracy of the model will be reduced. If the noise is set too small, although the convergence and accuracy are close to the performance of the `not_encrypt` scheme, the privacy protection is not strong enough. Also each Client needs to send the perturbed model, and as the model increases, the communication overhead increases. We expect the Client represented by the cell phone to achieve convergence of the global model with as little communication overhead as possible.
+
+## Algorithm Flow Introduction
+
+SignDS [2] is the abbreviation of Sign Dimension Select, and the processing object is the `update` of Client. Preparation: each layer of Tensor of `update` is flattened and expanded into a one-dimensional vector, connected together, and the number of splicing vector dimensions is noted as $d$.
+
+One sentence summarizes the algorithm: **Select $h(h<d)$ dimensions of `update`, replace the original update value of the selected dimension with the sign value (sign value: plus or minus 1), and replace the unselected ones with 0.**
+
+Here is an example: there are 3 clients Client1, 2, 3, whose `update` is a $d=8$-dimensional vector after flattening and expanding, and the Server calculates the `avg` of these 3 clients Client and updates the global model with the value, that is, completes a round of federated learning.
+
+| Client | d_1  | d_2  | d_3  | d_4  | d_5  | d_6  | d_7  |  d_8  |
+| :----: | :--: | :--: | :--: | :--: | :--: | :--: | :--: | :---: |
+|   1    | 0.4  | 0.1  | -0.2 | 0.3  | 0.5  | 0.1  | -0.2 | -0.3  |
+|   2    | 0.5  | 0.2  |  0   | 0.1  | 0.3  | 0.2  | -0.1 | -0.2  |
+|   3    | 0.3  | 0.1  | -0.1 | 0.5  | 0.2  | 0.3  |  0   |  0.1  |
+|  avg   | 0.4  | 0.13 | -0.1 | 0.3  | 0.33 | 0.2  | -0.1 | -0.13 |
+
+The dimension with higher importance should be selected, and the importance measure is the size of the **fetching value**, and the update needs to be sorted. update takes positive and negative values to represent different update directions, so in each round of federated learning, the sign values of Client each have **0.5 probability** of taking `1` or `-1`. If sign=1, the largest $k$ number of `update` dimensions are noted as the `topk` set and the remaining ones are noted as the `non-topk` set. If sign=-1, the smallest $k$ number of ones are noted as the `topk` set.
+
+If the Server specifies `h`, the total number of selected dimensions, the Client will directly use this value, otherwise each Client will locally calculate the optimal output dimension `h`.
+
+The SignDS algorithm outputs the number of dimensions (denoted as $v$) that should be selected from the `topk` set and the `non-topk` set, as in the example in the table below, where the two sets pick a total of dimensions h=3.
+
+Client selects dimensions uniformly and randomly according to the number of dimensions output by the SignDS algorithm, sends the dimension number and sign value to the Server. If the dimension number is output in the order of picking from `topk` first and then from `non-topk`, the dimension number list `index` needs to be shuffled and disordered. The following table shows the information finally transferred from each Client of this algorithm to the Server.
+
+| Client | index | sign |
+| :----: | :---: | :--: |
+|   1    | 1,5,8 |  1   |
+|   2    | 2,3,4 |  -1  |
+|   3    | 3,6,7 |  1   |
+
+Server constructs `update` with privacy protection based on the dimension serial number and sign value uploaded by each client Client, aggregates and averages all `updates` and updates the current `oldModel` to complete a round of federated learning.
+
+| Client |  d_1  |  d_2   |  d_3   |  d_4   |  d_5  |  d_6  |  d_7  |  d_8  |
+| :----: | :---: | :----: | :----: | :----: | :---: | :---: | :---: | :---: |
+|   1    | **1** |   0    |   0    |   0    | **1** |   0   |   0   | **1** |
+|   2    |   0   | **-1** | **-1** | **-1** |   0   |   0   |   0   |   0   |
+|   3    |   0   |   0    | **1**  |   0    |   0   | **1** | **1** |   0   |
+|  avg   |  1/3  |  -1/3  |   0    |  -1/3  |  1/3  |  1/3  |  1/3  |  1/3  |
+
+The optimized SignDS scheme has realized that the device-side client only uploads a list of dimension numbers of int type and a random Sign value of boolean type outputted by the algorithm to the cloud side, which significantly reduces the communication overhead compared to the common scenario of uploading thousands of float-level complete model weights or gradients. From the perspective of the actual reconstruction attack, the cloud side only gets the dimension serial number and a Sign value representing the direction of gradient update, and the attack is more difficult to achieve. The cloud side receives the dimension serial number and Sign value from the device side, and has to simulate the reconstruction of the original user weight, i.e., using `sign_global_lr` and Sign value, the latter representing the updated direction and the former representing the step size, which is where the accuracy of the scheme is lost. The cloud side can only be reconstructed to simulate each client **partial** gradient update. The number is equal to the number of serial numbers. Because the dimension selection is all random, the more the number of client users involved in aggregation, and the more model weights will be activated. If the reconstructed `update` is mostly focused on a certain position, it means that the real weight of that position is more updated, and vice versa, it means that the original update of that position is less updated. By reconstructing `update` and adding the initial model weights in this round, the cloud side can aggregate and update the final model in this round.
+
+## Privacy Protection Certificate
+
+The differential privacy noise scheme achieves privacy protection by adding noise so that the attacker cannot determine the original information, while the differential privacy SignDS scheme activates partial dimensions and replaces the original value with the sign value, which largely protects user privacy. Further, using the differential privacy index mechanism makes it impossible for an attacker to confirm whether the activated dimensions are significant (from the `topk` set) and whether the number of dimensions from `topk` in the output dimensions exceeds a given threshold.
+
+For any two updates $\Delta$ and $\Delta'$ of each Client, the set of `topk` dimensions is $S_{topk}$ , ${S'}_{topk}$ , respectively. The set of any possible output dimensions of the algorithm is ${J}\in {\mathcal{J}}$ . Note that $\nu=|{S}_ {topk}\cap {J}|$ , $\nu'=|{S'}_{topk}\cap {J}|$ is the number of intersections of ${J}$ and `topk` sets, and the algorithm such that the following inequality holds:
+
+$$
+\frac{{Pr}[{J}|\Delta]}{{Pr}[{J}|\Delta']}=\frac{{Pr}[{J}|{S}_{topk}]}{{Pr}[{J}|{S'}_{topk}]}=\frac{\frac{{exp}(\frac{\epsilon}{\phi_u}\cdot u({S}_{topk},{J}))}{\sum_{{J'}\in {\mathcal{J}}}{exp}(\frac{\epsilon}{\phi_u}\cdot u({S}_{topk}, {J'}))}}{\frac{{exp}(\frac{\epsilon}{\phi_u}\cdot u({S'}_{topk}, {J}))}{\sum_{ {J'}\in {\mathcal{J}}}{exp}(\frac{\epsilon}{\phi_u}\cdot u( {S'}_{topk},{J'}))}}=\frac{\frac{{exp}(\epsilon\cdot \unicode{x1D7D9}(\nu \geq \nu_{th}))}{\sum_{\tau=0}^{\tau=\nu_{th}-1}\omega_{\tau} + \sum_{\tau=\nu_{th}}^{\tau=h}\omega_{\tau}\cdot {exp}(\epsilon)}}{\frac{ {exp}(\epsilon\cdot \unicode{x1D7D9}(\nu' \geq\nu_{th}))}{\sum_{\tau=0}^{\tau=\nu_{th}-1}\omega_{\tau}+\sum_{\tau=\nu_{th}}^{\tau=h}\omega_{\tau}\cdot {exp}(\epsilon)}}\\= \frac{{exp}(\epsilon\cdot \unicode{x1D7D9} (\nu \geq \nu_{th}))}{ {exp}(\epsilon\cdot \unicode{x1D7D9} (\nu' \geq \nu_{th}))} \leq \frac{{exp}(\epsilon\cdot 1)}{{exp}(\epsilon\cdot 0)} = {exp}(\epsilon),
+$$
+
+It is proved that the algorithm satisfies local differential privacy.
+
+## Preparation
+
+To use the algorithm, one first needs to successfully complete the training aggregation process for either cross-device federated scenario. [Implementing an Image Classification Application of Cross-device Federated Learning (x86)](https://www.mindspore.cn/federated/docs/en/master/image_classification_application.html) describes the preparation work such as datasets, network models, and simulations to initiate the process of multi-client participation in federated learning in detail.
+
+## Algorithm Opening Script
+
+Local differential privacy SignDS training currently only supports cross-device federated learning scenarios. The opening method needs to change the following parameter configuration in the yaml file when opening the cloud-side service. The complete cloud-side opening script can be referred to the cloud-side deployment, and the relevant parameter configuration for opening this algorithm is given here. Taking LeNet task as an example, the yaml related configuration is as follows:
+
+```python
+encrypt:
+  encrypt_type: SIGNDS
+  ...
+  signds:
+    sign_k: 0.01
+    sign_eps: 100
+    sign_thr_ratio: 0.6
+    sign_global_lr: 0.1
+    sign_dim_out: 0
+```
+
+For the detailed example, refer to [Implementing an Image Classification Application of Cross-device Federated Learning (x86)](https://www.mindspore.cn/federated/docs/en/master/image_classification_application.html). The cloud-side code implementation gives the definition domain of each parameter. If it is not in the definition domain, Server will report an error prompting the definition domain. The following parameter changes are subject to keeping the remaining 4 parameters unchanged.
+
+- `sign_k`: (0,0.25], k*inputDim>50. default=0.01. `inputDim` is the pulling length of the model or update. If not satisfied, there is a device-side warning. Sort update, and the `topk` set is composed of the first k (%) of it. Decreasing k means to pick from more important dimensions with greater probability. The output will have fewer dimensions, but the dimensions are more important and the change in convergence cannot be determined. The user needs to observe the sparsity of model update to determine the value. When it is quite sparse (update has many zeros), it should be taken smaller.
+- `sign_eps`: (0,100], default=100. Privacy-preserving budget. The number sequence symbol is $\epsilon$, abbreviated as eps. When eps decreases, the probability of picking unimportant dimensions increases. When privacy protection is enhanced, output dimensions decrease, the percentage remains the same, and precision decreases.
+- `sign_thr_ratio`: [0.5,1], default=0.6. The dimension from `topk` in the activation dimension is occupied threshold lower bound. Increasing will reduce the output dimension, but the proportion of output dimensions from `topk` will increase. When the value is increased excessively, more from `topk` is required in the output, and the total output dimension can only be reduced to meet the requirement, and the accuracy decreases when the number of clients is not large enough.
+- `sign_global_lr`: (0,), default=1. This value is multiplied by sign instead of update, which directly affects the convergence speed and accuracy. Moderately increasing this value will improve the convergence speed, but it may make the model oscillate and the gradient explode. If more epochs are run locally per client and the learning rate used for local training is increased, the value needs to be increased accordingly. If the number of clients involved in the aggregation increases, the value also needs to be increased, because the value needs to be aggregated and then divided by the number of users when reconstruction. The result will remain the same only if the value is increased.
+- `sign_dim_out`: [0,50], default=0. If a non-zero value is given, the client side uses the value directly, increasing the value to output more dimensions, but the proportion of dimensions from `topk` will decrease. If it is 0, the client user has to calculate the optimal output parameters. If eps is not large enough, and the value is increased, many `non-topk` insignificant dimensions will be output leading to affect the mode convergence and accuracy decrease. When eps is large enough, increasing the value will allow important dimension information of more users to leave the local area and improve the accuracy.
+
+## LeNet Experiment results
+
+Use 100 client datasets of `3500_clients_bin`, 200 iterations of federated aggregation. 20 epochs run locally per client, and using learning rate of device-side local training is 0.01. The related parameter of SignDS is `k=0.01, eps=100, ratio=0.6, lr=4, out=0`, and the final accuracy is 66.5% for all users and 69% for the common federated scenario without encryption. In the unencrypted scenario, the length of the data uploaded to the cloud side at the end of training on the device side is 266,084, but the length of the data uploaded by SignDS is only 656.
+
+## References
+
+[1] Ligeng Zhu, Zhijian Liu, and Song Han. [Deep Leakage from Gradients](http://arxiv.org/pdf/1906.08935.pdf). NeurIPS, 2019.
+
+[2] Xue Jiang, Xuebing Zhou, and Jens Grossklags. "SignDS-FL: Local Differentially-Private Federated Learning with Sign-based Dimension Selection." ACM Transactions on Intelligent Systems and Technology, 2022.
\ No newline at end of file
diff --git a/docs/federated/docs/source_en/sentiment_classification_application.md b/docs/federated/docs/source_en/sentiment_classification_application.md
index 9cddf18a1151e9755200a8133b988149834ceefa..95a5b9890a300d9790826719bf3d40ba7d7e9cce 100644
--- a/docs/federated/docs/source_en/sentiment_classification_application.md
+++ b/docs/federated/docs/source_en/sentiment_classification_application.md
@@ -2,7 +2,9 @@
 
 <a href="https://gitee.com/mindspore/docs/blob/master/docs/federated/docs/source_en/sentiment_classification_application.md" target="_blank"><img src="https://mindspore-website.obs.cn-north-4.myhuaweicloud.com/website-images/master/resource/_static/logo_source_en.png"></a>
 
-In privacy compliance scenarios, the federated learning modeling mode based on device-cloud synergy can make full use of the advantages of device data and prevent sensitive user data from being directly reported to the cloud. When exploring the application scenarios of federated learning, we notice the input method scenario. Users attach great importance to their text privacy and intelligent functions on the input method. Therefore, federated learning is naturally applicable to the input method scenario. MindSpore Federated applies the federated language model to the emoji prediction function of the input method. The federated language model recommends emojis suitable for the current context based on the chat text data. During federated learning modeling, each emoji is defined as a sentiment label category, and each chat phrase corresponds to an emoji. MindSpore Federated defines the emoji prediction task as a federated sentiment classification task.
+Through the federated learning modeling approach of cross-device collaboration, the advantages of device-side data can be fully utilized to avoid uploading sensitive user data directly to the cloud side. Since users attach great importance to the privacy of the text they input when using input methods, and the intelligent functions of input methods are important to improve user experience. Therefore, federated learning is naturally applicable to the input method application scenarios.
+
+MindSpore Federated has applied the Federated Language Model to the emoji image prediction feature of the input method. The Federated Language Model recommends emoji images that are appropriate for the current context based on chat text data. When modeling with federated learning, each emoji image is defined as a sentiment label category, and each chat phrase corresponds to an emoji image. MindSpore Federated defines the emoji image prediction task as a federated sentiment classification task.
 
 ## Preparations
 
@@ -40,7 +42,7 @@ The directory structures of the [dictionary](https://mindspore-website.obs.cn-no
 ```text
 mobile/models/
 ├── vocab.txt  # Dictionary
-└── vocab_map_ids.txt  # mapping file of Dictionary ID
+└── vocab_map_ids.txt  # Mapping file of Dictionary ID
 ```
 
 ## Defining the Network
@@ -49,32 +51,113 @@ The ALBERT language model[1] is used in federated learning. The ALBERT model on
 
 For details about the network definition, see [source code](https://gitee.com/mindspore/mindspore/blob/master/tests/st/fl/mobile/src/model.py).
 
-### Generating a Device Model File
+### Generating a Device-Side Model File
 
 #### Exporting a Model as a MindIR File
 
 The sample code is as follows:
 
 ```python
+import argparse
+import os
+import random
+from time import time
 import numpy as np
 import mindspore as ms
+from mindspore.nn import AdamWeightDecay
 from src.config import train_cfg, client_net_cfg
-from src.cell_wrapper import NetworkTrainCell
-
-# Build a model.
-client_network_train_cell = NetworkTrainCell(client_net_cfg)
-
-# Build input data.
-input_ids = ms.Tensor(np.zeros((train_cfg.batch_size, client_net_cfg.seq_length), dtype=np.int32))
-attention_mask = ms.Tensor(np.zeros((train_cfg.batch_size, client_net_cfg.seq_length), dtype=np.int32))
-token_type_ids = ms.Tensor(np.zeros((train_cfg.batch_size, client_net_cfg.seq_length), dtype=np.int32))
-label_ids = ms.Tensor(np.zeros((train_cfg.batch_size, client_net_cfg.num_labels), dtype=np.int32))
+from src.utils import restore_params
+from src.model import AlbertModelCLS
+from src.cell_wrapper import NetworkWithCLSLoss, NetworkTrainCell
+
+
+def parse_args():
+    """
+    parse args
+    """
+    parser = argparse.ArgumentParser(description='export task')
+    parser.add_argument('--device_target', type=str, default='GPU', choices=['Ascend', 'GPU'])
+    parser.add_argument('--device_id', type=str, default='0')
+    parser.add_argument('--init_model_path', type=str, default='none')
+    parser.add_argument('--output_dir', type=str, default='./models/mindir/')
+    parser.add_argument('--seed', type=int, default=0)
+    return parser.parse_args()
+
+
+def supervise_export(args_opt):
+    ms.set_seed(args_opt.seed), random.seed(args_opt.seed)
+    start = time()
+    # Parameter configuration
+    os.environ['CUDA_VISIBLE_DEVICES'] = args_opt.device_id
+    init_model_path = args_opt.init_model_path
+    output_dir = args_opt.output_dir
+    if not os.path.exists(output_dir):
+        os.makedirs(output_dir)
+    print('Parameters setting is done! Time cost: {}'.format(time() - start))
+    start = time()
+
+    # MindSpore configuration
+    ms.set_context(mode=ms.GRAPH_MODE, device_target=args_opt.device_target)
+    print('Context setting is done! Time cost: {}'.format(time() - start))
+    start = time()
+
+    # Build mode
+    albert_model_cls = AlbertModelCLS(client_net_cfg)
+    network_with_cls_loss = NetworkWithCLSLoss(albert_model_cls)
+    network_with_cls_loss.set_train(True)
+    print('Model construction is done! Time cost: {}'.format(time() - start))
+    start = time()
+
+    # Build optimizer
+    client_params = [_ for _ in network_with_cls_loss.trainable_params()]
+    client_decay_params = list(
+        filter(train_cfg.optimizer_cfg.AdamWeightDecay.decay_filter, client_params)
+    )
+    client_other_params = list(
+        filter(lambda x: not train_cfg.optimizer_cfg.AdamWeightDecay.decay_filter(x), client_params)
+    )
+    client_group_params = [
+        {'params': client_decay_params, 'weight_decay': train_cfg.optimizer_cfg.AdamWeightDecay.weight_decay},
+        {'params': client_other_params, 'weight_decay': 0.0},
+        {'order_params': client_params}
+    ]
+    client_optimizer = AdamWeightDecay(client_group_params,
+                                       learning_rate=train_cfg.client_cfg.learning_rate,
+                                       eps=train_cfg.optimizer_cfg.AdamWeightDecay.eps)
+    client_network_train_cell = NetworkTrainCell(network_with_cls_loss, optimizer=client_optimizer)
+    print('Optimizer construction is done! Time cost: {}'.format(time() - start))
+    start = time()
+
+    # Construct data
+    input_ids = ms.Tensor(np.zeros((train_cfg.batch_size, client_net_cfg.seq_length), np.int32))
+    attention_mask = ms.Tensor(np.zeros((train_cfg.batch_size, client_net_cfg.seq_length), np.int32))
+    token_type_ids = ms.Tensor(np.zeros((train_cfg.batch_size, client_net_cfg.seq_length), np.int32))
+    label_ids = ms.Tensor(np.zeros((train_cfg.batch_size,), np.int32))
+    print('Client data loading is done! Time cost: {}'.format(time() - start))
+    start = time()
+
+    # Read checkpoint
+    if init_model_path != 'none':
+        init_param_dict = ms.load_checkpoint(init_model_path)
+        restore_params(client_network_train_cell, init_param_dict)
+    print('Checkpoint loading is done! Time cost: {}'.format(time() - start))
+    start = time()
+
+    # Export
+    ms.export(client_network_train_cell, input_ids, attention_mask, token_type_ids, label_ids,
+           file_name=os.path.join(output_dir, 'albert_supervise'), file_format='MINDIR')
+    print('Supervise model export process is done! Time cost: {}'.format(time() - start))
+
+
+if __name__ == '__main__':
+    total_time_start = time()
+    args = parse_args()
+    supervise_export(args)
+    print('All is done! Time cost: {}'.format(time() - total_time_start))
 
-# Export the model.
-ms.export(client_network_train_cell, input_ids, attention_mask, token_type_ids, label_ids, file_name='albert_train.mindir', file_format='MINDIR')
 ```
 
-#### Converting the MindIR File into an MS File that Can Be Used by the Federated Learning Framework on the Device
+#### Converting the MindIR File into an MS File that Can be Used by the Federated Learning Framework on the Device
 
 For details about how to generate a model file on the device, see [Implementing an Image Classification Application](https://www.mindspore.cn/federated/docs/en/master/image_classification_application.html).
 
@@ -102,17 +185,25 @@ Create a project in Android Studio and install the corresponding SDK. (After the
 
 ![New project](./images/create_android_project.png)
 
-### Building the MindSpore Lite AAR Package
+### Obtaining a Related Package
 
-- For details, see [Federated Learning Deployment](https://www.mindspore.cn/federated/docs/en/master/deploy_federated_client.html).
+1. Obtain MindSpore Lite AAR package
 
-- Name of the generated Android AAR package:
+    For details, see [Mindspore Lite](https://www.mindspore.cn/lite/docs/en/master/use/downloads.html).
+
+    ```text
+   mindspore-lite-full-{version}.aar
+   ```
 
-  ```sh
-  mindspore-lite-full-{version}.aar
-  ```
+2. Obtain Mindspore Federated device-side jar package
 
-- Place the AAR package in the app/libs/ directory of the Android project.
+    For details, see [On-Device Deployment](https://www.mindspore.cn/federated/docs/zh-CN/master/deploy_federated_client.html).
+
+   ```text
+   mindspore_federated/device_client/build/libs/jarAAR/mindspore-lite-java-flclient.jar
+   ```
+
+3. Place the AAR package in the app/libs/ directory of the Android project.
 
 ### Android Instance Program Structure
 
@@ -150,12 +241,10 @@ app
 
     ```java
     import android.content.Context;
-
     import java.io.File;
     import java.io.FileOutputStream;
     import java.io.InputStream;
     import java.util.logging.Logger;
-
     public class AssetCopyer {
         private static final Logger LOGGER = Logger.getLogger(AssetCopyer.class.toString());
         public static void copyAllAssets(Context context,String destination) {
@@ -206,7 +295,7 @@ app
     }
     ```
 
-2. FlJob.java: This code file is used to define training and inference tasks. For details about federated learning APIs, see [Federal Learning APIs](https://www.mindspore.cn/federated/docs/en/master/interface_description_federated_client.html).
+2. FlJob.java: This code file is used to define training and inference tasks. For details about federated learning APIs, see [federated Learning APIs](https://www.mindspore.cn/federated/docs/en/master/interface_description_federated_client.html).
 
     ```java
     import android.annotation.SuppressLint;
@@ -311,18 +400,26 @@ app
     }
     ```
 
+    The above eval_no_label.txt refers to a file where no label exists, with one statement per line. The format reference is as follows, which the user is free to set:
+
+    ```text
+    愿以吾辈之青春 护卫这盛世之中华🇨🇳
+    girls help girls
+    太美了，祝祖国繁荣昌盛！
+    中国人民站起来了
+    难道就我一个人觉得这个是plus版本？
+    被安利到啦！明天起来就看！早点睡觉莲莲
+    ```
+
 3. MainActivity.java: This code file is used to start federated learning training and inference tasks.
 
     ```java
     import android.os.Build;
     import android.os.Bundle;
-
     import androidx.annotation.RequiresApi;
     import androidx.appcompat.app.AppCompatActivity;
-
     import com.huawei.flAndroid.job.FlJob;
     import com.huawei.flAndroid.utils.AssetCopyer;
-
     @RequiresApi(api = Build.VERSION_CODES.P)
     public class MainActivity extends AppCompatActivity {
         private String parentPath;
@@ -336,7 +433,6 @@ app
             // Create a thread and start the federated learning training and inference tasks.
             new Thread(() -> {
                 FlJob flJob = new FlJob(parentPath);
-
                 flJob.syncJobTrain();
                 flJob.syncJobPredict();
             }).start();
@@ -441,14 +537,23 @@ app
    I/SyncFLJob: labels = [2, 0, 0, 0, 0, 1, 1, 1, 1, 2, 2, 2, 2, 4, 4, 4, 4]
    ```
 
-## Experiment Result
+## Results
 
-The total number of federated learning iterations is 5, the number of epochs for local training on the client is 10, and the value of batchSize is 16.
+The total number of federated learning iterations is 10, the number of client-side local training epochs is 1, and the batchSize is set to 16.
 
-|        | Top 1 Accuracy| Top 5 Accuracy|
-| ------ | -------- | -------- |
-| ALBERT | 24%      | 70%      |
+```text
+<FLClient> total acc:0.44488978
+<FLClient> total acc:0.583166333
+<FLClient> total acc:0.609218437
+<FLClient> total acc:0.645290581
+<FLClient> total acc:0.667334669
+<FLClient> total acc:0.685370741
+<FLClient> total acc:0.70741483
+<FLClient> total acc:0.711422846
+<FLClient> total acc:0.719438878
+<FLClient> total acc:0.733466934
+```
 
 ## References
 
-[1] Lan Z ,  Chen M ,  Goodman S , et al. ALBERT: A Lite BERT for Self-supervised Learning of Language Representations[J].  2019.
+[1] Lan Z, Chen M , Goodman S, et al. ALBERT: A Lite BERT for Self-supervised Learning of Language Representations[J].  2019.
diff --git a/docs/federated/docs/source_zh_cn/local_differential_privacy_training_noise.md b/docs/federated/docs/source_zh_cn/local_differential_privacy_training_noise.md
index 77d703138797849dfd1e97142181f1be898aa27b..e9c683384500cb336d8f1784803a79ec70e90547 100644
--- a/docs/federated/docs/source_zh_cn/local_differential_privacy_training_noise.md
+++ b/docs/federated/docs/source_zh_cn/local_differential_privacy_training_noise.md
@@ -20,7 +20,7 @@ $$
 
 MindSpore Federated提供基于本地差分隐私的安全聚合算法，防止客户端上传本地模型时泄露用户隐私数据。
 
-MindSpore Federated客户端会生成一个与本地模型$W$相同维度的差分噪声矩阵$G$，然后将二者相加，得到一个满足差分隐私定义的权重$W_p$:
+MindSpore Federated客户端会生成一个与本地模型$W$相同维度的差分噪声矩阵$G$，然后将二者相加，得到一个满足差分隐私定义的权重$W_p$：
 
 $$
 W_p=W+G
diff --git a/docs/federated/docs/source_zh_cn/local_differential_privacy_training_signds.md b/docs/federated/docs/source_zh_cn/local_differential_privacy_training_signds.md
index 21d9e8d67826fed743a54b0c1e4a6e2dd7754ab6..dbd4c32f3e387712f618ef6a183923ca7b13f868 100644
--- a/docs/federated/docs/source_zh_cn/local_differential_privacy_training_signds.md
+++ b/docs/federated/docs/source_zh_cn/local_differential_privacy_training_signds.md
@@ -4,7 +4,7 @@
 
 ## 隐私保护背景
 
-联邦学习通过让参与方只上传本地训练后的新模型或更新模型的update信息，实现了client用户不上传原始数据集就能参与全局模型训练的目的，打通了数据孤岛。这种普通场景的联邦学习对应MindSpore联邦学习框架中的默认方案，启动`server`时，`encrypt_type`开关默认为`not_encrypt`，联邦学习教程中的`安装部署`与`应用实践`都默认使用这种方式），是没有任何加密扰动等保护隐私处理的普通联邦求均方案，为方便描述，下文以`not_encrypt`来特指这种默认方案。
+联邦学习通过让参与方只上传本地训练后的新模型或更新模型的update信息，实现了client用户不上传原始数据集就能参与全局模型训练的目的，打通了数据孤岛。这种普通场景的联邦学习对应MindSpore联邦学习框架中的默认方案，启动`server`时，`encrypt_type`开关默认为`not_encrypt`，联邦学习教程中的`安装部署`与`应用实践`都默认使用这种方式，是没有任何加密扰动等保护隐私处理的普通联邦求均方案，为方便描述，下文以`not_encrypt`来特指这种默认方案。
 
 这种联邦学习方案并不是毫无隐私泄漏的，使用上述`not_encrypt`方案进行训练，服务端Server收到客户端Client上传的本地训练模型，仍可通过一些攻击方法[1]重构用户训练数据，从而泄露用户隐私，所以`not_encrypt`方案需要进一步增加用户隐私保护机制。
 
@@ -31,7 +31,7 @@ SignDS[2]是Sign Dimension Select的缩写，处理对象是客户端Client的`u
 
 如果服务端Server指定总共选择的维度数量`h`，客户端Client会直接使用该值，否则各客户端Client会本地计算出最优的输出维度`h`。
 
-随后SignDS算法会输出应从`topk`集合和`non-topk`集合中选择的维度数量（记为$v$），如下表中示例，两个集合总共挑选维度h=3，
+随后SignDS算法会输出应从`topk`集合和`non-topk`集合中选择的维度数量（记为$v$），如下表中示例，两个集合总共挑选维度h=3。
 
 客户端Client按照SignDS算法输出的维度数量，均匀随机挑选维度，将维度序号和sign值发送至服务端Server即可，维度序号如果按照先从`topk`挑选，再从`non-topk`挑选的顺序输出，则需要对维度序号列表`index`进行洗牌打乱操作，下表为该算法各客户端Client最终传输至服务端Server的信息：
 
@@ -56,7 +56,7 @@ SignDS[2]是Sign Dimension Select的缩写，处理对象是客户端Client的`u
 
 差分隐私噪声方案通过加噪的方式，让攻击者无法确定原始信息，从而实现隐私保护；而差分隐私SignDS方案只激活部分维度，且用sign值代替原始值，很大程度上保护了用户隐私。进一步的，利用差分隐私指数机制让攻击者无法确认激活的维度是否是重要（来自`topk`集合），且无法确认输出维度中来自`topk`的维度数量是否超过给定阈值。
 
-对于每个客户端Client的任意两个update $\Delta$ 和 $\Delta'$  ，其`topk`维度集合分别是  $S_{topk}$ ， ${S'}_{topk}$ ，该算法任意可能的输出维度集合是 ${J}\in {\mathcal{J}} $ ，记 $\nu=|{S}_{topk}\cap {J}|$ ,  $\nu'=|{S'}_{topk}\cap {J}|$  是 ${J}$ 和`topk` 集合交集的数量，算法使得以下不等式成立：
+对于每个客户端Client的任意两个update $\Delta$ 和 $\Delta'$  ，其`topk`维度集合分别是 $S_{topk}$ ， ${S'}_{topk}$ ，该算法任意可能的输出维度集合是 ${J}\in {\mathcal{J}}$ ，记 $\nu=|{S}_{topk}\cap {J}|$ ， $\nu'=|{S'}_{topk}\cap {J}|$ 是 ${J}$ 和`topk` 集合交集的数量，算法使得以下不等式成立：
 
 $$
 \frac{{Pr}[{J}|\Delta]}{{Pr}[{J}|\Delta']}=\frac{{Pr}[{J}|{S}_{topk}]}{{Pr}[{J}|{S'}_{topk}]}=\frac{\frac{{exp}(\frac{\epsilon}{\phi_u}\cdot u({S}_{topk},{J}))}{\sum_{{J'}\in {\mathcal{J}}}{exp}(\frac{\epsilon}{\phi_u}\cdot u({S}_{topk}, {J'}))}}{\frac{{exp}(\frac{\epsilon}{\phi_u}\cdot u({S'}_{topk}, {J}))}{\sum_{ {J'}\in {\mathcal{J}}}{exp}(\frac{\epsilon}{\phi_u}\cdot u( {S'}_{topk},{J'}))}}=\frac{\frac{{exp}(\epsilon\cdot \unicode{x1D7D9}(\nu \geq \nu_{th}))}{\sum_{\tau=0}^{\tau=\nu_{th}-1}\omega_{\tau} + \sum_{\tau=\nu_{th}}^{\tau=h}\omega_{\tau}\cdot {exp}(\epsilon)}}{\frac{ {exp}(\epsilon\cdot \unicode{x1D7D9}(\nu' \geq\nu_{th}))}{\sum_{\tau=0}^{\tau=\nu_{th}-1}\omega_{\tau}+\sum_{\tau=\nu_{th}}^{\tau=h}\omega_{\tau}\cdot {exp}(\epsilon)}}\\= \frac{{exp}(\epsilon\cdot \unicode{x1D7D9} (\nu \geq \nu_{th}))}{ {exp}(\epsilon\cdot \unicode{x1D7D9} (\nu' \geq \nu_{th}))} \leq \frac{{exp}(\epsilon\cdot 1)}{{exp}(\epsilon\cdot 0)} = {exp}(\epsilon),
diff --git a/docs/federated/docs/source_zh_cn/sentiment_classification_application.md b/docs/federated/docs/source_zh_cn/sentiment_classification_application.md
index f557bfc0bd3c8e9e2bd372a749afb3bd80fa49d0..0545e672b9bbc5f4a053b0a4deebd32691de269e 100644
--- a/docs/federated/docs/source_zh_cn/sentiment_classification_application.md
+++ b/docs/federated/docs/source_zh_cn/sentiment_classification_application.md
@@ -557,4 +557,4 @@ app
 
 ## 参考文献
 
-[1] Lan Z ,  Chen M ,  Goodman S , et al. ALBERT: A Lite BERT for Self-supervised Learning of Language Representations[J].  2019.
+[1] Lan Z, Chen M , Goodman S, et al. ALBERT: A Lite BERT for Self-supervised Learning of Language Representations[J].  2019.
diff --git a/tutorials/application/source_en/cv/dcgan.md b/tutorials/application/source_en/cv/dcgan.md
index 384187de6cc25e1c3e605eea0f6deb92bdde9f06..c4677bcdaea2c9274e8be9f891a2a11a8c78960e 100644
--- a/tutorials/application/source_en/cv/dcgan.md
+++ b/tutorials/application/source_en/cv/dcgan.md
@@ -106,28 +106,27 @@ Define the `create_dataset_imagenet` function to process and augment data.
 
 ```python
 import numpy as np
-import mindspore as ms
 import mindspore.dataset as ds
 import mindspore.dataset.vision as vision
 
-from mindspore import nn, ops
-
 def create_dataset_imagenet(dataset_path):
     """Data loading"""
-    data_set = ds.ImageFolderDataset(dataset_path, num_parallel_workers=4, shuffle=True,
-                                     decode=True)
+    dataset = ds.ImageFolderDataset(dataset_path,
+                                    num_parallel_workers=4,
+                                    shuffle=True,
+                                    decode=True)
 
     # Data augmentation
-    transform_img = [
+    transforms = [
         vision.Resize(image_size),
         vision.CenterCrop(image_size),
         vision.HWC2CHW(),
-        lambda x: ((x / 255).astype("float32"), np.random.normal(size=(nz, 1, 1)).astype("float32"))
+        lambda x: ((x / 255).astype("float32"))
     ]
 
     # Data mapping
-    data_set = data_set.map(operations=transform_img, input_columns="image", output_columns=["image", "latent_code"], num_parallel_workers=4)
-    data_set = data_set.project(["image", "latent_code"])
+    dataset = dataset.project('image')
+    dataset = dataset.map(transforms, 'image')
 
     # Batch operation
     data_set = data_set.batch(batch_size)