From 864070bebd933ffcac3ca89f90a943145e3e3b3d Mon Sep 17 00:00:00 2001 From: huanxiaoling <3174348550@qq.com> Date: Fri, 21 Oct 2022 15:56:56 +0800 Subject: [PATCH] update the federated en files --- docs/federated/docs/source_en/data_join.md | 2 +- docs/federated/docs/source_en/deploy_vfl.md | 68 +++++ docs/federated/docs/source_en/index.rst | 1 + .../docs/source_en/split_wnd_application.md | 237 ++++++++++++++++++ .../federated/docs/source_zh_cn/deploy_vfl.md | 6 +- .../source_zh_cn/split_wnd_application.md | 14 +- .../analysis_and_preparation.md | 6 + 7 files changed, 323 insertions(+), 11 deletions(-) create mode 100644 docs/federated/docs/source_en/deploy_vfl.md create mode 100644 docs/federated/docs/source_en/split_wnd_application.md diff --git a/docs/federated/docs/source_en/data_join.md b/docs/federated/docs/source_en/data_join.md index d9fe2336b4..8857c27eb6 100644 --- a/docs/federated/docs/source_en/data_join.md +++ b/docs/federated/docs/source_en/data_join.md @@ -198,7 +198,7 @@ Follower data export results: …… ``` -## An Example for Advanced Experience +## An Example for Deep Experience For detailed API documentation for the following code, see [Data Access Documentation](https://gitee.com/mindspore/federated/blob/master/docs/api/api_python/data_join.rst). diff --git a/docs/federated/docs/source_en/deploy_vfl.md b/docs/federated/docs/source_en/deploy_vfl.md new file mode 100644 index 0000000000..ab8bccc56f --- /dev/null +++ b/docs/federated/docs/source_en/deploy_vfl.md @@ -0,0 +1,68 @@ +# Vertical Federated Deployment + + + +This document explains how to use and deploy the vertical federated learning framework. + +The MindSpore Vertical Federated Learning (VFL) physical architecture is shown in the figure: + +As shown above, there are two participants in the vertical federated interaction: the Leader node and the Follower node, each of which has processes in two roles: `FLDataWorker` and `VFLTrainer`: + +- FLDataWorker + + The functions of `FLDataWorker` mainly includes: + + 1. Dataset intersection: obtains a common user intersection for both vertical federated participants, and supports a privacy dataset intersection protocol that prevents federated learning participants from obtaining ID information outside the intersection. + 2. Training data generation: After obtaining the intersection ID, the data features are expanded to generate the mindrecord file for training. + 3. Open management surface: `RESTful` interface is provided to users for cluster management. + + In a federated learning task, there is only one `Scheduler`, which communicates with the `Server` through TCP protocol. + +- VFLTrainer + + `VFLTrainer` is the main body that performs the vertical federated training tasks, and performs the forward and reverse computation after model slicing, Embedding tensor transfer, gradient tensor transfer, and reverse optimizer update. The current version supports single-computer single-card and single-computer multi-card training modes. + + In the MindSpore federated learning framework, `Server` also supports elastic scaling and disaster recovery, enabling dynamic provisioning of hardware resources without interruption of training tasks. + +`FLDataWorker` and `VFLTrainer` are generally deployed in the same server or container. + +## Preparation + +> It is recommended to use [Anaconda](https://www.anaconda.com/) to create a virtual environment for the following operations. + +### Installing MindSpore + +MindSpore vertical federated supports deployment on x86 CPU, GPU CUDA and Ascend hardware platforms. The latest version of MindSpore can be installed by referring to [MindSpore Installation Guide](https://www.mindspore.cn/install). + +### Installing MindSpore Federated + +Compile and install via [source code](https://gitee.com/mindspore/federated). + +```shell +git clone https://gitee.com/mindspore/federated.git -b master +cd federated +LD_LIBRARY_PATH=/usr/local/cuda-11.1/lib64:/usr/local/lib:/usr/local/openmpi/lib:/home/nishome/xq/anaconda3/envs/lib/:$LIBRARY_PATH +bash build.sh +``` + +For `bash build.sh`, accelerate compilation through the `-jn` option, e.g. `-j16`, and download third-party dependencies from gitee instead of github by the `-S on` option. + +Once compiled, find the Federated whl installation package in the `build/package/` directory to install. + +```shell +pip install mindspore_federated-{version}-{python_version}-linux_{arch}.whl +``` + +#### Verifying installation + +Execute the following command to verify the installation. The installation is successful if no error is reported when importing Python modules. + +```python +from mindspore_federated import FLServerJob +``` + +## Running the Example + +A running sample of FLDataWorker can be found in [Vertical federated learning data access](https://www.mindspore.cn/federated/docs/en/master/data_join.html) . + +A running sample of VFLTrainer can be found in [Vertical federated learning model training - Wide&Deep Recommended Application](https://www.mindspore.cn/federated/docs/en/master/split_wnd_application.html) . diff --git a/docs/federated/docs/source_en/index.rst b/docs/federated/docs/source_en/index.rst index bc20531c9a..54b887386d 100644 --- a/docs/federated/docs/source_en/index.rst +++ b/docs/federated/docs/source_en/index.rst @@ -89,6 +89,7 @@ Common Application Scenarios :caption: Vertical Application data_join + split_wnd_application .. toctree:: :maxdepth: 1 diff --git a/docs/federated/docs/source_en/split_wnd_application.md b/docs/federated/docs/source_en/split_wnd_application.md new file mode 100644 index 0000000000..a75fa829d9 --- /dev/null +++ b/docs/federated/docs/source_en/split_wnd_application.md @@ -0,0 +1,237 @@ +# Vertical Federated Learning Model Training - Wide&Deep Recommendation Application + + + +## Overview + +MindSpore Federated provides a vertical federated learning infrastructure component based on Split Learning. + +Vertical FL model training scenarios: including two stages of forward propagation and backward propagation/parameter update. + +Forward propagation: After the data intersection module processes the parameter-side data and aligns the feature information and label information, the Follower participant inputs the local feature information into the precursor network model, and the feature tensor output from the precursor network model is encrypted/scrambled by the privacy security module and transmitted to the Leader participant by the communication module. The Leader participants input the received feature tensor into the post-level network model, and the predicted values and local label information output from the post-level network model are used as the loss function input to calculate the loss values. + +Backward propagation: The Leader participant calculates the parameter gradient of the backward network model based on the loss value, trains and updates the parameters of the backward network model, and transmits the gradient tensor associated with the feature tensor to the Follower participant by the communication module after encrypted and scrambled by the privacy security module. The Follower participant uses the received gradient tensor for training and update of of frontward network model parameters. + +Vertical FL model inference scenario: similar to the forward propagation phase of the training scenario, but with the predicted values of the backward network model directly as the output, without calculating the loss values. + +## Network and Data + +This sample provides a federated learning training example for recommendation-oriented tasks by using Wide&Deep network and Criteo dataset as examples. As shown above, in this case, the vertical federated learning system consists of the Leader participant and the Follower participant. Among them, the Leader participant holds 20×2 dimensional feature information and label information, and the Follower participant holds 19×2 dimensional feature information. Leader participant and Follower participant deploy 1 set of Wide&Deep network respectively, and realize the collaborative training of the network model by exchanging embedding vectors and gradient vectors without disclosing the original features and label information. + +For a detailed description of the principle properties of Wide&Deep networks, see [MindSpore ModelZoo - Wide&Deep - Wide&Deep Overview](https://gitee.com/mindspore/models/blob/master/official/recommend/wide_and_deep/README.md#widedeep-description) and its [research paper](https://arxiv.org/pdf/1606.07792.pdf). + +## Dataset Preparation + +This sample is based on the Criteo dataset for training and testing. Before running the sample, you need to refer to [MindSpore ModelZoo - Wide&Deep - Quick Start](https://gitee.com/mindspore/models/blob/master/official/recommend/wide_and_deep/README.md#quick-start) to pre-process the Criteo dataset. + +1. Clone MindSpore ModelZoo code. + + ```shell + git clone https://gitee.com/mindspore/models.git + cd models/official/recommend/wide_and_deep + ``` + +2. Download the dataset + + ```shell + mkdir -p data/origin_data && cd data/origin_data + wget http://go.criteo.net/criteo-research-kaggle-display-advertising-challenge-dataset.tar.gz + tar -zxvf criteo-research-kaggle-display-advertising-challenge-dataset.tar.gz + ``` + +3. Use this script to pre-process the data. The preprocessing process may take up to an hour and the generated MindRecord data is stored in the data/mindrecord path. The preprocessing process consumes a lot of memory, so it is recommended to use a server. + + ```shell + cd ../.. + python src/preprocess_data.py --data_path=./data/ --dense_dim=13 --slot_dim=26 --threshold=100 --train_line_count=45840617 --skip_id_convert=0 + ``` + +## Quick Experience + +This sample runs as a Shell script pulling up a Python program. + +1. Refer to [MindSpore website guidance](https://www.mindspore.cn/install), installing MindSpore 1.8.1 or higher. + +2. Use to install the Python libraries that MindSpore Federated depends on. + + ```shell + cd federated + python -m pip install -r requirements_test.txt + ``` + +3. Copy the Criteo dataset after [preprocessing](#dataset-preparation) to this directory. + + ```shell + cd tests/st/splitnn_criteo + cp -rf ${DATA_ROOT_PATH}/data/mindrecord/ ./ + ``` + +4. Run the sample program to start the script. + + ```shell + # start leader: + bash run_vfl_train_socket_leader.sh + + # start follower: + bash run_vfl_train_socket_follower.sh + ``` + +5. View training log `log_local_gpu.txt`. + + ```text + INFO:root:epoch 0 step 100/2582 wide_loss: 0.528141 deep_loss: 0.528339 + INFO:root:epoch 0 step 200/2582 wide_loss: 0.499408 deep_loss: 0.499410 + INFO:root:epoch 0 step 300/2582 wide_loss: 0.477544 deep_loss: 0.477882 + INFO:root:epoch 0 step 400/2582 wide_loss: 0.474377 deep_loss: 0.476771 + INFO:root:epoch 0 step 500/2582 wide_loss: 0.472926 deep_loss: 0.475157 + INFO:root:epoch 0 step 600/2582 wide_loss: 0.464844 deep_loss: 0.467011 + INFO:root:epoch 0 step 700/2582 wide_loss: 0.464496 deep_loss: 0.466615 + INFO:root:epoch 0 step 800/2582 wide_loss: 0.466895 deep_loss: 0.468971 + INFO:root:epoch 0 step 900/2582 wide_loss: 0.463155 deep_loss: 0.465299 + INFO:root:epoch 0 step 1000/2582 wide_loss: 0.457914 deep_loss: 0.460132 + INFO:root:epoch 0 step 1100/2582 wide_loss: 0.453361 deep_loss: 0.455767 + INFO:root:epoch 0 step 1200/2582 wide_loss: 0.457566 deep_loss: 0.459997 + INFO:root:epoch 0 step 1300/2582 wide_loss: 0.460841 deep_loss: 0.463281 + INFO:root:epoch 0 step 1400/2582 wide_loss: 0.460973 deep_loss: 0.463365 + INFO:root:epoch 0 step 1500/2582 wide_loss: 0.459204 deep_loss: 0.461563 + INFO:root:epoch 0 step 1600/2582 wide_loss: 0.456771 deep_loss: 0.459200 + INFO:root:epoch 0 step 1700/2582 wide_loss: 0.458479 deep_loss: 0.460963 + INFO:root:epoch 0 step 1800/2582 wide_loss: 0.449609 deep_loss: 0.452122 + INFO:root:epoch 0 step 1900/2582 wide_loss: 0.451775 deep_loss: 0.454225 + INFO:root:epoch 0 step 2000/2582 wide_loss: 0.460343 deep_loss: 0.462826 + INFO:root:epoch 0 step 2100/2582 wide_loss: 0.456814 deep_loss: 0.459201 + INFO:root:epoch 0 step 2200/2582 wide_loss: 0.452091 deep_loss: 0.454555 + INFO:root:epoch 0 step 2300/2582 wide_loss: 0.461522 deep_loss: 0.464001 + INFO:root:epoch 0 step 2400/2582 wide_loss: 0.442355 deep_loss: 0.444790 + INFO:root:epoch 0 step 2500/2582 wide_loss: 0.450675 deep_loss: 0.453242 + ... + ``` + +6. Close training process. + + ```shell + pid=`ps -ef|grep run_vfl_train_socket |grep -v "grep" | grep -v "finish" |awk '{print $2}'` && for id in $pid; do kill -9 $id && echo "killed $id"; done + ``` + +## Deep Experience + +Before starting the vertical federated learning training, users need to construct the dataset iterator and network structure as they do for normal deep learning training with MindSpore. + +### Building the Dataset + +The current simulation process is used, i.e., both participants read the same data source. But for training, both participants use only part of the feature or label data, as shown in [Network and Data](#network-and-data). Later, the [Data Access](https://www.mindspore.cn/federated/docs/en/master/data_join/data_join.html) method will be used for both participants to import the data individually. + +```python +from run_vfl_train_local import construct_local_dataset + + +ds_train, _ = construct_local_dataset() +train_iter = ds_train.create_dict_iterator() +``` + +### Building the Network + +Leader participant network: + +```python +from network_config import config +from wide_and_deep import LeaderNet, LeaderLossNet + + +leader_base_net = LeaderNet(config) +leader_train_net = LeaderLossNet(leader_base_net, config) +``` + +Follower participant network: + +```python +from network_config import config +from wide_and_deep import FollowerNet, FollowerLossNet + + +follower_base_net = FollowerNet(config) +follower_train_net = FollowerLossNet(follower_base_net, config) +``` + +### Vertical Federated Communication Base + +Before training, we first have to start the communication base to make Leader and Follower participants group network. Detailed API documentation can be found in [Vertical Federated Communicator](https://gitee.com/mindspore/federated/blob/master/docs/api/api_python_en/vertical/vertical_communicator.rst). + +Both parties need to import the vertical federated communicator: + +```python +from mindspore_federated.startup.vertical_federated_local import VerticalFederatedCommunicator, ServerConfig +``` + +Leader participant communication base: + +```python +http_server_config = ServerConfig(server_name='serverB', server_address='127.0.0.1:10180') +remote_server_config = ServerConfig(server_name='serverA', server_address='127.0.0.1:10190') +vertical_communicator = VerticalFederatedCommunicator(http_server_config=http_server_config, + remote_server_config=remote_server_config) +vertical_communicator.launch() +``` + +Follower participant communication base: + +```python +http_server_config = ServerConfig(server_name='serverA', server_address='127.0.0.1:10190') +remote_server_config = ServerConfig(server_name='serverB', server_address='127.0.0.1:10180') +vertical_communicator = VerticalFederatedCommunicator(http_server_config=http_server_config, + remote_server_config=remote_server_config) +vertical_communicator.launch() +``` + +### Building a Vertical Federated Network + +Users need to use the classes provided by MindSpore Federated to wrap their constructed networks into a vertical federated network. The detailed API documentation can be found in [Vertical Federated Training Interface](https://gitee.com/mindspore/federated/blob/master/docs/api/api_python_en/vertical/vertical_federated_FLModel.rst). + +Both parties need to import the vertical federated training interface: + +```python +from mindspore_federated import FLModel, FLYamlData +``` + +Leader participant vertical federated network: + +```python +leader_yaml_data = FLYamlData(config.leader_yaml_path) +leader_fl_model = FLModel(yaml_data=leader_yaml_data, + network=leader_base_net, + train_network=leader_train_net) +``` + +Follower participant vertical federated network: + +```python +follower_yaml_data = FLYamlData(config.follower_yaml_path) +follower_fl_model = FLModel(yaml_data=follower_yaml_data, + network=follower_base_net, + train_network=follower_train_net) +``` + +### Vertical Training + +For the process of vertical training, refer to [overview](#overview). + +Leader participant training process: + +```python +for step, item in itertools.product(range(config.epochs), train_iter): + follower_out = vertical_communicator.receive("serverA") + leader_out = leader_fl_model.forward_one_step(item, follower_out) + grad_scale = leader_fl_model.backward_one_step(item, follower_out) + vertical_communicator.send_tensors("serverA", grad_scale) +``` + +Follower participant training process: + +```python +for step, item in itertools.product(range(config.epochs), train_iter): + follower_out = follower_fl_model.forward_one_step(item) + vertical_communicator.send_tensors("serverB", embedding_data) + scale = vertical_communicator.receive("serverB") + follower_fl_model.backward_one_step(item, sens=scale) +``` + diff --git a/docs/federated/docs/source_zh_cn/deploy_vfl.md b/docs/federated/docs/source_zh_cn/deploy_vfl.md index e4b0066309..088c68f5ce 100644 --- a/docs/federated/docs/source_zh_cn/deploy_vfl.md +++ b/docs/federated/docs/source_zh_cn/deploy_vfl.md @@ -8,7 +8,7 @@ MindSpore Vertical Federated Learning (VFL) 物理架构如图所示: ![](./images/deploy_VFL.png) -如上图所示,在纵向联邦的交互中有两个参与方:Leader node和Follower node,每一个参与方都有两种角色的进程:`FLDataWorker`和`VFLTrainer`: +如上图所示,在纵向联邦的交互中有两个参与方:Leader node和Follower node,每一个参与方都有两种角色的进程:`FLDataWorker`和`VFLTrainer`: - FLDataWorker @@ -65,6 +65,6 @@ from mindspore_federated import FLServerJob ## 运行样例 -FLDataWorker的运行样例可参考[纵向联邦学习数据接入](https://www.mindspore.cn/federated/docs/zh-CN/master/data_join.html) 。 +FLDataWorker的运行样例可参考[纵向联邦学习数据接入](https://www.mindspore.cn/federated/docs/zh-CN/master/data_join.html)。 -VFLTrainer的运行样例可参考[纵向联邦学习模型训练 - Wide&Deep推荐应用](https://www.mindspore.cn/federated/docs/zh-CN/master/split_wnd_application.html) 。 +VFLTrainer的运行样例可参考[纵向联邦学习模型训练 - Wide&Deep推荐应用](https://www.mindspore.cn/federated/docs/zh-CN/master/split_wnd_application.html)。 diff --git a/docs/federated/docs/source_zh_cn/split_wnd_application.md b/docs/federated/docs/source_zh_cn/split_wnd_application.md index f86da4e9e2..0fd82a0a8c 100644 --- a/docs/federated/docs/source_zh_cn/split_wnd_application.md +++ b/docs/federated/docs/source_zh_cn/split_wnd_application.md @@ -8,7 +8,7 @@ MindSpore Federated提供基于拆分学习(Split Learning)的纵向联邦 纵向FL模型训练场景:包括前向传播和后向传播/参数更新两个阶段。 -前向传播:经数据求交模块处理参数方数据,配准特征信息和标签信息后,Follower参与方将本地特征信息输入前级网络模型,将前级网络模型输出的特征张量,经隐私安全模块加密/加扰后,由通信模块传输传输给Leader参与方。Leader参与方将收到的特征张量输入后级网络模型,以后级网络模型输出的预测值和本地标签信息为损失函数输入,计算损失值。 +前向传播:经数据求交模块处理参数方数据,配准特征信息和标签信息后,Follower参与方将本地特征信息输入前级网络模型,将前级网络模型输出的特征张量,经隐私安全模块加密/加扰后,由通信模块传输给Leader参与方。Leader参与方将收到的特征张量输入后级网络模型,以后级网络模型输出的预测值和本地标签信息为损失函数输入,计算损失值。 ![](./images/vfl_forward.png) @@ -24,11 +24,11 @@ MindSpore Federated提供基于拆分学习(Split Learning)的纵向联邦 本样例以Wide&Deep网络和Criteo数据集为例,提供了面向推荐任务的联邦学习训练样例。如上图所示,本案例中,纵向联邦学习系统由Leader参与方和Follower参与方组成。其中,Leader参与方持有20×2维特征信息和标签信息,Follower参与方持有19×2维特征信息。Leader参与方和Follower参与方分别部署1组Wide&Deep网络,并通过交换embedding向量和梯度向量,在不泄露原始特征和标签信息的前提下,实现对网络模型的协同训练。 -Wide&Deep网络原理特性的详细介绍,可参考[MindSpore ModelZoo - Wide&Deep - Wide&Deep概述](https://gitee.com/mindspore/models/blob/master/official/recommend/wide_and_deep/README_CN.md#widedeep%E6%A6%82%E8%BF%B0) 及其[研究论文](https://arxiv.org/pdf/1606.07792.pdf) 。 +Wide&Deep网络原理特性的详细介绍,可参考[MindSpore ModelZoo - Wide&Deep - Wide&Deep概述](https://gitee.com/mindspore/models/blob/master/official/recommend/wide_and_deep/README_CN.md#widedeep%E6%A6%82%E8%BF%B0) 及其[研究论文](https://arxiv.org/pdf/1606.07792.pdf)。 ## 数据集准备 -本样例基于Criteo数据集进行训练和测试,在运行样例前,需参考[MindSpore ModelZoo - Wide&Deep - 快速入门](https://gitee.com/mindspore/models/blob/master/official/recommend/wide_and_deep/README_CN.md#%E5%BF%AB%E9%80%9F%E5%85%A5%E9%97%A8) ,对Criteo数据集进行预处理。 +本样例基于Criteo数据集进行训练和测试,在运行样例前,需参考[MindSpore ModelZoo - Wide&Deep - 快速入门](https://gitee.com/mindspore/models/blob/master/official/recommend/wide_and_deep/README_CN.md#%E5%BF%AB%E9%80%9F%E5%85%A5%E9%97%A8),对Criteo数据集进行预处理。 1. 克隆MindSpore ModelZoo代码。 @@ -125,7 +125,7 @@ Wide&Deep网络原理特性的详细介绍,可参考[MindSpore ModelZoo - Wide ### 构造数据集 -当前采用模拟流程,即两方读取数据源一样,但训练时,两方只使用部分的特征或标签数据,如[网络和数据](#网络和数据) 所示。后续将采用[数据接入](https://www.mindspore.cn/federated/docs/zh-CN/master/data_join/data_join.html) 方法两方各自导入数据。 +当前采用模拟流程,即两方读取数据源一样,但训练时,两方只使用部分的特征或标签数据,如[网络和数据](#网络和数据)所示。后续将采用[数据接入](https://www.mindspore.cn/federated/docs/zh-CN/master/data_join/data_join.html)方法两方各自导入数据。 ```python from run_vfl_train_local import construct_local_dataset @@ -161,7 +161,7 @@ follower_train_net = FollowerLossNet(follower_base_net, config) ### 纵向联邦通信底座 -在训练前首先要启动通信底座,使Leader和Follower参与方组网。详细的API文档可以参考[纵向联邦通信器](https://gitee.com/mindspore/federated/blob/master/docs/api/api_python/vertical/vertical_communicator.rst) 。 +在训练前首先要启动通信底座,使Leader和Follower参与方组网。详细的API文档可以参考[纵向联邦通信器](https://gitee.com/mindspore/federated/blob/master/docs/api/api_python/vertical/vertical_communicator.rst)。 两方都需要导入纵向联邦通信器: @@ -191,7 +191,7 @@ vertical_communicator.launch() ### 构建纵向联邦网络 -用户需要使用MindSpore Federated提供的类,将自己构造好的网络封装成纵向联邦网络。详细的API文档可以参考[纵向联邦训练接口](https://gitee.com/mindspore/federated/blob/master/docs/api/api_python/vertical/vertical_federated_FLModel.rst) 。 +用户需要使用MindSpore Federated提供的类,将自己构造好的网络封装成纵向联邦网络。详细的API文档可以参考[纵向联邦训练接口](https://gitee.com/mindspore/federated/blob/master/docs/api/api_python/vertical/vertical_federated_FLModel.rst)。 两方都需要导入纵向联邦训练接口: @@ -219,7 +219,7 @@ follower_fl_model = FLModel(yaml_data=follower_yaml_data, ### 纵向训练 -纵向训练的流程可以参考[概述](#概述) 。 +纵向训练的流程可以参考[概述](#概述)。 Leader参与方训练流程: diff --git a/docs/mindspore/source_en/migration_guide/analysis_and_preparation.md b/docs/mindspore/source_en/migration_guide/analysis_and_preparation.md index b78ad4f381..d8dcb6af7c 100644 --- a/docs/mindspore/source_en/migration_guide/analysis_and_preparation.md +++ b/docs/mindspore/source_en/migration_guide/analysis_and_preparation.md @@ -11,6 +11,8 @@ When you obtain a paper to implement migration on MindSpore, you need to find th 3. The code is new and maintained by developers. 4. The PyTorch reference code is preferred. +If the results are not reproducible in the reference project or the version information is missing, check the project issue for information. + If a new paper has no reference implementation, you can refer to [Constructing MindSpore Network](https://www.mindspore.cn/docs/en/master/migration_guide/model_development/model_development.html). ## Analyzing Algorithm and Network Structure @@ -91,6 +93,10 @@ The API missing analysis here refers to APIs in the network execution diagram, i Take the PyTorch code migration as an example. After obtaining the reference code implementation, you can filter keywords such as `torch`, `nn`, and `ops` to obtain the used APIs. If the method of another repository is invoked, you need to manually analyze the API. Then, check the [PyTorch and MindSpore API Mapping Table](https://www.mindspore.cn/docs/en/master/note/api_mapping/pytorch_api_mapping.html). Alternatively, the [API](https://www.mindspore.cn/docs/en/master/api_python/mindspore.ops.html) searches for the corresponding API implementation. +Generally the training process of a network contains forward calculation, backward gradient calculation and parameter update. In some special scenarios, another gradient calculation is needed for the gradient, such as [Gradient Penalty](https://arxiv.org/pdf/1704.00028.pdf), and this kind of scenario uses the second order gradient calculation. For scenarios where second-order gradient calculations are used in the network requires additional analysis of the second-order support of the APIs, the derivative links of the network need to be analyzed by code walk-through, and all APIs within the second-order derivative links need to support second order. The second-order support case can be viewed in [MindSpore gradient section source code](https://gitee.com/mindspore/mindspore/tree/master/mindspore/python/mindspore/ops/_grad) to see if its first-order Grad has a corresponding of the bprop function definition. + +For example, if the network second-order derivative links contain StridedSlice slicing operation, you can look up [array_ops gradient definition file](https://gitee.com/mindspore/mindspore/blob/master/mindspore/python/mindspore/ops/_grad/grad_array_ops.py) in the [reverse registration code of StridedSliceGrad](https://gitee.com/mindspore/mindspore/blob/master/mindspore/python/mindspore/ops/_grad/grad_array_ops.py#L867). If it exists, the current version of MindSpore StridedSlice slicing operation supports second-order gradient calculation. + For details about the mapping of other framework APIs, see the [API naming and function description](https://www.mindspore.cn/docs/en/master/api_python/mindspore.html). For APIs with the same function, the names of MindSpore may be different from those of other frameworks. The parameters and functions of APIs with the same name may also be different from those of other frameworks. For details, see the official description. If the corresponding API is not found, see specific missing API processing policy. -- Gitee