diff --git a/docs/federated/docs/source_en/communication_compression.md b/docs/federated/docs/source_en/communication_compression.md
index 6d3c256dd2af03fece948067a342fbc1aa8579ef..9990e79567b52bb3116ee4d3daf46e2c63b99fff 100644
--- a/docs/federated/docs/source_en/communication_compression.md
+++ b/docs/federated/docs/source_en/communication_compression.md
@@ -1,6 +1,6 @@
# Device-Cloud Federated Learning Communication Compression
-
+
During the device-side federated learning training process, the traffic volume affects the user experience of the device-side (user traffic, communication latency, number of FL-Client participants) and is limited by the cloud-side performance constraints (memory, bandwidth, CPU usage). To improve user experience and reduce performance bottlenecks, MindSpore federated learning framework provides traffic compression for upload and download in device-cloud federated scenarios.
diff --git a/docs/federated/docs/source_en/deploy_federated_client.md b/docs/federated/docs/source_en/deploy_federated_client.md
index 5ba2d289d4489645569274ba56bb38f836c9ce28..abbf49d0f3c183e773f274687e8cedca0b28f086 100644
--- a/docs/federated/docs/source_en/deploy_federated_client.md
+++ b/docs/federated/docs/source_en/deploy_federated_client.md
@@ -2,197 +2,218 @@
-The following describes how to deploy the Federated-Client in the Android aarch and Linux x86_64 environments:
+This document describes how to compile and deploy Federated-Client.
-## Android aarch
+## Linux Compilation Guidance
-### Building a Package
+### System Environment and Third-party Dependencies
-1. Configure the build environment.
+This section describes how to complete the device-side compilation of MindSpore federated learning. Currently, the federated learning device-side only provides compilation guidance on Linux, and other systems are not supported. The following table lists the system environment and third-party dependencies required for compilation.
- Currently, only the Linux build environment is supported. For details about how to configure the Linux build environment, click [here](https://www.mindspore.cn/lite/docs/en/master/use/build.html#linux-environment-compilation).
+| Software Name | Version | Functions |
+|-----------------------| ------------ | ------------ |
+| Ubuntu | 18.04.02LTS | Compiling and running MindSpore operating system |
+| [GCC](#installing-gcc) | Between 7.3.0 to 9.4.0 | C++ compiler for compiling MindSpore |
+| [git](#installing-git) | - | Source code management tools used by MindSpore |
+| [CMake](#installing-cmake) | 3.18.3 and above | Compiling and building MindSpore tools |
+| [Gradle](#installing-gradle) | 6.6.1 | JVM-based building tools |
+| [Maven](#installing-maven) | 3.3.1 and above | Tools for managing and building Java projects |
+| [OpenJDK](#installing-openjdk) | Between 1.8 to 1.15 | Tools for managing and building Java projects |
-2. Build the x86-related architecture package in the mindspore home directory.
+#### Installing GCC
- ```sh
- bash build.sh -I x86_64 -j32
- ```
+Install GCC with the following command.
- And the x86 architecture package will be generated in the path `mindspore/output/`after compiling ( please backup it to avoid auto-deletion while next compile):
+```bash
+sudo apt-get install gcc-7 git -y
+```
- ```sh
- mindspore-lite-{version}-linux-x64.tar.gz
- ```
+To install a higher version of GCC, use the following command to install GCC 8.
-3. Turn on Federated-Client compile option and build the AAR package that contains aarch64 and aarch32 in the mindspore home directory.
+```bash
+sudo apt-get install gcc-8 -y
+```
- ```sh
- export MSLITE_ENABLE_FL=on
- bash build.sh -A on -j32
- ```
+Or install GCC 9.
- The Android AAR package will be generated in the path `mindspore/output/` after compiling ( please backup it to avoid auto-deletion while next compile):
+```bash
+sudo apt-get install software-properties-common -y
+sudo add-apt-repository ppa:ubuntu-toolchain-r/test
+sudo apt-get update
+sudo apt-get install gcc-9 -y
+```
- ```sh
- mindspore-lite-full-{version}.aar
- ```
+#### Installing git
-4. Since the device-side framework and the model are decoupled, we provide Android AAR package `mindspore-lite-full-{version}.aar` that does not contain model-related scripts, so users need to generate the model script corresponding to the jar package. We provide two types of model scripts for your reference ([Supervised sentiment Classification Task](https://gitee.com/mindspore/mindspore/tree/master/mindspore/lite/examples/quick_start_flclient/src/main/java/com/mindspore/flclient/demo/albert), [LeNet image classification task](https://gitee.com/mindspore/mindspore/tree/master/mindspore/lite/examples/quick_start_flclient/src/main/java/com/mindspore/flclient/demo/lenet)). Users can refer to these two types of model scripts, and generate the corresponding jar package (assuming the name is `quick_start_flclient.jar`) after customizing the model script. The jar packages corresponding to the model scripts we provide can be obtained in the following ways:
+Install git with the following command.
- After downloading the latest code on [MindSpore Open Source Warehouse](https://gitee.com/mindspore/mindspore), perform the following operations:
+```bash
+sudo apt-get install git -y
+```
- ```sh
- cd mindspore/mindspore/lite/examples/quick_start_flclient
- sh build.sh -r "mindspore-lite-{version}-linux-x64.tar.gz" # After -r, give the absolute path of the latest x86 architecture package that generate at step 2
- ```
+#### Installing Cmake
+
+Install [CMake](https://cmake.org/) with the following command.
+
+```bash
+wget -O - https://apt.kitware.com/keys/kitware-archive-latest.asc 2>/dev/null | sudo apt-key add -
+sudo apt-add-repository "deb https://apt.kitware.com/ubuntu/ $(lsb_release -cs) main"
+sudo apt-get install cmake -y
+```
- After running the above command, the path of the jar package generated is: `mindspore/mindspore/lite/examples/quick_start_flclient/target/quick_start_flclient.jar`.
+#### Installing Gradle
-### Running Dependencies
+Install [Gradle](https://gradle.org/releases/) with the following command.
-- [Android Studio](https://developer.android.google.cn/studio) >= 4.0
-- [Android SDK](https://developer.android.com/studio?hl=zh-cn#cmdline-tools) >= 29
+```bash
+# Download the corresponding zip package and unzip it.
+# Configure environment variables:
+ export GRADLE_HOME=GRADLE path
+ export GRADLE_USER_HOME=GRADLE path
+# Add the bin directory to the PATH:
+ export PATH=${GRADLE_HOME}/bin:$PATH
+```
-### Building a Dependency Environment
+#### Installing Maven
-Renaming `mindspore-lite-full-{version}.aar` to `mindspore-lite-full-{version}.zip`. After the `mindspore-lite-full-{version}.zip` file is decompressed, the following directory structure is obtained:
+Install [Maven](https://archive.apache.org/dist/maven/maven-3/) with the following command.
-```text
-mindspore-lite-full-{version}
-├── jni
-│ ├── arm64-v8a
-│ │ ├── libjpeg.so # Dynamic library file for image processing
-│ │ ├── libminddata-lite.so # Dynamic library file for image processing
-│ │ ├── libmindspore-lite.so # Dynamic library on which the MindSpore Lite inference framework depends
-│ │ ├── libmindspore-lite-jni.so # JNI dynamic library on which the MindSpore Lite inference framework depends
-│ │ ├── libmindspore-lite-train.so # Dynamic library on which the MindSpore Lite training framework depends
-│ │ ├── libmindspore-lite-train-jni.so # JNI dynamic library on which the MindSpore Lite training framework depends
-│ │ └── libturbojpeg.so # Dynamic library file for image processing
-│ └── armeabi-v7a
- │ ├── libjpeg.so # Dynamic library file for image processing
-│ ├── libminddata-lite.so # Dynamic library file for image processing
-│ ├── libmindspore-lite.so # Dynamic library on which the MindSpore Lite inference framework depends
-│ ├── libmindspore-lite-jni.so # JNI dynamic library on which the MindSpore Lite inference framework depends
-│ ├── libmindspore-lite-train.so # Dynamic library on which the MindSpore Lite training framework depends
-│ ├── libmindspore-lite-train-jni.so # JNI dynamic library on which the MindSpore Lite training framework depends
-│ └── libturbojpeg.so # Dynamic library file for image processing
-├── libs
-│ ├── mindspore-lite-java-common.jar # MindSpore Lite training framework JAR package
-│ └── mindspore-lite-java-flclient.jar # Federated learning framework JAR package
-└── classes.jar # MindSpore Lite training framework JAR package
+```bash
+# Download the corresponding zip package and unzip it.
+# Configure environment variables:
+ export MAVEN_HOME=MAVEN path
+# Add the bin directory to the PATH:
+ export PATH=${MAVEN_HOME}/bin:$PATH
```
-Note 1: since the federated learning jar package in the Android environment does not contain the dependent third-party open source software packages, in the Android environment, before using the AAR package, the user needs to add related dependency statements in the dependencies{} field to load the three open source software that Federated Learning depends on, and the dependencies{} field is in the app/build.gradle file under the Android project, as shown below:
+#### Installing OpenJDK
-```text
-dependencies {
+Install [OpenJDK](https://jdk.java.net/archive/) with the following command.
-// Add third-party open source software that federated learning relies on
-implementation group: 'com.squareup.okhttp3', name: 'okhttp', version: '3.14.9'
-implementation group: 'com.google.flatbuffers', name: 'flatbuffers-java', version: '2.0.0'
-implementation(group: 'org.bouncycastle',name: 'bcprov-jdk15on', version: '1.68')
-}
+```bash
+# Download the corresponding zip package and unzip it.
+# Configure environment variables:
+ export JAVA_HOME=JDK path
+# Add the bin directory to the PATH:
+ export PATH=${JAVA_HOME}/bin:$PATH
```
-For specific implementation, please refer to the example of `app/build.gradle` provided in the `Android project configuration dependencies` section in the document [sentiment classification application](https://www.mindspore.cn/federated/docs/en/master/sentiment_classification_application.html).
+### Verifying Installation
-Note 2: since the third-party open source software `bcprov-jdk15on` that Federated Learning relies on contains multi-version class files, in order to prevent errors in compiling high-version class files with lower version jdk, the following setting statement can be added to the `gradle.properties` file of the Android project:
+Verify that the installation in [System environment and third-party dependencies](#system-environment-and-third-party-dependencies) is successful.
+
+```text
+Open a command window and enter: gcc --version
+The following output identifies a successful installation:
+ gcc version version number
+
+Open a command window and enter:git --version
+The following output identifies a successful installation:
+ git version version number
+
+Open a command window and enter:cmake --version
+The following output identifies a successful installation:
+ cmake version version number
+
+Open a command window and enter:gradle --version
+The following output identifies a successful installation:
+ Gradle version number
+
+Open a command window and enter:mvn --version
+The following output identifies a successful installation:
+ Apache Maven version number
+
+Open a command window and enter:java --version
+The following output identifies a successful installation:
+ openjdk version version number
-```java
-android.jetifier.blacklist=bcprov
```
-After setting up the dependencies shown above in the Android project, you only need to rely on the AAR package and the jar package corresponding to the model script `quick_start_flclient.jar` to call APIs provided by federated learning. For details about how to call and run the APIs, see the API description of federated learning.
+### Compilation Options
-## Linux x86_64
+The `cli_build.sh` script in the federated learning device_client directory is used for compilation on the federated learning device-side.
-### Building a Package
+#### Instructions for Using cli_build.sh Parameters
-1. Configure the build environment.
+| Parameters | Parameter Description | Value Range | Default Values |
+| ---- | ------------------------ | -------- | ------------ |
+| -p | the download path of dependency external packages | string | third |
+| -c | whether to reuse dependency packages previously downloaded | on and off | on |
- Currently, only the Linux build environment is supported. For details about how to configure the Linux build environment, click [here](https://www.mindspore.cn/lite/docs/en/master/use/build.html#linux-environment-compilation).
+### Compilation Examples
-2. Build the x86-related architecture package in the mindspore home directory
+1. First, you need to download the source code from the gitee code repository before you can compile it.
- ```sh
- bash build.sh -I x86_64 -j32
+ ```bash
+ git clone https://gitee.com/mindspore/federated.git ./
```
- And the x86 architecture package will be generated in the path `mindspore/output/` after compiling ( please backup it to avoid auto-deletion while next compile):
+2. Go to the mindspore_federated/device_client directory and execute the following command:
- ```sh
- mindspore-lite-{version}-linux-x64.tar.gz
+ ```bash
+ bash cli_build.sh
+ ```
+
+3. Since the end-side framework and the model are decoupled, the x86 architecture package we provide, mindspore-lite-{version}-linux-x64.tar.gz, does not contain model-related scripts, so the user needs to generate the jar package corresponding to the model scripts. The jar package corresponding to the model scripts we provide can be obtained in the following way:
+
+ ```bash
+ cd federated/example/quick_start_flclient
+ bash build.sh -r mindspore-lite-java-flclient.jar # After -r, you need to give the absolute path to the latest x86 architecture package (generated in Step 2, federated/mindspore_federated/device_client/build/libs/jarX86/mindspore-lite-java-flclient.jar)
```
-3. Since the device-side framework and the model are decoupled, we provide x86 architecture package `mindspore-lite-{version}-linux-x64.tar.gz` that does not contain model-related scripts, so users need to generate the model script corresponding to the jar package. We provide two types of model scripts for your reference ([Supervised sentiment Classification Task](https://gitee.com/mindspore/mindspore/tree/master/mindspore/lite/examples/quick_start_flclient/src/main/java/com/mindspore/flclient/demo/albert), [LeNet image classification task](https://gitee.com/mindspore/mindspore/tree/master/mindspore/lite/examples/quick_start_flclient/src/main/java/com/mindspore/flclient/demo/lenet)). Users can refer to these two types of model scripts, and generate the corresponding jar package (assuming the name is `quick_start_flclient.jar`) after customizing the model script. The jar packages corresponding to the model scripts we provide can be obtained in the following ways:
+After running the above command, the path of generated jar package is federated/example/quick_start_flclient/target/quick_start_flclient.jar.
+
+### Building Dependency Environment
- After downloading the latest code on [MindSpore Open Source Warehouse](https://gitee.com/mindspore/mindspore), perform the following operations:
+1. After extracting the file `federated/mindspore_federated/device_client/third/mindspore-lite-{version}-linux-x64.tar.gz`, the obtained directory structure is as follows:
```sh
- cd mindspore/mindspore/lite/examples/quick_start_flclient
- sh build.sh -r "mindspore-lite-{version}-linux-x64.tar.gz" # After -r, give the absolute path of the latest x86 architecture package that generate at step 2
+ mindspore-lite-{version}-linux-x64
+ ├── tools
+ │ ├── benchmark_train # Tool for training model performance and accuracy tuning
+ │ ├── converter # Model converter
+ │ └── cropper # Library cropper
+ │ ├── cropper # Executable files of library cropper
+ │ └── cropper_mapping_cpu.cfg # Configuration files required for cropping the cpu library
+ └── runtime
+ ├── include # Header files of training framework
+ │ └── registry # Custom operator registration header files
+ ├── lib # Training framework library
+ │ ├── libminddata-lite.a # Static library files for image processing
+ │ ├── libminddata-lite.so # Dynamic library files for image processing
+ │ ├── libmindspore-lite-jni.so # jni dynamic library relied by MindSpore Lite inference framework
+ │ ├── libmindspore-lite-train.a # Static library relied by MindSpore Lite training framework
+ │ ├── libmindspore-lite-train.so # Dynamic library relied by MindSpore Lite training framework
+ │ ├── libmindspore-lite-train-jni.so # jni dynamic library relied by MindSpore Lite training framework
+ │ ├── libmindspore-lite.a # Static library relied by MindSpore Lite inference framework
+ │ ├── libmindspore-lite.so # Dynamic library relied by MindSpore Lite inference framework
+ │ ├── mindspore-lite-java.jar # MindSpore Lite training framework jar package
+ │ └── mindspore-lite-java-flclient.jar # Federated learning framework jar package
+ └── third_party
+ └── libjpeg-turbo
+ └── lib
+ ├── libjpeg.so.62 # Dynamic library files for image processing
+ └── libturbojpeg.so.0 # Dynamic library files for image processing
```
- After running the above command, the path of the jar package generated is: `mindspore/mindspore/lite/examples/quick_start_flclient/target/quick_start_flclient.jar`.
-
-### Running Dependencies
-
-- [Python](https://www.python.org/downloads/) >= 3.7.0
-- [OpenJDK](https://openjdk.java.net/install/) 1.8 to 1.15
-
-### Building a Dependency Environment
-
-After the `mindspore/output/mindspore-lite-{version}-linux-x64.tar.gz` file is decompressed, the following directory structure is obtained:
-
-```sh
-mindspore-lite-{version}-linux-x64
-├── tools
-│ ├── benchmark_train # Tool for commissioning the performance and accuracy of the training model
-│ ├── converter # Model conversion tool
-│ └── cropper # Library cropping tool
-│ ├── cropper # Executable file of the library cropping tool
-│ └── cropper_mapping_cpu.cfg # Configuration file required for cropping the CPU library
-└── runtime
- ├── include # Header file of the training framework
- │ └── registry # Header file for custom operator registration
- ├── lib # Training framework library
- │ ├── libminddata-lite.a # Static library file for image processing
- │ ├── libminddata-lite.so # Dynamic library file for image processing
- │ ├── libmindspore-lite-jni.so # JNI dynamic library on which the MindSpore Lite inference framework depends
- │ ├── libmindspore-lite-train.a # Static library on which the MindSpore Lite training framework depends
- │ ├── libmindspore-lite-train.so # Dynamic library on which the MindSpore Lite training framework depends
- │ ├── libmindspore-lite-train-jni.so # JNI dynamic library on which the MindSpore Lite training framework depends
- │ ├── libmindspore-lite.a # Static library on which the MindSpore Lite inference framework depends
- │ ├── libmindspore-lite.so # Dynamic library on which the MindSpore Lite inference framework depends
- │ ├── mindspore-lite-java.jar # MindSpore Lite training framework JAR package
- │ └── mindspore-lite-java-flclient.jar # Federated learning framework JAR package
- └── third_party
- └── libjpeg-turbo
- └── lib
- ├── libjpeg.so.62 # Dynamic library file for image processing
- └── libturbojpeg.so.0 # Dynamic library file for image processing
-```
+2. The names of the relevant x86 packages required for federated learning are as follows:
-The x86 packages required for federated learning are as follows:
-
-```sh
-libjpeg.so.62 # Dynamic library file for image processing
-libminddata-lite.so # Dynamic library file for image processing
-libmindspore-lite.so # Dynamic library on which the MindSpore Lite inference framework depends
-libmindspore-lite-jni.so # JNI dynamic library on which the MindSpore Lite inference framework depends
-libmindspore-lite-train.so # Dynamic library on which the MindSpore Lite training framework depends
-libmindspore-lite-train-jni.so # JNI dynamic library on which the MindSpore Lite training framework depends
-libturbojpeg.so.0 # Dynamic library file for image processing
-mindspore-lite-java-flclient.jar # Federated learning framework JAR package
-quick_start_flclient.jar # The jar package corresponding to the model script
-```
+ ```sh
+ libminddata-lite.so # Dynamic library files for image processing
+ libmindspore-lite.so # Dynamic libraries relied by MindSpore Lite inference framework
+ libmindspore-lite-jni.so # jni dynamic library relied by MindSpore Lite inference framework
+ libmindspore-lite-train.so # Dynamic library relied by MindSpore Lite training framework
+ libmindspore-lite-train-jni.so # jni dynamic library relied by MindSpore Lite training framework
+ libjpeg.so.62 # Dynamic library files for image processing
+ libturbojpeg.so.0 # Dynamic library files for image processing
+ ```
-Find the seven .so files on which federated learning depends in the directories `mindspore/output/mindspore-lite-{version}-linux-x64/runtime/lib/` and `mindspore/output/mindspore-lite-{version}-linux-x64/runtime/third_party/libjpeg-turbo/lib`. Then, place these .so files in a folder, for example, `/resource/x86libs/`.
+3. Put the so files (7 in total) relied by federated learning in paths `mindspore-lite-{version}-linux-x64/runtime/lib/` and `mindspore-lite-{version}-linux-x64/runtime/third_party/libjpeg-turbo/lib` in a folder, e.g. `/resource/x86libs/`. Then set the environment variables in x86 (absolute paths need to be provided below):
-Set environment variables in the x86 system (an absolute path must be provided):
+ ```sh
+ export LD_LIBRARY_PATH=/resource/x86libs/:$LD_LIBRARY_PATH
+ ```
+
+4. After setting up the dependency environment, you can simulate starting multiple clients in the x86 environment for federated learning by referring to the application practice tutorial [Implementing an end-cloud federation for image classification application (x86)](https://www.mindspore.cn/federated/docs/en/master/image_classification_application.html).
-```sh
-export LD_LIBRARY_PATH=/resource/x86libs/:$LD_LIBRARY_PATH
-```
-After the dependency environment is set, you can simulate the startup of multiple clients in the x86 environment for federated learning. For details, click [here](https://www.mindspore.cn/federated/docs/en/master/image_classification_application.html).
diff --git a/docs/federated/docs/source_en/deploy_federated_server.md b/docs/federated/docs/source_en/deploy_federated_server.md
index b3489214bf5f902b24641275a1c6399cd4886e70..fddb1493729e993a331e04bf97609e2ec342904e 100644
--- a/docs/federated/docs/source_en/deploy_federated_server.md
+++ b/docs/federated/docs/source_en/deploy_federated_server.md
@@ -4,13 +4,11 @@
The following uses LeNet as an example to describe how to use MindSpore to deploy a federated learning cluster.
-> You can download the complete demo from [here](https://gitee.com/mindspore/mindspore/tree/master/tests/st/fl/mobile).
-
The following figure shows the physical architecture of the MindSpore Federated Learning (FL) Server cluster:

-As shown in the preceding figure, in the federated learning cloud cluster, there are two MindSpore process roles: `Federated Learning Scheduler` and `Federated Learning Server`:
+As shown in the preceding figure, in the federated learning cloud cluster, there are three MindSpore process roles: `Federated Learning Scheduler`, `Federated Learning Server` and `Federated Learning Worker`:
- Federated Learning Scheduler
@@ -25,7 +23,11 @@ As shown in the preceding figure, in the federated learning cloud cluster, there
`Server` executes federated learning tasks, receives and parses data from devices, and provides capabilities such as secure aggregation, time-limited communication, and model storage. In a federated learning task, users can configure multiple `Servers` which communicate with each other through the TCP proprietary protocol and open HTTP ports for device-side connection.
- > In the MindSpore federated learning framework, `Server` also supports auto scaling and disaster recovery, and can dynamically schedule hardware resources without interrupting training tasks.
+ In the MindSpore federated learning framework, `Server` also supports auto scaling and disaster recovery, and can dynamically schedule hardware resources without interrupting training tasks.
+
+- Federated Learning Worker
+
+ `Worker` is an accessory module for executing the federated learning task, which is used for supervised retraining of the model in the Server, and then the trained model is distributed to the Server. In a federated learning task, there can be more than one (user configurable) of `Worker`, and the communication between `Worker` and `Server` is performed via TCP protocol.
`Scheduler` and `Server` must be deployed on a server or container with a single NIC and in the same network segment. MindSpore automatically obtains the first available IP address as the `Server` IP address.
@@ -33,165 +35,175 @@ As shown in the preceding figure, in the federated learning cloud cluster, there
## Preparations
+> Recommend to create a virtual environment for the following operations with [Anaconda](https://www.anaconda.com/).
+
### Installing MindSpore
The MindSpore federated learning cloud cluster supports deployment on x86 CPU and GPU CUDA hardware platforms. Run commands provided by the [MindSpore Installation Guide](https://www.mindspore.cn/install) to install the latest MindSpore.
-## Defining a Model
+### Installing MindSpore Federated
-To facilitate deployment, the `Scheduler` and `Server` processes of MindSpore federated learning can reuse the training script. You can select different startup modes by referring to [Configuring Parameters](#configuring-parameters).
+Compile and install with [source code](https://gitee.com/mindspore/federated).
-## Configuring Parameters
+```shell
+git clone https://gitee.com/mindspore/federated.git -b master
+cd federated
+bash build.sh
+```
-The MindSpore federated learning task process reuses the training script. You only need to use the same script to transfer different parameters through the Python API `set_fl_context` and start different MindSpore process roles. For details about the parameter configuration, see [MindSpore API](https://www.mindspore.cn/federated/docs/en/master/federated_server.html#mindspore.set_fl_context).
+For `bash build.sh`, compilation can be accelerated by the `-jn` option, e.g. `-j16`. The third-party dependencies can be downloaded from gitee instead of github by the `-S on` option.
-After parameter configuration and before training, call the `set_fl_context` API as follows:
+After compilation, find the whl installation package of Federated in the `build/package/` directory to install:
-```python
-import mindspore as ms
-...
-
-enable_fl = True
-server_mode = "FEDERATED_LEARNING"
-ms_role = "MS_SERVER"
-server_num = 4
-scheduler_ip = "192.168.216.124"
-scheduler_port = 6667
-fl_server_port = 6668
-fl_name = "LeNet"
-scheduler_manage_port = 11202
-config_file_path = "./config.json"
-
-fl_ctx = {
- "enable_fl": enable_fl,
- "server_mode": server_mode,
- "ms_role": ms_role,
- "server_num": server_num,
- "scheduler_ip": scheduler_ip,
- "scheduler_port": scheduler_port,
- "fl_server_port": fl_server_port,
- "fl_name": fl_name,
- "scheduler_manage_port": scheduler_manage_port,
- "config_file_path": config_file_path
-}
-ms.set_fl_context(**fl_ctx)
-...
-
-model.train()
+```bash
+pip install mindspore_federated-{version}-{python_version}-linux_{arch}.whl
```
-In this example, the training task mode is set to `federated learning` (`FEDERATED_LEARNING`), and the training process role is `Server`. In this task, `4` `Servers` need to be started to complete the cluster networking. The IP address of the cluster `Scheduler` is `192.168.216.124`, the cluster `Scheduler` port number is `6667`, the `HTTP service port number` of federated learning is `6668` (connected by the device), the task name is `LeNet`, and the cluster `Scheduler` management port number is `11202`.
+### Verifying Installation
-> Some parameters are used by either `Scheduler` (for example, scheduler_manage_port) or `Server` (for example, fl_server_port). To facilitate deployment, transfer these parameters together to MindSpore. MindSpore reads different parameters based on process roles.
-> You are advised to import the parameter configuration through the Python `argparse` module:
+Execute the following command to verify the installation result. The installation is successful if no error is reported when importing Python modules.
```python
-import argparse
-
-parser = argparse.ArgumentParser()
-parser.add_argument("--server_mode", type=str, default="FEDERATED_LEARNING")
-parser.add_argument("--ms_role", type=str, default="MS_SERVER")
-parser.add_argument("--server_num", type=int, default=4)
-parser.add_argument("--scheduler_ip", type=str, default="192.168.216.124")
-parser.add_argument("--scheduler_port", type=int, default=6667)
-parser.add_argument("--fl_server_port", type=int, default=6668)
-parser.add_argument("--fl_name", type=str, default="LeNet")
-parser.add_argument("--scheduler_manage_port", type=int, default=11202)
-parser.add_argument("--config_file_path", type=str, default="")
-
-args, t = parser.parse_known_args()
-server_mode = args.server_mode
-ms_role = args.ms_role
-server_num = args.server_num
-scheduler_ip = args.scheduler_ip
-scheduler_port = args.scheduler_port
-fl_server_port = args.fl_server_port
-fl_name = args.fl_name
-scheduler_manage_port = args.scheduler_manage_port
-config_file_path = args.config_file_path
+from mindspore_federated import FLServerJob
```
-> Each Python script corresponds to a process. If multiple `Server` roles need to be deployed on different hosts, you can use shell commands and Python to quickly start multiple `Server` processes. You can refer to the [examples](https://gitee.com/mindspore/mindspore/tree/master/tests/st/fl/mobile).
->
-> Each `Server` process needs a unique identifier `MS_NODE_ID` which should be set by environment variable. In this tutorial, this environment variable has been set in the script [run_mobile_server.py](https://gitee.com/mindspore/mindspore/blob/master/tests/st/fl/mobile/run_mobile_server.py).
+### Installing and Starting Redis Server
-## Starting a Cluster
+Federated Learning relies on [Redis Server](https://gitee.com/link?target=https%3A%2F%2Fredis.io%2F) as the cached data middleware by default. To run the Federated Learning service, a Redis server needs to be installed and run.
-Start the cluster by referring to the [examples](https://gitee.com/mindspore/mindspore/tree/master/tests/st/fl/mobile). An example directory structure is as follows:
-
-```text
-mobile/
-├── config.json
-├── finish_mobile.py
-├── run_mobile_sched.py
-├── run_mobile_server.py
-├── src
-│ └── model.py
-└── test_mobile_lenet.py
+Install Redis server:
+
+```bash
+sudo apt-get install redis
```
-Descriptions of the documents:
+Run the Redis server and the number of configuration side is 23456:
-- config.json: The config file, which is used to configure security, disaster recovery, etc.
-- finish_mobile.py: This script is used to stop the cluster.
-- run_mobile_sched.py: Launch scheduler.
-- run_mobile_server.py: Launch server.
-- model.py: The model.
-- test_mobile_lenet.py: Training script.
+```bash
+redis-server --port 23456 --save ""
+```
-1. Start the `Scheduler`.
+## Starting a Cluster
- `run_mobile_sched.py` is a Python script provided for you to start `Scheduler` and supports configuration modification through passing the `argparse` parameter. Run the following command to start the `Scheduler` of the federated learning task. The TCP port number is `6667`, the HTTP service port number starts with `6668`, the number of `Server` is `4`, and the management port number of the cluster `Scheduler` is `11202`:
+1. [examples](https://gitee.com/mindspore/federated/tree/master/example/cross_device_lenet_femnist/).
- ```sh
- python run_mobile_sched.py --scheduler_ip=192.168.216.124 --scheduler_port=6667 --fl_server_port=6668 --server_num=4 --scheduler_manage_port=11202 --config_file_path=$PWD/config.json
+ ```bash
+ cd tests/st/cross_device_lenet_femnist
```
-2. Start the `Servers`.
-
- `run_mobile_server.py` is a Python script provided for you to start multiple `Servers` and supports configuration modification through passing the `argparse` parameter. Run the following command to start the `Servers` of the federated learning task. The TCP port number is `6667`, the HTTP service port number starts with `6668`, the number of `Server` is `4`, and the number of devices required for the federated learning task is `8`.
+2. Modify the yaml configuration file according to the actual running: `default_yaml_config.yaml`. [sample configuration of Lenet](https://gitee.com/mindspore/federated/tree/master/example/cross_device_lenet_femnist/yamls/lenet.yaml) is as follows:
+
+ ```yaml
+ fl_name: Lenet
+ fl_iteration_num: 25
+ server_mode: FEDERATED_LEARNING
+ enable_ssl: False
+
+ distributed_cache:
+ type: redis
+ address: 127.0.0.1:23456 # ip:port of redis actual machine
+ plugin_lib_path: ""
+
+ round:
+ start_fl_job_threshold: 2
+ start_fl_job_time_window: 30000
+ update_model_ratio: 1.0
+ update_model_time_window: 30000
+ global_iteration_time_window: 60000
+
+ summary:
+ metrics_file: "metrics.json"
+ failure_event_file: "event.txt"
+ continuous_failure_times: 10
+ data_rate_dir: ".."
+ participation_time_level: "5,15"
+
+ encrypt:
+ encrypt_type: NOT_ENCRYPT
+ pw_encrypt:
+ share_secrets_ratio: 1.0
+ cipher_time_window: 3000
+ reconstruct_secrets_threshold: 1
+ dp_encrypt:
+ dp_eps: 50.0
+ dp_delta: 0.01
+ dp_norm_clip: 1.0
+ signds:
+ sign_k: 0.01
+ sign_eps: 100
+ sign_thr_ratio: 0.6
+ sign_global_lr: 0.1
+ sign_dim_out: 0
+
+ compression:
+ upload_compress_type: NO_COMPRESS
+ upload_sparse_rate: 0.4
+ download_compress_type: NO_COMPRESS
+
+ ssl:
+ # when ssl_config is set
+ # for tcp/http server
+ server_cert_path: "server.p12"
+ # for tcp client
+ client_cert_path: "client.p12"
+ # common
+ ca_cert_path: "ca.crt"
+ crl_path: ""
+ cipher_list: "ECDHE-RSA-AES128-GCM-SHA256:ECDHE-ECDSA-AES128-GCM-SHA256:ECDHE-RSA-AES256-GCM-SHA384:ECDHE-ECDSA-AES256-GCM-SHA384:ECDHE-RSA-CHACHA20-POLY1305:ECDHE-PSK-CHACHA20-POLY1305:ECDHE-ECDSA-AES128-CCM:ECDHE-ECDSA-AES256-CCM:ECDHE-ECDSA-CHACHA20-POLY1305"
+ cert_expire_warning_time_in_day: 90
+
+ client_verify:
+ pki_verify: false
+ root_first_ca_path: ""
+ root_second_ca_path: ""
+ equip_crl_path: ""
+ replay_attack_time_diff: 600000
+
+ client:
+ http_url_prefix: ""
+ client_epoch_num: 20
+ client_batch_size: 32
+ client_learning_rate: 0.01
+ connection_num: 10000
- ```sh
- python run_mobile_server.py --scheduler_ip=192.168.216.124 --scheduler_port=6667 --fl_server_port=6668 --server_num=4 --start_fl_job_threshold=8 --config_file_path=$PWD/config.json
```
- The preceding command is equivalent to starting four `Server` processes, of which the federated learning service port numbers are `6668`, `6669`, `6670`, and `6671`. For details, see [run_mobile_server.py](https://gitee.com/mindspore/mindspore/blob/master/tests/st/fl/mobile/run_mobile_server.py).
-
- > If you only want to deploy `Scheduler` and `Server` in a standalone system, change the `scheduler_ip` to `127.0.0.1`.
+3. Prepare the model file and start it in the following way: weight-based start. You need to provide the corresponding model weights.
- To distribute the `Servers` on different physical nodes, you can use the `local_server_num` parameter to specify the number of `Server` processes to be executed on **the current node**.
-
- ```sh
- #Start three `Server` processes on node 1.
- python run_mobile_server.py --scheduler_ip={ip_address_node_1} --scheduler_port=6667 --fl_server_port=6668 --server_num=4 --start_fl_job_threshold=8 --local_server_num=3 --config_file_path=$PWD/config.json
- ```
+ Obtain lenet model weight:
- ```sh
- #Start one `Server` process on node 2.
- python run_mobile_server.py --scheduler_ip={ip_address_node_2} --scheduler_port=6667 --fl_server_port=6668 --server_num=4 --start_fl_job_threshold=8 --local_server_num=1 --config_file_path=$PWD/config.json
+ ```bash
+ wget https://ms-release.obs.cn-north-4.myhuaweicloud.com/ms-dependencies/Lenet.ckpt
```
- The log is displayed as follows:
+4. Run Scheduler, and the management side address is `127.0.0.1:11202` by default.
- ```sh
- Server started successfully.
+ ```python
+ python run_sched.py \
+ --yaml_config="yamls/lenet.yaml" \
+ --scheduler_manage_address="10.113.216.40:18019"
```
- If the preceding information is displayed, it indicates that the startup is successful.
+5. Run Server, and start one Server and the HTTP server address is `127.0.0.1:6666` by default.
- > In the preceding commands for distributed deployment, all values of `server_num` are set to 4. This is because this parameter indicates the number of global `Servers` in the cluster and should not change with the number of physical nodes. `Servers` on different nodes do not need to be aware of their own IP addresses. The cluster consistency and node discovery are scheduled by `Scheduler`.
-
-3. Stop federated learning.
+ ```python
+ python run_server.py \
+ --yaml_config="yamls/lenet.yaml" \
+ --tcp_server_ip="10.113.216.40" \
+ --checkpoint_dir="fl_ckpt" \
+ --local_server_num=1 \
+ --http_server_address="10.113.216.40:8019"
+ ```
- Currently, `finish_mobile.py` is used to stop the federated learning server. Run the following command to stop the federated learning cluster. The value of the `scheduler_port` parameter is the same as that passed when the server is started.
+6. Stop federated learning. The current version of the federated learning cluster is a resident process, and the `finish_cloud.py` script can be executed to terminate the federated learning service. The example of executing the command is as follows, where `redis_port` is passed with the same parameters as when starting redis, representing stopping the cluster corresponding to this `Scheduler`.
- ```sh
- python finish_mobile.py --scheduler_port=6667
+ ```python
+ python finish_cloud.py --redis_port=23456
```
- The result is as follows:
+ If console prints the following contents:
- ```sh
+ ```text
killed $PID1
killed $PID2
killed $PID3
@@ -202,131 +214,98 @@ Descriptions of the documents:
killed $PID8
```
- The services are stopped successfully.
+ it indicates the termination service is successful.
## Auto Scaling
-The MindSpore federated learning framework supports auto scaling of `Server` and provides the `RESTful` service through the `Scheduler` management port. In this way, you can dynamically schedule hardware resources without interrupting training tasks. Currently, MindSpore supports only horizontal scaling (scale-out or scale-in) and does not support vertical scaling (scale-up or scale-down). In the auto scaling scenario, the number of `Server` processes either increases or decreases according to user settings.
-
-The following describes how to control cluster scale-in and scale-out using the native RESTful APIs.
-
-1. Scale-out
-
- After the cluster is started, enter the machine where the scheduler node is deployed and send a scale-out request to `Scheduler`. Use the `curl` instruction to construct a `RESTful` scale-out request, indicating that two `Server` nodes need to be added to the cluster.
+MindSpore federal learning framework supports `Server` auto scaling and provides `RESTful` services externally through the `Scheduler` management port, enabling users to dynamically schedule hardware resources without interrupting training tasks.
- ```sh
- curl -i -X POST \
- -H "Content-Type:application/json" \
- -d \
- '{
- "worker_num":0,
- "server_num":2
- }' \
- 'http://127.0.0.1:11202/scaleout'
- ```
-
- Start `2` new `Server` processes and the `node_id` of the expanded `Server` cannot be the same as the `node_id` of the existing `Server`, add up the values of `server_num` to ensure that the global networking information is correct. After the scale-out, the value of `server_num` should be `6`.
+The following example describes how to control scale-out and scale-in of cluster through APIs.
- ```sh
- python run_mobile_server.py --node_id=scale_node --scheduler_ip=192.168.216.124 --scheduler_port=6667 --fl_server_port=6672 --server_num=6 --start_fl_job_threshold=8 --local_server_num=2 --config_file_path=$PWD/config.json
- ```
+### Scale-out
- This command is used to start two `Server` nodes. The port numbers of the federated learning services are `6672` and `6673`, and the total number of `Servers` is `6`.
+After the cluster starts, enter the machine where the scheduler node is deployed and make a request to the `Scheduler` to query the status and node information. A `RESTful` request can be constructed with the `curl` command.
-2. Scale-in
+```sh
+curl -k 'http://10.113.216.40:18015/state'
+```
- After the cluster is started, enter the machine where the scheduler node is deployed and send a scale-in request to `Scheduler`. Obtain the node information to perform the scale-in operation on specific nodes.
+`Scheduler` will return query results in `json` format.
- ```sh
- curl -i -X GET \
- 'http://127.0.0.1:11202/nodes'
- ```
+```json
+{
+ "message":"Get cluster state successful.",
+ "cluster_state":"CLUSTER_READY",
+ "code":0,
+ "nodes":[
+ {"node_id","{ip}:{port}::{timestamp}::{random}",
+ "tcp_address":"{ip}:{port}",
+ "role":"SERVER"}
+ ]
+}
+```
- The `scheduler` will return the query results in the `json` format:
-
- ```json
- {
- "message": "Get nodes info successful.",
- "nodeIds": [
- {
- "alive": "true",
- "nodeId": "3",
- "rankId": "3",
- "role": "SERVER"
- },
- {
- "alive": "true",
- "nodeId": "0",
- "rankId": "0",
- "role": "SERVER"
- },
- {
- "alive": "true",
- "nodeId": "2",
- "rankId": "2",
- "role": "SERVER"
- },
- {
- "alive": "true",
- "nodeId": "1",
- "rankId": "1",
- "role": "SERVER"
- },
- {
- "alive": "true",
- "nodeId": "20",
- "rankId": "0",
- "role": "SCHEDULER"
- }
- ]
- }
- ```
+You need to pull up 3 new `Server` processes and accumulate the `local_server_num` parameter to the number of scale-out, so as to ensure the correctness of the global networking information, i.e. after scale-out, the number of `local_server_num` should be 4. An example of executing the command is as follows:
- Select `Rank3` and `Rank2` for scale-in.
+```sh
+python run_server.py --yaml_config="yamls/lenet.yaml" --tcp_server_ip="10.113.216.40" --checkpoint_dir="fl_ckpt" --local_server_num=4 --http_server_address="10.113.216.40:18015"
+```
- ```sh
- curl -i -X POST \
- -H "Content-Type:application/json" \
- -d \
- '{
- "node_ids": ["2", "3"]
- }' \
- 'http://127.0.0.1:11202/scalein'
- ```
+This command indicates starting four `Server` nodes and the total number of `Server` is 4.
-> - After the cluster scale-out or scale-in is successful, the training task is automatically restored. No manual intervention is required.
->
-> - You can use a cluster management tool (such as Kubernetes) to create or release `Server` resources.
->
-> - After scale-in, the process scaled in will not exit. You need to use the cluster management tool (such as Kubernetes) or command `kill -15 $PID` to control the process to exit. Please note that you need to query the cluster status from the 'scheduler' node and wait for the cluster status to be set to `CLUSTER_READY`, the reduced node can be recycled.
+### Scale-in
-## Disaster Recovery
+Simulate the scale-in directly via kill -9 pid, construct a `RESTful` request with the `curl` command, and query the status, which finds that there is one node_id missing from the cluster to achieve the purpose of scale-in.
-After a node in the MindSpore federated learning cluster goes offline, you can keep the cluster online without exiting the training task. After the node is restarted, you can resume the training task. Currently, MindSpore supports disaster recovery for `Server` nodes (except Server 0).
+```sh
+curl -k \
+'http://10.113.216.40:18015/state'
+```
-To enable disaster recovery, the fields below should be added to the config.json set by config_file_path:
+`Scheduler` returns the query results in `json` format.
```json
{
- "recovery": {
- "storage_type": 1,
- "storge_file_path": "config.json"
- }
+ "message":"Get cluster state successful.",
+ "cluster_state":"CLUSTER_READY",
+ "code":0,
+ "nodes":[
+ {"node_id","{ip}:{port}::{timestamp}::{random}",
+ "tcp_address":"{ip}:{port}",
+ "role":"SERVER"},
+ {"node_id","worker_fl_{timestamp}::{random}",
+ "tcp_address":"",
+ "role":"WORKER"},
+ {"node_id","worker_fl_{timestamp}::{random}",
+ "tcp_address":"",
+ "role":"WORKER"}
+ ]
}
```
-- recovery: If this field is set, the disaster recovery feature is enabled.
-- storage_type: Persistent storage type. Only `1` is supported currently which represents file storage.
-- storage_file_path: The recovery file path.
+> - After scale-out/scale-in of the cluster is successful, the training tasks are automatically resumed without additional intervention.
-The node restart command is similar to the scale-out command. After the node is manually brought offline, run the following command:
+## Security
-```sh
-python run_mobile_server.py --scheduler_ip=192.168.216.124 --scheduler_port=6667 --fl_server_port=6673 --server_num=6 --start_fl_job_threshold=8 --local_server_num=1 --config_file_path=$PWD/config.json
+MindSpore Federal Learning Framework supports SSL security authentication of `Server`. To enable security authentication, you need to add `enable_ssl=True` to the startup command, and the config.json configuration file specified by config_file_path needs to add the following fields:
+
+```json
+{
+ "server_cert_path": "server.p12",
+ "crl_path": "",
+ "client_cert_path": "client.p12",
+ "ca_cert_path": "ca.crt",
+ "cert_expire_warning_time_in_day": 90,
+ "cipher_list": "ECDHE-RSA-AES128-GCM-SHA256:ECDHE-ECDSA-AES128-GCM-SHA256:ECDHE-RSA-AES256-GCM-SHA384:ECDHE-ECDSA-AES256-GCM-SHA384:DHE-RSA-AES128-GCM-SHA256:DHE-DSS-AES128-GCM-SHA256:kEDH+AESGCM:ECDHE-RSA-AES128-SHA256:ECDHE-ECDSA-AES128-SHA256:ECDHE-RSA-AES128-SHA:ECDHE-ECDSA-AES128-SHA:ECDHE-RSA-AES256-SHA384:ECDHE-ECDSA-AES256-SHA384:ECDHE-RSA-AES256-SHA:ECDHE-ECDSA-AES256-SHA:DHE-RSA-AES128-SHA256:DHE-RSA-AES128-SHA:DHE-DSS-AES128-SHA256:DHE-RSA-AES256-SHA256:DHE-DSS-AES256-SHA:DHE-RSA-AES256-SHA:!aNULL:!eNULL:!EXPORT:!DES:!RC4:!3DES:!MD5:!PSK",
+ "connection_num":10000
+}
```
-This command indicates that the `Server` is restarted. The federated learning service port number is `6673`.
+- server_cert_path: The path to the p12 file containing the ciphertext of the certificate and key on the server-side.
+- crl_path: Files of revocation list.
+- client_cert_path: The path to the p12 file containing the ciphertext of the certificate and key on the client-side.
+- ca_cert_path: Root certificate.
+- cipher_list: Cipher suite.
+- cert_expire_warning_time_in_day: Alarm time of certificate expiration.
-> MindSpore does not support disaster recovery after the auto scaling command is successfully delivered and before the scaling service is complete.
->
-> After recovery, the restarted node's `MS_NODE_ID` variable should be the same as the one which exited in exception to ensure the networking recovery.
+The key in the p12 file is stored in cipher text. You need to pass in the password when starting. Please refer to the `client_password` and `server_password` fields in the Python API `mindspore.set_fl_context` for the specific parameters.
diff --git a/docs/federated/docs/source_zh_cn/comunication_compression.md b/docs/federated/docs/source_zh_cn/comunication_compression.md
index 16c7275657d3628ff3228ff8193b76841244b592..f3572588f87fbbe9ca38947b7e609148f20aa4b1 100644
--- a/docs/federated/docs/source_zh_cn/comunication_compression.md
+++ b/docs/federated/docs/source_zh_cn/comunication_compression.md
@@ -1,6 +1,6 @@
# 端云联邦学习通信压缩
-
+
在端云联邦学习训练过程中,通信量会影响端侧用户体验(用户流量、通信时延、FL-Client 参与数量),并受云侧性能约束(内存、带宽、CPU 占用率)限制。为了提高用户体验和减少性能瓶颈,MindSpore联邦学习框架在端云联邦场景中,提供上传和下载的通信量压缩功能。
diff --git a/docs/federated/docs/source_zh_cn/deploy_federated_client.md b/docs/federated/docs/source_zh_cn/deploy_federated_client.md
index 84c3d10b551aca0a593610d7f403e8f0a11f21c3..fe1acd718498ef8890c56d5fccd247a173500db1 100644
--- a/docs/federated/docs/source_zh_cn/deploy_federated_client.md
+++ b/docs/federated/docs/source_zh_cn/deploy_federated_client.md
@@ -161,7 +161,7 @@ sudo apt-get install cmake -y
bash build.sh -r mindspore-lite-java-flclient.jar #-r 后需要给出最新x86架构包绝对路径(步骤2生成,federated/mindspore_federated/device_client/build/libs/jarX86/mindspore-lite-java-flclient.jar)
```
-运行以上指令后生成jar包路径为:federated/example/quick_start_flclient/target/quick_start_flclient.jar
+运行以上指令后生成jar包路径为:federated/example/quick_start_flclient/target/quick_start_flclient.jar。
### 构建依赖环境
@@ -186,7 +186,7 @@ sudo apt-get install cmake -y
│ ├── libmindspore-lite-train.so # MindSpore Lite训练框架依赖的动态库
│ ├── libmindspore-lite-train-jni.so # MindSpore Lite训练框架依赖的jni动态库
│ ├── libmindspore-lite.a # MindSpore Lite推理框架依赖的静态库
- │ ├── libmindspore-lite.so # MindSpore Lite推理依赖的动态库
+ │ ├── libmindspore-lite.so # MindSpore Lite推理框架依赖的动态库
│ ├── mindspore-lite-java.jar # MindSpore Lite训练框架jar包
│ └── mindspore-lite-java-flclient.jar # 联邦学习框架jar包
└── third_party
diff --git a/docs/federated/docs/source_zh_cn/deploy_federated_server.md b/docs/federated/docs/source_zh_cn/deploy_federated_server.md
index 06fe8e71bccdf99f48184c99a62b57d6232cefc2..3ecf67f4d423375a6dbc3597a494bf9d2723de97 100644
--- a/docs/federated/docs/source_zh_cn/deploy_federated_server.md
+++ b/docs/federated/docs/source_zh_cn/deploy_federated_server.md
@@ -87,134 +87,134 @@ redis-server --port 23456 --save ""
1. [样例路径](https://gitee.com/mindspore/federated/tree/master/example/cross_device_lenet_femnist/)。
-```bash
-cd tests/st/cross_device_lenet_femnist
-```
+ ```bash
+ cd tests/st/cross_device_lenet_femnist
+ ```
2. 据实际运行需要修改yaml配置文件:`default_yaml_config.yaml`,如下为[Lenet的相关配置样例](https://gitee.com/mindspore/federated/tree/master/example/cross_device_lenet_femnist/yamls/lenet.yaml)。
-```yaml
-fl_name: Lenet
-fl_iteration_num: 25
-server_mode: FEDERATED_LEARNING
-enable_ssl: False
-
-distributed_cache:
- type: redis
- address: 127.0.0.1:23456 # ip:port of redis actual machine
- plugin_lib_path: ""
-
-round:
- start_fl_job_threshold: 2
- start_fl_job_time_window: 30000
- update_model_ratio: 1.0
- update_model_time_window: 30000
- global_iteration_time_window: 60000
-
-summary:
- metrics_file: "metrics.json"
- failure_event_file: "event.txt"
- continuous_failure_times: 10
- data_rate_dir: ".."
- participation_time_level: "5,15"
-
-encrypt:
- encrypt_type: NOT_ENCRYPT
- pw_encrypt:
- share_secrets_ratio: 1.0
- cipher_time_window: 3000
- reconstruct_secrets_threshold: 1
- dp_encrypt:
- dp_eps: 50.0
- dp_delta: 0.01
- dp_norm_clip: 1.0
- signds:
- sign_k: 0.01
- sign_eps: 100
- sign_thr_ratio: 0.6
- sign_global_lr: 0.1
- sign_dim_out: 0
-
-compression:
- upload_compress_type: NO_COMPRESS
- upload_sparse_rate: 0.4
- download_compress_type: NO_COMPRESS
-
-ssl:
- # when ssl_config is set
- # for tcp/http server
- server_cert_path: "server.p12"
- # for tcp client
- client_cert_path: "client.p12"
- # common
- ca_cert_path: "ca.crt"
- crl_path: ""
- cipher_list: "ECDHE-RSA-AES128-GCM-SHA256:ECDHE-ECDSA-AES128-GCM-SHA256:ECDHE-RSA-AES256-GCM-SHA384:ECDHE-ECDSA-AES256-GCM-SHA384:ECDHE-RSA-CHACHA20-POLY1305:ECDHE-PSK-CHACHA20-POLY1305:ECDHE-ECDSA-AES128-CCM:ECDHE-ECDSA-AES256-CCM:ECDHE-ECDSA-CHACHA20-POLY1305"
- cert_expire_warning_time_in_day: 90
-
-client_verify:
- pki_verify: false
- root_first_ca_path: ""
- root_second_ca_path: ""
- equip_crl_path: ""
- replay_attack_time_diff: 600000
-
-client:
- http_url_prefix: ""
- client_epoch_num: 20
- client_batch_size: 32
- client_learning_rate: 0.01
- connection_num: 10000
-
-```
+ ```yaml
+ fl_name: Lenet
+ fl_iteration_num: 25
+ server_mode: FEDERATED_LEARNING
+ enable_ssl: False
+
+ distributed_cache:
+ type: redis
+ address: 127.0.0.1:23456 # ip:port of redis actual machine
+ plugin_lib_path: ""
+
+ round:
+ start_fl_job_threshold: 2
+ start_fl_job_time_window: 30000
+ update_model_ratio: 1.0
+ update_model_time_window: 30000
+ global_iteration_time_window: 60000
+
+ summary:
+ metrics_file: "metrics.json"
+ failure_event_file: "event.txt"
+ continuous_failure_times: 10
+ data_rate_dir: ".."
+ participation_time_level: "5,15"
+
+ encrypt:
+ encrypt_type: NOT_ENCRYPT
+ pw_encrypt:
+ share_secrets_ratio: 1.0
+ cipher_time_window: 3000
+ reconstruct_secrets_threshold: 1
+ dp_encrypt:
+ dp_eps: 50.0
+ dp_delta: 0.01
+ dp_norm_clip: 1.0
+ signds:
+ sign_k: 0.01
+ sign_eps: 100
+ sign_thr_ratio: 0.6
+ sign_global_lr: 0.1
+ sign_dim_out: 0
+
+ compression:
+ upload_compress_type: NO_COMPRESS
+ upload_sparse_rate: 0.4
+ download_compress_type: NO_COMPRESS
+
+ ssl:
+ # when ssl_config is set
+ # for tcp/http server
+ server_cert_path: "server.p12"
+ # for tcp client
+ client_cert_path: "client.p12"
+ # common
+ ca_cert_path: "ca.crt"
+ crl_path: ""
+ cipher_list: "ECDHE-RSA-AES128-GCM-SHA256:ECDHE-ECDSA-AES128-GCM-SHA256:ECDHE-RSA-AES256-GCM-SHA384:ECDHE-ECDSA-AES256-GCM-SHA384:ECDHE-RSA-CHACHA20-POLY1305:ECDHE-PSK-CHACHA20-POLY1305:ECDHE-ECDSA-AES128-CCM:ECDHE-ECDSA-AES256-CCM:ECDHE-ECDSA-CHACHA20-POLY1305"
+ cert_expire_warning_time_in_day: 90
+
+ client_verify:
+ pki_verify: false
+ root_first_ca_path: ""
+ root_second_ca_path: ""
+ equip_crl_path: ""
+ replay_attack_time_diff: 600000
+
+ client:
+ http_url_prefix: ""
+ client_epoch_num: 20
+ client_batch_size: 32
+ client_learning_rate: 0.01
+ connection_num: 10000
+
+ ```
3. 准备模型文件,启动方式为:基于权重启动,需要提供相应的模型权重。
-获取lenet模型权重:
+ 获取lenet模型权重:
-```bash
-wget https://ms-release.obs.cn-north-4.myhuaweicloud.com/ms-dependencies/Lenet.ckpt
-```
+ ```bash
+ wget https://ms-release.obs.cn-north-4.myhuaweicloud.com/ms-dependencies/Lenet.ckpt
+ ```
4. 运行Scheduler,管理面地址默认为`127.0.0.1:11202`。
-```python
-python run_sched.py \
---yaml_config="yamls/lenet.yaml" \
---scheduler_manage_address="10.113.216.40:18019"
-```
+ ```python
+ python run_sched.py \
+ --yaml_config="yamls/lenet.yaml" \
+ --scheduler_manage_address="10.113.216.40:18019"
+ ```
5. 运行Server,默认启动1个Server,HTTP服务器地址默认为`127.0.0.1:6666`。
-```python
-python run_server.py \
---yaml_config="yamls/lenet.yaml" \
---tcp_server_ip="10.113.216.40" \
---checkpoint_dir="fl_ckpt" \
---local_server_num=1 \
---http_server_address="10.113.216.40:8019"
-```
+ ```python
+ python run_server.py \
+ --yaml_config="yamls/lenet.yaml" \
+ --tcp_server_ip="10.113.216.40" \
+ --checkpoint_dir="fl_ckpt" \
+ --local_server_num=1 \
+ --http_server_address="10.113.216.40:8019"
+ ```
-6. 停止联邦学习 当前版本联邦学习集群为常驻进程,可执行`finish_cloud.py`脚本,以终止联邦学习服务。执行指令的示例如下,其中`redis_port`传参,需与启动redis时的传参保持一致,代表停止此`Scheduler`对应的集群。
+6. 停止联邦学习。当前版本联邦学习集群为常驻进程,可执行`finish_cloud.py`脚本,以终止联邦学习服务。执行指令的示例如下,其中`redis_port`传参,需与启动redis时的传参保持一致,代表停止此`Scheduler`对应的集群。
-```python
-python finish_cloud.py --redis_port=23456
-```
+ ```python
+ python finish_cloud.py --redis_port=23456
+ ```
-若console打印如下内容:
-
-```text
-killed $PID1
-killed $PID2
-killed $PID3
-killed $PID4
-killed $PID5
-killed $PID6
-killed $PID7
-killed $PID8
-```
+ 若console打印如下内容:
+
+ ```text
+ killed $PID1
+ killed $PID2
+ killed $PID3
+ killed $PID4
+ killed $PID5
+ killed $PID6
+ killed $PID7
+ killed $PID8
+ ```
-则表明停止服务成功。
+ 则表明停止服务成功。
## 弹性伸缩
@@ -230,7 +230,7 @@ MindSpore联邦学习框架支持`Server`的弹性伸缩,对外通过`Schedule
curl -k 'http://10.113.216.40:18015/state'
```
-Scheduler`将返回`json`格式的查询结果。
+`Scheduler`将返回`json`格式的查询结果。
```json
{
@@ -262,7 +262,7 @@ curl -k \
'http://10.113.216.40:18015/state'
```
-Scheduler`将返回`json`格式的查询结果。
+`Scheduler`将返回`json`格式的查询结果。
```json
{
diff --git a/docs/federated/docs/source_zh_cn/index.rst b/docs/federated/docs/source_zh_cn/index.rst
index 0a0d8ecdcd1853d76600c941fbb838355c7a4f74..ecf0e0faf846485aa63ce39088aaa7c3c3e9ed2a 100644
--- a/docs/federated/docs/source_zh_cn/index.rst
+++ b/docs/federated/docs/source_zh_cn/index.rst
@@ -96,7 +96,7 @@ MindSpore Federated是一款开源联邦学习框架,支持面向千万级无
:maxdepth: 1
:caption: 通信压缩
- comunication_compression
+ communication_compression
.. toctree::
:maxdepth: 1
diff --git a/docs/mindspore/source_en/note/api_mapping/pytorch_api_mapping.md b/docs/mindspore/source_en/note/api_mapping/pytorch_api_mapping.md
index 8280e7af3bc3f78ad30f8b2bd86c620d44f73433..db946a2399af46b62d4771eefc6c5d485a3b8abd 100644
--- a/docs/mindspore/source_en/note/api_mapping/pytorch_api_mapping.md
+++ b/docs/mindspore/source_en/note/api_mapping/pytorch_api_mapping.md
@@ -156,9 +156,9 @@ More MindSpore developers are also welcome to participate in improving the mappi
| PyTorch 1.5.0 APIs | MindSpore APIs | Description |
| ----------------------------------------------------------------------------------------------------- | -------------------------------------------------------------------------------------------------------------------------------------------------- | ----------------------------------------------------------------------------------------------------------- |
-| [torch.autograd.backward](https://pytorch.org/docs/1.5.0/autograd.html#torch.autograd.backward) | [mindspore.grad](https://mindspore.cn/docs/en/master/api_python/mindspore.grad.html) | [diff](https://www.mindspore.cn/docs/en/master/note/api_mapping/pytorch_diff/grad.html) |
+| [torch.autograd.backward](https://pytorch.org/docs/1.5.0/autograd.html#torch.autograd.backward) | [mindspore.grad](https://www.mindspore.cn/docs/en/master/api_python/mindspore/mindspore.grad.html) | [diff](https://www.mindspore.cn/docs/en/master/note/api_mapping/pytorch_diff/grad.html) |
| [torch.autograd.enable_grad](https://pytorch.org/docs/1.5.0/autograd.html#torch.autograd.enable_grad) | [mindspore.ops.stop_gradient](https://www.mindspore.cn/tutorials/en/master/beginner/autograd.html#stop-gradient) | [diff](https://www.mindspore.cn/docs/en/master/note/api_mapping/pytorch_diff/stop_gradient.html) |
-| [torch.autograd.grad](https://pytorch.org/docs/1.5.0/autograd.html#torch.autograd.grad) | [mindspore.grad](https://mindspore.cn/docs/en/master/api_python/mindspore.grad.html) | [diff](https://www.mindspore.cn/docs/en/master/note/api_mapping/pytorch_diff/grad.html) |
+| [torch.autograd.grad](https://pytorch.org/docs/1.5.0/autograd.html#torch.autograd.grad) | [mindspore.grad](https://www.mindspore.cn/docs/en/master/api_python/mindspore/mindspore.grad.html) | [diff](https://www.mindspore.cn/docs/en/master/note/api_mapping/pytorch_diff/grad.html) |
| [torch.autograd.no_grad](https://pytorch.org/docs/1.5.0/autograd.html#torch.autograd.no_grad) | [mindspore.ops.stop_gradient](https://www.mindspore.cn/tutorials/en/master/beginner/autograd.html#stop-gradient) | [diff](https://www.mindspore.cn/docs/en/master/note/api_mapping/pytorch_diff/stop_gradient.html) |
| [torch.autograd.variable](https://pytorch.org/docs/1.5.0/autograd.html#torch.autograd.variable-deprecated)| [mindspore.Parameter](https://mindspore.cn/docs/en/master/api_python/mindspore/mindspore.Parameter.html#mindspore.Parameter) | |
diff --git a/docs/mindspore/source_en/note/api_mapping/pytorch_diff/grad.md b/docs/mindspore/source_en/note/api_mapping/pytorch_diff/grad.md
index 0fe78a61bd1735843b118536007afff94e974984..842b440d200ad5b6bc4234972cb349dd5a2c5261 100644
--- a/docs/mindspore/source_en/note/api_mapping/pytorch_diff/grad.md
+++ b/docs/mindspore/source_en/note/api_mapping/pytorch_diff/grad.md
@@ -43,7 +43,7 @@ mindspore.grad(
)
```
-For more information, see [mindspore.grad](https://mindspore.cn/docs/en/master/api_python/mindspore.grad.html).
+For more information, see [mindspore.grad](https://www.mindspore.cn/docs/en/master/api_python/mindspore/mindspore.grad.html).
## Differences
diff --git a/docs/mindspore/source_en/note/api_mapping/tensorflow_api_mapping.md b/docs/mindspore/source_en/note/api_mapping/tensorflow_api_mapping.md
index f4577372c255849210cff048f4709835c4fa5e20..01e103edae50683726e0c927d67a0da78bf1e593 100644
--- a/docs/mindspore/source_en/note/api_mapping/tensorflow_api_mapping.md
+++ b/docs/mindspore/source_en/note/api_mapping/tensorflow_api_mapping.md
@@ -14,7 +14,7 @@ More MindSpore developers are also welcome to participate in improving the mappi
| [tf.eye](https://www.tensorflow.org/versions/r1.15/api_docs/python/tf/eye) | [mindspore.ops.Eye](https://mindspore.cn/docs/en/master/api_python/ops/mindspore.ops.Eye.html) | |
| [tf.fill](https://www.tensorflow.org/versions/r1.15/api_docs/python/tf/fill) | [mindspore.ops.Fill](https://mindspore.cn/docs/en/master/api_python/ops/mindspore.ops.Fill.html) | |
| [tf.gather](https://www.tensorflow.org/versions/r1.15/api_docs/python/tf/gather) | [mindspore.ops.Gather](https://mindspore.cn/docs/en/master/api_python/ops/mindspore.ops.Gather.html) | |
-| [tf.gradients](https://www.tensorflow.org/versions/r1.15/api_docs/python/tf/gradients) | [mindspore.grad](https://mindspore.cn/docs/en/master/api_python/mindspore.grad.html) | [diff](https://www.mindspore.cn/docs/en/master/note/api_mapping/tensorflow_diff/grad.html) |
+| [tf.gradients](https://www.tensorflow.org/versions/r1.15/api_docs/python/tf/gradients) | [mindspore.grad](https://www.mindspore.cn/docs/en/master/api_python/mindspore/mindspore.grad.html) | [diff](https://www.mindspore.cn/docs/en/master/note/api_mapping/tensorflow_diff/grad.html) |
| [tf.ones_like](https://www.tensorflow.org/versions/r1.15/api_docs/python/tf/ones_like) | [mindspore.ops.OnesLike](https://mindspore.cn/docs/en/master/api_python/ops/mindspore.ops.OnesLike.html) | |
| [tf.pad](https://www.tensorflow.org/versions/r1.15/api_docs/python/tf/pad) | [mindspore.nn.Pad](https://mindspore.cn/docs/en/master/api_python/nn/mindspore.nn.Pad.html) | |
| [tf.print](https://www.tensorflow.org/versions/r1.15/api_docs/python/tf/print) | [mindspore.ops.Print](https://mindspore.cn/docs/en/master/api_python/ops/mindspore.ops.Print.html) | |
diff --git a/docs/mindspore/source_en/note/api_mapping/tensorflow_diff/grad.md b/docs/mindspore/source_en/note/api_mapping/tensorflow_diff/grad.md
index 6322f9fc37cf196961c8bb8fef61d281a51b8721..48d5288faeb94e2988d9b2b9235e0d9b3a324e7e 100644
--- a/docs/mindspore/source_en/note/api_mapping/tensorflow_diff/grad.md
+++ b/docs/mindspore/source_en/note/api_mapping/tensorflow_diff/grad.md
@@ -31,7 +31,7 @@ mindspore.grad(
)
```
-For more information, see [mindspore.grad](https://mindspore.cn/docs/en/master/api_python/mindspore.grad.html).
+For more information, see [mindspore.grad](https://www.mindspore.cn/docs/en/master/api_python/mindspore/mindspore.grad.html).
## Differences
diff --git a/docs/mindspore/source_zh_cn/note/api_mapping/pytorch_api_mapping.md b/docs/mindspore/source_zh_cn/note/api_mapping/pytorch_api_mapping.md
index 79dde0d3d798729763446296c4690d652113094c..157c4c29569a70621aea35744597bcb00e5d7831 100644
--- a/docs/mindspore/source_zh_cn/note/api_mapping/pytorch_api_mapping.md
+++ b/docs/mindspore/source_zh_cn/note/api_mapping/pytorch_api_mapping.md
@@ -158,7 +158,7 @@
| ----------------------------------------------------------------------------------------------------- | ----------------------------------------------------------------------------------------------------------------------------------------------------- | ------------------------------------------------------------------------------------------------------------------ |
| [torch.autograd.backward](https://pytorch.org/docs/1.5.0/autograd.html#torch.autograd.backward) | [mindspore.grad](https://mindspore.cn/docs/zh-CN/master/api_python/ops/mindspore.ops.grad.html#mindspore.ops.grad) | [差异对比](https://www.mindspore.cn/docs/zh-CN/master/note/api_mapping/pytorch_diff/grad.html) |
| [torch.autograd.enable_grad](https://pytorch.org/docs/1.5.0/autograd.html#torch.autograd.enable_grad) | [mindspore.ops.stop_gradient](https://www.mindspore.cn/tutorials/zh-CN/master/beginner/autograd.html#stop-gradient) | [差异对比](https://www.mindspore.cn/docs/zh-CN/master/note/api_mapping/pytorch_diff/stop_gradient.html) |
-| [torch.autograd.grad](https://pytorch.org/docs/1.5.0/autograd.html#torch.autograd.grad) | [mindspore.grad](https://mindspore.cn/docs/zh-CN/master/api_python/mindspore.grad.html) | [差异对比](https://www.mindspore.cn/docs/zh-CN/master/note/api_mapping/pytorch_diff/grad.html) |
+| [torch.autograd.grad](https://pytorch.org/docs/1.5.0/autograd.html#torch.autograd.grad) | [mindspore.grad](https://www.mindspore.cn/docs/zh-CN/master/api_python/mindspore/mindspore.grad.html) | [差异对比](https://www.mindspore.cn/docs/zh-CN/master/note/api_mapping/pytorch_diff/grad.html) |
| [torch.autograd.no_grad](https://pytorch.org/docs/1.5.0/autograd.html#torch.autograd.no_grad) | [mindspore.ops.stop_gradient](https://www.mindspore.cn/tutorials/zh-CN/master/beginner/autograd.html#stop-gradient) | [差异对比](https://www.mindspore.cn/docs/zh-CN/master/note/api_mapping/pytorch_diff/stop_gradient.html) |
| [torch.autograd.variable](https://pytorch.org/docs/1.5.0/autograd.html#torch.autograd.variable-deprecated)| [mindspore.Parameter](https://mindspore.cn/docs/zh-CN/master/api_python/mindspore/mindspore.Parameter.html#mindspore.Parameter) | |
diff --git a/docs/mindspore/source_zh_cn/note/api_mapping/tensorflow_api_mapping.md b/docs/mindspore/source_zh_cn/note/api_mapping/tensorflow_api_mapping.md
index 7f180d67ca48275a5eba82669476b720ead4e89b..abb079dcce9eb16ae846c75c154b6ac0da8dbbb6 100644
--- a/docs/mindspore/source_zh_cn/note/api_mapping/tensorflow_api_mapping.md
+++ b/docs/mindspore/source_zh_cn/note/api_mapping/tensorflow_api_mapping.md
@@ -17,7 +17,7 @@
| [tf.eye](https://www.tensorflow.org/versions/r1.15/api_docs/python/tf/eye) | [mindspore.ops.Eye](https://mindspore.cn/docs/zh-CN/master/api_python/ops/mindspore.ops.Eye.html) | |
| [tf.fill](https://www.tensorflow.org/versions/r1.15/api_docs/python/tf/fill) | [mindspore.ops.Fill](https://mindspore.cn/docs/zh-CN/master/api_python/ops/mindspore.ops.Fill.html) | |
| [tf.gather](https://www.tensorflow.org/versions/r1.15/api_docs/python/tf/gather) | [mindspore.ops.Gather](https://mindspore.cn/docs/zh-CN/master/api_python/ops/mindspore.ops.Gather.html) | |
-| [tf.gradients](https://www.tensorflow.org/versions/r1.15/api_docs/python/tf/gradients) | [mindspore.grad](https://mindspore.cn/docs/zh-CN/master/api_python/mindspore.grad.html) | [差异对比](https://www.mindspore.cn/docs/zh-CN/master/note/api_mapping/tensorflow_diff/grad.html) |
+| [tf.gradients](https://www.tensorflow.org/versions/r1.15/api_docs/python/tf/gradients) | [mindspore.grad](https://www.mindspore.cn/docs/zh-CN/master/api_python/mindspore/mindspore.grad.html) | [差异对比](https://www.mindspore.cn/docs/zh-CN/master/note/api_mapping/tensorflow_diff/grad.html) |
| [tf.ones_like](https://www.tensorflow.org/versions/r1.15/api_docs/python/tf/ones_like) | [mindspore.ops.OnesLike](https://mindspore.cn/docs/zh-CN/master/api_python/ops/mindspore.ops.OnesLike.html) | |
| [tf.pad](https://www.tensorflow.org/versions/r1.15/api_docs/python/tf/pad) | [mindspore.nn.Pad](https://mindspore.cn/docs/zh-CN/master/api_python/nn/mindspore.nn.Pad.html) | |
| [tf.print](https://www.tensorflow.org/versions/r1.15/api_docs/python/tf/print) | [mindspore.ops.Print](https://mindspore.cn/docs/zh-CN/master/api_python/ops/mindspore.ops.Print.html) | |