diff --git a/tutorials/inference/source_en/serving_distributed_example.md b/tutorials/inference/source_en/serving_distributed_example.md
index 9394788a2a005d9b15ca07da8f7b3f61761fd6c2..4a818ac2e086e150e7839f05b7c8f84cd873d1d4 100644
--- a/tutorials/inference/source_en/serving_distributed_example.md
+++ b/tutorials/inference/source_en/serving_distributed_example.md
@@ -1,52 +1,51 @@
-# MindSpore Serving-based Distributed Inference Service Deployment
+# Based on MindSpore Serving Deploy Distributed Inference Service
-Translator: [xiaoxiaozhang](https://gitee.com/xiaoxinniuniu)
-
-`Linux` `Ascend` `Serving` `Intermediate` `Senior`
+`Linux` `Ascend` `Serving` `Intermediate` `Advanced`
-- [MindSpore Serving-based Distributed Inference Service Deployment](#mindspore-serving-based-distributed-inference-service-deployment)
+- [Based on MindSpore Serving Deploy Distributed Inference Service](#based-on-MindSpore-Serving-deploy-distributed-inference-service)
- [Overview](#overview)
- [Environment Preparation](#environment-preparation)
- - [Exporting a Distributed Model](#exporting-a-distributed-model)
- - [Deploying the Distributed Inference Service](#deploying-the-distributed-inference-service)
- - [Starting Master and Distributed Worker](#starting-master-and-distributed-worker)
- - [Starting Agent](#starting-agent)
- - [Executing Inference](#executing-inference)
+ - [Exporting Distributed Model](#exporting-distributed-model)
+ - [Deploy Distributed Interface Service](#deploy-distributed-interface-service)
+ - [Service Restriction](#service-restriction)
+ - [Launching Master and Distributed Worker](#launching-master-and-distributed-worker)
+ - [Launching Agent](#launching-agent)
+ - [Performing Inference](#performing-inference)
-
+
## Overview
-Distributed inference means that multiple cards are used in the inference phase, in order to solve the problem that too many parameters are in the very large scale neural network and the model cannot be fully loaded into a single card for inference, multi-cards can be used for distributed inference. This document describes the process of deploying the distributed inference service, which is similar to the process of deploying the [single-card inference service](https://www.mindspore.cn/tutorial/inference/en/master/serving_example.html), and these two can refer to each other.
+Distributed inference means that multi-card inference is used in the inference stage, Aiming at the problem that there are too many parameters in the model of super-large scale neural network and the model cannot be loaded competely into a single card for inference, Multi-card can be used for distributed inference. This article describes the process of deploying a distributed inference service, much the same process as deploy [single-card inference service](https://www.mindspore.cn/tutorial/inference/zh-CN/master/serving_example.html), you can refer to each other.
-The architecture of the distributed inference service shows as follows:
+The architecture of the distributed inference service is shown in the figure:

-The master provides an interface for client access, manages distributed workers, and performs task management and distribution; Distributed workers automatically schedule agents based on model configurations to complete distributed inference; Each agent contains a slice of the distributed model, occupies a device, and loads the model to performance inference.
+master provides an interface for client access, managing distribution worker, task management and distribution; distributed worker complete distributed inference by automatic dispatch agent according to model configuration; each agent includes a slice of the distributed model, occupy a device, load the model to perform inference.
-The preceding figure shows the scenario where rank_size is 16 and stage_size is 2. Each stage contains 8 agents and occupies 8 devices. rank_size indicates the number of devices used in inference, stage indicates a pipeline segment, and stage_size indicates the number of pipeline segments. The distributed worker sends an inference requests to the agent and obtains the inference result from the agent. Agents communicate with each other using HCCL.
+The figure above shows the scene when rank_size is 16, stage_size is 2, each stage includes 8 agent, occupies 8 device. rank_size represents the number of devices used for inference, stage represents a segment of the assembly line, stage_size represents the number of segments in pipeline. Distributed worker send an inference request to agent and obtain inference result from it. HCCL used to communicate with each other among agent.
-Currently, the distributed model has the following restrictions:
+Currently, the distributed model has the following restriction:
- The model of the first stage receives the same input data.
- The models of other stages do not receive data.
-- All models of the latter stage return the same data.
-- Only Ascend 910 inference is supported.
+- All models in the last stage return the same data.
+- Only the Ascend 910 inference is supported.
-The following uses a simple distributed network MatMul as an example to demonstrate the deployment process.
+Following will take a simple distributed network called Matmul as an example to illustrate the deployment process.
### Environment Preparation
-Before running the example, ensure that MindSpore Serving has been correctly installed. If not, install MindSpore Serving by referring to the [MindSpore Serving installation page](https://gitee.com/mindspore/serving/blob/master/README.md#installation), and configure environment variables by referring to the [MindSpore Serving environment configuration page](https://gitee.com/mindspore/serving/blob/master/README.md#configuring-environment-variables).
+Before running the example, make sure MindSpore Serving is installed correctly. If not, you can reference [MindSpore Serving installation page](https://gitee.com/mindspore/serving/blob/master/README_CN.md#%E5%AE%89%E8%A3%85), install MindSpore Serving correctly on your computer, refer to the [MindSpore Serving environment configuration page](https://gitee.com/mindspore/serving/blob/master/README_CN.md#%E9%85%8D%E7%BD%AE%E7%8E%AF%E5%A2%83%E5%8F%98%E9%87%8F) to configure the environment variables at the same time.
-### Exporting a Distributed Model
+### Exporting Distributed Model
-For details about the files required for exporting distributed models, see the [export_model directory](https://www.mindspore.cn/tutorial/training/en/master/advanced_use/distributed_training_ascend.html#id4), the following files are required:
+You can refer to the [export_model directory](https://gitee.com/mindspore/serving/tree/master/example/matmul_distributed/export_model) for the files needed to export the distributed model, the following list of files is required:
```text
export_model
@@ -56,10 +55,10 @@ export_model
└── rank_table_8pcs.json
```
-- `net.py` contains the definition of MatMul network.
+- `net.py` means the definition of MatMul network.
- `distributed_inference.py` is used to configure distributed parameters.
-- `export_model.sh` creates `device` directory on the current host and exports model files corresponding to `device`.
-- `rank_table_8pcs.json` is a json file for configuring the multi-cards network. For details, see [rank_table](https://www.mindspore.cn/tutorial/training/en/master/advanced_use/distributed_training_ascend.html#id4).
+- `export_model.sh` creates `device` directory on the current host and exports corresponding model files for each `device`.
+- `rank_table_8pcs.json` is a json file for configuring network information for the current multi-card environment, see [rank_table](https://www.mindspore.cn/tutorial/training/zh-CN/master/advanced_use/distributed_training_ascend.html#id4) for details.
Use [net.py](https://gitee.com/mindspore/serving/blob/master/example/matmul_distributed/export_model/net.py) to construct a network that contains the MatMul and Neg operators.
@@ -85,7 +84,7 @@ class Net(Cell):
return x
```
-Use [distributed_inference.py](https://gitee.com/mindspore/serving/blob/master/example/matmul_distributed/export_model/distributed_inference.py) to configure the distributed model. Refer to [Distributed inference](https://www.mindspore.cn/tutorial/inference/en/master/multi_platform_inference_ascend_910.html#id1)。
+Use [distributed_inference.py](https://gitee.com/mindspore/serving/blob/master/example/matmul_distributed/export_model/distributed_inference.py) to configure the distributed model. Refer to [Distributed inference](https://www.mindspore.cn/tutorial/inference/zh-CN/master/multi_platform_inference_ascend_910.html#id1).
```python
import numpy as np
@@ -114,13 +113,13 @@ def create_predict_data():
return Tensor(inputs_np)
```
-Run [export_model.sh](https://gitee.com/mindspore/serving/blob/master/example/matmul_distributed/export_model/export_model.sh) to export the distributed model. After the command is executed successfully, the `model` directory is created in the upper-level directory. The structure is as follows:
+Run [export_model.sh](https://gitee.com/mindspore/serving/blob/master/example/matmul_distributed/export_model/export_model.sh) to export the distributed model. After successful execution, the 'model' directory will be created in the previous directory, the structure is as follows:
```text
model
├── device0
-│ ├── group_config.pb
-│ └── matmul.mindir
+│ ├── group_config.pb
+│ └── matmul.mindir
├── device1
├── device2
├── device3
@@ -132,16 +131,16 @@ model
Each `device` directory contains two files, `group_config.pb` and `matmul.mindir`, which represent the model group configuration file and model file respectively.
-### Deploying the Distributed Inference Service
+### Deploy Distributed Interface Service
-For details about how to start the distributed inference service, refer to [matmul_distributed](https://gitee.com/mindspore/serving/tree/master/example/matmul_distributed), the following files are required:
+Start the distributed inference service, see [matmul_distributed](https://gitee.com/mindspore/serving/tree/master/example/matmul_distributed), the following files are required:
```text
matmul_distributed
├── agent.py
├── master_with_worker.py
├── matmul
-│ └── servable_config.py
+│ └── servable_config.py
├── model
└── rank_table_8pcs.json
```
@@ -149,7 +148,7 @@ matmul_distributed
- `model` is the directory for storing model files.
- `master_with_worker.py` is the script for starting services.
- `agent.py` is the script for starting agents.
-- `servable_config.py` is the [Model Configuration File](https://www.mindspore.cn/tutorial/inference/en/master/serving_model.html). It declares a distributed model with rank_size 8 and stage_size 1 through `declare_distributed_servable`, and defines a method `predict` for distributed servable.
+- `servable_config.py` is the [Model Configuration File](https://www.mindspore.cn/tutorial/inference/zh-CN/master/serving_model.html). It declares a distributed model with rank_size 8 and stage_size 1 through `declare_distributed_servable`, and defines a method `predict` for distributed servable.
The content of the model configuration file is as follows:
@@ -166,7 +165,7 @@ def predict(x):
return y
```
-#### Starting Master and Distributed Worker
+#### Launching Master and Distributed Worker
Use [master_with_worker.py](https://gitee.com/mindspore/serving/blob/master/example/matmul_distributed/master_with_worker.py) to call `start_distributed_servable_in_master` method to deploy the co-process master and distributed workers.
@@ -193,15 +192,15 @@ if __name__ == "__main__":
```
- `servable_dir` is the directory for storing a servable.
-- `servable_name` is the name of the servable, which corresponds to a directory for storing model configuration files.
-- `rank_table_json_file` is the JSON file for configuring multi-cards network.
+- `servable_name` is the name of the servable, corresponding to a directory for storing model configuration files.
+- `rank_table_json_file` is the json file for configuring multi-cards network.
- `worker_ip` is the IP address of the distributed worker.
- `worker_port` is the port of the distributed worker.
-- `wait_agents_time_in_seconds` specifies the duration of waiting for all agents to be registered, the default value 0 means it will wait forever.
+- `wait_agents_time_in_seconds` set a time limit for waiting for all Agent registrations to complete, a default of 0 means it waits to the end.
-#### Starting Agent
+#### Launching Agent
-Use [agent.py](https://gitee.com/mindspore/serving/blob/master/example/matmul_distributed/agent.py) to call `startup_worker_agents` method to start 8 agent processes on the current host. Agents obtain rank_tables from distributed workers so that agents can communicate with each other using HCCL.
+Use [agent.py](https://gitee.com/mindspore/serving/blob/master/example/matmul_distributed/agent.py) to call `startup_worker_agents` method to start 8 agent processes on the current host. Agent will obtain rank_table from distributed worker so that agent can communicate with each other using HCCL.
```python
from mindspore_serving.worker import distributed
@@ -228,13 +227,12 @@ if __name__ == '__main__':
- `worker_port` is the port of the distributed worker.
- `model_files` is a list of model file paths.
- `group_config_files` is a list of model group configuration file paths.
-- `agent_start_port` is the start port used by the agent. The default value is 7000.
+- `agent_start_port` represent the start port used by the agent. The default value is 7000.
- `agent_ip` is the IP address of an agent. The default value is None. The IP address used by the agent to communicate with the distributed worker is obtained from rank_table by default. If the IP address is unavailable, you need to set both `agent_ip` and `rank_start`.
-- `rank_start` is the start rank_id of the current server, the default value is None.
-
-### Executing Inference
+- `rank_start` is the start rank_id of the current server which default value is None.
+### Performing Inference
-To access the inference service through gRPC, the client needs to specify the IP address and port of the gRPC server. Run [client.py](https://gitee.com/mindspore/serving/blob/master/example/matmul_distributed/client.py) to call the `predict` method of matmul distributed model, execute inference.
+To access the inference service through gRPC, the client needs to specify the IP address and port of the gRPC server. Run [client.py](https://gitee.com/mindspore/serving/blob/master/example/matmul_distributed/client.py) to call the `predict` method of matmul distributed model, perform inference.
```python
import numpy as np
@@ -253,7 +251,7 @@ if __name__ == '__main__':
run_matmul()
```
-The following return value indicates that the Serving distributed inference service has correctly executed the inference of MatMul net:
+The following return value is displayed after execution, indicating that the Serving distributed inference service has correctly executed the inference of the Matmul network:
```text
result:
@@ -265,3 +263,4 @@ result:
[-48., -48., -48., ..., -48., -48., -48.],
[-48., -48., -48., ..., -48., -48., -48.]], dtype=float32)}]
```
+