diff --git a/docs/en/intelligent_foundation/syshax/deploy_guide/syshax_deployment_guide.md b/docs/en/intelligent_foundation/syshax/deploy_guide/syshax_deployment_guide.md index e6f6e3f4c44cb1ac82bc04d17de0bf92326030a5..fd38118821511e314aae8f84090a6cc5fc417315 100644 --- a/docs/en/intelligent_foundation/syshax/deploy_guide/syshax_deployment_guide.md +++ b/docs/en/intelligent_foundation/syshax/deploy_guide/syshax_deployment_guide.md @@ -2,63 +2,65 @@ ## Overview -sysHAX is positioned as K+X heterogeneous fusion inference acceleration, mainly containing two functional components: +sysHAX is positioned as K+X heterogeneous fusion inference acceleration, mainly consisting of two functional components: -- Dynamic inference scheduling +- Inference dynamic scheduling - CPU inference acceleration -**Dynamic inference scheduling**: For inference tasks, the prefill phase belongs to compute-intensive tasks, while the decode phase belongs to memory-intensive tasks. Therefore, from the perspective of computational resources, the prefill phase is suitable for execution on GPU/NPU and other hardware, while the decode phase can be executed on CPU and other hardware. +**Inference dynamic scheduling**: For inference tasks, the prefill stage is a compute-intensive task, while the decode stage is memory-access intensive. Therefore, from the perspective of computing resources, the prefill stage is suitable for execution on hardware such as GPU/NPU, whereas the decode stage can be executed on hardware like CPU. -**CPU inference acceleration**: Accelerates CPU inference performance through NUMA affinity, parallel optimization, operator optimization, and other methods on the CPU. +**CPU inference acceleration**: Accelerate CPU inference performance through NUMA affinity, parallel optimization, and operator optimization. -sysHAX consists of two delivery components: +sysHAX includes two delivery components: -![syshax-deploy](pictures/syshax-deploy.png) +![syshax-deploy](pictures/syshax-deploy.png "syshax-deploy") The delivery components include: -- sysHAX: Responsible for request processing and scheduling of prefill and decode requests -- vllm: vllm is a large model inference service that includes both GPU/NPU and CPU during deployment, used for processing prefill and decode requests respectively. From the perspective of developer usability, vllm will be released using containerization. +- sysHAX: responsible for request processing and scheduling of prefill and decode requests +- vllm: vllm is a large model inference service that includes both GPU/NPU and CPU versions during deployment, used for handling prefill and decode requests respectively. From the developer's usability perspective, vllm will be released in containerized form. -vllm is a **high-throughput, low-memory-usage large language model (LLM) inference and service engine** that supports **CPU computation acceleration** and provides efficient operator dispatch mechanisms, including: +vllm is a **high-throughput, low-memory footprint** **large language model (LLM) inference and service engine** that supports **CPU computation acceleration**, providing an efficient operator dispatch mechanism, including: -- Schedule: Optimizes task distribution to improve parallel computation efficiency -- Prepare Input: Efficient data preprocessing to accelerate input construction -- Ray framework: Utilizes distributed computing to improve inference throughput -- Sample (model post-processing): Optimizes sampling strategies to improve generation quality -- Framework post-processing: Integrates multiple optimization strategies to improve overall inference performance +- Schedule: Optimize task distribution to improve parallel computing efficiency +- Prepare Input: Efficient data preprocessing to accelerate input building +- Ray Framework: Utilize distributed computing to enhance inference throughput +- Sample (Model Post-processing): Optimize sampling strategies to improve generation quality +- Framework Post-processing: Integrate multiple optimization strategies to enhance overall inference performance -This engine combines **efficient computation scheduling and optimization strategies** to provide **faster, more stable, and more scalable** solutions for LLM inference. +This engine combines **efficient computational scheduling and optimization strategies** to provide **faster, more stable, and scalable** solutions for LLM inference. ## Environment Preparation -| Server Model | Kunpeng 920 Series CPU | -| ------------ | ---------------------- | -| GPU | Nvidia A100 | -| OS | openEuler 22.03 LTS and above | -| Python | 3.9 and above | -| Docker | 25.0.3 and above | +| KEY | VALUE | +| ---------- | ---------------------------------------- | +| Server Model | Kunpeng 920 series CPU | +| GPU | Nvidia A100 | +| Operating System | openEuler 24.03 LTS SP1 | +| Python | 3.9 and above | +| Docker | 25.0.3 and above | - Docker 25.0.3 can be installed via `dnf install moby`. +- Note: sysHAX currently only supports NVIDIA GPU adaptation on AI acceleration cards, ASCEND NPU adaptation is in progress. ## Deployment Process -First, check whether NVIDIA drivers and CUDA drivers are already installed using `nvidia-smi` and `nvcc -V`. If not, you need to install NVIDIA drivers and CUDA drivers first. +First, you need to check if the NVIDIA driver and CUDA driver are installed via `nvidia-smi` and `nvcc -V`. If not, you need to install the NVIDIA driver and CUDA driver first. ### Install NVIDIA Container Toolkit (Container Engine Plugin) -If NVIDIA Container Toolkit is already installed, you can skip this step. Otherwise, follow the installation process below: +If NVIDIA Container Toolkit is already installed, you can skip this step. Otherwise, install it following the process below: -- Execute the `systemctl restart docker` command to restart docker, making the container engine plugin configuration in the docker config file effective. +- Execute the `systemctl restart docker` command to restart Docker, so that the content added by the container engine plugin in the Docker configuration file takes effect. -### Container-based vllm Setup +### Container Scenario vllm Setup The following process deploys vllm in a GPU container. ```shell -docker pull hub.oepkgs.net/neocopilot/syshax/syshax-vllm-gpu:0.2.0 +docker pull hub.oepkgs.net/neocopilot/syshax/syshax-vllm-gpu:0.2.1 docker run --name vllm_gpu \ --ipc="shareable" \ @@ -67,40 +69,42 @@ docker run --name vllm_gpu \ -p 8001:8001 \ -v /home/models:/home/models \ -w /home/ \ - -itd hub.oepkgs.net/neocopilot/syshax/syshax-vllm-gpu:0.2.0 bash + -itd hub.oepkgs.net/neocopilot/syshax/syshax-vllm-gpu:0.2.1 bash ``` In the above script: -- `--ipc="shareable"`: Allows the container to share IPC namespace for inter-process communication. -- `--shm-size=64g`: Sets the container shared memory to 64G. -- `--gpus=all`: Allows the container to use all GPU devices on the host -- `-p 8001:8001`: Port mapping, mapping the host's port 8001 to the container's port 8001. Developers can modify this as needed. -- `-v /home/models:/home/models`: Directory mounting, mapping the host's `/home/models` to `/home/models` inside the container for model sharing. Developers can modify the mapping directory as needed. +- `--ipc="shareable"`: Allows containers to share IPC namespaces for inter-process communication. +- `--shm-size=64g`: Sets container shared memory to 64G. +- `--gpus=all`: Allows containers to use all GPU devices on the host machine. +- `-p 8001:8001`: Port mapping, mapping port 8001 of the host machine to port 8001 of the container, which can be modified by developers. +- `-v /home/models:/home/models`: Directory mounting, mapping the host's `/home/models` to the container's `/home/models`, achieving model sharing. Developers can modify the mapping directory as needed. ```shell vllm serve /home/models/DeepSeek-R1-Distill-Qwen-32B \ --served-model-name=ds-32b \ --host 0.0.0.0 \ --port 8001 \ - --dtype=half \ + --dtype=auto \ --swap_space=16 \ --block_size=16 \ --preemption_mode=swap \ --max_model_len=8192 \ --tensor-parallel-size 2 \ - --gpu_memory_utilization=0.8 + --gpu_memory_utilization=0.8 \ + --enable-auto-pd-offload ``` In the above script: -- `--tensor-parallel-size 2`: Enables tensor parallelism, splitting the model across 2 GPUs. Requires at least 2 GPUs. Developers can modify this as needed. -- `--gpu_memory_utilization=0.8`: Limits GPU memory usage to 80% to avoid service crashes due to memory exhaustion. Developers can modify this as needed. +- `--tensor-parallel-size 2`: Enable tensor parallelism, splitting the model to run on 2 GPUs, requiring at least 2 GPUs, which can be modified by developers. +- `--gpu_memory_utilization=0.8`: Limit GPU memory utilization to 80% to prevent service crashes due to memory exhaustion, which can be modified by developers. +- `--enable-auto-pd-offload`: Enable PD separation when triggering swap out. The following process deploys vllm in a CPU container. ```shell -docker pull hub.oepkgs.net/neocopilot/syshax/syshax-vllm-cpu:0.2.0 +docker pull hub.oepkgs.net/neocopilot/syshax/syshax-vllm-cpu:0.2.1 docker run --name vllm_cpu \ --ipc container:vllm_gpu \ @@ -109,15 +113,15 @@ docker run --name vllm_cpu \ -p 8002:8002 \ -v /home/models:/home/models \ -w /home/ \ - -itd hub.oepkgs.net/neocopilot/syshax/syshax-vllm-cpu:0.2.0 bash + -itd hub.oepkgs.net/neocopilot/syshax/syshax-vllm-cpu:0.2.1 bash ``` In the above script: -- `--ipc container:vllm_gpu`: Shares the IPC (inter-process communication) namespace with the container named vllm_gpu. Allows this container to exchange data directly through shared memory, avoiding cross-container copying. +- `--ipc container:vllm_gpu` shares the IPC (inter-process communication) namespace of the container named vllm_gpu. This allows this container to exchange data directly through shared memory, avoiding cross-container copying. ```shell -INFERENCE_OP_MODE=fused OMP_NUM_THREADS=160 CUSTOM_CPU_AFFINITY=0-159 SYSHAX_QUANTIZE=q8_0 \ +NRC=4 INFERENCE_OP_MODE=fused OMP_NUM_THREADS=160 CUSTOM_CPU_AFFINITY=0-159 SYSHAX_QUANTIZE=q4_0 \ vllm serve /home/models/DeepSeek-R1-Distill-Qwen-32B \ --served-model-name=ds-32b \ --host 0.0.0.0 \ @@ -125,19 +129,21 @@ vllm serve /home/models/DeepSeek-R1-Distill-Qwen-32B \ --dtype=half \ --block_size=16 \ --preemption_mode=swap \ - --max_model_len=8192 + --max_model_len=8192 \ + --enable-auto-pd-offload ``` In the above script: -- `INFERENCE_OP_MODE=fused`: Enables CPU inference acceleration -- `OMP_NUM_THREADS=160`: Specifies the number of CPU inference threads to start as 160. This environment variable only takes effect after specifying INFERENCE_OP_MODE=fused. -- `CUSTOM_CPU_AFFINITY=0-159`: Specifies the CPU binding scheme, which will be explained in detail later. -- `SYSHAX_QUANTIZE=q8_0`: Specifies the quantization scheme as q8_0. The current version supports 2 quantization schemes: `q8_0`, `q4_0`. +- `INFERENCE_OP_MODE=fused`: Enable CPU inference acceleration +- `OMP_NUM_THREADS=160`: Specify the number of threads for CPU inference startup as 160. This environment variable only takes effect after setting INFERENCE_OP_MODE=fused. +- `CUSTOM_CPU_AFFINITY=0-159`: Specify the CPU binding scheme, which will be described in detail later. +- `SYSHAX_QUANTIZE=q4_0`: Specify the quantization scheme as q4_0. The current version supports 2 quantization schemes: `q8_0` and `q4_0`. +- `NRC=4`: GEMV operator block mode. This environment variable provides good acceleration effects on 920 series processors. -Note: The GPU container must be started first before starting the CPU container. +Note that the GPU container must be started before the CPU container can be started. -Use lscpu to check the current machine's hardware configuration, focusing on: +Use lscpu to check the current machine's hardware situation, focusing on: ```shell Architecture: aarch64 @@ -160,26 +166,33 @@ NUMA: NUMA node3 CPU(s): 120-159 ``` -This machine has 160 physical cores, no SMT enabled, 4 NUMA nodes, with 40 cores on each NUMA. +This machine has a total of 160 physical cores, with SMT disabled, and has 4 NUMA nodes, each with 40 cores. -Use these two environment variables to set the CPU binding scheme: `OMP_NUM_THREADS=160 CUSTOM_CPU_AFFINITY=0-159`. In these two environment variables, the first one is the number of CPU inference threads to start, and the second one is the IDs of the CPUs to bind. In CPU inference acceleration, to achieve NUMA affinity, CPU binding operations are required, following these rules: +Set the CPU binding scheme using these two scripts: `OMP_NUM_THREADS=160 CUSTOM_CPU_AFFINITY=0-159`. In these two environment variables, the first specifies the number of threads for CPU inference startup, and the second specifies the IDs of the bound CPUs. To achieve NUMA affinity in CPU inference acceleration, CPU binding operations are required, following the rules below: -- The number of started threads must match the number of bound CPUs; +- The number of startup threads must match the number of bound CPUs; - The number of CPUs used on each NUMA must be the same to maintain load balancing. -For example, in the above script, CPUs 0-159 are bound. Among them, 0-39 belong to NUMA node 0, 40-79 belong to NUMA node 1, 80-119 belong to NUMA node 2, and 120-159 belong to NUMA node 3. Each NUMA uses 40 CPUs, ensuring load balancing across all NUMAs. +For example, in the above script, CPUs 0-159 are bound. Among them, 0-39 belongs to NUMA node 0, 40-79 belongs to NUMA node 1, 80-119 belongs to NUMA node 2, and 120-159 belongs to NUMA node 3. Each NUMA uses 40 CPUs, ensuring load balancing for each NUMA. ### sysHAX Installation -sysHAX installation: +There are two ways to install sysHAX: you can install the rpm package via dnf. Note that this method requires upgrading openEuler to openEuler 24.03 LTS SP2 or higher version: ```shell dnf install sysHAX ``` -Before starting sysHAX, some basic configuration is needed: +Or directly start using the source code: ```shell +git clone -b v0.2.0 https://gitee.com/openeuler/sysHAX.git +``` + +Before starting sysHAX, some basic configurations are required: + +```shell +# When installing sysHAX via dnf install sysHAX syshax init syshax config services.gpu.port 8001 syshax config services.cpu.port 8002 @@ -187,15 +200,30 @@ syshax config services.conductor.port 8010 syshax config models.default ds-32b ``` -Additionally, you can use `syshax config --help` to view all configuration commands. +```shell +# When using git clone -b v0.2.0 https://gitee.com/openeuler/sysHAX.git +python3 cli.py init +python3 cli.py config services.gpu.port 8001 +python3 cli.py config services.cpu.port 8002 +python3 cli.py config services.conductor.port 8010 +python3 cli.py config models.default ds-32b +``` + +Additionally, you can use `syshax config --help` or `python3 cli.py config --help` to view all configuration commands. -After configuration is complete, start the sysHAX service with the following command: +After configuration, start the sysHAX service using the following command: ```shell +# When installing sysHAX via dnf install sysHAX syshax run ``` -When starting the sysHAX service, service connectivity testing will be performed. sysHAX complies with openAPI standards. Once the service is started, you can use APIs to call the large model service. You can test it with the following script: +```shell +# When using git clone -b v0.2.0 https://gitee.com/openeuler/sysHAX.git +python3 main.py +``` + +When starting the sysHAX service, connectivity testing will be performed. sysHAX complies with the openAPI standard. After the service starts, you can call the large model service via API. The following script can be used for testing: ```shell curl http://0.0.0.0:8010/v1/chat/completions -H "Content-Type: application/json" -d '{ @@ -203,10 +231,9 @@ curl http://0.0.0.0:8010/v1/chat/completions -H "Content-Type: application/json" "messages": [ { "role": "user", - "content": "Introduce openEuler." + "content": "介绍一下openEuler。" } ], "stream": true, "max_tokens": 1024 }' -``` diff --git a/docs/en/openeuler_intelligence/mcp_agent/_toc.yaml b/docs/en/openeuler_intelligence/mcp_agent/_toc.yaml index 7af448ed9960b64b9cf3359549c785b3ca155d09..c69c3f2d2bc0b0139e629ef3d2695bc2c2963b6a 100644 --- a/docs/en/openeuler_intelligence/mcp_agent/_toc.yaml +++ b/docs/en/openeuler_intelligence/mcp_agent/_toc.yaml @@ -1,6 +1,6 @@ -label: openEuler MCP Service Guide +label: openEuler MCP 服务指南 isManual: true -description: openEuler Intelligent MCP +description: openEuler智能化MCP sections: - - label: openEuler MCP Service Guide + - label: openEuler MCP服务指南 href: ./mcp_guide.md \ No newline at end of file diff --git a/docs/en/openeuler_intelligence/mcp_agent/mcp_guide.md b/docs/en/openeuler_intelligence/mcp_agent/mcp_guide.md index 68b35325d03ddabed51005746b9008b6fb89a0fc..2e437b1c4972880fb08afe09492c87062a4c905d 100644 --- a/docs/en/openeuler_intelligence/mcp_agent/mcp_guide.md +++ b/docs/en/openeuler_intelligence/mcp_agent/mcp_guide.md @@ -1,3 +1,90 @@ # MCP Service Guide -(Current content is being updated, please wait.) +## 1. Overview + +The current version of openEuler intelligence has enhanced support for MCP. The usage process is mainly divided into the following steps: + +1. Register MCP +2. Install MCP +3. Activate MCP and load configuration files +4. Build Agent based on the activated MCP +5. Test Agent +6. Publish Agent +7. Use Agent + +> **Note**: +> +> - Registration, installation, and activation of MCP require administrator privileges +> - Building, testing, publishing, and using Agent are ordinary user privilege operations +> - All Agent-related operations must be performed based on the activated MCP + +## 2. Registration, Installation and Activation of MCP + +The following process uses an administrator account as an example to demonstrate the complete management process of MCP: + +1. **Register MCP** + Register MCP to the openEuler intelligence system through the "MCP Registration" button in the plugin center + ![mcp register button](pictures/regeister_mcp_button.png) + + Clicking the button pops up the registration window (the default configurations for SSE and STDIO are as follows): + ![sse register window](pictures/sse_mcp_register.png) + ![stdio register window](pictures/stdio_mcp_register.png) + + Taking SSE registration as an example, fill in the configuration information and click "Save" + ![fill in mcp configuration file](pictures/add_mcp.png) + +2. **Install MCP** + + > **Note**: Before installing STDIO, you can adjust service dependency files and permissions in the `/opt/copilot/semantics/mcp/template` directory on the corresponding container or server + + Click the "Install" button on the registered MCP card to install + ![mcp install button](pictures/sse_mcp_intstalling.png) + +3. **View MCP Tools** + After successful installation, click the MCP card to view the tools supported by this service + ![mcp service tools](pictures/mcp_details.png) + +4. **Activate MCP** + Click the "Activate" button to enable the MCP service + ![mcp activate button](pictures/activate_mcp.png) + +## 3. Creation, Testing, Publishing and Using of Agent Applications + +The following operations can be completed by ordinary users, and all operations must be performed based on the activated MCP: + +1. **Create Agent Application** + Click the "Create Application" button in the application center + ![create agent application](pictures/create_app_button.png) + +2. **Configure Agent Application** + After successful creation, click the application card to enter the details page, where you can modify the application configuration information + ![modify agent application configuration](pictures/edit_Agent_app_message.png) + +3. **Associate MCP** + Click the "Add MCP" button, select the activated MCP from the list that pops up on the left to associate + ![add mcp](pictures/add_mcp_button.png) + ![mcp list](pictures/add_mcp.png) + +4. **Test Agent Application** + After completing MCP association and information configuration, click the "Test" button in the lower right corner to perform functional testing + ![test agent application](pictures/test_mcp.png) + +5. **Publish Agent Application** + After passing the test, click the "Publish" button in the lower right corner to publish the application + ![publish agent application](pictures/publish_app.png) + +6. **Use Agent Application** + The published application will be displayed in the application market, double-click to use + ![use agent application](pictures/click_and_use.png) + + Agent applications have two usage modes: + + - **Automatic Mode**: Operations are executed automatically without manual user confirmation + ![automatic use agent application](pictures/chat_with_auto_excute.png) + + - **Manual Mode**: Risks are prompted before execution, and execution requires user confirmation + ![manual use agent application](pictures/chat_with_not_auto_excute.png) + +## 4. Summary + +Through the above process, users can build and use customized Agent applications based on MCP. Welcome to experience and explore more functional scenarios. diff --git a/docs/en/openeuler_intelligence/mcp_agent/pictures/activate_mcp.png b/docs/en/openeuler_intelligence/mcp_agent/pictures/activate_mcp.png new file mode 100644 index 0000000000000000000000000000000000000000..61533013f8629ce483c3ccb074ab0c1041c430cb Binary files /dev/null and b/docs/en/openeuler_intelligence/mcp_agent/pictures/activate_mcp.png differ diff --git a/docs/en/openeuler_intelligence/mcp_agent/pictures/add_mcp.png b/docs/en/openeuler_intelligence/mcp_agent/pictures/add_mcp.png new file mode 100644 index 0000000000000000000000000000000000000000..32f0f84879dcef4193cb912878bea5d87a86dd1c Binary files /dev/null and b/docs/en/openeuler_intelligence/mcp_agent/pictures/add_mcp.png differ diff --git a/docs/en/openeuler_intelligence/mcp_agent/pictures/add_mcp_button.png b/docs/en/openeuler_intelligence/mcp_agent/pictures/add_mcp_button.png new file mode 100644 index 0000000000000000000000000000000000000000..ad6ceb728d861cf16bcc3fbe1071c1ff376e0956 Binary files /dev/null and b/docs/en/openeuler_intelligence/mcp_agent/pictures/add_mcp_button.png differ diff --git a/docs/en/openeuler_intelligence/mcp_agent/pictures/chat_with_auto_excute.png b/docs/en/openeuler_intelligence/mcp_agent/pictures/chat_with_auto_excute.png new file mode 100644 index 0000000000000000000000000000000000000000..f7eb8e06eeb54e3f5c038b60bfd603953cae30f3 Binary files /dev/null and b/docs/en/openeuler_intelligence/mcp_agent/pictures/chat_with_auto_excute.png differ diff --git a/docs/en/openeuler_intelligence/mcp_agent/pictures/chat_with_not_auto_excute.png b/docs/en/openeuler_intelligence/mcp_agent/pictures/chat_with_not_auto_excute.png new file mode 100644 index 0000000000000000000000000000000000000000..1bfa31b6396680e0359f2176750462ca33862594 Binary files /dev/null and b/docs/en/openeuler_intelligence/mcp_agent/pictures/chat_with_not_auto_excute.png differ diff --git a/docs/en/openeuler_intelligence/mcp_agent/pictures/choose_Agent_app_to_create.png b/docs/en/openeuler_intelligence/mcp_agent/pictures/choose_Agent_app_to_create.png new file mode 100644 index 0000000000000000000000000000000000000000..b1c200e6b0092c043b3cb99f3c7def4fe101ad94 Binary files /dev/null and b/docs/en/openeuler_intelligence/mcp_agent/pictures/choose_Agent_app_to_create.png differ diff --git a/docs/en/openeuler_intelligence/mcp_agent/pictures/choose_mcp.png b/docs/en/openeuler_intelligence/mcp_agent/pictures/choose_mcp.png new file mode 100644 index 0000000000000000000000000000000000000000..4d50d4e3f6330062846d85d7bf7b66e34b1a3b98 Binary files /dev/null and b/docs/en/openeuler_intelligence/mcp_agent/pictures/choose_mcp.png differ diff --git a/docs/en/openeuler_intelligence/mcp_agent/pictures/choose_mcp_2.png b/docs/en/openeuler_intelligence/mcp_agent/pictures/choose_mcp_2.png new file mode 100644 index 0000000000000000000000000000000000000000..693d6e607c2872123e1a506237f53cacebad3af8 Binary files /dev/null and b/docs/en/openeuler_intelligence/mcp_agent/pictures/choose_mcp_2.png differ diff --git a/docs/en/openeuler_intelligence/mcp_agent/pictures/click_and_use.png b/docs/en/openeuler_intelligence/mcp_agent/pictures/click_and_use.png new file mode 100644 index 0000000000000000000000000000000000000000..281e8c736baf98c29cf3991e43dfe18cb4fc89be Binary files /dev/null and b/docs/en/openeuler_intelligence/mcp_agent/pictures/click_and_use.png differ diff --git a/docs/en/openeuler_intelligence/mcp_agent/pictures/create_app_button.png b/docs/en/openeuler_intelligence/mcp_agent/pictures/create_app_button.png new file mode 100644 index 0000000000000000000000000000000000000000..fce7d20de9e28a6e066f0cd9be16fc2dcdd350c7 Binary files /dev/null and b/docs/en/openeuler_intelligence/mcp_agent/pictures/create_app_button.png differ diff --git a/docs/en/openeuler_intelligence/mcp_agent/pictures/edit_Agent_app_message.png b/docs/en/openeuler_intelligence/mcp_agent/pictures/edit_Agent_app_message.png new file mode 100644 index 0000000000000000000000000000000000000000..eea98c7db7b4c93ea54bd83b18687159ed7e4c6b Binary files /dev/null and b/docs/en/openeuler_intelligence/mcp_agent/pictures/edit_Agent_app_message.png differ diff --git a/docs/en/openeuler_intelligence/mcp_agent/pictures/installed_sse_mcp.png b/docs/en/openeuler_intelligence/mcp_agent/pictures/installed_sse_mcp.png new file mode 100644 index 0000000000000000000000000000000000000000..f9d98d5f7aa424e0ceef901688424538cd26b18d Binary files /dev/null and b/docs/en/openeuler_intelligence/mcp_agent/pictures/installed_sse_mcp.png differ diff --git a/docs/en/openeuler_intelligence/mcp_agent/pictures/mcp_details.png b/docs/en/openeuler_intelligence/mcp_agent/pictures/mcp_details.png new file mode 100644 index 0000000000000000000000000000000000000000..f374a96687a4d98d65ee199be059212a61cf2692 Binary files /dev/null and b/docs/en/openeuler_intelligence/mcp_agent/pictures/mcp_details.png differ diff --git a/docs/en/openeuler_intelligence/mcp_agent/pictures/publish_app.png b/docs/en/openeuler_intelligence/mcp_agent/pictures/publish_app.png new file mode 100644 index 0000000000000000000000000000000000000000..3785369ddd009e4c0e2d08d2134dfffde5a101b6 Binary files /dev/null and b/docs/en/openeuler_intelligence/mcp_agent/pictures/publish_app.png differ diff --git a/docs/en/openeuler_intelligence/mcp_agent/pictures/regeister_mcp_button.png b/docs/en/openeuler_intelligence/mcp_agent/pictures/regeister_mcp_button.png new file mode 100644 index 0000000000000000000000000000000000000000..f3145eb9c93123b7b7417239f71d564f8ce4558c Binary files /dev/null and b/docs/en/openeuler_intelligence/mcp_agent/pictures/regeister_mcp_button.png differ diff --git a/docs/en/openeuler_intelligence/mcp_agent/pictures/sse_mcp_config.png b/docs/en/openeuler_intelligence/mcp_agent/pictures/sse_mcp_config.png new file mode 100644 index 0000000000000000000000000000000000000000..2f9fba9632ebf4da993ac62283c4cf3120a45faf Binary files /dev/null and b/docs/en/openeuler_intelligence/mcp_agent/pictures/sse_mcp_config.png differ diff --git a/docs/en/openeuler_intelligence/mcp_agent/pictures/sse_mcp_intstalling.png b/docs/en/openeuler_intelligence/mcp_agent/pictures/sse_mcp_intstalling.png new file mode 100644 index 0000000000000000000000000000000000000000..52f6ab4d5a51f3af5710c8c024e127fdff2a7e00 Binary files /dev/null and b/docs/en/openeuler_intelligence/mcp_agent/pictures/sse_mcp_intstalling.png differ diff --git a/docs/en/openeuler_intelligence/mcp_agent/pictures/sse_mcp_register.png b/docs/en/openeuler_intelligence/mcp_agent/pictures/sse_mcp_register.png new file mode 100644 index 0000000000000000000000000000000000000000..8ed20b064a11c3442ec7fafcf76bbd2f6fab0a64 Binary files /dev/null and b/docs/en/openeuler_intelligence/mcp_agent/pictures/sse_mcp_register.png differ diff --git a/docs/en/openeuler_intelligence/mcp_agent/pictures/stdio_mcp_register.png b/docs/en/openeuler_intelligence/mcp_agent/pictures/stdio_mcp_register.png new file mode 100644 index 0000000000000000000000000000000000000000..620a664fd63666ab224ea3d64d90417f482cba7a Binary files /dev/null and b/docs/en/openeuler_intelligence/mcp_agent/pictures/stdio_mcp_register.png differ diff --git a/docs/en/openeuler_intelligence/mcp_agent/pictures/test_mcp.png b/docs/en/openeuler_intelligence/mcp_agent/pictures/test_mcp.png new file mode 100644 index 0000000000000000000000000000000000000000..18143d4ea7e65f1b61b2618c06ac579ec516a018 Binary files /dev/null and b/docs/en/openeuler_intelligence/mcp_agent/pictures/test_mcp.png differ diff --git a/docs/en/openeuler_intelligence/mcp_agent/pictures/uninstalled_stdio_mcp.png b/docs/en/openeuler_intelligence/mcp_agent/pictures/uninstalled_stdio_mcp.png new file mode 100644 index 0000000000000000000000000000000000000000..101ab687f2e28593fcce5dcfc27c1d2b8866d640 Binary files /dev/null and b/docs/en/openeuler_intelligence/mcp_agent/pictures/uninstalled_stdio_mcp.png differ diff --git a/docs/zh/intelligent_foundation/syshax/deploy_guide/syshax_deployment_guide.md b/docs/zh/intelligent_foundation/syshax/deploy_guide/syshax_deployment_guide.md index 452ccd2f4885015bcbf12a0087d6409e3f4f3b8d..4faa2e431f6192f7795df9a8e468af5573383c0b 100644 --- a/docs/zh/intelligent_foundation/syshax/deploy_guide/syshax_deployment_guide.md +++ b/docs/zh/intelligent_foundation/syshax/deploy_guide/syshax_deployment_guide.md @@ -12,7 +12,7 @@ sysHAX功能定位为K+X异构融合推理加速,主要包含两部分功能 sysHAX共包含两部分交付件: -![syshax-deploy](pictures/syshax-deploy.png) +![syshax-deploy](pictures/syshax-deploy.png "syshax-deploy") 交付件包括: - sysHAX:负责请求的处理和prefill、decode请求的调度 @@ -30,14 +30,16 @@ vllm是一款**高吞吐、低内存占用**的**大语言模型(LLM)推理 ## 环境准备 +| KEY | VALUE | +| ---------- | ---------------------------------------- | | 服务器型号 | 鲲鹏920系列CPU | -| ---------- | ----------------------------------------- | | GPU | Nvidia A100 | -| 操作系统 | openEuler 22.03 LTS及以上 | +| 操作系统 | openEuler 24.03 LTS SP1 | | python | 3.9及以上 | | docker | 25.0.3及以上 | - docker 25.0.3可通过 `dnf install moby` 进行安装。 +- 请注意,sysHAX目前在AI加速卡侧只对NVIDIA GPU进行了适配,ASCEND NPU适配正在进行中。 ## 部署流程 @@ -56,7 +58,7 @@ vllm是一款**高吞吐、低内存占用**的**大语言模型(LLM)推理 如下流程为在GPU容器中部署vllm。 ```shell -docker pull hub.oepkgs.net/neocopilot/syshax/syshax-vllm-gpu:0.2.0 +docker pull hub.oepkgs.net/neocopilot/syshax/syshax-vllm-gpu:0.2.1 docker run --name vllm_gpu \ --ipc="shareable" \ @@ -65,7 +67,7 @@ docker run --name vllm_gpu \ -p 8001:8001 \ -v /home/models:/home/models \ -w /home/ \ - -itd hub.oepkgs.net/neocopilot/syshax/syshax-vllm-gpu:0.2.0 bash + -itd hub.oepkgs.net/neocopilot/syshax/syshax-vllm-gpu:0.2.1 bash ``` 在上述脚本中: @@ -81,24 +83,26 @@ vllm serve /home/models/DeepSeek-R1-Distill-Qwen-32B \ --served-model-name=ds-32b \ --host 0.0.0.0 \ --port 8001 \ - --dtype=half \ + --dtype=auto \ --swap_space=16 \ --block_size=16 \ --preemption_mode=swap \ --max_model_len=8192 \ --tensor-parallel-size 2 \ - --gpu_memory_utilization=0.8 + --gpu_memory_utilization=0.8 \ + --enable-auto-pd-offload ``` 在上述脚本中: - `--tensor-parallel-size 2`:启用张量并行,将模型拆分到2张GPU上运行,需至少2张GPU,开发者可自行修改。 - `--gpu_memory_utilization=0.8`:限制显存使用率为80%,避免因为显存耗尽而导致服务崩溃,开发者可自行修改。 +- `--enable-auto-pd-offload`:启动在swap out时触发PD分离。 如下流程为在CPU容器中部署vllm。 ```shell -docker pull hub.oepkgs.net/neocopilot/syshax/syshax-vllm-cpu:0.2.0 +docker pull hub.oepkgs.net/neocopilot/syshax/syshax-vllm-cpu:0.2.1 docker run --name vllm_cpu \ --ipc container:vllm_gpu \ @@ -107,7 +111,7 @@ docker run --name vllm_cpu \ -p 8002:8002 \ -v /home/models:/home/models \ -w /home/ \ - -itd hub.oepkgs.net/neocopilot/syshax/syshax-vllm-cpu:0.2.0 bash + -itd hub.oepkgs.net/neocopilot/syshax/syshax-vllm-cpu:0.2.1 bash ``` 在上述脚本中: @@ -115,7 +119,7 @@ docker run --name vllm_cpu \ - `--ipc container:vllm_gpu`共享名为vllm_gpu的容器的IPC(进程间通信)命名空间。允许此容器直接通过共享内存交换数据,避免跨容器复制。 ```shell -INFERENCE_OP_MODE=fused OMP_NUM_THREADS=160 CUSTOM_CPU_AFFINITY=0-159 SYSHAX_QUANTIZE=q8_0 \ +NRC=4 INFERENCE_OP_MODE=fused OMP_NUM_THREADS=160 CUSTOM_CPU_AFFINITY=0-159 SYSHAX_QUANTIZE=q4_0 \ vllm serve /home/models/DeepSeek-R1-Distill-Qwen-32B \ --served-model-name=ds-32b \ --host 0.0.0.0 \ @@ -123,7 +127,8 @@ vllm serve /home/models/DeepSeek-R1-Distill-Qwen-32B \ --dtype=half \ --block_size=16 \ --preemption_mode=swap \ - --max_model_len=8192 + --max_model_len=8192 \ + --enable-auto-pd-offload ``` 在上述脚本中: @@ -131,7 +136,8 @@ vllm serve /home/models/DeepSeek-R1-Distill-Qwen-32B \ - `INFERENCE_OP_MODE=fused`:启动CPU推理加速 - `OMP_NUM_THREADS=160`:指定CPU推理启动线程数为160,该环境变量需要在指定INFERENCE_OP_MODE=fused后才能生效 - `CUSTOM_CPU_AFFINITY=0-159`:指定CPU绑核方案,后续会详细介绍。 -- `SYSHAX_QUANTIZE=q8_0`:指定量化方案为q8_0。当前版本支持2种量化方案:`q8_0`、`q4_0`。 +- `SYSHAX_QUANTIZE=q4_0`:指定量化方案为q4_0。当前版本支持2种量化方案:`q8_0`、`q4_0`。 +- `NRC=4`:GEMV算子分块方式,该环境变量在920系列处理器上具有较好的加速效果。 需要注意的是,必须先启动GPU的容器,才能启动CPU的容器。 @@ -169,15 +175,22 @@ NUMA: ### sysHAX安装 -sysHAX安装: +sysHAX安装有两种方式,可以通过dnf安装rpm包。注意,使用该方法需要将openEuler升级至openEuler 24.03 LTS SP2及以上版本: ```shell dnf install sysHAX ``` +或者直接使用源码启动: + +```shell +git clone -b v0.2.0 https://gitee.com/openeuler/sysHAX.git +``` + 在启动sysHAX之前需要进行一些基础配置: ```shell +# 使用 dnf install sysHAX 安装sysHAX时 syshax init syshax config services.gpu.port 8001 syshax config services.cpu.port 8002 @@ -185,14 +198,29 @@ syshax config services.conductor.port 8010 syshax config models.default ds-32b ``` -此外,也可以通过 `syshax config --help` 来查看全部配置命令。 +```shell +# 使用 git clone -b v0.2.0 https://gitee.com/openeuler/sysHAX.git 时 +python3 cli.py init +python3 cli.py config services.gpu.port 8001 +python3 cli.py config services.cpu.port 8002 +python3 cli.py config services.conductor.port 8010 +python3 cli.py config models.default ds-32b +``` + +此外,也可以通过 `syshax config --help` 或者 `python3 cli.py config --help` 来查看全部配置命令。 配置完成后,通过如下命令启动sysHAX服务: ```shell +# 使用 dnf install sysHAX 安装sysHAX时 syshax run ``` +```shell +# 使用 git clone -b v0.2.0 https://gitee.com/openeuler/sysHAX.git 时 +python3 main.py +``` + 启动sysHAX服务的时候,会进行服务连通性测试。sysHAX符合openAPI标准,待服务启动完成后,即可API来调用大模型服务。可通过如下脚本进行测试: ```shell diff --git a/docs/zh/openeuler_intelligence/mcp_agent/mcp_guide.md b/docs/zh/openeuler_intelligence/mcp_agent/mcp_guide.md index 1518c6b91c4729d8fc0b7d4234b62a6cb78df5b2..cc09e51fb66c98a1189a250e03720b30fb148f4c 100644 --- a/docs/zh/openeuler_intelligence/mcp_agent/mcp_guide.md +++ b/docs/zh/openeuler_intelligence/mcp_agent/mcp_guide.md @@ -1,3 +1,106 @@ # MCP 服务指南 -(当前内容待更新,请等待) +## 1. 概述 + +openEuler intelligence 当前版本对 MCP 的支持已得到增强,使用流程主要分为以下步骤: + +1. 注册 MCP +2. 安装 MCP +3. 激活 MCP 并载入配置文件 +4. 基于已激活的 MCP 构建 Agent +5. 测试 Agent +6. 发布 Agent +7. 使用 Agent + +> **说明**: +> +> - 注册、安装和激活 MCP 需管理员权限操作 +> - 构建、测试、发布和使用 Agent 为普通用户权限操作 +> - 所有 Agent 相关操作均需基于已激活的 MCP 进行 + +## 2. MCP 的注册、安装与激活 + +以下流程以管理员账号为例,展示 MCP 的完整管理流程: + +1. **注册 MCP** + 通过插件中心的 "MCP 注册" 按钮,将 MCP 注册到 openEuler intelligence 系统中 + + ![mcp注册按钮](pictures/regeister_mcp_button.png) + + 点击按钮后弹出注册窗口(SSE 和 STDIO 的默认配置如下): + + ![sse注册窗口](pictures/sse_mcp_register.png) + + ![stdio注册窗口](pictures/stdio_mcp_register.png) + + 以 SSE 注册为例,填写配置信息后点击"保存" + + ![填入mcp配置文件](pictures/add_mcp.png) + +2. **安装 MCP** + + > **注意**:安装 STDIO 前,可在对应容器或服务器的 `/opt/copilot/semantics/mcp/template` 目录下调整服务依赖文件及权限 + + 点击已注册的 MCP 卡片上的"安装"按钮进行安装 + + ![mcp安装按钮](pictures/sse_mcp_intstalling.png) + +3. **查看 MCP 工具** + 安装成功后,点击 MCP 卡片可查看该服务支持的工具 + + ![mcp服务工具](pictures/mcp_details.png) + +4. **激活 MCP** + 点击"激活"按钮启用 MCP 服务 + + ![mcp激活按钮](pictures/activate_mcp.png) + +## 3. Agent 应用的创建、测试、发布与使用 + +以下操作可由普通用户完成,所有操作均需基于已激活的 MCP 进行: + +1. **创建 Agent 应用** + 点击应用中心的"创建应用"按钮 + + ![创建agent应用](pictures/create_app_button.png) + +2. **配置 Agent 应用** + 创建成功后,点击应用卡片进入详情页,可修改应用配置信息 + + ![修改agent应用配置](pictures/edit_Agent_app_message.png) + +3. **关联 MCP** + 点击"添加 MCP"按钮,在左侧弹出的列表中选择已激活的 MCP 进行关联 + + ![添加mcp](pictures/add_mcp_button.png) + + ![mcp列表](pictures/add_mcp.png) + +4. **测试 Agent 应用** + 完成 MCP 关联和信息配置后,点击右下角"测试"按钮进行功能测试 + + ![测试agent应用](pictures/test_mcp.png) + +5. **发布 Agent 应用** + 测试通过后,点击右下角"发布"按钮发布应用 + + ![发布agent应用](pictures/publish_app.png) + +6. **使用 Agent 应用** + 发布后的应用将显示在应用市场中,双击即可使用 + + ![使用agent应用](pictures/click_and_use.png) + + Agent 应用有两种使用模式: + + - **自动模式**:无需用户手动确认,自动执行操作 + + ![自动使用agent应用](pictures/chat_with_auto_excute.png) + + - **手动模式**:执行前会提示风险,需用户确认后才执行 + + ![手动使用agent应用](pictures/chat_with_not_auto_excute.png) + +## 4. 总结 + +通过上述流程,用户可基于 MCP 构建并使用自定义的 Agent 应用。欢迎体验并探索更多功能场景。 diff --git a/docs/zh/openeuler_intelligence/mcp_agent/pictures/activate_mcp.png b/docs/zh/openeuler_intelligence/mcp_agent/pictures/activate_mcp.png new file mode 100644 index 0000000000000000000000000000000000000000..61533013f8629ce483c3ccb074ab0c1041c430cb Binary files /dev/null and b/docs/zh/openeuler_intelligence/mcp_agent/pictures/activate_mcp.png differ diff --git a/docs/zh/openeuler_intelligence/mcp_agent/pictures/add_mcp.png b/docs/zh/openeuler_intelligence/mcp_agent/pictures/add_mcp.png new file mode 100644 index 0000000000000000000000000000000000000000..32f0f84879dcef4193cb912878bea5d87a86dd1c Binary files /dev/null and b/docs/zh/openeuler_intelligence/mcp_agent/pictures/add_mcp.png differ diff --git a/docs/zh/openeuler_intelligence/mcp_agent/pictures/add_mcp_button.png b/docs/zh/openeuler_intelligence/mcp_agent/pictures/add_mcp_button.png new file mode 100644 index 0000000000000000000000000000000000000000..ad6ceb728d861cf16bcc3fbe1071c1ff376e0956 Binary files /dev/null and b/docs/zh/openeuler_intelligence/mcp_agent/pictures/add_mcp_button.png differ diff --git a/docs/zh/openeuler_intelligence/mcp_agent/pictures/chat_with_auto_excute.png b/docs/zh/openeuler_intelligence/mcp_agent/pictures/chat_with_auto_excute.png new file mode 100644 index 0000000000000000000000000000000000000000..f7eb8e06eeb54e3f5c038b60bfd603953cae30f3 Binary files /dev/null and b/docs/zh/openeuler_intelligence/mcp_agent/pictures/chat_with_auto_excute.png differ diff --git a/docs/zh/openeuler_intelligence/mcp_agent/pictures/chat_with_not_auto_excute.png b/docs/zh/openeuler_intelligence/mcp_agent/pictures/chat_with_not_auto_excute.png new file mode 100644 index 0000000000000000000000000000000000000000..1bfa31b6396680e0359f2176750462ca33862594 Binary files /dev/null and b/docs/zh/openeuler_intelligence/mcp_agent/pictures/chat_with_not_auto_excute.png differ diff --git a/docs/zh/openeuler_intelligence/mcp_agent/pictures/choose_Agent_app_to_create.png b/docs/zh/openeuler_intelligence/mcp_agent/pictures/choose_Agent_app_to_create.png new file mode 100644 index 0000000000000000000000000000000000000000..b1c200e6b0092c043b3cb99f3c7def4fe101ad94 Binary files /dev/null and b/docs/zh/openeuler_intelligence/mcp_agent/pictures/choose_Agent_app_to_create.png differ diff --git a/docs/zh/openeuler_intelligence/mcp_agent/pictures/choose_mcp.png b/docs/zh/openeuler_intelligence/mcp_agent/pictures/choose_mcp.png new file mode 100644 index 0000000000000000000000000000000000000000..4d50d4e3f6330062846d85d7bf7b66e34b1a3b98 Binary files /dev/null and b/docs/zh/openeuler_intelligence/mcp_agent/pictures/choose_mcp.png differ diff --git a/docs/zh/openeuler_intelligence/mcp_agent/pictures/choose_mcp_2.png b/docs/zh/openeuler_intelligence/mcp_agent/pictures/choose_mcp_2.png new file mode 100644 index 0000000000000000000000000000000000000000..693d6e607c2872123e1a506237f53cacebad3af8 Binary files /dev/null and b/docs/zh/openeuler_intelligence/mcp_agent/pictures/choose_mcp_2.png differ diff --git a/docs/zh/openeuler_intelligence/mcp_agent/pictures/click_and_use.png b/docs/zh/openeuler_intelligence/mcp_agent/pictures/click_and_use.png new file mode 100644 index 0000000000000000000000000000000000000000..281e8c736baf98c29cf3991e43dfe18cb4fc89be Binary files /dev/null and b/docs/zh/openeuler_intelligence/mcp_agent/pictures/click_and_use.png differ diff --git a/docs/zh/openeuler_intelligence/mcp_agent/pictures/create_app_button.png b/docs/zh/openeuler_intelligence/mcp_agent/pictures/create_app_button.png new file mode 100644 index 0000000000000000000000000000000000000000..fce7d20de9e28a6e066f0cd9be16fc2dcdd350c7 Binary files /dev/null and b/docs/zh/openeuler_intelligence/mcp_agent/pictures/create_app_button.png differ diff --git a/docs/zh/openeuler_intelligence/mcp_agent/pictures/edit_Agent_app_message.png b/docs/zh/openeuler_intelligence/mcp_agent/pictures/edit_Agent_app_message.png new file mode 100644 index 0000000000000000000000000000000000000000..eea98c7db7b4c93ea54bd83b18687159ed7e4c6b Binary files /dev/null and b/docs/zh/openeuler_intelligence/mcp_agent/pictures/edit_Agent_app_message.png differ diff --git a/docs/zh/openeuler_intelligence/mcp_agent/pictures/installed_sse_mcp.png b/docs/zh/openeuler_intelligence/mcp_agent/pictures/installed_sse_mcp.png new file mode 100644 index 0000000000000000000000000000000000000000..f9d98d5f7aa424e0ceef901688424538cd26b18d Binary files /dev/null and b/docs/zh/openeuler_intelligence/mcp_agent/pictures/installed_sse_mcp.png differ diff --git a/docs/zh/openeuler_intelligence/mcp_agent/pictures/mcp_details.png b/docs/zh/openeuler_intelligence/mcp_agent/pictures/mcp_details.png new file mode 100644 index 0000000000000000000000000000000000000000..f374a96687a4d98d65ee199be059212a61cf2692 Binary files /dev/null and b/docs/zh/openeuler_intelligence/mcp_agent/pictures/mcp_details.png differ diff --git a/docs/zh/openeuler_intelligence/mcp_agent/pictures/publish_app.png b/docs/zh/openeuler_intelligence/mcp_agent/pictures/publish_app.png new file mode 100644 index 0000000000000000000000000000000000000000..3785369ddd009e4c0e2d08d2134dfffde5a101b6 Binary files /dev/null and b/docs/zh/openeuler_intelligence/mcp_agent/pictures/publish_app.png differ diff --git a/docs/zh/openeuler_intelligence/mcp_agent/pictures/regeister_mcp_button.png b/docs/zh/openeuler_intelligence/mcp_agent/pictures/regeister_mcp_button.png new file mode 100644 index 0000000000000000000000000000000000000000..f3145eb9c93123b7b7417239f71d564f8ce4558c Binary files /dev/null and b/docs/zh/openeuler_intelligence/mcp_agent/pictures/regeister_mcp_button.png differ diff --git a/docs/zh/openeuler_intelligence/mcp_agent/pictures/sse_mcp_config.png b/docs/zh/openeuler_intelligence/mcp_agent/pictures/sse_mcp_config.png new file mode 100644 index 0000000000000000000000000000000000000000..2f9fba9632ebf4da993ac62283c4cf3120a45faf Binary files /dev/null and b/docs/zh/openeuler_intelligence/mcp_agent/pictures/sse_mcp_config.png differ diff --git a/docs/zh/openeuler_intelligence/mcp_agent/pictures/sse_mcp_intstalling.png b/docs/zh/openeuler_intelligence/mcp_agent/pictures/sse_mcp_intstalling.png new file mode 100644 index 0000000000000000000000000000000000000000..52f6ab4d5a51f3af5710c8c024e127fdff2a7e00 Binary files /dev/null and b/docs/zh/openeuler_intelligence/mcp_agent/pictures/sse_mcp_intstalling.png differ diff --git a/docs/zh/openeuler_intelligence/mcp_agent/pictures/sse_mcp_register.png b/docs/zh/openeuler_intelligence/mcp_agent/pictures/sse_mcp_register.png new file mode 100644 index 0000000000000000000000000000000000000000..8ed20b064a11c3442ec7fafcf76bbd2f6fab0a64 Binary files /dev/null and b/docs/zh/openeuler_intelligence/mcp_agent/pictures/sse_mcp_register.png differ diff --git a/docs/zh/openeuler_intelligence/mcp_agent/pictures/stdio_mcp_register.png b/docs/zh/openeuler_intelligence/mcp_agent/pictures/stdio_mcp_register.png new file mode 100644 index 0000000000000000000000000000000000000000..620a664fd63666ab224ea3d64d90417f482cba7a Binary files /dev/null and b/docs/zh/openeuler_intelligence/mcp_agent/pictures/stdio_mcp_register.png differ diff --git a/docs/zh/openeuler_intelligence/mcp_agent/pictures/test_mcp.png b/docs/zh/openeuler_intelligence/mcp_agent/pictures/test_mcp.png new file mode 100644 index 0000000000000000000000000000000000000000..18143d4ea7e65f1b61b2618c06ac579ec516a018 Binary files /dev/null and b/docs/zh/openeuler_intelligence/mcp_agent/pictures/test_mcp.png differ diff --git a/docs/zh/openeuler_intelligence/mcp_agent/pictures/uninstalled_stdio_mcp.png b/docs/zh/openeuler_intelligence/mcp_agent/pictures/uninstalled_stdio_mcp.png new file mode 100644 index 0000000000000000000000000000000000000000..101ab687f2e28593fcce5dcfc27c1d2b8866d640 Binary files /dev/null and b/docs/zh/openeuler_intelligence/mcp_agent/pictures/uninstalled_stdio_mcp.png differ