diff --git a/docs/mindspore/source_en/features/compile/graph_construction.md b/docs/mindspore/source_en/features/compile/graph_construction.md index 08ff94cf4391783986a71817773e2b93c510742e..bc30d9fe528badd17403bce9d9bae5e2ba1bc0da 100644 --- a/docs/mindspore/source_en/features/compile/graph_construction.md +++ b/docs/mindspore/source_en/features/compile/graph_construction.md @@ -1,4 +1,4 @@ -# Graph Construction +# Graph Construction (Compilation) [![View Source On Gitee](https://mindspore-website.obs.cn-north-4.myhuaweicloud.com/website-images/master/resource/_static/logo_source_en.svg)](https://gitee.com/mindspore/docs/blob/master/docs/mindspore/source_en/features/compile/graph_construction.md) diff --git a/docs/mindspore/source_en/features/compile/graph_optimization.md b/docs/mindspore/source_en/features/compile/graph_optimization.md index 7343dc9236b61d63135228257799ad5e8740dd9a..00bb22bb1965adbb026a688f873251f0696e7e66 100644 --- a/docs/mindspore/source_en/features/compile/graph_optimization.md +++ b/docs/mindspore/source_en/features/compile/graph_optimization.md @@ -1,4 +1,4 @@ -# Graph Optimization +# Graph Optimization (Compilation) [![View Source On Gitee](https://mindspore-website.obs.cn-north-4.myhuaweicloud.com/website-images/master/resource/_static/logo_source_en.svg)](https://gitee.com/mindspore/docs/blob/master/docs/mindspore/source_en/features/compile/graph_optimization.md) diff --git a/docs/mindspore/source_en/features/compile/multi_level_compilation.md b/docs/mindspore/source_en/features/compile/multi_level_compilation.md index 43deec2a3e063987672f8eb80b8fc200b1774100..5ca3e25b7ab0587be5b12d11a0c5f088a22be5e6 100644 --- a/docs/mindspore/source_en/features/compile/multi_level_compilation.md +++ b/docs/mindspore/source_en/features/compile/multi_level_compilation.md @@ -1,4 +1,4 @@ -# Multi-Level Compilation Architecture +# Multi-Level Compilation Introduction (Compilation) [![View Source On Gitee](https://mindspore-website.obs.cn-north-4.myhuaweicloud.com/website-images/master/resource/_static/logo_source_en.svg)](https://gitee.com/mindspore/docs/blob/master/docs/mindspore/source_en/features/compile/multi_level_compilation.md) diff --git a/docs/mindspore/source_en/features/index.rst b/docs/mindspore/source_en/features/index.rst index e1b343d2cfd052bba97411752937d2f2e24a7ed5..6552f5d328fc67c4f3c50c56efab580cec4607d0 100644 --- a/docs/mindspore/source_en/features/index.rst +++ b/docs/mindspore/source_en/features/index.rst @@ -17,6 +17,4 @@ Developer Notes runtime/memory_manager runtime/multilevel_pipeline runtime/multistream_concurrency - runtime/pluggable_backend - runtime/pluggable_device data_engine \ No newline at end of file diff --git a/docs/mindspore/source_en/features/runtime/images/pluggable_device_arch.png b/docs/mindspore/source_en/features/runtime/images/pluggable_device_arch.png deleted file mode 100644 index 2d95358c30c1c08e4c95e3dd910a6be1cd697a8f..0000000000000000000000000000000000000000 Binary files a/docs/mindspore/source_en/features/runtime/images/pluggable_device_arch.png and /dev/null differ diff --git a/docs/mindspore/source_en/features/runtime/pluggable_backend.md b/docs/mindspore/source_en/features/runtime/pluggable_backend.md deleted file mode 100644 index d44ea023c8e273d5d4b110ea6fbc5bc179b144ca..0000000000000000000000000000000000000000 --- a/docs/mindspore/source_en/features/runtime/pluggable_backend.md +++ /dev/null @@ -1,28 +0,0 @@ -# Multi-backend Access - -[![View Source On Gitee](https://mindspore-website.obs.cn-north-4.myhuaweicloud.com/website-images/master/resource/_static/logo_source_en.svg)](https://gitee.com/mindspore/docs/blob/master/docs/mindspore/source_en/features/runtime/pluggable_backend.md) - -## Overview - -In order to meet the rapid docking requirements of new backend and new hardware, MindSpore supports plug-in, low-cost and rapid docking of third-party backend on the basis of MindIR through an open architecture. Third-party backend does not need to pay attention to the data structure and implementation of the current existing backend, and only needs to use MindIR as an input to realize its own backend and functionality, which will be loaded with independent so registration. The functionality of different backends will be isolated from each other. - -## Interface - -Multi-backend implementation: for the specified backend to use via mindspore.jit(backend="xx"), see [jit interface](https://www.mindspore.cn/docs/en/master/api_python/mindspore/mindspore.jit.html#mindspore.jit). - -## Basic Principle - -![multi_backend](https://mindspore-website.obs.cn-north-4.myhuaweicloud.com/website-images/master/docs/mindspore/source_zh_cn/features/runtime/images/multi_backend.png) - -The MindSpore multi-backend docking schematic is shown above, with the core idea being: - -1. The backend management module provides C++ external interfaces (Build and Run for backendmanager) and internal interfaces (Build and Run for the base class backend). -2. backendmanager external interface, mainly provided to the front-end MindIR docking back-end functionality for front-end and back-end decoupling. -3. Base class backend internal interface, mainly provided to the respective backend to achieve Build and Run functions. -4. Each back-end function is an independent so for the back-end management module to dynamically load scheduling. - -After understanding the core idea of MindSpore's multi-backend docking, the main tasks when adding a new backend are as follows: - -1. mindspore.jit(backend="xx") interface adds new backend type. -2. The new backend class inherits from the base class backend and implements the corresponding Build and Run functions. -3. The new backend code is compiled into a separate so and registered to the backend management module. \ No newline at end of file diff --git a/docs/mindspore/source_en/features/runtime/pluggable_device.md b/docs/mindspore/source_en/features/runtime/pluggable_device.md deleted file mode 100644 index 1e0b04a995ee373107260cd2ca1f89b5f66a2627..0000000000000000000000000000000000000000 --- a/docs/mindspore/source_en/features/runtime/pluggable_device.md +++ /dev/null @@ -1,59 +0,0 @@ -# Third-Party Hardware Interconnection - -[![View Source On Gitee](https://mindspore-website.obs.cn-north-4.myhuaweicloud.com/website-images/master/resource/_static/logo_source_en.svg)](https://gitee.com/mindspore/docs/blob/master/docs/mindspore/source_en/features/runtime/pluggable_device.md) - -MindSpore supports plug-in, standardized, low-cost and rapid interconnection of third-party chips through an open architecture: - -- Decoupling of back-end architectures to quickly support plug-in interconnection of new chips. -- Modeling of abstract hardware types and standardization of interconnection processes. -- Abstract operator encapsulation, uniform selection of multi-chip heterogeneous operator. -- Support third-party graph IR access to give full play to the advantages of the chip architecture. - -MindSpore overall architecture and components related to the backend are shown in the following figure: - -![image](./images/pluggable_device_arch.png) - -The overall MindSpore architecture consists of the following major components, which have interdependencies with each other: - -- Python API: Provide a Python-based front-end expression and programming interface to support users in network construction, whole-graph execution, sub-graph execution and single-calculus execution, and call to C++ modules through the pybind11 interface, which are divided into front-end, back-end, MindData, and Core. -- MindExpression Front-End Expression: Responsible for compilation flow control and hardware-independent optimizations such as type derivation, automatic differentiation, expression simplification, etc. -- MindData Data Components: MindData provides efficient data processing, common dataset loading and other functions and programming interfaces, and supports users' flexible definition of processing registration and pipeline parallel optimization. -- MindIR: Contains ANF IR data structures, logs, exceptions, and other data structures and algorithms shared by end and cloud. - -The process of third-party chip interconnection to MindSpore mainly involves the back-end of MindSpore, which is also divided into several components. The overall components are divided into two main categories: - -- A category of hardware-independent components, commonly used data structures such as MemoryManager, MemoryPool, DeviceAddress and related algorithms as well as components including GraphCompiler, GraphScheduler that can schedule the entire process and have initial processing and scheduling capabilities for graphs or single operators. -- The other category is hardware-related components. This part provides several interfaces through the abstraction of hardware, and the third-party chips can choose interconnection according to the situation to realize the logic of operator, graph optimization, memory allocation, stream allocation, etc. unique to the hardware platform, and encapsulate them into dynamic libraries, which are loaded as plug-ins when the program runs. Third-party chips can refer to default built-in CPU/GPU/Ascend plug-ins of MindSpore when interconnection. - -To facilitate third-party hardware interconnection, a hardware abstraction layer is provided in MindSpore, which defines a standardized hardware interconnection interface. The abstraction layer is called by two modules, GraphCompiler and GraphScheduler, in the upper unified runtime: - -- GraphCompiler provides default control flow, heterogeneous graph splitting logic, graph optimization at different stages, calls operator selection/operator compilation, memory allocation and stream allocation provided by the abstraction layer. -- GraphScheduler is responsible for transforming the compiled graph into an Actor model and adding it to the thread pool, and executing the scheduling of these Actors. - -Also public data structures and algorithms are provided in the framework, such as debug tools, default memory pool implementation, hundreds of common operations on Anf IR, and efficient memory reuse algorithm SOMAS developed by MindSpore. - -The hardware abstraction layer provides Graph mode (GraphExecutor) and Kernel mode (KernelExecutor) for two interconnection methods, respectively, for DSA architecture (such as NPU, XPU) and general architecture chips (such as GPU, CPU) to provide a classified interconnection interface. Chip vendors can inherit one or two abstract classes and implement them. Depending on the interconnection method, if you interconnect to Kernel mode, you also need to implement DeviceResManager, KernelMod, DeviceAddress and other interfaces. - -## Kernel Mode Interconnection - -The generic architecture Kernel mode requires the following aspects to be implemented in the plug-in: - -- Custom graph splitting logic, which allows for low-cost implementation of the control flow, heterogeneity and other advanced features provided by the framework. It can be null implementation if the features are not used. - -- Custom graph optimization, which allows splitting and fusion of certain operators according to the features of the hardware, and other custom modifications to the graph. - -- Operator selection and operator compilation. -- Memory management. DeviceAddress is the abstraction of memory, and third-party chip vendors need to implement the function of copying between Host and Device. It also needs to provide memory request and destruction functions. To facilitate third-party chip vendors, MindSpore provides a set of memory pool implementations and an efficient memory reuse algorithm, SOMAS, in the Common component. -- Stream management. If the chip to be docked has the concept of stream, it needs to provide the function of creation and destruction. and If not, it will run in single stream mode. - -![image](../../../source_zh_cn/features/runtime/images/pluggable_device_kernel.png) - -## Graph Mode Interconnection - -If the chip vendor's software stack can provide completely high level APIs, or if there are differences between the software stack of DSA architecture chips and Kernel mode, it can interconnect to Graph mode. The Graph model treats the whole graph as a big operator (SuperKernel) implemented by a third-party software stack, which needs to implement the following two functions: - -- Graph compilation. The third-party chip vendor needs to transform MindSpore Anf IR into a third-party IR graph representation and perform a third-party graph compilation process to compile the graph to an executable ready state. - -- Graph execution. The third-party chip vendor needs to understand MindSpore Tensor format or transform it into a format that can be understood, and call the execution of the ready graph and transform the result of the execution into MindSpore Tensor format. - -![image](../../../source_zh_cn/features/runtime/images/pluggable_device_graph.png) diff --git a/docs/mindspore/source_zh_cn/features/compile/graph_construction.ipynb b/docs/mindspore/source_zh_cn/features/compile/graph_construction.ipynb index 4d4365e03bbaa1d28ac1732c8cd44c77cad1476a..a8ed39198a1f1a458412c89e5f310d68c67c760e 100644 --- a/docs/mindspore/source_zh_cn/features/compile/graph_construction.ipynb +++ b/docs/mindspore/source_zh_cn/features/compile/graph_construction.ipynb @@ -4,7 +4,7 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "# 构图\n", + "# 构图(编译)\n", "\n", "[![下载Notebook](https://mindspore-website.obs.cn-north-4.myhuaweicloud.com/website-images/master/resource/_static/logo_notebook.svg)](https://mindspore-website.obs.cn-north-4.myhuaweicloud.com/notebook/master/zh_cn/features/compile/mindspore_graph_construction.ipynb) [![下载样例代码](https://mindspore-website.obs.cn-north-4.myhuaweicloud.com/website-images/master/resource/_static/logo_download_code.svg)](https://mindspore-website.obs.cn-north-4.myhuaweicloud.com/notebook/master/zh_cn/features/compile/mindspore_graph_construction.py) [![查看源文件](https://mindspore-website.obs.cn-north-4.myhuaweicloud.com/website-images/master/resource/_static/logo_source.svg)](https://gitee.com/mindspore/docs/blob/master/docs/mindspore/source_zh_cn/features/compile/graph_construction.ipynb)\n" ] diff --git a/docs/mindspore/source_zh_cn/features/compile/graph_optimization.md b/docs/mindspore/source_zh_cn/features/compile/graph_optimization.md index 773c61a083044ca21813f3aa5371aabf22040b5e..e3d8f794240ffb56d96374c162acfb3da213de7f 100644 --- a/docs/mindspore/source_zh_cn/features/compile/graph_optimization.md +++ b/docs/mindspore/source_zh_cn/features/compile/graph_optimization.md @@ -1,4 +1,4 @@ -# 图优化 +# 图优化(编译) [![查看源文件](https://mindspore-website.obs.cn-north-4.myhuaweicloud.com/website-images/master/resource/_static/logo_source.svg)](https://gitee.com/mindspore/docs/blob/master/docs/mindspore/source_zh_cn/features/compile/graph_optimization.md) diff --git a/docs/mindspore/source_zh_cn/features/compile/multi_level_compilation.md b/docs/mindspore/source_zh_cn/features/compile/multi_level_compilation.md index dd640eaff985b5172c732dd6918e263907799c08..0675dac4e7bf36a161bcfd117866089765510b1e 100644 --- a/docs/mindspore/source_zh_cn/features/compile/multi_level_compilation.md +++ b/docs/mindspore/source_zh_cn/features/compile/multi_level_compilation.md @@ -1,4 +1,4 @@ -# 多级编译架构 +# 多级编译介绍(编译) [![查看源文件](https://mindspore-website.obs.cn-north-4.myhuaweicloud.com/website-images/master/resource/_static/logo_source.svg)](https://gitee.com/mindspore/docs/blob/master/docs/mindspore/source_zh_cn/features/compile/multi_level_compilation.md) diff --git a/docs/mindspore/source_zh_cn/features/index.rst b/docs/mindspore/source_zh_cn/features/index.rst index 520a78bb208f7c050f8958984c66cc7c26db8982..1b046a7b6c3b98702c6c36ec7397119146289c7e 100644 --- a/docs/mindspore/source_zh_cn/features/index.rst +++ b/docs/mindspore/source_zh_cn/features/index.rst @@ -17,7 +17,6 @@ Developer Notes runtime/memory_manager runtime/multilevel_pipeline runtime/multistream_concurrency - runtime/pluggable_backend - runtime/pluggable_device amp data_engine + mint diff --git a/docs/mindspore/source_zh_cn/features/mint.md b/docs/mindspore/source_zh_cn/features/mint.md index 76e6db2ac7dafd9178c822ad3efead424b445ae5..9cc38148fecc2f5326ec33333d360dd923bde23f 100644 --- a/docs/mindspore/source_zh_cn/features/mint.md +++ b/docs/mindspore/source_zh_cn/features/mint.md @@ -8,7 +8,7 @@ ### 张量创建 -以empty这个API看下主要差异点 +以empty这个API看下主要差异点: | torch.empty | mindspore.mint.empty | 说明 | |:---: | :---: | :---:| @@ -25,11 +25,11 @@ - `layout`: 创建torch tensor时,一般默认layout是stride,即dense tensor。mindspore创建tensor时,默认是dense tensor,与torch 无差异。开发者无需设置。 - `memory_format`: tensor的内存排布,默认都是NCHW格式。torch 提供channel_last格式即NHWC,在一些场景中,这样会有性能提升,但是泛化性和兼容性需要开发者实际测试和验证。使用mindspore开发,可不设置此参数。 -- `requires_grad`: 由于框架自动微分求导机制不同,mindspore在tensor的属性中没有设置此参数。对于是否需要计算梯度,常用的parameter类提供了此参数。如果无需计算梯度,可参考[mindspore.ops.stop_gradient](https://www.mindspore.cn/docs/zh-CN/master/api_python/ops/mindspore.ops.stop_gradient.html) +- `requires_grad`: 由于框架自动微分求导机制不同,mindspore在tensor的属性中没有设置此参数。对于是否需要计算梯度,常用的parameter类提供了此参数。如果无需计算梯度,可参考[mindspore.ops.stop_gradient](https://www.mindspore.cn/docs/zh-CN/master/api_python/ops/mindspore.ops.stop_gradient.html)。 - `pin_memory`: 返回的tensor被分配到pinned memory,我们已经规划支持此功能。计划在2.7.1版本推出。 - `out`: 指定输出张量,用于原地操作和内存优化。当提供 `out` 参数时,操作结果会直接写入到指定的张量中,而不是创建新的张量。当前未规划支持此参数。 -**代码示例**: +**代码示例**: ```diff - import torch @@ -42,7 +42,8 @@ 总结:tensor相关可选参数涉及框架实现机制不同,我们也会根据开发者反馈不断完善,如tensor storage能力已规划。 ### 随机采样 -以bernoulli举例 + +以bernoulli举例: | torch.bernoulli | mindspore.mint.bernoulli | 说明 | |:---: | :---: | :---:| @@ -52,7 +53,8 @@ out参数差异参考张量创建 -**代码示例**: +**代码示例**: + ```diff - import torch + import mindspore.mint @@ -66,16 +68,17 @@ out参数差异参考张量创建 ### 数学计算 -基础计算类当前均已支持,以mul举例 +基础计算类当前均已支持,以mul举例: + | torch.mul | mindspore.mint.mul | 说明 | |:---: | :---: | :---:| | `*size` (Tensor...) | `*size` (Tensor...) | 必选 | | `other` | `other`| 可选 | | `out` | 无 | 可选 | -计算类ops当前不支持的参数与tensor creation是类似的,这与tensor实现机制相关。例如out +计算类ops当前不支持的参数与tensor creation是类似的,这与tensor实现机制相关。例如out: -**代码示例**: +**代码示例**: ```diff - import torch @@ -104,8 +107,7 @@ out参数差异参考张量创建 | `groups` (int) | `groups` (int) | 可选 | | `bias`(bool) | `bias`(bool) | 可选 | - -**代码示例**: +**代码示例**: ```diff - import torch @@ -127,14 +129,14 @@ dilation = (3, 1) output = model(input) ``` -包含inplace参数的,当前未全部支持,例如 +包含inplace参数的,当前未全部支持,例如: -| API | Args | -| :--- | :--- | +| API | Args | +| :--- | :--- | | torch.nn.functional_dropout2d | input, p=0.5, training=True, inplace=False | | mindspore.mint.nn.functional_dropout2d | input, p=0.5, training=True -#### torch废弃的参数,不支持,例如 +torch废弃的参数,不支持,例如: | torch.nn.MSELoss | 是否废弃 | mindspore.nn.MSELoss | 说明 | |:---: | :--- | :---: | :---:| @@ -144,7 +146,7 @@ output = model(input) ### 集群通信类 -常用all_gather/all_reduce/all_to_all等均已支持,参数也保持一致,例如 +常用all_gather/all_reduce/all_to_all等均已支持,参数也保持一致,例如: | torch.distributed.all_gather | mindspore.mint.distributed.all_gather | 说明 | |:---: | :---: | :---:| @@ -153,7 +155,6 @@ output = model(input) | `group`(ProcessGroup) | `group` (ProcessGroup) | 可选 | | `async_op` (bool) | `async_op` (bool) | 可选 | - | torch.distributed.all_reduce | mindspore.mint.distributed.all_reduce | 说明 | |:---: | :---: | :---:| | `tensor` (Tensor) | `Tensor` (Tensor) | 必选 | @@ -161,4 +162,4 @@ output = model(input) | `group`(ProcessGroup) | `group` (ProcessGroup) | 可选 | | `async_op` (bool) | `async_op` (bool) | 可选 | -更多API支持情况请查阅[mint支持列表](https://www.mindspore.cn/docs/zh-CN/master/api_python/mindspore.mint.html) \ No newline at end of file +更多API支持情况请查阅[mint支持列表](https://www.mindspore.cn/docs/zh-CN/master/api_python/mindspore.mint.html)。 \ No newline at end of file diff --git a/docs/mindspore/source_zh_cn/features/runtime/images/multi_backend.png b/docs/mindspore/source_zh_cn/features/runtime/images/multi_backend.png deleted file mode 100644 index f1309c225c988487b751c1f92b434edbcf3e6d6d..0000000000000000000000000000000000000000 Binary files a/docs/mindspore/source_zh_cn/features/runtime/images/multi_backend.png and /dev/null differ diff --git a/docs/mindspore/source_zh_cn/features/runtime/images/pluggable_device_arch.png b/docs/mindspore/source_zh_cn/features/runtime/images/pluggable_device_arch.png deleted file mode 100644 index 390dc8a21af8d06ac4486fe46da62a6c25c8c3ff..0000000000000000000000000000000000000000 Binary files a/docs/mindspore/source_zh_cn/features/runtime/images/pluggable_device_arch.png and /dev/null differ diff --git a/docs/mindspore/source_zh_cn/features/runtime/images/pluggable_device_graph.png b/docs/mindspore/source_zh_cn/features/runtime/images/pluggable_device_graph.png deleted file mode 100644 index c3db3f0c96806a491c3dd40be7ffe01141d9cd25..0000000000000000000000000000000000000000 Binary files a/docs/mindspore/source_zh_cn/features/runtime/images/pluggable_device_graph.png and /dev/null differ diff --git a/docs/mindspore/source_zh_cn/features/runtime/images/pluggable_device_kernel.png b/docs/mindspore/source_zh_cn/features/runtime/images/pluggable_device_kernel.png deleted file mode 100644 index deff43ac2a99aa5944e74c21209c6dc68e41ab20..0000000000000000000000000000000000000000 Binary files a/docs/mindspore/source_zh_cn/features/runtime/images/pluggable_device_kernel.png and /dev/null differ diff --git a/docs/mindspore/source_zh_cn/features/runtime/pluggable_backend.md b/docs/mindspore/source_zh_cn/features/runtime/pluggable_backend.md deleted file mode 100644 index d7107dbf1d8100c6373dbc01c0bf8999d1c61703..0000000000000000000000000000000000000000 --- a/docs/mindspore/source_zh_cn/features/runtime/pluggable_backend.md +++ /dev/null @@ -1,28 +0,0 @@ -# 多后端接入 - -[![查看源文件](https://mindspore-website.obs.cn-north-4.myhuaweicloud.com/website-images/master/resource/_static/logo_source.svg)](https://gitee.com/mindspore/docs/blob/master/docs/mindspore/source_zh_cn/features/runtime/pluggable_backend.md) - -## 概述 - -为了满足新后端和新硬件的快速对接诉求,MindSpore通过开放式架构,在MindIR的基础上支持第三方后端插件化、低成本快速对接,第三方后端无需关注当前已有后端的数据结构和实现,只需以MindIR作为输入,实现自己的后端和功能,以独立so注册加载,不同后端之间的功能互相隔离。 - -## 接口 - -多后端实现,可通过mindspore.jit(backend="xx")指定使用的后端,详见[jit接口](https://www.mindspore.cn/docs/zh-CN/master/api_python/mindspore/mindspore.jit.html#mindspore.jit)。 - -## 基本原理 - -![multi_backend](./images/multi_backend.png) - -MindSpore多后端对接示意图如上,核心思想为: - -1. 后端管理模块提供C++对外接口(backendmanager的Build和Run)和对内接口(基类backend的Build和Run)。 -2. backendmanager对外接口,主要提供给前端MindIR对接后端功能,用于前后端解耦。 -3. 基类backend对内接口,主要提供给各个后端各自实现Build和Run功能。 -4. 各个后端功能均是独立so,用于后端管理模块动态加载调度。 - -了解了MindSpore多后端对接的核心思想后,新增后端时,主要工作如下: - -1. mindspore.jit(backend="xx")接口新增后端类型。 -2. 新后端子类继承基类backend,实现对应的Build和Run功能。 -3. 新后端代码编译成独立so,注册到后端管理模块。 diff --git a/docs/mindspore/source_zh_cn/features/runtime/pluggable_device.md b/docs/mindspore/source_zh_cn/features/runtime/pluggable_device.md deleted file mode 100644 index 0f6c3beca60ec85678b44ddedd532bd28b267450..0000000000000000000000000000000000000000 --- a/docs/mindspore/source_zh_cn/features/runtime/pluggable_device.md +++ /dev/null @@ -1,59 +0,0 @@ -# 三方硬件对接 - -[![查看源文件](https://mindspore-website.obs.cn-north-4.myhuaweicloud.com/website-images/master/resource/_static/logo_source.svg)](https://gitee.com/mindspore/docs/blob/master/docs/mindspore/source_zh_cn/features/runtime/pluggable_device.md) - -MindSpore通过开放式架构,支持第三方芯片插件化、标准化、低成本快速对接: - -- 后端架构解耦,快速支持新芯片插件化对接; -- 抽象硬件类型建模,对接流程标准化; -- 抽象算子封装,多芯片异构算子统一选择; -- 支持第三方图IR接入,充分发挥芯片架构优势。 - -MindSpore整体架构及后端相关组件如下图所示: - -![image](./images/pluggable_device_arch.png) - -MindSpore整体架构包括如下几个主要组件,它们之间存在相互的依赖关系: - -- Python API:提供了基于Python的前端表达与编程接口,支撑用户进行网络构建、整图执行、子图执行以及单算子执行,并通过pybind11接口调用到C++模块,C++模块分为前端、后端、MindData、Core等; -- MindExpression前端表达:负责编译流程控制和硬件无关的优化如类型推导、自动微分、表达式化简等; -- MindData数据组件:MindData提供高效的数据处理、常用数据集加载等功能和编程接口,支持用户灵活的定义处理注册和pipeline并行优化; -- MindIR:包含了ANF IR数据结构、日志、异常等端、云共用的数据结构与算法。 - -第三方芯片对接MindSpore的过程主要涉及MindSpore的后端,后端也分为多个组件,整体上分为两大类: - -- 一类与硬件无关,如MemoryManager、MemoryPool、DeviceAddress等常用数据结构及相关算法以及包括GraphCompiler、GraphScheduler在内的能够调度整个流程、具有对图或单算子的初步处理和调度能力的组件; -- 另一类与硬件相关,这部分通过对硬件的抽象,提供了多个接口,第三方芯片可以根据情况选择对接,实现硬件平台上特有的算子、图优化、内存分配、流分配等逻辑,并封装成动态库,程序运行时作为插件加载。第三方芯片对接时可以参考MindSpore默认内置的CPU/GPU/Ascend插件。 - -为了方便第三方硬件对接,在MindSpore中提供了硬件抽象层,定义了标准化的硬件对接接口,抽象层被上层统一运行时中的GraphCompiler和GraphScheduler两个模块调用: - -- GraphCompiler负责提供默认的控制流、异构图拆分逻辑,不同阶段的图优化,调用抽象层提供的算子选择/算子编译、内存分配和流分配等; -- GraphScheduler负责将编译完成的图转化为Actor模型并加入到线程池中,并执行调度这些Actor。 - -同时,在框架中也提供了公共数据结构与算法,如debug工具、默认的内存池实现、数百个对Anf IR的常见操作、由MindSpore研发高效内存复用算法SOMAS等。 - -硬件抽象层提供了Graph模式(GraphExecutor)和Kernel模式(KernelExecutor)用于两种对接方式,分别面向DSA架构(如NPU、XPU等)和通用架构的芯片(如GPU、CPU等)提供分类的对接接口。芯片厂商可以继承某种或两种抽象类并实现,根据对接方式的不同,如果对接Kernel模式还需实现DeviceResManager、KernelMod、DeviceAddress等接口。 - -## Kernel模式对接 - -通用架构Kernel模式需要在插件中实现以下几个方面的功能: - -- 自定义图拆分逻辑,可以低成本实现框架提供的控制流、异构等高级特性,如果不使用这些特性,可以空实现; - -- 自定义图优化,可以根据硬件的特性对某些算子进行拆分与融合,以及其他自定义的对图的修改; - -- 算子选择和算子编译; -- 内存管理,DeviceAddress是对内存的抽象,第三方芯片厂商需要实现Host与Device之间拷贝的功能。还需要提供内存申请、销毁的功能。为了方便第三方芯片厂商,MindSpore在Common组件中提供了一套内存池的实现和高效内存复用算法SOMAS; -- 流管理,如果待对接的芯片有流的概念,需要提供创建与销毁的功能,如果没有,则将会以单流模式运行。 - -![image](./images/pluggable_device_kernel.png) - -## Graph模式对接 - -若芯片厂商的软件栈较完整能够提供High level的API,或DSA架构芯片的软件栈与Kernel模式存在差异,可以对接Graph模式。Graph模式将整个图视为一个由第三方软件栈实现的大算子(SuperKernel),需要由第三方软件栈实现以下两个功能: - -- 图编译,第三方芯片厂商需要将MindSpore的Anf IR转换成第三方IR图表达,并执行第三方图编译流程将该图编译至可执行的就绪状态; - -- 图执行,第三方芯片厂商需要理解MindSpore的Tensor格式或将其转换成可被理解的格式,并调用执行已就绪的图,并将执行的结果转换成MindSpore的Tensor格式。 - -![image](./images/pluggable_device_graph.png)