diff --git a/docs/mindspore/source_en/faq/network_compilation.md b/docs/mindspore/source_en/faq/network_compilation.md index 03648f0f875b5a7be043d2123497b94e0eb0152e..ea6b8e474722936f52ca12a42a390bf26f479a36 100644 --- a/docs/mindspore/source_en/faq/network_compilation.md +++ b/docs/mindspore/source_en/faq/network_compilation.md @@ -37,7 +37,7 @@ A: MindSpore does not support the `yield` syntax in graph mode. A: In the inference stage of front-end compilation, the abstract types of nodes, including `type` and `shape`, will be inferred. Common abstract types include `AbstractScalar`, `AbstractTensor`, `AbstractFunction`, `AbstractTuple`, `AbstractList`, etc. In some scenarios, such as multi-branch scenarios, the abstract types of the return values of different branches will be `join` to infer the abstract type of the returned result. If these abstract types do not match, or `type`/`shape` are inconsistent, the above exception will be thrown. -When an error similar to `Type Join Failed: dtype1 = Float32, dtype2 = Float16` appears, it means that the data types are inconsistent, resulting in an exception when joining abstract. According to the provided data types and code line, the error can be quickly located. In addition, the specific abstract information and node information are provided in the error message. You can view the MindIR information through the `analyze_fail.ir` file to locate and solve the problem. For specific introduction of MindIR, please refer to [MindSpore IR (MindIR)](https://www.mindspore.cn/docs/en/master/design/all_scenarios.html#mindspore-ir-mindir). The code sample is as follows: +When an error similar to `Type Join Failed: dtype1 = Float32, dtype2 = Float16` appears, it means that the data types are inconsistent, resulting in an exception when joining abstract. According to the provided data types and code line, the error can be quickly located. In addition, the specific abstract information and node information are provided in the error message. You can view the MindIR information through the `analyze_fail.ir` file to locate and solve the problem. The code sample is as follows: ```python import numpy as np diff --git a/docs/mindspore/source_en/features/compile/graph_optimization.md b/docs/mindspore/source_en/features/compile/graph_optimization.md index 3c5b9cfe8294157b679330c2ddab03b2e7965827..9c9687f514300854efce7b1707c1f42224f130de 100644 --- a/docs/mindspore/source_en/features/compile/graph_optimization.md +++ b/docs/mindspore/source_en/features/compile/graph_optimization.md @@ -2,7 +2,7 @@ [](https://gitee.com/mindspore/docs/blob/master/docs/mindspore/source_en/features/compile/graph_optimization.md) -Similar to traditional compilers, MindSpore also performs compilation optimization after graph construction. The main purpose of compilation optimization is to analyze and transform MindSpore's intermediate representation [MindIR](https://www.mindspore.cn/docs/en/master/design/all_scenarios.html#mindspore-ir-mindir) by static analysis techniques to achieve goals such as reducing the size of the target code, improving execution efficiency, lowering runtime resource consumption, or enhancing other performance metrics. Compilation optimization is a crucial part of the graph compilation system and plays an extremely important role in improving the performance and resource utilization of the entire neural network model. Compared with the original code that has not been optimized, compilation optimization can bring several times or even tens of times performance improvement. +Similar to traditional compilers, MindSpore also performs compilation optimization after graph construction. The main purpose of compilation optimization is to analyze and transform MindSpore's intermediate representation MindIR by static analysis techniques to achieve goals such as reducing the size of the target code, improving execution efficiency, lowering runtime resource consumption, or enhancing other performance metrics. Compilation optimization is a crucial part of the graph compilation system and plays an extremely important role in improving the performance and resource utilization of the entire neural network model. Compared with the original code that has not been optimized, compilation optimization can bring several times or even tens of times performance improvement. This section mainly introduces front-end compilation optimization techniques that are independent of specific hardware. Hardware-specific back-end compilation optimization techniques are not within the scope of this discussion. diff --git a/docs/mindspore/source_en/features/compile/multi_level_compilation.md b/docs/mindspore/source_en/features/compile/multi_level_compilation.md index c6282d3283987a6624a08b93a5d0b6cb795611cc..43deec2a3e063987672f8eb80b8fc200b1774100 100644 --- a/docs/mindspore/source_en/features/compile/multi_level_compilation.md +++ b/docs/mindspore/source_en/features/compile/multi_level_compilation.md @@ -101,7 +101,7 @@ The overall architecture of graph-kernel fusion is shown in the figure below. Th The optimized computational graph is passed to MindSpore AKG as a subgraph for further back-end optimization and target code generation. - + By following these steps, we can obtain two aspects of performance gains: diff --git a/docs/mindspore/source_en/features/data_engine.md b/docs/mindspore/source_en/features/data_engine.md index ba9000f81187abfcdd973b06632ac33610a5299c..fada48b563d61b9066e17ffa3fd874abeaacfa9a 100644 --- a/docs/mindspore/source_en/features/data_engine.md +++ b/docs/mindspore/source_en/features/data_engine.md @@ -16,7 +16,7 @@ The core of MindSpore training data processing engine is to efficiently and flex Please refer to the instructions for usage: [Data Loading And Processing](https://www.mindspore.cn/docs/en/master/features/dataset/overview.html) - + MindSpore training data engine also provides efficient loading and sampling capabilities of datasets in fields, such as scientific computing-electromagnetic simulation, remote sensing large-format image processing, helping MindSpore achieve full-scene support. @@ -26,7 +26,7 @@ MindSpore training data engine also provides efficient loading and sampling capa The design of MindSpore considers the efficiency, flexibility and adaptability of data processing in different scenarios. The whole data processing subsystem is divided into the following modules: - + - API: The data processing process is represented in MindSpore in the form of a graph, called a data graph. MindSpore provides Python API to define data graphs externally and implement graph optimization and graph execution internally. - Data Processing Pipeline: Data loading and pre-processing multi-step parallel pipeline, which consists of the following components. diff --git a/docs/mindspore/source_en/features/overview.md b/docs/mindspore/source_en/features/overview.md index 9a6f425b186b4958e94e6036d90394a324acd603..9d49376aa4bf3db34710b0d1a161396573228c12 100644 --- a/docs/mindspore/source_en/features/overview.md +++ b/docs/mindspore/source_en/features/overview.md @@ -33,13 +33,13 @@ MindSpore is a full-scenario deep learning framework designed to achieve three m ### Fusion of Functional and Object-Oriented Programming Paradigms -MindSpore provides both object-oriented and function-oriented [programming paradigms](https://www.mindspore.cn/docs/en/master/design/programming_paradigm.html), both of which can be used to construct network algorithms and training processes. +MindSpore provides both object-oriented and function-oriented programming paradigms, both of which can be used to construct network algorithms and training processes. Developers can derive from the nn.Cell class to define AI networks or layers with required functionality, and assemble various defined layers through nested object calls to complete the definition of the entire AI network. At the same time, developers can also define a pure Python function that can be source-to-source compiled by MindSpore, and accelerate its execution through functions or decorators provided by MindSpore. Under the requirements of MindSpore's static syntax, pure Python functions can support nested subfunctions, control logic, and even recursive function expressions. Therefore, based on this programming paradigm, developers can flexibly enable certain functional features, making it easier to express business logic. -MindSpore implements [functional differential programming](https://www.mindspore.cn/docs/en/master/design/programming_paradigm.html#functional-differential-programming), which performs differentiation based on the call chain according to the calling relationship for function objects that can be differentiated. This automatic differentiation strategy better aligns with mathematical semantics and has an intuitive correspondence with composite functions in basic algebra. As long as the derivative formulas of basic functions are known, the derivative formula of a composite function composed of any basic functions can be derived. +MindSpore implements functional differential programming, which performs differentiation based on the call chain according to the calling relationship for function objects that can be differentiated. This automatic differentiation strategy better aligns with mathematical semantics and has an intuitive correspondence with composite functions in basic algebra. As long as the derivative formulas of basic functions are known, the derivative formula of a composite function composed of any basic functions can be derived. At the same time, based on the functional programming paradigm, MindSpore provides rich higher-order functions such as vmap, shard, and other built-in higher-order functions. Like the differential function grad, these allow developers to conveniently construct a function or object as a parameter for higher-order functions. Higher-order functions, after internal compilation optimization, generate optimized versions of developers' functions, implementing features such as vectorization transformation and distributed parallel partitioning. @@ -57,7 +57,7 @@ MindSpore builds the graph structure of neural networks based on Python, which p Native Python expressions can directly enable static graph mode execution based on Python control flow keywords, making the programming unification of dynamic and static graphs higher. At the same time, developers can flexibly control Python code fragments in dynamic and static graph modes based on MindSpore's interfaces. That is, local functions can be executed in static graph mode ([mindspore.jit](https://www.mindspore.cn/docs/en/master/api_python/mindspore/mindspore.jit.html)) while other functions are executed in dynamic graph mode. This allows developers to flexibly specify function fragments for static graph optimization and acceleration when interleaving with common Python libraries and custom Python functions, without sacrificing the programming ease of interleaved execution. -### [Distributed Parallel Computing](https://www.mindspore.cn/docs/en/master/design/distributed_training_design.html) +### Distributed Parallel Computing As large model parameters continue to grow, complex and diverse distributed parallel strategies are needed to address this challenge. MindSpore has built-in multi-dimensional distributed training strategies that developers can flexibly assemble and use. Through parallel abstraction, communication operations are hidden, simplifying the complexity of parallel programming for developers. @@ -71,7 +71,7 @@ At the same time, MindSpore also provides various parallel strategies such as pi Based on compilation technology, MindSpore provides rich hardware-independent optimizations such as IR fusion, algebraic simplification, constant folding, and common subexpression elimination. At the same time, it also provides various hardware optimization capabilities for different hardware such as NPU and GPU, thereby better leveraging the large-scale computational acceleration capabilities of hardware. -#### [Graph-Algorithm Fusion](https://www.mindspore.cn/docs/en/master/design/multi_level_compilation.html#graph-kernel-fusion) +#### [Graph-Algorithm Fusion](https://www.mindspore.cn/docs/en/master/features/compile/multi_level_compilation.html#graph-kernel-fusion) Mainstream AI computing frameworks like MindSpore typically define operators from the perspective of developer understanding and ease of use. Each operator carries varying amounts of computation and computational complexity. However, from a hardware execution perspective, this natural operator computational division based on the developer's perspective is not efficient and cannot fully utilize hardware computational capabilities. This is mainly reflected in: @@ -95,12 +95,12 @@ Loop sinking is an optimization based on On Device execution, aimed at further r Data sinking means that data is directly transmitted to the Device through channels. -### [Unified Deployment Across All Scenarios](https://www.mindspore.cn/docs/en/master/design/all_scenarios.html) +### Unified Deployment Across All Scenarios MindSpore is an AI framework that integrates training and inference, supporting both training and inference functions. At the same time, MindSpore supports various chips such as CPU, GPU, and NPU, and provides unified programming interfaces and can generate offline models that can be loaded and executed on various hardware. According to actual execution environments and business requirements, MindSpore provides multiple specification versions, supporting deployment on cloud, servers, mobile and other embedded devices, and ultra-lightweight devices such as earphones. -### [Third-Party Hardware Integration](https://www.mindspore.cn/docs/en/master/design/pluggable_device.html) +### [Third-Party Hardware Integration](https://www.mindspore.cn/docs/en/master/features/runtime/pluggable_device.html) -Based on the unified MindIR, MindSpore has built an open AI architecture that supports third-party chip plugins, standardization, and low-cost rapid integration, which can connect to GPU series chips as well as various DSA chips. MindSpore provides two chip integration methods: Kernel mode and Graph mode, allowing chip manufacturers to choose the integration method according to their own characteristics. \ No newline at end of file +Based on the unified MindIR, MindSpore has built an open AI architecture that supports third-party chip plugins, standardization, and low-cost rapid integration, which can connect to GPU series chips as well as various DSA chips. MindSpore provides two chip integration methods: Kernel mode and Graph mode, allowing chip manufacturers to choose the integration method according to their own characteristics. diff --git a/docs/mindspore/source_en/features/parallel/data_parallel.md b/docs/mindspore/source_en/features/parallel/data_parallel.md index aef44a6c9b7b8c2f9b8b0fa38187711bc16a5f88..3fa7ff6c141734c14f9bcb013cb215d6cfa01735 100644 --- a/docs/mindspore/source_en/features/parallel/data_parallel.md +++ b/docs/mindspore/source_en/features/parallel/data_parallel.md @@ -15,7 +15,7 @@ Related interfaces are as follows: ## Overall Process - + 1. Environmental dependencies diff --git a/docs/mindspore/source_en/features/runtime/memory_manager.md b/docs/mindspore/source_en/features/runtime/memory_manager.md index ce1857ee237164681de0f7d630058a91abc6b140..1162827adce30c41f68c497e4da76979301607b9 100644 --- a/docs/mindspore/source_en/features/runtime/memory_manager.md +++ b/docs/mindspore/source_en/features/runtime/memory_manager.md @@ -9,7 +9,7 @@ Device memory (hereinafter referred to as memory) is the most important resource 1. Memory pool serves as a base for memory management and can effectively avoid the overhead of frequent dynamic allocation of memory. 2. Memory reuse algorithm, as a core competency in memory management, needs to have efficient memory reuse results as well as minimal memory fragmentation. - + ## Interfaces @@ -22,7 +22,7 @@ The memory management-related interfaces are detailed in [runtime interfaces](ht The core idea of memory pool as a base for memory management is to pre-allocate a large block of contiguous memory, allocate it directly from the pool when applying for memory, and return it to the pool for reuse when releasing it, instead of frequently calling the memory application and release interfaces in the system, which reduces the overhead of frequent dynamic allocations, and improves system performance. MindSpore mainly uses the BestFit memory allocation algorithm, supports dynamic expansion of memory blocks and defragmentation, and sets the initialization parameters of the memory pool through the interface [mindspore.runtime.set_memory(init_size,increase_size,max_size)](https://www.mindspore.cn/docs/en/master/api_python/runtime/mindspore.runtime.set_memory.html) to control the dynamic expansion size and maximum memory usage. - + 1. Slicing operation: When memory is allocated, free areas are sorted according to their sizes, the first free area that meets the requirements is found, allocated on demand, the excess is cut, and a new block of free memory is inserted. 2. Merge operation: When memory is reclaimed, neighboring free memory blocks are reclaimed and merged into one large free memory block. @@ -55,4 +55,4 @@ Dynamic memory reuse is just the opposite of static memory reuse, transferring t 4. Reset the initial reference count from step 1. - Pros: Dynamic memory reuse during the graph execution phase, fully generalized, especially friendly for dynamic shape and control flow scenarios. -- Cons: The graph execution phase is reused on demand, obtains no global information, and is prone to memory fragmentation. \ No newline at end of file +- Cons: The graph execution phase is reused on demand, obtains no global information, and is prone to memory fragmentation. diff --git a/docs/mindspore/source_en/features/runtime/multilevel_pipeline.md b/docs/mindspore/source_en/features/runtime/multilevel_pipeline.md index 62276109aae103df85befb55676b37b5b7f83609..235120d55c435a9349a24584098aea1a7f62f1e6 100644 --- a/docs/mindspore/source_en/features/runtime/multilevel_pipeline.md +++ b/docs/mindspore/source_en/features/runtime/multilevel_pipeline.md @@ -12,7 +12,7 @@ Runtime scheduling for an operator mainly includes the operations InferShape (in Multi-stage flow is a key performance optimization point for runtime, which improves runtime scheduling efficiency by task decomposition and parallel flow issued to give full play to CPU multi-core performance. The main flow is as follows: - + 1. Task decomposition: the operator scheduling is decomposed into three tasks InferShape, Resize and Launch. 2. Queue creation: Create three queues, Infer Queue, Resize Queue and Launch Queue, for taking over the three tasks in step 1. diff --git a/docs/mindspore/source_en/features/runtime/multistream_concurrency.md b/docs/mindspore/source_en/features/runtime/multistream_concurrency.md index bc482deafd1f84b73470b226b121ad302bcbe763..b99d645b3ffac040d2061977cdcfea8d372adda0 100644 --- a/docs/mindspore/source_en/features/runtime/multistream_concurrency.md +++ b/docs/mindspore/source_en/features/runtime/multistream_concurrency.md @@ -10,7 +10,7 @@ During the training of large-scale deep learning models, the importance of commu Traditional multi-stream concurrency methods usually rely on manual configuration, which is not only cumbersome and error-prone, but also often difficult to achieve optimal concurrency when faced with complex computational graphs. MindSpore's automatic stream assignment feature automatically identifies and assigns concurrency opportunities in the computational graph by means of an intelligent algorithm, and assigns different operators to different streams for execution. This automated allocation process not only simplifies user operations, but also enables dynamic adjustment of stream allocation policies at runtime to accommodate different computing environments and resource conditions. - + The principles are as follows: diff --git a/docs/mindspore/source_en/index.rst b/docs/mindspore/source_en/index.rst index b2abe11b6715e79a7738e3b5cbea31795cbfa822..b0e36e72f37598e87c6c51f6d7dd36367899600f 100644 --- a/docs/mindspore/source_en/index.rst +++ b/docs/mindspore/source_en/index.rst @@ -6,7 +6,6 @@ MindSpore Documentation :maxdepth: 1 :hidden: - design/index features/index api_python/index faq/index @@ -19,7 +18,7 @@ MindSpore Documentation