diff --git a/CONTRIBUTING_DOC.md b/CONTRIBUTING_DOC.md index 53a1a63281e5aed7f04333637b0d494362414829..171f94d5ab1d26bfb2ec1ba6a3241acf5f265a3b 100644 --- a/CONTRIBUTING_DOC.md +++ b/CONTRIBUTING_DOC.md @@ -25,7 +25,7 @@ The procedure for submitting the modification is the same as that for submitting - The title supports only the ATX style. The title and context must be separated by a blank line. - ``` + ```markdown # Heading 1 ## Heading 2 @@ -35,7 +35,7 @@ The procedure for submitting the modification is the same as that for submitting - If the list title and content need to be displayed in different lines, add a blank line between the title and content. Otherwise, the line breaks may not be implemented. - ``` + ```markdown - Title Content @@ -45,13 +45,13 @@ The procedure for submitting the modification is the same as that for submitting - Precautions are marked with a right angle bracket (>). - ``` + ```markdown > Precautions ``` - References should be listed at the end of the document and marked in the document. - ``` + ```markdown Add a [number] after the referenced text or image description. ## References @@ -63,12 +63,12 @@ The procedure for submitting the modification is the same as that for submitting - Comments in the sample code must comply with the following requirements: - - Comments are written in English. - - Use ```"""``` to comment out Python functions, methods, and classes. - - Use ```#``` to comment out other Python code. - - Use ```//``` to comment out C++ code. + - Comments are written in English. + - Use ```"""``` to comment out Python functions, methods, and classes. + - Use ```#``` to comment out other Python code. + - Use ```//``` to comment out C++ code. - ``` + ```markdown """ Comments on Python functions, methods, and classes """ @@ -81,7 +81,7 @@ The procedure for submitting the modification is the same as that for submitting - A blank line must be added before and after an image and an image title. Otherwise, the typesetting will be abnormal. For example as correctly: - ``` + ```markdown Example: ![](./xxx.png) @@ -93,7 +93,7 @@ The procedure for submitting the modification is the same as that for submitting - A blank line must be added before and after a table. Otherwise, the typesetting will be abnormal. Tables are not supported in ordered or unordered lists. For example as correctly: - ``` + ```markdown ## Title | Header1 | Header2 @@ -103,43 +103,43 @@ The procedure for submitting the modification is the same as that for submitting The following content. ``` - + - Mark the reference interface, path name, file name in the tutorial and document with "\` \`". If it's a function or method, don't use parentheses at the end. For example: - - - Reference method - - ``` + + - Reference method + + ```markdown Use the `map` method. ``` - - - Reference code - - ``` + + - Reference code + + ```markdown `batch_size`: number of data in each group. ``` - - - Reference path - - ``` + + - Reference path + + ```markdown Decompress the dataset and store it in `./MNIST_Data`. ``` - - Reference file name - - ``` + - Reference file name + + ```markdown Other dependencies is described in `requirements.txt`. ``` - In tutorials and documents, the contents that need to be replaced need additional annotation. In the body, a "*" should be added before and after the content. In the code snippet, the content should be annotated with "{}". For example: - - - In body - ``` - Need to replace your local path *your_ path*. + - In body + + ```markdown + Need to replace your local path *your_ path*. ``` - - In code snippet + - In code snippet - ``` + ```markdown conda activate {your_env_name} - ``` \ No newline at end of file + ``` diff --git a/CONTRIBUTING_DOC_CN.md b/CONTRIBUTING_DOC_CN.md index 2ed49dbd3769e654dd699a9f26e6ad8fe058135d..b349f753b1aeb3e1ecf75ec40b2ce016ba42b15d 100644 --- a/CONTRIBUTING_DOC_CN.md +++ b/CONTRIBUTING_DOC_CN.md @@ -25,7 +25,7 @@ - 标题仅支持Atx风格,标题与上下文需用空行隔开。 - ``` + ```markdown # 一级标题 ## 二级标题 @@ -35,7 +35,7 @@ - 列表标题和内容如需换行显示,标题和内容间需增加一个空行,否则无法实现换行。 - ``` + ```markdown - 标题 内容。 @@ -45,30 +45,30 @@ - 注意事项使用“>”标识。 - ``` + ```markdown > 注意事项内容。 ``` - 参考文献需列举在文末,并在文中标注。 - - ``` + + ```markdown 引用文字或图片说明后,增加标注[编号]。 ## 参考文献 [1] 作者. [有链接的文献名](http://xxx). - + [2] 作者. 没有链接的文献名. ``` - 示例代码注释需遵循如下要求: - - 注释用英文写作; - - Python函数、方法、类的注释使用```"""```; - - Python其他代码注释使用```#```; - - C++代码注释使用```//```。 + - 注释用英文写作; + - Python函数、方法、类的注释使用```"""```; + - Python其他代码注释使用```#```; + - C++代码注释使用```//```。 - ``` + ```markdown """ Python函数、方法、类的注释 """ @@ -81,7 +81,7 @@ - 图和图标题前后需增加一个空行,否则会导致排版异常。正确举例如下: - ``` + ```markdown 如下图所示: ![](./xxx.png) @@ -93,7 +93,7 @@ - 表格前后需增加一个空行,否则会导致排版异常。有序或无序列表内不支持表格。正确举例如下: - ``` + ```markdown ## 文章标题 | 表头1 | 表头2 @@ -105,41 +105,41 @@ ``` - 教程、文档中引用接口、路径名、文件名等使用“\` \`”标注,如果是函数或方法,最后不加括号。举例如下: - - - 引用方法 - - ``` + + - 引用方法 + + ```markdown 使用映射 `map` 方法。 ``` - - - 引用代码 - - ``` + + - 引用代码 + + ```markdown `batch_size`:每组包含的数据个数。 ``` - - 引用路径 - - ``` + - 引用路径 + + ```markdown 将数据集解压存放到工作区`./MNIST_Data`路径下。 ``` - - 引用文件名 - - ``` + - 引用文件名 + + ```markdown 其他依赖项在`requirements.txt`中有详细描述。 ``` - 教程、文档中待用户替换的内容需要额外标注,在正文中,使用“*”包围需要替换内容,在代码片段中,使用“{}”包围替换内容。举例如下: - - - 正文中 - ``` + - 正文中 + + ```markdown 需要替换你的本地路径*your_path*。 ``` - - - 代码片段中 - ``` + - 代码片段中 + + ```markdown conda activate {your_env_name} ``` diff --git a/README.md b/README.md index 499d288a55662adde7dfb265ca95ac93d64dec07..b79bbd2d9b961550b7ded278871cec0fd6e518c5 100644 --- a/README.md +++ b/README.md @@ -16,20 +16,20 @@ If you have any comments or suggestions on the documents, submit them in Issues. ## Directory Structure Description -``` +```text docs ├───docs // Technical documents about architecture, network list, operator list, programming guide and so on. Configuration files for API generation. -│ +│ ├───install // Installation guide. -│ +│ ├───lite // Summary of all documents related to mindspore lite and their links. -│ +│ ├───resource // Resource-related documents. -│ +│ ├───tools // Automation tool. -│ +│ ├───tutorials // Tutorial-related documents. -│ +│ └───README_CN.md // Docs repository description. ``` @@ -38,21 +38,27 @@ docs MindSpore tutorials and API documents can be generated by [Sphinx](https://www.sphinx-doc.org/en/master/). The following uses the Python API document as an example to describe the procedure, and ensure that MindSpore, MindSpore Hub and MindArmour have been installed. 1. Download code of the MindSpore Docs repository. + ```shell git clone https://gitee.com/mindspore/docs.git ``` + 2. Go to the api_python directory and install the dependency items in the `requirements.txt` file. + ```shell cd docs/api_python pip install -r requirements.txt ``` + 3. Run the following command in the api_python directory to create the `build_zh_cn/html` directory that stores the generated document web page. You can open `build_zh_cn/html/index.html` to view the API document. - ``` + + ```shell make html ``` + > If you only need to generate the MindSpore API, please modify the `source_zh_cn/conf.py` file, comment the `import mindspore_hub` and `import mindarmour` statements, and then perform this step. ## License - [Apache License 2.0](LICENSE) -- [Creative Commons License version 4.0](LICENSE-CC-BY-4.0) \ No newline at end of file +- [Creative Commons License version 4.0](LICENSE-CC-BY-4.0) diff --git a/README_CN.md b/README_CN.md index 8a4de7e8afd65198f4234dc3507f50c4b7157a0d..ac420699c9536a0188543582b30a64b2d7792f0b 100644 --- a/README_CN.md +++ b/README_CN.md @@ -16,20 +16,20 @@ ## 目录结构说明 -``` +```text docs ├───docs // 架构、网络和算子支持、编程指南等技术文档以及用于生成API的相关配置文件 -│ +│ ├───install // 安装指南 -│ +│ ├───lite // MindSpore Lite相关所有文档汇总及其链接 -│ +│ ├───resource // 资源相关文档 -│ +│ ├───tools // 自动化工具 -│ +│ ├───tutorials // 教程相关文档 -│ +│ └───README_CN.md // Docs仓说明 ``` @@ -38,19 +38,25 @@ docs MindSpore的教程和API文档均可由[Sphinx](https://www.sphinx-doc.org/en/master/)工具生成。下面以Python API文档为例介绍具体步骤,操作前需完成MindSpore、MindSpore Hub和MindArmour的安装。 1. 下载MindSpore Docs仓代码。 + ```shell git clone https://gitee.com/mindspore/docs.git ``` + 2. 进入api_python目录,安装该目录下`requirements.txt`文件中的依赖项。 + ```shell cd docs/api_python pip install -r requirements.txt ``` + 3. 在api_python目录下执行如下命令,完成后会新建`build_zh_cn/html`目录,该目录中存放了生成后的文档网页,打开`build_zh_cn/html/index.html`即可查看API文档内容。 - ``` + + ```shell make html ``` - > 如仅需生成MindSpore API,请先修改`source_zh_cn/conf.py`文件,注释`import mindspore_hub`和`import mindarmour`语句后,再执行此步骤。 + + > 如仅需生成MindSpore API,请先修改`source_zh_cn/conf.py`文件,注释`import mindspore_hub`和`import mindarmour`语句后,再执行此步骤。 ## 版权 diff --git a/docs/api_cpp/source_en/dataset.md b/docs/api_cpp/source_en/dataset.md index 607e3eaf4f4243debe90e8d8db758a111eadb683..df6c8a3f8ecae435799c89bd390b2e33f4f086d5 100644 --- a/docs/api_cpp/source_en/dataset.md +++ b/docs/api_cpp/source_en/dataset.md @@ -1,14 +1,14 @@ # mindspore::dataset -#include <[lite_mat.h](https://gitee.com/mindspore/mindspore/blob/master/mindspore/ccsrc/minddata/dataset/kernels/image/lite_cv/lite_mat.h)> -#include <[image_process.h](https://gitee.com/mindspore/mindspore/blob/master/mindspore/ccsrc/minddata/dataset/kernels/image/lite_cv/image_process.h)> +\#include <[lite_mat.h](https://gitee.com/mindspore/mindspore/blob/master/mindspore/ccsrc/minddata/dataset/kernels/image/lite_cv/lite_mat.h)> +\#include <[image_process.h](https://gitee.com/mindspore/mindspore/blob/master/mindspore/ccsrc/minddata/dataset/kernels/image/lite_cv/image_process.h)> -## Functions of image_process.h +## Functions of image_process.h ### ResizeBilinear -``` +```cpp bool ResizeBilinear(LiteMat &src, LiteMat &dst, int dst_w, int dst_h) ``` @@ -26,7 +26,7 @@ Resize image by bilinear algorithm, currently the data type only supports uint8, ### InitFromPixel -``` +```cpp bool InitFromPixel(const unsigned char *data, LPixelType pixel_type, LDataType data_type, int w, int h, LiteMat &m) ``` @@ -46,7 +46,7 @@ Initialize LiteMat from pixel, currently the conversion supports rbgaTorgb and r ### ConvertTo -``` +```cpp bool ConvertTo(LiteMat &src, LiteMat &dst, double scale = 1.0) ``` @@ -64,7 +64,7 @@ Convert the data type, currently it supports converting the data type from uint8 ### Crop -``` +```cpp bool Crop(LiteMat &src, LiteMat &dst, int x, int y, int w, int h) ``` @@ -84,7 +84,7 @@ Crop image, the channel supports is 3 and 1. ### SubStractMeanNormalize -``` +```cpp bool SubStractMeanNormalize(const LiteMat &src, LiteMat &dst, const std::vector &mean, const std::vector &std) ``` @@ -102,7 +102,7 @@ Normalize image, currently the supports data type is float. ### Pad -``` +```cpp bool Pad(const LiteMat &src, LiteMat &dst, int top, int bottom, int left, int right, PaddBorderType pad_type, uint8_t fill_b_or_gray, uint8_t fill_g, uint8_t fill_r) ``` @@ -126,7 +126,7 @@ Pad image, the channel supports is 3 and 1. ### Affine -``` +```cpp void Affine(LiteMat &src, LiteMat &out_img, double M[6], std::vector dsize, UINT8_C1 borderValue) ``` @@ -140,7 +140,7 @@ Apply affine transformation for 1 channel image. - `dsize`: The size of the output image. - `borderValue`: The pixel value is used for filing after the image is captured. -``` +```cpp void Affine(LiteMat &src, LiteMat &out_img, double M[6], std::vector dsize, UINT8_C3 borderValue) ``` @@ -156,7 +156,7 @@ Apply affine transformation for 3 channel image. ### GetDefaultBoxes -``` +```cpp std::vector> GetDefaultBoxes(BoxesConfig config) ``` @@ -172,7 +172,7 @@ Get default anchor boxes for Faster R-CNN, SSD, YOLO etc. ### ConvertBoxes -``` +```cpp void ConvertBoxes(std::vector> &boxes, std::vector> &default_boxes, BoxesConfig config) ``` @@ -186,7 +186,7 @@ Convert the prediction boxes to the actual boxes with (y, x, h, w). ### ApplyNms -``` +```cpp std::vector ApplyNms(std::vector> &all_boxes, std::vector &all_scores, float thres, int max_boxes) ``` @@ -208,11 +208,11 @@ Real-size box non-maximum suppression. Class that represents a lite Mat of a Image. -**Constructors & Destructors** +### Constructors & Destructors -### LiteMat +#### LiteMat -``` +```cpp LiteMat() LiteMat(int width, LDataType data_type = LDataType::UINT8) @@ -224,17 +224,17 @@ LiteMat(int width, int height, int channel, LDataType data_type = LDataType::UIN Constructor of MindSpore dataset LiteMat using default value of parameters. -``` +```cpp ~LiteMat(); ``` Destructor of MindSpore dataset LiteMat. -**Public Member Functions** +### Public Member Functions -### Init +#### Init -``` +```cpp void Init(int width, LDataType data_type = LDataType::UINT8) void Init(int width, int height, LDataType data_type = LDataType::UINT8) @@ -244,9 +244,9 @@ void Init(int width, int height, int channel, LDataType data_type = LDataType::U The function to initialize the channel, width and height of the image, but the parameters are different. -### IsEmpty +#### IsEmpty -``` +```cpp bool IsEmpty() const ``` @@ -256,19 +256,19 @@ A function to determine whether the object is empty. Return True or False. -### Release +#### Release -``` +```cpp void Release() ``` A function to release memory. -**Private Member Functions** +### Private Member Functions -### AlignMalloc +#### AlignMalloc -``` +```cpp void *AlignMalloc(unsigned int size) ``` @@ -282,15 +282,15 @@ Apply for memory alignment. Return the size of a pointer. -### AlignFree +#### AlignFree -``` +```cpp void AlignFree(void *ptr) ``` A function to release pointer memory. -``` +```cpp void InitElemSize(LDataType data_type) ``` @@ -300,9 +300,9 @@ Initialize the value of elem_size_ by data_type. - `data_type`: Type of data. -### addRef +#### addRef -``` +```cpp int addRef(int *p, int value) ``` @@ -311,4 +311,4 @@ A function to count the number of times the function is referenced. - Parameters - `p`: Point to the referenced object. - - `value`: Value added when quoted. \ No newline at end of file + - `value`: Value added when quoted. diff --git a/docs/api_cpp/source_en/errorcode_and_metatype.md b/docs/api_cpp/source_en/errorcode_and_metatype.md index 2cebba704dbf103909e29b1939b463a96ce4a4bc..6d725cb19d1f962a8e3a9f520b4f1f66920149dc 100644 --- a/docs/api_cpp/source_en/errorcode_and_metatype.md +++ b/docs/api_cpp/source_en/errorcode_and_metatype.md @@ -28,6 +28,7 @@ Description of error code and meta type supported in MindSpore Lite. | RET_INPUT_PARAM_INVALID | -600 | Invalid input param by user. | ## MetaType + An **enum** type. | Type Definition | Value | Description | @@ -49,4 +50,3 @@ An **enum** type. |kNumberTypeFloat32| 43 | Indicating a data type of float32. | |kNumberTypeFloat64| 44 | Indicating a data type of float64.| |kNumberTypeEnd| 45 | The end of number type. | - diff --git a/docs/api_cpp/source_en/lite.md b/docs/api_cpp/source_en/lite.md index 6c6ef9d0d78ec6b61da7f39b89d59c54ce833ad4..61ee990133dc195da807e0eeb28ab1efa789328b 100644 --- a/docs/api_cpp/source_en/lite.md +++ b/docs/api_cpp/source_en/lite.md @@ -1,11 +1,10 @@ # mindspore::lite -#include <[context.h](https://gitee.com/mindspore/mindspore/blob/master/mindspore/lite/include/context.h)> +\#include <[context.h](https://gitee.com/mindspore/mindspore/blob/master/mindspore/lite/include/context.h)> -#include <[model.h](https://gitee.com/mindspore/mindspore/blob/master/mindspore/lite/include/model.h)> - -#include <[version.h](https://gitee.com/mindspore/mindspore/blob/master/mindspore/lite/include/version.h)> +\#include <[model.h](https://gitee.com/mindspore/mindspore/blob/master/mindspore/lite/include/model.h)> +\#include <[version.h](https://gitee.com/mindspore/mindspore/blob/master/mindspore/lite/include/version.h)> ## Allocator @@ -15,135 +14,156 @@ Allocator defines a memory pool for dynamic memory malloc and memory free. Context is defined for holding environment variables during runtime. -**Constructors & Destructors** +### Constructors & Destructors -``` +```cpp Context() ``` Constructor of MindSpore Lite Context using default value for parameters. -``` +```cpp ~Context() ``` + Destructor of MindSpore Lite Context. -**Public Attributes** +### Public Attributes -``` +```cpp float16_priority ``` + A **bool** value. Defaults to **false**. Prior enable float16 inference. > Enabling float16 inference may cause low precision inference,because some variables may exceed the range of float16 during forwarding. -``` +```cpp device_type ``` + A [**DeviceType**](https://www.mindspore.cn/doc/api_cpp/en/master/lite.html#devicetype) **enum** type. Defaults to **DT_CPU**. Using to specify the device. -``` +```cpp thread_num_ ``` An **int** value. Defaults to **2**. Thread number config for thread pool. -``` +```cpp allocator ``` A **pointer** pointing to [**Allocator**](https://www.mindspore.cn/doc/api_cpp/en/master/lite.html#allocator). -``` -cpu_bind_mode_ +```cpp +cpu_bind_mode_ ``` -A [**CpuBindMode**](https://www.mindspore.cn/doc/api_cpp/en/master/lite.html#cpubindmode) **enum** variable. Defaults to **MID_CPU**. +A [**CpuBindMode**](https://www.mindspore.cn/doc/api_cpp/en/master/lite.html#cpubindmode) **enum** variable. Defaults to **MID_CPU**. ## PrimitiveC + Primitive is defined as prototype of operator. ## Model + Model defines model in MindSpore Lite for managing graph. -**Destructors** +### Destructors -``` +```cpp virtual ~Model() ``` Destructor of MindSpore Lite Model. -**Public Member Functions** +### Public Member Functions -``` +```cpp void Free() ``` + Free MetaGraph in MindSpore Lite Model to reduce memory usage during inference. -``` +```cpp void Destroy() ``` + Destroy all temporary memory in MindSpore Lite Model. -**Static Public Member Functions** -``` +### Static Public Member Functions + +```cpp static Model *Import(const char *model_buf, size_t size) ``` + Static method to create a Model pointer. -- Parameters +- Parameters - - `model_buf`: Define the buffer read from a model file. + - `model_buf`: Define the buffer read from a model file. - `size`: variable. Define bytes number of model buffer. - Returns Pointer of MindSpore Lite Model. - + ## CpuBindMode + An **enum** type. CpuBindMode defined for holding bind cpu strategy argument. -**Attributes** +### Attributes -``` -MID_CPU = -1 +```cpp +MID_CPU = 2 ``` + Bind middle cpu first. -``` +```cpp HIGHER_CPU = 1 ``` + Bind higher cpu first. -``` +```cpp NO_BIND = 0 ``` + No bind. + ## DeviceType + An **enum** type. DeviceType defined for holding user's preferred backend. -**Attributes** -``` +### Attributes + +```cpp DT_CPU = -1 ``` + CPU device type. -``` +```cpp DT_GPU = 1 ``` + GPU device type. -``` +```cpp DT_NPU = 0 ``` + NPU device type, not supported yet. + ## Version -``` +```cpp std::string Version() ``` + Global method to get a version string. - Returns diff --git a/docs/api_cpp/source_en/session.md b/docs/api_cpp/source_en/session.md index 74216a06f9928ae45edbebf8dd72953cb6390594..80e6168574004dffd6abf3e5b01d10e7a4db6ec9 100644 --- a/docs/api_cpp/source_en/session.md +++ b/docs/api_cpp/source_en/session.md @@ -1,37 +1,42 @@ -# mindspore::session - -#include <[lite_session.h](https://gitee.com/mindspore/mindspore/blob/master/mindspore/lite/include/lite_session.h)> +# mindspore::session +\#include <[lite_session.h](https://gitee.com/mindspore/mindspore/blob/master/mindspore/lite/include/lite_session.h)> ## LiteSession LiteSession defines session in MindSpore Lite for compiling Model and forwarding model. -**Constructors & Destructors** +### Constructors & Destructors -``` +```cpp LiteSession() ``` + Constructor of MindSpore Lite LiteSession using default value for parameters. -``` + +```cpp ~LiteSession() ``` + Destructor of MindSpore Lite LiteSession. -**Public Member Functions** -``` +### Public Member Functions + +```cpp virtual void BindThread(bool if_bind) ``` + Attempt to bind or unbind threads in the thread pool to or from the specified cpu core. - Parameters - `if_bind`: Define whether to bind or unbind threads. -``` +```cpp virtual int CompileGraph(lite::Model *model) ``` -Compile MindSpore Lite model. + +Compile MindSpore Lite model. > CompileGraph should be called before RunGraph. @@ -43,18 +48,20 @@ Compile MindSpore Lite model. STATUS as an error code of compiling graph, STATUS is defined in [errorcode.h](https://gitee.com/mindspore/mindspore/blob/master/mindspore/lite/include/errorcode.h). -``` +```cpp virtual std::vector GetInputs() const ``` + Get input MindSpore Lite MSTensors of model. - Returns The vector of MindSpore Lite MSTensor. -``` +```cpp mindspore::tensor::MSTensor *GetInputsByName(const std::string &name) const ``` + Get input MindSpore Lite MSTensors of model by tensor name. - Parameters @@ -64,11 +71,12 @@ Get input MindSpore Lite MSTensors of model by tensor name. - Returns MindSpore Lite MSTensor. - -``` + +```cpp virtual int RunGraph(const KernelCallBack &before = nullptr, const KernelCallBack &after = nullptr) ``` -Run session with callback. + +Run session with callback. > RunGraph should be called after CompileGraph. - Parameters @@ -81,9 +89,10 @@ Run session with callback. STATUS as an error code of running graph, STATUS is defined in [errorcode.h](https://gitee.com/mindspore/mindspore/blob/master/mindspore/lite/include/errorcode.h). -``` +```cpp virtual std::vector GetOutputsByNodeName(const std::string &node_name) const ``` + Get output MindSpore Lite MSTensors of model by node name. - Parameters @@ -94,27 +103,30 @@ Get output MindSpore Lite MSTensors of model by node name. The vector of MindSpore Lite MSTensor. -``` +```cpp virtual std::unordered_map GetOutputs() const ``` + Get output MindSpore Lite MSTensors of model mapped by tensor name. - Returns The map of output tensor name and MindSpore Lite MSTensor. -``` +```cpp virtual std::vector GetOutputTensorNames() const ``` + Get name of output tensors of model compiled by this session. - Returns The vector of string as output tensor names in order. -``` +```cpp virtual mindspore::tensor::MSTensor *GetOutputByTensorName(const std::string &tensor_name) const ``` + Get output MindSpore Lite MSTensors of model by tensor name. - Parameters @@ -125,11 +137,12 @@ Get output MindSpore Lite MSTensors of model by tensor name. Pointer of MindSpore Lite MSTensor. -``` +```cpp virtual mindspore::tensor::MSTensor *GetOutputByTensorName(const std::string &tensor_name) const ``` + Get output MindSpore Lite MSTensors of model by tensor name. - + - Parameters - `tensor_name`: Define tensor name. @@ -138,10 +151,11 @@ Get output MindSpore Lite MSTensors of model by tensor name. Pointer of MindSpore Lite MSTensor. -``` +```cpp virtual int Resize(const std::vector &inputs, const std::vector> &dims) ``` + Resize inputs shape. - Parameters @@ -153,11 +167,12 @@ Resize inputs shape. STATUS as an error code of resize inputs, STATUS is defined in [errorcode.h](https://gitee.com/mindspore/mindspore/blob/master/mindspore/lite/include/errorcode.h). -**Static Public Member Functions** +### Static Public Member Functions -``` +```cpp static LiteSession *CreateSession(lite::Context *context) ``` + Static method to create a LiteSession pointer. - Parameters @@ -167,9 +182,10 @@ Static method to create a LiteSession pointer. - Returns Pointer of MindSpore Lite LiteSession. + ## KernelCallBack -``` +```cpp using KernelCallBack = std::function inputs, std::vector outputs, const CallBackParam &opInfo)> ``` @@ -179,14 +195,16 @@ A function wrapper. KernelCallBack defined the function pointer for callback. A **struct**. CallBackParam defines input arguments for callback function. -**Attributes** +### Attributes -``` +```cpp name_callback_param ``` + A **string** variable. Node name argument. -``` +```cpp type_callback_param ``` + A **string** variable. Node type argument. diff --git a/docs/api_cpp/source_en/tensor.md b/docs/api_cpp/source_en/tensor.md index 14171fd1d8d71c43bd96e69ba6e74e6a6eae96dc..4d10116efd910f19a9b97877149204d607ef0592 100644 --- a/docs/api_cpp/source_en/tensor.md +++ b/docs/api_cpp/source_en/tensor.md @@ -1,33 +1,35 @@ # mindspore::tensor -#include <[ms_tensor.h](https://gitee.com/mindspore/mindspore/blob/master/mindspore/lite/include/ms_tensor.h)> - +\#include <[ms_tensor.h](https://gitee.com/mindspore/mindspore/blob/master/mindspore/lite/include/ms_tensor.h)> ## MSTensor MSTensor defined tensor in MindSpore Lite. -**Constructors & Destructors** -``` +### Constructors & Destructors + +```cpp MSTensor() ``` + Constructor of MindSpore Lite MSTensor. - Returns Instance of MindSpore Lite MSTensor. - -``` + +```cpp virtual ~MSTensor() ``` Destructor of MindSpore Lite Model. -**Public Member Functions** +### Public Member Functions -``` +```cpp virtual TypeId data_type() const ``` + Get data type of the MindSpore Lite MSTensor. > TypeId is defined in [mindspore/mindspore/core/ir/dtype/type_id.h](https://gitee.com/mindspore/mindspore/blob/master/mindspore/core/ir/dtype/type_id.h). Only number types in TypeId enum are suitable for MSTensor. @@ -36,7 +38,7 @@ Get data type of the MindSpore Lite MSTensor. MindSpore Lite TypeId of the MindSpore Lite MSTensor. -``` +```cpp virtual std::vector shape() const ``` @@ -46,7 +48,7 @@ Get shape of the MindSpore Lite MSTensor. A vector of int as the shape of the MindSpore Lite MSTensor. -``` +```cpp virtual int DimensionSize(size_t index) const ``` @@ -60,7 +62,7 @@ Get size of the dimension of the MindSpore Lite MSTensor index by the parameter Size of dimension of the MindSpore Lite MSTensor. -``` +```cpp virtual int ElementsNum() const ``` @@ -70,7 +72,7 @@ Get number of element in MSTensor. Number of element in MSTensor. -``` +```cpp virtual size_t Size() const ``` @@ -79,15 +81,13 @@ Get byte size of data in MSTensor. - Returns Byte size of data in MSTensor. - -``` +```cpp virtual void *MutableData() const ``` Get the pointer of data in MSTensor. - > The data pointer can be used to both write and read data in MSTensor. - Returns diff --git a/docs/api_cpp/source_zh_cn/dataset.md b/docs/api_cpp/source_zh_cn/dataset.md index 5170585149b24badd6ae3a38f40aacf336641003..1811c7d1ca8864e2bd0e0f4a1ea2dc724356fea2 100644 --- a/docs/api_cpp/source_zh_cn/dataset.md +++ b/docs/api_cpp/source_zh_cn/dataset.md @@ -1,14 +1,13 @@ # mindspore::dataset -#include <[lite_mat.h](https://gitee.com/mindspore/mindspore/blob/master/mindspore/ccsrc/minddata/dataset/kernels/image/lite_cv/lite_mat.h)> -#include <[image_process.h](https://gitee.com/mindspore/mindspore/blob/master/mindspore/ccsrc/minddata/dataset/kernels/image/lite_cv/image_process.h)> - +\#include <[lite_mat.h](https://gitee.com/mindspore/mindspore/blob/master/mindspore/ccsrc/minddata/dataset/kernels/image/lite_cv/lite_mat.h)> +\#include <[image_process.h](https://gitee.com/mindspore/mindspore/blob/master/mindspore/ccsrc/minddata/dataset/kernels/image/lite_cv/image_process.h)> ## image_process.h文件的函数 ### ResizeBilinear -``` +```cpp bool ResizeBilinear(LiteMat &src, LiteMat &dst, int dst_w, int dst_h) ``` @@ -26,7 +25,7 @@ bool ResizeBilinear(LiteMat &src, LiteMat &dst, int dst_w, int dst_h) ### InitFromPixel -``` +```cpp bool InitFromPixel(const unsigned char *data, LPixelType pixel_type, LDataType data_type, int w, int h, LiteMat &m) ``` @@ -46,7 +45,7 @@ bool InitFromPixel(const unsigned char *data, LPixelType pixel_type, LDataType d ### ConvertTo -``` +```cpp bool ConvertTo(LiteMat &src, LiteMat &dst, double scale = 1.0) ``` @@ -64,7 +63,7 @@ bool ConvertTo(LiteMat &src, LiteMat &dst, double scale = 1.0) ### Crop -``` +```cpp bool Crop(LiteMat &src, LiteMat &dst, int x, int y, int w, int h) ``` @@ -84,7 +83,7 @@ bool Crop(LiteMat &src, LiteMat &dst, int x, int y, int w, int h) ### SubStractMeanNormalize -``` +```cpp bool SubStractMeanNormalize(const LiteMat &src, LiteMat &dst, const std::vector &mean, const std::vector &std) ``` @@ -102,7 +101,7 @@ bool SubStractMeanNormalize(const LiteMat &src, LiteMat &dst, const std::vector< ### Pad -``` +```cpp bool Pad(const LiteMat &src, LiteMat &dst, int top, int bottom, int left, int right, PaddBorderType pad_type, uint8_t fill_b_or_gray, uint8_t fill_g, uint8_t fill_r) ``` @@ -126,7 +125,7 @@ bool Pad(const LiteMat &src, LiteMat &dst, int top, int bottom, int left, int ri ### Affine -``` +```cpp void Affine(LiteMat &src, LiteMat &out_img, double M[6], std::vector dsize, UINT8_C1 borderValue) ``` @@ -140,7 +139,7 @@ void Affine(LiteMat &src, LiteMat &out_img, double M[6], std::vector dsi - `dsize`: 输出图像的大小。 - `borderValue`: 采图之后用于填充的像素值。 -``` +```cpp void Affine(LiteMat &src, LiteMat &out_img, double M[6], std::vector dsize, UINT8_C3 borderValue) ``` @@ -156,7 +155,7 @@ void Affine(LiteMat &src, LiteMat &out_img, double M[6], std::vector dsi ### GetDefaultBoxes -``` +```cpp std::vector> GetDefaultBoxes(BoxesConfig config) ``` @@ -172,7 +171,7 @@ std::vector> GetDefaultBoxes(BoxesConfig config) ### ConvertBoxes -``` +```cpp void ConvertBoxes(std::vector> &boxes, std::vector> &default_boxes, BoxesConfig config) ``` @@ -186,7 +185,7 @@ void ConvertBoxes(std::vector> &boxes, std::vector ApplyNms(std::vector> &all_boxes, std::vector &all_scores, float thres, int max_boxes) ``` @@ -208,11 +207,11 @@ std::vector ApplyNms(std::vector> &all_boxes, std::vecto LiteMat是一个处理图像的类。 -**构造函数和析构函数** +### 构造函数和析构函数 -### LiteMat +#### LiteMat -``` +```cpp LiteMat() LiteMat(int width, LDataType data_type = LDataType::UINT8) @@ -224,17 +223,17 @@ LiteMat(int width, int height, int channel, LDataType data_type = LDataType::UIN MindSpore中dataset模块下LiteMat的构造方法,使用参数的默认值。 -``` +```cpp ~LiteMat(); ``` MindSpore dataset LiteMat的析构函数。 -**公有成员函数** +### 公有成员函数 -### Init +#### Init -``` +```cpp void Init(int width, LDataType data_type = LDataType::UINT8) void Init(int width, int height, LDataType data_type = LDataType::UINT8) @@ -244,9 +243,9 @@ void Init(int width, int height, int channel, LDataType data_type = LDataType::U 该函数用于初始化图像的通道,宽度和高度,参数不同。 -### IsEmpty +#### IsEmpty -``` +```cpp bool IsEmpty() const ``` @@ -256,19 +255,19 @@ bool IsEmpty() const 返回True或者False。 -### Release +#### Release -``` +```cpp void Release() ``` 释放内存的函数。 -**私有成员函数** +### 私有成员函数 -### AlignMalloc +#### AlignMalloc -``` +```cpp void *AlignMalloc(unsigned int size) ``` @@ -282,18 +281,17 @@ void *AlignMalloc(unsigned int size) 返回指针的大小。 -### AlignFree +#### AlignFree -``` +```cpp void AlignFree(void *ptr) ``` 释放指针内存大小的方法。 +#### InitElemSize -### InitElemSize - -``` +```cpp void InitElemSize(LDataType data_type) ``` @@ -303,7 +301,7 @@ void InitElemSize(LDataType data_type) - `data_type`: 数据的类型。 -``` +```cpp int addRef(int *p, int value) ``` @@ -312,4 +310,4 @@ void InitElemSize(LDataType data_type) - 参数 - `p`: 指向引用的对象。 - - `value`: 引用时所加的值。 \ No newline at end of file + - `value`: 引用时所加的值。 diff --git a/docs/api_cpp/source_zh_cn/lite.md b/docs/api_cpp/source_zh_cn/lite.md index b3aaa5909c268a82eb43b7717348816e55c501f3..978178fabc8f412a6c811e0bb117dcce887ed788 100644 --- a/docs/api_cpp/source_zh_cn/lite.md +++ b/docs/api_cpp/source_zh_cn/lite.md @@ -1,11 +1,10 @@ # mindspore::lite -#include <[context.h](https://gitee.com/mindspore/mindspore/blob/master/mindspore/lite/include/context.h)> +\#include <[context.h](https://gitee.com/mindspore/mindspore/blob/master/mindspore/lite/include/context.h)> -#include <[model.h](https://gitee.com/mindspore/mindspore/blob/master/mindspore/lite/include/model.h)> - -#include <[version.h](https://gitee.com/mindspore/mindspore/blob/master/mindspore/lite/include/version.h)> +\#include <[model.h](https://gitee.com/mindspore/mindspore/blob/master/mindspore/lite/include/model.h)> +\#include <[version.h](https://gitee.com/mindspore/mindspore/blob/master/mindspore/lite/include/version.h)> ## Allocator @@ -15,23 +14,23 @@ Allocator类定义了一个内存池,用于动态地分配和释放内存。 Context类用于保存执行中的环境变量。 -**构造函数和析构函数** +### 构造函数和析构函数 -``` +```cpp Context() ``` 用默认参数构造MindSpore Lite Context 对象。 -``` +```cpp ~Context() ``` MindSpore Lite Context 的析构函数。 -**公有属性** +### 公有属性 -``` +```cpp float16_priority ``` @@ -39,29 +38,29 @@ float16_priority > 使能float16推理可能会导致模型推理精度下降,因为在模型推理的中间过程中,有些变量可能会超出float16的数值范围。 -``` +```cpp device_type ``` [**DeviceType**](https://www.mindspore.cn/doc/api_cpp/zh-CN/master/lite.html#devicetype)枚举类型。默认为**DT_CPU**,用于设置设备信息。 -``` +```cpp thread_num_ ``` **int** 值,默认为**2**,设置线程数。 -``` +```cpp allocator ``` 指针类型,指向内存分配器[**Allocator**](https://www.mindspore.cn/doc/api_cpp/zh-CN/master/lite.html#allocator)的指针。 -``` -cpu_bind_mode_ +```cpp +cpu_bind_mode_ ``` -[**CpuBindMode**](https://www.mindspore.cn/doc/api_cpp/zh-CN/master/lite.html#cpubindmode)枚举类型,默认为**MID_CPU**。 +[**CpuBindMode**](https://www.mindspore.cn/doc/api_cpp/zh-CN/master/lite.html#cpubindmode)枚举类型,默认为**MID_CPU**。 ## PrimitiveC @@ -71,87 +70,89 @@ PrimitiveC定义为算子的原型。 Model定义了MindSpore Lite中的模型,便于计算图管理。 -**析构函数** +### 析构函数 -``` +```cpp ~Model() ``` MindSpore Lite Model的析构函数。 -**公有成员函数** +### 公有成员函数 -``` +```cpp void Destroy() ``` 释放Model内的所有过程中动态分配的内存。 -``` +```cpp void Free() ``` 释放MindSpore Lite Model中的MetaGraph,用于减小运行时的内存。 -**静态公有成员函数** +### 静态公有成员函数 -``` +```cpp static Model *Import(const char *model_buf, size_t size) ``` 创建Model指针的静态方法。 -- 参数 +- 参数 - - `model_buf`: 定义了读取模型文件的缓存区。 + - `model_buf`: 定义了读取模型文件的缓存区。 - - `size`: 定义了模型缓存区的字节数。 + - `size`: 定义了模型缓存区的字节数。 - 返回值 指向MindSpore Lite的Model的指针。 - + ## CpuBindMode + 枚举类型,设置cpu绑定策略。 -**属性** +### 属性 -``` -MID_CPU = -1 +```cpp +MID_CPU = 2 ``` 优先中等CPU绑定策略。 -``` +```cpp HIGHER_CPU = 1 ``` 优先高级CPU绑定策略。 -``` +```cpp NO_BIND = 0 ``` 不绑定。 ## DeviceType + 枚举类型,设置设备类型。 -**属性** +### 属性 -``` +```cpp DT_CPU = -1 ``` 设备为CPU。 -``` +```cpp DT_GPU = 1 ``` 设备为GPU。 -``` +```cpp DT_NPU = 0 ``` @@ -159,11 +160,12 @@ DT_NPU = 0 ## Version -``` +```cpp std::string Version() ``` + 全局方法,用于获取版本的字符串。 - 返回值 - MindSpore Lite版本的字符串。 \ No newline at end of file + MindSpore Lite版本的字符串。 diff --git a/docs/api_cpp/source_zh_cn/session.md b/docs/api_cpp/source_zh_cn/session.md index adfc159b68d8e5e7f35abfef400f434e1719eb9f..58f3a86018f029c7256b9ab0d6e40536cebae614 100644 --- a/docs/api_cpp/source_zh_cn/session.md +++ b/docs/api_cpp/source_zh_cn/session.md @@ -1,36 +1,41 @@ # mindspore::session -#include <[lite_session.h](https://gitee.com/mindspore/mindspore/blob/master/mindspore/lite/include/lite_session.h)> - +\#include <[lite_session.h](https://gitee.com/mindspore/mindspore/blob/master/mindspore/lite/include/lite_session.h)> ## LiteSession LiteSession定义了MindSpore Lite中的会话,用于进行Model的编译和前向推理。 -**构造函数和析构函数** +### 构造函数和析构函数 -``` +```cpp LiteSession() ``` + MindSpore Lite LiteSession的构造函数,使用默认参数。 -``` + +```cpp ~LiteSession() ``` + MindSpore Lite LiteSession的析构函数。 -**公有成员函数** -``` +### 公有成员函数 + +```cpp virtual void BindThread(bool if_bind) ``` + 尝试将线程池中的线程绑定到指定的cpu内核,或从指定的cpu内核进行解绑。 - 参数 - `if_bind`: 定义了对线程进行绑定或解绑。 -``` +```cpp virtual int CompileGraph(lite::Model *model) ``` + 编译MindSpore Lite模型。 > CompileGraph必须在RunGraph方法之前调用。 @@ -43,18 +48,20 @@ virtual int CompileGraph(lite::Model *model) STATUS ,即编译图的错误码。STATUS在[errorcode.h](https://gitee.com/mindspore/mindspore/blob/master/mindspore/lite/include/errorcode.h)中定义。 -``` +```cpp virtual std::vector GetInputs() const ``` + 获取MindSpore Lite模型的MSTensors输入。 - 返回值 MindSpore Lite MSTensor向量。 -``` +```cpp mindspore::tensor::MSTensor *GetInputsByName(const std::string &name) const ``` + 通过tensor名获取MindSpore Lite模型的MSTensors输入。 - 参数 @@ -65,9 +72,10 @@ mindspore::tensor::MSTensor *GetInputsByName(const std::string &name) const MindSpore Lite MSTensor。 -``` +```cpp virtual int RunGraph(const KernelCallBack &before = nullptr, const KernelCallBack &after = nullptr) ``` + 运行带有回调函数的会话。 > RunGraph必须在CompileGraph方法之后调用。 @@ -81,9 +89,10 @@ virtual int RunGraph(const KernelCallBack &before = nullptr, const KernelCallBac STATUS ,即编译图的错误码。STATUS在[errorcode.h](https://gitee.com/mindspore/mindspore/blob/master/mindspore/lite/include/errorcode.h)中定义。 -``` +```cpp virtual std::vector GetOutputsByNodeName(const std::string &node_name) const ``` + 通过节点名获取MindSpore Lite模型的MSTensors输出。 - 参数 @@ -94,27 +103,30 @@ virtual std::vector GetOutputsByNodeName(const std::string MindSpore Lite MSTensor向量。 -``` +```cpp virtual std::unordered_map GetOutputs() const ``` + 获取与张量名相关联的MindSpore Lite模型的MSTensors输出。 - 返回值 包含输出张量名和MindSpore Lite MSTensor的容器类型变量。 -``` +```cpp virtual std::vector GetOutputTensorNames() const ``` + 获取由当前会话所编译的模型的输出张量名。 - 返回值 字符串向量,其中包含了按顺序排列的输出张量名。 -``` +```cpp virtual mindspore::tensor::MSTensor *GetOutputByTensorName(const std::string &tensor_name) const ``` + 通过张量名获取MindSpore Lite模型的MSTensors输出。 - 参数 @@ -125,9 +137,10 @@ virtual mindspore::tensor::MSTensor *GetOutputByTensorName(const std::string &te 指向MindSpore Lite MSTensor的指针。 -``` +```cpp virtual int Resize(const std::vector &inputs, const std::vector> &dims) ``` + 调整输入的形状。 - 参数 @@ -139,11 +152,12 @@ virtual int Resize(const std::vector &inputs, const std::ve STATUS ,即编译图的错误码。STATUS在[errorcode.h](https://gitee.com/mindspore/mindspore/blob/master/mindspore/lite/include/errorcode.h)中定义。 -**静态公有成员函数** +### 静态公有成员函数 -``` +```cpp static LiteSession *CreateSession(lite::Context *context) ``` + 用于创建一个LiteSession指针的静态方法。 - 参数 @@ -153,9 +167,10 @@ static LiteSession *CreateSession(lite::Context *context) - 返回值 指向MindSpore Lite LiteSession的指针。 + ## KernelCallBack -``` +```cpp using KernelCallBack = std::function inputs, std::vector outputs, const CallBackParam &opInfo)> ``` @@ -166,12 +181,14 @@ using KernelCallBack = std::function inputs 一个结构体。CallBackParam定义了回调函数的输入参数。 **属性** -``` +```cpp name_callback_param ``` + **string** 类型变量。节点名参数。 -``` +```cpp type_callback_param ``` + **string** 类型变量。节点类型参数。 diff --git a/docs/api_cpp/source_zh_cn/tensor.md b/docs/api_cpp/source_zh_cn/tensor.md index 50e7c294b1923dbc85ff2911f42c91e47b630944..0d2e7e219a83468d84c7d83c961ebe82b0e6510a 100644 --- a/docs/api_cpp/source_zh_cn/tensor.md +++ b/docs/api_cpp/source_zh_cn/tensor.md @@ -1,32 +1,35 @@ # mindspore::tensor -#include <[ms_tensor.h](https://gitee.com/mindspore/mindspore/blob/master/mindspore/lite/include/ms_tensor.h)> - +\#include <[ms_tensor.h](https://gitee.com/mindspore/mindspore/blob/master/mindspore/lite/include/ms_tensor.h)> ## MSTensor MSTensor定义了MindSpore Lite中的张量。 -**构造函数和析构函数** -``` +### 构造函数和析构函数 + +```cpp MSTensor() ``` + MindSpore Lite MSTensor的构造函数。 - 返回值 MindSpore Lite MSTensor的实例。 - -``` + +```cpp virtual ~MSTensor() ``` + MindSpore Lite Model的析构函数。 -**公有成员函数** +### 公有成员函数 -``` +```cpp virtual TypeId data_type() const ``` + 获取MindSpore Lite MSTensor的数据类型。 > TypeId在[mindspore/mindspore/core/ir/dtype/type_id\.h](https://gitee.com/mindspore/mindspore/blob/master/mindspore/core/ir/dtype/type_id.h)中定义。只有TypeId枚举中的数字类型可用于MSTensor。 @@ -35,18 +38,20 @@ virtual TypeId data_type() const MindSpore Lite MSTensor类的MindSpore Lite TypeId。 -``` +```cpp virtual std::vector shape() const ``` + 获取MindSpore Lite MSTensor的形状。 - 返回值 一个包含MindSpore Lite MSTensor形状数值的整型向量。 -``` +```cpp virtual int DimensionSize(size_t index) const ``` + 通过参数索引获取MindSpore Lite MSTensor的维度的大小。 - 参数 @@ -57,28 +62,30 @@ virtual int DimensionSize(size_t index) const MindSpore Lite MSTensor的维度的大小。 -``` +```cpp virtual int ElementsNum() const ``` + 获取MSTensor中的元素个数。 - 返回值 MSTensor中的元素个数 -``` +```cpp virtual size_t Size() const ``` + 获取MSTensor中的数据的字节数大小。 - 返回值 MSTensor中的数据的字节数大小。 - -``` +```cpp virtual void *MutableData() const ``` + 获取MSTensor中的数据的指针。 > 该数据指针可用于对MSTensor中的数据进行读取和写入。 diff --git a/docs/api_java/source_en/class_list.md b/docs/api_java/source_en/class_list.md index befeea5d3882c22b2a7f7d48f60094a10258b790..5244b02ce5e8ac7f44cd8406fa07898ca6153ddf 100644 --- a/docs/api_java/source_en/class_list.md +++ b/docs/api_java/source_en/class_list.md @@ -1,3 +1,13 @@ # Class List -Java API is being translated, will be released soon. +| Package | Class Name | Description | +| ------------------------- | -------------- | ------------------------------------------------------------ | +| com.mindspore.lite.config | MSConfig | MSConfig defines for holding environment variables during runtime. | +| com.mindspore.lite.config | CpuBindMode | CpuBindMode defines the CPU binding mode. | +| com.mindspore.lite.config | DeviceType | DeviceType defines the back-end device type. | +| com.mindspore.lite | LiteSession | LiteSession defines session in MindSpore Lite for compiling Model and forwarding model. | +| com.mindspore.lite | Model | Model defines the model in MindSpore Lite for managing graph. | +| com.mindspore.lite | MSTensor | MSTensor defines the tensor in MindSpore Lite. | +| com.mindspore.lite | DataType | DataType defines the supported data types. | +| com.mindspore.lite | Version | Version is used to obtain the version information of MindSpore Lite. | + diff --git a/docs/api_java/source_en/index.rst b/docs/api_java/source_en/index.rst index f8e7fd507a9549880c0982b37bac2a5cf082feae..935aa0a5d22565b2d51fc919a8f81c00d9702b02 100644 --- a/docs/api_java/source_en/index.rst +++ b/docs/api_java/source_en/index.rst @@ -10,4 +10,8 @@ MindSpore Java API :glob: :maxdepth: 1 - class_list \ No newline at end of file + class_list + lite_session + model + msconfig + mstensor \ No newline at end of file diff --git a/docs/api_java/source_en/lite_session.md b/docs/api_java/source_en/lite_session.md new file mode 100644 index 0000000000000000000000000000000000000000..8b71869df777c442b8f9abb1f0d5f4e2d1abbb85 --- /dev/null +++ b/docs/api_java/source_en/lite_session.md @@ -0,0 +1,188 @@ +# LiteSession + +```java +import com.mindspore.lite.LiteSession; +``` + +LiteSession defines session in MindSpore Lite for compiling Model and forwarding model. + +## Public Member Functions + +| function | +| ------------------------------------------------------------ | +| [boolean init(MSConfig config)](#init) | +| [void bindThread(boolean if_bind)](#bindthread) | +| [boolean compileGraph(Model model)](#compilegraph) | +| [boolean runGraph()](#rungraph) | +| [List\ getInputs()](#getinputs) | +| [List\ getInputsByName(String nodeName)](#getinputsbyname) | +| [List\ getOutputsByNodeName(String nodeName)](#getoutputsbynodename) | +| [Map\ getOutputMapByTensor()](#getoutputmapbytensor) | +| [List\ getOutputTensorNames()](#getoutputtensornames) | +| [MSTensor getOutputByTensorName(String tensorName)](#getoutputbytensorname) | +| [boolean resize(List\ inputs, int[][] dims](#resize) | +| [void free()](#free) | + +## init + +```java +public boolean init(MSConfig config) +``` + +Initialize LiteSession. + +- Parameters + + - `MSConfig`: MSConfig class. + +- Returns + + Whether the initialization is successful. + +## bindThread + +```java +public void bindThread(boolean if_bind) +``` + +Attempt to bind or unbind threads in the thread pool to or from the specified cpu core. + +- Parameters + - `if_bind`: Define whether to bind or unbind threads. + +## compileGraph + +```java +public boolean compileGraph(Model model) +``` + +Compile MindSpore Lite model. + +- Parameters + + - `Model`: Define the model to be compiled. + +- Returns + + Whether the compilation is successful. + +## runGraph + +```java +public boolean runGraph() +``` + +Run the session for inference. + +- Returns + + Whether the inference is successful. + +## getInputs + +```java +public List getInputs() +``` + +Get the MSTensors input of MindSpore Lite model. + +- Returns + + The vector of MindSpore Lite MSTensor. + +## getInputsByName + +```java +public List getInputsByName(String nodeName) +``` + +Get the MSTensors input of MindSpore Lite model by the node name. + +- Parameters + + - `nodeName`: Define the node name. + +- Returns + + The vector of MindSpore Lite MSTensor. + +## getOutputsByNodeName + +```java +public List getOutputsByNodeName(String nodeName) +``` + +Get the MSTensors output of MindSpore Lite model by the node name. + +- Parameters + + - `nodeName`: Define the node name. + +- Returns + + The vector of MindSpore Lite MSTensor. + +## getOutputMapByTensor + +```java +public Map getOutputMapByTensor() +``` + +Get the MSTensors output of the MindSpore Lite model associated with the tensor name. + +- Returns + + The map of output tensor name and MindSpore Lite MSTensor. + +## getOutputTensorNames + +```java +public List getOutputTensorNames() +``` + +Get the name of output tensors of the model compiled by this session. + +- Returns + + The vector of string as output tensor names in order. + +## getOutputByTensorName + +```java +public MSTensor getOutputByTensorName(String tensorName) +``` + +Get the MSTensors output of MindSpore Lite model by the tensor name. + +- Parameters + + - `tensorName`: Define the tensor name. + +- Returns + + Pointer of MindSpore Lite MSTensor. + +## resize + +```java +public boolean resize(List inputs, int[][] dims) +``` + +Resize inputs shape. + +- Parameters + + - `inputs`: Model inputs. + - `dims`: Define the new inputs shape. + +- Returns + + Whether the resize is successful. + +## free + +```java +public void free() +``` + +Free LiteSession. diff --git a/docs/api_java/source_en/model.md b/docs/api_java/source_en/model.md new file mode 100644 index 0000000000000000000000000000000000000000..101750238841d2875f2ccba22fa4d7f342609e71 --- /dev/null +++ b/docs/api_java/source_en/model.md @@ -0,0 +1,64 @@ +# Model + +```java +import com.mindspore.lite.Model; +``` + +Model defines model in MindSpore Lite for managing graph. + +## Public Member Functions + +| function | +| ------------------------------------------------------------ | +| [boolean loadModel(Context context, String modelName)](#loadmodel) | +| [boolean loadModel(String modelPath)](#loadmodel) | +| [void freeBuffer()](#freebuffer) | +| [void free()](#free) | + +## loadModel + +```java +public boolean loadModel(Context context, String modelName) +``` + +Load the MindSpore Lite model from Assets. + +- Parameters + + - `context`: Context in Android. + - `modelName`: Model file name. + +- Returns + + Whether the load is successful. + +```java +public boolean loadModel(String modelPath) +``` + +Load the MindSpore Lite model from path. + +- Parameters + + - `modelPath`: Model file path. + +- Returns + + Whether the load is successful. + +## freeBuffer + +```java +public void freeBuffer() +``` + +Free MetaGraph in MindSpore Lite Model to reduce memory usage during inference. + +## free + +```java +public void free() +``` + +Free all temporary memory in MindSpore Lite Model. + diff --git a/docs/api_java/source_en/msconfig.md b/docs/api_java/source_en/msconfig.md new file mode 100644 index 0000000000000000000000000000000000000000..1d0feff298d136bde848ffd8b8e5884c8cdda78f --- /dev/null +++ b/docs/api_java/source_en/msconfig.md @@ -0,0 +1,82 @@ +# MSConfig + +```java +import com.mindspore.lite.config.MSConfig; +``` + +MSConfig is defined for holding environment variables during runtime. + +## Public Member Functions + +| function | +| ------------------------------------------------------------ | +| [boolean init(int deviceType, int threadNum, int cpuBindMode)](#init) | +| [boolean init(int deviceType, int threadNum)](#init) | +| [boolean init(int deviceType)](#init) | +| [boolean init()](#init) | +| [void free()](#free) | + +## init + +```java +public boolean init(int deviceType, int threadNum, int cpuBindMode) +``` + +Initialize MSConfig. + +- Parameters + + - `deviceType`: A [**DeviceType**](https://gitee.com/mindspore/mindspore/blob/master/mindspore/lite/java/java/app/src/main/java/com/mindspore/lite/config/DeviceType.java) **enum** type. + - `threadNum`: Thread number config for thread pool. + - `cpuBindMode`: A [**CpuBindMode**](https://gitee.com/mindspore/mindspore/blob/master/mindspore/lite/java/java/app/src/main/java/com/mindspore/lite/config/CpuBindMode.java) **enum** variable. + +- Returns + + Whether the initialization is successful. + +```java +public boolean init(int deviceType, int threadNum) +``` + +Initialize MSConfig, `cpuBindMode` defaults to `CpuBindMode.MID_CPU`. + +- Parameters + + - `deviceType`: A [**DeviceType**](https://gitee.com/mindspore/mindspore/blob/master/mindspore/lite/java/java/app/src/main/java/com/mindspore/lite/config/DeviceType.java) **enum** type. + - `threadNum`: Thread number config for thread pool. + +- Returns + + Whether the initialization is successful. + +```java +public boolean init(int deviceType) +``` + +Initialize MSConfig,`cpuBindMode` defaults to `CpuBindMode.MID_CPU`, `threadNum` defaults to `2`. + +- Parameters + + - `deviceType`: A [**DeviceType**](https://gitee.com/mindspore/mindspore/blob/master/mindspore/lite/java/java/app/src/main/java/com/mindspore/lite/config/DeviceType.java) **enum** type. + +- Returns + + Whether the initialization is successful. + +```java +public boolean init() +``` + +Initialize MSConfig,`deviceType` defaults to `DeviceType.DT_CPU`,`cpuBindMode` defaults to`CpuBindMode.MID_CPU`,`threadNum` defaults to `2`. + +- Returns + + Whether the initialization is successful. + +## free + +```java +public void free() +``` + +Free all temporary memory in MindSpore Lite MSConfig. \ No newline at end of file diff --git a/docs/api_java/source_en/mstensor.md b/docs/api_java/source_en/mstensor.md new file mode 100644 index 0000000000000000000000000000000000000000..69bdaa63418778625ef7b6937c44bed697563b8a --- /dev/null +++ b/docs/api_java/source_en/mstensor.md @@ -0,0 +1,148 @@ +# MSTensor + +```java +import com.mindspore.lite.MSTensor; +``` + +MSTensor defined tensor in MindSpore Lite. + +## Public Member Functions + +| function | +| ------------------------------------------ | +| [int[] getShape()](#getshape) | +| [int getDataType()](#getdatatype) | +| [byte[] getByteData()](#getbytedata) | +| [float[] getFloatData()](#getfloatdata) | +| [int[] getIntData()](#getintdata) | +| [long[] getLongData()](#getlongdata) | +| [void setData(byte[] data)](#setdata) | +| [void setData(ByteBuffer data)](#setdata) | +| [long size()](#size) | +| [int elementsNum()](#elementsnum) | +| [void free()](#free) | + +## getShape + +```java +public int[] getShape() +``` + +Get the shape of the MindSpore Lite MSTensor. + +- Returns + + A array of int as the shape of the MindSpore Lite MSTensor. + +## getDataType + +```java +public int getDataType() +``` + +> DataType is defined in [com.mindspore.lite.DataType](https://gitee.com/mindspore/mindspore/blob/master/mindspore/lite/java/java/app/src/main/java/com/mindspore/lite/DataType.java). + +- Returns + + The MindSpore Lite data type of the MindSpore Lite MSTensor class. + +## getByteData + +```java +public byte[] getByteData() +``` + +Get output data of MSTensor, the data type is byte. + +- Returns + + The byte array containing all MSTensor output data. + +## getFloatData + +```java +public float[] getFloatData() +``` + +Get output data of MSTensor, the data type is float. + +- Returns + + The float array containing all MSTensor output data. + +## getIntData + +```java +public int[] getIntData() +``` + +Get output data of MSTensor, the data type is int. + +- Returns + + The int array containing all MSTensor output data. + +## getLongData + +```java +public long[] getLongData() +``` + +Get output data of MSTensor, the data type is long. + +- Returns + + The long array containing all MSTensor output data. + +## setData + +```java +public void setData(byte[] data) +``` + +Set the input data of MSTensor. + +- Parameters + - `data`: Input data of byte[] type. + +```java +public void setData(ByteBuffer data) +``` + +Set the input data of MSTensor. + +- Parameters + - `data`: Input data of ByteBuffer type. + +## size + +```java +public long size() +``` + +Get the size of the data in MSTensor in bytes. + +- Returns + + The size of the data in MSTensor in bytes. + +## elementsNum + +```java +public int elementsNum() +``` + +Get the number of elements in MSTensor. + +- Returns + + The number of elements in MSTensor. + +## free + +```java +public void free() +``` + +Free all temporary memory in MindSpore Lite MSTensor. + diff --git a/docs/api_java/source_zh_cn/lite_session.md b/docs/api_java/source_zh_cn/lite_session.md index c008d0eab9f86e9d6b59e2b776ae645a40aa3510..9ed5bc8e9ac138a61128ba5b64dd439b44a97ba1 100644 --- a/docs/api_java/source_zh_cn/lite_session.md +++ b/docs/api_java/source_zh_cn/lite_session.md @@ -33,7 +33,7 @@ public boolean init(MSConfig config) - 参数 - - `MSConfig`: MSConfig类。 + - `MSConfig`: MSConfig类。 - 返回值 @@ -48,7 +48,7 @@ public void bindThread(boolean if_bind) 尝试将线程池中的线程绑定到指定的CPU内核,或从指定的CPU内核进行解绑。 - 参数 - - `if_bind`: 是否对线程进行绑定或解绑。 + - `if_bind`: 是否对线程进行绑定或解绑。 ## compileGraph @@ -156,7 +156,7 @@ public MSTensor getOutputByTensorName(String tensorName) - 参数 - - `tensorName`: 张量名。 + - `tensorName`: 张量名。 - 返回值 @@ -172,8 +172,8 @@ public boolean resize(List inputs, int[][] dims) - 参数 - - `inputs`: 模型对应的所有输入。 - - `dims`: 输入对应的新的shape,顺序注意要与inputs一致。 + - `inputs`: 模型对应的所有输入。 + - `dims`: 输入对应的新的shape,顺序注意要与inputs一致。 - 返回值 diff --git a/docs/api_java/source_zh_cn/model.md b/docs/api_java/source_zh_cn/model.md index f5230efd6a73aa22805fa7f0997ded57215ff474..c6081a8db8722e9475859a03565fa3c40dbacf39 100644 --- a/docs/api_java/source_zh_cn/model.md +++ b/docs/api_java/source_zh_cn/model.md @@ -6,7 +6,7 @@ import com.mindspore.lite.Model; Model定义了MindSpore Lite中的模型,便于计算图管理。 -# 公有成员函数 +## 公有成员函数 | function | | ------------------------------------------------------------ | @@ -25,8 +25,8 @@ public boolean loadModel(Context context, String modelName) - 参数 - - `context`: Android中的Context上下文 - - `modelName`: 模型文件名称 + - `context`: Android中的Context上下文 + - `modelName`: 模型文件名称 - 返回值 @@ -40,7 +40,7 @@ public boolean loadModel(String modelPath) - 参数 - - `modelPath`: 模型文件路径 + - `modelPath`: 模型文件路径 - 返回值 @@ -61,4 +61,3 @@ public void free() ``` 释放Model运行过程中动态分配的内存。 - diff --git a/docs/api_java/source_zh_cn/msconfig.md b/docs/api_java/source_zh_cn/msconfig.md index 7e868a60f115e4c3432e6b4b3e5aba03f45ea109..c60019b8c6f6196f8b9651a193e4f1b62021c36c 100644 --- a/docs/api_java/source_zh_cn/msconfig.md +++ b/docs/api_java/source_zh_cn/msconfig.md @@ -26,9 +26,9 @@ public boolean init(int deviceType, int threadNum, int cpuBindMode) - 参数 - - `deviceType`: 设备类型,`deviceType`在[com.mindspore.lite.config.DeviceType](https://gitee.com/mindspore/mindspore/blob/master/mindspore/lite/java/java/app/src/main/java/com/mindspore/lite/config/DeviceType.java)中定义。 - - `threadNum`: 线程数 - - `cpuBindMode`: CPU绑定模式,`cpuBindMode`在[com.mindspore.lite.config.CpuBindMode](https://gitee.com/mindspore/mindspore/blob/master/mindspore/lite/java/java/app/src/main/java/com/mindspore/lite/config/CpuBindMode.java)中定义。 + - `deviceType`: 设备类型,`deviceType`在[com.mindspore.lite.config.DeviceType](https://gitee.com/mindspore/mindspore/blob/master/mindspore/lite/java/java/app/src/main/java/com/mindspore/lite/config/DeviceType.java)中定义。 + - `threadNum`: 线程数 + - `cpuBindMode`: CPU绑定模式,`cpuBindMode`在[com.mindspore.lite.config.CpuBindMode](https://gitee.com/mindspore/mindspore/blob/master/mindspore/lite/java/java/app/src/main/java/com/mindspore/lite/config/CpuBindMode.java)中定义。 - 返回值 @@ -42,8 +42,8 @@ public boolean init(int deviceType, int threadNum) - 参数 - - `deviceType`: 设备类型,`deviceType`在[com.mindspore.lite.config.DeviceType](https://gitee.com/mindspore/mindspore/blob/master/mindspore/lite/java/java/app/src/main/java/com/mindspore/lite/config/DeviceType.java)中定义。 - - `threadNum`: 线程数。 + - `deviceType`: 设备类型,`deviceType`在[com.mindspore.lite.config.DeviceType](https://gitee.com/mindspore/mindspore/blob/master/mindspore/lite/java/java/app/src/main/java/com/mindspore/lite/config/DeviceType.java)中定义。 + - `threadNum`: 线程数。 - 返回值 @@ -57,7 +57,7 @@ public boolean init(int deviceType) - 参数 - - `deviceType`: 设备类型,`deviceType`在[com.mindspore.lite.config.DeviceType](https://gitee.com/mindspore/mindspore/blob/master/mindspore/lite/java/java/app/src/main/java/com/mindspore/lite/config/DeviceType.java)中定义。 + - `deviceType`: 设备类型,`deviceType`在[com.mindspore.lite.config.DeviceType](https://gitee.com/mindspore/mindspore/blob/master/mindspore/lite/java/java/app/src/main/java/com/mindspore/lite/config/DeviceType.java)中定义。 - 返回值 @@ -79,4 +79,4 @@ public boolean init() public void free() ``` -释放MSConfig运行过程中动态分配的内存。LiteSession init之后即可释放。 \ No newline at end of file +释放MSConfig运行过程中动态分配的内存。LiteSession init之后即可释放。 diff --git a/docs/api_java/source_zh_cn/mstensor.md b/docs/api_java/source_zh_cn/mstensor.md index b6e582f38ffdf38b51d03c03be413beb2e8d2ab5..ce7db9822c35f5698e68dd4c275cadf8d1f1312e 100644 --- a/docs/api_java/source_zh_cn/mstensor.md +++ b/docs/api_java/source_zh_cn/mstensor.md @@ -103,7 +103,7 @@ public void setData(byte[] data) 设定MSTensor的输入数据。 - 参数 - - `data`: byte[]类型的输入数据。 + - `data`: byte[]类型的输入数据。 ```java public void setData(ByteBuffer data) @@ -112,7 +112,7 @@ public void setData(ByteBuffer data) 设定MSTensor的输入数据。 - 参数 - - `data`: ByteBuffer类型的输入数据。 + - `data`: ByteBuffer类型的输入数据。 ## size @@ -145,4 +145,3 @@ public void free() ``` 释放MSTensor运行过程中动态分配的内存。 - diff --git a/docs/faq/source_en/faq.md b/docs/faq/source_en/faq.md index 3162c71c3a99de2fb0a85d3523625e5af82fbd12..3f9e7dc728257e22f9be031d44a9a5c6c8de5e9e 100644 --- a/docs/faq/source_en/faq.md +++ b/docs/faq/source_en/faq.md @@ -98,6 +98,7 @@ A: You can write the frequently-used environment settings to `~/.bash_profile` o ### Verifying the Installation Q: After MindSpore is installed on a CPU of a PC, an error message `the pointer[session] is null` is displayed during code verification. The specific code is as follows. How do I verify whether MindSpore is successfully installed? + ```python import numpy as np from mindspore import Tensor @@ -194,7 +195,7 @@ A: The MindSpore CPU version can be installed on Windows 10. For details about t Q: What can I do if an error message `wrong shape of image` is displayed when I use a model trained by MindSpore to perform prediction on a `28 x 28` digital image with white text on a black background? -A: The MNIST gray scale image dataset is used for MindSpore training. Therefore, when the model is used, the data must be set to a `28 x 28 `gray scale image, that is, a single channel. +A: The MNIST gray scale image dataset is used for MindSpore training. Therefore, when the model is used, the data must be set to a `28 x 28` gray scale image, that is, a single channel.
@@ -291,4 +292,4 @@ A: For details about script or model migration, please visit the [MindSpore offi Q: Does MindSpore provide open-source e-commerce datasets? -A: No. Please stay tuned for updates on the [MindSpore official website](https://www.mindspore.cn/en). \ No newline at end of file +A: No. Please stay tuned for updates on the [MindSpore official website](https://www.mindspore.cn/en). diff --git a/docs/faq/source_zh_cn/faq.md b/docs/faq/source_zh_cn/faq.md index c698b8e99038a66c38a50d16837edf8ef08c297b..4637908446d259eaacb7e1cac762acfe3c340d87 100644 --- a/docs/faq/source_zh_cn/faq.md +++ b/docs/faq/source_zh_cn/faq.md @@ -103,6 +103,7 @@ A:常用的环境变量设置写入到`~/.bash_profile` 或 `~/.bashrc`中, ### 安装验证 Q:个人电脑CPU环境安装MindSpore后验证代码时报错:`the pointer[session] is null`,具体代码如下,该如何验证是否安装成功呢? + ```python import numpy as np from mindspore import Tensor @@ -121,6 +122,19 @@ A:CPU硬件平台安装MindSpore后测试是否安装成功,只需要执行命 ## 算子支持 +Q:`nn.Embedding`层与PyTorch相比缺少了`Padding`操作,有其余的算子可以实现吗? + +A:在PyTorch中`padding_idx`的作用是将embedding矩阵中`padding_idx`位置的词向量置为0,并且反向传播时不会更新`padding_idx`位置的词向量。在MindSpore中,可以手动将embedding的`padding_idx`位置对应的权重初始化为0,并且在训练时通过`mask`的操作,过滤掉`padding_idx`位置对应的`Loss`。 + +
+ +Q:Operations中`Tile`算子执行到`__infer__`时`value`值为`None`,丢失了数值是怎么回事? + +A:`Tile`算子的`multiples input`必须是一个常量(该值不能直接或间接来自于图的输入)。否则构图的时候会拿到一个`None`的数据,因为图的输入是在图执行的时候才传下去的,构图的时候拿不到图的输入数据。 +相关的资料可以看[相关文档](https://www.mindspore.cn/doc/note/zh-CN/master/constraints_on_network_construction.html)的“其他约束”。 + +
+ Q:官网的LSTM示例在Ascend上跑不通 A:目前LSTM只支持在GPU和CPU上运行,暂不支持硬件环境,您可以[点击这里](https://www.mindspore.cn/doc/note/zh-CN/master/operator_list_ms.html)查看算子支持情况。 @@ -135,6 +149,12 @@ A:这是TBE这个算子的限制,x的width必须大于kernel的width。CPU ## 网络模型 +Q:如何不将数据处理为MindRecord格式,直接进行训练呢? + +A:可以使用自定义的数据加载方式 `GeneratorDataset`,具体可以参考[数据集加载](https://www.mindspore.cn/doc/programming_guide/zh-CN/master/dataset_loading.html)文档中的自定义数据集加载。 + +
+ Q:MindSpore现支持直接读取哪些其他框架的模型和哪些格式呢?比如PyTorch下训练得到的pth模型可以加载到MindSpore框架下使用吗? A: MindSpore采用protbuf存储训练参数,无法直接读取其他框架的模型。对于模型文件本质保存的就是参数和对应的值,可以用其他框架的API将参数读取出来之后,拿到参数的键值对,然后再加载到MindSpore中使用。比如想用其他框架训练好的ckpt文件,可以先把参数读取出来,再调用MindSpore的`save_checkpoint`接口,就可以保存成MindSpore可以读取的ckpt文件格式了。 @@ -211,6 +231,30 @@ A:MindSpore CPU版本已经支持在Windows 10系统中安装,具体安装 ## 后端运行 +Q:MindSpore如何实现早停功能? + +A:可以自定义`callback`方法实现早停功能。 +例子:当loss降到一定数值后,停止训练。 + +```python +class EarlyStop(Callback): + def __init__(self, control_loss=1): + super(EarlyStep, self).__init__() + self._control_loss = control_loss + + def step_end(self, run_context): + cb_params = run_context.original_args() + loss = cb_params.net_outputs + if loss.asnumpy() < self._control_loss: + # Stop training + run_context._stop_requested = True + +stop_cb = EarlyStop(control_loss=1) +model.train(epoch_size, ds_train, callbacks=[stop_cb]) +``` + +
+ Q:请问自己制作的黑底白字`28*28`的数字图片,使用MindSpore训练出来的模型做预测,报错提示`wrong shape of image`是怎么回事? A:首先MindSpore训练使用的灰度图MNIST数据集。所以模型使用时对数据是有要求的,需要设置为`28*28`的灰度图,就是单通道才可以。 @@ -275,6 +319,12 @@ A:MindSpore目前支持Python扩展,针对C++、Rust、Julia等语言的支 ## 特性支持 +Q:MindSpore有量化推理工具么? + +A:[MindSpore Lite](https://www.mindspore.cn/lite)支持云侧量化感知训练的量化模型的推理,MindSpore Lite converter工具提供训练后量化以及权重量化功能,且功能在持续加强完善中。 + +
+ Q:MindSpore并行模型训练的优势和特色有哪些? A:MindSpore分布式训练除了支持数据并行,还支持算子级模型并行,可以对算子输入tensor进行切分并行。在此基础上支持自动并行,用户只需要写单卡脚本,就能自动切分到多个节点并行执行。 @@ -325,4 +375,4 @@ A:关于脚本或者模型迁移,可以查询MindSpore官网中关于[网络 Q:MindSpore是否附带开源电商类数据集? -A:暂时还没有,可以持续关注[MindSpore官网](https://www.mindspore.cn)。 \ No newline at end of file +A:暂时还没有,可以持续关注[MindSpore官网](https://www.mindspore.cn)。 diff --git a/docs/note/source_en/benchmark.md b/docs/note/source_en/benchmark.md index 72779fa5d8d92d8add4344715d19cbde74b209e8..9a715e46ceeffbf7a5319b36667743a6dccabbc9 100644 --- a/docs/note/source_en/benchmark.md +++ b/docs/note/source_en/benchmark.md @@ -15,7 +15,7 @@ -This document describes the MindSpore benchmarks. +This document describes the MindSpore benchmarks. For details about the MindSpore networks, see [Model Zoo](https://gitee.com/mindspore/mindspore/tree/master/model_zoo). ## Training Performance @@ -28,17 +28,17 @@ For details about the MindSpore networks, see [Model Zoo](https://gitee.com/mind | | | | | Ascend: 8 * Ascend 910
CPU: 192 Cores | Mixed | 256 | 16600 images/sec | 0.98 | | | | | | Ascend: 16 * Ascend 910
CPU: 384 Cores | Mixed | 256 | 32768 images/sec | 0.96 | -1. The preceding performance is obtained based on ModelArts, the HUAWEI CLOUD AI development platform. It is the average performance obtained by the Ascend 910 AI processor during the overall training process. +1. The preceding performance is obtained based on ModelArts, the HUAWEI CLOUD AI development platform. It is the average performance obtained by the Ascend 910 AI processor during the overall training process. 2. For details about other open source frameworks, see [ResNet-50 v1.5 for TensorFlow](https://github.com/NVIDIA/DeepLearningExamples/tree/master/TensorFlow/Classification/ConvNets/resnet50v1.5). ### BERT -| Network | Network Type | Dataset | MindSpore Version | Resource                 | Precision | Batch Size | Throughput | Speedup | +| Network | Network Type | Dataset | MindSpore Version | Resource                 | Precision | Batch Size | Throughput | Speedup | | --- | --- | --- | --- | --- | --- | --- | --- | --- | | BERT-Large | Attention | zhwiki | 0.5.0-beta | Ascend: 1 * Ascend 910
CPU: 24 Cores | Mixed | 96 | 269 sentences/sec | - | | | | | | Ascend: 8 * Ascend 910
CPU: 192 Cores | Mixed | 96 | 2069 sentences/sec | 0.96 | -1. The preceding performance is obtained based on ModelArts, the HUAWEI CLOUD AI development platform. The network contains 24 hidden layers, the sequence length is 128 tokens, and the vocabulary contains 21128 tokens. +1. The preceding performance is obtained based on ModelArts, the HUAWEI CLOUD AI development platform. The network contains 24 hidden layers, the sequence length is 128 tokens, and the vocabulary contains 21128 tokens. 2. For details about other open source frameworks, see [BERT For TensorFlow](https://github.com/NVIDIA/DeepLearningExamples/tree/master/TensorFlow/LanguageModeling/BERT). ### Wide & Deep (data parallel) @@ -46,7 +46,7 @@ For details about the MindSpore networks, see [Model Zoo](https://gitee.com/mind | Network | Network Type | Dataset | MindSpore Version | Resource                 | Precision | Batch Size | Throughput | Speedup | | --- | --- | --- | --- | --- | --- | --- | --- | --- | | Wide & Deep | Recommend | Criteo | 0.6.0-beta | Ascend: 1 * Ascend 910
CPU: 24 Cores | Mixed | 16000 | 796892 samples/sec | - | -| | | | | Ascend: 8 * Ascend 910
CPU: 192 Cores | Mixed | 16000*8 | 4872849 samples/sec | 0.76 | +| | | | | Ascend: 8 \* Ascend 910
CPU: 192 Cores | Mixed | 16000*8 | 4872849 samples/sec | 0.76 | 1. The preceding performance is obtained based on Atlas 800, and the model is data parallel. 2. For details about other open source frameworks, see [Wide & Deep For TensorFlow](https://github.com/NVIDIA/DeepLearningExamples/tree/master/TensorFlow/Recommendation/WideAndDeep). @@ -56,9 +56,9 @@ For details about the MindSpore networks, see [Model Zoo](https://gitee.com/mind | Network | Network Type | Dataset | MindSpore Version | Resource                 | Precision | Batch Size | Throughput | Speedup | | --- | --- | --- | --- | --- | --- | --- | --- | --- | | Wide & Deep | Recommend | Criteo | 0.6.0-beta | Ascend: 1 * Ascend 910
CPU: 24 Cores | Mixed | 1000 | 68715 samples/sec | - | -| | | | | Ascend: 8 * Ascend 910
CPU: 192 Cores | Mixed | 8000*8 | 283830 samples/sec | 0.51 | -| | | | | Ascend: 16 * Ascend 910
CPU: 384 Cores | Mixed | 8000*16 | 377848 samples/sec | 0.34 | -| | | | | Ascend: 32 * Ascend 910
CPU: 768 Cores | Mixed | 8000*32 | 433423 samples/sec | 0.20 | +| | | | | Ascend: 8 \* Ascend 910
CPU: 192 Cores | Mixed | 8000*8 | 283830 samples/sec | 0.51 | +| | | | | Ascend: 16 \* Ascend 910
CPU: 384 Cores | Mixed | 8000*16 | 377848 samples/sec | 0.34 | +| | | | | Ascend: 32 \* Ascend 910
CPU: 768 Cores | Mixed | 8000*32 | 433423 samples/sec | 0.20 | 1. The preceding performance is obtained based on Atlas 800, and the model is model parallel. 2. For details about other open source frameworks, see [Wide & Deep For TensorFlow](https://github.com/NVIDIA/DeepLearningExamples/tree/master/TensorFlow/Recommendation/WideAndDeep). diff --git a/docs/note/source_en/constraints_on_network_construction.md b/docs/note/source_en/constraints_on_network_construction.md index 0c0a4f2cba6697639c016d1fb2618874363372a0..5936a2cf35f1d6c530dfb7327577dde135b761e9 100644 --- a/docs/note/source_en/constraints_on_network_construction.md +++ b/docs/note/source_en/constraints_on_network_construction.md @@ -1,7 +1,7 @@ # Constraints on Network Construction Using Python `Linux` `Ascend` `GPU` `CPU` `Model Development` `Beginner` `Intermediate` `Expert` - + - [Constraints on Network Construction Using Python](#constraints-on-network-construction-using-python) @@ -28,21 +28,26 @@ ## Overview + MindSpore can compile user source code based on the Python syntax into computational graphs, and can convert common functions or instances inherited from nn.Cell into computational graphs. Currently, MindSpore does not support conversion of any Python source code into computational graphs. Therefore, there are constraints on source code compilation, including syntax constraints and network definition constraints. As MindSpore evolves, the constraints may change. ## Syntax Constraints + ### Supported Python Data Types -* Number: supports `int`, `float`, and `bool`. Complex numbers are not supported. -* String -* List: supports the append method only. Updating a list will generate a new list. -* Tuple -* Dictionary: The type of key should be String. + +- Number: supports `int`, `float`, and `bool`. Complex numbers are not supported. +- String +- List: supports the append method only. Updating a list will generate a new list. +- Tuple +- Dictionary: The type of key should be String. + ### MindSpore Extended Data Type -* Tensor: Tensor variables must be defined instances. + +- Tensor: Tensor variables must be defined instances. ### Expression Types -| Operation | Description +| Operation | Description | :----------- |:-------- | Unary operator |`+`,`-`, and`not`. The operator `+` supports only scalars. | Binary operator |`+`, `-`, `*`, `/`, `%`, `**` and `//`. @@ -81,10 +86,11 @@ | `isinstance` | The usage principle is consistent with Python, but the second input parameter can only be the type defined by mindspore. ### Function Parameters -* Default parameter value: The data types `int`, `float`, `bool`, `None`, `str`, `tuple`, `list`, and `dict` are supported, whereas `Tensor` is not supported. -* Variable parameter: Functions with variable arguments is supported for training and inference. -* Key-value pair parameter: Functions with key-value pair parameters cannot be used for backward propagation on computational graphs. -* Variable key-value pair parameter: Functions with variable key-value pairs cannot be used for backward propagation on computational graphs. + +- Default parameter value: The data types `int`, `float`, `bool`, `None`, `str`, `tuple`, `list`, and `dict` are supported, whereas `Tensor` is not supported. +- Variable parameter: Functions with variable arguments is supported for training and inference. +- Key-value pair parameter: Functions with key-value pair parameters cannot be used for backward propagation on computational graphs. +- Variable key-value pair parameter: Functions with variable key-value pairs cannot be used for backward propagation on computational graphs. ### Operators @@ -101,55 +107,56 @@ ### Index operation -The index operation includes `tuple` and` Tensor`. The following focuses on the index value assignment and assignment operation of `Tensor`. The value takes` tensor_x [index] `as an example, and the assignment takes` tensor_x [index] = u` as an example for detailed description. Among them, tensor_x is a `Tensor`, which is sliced; index means the index, u means the assigned value, which can be` scalar` or `Tensor (size = 1)`. The index types are as follows: +The index operation includes `tuple` and `Tensor`. The following focuses on the index value assignment and assignment operation of `Tensor`. The value takes` tensor_x [index] `as an example, and the assignment takes`tensor_x [index] = u` as an example for detailed description. Among them, tensor_x is a `Tensor`, which is sliced; index means the index, u means the assigned value, which can be`scalar` or `Tensor (size = 1)`. The index types are as follows: - Slice index: index is `slice` - - Value: `tensor_x[start: stop: step]`, where Slice (start: stop: step) has the same syntax as Python, and will not be repeated here. - - Assignment: `tensor_x[start: stop: step] = u`. + - Value: `tensor_x[start: stop: step]`, where Slice (start: stop: step) has the same syntax as Python, and will not be repeated here. + - Assignment: `tensor_x[start: stop: step] = u`. - Ellipsis index: index is `ellipsis` - - Value: `tensor_x [...]`. - - Assignment: `tensor_x [...] = u`. + - Value: `tensor_x [...]`. + - Assignment: `tensor_x [...] = u`. - Boolean constant index: index is `True`, index is `False` is not supported temporarily. - - Value: `tensor_x[True]`. - - Assignment: Not supported yet. + - Value: `tensor_x[True]`. + - Assignment: Not supported yet. - Tensor index: index is `Tensor` - - Value: `tensor_x [index]`, `index` must be `Tensor` of data type `int32` or `int64`, + - Value: `tensor_x [index]`, `index` must be `Tensor` of data type `int32` or `int64`, the element value range is `[0, tensor_x.shape[0])`. - - Assignment: `tensor_x [index] = U`. - - `tensor_x` data type must be one of the following: `float16`, `float32`, `int8`, `uint8`. - - `index` must be `Tensor` of data type `int32`, the element value range is `[0, tensor_x.shape [0])`. - - `U` can be `Number`, `Tensor`, `Tuple` only containing `Number`, `Tuple` only containing `Tensor`. - - Single `Number` or every `Number` in `Tuple` must be the same type as `tensor_x`, ie + - Assignment: `tensor_x [index] = U`. + - `tensor_x` data type must be one of the following: `float16`, `float32`, `int8`, `uint8`. + - `index` must be `Tensor` of data type `int32`, the element value range is `[0, tensor_x.shape [0])`. + - `U` can be `Number`, `Tensor`, `Tuple` only containing `Number`, `Tuple` only containing `Tensor`. + - Single `Number` or every `Number` in `Tuple` must be the same type as `tensor_x`, ie When the data type of `tensor_x` is `uint8` or `int8`, the `Number` type should be `int`; When the data type of `tensor_x` is `float16` or `float32`, the `Number` type should be `float`. - - Single `Tensor` or every `Tensor in Tuple` must be consistent with the data type of `tensor_x`, + - Single `Tensor` or every `Tensor in Tuple` must be consistent with the data type of `tensor_x`, when single `Tensor`, the `shape` should be equal to or broadcast as `index.shape + tensor_x.shape [1:]`. - - `Tuple` containing `Number` must meet requirement: + - `Tuple` containing `Number` must meet requirement: `len (Tuple) = (index.shape + tensor_x.shape [1:]) [-1]`. - - `Tuple` containing `Tensor` must meet requirements: + - `Tuple` containing `Tensor` must meet requirements: the `shape` of each `Tensor` should be the same, `(len (Tuple),) + Tensor.shape` should be equal to or broadcast as `index.shape + tensor_x.shape [1:]`. - None constant index: index is `None` - - Value: `tensor_x[None]`, results are consistent with numpy. - - Assignment: Not supported yet. + - Value: `tensor_x[None]`, results are consistent with numpy. + - Assignment: Not supported yet. - tuple index: index is `tuple` - - The tuple element is a slice: - - Value: for example `tensor_x[::,: 4, 3: 0: -1]`. - - Assignment: for example `tensor_x[::,: 4, 3: 0: -1] = u`. - - The tuple element is Number: - - Value: for example `tensor_x[2,1]`. - - Assignment: for example `tensor_x[1,4] = u`. - - The tuple element is a mixture of slice and ellipsis: - - Value: for example `tensor_x[..., ::, 1:]`. - - Assignment: for example `tensor_x[..., ::, 1:] = u`. - - Not supported in other situations + - The tuple element is a slice: + - Value: for example `tensor_x[::,: 4, 3: 0: -1]`. + - Assignment: for example `tensor_x[::,: 4, 3: 0: -1] = u`. + - The tuple element is Number: + - Value: for example `tensor_x[2,1]`. + - Assignment: for example `tensor_x[1,4] = u`. + - The tuple element is a mixture of slice and ellipsis: + - Value: for example `tensor_x[..., ::, 1:]`. + - Assignment: for example `tensor_x[..., ::, 1:] = u`. + - Not supported in other situations The index value operation of tuple and list type, we need to focus on the index value operation of tuple or list whose element type is `nn.Cell`. This operation is currently only supported by the GPU backend in Graph mode, and its syntax format is like `layers[index](*inputs)`, the example code is as follows: + ```python class Net(nn.Cell): def __init__(self): @@ -162,60 +169,68 @@ The index value operation of tuple and list type, we need to focus on the index x = self.layers[index](x) return x ``` + The grammar has the following constraints: -* Only the index value operation of tuple or list whose element type is `nn.Cell` is supported. -* The index is a scalar `Tensor` of type `int32`, with a value range of `[-n, n)`, where `n` is the size of the tuple, and the maximum supported tuple size is 1000. -* The number, type and shape of the input data of the `Construct` function of each Cell element in the tuple are the same, and the number of data output after the `Construct` function runs, the type and shape are also the same. -* Each element in the tuple needs to be defined before the tuple is defined. -* This syntax does not support running branches as if, while, for and other control flow, except if the control condition of the control flow is constant. for example: - - Supported example: - ```python - class Net(nn.Cell): - def __init__(self, flag=True): - super(Net, self).__init__() - self.flag = flag - self.relu = nn.ReLU() - self.softmax = nn.Softmax() - self.layers = (self.relu, self.softmax) - def construct(self, x, index): - if self.flag: - x = self.layers[index](x) - return x - ``` - - Unsupported example: - ```python - class Net(nn.Cell): - def __init__(self): - super(Net, self).__init__() - self.relu = nn.ReLU() - self.softmax = nn.Softmax() - self.layers = (self.relu, self.softmax) +- Only the index value operation of tuple or list whose element type is `nn.Cell` is supported. +- The index is a scalar `Tensor` of type `int32`, with a value range of `[-n, n)`, where `n` is the size of the tuple, and the maximum supported tuple size is 1000. +- The number, type and shape size of the input data of the `Construct` function of each Cell element in the tuple are the same, and the number of data output after the `Construct` function runs, the type and shape size are also the same. +- Each element in the tuple needs to be defined before the tuple is defined. +- This syntax does not support running branches as if, while, for and other control flow, except if the control condition of the control flow is constant. for example: + - Supported example: - def construct(self, x, index, flag): - if flag: - x = self.layers[index](x) - return x - ``` + ```python + class Net(nn.Cell): + def __init__(self, flag=True): + super(Net, self).__init__() + self.flag = flag + self.relu = nn.ReLU() + self.softmax = nn.Softmax() + self.layers = (self.relu, self.softmax) + + def construct(self, x, index): + if self.flag: + x = self.layers[index](x) + return x + ``` + + - Unsupported example: + + ```python + class Net(nn.Cell): + def __init__(self): + super(Net, self).__init__() + self.relu = nn.ReLU() + self.softmax = nn.Softmax() + self.layers = (self.relu, self.softmax) + + def construct(self, x, index, flag): + if flag: + x = self.layers[index](x) + return x + ``` Tuple also support slice value operations, but do not support slice type as Tensor, support `tuple_x [start: stop: step]`, which has the same effect as Python, and will not be repeated here. ### Unsupported Syntax -Currently, the following syntax is not supported in network constructors: +Currently, the following syntax is not supported in network constructors: `raise`, `yield`, `async for`, `with`, `async with`, `assert`, `import`, and `await`. ## Network Definition Constraints ### Instance Types on the Entire Network -* Common Python function with the [@ms_function](https://www.mindspore.cn/doc/api_python/en/master/mindspore/mindspore.html#mindspore.ms_function) decorator. -* Cell subclass inherited from [nn.Cell](https://www.mindspore.cn/doc/api_python/en/master/mindspore/mindspore.nn.html#mindspore.nn.Cell). + +- Common Python function with the [@ms_function](https://www.mindspore.cn/doc/api_python/en/master/mindspore/mindspore.html#mindspore.ms_function) decorator. +- Cell subclass inherited from [nn.Cell](https://www.mindspore.cn/doc/api_python/en/master/mindspore/mindspore.nn.html#mindspore.nn.Cell). ### Network Input Type -* The training data input parameters of the entire network must be of the Tensor type. -* The generated ANF diagram cannot contain the following constant nodes: string constants, constants with nested tuples, and constants with nested lists. + +- The training data input parameters of the entire network must be of the Tensor type. +- The generated ANF diagram cannot contain the following constant nodes: string constants, constants with nested tuples, and constants with nested lists. ### Network Graph Optimization + During graph optimization at the ME frontend, the dataclass, dictionary, list, and key-value pair types are converted to tuple types, and the corresponding operations are converted to tuple operations. ### Network Construction Components @@ -230,33 +245,36 @@ Currently, the following syntax is not supported in network constructors: | Composite operator |[mindspore/ops/composite/*](https://www.mindspore.cn/doc/api_python/en/master/mindspore/mindspore.ops.html). | Operator generated by constexpr |Uses the value generated by [@constexpr](https://www.mindspore.cn/doc/api_python/en/master/mindspore/mindspore.ops.html#mindspore.ops.constexpr) to calculate operators. - ### Other Constraints + 1. Input parameters of the `construct` function on the entire network and parameters of functions modified by the `ms_function` decorator are generalized during the graph compilation and cannot be passed to operators as constant input. Therefore, in graph mode, the parameter passed to the entry network can only be `Tensor`. As shown in the following example: - - * The following is an example of incorrect input: + + - The following is an example of incorrect input: + ```python class ExpandDimsTest(Cell): def __init__(self): super(ExpandDimsTest, self).__init__() self.expandDims = P.ExpandDims() - + def construct(self, input_x, input_axis): return self.expandDims(input_x, input_axis) expand_dim = ExpandDimsTest() input_x = Tensor(np.random.randn(2,2,2,2).astype(np.float32)) expand_dim(input_x, 0) ``` + In the example, `ExpandDimsTest` is a single-operator network with two inputs: `input_x` and `input_axis`. The second input of the `ExpandDims` operator must be a constant. This is because `input_axis` is required when the output dimension of the `ExpandDims` operator is deduced during graph compilation. As the network parameter input, the value of `input_axis` is generalized into a variable and cannot be determined. As a result, the output dimension of the operator cannot be deduced, causing the graph compilation failure. Therefore, the input required by deduction in the graph compilation phase must be a constant. In the API, the parameters of this type of operator that require constant input will be explained, marked `const input is needed`. - - * Directly enter the needed value or a member variable in a class for the constant input of the operator in the construct function. The following is an example of correct input: + + - Directly enter the needed value or a member variable in a class for the constant input of the operator in the construct function. The following is an example of correct input: + ```python class ExpandDimsTest(Cell): def __init__(self, axis): super(ExpandDimsTest, self).__init__() self.expandDims = P.ExpandDims() self.axis = axis - + def construct(self, input_x): return self.expandDims(input_x, self.axis) axis = 0 @@ -264,52 +282,56 @@ Currently, the following syntax is not supported in network constructors: input_x = Tensor(np.random.randn(2,2,2,2).astype(np.float32)) expand_dim(input_x) ``` - + 2. It is not allowed to modify `non-Parameter` type data members of the network. Examples are as follows: - ``` + ```python class Net(Cell): def __init__(self): super(Net, self).__init__() self.num = 2 self.par = Parameter(Tensor(np.ones((2, 3, 4))), name="par") - + def construct(self, x, y): return x + y ``` + In the network defined above, `self.num` is not a `Parameter` and cannot be modified, but `self.par` is a `Parameter` and can be modified. 3. When an undefined class member is used in the `construct` function, it will be treated as `None` instead of throwing `AttributeError` like the Python interpreter. Examples are as follows: - - ``` + + ```python class Net(Cell): def __init__(self): super(Net, self).__init__() - + def construct(self, x): return x + self.y ``` + In the network defined above, the undefined class member `self.y` is used in `construct`, and `self.y` will be treated as `None`. 4. When using the control flow of `if-else` in the `construct` function, the data types returned by `if` and `else` or the data types of the same variable after being updated must be the same. Examples are as follows: - ``` + + ```python class NetReturn(Cell): def __init__(self): super(NetReturn, self).__init__() - + def construct(self, x, y, m, n): if x > y: return m else: return n ``` + In the network `NetReturn` defined above, the `if-else` control flow is used in `construct`, then the data type of `m` returned by the `if` branch and the data type of `n` returned by the `else` branch must be consistent. - - ``` + + ```python class NetAssign(Cell): def __init__(self): super(NetAssign, self).__init__() - + def construct(self, x, y, m, n): if x > y: out = m @@ -317,4 +339,5 @@ Currently, the following syntax is not supported in network constructors: out = n return out ``` + In the network `NetAssign` defined above, the `if-else` control flow is used in the `construct`, then the data types of the `out` after the update of the `if` branch and the update of the `else` branch must be consistent. diff --git a/docs/note/source_en/design/mindarmour/differential_privacy_design.md b/docs/note/source_en/design/mindarmour/differential_privacy_design.md index c57af833f17e3642d4719def21170d44e60626af..7038864e653ba412238865bae8c9e12e72f7a735 100644 --- a/docs/note/source_en/design/mindarmour/differential_privacy_design.md +++ b/docs/note/source_en/design/mindarmour/differential_privacy_design.md @@ -26,14 +26,13 @@ The Differential-Privacy module of MindArmour implements the differential privac Figure 1 shows an overall design of differential privacy training, and mainly including differential privacy noise mechanisms (DP mechanisms), a differential privacy optimizer (DP optimizer), and a privacy monitor. - ### DP Optimizer DP optimizer inherits capabilities of the MindSpore optimizer and uses the DP mechanisms to scramble and protect gradients. Currently, MindArmour provides three types of DP optimizers: constant Gaussian optimizer, adaptive Gaussian optimizer, and adaptive clipping optimizer. Each type of DP optimizer adds differential privacy protection capabilities to common optimizers such as SGD and Momentum from different perspectives. -* Constant Gaussian optimizer is a DP optimizer for non-adaptive Gaussian noise. The advantage is that the differential privacy budget ϵ can be strictly controlled. The disadvantage is that in the model training process, the noise amount added in each step is fixed. If the number of training steps is too large, the noise in the later phase of training makes the model convergence difficult, or even causes the performance to deteriorate greatly and the model availability to be poor. -* Adaptive Gaussian optimizer adaptively adjusts the standard deviation to adjust the Gaussian distribution noise. In the initial phase of model training, a large amount of noise is added. As the model gradually converges, the noise amount gradually decreases, and the impact of the noise on the model availability is reduced. A disadvantage of the adaptive Gaussian noise is that a differential privacy budget cannot be strictly controlled. -* Adaptive clipping optimizer is a DP optimizer that adaptively adjusts a clipping granularity. Gradient clipping is an important operation in differential privacy training. The adaptive clipping optimizer can control a ratio of gradient clipping to fluctuate within a given range and control the gradient clipping granularity during training steps. +- Constant Gaussian optimizer is a DP optimizer for non-adaptive Gaussian noise. The advantage is that the differential privacy budget ϵ can be strictly controlled. The disadvantage is that in the model training process, the noise amount added in each step is fixed. If the number of training steps is too large, the noise in the later phase of training makes the model convergence difficult, or even causes the performance to deteriorate greatly and the model availability to be poor. +- Adaptive Gaussian optimizer adaptively adjusts the standard deviation to adjust the Gaussian distribution noise. In the initial phase of model training, a large amount of noise is added. As the model gradually converges, the noise amount gradually decreases, and the impact of the noise on the model availability is reduced. A disadvantage of the adaptive Gaussian noise is that a differential privacy budget cannot be strictly controlled. +- Adaptive clipping optimizer is a DP optimizer that adaptively adjusts a clipping granularity. Gradient clipping is an important operation in differential privacy training. The adaptive clipping optimizer can control a ratio of gradient clipping to fluctuate within a given range and control the gradient clipping granularity during training steps. ### DP Mechanisms @@ -41,26 +40,24 @@ The noise mechanism is a basis for building a differential privacy training capa ### Monitor -Monitor provides callback functions such as Rényi differential privacy (RDP) and zero-concentrated differential privacy (ZCDP) to monitor the differential privacy budget of the model. +Monitor provides callback functions such as Rényi differential privacy (RDP) and zero-concentrated differential privacy (ZCDP) to monitor the differential privacy budget of the model. -* ZCDP[2] +- ZCDP[2] ZCDP is a loose differential privacy definition. It uses the Rényi divergence to measure the distribution difference of random functions on adjacent datasets. -* RDP[3] +- RDP[3] RDP is a more general differential privacy definition based on the Rényi divergence. It uses the Rényi divergence to measure the distribution difference between two adjacent datasets. - Compared with traditional differential privacy, ZCDP and RDP provide stricter privacy budget upper bound guarantee. - ## Code Implementation -* [mechanisms.py](https://gitee.com/mindspore/mindarmour/blob/master/mindarmour/privacy/diff_privacy/mechanisms/mechanisms.py): implements the noise generation mechanism required by differential privacy training, including simple Gaussian noise, adaptive Gaussian noise, and adaptive clipping Gaussian noise. -* [optimizer.py](https://gitee.com/mindspore/mindarmour/blob/master/mindarmour/privacy/diff_privacy/optimizer/optimizer.py): implements the fundamental logic of using the noise generation mechanism to add noise during backward propagation. -* [monitor.py](https://gitee.com/mindspore/mindarmour/blob/master/mindarmour/privacy/diff_privacy/monitor/monitor.py): implements the callback function for computing the differential privacy budget. During model training, the current differential privacy budget is returned. -* [model.py](https://gitee.com/mindspore/mindarmour/blob/master/mindarmour/privacy/diff_privacy/train/model.py): implements the logic of computing the loss and gradient as well as the gradient truncation logic of differential privacy training, which is the entry for users to use the differential privacy training capability. +- [mechanisms.py](https://gitee.com/mindspore/mindarmour/blob/master/mindarmour/privacy/diff_privacy/mechanisms/mechanisms.py): implements the noise generation mechanism required by differential privacy training, including simple Gaussian noise, adaptive Gaussian noise, and adaptive clipping Gaussian noise. +- [optimizer.py](https://gitee.com/mindspore/mindarmour/blob/master/mindarmour/privacy/diff_privacy/optimizer/optimizer.py): implements the fundamental logic of using the noise generation mechanism to add noise during backward propagation. +- [monitor.py](https://gitee.com/mindspore/mindarmour/blob/master/mindarmour/privacy/diff_privacy/monitor/monitor.py): implements the callback function for computing the differential privacy budget. During model training, the current differential privacy budget is returned. +- [model.py](https://gitee.com/mindspore/mindarmour/blob/master/mindarmour/privacy/diff_privacy/train/model.py): implements the logic of computing the loss and gradient as well as the gradient truncation logic of differential privacy training, which is the entry for users to use the differential privacy training capability. ## References diff --git a/docs/note/source_en/design/mindarmour/fuzzer_design.md b/docs/note/source_en/design/mindarmour/fuzzer_design.md index dbb4feeb0a5ce1989b0bf9178892787740ff369f..34cfc563dc10d75662950114a1db2337fe5f9596 100644 --- a/docs/note/source_en/design/mindarmour/fuzzer_design.md +++ b/docs/note/source_en/design/mindarmour/fuzzer_design.md @@ -2,7 +2,6 @@ `Linux` `Ascend` `GPU` `CPU` `Data Preparation` `Model Development` `Model Training` `Model Optimization` `Enterprise` `Expert` - - [AI Model Security Testing](#ai-model-security-testing) @@ -71,4 +70,4 @@ Through multiple rounds of mutations, you can obtain a series of variant data in [1] Pei K, Cao Y, Yang J, et al. Deepxplore: Automated whitebox testing of deep learning systems[C]//Proceedings of the 26th Symposium on Operating Systems Principles. ACM, 2017: 1-18. -[2] Ma L, Juefei-Xu F, Zhang F, et al. Deepgauge: Multi-granularity testing criteria for deep learning systems[C]//Proceedings of the 33rd ACM/IEEE International Conference on Automated Software Engineering. ACM, 2018: 120-131. \ No newline at end of file +[2] Ma L, Juefei-Xu F, Zhang F, et al. Deepgauge: Multi-granularity testing criteria for deep learning systems[C]//Proceedings of the 33rd ACM/IEEE International Conference on Automated Software Engineering. ACM, 2018: 120-131. diff --git a/docs/note/source_en/design/mindinsight/graph_visual_design.md b/docs/note/source_en/design/mindinsight/graph_visual_design.md index a0e0c8918f537edeeb2586ca23b7bc25f310e16c..8633d64951454033d95e05a8302c4cdeb825a59d 100644 --- a/docs/note/source_en/design/mindinsight/graph_visual_design.md +++ b/docs/note/source_en/design/mindinsight/graph_visual_design.md @@ -21,9 +21,9 @@ The computational graph visualization function is mainly used in the following scenarios: - - View a data flow direction of operators and a model structure when programming a deep learning neural network. - - View the input and output nodes of a specified node and attributes of a queried node. - - Trace data, including data dimension and type changes when debugging a network. +- View a data flow direction of operators and a model structure when programming a deep learning neural network. +- View the input and output nodes of a specified node and attributes of a queried node. +- Trace data, including data dimension and type changes when debugging a network. ## Overall Design @@ -71,4 +71,4 @@ RESTful API is used for data interaction between the MindInsight frontend and ba #### File API Design Data interaction between MindSpore and MindInsight uses the data format defined by [Protocol Buffer](https://developers.google.cn/protocol-buffers/docs/pythontutorial). -The main entry is the [summary.proto file](https://gitee.com/mindspore/mindinsight/blob/master/mindinsight/datavisual/proto_files/mindinsight_summary.proto). A message object of a computational graph is defined as `GraphProto`. For details about `GraphProto`, see the [anf_ir.proto file](https://gitee.com/mindspore/mindinsight/blob/master/mindinsight/datavisual/proto_files/mindinsight_anf_ir.proto). \ No newline at end of file +The main entry is the [summary.proto file](https://gitee.com/mindspore/mindinsight/blob/master/mindinsight/datavisual/proto_files/mindinsight_summary.proto). A message object of a computational graph is defined as `GraphProto`. For details about `GraphProto`, see the [anf_ir.proto file](https://gitee.com/mindspore/mindinsight/blob/master/mindinsight/datavisual/proto_files/mindinsight_anf_ir.proto). diff --git a/docs/note/source_en/design/mindinsight/tensor_visual_design.md b/docs/note/source_en/design/mindinsight/tensor_visual_design.md index f21b670f1f37133fb8909b2ae1ed009f9f960f32..86a364148c6002864ea62d9c5b38bda03775674c 100644 --- a/docs/note/source_en/design/mindinsight/tensor_visual_design.md +++ b/docs/note/source_en/design/mindinsight/tensor_visual_design.md @@ -60,7 +60,8 @@ In tensor visualization, there are file API and RESTful API. The file API is the #### File API Design The `summary.proto` file is the main entry. TensorProto data is stored in the summary value, as shown in the following: -``` + +```cpp { message Summary { message Image { @@ -69,7 +70,7 @@ The `summary.proto` file is the main entry. TensorProto data is stored in the su required int32 width = 2; ... } - + message Histogram { message bucket{ // Counting number of values fallen in [left, left + width). @@ -78,7 +79,7 @@ The `summary.proto` file is the main entry. TensorProto data is stored in the su required double width = 2; required int64 count = 3; } - + repeated bucket buckets = 1; ... } @@ -86,7 +87,7 @@ The `summary.proto` file is the main entry. TensorProto data is stored in the su message Value { // Tag name for the data. required string tag = 1; - + // Value associated with the tag. oneof value { float scalar_value = 3; @@ -100,4 +101,5 @@ The `summary.proto` file is the main entry. TensorProto data is stored in the su repeated Value value = 1; } ``` -TensorProto is defined in the [anf_ir.proto](https://gitee.com/mindspore/mindspore/blob/master/mindspore/ccsrc/utils/anf_ir.proto) file. \ No newline at end of file + +TensorProto is defined in the [anf_ir.proto](https://gitee.com/mindspore/mindspore/blob/master/mindspore/ccsrc/utils/anf_ir.proto) file. diff --git a/docs/note/source_en/design/mindinsight/training_visual_design.md b/docs/note/source_en/design/mindinsight/training_visual_design.md index 86451518c897bd241b0900664cecc207c2c3ce07..05cadfe220ab59d397cc4e2342d2fbf6d43325b6 100644 --- a/docs/note/source_en/design/mindinsight/training_visual_design.md +++ b/docs/note/source_en/design/mindinsight/training_visual_design.md @@ -5,16 +5,16 @@ - [Overall Design of Training Visualization](#overall-design-of-training-visualization) - - [Logical Architecture of Training Visualization](#logical-architecture-of-training-visualization) - - [Architecture of Training Information Collection](#architecture-of-training-information-collection) - - [Architecture of Training Information Analysis and Display](#architecture-of-training-information-analysis-and-display) - - [Code Organization](#code-organization) - - [Training Visualization Data Model](#training-visualization-data-model) - - [Training Information Data Flow](#training-information-data-flow) - - [Data Model](#data-model) - - [Training Job](#training-job) - - [Lineage Data](#lineage-data) - - [Training Process Data](#training-process-data) + - [Logical Architecture of Training Visualization](#logical-architecture-of-training-visualization) + - [Architecture of Training Information Collection](#architecture-of-training-information-collection) + - [Architecture of Training Information Analysis and Display](#architecture-of-training-information-analysis-and-display) + - [Code Organization](#code-organization) + - [Training Visualization Data Model](#training-visualization-data-model) + - [Training Information Data Flow](#training-information-data-flow) + - [Data Model](#data-model) + - [Training Job](#training-job) + - [Lineage Data](#lineage-data) + - [Training Process Data](#training-process-data) @@ -105,7 +105,7 @@ MindInsight uses directories to distinguish different training jobs. To distingu In MindInsight code, a training job is called a TrainJob. A TrainJob ID is the name of the directory where the training log is located, for example, ./train_my_lenet_1. -During a training process, a lineage data file (whose name ends with _lineage) and a training process data file (whose name ends with _MS) are generated. The lineage data mainly describes an invariant attribute of the training from a global perspective, for example, a dataset path used for training, an optimizer used for training, and user-defined lineage information. The most prominent feature of the lineage data file is that it does not change during the training process. The training process data mainly describes a change status of the training, for example, a loss value, parameter distribution, and image data sent to the model in a step. The most prominent feature of the training process data file is that each step changes. +During a training process, a lineage data file (whose name ends with _lineage) and a training process data file (whose name ends with_MS) are generated. The lineage data mainly describes an invariant attribute of the training from a global perspective, for example, a dataset path used for training, an optimizer used for training, and user-defined lineage information. The most prominent feature of the lineage data file is that it does not change during the training process. The training process data mainly describes a change status of the training, for example, a loss value, parameter distribution, and image data sent to the model in a step. The most prominent feature of the training process data file is that each step changes. It should be noted that the classification about whether the training information changes is not absolute. For example, the training process data file contains computational graph data, which is determined when the training starts. @@ -129,4 +129,4 @@ The lineage data describes the invariant attribute of a training from a global p - Data Query and Display - When displaying data, you might want to see how the data under a tag changes with the training process. Therefore, when querying data, you do not need to specify the step number. Instead, you can specify the training job, plugin name, and tag to query data of all steps under the tag. \ No newline at end of file + When displaying data, you might want to see how the data under a tag changes with the training process. Therefore, when querying data, you do not need to specify the step number. Instead, you can specify the training job, plugin name, and tag to query data of all steps under the tag. diff --git a/docs/note/source_en/design/mindspore/architecture_lite.md b/docs/note/source_en/design/mindspore/architecture_lite.md index d78fc89147c30eea967c849d5291ce61ed0d42d6..05c69cede3232a2db8a890f63606dc79cede1ba2 100644 --- a/docs/note/source_en/design/mindspore/architecture_lite.md +++ b/docs/note/source_en/design/mindspore/architecture_lite.md @@ -1,7 +1,7 @@ # Overall Architecture (Lite) `Linux` `Windows` `On Device` `Inference Application` `Intermediate` `Expert` `Contributor` - + The overall architecture of MindSpore Lite is as follows: @@ -14,8 +14,8 @@ The overall architecture of MindSpore Lite is as follows: - **Backend:** optimizes graphs based on IR, including graph high level optimization (GHLO), graph low level optimization (GLLO), and quantization. GHLO is responsible for hardware-independent optimization, such as operator fusion and constant folding. GLLO is responsible for hardware-related optimization. Quantizer supports quantization methods after training, such as weight quantization and activation value quantization. -- **Runtime:** inference runtime of intelligent devices. Sessions are responsible for session management and provide external APIs. The thread pool and parallel primitives are responsible for managing the thread pool used for graph execution. Memory allocation is responsible for memory overcommitment of each operator during graph execution. The operator library provides the CPU and GPU operators. +- **Runtime:** inference runtime of intelligent devices. Sessions are responsible for session management and provide external APIs. The thread pool and parallel primitives are responsible for managing the thread pool used for graph execution. Memory allocation is responsible for memory overcommitment of each operator during graph execution. The operator library provides the CPU and GPU operators. - **Micro:** runtime of IoT devices, including the model generation .c file, thread pool, memory overcommitment, and operator library. -Runtime and Micro share the underlying infrastructure layers, such as the operator library, memory allocation, thread pool, and parallel primitives. +Runtime and Micro share the underlying infrastructure layers, such as the operator library, memory allocation, thread pool, and parallel primitives. diff --git a/docs/note/source_en/design/mindspore/distributed_training_design.md b/docs/note/source_en/design/mindspore/distributed_training_design.md index 79de486296bad80e349dcc714ec16a35067c375e..cf963d8a9f819eeecf08184300edf060361f3834 100644 --- a/docs/note/source_en/design/mindspore/distributed_training_design.md +++ b/docs/note/source_en/design/mindspore/distributed_training_design.md @@ -24,7 +24,6 @@ With the rapid development of deep learning, the number of datasets and parameters are growing exponentially to improve the accuracy and generalization capability of neural networks. Parallel distributed training has become a development trend to resolve the performance bottleneck of ultra-large scale networks. MindSpore supports the mainstream distributed training paradigm and develops an automatic hybrid parallel solution. The following describes the design principles of several parallel training modes and provides guidance for users to perform custom development. - ## Concepts ### Collective Communication @@ -74,7 +73,6 @@ This section describes how the data parallel mode `ParallelMode.DATA_PARALLEL` w - [grad_reducer.py](https://gitee.com/mindspore/mindspore/blob/master/mindspore/nn/wrap/grad_reducer.py): This file implements the gradient aggregation process. After the input parameter `grads` is expanded by using `HyperMap`, the `AllReduce` operator is inserted. The global communication group is used. You can also perform custom development by referring to this section based on your network requirements. In MindSpore, standalone and distributed execution shares a set of network encapsulation APIs. In the `Cell`, `ParallelMode` is used to determine whether to perform gradient aggregation. For details about the network encapsulation APIs, see the `TrainOneStepCell` code implementation. - ## Automatic Parallelism As a key feature of MindSpore, automatic parallelism is used to implement hybrid parallel training that combines automatic data parallelism and model parallelism. It aims to help users express the parallel algorithm logic using standalone scripts, reduce the difficulty of distributed training, improve the algorithm R&D efficiency, and maintain the high performance of training. This section describes how the automatic parallel mode `ParallelMode.AUTO_PARALLEL` and semi-automatic parallel mode `ParallelMode.SEMI_AUTO_PARALLEL` work in MindSpore. @@ -86,19 +84,19 @@ As a key feature of MindSpore, automatic parallelism is used to implement hybrid 1. Distributed operator and tensor layout As shown in the preceding figure, the automatic parallel process traverses the standalone forward ANF graphs and performs shard modeling on tensors in the unit of distributed operator, indicating how the input and output tensors of an operator are distributed to each device of the cluster, that is, the tensor layout. Users do not need to know which device runs which slice of a model. The framework automatically schedules and allocates model slices. - + To obtain the tensor layout model, each operator has a shard strategy, which indicates the shard status of each input of the operator in the corresponding dimension. Generally, tensors can be sharded in any dimension as long as the value is a multiple of 2, and the even distribution principle is met. The following figure shows an example of the three-dimensional `BatchMatmul` operation. The parallel strategy consists of two tuples, indicating the sharding of `input` and `weight`, respectively. Elements in a tuple correspond to tensor dimensions one by one. `2^N` indicates the shard unit, and `1` indicates that the tuple is not sharded. If you want to express a parallel data shard strategy, that is, only data in the `batch` dimension of `input` is sharded, and data in other dimensions are not sharded, you can use `strategy=((2^N, 1, 1),(1, 1, 1))`. If you want to express a parallel model shard strategy, that is, only model in the non-`batch` dimension of `weight` is sharded, for example, only the `channel` dimension is sharded, you can use `strategy=((1, 1, 1),(1, 1, 2^N))`. If you want to express a hybrid parallel shard strategy, one of which is `strategy=((2^N, 1, 1),(1, 1, 2^N))`. ![Operator Sharding Definition](./images/operator_split.png) - - Based on the shard strategy of an operator, the framework automatically derives the distribution model of input tensors and output tensors of the operator. This distribution model consists of `device_matrix`, `tensor_shape`, and `tensor map`, which indicate the device matrix shape, tensor shape, and mapping between devices and tensor dimensions, respectively. Based on the tensor layout model, distributed operator determines whether to insert extra computation and communication operations in the graph to ensure that the operator computing logic is correct. + + Based on the shard strategy of an operator, the framework automatically derives the distribution model of input tensors and output tensors of the operator. This distribution model consists of `device_matrix`, `tensor_shape`, and `tensor map`, which indicate the device matrix shape, tensor shape, and mapping between devices and tensor dimensions, respectively. Based on the tensor layout model, distributed operator determines whether to insert extra computation and communication operations in the graph to ensure that the operator computing logic is correct. 2. Tensor Redistribution When the output tensor model of an operator is inconsistent with the input tensor model of the next operator, computation and communication operations need to be introduced to implement the change between tensor layouts. The automatic parallel process introduces the tensor redistribution algorithm, which can be used to derive the communication conversion operations between random tensor layouts. The following three examples represent a parallel computing process of the formula `Z=(X×W)×V`, that is, a `MatMul` operation of two two-dimensional matrices, and show how to perform conversion between different parallel modes. - + In example 1, the output of the first data parallel matrix multiplication is sharded in the row rection, and the input of the second model parallel matrix multiplication requires full tensors. The framework automatically inserts the `AllGather` operator to implement redistribution. - + ![Tensor Redistribution](./images/tensor_redistribution1.png) In example 2, the output of parallel matrix multiplication of the first model is sharded in the column direction, and the input of parallel matrix multiplication of the second model is sharded in the row direction. The framework automatically inserts a communication operator equivalent to the `AlltoAll` operation in collective communication to implement redistribution. @@ -114,9 +112,8 @@ As a key feature of MindSpore, automatic parallelism is used to implement hybrid 3. Efficient parallel strategy search algorithm The `SEMI_AUTO_PARALLEL` semi-automatic parallel mode indicates that you manually configure the parallel strategy for operators when you are familiar with the operator sharding representation. This mode is helpful for manual optimization, with certain commissioning difficulty. You need to master the parallel principle and obtain a high-performance parallel solution based on the network structure and cluster topology. To further help users accelerate the parallel network training process, the automatic parallel mode `AUTO_PARALLEL` introduces the automatic search feature of the parallel strategy on the basis of the semi-automatic parallel mode. Automatic parallelism builds cost models based on the hardware platform, and calculates the computation cost, memory cost, and communication cost of a certain amount of data and specific operators based on different parallel strategies Then, by using the dynamic programming algorithm or recursive programming algorithm and taking the memory upper limit of a single device as a constraint condition, a parallel strategy with optimal performance is efficiently searched out. - - Strategy search replaces manual model sharding and provides a high-performance sharding solution within a short period of time, greatly reducing the threshold for parallel training. + Strategy search replaces manual model sharding and provides a high-performance sharding solution within a short period of time, greatly reducing the threshold for parallel training. 4. Convenient distributed automatic differentiation @@ -139,6 +136,5 @@ As a key feature of MindSpore, automatic parallelism is used to implement hybrid 5. Entire graph sharding - [step_auto_parallel.h](https://gitee.com/mindspore/mindspore/blob/master/mindspore/ccsrc/frontend/parallel/step_auto_parallel.h), and [step_parallel.h](https://gitee.com/mindspore/mindspore/blob/master/mindspore/ccsrc/frontend/parallel/step_parallel.h): The two files contain the core implementation of the automatic parallel process. `step_auto_parallel.h` calls the strategy search process and generates the `OperatorInfo` of the distributed operator. Then in `step_parallel.h`, processes such as operator sharding and tensor redistribution are processed to reconstruct the standalone computing graph in distributed mode. - 6. Backward propagation of communication operators - [grad_comm_ops.py](https://gitee.com/mindspore/mindspore/blob/master/mindspore/ops/_grad/grad_comm_ops.py): This file defines the backward propagation of communication operators, such as `AllReduce` and `AllGather`. diff --git a/docs/note/source_en/design/mindspore/mindir.md b/docs/note/source_en/design/mindspore/mindir.md index 7d3ba3385c9ff0c399a09002e35742f886ee7975..59f55e31952a36ce34cce402f9a8f328a3f835b3 100644 --- a/docs/note/source_en/design/mindspore/mindir.md +++ b/docs/note/source_en/design/mindspore/mindir.md @@ -21,13 +21,16 @@ ## Overview + An intermediate representation (IR) is a representation of a program between the source and target languages, which facilitates program analysis and optimization for the compiler. Therefore, the IR design needs to consider the difficulty in converting the source language to the target language, as well as the ease-of-use and performance of program analysis and optimization. MindSpore IR (MindIR) is a function-style IR based on graph representation. Its core purpose is to serve automatic differential transformation. Automatic differentiation uses the transformation method based on the function-style programming framework. Therefore, IR uses the semantics close to that of the ANF function. In addition, a manner of representation based on an explicit dependency graph is used by referring to excellent designs of Sea of Nodes[1] and Thorin[2]. ## Syntax + ANF is a simple IR commonly used during functional programming. The ANF syntax is defined as follows: -``` + +```python ::= NUMBER | STRING | VAR | BOOLEAN | PRIMOP | (lambda (VAR …) ) ::= ( …) @@ -35,17 +38,20 @@ ANF is a simple IR commonly used during functional programming. The ANF syntax i ::= (let ([VAR ]) ) | | ``` + Expressions in the ANF are classified into atomic expressions (aexp) and compound expressions (cexp). An atomic expression indicates a constant value, a variable, or an anonymous function. A compound expression consists of multiple atomic expressions, indicating that an anonymous function or primitive function call. The first input expression of a compound expression is the called function, and the other input expressions are the called parameters. The syntax of MindIR is inherited from the ANF and is defined as follows: -``` + +```python ::= | ::= Parameter ::= Scalar | Named | Tensor | Type | Shape - | Primitive | MetaFuncGraph | FuncGraph + | Primitive | MetaFuncGraph | FuncGraph ::= ( …) ::= | ``` + ANode in a MindIR corresponds to the atomic expression of ANF. ANode has two subclasses: ValueNode and ParameterNode. ValueNode refers to a constant node, which can carry a constant value (such as a scalar, symbol, tensor, type, and dimension), a primitive function (Primitive), a metafunction (MetaFuncGraph), or a common function (FuncGraph). In functional programming, the function definition itself is a value. ParameterNode refers to a parameter node, which indicates the formal parameter of a function. CNode in a MindIR corresponds to the compound expression of ANF, indicating a function call. @@ -53,7 +59,9 @@ CNode in a MindIR corresponds to the compound expression of ANF, indicating a fu During automatic differentiation of MindSpore, the gradient contribution of ParameterNode and CNode are calculated, and the final gradient of ParameterNode is returned. The gradient of ValueNode is not calculated. ## Example + The following uses a program code segment as an example to help you understand MindIR. + ```python def func(x, y): return x / y @@ -65,8 +73,10 @@ def test_f(x, y): c = b * func(a, b) return c ``` + The ANF corresponding to the Python code is as follows: -``` + +```python lambda (x, y) let a = x - 1 in let b = a + y in @@ -77,26 +87,31 @@ lambda (x, y) let c = b * %1 in c end ``` + The corresponding MindIR is [ir.dot](https://gitee.com/mindspore/docs/blob/master/docs/note/source_en/design/mindspore/images/ir/ir.dot). -![](./images/ir/ir.png) +![image](./images/ir/ir.png) In a MindIR, a function graph (FuncGraph) indicates the definition of a common function. A directed acyclic graph (DAG) usually consists of ParameterNode, ValueNode, and CNode, which clearly shows the calculation process from parameters to return values. As shown in the preceding figure, the `test_f` and `func` functions in the Python code are converted into two function graphs. The `x` and `y` parameters are converted into ParameterNode in the function graphs, and each expression is converted into a CNode. The first input of CNode links to the called functions, for example, `add`, `func`, and `return` in the figure. It should be noted that these nodes are all `ValueNode` because they are considered as constant function values. Other input of CNode links to the called parameters. The parameter values can be obtained from the ParameterNode, ValueNode, and other CNode. In the ANF, each expression is bound as a variable by using the let expression, and the dependency on the expression output is represented by referencing the variable. In the MindIR, each expression is bound as a node, and the dependency is represented by using the directed edges between nodes. ## Saving IR + `context.set_context(save_graphs=True)` is used to save the intermediate code in each compilation phase. The intermediate code can be saved in two formats. One is the text format with the suffix `.ir`, and the other is the graphical format with the suffix `.dot`. When the network scale is small, you are advised to use the graphical format that is more intuitive. When the network scale is large, you are advised to use the text format that is more efficient. You can run the graphviz command to convert a .dot file to the picture format. For example, you can run the `dot -Tpng *.dot -o *.png` command to convert a .dot file to a .png file. ## Function-style Semantics + Compared with traditional computational graphs, MindIR can not only express data dependency between operators, but also express rich function-style semantics. + ### Higher-Order Functions + In a MindIR, a function is defined by a subgraph. However, the function itself can be transferred as the input or output of other higher-order functions. In the following simple example, the `f` function is transferred as a parameter into the `g` function. Therefore, the `g` function is a higher-order function that receives function input, and the actual call site of the `f` function is inside the `g` function. -``` +```python @ms_function def hof(x): def f(x): @@ -108,14 +123,16 @@ def hof(x): ``` The corresponding MindIR is [hof.dot](https://gitee.com/mindspore/docs/blob/master/docs/note/source_en/design/mindspore/images/ir/hof.dot). -![](./images/ir/hof.png) +![image](./images/ir/hof.png) In the actual network training scripts, the automatic derivation generic function `GradOperation` and `Partial` and `HyperMap` that are commonly used in the optimizer are typical high-order functions. Higher-order semantics greatly improve the flexibility and simplicity of MindSpore representations. ### Control Flows + In a MindIR, control flows are expressed in the form of high-order function selection and calling. This form transforms a control flow into a data flow of higher-order functions, making the automatic differential algorithm more powerful. It not only supports automatic differentiation of data flows, but also supports automatic differentiation of control flows such as conditional jumps, loops, and recursion. The following uses a simple Fibonacci instance as an example. + ```python @ms_function def fibonacci(n): @@ -128,15 +145,16 @@ def fibonacci(n): ``` The corresponding MindIR is [cf.dot](https://gitee.com/mindspore/docs/blob/master/docs/note/source_en/design/mindspore/images/ir/cf.dot). -![](./images/ir/cf.png) +![image](./images/ir/cf.png) `fibonacci` is a top-level function graph. Two function graphs at the top level are selected and called by `switch`. `✓fibonacci` is the True branch of the first `if`, and `✗fibonacci` is the False branch of the first `if`. `✓✗fibonacci` called in `✗fibonacci` is the True branch of `elif`, and `✗✗fibonacci` is the False branch of `elif`. The key is, in a MindIR, conditional jumps and recursion are represented in the form of higher-order control flows. For example, `✓✗fibonacci` and `✗fibonacci` are transferred in as parameters of the `switch` operator. `switch` selects a function as the return value based on the condition parameter. In this way, `switch` performs a binary selection operation on the input functions as common values and does not call the functions. The real function call is completed on CNode following `switch`. - ### Free Variables and Closures + Closure is a programming language feature that refers to the combination of code blocks and scope environment. A free variable refers to a variable in the scope environment referenced in a code block instead of a local variable. In a MindIR, a code block is represented as a function graph. The scope environment can be considered as the context where the function is called. The capture method of free variables is value copy instead of reference. A typical closure instance is as follows: + ```python @ms_function def func_outer(a, b): @@ -153,14 +171,15 @@ def ms_closure(): ``` The corresponding MindIR is [closure.dot](https://gitee.com/mindspore/docs/blob/master/docs/note/source_en/design/mindspore/images/ir/closure.dot). -![](./images/ir/closure.png) +![image](./images/ir/closure.png) In the example, `a` and `b` are free variables because the variables `a` and `b` in `func_inner` are parameters defined in the referenced parent graph `func_outer`. The variable `closure` is a closure, which is the combination of the function `func_inner` and its context `func_outer(1, 2)`. Therefore, the result of `out1` is 4, which is equivalent to `1+2+1`, and the result of `out2` is 5, which is equivalent to `1+2+2`. ## References + [1] C. Click and M. Paleczny. A simple graph-based intermediate representation. SIGPLAN Not., 30:35–49, March 1995. [2] Roland Leißa, Marcel Köster, and Sebastian Hack. A graph-based higher-order intermediate representation. In Proceedings of the 13th Annual IEEE/ACM International Symposium on -Code Generation and Optimization, pages 202–212. IEEE Computer Society, 2015. \ No newline at end of file +Code Generation and Optimization, pages 202–212. IEEE Computer Society, 2015. diff --git a/docs/note/source_en/design/mindspore/profiler_design.md b/docs/note/source_en/design/mindspore/profiler_design.md index 9acb50afb733aa344cc00275fe58f6f03568c8a1..e8dd67c9ffb8d6b20b5c1cf6920490e80fff62aa 100644 --- a/docs/note/source_en/design/mindspore/profiler_design.md +++ b/docs/note/source_en/design/mindspore/profiler_design.md @@ -33,6 +33,7 @@ To support model development and performance debugging in MindSpore, an easy-to-use profile tool is required to intuitively display the performance information of each dimension of a network model, provide users with easy-to-use and abundant profiling functions, and help users quickly locate network performance faults. ## Profiler Architecture Design + The Profiler architecture design is introduced from the following three aspects: the overall context interaction relationship of Profiler; the internal structure of Profiler, including the module structure and module layers; the interactive calling relationship between modules. ### Context @@ -50,6 +51,7 @@ As shown in the preceding figure, the interaction between the Profiler and other 2. MindSpore Profiler parses the original data in the user script and generates the intermediate data results in the specified folder. 3. MindInsight Profiler connects to the intermediate data and provides the visualized Profiler function for users. + ### Module Structure Modules are classified into the following layers: @@ -58,8 +60,8 @@ Modules are classified into the following layers: Figure 2 Relationships between modules at different layers - Module functions are as follows: + 1. ProfilerAPI is a calling entry provided by code, including the performance collection startup API and analysis API. 2. Controller is a module at a layer lower than that of ProfilerAPI. It is called by the startup API of ProfilerAPI to start or stop the performance collection function. The original data is written to a fixed position by ada. 3. Parser is a module for parsing original performance data which is collected on the device and cannot be directly understood by users. Parser parses, combines, and converts the data to generate intermediate results that can be understood by users and analyzed by upper layers. @@ -67,6 +69,7 @@ Module functions are as follows: 5. RESTful is used to call the common API provided by the backend Analyser to obtain objective data and use RESTful to connect to the frontend. ### Internal Module Interaction + Users can use API or RESTful to complete internal module interaction process. The following uses the API as an example: ![time_order_profiler.png](./images/time_order_profiler.png) @@ -82,20 +85,22 @@ The interaction process of each module is as follows: 3. Profiler API analysis API uses the Parser module to parse performance data, generates intermediate results, calls the Aalayser module to analyze the results, and returns various information to users. ## Sub-Module Design + ### ProfilerAPI and Controller #### Description + ProfilerAPI provides an entry API in the training script for users to start performance collection and analyze performance data. ProfilerAPI delivers commands through Controller to control the startup of ada. #### Design + ProfilerAPI belongs to the API layer of upper-layer application and is integrated by the training script. The function is divided into two parts: - Before training, call the bottom-layer Controller API to deliver a command to start a profiling task. - After training, call the bottom-layer Controller API to deliver commands to stop the profiling task, call the Analyser and Parser APIs to parse data files and generate result data such as operator performance statistics and training trace statistics. - Controller provides an API for the upper layer, calls API of the lower-layer performance collection module, and delivers commands for starting and stopping performance collection. The generated original performance data includes: @@ -106,9 +111,13 @@ The generated original performance data includes: - `training_trace.46.dev.profiler_default_tag` file: stores the start and end time of each step and time of step interval, forward and backward propagation, and step tail. ### Parser + #### Description + Parser is a module for parsing original performance data which is collected on the device and cannot be directly understood by users. Parser parses, combines, and converts the data to generate intermediate results that can be understood by users and analyzed by upper layers. + #### Design + ![parser_module_profiler.png](./images/parser_module_profiler.png) Figure 4 Parser module @@ -123,6 +132,7 @@ As shown in the preceding figure, there are HWTS Parser, AI CPU Parser, Framewor ### Analyser #### Description + Analyzer is used to filter, sort, query, and page the intermediate results generated at the parsing stage. #### Design @@ -142,9 +152,10 @@ Currently, there are two types of analyzers for operator information: To hide the internal implementation of Analyser and facilitate calling, the simple factory mode is used to obtain the specified Analyser through AnalyserFactory. - ### Proposer + #### Description + Proposer is a Profiler performance optimization suggestion module. Proposer calls the Analyser module to obtain performance data, analyzes the performance data based on optimization rules, and displays optimization suggestions for users through the UI and API. #### Design @@ -172,4 +183,4 @@ Figure 7 Proposer class As shown in the preceding figure: - Proposers of various types inherit the abstract class Proposer and implement the analyze methods. -- API and CLI call the ProposerFactory to obtain the Proposer and call the Proposer.analyze function to obtain the optimization suggestions of each type of Proposer. \ No newline at end of file +- API and CLI call the ProposerFactory to obtain the Proposer and call the Proposer.analyze function to obtain the optimization suggestions of each type of Proposer. diff --git a/docs/note/source_en/glossary.md b/docs/note/source_en/glossary.md index 852afab1c841523f869472090a9ee827e028c047..6ec11c85d39df9af4669d847255544882ac78be4 100644 --- a/docs/note/source_en/glossary.md +++ b/docs/note/source_en/glossary.md @@ -4,7 +4,7 @@ -| Acronym and Abbreviation | Description | +| Acronym and Abbreviation | Description | | ----- | ----- | | ACL | Ascend Computer Language, for users to develop deep neural network applications, which provides the C++ API library including device management, context management, stream management, memory management, model loading and execution, operator loading and execution, media data processing, etc. | | Ascend | Name of Huawei Ascend series chips. | diff --git a/docs/note/source_en/help_seeking_path.md b/docs/note/source_en/help_seeking_path.md index ecaa964ae6d416ebda96843a8688029a20c78278..9ac8c6bb6da04e502a89729e32b1e8644c82db51 100644 --- a/docs/note/source_en/help_seeking_path.md +++ b/docs/note/source_en/help_seeking_path.md @@ -10,22 +10,20 @@ This document describes how to seek help and support when you encounter problems - Website search - - Go to the [official search page](https://www.mindspore.cn/search/en). - - When encountering a problem, search on the official website first, which is simple and efficient. - - Enter a keyword in the search box and click the search icon. The related content is displayed. - - Resolve the problem based on the search result. - + - Go to the [official search page](https://www.mindspore.cn/search/en). + - When encountering a problem, search on the official website first, which is simple and efficient. + - Enter a keyword in the search box and click the search icon. The related content is displayed. + - Resolve the problem based on the search result. - User group consultation - - If you cannot solve the problem using the website search method and want a quick consultation. Get support by joining the [Slack group](https://mindspore.slack.com/join/shared_invite/zt-dgk65rli-3ex4xvS4wHX7UDmsQmfu8w#/ ) and start a conversation with our members. - - Resolve the problem by asking experts or communicating with other users. - + - If you cannot solve the problem using the website search method and want a quick consultation. Get support by joining the [Slack group](https://mindspore.slack.com/join/shared_invite/zt-dgk65rli-3ex4xvS4wHX7UDmsQmfu8w#/ ) and start a conversation with our members. + - Resolve the problem by asking experts or communicating with other users. - Forum Help-Seeking - - If you want a detailed solution, start a help post on the [Ascend forum](https://forum.huawei.com/enterprise/en/forum-100504.html). - - After the post is sent, a forum moderator collects the question and contacts technical experts to answer the question. The question will be resolved within three working days. - - Resolve the problem by referring to solutions provided by technical experts. + - If you want a detailed solution, start a help post on the [Ascend forum](https://forum.huawei.com/enterprise/en/forum-100504.html). + - After the post is sent, a forum moderator collects the question and contacts technical experts to answer the question. The question will be resolved within three working days. + - Resolve the problem by referring to solutions provided by technical experts. - If the expert test result shows that the MindSpore function needs to be improved, you are advised to submit an issue in the [MindSpore repository](https://gitee.com/mindspore). Issues will be resolved in later versions. \ No newline at end of file + If the expert test result shows that the MindSpore function needs to be improved, you are advised to submit an issue in the [MindSpore repository](https://gitee.com/mindspore). Issues will be resolved in later versions. diff --git a/docs/note/source_en/network_list_ms.md b/docs/note/source_en/network_list_ms.md index 8cbf3ba9802d880346645ff113fe73410799aa16..169d2137ccd1f235a41572f3d4f21126de7d871e 100644 --- a/docs/note/source_en/network_list_ms.md +++ b/docs/note/source_en/network_list_ms.md @@ -13,7 +13,7 @@ ## Model Zoo -| Domain | Sub Domain | Network | Ascend(Graph) | Ascend(PyNative) | GPU(Graph) | GPU(PyNative)| CPU(Graph) +| Domain | Sub Domain | Network | Ascend(Graph) | Ascend(PyNative) | GPU(Graph) | GPU(PyNative)| CPU(Graph) |:------ |:------| :----------- |:------ |:------ |:------ |:------ |:----- |Computer Vision (CV) | Image Classification | [AlexNet](https://gitee.com/mindspore/mindspore/blob/master/model_zoo/official/cv/alexnet/src/alexnet.py) | Supported | Supported | Supported | Supported | Doing | Computer Vision (CV) | Image Classification | [GoogleNet](https://gitee.com/mindspore/mindspore/blob/master/model_zoo/official/cv/googlenet/src/googlenet.py) | Supported | Supported | Supported | Supported | Doing diff --git a/docs/note/source_en/object_detection_lite.md b/docs/note/source_en/object_detection_lite.md index 6e02865426d6957fc991d5c4968f05ccf5af1e53..10089f2a80189ff17cbe49230f7abd842db731a0 100644 --- a/docs/note/source_en/object_detection_lite.md +++ b/docs/note/source_en/object_detection_lite.md @@ -23,4 +23,3 @@ The following table shows the data of some object detection models using MindSpo | Model name | Size | mAP(IoU=0.50:0.95) | CPU 4 thread delay (ms) | |-----------------------| :----------: | :----------: | :-----------: | | [MobileNetv2-SSD](https://download.mindspore.cn/model_zoo/official/lite/ssd_mobilenetv2_lite/ssd.ms) | 16.7 | 0.22 | 25.4 | - diff --git a/docs/note/source_en/operator_list_implicit.md b/docs/note/source_en/operator_list_implicit.md index 955ea0a2feaaca4c0112f13c516de2ad0bd16e26..5eb631df80993c22b4e4aad5d8ad11d0847c6ffb 100644 --- a/docs/note/source_en/operator_list_implicit.md +++ b/docs/note/source_en/operator_list_implicit.md @@ -17,24 +17,26 @@ ## Implicit Type Conversion ### conversion rules -* Scalar and Tensor operations: during operation, the scalar is automatically converted to Tensor, and the data type is consistent with the Tensor data type involved in the operation; + +- Scalar and Tensor operations: during operation, the scalar is automatically converted to Tensor, and the data type is consistent with the Tensor data type involved in the operation; when Tensor is bool data type and the scalar is int or float, both the scalar and Tensor are converted to the Tensor with the data type of int32 or float32; when Tensor is int or uint data type and the scalar is float, both the scalar and Tensor are converted to the Tensor with the data type of float32. -* Tensor operation of different data types: the priority of data type is bool < uint8 < int8 < int16 < int32 < int64 < float16 < float32 - + 本文介绍MindSpore的基准性能。MindSpore网络定义可参考[Model Zoo](https://gitee.com/mindspore/mindspore/tree/master/model_zoo)。 @@ -32,7 +32,7 @@ ### BERT -| Network | Network Type | Dataset | MindSpore Version | Resource                 | Precision | Batch Size | Throughput | Speedup | +| Network | Network Type | Dataset | MindSpore Version | Resource                 | Precision | Batch Size | Throughput | Speedup | | --- | --- | --- | --- | --- | --- | --- | --- | --- | | BERT-Large | Attention | zhwiki | 0.5.0-beta | Ascend: 1 * Ascend 910
CPU:24 Cores | Mixed | 96 | 269 sentences/sec | - | | | | | | Ascend: 8 * Ascend 910
CPU:192 Cores | Mixed | 96 | 2069 sentences/sec | 0.96 | @@ -45,7 +45,7 @@ | Network | Network Type | Dataset | MindSpore Version | Resource                 | Precision | Batch Size | Throughput | Speedup | | --- | --- | --- | --- | --- | --- | --- | --- | --- | | Wide & Deep | Recommend | Criteo | 0.6.0-beta | Ascend: 1 * Ascend 910
CPU:24 Cores | Mixed | 16000 | 796892 samples/sec | - | -| | | | | Ascend: 8 * Ascend 910
CPU:192 Cores | Mixed | 16000*8 | 4872849 samples/sec | 0.76 | +| | | | | Ascend: 8 \* Ascend 910
CPU:192 Cores | Mixed | 16000*8 | 4872849 samples/sec | 0.76 | 1. 以上数据基于Atlas 800测试获得,且网络模型为数据并行。 2. 业界其他开源框架数据可参考:[Wide & Deep For TensorFlow](https://github.com/NVIDIA/DeepLearningExamples/tree/master/TensorFlow/Recommendation/WideAndDeep)。 @@ -55,9 +55,9 @@ | Network | Network Type | Dataset | MindSpore Version | Resource                 | Precision | Batch Size | Throughput | Speedup | | --- | --- | --- | --- | --- | --- | --- | --- | --- | | Wide & Deep | Recommend | Criteo | 0.6.0-beta | Ascend: 1 * Ascend 910
CPU:24 Cores | Mixed | 8000 | 68715 samples/sec | - | -| | | | | Ascend: 8 * Ascend 910
CPU:192 Cores | Mixed | 8000*8 | 283830 samples/sec | 0.51 | -| | | | | Ascend: 16 * Ascend 910
CPU:384 Cores | Mixed | 8000*16 | 377848 samples/sec | 0.34 | -| | | | | Ascend: 32 * Ascend 910
CPU:768 Cores | Mixed | 8000*32 | 433423 samples/sec | 0.20 | +| | | | | Ascend: 8 \* Ascend 910
CPU:192 Cores | Mixed | 8000*8 | 283830 samples/sec | 0.51 | +| | | | | Ascend: 16 \* Ascend 910
CPU:384 Cores | Mixed | 8000*16 | 377848 samples/sec | 0.34 | +| | | | | Ascend: 32 \* Ascend 910
CPU:768 Cores | Mixed | 8000*32 | 433423 samples/sec | 0.20 | 1. 以上数据基于Atlas 800测试获得,且网络模型为模型并行。 2. 业界其他开源框架数据可参考:[Wide & Deep For TensorFlow](https://github.com/NVIDIA/DeepLearningExamples/tree/master/TensorFlow/Recommendation/WideAndDeep)。 diff --git a/docs/note/source_zh_cn/constraints_on_network_construction.md b/docs/note/source_zh_cn/constraints_on_network_construction.md index 7fa4b65d340f09c7b88b6c796dcb5b9112474ebf..0678b7d4e7ca52c6a9f387525950fb7a38a3de18 100644 --- a/docs/note/source_zh_cn/constraints_on_network_construction.md +++ b/docs/note/source_zh_cn/constraints_on_network_construction.md @@ -28,21 +28,26 @@ ## 概述 + MindSpore完成从用户源码到计算图的编译,用户源码基于Python语法编写,当前MindSpore支持将普通函数或者继承自nn.Cell的实例转换生成计算图,暂不支持将任意Python源码转换成计算图,所以对于用户源码支持的写法有所限制,主要包括语法约束和网络定义约束两方面。随着MindSpore的演进,这些约束可能会发生变化。 ## 语法约束 + ### 支持的Python数据类型 -* Number:包括`int`、`float`、`bool`,不支持复数类型。 -* String -* List:当前只支持append方法;List的更新会拷贝生成新的List。 -* Tuple -* Dictionary:当前`key`只支持String类型 + +- Number:包括`int`、`float`、`bool`,不支持复数类型。 +- String +- List:当前只支持append方法;List的更新会拷贝生成新的List。 +- Tuple +- Dictionary:当前`key`只支持String类型 + ### MindSpore扩展数据类型 -* Tensor:Tensor变量必须是已定义实例。 + +- Tensor:Tensor变量必须是已定义实例。 ### 表达式类型 -| 操作名 | 具体操作 +| 操作名 | 具体操作 | :----------- |:-------- | 一元操作符 |`+`、`-`、`not`,其中`+`操作符只支持标量。 | 数学表达式 |`+`、`-`、`*`、`/`、`%`、`**`、`//` @@ -81,10 +86,11 @@ | `isinstance` | 使用原则与Python一致,但第二个入参只能是mindspore定义的类型。 ### 函数参数 -* 参数默认值:目前不支持默认值设为`Tensor`类型数据,支持`int`、`float`、`bool`、`None`、`str`、`tuple`、`list`、`dict`类型数据。 -* 可变参数:支持带可变参数网络的推理和训练。 -* 键值对参数:目前不支持带键值对参数的函数求反向。 -* 可变键值对参数:目前不支持带可变键值对的函数求反向。 + +- 参数默认值:目前不支持默认值设为`Tensor`类型数据,支持`int`、`float`、`bool`、`None`、`str`、`tuple`、`list`、`dict`类型数据。 +- 可变参数:支持带可变参数网络的推理和训练。 +- 键值对参数:目前不支持带键值对参数的函数求反向。 +- 可变键值对参数:目前不支持带可变键值对的函数求反向。 ### 操作符 @@ -104,51 +110,52 @@ 索引操作包含`tuple`和`Tensor`的索引操作。下面重点介绍一下`Tensor`的索引取值和赋值操作,取值以`tensor_x[index]`为例,赋值以`tensor_x[index] = u`为例进行详细说明。其中tensor_x是一个`Tensor`,对其进行切片操作;index表示索引,u表示赋予的值,可以是`scalar`或者`Tensor(size=1)`。索引类型如下: - 切片索引:index为`slice` - - 取值:`tensor_x[start:stop:step]`,其中Slice(start:stop:step)与Python的语法相同,这里不再赘述。 - - 赋值:`tensor_x[start:stop:step]=u`。 + - 取值:`tensor_x[start:stop:step]`,其中Slice(start:stop:step)与Python的语法相同,这里不再赘述。 + - 赋值:`tensor_x[start:stop:step]=u`。 - Ellipsis索引:index为`ellipsis` - - 取值:`tensor_x[...]`。 - - 赋值:`tensor_x[...]=u`。 + - 取值:`tensor_x[...]`。 + - 赋值:`tensor_x[...]=u`。 - 布尔常量索引:index为`True`,index为`False`暂不支持。 - - 取值:`tensor_x[True]`。 - - 赋值:暂不支持。 + - 取值:`tensor_x[True]`。 + - 赋值:暂不支持。 - Tensor索引:index为`Tensor` - - 取值:`tensor_x[index]`,`index`必须是`int32`、`int64`类型的`Tensor`,元素取值范围在`[0, tensor_x.shape[0])`。 - - 赋值:`tensor_x[index]=U`。 - - `tensor_x`的数据类型必须是下面一种: `float16`,`float32`,`int8`,`uint8`。 - - `index`必须是`int32`类型的`Tensor`,元素取值范围在`[0, tensor_x.shape[0])`。 - - `U`可以是`Number`,`Tensor`,只包含`Number`的`Tuple`,只包含`Tensor`的`Tuple`。 - - 单个`Number`和`Tuple`里的每个`Number`必须与`tensor_x`的数据类型属于同一类,即 + - 取值:`tensor_x[index]`,`index`必须是`int32`、`int64`类型的`Tensor`,元素取值范围在`[0, tensor_x.shape[0])`。 + - 赋值:`tensor_x[index]=U`。 + - `tensor_x`的数据类型必须是下面一种: `float16`,`float32`,`int8`,`uint8`。 + - `index`必须是`int32`类型的`Tensor`,元素取值范围在`[0, tensor_x.shape[0])`。 + - `U`可以是`Number`,`Tensor`,只包含`Number`的`Tuple`,只包含`Tensor`的`Tuple`。 + - 单个`Number`和`Tuple`里的每个`Number`必须与`tensor_x`的数据类型属于同一类,即 当`tensor_x`的数据类型是`uint8`或者`int8`时,`Number`类型应该是`int`; 当`tensor_x`的数据类型是`float16`或者`float32`时,`Number`类型应该是`float`。 - - 单个`Tensor`和`Tuple`里的每个`Tensor`必须与`tensor_x`的数据类型一致, + - 单个`Tensor`和`Tuple`里的每个`Tensor`必须与`tensor_x`的数据类型一致, 单个`Tensor`时,其`shape`需等于或者可广播为`index.shape + tensor_x.shape[1:]`。 - - 包含`Number`的`Tuple`需满足下面条件: + - 包含`Number`的`Tuple`需满足下面条件: `len(Tuple) = (index.shape + tensor_x.shape[1:])[-1]`。 - - 包含`Tensor`的`Tuple`需满足下面条件: + - 包含`Tensor`的`Tuple`需满足下面条件: 每个`Tensor`的`shape`一样; `(len(Tuple),) + Tensor.shape`等于或者可广播为`index.shape + tensor_x.shape[1:]`。 - None常量索引:index为`None` - - 取值:`tensor_x[None]`,结果与numpy保持一致。 - - 赋值:暂不支持。 + - 取值:`tensor_x[None]`,结果与numpy保持一致。 + - 赋值:暂不支持。 - tuple索引:index为`tuple` - - tuple元素为slice: - - 取值:例如`tensor_x[::, :4, 3:0:-1]`。 - - 赋值:例如`tensor_x[::, :4, 3:0:-1]=u`。 - - tuple元素为Number: - - 取值:例如`tensor_x[2,1]`。 - - 赋值:例如`tensor_x[1,4]=u`。 - - tuple元素为slice和ellipsis混合情况: - - 取值:例如`tensor_x[..., ::, 1:]` - - 赋值:例如`tensor_x[..., ::, 1:]=u` - - 其他情况暂不支持 + - tuple元素为slice: + - 取值:例如`tensor_x[::, :4, 3:0:-1]`。 + - 赋值:例如`tensor_x[::, :4, 3:0:-1]=u`。 + - tuple元素为Number: + - 取值:例如`tensor_x[2,1]`。 + - 赋值:例如`tensor_x[1,4]=u`。 + - tuple元素为slice和ellipsis混合情况: + - 取值:例如`tensor_x[..., ::, 1:]` + - 赋值:例如`tensor_x[..., ::, 1:]=u` + - 其他情况暂不支持 tuple和list类型的索引取值操作,需要重点介绍一下元素类型为`nn.Cell`的tuple或list的索引取值操作,该操作目前在Graph模式下仅GPU后端支持运行,其语法格式形如`layers[index](*inputs)`,具体示例代码如下: + ```python class Net(nn.Cell): def __init__(self): @@ -161,60 +168,68 @@ tuple和list类型的索引取值操作,需要重点介绍一下元素类型 x = self.layers[index](x) return x ``` + 同时该语法有以下几个约束: -* 只支持元素类型为`nn.Cell`的tuple或list的索引取值操作。 -* 索引值index的类型为`int32`的Tensor标量,取值范围为`[-n, n)`, 其中`n`为tuple的size,支持的tuple的size的最大值为1000。 -* tuple中的每个Cell元素的Construct函数的输入数据的数目,类型和shape要求相同,且Construct函数运行后输出的数据的数目,类型和shape也要求相同。 -* tuple中的每个Cell元素,需要在tuple定义之前完成定义。 -* 该语法不支持做为if、while、for等控制流的运行分支,如果控制流的控制条件为常量除外。举例说明: - - 支持的写法: - ```python - class Net(nn.Cell): - def __init__(self, flag=True): - super(Net, self).__init__() - self.flag = flag - self.relu = nn.ReLU() - self.softmax = nn.Softmax() - self.layers = (self.relu, self.softmax) - def construct(self, x, index): - if self.flag: - x = self.layers[index](x) - return x - ``` - - 不支持的写法: - ```python - class Net(nn.Cell): - def __init__(self): - super(Net, self).__init__() - self.relu = nn.ReLU() - self.softmax = nn.Softmax() - self.layers = (self.relu, self.softmax) +- 只支持元素类型为`nn.Cell`的tuple或list的索引取值操作。 +- 索引值index的类型为`int32`的Tensor标量,取值范围为`[-n, n)`, 其中`n`为tuple的size,支持的tuple的size的最大值为1000。 +- tuple中的每个Cell元素的Construct函数的输入数据的数目,类型和shape维度要求相同,且Construct函数运行后输出的数据的数目,类型和shape维度也要求相同。 +- tuple中的每个Cell元素,需要在tuple定义之前完成定义。 +- 该语法不支持做为if、while、for等控制流的运行分支,如果控制流的控制条件为常量除外。举例说明: + - 支持的写法: - def construct(self, x, index, flag): - if flag: - x = self.layers[index](x) - return x - ``` + ```python + class Net(nn.Cell): + def __init__(self, flag=True): + super(Net, self).__init__() + self.flag = flag + self.relu = nn.ReLU() + self.softmax = nn.Softmax() + self.layers = (self.relu, self.softmax) + + def construct(self, x, index): + if self.flag: + x = self.layers[index](x) + return x + ``` + + - 不支持的写法: + + ```python + class Net(nn.Cell): + def __init__(self): + super(Net, self).__init__() + self.relu = nn.ReLU() + self.softmax = nn.Softmax() + self.layers = (self.relu, self.softmax) + + def construct(self, x, index, flag): + if flag: + x = self.layers[index](x) + return x + ``` tuple也支持切片取值操作, 但不支持切片类型为Tensor类型,支持`tuple_x[start:stop:step]`,其中操作对象为与Python的效果相同,这里不再赘述。 ### 不支持的语法 -目前在网络构造函数里面暂不支持以下语法: +目前在网络构造函数里面暂不支持以下语法: `raise`、 `yield`、 `async for`、 `with`、 `async with`、 `assert`、 `import`、 `await`。 ## 网络定义约束 ### 整网实例类型 -* 带[@ms_function](https://www.mindspore.cn/doc/api_python/zh-CN/master/mindspore/mindspore.html#mindspore.ms_function)装饰器的普通Python函数。 -* 继承自[nn.Cell](https://www.mindspore.cn/doc/api_python/zh-CN/master/mindspore/mindspore.nn.html#mindspore.nn.Cell)的Cell子类。 + +- 带[@ms_function](https://www.mindspore.cn/doc/api_python/zh-CN/master/mindspore/mindspore.html#mindspore.ms_function)装饰器的普通Python函数。 +- 继承自[nn.Cell](https://www.mindspore.cn/doc/api_python/zh-CN/master/mindspore/mindspore.nn.html#mindspore.nn.Cell)的Cell子类。 ### 网络输入类型 -* 整网的训练数据输入参数只能是Tensor类型。 -* 生成的ANF图里面不能包含这几种常量节点:字符串类型常量、带有Tuple嵌套的常量、带有List嵌套的常量。 + +- 整网的训练数据输入参数只能是Tensor类型。 +- 生成的ANF图里面不能包含这几种常量节点:字符串类型常量、带有Tuple嵌套的常量、带有List嵌套的常量。 ### 网络图优化 + 在ME前端图优化过程中,会将DataClass类型、Dictionary、List、键值对操作转换为Tuple相关操作。 ### 网络构造组件 @@ -229,33 +244,36 @@ tuple也支持切片取值操作, 但不支持切片类型为Tensor类型,支 | Composite算子 |[mindspore/ops/composite/*](https://www.mindspore.cn/doc/api_python/zh-CN/master/mindspore/mindspore.ops.html) | constexpr生成算子 |使用[@constexpr](https://www.mindspore.cn/doc/api_python/zh-CN/master/mindspore/mindspore.ops.html#mindspore.ops.constexpr)生成的值计算算子。 - ### 其他约束 + 1. 整网`construct`函数输入的参数以及使用`ms_function`装饰器修饰的函数的参数在图编译过程中会进行泛化,不能作为常量输入传给算子使用。所以,在图模式下,限制入口网络的参数只能是`Tensor`,如下例所示: - - * 错误的写法如下: + + - 错误的写法如下: + ```python class ExpandDimsTest(Cell): def __init__(self): super(ExpandDimsTest, self).__init__() self.expandDims = P.ExpandDims() - + def construct(self, input_x, input_axis): return self.expandDims(input_x, input_axis) expand_dim = ExpandDimsTest() input_x = Tensor(np.random.randn(2,2,2,2).astype(np.float32)) expand_dim(input_x, 0) ``` + 在示例中,`ExpandDimsTest`是一个只有单算子的网络,网络的输入有`input_x`和`input_axis`两个。因为`ExpandDims`算子的第二个输入需要是常量,这是因为在图编译过程中推导`ExpandDims`算子输出维度的时候需要用到,而`input_axis`作为网络参数输入会泛化成变量,无法确定其值,从而无法推导算子的输出维度导致图编译失败。所以在图编译阶段需要值推导的输入都应该是常量输入。在API中,这类算子需要常量输入的参数会进行说明,标注"constant input is needed"。 - - * 正确的写法是在construct函数里面对算子的常量输入直接填入需要的值或者是一个类的成员变量,如下: + + - 正确的写法是在construct函数里面对算子的常量输入直接填入需要的值或者是一个类的成员变量,如下: + ```python class ExpandDimsTest(Cell): def __init__(self, axis): super(ExpandDimsTest, self).__init__() self.expandDims = P.ExpandDims() self.axis = axis - + def construct(self, input_x): return self.expandDims(input_x, self.axis) axis = 0 @@ -266,48 +284,53 @@ tuple也支持切片取值操作, 但不支持切片类型为Tensor类型,支 2. 不允许修改网络的非`Parameter`类型数据成员。示例如下: - ``` + ```python class Net(Cell): def __init__(self): super(Net, self).__init__() self.num = 2 self.par = Parameter(Tensor(np.ones((2, 3, 4))), name="par") - + def construct(self, x, y): return x + y ``` + 上面所定义的网络里,`self.num`不是一个`Parameter`,不允许被修改,而`self.par`是一个`Parameter`,可以被修改。 3. 当`construct`函数里,使用未定义的类成员时,不会像Python解释器那样抛出`AttributeError`,而是作为`None`处理。示例如下: - ``` + + ```python class Net(Cell): def __init__(self): super(Net, self).__init__() - + def construct(self, x): return x + self.y ``` + 上面所定义的网络里,`construct`里使用了并未定义的类成员`self.y`,此时会将`self.y`作为`None`处理。 - + 4. 当`construct`函数里,使用`if-else`控制流时,`if`和`else`返回的数据类型或者同一变量被更新后的数据类型必须一致,示例如下: - ``` + + ```python class NetReturn(Cell): def __init__(self): super(NetReturn, self).__init__() - + def construct(self, x, y, m, n): if x > y: return m else: return n ``` + 上面所定义的网络`NetReturn`里,`construct`里使用了`if-else`控制流,那么`if`分支返回的`m`和`else`分支返回的`n`数据类型必须一致。 - - ``` + + ```python class NetAssign(Cell): def __init__(self): super(NetAssign, self).__init__() - + def construct(self, x, y, m, n): out = None if x > y: @@ -316,4 +339,5 @@ tuple也支持切片取值操作, 但不支持切片类型为Tensor类型,支 out = n return out ``` + 上面所定义的网络`NetAssign`里,`construct`里使用了`if-else`控制流,那么`if`分支更新后的`out`和`else`分支更新后的`out`的数据类型必须一致。 diff --git a/docs/note/source_zh_cn/design/mindarmour/differential_privacy_design.md b/docs/note/source_zh_cn/design/mindarmour/differential_privacy_design.md index 608841c576e8b106b3a36e775dfaafe88ee91492..256d719bcf7dc1e789d9895d3841598691da4672 100644 --- a/docs/note/source_zh_cn/design/mindarmour/differential_privacy_design.md +++ b/docs/note/source_zh_cn/design/mindarmour/differential_privacy_design.md @@ -26,14 +26,13 @@ MindArmour的Differential-Privacy模块实现了差分隐私训练的能力。 图1是差分隐私训练的总体设计,主要由差分隐私噪声机制(DP Mechanisms)、差分隐私优化器(DP Optimizer)、差分隐私监控器(Privacy Monitor)组成。 - ### 差分隐私优化器 差分隐私优化器继承了MindSpore优化器的能力,并使用差分隐私的噪声机制对梯度加扰保护。目前,MindArmour提供三类差分隐私优化器:固定高斯优化器、自适应高斯优化器、自适应裁剪优化器,每类差分隐私优化器从不同的角度为SGD、Momentum等常规优化器增加差分隐私保护的能力。 -* 固定高斯优化器,是一种非自适应高斯噪声的差分隐私优化器。其优势在于可以严格控制差分隐私预算ϵ,缺点是在模型训练过程中,每个Step添加的噪声量固定,若迭代次数过大,训练后期的噪声使得模型收敛困难,甚至导致性能大幅下跌,模型可用性差。 -* 自适应高斯优化器,通过自适应调整标准差,来调整高斯分布噪声的大小,在模型训练初期,添加的噪声量较大,随着模型逐渐收敛,噪声量逐渐减小,噪声对于模型可用性的影响减小。自适应高斯噪声的缺点是不能严格控制差分隐私预算。 -* 自适应裁剪优化器,是一种自适应调整调整裁剪粒度的差分隐私优化器,梯度裁剪是差分隐私训练的一个重要操作,自适应裁剪优化器能够自适应的控制梯度裁剪的的比例在给定的范围波动,控制迭代训练过程中梯度裁剪的粒度。 +- 固定高斯优化器,是一种非自适应高斯噪声的差分隐私优化器。其优势在于可以严格控制差分隐私预算ϵ,缺点是在模型训练过程中,每个Step添加的噪声量固定,若迭代次数过大,训练后期的噪声使得模型收敛困难,甚至导致性能大幅下跌,模型可用性差。 +- 自适应高斯优化器,通过自适应调整标准差,来调整高斯分布噪声的大小,在模型训练初期,添加的噪声量较大,随着模型逐渐收敛,噪声量逐渐减小,噪声对于模型可用性的影响减小。自适应高斯噪声的缺点是不能严格控制差分隐私预算。 +- 自适应裁剪优化器,是一种自适应调整调整裁剪粒度的差分隐私优化器,梯度裁剪是差分隐私训练的一个重要操作,自适应裁剪优化器能够自适应的控制梯度裁剪的的比例在给定的范围波动,控制迭代训练过程中梯度裁剪的粒度。 ### 差分隐私的噪声机制 @@ -41,25 +40,24 @@ MindArmour的Differential-Privacy模块实现了差分隐私训练的能力。 ### Monitor -Monitor提供RDP、ZCDP等回调函数,用于监测模型的差分隐私预算。 +Monitor提供RDP、ZCDP等回调函数,用于监测模型的差分隐私预算。 -* ZCDP[2] +- ZCDP[2] ZCDP,zero-concentrated differential privacy,是一种宽松的差分隐私定义,利用Rényi散度来度量随机函数在相邻数据集上的分布差异。 -* RDP[3] +- RDP[3] RDP,Rényi Differential Privacy,是一种更通用的基于R'enyi散度的差分隐私定义,利用Rényi散度来度量两个相邻数据集的分布差异。 - -相对于传统差分隐私,ZCDP和RDP都能能够提供更加严格的隐私预算上界保证。 +相对于传统差分隐私,ZCDP和RDP都能能够提供更加严格的隐私预算上界保证。 ## 代码实现 -* [mechanisms.py](https://gitee.com/mindspore/mindarmour/blob/master/mindarmour/privacy/diff_privacy/mechanisms/mechanisms.py):这个文件实现了差分隐私训练所需的噪声生成机制,包括简单高斯噪声、自适应高斯噪声、自适应裁剪高斯噪声等。 -* [optimizer.py](https://gitee.com/mindspore/mindarmour/blob/master/mindarmour/privacy/diff_privacy/optimizer/optimizer.py):这个文件实现了使用噪声生成机制在反向传播时添加噪声的根本逻辑。 -* [monitor.py](https://gitee.com/mindspore/mindarmour/blob/master/mindarmour/privacy/diff_privacy/monitor/monitor.py):实现了计算差分隐私预算的回调函数,模型训练过程中,会反馈当前的差分隐私预算。 -* [model.py](https://gitee.com/mindspore/mindarmour/blob/master/mindarmour/privacy/diff_privacy/train/model.py):这个文件实现了计算损失和梯度的逻辑,差分隐私训练的梯度截断逻辑在此文件中实现,且model.py是用户使用差分隐私训练能力的入口。 +- [mechanisms.py](https://gitee.com/mindspore/mindarmour/blob/master/mindarmour/privacy/diff_privacy/mechanisms/mechanisms.py):这个文件实现了差分隐私训练所需的噪声生成机制,包括简单高斯噪声、自适应高斯噪声、自适应裁剪高斯噪声等。 +- [optimizer.py](https://gitee.com/mindspore/mindarmour/blob/master/mindarmour/privacy/diff_privacy/optimizer/optimizer.py):这个文件实现了使用噪声生成机制在反向传播时添加噪声的根本逻辑。 +- [monitor.py](https://gitee.com/mindspore/mindarmour/blob/master/mindarmour/privacy/diff_privacy/monitor/monitor.py):实现了计算差分隐私预算的回调函数,模型训练过程中,会反馈当前的差分隐私预算。 +- [model.py](https://gitee.com/mindspore/mindarmour/blob/master/mindarmour/privacy/diff_privacy/train/model.py):这个文件实现了计算损失和梯度的逻辑,差分隐私训练的梯度截断逻辑在此文件中实现,且model.py是用户使用差分隐私训练能力的入口。 ## 参考文献 diff --git a/docs/note/source_zh_cn/design/mindarmour/fuzzer_design.md b/docs/note/source_zh_cn/design/mindarmour/fuzzer_design.md index 8f7fcbf81c368446fe19e06baa71933cc9559a7e..0d753d7bb92744cb7bc2ae96a665ac09fafec77b 100644 --- a/docs/note/source_zh_cn/design/mindarmour/fuzzer_design.md +++ b/docs/note/source_zh_cn/design/mindarmour/fuzzer_design.md @@ -2,7 +2,6 @@ `Linux` `Ascend` `GPU` `CPU` `数据准备` `模型开发` `模型训练` `模型调优` `企业` `高级` - - [AI模型安全测试](#ai模型安全测试) - [背景](#背景) diff --git a/docs/note/source_zh_cn/design/mindinsight/graph_visual_design.md b/docs/note/source_zh_cn/design/mindinsight/graph_visual_design.md index 600da5f29c5015556c5bd03a521d589107db7783..be8a8c686bb95ba565506ded1940d8b87281ca51 100644 --- a/docs/note/source_zh_cn/design/mindinsight/graph_visual_design.md +++ b/docs/note/source_zh_cn/design/mindinsight/graph_visual_design.md @@ -21,9 +21,9 @@ 计算图可视的功能,主要协助开发者在下面这些场景中使用。 - - 开发者在编写深度学习神经网络的代码时,可以使用计算图的功能查看神经网络中算子的数据流走向,以及模型结构。 - - 计算图还可以方便开发者查看指定节点的输入和输出节点,以及所查找的节点的属性信息。 - - 开发者在调试网络时,可以通过可视化的计算图,轻易跟踪数据,包括数据维度、类型的变更等。 +- 开发者在编写深度学习神经网络的代码时,可以使用计算图的功能查看神经网络中算子的数据流走向,以及模型结构。 +- 计算图还可以方便开发者查看指定节点的输入和输出节点,以及所查找的节点的属性信息。 +- 开发者在调试网络时,可以通过可视化的计算图,轻易跟踪数据,包括数据维度、类型的变更等。 ## 总体设计 diff --git a/docs/note/source_zh_cn/design/mindinsight/tensor_visual_design.md b/docs/note/source_zh_cn/design/mindinsight/tensor_visual_design.md index b8439752e6278735331057a4ca3ec254525ad732..44d4db5b12ddc5dc04e3ed2cedfb16dd69bb382d 100644 --- a/docs/note/source_zh_cn/design/mindinsight/tensor_visual_design.md +++ b/docs/note/source_zh_cn/design/mindinsight/tensor_visual_design.md @@ -60,7 +60,8 @@ Tensor可视支持1-N维的Tensor以表格或直方图的形式展示,对于0 #### 文件接口设计 `summary.proto`文件为总入口,其中张量的数据(TensorProto)存放在Summary的Value中,如下所示: -``` + +```protobuf { message Summary { message Image { @@ -69,7 +70,7 @@ Tensor可视支持1-N维的Tensor以表格或直方图的形式展示,对于0 required int32 width = 2; ... } - + message Histogram { message bucket{ // Counting number of values fallen in [left, left + width). @@ -78,7 +79,7 @@ Tensor可视支持1-N维的Tensor以表格或直方图的形式展示,对于0 required double width = 2; required int64 count = 3; } - + repeated bucket buckets = 1; ... } @@ -86,7 +87,7 @@ Tensor可视支持1-N维的Tensor以表格或直方图的形式展示,对于0 message Value { // Tag name for the data. required string tag = 1; - + // Value associated with the tag. oneof value { float scalar_value = 3; @@ -100,4 +101,5 @@ Tensor可视支持1-N维的Tensor以表格或直方图的形式展示,对于0 repeated Value value = 1; } ``` -而TensorProto的定义在[anf_ir.proto](https://gitee.com/mindspore/mindspore/blob/master/mindspore/ccsrc/utils/anf_ir.proto)文件中。 \ No newline at end of file + +而TensorProto的定义在[anf_ir.proto](https://gitee.com/mindspore/mindspore/blob/master/mindspore/ccsrc/utils/anf_ir.proto)文件中。 diff --git a/docs/note/source_zh_cn/design/mindspore/architecture_lite.md b/docs/note/source_zh_cn/design/mindspore/architecture_lite.md index eecfd0610dad6c0809e69bc11e781727331842df..23738f4426bf18f7ac54f1907ead31a8042ed04c 100644 --- a/docs/note/source_zh_cn/design/mindspore/architecture_lite.md +++ b/docs/note/source_zh_cn/design/mindspore/architecture_lite.md @@ -2,7 +2,6 @@ `Linux` `Windows` `端侧` `推理应用` `中级` `高级` `贡献者` - MindSpore Lite框架的总体架构如下所示: @@ -15,9 +14,8 @@ MindSpore Lite框架的总体架构如下所示: - **Backend:** 基于IR进行图优化,包括GHLO、GLLO和量化三部分。其中,GHLO负责和硬件无关的优化,如算子融合、常量折叠等;GLLO负责与硬件相关的优化;量化Quantizer支持权重量化、激活值量化等训练后量化手段。 -- **Runtime:** 智能终端的推理运行时,其中session负责会话管理,提供对外接口;线程池和并行原语负责图执行使用的线程池管理,内存分配负责图执行中各个算子的内存复用,算子库提供CPU和GPU算子。 +- **Runtime:** 智能终端的推理运行时,其中session负责会话管理,提供对外接口;线程池和并行原语负责图执行使用的线程池管理,内存分配负责图执行中各个算子的内存复用,算子库提供CPU和GPU算子。 - **Micro:** IoT设备的运行时,包括模型生成.c文件、线程池、内存复用和算子库。 -其中,Runtime和Micro共享底层的算子库、内存分配、线程池、并行原语等基础设施层。 - +其中,Runtime和Micro共享底层的算子库、内存分配、线程池、并行原语等基础设施层。 diff --git a/docs/note/source_zh_cn/design/mindspore/distributed_training_design.md b/docs/note/source_zh_cn/design/mindspore/distributed_training_design.md index 5b67aa47b3f30bc79b0e97736a5bfbf5eeec75cb..97a9a328b99dba77ff968775ef848096a0c995fc 100644 --- a/docs/note/source_zh_cn/design/mindspore/distributed_training_design.md +++ b/docs/note/source_zh_cn/design/mindspore/distributed_training_design.md @@ -24,7 +24,6 @@ 随着深度学习的快步发展,为了提升神经网络的精度和泛化能力,数据集和参数量都在呈指数级向上攀升。分布式并行训练成为一种解决超大规模网络性能瓶颈的发展趋势。MindSpore支持了当前主流的分布式训练范式并开发了一套自动混合并行解决方案。本篇设计文档将会集中介绍几种并行训练方式的设计原理,同时指导用户进行自定义开发。 - ## 概念 ### 集合通信 @@ -74,7 +73,6 @@ - [grad_reducer.py](https://gitee.com/mindspore/mindspore/blob/master/mindspore/nn/wrap/grad_reducer.py):这个文件实现了梯度聚合的过程。对入参`grads`用`HyperMap`展开后插入`AllReduce`算子,这里采用的是全局通信组,用户也可以根据自己网络的需求仿照这个模块进行自定义开发。MindSpore中单机和分布式执行共用一套网络封装接口,在`Cell`内部通过`ParallelMode`来区分是否要对梯度做聚合操作,网络封装接口建议参考`TrainOneStepCell`代码实现。 - ## 自动并行 自动并行作为MindSpore的关键特性,用于实现自动的数据并行加模型并行的混合并行训练方式,旨在帮助用户以单机的脚本表达并行算法逻辑,降低分布式训练难度,提高算法研发效率,同时又能保持训练的高性能。这个小节介绍了在MindSpore中`ParallelMode.AUTO_PARALLEL`自动并行模式及`ParallelMode.SEMI_AUTO_PARALLEL`半自动并行模式是如何工作的。 @@ -90,34 +88,31 @@ 为了得到张量的排布模型,每个算子都具有切分策略(Shard Strategy),它表示算子的各个输入在相应维度的切分情况。通常情况下只要满足以2为基、均匀分配的原则,张量的任意维度均可切分。以下图为例,这是一个三维矩阵乘(BatchMatMul)操作,它的切分策略由两个元组构成,分别表示`input`和`weight`的切分形式。其中元组中的元素与张量维度一一对应,`2^N`为切分份数,`1`表示不切。当我们想表示一个数据并行切分策略时,即`input`的`batch`维度切分,其他维度不切,可以表达为`strategy=((2^N, 1, 1),(1, 1, 1))`;当表示一个模型并行切分策略时,即`weight`的非`batch`维度切分,这里以`channel`维度切分为例,其他维度不切,可以表达为`strategy=((1, 1, 1),(1, 1, 2^N))`;当表示一个混合并行切分策略时,其中一种切分策略为`strategy=((2^N, 1, 1),(1, 1, 2^N))`。 ![算子切分定义](./images/operator_split.png) - + 依据切分策略,分布式算子中定义了推导算子输入张量和输出张量的排布模型的方法。这个排布模型由`device_matrix`,`tensor_shape`和`tensor map`组成,分别表示设备矩阵形状、张量形状、设备和张量维度间的映射关系。分布式算子会进一步根据张量排布模型判断是否要在图中中插入额外的计算、通信操作,以保证算子运算逻辑正确。 2. 张量排布变换 当前一个算子的输出张量模型和后一个算子的输入张量模型不一致时,就需要引入计算、通信操作的方式实现张量排布间的变化。自动并行流程引入了张量重排布算法(Tensor Redistribution),可以推导得到任意排布的张量间通信转换方式。下面三个样例表示公式`Z=(X×W)×V`的并行计算过程, 即两个二维矩阵乘操作,体现了不同并行方式间如何转换。 在样例一中,第一个数据并行矩阵乘的输出在行方向上存在切分,而第二个模型并行矩阵乘的输入需要全量张量,框架将会自动插入`AllGather`算子实现排布变换。 - + ![张量排布变换](./images/tensor_redistribution1.png) - + 在样例二中,第一个模型并行矩阵乘的输出在列方向上存在切分,而第二个数据并行矩阵乘的输入在行方向上存在切分,框架将会自动插入等价于集合通信中`AlltoAll`操作的通信算子实现排布变换。 ![张量排布变换](./images/tensor_redistribution2.png) - 在样例三中,第一个混合并行矩阵乘的输出切分方式和第二个混合并行矩阵乘的输入切分方式一致,所以不需要引入重排布变换。但由于第二个矩阵乘操作中,两个输入的相关维度存在切分,所以需要插入`AllReduce`算子保证运算正确性。 ![张量排布变换](./images/tensor_redistribution3.png) - 综上,1、2两点是自动并行实现的基础,总体来说这种分布式表达打破了数据并行和模型并行的边界,轻松实现混合并行。从脚本层面上,用户仅需构造单机网络,即可表达并行算法逻辑,框架将自动实现对整图切分。 3. 切分策略搜索算法 当用户熟悉了算子的切分表达,并手动对算子配置切分策略,这就是`SEMI_AUTO_PARALLEL`半自动并行模式。这种方式对手动调优有帮助,但还是具有一定的调试难度,用户需要掌握并行原理,并根据网络结构、集群拓扑等计算分析得到高性能的并行方案。为了进一步帮助用户加速并行网络训练过程,在半自动并行模式的基础上,`AUTO_PARALLEL`自动并行模式引入了并行切分策略自动搜索的特性。自动并行围绕硬件平台构建相应的代价函数模型(Cost Model),计算出一定数据量、一定算子在不同切分策略下的计算开销(Computation Cost),内存开销(Memory Cost)及通信开销(Communication Cost)。然后通过动态规划算法(Dynamic Programming)或者递归规划算法(Recursive Programming),以单卡的内存上限为约束条件,高效地搜索出性能较优的切分策略。 - - 策略搜索这一步骤代替了用户手动指定模型切分,在短时间内可以得到较高性能的切分方案,极大降低了并行训练的使用门槛。 + 策略搜索这一步骤代替了用户手动指定模型切分,在短时间内可以得到较高性能的切分方案,极大降低了并行训练的使用门槛。 4. 分布式自动微分 @@ -140,7 +135,5 @@ 5. 整图切分 - [step_auto_parallel.h](https://gitee.com/mindspore/mindspore/blob/master/mindspore/ccsrc/frontend/parallel/step_auto_parallel.h), [step_parallel.h](https://gitee.com/mindspore/mindspore/blob/master/mindspore/ccsrc/frontend/parallel/step_parallel.h):这两个文件包含了自动并行流程的核心实现。首先由`step_auto_parallel.h`调用策略搜索流程并产生分布式算子的`OperatorInfo`,然后在`step_parallel.h`中处理算子切分和张量重排布等流程,对单机计算图进行分布式改造。 - 6. 通信算子反向 - [grad_comm_ops.py](https://gitee.com/mindspore/mindspore/blob/master/mindspore/ops/_grad/grad_comm_ops.py):这个文件定义了`AllReduce`和`AllGather`等通信算子的反向操作。 - diff --git a/docs/note/source_zh_cn/design/mindspore/mindir.md b/docs/note/source_zh_cn/design/mindspore/mindir.md index e501365e3740586a898f86d30eff8866709e61e9..01fd8b8ab770c1db0a3749607f368199f41e36bc 100644 --- a/docs/note/source_zh_cn/design/mindspore/mindir.md +++ b/docs/note/source_zh_cn/design/mindspore/mindir.md @@ -20,13 +20,16 @@ ## 简介 + 中间表示(IR)是程序编译过程中介于源语言和目标语言之间的程序表示,以方便编译器进行程序分析和优化,因此IR的设计需要考虑从源语言到目标语言的转换难度,同时考虑程序分析和优化的易用性和性能。 MindIR是一种基于图表示的函数式IR,其最核心的目的是服务于自动微分变换。自动微分采用的是基于函数式编程框架的变换方法,因此IR采用了接近于ANF函数式的语义。此外,借鉴Sea of Nodes[1]和Thorin[2]的优秀设计,采用了一种基于显性依赖图的表示方式。 ## 文法定义 + ANF是函数式编程中常用且简洁的中间表示,其文法定义如下所示: -``` + +```text ::= NUMBER | STRING | VAR | BOOLEAN | PRIMOP | (lambda (VAR …) ) ::= ( …) @@ -34,17 +37,20 @@ ANF是函数式编程中常用且简洁的中间表示,其文法定义如下 ::= (let ([VAR ]) ) | | ``` + ANF中表达式分为原子表达式(aexp)和复合表达式(cexp),原子表达式表示一个常数值或一个变量或一个匿名函数;复合表达式由多个原子表达式复合组成,表示一个匿名函数或原语函数调用,组合的第一个输入是调用的函数,其余输入是调用的参数。 MindIR文法继承于ANF,其定义如下所示: -``` + +```text ::= | ::= Parameter ::= Scalar | Named | Tensor | Type | Shape - | Primitive | MetaFuncGraph | FuncGraph + | Primitive | MetaFuncGraph | FuncGraph ::= ( …) ::= | ``` + MindIR中的ANode对应于ANF的原子表达式,ANode有两个子类分别为ValueNode和ParameterNode。ValueNode表示常数节点,可承载一个常数值(标量、符号、张量、类型、维度等),也可以是一个原语函数(Primitive)或一个元函数(MetaFuncGraph)或一个普通函数(FuncGraph),因为在函数式编程中函数定义本身也是一个值。ParameterNode是参数节点,表示函数的形参。 MindIR中CNode对应于ANF的复合表达式,表示一次函数调用。 @@ -52,7 +58,9 @@ MindIR中CNode对应于ANF的复合表达式,表示一次函数调用。 在MindSpore自动微分时,会计算ParameterNode和CNode的梯度贡献,并返回最终ParameterNode的梯度,而不计算ValueNode的梯度。 ## 示例 + 下面以一段程序作为示例,对比理解MindIR。 + ```python def func(x, y): return x / y @@ -64,8 +72,10 @@ def test_f(x, y): c = b * func(a, b) return c ``` + 这段Python代码对应的ANF表达为: -``` + +```python lambda (x, y) let a = x - 1 in let b = a + y in @@ -76,26 +86,31 @@ lambda (x, y) let c = b * %1 in c end ``` + 对应的MindIR为[ir.dot](https://gitee.com/mindspore/docs/blob/master/docs/note/source_zh_cn/design/mindspore/images/ir/ir.dot): -![](./images/ir/ir.png) +![image](./images/ir/ir.png) 在MindIR中,一个函数图(FuncGraph)表示一个普通函数的定义,函数图一般由ParameterNode、ValueNode和CNode组成有向无环图,可以清晰地表达出从参数到返回值的计算过程。在上图中可以看出,python代码中两个函数`test_f`和`func`转换成了两个函数图,其参数`x`和`y`转换为函数图的ParameterNode,每一个表达式转换为一个CNode。CNode的第一个输入链接着调用的函数,例如图中的`add`、`func`、`return`。值得注意的是这些节点均是`ValueNode`,因为它们被理解为常数函数值。CNode的其他输入链接这调用的参数,参数值可以来自于ParameterNode、ValueNode和其他CNode。 在ANF中每个表达式都用let表达式绑定为一个变量,通过对变量的引用来表示对表达式输出的依赖,而在MindIR中每个表达式都绑定为一个节点,通过节点与节点之间的有向边表示依赖关系。 ## 如何保存IR + 通过`context.set_context(save_graphs=True)`来保存各个编译阶段的中间代码。被保存的中间代码有两种格式,一个是后缀名为`.ir`的文本格式,一个是后缀名为`.dot`的图形化格式。当网络规模不大时,建议使用更直观的图形化格式来查看,当网络规模较大时建议使用更高效的文本格式来查看。 DOT文件可以通过graphviz转换为图片格式来查看,例如将dot转换为png的命令是`dot -Tpng *.dot -o *.png`。 ## 函数式语义 + MindIR较传统计算图的一个重要特性是不仅可以表达算子之间的数据依赖,还可以表达丰富的函数式语义。 + ### 高阶函数 + 在MindIR中,函数的定义是由一个子图来定义,但其本身可以是一个被传递的值,作为其他高阶函数的输入或输出。 例如下面一个简单的示例中,函数`f`作为参数传入了函数`g`,因此函数`g`是一个接收函数输入的高阶函数,函数`f`真正的调用点是在函数`g`内部。 -``` +```python @ms_function def hof(x): def f(x): @@ -108,14 +123,16 @@ def hof(x): 对应的MindIR为[hof.dot](https://gitee.com/mindspore/docs/blob/master/docs/note/source_zh_cn/design/mindspore/images/ir/hof.dot): -![](./images/ir/hof.png) +![image](./images/ir/hof.png) 在实际网络训练脚本中,自动求导泛函`GradOperation`和优化器中常用到的`Partial`和`HyperMap`都是典型的高阶函数。高阶语义极大地提升了MindSpore表达的灵活性和简洁性。 ### 控制流 + 控制流在MindIR中是以高阶函数选择调用的形式表达。这样的形式把控制流转换为高阶函数的数据流,从而使得自动微分算法更加强大。不仅可以支持数据流的自动微分,还可以支持条件跳转、循环和递归等控制流的自动微分。 下面以一个简单的斐波那契用例来演示说明。 + ```python @ms_function def fibonacci(n): @@ -129,15 +146,16 @@ def fibonacci(n): 对应的MindIR为[cf.dot](https://gitee.com/mindspore/docs/blob/master/docs/note/source_zh_cn/design/mindspore/images/ir/cf.dot): -![](./images/ir/cf.png) +![image](./images/ir/cf.png) 其中`fibonacci`是顶层函数图,在顶层中有两个函数图被`switch`选择调用。`✓fibonacci`是第一个`if`的True分支,`✗fibonacci`是第一个`if`的False分支。在`✗fibonacci`中被调用的`✓✗fibonacci`是`elif`的True分支,`✗✗fibonacci`是`elif`的False分支。这里需要理解的关键是在MindIR中,条件跳转和递归是以高阶控制流的形式表达的。例如,`✓fibonacci`和`✗fibonacci`是作为`switch`算子的参数传入,`switch`根据条件参数选择哪一个函数作为返回值。因此,`switch`是把输入的函数当成普通的值做了一个二元选择操作,并没有调用,而真正的函数调用是在紧随`switch`后的CNode上完成。 - ### 自由变量和闭包 + 闭包(closure)是一种编程语言特性,它指的是代码块和作用域环境的结合。自由变量(free variable)是指在代码块中引用作用域环境中的变量而非局部变量。在MindIR中,代码块是以函数图呈现的,而作用域环境可以理解为该函数被调用时的上下文环境,自由变量的捕获方式是值拷贝而非引用。 一个典型的闭包用例如下: + ```python @ms_function def func_outer(a, b): @@ -155,14 +173,15 @@ def ms_closure(): 对应的MindIR为[closure.dot](https://gitee.com/mindspore/docs/blob/master/docs/note/source_zh_cn/design/mindspore/images/ir/closure.dot): -![](./images/ir/closure.png) +![image](./images/ir/closure.png) 在例子中,`a`和`b`是自由变量,因为`func_inner`中变量`a`和`b`是引用的其父图`func_outer`中定义的参数。变量`closure`是一个闭包,它是函数`func_inner`与其上下文`func_outer(1, 2)`的结合。因此,`out1`的结果是4,因为其等价于`1+2+1`,`out2`的结果是5,因为其等价于`1+2+2`。 ## 参考文献 + [1] C. Click and M. Paleczny. A simple graph-based intermediate representation. SIGPLAN Not., 30:35–49, March 1995. [2] Roland Leißa, Marcel Köster, and Sebastian Hack. A graph-based higher-order intermediate representation. In Proceedings of the 13th Annual IEEE/ACM International Symposium on -Code Generation and Optimization, pages 202–212. IEEE Computer Society, 2015. \ No newline at end of file +Code Generation and Optimization, pages 202–212. IEEE Computer Society, 2015. diff --git a/docs/note/source_zh_cn/design/mindspore/profiler_design.md b/docs/note/source_zh_cn/design/mindspore/profiler_design.md index 9b0b965b0ae7634703ead68a728d949967424cd9..219b40dee8b3e6ac567c396e604d91adb7f260f0 100644 --- a/docs/note/source_zh_cn/design/mindspore/profiler_design.md +++ b/docs/note/source_zh_cn/design/mindspore/profiler_design.md @@ -5,23 +5,23 @@ - [Profiler设计文档](#profiler设计文档) - - [背景](#背景) - - [Profiler框架设计](#profiler架构设计) - - [上下文](#上下文) - - [模块层级结构](#模块层级结构) - - [内部模块交互](#内部模块交互) - - [子模块设计](#准备训练脚本) - - [ProfilerAPI和Controller](#profiler-api-controller) - - [ProfilerAPI和Controller模块介绍](#profiler-api-controller模块介绍) - - [Analyser](#analyser) - - [Analyser模块介绍](#analyser模块介绍) - - [Analyser模块设计](#analyser模块设计) - - [Parser](#parser) - - [Parser模块介绍](#parser模块介绍) - - [Parser模块设计](#parser模块设计) - - [Proposer](#proposer) - - [Proposer模块介绍](#proposer模块介绍) - - [Proposer模块设计](#proposer模块设计) + - [背景](#背景) + - [Profiler框架设计](#profiler架构设计) + - [上下文](#上下文) + - [模块层级结构](#模块层级结构) + - [内部模块交互](#内部模块交互) + - [子模块设计](#准备训练脚本) + - [ProfilerAPI和Controller](#profiler-api-controller) + - [ProfilerAPI和Controller模块介绍](#profiler-api-controller模块介绍) + - [Analyser](#analyser) + - [Analyser模块介绍](#analyser模块介绍) + - [Analyser模块设计](#analyser模块设计) + - [Parser](#parser) + - [Parser模块介绍](#parser模块介绍) + - [Parser模块设计](#parser模块设计) + - [Proposer](#proposer) + - [Proposer模块介绍](#proposer模块介绍) + - [Proposer模块设计](#proposer模块设计) @@ -32,6 +32,7 @@ 为了支持用户在MindSpore进行模型开发性能调试,需要提供易用的Profile工具,直观地展现网络模型各维度的性能信息,为用户提供易用、丰富的性能分析功能,帮助用户快速定位网络中性能问题。 ## Profiler架构设计 + 这一章将介绍Profiler的架构设计,第一节从整体Profiler的角度出发介绍其上下文交互关系,第二节将打开Profiler内部,介绍模块层架结构以及模块划分,第三节将介绍模块间的交互调用关系。 ### 上下文 @@ -49,6 +50,7 @@ Profiler是MindSpore调试调优工具的一部分,在整个使用过程中的 2. MindSpore侧Profiler将在用户脚本中对原始数据进行解析,并在用户指定的文件夹下面生成中间数据结果; 3. Mindinsight侧Profiler对接中间数据,提供可视化Profiler功能供用户使用。 + ### 模块层级结构 模块层级划分如下: @@ -57,8 +59,8 @@ Profiler是MindSpore调试调优工具的一部分,在整个使用过程中的 图2:层级模块关系图 - 如上图所示,各个模块功能介绍如下: + 1. ProfilerAPI是代码侧对用户提供的调用入口,为用户提供了性能收集启动接口以及分析接口; 2. Controller是ProfilerAPI下层的模块,被ProfilerAPI中的启动接口调用,负责控制下方性能收集功能的启动停止,原始数据会被ada写入固定位置; 3. Parser是性能原始数据解析模块,由于性能原始数据是在设备侧收集的信息,所以信息不能直接被用户所理解,该模块负责将信息进行解析、组合、转换,最终形成用户可理解、上层可分析的中间结果; @@ -66,6 +68,7 @@ Profiler是MindSpore调试调优工具的一部分,在整个使用过程中的 5. 通过RESTful调用后端Analyser提供的common API,获取目标数据,以RESTful接口对接前端。 ### 内部模块交互 + 从用户角度,有两种使用形式API、RESTful,我们以API为例,阐述一个完整的内部模块交互流程: ![time_order_profiler.png](./images/time_order_profiler.png) @@ -81,20 +84,22 @@ Profiler是MindSpore调试调优工具的一部分,在整个使用过程中的 3. Profiler API分析接口首先使用Parser模块对性能数据进行解析,产生中间结果,再调用Aalayser进行中间结果分析,最终将各类信息返回至用户侧。 ## 子模块设计 + ### ProfilerAPI和Controller #### ProfilerAPI和Controller模块说明 + ProfilerAPI为用户在训练脚本侧提供入口API,用户通过ProfilerAPI启动性能收集以及对性能数据进行分析。 ProfilerAPI通过Controller下发命令,完成对ada启动的控制。 #### ProfilerAPI和Controller模块设计 + ProfilerAPI模块,属于上层应用接口层,由训练脚本集成。功能分为两部分: - 训练前调用底层Controller接口,下发命令,启动profiling统计任务。 - 训练完成后,调用底层Controller接口,下发命令,停止性能统计任务,再调用Analyser、Parser模块接口解析数据文件,生成算子性能统计、training trace统计等结果数据。 - Controller模块提供对上层接口,并调用底层性能收集模块接口,下发启动和停止性能收集的命令。 最终生成的性能原始数据主要包含: @@ -105,9 +110,13 @@ Controller模块提供对上层接口,并调用底层性能收集模块接口 - `training_trace.46.dev.profiler_default_tag`文件:存储每个step的开始结束时刻,迭代间隙、迭代前向反向、迭代拖尾的时刻信息。 ### Parser + #### Parser模块介绍 + Parser是原始性能数据解析模块,由于原始性能数据是在设备侧收集的信息,所以信息不能直接被用户所理解,该模块负责将信息进行解析、组合、转换,最终形成用户可理解、上层可分析的中间结果。 + #### Parser模块设计 + ![parser_module_profiler.png](./images/parser_module_profiler.png) 图4:Parser模块图 @@ -122,6 +131,7 @@ Parser是原始性能数据解析模块,由于原始性能数据是在设备 ### Analyser #### Analyser模块介绍 + 分析器的作用是对解析阶段生成的中间结果,进行筛选、排序、查询、分页等相关操作。 #### Analyser模块设计 @@ -141,9 +151,10 @@ Parser是原始性能数据解析模块,由于原始性能数据是在设备 为了隐藏Analyser内部实现,方便调用,使用简单工厂模式,通过AnalyserFactory获取指定的Analyser。 - ### Proposer + #### Proposer模块介绍 + Proposer是Profiler性能优化建议模块,Proposer调用Analyser模块获取性能数据,通过调优规则对性能数据进行分析,输出调优建议由UI、API接口展示给用户。 #### Proposer模块设计 @@ -171,4 +182,4 @@ Proposer是Profiler性能优化建议模块,Proposer调用Analyser模块获取 如上模块类图所示: - 各类型Proposer继承抽象类Proposer并实现analyze方法; -- API、CLI通过调用工厂ProposerFactory获取Proposer,并调用Proposer.analyze函数获取各类型的Proposer分析的优化建议。 \ No newline at end of file +- API、CLI通过调用工厂ProposerFactory获取Proposer,并调用Proposer.analyze函数获取各类型的Proposer分析的优化建议。 diff --git a/docs/note/source_zh_cn/design/technical_white_paper.md b/docs/note/source_zh_cn/design/technical_white_paper.md index 244f94705c0fc50fe16a30c1b26fc10e50dde282..c3ec41c35159513d72d40770a3fdbe593ce3bbf9 100644 --- a/docs/note/source_zh_cn/design/technical_white_paper.md +++ b/docs/note/source_zh_cn/design/technical_white_paper.md @@ -13,11 +13,13 @@ ## 引言 + 深度学习研究和应用在近几十年得到了爆炸式的发展,掀起了人工智能的第三次浪潮,并且在图像识别、语音识别与合成、无人驾驶、机器视觉等方面取得了巨大的成功。这也对算法的应用以及依赖的框架有了更高级的要求。深度学习框架的不断发展使得在大型数据集上训练神经网络模型时,可以方便地使用大量的计算资源。 深度学习是使用多层结构从原始数据中自动学习并提取高层次特征的一类机器学习算法。通常,从原始数据中提取高层次、抽象的特征是非常困难的。目前有两种主流的深度学习框架:一种是在执行之前构造一个静态图,定义所有操作和网络结构,典型代表是TensorFlow,这种方法以牺牲易用性为代价,来提高训练期间的性能;另一种是立即执行的动态图计算,典型代表是PyTorch。通过比较可以发现,动态图更灵活、更易调试,但会牺牲性能。因此,现有深度学习框架难以同时满足易开发、高效执行的要求。 ## 简介 + MindSpore作为新一代深度学习框架,是源于全产业的最佳实践,最佳匹配昇腾处理器算力,支持终端、边缘、云全场景灵活部署,开创全新的AI编程范式,降低AI开发门槛。MindSpore是一种全新的深度学习计算框架,旨在实现易开发、高效执行、全场景覆盖三大目标。为了实现易开发的目标,MindSpore采用基于源码转换(Source Code Transformation,SCT)的自动微分(Automatic Differentiation,AD)机制,该机制可以用控制流表示复杂的组合。函数被转换成函数中间表达(Intermediate Representation,IR),中间表达构造出一个能够在不同设备上解析和执行的计算图。在执行前,计算图上应用了多种软硬件协同优化技术,以提升端、边、云等不同场景下的性能和效率。MindSpore支持动态图,更易于检查运行模式。由于采用了基于源码转换的自动微分机制,所以动态图和静态图之间的模式切换非常简单。为了在大型数据集上有效训练大模型,通过高级手动配置策略,MindSpore可以支持数据并行、模型并行和混合并行训练,具有很强的灵活性。此外,MindSpore还有“自动并行”能力,它通过在庞大的策略空间中进行高效搜索来找到一种快速的并行策略。MindSpore框架的具体优势,请查看详细介绍。 -[查看技术白皮书](https://mindspore-website.obs.cn-north-4.myhuaweicloud.com:443/white_paper/MindSpore_white_paper.pdf) \ No newline at end of file +[查看技术白皮书](https://mindspore-website.obs.cn-north-4.myhuaweicloud.com:443/white_paper/MindSpore_white_paper.pdf) diff --git a/docs/note/source_zh_cn/glossary.md b/docs/note/source_zh_cn/glossary.md index 4297fd2647161670879a40c7ee2522077f4d8160..630fec2ea6cbb0bfe4cbbd913419853a056adc57 100644 --- a/docs/note/source_zh_cn/glossary.md +++ b/docs/note/source_zh_cn/glossary.md @@ -4,7 +4,7 @@ -| 术语/缩略语 | 说明 | +| 术语/缩略语 | 说明 | | ----- | ----- | | ACL | Ascend Computer Language,提供Device管理、Context管理、Stream管理、内存管理、模型加载与执行、算子加载与执行、媒体数据处理等C++ API库,供用户开发深度神经网络应用。| | Ascend | 华为昇腾系列芯片的系列名称。 | diff --git a/docs/note/source_zh_cn/help_seeking_path.md b/docs/note/source_zh_cn/help_seeking_path.md index f3de79c3733fbf48e529bb4b65b1488d5e1fda42..ac3338260cf2cc17cbd838c5e7fc101da5021cf1 100644 --- a/docs/note/source_zh_cn/help_seeking_path.md +++ b/docs/note/source_zh_cn/help_seeking_path.md @@ -10,25 +10,23 @@ - 网站搜索 - - 进入[官网搜索](https://www.mindspore.cn/search)。 - - 遇到问题时,首先推荐使用网站搜索方法,该方法操作简单、高效。 - - 在搜索框输入问题的关键词,点击搜索,可匹配出与关键词相关的内容。 - - 参考搜索结果,解决当前遇到的问题。 - + - 进入[官网搜索](https://www.mindspore.cn/search)。 + - 遇到问题时,首先推荐使用网站搜索方法,该方法操作简单、高效。 + - 在搜索框输入问题的关键词,点击搜索,可匹配出与关键词相关的内容。 + - 参考搜索结果,解决当前遇到的问题。 - 用户群咨询 - - QQ用户群号:871543426。 - - 如果网站搜索方法不能解决当前问题,可通过QQ用户群咨询,建议想要简单咨询的用户选取此方法。 - - 加群后可以与其他用户讨论交流,还有技术专家在群中提供帮助解答。 - - 通过专家的解答或和其他用户的交流来解决当前遇到的问题。 - + - QQ用户群号:871543426。 + - 如果网站搜索方法不能解决当前问题,可通过QQ用户群咨询,建议想要简单咨询的用户选取此方法。 + - 加群后可以与其他用户讨论交流,还有技术专家在群中提供帮助解答。 + - 通过专家的解答或和其他用户的交流来解决当前遇到的问题。 - 论坛求助 - - 如果用户想要详细的解决方法,可通过[MindSpore论坛](https://bbs.huaweicloud.com/forum/forum-1076-1.html)中发布问题求助帖获取解答。 - - 为提高问题解决速度与质量,发帖前请参考[发帖建议](https://bbs.huaweicloud.com/forum/thread-69695-1-1.html),按照建议格式发帖。 - - 帖子发出后会有论坛版主负责将问题收录,并联系技术专家进行解答,问题将在三个工作日内解决。 - - 参考技术专家的解决方案,解决当前遇到的问题。 + - 如果用户想要详细的解决方法,可通过[MindSpore论坛](https://bbs.huaweicloud.com/forum/forum-1076-1.html)中发布问题求助帖获取解答。 + - 为提高问题解决速度与质量,发帖前请参考[发帖建议](https://bbs.huaweicloud.com/forum/thread-69695-1-1.html),按照建议格式发帖。 + - 帖子发出后会有论坛版主负责将问题收录,并联系技术专家进行解答,问题将在三个工作日内解决。 + - 参考技术专家的解决方案,解决当前遇到的问题。 - 如果在专家测试后确定是MindSpore功能有待完善,推荐用户在[MindSpore仓](https://gitee.com/mindspore)中创建ISSUE,所提问题会在后续的版本中得到修复完善。 \ No newline at end of file + 如果在专家测试后确定是MindSpore功能有待完善,推荐用户在[MindSpore仓](https://gitee.com/mindspore)中创建ISSUE,所提问题会在后续的版本中得到修复完善。 diff --git a/docs/note/source_zh_cn/image_classification_lite.md b/docs/note/source_zh_cn/image_classification_lite.md index 4aa2960c32af591cbc6c277241358a80acea9a1b..10a4a449f18096e83803f77f67d0e18314cc19b5 100644 --- a/docs/note/source_zh_cn/image_classification_lite.md +++ b/docs/note/source_zh_cn/image_classification_lite.md @@ -4,7 +4,7 @@ ## 图像分类介绍 -图像分类模型可以预测图片中出现哪些物体,识别出图片中出现物体列表及其概率。 比如下图经过模型推理的分类结果为下表: +图像分类模型可以预测图片中出现哪些物体,识别出图片中出现物体列表及其概率。 比如下图经过模型推理的分类结果为下表: ![image_classification](images/image_classification_result.png) @@ -30,4 +30,3 @@ | [Shufflenetv2](https://download.mindspore.cn/model_zoo/official/lite/shufflenetv2_lite/shufflenetv2.ms) | 8.8 | 67.74% | 87.62% | - | 8.303 | | [GoogleNet](https://download.mindspore.cn/model_zoo/official/lite/googlenet_lite/googlenet.ms) | 25.3 | 72.2% | 90.06% | - | 23.257 | | [ResNext50](https://download.mindspore.cn/model_zoo/official/lite/resnext50_lite/resnext50.ms) | 95.8 | 73.1% | 91.21% | - | 138.164 | - diff --git a/docs/note/source_zh_cn/network_list_ms.md b/docs/note/source_zh_cn/network_list_ms.md index 20b1d9107b3905545742e2ec8d84188ab3655e7a..5eca8d6bc206b3f18df676a916956db42696acad 100644 --- a/docs/note/source_zh_cn/network_list_ms.md +++ b/docs/note/source_zh_cn/network_list_ms.md @@ -1,7 +1,7 @@ # MindSpore网络支持 `Linux` `Ascend` `GPU` `CPU` `模型开发` `中级` `高级` - + - [MindSpore网络支持](#mindspore网络支持) @@ -13,7 +13,7 @@ ## Model Zoo -| 领域 | 子领域 | 网络 | Ascend(Graph) | Ascend(PyNative) | GPU(Graph) | GPU(PyNaitve) | CPU(Graph) +| 领域 | 子领域 | 网络 | Ascend(Graph) | Ascend(PyNative) | GPU(Graph) | GPU(PyNaitve) | CPU(Graph) |:---- |:------- |:---- |:---- |:---- |:---- |:---- |:---- |计算机视觉(CV) | 图像分类(Image Classification) | [AlexNet](https://gitee.com/mindspore/mindspore/blob/master/model_zoo/official/cv/alexnet/src/alexnet.py) | Supported | Supported | Supported | Supported | Doing | 计算机视觉(CV) | 图像分类(Image Classification) | [GoogleNet](https://gitee.com/mindspore/mindspore/blob/master/model_zoo/official/cv/googlenet/src/googlenet.py) | Supported | Supported | Supported | Supported | Doing diff --git a/docs/note/source_zh_cn/object_detection_lite.md b/docs/note/source_zh_cn/object_detection_lite.md index 486d48d79d2ae8ed4212a26291862e84de425efe..b278206f80621b9496f29c2b9daaa69196b80077 100644 --- a/docs/note/source_zh_cn/object_detection_lite.md +++ b/docs/note/source_zh_cn/object_detection_lite.md @@ -23,4 +23,3 @@ | 模型名称 | 大小 | mAP(IoU=0.50:0.95) | CPU 4线程时延(ms) | |-----------------------| :----------: | :----------: | :-----------: | | [MobileNetv2-SSD](https://download.mindspore.cn/model_zoo/official/lite/ssd_mobilenetv2_lite/ssd.ms) | 16.7 | 0.22 | 25.4 | - diff --git a/docs/note/source_zh_cn/operator_list_implicit.md b/docs/note/source_zh_cn/operator_list_implicit.md index 710dd8c2267dad7ac96d7648d1cdf39047b1c441..e37729db056c97c4c9e3f0061b9ea5efa8fecb14 100644 --- a/docs/note/source_zh_cn/operator_list_implicit.md +++ b/docs/note/source_zh_cn/operator_list_implicit.md @@ -15,25 +15,28 @@ ## 隐式类型转换 + ### 转换规则 -* 标量与Tensor运算:运算时,将标量自动转为Tensor,数据类型和参与运算的Tensor数据类型保持一致; + +- 标量与Tensor运算:运算时,将标量自动转为Tensor,数据类型和参与运算的Tensor数据类型保持一致; 当Tensor是bool数据类型,标量是int或float时,将标量和Tensor都转为数据类型为int32或float32的Tensor; 当Tensor是int或者uint数据类型,标量是float时,将标量和Tensor都转为数据类型为float32的Tensor。 -* 不同数据类型Tensor运算:数据类型优先级排序为bool < uint8 < int8 < int16 < int32 < int64 < float16 < float32 < float64, +- 不同数据类型Tensor运算:数据类型优先级排序为bool < uint8 < int8 < int16 < int32 < int64 < float16 < float32 < float64, 运算时,先确定参与运算的Tensor中优先级相对最高的数据类型,然后将低优先级数据类型Tensor转换为相对最高优先级数据类型; 而当int8和uint8数据类型的Tensor进行运算时,将其都转为int16的Tensor。 -* 不支持对Parameter进行数据类型转换:如果按照转换规则推导,需要对网络中定义的Parameter进行数据类型转换时,会抛出RuntimeError异常。 +- 不支持对Parameter进行数据类型转换:如果按照转换规则推导,需要对网络中定义的Parameter进行数据类型转换时,会抛出RuntimeError异常。 ### 参与转换的数据类型 -* bool -* int8 -* uint8 -* int16 -* int32 -* int64 -* float16 -* float32 -* float64 + +- bool +- int8 +- uint8 +- int16 +- int32 +- int64 +- float16 +- float32 +- float64 ### 支持算子 @@ -101,4 +104,3 @@ | [mindspore.ops.ScatterSub](https://www.mindspore.cn/doc/api_python/zh-CN/master/mindspore/mindspore.ops.html#mindspore.ops.ScatterSub) | [mindspore.ops.ScatterMul](https://www.mindspore.cn/doc/api_python/zh-CN/master/mindspore/mindspore.ops.html#mindspore.ops.ScatterMul) | [mindspore.ops.ScatterDiv](https://www.mindspore.cn/doc/api_python/zh-CN/master/mindspore/mindspore.ops.html#mindspore.ops.ScatterDiv) - diff --git a/docs/note/source_zh_cn/operator_list_lite.md b/docs/note/source_zh_cn/operator_list_lite.md index 04bedcfb2c9191797f9159642f492e3e7e455752..1b9695e052e7c696b3bca6126484c2342c354467 100644 --- a/docs/note/source_zh_cn/operator_list_lite.md +++ b/docs/note/source_zh_cn/operator_list_lite.md @@ -109,6 +109,6 @@ | Unsqueeze | | Supported | Supported | Supported | | | | | Unsqueeze | | Unstack | | Supported | | | | | Unstack | | | | Where | | Supported | | | | | Where | | | -| ZerosLike | | Supported | | | | | ZerosLike | | | +| ZerosLike | | Supported | | | | | ZerosLike | | | * Clip:仅支持将clip(0, 6)转换为Relu6。 diff --git a/docs/note/source_zh_cn/operator_list_ms.md b/docs/note/source_zh_cn/operator_list_ms.md index f18271b91f55c22a580773285f911dfa0457b748..c4b6fb10ffccf6cfb2125cc55754eeffbbdeaaca 100644 --- a/docs/note/source_zh_cn/operator_list_ms.md +++ b/docs/note/source_zh_cn/operator_list_ms.md @@ -46,7 +46,8 @@ | [mindspore.nn.SSIM](https://www.mindspore.cn/doc/api_python/zh-CN/master/mindspore/mindspore.nn.html#mindspore.nn.SSIM) | Supported | Supported | Doing |layer/image | [mindspore.nn.PSNR](https://www.mindspore.cn/doc/api_python/zh-CN/master/mindspore/mindspore.nn.html#mindspore.nn.PSNR) | Supported |Supported | Doing |layer/image | [mindspore.nn.CentralCrop](https://www.mindspore.cn/doc/api_python/zh-CN/master/mindspore/mindspore.nn.html#mindspore.nn.CentralCrop) | Supported |Supported | Doing |layer/image -| [mindspore.nn.LSTM](https://www.mindspore.cn/doc/api_python/zh-CN/master/mindspore/mindspore.nn.html#mindspore.nn.LSTM) | Doing | Supported | Supported |layer/lstm +| [mindspore.nn.LSTM](https://www.mindspore.cn/doc/api_python/zh-CN/master/mindspore/mindspore.nn.html#mindspore.nn.LSTM) | Doing | Supported | Doing |layer/lstm +| [mindspore.nn.LSTMCell](https://www.mindspore.cn/doc/api_python/zh-CN/master/mindspore/mindspore.nn.html#mindspore.nn.LSTMCell) | Doing | Supported | Supported |layer/lstm | [mindspore.nn.GlobalBatchNorm](https://www.mindspore.cn/doc/api_python/zh-CN/master/mindspore/mindspore.nn.html#mindspore.nn.GlobalBatchNorm) | Supported |Doing | Doing |layer/normalization | [mindspore.nn.BatchNorm1d](https://www.mindspore.cn/doc/api_python/zh-CN/master/mindspore/mindspore.nn.html#mindspore.nn.BatchNorm1d) | Supported |Doing | Doing |layer/normalization | [mindspore.nn.BatchNorm2d](https://www.mindspore.cn/doc/api_python/zh-CN/master/mindspore/mindspore.nn.html#mindspore.nn.BatchNorm2d) | Supported | Supported | Doing |layer/normalization @@ -104,6 +105,8 @@ | [mindspore.nn.LGamma](https://www.mindspore.cn/doc/api_python/zh-CN/master/mindspore/mindspore.nn.html#mindspore.nn.LGamma) |Supported | Doing | Doing |layer/math | [mindspore.nn.ReduceLogSumExp](https://www.mindspore.cn/doc/api_python/zh-CN/master/mindspore/mindspore.nn.html#mindspore.nn.ReduceLogSumExp) |Supported | Supported | Doing |layer/math | [mindspore.nn.MSSSIM](https://www.mindspore.cn/doc/api_python/zh-CN/master/mindspore/mindspore.nn.html#mindspore.nn.MSSSIM) | Supported |Doing | Doing |layer/image +| [mindspore.nn.AvgPool1d](https://www.mindspore.cn/doc/api_python/zh-CN/master/mindspore/mindspore.nn.html#mindspore.nn.AvgPool1d) | Supported | Doing | Doing |layer/pooling +| [mindspore.nn.Unfold](https://www.mindspore.cn/doc/api_python/zh-CN/master/mindspore/mindspore.nn.html#mindspore.nn.Unfold) |Supported | Doing | Doing |layer/basic ## mindspore.ops.operations @@ -184,18 +187,18 @@ | [mindspore.ops.ReduceSum](https://www.mindspore.cn/doc/api_python/zh-CN/master/mindspore/mindspore.ops.html#mindspore.ops.ReduceSum) | Supported | Supported | Supported | math_ops | [mindspore.ops.ReduceAll](https://www.mindspore.cn/doc/api_python/zh-CN/master/mindspore/mindspore.ops.html#mindspore.ops.ReduceAll) | Supported | Doing | Doing | math_ops | [mindspore.ops.ReduceMax](https://www.mindspore.cn/doc/api_python/zh-CN/master/mindspore/mindspore.ops.html#mindspore.ops.ReduceMax) | Supported | Supported | Supported | math_ops -| [mindspore.ops.ReduceMin](https://www.mindspore.cn/doc/api_python/zh-CN/master/mindspore/mindspore.ops.html#mindspore.ops.ReduceMin) | Supported | Supported | Doing | math_ops -| [mindspore.ops.ReduceProd](https://www.mindspore.cn/doc/api_python/zh-CN/master/mindspore/mindspore.ops.html#mindspore.ops.ReduceProd) | Supported | Doing | Doing | math_ops -| [mindspore.ops.CumProd](https://www.mindspore.cn/doc/api_python/zh-CN/master/mindspore/mindspore.ops.html#mindspore.ops.CumProd) | Supported | Doing | Doing | math_ops +| [mindspore.ops.ReduceMin](https://www.mindspore.cn/doc/api_python/zh-CN/master/mindspore/mindspore.ops.html#mindspore.ops.ReduceMin) | Supported | Supported | Doing | math_ops +| [mindspore.ops.ReduceProd](https://www.mindspore.cn/doc/api_python/zh-CN/master/mindspore/mindspore.ops.html#mindspore.ops.ReduceProd) | Supported | Doing | Doing | math_ops +| [mindspore.ops.CumProd](https://www.mindspore.cn/doc/api_python/zh-CN/master/mindspore/mindspore.ops.html#mindspore.ops.CumProd) | Supported | Doing | Doing | math_ops | [mindspore.ops.MatMul](https://www.mindspore.cn/doc/api_python/zh-CN/master/mindspore/mindspore.ops.html#mindspore.ops.MatMul) | Supported | Supported | Supported | math_ops | [mindspore.ops.BatchMatMul](https://www.mindspore.cn/doc/api_python/zh-CN/master/mindspore/mindspore.ops.html#mindspore.ops.BatchMatMul) | Supported | Supported | Doing | math_ops | [mindspore.ops.CumSum](https://www.mindspore.cn/doc/api_python/zh-CN/master/mindspore/mindspore.ops.html#mindspore.ops.CumSum) | Supported | Supported| Doing | math_ops | [mindspore.ops.AddN](https://www.mindspore.cn/doc/api_python/zh-CN/master/mindspore/mindspore.ops.html#mindspore.ops.AddN) | Supported | Supported | Supported | math_ops | [mindspore.ops.Neg](https://www.mindspore.cn/doc/api_python/zh-CN/master/mindspore/mindspore.ops.html#mindspore.ops.Neg) | Supported | Supported | Doing | math_ops -| [mindspore.ops.Sub](https://www.mindspore.cn/doc/api_python/zh-CN/master/mindspore/mindspore.ops.html#mindspore.ops.Sub) | Supported | Supported | Supported | math_ops +| [mindspore.ops.Sub](https://www.mindspore.cn/doc/api_python/zh-CN/master/mindspore/mindspore.ops.html#mindspore.ops.Sub) | Supported | Supported | Supported | math_ops | [mindspore.ops.Mul](https://www.mindspore.cn/doc/api_python/zh-CN/master/mindspore/mindspore.ops.html#mindspore.ops.Mul) | Supported | Supported | Supported | math_ops -| [mindspore.ops.Square](https://www.mindspore.cn/doc/api_python/zh-CN/master/mindspore/mindspore.ops.html#mindspore.ops.Square) | Supported | Supported | Supported | math_ops -| [mindspore.ops.SquareSumAll](https://www.mindspore.cn/doc/api_python/zh-CN/master/mindspore/mindspore.ops.html#mindspore.ops.SquareSumAll) | Supported | Doing | Doing | math_ops +| [mindspore.ops.Square](https://www.mindspore.cn/doc/api_python/zh-CN/master/mindspore/mindspore.ops.html#mindspore.ops.Square) | Supported | Supported | Supported | math_ops +| [mindspore.ops.SquareSumAll](https://www.mindspore.cn/doc/api_python/zh-CN/master/mindspore/mindspore.ops.html#mindspore.ops.SquareSumAll) | Supported | Doing | Doing | math_ops | [mindspore.ops.Rsqrt](https://www.mindspore.cn/doc/api_python/zh-CN/master/mindspore/mindspore.ops.html#mindspore.ops.Rsqrt) | Supported | Supported | Doing | math_ops | [mindspore.ops.Sqrt](https://www.mindspore.cn/doc/api_python/zh-CN/master/mindspore/mindspore.ops.html#mindspore.ops.Sqrt) | Supported | Supported | Doing | math_ops | [mindspore.ops.Reciprocal](https://www.mindspore.cn/doc/api_python/zh-CN/master/mindspore/mindspore.ops.html#mindspore.ops.Reciprocal) | Supported | Supported | Doing | math_ops @@ -203,7 +206,7 @@ | [mindspore.ops.Exp](https://www.mindspore.cn/doc/api_python/zh-CN/master/mindspore/mindspore.ops.html#mindspore.ops.Exp) | Supported | Supported | Doing | math_ops | [mindspore.ops.Log](https://www.mindspore.cn/doc/api_python/zh-CN/master/mindspore/mindspore.ops.html#mindspore.ops.Log) | Supported | Supported | Doing | math_ops | [mindspore.ops.Log1p](https://www.mindspore.cn/doc/api_python/zh-CN/master/mindspore/mindspore.ops.html#mindspore.ops.Log1p) | Supported | Doing | Doing | math_ops -| [mindspore.ops.Minimum](https://www.mindspore.cn/doc/api_python/zh-CN/master/mindspore/mindspore.ops.html#mindspore.ops.Minimum) | Supported | Supported | Doing | math_ops +| [mindspore.ops.Minimum](https://www.mindspore.cn/doc/api_python/zh-CN/master/mindspore/mindspore.ops.html#mindspore.ops.Minimum) | Supported | Supported | Doing | math_ops | [mindspore.ops.Maximum](https://www.mindspore.cn/doc/api_python/zh-CN/master/mindspore/mindspore.ops.html#mindspore.ops.Maximum) | Supported | Supported | Doing | math_ops | [mindspore.ops.RealDiv](https://www.mindspore.cn/doc/api_python/zh-CN/master/mindspore/mindspore.ops.html#mindspore.ops.RealDiv) | Supported | Supported | Doing | math_ops | [mindspore.ops.Div](https://www.mindspore.cn/doc/api_python/zh-CN/master/mindspore/mindspore.ops.html#mindspore.ops.Div) | Supported | Supported | Doing | math_ops @@ -267,7 +270,7 @@ | [mindspore.ops.GatherV2](https://www.mindspore.cn/doc/api_python/zh-CN/master/mindspore/mindspore.ops.html#mindspore.ops.GatherV2) | Supported | Supported | Supported | array_ops | [mindspore.ops.Split](https://www.mindspore.cn/doc/api_python/zh-CN/master/mindspore/mindspore.ops.html#mindspore.ops.Split) | Supported | Supported | Doing | array_ops | [mindspore.ops.Rank](https://www.mindspore.cn/doc/api_python/zh-CN/master/mindspore/mindspore.ops.html#mindspore.ops.Rank) | Supported | Supported | Supported | array_ops -| [mindspore.ops.TruncatedNormal](https://www.mindspore.cn/doc/api_python/zh-CN/master/mindspore/mindspore.ops.html#mindspore.ops.TruncatedNormal) | Supported | Supported | Supported | array_ops +| [mindspore.ops.TruncatedNormal](https://www.mindspore.cn/doc/api_python/zh-CN/master/mindspore/mindspore.ops.html#mindspore.ops.TruncatedNormal) | Doing | Supported | Supported | array_ops | [mindspore.ops.Size](https://www.mindspore.cn/doc/api_python/zh-CN/master/mindspore/mindspore.ops.html#mindspore.ops.Size) | Supported | Supported | Supported | array_ops | [mindspore.ops.Fill](https://www.mindspore.cn/doc/api_python/zh-CN/master/mindspore/mindspore.ops.html#mindspore.ops.Fill) | Supported | Supported | Supported | array_ops | [mindspore.ops.OnesLike](https://www.mindspore.cn/doc/api_python/zh-CN/master/mindspore/mindspore.ops.html#mindspore.ops.OnesLike) | Supported | Supported | Doing | array_ops @@ -291,7 +294,7 @@ | [mindspore.ops.StridedSlice](https://www.mindspore.cn/doc/api_python/zh-CN/master/mindspore/mindspore.ops.html#mindspore.ops.StridedSlice) | Supported | Supported | Supported | array_ops | [mindspore.ops.Diag](https://www.mindspore.cn/doc/api_python/zh-CN/master/mindspore/mindspore.ops.html#mindspore.ops.Diag) | Doing | Doing | Doing | array_ops | [mindspore.ops.DiagPart](https://www.mindspore.cn/doc/api_python/zh-CN/master/mindspore/mindspore.ops.html#mindspore.ops.DiagPart) | Doing | Doing | Doing | array_ops -| [mindspore.ops.Eye](https://www.mindspore.cn/doc/api_python/zh-CN/master/mindspore/mindspore.ops.html#mindspore.ops.Eye) | Supported | Supported | Supported | array_ops +| [mindspore.ops.Eye](https://www.mindspore.cn/doc/api_python/zh-CN/master/mindspore/mindspore.ops.html#mindspore.ops.Eye) | Supported | Supported | Supported | array_ops | [mindspore.ops.ScatterNd](https://www.mindspore.cn/doc/api_python/zh-CN/master/mindspore/mindspore.ops.html#mindspore.ops.ScatterNd) | Supported | Supported | Doing | array_ops | [mindspore.ops.ResizeNearestNeighbor](https://www.mindspore.cn/doc/api_python/zh-CN/master/mindspore/mindspore.ops.html#mindspore.ops.ResizeNearestNeighbor) | Supported | Supported | Doing | array_ops | [mindspore.ops.GatherNd](https://www.mindspore.cn/doc/api_python/zh-CN/master/mindspore/mindspore.ops.html#mindspore.ops.GatherNd) | Supported | Supported | Doing | array_ops @@ -365,6 +368,9 @@ | [mindspore.ops.IFMR](https://www.mindspore.cn/doc/api_python/zh-CN/master/mindspore/mindspore.ops.html#mindspore.ops.IFMR) | Supported | Doing | Doing | math_ops | [mindspore.ops.DynamicShape](https://www.mindspore.cn/doc/api_python/zh-CN/master/mindspore/mindspore.ops.html#mindspore.ops.DynamicShape) | Supported | Supported | Supported | array_ops | [mindspore.ops.Unique](https://www.mindspore.cn/doc/api_python/zh-CN/master/mindspore/mindspore.ops.html#mindspore.ops.Unique) | Doing | Doing | Doing | array_ops +| [mindspore.ops.ReduceAny](https://www.mindspore.cn/doc/api_python/zh-CN/master/mindspore/mindspore.ops.html#mindspore.ops.ReduceAny) | Supported | Doing | Doing | math_ops +| [mindspore.ops.SparseToDense](https://www.mindspore.cn/doc/api_python/zh-CN/master/mindspore/mindspore.ops.html#mindspore.ops.SparseToDense) | Doing | Doing | Doing | sparse_ops +| [mindspore.ops.CTCGreedyDecoder](https://www.mindspore.cn/doc/api_python/zh-CN/master/mindspore/mindspore.ops.html#mindspore.ops.CTCGreedyDecoder) | Doing | Doing | Doing | nn_ops ## mindspore.ops.functional diff --git a/docs/note/source_zh_cn/roadmap.md b/docs/note/source_zh_cn/roadmap.md index a98ee9f3a1ad580bf8aa4855b1e049d43de4a23a..62c7657b6f592cb4632dbc4a31653fdd3643a19a 100644 --- a/docs/note/source_zh_cn/roadmap.md +++ b/docs/note/source_zh_cn/roadmap.md @@ -20,6 +20,7 @@ 以下将展示MindSpore近一年的高阶计划,我们会根据用户的反馈诉求,持续调整计划的优先级。 总体而言,我们会努力在以下几个方面不断改进。 + 1. 提供更多的预置模型支持。 2. 持续补齐API和算子库,改善易用性和编程体验。 3. 提供华为昇腾AI处理器的全面支持,并不断优化性能及软件架构。 @@ -28,56 +29,63 @@ 热忱希望各位在用户社区加入讨论,并贡献您的建议。 ## 预置模型 -* CV:目标检测、GAN、图像分割、姿态识别等场景经典模型。 -* NLP:RNN、Transformer类型神经网络,拓展基于Bert预训练模型的应用。 -* 其它:GNN、强化学习、概率编程、AutoML等。 + +- CV:目标检测、GAN、图像分割、姿态识别等场景经典模型。 +- NLP:RNN、Transformer类型神经网络,拓展基于Bert预训练模型的应用。 +- 其它:GNN、强化学习、概率编程、AutoML等。 ## 易用性 -* 补齐算子、优化器、Loss函数等各类API -* 完善Python语言原生表达支持 -* 支持常见的Tensor/Math操作 -* 增加更多的自动并行适用场景,提高策略搜索的准确性 + +- 补齐算子、优化器、Loss函数等各类API +- 完善Python语言原生表达支持 +- 支持常见的Tensor/Math操作 +- 增加更多的自动并行适用场景,提高策略搜索的准确性 ## 性能优化 -* 优化编译时间 -* 低比特混合精度训练/推理 -* 提升内存使用效率 -* 提供更多的融合优化手段 -* 加速PyNative执行性能 + +- 优化编译时间 +- 低比特混合精度训练/推理 +- 提升内存使用效率 +- 提供更多的融合优化手段 +- 加速PyNative执行性能 ## 架构演进 -* 图算融合优化:使用细粒度Graph IR表达算子,构成带算子边界的中间表达,挖掘更多图层优化机会。 -* 支持更多编程语言 -* 优化数据增强的自动调度及分布式训练数据缓存机制 -* 持续完善MindSpore IR -* Parameter Server模式分布式训练 + +- 图算融合优化:使用细粒度Graph IR表达算子,构成带算子边界的中间表达,挖掘更多图层优化机会。 +- 支持更多编程语言 +- 优化数据增强的自动调度及分布式训练数据缓存机制 +- 持续完善MindSpore IR +- Parameter Server模式分布式训练 ## MindInsight调试调优 -* 训练过程观察 - * 直方图 - * 计算图/数据图展示优化 - * 集成性能Profiling/Debugger工具 - * 支持多次训练间的对比 -* 训练结果溯源 - * 数据增强溯源对比 -* 训练过程诊断 - * 性能Profiling - * 基于图模型的Debugger + +- 训练过程观察 + - 直方图 + - 计算图/数据图展示优化 + - 集成性能Profiling/Debugger工具 + - 支持多次训练间的对比 +- 训练结果溯源 + - 数据增强溯源对比 +- 训练过程诊断 + - 性能Profiling + - 基于图模型的Debugger ## MindArmour安全增强包 -* 测试模型的安全性 -* 提供模型安全性增强工具 -* 保护训练和推理过程中的数据隐私 + +- 测试模型的安全性 +- 提供模型安全性增强工具 +- 保护训练和推理过程中的数据隐私 ## 推理框架 -* 算子性能与完备度的持续优化 -* 支持语音模型推理 -* 端侧模型的可视化 -* Micro方案,适用于嵌入式系统的超轻量化推理, 支持ARM Cortex-A、Cortex-M硬件 -* 支持端侧重训及联邦学习 -* 端侧自动并行特性 -* 端侧MindData,包含图片Resize、像素数据转换等功能 -* 配套MindSpore混合精度量化训练(或训练后量化),实现混合精度推理,提升推理性能 -* 支持Kirin NPU、MTK APU等AI加速硬件 -* 支持多模型推理pipeline -* C++构图接口 + +- 算子性能与完备度的持续优化 +- 支持语音模型推理 +- 端侧模型的可视化 +- Micro方案,适用于嵌入式系统的超轻量化推理, 支持ARM Cortex-A、Cortex-M硬件 +- 支持端侧重训及联邦学习 +- 端侧自动并行特性 +- 端侧MindData,包含图片Resize、像素数据转换等功能 +- 配套MindSpore混合精度量化训练(或训练后量化),实现混合精度推理,提升推理性能 +- 支持Kirin NPU、MTK APU等AI加速硬件 +- 支持多模型推理pipeline +- C++构图接口 diff --git a/docs/programming_guide/source_en/api_structure.md b/docs/programming_guide/source_en/api_structure.md index 8fb4885d0125e4cb33a9a8a4f60b8111b2a936a4..c77f085b66eb17f5feb1e9c1c6fe5ddfb29bacf9 100644 --- a/docs/programming_guide/source_en/api_structure.md +++ b/docs/programming_guide/source_en/api_structure.md @@ -12,6 +12,7 @@ ## Overall Architecture + MindSpore is a deep learning framework in all scenarios, aiming to achieve easy development, efficient execution, and all-scenario coverage. Easy development features include API friendliness and low debugging difficulty. Efficient execution includes computing efficiency, data preprocessing efficiency, and distributed training efficiency. All-scenario coverage means that the framework supports cloud, edge, and device scenarios. The overall architecture of MindSpore consists of the Mind Expression (ME), Graph Engine (GE), and backend runtime. ME provides user-level APIs for scientific computing, building and training neural networks, and converting Python code of users into graphs. GE is a manager of operators and hardware resources, and is responsible for controlling execution of graphs received from ME. Backend runtime includes efficient running environments, such as the CPU, GPU, Ascend AI processors, and Android/iOS, on the cloud, edge, and device. For more information about the overall architecture, see [Overall Architecture](https://www.mindspore.cn/doc/note/en/master/design/mindspore/architecture.html). diff --git a/docs/programming_guide/source_en/dtype.md b/docs/programming_guide/source_en/dtype.md index 495c7573d02a5a9d011b9e14c87272bf030b9c76..29437cc5de0d6dfbf696f4548167d62f04f3d682 100644 --- a/docs/programming_guide/source_en/dtype.md +++ b/docs/programming_guide/source_en/dtype.md @@ -19,7 +19,8 @@ In the computation process of MindSpore, the `int` data type in Python is conver For details about the supported types, see . In the following code, the data type of MindSpore is int32. -``` + +```python from mindspore import dtype as mstype data_type = mstype.int32 @@ -28,11 +29,10 @@ print(data_type) The following information is displayed: -``` +```text Int32 ``` - ## Data Type Conversion API MindSpore provides the following APIs for conversion between NumPy data types and Python built-in data types: @@ -43,7 +43,7 @@ MindSpore provides the following APIs for conversion between NumPy data types an The following code implements the conversion between different data types and prints the converted type. -``` +```python from mindspore import dtype as mstype np_type = mstype.dtype_to_nptype(mstype.int32) @@ -57,8 +57,8 @@ print(py_type) The following information is displayed: -``` +```text Int64 -``` \ No newline at end of file +``` diff --git a/docs/programming_guide/source_en/tensor.md b/docs/programming_guide/source_en/tensor.md index 0054dc933a385e67a004ee0fe4581dc84e9266d4..c97345a6078b4f2cb9de21c215c96cc232f3ecea 100644 --- a/docs/programming_guide/source_en/tensor.md +++ b/docs/programming_guide/source_en/tensor.md @@ -19,7 +19,7 @@ Tensor is a basic data structure in the MindSpore network computing. For details Tensors of different dimensions represent different data. For example, a 0-dimensional tensor represents a scalar, a 1-dimensional tensor represents a vector, a 2-dimensional tensor represents a matrix, and a 3-dimensional tensor may represent the three channels of RGB images. -> All examples in this document can run in PyNative mode and do not support CPUs. +> All examples in this document can run in PyNative mode and do not support CPUs. ## Tensor Structure @@ -29,7 +29,7 @@ When `Tensor` is used as the initial value, dtype can be specified. If dtype is A code example is as follows: -``` +```python import numpy as np from mindspore import Tensor from mindspore.common import dtype as mstype @@ -46,7 +46,7 @@ print(x, "\n\n", y, "\n\n", z, "\n\n", m, "\n\n", n, "\n\n", p) The following information is displayed: -``` +```text [[1 2] [3 4]] @@ -66,12 +66,13 @@ True ### Attributes Tensor attributes include shape and data type (dtype). + - shape: a tuple - dtype: a data type of MindSpore A code example is as follows: -``` +```python import numpy as np from mindspore import Tensor from mindspore.common import dtype as mstype @@ -85,20 +86,21 @@ print(x_shape, x_dtype) The following information is displayed: -``` +```text (2, 2) Int32 ``` - + ### Methods Tensor methods include `all`, `any`, and `asnumpy`. Currently, the `all` and `any` methods support only Ascend. + - `all(axis, keep_dims)`: performs the `and` operation on a specified dimension to reduce the dimension. `axis` indicates the reduced dimension, and `keep_dims` indicates whether to retain the reduced dimension. - `any(axis, keep_dims)`: performs the `or` operation on a specified dimension to reduce the dimension. The parameter meaning is the same as that of `all`. - `asnumpy()`: converts `Tensor` to an array of NumPy. A code example is as follows: -``` +```python import numpy as np from mindspore import Tensor from mindspore.common import dtype as mstype @@ -113,7 +115,7 @@ print(x_all, "\n\n", x_any, "\n\n", x_array) The following information is displayed: -``` +```text False True @@ -121,4 +123,4 @@ True [[ True True] [False False]] -``` \ No newline at end of file +``` diff --git a/docs/programming_guide/source_zh_cn/api_structure.md b/docs/programming_guide/source_zh_cn/api_structure.md index 847cc1bcaea6760a9e1ca5c8ad8ac1fa350d4c62..2fd2d7b094bd81eca23360ebe868d763ac020000 100644 --- a/docs/programming_guide/source_zh_cn/api_structure.md +++ b/docs/programming_guide/source_zh_cn/api_structure.md @@ -14,6 +14,7 @@ ## 总体架构 + MindSpore是一个全场景深度学习框架,旨在实现易开发、高效执行、全场景覆盖三大目标,其中易开发表现为API友好、调试难度低,高效执行包括计算效率、数据预处理效率和分布式训练效率,全场景则指框架同时支持云、边缘以及端侧场景。 MindSpore总体架构分为前端表示层(Mind Expression,ME)、计算图引擎(Graph Engine,GE)和后端运行时三个部分。ME提供了用户级应用软件编程接口(Application Programming Interface,API),用于科学计算以及构建和训练神经网络,并将用户的Python代码转换为数据流图。GE是算子和硬件资源的管理器,负责控制从ME接收的数据流图的执行。后端运行时包含云、边、端上不同环境中的高效运行环境,例如CPU、GPU、Ascend AI处理器、 Android/iOS等。更多总体架构的相关内容请参见[总体架构](https://www.mindspore.cn/doc/note/zh-CN/master/design/mindspore/architecture.html)。 diff --git a/docs/programming_guide/source_zh_cn/augmentation.md b/docs/programming_guide/source_zh_cn/augmentation.md index a7741906d98497e6190a26024c1c6b162b620770..f19715460d5b9c756779c03de87263f3c61ca8d4 100644 --- a/docs/programming_guide/source_zh_cn/augmentation.md +++ b/docs/programming_guide/source_zh_cn/augmentation.md @@ -42,7 +42,7 @@ MindSpore目前支持的常用数据增强算子如下表所示,更多数据 | | Invert | 将图像进行反相。 | | |Compose | 将列表中的数据增强操作依次执行。 | -## c_transforms +## c_transforms 下面将简要介绍几种常用的`c_transforms`模块数据增强算子的使用方法。 @@ -51,6 +51,7 @@ MindSpore目前支持的常用数据增强算子如下表所示,更多数据 对输入图像进行在随机位置的裁剪。 **参数说明:** + - `size`:裁剪图像的尺寸。 - `padding`:填充的像素数量。 - `pad_if_needed`:原图小于裁剪尺寸时,是否需要填充。 @@ -98,7 +99,7 @@ plt.show() 输出结果如下: -``` +```text Source image Shape : (32, 32, 3) , Source label : 6 Cropped image Shape: (10, 10, 3) , Cropped label: 6 ------ @@ -119,6 +120,7 @@ Cropped image Shape: (10, 10, 3) , Cropped label: 9 对输入图像进行随机水平翻转。 **参数说明:** + - `prob`: 单张图片发生翻转的概率。 下面的样例首先使用随机采样器加载CIFAR-10数据集[1],然后对已加载的图片进行概率为0.8的随机水平翻转,最后输出翻转前后的图片形状及对应标签,并对图片进行了展示。 @@ -164,7 +166,7 @@ plt.show() 输出结果如下: -``` +```text Source image Shape : (32, 32, 3) , Source label : 3 Flipped image Shape: (32, 32, 3) , Flipped label: 3 ------ @@ -188,6 +190,7 @@ Flipped image Shape: (32, 32, 3) , Flipped label: 9 对输入图像进行缩放。 **参数说明:** + - `self`:缩放的目标大小。 - `interpolation`:缩放时采用的插值方式。 @@ -231,7 +234,7 @@ plt.show() 输出结果如下: -``` +```text Source image Shape : (28, 28, 1) , Source label : 5 Flipped image Shape: (101, 101, 1) , Flipped label: 5 ------ @@ -297,7 +300,7 @@ plt.show() 输出结果如下: -``` +```text Source image Shape : (32, 32, 3) , Source label : 4 Flipped image Shape: (32, 32, 3) , Flipped label: 4 ------ @@ -362,7 +365,7 @@ plt.show() 输出结果如下: -``` +```text Transformed image Shape: (3, 200, 200) , Transformed label: 3 Transformed image Shape: (3, 200, 200) , Transformed label: 0 Transformed image Shape: (3, 200, 200) , Transformed label: 3 diff --git a/docs/programming_guide/source_zh_cn/auto_parallel.md b/docs/programming_guide/source_zh_cn/auto_parallel.md index c7b9ecc28d62d49a2b818c03ac7654043c23c8ed..7bdbf8ddbc3a1b51b88c89f9377ed73f5cb21d0b 100644 --- a/docs/programming_guide/source_zh_cn/auto_parallel.md +++ b/docs/programming_guide/source_zh_cn/auto_parallel.md @@ -61,7 +61,7 @@ MindSpore的分布式并行配置通过`auto_parallel_context`来进行集中管 代码样例如下: ```python -from mindspore import context +from mindspore import context context.set_auto_parallel_context(device_num=8) context.get_auto_parallel_context("device_num") @@ -74,7 +74,7 @@ context.get_auto_parallel_context("device_num") 代码样例如下: ```python -from mindspore import context +from mindspore import context context.set_auto_parallel_context(global_rank=0) context.get_auto_parallel_context("global_rank") @@ -87,7 +87,7 @@ context.get_auto_parallel_context("global_rank") 代码样例如下: ```python -from mindspore import context +from mindspore import context context.set_auto_parallel_context(gradients_mean=False) context.get_auto_parallel_context("gradients_mean") @@ -98,10 +98,10 @@ context.get_auto_parallel_context("gradients_mean") `parallel_mode`表示并行模式,其值为字符串类型。用户可选择的模式有: - `stand_alone`:单机模式。 -- `data_parallel`:数据并行模式。 -- `hybrid_parallel`:混合并行模式。 -- `semi_auto_parallel`:半自动并行模式,即用户可通过`shard`方法给算子配置切分策略,若不配置策略,则默认是数据并行策略。 -- `auto_parallel`:自动并行模式,即框架会自动建立代价模型,为用户选择最优的切分策略。 +- `data_parallel`:数据并行模式。 +- `hybrid_parallel`:混合并行模式。 +- `semi_auto_parallel`:半自动并行模式,即用户可通过`shard`方法给算子配置切分策略,若不配置策略,则默认是数据并行策略。 +- `auto_parallel`:自动并行模式,即框架会自动建立代价模型,为用户选择最优的切分策略。 其中`auto_parallel`和`data_parallel`在MindSpore教程中有完整样例: @@ -125,7 +125,7 @@ context.get_auto_parallel_context("parallel_mode") 代码样例如下: ```python -from mindspore import context +from mindspore import context context.set_auto_parallel_context(all_reduce_fusion_config=[20, 35]) context.get_auto_parallel_context("all_reduce_fusion_config") @@ -133,7 +133,6 @@ context.get_auto_parallel_context("all_reduce_fusion_config") 样例中,`all_reduce_fusion_config`的值为[20, 35],将前20个AllReduce融合成1个,第20~35个AllReduce融合成1个,剩下的AllReduce融合成1个。 - ### 自动并行配置 #### gradient_fp32_sync @@ -143,7 +142,7 @@ context.get_auto_parallel_context("all_reduce_fusion_config") 代码样例如下: ```python -from mindspore import context +from mindspore import context context.set_auto_parallel_context(gradient_fp32_sync=False) context.get_auto_parallel_context("gradient_fp32_sync") @@ -156,7 +155,7 @@ MindSpore提供了`dynamic_programming`和`recursive_programming`两种搜索策 代码样例如下: ```python -from mindspore import context +from mindspore import context context.set_auto_parallel_context(auto_parallel_search_mode="dynamic_programming") context.get_auto_parallel_context("auto_parallel_search_mode") @@ -169,7 +168,7 @@ context.get_auto_parallel_context("auto_parallel_search_mode") 代码样例如下: ```python -from mindspore import context +from mindspore import context context.set_auto_parallel_context(strategy_ckpt_load_file="./") context.get_auto_parallel_context("strategy_ckpt_load_file") @@ -182,7 +181,7 @@ context.get_auto_parallel_context("strategy_ckpt_load_file") 代码样例如下: ```python -from mindspore import context +from mindspore import context context.set_auto_parallel_context(strategy_ckpt_save_file="./") context.get_auto_parallel_context("strategy_ckpt_save_file") @@ -195,7 +194,7 @@ context.get_auto_parallel_context("strategy_ckpt_save_file") 代码样例如下: ```python -from mindspore import context +from mindspore import context context.set_auto_parallel_context(full_batch=False) context.get_auto_parallel_context("full_batch") @@ -210,7 +209,7 @@ context.get_auto_parallel_context("full_batch") 代码样例如下: ```python -from mindspore import context +from mindspore import context context.set_auto_parallel_context(enable_parallel_optimizer=True) context.get_auto_parallel_context("enable_parallel_optimizer") @@ -323,4 +322,3 @@ allreduce2 = P.AllReduce().add_prim_attr("fusion", 1) 具体用例请参考MindSpore分布式并行训练教程: 。 - diff --git a/docs/programming_guide/source_zh_cn/callback.md b/docs/programming_guide/source_zh_cn/callback.md index 753aa59505143da9cbcb95ec4985cf90bc910b3a..15d1e9a391e00a18c0111030f2fd17457b39f4e4 100644 --- a/docs/programming_guide/source_zh_cn/callback.md +++ b/docs/programming_guide/source_zh_cn/callback.md @@ -12,6 +12,7 @@ ## 概述 + Callback回调函数在MindSpore中被实现为一个类,Callback机制类似于一种监控模式,可以帮助用户观察网络训练过程中各种参数的变化情况和网络内部的状态,还可以根据用户的指定,在达到特定条件后执行相应的操作,在训练过程中,Callback列表会按照定义的顺序执行Callback函数。Callback机制让用户可以及时有效地掌握网络模型的训练状态,并根据需要随时作出调整,可以极大地提升用户的开发效率。 在MindSpore中,Callback机制一般用在网络训练过程`model.train`中,用户可以通过配置不同的内置回调函数传入不同的参数,从而实现各种功能。例如,可以通过`LossMonitor`监控每一个epoch的loss变化情况,通过`ModelCheckpoint`保存网络参数和模型进行再训练或推理,通过`TimeMonitor`监控每一个epoch,每一个step的训练时间,以及提前终止训练,动态调整参数等。 @@ -37,10 +38,11 @@ Callback回调函数在MindSpore中被实现为一个类,Callback机制类似 详细内容,请参考[LossMonitor官网教程](https://www.mindspore.cn/tutorial/training/zh-CN/master/advanced_use/custom_debugging_info.html#mindsporecallback)。 - TimeMonitor - + 监控训练过程中每个epoch,每个step的运行时间。 ## MindSpore自定义回调函数 + MindSpore不但有功能强大的内置回调函数,还可以支持用户自定义回调函数。当用户有自己的特殊需求时,可以基于Callback基类,自定义满足用户自身需求的回调函数。Callback可以把训练过程中的重要信息记录下来,通过一个字典类型变量cb_params传递给Callback对象, 用户可以在各个自定义的Callback中获取到相关属性,执行自定义操作。 以下面两个场景为例,介绍自定义Callback回调函数的功能: @@ -51,4 +53,4 @@ MindSpore不但有功能强大的内置回调函数,还可以支持用户自 详细内容,请参考[自定义Callback官网教程](https://www.mindspore.cn/tutorial/training/zh-CN/master/advanced_use/custom_debugging_info.html#id3)。 -根据教程,用户可以很容易实现具有其他功能的自定义回调函数,如实现在每一轮训练结束后都输出相应的详细训练信息,包括训练进度、训练轮次、训练名称、loss值等;如实现在loss或模型精度达到一定值后停止训练,用户可以设定loss或模型精度的阈值,当loss或模型精度达到该阈值后就提前终止训练等。 \ No newline at end of file +根据教程,用户可以很容易实现具有其他功能的自定义回调函数,如实现在每一轮训练结束后都输出相应的详细训练信息,包括训练进度、训练轮次、训练名称、loss值等;如实现在loss或模型精度达到一定值后停止训练,用户可以设定loss或模型精度的阈值,当loss或模型精度达到该阈值后就提前终止训练等。 diff --git a/docs/programming_guide/source_zh_cn/cell.md b/docs/programming_guide/source_zh_cn/cell.md index 0f1ac35d43b3eb380737c26debf8705e208fc512..a3409c2e3c14c5454a16ffed94b7a2c34a5089cb 100644 --- a/docs/programming_guide/source_zh_cn/cell.md +++ b/docs/programming_guide/source_zh_cn/cell.md @@ -41,7 +41,7 @@ MindSpore的`Cell`类是构建所有网络的基类,也是网络的基本单 在`construct`方法中,`x`为输入数据,`output`是经过网络结构计算后得到的计算结果。 -``` +```python import mindspore.nn as nn from mindspore.ops import operations as P from mindspore.common.parameter import Parameter @@ -70,7 +70,7 @@ class Net(nn.Cell): 代码样例如下: -``` +```python net = Net() result = net.parameters_dict() print(result.keys()) @@ -80,7 +80,8 @@ print(result['conv.weight']) 样例中的`Net`采用上文构造网络的用例,打印了网络中所有参数的名字和`conv.weight`参数的结果。 输出如下: -``` + +```text odict_keys(['conv.weight']) Parameter (name=conv.weight, value=[[[[-3.95042636e-03 1.08830128e-02 -6.51786150e-03] [ 8.66129529e-03 7.36288540e-03 -4.32638079e-03] @@ -98,7 +99,8 @@ Parameter (name=conv.weight, value=[[[[-3.95042636e-03 1.08830128e-02 -6.517861 其中`nn.Conv2d`是MindSpore以`Cell`为基类封装好的一个卷积层,其具体内容将在“模型层”中进行介绍。 代码样例如下: -``` + +```python import mindspore.nn as nn class Net1(nn.Cell): @@ -120,7 +122,8 @@ print(names) ``` 输出如下: -``` + +```text ('', Net1< (conv): Conv2d >) @@ -136,7 +139,8 @@ print(names) 以`TrainOneStepCell`为例,其接口功能是使网络进行单步训练,需要计算网络反向,因此初始化方法里需要使用`set_grad`。 `TrainOneStepCell`部分代码如下: -``` + +```python class TrainOneStepCell(Cell): def __init__(self, network, optimizer, sens=1.0): super(TrainOneStepCell, self).__init__(auto_prefix=False) @@ -156,7 +160,8 @@ MindSpore的nn模块是Python实现的模型组件,是对低阶API的封装, 同时nn也提供了部分与`Primitive`算子同名的接口,主要作用是对`Primitive`算子进行进一步封装,为用户提供更友好的API。 重新分析上文介绍`construct`方法的用例,此用例是MindSpore的`nn.Conv2d`源码简化内容,内部会调用`P.Conv2D`。`nn.Conv2d`卷积API增加输入参数校验功能并判断是否`bias`等,是一个高级封装的模型层。 -``` + +```python import mindspore.nn as nn from mindspore.ops import operations as P from mindspore.common.parameter import Parameter @@ -259,7 +264,7 @@ MindSpore框架在`mindspore.nn`的layer层内置了丰富的接口,主要内 MindSpore的模型层在`mindspore.nn`下,使用方法如下所示: -``` +```python import mindspore.nn as nn class Net(nn.Cell): @@ -307,7 +312,7 @@ MindSpore的损失函数全部是`Cell`的子类实现,所以也支持用户 - SoftmaxCrossEntropyWithLogits 交叉熵损失函数,用于分类模型。当标签数据不是one-hot编码形式时,需要输入参数`sparse`为True。`reduction`参数默认值为none,其参数含义同`L1Loss`。 - + - CosineEmbeddingLoss `CosineEmbeddingLoss`用于衡量两个输入相似程度,用于分类模型。`margin`默认为0.0,`reduction`参数同`L1Loss`。 @@ -316,7 +321,7 @@ MindSpore的损失函数全部是`Cell`的子类实现,所以也支持用户 MindSpore的损失函数全部在mindspore.nn下,使用方法如下所示: -``` +```python import numpy as np import mindspore.nn as nn from mindspore import Tensor @@ -328,7 +333,8 @@ print(loss(input_data, target_data)) ``` 输出结果: -``` + +```text 1.5 ``` @@ -347,7 +353,8 @@ print(loss(input_data, target_data)) 以LeNet网络为例,在`__init__`方法中定义了卷积层,池化层和全连接层等结构单元,然后在`construct`方法将定义的内容连接在一起,形成一个完整LeNet的网络结构。 LeNet网络实现方式如下所示: -``` + +```python import mindspore.nn as nn class LeNet5(nn.Cell): diff --git a/docs/programming_guide/source_zh_cn/context.md b/docs/programming_guide/source_zh_cn/context.md index 227d73ba5d251d770fa06bcf85c1b2eb6277457c..59a7e39523dff0ec866c5bf7bf735ea7eaff9c26 100644 --- a/docs/programming_guide/source_zh_cn/context.md +++ b/docs/programming_guide/source_zh_cn/context.md @@ -19,9 +19,11 @@ ## 概述 + 初始化网络之前要配置context参数,用于控制程序执行的策略。比如选择执行模式、选择执行后端、配置分布式相关参数等。按照context参数设置实现的不同功能,可以将其分为执行模式管理、硬件管理、分布式管理和维测管理等。 ## 执行模式管理 + MindSpore支持PyNative和Graph这两种运行模式: - `PYNATIVE_MODE`:动态图模式,将神经网络中的各个算子逐一下发执行,方便用户编写和调试神经网络模型。 @@ -29,20 +31,24 @@ MindSpore支持PyNative和Graph这两种运行模式: - `GRAPH_MODE`:静态图模式或者图模式,将神经网络模型编译成一整张图,然后下发执行。该模式利用图优化等技术提高运行性能,同时有助于规模部署和跨平台运行。 ### 模式选择 + 通过设置可以控制程序运行的模式,默认情况下,MindSpore处于PyNative模式。 代码样例如下: + ```python from mindspore import context context.set_context(mode=context.GRAPH_MODE) ``` ### 模式切换 + 实现两种模式之间的切换。 MindSpore处于PYNATIVE模式时,可以通过`context.set_context(mode=context.GRAPH_MODE)`切换为Graph模式;同样地,MindSpore处于Graph模式时,可以通过 `context.set_context(mode=context.PYNATIVE_MODE)`切换为PyNative模式。 代码样例如下: + ```python import numpy as np import mindspore.nn as nn @@ -60,6 +66,7 @@ conv(input_data) 上面的例子先将运行模式设置为`GRAPH_MODE`模式,然后将模式切换为`PYNATIVE_MODE`模式,实现了模式的切换。 ## 硬件管理 + 硬件管理部分主要包括`device_target`和`device_id`两个参数。 - `device_target`: 用于设置目标设备,支持Ascend、GPU和CPU,可以根据实际环境情况设置。 @@ -69,12 +76,14 @@ conv(input_data) > 在GPU和CPU上,设置`device_id`参数无效。 代码样例如下: + ```python from mindspore import context context.set_context(device_target="Ascend", device_id=6) ``` ## 分布式管理 + context中有专门用于配置并行训练参数的接口:context.set_auto_parallel_context,该接口必须在初始化网络之前调用。 - `parallel_mode`:分布式并行模式,默认为单机模式`ParallelMode.STAND_ALONE`。可选数据并行`ParallelMode.DATA_PARALLEL`及自动并行`ParallelMode.AUTO_PARALLEL`。 @@ -86,6 +95,7 @@ context中有专门用于配置并行训练参数的接口:context.set_auto_pa > `device_num`和`global_rank`建议采用默认值,框架内会调用HCCL接口获取。 代码样例如下: + ```python from mindspore import context from mindspore.context import ParallelMode @@ -95,9 +105,11 @@ context.set_auto_parallel_context(parallel_mode=ParallelMode.AUTO_PARALLEL, grad > 分布式并行训练详细介绍可以查看[分布式并行训练](https://www.mindspore.cn/tutorial/training/zh-CN/master/advanced_use/distributed_training_tutorials.html)。 ## 维测管理 + 为了方便维护和定位问题,context提供了大量维测相关的参数配置,如采集profiling数据、异步数据dump功能和print算子落盘等。 ### 采集profiling数据 + 系统支持在训练过程中采集profiling数据,然后通过profiling工具进行性能分析。当前支持采集的profiling数据包括: - `enable_profiling`:是否开启profiling功能。设置为True,表示开启profiling功能,从enable_options读取profiling的采集选项;设置为False,表示关闭profiling功能,仅采集training_trace。 @@ -105,15 +117,18 @@ context.set_auto_parallel_context(parallel_mode=ParallelMode.AUTO_PARALLEL, grad - `enable_options`:profiling采集选项,取值如下,支持采集多项数据。training_trace:采集迭代轨迹数据,即训练任务及AI软件栈的软件信息,实现对训练任务的性能分析,重点关注数据增强、前后向计算、梯度聚合更新等相关数据;task_trace:采集任务轨迹数据,即昇腾910处理器HWTS/AICore的硬件信息,分析任务开始、结束等信息;op_trace:采集单算子性能数据。格式:['op_trace','task_trace','training_trace'] 代码样例如下: + ```python from mindspore import context context.set_context(enable_profiling=True, profiling_options="training_trace") ``` ### 异步数据dump功能 + 在Ascend环境上执行训练,当训练结果和预期有偏差时,可以通过异步数据dump功能保存算子的输入输出进行调试。 代码样例如下: + ```python from mindspore import context context.set_context(save_graphs=True) @@ -122,6 +137,7 @@ context.set_context(save_graphs=True) > 详细的调试方法可以查看[异步数据Dump功能介绍](https://www.mindspore.cn/tutorial/training/zh-CN/master/advanced_use/custom_debugging_info.html#dump)。 ### print算子落盘 + 默认情况下,MindSpore的自研print算子可以将用户输入的Tensor或字符串信息打印出来,支持多字符串输入,多Tensor输入和字符串与Tensor的混合输入,输入参数以逗号隔开。 > Print打印功能可以查看[Print算子功能介绍](https://www.mindspore.cn/tutorial/training/zh-CN/master/advanced_use/custom_debugging_info.html#print)。 @@ -129,9 +145,10 @@ context.set_context(save_graphs=True) - `print_file_path`:可以将print算子数据保存到文件,同时关闭屏幕打印功能。如果保存的文件已经存在,则会给文件添加时间戳后缀。数据保存到文件可以解决数据量较大时屏幕打印数据丢失的问题。 代码样例如下: + ```python from mindspore import context context.set_context(print_file_path="print.pb") ``` -> context接口详细介绍可以查看[mindspore.context](https://www.mindspore.cn/doc/api_python/zh-CN/master/mindspore/mindspore.context.html)。 \ No newline at end of file +> context接口详细介绍可以查看[mindspore.context](https://www.mindspore.cn/doc/api_python/zh-CN/master/mindspore/mindspore.context.html)。 diff --git a/docs/programming_guide/source_zh_cn/dataset_conversion.md b/docs/programming_guide/source_zh_cn/dataset_conversion.md index 872b09859fc8923ce82d784ce09819bf26a9830a..0da981f4199a8fe3c24bdd9359dde0b5ac5c0935 100644 --- a/docs/programming_guide/source_zh_cn/dataset_conversion.md +++ b/docs/programming_guide/source_zh_cn/dataset_conversion.md @@ -61,19 +61,19 @@ for i in range(100): white_io = BytesIO() Image.new('RGB', (i*10, i*10), (255, 255, 255)).save(white_io, 'JPEG') image_bytes = white_io.getvalue() - sample['file_name'] = str(i) + ".jpg" - sample['label'] = i + sample['file_name'] = str(i) + ".jpg" + sample['label'] = i sample['data'] = white_io.getvalue() data.append(sample) - if i % 10 == 0: + if i % 10 == 0: writer.write_raw_data(data) data = [] -if data: +if data: writer.write_raw_data(data) -writer.commit() +writer.commit() data_set = ds.MindDataset(dataset_file=mindrecord_filename) decode_op = vision.Decode() @@ -129,18 +129,18 @@ for i in range(100): "target_eos_mask": np.array([48, 49, 50, 51], dtype=np.int64)} data.append(sample) - if i % 10 == 0: + if i % 10 == 0: writer.write_raw_data(data) data = [] -if data: +if data: writer.write_raw_data(data) writer.commit() data_set = ds.MindDataset(dataset_file=mindrecord_filename) count = 0 -for item in data_set.create_dict_iterator(): +for item in data_set.create_dict_iterator(): print("sample: {}".format(item)) count += 1 print("Got {} samples".format(count)) @@ -167,7 +167,7 @@ MindSpore提供转换常用数据集的工具类,能够将常用的数据集 1. 下载[CIFAR-10数据集](https://www.cs.toronto.edu/~kriz/cifar-10-python.tar.gz)并解压,其目录结构如下所示。 - ``` + ```text └─cifar-10-batches-py ├─batches.meta ├─data_batch_1 @@ -220,7 +220,7 @@ MindSpore提供转换常用数据集的工具类,能够将常用的数据集 1. 下载[ImageNet数据集](http://image-net.org/download),将所有图片存放在同一文件夹,用一个映射文件记录图片和标签的对应关系。映射文件包含2列,分别为各类别图片目录和标签ID,用空格隔开,映射文件示例如下: - ``` + ```text n01440760 0 n01443537 1 n01484850 2 @@ -246,7 +246,7 @@ MindSpore提供转换常用数据集的工具类,能够将常用的数据集 imagenet_transformer.transform() ``` - **参数说明:** + **参数说明:** - `IMAGENET_MAP_FILE`:ImageNet数据集标签映射文件的路径。 - `IMAGENET_IMAGE_DIR`:包含ImageNet所有图片的文件夹路径。 - `MINDRECORD_FILE`:输出的MindRecord文件路径。 @@ -277,8 +277,8 @@ import os import mindspore.dataset as ds from mindspore.mindrecord import CsvToMR -CSV_FILE_NAME = "test.csv" -MINDRECORD_FILE_NAME = "test.mindrecord" +CSV_FILE_NAME = "test.csv" +MINDRECORD_FILE_NAME = "test.mindrecord" PARTITION_NUM = 1 def generate_csv(): @@ -330,8 +330,8 @@ import mindspore.dataset.vision.c_transforms as vision from PIL import Image import tensorflow as tf -TFRECORD_FILE_NAME = "test.tfrecord" -MINDRECORD_FILE_NAME = "test.mindrecord" +TFRECORD_FILE_NAME = "test.tfrecord" +MINDRECORD_FILE_NAME = "test.mindrecord" PARTITION_NUM = 1 def generate_tfrecord(): @@ -339,7 +339,7 @@ def generate_tfrecord(): if isinstance(values, list): feature = tf.train.Feature(int64_list=tf.train.Int64List(value=list(values))) else: - feature = tf.train.Feature(int64_list=tf.train.Int64List(value=[values])) + feature = tf.train.Feature(int64_list=tf.train.Int64List(value=[values])) return feature def create_float_feature(values): @@ -352,9 +352,9 @@ def generate_tfrecord(): def create_bytes_feature(values): if isinstance(values, bytes): white_io = BytesIO() - Image.new('RGB', (10, 10), (255, 255, 255)).save(white_io, 'JPEG') + Image.new('RGB', (10, 10), (255, 255, 255)).save(white_io, 'JPEG') image_bytes = white_io.getvalue() - feature = tf.train.Feature(bytes_list=tf.train.BytesList(value=[image_bytes])) + feature = tf.train.Feature(bytes_list=tf.train.BytesList(value=[image_bytes])) else: feature = tf.train.Feature(bytes_list=tf.train.BytesList(value=[bytes(values, encoding='utf-8')])) return feature diff --git a/docs/programming_guide/source_zh_cn/dataset_loading.md b/docs/programming_guide/source_zh_cn/dataset_loading.md index 3d2e08eb1df81fbc69eb79fdb8a943a99108212b..292f5c35334342842e39f116ae383f251773285d 100644 --- a/docs/programming_guide/source_zh_cn/dataset_loading.md +++ b/docs/programming_guide/source_zh_cn/dataset_loading.md @@ -76,7 +76,7 @@ for data in dataset.create_dict_iterator(): 输出结果如下: -``` +```text Image shape: (32, 32, 3) , Label: 0 Image shape: (32, 32, 3) , Label: 1 Image shape: (32, 32, 3) , Label: 2 @@ -110,7 +110,7 @@ for data in dataset.create_dict_iterator(): 输出结果如下: -``` +```text [Segmentation]: image shape: (281, 500, 3) target shape: (281, 500, 3) @@ -152,7 +152,7 @@ for data in dataset.create_dict_iterator(): 输出结果如下: -``` +```text Detection: dict_keys(['bbox', 'image', 'iscrowd', 'category_id']) Stuff: dict_keys(['segmentation', 'iscrowd', 'image']) Keypoint: dict_keys(['keypoints', 'num_keypoints', 'image']) @@ -218,7 +218,7 @@ TFRecord是TensorFlow定义的一种二进制数据文件格式。 将数据集格式和特征按JSON格式写入Schema文件,示例如下: - ``` + ```json { "columns": { "image": { @@ -281,7 +281,7 @@ TFRecord是TensorFlow定义的一种二进制数据文件格式。 输出结果如下: - ``` + ```text [0.49893939 0.36348882] [0.15234002] [0.83845534 0.19721032] [0.94602561] [0.2361873 0.79506755] [0.88118559] @@ -305,7 +305,7 @@ TFRecord是TensorFlow定义的一种二进制数据文件格式。 输出结果如下: - ``` + ```text [1 2] [3 4] ``` @@ -325,7 +325,7 @@ TFRecord是TensorFlow定义的一种二进制数据文件格式。 输出结果如下: - ``` + ```text {'col1': Tensor(shape=[], dtype=Int64, value= 1), 'col2': Tensor(shape=[], dtype=Int64, value= 3)} {'col1': Tensor(shape=[], dtype=Int64, value= 2), 'col2': Tensor(shape=[], dtype=Int64, value= 4)} ``` @@ -374,7 +374,7 @@ for data in dataset.create_dict_iterator(): 输出结果如下: -``` +```text [0.36510558 0.45120592] [0.78888122] [0.49606035 0.07562207] [0.38068183] [0.57176158 0.28963401] [0.16271622] @@ -420,7 +420,7 @@ for data in dataset.create_dict_iterator(): 输出结果如下: -``` +```text [0.36510558 0.45120592] [0.78888122] [0.49606035 0.07562207] [0.38068183] [0.57176158 0.28963401] [0.16271622] @@ -457,7 +457,7 @@ for data in dataset.create_dict_iterator(): 输出结果如下: -``` +```text [0.36510558 0.45120592] [0.78888122] [0.49606035 0.07562207] [0.38068183] [0.57176158 0.28963401] [0.16271622] @@ -497,7 +497,7 @@ for data in dataset.create_dict_iterator(): 输出结果如下: -``` +```text [0.36510558 0.45120592] [0.78888122] [0.57176158 0.28963401] [0.16271622] [0.81585667 0.96883469] [0.77994068] diff --git a/docs/programming_guide/source_zh_cn/dtype.md b/docs/programming_guide/source_zh_cn/dtype.md index 5fc7d208e65aed905222fe79b2ca6530a04f5fdc..177cf1d89cc47884ad8f45d579749465c575fda0 100644 --- a/docs/programming_guide/source_zh_cn/dtype.md +++ b/docs/programming_guide/source_zh_cn/dtype.md @@ -9,6 +9,8 @@ +   + ## 概述 @@ -19,7 +21,8 @@ MindSpore张量支持不同的数据类型,包含`int8`、`int16`、`int32`、 详细的类型支持情况请参考。 以下代码,打印MindSpore的数据类型int32。 -``` + +```python from mindspore import dtype as mstype data_type = mstype.int32 @@ -28,11 +31,10 @@ print(data_type) 输出如下: -``` +```text Int32 ``` - ## 数据类型转换接口 MindSpore提供了以下几个接口,实现与NumPy数据类型和Python内置的数据类型间的转换。 @@ -43,7 +45,7 @@ MindSpore提供了以下几个接口,实现与NumPy数据类型和Python内置 以下代码实现了不同数据类型间的转换,并打印转换后的类型。 -``` +```python from mindspore import dtype as mstype np_type = mstype.dtype_to_nptype(mstype.int32) @@ -57,7 +59,7 @@ print(py_type) 输出如下: -``` +```text Int64 diff --git a/docs/programming_guide/source_zh_cn/network_component.md b/docs/programming_guide/source_zh_cn/network_component.md index 78376e493732c6c4e7f184da441cf8e23384ecc8..b66b27f9038e111ad72eca4ffd48d487dbf473f0 100644 --- a/docs/programming_guide/source_zh_cn/network_component.md +++ b/docs/programming_guide/source_zh_cn/network_component.md @@ -26,7 +26,7 @@ GradOperation组件用于生成输入函数的梯度,利用`get_all`、`get_by GradOperation的使用实例如下: -``` +```python import numpy as np import mindspore.nn as nn @@ -64,7 +64,7 @@ GradNetWrtX(Net())(x, y) 输出如下: -``` +```text Tensor(shape=[2, 3], dtype=Float32, [[1.4100001 1.5999999 6.6 ] [1.4100001 1.5999999 6.6 ]]) @@ -79,7 +79,7 @@ MindSpore涉及梯度计算的其他组件,例如`WithGradCell`和`TrainOneSte 下面通过一个实例来介绍其具体的使用, 首先需要构造一个网络,内容如下: -``` +```python import numpy as np import pytest @@ -124,7 +124,8 @@ class LeNet(nn.Cell): ``` 下面是`WithLossCell`的使用实例,分别定义好网络和损失函数,然后创建一个`WithLossCell`,传入输入数据和标签数据,`WithLossCell`内部根据网络和损失函数返回计算结果。 -``` + +```python data = Tensor(np.ones([32, 1, 32, 32]).astype(np.float32) * 0.01) label = Tensor(np.ones([32]).astype(np.int32)) net = LeNet() @@ -136,7 +137,8 @@ print(loss) ``` 输出如下: -``` + +```text +++++++++Loss+++++++++++++ 2.302585 ``` @@ -147,7 +149,7 @@ print(loss) 下面构造一个使用`TrainOneStepCell`接口进行网络训练的实例,其中`LeNet`和包名的导入代码和上个用例共用。 -``` +```python data = Tensor(np.ones([32, 1, 32, 32]).astype(np.float32) * 0.01) label = Tensor(np.ones([32]).astype(np.int32)) net = LeNet() @@ -168,7 +170,8 @@ for i in range(5): 用例中构造了优化器和一个`WithLossCell`的实例,然后传入`TrainOneStepCell`中初始化一个训练网络,用例循环五次,相当于网络训练了五次,并输出每次的loss结果,由结果可以看出每次训练后loss值在逐渐减小。 输出如下: -``` + +```text +++++++++result:0++++++++++++ 2.302585 +++++++++result:1++++++++++++ @@ -180,5 +183,6 @@ for i in range(5): +++++++++result:4++++++++++++ 2.2215357 ``` + 后续内容会介绍MindSpore使用更加高级封装的接口,即`Model`类中的`train`方法训练模型,在其内部实现中会用到 `TrainOneStepCell`和`WithLossCell` 等许多网络组件,感兴趣的读者可以查看其内部实现。 diff --git a/docs/programming_guide/source_zh_cn/operator.md b/docs/programming_guide/source_zh_cn/operator.md index 588d3a4314412f544695fe619cb26c9116bc60f1..aa0afe9f566c735324f2a89e47da2fb08e00f6a8 100644 --- a/docs/programming_guide/source_zh_cn/operator.md +++ b/docs/programming_guide/source_zh_cn/operator.md @@ -16,7 +16,6 @@ - [求三角函数](#求三角函数) - [向量运算](#向量运算) - [Squeeze](#squeeze) - - [求Sparse2Dense](#求sparse2dense) - [矩阵运算](#矩阵运算) - [矩阵乘法](#矩阵乘法) - [广播机制](#广播机制) @@ -43,10 +42,13 @@ ## 概述 + MindSpore的算子组件,可从算子使用方式和算子功能两种维度进行划分。 ## 算子使用方式 + 算子相关接口主要包括operations、functional和composite,可通过ops直接获取到这三类算子。 + - operations提供单个的Primtive算子。一个算子对应一个原语,是最小的执行对象,需要实例化之后使用。 - composite提供一些预定义的组合算子,以及复杂的涉及图变换的算子,如`GradOperation`。 - functional提供operations和composite实例化后的对象,简化算子的调用流程。 @@ -60,6 +62,7 @@ Primitive算子也称为算子原语,它直接封装了底层的Ascend、GPU Primitive算子接口是构建高阶接口、自动微分、网络模型等能力的基础。 代码样例如下: + ```python import numpy as np import mindspore @@ -74,7 +77,8 @@ print("output =", output) ``` 输出如下: -``` + +```text output = [ 1. 8. 64.] ``` @@ -99,7 +103,8 @@ print("output =", output) ``` 输出如下: -``` + +```text output = [ 1. 8. 64.] ``` @@ -108,6 +113,7 @@ output = [ 1. 8. 64.] composite提供了一些算子的组合,包括clip_by_value和random相关的一些算子,以及涉及图变换的函数(`GradOperation`、`HyperMap`和`Map`等)。 算子的组合可以直接像一般函数一样使用,例如使用`normal`生成一个随机分布: + ```python from mindspore.common import dtype as mstype from mindspore.ops import composite as C @@ -118,8 +124,10 @@ stddev = Tensor(1.0, mstype.float32) output = C.normal((2, 3), mean, stddev, seed=5) print("ouput =", output) ``` + 输出如下: -``` + +```text output = [[2.4911082 0.7941146 1.3117087] [0.30582333 1.772938 1.525996]] ``` @@ -129,6 +137,7 @@ output = [[2.4911082 0.7941146 1.3117087] 针对涉及图变换的函数,用户可以使用`MultitypeFuncGraph`定义一组重载的函数,根据不同类型,采用不同实现。 代码样例如下: + ```python import numpy as np from mindspore.ops.composite import MultitypeFuncGraph @@ -149,8 +158,10 @@ tensor2 = Tensor(np.array([[1.2, 2.1], [2.2, 3.2]]).astype('float32')) print('tensor', add(tensor1, tensor2)) print('scalar', add(1, 2)) ``` + 输出如下: -``` + +```text tensor [[2.4, 4.2] [4.4, 6.4]] scalar 3 @@ -183,36 +194,41 @@ scalar 3 有些标量运算符对常用的数学运算符进行了重载。并且支持类似NumPy的广播特性。 以下代码实现了对input_x作乘方数为input_y的乘方操作: + ```python -import numpy as np +import numpy as np import mindspore from mindspore import Tensor -import mindspore.ops.operations as P + input_x = mindspore.Tensor(np.array([1.0, 2.0, 4.0]), mindspore.float32) input_y = 3.0 print(input_x**input_y) ``` 输出如下: -``` + +```text [ 1. 8. 64.] ``` #### 加法 上述代码中`input_x`和`input_y`的相加实现方式如下: + ```python print(input_x + input_y) ``` 输出如下: -``` + +```text [4.0 5.0 7.0] ``` #### Element-wise乘法 以下代码实现了Element-wise乘法示例: + ```python import numpy as np import mindspore @@ -228,13 +244,15 @@ print(res) ``` 输出如下: -``` + +```text [4. 10. 18] ``` #### 求三角函数 以下代码实现了Acos: + ```python import numpy as np import mindspore @@ -248,9 +266,11 @@ print(output) ``` 输出如下: -``` + +```text [0.7377037, 1.5307858, 1.2661037,0.97641146] ``` + ### 向量运算 向量运算符只在一个特定轴上运算,将一个向量映射到一个标量或者另外一个向量。 @@ -258,6 +278,7 @@ print(output) #### Squeeze 以下代码实现了压缩第3个通道维度为1的通道: + ```python import numpy as np import mindspore @@ -272,34 +293,12 @@ print(output) ``` 输出如下: -``` + +```text [[1. 1.] [1. 1.] [1. 1.]] ``` -#### 求Sparse2Dense - -以下代码实现了对Sparse2Dense示例: -```python -import numpy as np -import mindspore as ms -from mindspore import Tensor -import mindspore.ops.operations as P - -indices = Tensor([[0, 1], [1, 2]]) -values = Tensor([1, 2], dtype=ms.float32) -dense_shape = (3, 4) -out = P.SparseToDense()(indices, values, dense_shape) - -print(out) -``` - -输出如下: -``` -[[0, 1, 0, 0], - [0, 0, 2, 0], - [0, 0, 0, 0]] -``` ### 矩阵运算 @@ -308,6 +307,7 @@ print(out) #### 矩阵乘法 以下代码实现了input_x 和 input_y的矩阵乘法: + ```python import numpy as np import mindspore @@ -323,7 +323,8 @@ print(output) ``` 输出如下: -``` + +```text [[3. 3. 3. 3.]] ``` @@ -332,6 +333,7 @@ print(output) 广播表示输入各变量channel数目不一致时,改变他们的channel数以得到结果。 - 以下代码实现了广播机制的示例: + ```python from mindspore import Tensor from mindspore.communication import init @@ -363,6 +365,7 @@ output = net(input_) 卷积操作 以下代码实现了常见卷积操作之一的2D convolution 操作: + ```python from mindspore import Tensor import mindspore.ops.operations as P @@ -376,8 +379,10 @@ res = conv2d(input, weight) print(res) ``` + 输出如下: -``` + +```text [[[[288. 288. 288. ... 288. 288. 288.] [288. 288. 288. ... 288. 288. 288.] [288. 288. 288. ... 288. 288. 288.] @@ -411,8 +416,10 @@ res = conv2d_backprop_input(dout, weight, F.shape(x)) print(res) ``` + 输出如下: -``` + +```text [[[[ 32. 64. 96. ... 96. 64. 32.] [ 64. 128. 192. ... 192. 128. 64.] [ 96. 192. 288. ... 288. 192. 96.] @@ -433,6 +440,7 @@ print(res) #### 激活函数 以下代码实现Softmax激活函数计算: + ```python from mindspore import Tensor import mindspore.ops.operations as P @@ -447,15 +455,15 @@ print(res) ``` 输出如下: -``` + +```text [0.01165623 0.03168492 0.08612854 0.23412167 0.6364086] ``` #### LossFunction - L1Loss - 以下代码实现了L1 loss function: + ```python from mindspore import Tensor import mindspore.ops.operations as P @@ -470,15 +478,15 @@ print(res) ``` 输出如下: -``` + +```text [0. 0. 0.5] ``` #### 优化算法 - SGD - 以下代码实现了SGD梯度下降算法的具体实现,输出是result: + ```python from mindspore import Tensor import mindspore.ops.operations as P @@ -498,7 +506,8 @@ print(result) ``` 输出如下: -``` + +```text [0. 0. 0. 0.] ``` @@ -525,7 +534,8 @@ print(typea) ``` 输出如下: -``` + +```text Float32 ``` @@ -550,7 +560,8 @@ print(type(result)) ``` 输出结果: -``` + +```text ``` @@ -559,6 +570,7 @@ print(type(result)) 返回输入数据的形状。 以下代码实现了返回输入数据input_tensor的操作: + ```python from mindspore import Tensor import mindspore.ops.operations as P @@ -572,7 +584,8 @@ print(output) ``` 输出如下: -``` + +```text [3, 2, 1] ``` @@ -581,6 +594,7 @@ print(output) 图像操作包括图像预处理操作,如图像剪切(Crop,便于得到大量训练样本)和大小变化(Reise,用于构建图像金子塔等)。 以下代码实现了Crop和Resize操作: + ```python from mindspore import Tensor import mindspore.ops.operations as P @@ -613,7 +627,8 @@ print(output.asnumpy()) ``` 输出如下: -``` + +```text [[[[ 6.51672244e-01 -1.85958534e-01 5.19907832e-01] [ 1.53466597e-01 4.10562098e-01 6.26138210e-01] [ 6.62892580e-01 3.81776541e-01 4.69261825e-01] @@ -645,6 +660,7 @@ print(output.asnumpy()) 对物体所在区域方框进行编码,得到类似PCA的更精简信息,以便做后续类似特征提取,物体检测,图像恢复等任务。 以下代码实现了对anchor_box和groundtruth_box的boundingbox encode: + ```python from mindspore import Tensor import mindspore.ops.operations as P @@ -659,7 +675,8 @@ print(res) ``` 输出如下: -``` + +```text [[5.0000000e-01 5.0000000e-01 -6.5504000e+04 6.9335938e-01] [-1.0000000e+00 2.5000000e-01 0.0000000e+00 4.0551758e-01]] ``` @@ -669,6 +686,7 @@ print(res) 编码器对区域位置信息解码之后,用此算子进行解码。 以下代码实现了: + ```python from mindspore import Tensor import mindspore.ops.operations as P @@ -683,7 +701,8 @@ print(res) ``` 输出如下: -``` + +```text [[4.1953125 0. 0. 5.1953125] [2.140625 0. 3.859375 60.59375]] ``` @@ -693,6 +712,7 @@ print(res) 计算预测的物体所在方框和真实物体所在方框的交集区域与并集区域的占比大小,常作为一种损失函数,用以优化模型。 以下代码实现了计算两个变量anchor_boxes和gt_boxes之间的IOU,以out输出: + ```python from mindspore import Tensor import mindspore.ops.operations as P @@ -707,7 +727,8 @@ print(out) ``` 输出如下: -``` + +```text [[0. -0. 0.] [0. -0. 0.] [0. 0. 0.]] @@ -722,6 +743,7 @@ print(out) 输出Tensor变量的数值,方便用户随时随地打印想了解或者debug必需的某变量数值。 以下代码实现了输出x这一变量的值: + ```python from mindspore import nn @@ -741,6 +763,7 @@ class DebugNN(nn.Cell): 打印中间变量的梯度,是比较常用的算子,目前仅支持Pynative模式。 以下代码实现了打印中间变量(例中x,y)的梯度: + ```python from mindspore import Tensor import mindspore.ops.operations as P @@ -765,7 +788,9 @@ def backward(x, y): backward(1, 2) ``` + 输出如下: -``` + +```text (Tensor(shape=[], dtype=Float32, value=2),) ``` diff --git a/docs/programming_guide/source_zh_cn/optim.md b/docs/programming_guide/source_zh_cn/optim.md index da9807d72eb72ed5bee84f288daa4040897474c7..dc96b82e7be8de0613344c55081df6e34486676d 100644 --- a/docs/programming_guide/source_zh_cn/optim.md +++ b/docs/programming_guide/source_zh_cn/optim.md @@ -24,6 +24,7 @@ > 本文档中的所有示例,支持CPU,GPU,Ascend环境。 ## 学习率 + ### dynamic_lr `mindspore.nn.dynamic_lr`模块有以下几个类: @@ -40,7 +41,7 @@ 例如`piecewise_constant_lr`类代码样例如下: -``` +```python from mindspore.nn.dynamic_lr import piecewise_constant_lr def test_dynamic_lr(): @@ -55,7 +56,8 @@ if __name__ == '__main__': ``` 返回结果如下: -``` + +```text [0.1, 0.1, 0.05, 0.05, 0.05, 0.01, 0.01, 0.01, 0.01, 0.01] ``` @@ -73,7 +75,8 @@ if __name__ == '__main__': 它们是属于`learning_rate_schedule`的不同实现方式。 例如ExponentialDecayLR类代码样例如下: -``` + +```python from mindspore.common import dtype as mstype from mindspore import Tensor from mindspore.nn.learning_rate_schedule import ExponentialDecayLR @@ -93,13 +96,15 @@ if __name__ == '__main__': ``` 返回结果如下: -``` + +```text 0.094868325 ``` - ## Optimzer + ### 如何使用 + 为了使用`mindspore.nn.optim`,我们需要构建一个`Optimizer`对象。这个对象能够保持当前参数状态并基于计算得到的梯度进行参数更新。 - 构建 @@ -108,7 +113,7 @@ if __name__ == '__main__': 代码样例如下: -``` +```python from mindspore import nn optim = nn.SGD(group_params, learning_rate=0.1, weight_decay=0.0) @@ -125,7 +130,7 @@ optim = nn.Adam(group_params, learning_rate=0.1, weight_decay=0.0) 我们仍然能够传递选项作为关键字参数,在未重写这些选项的组中,它们会被用作默认值。当你只想改动一个参数组的选项,但其他参数组的选项不变时,这是非常有用的。 例如,当我们想制定每一层的学习率时,以`SGD`为例: -``` +```python from mindspore import nn optim = nn.SGD([{'params': conv_params, 'weight_decay': 0.01}, @@ -134,6 +139,7 @@ optim = nn.SGD([{'params': conv_params, 'weight_decay': 0.01}, learning_rate=0.1, weight_decay=0.0) ``` + 这段示例意味着当参数是conv_params时候,权重衰减使用的是0.01,学习率使用的是0.1;而参数是no_conv_params时候,权重衰减使用的是0.0,学习率使用的是0.01。这个学习率learning_rate=0.1会被用于所有分组里没有设置学习率的参数,权重衰减weight_deca也是如此。 ### 内置优化器 @@ -149,7 +155,7 @@ optim = nn.SGD([{'params': conv_params, 'weight_decay': 0.01}, 例如`SGD`的代码样例如下: -``` +```python from mindspore import nn from mindspore.train import Model from .optimizer import Optimizer @@ -183,4 +189,4 @@ optim = nn.SGD(group_params, learning_rate=0.1, weight_decay=0.0) loss = nn.SoftmaxCrossEntropyWithLogits() model = Model(net, loss_fn=loss, optimizer=optim) -``` \ No newline at end of file +``` diff --git a/docs/programming_guide/source_zh_cn/parameter.md b/docs/programming_guide/source_zh_cn/parameter.md index d42c81a97dabc58c07ec9f0c1bfcbb01bd029d7c..d179d6c1619d71511a0bda3a87e5ff4206df5857 100644 --- a/docs/programming_guide/source_zh_cn/parameter.md +++ b/docs/programming_guide/source_zh_cn/parameter.md @@ -18,9 +18,11 @@ `Parameter`是变量张量,代表在训练网络时,需要被更新的参数。本章主要介绍了`Parameter`的初始化以及属性和方法的使用,同时介绍了`ParameterTuple`。 ## 初始化 -``` + +```python mindspore.Parameter(default_input, name, requires_grad=True, layerwise_parallel=False) ``` + 初始化一个`Parameter`对象,传入的数据支持`Tensor`、`Initializer`、`int`和`float`四种类型。 `Initializer`是初始化器,保存了shape和dtype信息,提供`to_tensor`方法生成存有数据的`Tensor`,可调用`initializer`接口生成`Initializer`对象。 @@ -38,7 +40,8 @@ mindspore.Parameter(default_input, name, requires_grad=True, layerwise_parallel= 有关分布式并行的相关配置,可以参考文档:。 下例通过三种不同的数据类型构造了`Parameter`,三个`Parameter`都需要更新,都不采用layerwise并行。如下: -``` + +```python import numpy as np from mindspore import Tensor, Parameter from mindspore.common import dtype as mstype @@ -53,12 +56,12 @@ print(x, "\n\n", y, "\n\n", z) 输出如下: -``` +```text Parameter (name=x, value=[[0 1 2] - [3 4 5]]) + [3 4 5]]) Parameter (name=y, value=[[[1. 1. 1.] - [1. 1. 1.]]]) + [1. 1. 1.]]]) Parameter (name=z, value=2.0) ``` @@ -84,7 +87,7 @@ Parameter (name=z, value=2.0) 下例通过`Tensor`初始化一个`Parameter`,获取了`Parameter`的相关属性。如下: -``` +```python import numpy as np from mindspore import Tensor, Parameter @@ -102,7 +105,7 @@ print("name: ", x.name, "\n", 输出如下: -``` +```text name: x sliced: False is_init: False @@ -111,10 +114,11 @@ requires_grad: True layerwise_parallel: False data: Parameter (name=x, value=[[0 1 2] - [3 4 5]]) + [3 4 5]]) ``` ## 方法 + - `init_data`:在网络采用半自动或者全自动并行策略的场景下, 当初始化`Parameter`传入的数据是`Initializer`时,可调用该接口将`Parameter`保存的数据转换为`Tensor`。 @@ -127,7 +131,7 @@ data: Parameter (name=x, value=[[0 1 2] 下例通过`Initializer`来初始化`Tensor`,调用了`Parameter`的相关方法。如下: -``` +```python import numpy as np from mindspore import Tensor, Parameter @@ -145,7 +149,7 @@ print(x.set_data(default_input=Tensor(np.arange(2*3).reshape((1, 2, 3))))) 输出如下: -``` +```text Parameter (name=x, value=[[[1. 1. 1.] [1. 1. 1.]]]) Parameter (name=x_c.x, value=[[[1. 1. 1.] @@ -158,11 +162,12 @@ Parameter (name=x, value=[[[0. 1. 2.] ``` ## ParameterTuple + 继承于`tuple`,用于保存多个`Parameter`,通过`__new__(cls, iterable)`传入一个存放`Parameter`的迭代器进行构造,提供`clone`接口进行克隆。 下例构造了一个`ParameterTuple`对象,并进行了克隆。如下: -``` +```python import numpy as np from mindspore import Tensor, Parameter, ParameterTuple from mindspore.common import dtype as mstype @@ -179,12 +184,12 @@ print(params_copy) 输出如下: -``` +```text (Parameter (name=x, value=Tensor(shape=[2, 3], dtype=Int64, [[ 0, 1, 2], [ 3, 4, 5]])), Parameter (name=y, value=Tensor(shape=[1, 2, 3], dtype=Float32, [[[ 1.00000000e+00, 1.00000000e+00, 1.00000000e+00], - [ 1.00000000e+00, 1.00000000e+00, 1.00000000e+00]]])), Parameter (name=z, value=Tensor(shape=[], dtype=Float32, 2))) + [ 1.00000000e+00, 1.00000000e+00, 1.00000000e+00]]])), Parameter (name=z, value=Tensor(shape=[], dtype=Float32, 2))) (Parameter (name=params_copy.x, value=Tensor(shape=[2, 3], dtype=Int64, [[ 0, 1, 2], diff --git a/docs/programming_guide/source_zh_cn/pipeline.md b/docs/programming_guide/source_zh_cn/pipeline.md index ba0e282794170c3c3b70fa0c6aa9f896f1920ab4..7b16631c79e9f80bfe1950ea4f57bf75c1b230eb 100644 --- a/docs/programming_guide/source_zh_cn/pipeline.md +++ b/docs/programming_guide/source_zh_cn/pipeline.md @@ -65,7 +65,7 @@ for data in dataset1.create_dict_iterator(): 输出结果如下: -``` +```text {'data': Tensor(shape=[3], dtype=Int64, value=[0, 1, 2])} {'data': Tensor(shape=[3], dtype=Int64, value=[2, 3, 4])} {'data': Tensor(shape=[3], dtype=Int64, value=[3, 4, 5])} @@ -109,7 +109,7 @@ for data in dataset.create_dict_iterator(): 输出结果如下: -``` +```text {'data': Tensor(shape=[3], dtype=Int64, value=[0, 1, 2])} {'data': Tensor(shape=[3], dtype=Int64, value=[1, 2, 3])} {'data': Tensor(shape=[3], dtype=Int64, value=[2, 3, 4])} @@ -156,7 +156,7 @@ for data in dataset2.create_dict_iterator(): 输出结果如下: -``` +```text {'data': Tensor(shape=[2, 3], dtype=Int64, value=[[0, 1, 2], [1, 2, 3]])} {'data': Tensor(shape=[2, 3], dtype=Int64, value=[[2, 3, 4], [3, 4, 5]])} {'data': Tensor(shape=[1, 3], dtype=Int64, value=[[4, 5, 6]])} @@ -192,7 +192,7 @@ for data in dataset1.create_dict_iterator(): 输出结果如下: -``` +```text {'data': Tensor(shape=[3], dtype=Int64, value=[0, 1, 2])} {'data': Tensor(shape=[3], dtype=Int64, value=[1, 2, 3])} {'data': Tensor(shape=[3], dtype=Int64, value=[2, 3, 4])} @@ -239,7 +239,7 @@ for data in dataset3.create_dict_iterator(): 输出结果如下: -``` +```text {'data1': Tensor(shape=[3], dtype=Int64, value= [0, 1, 2]), 'data2': Tensor(shape=[2], dtype=Int64, value= [1, 2])} {'data1': Tensor(shape=[3], dtype=Int64, value= [1, 2, 3]), 'data2': Tensor(shape=[2], dtype=Int64, value= [1, 2])} {'data1': Tensor(shape=[3], dtype=Int64, value= [2, 3, 4]), 'data2': Tensor(shape=[2], dtype=Int64, value= [1, 2])} @@ -279,7 +279,7 @@ for data in dataset3.create_dict_iterator(): 输出结果如下: -``` +```text {'data1': Tensor(shape=[3], dtype=Int64, value= [0, 0, 0])} {'data1': Tensor(shape=[3], dtype=Int64, value= [0, 0, 0])} {'data1': Tensor(shape=[3], dtype=Int64, value= [1, 2, 3])} diff --git a/docs/programming_guide/source_zh_cn/probability.md b/docs/programming_guide/source_zh_cn/probability.md index 3b1073b4cb974024e0f9aec87f91c3f8f86e3b8a..8cbea642405bdaea280a947253e83e33e4f102bd 100644 --- a/docs/programming_guide/source_zh_cn/probability.md +++ b/docs/programming_guide/source_zh_cn/probability.md @@ -82,6 +82,7 @@ MindSpore深度概率编程的目标是将深度学习和贝叶斯学习结合 伯努利分布,继承自 `Distribution` 类。 属性: + - `Bernoulli.probs`:伯努利试验成功的概率。 `Distribution` 基类调用 `Bernoulli` 中私有接口以实现基类中的公有接口。`Bernoulli` 支持的公有接口为: @@ -97,6 +98,7 @@ MindSpore深度概率编程的目标是将深度学习和贝叶斯学习结合 指数分布,继承自 `Distribution` 类。 属性: + - `Exponential.rate`:率参数。 `Distribution` 基类调用 `Exponential` 私有接口以实现基类中的公有接口。`Exponential` 支持的公有接口为: @@ -112,6 +114,7 @@ MindSpore深度概率编程的目标是将深度学习和贝叶斯学习结合 几何分布,继承自 `Distribution` 类。 属性: + - `Geometric.probs`:伯努利试验成功的概率。 `Distribution` 基类调用 `Geometric` 中私有接口以实现基类中的公有接口。`Geometric` 支持的公有接口为: @@ -127,6 +130,7 @@ MindSpore深度概率编程的目标是将深度学习和贝叶斯学习结合 正态(高斯)分布,继承自 `Distribution` 类。 `Distribution` 基类调用 `Normal` 中私有接口以实现基类中的公有接口。`Normal` 支持的公有接口为: + - `mean`,`mode`,`var`:可选择传入分布的参数均值 *mean* 和标准差 *sd* 。 - `entropy`:可选择传入分布的参数均值 *mean* 和标准差 *sd* 。 - `cross_entropy`,`kl_loss`:必须传入 *dist* ,*mean_b* 和 *sd_b* 。*dist* 为另一分布的类型的名称,目前只支持此处为 *‘Normal’* 。*mean_b* 和 *sd_b* 为分布 *b* 的均值和标准差。可选择传入分布的参数 *a* 均值 *mean_a* 和标准差 *sd_a* 。 @@ -138,6 +142,7 @@ MindSpore深度概率编程的目标是将深度学习和贝叶斯学习结合 均匀分布,继承自 `Distribution` 类。 属性: + - `Uniform.low`:最小值。 - `Uniform.high`:最大值。 @@ -162,65 +167,91 @@ import mindspore.context as context import mindspore.nn.probability.distribution as msd context.set_context(mode=context.PYNATIVE_MODE) ``` + 以 `Normal` 为例, 创建一个均值为0.0、标准差为1.0的正态分布: + ```python my_normal = msd.Normal(0.0, 1.0, dtype=mstype.float32) ``` + 计算均值: + ```python mean = my_normal.mean() print(mean) ``` + 输出为: -```python + +```text 0.0 ``` + 计算方差: + ```python var = my_normal.var() print(var) ``` + 输出为: -```python + +```text 1.0 ``` + 计算熵: + ```python entropy = my_normal.entropy() print(entropy) ``` + 输出为: -```python + +```text 1.4189385 ``` + 计算概率密度函数: + ```python value = Tensor([-0.5, 0.0, 0.5], dtype=mstype.float32) prob = my_normal.prob(value) print(prob) ``` + 输出为: -```python + +```text [0.35206532, 0.3989423, 0.35206532] ``` + 计算累积分布函数: + ```python cdf = my_normal.cdf(value) print(cdf) ``` + 输出为: -```python + +```text [0.30852754, 0.5, 0.69146246] ``` + 计算 Kullback-Leibler 散度: + ```python mean_b = Tensor(1.0, dtype=mstype.float32) sd_b = Tensor(2.0, dtype=mstype.float32) kl = my_normal.kl_loss('Normal', mean_b, sd_b) print(kl) ``` + 输出为: -```python + +```text 0.44314718 ``` @@ -229,6 +260,7 @@ print(kl) 在图模式下,`Distribution` 子类可用在网络中。 导入相关模块: + ```python import mindspore.nn as nn from mindspore import Tensor @@ -237,20 +269,24 @@ import mindspore.context as context import mindspore.nn.probability.distribution as msd context.set_context(mode=context.GRAPH_MODE) ``` + 创建网络: + ```python # 网络继承nn.Cell class Net(nn.Cell): def __init__(self): super(Net, self).__init__() self.normal = msd.Normal(0.0, 1.0, dtype=mstype.float32) - + def construct(self, value, mean, sd): pdf = self.normal.prob(value) kl = self.normal.kl_loss("Normal", mean, sd) return pdf, kl ``` + 调用网络: + ```python net = Net() value = Tensor([-0.5, 0.0, 0.5], dtype=mstype.float32) @@ -260,8 +296,10 @@ pdf, kl = net(value, mean, sd) print("pdf: ", pdf) print("kl: ", kl) ``` + 输出为: -```python + +```text pdf: [0.3520653, 0.39894226, 0.3520653] kl: 0.5 ``` @@ -293,6 +331,7 @@ kl: 0.5 在执行之前,我们需要导入需要的库文件包。 导入相关模块: + ```python import numpy as np import mindspore.nn as nn @@ -305,6 +344,7 @@ context.set_context(mode=context.PYNATIVE_MODE) ``` 构造一个 `TransformedDistribution` 实例,使用 `Normal` 分布作为需要变换的分布类,使用 `Exp` 作为映射变换,可以生成 `LogNormal` 分布。 + ```python normal = msd.Normal(0.0, 1.0, dtype=dtype.float32) exp = msb.Exp() @@ -313,7 +353,8 @@ print(LogNormal) ``` 输出为: -```python + +```text TransformedDistribution< (_bijector): Exp (_distribution): Normal @@ -323,6 +364,7 @@ TransformedDistribution< 可以对 `LogNormal` 进行概率分布计算。例如: 计算累积分布函数: + ```python x = np.array([2.0, 5.0, 10.0], dtype=np.float32) tx = Tensor(x, dtype=dtype.float32) @@ -331,11 +373,13 @@ print(cdf) ``` 输出为: -```python + +```text [7.55891383e-01, 9.46239710e-01, 9.89348888e-01] ``` 计算对数累积分布函数: + ```python x = np.array([2.0, 5.0, 10.0], dtype=np.float32) tx = Tensor(x, dtype=dtype.float32) @@ -344,11 +388,13 @@ print(log_cdf) ``` 输出为: -```python + +```text [-2.79857576e-01, -5.52593507e-02, -1.07082408e-02] ``` 计算生存函数: + ```python x = np.array([2.0, 5.0, 10.0], dtype=np.float32) tx = Tensor(x, dtype=dtype.float32) @@ -357,11 +403,13 @@ print(survival_function) ``` 输出为: -```python + +```text [2.44108617e-01, 5.37602901e-02, 1.06511116e-02] ``` 计算对数生存函数: + ```python x = np.array([2.0, 5.0, 10.0], dtype=np.float32) tx = Tensor(x, dtype=dtype.float32) @@ -370,11 +418,13 @@ print(log_survival) ``` 输出为: -```python + +```text [-1.41014194e+00, -2.92322016e+00, -4.54209089e+00] ``` 计算概率密度函数: + ```python x = np.array([2.0, 5.0, 10.0], dtype=np.float32) tx = Tensor(x, dtype=dtype.float32) @@ -383,11 +433,13 @@ print(prob) ``` 输出为: -```python + +```text [1.56874031e-01, 2.18507163e-02, 2.81590177e-03] ``` 计算对数概率密度函数: + ```python x = np.array([2.0, 5.0, 10.0], dtype=np.float32) tx = Tensor(x, dtype=dtype.float32) @@ -396,11 +448,13 @@ print(log_prob) ``` 输出为: -```python + +```text [-1.85231221e+00, -3.82352161e+00, -5.87247276e+00] ``` 调用取样函数 `sample` 抽样: + ```python shape = ((3, 2)) sample = LogNormal.sample(shape) @@ -408,13 +462,15 @@ print(sample) ``` 输出为: -```python + +```text [[7.64315844e-01, 3.01435232e-01], [1.17166102e+00, 2.60277224e+00], [7.02699006e-01, 3.91564220e-01]]) ``` 当构造 `TransformedDistribution` 映射变换的 `is_constant_jacobian = true` 时(如 `ScalarAffine`),构造的 `TransformedDistribution` 实例可以使用直接使用 `mean` 接口计算均值,例如: + ```python normal = msd.Normal(0.0, 1.0, dtype=dtype.float32) scalaraffine = msb.ScalarAffine(1.0, 2.0) @@ -422,15 +478,19 @@ trans_dist = msd.TransformedDistribution(scalaraffine, normal, dtype=dtype.float mean = trans_dist.mean() print(mean) ``` + 输出为: -```python + +```text 2.0 ``` + ### 图模式下调用TransformedDistribution实例 在图模式下,`TransformedDistribution` 类可用在网络中。 导入相关模块: + ```python import mindspore.nn as nn from mindspore import Tensor @@ -442,6 +502,7 @@ context.set_context(mode=self.GRAPH_MODE) ``` 创建网络: + ```python class Net(nn.Cell): def __init__(self, shape, dtype=dtype.float32, seed=0, name='transformed_distribution'): @@ -451,7 +512,7 @@ class Net(nn.Cell): self.normal = msd.Normal(0.0, 1.0, dtype=dtype) self.lognormal = msd.TransformedDistribution(self.exp, self.normal, dtype=dtype, seed=seed, name=name) self.shape = shape - + def construct(self, value): cdf = self.lognormal.cdf(value) sample = self.lognormal.sample(self.shape) @@ -459,6 +520,7 @@ class Net(nn.Cell): ``` 调用网络: + ```python shape = (2, 3) net = Net(shape=shape, name="LogNormal") @@ -468,8 +530,10 @@ cdf, sample = net(tx) print("cdf: ", cdf) print("sample: ", sample) ``` + 输出为: -```python + +```text cdf: [0.7558914 0.8640314 0.9171715 0.9462397] sample: [[0.21036398 0.44932044 0.5669641 ] [1.4103683 6.724116 0.97894996]] @@ -503,6 +567,7 @@ Bijector(`mindspore.nn.probability.bijector`)是概率编程的基本组成 输入是一个 `Distribution` 类:生成一个 `TransformedDistribution` **(不可在图内调用)**。 #### 幂函数变换映射(PowerTransform) + `PowerTransform` 做如下变量替换:$Y = g(X) = {(1 + X * c)}^{1 / c}$。其接口包括: 1. 类特征函数 @@ -515,15 +580,18 @@ Bijector(`mindspore.nn.probability.bijector`)是概率编程的基本组成 - `inverse_log_jacobian`:反向映射的导数的对数,输入为 `Tensor` 。 #### 指数变换映射(Exp) + `Exp` 做如下变量替换:$Y = g(X)= exp(X)$。其接口包括: 映射函数 + - `forward`:正向映射,输入为 `Tensor` 。 - `inverse`:反向映射,输入为 `Tensor` 。 - `forward_log_jacobian`:正向映射的导数的对数,输入为 `Tensor` 。 - `inverse_log_jacobian`:反向映射的导数的对数,输入为 `Tensor` 。 #### 标量仿射变换映射(ScalarAffine) + `ScalarAffine` 做如下变量替换:Y = g(X) = a * X + b。其接口包括: 1. 类特征函数 @@ -537,6 +605,7 @@ Bijector(`mindspore.nn.probability.bijector`)是概率编程的基本组成 - `inverse_log_jacobian`:反向映射的导数的对数,输入为 `Tensor` 。 #### Softplus变换映射(Softplus) + `Softplus` 做如下变量替换:$Y = g(X) = log(1 + e ^ {kX}) / k $。其接口包括: 1. 类特征函数 @@ -553,6 +622,7 @@ Bijector(`mindspore.nn.probability.bijector`)是概率编程的基本组成 在执行之前,我们需要导入需要的库文件包。双射类最主要的库是 `mindspore.nn.probability.bijector`,导入后我们使用 `msb` 作为库的缩写并进行调用。 导入相关模块: + ```python import numpy as np import mindspore.nn as nn @@ -566,19 +636,22 @@ context.set_context(mode=context.PYNATIVE_MODE) 下面我们以 `PowerTransform` 为例。创建一个指数为2的 `PowerTransform` 对象。 构造 `PowerTransform`: + ```python powertransform = msb.PowerTransform(power=2) print(powertransform) ``` 输出: -```python + +```text PowerTransform ``` 接下来可以使用映射函数进行运算。 调用 `forward` 方法,计算正向映射: + ```python x = np.array([2.0, 3.0, 4.0, 5.0], dtype=np.float32) tx = Tensor(x, dtype=dtype.float32) @@ -587,40 +660,47 @@ print(forward) ``` 输出为: -```python + +```text [2.23606801e+00, 2.64575124e+00, 3.00000000e+00, 3.31662488e+00] ``` 输入 `inverse` 方法,计算反向映射: + ```python inverse = powertransform.inverse(tx) print(inverse) ``` 输出为: -```python + +```text [1.50000000e+00, 4.00000048e+00, 7.50000000e+00, 1.20000010e+01] ``` 输入 `forward_log_jacobian` 方法,计算正向映射导数的对数: + ```python forward_log_jaco = powertransform.forward_log_jacobian(tx) print(forward_log_jaco) ``` 输出: -```python + +```text [-8.04718971e-01, -9.72955048e-01, -1.09861231e+00, -1.19894767e+00] ``` 输入 `inverse_log_jacobian` 方法,计算反向映射导数的对数: + ```python inverse_log_jaco = powertransform.inverse_log_jacobian(tx) print(inverse_log_jaco) ``` 输出为: -```python + +```text [6.93147182e-01 1.09861231e+00 1.38629436e+00 1.60943794e+00] ``` @@ -629,6 +709,7 @@ print(inverse_log_jaco) 在图模式下,`Bijector` 子类可用在网络中。 导入相关模块: + ```python import mindspore.nn as nn from mindspore import Tensor @@ -639,6 +720,7 @@ context.set_context(mode=context.GRAPH_MODE) ``` 创建网络: + ```python class Net(nn.Cell): def __init__(self): @@ -653,7 +735,9 @@ class Net(nn.Cell): inverse_log_jaco = self.s1.inverse_log_jacobian(value) return forward, inverse, forward_log_jaco, inverse_log_jaco ``` + 调用网络: + ```python net = Net() x = np.array([2.0, 3.0, 4.0, 5.0]).astype(np.float32) @@ -664,8 +748,10 @@ print("inverse: ", inverse) print("forward_log_jaco: ", forward_log_jaco) print("inverse_log_jaco: ", inverse_log_jaco) ``` + 输出为: -```python + +```text forward: [2.236068 2.6457512 3. 3.3166249] inverse: [ 1.5 4.0000005 7.5 12.000001 ] forward_log_jaco: [-0.804719 -0.97295505 -1.0986123 -1.1989477 ] @@ -723,6 +809,7 @@ encoder = Encoder() decoder = Decoder() vae = VAE(encoder, decoder, hidden_size=400, latent_size=20) ``` + ### ConditionalVAE 类似地,ConditionalVAE与VAE的使用方法比较相近,不同的是,ConditionalVAE利用了数据集的标签信息,属于有监督学习算法,其生成效果一般会比VAE好。 @@ -779,6 +866,7 @@ cvae = ConditionalVAE(encoder, decoder, hidden_size=400, latent_size=20, num_cla ```python ds_train = create_dataset(image_path, 128, 1) ``` + 接下来,需要用到infer接口进行VAE网络的变分推断。 ## 概率推断算法 @@ -796,7 +884,9 @@ vi = SVI(net_with_loss=net_with_loss, optimizer=optimizer) vae = vi.run(train_dataset=ds_train, epochs=10) trained_loss = vi.get_train_loss() ``` + 最后,得到训练好的VAE网络后,我们可以使用`vae.generate_sample`生成新样本,需要传入待生成样本的个数,及生成样本的shape,shape需要保持和原数据集中的样本shape一样;当然,我们也可以使用`vae.reconstruct_sample`重构原来数据集中的样本,来测试VAE网络的重建能力。 + ```python generated_sample = vae.generate_sample(64, IMAGE_SHAPE) for sample in ds_train.create_dict_iterator(): @@ -804,10 +894,13 @@ for sample in ds_train.create_dict_iterator(): reconstructed_sample = vae.reconstruct_sample(sample_x) print('The shape of the generated sample is ', generated_sample.shape) ``` + 我们可以看一下新生成样本的shape: -```python + +```text The shape of the generated sample is (64, 1, 32, 32) ``` + ConditionalVAE训练过程和VAE的过程类似,但需要注意的是使用训练好的ConditionalVAE网络生成新样本和重建新样本时,需要输入标签信息,例如下面生成的新样本就是64个0-7的数字。 ```python @@ -819,8 +912,10 @@ for sample in ds_train.create_dict_iterator(): reconstructed_sample = cvae.reconstruct_sample(sample_x, sample_y) print('The shape of the generated sample is ', generated_sample.shape) ``` + 查看一下新生成的样本的shape: -```python + +```text The shape of the generated sample is (64, 1, 32, 32) ``` @@ -849,8 +944,10 @@ class TransformToBNN: self.bnn_factor = bnn_factor self.bnn_loss_file = None ``` + 参数`trainable_bnn`是经过`TrainOneStepCell`包装的可训练DNN模型,`dnn_factor`和`bnn_factor`分别为由损失函数计算得到的网络整体损失的系数和每个贝叶斯层的KL散度的系数。 API`TransformToBNN`主要实现了两个功能: + - 功能一:转换整个模型 `transform_to_bnn_model`方法可以将整个DNN模型转换为BNN模型。其定义如下: @@ -881,8 +978,9 @@ API`TransformToBNN`主要实现了两个功能: Returns: Cell, a trainable BNN model wrapped by TrainOneStepCell. """ - + ``` + 参数`get_dense_args`指定从DNN模型的全连接层中获取哪些参数,默认值是DNN模型的全连接层和BNN的全连接层所共有的参数,参数具体的含义可以参考[API说明文档](https://www.mindspore.cn/doc/api_python/zh-CN/master/mindspore/mindspore.nn.html#mindspore.nn.Dense);`get_conv_args`指定从DNN模型的卷积层中获取哪些参数,默认值是DNN模型的卷积层和BNN的卷积层所共有的参数,参数具体的含义可以参考[API说明文档](https://www.mindspore.cn/doc/api_python/zh-CN/master/mindspore/mindspore.nn.html#mindspore.nn.Conv2d);参数`add_dense_args`和`add_conv_args`分别指定了要为BNN层指定哪些新的参数值。需要注意的是,`add_dense_args`中的参数不能与`get_dense_args`重复,`add_conv_args`和`get_conv_args`也是如此。 - 功能二:转换指定类型的层 @@ -904,8 +1002,9 @@ API`TransformToBNN`主要实现了两个功能: Returns: Cell, a trainable model wrapped by TrainOneStepCell, whose sprcific type of layer is transformed to the corresponding bayesian layer. - """ + """ ``` + 参数`dnn_layer`指定将哪个类型的DNN层转换成BNN层,`bnn_layer`指定DNN层将转换成哪个类型的BNN层,`get_args`和`add_args`分别指定从DNN层中获取哪些参数和要为BNN层的哪些参数重新赋值。 如何在MindSpore中使用API`TransformToBNN`可以参考教程[DNN一键转换成BNN](https://www.mindspore.cn/tutorial/training/zh-CN/master/advanced_use/apply_deep_probability_programming.html#dnnbnn) @@ -918,6 +1017,7 @@ API`TransformToBNN`主要实现了两个功能: - 认知不确定性(Epistemic Uncertainty):模型自身对输入数据的估计可能因为训练不佳、训练数据不够等原因而不准确,可以通过增加训练数据等方式来缓解。 不确定性评估工具箱的接口如下: + - `model`:待评估不确定性的已训练好的模型。 - `train_dataset`:用于训练的数据集,迭代器类型。 - `task_type`:模型的类型,字符串,输入“regression”或者“classification”。 @@ -928,6 +1028,7 @@ API`TransformToBNN`主要实现了两个功能: - `save_model`:布尔类型,是否需要存储模型。 在使用前,需要先训练好模型,以LeNet5为例,使用方式如下: + ```python from mindspore.nn.probability.toolbox.uncertainty_evaluation import UncertaintyEvaluation from mindspore.train.serialization import load_checkpoint, load_param_into_net @@ -955,11 +1056,13 @@ if __name__ == '__main__': print('The shape of epistemic uncertainty is ', epistemic_uncertainty.shape) print('The shape of epistemic uncertainty is ', aleatoric_uncertainty.shape) ``` + `eval_epistemic_uncertainty`计算的是认知不确定性,也叫模型不确定性,对于每一个样本的每个预测标签都会有一个不确定值;`eval_aleatoric_uncertainty`计算的是偶然不确定性,也叫数据不确定性,对于每一个样本都会有一个不确定值。 所以输出为: -```python +```text The shape of epistemic uncertainty is (32, 10) The shape of epistemic uncertainty is (32,) ``` + uncertainty的值位于[0,1]之间,越大表示不确定性越高。 diff --git a/docs/programming_guide/source_zh_cn/run.md b/docs/programming_guide/source_zh_cn/run.md index 3b3138b05477a670dc46b81fddaeef1621689e26..6281328a1dff542052b7f2d093749f23ba59ecd5 100644 --- a/docs/programming_guide/source_zh_cn/run.md +++ b/docs/programming_guide/source_zh_cn/run.md @@ -15,14 +15,15 @@ ## 概述 -执行主要有三种方式:单算子、普通函数和网络训练模型。 +执行主要有三种方式:单算子、普通函数和网络训练模型。 ## 执行单算子 执行单个算子,并打印相关结果。 代码样例如下: + ```python import numpy as np import mindspore.nn as nn @@ -37,6 +38,7 @@ print(output.asnumpy()) ``` 输出如下: + ```python [[[[ 0.06022915 0.06149777 0.06149777 0.06149777 0.01145121] [ 0.06402162 0.05889071 0.05889071 0.05889071 -0.00933781] @@ -63,12 +65,12 @@ print(output.asnumpy()) [ 0.01015155 0.00781826 0.00781826 0.00781826 -0.02884173]]]] ``` - ## 执行普通函数 将若干算子组合成一个函数,然后直接通过函数调用的方式执行这些算子,并打印相关结果,如下例所示。 代码样例如下: + ```python import numpy as np from mindspore import context, Tensor @@ -88,6 +90,7 @@ print(output.asnumpy()) ``` 输出如下: + ```python [[3. 3. 3.] [3. 3. 3.] @@ -95,14 +98,17 @@ print(output.asnumpy()) ``` ## 执行网络模型 + MindSpore的Model接口是用于训练和验证的高级接口。可以将有训练或推理功能的layers组合成一个对象,通过调用train、eval、predict接口可以分别实现训练、推理和预测功能。 用户可以根据实际需要传入网络、损失函数和优化器等初始化Model接口,还可以通过配置amp_level实现混合精度,配置metrics实现模型评估。 ### 执行训练模型 + 通过调用Model的train接口可以实现训练。 代码样例如下: + ```python import os @@ -180,7 +186,7 @@ def weight_variable(): class LeNet5(nn.Cell): - """ + """ Lenet network Args: @@ -231,6 +237,7 @@ if __name__ == "__main__": > 示例中用到的MNIST数据集的获取方法,可以参照[实现一个图片分类应用](https://www.mindspore.cn/tutorial/training/zh-CN/master/quick_start/quick_start.html)的下载数据集部分,下同。 输出如下: + ```python epoch: 1 step: 1, loss is 2.300784 epoch: 1 step: 2, loss is 2.3076947 @@ -244,23 +251,26 @@ epoch: 1 step: 1875, loss is 0.017264696 > 使用PyNative模式调试, 请参考[使用PyNative模式调试](https://www.mindspore.cn/tutorial/training/zh-CN/master/advanced_use/debug_in_pynative_mode.html), 包括单算子、普通函数和网络训练模型的执行。 ### 执行推理模型 + 通过调用Model的train接口可以实现推理。为了方便评估模型的好坏,可以在Model接口初始化的时候设置评估指标Metric。 Metric是用于评估模型好坏的指标。常见的主要有Accuracy、Fbeta、Precision、Recall和TopKCategoricalAccuracy等,通常情况下,一种模型指标无法全面的评估模型的好坏,一般会结合多个指标共同作用对模型进行评估。 常用的内置评估指标: + - `Accuracy`(准确率):是一个用于评估分类模型的指标。通俗来说,准确率是指我们的模型预测正确的结果所占的比例。 公式:$$Accuracy = (TP+TN)/(TP+TN+FP+FN)$$ - `Precision`(精确率):在被识别为正类别的样本中,确实为正类别的比例。公式:$$Precision = TP/(TP+FP)$$ - `Recall`(召回率):在所有正类别样本中,被正确识别为正类别的比例。 公式:$$Recall = TP/(TP+FN)$$ -- `Fbeta`(调和均值):综合考虑precision和recall的调和均值。 +- `Fbeta`(调和均值):综合考虑precision和recall的调和均值。 公式:$$F_\beta = (1 + \beta^2) \cdot \frac{precisiont \cdot recall}{(\beta^2 \cdot precision) + recall}$$ - `TopKCategoricalAccuracy`(多分类TopK准确率):计算TopK分类准确率。 代码样例如下: + ```python import os @@ -278,7 +288,7 @@ from mindspore.train.serialization import load_checkpoint, load_param_into_net class LeNet5(nn.Cell): - """ + """ Lenet network Args: @@ -375,7 +385,8 @@ if __name__ == "__main__": > `checkpoint_lenet-1_1875.ckpt`文件的保存方法,可以参考[实现一个图片分类应用](https://www.mindspore.cn/tutorial/training/zh-CN/master/quick_start/quick_start.html)的训练网络部分。 输出如下: + ```python ============== {'Accuracy': 0.96875, 'Precision': array([0.97782258, 0.99451052, 0.98031496, 0.92723881, 0.98352214, 0.97165533, 0.98726115, 0.9472196 , 0.9394551 , 0.98236515])} ============== -``` \ No newline at end of file +``` diff --git a/docs/programming_guide/source_zh_cn/sampler.md b/docs/programming_guide/source_zh_cn/sampler.md index 295805463f13c3d35b5d75ba0814fa3772e2bd74..5a546c576905679d915f623845f03c38b6123516 100644 --- a/docs/programming_guide/source_zh_cn/sampler.md +++ b/docs/programming_guide/source_zh_cn/sampler.md @@ -65,7 +65,7 @@ for data in dataset2.create_dict_iterator(): 输出结果如下: -``` +```text Image shape: (32, 32, 3) , Label: 0 Image shape: (32, 32, 3) , Label: 2 Image shape: (32, 32, 3) , Label: 6 @@ -102,7 +102,7 @@ for data in dataset.create_dict_iterator(): 输出结果如下: -``` +```text Image shape: (32, 32, 3) , Label: 1 Image shape: (32, 32, 3) , Label: 1 Image shape: (32, 32, 3) , Label: 0 @@ -134,7 +134,7 @@ for data in dataset.create_dict_iterator(): 输出结果如下: -``` +```text Image shape: (32, 32, 3) , Label: 5 Image shape: (32, 32, 3) , Label: 0 Image shape: (32, 32, 3) , Label: 3 @@ -162,7 +162,7 @@ for data in dataset.create_dict_iterator(): 输出结果如下: -``` +```text Image shape: (32, 32, 3) , Label: 0 Image shape: (32, 32, 3) , Label: 0 Image shape: (32, 32, 3) , Label: 1 @@ -206,7 +206,7 @@ for data in dataset.create_dict_iterator(): 输出结果如下: -``` +```text {'data': Tensor(shape=[], dtype=Int64, value= 0)} {'data': Tensor(shape=[], dtype=Int64, value= 3)} {'data': Tensor(shape=[], dtype=Int64, value= 6)} @@ -236,7 +236,7 @@ for data in dataset.create_dict_iterator(): 输出结果如下: -``` +```text Image shape: (32, 32, 3) , Label: 0 Image shape: (32, 32, 3) , Label: 2 Image shape: (32, 32, 3) , Label: 4 diff --git a/docs/programming_guide/source_zh_cn/security_and_privacy.md b/docs/programming_guide/source_zh_cn/security_and_privacy.md index ec46866350195062272cd61bae7f1717c52612ff..06dda165702d0b658822f9914389c75622c011cd 100644 --- a/docs/programming_guide/source_zh_cn/security_and_privacy.md +++ b/docs/programming_guide/source_zh_cn/security_and_privacy.md @@ -26,19 +26,22 @@ ## 对抗鲁棒性 ### Attack + `Attack`基类定义了对抗样本生成的使用接口,其子类实现了各种具体的生成算法,支持安全工作人员快速高效地生成对抗样本,用于攻击AI模型,以评估模型的鲁棒性。 ### Defense + `Defense`基类定义了对抗训练的使用接口,其子类实现了各种具体的对抗训练算法,增强模型的对抗鲁棒性。 ### Detector + `Detector`基类定义了对抗样本检测的使用借口,其子类实现了各种具体的检测算法,增强模型的对抗鲁棒性。 详细内容,请参考[对抗鲁棒性官网教程](https://www.mindspore.cn/tutorial/training/zh-CN/master/advanced_use/improve_model_security_nad.html)。 ## 模型安全测试 -### Fuzzer +### Fuzzer `Fuzzer`类基于神经元覆盖率增益控制fuzzing流程,采用自然扰动和对抗样本生成方法作为变异策略,激活更多的神经元,从而探索不同类型的模型输出结果、错误行为,指导用户增强模型鲁棒性。 diff --git a/docs/programming_guide/source_zh_cn/tensor.md b/docs/programming_guide/source_zh_cn/tensor.md index b7a6196404b7942b7bc978c6d20c01e2b1099bb5..0ed0d4273e495fcbd8122ccad7e2291e7b8c10ea 100644 --- a/docs/programming_guide/source_zh_cn/tensor.md +++ b/docs/programming_guide/source_zh_cn/tensor.md @@ -12,6 +12,8 @@ +   + ## 概述 @@ -29,7 +31,7 @@ 代码样例如下: -``` +```python import numpy as np from mindspore import Tensor from mindspore.common import dtype as mstype @@ -46,7 +48,7 @@ print(x, "\n\n", y, "\n\n", z, "\n\n", m, "\n\n", n, "\n\n", p) 输出如下: -``` +```text [[1 2] [3 4]] @@ -66,12 +68,13 @@ True ### 属性 张量的属性包括形状(shape)和数据类型(dtype)。 + - 形状:`Tensor`的shape,是一个tuple。 - 数据类型:`Tensor`的dtype,是MindSpore的一个数据类型。 代码样例如下: -``` +```python import numpy as np from mindspore import Tensor from mindspore.common import dtype as mstype @@ -85,20 +88,21 @@ print(x_shape, x_dtype) 输出如下: -``` +```text (2, 2) Int32 ``` - + ### 方法 张量的方法包括`all`、`any`和`asnumpy`,`all`和`any`方法目前只支持Ascend。 + - `all(axis, keep_dims)`:在指定维度上通过`and`操作进行归约,`axis`代表归约维度,`keep_dims`表示是否保留归约后的维度。 - `any(axis, keep_dims)`:在指定维度上通过`or`操作进行归约,参数含义同`all`。 - `asnumpy()`:将`Tensor`转换为NumPy的array。 代码样例如下: -``` +```python import numpy as np from mindspore import Tensor from mindspore.common import dtype as mstype @@ -113,7 +117,7 @@ print(x_all, "\n\n", x_any, "\n\n", x_array) 输出如下: -``` +```text False True diff --git a/docs/programming_guide/source_zh_cn/tokenizer.md b/docs/programming_guide/source_zh_cn/tokenizer.md index 66c447d1c318ed1eb1fe4ce1e2636e8967f64e70..cb00c06cd11a0f47921843d41350c68078734d04 100644 --- a/docs/programming_guide/source_zh_cn/tokenizer.md +++ b/docs/programming_guide/source_zh_cn/tokenizer.md @@ -79,7 +79,7 @@ for i in dataset.create_dict_iterator(num_epochs=1, output_numpy=True): 输出结果如下: -``` +```text ------------------------before tokenization---------------------------- 床前明月光 疑是地上霜 @@ -130,7 +130,7 @@ for i in dataset.create_dict_iterator(num_epochs=1, output_numpy=True): 输出结果如下: -``` +```text ------------------------before tokenization---------------------------- 今天天气太好了我们一起去外面玩吧 ------------------------after tokenization----------------------------- @@ -167,7 +167,7 @@ for i in dataset.create_dict_iterator(num_epochs=1, output_numpy=True): 输出结果如下: -``` +```text ------------------------before tokenization---------------------------- I saw a girl with a telescope. ------------------------after tokenization----------------------------- @@ -203,7 +203,7 @@ for i in dataset.create_dict_iterator(num_epochs=1, output_numpy=True): 输出结果如下: -``` +```text ------------------------before tokenization---------------------------- Welcome to Beijing! 北京欢迎您! @@ -243,7 +243,7 @@ for i in dataset.create_dict_iterator(num_epochs=1, output_numpy=True): 输出结果如下: -``` +```text ------------------------before tokenization---------------------------- Welcome to Beijing! 北京欢迎您! @@ -285,7 +285,7 @@ for i in dataset.create_dict_iterator(num_epochs=1, output_numpy=True): 输出结果如下: -``` +```text ------------------------before tokenization---------------------------- my favorite diff --git a/docs/programming_guide/source_zh_cn/train.md b/docs/programming_guide/source_zh_cn/train.md index 57c313a78f21c92f95475c5f2faf47200c89208a..12182a926ca388876e06fab60090099be4e7196b 100644 --- a/docs/programming_guide/source_zh_cn/train.md +++ b/docs/programming_guide/source_zh_cn/train.md @@ -16,9 +16,11 @@ ## 概述 + MindSpore在Model_zoo也已经提供了大量的目标检测、自然语言处理等多种网络模型,供用户直接使用,但是对于某些高级用户而言可能想要自行设计网络或者自定义训练循环,下面就对自定义训练网络、自定义训练循环和边训练边推理三种场景进行介绍,另外对On device执行方式进行详细介绍。 ## 自定义训练网络 + 在自定义训练网络前,需要先了解下MindSpore的网络支持、Python源码构造网络约束和算子支持情况。 - 网络支持:当前MindSpore已经支持多种网络,按类型分为计算机视觉、自然语言处理、推荐和图神经网络,可以通过[网络支持](https://www.mindspore.cn/doc/note/zh-CN/master/network_list.html)查看具体支持的网络情况。如果现有网络无法满足用户需求,用户可以根据实际需要定义自己的网络。 @@ -30,6 +32,7 @@ MindSpore在Model_zoo也已经提供了大量的目标检测、自然语言处 > 当开发网络遇到内置算子不足以满足需求时,用户也可以参考[自定义算子](https://www.mindspore.cn/tutorial/training/zh-CN/master/advanced_use/custom_operator_ascend.html),方便快捷地扩展昇腾AI处理器的自定义算子。 代码样例如下: + ```python import numpy as np @@ -74,15 +77,18 @@ if __name__ == "__main__": ``` 输出如下: + ```python -------loss------ [0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.] ``` ## 自定义训练循环 + 用户如果不想使用MindSpore提供的Model接口,可以将模仿Model的train接口自由控制循环的迭代次数和每个epoch的step数量。 代码样例如下: + ```python import os @@ -244,6 +250,7 @@ if __name__ == "__main__": > 示例中用到的MNIST数据集的获取方法,可以参照[实现一个图片分类应用](https://www.mindspore.cn/tutorial/training/zh-CN/master/quick_start/quick_start.html)的下载数据集部分,下同。 输出如下: + ```python epoch: 1/10, losses: 2.294034719467163 epoch: 2/10, losses: 2.3150298595428467 @@ -260,9 +267,11 @@ epoch: 10/10, losses: 1.4282708168029785 > 典型的使用场景是梯度累积,详细查看[梯度累积](https://www.mindspore.cn/tutorial/training/zh-CN/master/advanced_use/apply_gradient_accumulation.html)。 ## 边训练边推理 + 对于某些数据量较大、训练时间较长的复杂网络,为了能掌握训练的不同阶段模型精度的指标变化情况,可以通过边训练边推理的方式跟踪精度的变化情况。具体可以参考[同步训练和验证模型](https://www.mindspore.cn/tutorial/training/zh-CN/master/advanced_use/evaluate_the_model_during_training.html)。 ## on-device执行 + 当前MindSpore支持的后端包括Ascend、GPU、CPU,所谓On Device中的Device通常指Ascend(昇腾)AI处理器。 昇腾芯片上集成了AICORE、AICPU和CPU。其中,AICORE负责大型Tensor Vector运算,AICPU负责标量运算,CPU负责逻辑控制和任务分发。 @@ -270,12 +279,14 @@ epoch: 10/10, losses: 1.4282708168029785 Host侧CPU负责将图或算子下发到昇腾芯片。昇腾芯片由于具备了运算、逻辑控制和任务分发的功能,所以不需要与Host侧的CPU进行频繁的交互,只需要将计算完的最终结果返回给Host侧,实现整图下沉到Device执行,避免Host-Device频繁交互,减小了开销。 以下是Device的主要组成结构: + - 片上32G内存:5G(parameter) + 26G(feature map) + 1G(HCCL) - 多流水线并行:6条流水线 - AICORE&带宽:32Cores、读写带宽128GBps - 通信协议:HCCS、PCIe4.0、RoCEv2 ### 计算图下沉 + 计算图整图下沉到Device上执行,减少Host-Device交互开销。可以结合循环下沉实现多个Step下沉,进一步减少Host和Device的交互次数。 循环下沉是在On Device执行的基础上的优化,目的是进一步减少Host侧和Device侧之间的交互次数。通常情况下,每个step都返回一个结果,循环下沉是控制每隔多少个step返回一次结果。 @@ -285,6 +296,7 @@ Host侧CPU负责将图或算子下发到昇腾芯片。昇腾芯片由于具备 也可以结合`train`接口的`dataset_sink_mode`和`sink_size`控制每个epoch的下沉数据量。 ### 数据下沉 + `Model`的`train`接口参数`dataset_sink_mode`可以控制数据是否下沉。`dataset_sink_mode`为True表示数据下沉,否则为非下沉。所谓下沉即数据通过通道直接传送到Device上。 dataset_sink_mode参数可以配合`sink_size`控制每个`epoch`下沉的数据量大小。当`dataset_sink_mode`设置为True,即数据下沉模式时: @@ -296,6 +308,7 @@ dataset_sink_mode参数可以配合`sink_size`控制每个`epoch`下沉的数据 下沉的总数据量由`epoch`和`sink_size`两个变量共同控制,即总数据量=`epoch`*`sink_size`。 代码样例如下: + ```python import os @@ -428,6 +441,7 @@ if __name__ == "__main__": `batch_size`为32的情况下,数据集的大小为1875,当`sink_size`设置为1000时,表示每个`epoch`下沉1000个batch的数据,下沉次数为`epoch`=10,下沉的总数据量为:`epoch`*`sink_size`=10000。 输出如下: + ```python epoch: 1 step: 1000, loss is 0.5399815 epoch: 2 step: 1000, loss is 0.033433747 @@ -441,4 +455,4 @@ epoch: 9 step: 1000, loss is 0.00017951085 epoch: 10 step: 1000, loss is 0.01490275 ``` -> `dataset_sink_mode`为False时,`sink_size`参数设置无效。 \ No newline at end of file +> `dataset_sink_mode`为False时,`sink_size`参数设置无效。 diff --git a/resource/release/release_list_zh_cn.md b/resource/release/release_list_zh_cn.md index 88e820a5fba2ceb533d80ce8b6fd1b9a86fcd154..8c6df5cdfc9ecf1a44441e861070c5590294d5f2 100644 --- a/resource/release/release_list_zh_cn.md +++ b/resource/release/release_list_zh_cn.md @@ -53,11 +53,13 @@ ## 1.0.0 + ### 版本说明 ### 下载地址 + | 组件 | 硬件平台 | 操作系统 | 链接 | SHA-256 | | --- | --- | --- | --- | --- | | MindSpore | Ascend910 | Ubuntu-x86 | | 4682be18cffdf86346bdb286ccd9e05f33be4138415dbc7db1650d029510ee44 | @@ -97,11 +99,13 @@ | 文档 | 编程指南
Python API
C++ API
FAQ
其他说明 | ## 0.7.0-beta + ### 版本说明 ### 下载地址 + | 组件 | 硬件平台 | 操作系统 | 链接 | SHA-256 | | --- | --- | --- | --- | --- | | MindSpore | Ascend910 | Ubuntu-x86 | | 522b80e84de1b414d3800a27d01e40f75332000e5246b24cc1aea7d9e5566ce5 | @@ -130,11 +134,13 @@ | 文档 | | | ## 0.6.0-beta + ### 版本说明 ### 下载地址 + | 组件 | 硬件平台 | 操作系统 | 链接 | SHA-256 | | --- | --- | --- | --- | --- | | MindSpore | Ascend910 | Ubuntu-x86 | | afea66c19beff797b99bf06bc0ed897a83fdb510d62e03663cef55a68e0f278f | @@ -154,21 +160,25 @@ | | GPU CUDA 10.1/CPU | Ubuntu-x86 | | 18f245bdff972414010c9f53de402d790cdef9a74f94ac41e5b6341e778e93b3 | ### 教程 + ### API + ### 文档 - + ## 0.5.2-beta + ### 版本说明 ### 下载地址 + | 组件 | 硬件平台 | 操作系统 | 链接 | SHA-256 | | --- | --- | --- | --- | --- | | MindSpore | Ascend910 | Ubuntu-x86 | | ec4bdb6c96d9ffd2d1e465bd07ac4a8a9c0633512b4fffe9217590ad1a576ea6 | @@ -188,20 +198,25 @@ | | GPU CUDA 10.1/CPU | Ubuntu-x86 | | 09aa2887b0acbe9b31d07fb8d740c0bceefd6b8751aebdddd533f752f7564efc | ### 教程 + ### API + ### 文档 + ## 0.5.0-beta + ### 版本说明 ### 下载地址 + | 组件 | 硬件平台 | 操作系统 | 链接 | SHA-256 | | --- | --- | --- | --- | --- | | MindSpore | Ascend910 | Ubuntu-x86 | | f20adcdb696316361e13fcd624d7188598b7248f77c7efc535cf193afc26f1c2 | @@ -221,20 +236,25 @@ | | GPU CUDA 10.1/CPU | Ubuntu-x86 | | 09aa2887b0acbe9b31d07fb8d740c0bceefd6b8751aebdddd533f752f7564efc | ### 教程 + ### API + ### 文档 + ## 0.3.0-alpha + ### 版本说明 ### 下载地址 + | 组件 | 硬件平台 | 操作系统 | 链接 | SHA-256 | | --- | --- | --- | --- | --- | | MindSpore | Ascend910 | Ubuntu-x86 | | 7756a50ca3af82d06eaf456db4d062fa647a8352724ef85da6569426a6393918 | @@ -255,20 +275,25 @@ | | GPU CUDA 9.2/GPU CUDA 10.1/CPU | Ubuntu-x86 | | 7a2bd6174be9e5a47e8ae6bcdd592ecdafc6e53e6f1cd5f0261fcb8337b5b337 | ### 教程 + ### API + ### 文档 + ## 0.2.0-alpha + ### 版本说明 ### 下载地址 + | 组件 | 硬件平台 | 操作系统 | 链接 | SHA-256 | | --- | --- | --- | --- | --- | | MindSpore | Ascend910 | Ubuntu-x86 | | aa1225665d05263b17bb7ec1d51dd4f933254c818bee126b6c5dac4513532a14 | @@ -288,20 +313,25 @@ | | GPU CUDA 9.2/GPU CUDA 10.1/CPU | Ubuntu-x86 | | 4146790bc73a5846e92b943dfd3febb6c62052b217eeb45b6c48aa82b51e7cc3 | ### 教程 + ### API + ### 文档 + ## 0.1.0-alpha + ### 版本说明 ### 下载地址 + | 组件 | 硬件平台 | 操作系统 | 链接 | SHA-256 | | --- | --- | --- | --- | --- | | MindSpore | Ascend910 | Ubuntu-x86 | | a76df4e96c4cb69b10580fcde2d4ef46b5d426be6d47a3d8fd379c97c3e66638 | @@ -319,12 +349,15 @@ | | GPU CUDA 9.2/GPU CUDA 10.1/CPU | Ubuntu-x86 | | 7796b6c114ee4962ce605da59a9bc47390c8910acbac318ecc0598829aad6e8c | ### 教程 + ### API + ### 文档 + ## master(unstable) diff --git a/tools/link_detection/README_CN.md b/tools/link_detection/README_CN.md index 442726a409139e21f99d432923946255903b9213..c2be9e6e409f7926daaf6e5034c5525da6b120c1 100644 --- a/tools/link_detection/README_CN.md +++ b/tools/link_detection/README_CN.md @@ -3,29 +3,32 @@ ## 简介 此工具可以检查用户指定目录里所有文件的链接,将所有链接分为三类,并且将检查结果分别写入三个文件,如下所示: + 1. 响应的状态码不是200的链接,写入`400.txt`文件中。 2. 脚本执行过程中请求出现异常的链接,写入`exception.txt`文件中。 3. 对于安装包的链接,因为请求非常耗时,所以不发请求,直接写入`slow.txt`文件中。 - ## 使用说明 该工具所依赖的操作系统为Windows操作系统,执行环境为Python环境,具体使用步骤如下所示: 1. 打开Git Bash,下载MindSpore Docs仓代码。 - ``` + + ```shell git clone https://gitee.com/mindspore/docs.git ``` + 2. 进入`tools/link_detection`目录,安装执行所需的第三方库。 - ``` + + ```shell cd tools/link_detection pip install requests ``` + 3. 在`link_detection`目录下执行如下命令,在输入需要检测目录的绝对路径后,开始进行检测,完成后会在当前目录下新建`404.txt`、`exception.txt`、`slow.txt`三个文件。 - ``` + + ```shell python link_detection.py ``` - > 检测目录的绝对路径全使用英文,并且使用Linux的绝对路径方式,例如:`/d/master/docs`。 - - + > 检测目录的绝对路径全使用英文,并且使用Linux的绝对路径方式,例如:`/d/master/docs`。 diff --git a/tools/pic_detection/README_CN.md b/tools/pic_detection/README_CN.md index c217d51929f0d07908d62695626ec68a2b465d77..a3cf658bc44bc75dede5f6d86a1f649209912092 100644 --- a/tools/pic_detection/README_CN.md +++ b/tools/pic_detection/README_CN.md @@ -4,24 +4,26 @@ 此工具可以检查用户指定目录里所有图片的使用情况,会检查出没有使用的图片,并且将没有使用的图片删除。 - ## 使用说明 该工具所依赖的操作系统为Windows操作系统,执行环境为Python环境,具体使用步骤如下所示: 1. 打开Git Bash,下载MindSpore Docs仓代码。 - ``` + + ```shell git clone https://gitee.com/mindspore/docs.git ``` + 2. 进入`tools/pic_detection`目录。 - ``` + + ```shell cd tools/pic_detection ``` + 3. 在`pic_detection`目录下执行如下命令,在输入需要检测目录的绝对路径后,开始进行检测,最后将没有使用的图片删除。 - ``` + + ```shell python pic_detection.py ``` - > 检测目录的绝对路径全使用英文,并且使用Linux的绝对路径方式,例如:`/d/master/docs`。 - - + > 检测目录的绝对路径全使用英文,并且使用Linux的绝对路径方式,例如:`/d/master/docs`。 diff --git a/tutorials/inference/source_en/multi_platform_inference.md b/tutorials/inference/source_en/multi_platform_inference.md index ab1a6bf9cbd0619fbe3d346603a2aa2780af56b4..da8121d4e353a510b158762e8cbfafea9f039874 100644 --- a/tutorials/inference/source_en/multi_platform_inference.md +++ b/tutorials/inference/source_en/multi_platform_inference.md @@ -13,6 +13,7 @@ Models trained by MindSpore support the inference on different hardware platforms. This document describes the inference process on each platform. The inference can be performed in either of the following methods based on different principles: + - Use a checkpoint file for inference. That is, use the inference API to load data and the checkpoint file for inference in the MindSpore training environment. - Convert the checkpiont file into a common model format, such as ONNX or AIR, for inference. The inference environment does not depend on MindSpore. In this way, inference can be performed across hardware platforms as long as the platform supports ONNX or AIR inference. For example, models trained on the Ascend 910 AI processor can be inferred on the GPU or CPU. @@ -27,12 +28,8 @@ MindSpore supports the following inference scenarios based on the hardware platf | CPU | Checkpoint | The training environment dependency is the same as that of MindSpore. | | CPU | ONNX | Supports ONNX Runtime or SDK, for example, TensorRT. | -> Open Neural Network Exchange (ONNX) is an open file format designed for machine learning. It is used to store trained models. It enables different AI frameworks (such as PyTorch and MXNet) to store model data in the same format and interact with each other. For details, visit the ONNX official website . - -> Ascend Intermediate Representation (AIR) is an open file format defined by Huawei for machine learning and can better adapt to the Ascend AI processor. It is similar to ONNX. - -> Ascend Computer Language (ACL) provides C++ API libraries for users to develop deep neural network applications, including device management, context management, stream management, memory management, model loading and execution, operator loading and execution, and media data processing. It matches the Ascend AI processor and enables hardware running management and resource management. - -> Offline Model (OM) is supported by the Huawei Ascend AI processor. It implements preprocessing functions that can be completed without devices, such as operator scheduling optimization, weight data rearrangement and compression, and memory usage optimization. - -> NVIDIA TensorRT is an SDK for high-performance deep learning inference. It includes a deep learning inference optimizer and runtime to improve the inference speed of the deep learning model on edge devices. For details, see . +> - Open Neural Network Exchange (ONNX) is an open file format designed for machine learning. It is used to store trained models. It enables different AI frameworks (such as PyTorch and MXNet) to store model data in the same format and interact with each other. For details, visit the ONNX official website . +> - Ascend Intermediate Representation (AIR) is an open file format defined by Huawei for machine learning and can better adapt to the Ascend AI processor. It is similar to ONNX. +> - Ascend Computer Language (ACL) provides C++ API libraries for users to develop deep neural network applications, including device management, context management, stream management, memory management, model loading and execution, operator loading and execution, and media data processing. It matches the Ascend AI processor and enables hardware running management and resource management. +> - Offline Model (OM) is supported by the Huawei Ascend AI processor. It implements preprocessing functions that can be completed without devices, such as operator scheduling optimization, weight data rearrangement and compression, and memory usage optimization. +> - NVIDIA TensorRT is an SDK for high-performance deep learning inference. It includes a deep learning inference optimizer and runtime to improve the inference speed of the deep learning model on edge devices. For details, see . diff --git a/tutorials/inference/source_en/multi_platform_inference_ascend_310.md b/tutorials/inference/source_en/multi_platform_inference_ascend_310.md index 359b4d91d9c8cc3956c1ee832eeb4d077e5a7d11..ebbb52fa1a2f4e4bcca4c14ad434e5709df120bb 100644 --- a/tutorials/inference/source_en/multi_platform_inference_ascend_310.md +++ b/tutorials/inference/source_en/multi_platform_inference_ascend_310.md @@ -11,7 +11,6 @@ - ## Inference Using an ONNX or AIR File The Ascend 310 AI processor is equipped with the ACL framework and supports the OM format which needs to be converted from the model in ONNX or AIR format. For inference on the Ascend 310 AI processor, perform the following steps: diff --git a/tutorials/inference/source_en/multi_platform_inference_ascend_910.md b/tutorials/inference/source_en/multi_platform_inference_ascend_910.md index 05217b62497ea42bd2324062c36d593d6cc03ad4..bd536f4e30e6c078138e76bc250a49a6a3acdbcc 100644 --- a/tutorials/inference/source_en/multi_platform_inference_ascend_910.md +++ b/tutorials/inference/source_en/multi_platform_inference_ascend_910.md @@ -13,7 +13,7 @@ ## Inference Using a Checkpoint File -1. Use the `model.eval` interface for model validation. +1. Use the `model.eval` interface for model validation. 1.1 Local Storage @@ -34,12 +34,13 @@ acc = model.eval(dataset, dataset_sink_mode=args.dataset_sink_mode) print("============== {} ==============".format(acc)) ``` + In the preceding information: `model.eval` is an API for model validation. For details about the API, see . > Inference sample code: . 1.2 Remote Storage - + When the pre-trained models are saved remotely, the steps of performing inference on validation dataset are as follows: firstly determine which model to be used, then loading model and parameters using `mindspore_hub.load`, and finally performing inference on validation dataset once created. The processing method of the validation dataset is the same as that of the training dataset. ```python @@ -55,14 +56,17 @@ 1) acc = model.eval(dataset, dataset_sink_mode=args.dataset_sink_mode) print("============== {} ==============".format(acc)) - ``` + ``` + In the preceding information: - + `mindpsore_hub.load` is an API for loading model parameters. PLease check the details in . 2. Use the `model.predict` API to perform inference. + ```python model.predict(input_data) ``` + In the preceding information: `model.predict` is an API for inference. For details about the API, see . diff --git a/tutorials/inference/source_en/multi_platform_inference_cpu.md b/tutorials/inference/source_en/multi_platform_inference_cpu.md index 82424de89d9da00b7dac09c8c8e4495825862595..8d00afd56a67f27869dd0f68bec43c43437d8c2e 100644 --- a/tutorials/inference/source_en/multi_platform_inference_cpu.md +++ b/tutorials/inference/source_en/multi_platform_inference_cpu.md @@ -12,11 +12,12 @@ - ## Inference Using a Checkpoint File + The inference is the same as that on the Ascend 910 AI processor. ## Inference Using an ONNX File + Similar to the inference on a GPU, the following steps are required: 1. Generate a model in ONNX format on the training platform. For details, see [Export ONNX Model](https://www.mindspore.cn/tutorial/training/en/master/use/save_model.html#export-onnx-model). diff --git a/tutorials/inference/source_en/multi_platform_inference_gpu.md b/tutorials/inference/source_en/multi_platform_inference_gpu.md index d42a2ffe2548e9b4b31d4c9cbf7c38cec4439541..0c3de8af6ba83965679f63f5719233bf1b982100 100644 --- a/tutorials/inference/source_en/multi_platform_inference_gpu.md +++ b/tutorials/inference/source_en/multi_platform_inference_gpu.md @@ -12,7 +12,6 @@ - ## Inference Using a Checkpoint File The inference is the same as that on the Ascend 910 AI processor. diff --git a/tutorials/inference/source_en/serving.md b/tutorials/inference/source_en/serving.md index 589857362d7f721e002264624fcd44762477b54b..18266ebe7a82183bd9df9cc10e517abf712d2538 100644 --- a/tutorials/inference/source_en/serving.md +++ b/tutorials/inference/source_en/serving.md @@ -23,12 +23,15 @@ MindSpore Serving is a lightweight and high-performance service module that helps MindSpore developers efficiently deploy online inference services in the production environment. After completing model training using MindSpore, you can export the MindSpore model and use MindSpore Serving to create an inference service for the model. Currently, only Ascend 910 is supported. ## Starting Serving + After MindSpore is installed using `pip`, the Serving executable program is stored in `/{your python path}/lib/python3.7/site-packages/mindspore/ms_serving`. Run the following command to start Serving: -```bash -ms_serving [--help] [--model_path=] [--model_name=] [--port=] + +```bash +ms_serving [--help] [--model_path=] [--model_name=] [--port=] [--rest_api_port=] [--device_id=] ``` + Parameters are described as follows: |Parameter|Attribute|Function|Parameter Type|Default Value|Value Range| @@ -41,69 +44,84 @@ Parameters are described as follows: |`--device_id=`|Optional|Specifies device ID to be used.|Integer|0|0 to 7| > Before running the startup command, add the path `/{your python path}/lib:/{your python path}/lib/python3.7/site-packages/mindspore/lib` to the environment variable `LD_LIBRARY_PATH`. - > port and rest_ api_port cannot be the same. + > port and rest_api_port cannot be the same. ## Application Example + The following uses a simple network as an example to describe how to use MindSpore Serving. ### Exporting Model + > Before exporting the model, you need to configure MindSpore [base environment](https://www.mindspore.cn/install/en). Use [add_model.py](https://gitee.com/mindspore/mindspore/blob/master/serving/example/export_model/add_model.py) to build a network with only the Add operator and export the MindSpore inference deployment model. -```python +```shell python add_model.py ``` + Execute the script to generate the `tensor_add.mindir` file. The input of the model is two one-dimensional tensors with shape [2,2], and the output is the sum of the two input tensors. ### Starting Serving Inference + ```bash ms_serving --model_path={model directory} --model_name=tensor_add.mindir ``` + If the server prints the `MS Serving Listening on 0.0.0.0:5500` log, the Serving has loaded the inference model. ### Client Samples + #### Python Client Sample + > Before running the client sample, add the path `/{your python path}/lib/python3.7/site-packages/mindspore/` to the environment variable `PYTHONPATH`. Obtain [ms_client.py](https://gitee.com/mindspore/mindspore/blob/master/serving/example/python_client/ms_client.py) and start the Python client. + ```bash python ms_client.py ``` If the following information is displayed, the Serving has correctly executed the inference of the Add network. -``` + +```text ms client received: [[2. 2.] [2. 2.]] ``` #### C++ Client Sample + 1. Obtain an executable client sample program. Download the [MindSpore source code](https://gitee.com/mindspore/mindspore). You can use either of the following methods to compile and obtain the client sample program: - + When MindSpore is compiled using the source code, the Serving C++ client sample program is generated. You can find the `ms_client` executable program in the `build/mindspore/serving/example/cpp_client` directory. - + Independent compilation + - When MindSpore is compiled using the source code, the Serving C++ client sample program is generated. You can find the `ms_client` executable program in the `build/mindspore/serving/example/cpp_client` directory. + - Independent compilation Preinstall [gRPC](https://gRPC.io). Run the following command in the MindSpore source code path to compile a client sample program: + ```bash cd mindspore/serving/example/cpp_client mkdir build && cd build cmake -D GRPC_PATH={grpc_install_dir} .. make ``` + In the preceding command, `{grpc_install_dir}` indicates the gRPC installation path. Replace it with the actual gRPC installation path. 2. Start the client. Execute `ms_client` to send an inference request to the Serving. + ```bash ./ms_client --target=localhost:5500 ``` + If the following information is displayed, the Serving has correctly executed the inference of the Add network. - ``` + + ```text Compute [[1, 2], [3, 4]] + [[1, 2], [3, 4]] Add result is 2 4 6 8 client received: RPC OK @@ -112,69 +130,82 @@ ms client received: The client code consists of the following parts: 1. Implement the client based on MSService::Stub and create a client instance. - ``` + + ```cpp class MSClient { public: explicit MSClient(std::shared_ptr channel) : stub_(MSService::NewStub(channel)) {} private: std::unique_ptr stub_; }; - + MSClient client(grpc::CreateChannel(target_str, grpc::InsecureChannelCredentials())); - + ``` + 2. Build the request input parameter `Request`, output parameter `Reply`, and gRPC client `Context` based on the actual network input. - ``` + + ```cpp PredictRequest request; PredictReply reply; ClientContext context; - + //construct tensor Tensor data; - + //set shape TensorShape shape; shape.add_dims(4); *data.mutable_tensor_shape() = shape; - + //set type data.set_tensor_type(ms_serving::MS_FLOAT32); std::vector input_data{1, 2, 3, 4}; - + //set datas data.set_data(input_data.data(), input_data.size()); - + //add tensor to request *request.add_data() = data; *request.add_data() = data; ``` + 3. Call the gRPC API to communicate with the Serving that has been started, and obtain the return value. - ``` + + ```cpp Status status = stub_->Predict(&context, request, &reply); ``` -For details about the complete code, see [ms_client](https://gitee.com/mindspore/mindspore/blob/master/serving/example/cpp_client/ms_client.cc). +For details about the complete code, see [ms_client](https://gitee.com/mindspore/mindspore/blob/master/serving/example/cpp_client/ms_client.cc). ### REST API Client Sample + 1. Send data in the form of `data`: `data` field: flatten each input data of network model into one-dimensional data. Suppose the network model has n inputs, and the final data structure is a two-dimensional list of 1 * n. As in this example, flatten the model input data `[[1.0, 2.0], [3.0, 4.0]]` and `[[1.0, 2.0], [3.0, 4.0]]` to form `[[1.0, 2.0, 3.0, 4.0], [1.0, 2.0, 3.0, 4.0]]`. - ``` + + ```shell curl -X POST -d '{"data": [[1.0, 2.0, 3.0, 4.0], [1.0, 2.0, 3.0, 4.0]]}' http://127.0.0.1:5501 ``` + The following return values are displayed, indicating that the serving service has correctly executed the reasoning of the add network, and the output data structure is similar to that of the input: - ``` + + ```text {"data":[[2.0,4.0,6.0,8.0]]} ``` 2. Send data in the form of `tensor`: `tensor` field: composed of each input of the network model, keeping the original shape of input. As in this example, the model input data `[[1.0, 2.0], [3.0, 4.0]]` and `[[1.0, 2.0], [3.0, 4.0]]` are combined into `[[[1.0, 2.0], [3.0, 4.0]], [[1.0, 2.0], [3.0, 4.0]]]`. - ``` + + ```shell curl -X POST -d '{"tensor": [[[1.0, 2.0], [3.0, 4.0]], [[1.0, 2.0], [3.0, 4.0]]]}' http://127.0.0.1:5501 ``` + The following return values are displayed, indicating that the serving service has correctly executed the reasoning of the add network, and the output data structure is similar to that of the input: - ``` + + ```text {"tensor":[[2.0,4.0], [6.0,8.0]]} ``` - > REST APICurrently only int32 and fp32 are supported as inputs. \ No newline at end of file + + > REST APICurrently only int32 and fp32 are supported as inputs. diff --git a/tutorials/inference/source_zh_cn/multi_platform_inference.md b/tutorials/inference/source_zh_cn/multi_platform_inference.md index 22402d31027482bbcfdf8405fc93f5d558791fb2..e5ee1902d3f18961004c2e8abe276cbaa195511f 100644 --- a/tutorials/inference/source_zh_cn/multi_platform_inference.md +++ b/tutorials/inference/source_zh_cn/multi_platform_inference.md @@ -13,6 +13,7 @@ 基于MindSpore训练后的模型,支持在不同的硬件平台上执行推理。本文介绍各平台上的推理流程。 按照原理不同,推理可以有两种方式: + - 直接使用checkpiont文件进行推理,即在MindSpore训练环境下,使用推理接口加载数据及checkpoint文件进行推理。 - 将checkpiont文件转化为通用的模型格式,如ONNX、AIR格式模型文件进行推理,推理环境不需要依赖MindSpore。这样的好处是可以跨硬件平台,只要支持ONNX/AIR推理的硬件平台即可进行推理。譬如在Ascend 910 AI处理器上训练的模型,可以在GPU/CPU上进行推理。 @@ -27,12 +28,8 @@ GPU | ONNX格式 | 支持ONNX推理的runtime/SDK,如TensorRT。 CPU | checkpoint格式 | 与MindSpore训练环境依赖一致。 CPU | ONNX格式 | 支持ONNX推理的runtime/SDK,如TensorRT。 -> ONNX,全称Open Neural Network Exchange,是一种针对机器学习所设计的开放式的文件格式,用于存储训练好的模型。它使得不同的人工智能框架(如PyTorch, MXNet)可以采用相同格式存储模型数据并交互。详细了解,请参见ONNX官网。 - -> AIR,全称Ascend Intermediate Representation,类似ONNX,是华为定义的针对机器学习所设计的开放式的文件格式,能更好地适配Ascend AI处理器。 - -> ACL,全称Ascend Computer Language,提供Device管理、Context管理、Stream管理、内存管理、模型加载与执行、算子加载与执行、媒体数据处理等C++ API库,供用户开发深度神经网络应用。它匹配Ascend AI处理器,使能硬件的运行管理、资源管理能力。 - -> OM,全称Offline Model,华为Ascend AI处理器支持的离线模型,实现算子调度的优化,权值数据重排、压缩,内存使用优化等可以脱离设备完成的预处理功能。 - -> TensorRT,NVIDIA 推出的高性能深度学习推理的SDK,包括深度推理优化器和runtime,提高深度学习模型在边缘设备上的推断速度。详细请参见。 \ No newline at end of file +> - ONNX,全称Open Neural Network Exchange,是一种针对机器学习所设计的开放式的文件格式,用于存储训练好的模型。它使得不同的人工智能框架(如PyTorch, MXNet)可以采用相同格式存储模型数据并交互。详细了解,请参见ONNX官网。 +> - AIR,全称Ascend Intermediate Representation,类似ONNX,是华为定义的针对机器学习所设计的开放式的文件格式,能更好地适配Ascend AI处理器。 +> - ACL,全称Ascend Computer Language,提供Device管理、Context管理、Stream管理、内存管理、模型加载与执行、算子加载与执行、媒体数据处理等C++ API库,供用户开发深度神经网络应用。它匹配Ascend AI处理器,使能硬件的运行管理、资源管理能力。 +> - OM,全称Offline Model,华为Ascend AI处理器支持的离线模型,实现算子调度的优化,权值数据重排、压缩,内存使用优化等可以脱离设备完成的预处理功能。 +> - TensorRT,NVIDIA 推出的高性能深度学习推理的SDK,包括深度推理优化器和runtime,提高深度学习模型在边缘设备上的推断速度。详细请参见。 diff --git a/tutorials/inference/source_zh_cn/multi_platform_inference_ascend_310.md b/tutorials/inference/source_zh_cn/multi_platform_inference_ascend_310.md index f99df969c5dad30127e8b29d51cf3a2d25587df1..29cc91251f77d70d339f825d665915277fbe44b1 100644 --- a/tutorials/inference/source_zh_cn/multi_platform_inference_ascend_310.md +++ b/tutorials/inference/source_zh_cn/multi_platform_inference_ascend_310.md @@ -19,4 +19,4 @@ Ascend 310 AI处理器上搭载了ACL框架,他支持OM格式,而OM格式需 2. 将ONNX/AIR格式模型文件,转化为OM格式模型,并进行推理。 - 云上(ModelArt环境),请参考[Ascend910训练和Ascend310推理的样例](https://support.huaweicloud.com/bestpractice-modelarts/modelarts_10_0026.html)完成推理操作。 - - 本地的裸机环境(对比云上环境,即本地有Ascend 310 AI 处理器),请参考Ascend 310 AI处理器配套软件包的说明文档。 \ No newline at end of file + - 本地的裸机环境(对比云上环境,即本地有Ascend 310 AI 处理器),请参考Ascend 310 AI处理器配套软件包的说明文档。 diff --git a/tutorials/inference/source_zh_cn/multi_platform_inference_ascend_910.md b/tutorials/inference/source_zh_cn/multi_platform_inference_ascend_910.md index 2f54f93e870967486955c914b23556447596031b..c726609e2e2e24d65d42f584f9e6ab0503512ce5 100644 --- a/tutorials/inference/source_zh_cn/multi_platform_inference_ascend_910.md +++ b/tutorials/inference/source_zh_cn/multi_platform_inference_ascend_910.md @@ -13,11 +13,12 @@ ## 使用checkpoint格式文件推理 -1. 使用`model.eval`接口来进行模型验证。 +1. 使用`model.eval`接口来进行模型验证。 1.1 模型已保存在本地 首先构建模型,然后使用`mindspore.train.serialization`模块的`load_checkpoint`和`load_param_into_net`从本地加载模型与参数,传入验证数据集后即可进行模型推理,验证数据集的处理方式与训练数据集相同。 + ```python network = LeNet5(cfg.num_classes) net_loss = nn.SoftmaxCrossEntropyWithLogits(sparse=True, reduction="mean") @@ -33,13 +34,15 @@ acc = model.eval(dataset, dataset_sink_mode=args.dataset_sink_mode) print("============== {} ==============".format(acc)) ``` + 其中, `model.eval`为模型验证接口,对应接口说明:。 > 推理样例代码:。 1.2 使用MindSpore Hub从华为云加载模型 - + 首先构建模型,然后使用`mindspore_hub.load`从云端加载模型参数,传入验证数据集后即可进行推理,验证数据集的处理方式与训练数据集相同。 + ```python model_uid = "mindspore/ascend/0.7/googlenet_v1_cifar10" # using GoogleNet as an example. network = mindspore_hub.load(model_uid, num_classes=10) @@ -53,13 +56,16 @@ 1) acc = model.eval(dataset, dataset_sink_mode=args.dataset_sink_mode) print("============== {} ==============".format(acc)) - ``` + ``` + 其中, `mindspore_hub.load`为加载模型参数接口,对应接口说明:。 2. 使用`model.predict`接口来进行推理操作。 + ```python model.predict(input_data) ``` + 其中, - `model.predict`为推理接口,对应接口说明:。 \ No newline at end of file + `model.predict`为推理接口,对应接口说明:。 diff --git a/tutorials/inference/source_zh_cn/multi_platform_inference_cpu.md b/tutorials/inference/source_zh_cn/multi_platform_inference_cpu.md index 676ec679bddc18f98d0cc06537ba6e89cd1fc80a..82d7141468788164b7c18d166d19f40206d33be6 100644 --- a/tutorials/inference/source_zh_cn/multi_platform_inference_cpu.md +++ b/tutorials/inference/source_zh_cn/multi_platform_inference_cpu.md @@ -13,9 +13,11 @@ ## 使用checkpoint格式文件推理 + 与在Ascend 910 AI处理器上推理一样。 ## 使用ONNX格式文件推理 + 与在GPU上进行推理类似,需要以下几个步骤: 1. 在训练平台上生成ONNX格式模型,具体步骤请参考[导出ONNX格式文件](https://www.mindspore.cn/tutorial/training/zh-CN/master/use/save_model.html#onnx)。 diff --git a/tutorials/inference/source_zh_cn/multi_platform_inference_gpu.md b/tutorials/inference/source_zh_cn/multi_platform_inference_gpu.md index abd1173cf5fe61301ffd0c6f221a9a87d3b7ed6d..ea96a12c1ce5e620f6c2700aa5c26088b9e8f534 100644 --- a/tutorials/inference/source_zh_cn/multi_platform_inference_gpu.md +++ b/tutorials/inference/source_zh_cn/multi_platform_inference_gpu.md @@ -20,4 +20,4 @@ 1. 在训练平台上生成ONNX格式模型,具体步骤请参考[导出ONNX格式文件](https://www.mindspore.cn/tutorial/training/zh-CN/master/use/save_model.html#onnx)。 -2. 在GPU上进行推理,具体可以参考推理使用runtime/SDK的文档。如在Nvidia GPU上进行推理,使用常用的TensorRT,可参考[TensorRT backend for ONNX](https://github.com/onnx/onnx-tensorrt)。 \ No newline at end of file +2. 在GPU上进行推理,具体可以参考推理使用runtime/SDK的文档。如在Nvidia GPU上进行推理,使用常用的TensorRT,可参考[TensorRT backend for ONNX](https://github.com/onnx/onnx-tensorrt)。 diff --git a/tutorials/inference/source_zh_cn/serving.md b/tutorials/inference/source_zh_cn/serving.md index 4f7f3ba689ef50e3bc0fb5315dd20b4759f5307d..9288da7d9c171dd18c2cd0b170fc57ae009cb7e2 100644 --- a/tutorials/inference/source_zh_cn/serving.md +++ b/tutorials/inference/source_zh_cn/serving.md @@ -18,18 +18,20 @@ - ## 概述 MindSpore Serving是一个轻量级、高性能的服务模块,旨在帮助MindSpore开发者在生产环境中高效部署在线推理服务。当用户使用MindSpore完成模型训练后,导出MindSpore模型,即可使用MindSpore Serving创建该模型的推理服务。当前Serving仅支持Ascend 910。 ## 启动Serving服务 + 通过pip安装MindSpore后,Serving可执行程序位于`/{your python path}/lib/python3.7/site-packages/mindspore/ms_serving`。 启动Serving服务命令如下 -```bash -ms_serving [--help] [--model_path=] [--model_name=] [--port=] + +```bash +ms_serving [--help] [--model_path=] [--model_name=] [--port=] [--rest_api_port=] [--device_id=] ``` + 参数含义如下 |参数名|属性|功能描述|参数类型|默认值|取值范围| @@ -45,21 +47,24 @@ ms_serving [--help] [--model_path=] [--model_name=] [--p > port与rest_api_port不可相同。 ## 应用示例 + 下面以一个简单的网络为例,演示MindSpore Serving如何使用。 ### 导出模型 + > 导出模型之前,需要配置MindSpore[基础环境](https://www.mindspore.cn/install)。 使用[add_model.py](https://gitee.com/mindspore/mindspore/blob/master/serving/example/export_model/add_model.py),构造一个只有Add算子的网络,并导出MindSpore推理部署模型。 -```python +```python python add_model.py ``` 执行脚本,生成`tensor_add.mindir`文件,该模型的输入为两个shape为[2,2]的二维Tensor,输出结果是两个输入Tensor之和。 ### 启动Serving推理服务 -```bash + +```bash ms_serving --model_path={model directory} --model_name=tensor_add.mindir ``` @@ -67,15 +72,19 @@ ms_serving --model_path={model directory} --model_name=tensor_add.mindir 当服务端打印日志`MS Serving RESTful start, listening on 0.0.0.0:5501`时,表示Serving REST服务已加载推理模型完毕。 ### gRPC客户端示例 + #### Python客户端示例 + > 执行客户端前,需将`/{your python path}/lib/python3.7/site-packages/mindspore`对应的路径添加到环境变量PYTHONPATH中。 获取[ms_client.py](https://gitee.com/mindspore/mindspore/blob/master/serving/example/python_client/ms_client.py),启动Python客户端。 + ```bash python ms_client.py ``` 显示如下返回值说明Serving服务已正确执行Add网络的推理。 + ```bash ms client received: [[2. 2.] @@ -83,31 +92,37 @@ ms client received: ``` #### C++客户端示例 + 1. 获取客户端示例执行程序 首先需要下载[MindSpore源码](https://gitee.com/mindspore/mindspore)。有两种方式编译并获取客户端示例程序: - + 从源码编译MindSpore时候,将会编译产生Serving C++客户端示例程序,可在`build/mindspore/serving/example/cpp_client`目录下找到`ms_client`可执行程序。 - + 独立编译: + - 从源码编译MindSpore时候,将会编译产生Serving C++客户端示例程序,可在`build/mindspore/serving/example/cpp_client`目录下找到`ms_client`可执行程序。 + - 独立编译: 需要先预装[gRPC](https://gRPC.io)。 然后,在MindSpore源码路径中执行如下命令,编译一个客户端示例程序。 + ```bash cd mindspore/serving/example/cpp_client mkdir build && cd build cmake -D GRPC_PATH={grpc_install_dir} .. make ``` + 其中`{grpc_install_dir}`为gRPC安装时的路径,请替换为实际gRPC安装路径。 2. 启动gRPC客户端 执行ms_client,向Serving服务发送推理请求: + ```bash ./ms_client --target=localhost:5500 ``` + 显示如下返回值说明Serving服务已正确执行Add网络的推理。 - ``` + + ```text Compute [[1, 2], [3, 4]] + [[1, 2], [3, 4]] Add result is 2 4 6 8 client received: RPC OK @@ -116,75 +131,84 @@ ms client received: 客户端代码主要包含以下几个部分: 1. 基于MSService::Stub实现Client,并创建Client实例。 - ``` + + ```cpp class MSClient { public: explicit MSClient(std::shared_ptr channel) : stub_(MSService::NewStub(channel)) {} private: std::unique_ptr stub_; }; - + MSClient client(grpc::CreateChannel(target_str, grpc::InsecureChannelCredentials())); - + ``` + 2. 根据网络的实际输入构造请求的入参Request、出参Reply和gRPC的客户端Context。 - ``` + + ```cpp PredictRequest request; PredictReply reply; ClientContext context; - + //construct tensor Tensor data; - + //set shape TensorShape shape; shape.add_dims(2); shape.add_dims(2); *data.mutable_tensor_shape() = shape; - + //set type data.set_tensor_type(ms_serving::MS_FLOAT32); std::vector input_data{1, 2, 3, 4}; - + //set datas data.set_data(input_data.data(), input_data.size()); - + //add tensor to request *request.add_data() = data; *request.add_data() = data; ``` + 3. 调用gRPC接口和已经启动的Serving服务通信,并取回返回值。 ```Status status = stub_->Predict(&context, request, &reply);``` -完整代码参考[ms_client](https://gitee.com/mindspore/mindspore/blob/master/serving/example/cpp_client/ms_client.cc)。 +完整代码参考[ms_client](https://gitee.com/mindspore/mindspore/blob/master/serving/example/cpp_client/ms_client.cc)。 ### REST API客户端示例 + 1. `data`形式发送数据: data字段:将网络模型每个输入数据展平成一维数据,假设网络模型有n个输入,最后data数据结构为1*n的二维list。 - + 如本例中,将模型输入数据`[[1.0, 2.0], [3.0, 4.0]]`和`[[1.0, 2.0], [3.0, 4.0]]`展平后组合成data形式的数据`[[1.0, 2.0, 3.0, 4.0], [1.0, 2.0, 3.0, 4.0]]` - - ``` + + ```bash curl -X POST -d '{"data": [[1.0, 2.0, 3.0, 4.0], [1.0, 2.0, 3.0, 4.0]]}' http://127.0.0.1:5501 ``` - + 显示如下返回值,说明Serving服务已正确执行Add网络的推理,输出数据结构同输入类似: - ``` + + ```text {"data":[[2.0,4.0,6.0,8.0]]} ``` 2. `tensor`形式发送数据: tensor字段:由网络模型每个输入组合而成,保持输入的原始shape。 - + 如本例中,将模型输入数据`[[1.0, 2.0], [3.0, 4.0]]`和`[[1.0, 2.0], [3.0, 4.0]]`组合成tensor形式的数据`[[[1.0, 2.0], [3.0, 4.0]], [[1.0, 2.0], [3.0, 4.0]]]` - ``` + + ```bash curl -X POST -d '{"tensor": [[[1.0, 2.0], [3.0, 4.0]], [[1.0, 2.0], [3.0, 4.0]]]}' http://127.0.0.1:5501 ``` + 显示如下返回值,说明Serving服务已正确执行Add网络的推理,输出数据结构同输入类似: - ``` + + ```text {"tensor":[[2.0,4.0], [6.0,8.0]]} ``` - > REST API当前只支持int32和fp32数据输入。 + > REST API当前只支持int32和fp32数据输入。 diff --git a/tutorials/lite/source_en/quick_start/quick_start.md b/tutorials/lite/source_en/quick_start/quick_start.md index 5af499aa7a8f4cd520dd3244091afec3136d3c6c..1d3dd029c04581606709c8e9a4ffb987806c454f 100644 --- a/tutorials/lite/source_en/quick_start/quick_start.md +++ b/tutorials/lite/source_en/quick_start/quick_start.md @@ -22,26 +22,27 @@ ## Overview It is recommended that you start from the image classification demo on the Android device to understand how to build the MindSpore Lite application project, configure dependencies, and use related APIs. - + This tutorial demonstrates the on-device deployment process based on the image classification sample program on the Android device provided by the MindSpore team. 1. Select an image classification model. 2. Convert the model into a MindSpore Lite model. 3. Use the MindSpore Lite inference model on the device. The following describes how to use the MindSpore Lite C++ APIs (Android JNIs) and MindSpore Lite image classification models to perform on-device inference, classify the content captured by a device camera, and display the most possible classification result on the application's image preview screen. - + > Click to find [Android image classification models](https://download.mindspore.cn/model_zoo/official/lite/mobilenetv2_openimage_lite) and [sample code](https://gitee.com/mindspore/mindspore/tree/master/model_zoo/official/lite/image_classification). ## Selecting a Model The MindSpore team provides a series of preset device models that you can use in your application. Click [here](https://download.mindspore.cn/model_zoo/official/lite/mobilenetv2_openimage_lite/mobilenetv2.ms) to download image classification models in MindSpore ModelZoo. -In addition, you can use the preset model to perform migration learning to implement your image classification tasks. +In addition, you can use the preset model to perform migration learning to implement your image classification tasks. ## Converting a Model After you retrain a model provided by MindSpore, export the model in the [.mindir format](https://www.mindspore.cn/tutorial/training/en/master/use/save_model.html#export-mindir-model). Use the MindSpore Lite [model conversion tool](https://www.mindspore.cn/tutorial/lite/en/master/use/converter_tool.html) to convert the .mindir model to a .ms model. Take the mobilenetv2 model as an example. Execute the following script to convert a model into a MindSpore Lite model for on-device inference. + ```bash ./converter_lite --fmk=MINDIR --modelFile=mobilenetv2.mindir --outputFile=mobilenetv2.ms ``` @@ -60,7 +61,7 @@ The following section describes how to build and execute an on-device image clas ### Building and Running -1. Load the sample source code to Android Studio and install the corresponding SDK. (After the SDK version is specified, Android Studio automatically installs the SDK.) +1. Load the sample source code to Android Studio and install the corresponding SDK. (After the SDK version is specified, Android Studio automatically installs the SDK.) ![start_home](../images/lite_quick_start_home.png) @@ -86,7 +87,6 @@ The following section describes how to build and execute an on-device image clas ![result](../images/lite_quick_start_app_result.png) - ## Detailed Description of the Sample Program This image classification sample program on the Android device includes a Java layer and a JNI layer. At the Java layer, the Android Camera 2 API is used to enable a camera to obtain image frames and process images. At the JNI layer, the model inference process is completed in [Runtime](https://www.mindspore.cn/tutorial/lite/en/master/use/runtime.html). @@ -95,7 +95,7 @@ This image classification sample program on the Android device includes a Java l ### Sample Program Structure -``` +```text app │ ├── src/main @@ -109,12 +109,12 @@ app │ | └── MindSporeNetnative.h # header file │ | │ ├── java # application code at the Java layer -│ │ └── com.mindspore.himindsporedemo +│ │ └── com.mindspore.himindsporedemo │ │ ├── gallery.classify # implementation related to image processing and MindSpore JNI calling │ │ │ └── ... │ │ └── widget # implementation related to camera enabling and drawing │ │ └── ... -│ │ +│ │ │ ├── res # resource files related to Android │ └── AndroidManifest.xml # Android configuration file │ @@ -135,7 +135,7 @@ Note: if the automatic download fails, please manually download the relevant lib mindspore-lite-1.0.0-minddata-arm64-cpu.tar.gz [Download link](https://ms-release.obs.cn-north-4.myhuaweicloud.com/1.0.0/lite/android_aarch64/mindspore-lite-1.0.0-minddata-arm64-cpu.tar.gz) -``` +```text android{ defaultConfig{ externalNativeBuild{ @@ -144,7 +144,7 @@ android{ } } - ndk{ + ndk{ abiFilters'armeabi-v7a', 'arm64-v8a' } } @@ -153,7 +153,7 @@ android{ Create a link to the `.so` library file in the `app/CMakeLists.txt` file: -``` +```text # ============== Set MindSpore Dependencies. ============= include_directories(${CMAKE_SOURCE_DIR}/src/main/cpp) include_directories(${CMAKE_SOURCE_DIR}/src/main/cpp/${MINDSPORELITE_VERSION}/third_party/flatbuffers/include) @@ -171,7 +171,7 @@ set_target_properties(minddata-lite PROPERTIES IMPORTED_LOCATION ${CMAKE_SOURCE_DIR}/src/main/cpp/${MINDSPORELITE_VERSION}/lib/libminddata-lite.so) # --------------- MindSpore Lite set End. -------------------- -# Link target library. +# Link target library. target_link_libraries( ... # --- mindspore --- @@ -193,34 +193,37 @@ mobilenetv2.ms [mobilenetv2.ms]( https://download.mindspore.cn/model_zoo/officia Call MindSpore Lite C++ APIs at the JNI layer to implement on-device inference. -The inference code process is as follows. For details about the complete code, see `src/cpp/MindSporeNetnative.cpp`. +The inference code process is as follows. For details about the complete code, see `src/cpp/MindSporeNetnative.cpp`. 1. Load the MindSpore Lite model file and build the context, session, and computational graph for inference. - Load a model file. Create and configure the context for model inference. + ```cpp // Buffer is the model data passed in by the Java layer jlong bufferLen = env->GetDirectBufferCapacity(buffer); char *modelBuffer = CreateLocalModelBuffer(env, buffer); ``` - + - Create a session. + ```cpp void **labelEnv = new void *; MSNetWork *labelNet = new MSNetWork; *labelEnv = labelNet; - + // Create context. mindspore::lite::Context *context = new mindspore::lite::Context; context->thread_num_ = num_thread; - + // Create the mindspore session. labelNet->CreateSessionMS(modelBuffer, bufferLen, "device label", context); delete(context); - + ``` - + - Load the model file and build a computational graph for inference. + ```cpp void MSNetWork::CreateSessionMS(char* modelBuffer, size_t bufferLen, std::string name, mindspore::lite::Context* ctx) { @@ -230,8 +233,8 @@ The inference code process is as follows. For details about the complete code, s int ret = session->CompileGraph(model); } ``` - -2. Convert the input image into the Tensor format of the MindSpore model. + +2. Convert the input image into the Tensor format of the MindSpore model. Convert the image data to be detected into the Tensor format of the MindSpore model. @@ -241,7 +244,7 @@ The inference code process is as follows. For details about the complete code, s // Processing such as zooming the picture size. matImgPreprocessed = PreProcessImageData(matImageSrc); - ImgDims inputDims; + ImgDims inputDims; inputDims.channel = matImgPreprocessed.channels(); inputDims.width = matImgPreprocessed.cols; inputDims.height = matImgPreprocessed.rows; @@ -261,7 +264,7 @@ The inference code process is as follows. For details about the complete code, s inputDims.channel * inputDims.width * inputDims.height * sizeof(float)); delete[] (dataHWC); ``` - + 3. Preprocessing the input data. ```cpp @@ -293,7 +296,7 @@ The inference code process is as follows. For details about the complete code, s } ``` -4. Perform inference on the input tensor based on the model, obtain the output tensor, and perform post-processing. +4. Perform inference on the input tensor based on the model, obtain the output tensor, and perform post-processing. - Perform graph execution and on-device inference. @@ -303,6 +306,7 @@ The inference code process is as follows. For details about the complete code, s ``` - Obtain the output data. + ```cpp auto names = mSession->GetOutputTensorNames(); std::unordered_map msOutputs; @@ -312,22 +316,23 @@ The inference code process is as follows. For details about the complete code, s } std::string retStr = ProcessRunnetResult(msOutputs, ret); ``` - + - Perform post-processing of the output data. + ```cpp std::string ProcessRunnetResult(std::unordered_map msOutputs, int runnetRet) { - + std::unordered_map::iterator iter; iter = msOutputs.begin(); - + // The mobilenetv2.ms model output just one branch. auto outputTensor = iter->second; int tensorNum = outputTensor->ElementsNum(); - + // Get a pointer to the first score. float *temp_scores = static_cast(outputTensor->MutableData()); - + float scores[RET_CATEGORY_SUM]; for (int i = 0; i < RET_CATEGORY_SUM; ++i) { if (temp_scores[i] > 0.5) { @@ -335,7 +340,7 @@ The inference code process is as follows. For details about the complete code, s } scores[i] = temp_scores[i]; } - + // Score for each category. // Converted to text information that needs to be displayed in the APP. std::string categoryScore = ""; @@ -347,5 +352,5 @@ The inference code process is as follows. For details about the complete code, s categoryScore += ";"; } return categoryScore; - } - ``` \ No newline at end of file + } + ``` diff --git a/tutorials/lite/source_en/use/benchmark_tool.md b/tutorials/lite/source_en/use/benchmark_tool.md index 5bf91d33c6a7603dee7a9cb74a1d9d182d81634f..60f8b6a5d80422cddc0d6c4b9036a0090a5635df 100644 --- a/tutorials/lite/source_en/use/benchmark_tool.md +++ b/tutorials/lite/source_en/use/benchmark_tool.md @@ -40,7 +40,7 @@ The main test indicator of the performance test performed by the Benchmark tool This command uses a random input, and other parameters use default values. After this command is executed, the following statistics are displayed. The statistics include the minimum duration, maximum duration, and average duration of a single inference after the tested model runs for the specified number of inference rounds. -``` +```text Model = test_benchmark.ms, numThreads = 2, MinRunTime = 72.228996 ms, MaxRuntime = 73.094002 ms, AvgRunTime = 72.556000 ms ``` @@ -50,7 +50,7 @@ Model = test_benchmark.ms, numThreads = 2, MinRunTime = 72.228996 ms, MaxRuntime This command uses a random input, sets the parameter `timeProfiling` as true, and other parameters use default values. After this command is executed, the statistics on the running time of the model at the network layer will be displayed as follows. In this case, the statistics are displayed by`opName` and `optype`. `opName` indicates the operator name, `optype` indicates the operator type, and `avg` indicates the average running time of the operator per single run, `percent` indicates the ratio of the operator running time to the total operator running time, `calledTimess` indicates the number of times that the operator is run, and `opTotalTime` indicates the total time that the operator is run for a specified number of times. Finally, `total time` and `kernel cost` show the average time consumed by a single inference operation of the model and the sum of the average time consumed by all operators in the model inference, respectively. -``` +```text ----------------------------------------------------------------------------------------- opName avg(ms) percent calledTimess opTotalTime conv2d_1/convolution 2.264800 0.824012 10 22.648003 @@ -98,7 +98,7 @@ The accuracy test performed by the Benchmark tool is to verify the accuracy of t This command specifies the input data and benchmark data of the tested model, specifies that the model inference program runs on the CPU, and sets the accuracy threshold to 3%. After this command is executed, the following statistics are displayed, including the single input data of the tested model, output result and average deviation rate of the output node, and average deviation rate of all nodes. -``` +```text InData0: 139.947 182.373 153.705 138.945 108.032 164.703 111.585 227.402 245.734 97.7776 201.89 134.868 144.851 236.027 18.1142 22.218 5.15569 212.318 198.43 221.853 ================ Comparing Output data ================ Data of node age_out : 5.94584e-08 6.3317e-08 1.94726e-07 1.91809e-07 8.39805e-08 7.66035e-08 1.69285e-07 1.46246e-07 6.03796e-07 1.77631e-07 1.54343e-07 2.04623e-07 8.89609e-07 3.63487e-06 4.86876e-06 1.23939e-05 3.09981e-05 3.37098e-05 0.000107102 0.000213932 0.000533579 0.00062465 0.00296401 0.00993984 0.038227 0.0695085 0.162854 0.123199 0.24272 0.135048 0.169159 0.0221256 0.013892 0.00502971 0.00134921 0.00135701 0.000383242 0.000163475 0.000136294 9.77864e-05 8.00793e-05 5.73874e-05 3.53858e-05 2.18535e-05 2.04467e-05 1.85286e-05 1.05075e-05 9.34751e-06 6.12732e-06 4.55476e-06 @@ -108,6 +108,7 @@ Mean bias of all nodes: 0% ``` To set specified input shapes(such as 1,32,32,1), use command as follows: + ```bash ./benchmark --modelFile=./models/test_benchmark.ms --inDataFile=./input/test_benchmark.bin --inputShapes=1,32,32,1 --device=CPU --accuracyThreshold=3 --benchmarkDataFile=./output/test_benchmark.out ``` @@ -118,11 +119,11 @@ The command used for benchmark testing based on the compiled Benchmark tool is a ```bash ./benchmark [--modelFile=] [--accuracyThreshold=] - [--benchmarkDataFile=] [--benchmarkDataType=] - [--cpuBindMode=] [--device=] [--help] - [--inDataFile=] [--loopCount=] - [--numThreads=] [--warmUpLoopCount=] - [--enableFp16=] [--timeProfiling=] + [--benchmarkDataFile=] [--benchmarkDataType=] + [--cpuBindMode=] [--device=] [--help] + [--inDataFile=] [--loopCount=] + [--numThreads=] [--warmUpLoopCount=] + [--enableFp16=] [--timeProfiling=] [--inputShapes=] ``` diff --git a/tutorials/lite/source_en/use/build.md b/tutorials/lite/source_en/use/build.md index 74b3d08a1bf96da07a791fb7e9f97ee232a500a0..b429ae30527f9fa960a6cfe51504c1da1ac46ec6 100644 --- a/tutorials/lite/source_en/use/build.md +++ b/tutorials/lite/source_en/use/build.md @@ -32,33 +32,33 @@ This chapter introduces how to quickly compile MindSpore Lite, which includes th - The compilation environment supports Linux x86_64 only. Ubuntu 18.04.02 LTS is recommended. - Compilation dependencies of runtime(cpp), benchmark: - - [CMake](https://cmake.org/download/) >= 3.14.1 - - [GCC](https://gcc.gnu.org/releases.html) >= 7.3.0 - - [Android_NDK r20b](https://dl.google.com/android/repository/android-ndk-r20b-linux-x86_64.zip) - - [Git](https://git-scm.com/downloads) >= 2.28.0 + - [CMake](https://cmake.org/download/) >= 3.14.1 + - [GCC](https://gcc.gnu.org/releases.html) >= 7.3.0 + - [Android_NDK r20b](https://dl.google.com/android/repository/android-ndk-r20b-linux-x86_64.zip) + - [Git](https://git-scm.com/downloads) >= 2.28.0 - Compilation dependencies of converter: - - [CMake](https://cmake.org/download/) >= 3.14.1 - - [GCC](https://gcc.gnu.org/releases.html) >= 7.3.0 - - [Android_NDK r20b](https://dl.google.com/android/repository/android-ndk-r20b-linux-x86_64.zip) - - [Git](https://git-scm.com/downloads) >= 2.28.0 - - [Autoconf](http://ftp.gnu.org/gnu/autoconf/) >= 2.69 - - [Libtool](https://www.gnu.org/software/libtool/) >= 2.4.6 - - [LibreSSL](http://www.libressl.org/) >= 3.1.3 - - [Automake](https://www.gnu.org/software/automake/) >= 1.11.6 - - [Libevent](https://libevent.org) >= 2.0 - - [M4](https://www.gnu.org/software/m4/m4.html) >= 1.4.18 - - [OpenSSL](https://www.openssl.org/) >= 1.1.1 - - [Python](https://www.python.org/) >= 3.7.5 + - [CMake](https://cmake.org/download/) >= 3.14.1 + - [GCC](https://gcc.gnu.org/releases.html) >= 7.3.0 + - [Android_NDK r20b](https://dl.google.com/android/repository/android-ndk-r20b-linux-x86_64.zip) + - [Git](https://git-scm.com/downloads) >= 2.28.0 + - [Autoconf](http://ftp.gnu.org/gnu/autoconf/) >= 2.69 + - [Libtool](https://www.gnu.org/software/libtool/) >= 2.4.6 + - [LibreSSL](http://www.libressl.org/) >= 3.1.3 + - [Automake](https://www.gnu.org/software/automake/) >= 1.11.6 + - [Libevent](https://libevent.org) >= 2.0 + - [M4](https://www.gnu.org/software/m4/m4.html) >= 1.4.18 + - [OpenSSL](https://www.openssl.org/) >= 1.1.1 + - [Python](https://www.python.org/) >= 3.7.5 - Compilation dependencies of runtime(java) - - [CMake](https://cmake.org/download/) >= 3.14.1 - - [GCC](https://gcc.gnu.org/releases.html) >= 7.3.0 - - [Android_NDK](https://dl.google.com/android/repository/android-ndk-r20b-linux-x86_64.zip) >= r20 - - [Git](https://git-scm.com/downloads) >= 2.28.0 - - [Android_SDK](https://developer.android.com/studio/releases/platform-tools?hl=zh-cn#downloads) >= 30 - - [Gradle](https://gradle.org/releases/) >= 6.6.1 - - [JDK](https://www.oracle.com/cn/java/technologies/javase/javase-jdk8-downloads.html) >= 1.8 + - [CMake](https://cmake.org/download/) >= 3.14.1 + - [GCC](https://gcc.gnu.org/releases.html) >= 7.3.0 + - [Android_NDK](https://dl.google.com/android/repository/android-ndk-r20b-linux-x86_64.zip) >= r20 + - [Git](https://git-scm.com/downloads) >= 2.28.0 + - [Android_SDK](https://developer.android.com/studio/releases/platform-tools?hl=zh-cn#downloads) >= 30 + - [Gradle](https://gradle.org/releases/) >= 6.6.1 + - [JDK](https://www.oracle.com/cn/java/technologies/javase/javase-jdk8-downloads.html) >= 1.8 > - To install and use `Android_NDK`, you need to configure environment variables. The command example is `export ANDROID_NDK={$NDK_PATH}/android-ndk-r20b`. > - Android SDK Tools need install Android SDK Build Tools. @@ -95,30 +95,37 @@ git clone https://gitee.com/mindspore/mindspore.git Then, run the following commands in the root directory of the source code to compile MindSpore Lite of different versions: - Debug version of the x86_64 architecture: + ```bash bash build.sh -I x86_64 -d ``` - Release version of the x86_64 architecture, with the number of threads set: + ```bash bash build.sh -I x86_64 -j32 ``` - Release version of the Arm 64-bit architecture in incremental compilation mode, with the number of threads set: + ```bash bash build.sh -I arm64 -i -j32 ``` - Release version of the Arm 64-bit architecture in incremental compilation mode, with the built-in GPU operator compiled: + ```bash bash build.sh -I arm64 -e gpu ``` - Compile ARM64 with image preprocessing module: + ```bash bash build.sh -I arm64 -n lite_cv ``` + - Compile MindSpore Lite AAR in incremental compilation mode: + ```bash bash build.sh -A java -i ``` @@ -128,6 +135,7 @@ Then, run the following commands in the root directory of the source code to com ### Output Description After the compilation is complete, go to the `mindspore/output` directory of the source code to view the file generated after compilation. The file is divided into three parts. + - `mindspore-lite-{version}-converter-{os}.tar.gz`: Contains model conversion tool. - `mindspore-lite-{version}-runtime-{os}-{device}.tar.gz`: Contains model inference framework, benchmarking tool and performance analysis tool. - `mindspore-lite-{version}-minddata-{os}-{device}.tar.gz`: Contains image processing library ImageProcess. @@ -146,13 +154,14 @@ tar -xvf mindspore-lite-{version}-runtime-{os}-{device}.tar.gz tar -xvf mindspore-lite-{version}-minddata-{os}-{device}.tar.gz unzip mindspore-lite-maven-{version}.zip ``` + #### Description of Converter's Directory Structure The conversion tool is only available under the `-I x86_64` compilation option, and the content includes the following parts: -``` +```text | -├── mindspore-lite-{version}-converter-{os} +├── mindspore-lite-{version}-converter-{os} │ └── converter # Model conversion Ttool │ └── lib # The dynamic link library that converter depends │ └── third_party # Header files and libraries of third party libraries @@ -164,9 +173,10 @@ The conversion tool is only available under the `-I x86_64` compilation option, The inference framework can be obtained under `-I x86_64`, `-I arm64` and `-I arm32` compilation options, and the content includes the following parts: - When the compilation option is `-I x86_64`: - ``` + + ```text | - ├── mindspore-lite-{version}-runtime-x86-cpu + ├── mindspore-lite-{version}-runtime-x86-cpu │ └── benchmark # Benchmarking Tool │ └── lib # Inference framework dynamic library │ ├── libmindspore-lite.so # Dynamic library of infernece framework in MindSpore Lite @@ -176,7 +186,8 @@ The inference framework can be obtained under `-I x86_64`, `-I arm64` and `-I ar ``` - When the compilation option is `-I arm64`: - ``` + + ```text | ├── mindspore-lite-{version}-runtime-arm64-cpu │ └── benchmark # Benchmarking Tool @@ -190,7 +201,8 @@ The inference framework can be obtained under `-I x86_64`, `-I arm64` and `-I ar ``` - When the compilation option is `-I arm32`: - ``` + + ```text | ├── mindspore-lite-{version}-runtime-arm32-cpu │ └── benchmark # Benchmarking Tool @@ -220,24 +232,23 @@ export LD_LIBRARY_PATH= ./output/mindspore-lite-{version}-runtime-x86-cpu/lib:${ - When the compilation option is `-A java`: - ``` + ```text | ├── mindspore-lite-maven-{version} │ └── mindspore - │ └── mindspore-lite - | └── {version}-SNAPSHOT - │ ├── mindspore-lite-{version}-{timestamp}-{versionCode}.aar # MindSpore Lite runtime aar + │ └── mindspore-lite + | └── {version}-SNAPSHOT + │ ├── mindspore-lite-{version}-{timestamp}-{versionCode}.aar # MindSpore Lite runtime aar ``` - #### Description of Imageprocess's Directory Structure The image processing library is only available under the `-I arm64 -n lite_cv` compilation option, and the content includes the following parts: -``` +```text | ├── mindspore-lite-{version}-minddata-{os}-{device} -│ └── benchmark # Benchmarking Tool +│ └── benchmark # Benchmarking Tool │ └── include # Head file(Only show files related to image processing) │ ├── lite_cv # Image processing library header file │ ├── image_process.h # Image processing function header file diff --git a/tutorials/lite/source_en/use/converter_tool.md b/tutorials/lite/source_en/use/converter_tool.md index 388896527d16fe9fa7eebcf42ca606745a959949..0d64d1e74022981009b66ea67a229fdeaf91cd75 100644 --- a/tutorials/lite/source_en/use/converter_tool.md +++ b/tutorials/lite/source_en/use/converter_tool.md @@ -36,9 +36,11 @@ To use the MindSpore Lite model conversion tool, you need to prepare the environ ### Example First, in the root directory of the source code, run the following command to perform compilation. For details, see `compile.md`. + ```bash bash build.sh -I x86_64 ``` + > Currently, the model conversion tool supports only the x86_64 architecture. The following describes how to use the conversion command by using several common examples. @@ -52,42 +54,51 @@ The following describes how to use the conversion command by using several commo In this example, the Caffe model is used. Therefore, the model structure and model weight files are required. Two more parameters `fmk` and `outputFile` are also required. The output is as follows: - ``` + + ```text CONVERTER RESULT SUCCESS:0 ``` + This indicates that the Caffe model is successfully converted into the MindSpore Lite model and the new file `lenet.ms` is generated. - + - The following uses the MindSpore, TensorFlow Lite, ONNX and perception quantization models as examples to describe how to run the conversion command. - - MindSpore model `model.mindir` + - MindSpore model `model.mindir` + ```bash ./converter_lite --fmk=MINDIR --modelFile=model.mindir --outputFile=model ``` - - - TensorFlow Lite model `model.tflite` + + - TensorFlow Lite model `model.tflite` + ```bash ./converter_lite --fmk=TFLITE --modelFile=model.tflite --outputFile=model ``` - - - ONNX model `model.onnx` + + - ONNX model `model.onnx` + ```bash ./converter_lite --fmk=ONNX --modelFile=model.onnx --outputFile=model ``` - - TensorFlow Lite aware quantization model `model_quant.tflite` + - TensorFlow Lite aware quantization model `model_quant.tflite` + ```bash ./converter_lite --fmk=TFLITE --modelFile=model.tflite --outputFile=model --quantType=AwareTraining ``` - - - TensorFlow Lite aware quantization model `model_quant.tflite` set the input and output data type to be float + + - TensorFlow Lite aware quantization model `model_quant.tflite` set the input and output data type to be float + ```bash ./converter_lite --fmk=TFLITE --modelFile=model.tflite --outputFile=model --quantType=AwareTraining --inferenceType=FLOAT ``` In the preceding scenarios, the following information is displayed, indicating that the conversion is successful. In addition, the target file `model.ms` is obtained. - ``` + + ```text CONVERTER RESULT SUCCESS:0 ``` + - If fail to run the conversion command, an [errorcode](https://www.mindspore.cn/doc/api_cpp/en/master/errorcode_and_metatype.html) will be output. ### Parameter Description @@ -97,7 +108,6 @@ You can enter `./converter_lite --help` to obtain help information in real time. The following describes the parameters in detail. - | Parameter | Mandatory or Not | Parameter Description | Value Range | Default Value | | -------- | ------- | ----- | --- | ---- | | `--help` | No | Prints all help information. | - | - | @@ -106,7 +116,7 @@ The following describes the parameters in detail. | `--outputFile=` | Yes | Path of the output model. (If the path does not exist, a directory will be automatically created.) The suffix `.ms` can be automatically generated. | - | - | | `--weightFile=` | Yes (for Caffe models only) | Path of the weight file of the input model. | - | - | | `--quantType=` | No | Sets the quant type of the model. | PostTraining: quantization after training
AwareTraining: perceptual quantization
WeightQuant: only do weight quantization after training | - | -| `--inferenceType= `| No(supported by aware quant models only) | Sets the input and output data type of the converted model. If the types are different from the origin model, the convert tool will insert data type convert op in the inputs and outputs of the model to make sure the data types are same as origin model. | UINT8, FLOAT or INT8 | FLOAT | +| `--inferenceType=`| No(supported by aware quant models only) | Sets the input and output data type of the converted model. If the types are different from the origin model, the convert tool will insert data type convert op in the inputs and outputs of the model to make sure the data types are same as origin model. | UINT8, FLOAT or INT8 | FLOAT | | `--bitNum=` | No | Sets the quantization bitNum when quantType is set as WeightQuant,now only support 8 bits. | 8 | 8 | | `--quantWeightSize=` | No | Sets a size threshold of convolution filter when quantType is set as WeightQuant.If size is bigger than this value,it will trigger weight quantization | (0, +∞) | 0 | | `--quantWeightChannel=` | No | Sets a channel num threshold of convolution filter when quantType is set as WeightQuant.If num is bigger than this,it will trigger weight quantization | (0, +∞) | 16 | @@ -130,9 +140,11 @@ Reference description Linux environment model conversion tool [parameter descrip ### Example Set the log printing level to INFO. + ```bash set GLOG_v=1 ``` + > Log level: 0 is DEBUG, 1 is INFO, 2 is WARNING, 3 is ERROR. Several common examples are selected below to illustrate the use of conversion commands. @@ -146,35 +158,43 @@ Several common examples are selected below to illustrate the use of conversion c In this example, because the Caffe model is used, two input files of model structure and model weight are required. Then plus fmk type and output path two parameters which are required, you can successfully execute. The result is shown as: - ``` + + ```text CONVERTER RESULT SUCCESS:0 ``` + This means that the Caffe model has been successfully converted to the MindSpore Lite model and the new file `lenet.ms` has been obtained. - + - Take MindSpore, TensorFlow Lite, ONNX model format and perceptual quantization model as examples to execute conversion commands. - - MindSpore model `model.mindir` + - MindSpore model `model.mindir` + ```bash call converter_lite --fmk=MINDIR --modelFile=model.mindir --outputFile=model ``` - - - TensorFlow Lite model`model.tflite` + + - TensorFlow Lite model`model.tflite` + ```bash call converter_lite --fmk=TFLITE --modelFile=model.tflite --outputFile=model ``` - - - ONNX model`model.onnx` + + - ONNX model`model.onnx` + ```bash call converter_lite --fmk=ONNX --modelFile=model.onnx --outputFile=model ``` - - TensorFlow Lite awaring quant model `model_quant.tflite` + - TensorFlow Lite awaring quant model `model_quant.tflite` + ```bash call converter_lite --fmk=TFLITE --modelFile=model_quant.tflite --outputFile=model --quantType=AwareTraining ``` In the above cases, the following conversion success prompt is displayed, and the `model.ms` target file is obtained at the same time. - ``` + + ```text CONVERTER RESULT SUCCESS:0 - ``` + ``` + - If fail to run the conversion command, an [errorcode](https://www.mindspore.cn/doc/api_cpp/en/master/errorcode_and_metatype.html) will be output. diff --git a/tutorials/lite/source_en/use/image_processing.md b/tutorials/lite/source_en/use/image_processing.md index f893ad41b3b9c9048310fb473c011e0f6962120d..06780e3461216f46370a7eb5fe924dba1751a4c5 100644 --- a/tutorials/lite/source_en/use/image_processing.md +++ b/tutorials/lite/source_en/use/image_processing.md @@ -7,15 +7,15 @@ - [Import image preprocessing function library](#import-image-preprocessing-function-library) - [Initialize the image](#initialize-the-image) - [Usage example](#usage-example) - - [Optional image preprocessing operator](#optional-image-preprocessing-operator) + - [Optional image preprocessing operator](#optional-image-preprocessing-operator) - [Resize image](#resize-image) - - [Usage example](#usage-example-1) + - [Usage example](#usage-example-1) - [Convert the image data type](#convert-the-image-data-type) - - [Usage example](#usage-example-2) + - [Usage example](#usage-example-2) - [Crop image data](#crop-image-data) - - [Usage example](#usage-example-3) + - [Usage example](#usage-example-3) - [Normalize image data](#normalize-image-data) - - [Usage example](#usage-example-4) + - [Usage example](#usage-example-4) @@ -29,7 +29,7 @@ The process is as follows: ## Import image preprocessing function library -``` +```cpp #include "lite_cv/lite_mat.h" #include "lite_cv/image_process.h" ``` @@ -38,13 +38,13 @@ The process is as follows: Here, the [InitFromPixel](https://www.mindspore.cn/doc/api_cpp/en/master/dataset.html#initfrompixel) function in the `image_process.h` file is used to initialize the image. -``` +```cpp bool InitFromPixel(const unsigned char *data, LPixelType pixel_type, LDataType data_type, int w, int h, LiteMat &m) ``` ### Usage example -``` +```cpp // Create the data object of the LiteMat object. LiteMat lite_mat_bgr; @@ -61,13 +61,13 @@ The image processing operators here can be used in any combination according to Here we use the [ResizeBilinear](https://www.mindspore.cn/doc/api_cpp/en/master/dataset.html#resizebilinear) function in `image_process.h` to resize the image through a bilinear algorithm. Currently, the supported data type is unit8, the supported channels are 3 and 1. -``` +```cpp bool ResizeBilinear(const LiteMat &src, LiteMat &dst, int dst_w, int dst_h) ``` #### Usage example -``` +```cpp // Initialize the image data. LiteMat lite_mat_bgr; InitFromPixel(rgba_mat.data, LPixelType::RGBA2BGR, LDataType::UINT8, rgba_mat.cols, rgba_mat.rows, lite_mat_bgr); @@ -83,13 +83,13 @@ ResizeBilinear(lite_mat_bgr, lite_mat_resize, 256, 256); Here we use the [ConvertTo](https://www.mindspore.cn/doc/api_cpp/en/master/dataset.html#convertto) function in `image_process.h` to convert the image data type. Currently, the supported conversion is to convert uint8 to float. -``` +```cpp bool ConvertTo(const LiteMat &src, LiteMat &dst, double scale = 1.0) ``` #### Usage example -``` +```cpp // Initialize the image data. LiteMat lite_mat_bgr; InitFromPixel(rgba_mat.data, LPixelType::RGBA2BGR, LDataType::UINT8, rgba_mat.cols, rgba_mat.rows, lite_mat_bgr); @@ -105,13 +105,13 @@ ConvertTo(lite_mat_bgr, lite_mat_convert_float); Here we use the [Crop](https://www.mindspore.cn/doc/api_cpp/en/master/dataset.html#crop) function in `image_process.h` to crop the image. Currently, channels 3 and 1 are supported. -``` +```cpp bool Crop(const LiteMat &src, LiteMat &dst, int x, int y, int w, int h) ``` #### Usage example -``` +```cpp // Initialize the image data. LiteMat lite_mat_bgr; InitFromPixel(rgba_mat.data, LPixelType::RGBA2BGR, LDataType::UINT8, rgba_mat.cols, rgba_mat.rows, lite_mat_bgr); @@ -127,13 +127,13 @@ Crop(lite_mat_bgr, lite_mat_cut, 16, 16, 224, 224); In order to eliminate the dimensional influence among the data indicators, and solve the comparability problem among the data indicators through standardization processing, here is the use of the [SubStractMeanNormalize](https://www.mindspore.cn/doc/api_cpp/en/master/dataset.html#substractmeannormalize) function in `image_process.h` to normalize the image data. -``` +```cpp bool SubStractMeanNormalize(const LiteMat &src, LiteMat &dst, const std::vector &mean, const std::vector &std) ``` #### Usage example -``` +```cpp // Initialize the image data. LiteMat lite_mat_bgr; InitFromPixel(rgba_mat.data, LPixelType::RGBA2BGR, LDataType::UINT8, rgba_mat.cols, rgba_mat.rows, lite_mat_bgr); @@ -148,4 +148,4 @@ LiteMat lite_mat_bgr_norm; // The image data is normalized by the mean value and variance of the image data. SubStractMeanNormalize(lite_mat_bgr, lite_mat_bgr_norm, means, stds); -``` \ No newline at end of file +``` diff --git a/tutorials/lite/source_en/use/post_training_quantization.md b/tutorials/lite/source_en/use/post_training_quantization.md index d5cb08563bd75b78c295a98ac02dba7a2702e16f..544255096c9d953c879f87b367124e9be558b05e 100644 --- a/tutorials/lite/source_en/use/post_training_quantization.md +++ b/tutorials/lite/source_en/use/post_training_quantization.md @@ -1,3 +1,3 @@ -# Converting to the MindSpore Lite Model (Post Training Quantization) +# Converting to the MindSpore Lite Model (Post Training Quantization) -Post training quantization is being translated, will be released soon. \ No newline at end of file +Post training quantization is being translated, will be released soon. diff --git a/tutorials/lite/source_en/use/runtime_cpp.md b/tutorials/lite/source_en/use/runtime_cpp.md index f763ba52b2c45e4f82a02e85c60074f04dd7c463..6b91c732b66767c4a2f4a5e55caa7d79ebdfec95 100644 --- a/tutorials/lite/source_en/use/runtime_cpp.md +++ b/tutorials/lite/source_en/use/runtime_cpp.md @@ -35,7 +35,6 @@ - ## Overview @@ -47,6 +46,7 @@ The procedure for using Runtime is shown in the following figure: ![img](../images/side_infer_process.png) Its components and their functions are described as follows: + - `Model`: model used by MindSpore Lite, which instantiates the list of operator prototypes through image composition or direct network loading. - `Lite Session`: provides the graph compilation function and calls the graph executor for inference. - `Scheduler`: operator heterogeneous scheduler. It can select a proper kernel for each operator based on the heterogeneous scheduling policy, construct a kernel list, and split a graph into subgraphs. @@ -252,6 +252,7 @@ memcpy(in_data, input_buf, data_size); ``` Note: + - The data layout in the model input tensors of MindSpore Lite must be NHWC. - The model input `input_buf` is read from disks. After it is copied to model input tensors, you need to release `input_buf`. - Vectors returned by using the `GetInputs` and `GetInputsByTensorName` methods do not need to be released by users. @@ -299,6 +300,7 @@ session->BindThread(false); ### Callback Running MindSpore Lite can transfer two `KernelCallBack` function pointers to call back the inference model when calling `RunGraph`. Compared with common graph execution, callback running can obtain extra information during the running process to help developers analyze performance and fix bugs. The extra information includes: + - Name of the running node - Input and output tensors before inference of the current node - Input and output tensors after inference of the current node @@ -380,6 +382,7 @@ delete (model); After performing inference, MindSpore Lite can obtain the model inference result. MindSpore Lite provides the following methods to obtain the model output `MSTensor`. + 1. Use the `GetOutputsByNodeName` method to obtain vectors of the model output `MSTensor` that is connected to the model output node based on the node name. ```cpp @@ -515,11 +518,13 @@ std::string version = mindspore::lite::Version(); ``` ## Session parallel launch + MindSpore Lite supports multiple `LiteSession` parallel inferences, but does not support multiple threads calling the `RunGraph` interface of a single `LiteSession` at the same time. ### Single Session parallel launch MindSpore Lite does not support multi-threaded parallel calling of the inference interface of a single `LiteSession`, otherwise we will get the following error message: + ```cpp ERROR [mindspore/lite/src/lite_session.cc:297] RunGraph] 10 Not support multi-threading ``` @@ -531,6 +536,7 @@ MindSpore Lite supports multiple `LiteSession` in doing inference in parallel. T ### Example The following code shows how to create multiple `LiteSession` and do inference in parallel: + ```cpp #include #include "src/common/file_utils.h" diff --git a/tutorials/lite/source_zh_cn/quick_start/quick_start.md b/tutorials/lite/source_zh_cn/quick_start/quick_start.md index 2a8050f6ab1872b0359887512d845331d4737174..15deccdde6f612727f150931468a4cbe54f7c894 100644 --- a/tutorials/lite/source_zh_cn/quick_start/quick_start.md +++ b/tutorials/lite/source_zh_cn/quick_start/quick_start.md @@ -22,12 +22,13 @@ ## 概述 我们推荐你从端侧Android图像分类demo入手,了解MindSpore Lite应用工程的构建、依赖项配置以及相关API的使用。 - + 本教程基于MindSpore团队提供的Android“端侧图像分类”示例程序,演示了端侧部署的流程。 + 1. 选择图像分类模型。 2. 将模型转换成MindSpore Lite模型格式。 3. 在端侧使用MindSpore Lite推理模型。详细说明如何在端侧利用MindSpore Lite C++ API(Android JNI)和MindSpore Lite图像分类模型完成端侧推理,实现对设备摄像头捕获的内容进行分类,并在APP图像预览界面中,显示出最可能的分类结果。 - + > 你可以在这里找到[Android图像分类模型](https://download.mindspore.cn/model_zoo/official/lite/mobilenetv2_openimage_lite)和[示例代码](https://gitee.com/mindspore/mindspore/tree/master/model_zoo/official/lite/image_classification)。 ## 选择模型 @@ -41,6 +42,7 @@ MindSpore Model Zoo中图像分类模型可[在此下载](https://download.minds 如果预置模型已经满足你要求,请跳过本章节。 如果你需要对MindSpore提供的模型进行重训,重训完成后,需要将模型导出为[.mindir格式](https://www.mindspore.cn/tutorial/training/zh-CN/master/use/save_model.html#mindir)。然后使用MindSpore Lite[模型转换工具](https://www.mindspore.cn/tutorial/lite/zh-CN/master/use/converter_tool.html)将.mindir模型转换成.ms格式。 以mobilenetv2模型为例,如下脚本将其转换为MindSpore Lite模型用于端侧推理。 + ```bash ./converter_lite --fmk=MINDIR --modelFile=mobilenetv2.mindir --outputFile=mobilenetv2.ms ``` @@ -59,7 +61,7 @@ MindSpore Model Zoo中图像分类模型可[在此下载](https://download.minds ### 构建与运行 -1. 在Android Studio中加载本示例源码,并安装相应的SDK(指定SDK版本后,由Android Studio自动安装)。 +1. 在Android Studio中加载本示例源码,并安装相应的SDK(指定SDK版本后,由Android Studio自动安装)。 ![start_home](../images/lite_quick_start_home.png) @@ -85,13 +87,10 @@ MindSpore Model Zoo中图像分类模型可[在此下载](https://download.minds ![install](../images/lite_quick_start_install.png) - - 识别结果如下图所示。 ![result](../images/lite_quick_start_app_result.png) - ## 示例程序详细说明 本端侧图像分类Android示例程序分为JAVA层和JNI层,其中,JAVA层主要通过Android Camera 2 API实现摄像头获取图像帧,以及相应的图像处理等功能;JNI层在[Runtime](https://www.mindspore.cn/tutorial/lite/zh-CN/master/use/runtime.html)中完成模型推理的过程。 @@ -100,7 +99,7 @@ MindSpore Model Zoo中图像分类模型可[在此下载](https://download.minds ### 示例程序结构 -``` +```text app ├── src/main │ ├── assets # 资源文件 @@ -119,7 +118,7 @@ app │ │ │ └── ... │ │ └── widget # 开启摄像头及绘制相关实现 │ │ └── ... -│ │ +│ │ │ ├── res # 存放Android相关的资源文件 │ └── AndroidManifest.xml # Android配置文件 │ @@ -146,7 +145,7 @@ Android JNI层调用MindSpore C++ API时,需要相关库文件支持。可通 mindspore-lite-1.0.0-minddata-arm64-cpu.tar.gz [下载链接](https://ms-release.obs.cn-north-4.myhuaweicloud.com/1.0.0/lite/android_aarch64/mindspore-lite-1.0.0-minddata-arm64-cpu.tar.gz) -``` +```text android{ defaultConfig{ externalNativeBuild{ @@ -155,7 +154,7 @@ android{ } } - ndk{ + ndk{ abiFilters'armeabi-v7a', 'arm64-v8a' } } @@ -164,7 +163,7 @@ android{ 在`app/CMakeLists.txt`文件中建立`.so`库文件链接,如下所示。 -``` +```text # ============== Set MindSpore Dependencies. ============= include_directories(${CMAKE_SOURCE_DIR}/src/main/cpp) include_directories(${CMAKE_SOURCE_DIR}/src/main/cpp/${MINDSPORELITE_VERSION}/third_party/flatbuffers/include) @@ -182,7 +181,7 @@ set_target_properties(minddata-lite PROPERTIES IMPORTED_LOCATION ${CMAKE_SOURCE_DIR}/src/main/cpp/${MINDSPORELITE_VERSION}/lib/libminddata-lite.so) # --------------- MindSpore Lite set End. -------------------- -# Link target library. +# Link target library. target_link_libraries( ... # --- mindspore --- @@ -202,34 +201,37 @@ target_link_libraries( 在JNI层调用MindSpore Lite C++ API实现端测推理。 -推理代码流程如下,完整代码请参见`src/cpp/MindSporeNetnative.cpp`。 +推理代码流程如下,完整代码请参见`src/cpp/MindSporeNetnative.cpp`。 1. 加载MindSpore Lite模型文件,构建上下文、会话以及用于推理的计算图。 - 加载模型文件:创建并配置用于模型推理的上下文 + ```cpp // Buffer is the model data passed in by the Java layer jlong bufferLen = env->GetDirectBufferCapacity(buffer); char *modelBuffer = CreateLocalModelBuffer(env, buffer); ``` - + - 创建会话 + ```cpp void **labelEnv = new void *; MSNetWork *labelNet = new MSNetWork; *labelEnv = labelNet; - + // Create context. mindspore::lite::Context *context = new mindspore::lite::Context; context->thread_num_ = num_thread; - + // Create the mindspore session. labelNet->CreateSessionMS(modelBuffer, bufferLen, context); delete (context); - + ``` - + - 加载模型文件并构建用于推理的计算图 + ```cpp void MSNetWork::CreateSessionMS(char* modelBuffer, size_t bufferLen, std::string name, mindspore::lite::Context* ctx) { @@ -239,8 +241,8 @@ target_link_libraries( int ret = session->CompileGraph(model); } ``` - -2. 将输入图片转换为传入MindSpore模型的Tensor格式。 + +2. 将输入图片转换为传入MindSpore模型的Tensor格式。 将待检测图片数据转换为输入MindSpore模型的Tensor。 @@ -250,7 +252,7 @@ target_link_libraries( // Processing such as zooming the picture size. matImgPreprocessed = PreProcessImageData(matImageSrc); - ImgDims inputDims; + ImgDims inputDims; inputDims.channel = matImgPreprocessed.channels(); inputDims.width = matImgPreprocessed.cols; inputDims.height = matImgPreprocessed.rows; @@ -270,7 +272,7 @@ target_link_libraries( inputDims.channel * inputDims.width * inputDims.height * sizeof(float)); delete[] (dataHWC); ``` - + 3. 对输入数据进行处理。 ```cpp @@ -302,7 +304,7 @@ target_link_libraries( } ``` -4. 对输入Tensor按照模型进行推理,获取输出Tensor,并进行后处理。 +4. 对输入Tensor按照模型进行推理,获取输出Tensor,并进行后处理。 - 图执行,端测推理。 @@ -312,6 +314,7 @@ target_link_libraries( ``` - 获取输出数据。 + ```cpp auto names = mSession->GetOutputTensorNames(); std::unordered_map msOutputs; @@ -321,22 +324,23 @@ target_link_libraries( } std::string retStr = ProcessRunnetResult(msOutputs, ret); ``` - + - 输出数据的后续处理。 + ```cpp std::string ProcessRunnetResult(std::unordered_map msOutputs, int runnetRet) { - + std::unordered_map::iterator iter; iter = msOutputs.begin(); - + // The mobilenetv2.ms model output just one branch. auto outputTensor = iter->second; int tensorNum = outputTensor->ElementsNum(); - + // Get a pointer to the first score. float *temp_scores = static_cast(outputTensor->MutableData()); - + float scores[RET_CATEGORY_SUM]; for (int i = 0; i < RET_CATEGORY_SUM; ++i) { if (temp_scores[i] > 0.5) { @@ -344,7 +348,7 @@ target_link_libraries( } scores[i] = temp_scores[i]; } - + // Score for each category. // Converted to text information that needs to be displayed in the APP. std::string categoryScore = ""; @@ -356,5 +360,5 @@ target_link_libraries( categoryScore += ";"; } return categoryScore; - } + } ``` diff --git a/tutorials/lite/source_zh_cn/use/benchmark_tool.md b/tutorials/lite/source_zh_cn/use/benchmark_tool.md index d411e48fc922562a3bef51f33063778bd6e02040..5b0c12be39e9d04994354afa179b4d0393e9457a 100644 --- a/tutorials/lite/source_zh_cn/use/benchmark_tool.md +++ b/tutorials/lite/source_zh_cn/use/benchmark_tool.md @@ -40,7 +40,7 @@ Benchmark工具进行的性能测试主要的测试指标为模型单次前向 这条命令使用随机输入,其他参数使用默认值。该命令执行后会输出如下统计信息,该信息显示了测试模型在运行指定推理轮数后所统计出的单次推理最短耗时、单次推理最长耗时和平均推理耗时。 -``` +```text Model = test_benchmark.ms, numThreads = 2, MinRunTime = 72.228996 ms, MaxRuntime = 73.094002 ms, AvgRunTime = 72.556000 ms ``` @@ -50,7 +50,7 @@ Model = test_benchmark.ms, numThreads = 2, MinRunTime = 72.228996 ms, MaxRuntime 这条命令使用随机输入,并且输出模型网络层的耗时信息,其他参数使用默认值。该命令执行后,模型网络层的耗时会输出如下统计信息,在该例中,该统计信息按照`opName`和`optype`两种划分方式分别显示,`opName`表示算子名,`optype`表示算子类别,`avg`表示该算子的平均单次运行时间,`percent`表示该算子运行耗时占所有算子运行总耗时的比例,`calledTimess`表示该算子的运行次数,`opTotalTime`表示该算子运行指定次数的总耗时。最后,`total time`和`kernel cost`分别显示了该模型单次推理的平均耗时和模型推理中所有算子的平均耗时之和。 -``` +```text ----------------------------------------------------------------------------------------- opName avg(ms) percent calledTimess opTotalTime conv2d_1/convolution 2.264800 0.824012 10 22.648003 @@ -98,7 +98,7 @@ Benchmark工具进行的精度测试主要是通过设置标杆数据来对比 这条命令指定了测试模型的输入数据、标杆数据(默认的输入及标杆数据类型均为float32),同时指定了模型推理程序在CPU上运行,并指定了准确度阈值为3%。该命令执行后会输出如下统计信息,该信息显示了测试模型的单条输入数据、输出节点的输出结果和平均偏差率以及所有节点的平均偏差率。 -``` +```text InData0: 139.947 182.373 153.705 138.945 108.032 164.703 111.585 227.402 245.734 97.7776 201.89 134.868 144.851 236.027 18.1142 22.218 5.15569 212.318 198.43 221.853 ================ Comparing Output data ================ Data of node age_out : 5.94584e-08 6.3317e-08 1.94726e-07 1.91809e-07 8.39805e-08 7.66035e-08 1.69285e-07 1.46246e-07 6.03796e-07 1.77631e-07 1.54343e-07 2.04623e-07 8.89609e-07 3.63487e-06 4.86876e-06 1.23939e-05 3.09981e-05 3.37098e-05 0.000107102 0.000213932 0.000533579 0.00062465 0.00296401 0.00993984 0.038227 0.0695085 0.162854 0.123199 0.24272 0.135048 0.169159 0.0221256 0.013892 0.00502971 0.00134921 0.00135701 0.000383242 0.000163475 0.000136294 9.77864e-05 8.00793e-05 5.73874e-05 3.53858e-05 2.18535e-05 2.04467e-05 1.85286e-05 1.05075e-05 9.34751e-06 6.12732e-06 4.55476e-06 @@ -108,6 +108,7 @@ Mean bias of all nodes: 0% ``` 如果需要指定输入数据的维度(例如输入维度为1,32,32,1),使用如下命令: + ```bash ./benchmark --modelFile=./models/test_benchmark.ms --inDataFile=./input/test_benchmark.bin --inputShapes=1,32,32,1 --device=CPU --accuracyThreshold=3 --benchmarkDataFile=./output/test_benchmark.out ``` @@ -118,11 +119,11 @@ Mean bias of all nodes: 0% ```bash ./benchmark [--modelFile=] [--accuracyThreshold=] - [--benchmarkDataFile=] [--benchmarkDataType=] - [--cpuBindMode=] [--device=] [--help] - [--inDataFile=] [--loopCount=] - [--numThreads=] [--warmUpLoopCount=] - [--enableFp16=] [--timeProfiling=] + [--benchmarkDataFile=] [--benchmarkDataType=] + [--cpuBindMode=] [--device=] [--help] + [--inDataFile=] [--loopCount=] + [--numThreads=] [--warmUpLoopCount=] + [--enableFp16=] [--timeProfiling=] [--inputShapes=] ``` diff --git a/tutorials/lite/source_zh_cn/use/build.md b/tutorials/lite/source_zh_cn/use/build.md index eebe9a0f59def2f9d89d642f106ce79750297936..09fdccb0d0269dc6c48107529dc1c6e51be78659 100644 --- a/tutorials/lite/source_zh_cn/use/build.md +++ b/tutorials/lite/source_zh_cn/use/build.md @@ -31,31 +31,31 @@ - 系统环境:Linux x86_64,推荐使用Ubuntu 18.04.02LTS - runtime(cpp)、benchmark编译依赖 - - [CMake](https://cmake.org/download/) >= 3.14.1 - - [GCC](https://gcc.gnu.org/releases.html) >= 7.3.0 - - [Android_NDK](https://dl.google.com/android/repository/android-ndk-r20b-linux-x86_64.zip) >= r20 - - [Git](https://git-scm.com/downloads) >= 2.28.0 + - [CMake](https://cmake.org/download/) >= 3.14.1 + - [GCC](https://gcc.gnu.org/releases.html) >= 7.3.0 + - [Android_NDK](https://dl.google.com/android/repository/android-ndk-r20b-linux-x86_64.zip) >= r20 + - [Git](https://git-scm.com/downloads) >= 2.28.0 - converter编译依赖 - - [CMake](https://cmake.org/download/) >= 3.14.1 - - [GCC](https://gcc.gnu.org/releases.html) >= 7.3.0 - - [Android_NDK](https://dl.google.com/android/repository/android-ndk-r20b-linux-x86_64.zip) >= r20 - - [Git](https://git-scm.com/downloads) >= 2.28.0 - - [Autoconf](http://ftp.gnu.org/gnu/autoconf/) >= 2.69 - - [Libtool](https://www.gnu.org/software/libtool/) >= 2.4.6 - - [LibreSSL](http://www.libressl.org/) >= 3.1.3 - - [Automake](https://www.gnu.org/software/automake/) >= 1.11.6 - - [Libevent](https://libevent.org) >= 2.0 - - [M4](https://www.gnu.org/software/m4/m4.html) >= 1.4.18 - - [OpenSSL](https://www.openssl.org/) >= 1.1.1 - - [Python](https://www.python.org/) >= 3.7.5 + - [CMake](https://cmake.org/download/) >= 3.14.1 + - [GCC](https://gcc.gnu.org/releases.html) >= 7.3.0 + - [Android_NDK](https://dl.google.com/android/repository/android-ndk-r20b-linux-x86_64.zip) >= r20 + - [Git](https://git-scm.com/downloads) >= 2.28.0 + - [Autoconf](http://ftp.gnu.org/gnu/autoconf/) >= 2.69 + - [Libtool](https://www.gnu.org/software/libtool/) >= 2.4.6 + - [LibreSSL](http://www.libressl.org/) >= 3.1.3 + - [Automake](https://www.gnu.org/software/automake/) >= 1.11.6 + - [Libevent](https://libevent.org) >= 2.0 + - [M4](https://www.gnu.org/software/m4/m4.html) >= 1.4.18 + - [OpenSSL](https://www.openssl.org/) >= 1.1.1 + - [Python](https://www.python.org/) >= 3.7.5 - runtime(java)编译依赖 - - [CMake](https://cmake.org/download/) >= 3.14.1 - - [GCC](https://gcc.gnu.org/releases.html) >= 7.3.0 - - [Android_NDK](https://dl.google.com/android/repository/android-ndk-r20b-linux-x86_64.zip) >= r20 - - [Git](https://git-scm.com/downloads) >= 2.28.0 - - [Android_SDK](https://developer.android.com/studio/releases/platform-tools?hl=zh-cn#downloads) >= 30 - - [Gradle](https://gradle.org/releases/) >= 6.6.1 - - [JDK](https://www.oracle.com/cn/java/technologies/javase/javase-jdk8-downloads.html) >= 1.8 + - [CMake](https://cmake.org/download/) >= 3.14.1 + - [GCC](https://gcc.gnu.org/releases.html) >= 7.3.0 + - [Android_NDK](https://dl.google.com/android/repository/android-ndk-r20b-linux-x86_64.zip) >= r20 + - [Git](https://git-scm.com/downloads) >= 2.28.0 + - [Android_SDK](https://developer.android.com/studio/releases/platform-tools?hl=zh-cn#downloads) >= 30 + - [Gradle](https://gradle.org/releases/) >= 6.6.1 + - [JDK](https://www.oracle.com/cn/java/technologies/javase/javase-jdk8-downloads.html) >= 1.8 > - 当安装完依赖项Android_NDK后,需配置环境变量:`export ANDROID_NDK={$NDK_PATH}/android-ndk-r20b`。 > - Android SDK组件需要安装Android SDK Build Tools。 @@ -92,31 +92,37 @@ git clone https://gitee.com/mindspore/mindspore.git 然后,在源码根目录下执行如下命令,可编译不同版本的MindSpore Lite。 - 编译x86_64架构Debug版本。 + ```bash bash build.sh -I x86_64 -d ``` - 编译x86_64架构Release版本,同时设定线程数。 + ```bash bash build.sh -I x86_64 -j32 ``` - 增量编译ARM64架构Release版本,同时设定线程数。 + ```bash bash build.sh -I arm64 -i -j32 ``` - 编译ARM64架构Release版本,同时编译内置的GPU算子。 + ```bash bash build.sh -I arm64 -e gpu ``` - 编译ARM64带图像预处理模块。 + ```bash bash build.sh -I arm64 -n lite_cv ``` - 增量编译MindSpore Lite AAR。 + ```bash bash build.sh -A java -i ``` @@ -150,9 +156,9 @@ unzip mindspore-lite-maven-{version}.zip 转换工具仅在`-I x86_64`编译选项下获得,内容包括以下几部分: -``` +```text | -├── mindspore-lite-{version}-converter-{os} +├── mindspore-lite-{version}-converter-{os} │ └── converter # 模型转换工具 │ └── lib # 转换工具依赖的动态库 │ └── third_party # 第三方库头文件和库 @@ -164,9 +170,10 @@ unzip mindspore-lite-maven-{version}.zip 推理框架可在`-I x86_64`、`-I arm64`、`-I arm32`和`-A java`编译选项下获得,内容包括以下几部分: - 当编译选项为`-I x86_64`时: - ``` + + ```text | - ├── mindspore-lite-{version}-runtime-x86-cpu + ├── mindspore-lite-{version}-runtime-x86-cpu │ └── benchmark # 基准测试工具 │ └── lib # 推理框架动态库 │ ├── libmindspore-lite.so # MindSpore Lite推理框架的动态库 @@ -176,7 +183,8 @@ unzip mindspore-lite-maven-{version}.zip ``` - 当编译选项为`-I arm64`时: - ``` + + ```text | ├── mindspore-lite-{version}-runtime-arm64-cpu │ └── benchmark # 基准测试工具 @@ -190,7 +198,8 @@ unzip mindspore-lite-maven-{version}.zip ``` - 当编译选项为`-I arm32`时: - ``` + + ```text | ├── mindspore-lite-{version}-runtime-arm32-cpu │ └── benchmark # 基准测试工具 @@ -220,23 +229,23 @@ export LD_LIBRARY_PATH=./output/mindspore-lite-{version}-runtime-x86-cpu/lib:${L - 当编译选项为`-A java`时: - ``` + ```text | ├── mindspore-lite-maven-{version} │ └── mindspore - │ └── mindspore-lite - | └── {version}-SNAPSHOT - │ ├── mindspore-lite-{version}-{timestamp}-{versionCode}.aar # MindSpore Lite推理框架aar包 + │ └── mindspore-lite + | └── {version}-SNAPSHOT + │ ├── mindspore-lite-{version}-{timestamp}-{versionCode}.aar # MindSpore Lite推理框架aar包 ``` #### 图像处理库目录结构说明 图像处理库在`-I arm64 -n lite_cv`编译选项下获得,内容包括以下几部分: -``` +```text | ├── mindspore-lite-{version}-minddata-{os}-{device} -│ └── benchmark # 基准测试工具 +│ └── benchmark # 基准测试工具 │ └── include # 头文件(此处只展示和图像处理相关的文件) │ ├── lite_cv # 图像处理库头文件 │ ├── image_process.h # 图像处理函数头文件 diff --git a/tutorials/lite/source_zh_cn/use/converter_tool.md b/tutorials/lite/source_zh_cn/use/converter_tool.md index 157d3e7758bcf0dba4f975f13bc96b71675e387d..57c8002b60e2956aa2ffc0dffbf6a91696a18e32 100644 --- a/tutorials/lite/source_zh_cn/use/converter_tool.md +++ b/tutorials/lite/source_zh_cn/use/converter_tool.md @@ -36,9 +36,11 @@ MindSpore Lite提供离线转换模型功能的工具,支持多种类型的模 ### 使用示例 首先,在源码根目录下,输入命令进行编译,可参考`build.md`。 + ```bash bash build.sh -I x86_64 ``` + > 目前模型转换工具仅支持x86_64架构。 下面选取了几个常用示例,说明转换命令的使用方法。 @@ -52,43 +54,51 @@ bash build.sh -I x86_64 本例中,因为采用了Caffe模型,所以需要模型结构、模型权值两个输入文件。再加上其他必需的fmk类型和输出路径两个参数,即可成功执行。 结果显示为: - ``` + + ```text CONVERTER RESULT SUCCESS:0 ``` + 这表示已经成功将Caffe模型转化为MindSpore Lite模型,获得新文件`lenet.ms`。 - + - 以MindSpore、TensorFlow Lite、ONNX模型格式和感知量化模型为例,执行转换命令。 - - MindSpore模型`model.mindir` + - MindSpore模型`model.mindir` + ```bash ./converter_lite --fmk=MINDIR --modelFile=model.mindir --outputFile=model ``` - - - TensorFlow Lite模型`model.tflite` + + - TensorFlow Lite模型`model.tflite` + ```bash ./converter_lite --fmk=TFLITE --modelFile=model.tflite --outputFile=model ``` - - - ONNX模型`model.onnx` + + - ONNX模型`model.onnx` + ```bash ./converter_lite --fmk=ONNX --modelFile=model.onnx --outputFile=model ``` - - TensorFlow Lite感知量化模型`model_quant.tflite` + - TensorFlow Lite感知量化模型`model_quant.tflite` + ```bash ./converter_lite --fmk=TFLITE --modelFile=model_quant.tflite --outputFile=model --quantType=AwareTraining ``` - - 感知量化模型输入输出类型设置为float - + - 感知量化模型输入输出类型设置为float + ```bash ./converter_lite --fmk=TFLITE --modelFile=model_quant.tflite --outputFile=model --quantType=AwareTraining --inferenceType=FLOAT ``` + 以上几种情况下,均显示如下转换成功提示,且同时获得`model.ms`目标文件。 - - ``` + + ```text CONVERTER RESULT SUCCESS:0 ``` + - 如果转换命令执行失败,程序会返回一个[错误码](https://www.mindspore.cn/doc/api_cpp/zh-CN/master/errorcode_and_metatype.html)。 > 训练后量化示例请参考。 @@ -107,13 +117,12 @@ MindSpore Lite模型转换工具提供了多种参数设置,用户可根据需 | `--outputFile=` | 是 | 输出模型的路径(不存在时将自动创建目录),不需加后缀,可自动生成`.ms`后缀。 | - | - | | `--weightFile=` | 转换Caffe模型时必选 | 输入模型weight文件的路径。 | - | - | | `--quantType=` | 否 | 设置模型的量化类型。 | WeightQuant:训练后量化(权重量化)
PostTraining:训练后量化(全量化)
AwareTraining:感知量化 | - | -|` --inferenceType=` | 否 | 设置感知量化模型输入输出数据类型,如果和原模型不一致则转换工具会在模型前后插转换算子,使得转换后的模型输入输出类型和inferenceType保持一致。 | UINT8、FLOAT、INT8 | FLOAT | +|`--inferenceType=` | 否 | 设置感知量化模型输入输出数据类型,如果和原模型不一致则转换工具会在模型前后插转换算子,使得转换后的模型输入输出类型和inferenceType保持一致。 | UINT8、FLOAT、INT8 | FLOAT | | `--bitNum=` | 否 | 设定训练后量化(权重量化)的比特数,目前仅支持8bit量化 | 8 | 8 | | `--quantWeightSize=` | 否 | 设定参与训练后量化(权重量化)的卷积核尺寸阈值,若卷积核尺寸大于该值,则对此权重进行量化 | (0,+∞) | 0 | | `--quantWeightChannel=` | 否 | 设定参与训练后量化(权重量化)的卷积通道数阈值,若卷积通道数大于该值,则对此权重进行量化 | (0,+∞) | 16 | | `--configFile=` | 否 | 训练后量化(全量化)校准数据集配置文件路径 | - | - | - > - 参数名和参数值之间用等号连接,中间不能有空格。 > - Caffe模型一般分为两个文件:`*.prototxt`模型结构,对应`--modelFile`参数;`*.caffemodel`模型权值,对应`--weightFile`参数。 @@ -132,9 +141,11 @@ MindSpore Lite模型转换工具提供了多种参数设置,用户可根据需 ### 使用示例 设置日志打印级别为INFO。 + ```bash set GLOG_v=1 ``` + > 日志级别:0代表DEBUG,1代表INFO,2代表WARNING,3代表ERROR。 下面选取了几个常用示例,说明转换命令的使用方法。 @@ -148,35 +159,43 @@ set GLOG_v=1 本例中,因为采用了Caffe模型,所以需要模型结构、模型权值两个输入文件。再加上其他必需的fmk类型和输出路径两个参数,即可成功执行。 结果显示为: - ``` + + ```text CONVERTER RESULT SUCCESS:0 ``` + 这表示已经成功将Caffe模型转化为MindSpore Lite模型,获得新文件`lenet.ms`。 - + - 以MindSpore、TensorFlow Lite、ONNX模型格式和感知量化模型为例,执行转换命令。 - - MindSpore模型`model.mindir` + - MindSpore模型`model.mindir` + ```bash call converter_lite --fmk=MINDIR --modelFile=model.mindir --outputFile=model ``` - - - TensorFlow Lite模型`model.tflite` + + - TensorFlow Lite模型`model.tflite` + ```bash call converter_lite --fmk=TFLITE --modelFile=model.tflite --outputFile=model ``` - - - ONNX模型`model.onnx` + + - ONNX模型`model.onnx` + ```bash call converter_lite --fmk=ONNX --modelFile=model.onnx --outputFile=model ``` - - TensorFlow Lite感知量化模型`model_quant.tflite` + - TensorFlow Lite感知量化模型`model_quant.tflite` + ```bash call converter_lite --fmk=TFLITE --modelFile=model_quant.tflite --outputFile=model --quantType=AwareTraining ``` 以上几种情况下,均显示如下转换成功提示,且同时获得`model.ms`目标文件。 - ``` + + ```text CONVERTER RESULT SUCCESS:0 - ``` + ``` + - 如果转换命令执行失败,程序会返回一个[错误码](https://www.mindspore.cn/doc/api_cpp/zh-CN/master/errorcode_and_metatype.html)。 diff --git a/tutorials/lite/source_zh_cn/use/image_processing.md b/tutorials/lite/source_zh_cn/use/image_processing.md index c99badd2254a8a72b23c407c3bce24c80ac75350..b328bb056ad6a6671fd56fb793741968e6e4ae26 100644 --- a/tutorials/lite/source_zh_cn/use/image_processing.md +++ b/tutorials/lite/source_zh_cn/use/image_processing.md @@ -7,15 +7,15 @@ - [导入图像预处理函数的库](#导入图像预处理函数的库) - [对图像进行初始化](#对图像进行初始化) - [使用示例](#使用示例) - - [可选的图像预处理算子](#可选的图像预处理算子) + - [可选的图像预处理算子](#可选的图像预处理算子) - [对图像进行缩放操作](#对图像进行缩放操作) - - [使用示例](#使用示例-1) + - [使用示例](#使用示例-1) - [对图像数据类型进行转换](#对图像数据类型进行转换) - - [使用示例](#使用示例-2) + - [使用示例](#使用示例-2) - [对图像数据进行裁剪](#对图像数据进行裁剪) - - [使用示例](#使用示例-3) + - [使用示例](#使用示例-3) - [对图像数据进行归一化处理](#对图像数据进行归一化处理) - - [使用示例](#使用示例-4) + - [使用示例](#使用示例-4) @@ -29,7 +29,7 @@ ## 导入图像预处理函数的库 -``` +```cpp #include "lite_cv/lite_mat.h" #include "lite_cv/image_process.h" ``` @@ -38,13 +38,13 @@ 这边使用的是`image_process.h`文件中的[InitFromPixel](https://www.mindspore.cn/doc/api_cpp/zh-CN/master/dataset.html#initfrompixel)函数对图像进行初始化操作。 -``` +```cpp bool InitFromPixel(const unsigned char *data, LPixelType pixel_type, LDataType data_type, int w, int h, LiteMat &m) ``` ### 使用示例 -``` +```cpp // Create the data object of the LiteMat object. LiteMat lite_mat_bgr; @@ -61,13 +61,13 @@ InitFromPixel(pixel_ptr, LPixelType::RGBA2GRAY, LDataType::UINT8, rgba_mat.cols, 这边利用的是`image_process.h`中的[ResizeBilinear](https://www.mindspore.cn/doc/api_cpp/zh-CN/master/dataset.html#resizebilinear)函数通过双线性算法调整图像大小,当前仅支持的数据类型为uint8,当前支持的通道为3和1。 -``` +```cpp bool ResizeBilinear(const LiteMat &src, LiteMat &dst, int dst_w, int dst_h) ``` #### 使用示例 -``` +```cpp // Initialize the image data. LiteMat lite_mat_bgr; InitFromPixel(rgba_mat.data, LPixelType::RGBA2BGR, LDataType::UINT8, rgba_mat.cols, rgba_mat.rows, lite_mat_bgr); @@ -83,13 +83,13 @@ ResizeBilinear(lite_mat_bgr, lite_mat_resize, 256, 256); 这边利用的是`image_process.h`中的[ConvertTo](https://www.mindspore.cn/doc/api_cpp/zh-CN/master/dataset.html#convertto)函数对图像数据类型进行转换,目前支持的转换是将uint8转换为float。 -``` +```cpp bool ConvertTo(const LiteMat &src, LiteMat &dst, double scale = 1.0) ``` #### 使用示例 -``` +```cpp // Initialize the image data. LiteMat lite_mat_bgr; InitFromPixel(rgba_mat.data, LPixelType::RGBA2BGR, LDataType::UINT8, rgba_mat.cols, rgba_mat.rows, lite_mat_bgr); @@ -105,13 +105,13 @@ ConvertTo(lite_mat_bgr, lite_mat_convert_float); 这边利用的是`image_process.h`中的[Crop](https://www.mindspore.cn/doc/api_cpp/zh-CN/master/dataset.html#crop)函数对图像进行裁剪,目前支持通道3和1。 -``` +```cpp bool Crop(const LiteMat &src, LiteMat &dst, int x, int y, int w, int h) ``` #### 使用示例 -``` +```cpp // Initialize the image data. LiteMat lite_mat_bgr; InitFromPixel(rgba_mat.data, LPixelType::RGBA2BGR, LDataType::UINT8, rgba_mat.cols, rgba_mat.rows, lite_mat_bgr); @@ -127,13 +127,13 @@ Crop(lite_mat_bgr, lite_mat_cut, 16, 16, 224, 224); 为了消除数据指标之间的量纲影响,通过标准化处理来解决数据指标之间的可比性问题,这边利用的是`image_process.h`中的[SubStractMeanNormalize](https://www.mindspore.cn/doc/api_cpp/zh-CN/master/dataset.html#substractmeannormalize)函数对图像数据进行归一化处理。 -``` +```cpp bool SubStractMeanNormalize(const LiteMat &src, LiteMat &dst, const std::vector &mean, const std::vector &std) ``` #### 使用示例 -``` +```cpp // Initialize the image data. LiteMat lite_mat_bgr; InitFromPixel(rgba_mat.data, LPixelType::RGBA2BGR, LDataType::UINT8, rgba_mat.cols, rgba_mat.rows, lite_mat_bgr); @@ -148,4 +148,4 @@ LiteMat lite_mat_bgr_norm; // The image data is normalized by the mean value and variance of the image data. SubStractMeanNormalize(lite_mat_bgr, lite_mat_bgr_norm, means, stds); -``` \ No newline at end of file +``` diff --git a/tutorials/lite/source_zh_cn/use/post_training_quantization.md b/tutorials/lite/source_zh_cn/use/post_training_quantization.md index a1b9875ab15093b07d5f65a9a0fcb8f224bbd7b0..1e10f8ab39a6ba4d677a7f49d2790b3f72f3f8da 100644 --- a/tutorials/lite/source_zh_cn/use/post_training_quantization.md +++ b/tutorials/lite/source_zh_cn/use/post_training_quantization.md @@ -23,6 +23,7 @@ 目前训练后量化属于alpha阶段(支持部分网络,不支持多输入模型),正在持续完善中。 MindSpore Lite训练后量化分为两类: + 1. 权重量化:单独对模型的权值进行量化; 2. 全量化:对模型的权值、激活值、bias值统一进行量化。 @@ -35,13 +36,15 @@ MindSpore Lite训练后量化分为两类: ### 参数说明 权重量化转换命令的一般形式为: -``` + +```bash ./converter_lite --fmk=ModelType --modelFile=ModelFilePath --outputFile=ConvertedModelPath --quantType=WeightQuant --bitNum=BitNumValue --quantSize=QuantizationSizeThresholdValue --convWeightQuantChannelThreshold=ConvWeightQuantChannelThresholdValue ``` + 下面对此命令的量化相关参数进行说明: | 参数 | 属性 | 功能描述 | 参数类型 | 默认值 | 取值范围 | -| -------- | ------- | ----- | ----- |----- | ----- | +| -------- | ------- | ----- | ----- |----- | ----- | | `--quantType=` | 必选 | 设置为WeightQuant,启用权重量化 | String | - | 必须设置为WeightQuant | | `--bitNum=` | 可选 | 设定权重量化的比特数,目前仅支持8bit量化 | Integer | 8 | 8 | | `--quantSize=` | 可选 | 设定参与权重量化的卷积核尺寸阈值,若卷积核尺寸大于该值,则对此权重进行量化;建议设置为500 | Integer | 0 | (0,+∞) | @@ -49,21 +52,22 @@ MindSpore Lite训练后量化分为两类: 用户可根据模型及自身需要对权重量化的参数作出调整。 - ### 使用步骤 1. 正确编译出`converter_lite`可执行文件。该部分可参考构建文档[编译MindSpore Lite](https://www.mindspore.cn/tutorial/lite/zh-CN/master/use/build.html),获得`converter_lite`工具,并配置环境变量。 2. 以TensorFlow Lite模型为例,执行权重量化模型转换命令: - ``` + + ```bash ./converter_lite --fmk=TFLITE --modelFile=Inception_v3.tflite --outputFile=Inception_v3.tflite --quantType=WeightQuant --bitNum=8 --quantSize=0 --convWeightQuantChannelThreshold=0 ``` + 3. 上述命令执行成功后,便可得到量化后的模型`Inception_v3.tflite.ms`,量化后的模型大小通常会下降到FP32模型的1/4。 ### 部分模型精度结果 - | 模型 | 测试数据集 | FP32模型精度 | 权重量化精度 | - | -------- | ------- | ----- | ----- | - | [Inception_V3](https://storage.googleapis.com/download.tensorflow.org/models/tflite/model_zoo/upload_20180427/inception_v3_2018_04_27.tgz) | [ImageNet](http://image-net.org/) | 77.92% | 77.84% | + | 模型 | 测试数据集 | FP32模型精度 | 权重量化精度 | + | -------- | ------- | ----- | ----- | + | [Inception_V3](https://storage.googleapis.com/download.tensorflow.org/models/tflite/model_zoo/upload_20180427/inception_v3_2018_04_27.tgz) | [ImageNet](http://image-net.org/) | 77.92% | 77.84% | | [Mobilenet_V1_1.0_224](https://storage.googleapis.com/download.tensorflow.org/models/mobilenet_v1_2018_02_22/mobilenet_v1_1.0_224.tgz) | [ImageNet](http://image-net.org/) | 70.96% | 70.56% | > 以上所有结果均在x86环境上测得。 @@ -75,13 +79,15 @@ MindSpore Lite训练后量化分为两类: ### 参数说明 全量化转换命令的一般形式为: -``` + +```bash ./converter_lite --fmk=ModelType --modelFile=ModelFilePath --outputFile=ConvertedModelPath --quantType=PostTraining --config_file=config.cfg ``` + 下面对此命令的量化相关参数进行说明: | 参数 | 属性 | 功能描述 | 参数类型 | 默认值 | 取值范围 | -| -------- | ------- | ----- | ----- |----- | ----- | +| -------- | ------- | ----- | ----- |----- | ----- | | `--quantType=` | 必选 | 设置为PostTraining,启用全量化 | String | - | 必须设置为PostTraining | | `--config_file=` | 必选 | 校准数据集配置文件路径 | String | - | - | @@ -99,17 +105,21 @@ MindSpore Lite训练后量化分为两类: 1. 正确编译出`converter_lite`可执行文件。 2. 准备校准数据集,假设存放在`/dir/images`目录,编写配置文件`config.cfg`,内容如下: - ``` + + ```python image_path=/dir/images batch_count=100 method_x=MAX_MIN thread_num=1 ``` + 校准数据集可以选择测试数据集的子集,要求`/dir/images`目录下存放的每个文件均是预处理好的输入数据,每个文件都可以直接用于推理的输入。 3. 以MindSpore模型为例,执行全量化的模型转换命令: - ``` + + ```bash ./converter_lite --fmk=MINDIR --modelFile=lenet.mindir --outputFile=lenet_quant --quantType=PostTraining --config_file=config.cfg ``` + 4. 上述命令执行成功后,便可得到量化后的模型`lenet_quant.ms`,通常量化后的模型大小会下降到FP32模型的1/4。 ### 部分模型精度结果 diff --git a/tutorials/lite/source_zh_cn/use/runtime_cpp.md b/tutorials/lite/source_zh_cn/use/runtime_cpp.md index 029b1c99f90ad9d38ccdbc140d3bbe2aaaa5bccd..70753fd478b55e437317d5a735e4283618231126 100644 --- a/tutorials/lite/source_zh_cn/use/runtime_cpp.md +++ b/tutorials/lite/source_zh_cn/use/runtime_cpp.md @@ -46,6 +46,7 @@ Runtime总体使用流程如下图所示: ![img](../images/side_infer_process.png) 包含的组件及功能如下所述: + - `Model`:MindSpore Lite使用的模型,通过用户构图或直接加载网络,来实例化算子原型的列表。 - `Lite Session`:提供图编译的功能,并调用图执行器进行推理。 - `Scheduler`:算子异构调度器,根据异构调度策略,为每一个算子选择合适的kernel,构造kernel list,并切分子图。 @@ -132,6 +133,7 @@ if (session2 == nullptr) { ### 使用示例 下面代码演示如何对MindSpore Lite的输入进行Resize: + ```cpp // Assume we have created a LiteSession instance named session. auto inputs = session->GetInputs(); @@ -160,6 +162,7 @@ virtual int CompileGraph(lite::Model *model) = 0; ### 使用示例 下面代码演示如何进行图编译: + ```cpp // Assume we have created a LiteSession instance named session and a Model instance named model before. // The methods of creating model and session can refer to "Import Model" and "Create Session" two sections. @@ -249,6 +252,7 @@ memcpy(in_data, input_buf, data_size); ``` 需要注意的是: + - MindSpore Lite的模型输入Tensor中的数据排布必须是NHWC。 - 模型的输入`input_buf`是用户从磁盘读取的,当拷贝给模型输入Tensor以后,用户需要自行释放`input_buf`。 - `GetInputs`和`GetInputsByTensorName`方法返回的vector不需要用户释放。 @@ -296,6 +300,7 @@ session->BindThread(false); ### 回调运行 Mindspore Lite可以在调用`RunGraph`时,传入两个`KernelCallBack`函数指针来回调推理模型,相比于一般的图执行,回调运行可以在运行过程中获取额外的信息,帮助开发者进行性能分析、Bug调试等。额外的信息包括: + - 当前运行的节点名称 - 推理当前节点前的输入输出Tensor - 推理当前节点后的输入输出Tensor @@ -377,6 +382,7 @@ delete (model); MindSpore Lite在执行完推理后,就可以获取模型的推理结果。 MindSpore Lite提供四种方法来获取模型的输出`MSTensor`。 + 1. 使用`GetOutputsByNodeName`方法,根据模型输出节点的名称来获取模型输出`MSTensor`中连接到该节点的Tensor的vector。 ```cpp @@ -501,22 +507,26 @@ for (auto tensor_name : tensor_names) { ``` ## 获取版本号 + MindSpore Lite提供了`Version`方法可以获取版本号,包含在`include/version.h`头文件中,调用该方法可以得到版本号字符串。 ### 使用示例 下面代码演示如何获取MindSpore Lite的版本号: + ```cpp #include "include/version.h" std::string version = mindspore::lite::Version(); ``` ## Session并行 + MindSpore Lite支持多个`LiteSession`并行推理,但不支持多个线程同时调用单个`LiteSession`的`RunGraph`接口。 ### 单Session并行 MindSpore Lite不支持多线程并行执行单个`LiteSession`的推理,否则会得到以下错误信息: + ```cpp ERROR [mindspore/lite/src/lite_session.cc:297] RunGraph] 10 Not support multi-threading ``` @@ -528,6 +538,7 @@ MindSpore Lite支持多个`LiteSession`同时进行推理的场景,每个`Lite ### 使用示例 下面代码演示了如何创建多个`LiteSession`,并且并行执行推理的过程: + ```cpp #include #include "src/common/file_utils.h" diff --git a/tutorials/lite/source_zh_cn/use/runtime_java.md b/tutorials/lite/source_zh_cn/use/runtime_java.md index 2ff2a9396fe359c9ff62aad385b83744f24f0f5b..99407b051b3fe395d7671626a91ddf176629f959 100644 --- a/tutorials/lite/source_zh_cn/use/runtime_java.md +++ b/tutorials/lite/source_zh_cn/use/runtime_java.md @@ -40,14 +40,14 @@ private boolean init(Context context) { Log.e("MS_LITE", "Load Model failed"); return false; } - + // Create and init config. MSConfig msConfig = new MSConfig(); if (!msConfig.init(DeviceType.DT_CPU, 2, CpuBindMode.MID_CPU)) { Log.e("MS_LITE", "Init context failed"); return false; } - + // Create the mindspore lite session. session = new LiteSession(); if (!session.init(msConfig)) { @@ -56,14 +56,14 @@ private boolean init(Context context) { return false; } msConfig.free(); - + // Complile graph. if (!session.compileGraph(model)) { Log.e("MS_LITE", "Compile graph failed"); model.freeBuffer(); return false; } - + // Note: when use model.freeBuffer(), the model can not be complile graph again. model.freeBuffer(); @@ -79,7 +79,7 @@ private void DoInference(Context context) { } byte[] inData = readFileFromAssets(context, "model_inputs.bin"); inTensor.setData(inData); - + // Run graph to infer results. if (!session.runGraph()) { Log.e("MS_LITE", "Run graph failed"); @@ -97,7 +97,7 @@ private void DoInference(Context context) { return; } float[] results = output.getFloatData(); - + // Apply infer results. …… } @@ -109,4 +109,3 @@ private void free() { model.free(); } ``` - diff --git a/tutorials/notebook/README.md b/tutorials/notebook/README.md index cd78a0dd6a7d9ae6f854b071a45931355058c124..555baf04c3dc68bde90231181461c3ad2254fefe 100644 --- a/tutorials/notebook/README.md +++ b/tutorials/notebook/README.md @@ -1,6 +1,7 @@ # MindSpore的教程体验 ## 环境配置 + ### Windows和Linux系统配置方法 - 系统版本:Windows 10,Ubuntu 16.04及以上 @@ -11,17 +12,18 @@ - MindSpore 下载地址:[MindSpore官网下载](https://www.mindspore.cn/versions),使用Windows系统用户选择Windows-X86版本,使用Linux系统用户选择Ubuntu-X86版本 -> MindSpore的[具体安装教程](https://www.mindspore.cn/install/) - +> MindSpore的[具体安装教程](https://www.mindspore.cn/install/) ### Jupyter Notebook切换conda环境(Kernel Change)的配置方法 - 首先,增加Jupyter Notebook切换conda环境功能(Kernel Change) 启动Anaconda Prompt,输入命令: - ``` + + ```bash conda install nb_conda ``` + > 建议在base环境操作上述命令。 执行完毕,重启Jupyter Notebook即可完成功能添加。 @@ -29,19 +31,25 @@ - 然后,添加conda环境到Jypyter Notebook的Kernel Change中。 1. 新建一个conda环境,启动Anaconda Prompt,输入命令: - ``` + + ```bash conda create -n {env_name} python=3.7.5 ``` + > env_name可以按照自己想要的环境名称自行命名。 2. 激活新环境,输入命令: - ``` + + ```bash conda activate {env_name} ``` + 3. 安装ipykernel,输入命令: - ``` + + ```bash conda install -n {env_name} ipykernel ``` + > 如果添加已有环境,只需执行安装ipykernel操作即可。 执行完毕后,刷新Jupyter notebook页面点击Kernel下拉,选择Kernel Change,就能选择新添加的conda环境。 @@ -49,20 +57,19 @@ ## notebook说明 | 教  程  名  称 | 文  件  名  称 | 教  程  类  别 | 内  容  描  述 -| :----------- | :----------- | :------- |:------ +| :----------- | :----------- | :------- |:------ | 手写数字分类识别入门体验教程 | [quick_start.ipynb](https://gitee.com/mindspore/docs/blob/master/tutorials/notebook/quick_start.ipynb) | 快速入门 | - CPU平台下从数据集到模型验证的全过程解读
- 体验教程中各功能模块的使用说明
- 数据集图形化展示
- 了解LeNet5具体结构和参数作用
- 学习使用自定义回调函数
- loss值与训练步数的变化图
- 模型精度与训练步数的变化图
- 使用模型应用到手写图片的预测与分类上 | 线性拟合 | [linear_regression.ipynb](https://gitee.com/mindspore/docs/blob/master/tutorials/notebook/linear_regression.ipynb) | 快速入门 | - 了解线性拟合的算法原理
- 了解在MindSpore中如何实现线性拟合的算法原理
- 学习使用MindSpore实现AI训练中的正向传播和方向传播
- 可视化线性函数拟合数据的全过程。 -| 加载数据集 | [loading_dataset.ipynb](https://gitee.com/mindspore/docs/blob/master/tutorials/notebook/loading_dataset.ipynb) | 使用指南 | - 学习MindSpore中加载数据集的方法
- 展示加载常用数据集的方法
- 展示加载MindRecord格式数据集的方法
- 展示加载自定义格式数据集的方法 +| 加载数据集 | [loading_dataset.ipynb](https://gitee.com/mindspore/docs/blob/master/tutorials/notebook/loading_dataset.ipynb) | 使用指南 | - 学习MindSpore中加载数据集的方法
- 展示加载常用数据集的方法
- 展示加载MindRecord格式数据集的方法
- 展示加载自定义格式数据集的方法 | 将数据集转换为MindSpore数据格式 | [convert_dataset_to_mindspore_data_format.ipynb](https://gitee.com/mindspore/docs/blob/master/tutorials/notebook/convert_dataset_to_mindspore_data_format/convert_dataset_to_mindspore_data_format.ipynb) | 使用指南 | - 展示将MNIST数据集转换为MindSpore数据格式
- 展示将CSV数据集转换为MindSpore数据格式
- 展示将CIFAR-10数据集转换为MindSpore数据格式
- 展示将CIFAR-100数据集转换为MindSpore数据格式
- 展示将ImageNet数据集转换为MindSpore数据格式
- 展示用户自定义生成MindSpore数据格式 | 数据处理与数据增强 | [data_loading_enhancement.ipynb](https://gitee.com/mindspore/docs/blob/master/tutorials/notebook/data_loading_enhance/data_loading_enhancement.ipynb) | 使用指南 | - 学习MindSpore中数据处理和增强的方法
- 展示数据处理、增强方法的实际操作
- 对比展示数据处理前和处理后的效果
- 表述在数据处理、增强后的意义 -| 自然语言处理应用 | [nlp_application.ipynb](https://gitee.com/mindspore/docs/blob/master/tutorials/notebook/nlp_application.ipynb) | 应用实践 | - 展示MindSpore在自然语言处理的应用
- 展示自然语言处理中数据集特定的预处理方法
- 展示如何定义基于LSTM的SentimentNet网络 +| 自然语言处理应用 | [nlp_application.ipynb](https://gitee.com/mindspore/docs/blob/master/tutorials/notebook/nlp_application.ipynb) | 应用实践 | - 展示MindSpore在自然语言处理的应用
- 展示自然语言处理中数据集特定的预处理方法
- 展示如何定义基于LSTM的SentimentNet网络 | 计算机视觉应用 | [computer_vision_application.ipynb](https://gitee.com/mindspore/docs/blob/master/tutorials/notebook/computer_vision_application.ipynb) | 应用实践 | - 学习MindSpore卷积神经网络在计算机视觉应用的过程
- 学习下载CIFAR-10数据集,搭建运行环境
- 学习使用ResNet-50构建卷积神经网络
- 学习使用Momentum和SoftmaxCrossEntropyWithLogits构建优化器和损失函数
- 学习调试参数训练模型,判断模型精度 -| 模型的训练及验证同步方法 | [evaluate_the_model_during_training.ipynb](https://gitee.com/mindspore/docs/blob/master/tutorials/notebook/evaluate_the_model_during_training.ipynb) | 应用实践 | - 了解模型训练和验证同步进行的方法
- 学习同步训练和验证中参数设置方法
- 利用绘图函数从保存的模型中挑选出最优模型 +| 模型的训练及验证同步方法 | [evaluate_the_model_during_training.ipynb](https://gitee.com/mindspore/docs/blob/master/tutorials/notebook/evaluate_the_model_during_training.ipynb) | 应用实践 | - 了解模型训练和验证同步进行的方法
- 学习同步训练和验证中参数设置方法
- 利用绘图函数从保存的模型中挑选出最优模型 | 优化数据准备的性能 | [optimize_the_performance_of_data_preparation.ipynb](https://gitee.com/mindspore/docs/blob/master/tutorials/notebook/optimize_the_performance_of_data_preparation/optimize_the_performance_of_data_preparation.ipynb) | 应用实践 | - 数据加载性能优化
- shuffle性能优化
- 数据增强性能优化
- 性能优化方案总结 | 使用PyNative进行神经网络的训练调试体验 | [debugging_in_pynative_mode.ipynb](https://gitee.com/mindspore/docs/blob/master/tutorials/notebook/debugging_in_pynative_mode.ipynb) | 模型调优 | - GPU平台下从数据集获取单个数据进行单个step训练的数据变化全过程解读
- 了解PyNative模式下的调试方法
- 图片数据在训练过程中的变化情况的图形展示
- 了解构建权重梯度计算函数的方法
- 展示1个step过程中权重的变化及数据展示 | 自定义调试信息体验文档 | [custom_debugging_info.ipynb](https://gitee.com/mindspore/docs/blob/master/tutorials/notebook/custom_debugging_info.ipynb) | 模型调优 | - 了解MindSpore的自定义调试算子
- 学习使用自定义调试算子Callback设置定时训练
- 学习设置metrics算子输出相对应的模型精度信息
- 学习设置日志环境变量来控制glog输出日志 | MindInsight的溯源分析和对比分析 | [lineage_and_scalars_comparision.ipynb](https://gitee.com/mindspore/docs/blob/master/tutorials/notebook/mindinsight/lineage_and_scalars_comparision.ipynb) | 模型调优 | - 了解MindSpore中训练数据的采集及展示
- 学习使用回调函数SummaryCollector进行数据采集
- 使用MindInsight进行数据可视化
- 了解数据溯源和模型溯源的使用方法
- 了解对比分析的使用方法 -| 计算图和数据图可视化 | [calculate_and_datagraphic.ipynb](https://gitee.com/mindspore/docs/blob/master/tutorials/notebook/mindinsight/calculate_and_datagraphic.ipynb) | 模型调优 | - 了解MindSpore中新增可视化功能
- 学习使用MindInsight可视化看板
- 学习使用查看计算图可视化图的信息的方法
- 学习使用查看数据图中展示的信息的方法 -| 标量、直方图、图像和张量可视化 | [mindinsight_image_histogram_scalar_tensor.ipynb](https://gitee.com/mindspore/docs/blob/master/tutorials/notebook/mindinsight/mindinsight_image_histogram_scalar_tensor.ipynb) | 模型调优 | - 了解完整的MindSpore深度学习及MindInsight可视化展示的过程
- 学习使用MindInsight对训练过程中标量、直方图、图像和张量信息进行可视化展示
- 学习使用Summary算子记录标量、直方图、图像和张量信息
- 学习单独对标量、直方图、图像和张量信息进行记录并可视化展示的方法 -| 混合精度 | [mixed_precision.ipynb](https://gitee.com/mindspore/docs/blob/master/tutorials/notebook/mixed_precision.ipynb) | 性能优化 | - 了解混合精度训练的原理
- 学习在MindSpore中使用混合精度训练
- 对比单精度训练和混合精度训练的对模型训练的影响 +| MindInsight训练看板 | [mindinsight_dashboard.ipynb](https://gitee.com/mindspore/docs/blob/master/tutorials/notebook/mindinsight/mindinsight_dashboard.ipynb) | 模型调优 | - 了解完整的MindSpore深度学习及MindInsight可视化展示的过程
- 学习使用MindInsight对训练过程中标量、直方图、图像、计算图、数据图和张量信息进行可视化展示
- 学习使用Summary算子记录标量、直方图、图像、计算图、数据图和张量信息 +| 混合精度 | [mixed_precision.ipynb](https://gitee.com/mindspore/docs/blob/master/tutorials/notebook/mixed_precision.ipynb) | 性能优化 | - 了解混合精度训练的原理
- 学习在MindSpore中使用混合精度训练
- 对比单精度训练和混合精度训练的对模型训练的影响 | 模型安全 | [model_security.ipynb](https://gitee.com/mindspore/docs/blob/master/tutorials/notebook/model_security.ipynb) | AI安全和隐私 | - 了解AI算法的安全威胁的概念和影响
- 介绍MindArmour提供的模型安全防护手段
- 学习如何模拟攻击训练模型
- 学习针对被攻击模型进行对抗性防御 diff --git a/tutorials/notebook/mindinsight/calculate_and_datagraphic.ipynb b/tutorials/notebook/mindinsight/calculate_and_datagraphic.ipynb deleted file mode 100644 index 5640befc4b5429dcabed1b4235399f7c1c80bfe4..0000000000000000000000000000000000000000 --- a/tutorials/notebook/mindinsight/calculate_and_datagraphic.ipynb +++ /dev/null @@ -1,422 +0,0 @@ -{ - "cells": [ - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "#
计算图和数据图可视化
\n", - "\n", - "\n", - "## 计算图与数据图概述\n", - "\n", - "计算图的生成是通过将模型训练过程中的每个计算节点关联后所构成的,初体验者可以通过查看计算图,掌握整个模型的计算走向结构,数据流以及控制流的信息。对于高阶的使用人员,能够通过计算图验证计算节点的输入输出是否正确,并验证整个计算过程是否符合预期。数据图展示的是数据预处理的过程,在MindInsight可视化面板中可查看数据处理的图,能够更加直观地查看数据预处理的每一个环节,并帮助提升模型性能。\n", - "\n", - "接下来我们用一个图片分类的项目来体验计算图与数据图的生成与使用。\n", - " \n", - "## 本次体验的整体流程\n", - "1. 体验模型的数据选择使用MNIST数据集,MNIST数据集整体数据量比较小,更适合体验使用。\n", - "\n", - "2. 初始化一个网络,本次的体验使用LeNet网络。\n", - "\n", - "3. 增加可视化功能的使用,并设定只记录计算图与数据图。\n", - "\n", - "4. 加载训练数据集并进行训练,训练完成后,查看结果并保存模型文件。\n", - "\n", - "5. 启用MindInsight的可视化图界面,进行训练过程的核对。" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "### 数据集来源\n", - "\n", - "方法一\n", - "\n", - "从以下网址下载,并将数据包解压后放在Jupyter的工作目录下。\n", - "\n", - "- 训练数据集:{\"\",\"\"}\n", - "- 测试数据集:{\"\",\"\"}\n", - "\n", - "可执行下面代码查看Jupyter的工作目录。" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "import os\n", - "os.getcwd()" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "- 训练数据集放在----`Jupyter工作目录+\\MNIST_Data\\train\\`,此时`train`文件夹内应该包含两个文件,`train-images-idx3-ubyte`和`train-labels-idx1-ubyte` \n", - "- 测试数据集放在----`Jupyter工作目录+\\MNIST_Data\\test\\`,此时`test`文件夹内应该包含两个文件,`t10k-images-idx3-ubyte`和`t10k-labels-idx1-ubyte`" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "方法二\n", - "\n", - "直接执行以下代码,会自动进行训练集的下载与解压,但是整个过程根据网络好坏情况会需要花费几分钟时间。" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "import os\n", - "import gzip\n", - "import urllib.request\n", - "from urllib.parse import urlparse\n", - "\n", - "\n", - "def unzip_file(gzip_path):\n", - " \"\"\"\n", - " Unzip a given gzip file.\n", - "\n", - " Args:\n", - " gzip_path (str): The gzip file path\n", - " \"\"\"\n", - " open_file = open(gzip_path.replace('.gz', ''), 'wb')\n", - " gz_file = gzip.GzipFile(gzip_path)\n", - " open_file.write(gz_file.read())\n", - " gz_file.close()\n", - "\n", - "\n", - "def download_dataset():\n", - " \"\"\"Download the dataset from http://yann.lecun.com/exdb/mnist/.\"\"\"\n", - " print(\"******Downloading the MNIST dataset******\")\n", - " train_path = \"./MNIST_Data/train/\"\n", - " test_path = \"./MNIST_Data/test/\"\n", - " train_path_check = os.path.exists(train_path)\n", - " test_path_check = os.path.exists(test_path)\n", - " if not train_path_check and not test_path_check:\n", - " os.makedirs(train_path)\n", - " os.makedirs(test_path)\n", - " train_url = {\"http://yann.lecun.com/exdb/mnist/train-images-idx3-ubyte.gz\",\n", - " \"http://yann.lecun.com/exdb/mnist/train-labels-idx1-ubyte.gz\"}\n", - " test_url = {\"http://yann.lecun.com/exdb/mnist/t10k-images-idx3-ubyte.gz\",\n", - " \"http://yann.lecun.com/exdb/mnist/t10k-labels-idx1-ubyte.gz\"}\n", - "\n", - " for url in train_url:\n", - " url_parse = urlparse(url)\n", - " # split the file name from url\n", - " file_name = os.path.join(train_path,url_parse.path.split('/')[-1])\n", - " if not os.path.exists(file_name.replace('.gz', '')):\n", - " file = urllib.request.urlretrieve(url, file_name)\n", - " unzip_file(file_name)\n", - " os.remove(file_name)\n", - " \n", - " for url in test_url:\n", - " url_parse = urlparse(url)\n", - " # split the file name from url\n", - " file_name = os.path.join(test_path,url_parse.path.split('/')[-1])\n", - " if not os.path.exists(file_name.replace('.gz', '')):\n", - " file = urllib.request.urlretrieve(url, file_name)\n", - " unzip_file(file_name)\n", - " os.remove(file_name)\n", - "\n", - "download_dataset()" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "#### 数据增强\n", - "对数据集进行数据增强操作,可以提升模型精度。\n" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "import mindspore.dataset as ds\n", - "import mindspore.dataset.vision.c_transforms as CV\n", - "import mindspore.dataset.transforms.c_transforms as C\n", - "from mindspore.dataset.vision import Inter\n", - "from mindspore.common import dtype as mstype\n", - "\n", - "\n", - "def create_dataset(data_path, batch_size=32, repeat_size=1,\n", - " num_parallel_workers=1):\n", - " \"\"\"\n", - " Create dataset for train or test.\n", - "\n", - " Args:\n", - " data_path (str): The absolute path of the dataset\n", - " batch_size (int): The number of data records in each group\n", - " repeat_size (int): The number of replicated data records\n", - " num_parallel_workers (int): The number of parallel workers\n", - " \"\"\"\n", - " # define dataset\n", - " mnist_ds = ds.MnistDataset(data_path)\n", - "\n", - " # define some parameters needed for data enhancement and rough justification\n", - " resize_height, resize_width = 32, 32\n", - " rescale = 1.0 / 255.0\n", - " shift = 0.0\n", - " rescale_nml = 1 / 0.3081\n", - " shift_nml = -1 * 0.1307 / 0.3081\n", - "\n", - " # according to the parameters, generate the corresponding data enhancement method\n", - " resize_op = CV.Resize((resize_height, resize_width), interpolation=Inter.LINEAR)\n", - " rescale_nml_op = CV.Rescale(rescale_nml, shift_nml)\n", - " rescale_op = CV.Rescale(rescale, shift)\n", - " hwc2chw_op = CV.HWC2CHW()\n", - " type_cast_op = C.TypeCast(mstype.int32)\n", - "\n", - " # using map method to apply operations to a dataset\n", - " mnist_ds = mnist_ds.map(operations=type_cast_op, input_columns=\"label\", num_parallel_workers=num_parallel_workers)\n", - " mnist_ds = mnist_ds.map(operations=resize_op, input_columns=\"image\", num_parallel_workers=num_parallel_workers)\n", - " mnist_ds = mnist_ds.map(operations=rescale_op, input_columns=\"image\", num_parallel_workers=num_parallel_workers)\n", - " mnist_ds = mnist_ds.map(operations=rescale_nml_op, input_columns=\"image\", num_parallel_workers=num_parallel_workers)\n", - " mnist_ds = mnist_ds.map(operations=hwc2chw_op, input_columns=\"image\", num_parallel_workers=num_parallel_workers)\n", - " \n", - " # process the generated dataset\n", - " buffer_size = 10000\n", - " mnist_ds = mnist_ds.shuffle(buffer_size=buffer_size) # 10000 as in LeNet train script\n", - " mnist_ds = mnist_ds.batch(batch_size, drop_remainder=True)\n", - " mnist_ds = mnist_ds.repeat(repeat_size)\n", - "\n", - " return mnist_ds" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "### 可视化操作流程\n", - "\n", - "1. 准备训练脚本,在训练脚本中指定计算图的超参数信息,使用`Summary`保存到日志中,接着再运行训练脚本。\n", - "\n", - "2. 启动MindInsight,启动成功后,就可以通过访问命令执行后显示的地址,查看可视化界面。\n", - "\n", - "3. 访问可视化地址成功后,就可以对图界面进行查询等操作。" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "#### 初始化网络\n", - "\n", - "1. 导入构建网络所使用的模块。\n", - "\n", - "2. 构建初始化参数的函数。\n", - "\n", - "3. 创建网络,在网络中设置参数。" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "import mindspore.nn as nn\n", - "from mindspore.common.initializer import TruncatedNormal\n", - "\n", - "\n", - "def conv(in_channels, out_channels, kernel_size, stride=1, padding=0):\n", - " \"\"\"weight initial for conv layer\"\"\"\n", - " weight = weight_variable()\n", - " return nn.Conv2d(in_channels, out_channels,\n", - " kernel_size=kernel_size, stride=stride, padding=padding,\n", - " weight_init=weight, has_bias=False, pad_mode=\"valid\")\n", - "\n", - "\n", - "def fc_with_initialize(input_channels, out_channels):\n", - " \"\"\"weight initial for fc layer\"\"\"\n", - " weight = weight_variable()\n", - " bias = weight_variable()\n", - " return nn.Dense(input_channels, out_channels, weight, bias)\n", - "\n", - "\n", - "def weight_variable():\n", - " \"\"\"weight initial\"\"\"\n", - " return TruncatedNormal(0.02)\n", - "\n", - "\n", - "class LeNet5(nn.Cell):\n", - " \n", - " def __init__(self, num_class=10, channel=1):\n", - " super(LeNet5, self).__init__()\n", - " self.num_class = num_class\n", - " self.conv1 = conv(channel, 6, 5)\n", - " self.conv2 = conv(6, 16, 5)\n", - " self.fc1 = fc_with_initialize(16 * 5 * 5, 120)\n", - " self.fc2 = fc_with_initialize(120, 84)\n", - " self.fc3 = fc_with_initialize(84, self.num_class)\n", - " self.relu = nn.ReLU()\n", - " self.max_pool2d = nn.MaxPool2d(kernel_size=2, stride=2)\n", - " self.flatten = nn.Flatten()\n", - "\n", - " def construct(self, x):\n", - " x = self.conv1(x)\n", - " x = self.relu(x)\n", - " x = self.max_pool2d(x)\n", - " x = self.conv2(x)\n", - " x = self.relu(x)\n", - " x = self.max_pool2d(x)\n", - " x = self.flatten(x)\n", - " x = self.fc1(x)\n", - " x = self.relu(x)\n", - " x = self.fc2(x)\n", - " x = self.relu(x)\n", - " x = self.fc3(x)\n", - " return x" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "#### 执行训练\n", - "\n", - "1. 导入所需的代码包,并示例化训练网络。\n", - "2. 通过MindSpore提供的 `SummaryCollector` 接口,实现收集计算图和数据图。在实例化 `SummaryCollector` 时,在 `collect_specified_data` 参数中,通过设置 `collect_graph` 指定收集计算图,设置 `collect_dataset_graph` 指定收集数据图。\n", - "\n", - "更多 `SummaryCollector` 的用法,请点击[API文档](https://www.mindspore.cn/doc/api_python/zh-CN/master/mindspore/mindspore.train.html#mindspore.train.callback.SummaryCollector)查看。\n", - "\n" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "import mindspore.nn as nn\n", - "from mindspore import context\n", - "from mindspore.train import Model\n", - "from mindspore.nn.metrics import Accuracy\n", - "from mindspore.train.callback import LossMonitor, SummaryCollector\n", - "\n", - "if __name__ == \"__main__\":\n", - " device_target = \"CPU\"\n", - " \n", - " context.set_context(mode=context.GRAPH_MODE, device_target=device_target)\n", - " download_dataset()\n", - " ds_train = create_dataset(data_path=\"./MNIST_Data/train/\")\n", - "\n", - " network = LeNet5()\n", - " net_loss = nn.SoftmaxCrossEntropyWithLogits(sparse=True, reduction=\"mean\")\n", - " net_opt = nn.Momentum(network.trainable_params(), learning_rate=0.01, momentum=0.9)\n", - " model = Model(network, net_loss, net_opt, metrics={\"Accuracy\": Accuracy()})\n", - "\n", - " specified={'collect_graph': True, 'collect_dataset_graph': True}\n", - " summary_collector = SummaryCollector(summary_dir='./summary_dir', collect_specified_data=specified, collect_freq=1, keep_default_action=False)\n", - " \n", - " print(\"============== Starting Training ==============\")\n", - " model.train(epoch=2, train_dataset=ds_train, callbacks=[LossMonitor(), summary_collector], dataset_sink_mode=False)\n", - "\n", - " print(\"============== Starting Testing ==============\")\n", - " ds_eval = create_dataset(\"./MNIST_Data/test/\")\n", - " acc = model.eval(ds_eval, dataset_sink_mode=False)\n", - " print(\"============== {} ==============\".format(acc))" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "### 启动MindInsight\n", - "- 启动MindInsigh服务命令:`mindinsigh start --summary-base-dir=/path/ --port=8080`;\n", - "- 执行完服务命令后,访问给出的地址,查看MindInsigh可视化结果。\n", - "\n", - "> 其中 /path/ 为 `SummaryCollector` 中参数 `summary_dir` 所指定的目录。\n", - "\n", - "![title](https://gitee.com/mindspore/docs/raw/master/tutorials/notebook/mindinsight/images/mindinsight_map.png)" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "### 计算图信息\n", - "- 文本选择框:输入计算图对应的路径及文件名,显示相应的计算图,便于查找文件。\n", - "- 搜索框:可以对整体计算图的节点信息进行搜索,输入完整的节点名称,回车执行搜索,如果有该名称节点,就会呈现出来,便于查找节点。\n", - "- 缩略图:展示整体计算图的缩略情况,在面板左边查看详细图结构时,在缩略图处会有定位,显示当前查看的位置在整体计算图中的定位,实时呈现部分与整体的关系。\n", - "- 节点信息:显示当前所查看节点的信息,包括名称、类型、属性、输入和输出。便于在训练结束后,核对计算正确性时查看。\n", - "- 图例:图例中包括命名空间、聚合节点、虚拟节点、算子节点、常量节点,通过不同图形来区分。\n", - "\n", - "![title](https://gitee.com/mindspore/docs/raw/master/tutorials/notebook/mindinsight/images/cast_map.png)" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "### 数据图展示\n", - "\n", - "数据图展示了数据增强中对数据进行操作的流程。\n", - "\n", - "1. 首先是从加载数据集 `mnist_ds = ds.MnistDataset(data_path)` 开始,对应数据图中 `MnistDataset`。\n", - "\n", - "2. 下面代码为上面的 `create_dataset` 函数中作数据预处理与数据增强的相关操作。可以从数据图中清晰地看到数据处理的流程。通过查看数据图,可以帮助分析是否存在不恰当的数据处理流程。\n", - "\n", - "```\n", - "mnist_ds = mnist_ds.map(operations=type_cast_op, input_columns=\"label\", num_parallel_workers=num_parallel_workers)\n", - "mnist_ds = mnist_ds.map(operations=resize_op, input_columns=\"image\", num_parallel_workers=num_parallel_workers)\n", - "mnist_ds = mnist_ds.map(operations=rescale_op, input_columns=\"image\", num_parallel_workers=num_parallel_workers)\n", - "mnist_ds = mnist_ds.map(operations=rescale_nml_op, input_columns=\"image\", num_parallel_workers=num_parallel_workers)\n", - "mnist_ds = mnist_ds.map(operations=hwc2chw_op, input_columns=\"image\", num_parallel_workers=num_parallel_workers)\n", - "\n", - "mnist_ds = mnist_ds.shuffle(buffer_size=buffer_size) # 10000 as in LeNet train script\n", - "mnist_ds = mnist_ds.batch(batch_size, drop_remainder=True)\n", - "mnist_ds = mnist_ds.repeat(repeat_size)\n", - "```\n" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "![title](https://gitee.com/mindspore/docs/raw/master/tutorials/notebook/mindinsight/images/data_map.png)" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "### 关闭MindInsight\n", - "\n", - "- 查看完成后,在命令行中可执行此命令 `mindinsight stop --port=8080`,关闭MindInsight。" - ] - } - ], - "metadata": { - "kernelspec": { - "display_name": "Python 3", - "language": "python", - "name": "python3" - }, - "language_info": { - "codemirror_mode": { - "name": "ipython", - "version": 3 - }, - "file_extension": ".py", - "mimetype": "text/x-python", - "name": "python", - "nbconvert_exporter": "python", - "pygments_lexer": "ipython3", - "version": "3.7.6" - } - }, - "nbformat": 4, - "nbformat_minor": 4 -} diff --git a/tutorials/notebook/mindinsight/images/caculate_graph.png b/tutorials/notebook/mindinsight/images/caculate_graph.png new file mode 100644 index 0000000000000000000000000000000000000000..460e56104705421002cf5d6112a84322b1ace391 Binary files /dev/null and b/tutorials/notebook/mindinsight/images/caculate_graph.png differ diff --git a/tutorials/notebook/mindinsight/images/cast_map.png b/tutorials/notebook/mindinsight/images/cast_map.png deleted file mode 100644 index 4713ffb03f6e03c02c45d41b63fcc6bdd7d8b2ae..0000000000000000000000000000000000000000 Binary files a/tutorials/notebook/mindinsight/images/cast_map.png and /dev/null differ diff --git a/tutorials/notebook/mindinsight/images/data_function.png b/tutorials/notebook/mindinsight/images/data_function.png new file mode 100644 index 0000000000000000000000000000000000000000..123d4cefd29b84cf261ccab8c8a62f63b06aea43 Binary files /dev/null and b/tutorials/notebook/mindinsight/images/data_function.png differ diff --git a/tutorials/notebook/mindinsight/images/data_map.png b/tutorials/notebook/mindinsight/images/data_map.png deleted file mode 100644 index 841346ff6256bdc81c09eb9e8f7a32a364d838b3..0000000000000000000000000000000000000000 Binary files a/tutorials/notebook/mindinsight/images/data_map.png and /dev/null differ diff --git a/tutorials/notebook/mindinsight/images/graph_sidebar.png b/tutorials/notebook/mindinsight/images/graph_sidebar.png new file mode 100644 index 0000000000000000000000000000000000000000..4b9b6097aa62fdf426d6fc62ee1dd55f8086aeb9 Binary files /dev/null and b/tutorials/notebook/mindinsight/images/graph_sidebar.png differ diff --git a/tutorials/notebook/mindinsight/images/histogram_only.png b/tutorials/notebook/mindinsight/images/histogram_only.png deleted file mode 100644 index 70e59386856dd370782e037425e56efbe1ebf3b9..0000000000000000000000000000000000000000 Binary files a/tutorials/notebook/mindinsight/images/histogram_only.png and /dev/null differ diff --git a/tutorials/notebook/mindinsight/images/histogram_only_all.png b/tutorials/notebook/mindinsight/images/histogram_only_all.png deleted file mode 100644 index c62873dd5e6ca4cf8e06b9ee642a8bd7e6d5ac85..0000000000000000000000000000000000000000 Binary files a/tutorials/notebook/mindinsight/images/histogram_only_all.png and /dev/null differ diff --git a/tutorials/notebook/mindinsight/images/image_only.png b/tutorials/notebook/mindinsight/images/image_only.png deleted file mode 100644 index fdbc96981316fc3c8b977c24d3a93474f4df1cbe..0000000000000000000000000000000000000000 Binary files a/tutorials/notebook/mindinsight/images/image_only.png and /dev/null differ diff --git a/tutorials/notebook/mindinsight/images/image_panel.png b/tutorials/notebook/mindinsight/images/image_panel.png index 64963ff989d6fd743166bd8630a3feade2fa01cb..451a3fea4ecfabb224f0f1cc90ca5e5d617d9bf1 100644 Binary files a/tutorials/notebook/mindinsight/images/image_panel.png and b/tutorials/notebook/mindinsight/images/image_panel.png differ diff --git a/tutorials/notebook/mindinsight/images/mindinsight_panel.png b/tutorials/notebook/mindinsight/images/mindinsight_panel.png index 8eb80073b47556ea1759bc44b3b02b0e8f5ed022..d93921429b05195b38b491040e9abcc99fb4994d 100644 Binary files a/tutorials/notebook/mindinsight/images/mindinsight_panel.png and b/tutorials/notebook/mindinsight/images/mindinsight_panel.png differ diff --git a/tutorials/notebook/mindinsight/images/mindinsight_panel2.png b/tutorials/notebook/mindinsight/images/mindinsight_panel2.png index 7d4d858c3102b790ee14b5e706bfeb3cd6c10062..355d4ce219bb2e2766eef80928838f4a11976c46 100644 Binary files a/tutorials/notebook/mindinsight/images/mindinsight_panel2.png and b/tutorials/notebook/mindinsight/images/mindinsight_panel2.png differ diff --git a/tutorials/notebook/mindinsight/images/multi_scalars.png b/tutorials/notebook/mindinsight/images/multi_scalars.png deleted file mode 100644 index af8e1d26d94fdf5240b3f76ae070661be745c909..0000000000000000000000000000000000000000 Binary files a/tutorials/notebook/mindinsight/images/multi_scalars.png and /dev/null differ diff --git a/tutorials/notebook/mindinsight/images/scalar_panel.png b/tutorials/notebook/mindinsight/images/scalar_panel.png index 36d8e5a7206c71c1fc8697c465bdeee923122f84..55f49fb639f126fa823f84cd4b0e2945f85c86f9 100644 Binary files a/tutorials/notebook/mindinsight/images/scalar_panel.png and b/tutorials/notebook/mindinsight/images/scalar_panel.png differ diff --git a/tutorials/notebook/mindinsight/images/tensor.png b/tutorials/notebook/mindinsight/images/tensor.png index 26346a185960ad391ec86476e7fb8823aab289f9..581e5895ffa77280cfb48245f38709107bf8c7cb 100644 Binary files a/tutorials/notebook/mindinsight/images/tensor.png and b/tutorials/notebook/mindinsight/images/tensor.png differ diff --git a/tutorials/notebook/mindinsight/images/tensor_func.png b/tutorials/notebook/mindinsight/images/tensor_func.png index 3db1f6c55e56edd88633168296f8cbc954cd0fff..df549a8e628bdc8d2ffef9dd814e5a8cf0f3ba34 100644 Binary files a/tutorials/notebook/mindinsight/images/tensor_func.png and b/tutorials/notebook/mindinsight/images/tensor_func.png differ diff --git a/tutorials/notebook/mindinsight/images/tensor_only.png b/tutorials/notebook/mindinsight/images/tensor_only.png deleted file mode 100644 index bdd8290c573341b31b11ccc7792178f3296274d0..0000000000000000000000000000000000000000 Binary files a/tutorials/notebook/mindinsight/images/tensor_only.png and /dev/null differ diff --git a/tutorials/notebook/mindinsight/mindinsight_dashboard.ipynb b/tutorials/notebook/mindinsight/mindinsight_dashboard.ipynb new file mode 100644 index 0000000000000000000000000000000000000000..d080d205fd157e7ebd778440ebbbd88afac39507 --- /dev/null +++ b/tutorials/notebook/mindinsight/mindinsight_dashboard.ipynb @@ -0,0 +1,846 @@ +{ + "cells": [ + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "# MindInsight训练看板\n", + "\n", + "通过MindSpore可以将训练过程中的标量、图像、参数分布直方图、张量、计算图和数据图记录到summary日志文件中,并通过MindInsight提供的可视化界面进行查看。\n", + "\n", + "- 通过查看特定的标量数值随着训练步骤的变化趋势,比如查看每个迭代的损失值、正确率、准确率这些标量的变化过程,追踪神经网络在整个训练过程中的信息,帮助用户了解模型是否过拟合,或者是否训练了过长时间。可以通过比较不同训练中的这些指标,以帮助调试和改善模型。\n", + "\n", + "- 通过查看训练过程中的图像数据,用户可以查看每个步骤所使用的数据集图像。\n", + "\n", + "- 参数分布直方图支持以直方图的形式呈现Tensor的变化趋势,用户可以查看训练过程中每个训练步骤的权重、bias和梯度参数变化信息。\n", + "\n", + "- 张量可视能够帮助用户直观查看训练过程中某个步骤的Tensor值,Tensor包括权重值、梯度值、激活值等。\n", + "\n", + "- 计算图的生成是通过将模型训练过程中的每个计算节点关联后所构成的,用户可以通过查看计算图,掌握整个模型的计算走向结构,数据流以及控制流的信息。对于高阶的使用人员,能够通过计算图验证计算节点的输入输出是否正确,并验证整个计算过程是否符合预期。\n", + "\n", + "- 数据图展示的是数据预处理的过程,在MindInsight可视化面板中可查看数据处理的图,能够更加直观地查看数据预处理的每一个环节,并帮助提升模型性能。\n", + "\n", + "接下来是本次流程的体验过程。\n", + "\n", + "## 整体流程\n", + "\n", + "1. 下载CIFAR-10二进制格式数据集。\n", + "2. 对数据进行预处理。\n", + "3. 定义AlexNet网络,在网络中使用summary算子记录数据。\n", + "4. 训练网络,使用 `SummaryCollector` 记录损失值标量、权重梯度、计算图和数据图参数。同时启动MindInsight服务,实时查看损失值、参数直方图、输入图像、张量、计算图和数据图的变化。\n", + "5. 完成训练后,查看MindInsight看板中记录到的损失值标量、直方图、图像信息、张量、计算图、数据图信息。\n", + "6. 相关注意事项,关闭MindInsight服务。" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## 准备环节\n", + "\n", + "### 下载数据集\n", + "\n", + "本次流程使用CIFAR-10二进制格式数据集,下载地址为:。\n", + "\n", + "CIFAR-10二进制格式数据集包含10个类别的60000个32x32彩色图像。每个类别6000个图像,包含50000张训练图像和10000张测试图像。数据集分为5个训练批次和1个测试批次,每个批次具有10000张图像。测试批次包含每个类别中1000个随机选择的图像,训练批次按随机顺序包含剩余图像(某个训练批次包含的一类图像可能比另一类更多)。其中,每个训练批次精确地包含对应每个类别的5000张图像。\n", + "\n", + "执行下面一段代码下载CIFAR-10二进制格式数据集到当前工作目录,如果已经下载过数据集,则不重复下载。" + ] + }, + { + "cell_type": "code", + "execution_count": 1, + "metadata": { + "scrolled": true + }, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "********Checking DataSets Path.*********\n", + "*****Downloading CIFAR-10 DataSets.*****\n", + "*********data_batch_1.bin is ok*********\n", + "*********data_batch_2.bin is ok*********\n", + "*********data_batch_3.bin is ok*********\n", + "*********data_batch_4.bin is ok*********\n", + "*********data_batch_5.bin is ok*********\n", + "**********test_batch.bin is ok**********\n", + "*Downloaded CIFAR-10 DataSets Already.**\n" + ] + } + ], + "source": [ + "import os, shutil\n", + "import urllib.request\n", + "from urllib.parse import urlparse\n", + "\n", + "\n", + "def callbackfunc(blocknum, blocksize, totalsize):\n", + " percent = 100.0 * blocknum * blocksize / totalsize\n", + " if percent > 100:\n", + " percent = 100\n", + " print(\"downloaded {:.1f}\".format(percent), end=\"\\r\")\n", + "\n", + "def _download_dataset():\n", + " ds_url = \"https://www.cs.toronto.edu/~kriz/cifar-10-binary.tar.gz\"\n", + " file_base_name = urlparse(ds_url).path.split(\"/\")[-1]\n", + " file_name = os.path.join(\"./datasets\", file_base_name)\n", + " if not os.path.exists(file_name):\n", + " urllib.request.urlretrieve(ds_url, file_name, callbackfunc)\n", + " print(\"{:*^40}\".format(\"DataSets Downloaded\"))\n", + " shutil.unpack_archive(file_name, extract_dir=\"./datasets/cifar-10-binary\")\n", + "\n", + "def _copy_dataset(ds_part, dest_path):\n", + " data_source_path = \"./datasets/cifar-10-binary/cifar-10-batches-bin\"\n", + " ds_part_source_path = os.path.join(data_source_path, ds_part)\n", + " if not os.path.exists(ds_part_source_path):\n", + " _download_dataset()\n", + " shutil.copy(ds_part_source_path, dest_path)\n", + "\n", + "def download_cifar10_dataset():\n", + " ds_base_path = \"./datasets/cifar10\"\n", + " train_path = os.path.join(ds_base_path, \"train\")\n", + " test_path = os.path.join(ds_base_path, \"test\")\n", + " print(\"{:*^40}\".format(\"Checking DataSets Path.\"))\n", + " if not os.path.exists(train_path) and not os.path.exists(test_path):\n", + " os.makedirs(train_path)\n", + " os.makedirs(test_path)\n", + " print(\"{:*^40}\".format(\"Downloading CIFAR-10 DataSets.\"))\n", + " for i in range(1, 6):\n", + " train_part = \"data_batch_{}.bin\".format(i)\n", + " if not os.path.exists(os.path.join(train_path, train_part)):\n", + " _copy_dataset(train_part, train_path)\n", + " pops = train_part + \" is ok\"\n", + " print(\"{:*^40}\".format(pops))\n", + " test_part = \"test_batch.bin\"\n", + " if not os.path.exists(os.path.join(test_path, test_part)):\n", + " _copy_dataset(test_part, test_path)\n", + " print(\"{:*^40}\".format(test_part+\" is ok\"))\n", + " print(\"{:*^40}\".format(\"Downloaded CIFAR-10 DataSets Already.\"))\n", + "\n", + "download_cifar10_dataset()" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "下载数据集后,CIFAR-10数据集目录(`datasets`)结构如下所示。\n", + "\n", + "```shell\n", + " $ tree datasets\n", + " datasets\n", + " └── cifar-10-batches-bin\n", + " ├── test\n", + " │   └── test_batch.bin\n", + " └── train\n", + " ├── data_batch_1.bin\n", + " ├── data_batch_2.bin\n", + " ├── data_batch_3.bin\n", + " ├── data_batch_4.bin\n", + " └── data_batch_5.bin\n", + "\n", + "```\n", + "\n", + "其中:\n", + "- `test_batch.bin`文件为测试数据集文件。\n", + "- `data_batch_1.bin`文件为第1批次训练数据集文件。\n", + "- `data_batch_2.bin`文件为第2批次训练数据集文件。\n", + "- `data_batch_3.bin`文件为第3批次训练数据集文件。\n", + "- `data_batch_4.bin`文件为第4批次训练数据集文件。\n", + "- `data_batch_5.bin`文件为第5批次训练数据集文件。\n" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## 数据处理\n", + "\n", + "好的数据集可以有效提高训练精度和效率,在加载数据集前,会进行一些处理,增加数据的可用性和随机性。下面一段代码定义函数`create_dataset_cifar10`来进行数据处理操作,并创建训练数据集(`ds_train`)和测试数据集(`ds_eval`)。\n" + ] + }, + { + "cell_type": "code", + "execution_count": 2, + "metadata": {}, + "outputs": [], + "source": [ + "import mindspore.dataset as ds\n", + "import mindspore.dataset.transforms.c_transforms as C\n", + "import mindspore.dataset.vision.c_transforms as CV\n", + "from mindspore.common import dtype as mstype\n", + "\n", + "\n", + "def create_dataset_cifar10(data_path, batch_size=32, repeat_size=1, status=\"train\"):\n", + " \"\"\"\n", + " create dataset for train or test\n", + " \"\"\"\n", + " cifar_ds = ds.Cifar10Dataset(data_path)\n", + " rescale = 1.0 / 255.0\n", + " shift = 0.0\n", + "\n", + " resize_op = CV.Resize(size=(227, 227))\n", + " rescale_op = CV.Rescale(rescale, shift)\n", + " normalize_op = CV.Normalize((0.4914, 0.4822, 0.4465), (0.2023, 0.1994, 0.2010))\n", + " if status == \"train\":\n", + " random_crop_op = CV.RandomCrop([32, 32], [4, 4, 4, 4])\n", + " random_horizontal_op = CV.RandomHorizontalFlip()\n", + " channel_swap_op = CV.HWC2CHW()\n", + " typecast_op = C.TypeCast(mstype.int32)\n", + " cifar_ds = cifar_ds.map(operations=typecast_op, input_columns=\"label\")\n", + " if status == \"train\":\n", + " cifar_ds = cifar_ds.map(operations=random_crop_op, input_columns=\"image\")\n", + " cifar_ds = cifar_ds.map(operations=random_horizontal_op, input_columns=\"image\")\n", + " cifar_ds = cifar_ds.map(operations=resize_op, input_columns=\"image\")\n", + " cifar_ds = cifar_ds.map(operations=rescale_op, input_columns=\"image\")\n", + " cifar_ds = cifar_ds.map(operations=normalize_op, input_columns=\"image\")\n", + " cifar_ds = cifar_ds.map(operations=channel_swap_op, input_columns=\"image\")\n", + "\n", + " cifar_ds = cifar_ds.shuffle(buffer_size=1000)\n", + " cifar_ds = cifar_ds.batch(batch_size, drop_remainder=True)\n", + " cifar_ds = cifar_ds.repeat(repeat_size)\n", + " return cifar_ds\n", + "\n", + "ds_train = create_dataset_cifar10(data_path=\"./datasets/cifar10/train\")\n", + "ds_eval = create_dataset_cifar10(\"./datasets/cifar10/test\")" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "### 抽取数据集图像\n", + "\n", + "执行以下一段代码,抽取上步创建好的训练数据集`ds_train`中第一个`batch`的32张图像以及对应的类别名称进行展示。" + ] + }, + { + "cell_type": "code", + "execution_count": 3, + "metadata": { + "scrolled": true + }, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "The 32 images with label of the first batch in ds_train are showed below:\n" + ] + }, + { + "data": { + "image/png": "\n", + "text/plain": [ + "
" + ] + }, + "metadata": { + "needs_background": "light" + }, + "output_type": "display_data" + } + ], + "source": [ + "from matplotlib import pyplot as plt\n", + "import numpy as np\n", + "\n", + "label_list = [\"airplane\", \"automobile\", \"bird\", \"cat\", \"deer\", \"dog\", \"rog\", \"horse\", \"ship\", \"truck\"]\n", + "print(\"The 32 images with label of the first batch in ds_train are showed below:\")\n", + "ds_iterator = ds_train.create_dict_iterator()\n", + "ds_iterator.get_next()\n", + "batch_1 = ds_iterator.get_next()\n", + "batch_image = batch_1[\"image\"].asnumpy()\n", + "batch_label = batch_1[\"label\"].asnumpy()\n", + "%matplotlib inline\n", + "plt.figure(dpi=144)\n", + "for i,image in enumerate(batch_image):\n", + " plt.subplot(4, 8, i+1)\n", + " plt.subplots_adjust(wspace=0.2, hspace=0.2)\n", + " image = image/np.amax(image)\n", + " image = np.clip(image, 0, 1)\n", + " image = np.transpose(image,(1,2,0))\n", + " plt.imshow(image)\n", + " num = batch_label[i]\n", + " plt.title(f\"image {i+1}\\n{label_list[num]}\", y=-0.65, fontdict={\"fontsize\":8})\n", + " plt.axis('off') \n", + "plt.show()" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## 使用Summary算子记录数据\n", + "\n", + "在进行训练之前,需定义神经网络模型,本流程采用AlexNet网络。\n", + "\n", + "MindSpore提供了两种方法进行记录数据,分别为:\n", + "\n", + "- 通过Summary算子记录数据。\n", + "- 通过 `SummaryCollector` 这个callback进行记录。\n", + "\n", + "下面为在AlexNet网络中使用Summary算子记录输入图像和张量数据的配置方法。\n", + "\n", + "- 使用 `ImageSummary` 记录输入图像数据。\n", + "\n", + " 1. 在 `__init__` 方法中初始化 `ImageSummary`。\n", + " \n", + " ```python\n", + " # Init ImageSummary\n", + " self.image_summary = P.ImageSummary()\n", + " ```\n", + " \n", + " 2. 在 `construct` 方法中使用 `ImageSummary` 算子记录输入图像。其中 \"Image\" 为该数据的名称,MindInsight在展示时,会将该名称展示出来以方便识别是哪个数据。\n", + " \n", + " ```python\n", + " # Record image by Summary operator\n", + " self.image_summary(\"Image\", x)\n", + " ```\n", + " \n", + "- 使用 `TensorSummary` 记录张量数据。\n", + "\n", + " 1. 在 `__init__` 方法中初始化 `TensorSummary`。\n", + " \n", + " ```python\n", + " # Init TensorSummary\n", + " self.tensor_summary = P.TensorSummary()\n", + " ```\n", + " \n", + " 2. 在`construct`方法中使用`TensorSummary`算子记录张量数据。其中\"Tensor\"为该数据的名称。\n", + " \n", + " ```python\n", + " # Record tensor by Summary operator\n", + " self.tensor_summary(\"Tensor\", x)\n", + " ```\n", + "\n", + "当前支持的Summary算子:\n", + "\n", + "- [ScalarSummary](https://www.mindspore.cn/doc/api_python/zh-CN/master/mindspore/mindspore.ops.html#mindspore.ops.ScalarSummary): 记录标量数据\n", + "- [TensorSummary](https://www.mindspore.cn/doc/api_python/zh-CN/master/mindspore/mindspore.ops.html#mindspore.ops.TensorSummary): 记录张量数据\n", + "- [ImageSummary](https://www.mindspore.cn/doc/api_python/zh-CN/master/mindspore/mindspore.ops.html#mindspore.ops.ImageSummary): 记录图片数据\n", + "- [HistogramSummary](https://www.mindspore.cn/doc/api_python/zh-CN/master/mindspore/mindspore.ops.html#mindspore.ops.HistogramSummary): 将张量数据转为直方图数据记录\n", + "\n", + "以下一段代码中定义AlexNet网络结构。" + ] + }, + { + "cell_type": "code", + "execution_count": 4, + "metadata": {}, + "outputs": [], + "source": [ + "import mindspore.nn as nn\n", + "from mindspore.common.initializer import TruncatedNormal\n", + "from mindspore.ops import operations as P\n", + "\n", + "def conv(in_channels, out_channels, kernel_size, stride=1, padding=0, pad_mode=\"valid\"):\n", + " weight = weight_variable()\n", + " return nn.Conv2d(in_channels, out_channels,\n", + " kernel_size=kernel_size, stride=stride, padding=padding,\n", + " weight_init=weight, has_bias=False, pad_mode=pad_mode)\n", + "\n", + "def fc_with_initialize(input_channels, out_channels):\n", + " weight = weight_variable()\n", + " bias = weight_variable()\n", + " return nn.Dense(input_channels, out_channels, weight, bias)\n", + "\n", + "def weight_variable():\n", + " return TruncatedNormal(0.02)\n", + "\n", + "\n", + "class AlexNet(nn.Cell):\n", + " \"\"\"\n", + " Alexnet\n", + " \"\"\"\n", + " def __init__(self, num_classes=10, channel=3):\n", + " super(AlexNet, self).__init__()\n", + " self.conv1 = conv(channel, 96, 11, stride=4)\n", + " self.conv2 = conv(96, 256, 5, pad_mode=\"same\")\n", + " self.conv3 = conv(256, 384, 3, pad_mode=\"same\")\n", + " self.conv4 = conv(384, 384, 3, pad_mode=\"same\")\n", + " self.conv5 = conv(384, 256, 3, pad_mode=\"same\")\n", + " self.relu = nn.ReLU()\n", + " self.max_pool2d = P.MaxPool(ksize=3, strides=2)\n", + " self.flatten = nn.Flatten()\n", + " self.fc1 = fc_with_initialize(6*6*256, 4096)\n", + " self.fc2 = fc_with_initialize(4096, 4096)\n", + " self.fc3 = fc_with_initialize(4096, num_classes)\n", + " # Init TensorSummary\n", + " self.tensor_summary = P.TensorSummary()\n", + " # Init ImageSummary\n", + " self.image_summary = P.ImageSummary()\n", + "\n", + " def construct(self, x):\n", + " # Record image by Summary operator\n", + " self.image_summary(\"Image\", x)\n", + " x = self.conv1(x)\n", + " # Record tensor by Summary operator\n", + " self.tensor_summary(\"Tensor\", x)\n", + " x = self.relu(x)\n", + " x = self.max_pool2d(x)\n", + " x = self.conv2(x)\n", + " x = self.relu(x)\n", + " x = self.max_pool2d(x)\n", + " x = self.conv3(x)\n", + " x = self.relu(x)\n", + " x = self.conv4(x)\n", + " x = self.relu(x)\n", + " x = self.conv5(x)\n", + " x = self.relu(x)\n", + " x = self.max_pool2d(x)\n", + " x = self.flatten(x)\n", + " x = self.fc1(x)\n", + " x = self.relu(x)\n", + " x = self.fc2(x)\n", + " x = self.relu(x)\n", + " x = self.fc3(x)\n", + " return x" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## 使用 `SummaryCollector` 记录数据\n", + "\n", + "下面展示使用`SummaryCollector`来记录标量、直方图信息。\n", + "\n", + "在MindSpore中通过`Callback`机制,提供支持快速简易地收集损失值、参数权重、梯度等信息的`Callback`, 叫做`SummaryCollector`(详细的用法可以参考API文档中[mindspore.train.callback.SummaryCollector](https://www.mindspore.cn/doc/api_python/zh-CN/master/mindspore/mindspore.train.html#mindspore.train.callback.SummaryCollector))。`SummaryCollector`使用方法如下: \n", + "\n", + "`SummaryCollector` 提供 `collect_specified_data` 参数,允许用户自定义想要收集的数据。\n", + "\n", + "下面的代码展示通过 `SummaryCollector` 收集损失值以及卷积层的参数值,参数值在MindInsight中以直方图展示。\n", + "\n", + "\n", + "\n", + "\n", + "```python\n", + "specified={\"collect_metric\": True, \"histogram_regular\": \"^conv1.*|^conv2.*\",\"collect_graph\": True, \"collect_dataset_graph\": True}\n", + "summary_collector = SummaryCollector(summary_dir=\"./summary_dir/summary_01\", \n", + " collect_specified_data=specified, \n", + " collect_freq=1, \n", + " keep_default_action=False, \n", + " collect_tensor_freq=200)\n", + "```\n", + "\n", + "- `summary_dir`:指定日志保存的路径。\n", + "- `collect_specified_data`:指定需要记录的信息。\n", + "- `collect_freq`:指定使用`SummaryCollector`记录数据的频率。\n", + "- `keep_default_action`:指定是否除记录除指定信息外的其他数据信息。\n", + "- `collect_tensor_freq`:指定记录张量信息的频率。\n", + "- `\"collect_metric\"`为记录损失值标量信息。\n", + "- `\"histogram_regular\"`为记录`conv1`层和`conv2`层直方图信息。\n", + "- `\"collect_graph\"`为记录计算图信息。\n", + "- `\"collect_dataset_graph\"`为记录数据图信息。\n", + "\n", + "  程序运行过程中将在本地`8080`端口自动启动MindInsight服务并自动遍历读取当前notebook目录下`summary_dir`子目录下所有日志文件、解析进行可视化展示。" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "### 导入模块" + ] + }, + { + "cell_type": "code", + "execution_count": 5, + "metadata": {}, + "outputs": [], + "source": [ + "import os\n", + "import mindspore.nn as nn\n", + "from mindspore.train.callback import ModelCheckpoint, CheckpointConfig, LossMonitor, TimeMonitor\n", + "from mindspore.train import Model\n", + "from mindspore.nn.metrics import Accuracy\n", + "from mindspore.train.callback import SummaryCollector\n", + "from mindspore.train.serialization import load_checkpoint, load_param_into_net\n", + "from mindspore import Tensor\n", + "from mindspore import context\n", + "\n", + "device_target = \"GPU\"\n", + "context.set_context(mode=context.GRAPH_MODE, device_target=device_target)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "### 定义学习率\n", + "\n", + "以下一段代码定义学习率。" + ] + }, + { + "cell_type": "code", + "execution_count": 6, + "metadata": {}, + "outputs": [], + "source": [ + "import numpy as np\n", + "\n", + "\n", + "def get_lr(current_step, lr_max, total_epochs, steps_per_epoch):\n", + " \"\"\"\n", + " generate learning rate array\n", + "\n", + " Args:\n", + " current_step(int): current steps of the training\n", + " lr_max(float): max learning rate\n", + " total_epochs(int): total epoch of training\n", + " steps_per_epoch(int): steps of one epoch\n", + "\n", + " Returns:\n", + " np.array, learning rate array\n", + " \"\"\"\n", + " lr_each_step = []\n", + " total_steps = steps_per_epoch * total_epochs\n", + " decay_epoch_index = [0.8 * total_steps]\n", + " for i in range(total_steps):\n", + " if i < decay_epoch_index[0]:\n", + " lr = lr_max\n", + " else:\n", + " lr = lr_max * 0.1\n", + " lr_each_step.append(lr)\n", + " lr_each_step = np.array(lr_each_step).astype(np.float32)\n", + " learning_rate = lr_each_step[current_step:]\n", + "\n", + " return learning_rate\n" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "### 执行训练" + ] + }, + { + "cell_type": "code", + "execution_count": 7, + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "============== Starting Training ==============\n", + "epoch: 1 step: 1, loss is 2.3037791\n", + "epoch: 1 step: 2, loss is 2.3127236\n", + "epoch: 1 step: 3, loss is 2.3156757\n", + "epoch: 1 step: 4, loss is 2.2910595\n", + "epoch: 1 step: 5, loss is 2.3042145\n", + "epoch: 1 step: 6, loss is 2.3150084\n", + "epoch: 1 step: 7, loss is 2.2808924\n", + "epoch: 1 step: 8, loss is 2.3073373\n", + "epoch: 1 step: 9, loss is 2.308782\n", + "epoch: 1 step: 10, loss is 2.2957213\n", + "\n", + "...\n", + "\n", + "epoch: 10 step: 1550, loss is 0.54039395\n", + "epoch: 10 step: 1551, loss is 0.25690028\n", + "epoch: 10 step: 1552, loss is 0.26572403\n", + "epoch: 10 step: 1553, loss is 0.4429163\n", + "epoch: 10 step: 1554, loss is 0.25716054\n", + "epoch: 10 step: 1555, loss is 0.38538748\n", + "epoch: 10 step: 1556, loss is 0.12103356\n", + "epoch: 10 step: 1557, loss is 0.16565521\n", + "epoch: 10 step: 1558, loss is 0.4364005\n", + "epoch: 10 step: 1559, loss is 0.428179\n", + "epoch: 10 step: 1560, loss is 0.42687342\n", + "epoch: 10 step: 1561, loss is 0.6419081\n", + "epoch: 10 step: 1562, loss is 0.5843237\n", + "Epoch time: 115283.798, per step time: 73.805\n", + "============== Starting Testing ==============\n", + "============== {'Accuracy': 0.8302283653846154} ==============\n" + ] + } + ], + "source": [ + "network = AlexNet(num_classes=10)\n", + "net_loss = nn.SoftmaxCrossEntropyWithLogits(sparse=True, reduction=\"mean\")\n", + "lr = Tensor(get_lr(0, 0.002, 10, ds_train.get_dataset_size()))\n", + "net_opt = nn.Momentum(network.trainable_params(), learning_rate=lr, momentum=0.9)\n", + "time_cb = TimeMonitor(data_size=ds_train.get_dataset_size())\n", + "config_ck = CheckpointConfig(save_checkpoint_steps=1562, keep_checkpoint_max=10)\n", + "ckpoint_cb = ModelCheckpoint(directory=\"./models/ckpt/mindinsight_dashboard\", prefix=\"checkpoint_alexnet\", config=config_ck)\n", + "model = Model(network, net_loss, net_opt, metrics={\"Accuracy\": Accuracy()})\n", + "\n", + "summary_base_dir = \"./summary_dir\"\n", + "os.system(f\"mindinsight start --summary-base-dir {summary_base_dir} --port=8080\")\n", + "\n", + "# Init a SummaryCollector callback instance, and use it in model.train or model.eval\n", + "specified = {\"collect_metric\": True, \"histogram_regular\": \"^conv1.*|^conv2.*\", \"collect_graph\": True, \"collect_dataset_graph\": True}\n", + "summary_collector = SummaryCollector(summary_dir=\"./summary_dir/summary_01\", collect_specified_data=specified, collect_freq=1, keep_default_action=False, collect_tensor_freq=200)\n", + "\n", + "print(\"============== Starting Training ==============\")\n", + "model.train(epoch=10, train_dataset=ds_train, callbacks=[time_cb, ckpoint_cb, LossMonitor(), summary_collector], dataset_sink_mode=True)\n", + "\n", + "print(\"============== Starting Testing ==============\")\n", + "param_dict = load_checkpoint(\"./models/ckpt/mindinsight_dashboard/checkpoint_alexnet-10_1562.ckpt\")\n", + "load_param_into_net(network, param_dict)\n", + "acc = model.eval(ds_eval, callbacks=summary_collector, dataset_sink_mode=True)\n", + "print(\"============== {} ==============\".format(acc))" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## MindInsight看板\n", + "\n", + "在本地浏览器中打开地址:`127.0.0.1:8080`,进入到可视化面板。\n", + "\n", + "![](https://gitee.com/mindspore/docs/raw/master/tutorials/notebook/mindinsight/images/mindinsight_panel.png)\n", + "\n", + "在上图所示面板中可以看到`summary_01`日志文件目录,点击**训练看板**进入到下图所示的训练数据展示面板,该面板展示了标量数据、直方图、图像和张量信息,并随着训练、测试的进行实时刷新数据,实时显示训练过程参数的变化情况。\n", + "\n", + "![](https://gitee.com/mindspore/docs/raw/master/tutorials/notebook/mindinsight/images/mindinsight_panel2.png)\n", + "\n", + "### 标量可视化\n", + "\n", + "标量可视化用于展示训练过程中标量的变化趋势,点击打开训练标量信息展示面板,该面板记录了迭代计算过程中的损失值标量信息,如下图展示了损失值标量趋势图。\n", + "\n", + "![](https://gitee.com/mindspore/docs/raw/master/tutorials/notebook/mindinsight/images/scalar_panel.png)\n", + "\n", + "上图展示了神经网络在训练过程中损失值的变化过程。横坐标是训练步骤,纵坐标是损失值。\n", + "\n", + "图中右上角有几个按钮功能,从左到右功能分别是全屏展示,切换Y轴比例,开启/关闭框选,分步回退和还原图形。\n", + "\n", + "- 全屏展示即全屏展示该标量曲线,再点击一次即可恢复。\n", + "- 切换Y轴比例是指可以将Y轴坐标进行对数转换。\n", + "- 开启/关闭框选是指可以框选图中部分区域,并放大查看该区域,可以在已放大的图形上叠加框选。\n", + "- 分步回退是指对同一个区域连续框选并放大查看时,可以逐步撤销操作。\n", + "- 还原图形是指进行了多次框选后,点击此按钮可以将图还原回原始状态。\n", + "\n", + "![](https://gitee.com/mindspore/docs/raw/master/tutorials/notebook/mindinsight/images/scalar_select.png)\n", + "\n", + "上图展示的标量可视化的功能区,提供了根据选择不同标签,水平轴的不同维度和平滑度来查看标量信息的功能。\n", + "\n", + "- 标签选择:提供了对所有标签进行多项选择的功能,用户可以通过勾选所需的标签,查看对应的标量信息。\n", + "- 水平轴:可以选择“步骤”、“相对时间”、“绝对时间”中的任意一项,来作为标量曲线的水平轴。\n", + "- 平滑度:可以通过调整平滑度,对标量曲线进行平滑处理。\n", + "- 标量合成:可以选中两条标量曲线进行合成并展示在一个图中,以方便对两条曲线进行对比或者查看合成后的图。\n", + " 标量合成的功能区与标量可视化的功能区相似。其中与标量可视化功能区不一样的地方,在于标签选择时,标量合成功能最多只能同时选择两个标签,将其曲线合成并展示。\n", + "\n", + "### 直方图可视化\n", + "\n", + "\n", + "直方图用于将用户所指定的张量以直方图的形式展示。点击打开直方图展示面板,以直方图的形式记录了在迭代过程中所有层参数分布信息。\n", + "\n", + "![](https://gitee.com/mindspore/docs/raw/master/tutorials/notebook/mindinsight/images/histogram_panel.png)\n", + "\n", + "如下图为`conv1`层参数分布信息,点击图中右上角,可以将图放大。\n", + "\n", + "![](https://gitee.com/mindspore/docs/raw/master/tutorials/notebook/mindinsight/images/histogram.png)\n", + "\n", + "下图为直方图功能区。\n", + "\n", + "![](https://gitee.com/mindspore/docs/raw/master/tutorials/notebook/mindinsight/images/histogram_func.png)\n", + "\n", + "上图展示直方图的功能区,包含以下内容:\n", + "\n", + "- 标签选择:提供了对所有标签进行多项选择的功能,用户可以通过勾选所需的标签,查看对应的直方图。\n", + "- 纵轴:可以选择步骤、相对时间、绝对时间中的任意一项,来作为直方图纵轴显示的数据。\n", + "- 视角:可以选择正视和俯视中的一种。正视是指从正面的角度查看直方图,此时不同步骤之间的数据会覆盖在一起。俯视是指偏移以45度角俯视直方图区域,这时可以呈现不同步骤之间数据的差异。\n", + "\n", + "### 图像可视化\n", + "\n", + "图像可视化用于展示用户所指定的图片。点击数据抽样展示面板,展示了每个一步进行处理的图像信息。\n", + "\n", + "下图为展示`summary_01`记录的图像信息。\n", + "\n", + "![](https://gitee.com/mindspore/docs/raw/master/tutorials/notebook/mindinsight/images/image_panel.png)\n", + "\n", + "通过滑动上图中的\"步骤\"滑条,查看不同步骤的图片。\n", + "\n", + "![](https://gitee.com/mindspore/docs/raw/master/tutorials/notebook/mindinsight/images/image_function.png)\n", + "\n", + "上图展示图像可视化的功能区,提供了选择查看不同标签,不同亮度和不同对比度来查看图片信息。\n", + "\n", + "- 标签:提供了对所有标签进行多项选择的功能,用户可以通过勾选所需的标签,查看对应的图片信息。\n", + "- 亮度调整:可以调整所展示的所有图片亮度。\n", + "- 对比度调整:可以调整所展示的所有图片对比度。\n", + "\n", + "### 张量可视化\n", + "\n", + "张量可视化用于将张量以表格以及直方图的形式进行展示。\n", + "\n", + "![](https://gitee.com/mindspore/docs/raw/master/tutorials/notebook/mindinsight/images/tensor_func.png)\n", + "\n", + "上图展示了张量可视化的功能区,包含以下内容:\n", + "\n", + "- 标签选择:提供了对所有标签进行多项选择的功能,用户可以通过勾选所需的标签,查看对应的表格数据或者直方图。\n", + "- 视图:可以选择表格或者直方图来展示tensor数据。在直方图视图下存在纵轴和视角的功能选择。\n", + "- 纵轴:可以选择步骤、相对时间、绝对时间中的任意一项,来作为直方图纵轴显示的数据。\n", + "- 视角:可以选择正视和俯视中的一种。正视是指从正面的角度查看直方图,此时不同步骤之间的数据会覆盖在一起。俯视是指 偏移以45度角俯视直方图区域,这时可以呈现不同步骤之间数据的差异。\n", + "\n", + "![](https://gitee.com/mindspore/docs/raw/master/tutorials/notebook/mindinsight/images/tensor.png)\n", + "\n", + "上图中将用户所记录的张量以表格的形式展示,包含以下功能:\n", + "\n", + "- 点击表格右边小方框按钮,可以将表格放大。\n", + "- 表格中白色方框显示当前展示的是哪个维度下的张量数据,其中冒号\":\"表示当前维度的所有值,可以在方框输入对应的索引或者:后按Enter键或者点击后边的打勾按钮来查询特定维度的张量数据。 假设某维度是32,则其索引范围是-32到31。注意:可以查询0维到2维的张量数据,不支持查询超过两维的张量数据,即不能设置超过两个冒号\":\"的查询条件。\n", + "- 拖拽表格下方的空心圆圈可以查询特定步骤的张量数据。\n", + "\n", + "### 计算图可视化\n", + "\n", + "点击计算图可视化用于展示计算图的图结构,数据流以及控制流的走向,支持展示summary日志文件与通过`context`的`save_graphs`参数导出的`pb`文件。\n", + "\n", + "![graph.png](https://gitee.com/mindspore/docs/raw/master/tutorials/notebook/mindinsight/images/caculate_graph.png)\n", + "\n", + "上展示了计算图的网络结构。如图中所展示的,在展示区中,选中其中一个算子(图中圈红算子),可以看到该算子有两个输入和一个输出(实线代表算子的数据流走向)。\n", + "\n", + "![graph_sidebar.png](https://gitee.com/mindspore/docs/raw/master/tutorials/notebook/mindinsight/images/graph_sidebar.png)\n", + "\n", + "上图展示了计算图可视化的功能区,包含以下内容:\n", + "\n", + "- 文件选择框:可以选择查看不同文件的计算图。\n", + "- 搜索框:可以对节点进行搜索,输入节点名称点击回车,即可展示该节点。\n", + "- 缩略图:展示整个网络图结构的缩略图,在查看超大图结构时,方便查看当前浏览的区域。\n", + "- 节点信息:展示选中的节点的基本信息,包括节点的名称、属性、输入节点、输出节点等信息。\n", + "- 图例:展示的是计算图中各个图标的含义。\n", + "\n", + "### 数据图可视化\n", + "\n", + "数据图可视化用于展示单次模型训练的数据处理和数据增强信息。\n", + "\n", + "![data_function.png](https://gitee.com/mindspore/docs/raw/master/tutorials/notebook/mindinsight/images/data_function.png)\n", + "\n", + "上图展示的数据图功能区包含以下内容:\n", + "\n", + "- 图例:展示数据溯源图中各个图标的含义。\n", + "- 数据处理流水线:展示训练所使用的数据处理流水线,可以选择图中的单个节点查看详细信息。\n", + "- 节点信息:展示选中的节点的基本信息,包括使用的数据处理和增强算子的名称、参数等。\n" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "### 单独记录损失值标量\n", + "\n", + "\n", + "为了降低性能开销和日志文件大小,可以单独记录关心的数据。单独记录标量、参数分布直方图、计算图或数据图信息,可以通过配置`specified`参数为相应的值来单独记录。单独记录图像或张量信息,可以在AlexNet网络的`construct`方法中使用`ImageSummary`算子或`TensorSummary`算子来单独记录。\n" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## 关闭MindInsight服务\n", + "\n", + "在终端命令行中执行以下代码关闭MindInsight服务。\n", + "\n", + "```shell\n", + "mindinsight stop --port 8080\n", + "```" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## 注意事项和规格\n", + "1. 为了控制列出summary文件目录的用时,MindInsight最多支持发现999个summary文件目录。\n", + "2. 不能同时使用多个 `SummaryRecord` 实例 (`SummaryCollector` 中使用了 `SummaryRecord`)。\n", + "\n", + " 如果在 `model.train` 或者 `model.eval` 的callback列表中使用两个及以上的 `SummaryCollector` 实例,则视为同时使用 `SummaryRecord`,导致记录数据失败。\n", + "\n", + " 自定义callback中如果使用 `SummaryRecord`,则其不能和 `SummaryCollector` 同时使用。\n", + "\n", + " 正确代码:\n", + " ```\n", + " ...\n", + " summary_collector = SummaryCollector('./summary_dir')\n", + " model.train(2, train_dataset, callbacks=[summary_collector])\n", + "\n", + " ...\n", + " model.eval(dataset, callbacks=[summary_collector])\n", + " ```\n", + "\n", + " 错误代码:\n", + " ```\n", + " ...\n", + " summary_collector1 = SummaryCollector('./summary_dir1')\n", + " summary_collector2 = SummaryCollector('./summary_dir2')\n", + " model.train(2, train_dataset, callbacks=[summary_collector1, summary_collector2])\n", + " ```\n", + "\n", + " 错误代码:\n", + " ```\n", + " ...\n", + " # Note: the 'ConfusionMatrixCallback' is user-defined, and it uses SummaryRecord to record data.\n", + " confusion_callback = ConfusionMatrixCallback('./summary_dir1')\n", + " summary_collector = SummaryCollector('./summary_dir2')\n", + " model.train(2, train_dataset, callbacks=[confusion_callback, summary_collector])\n", + " ```\n", + "3. 每个summary日志文件目录中,应该只放置一次训练的数据。一个summary日志目录中如果存放了多次训练的summary数据,MindInsight在可视化数据时会将这些训练的summary数据进行叠加展示,可能会与预期可视化效果不相符。\n", + "4. 当前 `SummaryCollector` 和 `SummaryRecord` 不支持GPU多卡运行的场景。\n", + "5. 目前MindSpore仅支持在Ascend 910 AI处理器上导出算子融合后的计算图。\n", + "6. 在训练中使用Summary算子收集数据时,`HistogramSummary` 算子会影响性能,所以请尽量少地使用。\n", + "7. 为了控制内存占用,MindInsight对标签(tag)数目和步骤(step)数目进行了限制:\n", + " - 每个训练看板的最大标签数量为300个标签。标量标签、图片标签、计算图标签、参数分布图(直方图)标签、张量标签的数量总和不得超过300个。特别地,每个训练看板最多有10个计算图标签、6个张量标签。当实际标签数量超过这一限制时,将依照MindInsight的处理顺序,保留最近处理的300个标签。\n", + " - 每个训练看板的每个标量标签最多有1000个步骤的数据。当实际步骤的数目超过这一限制时,将对数据进行随机采样,以满足这一限制。\n", + " - 每个训练看板的每个图片标签最多有10个步骤的数据。当实际步骤的数目超过这一限制时,将对数据进行随机采样,以满足这一限制。\n", + " - 每个训练看板的每个参数分布图(直方图)标签最多有50个步骤的数据。当实际步骤的数目超过这一限制时,将对数据进行随机采样,以满足这一限制。\n", + " - 每个训练看板的每个张量标签最多有20个步骤的数据。当实际步骤的数目超过这一限制时,将对数据进行随机采样,以满足这一限制。\n", + "8. 由于`TensorSummary`会记录完整Tensor数据,数据量通常会比较大,为了控制内存占用和出于性能上的考虑,MindInsight对Tensor的大小以及返回前端展示的数值个数进行以下限制:\n", + " - MindInsight最大支持加载含有1千万个数值的Tensor。\n", + " - Tensor加载后,在张量可视的表格视图下,最大支持查看10万个数值,如果所选择的维度查询得到的数值超过这一限制,则无法显示。\n", + "\n", + "9. 由于张量可视(`TensorSummary`)会记录原始张量数据,需要的存储空间较大。使用`TensorSummary`前和训练过程中请注意检查系统存储空间充足。\n", + "\n", + " 通过以下方法可以降低张量可视功能的存储空间占用:\n", + "\n", + " 1)避免使用`TensorSummary`记录较大的Tensor。\n", + "\n", + " 2)减少网络中`TensorSummary`算子的使用个数。\n", + "\n", + " 功能使用完毕后,请及时清理不再需要的训练日志,以释放磁盘空间。\n", + "\n", + " 备注:估算`TensorSummary`空间使用量的方法如下:\n", + "\n", + " 一个`TensorSummary数据的大小 = Tensor中的数值个数 * 4 bytes`。假设使用`TensorSummary`记录的Tensor大小为`32 * 1 * 256 * 256`,则一个`TensorSummary`数据大约需要`32 * 1 * 256 * 256 * 4 bytes = 8,388,608 bytes = 8MiB`。`TensorSummary`默认会记录20个步骤的数据,则记录这20组数据需要的空间约为`20 * 8 MiB = 160MiB`。需要注意的是,由于数据结构等因素的开销,实际使用的存储空间会略大于160MiB。\n", + "10. 当使用`TensorSummary`时,由于记录完整Tensor数据,训练日志文件较大,MindInsight需要更多时间解析训练日志文件,请耐心等待。" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## 总结\n", + "\n", + "本次体验流程为完整的MindSpore深度学习及MindInsight可视化展示的过程,包括了下载数据集及预处理过程,构建网络、损失函数和优化器过程,生成模型并进行训练、验证的过程,以及启动MindInsight服务进行训练过程可视化展示。读者可以基于本次体验流程构建自己的网络模型进行训练,并使用`SummaryCollector`以及Summary算子记录关心的数据,然后在MindInsight服务看板中进行可视化展示,根据MindInsight服务中展示的结果调整相应的参数以提高训练精度。\n", + "\n", + "以上便完成了标量、直方图、图像和张量可视化的体验,我们通过本次体验全面了解了MindSpore执行训练的过程和MindInsight在标量、直方图、图像、张量、计算图和数据图可视化的应用,理解了如何使用`SummaryColletor`记录训练过程中的标量、直方图、图像、张量、计算图和数据图数据。" + ] + } + ], + "metadata": { + "kernelspec": { + "display_name": "Python 3", + "language": "python", + "name": "python3" + }, + "language_info": { + "codemirror_mode": { + "name": "ipython", + "version": 3 + }, + "file_extension": ".py", + "mimetype": "text/x-python", + "name": "python", + "nbconvert_exporter": "python", + "pygments_lexer": "ipython3", + "version": "3.7.5" + } + }, + "nbformat": 4, + "nbformat_minor": 4 +} diff --git a/tutorials/notebook/mindinsight/mindinsight_image_histogram_scalar_tensor.ipynb b/tutorials/notebook/mindinsight/mindinsight_image_histogram_scalar_tensor.ipynb deleted file mode 100644 index 862c259b04a435aff6077c9ba975bffa5e091705..0000000000000000000000000000000000000000 --- a/tutorials/notebook/mindinsight/mindinsight_image_histogram_scalar_tensor.ipynb +++ /dev/null @@ -1,1285 +0,0 @@ -{ - "cells": [ - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "# 标量、直方图、图像和张量可视化\n", - "\n", - "可以通过MindSpore提供的接口将训练过程中的标量、图像和张量记录到summary日志文件中,并通过MindInsight提供的可视化界面进行查看。\n", - "\n", - "接下来是本次流程的体验过程。\n", - "\n", - "## 整体流程\n", - "\n", - "1. 下载CIFAR-10二进制格式数据集。\n", - "2. 对数据进行预处理。\n", - "3. 定义AlexNet网络,在网络中使用summary算子记录数据。\n", - "4. 训练网络,使用 `SummaryCollector` 记录损失值标量、权重梯度等参数。同时启动MindInsight服务,实时查看损失值、参数直方图、输入图像和张量的变化。\n", - "5. 完成训练后,查看MindInsight看板中记录到的损失值标量、直方图、图像信息、张量信息。\n", - "6. 分别单独记录损失值标量、直方图、图像信息和张量信息并查看可视化结果,查看损失值标量对比信息。\n", - "7. 相关注意事项,关闭MindInsight服务。" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "## 准备环节\n", - "\n", - "### 下载数据集\n", - "\n", - "本次流程使用CIFAR-10二进制格式数据集,下载地址为:。\n", - "\n", - "CIFAR-10二进制格式数据集包含10个类别的60000个32x32彩色图像。每个类别6000个图像,包含50000张训练图像和10000张测试图像。数据集分为5个训练批次和1个测试批次,每个批次具有10000张图像。测试批次包含每个类别中1000个随机选择的图像,训练批次按随机顺序包含剩余图像(某个训练批次包含的一类图像可能比另一类更多)。其中,每个训练批次精确地包含对应每个类别的5000张图像。\n", - "\n", - "执行下面一段代码下载CIFAR-10二进制格式数据集到当前工作目录,如果已经下载过数据集,则不重复下载。" - ] - }, - { - "cell_type": "code", - "execution_count": 1, - "metadata": { - "scrolled": true - }, - "outputs": [ - { - "name": "stdout", - "output_type": "stream", - "text": [ - "*Checking DataSets Path.*\n", - "Downloading CIFAR-10 DataSets.\n", - "data_batch_1.bin is ok\n", - "data_batch_2.bin is ok\n", - "data_batch_3.bin is ok\n", - "data_batch_4.bin is ok\n", - "data_batch_5.bin is ok\n", - "Downloaded CIFAR-10 DataSets Already.\n" - ] - } - ], - "source": [ - "import os, shutil\n", - "import urllib.request\n", - "from urllib.parse import urlparse\n", - "\n", - "\n", - "def callbackfunc(blocknum, blocksize, totalsize):\n", - " percent = 100.0 * blocknum * blocksize / totalsize\n", - " if percent > 100:\n", - " percent = 100\n", - " print(\"downloaded {:.1f}\".format(percent), end=\"\\r\")\n", - "\n", - "def _download_dataset():\n", - " ds_url = \"https://www.cs.toronto.edu/~kriz/cifar-10-binary.tar.gz\"\n", - " file_base_name = urlparse(ds_url).path.split(\"/\")[-1]\n", - " file_name = os.path.join(\"./DataSets\", file_base_name)\n", - " if not os.path.exists(file_name):\n", - " urllib.request.urlretrieve(ds_url, file_name, callbackfunc)\n", - " print(\"{:*^25}\".format(\"DataSets Downloaded\"))\n", - " shutil.unpack_archive(file_name, extract_dir=\"./DataSets/cifar-10-binary\")\n", - "\n", - "def _copy_dataset(ds_part, dest_path):\n", - " data_source_path = \"./DataSets/cifar-10-binary/cifar-10-batches-bin\"\n", - " ds_part_source_path = os.path.join(data_source_path, ds_part)\n", - " if not os.path.exists(ds_part_source_path):\n", - " _download_dataset()\n", - " shutil.copy(ds_part_source_path, dest_path)\n", - "\n", - "def download_cifar10_dataset():\n", - " ds_base_path = \"./DataSets/cifar-10-batches-bin\"\n", - " train_path = os.path.join(ds_base_path, \"train\")\n", - " test_path = os.path.join(ds_base_path, \"test\")\n", - " print(\"{:*^25}\".format(\"Checking DataSets Path.\"))\n", - " if not os.path.exists(train_path) and not os.path.exists(train_path):\n", - " os.makedirs(train_path)\n", - " os.makedirs(test_path)\n", - " print(\"{:*^25}\".format(\"Downloading CIFAR-10 DataSets.\"))\n", - " for i in range(1, 6):\n", - " train_part = \"data_batch_{}.bin\".format(i)\n", - " if not os.path.exists(os.path.join(train_path, train_part)):\n", - " _copy_dataset(train_part, train_path)\n", - " pops = train_part + \" is ok\"\n", - " print(\"{:*^20}\".format(pops))\n", - " test_part = \"test_batch.bin\"\n", - " if not os.path.exists(os.path.join(test_path, test_part)):\n", - " _copy_dataset(test_part, test_path)\n", - " print(\"{:*^20}\".format(test_part+\" is ok\"))\n", - " print(\"{:*^25}\".format(\"Downloaded CIFAR-10 DataSets Already.\"))\n", - "\n", - "download_cifar10_dataset()" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "下载数据集后,CIFAR-10数据集目录(`DataSets`)结构如下所示。\n", - "\n", - "```shell\n", - " $ tree DataSets\n", - " DataSets\n", - " └── cifar-10-batches-bin\n", - " ├── test\n", - " │   └── test_batch.bin\n", - " └── train\n", - " ├── data_batch_1.bin\n", - " ├── data_batch_2.bin\n", - " ├── data_batch_3.bin\n", - " ├── data_batch_4.bin\n", - " └── data_batch_5.bin\n", - "\n", - "```\n", - "\n", - "其中:\n", - "- `test_batch.bin`文件为测试数据集文件。\n", - "- `data_batch_1.bin`文件为第1批次训练数据集文件。\n", - "- `data_batch_2.bin`文件为第2批次训练数据集文件。\n", - "- `data_batch_3.bin`文件为第3批次训练数据集文件。\n", - "- `data_batch_4.bin`文件为第4批次训练数据集文件。\n", - "- `data_batch_5.bin`文件为第5批次训练数据集文件。\n" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "## 数据处理\n", - "\n", - "好的数据集可以有效提高训练精度和效率,在加载数据集前,会进行一些处理,增加数据的可用性和随机性。下面一段代码定义函数`create_dataset_cifar10`来进行数据处理操作,并创建训练数据集(`ds_train`)和测试数据集(`ds_eval`)。\n" - ] - }, - { - "cell_type": "code", - "execution_count": 3, - "metadata": {}, - "outputs": [], - "source": [ - "import mindspore.dataset as ds\n", - "import mindspore.dataset.transforms.c_transforms as C\n", - "import mindspore.dataset.vision.c_transforms as CV\n", - "from mindspore.common import dtype as mstype\n", - "\n", - "\n", - "def create_dataset_cifar10(data_path, batch_size=32, repeat_size=1, status=\"train\"):\n", - " \"\"\"\n", - " create dataset for train or test\n", - " \"\"\"\n", - " cifar_ds = ds.Cifar10Dataset(data_path)\n", - " rescale = 1.0 / 255.0\n", - " shift = 0.0\n", - "\n", - " resize_op = CV.Resize(size=(227, 227))\n", - " rescale_op = CV.Rescale(rescale, shift)\n", - " normalize_op = CV.Normalize((0.4914, 0.4822, 0.4465), (0.2023, 0.1994, 0.2010))\n", - " if status == \"train\":\n", - " random_crop_op = CV.RandomCrop([32, 32], [4, 4, 4, 4])\n", - " random_horizontal_op = CV.RandomHorizontalFlip()\n", - " channel_swap_op = CV.HWC2CHW()\n", - " typecast_op = C.TypeCast(mstype.int32)\n", - " cifar_ds = cifar_ds.map(operations=typecast_op, input_columns=\"label\")\n", - " if status == \"train\":\n", - " cifar_ds = cifar_ds.map(operations=random_crop_op, input_columns=\"image\")\n", - " cifar_ds = cifar_ds.map(operations=random_horizontal_op, input_columns=\"image\")\n", - " cifar_ds = cifar_ds.map(operations=resize_op, input_columns=\"image\")\n", - " cifar_ds = cifar_ds.map(operations=rescale_op, input_columns=\"image\")\n", - " cifar_ds = cifar_ds.map(operations=normalize_op, input_columns=\"image\")\n", - " cifar_ds = cifar_ds.map(operations=channel_swap_op, input_columns=\"image\")\n", - "\n", - " cifar_ds = cifar_ds.shuffle(buffer_size=1000)\n", - " cifar_ds = cifar_ds.batch(batch_size, drop_remainder=True)\n", - " cifar_ds = cifar_ds.repeat(repeat_size)\n", - " return cifar_ds\n", - "\n", - "ds_train = create_dataset_cifar10(data_path=\"./DataSets/cifar-10-batches-bin/train\")\n", - "ds_eval = create_dataset_cifar10(\"./DataSets/cifar-10-batches-bin/test\")" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "### 抽取数据集图像\n", - "\n", - "执行以下一段代码,抽取上步创建好的训练数据集`ds_train`中第一个`batch`的32张图像以及对应的类别名称进行展示。" - ] - }, - { - "cell_type": "code", - "execution_count": 4, - "metadata": { - "scrolled": true - }, - "outputs": [ - { - "name": "stdout", - "output_type": "stream", - "text": [ - "The 32 images with label of the first batch in ds_train are showed below:\n" - ] - }, - { - "data": { - "image/png": "\n", - "text/plain": [ - "
" - ] - }, - "metadata": { - "needs_background": "light" - }, - "output_type": "display_data" - } - ], - "source": [ - "from matplotlib import pyplot as plt\n", - "import numpy as np\n", - "\n", - "label_list = [\"airplane\", \"automobile\", \"bird\", \"cat\", \"deer\", \"dog\", \"rog\", \"horse\", \"ship\", \"truck\"]\n", - "print(\"The 32 images with label of the first batch in ds_train are showed below:\")\n", - "ds_iterator = ds_train.create_dict_iterator()\n", - "ds_iterator.get_next()\n", - "batch_1 = ds_iterator.get_next()\n", - "batch_image = batch_1[\"image\"].asnumpy()\n", - "batch_label = batch_1[\"label\"].asnumpy()\n", - "%matplotlib inline\n", - "plt.figure(dpi=144)\n", - "for i,image in enumerate(batch_image):\n", - " plt.subplot(4, 8, i+1)\n", - " plt.subplots_adjust(wspace=0.2, hspace=0.2)\n", - " image = image/np.amax(image)\n", - " image = np.clip(image, 0, 1)\n", - " image = np.transpose(image,(1,2,0))\n", - " plt.imshow(image)\n", - " num = batch_label[i]\n", - " plt.title(f\"image {i+1}\\n{label_list[num]}\", y=-0.65, fontdict={\"fontsize\":8})\n", - " plt.axis('off') \n", - "plt.show()" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "## 使用Summary算子记录数据\n", - "\n", - "在进行训练之前,需定义神经网络模型,本流程采用AlexNet网络,以下一段代码中定义AlexNet网络结构。\n", - "\n", - "MindSpore提供了两种方法进行记录数据,分别为:\n", - "- 通过Summary算子记录数据\n", - "- 通过 `SummaryCollector` 这个callback进行记录\n", - "\n", - "下面展示在AlexNet网络中使用Summary算子记录输入图像和张量数据。\n", - "\n", - "- 使用 `ImageSummary` 记录输入图像数据。\n", - "\n", - " 1. 在 `__init__` 方法中初始化 `ImageSummary`。\n", - " \n", - " ```python\n", - " # Init ImageSummary\n", - " self.image_summary = P.ImageSummary()\n", - " ```\n", - " \n", - " 2. 在 `construct` 方法中使用 `ImageSummary` 算子记录输入图像。其中 \"Image\" 为该数据的名称,MindInsight在展示时,会将该名称展示出来以方便识别是哪个数据。\n", - " \n", - " ```python\n", - " # Record image by Summary operator\n", - " self.image_summary(\"Image\", x)\n", - " ```\n", - " \n", - "- 使用 `TensorSummary` 记录张量数据。\n", - "\n", - " 1. 在 `__init__` 方法中初始化 `TensorSummary`。\n", - " \n", - " ```python\n", - " # Init TensorSummary\n", - " self.tensor_summary = P.TensorSummary()\n", - " ```\n", - " \n", - " 2. 在`construct`方法中使用`TensorSummary`算子记录张量数据。其中\"Tensor\"为该数据的名称。\n", - " \n", - " ```python\n", - " # Record tensor by Summary operator\n", - " self.tensor_summary(\"Tensor\", x)\n", - " ```\n", - "\n", - "当前支持的Summary算子:\n", - "\n", - "- [ScalarSummary](https://www.mindspore.cn/doc/api_python/zh-CN/master/mindspore/mindspore.ops.html#mindspore.ops.ScalarSummary): 记录标量数据\n", - "- [TensorSummary](https://www.mindspore.cn/doc/api_python/zh-CN/master/mindspore/mindspore.ops.html#mindspore.ops.TensorSummary): 记录张量数据\n", - "- [ImageSummary](https://www.mindspore.cn/doc/api_python/zh-CN/master/mindspore/mindspore.ops.html#mindspore.ops.ImageSummary): 记录图片数据\n", - "- [HistogramSummary](https://www.mindspore.cn/doc/api_python/zh-CN/master/mindspore/mindspore.ops.html#mindspore.ops.HistogramSummary): 将张量数据转为直方图数据记录" - ] - }, - { - "cell_type": "code", - "execution_count": 5, - "metadata": {}, - "outputs": [], - "source": [ - "import mindspore.nn as nn\n", - "from mindspore.common.initializer import TruncatedNormal\n", - "from mindspore.ops import operations as P\n", - "\n", - "def conv(in_channels, out_channels, kernel_size, stride=1, padding=0, pad_mode=\"valid\"):\n", - " weight = weight_variable()\n", - " return nn.Conv2d(in_channels, out_channels,\n", - " kernel_size=kernel_size, stride=stride, padding=padding,\n", - " weight_init=weight, has_bias=False, pad_mode=pad_mode)\n", - "\n", - "def fc_with_initialize(input_channels, out_channels):\n", - " weight = weight_variable()\n", - " bias = weight_variable()\n", - " return nn.Dense(input_channels, out_channels, weight, bias)\n", - "\n", - "def weight_variable():\n", - " return TruncatedNormal(0.02)\n", - "\n", - "\n", - "class AlexNet(nn.Cell):\n", - " \"\"\"\n", - " Alexnet\n", - " \"\"\"\n", - " def __init__(self, num_classes=10, channel=3):\n", - " super(AlexNet, self).__init__()\n", - " self.conv1 = conv(channel, 96, 11, stride=4)\n", - " self.conv2 = conv(96, 256, 5, pad_mode=\"same\")\n", - " self.conv3 = conv(256, 384, 3, pad_mode=\"same\")\n", - " self.conv4 = conv(384, 384, 3, pad_mode=\"same\")\n", - " self.conv5 = conv(384, 256, 3, pad_mode=\"same\")\n", - " self.relu = nn.ReLU()\n", - " self.max_pool2d = P.MaxPool(ksize=3, strides=2)\n", - " self.flatten = nn.Flatten()\n", - " self.fc1 = fc_with_initialize(6*6*256, 4096)\n", - " self.fc2 = fc_with_initialize(4096, 4096)\n", - " self.fc3 = fc_with_initialize(4096, num_classes)\n", - " # Init TensorSummary\n", - " self.tensor_summary = P.TensorSummary()\n", - " # Init ImageSummary\n", - " self.image_summary = P.ImageSummary()\n", - "\n", - " def construct(self, x):\n", - " # Record image by Summary operator\n", - " self.image_summary(\"Image\", x)\n", - " x = self.conv1(x)\n", - " # Record tensor by Summary operator\n", - " self.tensor_summary(\"Tensor\", x)\n", - " x = self.relu(x)\n", - " x = self.max_pool2d(x)\n", - " x = self.conv2(x)\n", - " x = self.relu(x)\n", - " x = self.max_pool2d(x)\n", - " x = self.conv3(x)\n", - " x = self.relu(x)\n", - " x = self.conv4(x)\n", - " x = self.relu(x)\n", - " x = self.conv5(x)\n", - " x = self.relu(x)\n", - " x = self.max_pool2d(x)\n", - " x = self.flatten(x)\n", - " x = self.fc1(x)\n", - " x = self.relu(x)\n", - " x = self.fc2(x)\n", - " x = self.relu(x)\n", - " x = self.fc3(x)\n", - " return x" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "## 使用 `SummaryCollector` 记录数据\n", - "\n", - "下面展示使用`SummaryCollector`来记录标量、直方图信息。\n", - "\n", - "在MindSpore中通过`Callback`机制,提供支持快速简易地收集损失值、参数权重、梯度等信息的`Callback`, 叫做`SummaryCollector`(详细的用法可以参考API文档中[mindspore.train.callback.SummaryCollector](https://www.mindspore.cn/doc/api_python/zh-CN/master/mindspore/mindspore.train.html#mindspore.train.callback.SummaryCollector))。`SummaryCollector`使用方法如下: \n", - "\n", - "`SummaryCollector` 提供 `collect_specified_data` 参数,允许自定义想要收集的数据。\n", - "\n", - "下面的代码展示通过 `SummaryCollector` 收集损失值以及卷积层的参数值,参数值在MindInsight中以直方图展示。\n", - "\n", - "\n", - "\n", - "\n", - "```python\n", - "specified={\"collect_metric\": True, \"histogram_regular\": \"^conv1.*|^conv2.*\"}\n", - "summary_collector = SummaryCollector(summary_dir=\"./summary_dir/summary_01\", \n", - " collect_specified_data=specified, \n", - " collect_freq=1, \n", - " keep_default_action=False, \n", - " collect_tensor_freq=200)\n", - "```\n", - "\n", - "- `summary_dir`:指定日志保存的路径。\n", - "- `collect_specified_data`:指定需要记录的信息。\n", - "- `collect_freq`:指定使用`SummaryCollector`记录数据的频率。\n", - "- `keep_default_action`:指定是否除记录除指定信息外的其他数据信息。\n", - "- `collect_tensor_freq`:指定记录张量信息的频率。\n", - "- `\"collect_metric\"`为记录损失值标量信息。\n", - "- `\"histogram_regular\"`为记录`conv1`层和`conv2`层直方图信息。\n", - "\n", - "  程序运行过程中将在本地`8080`端口自动启动MindInsight服务并自动遍历读取当前notebook目录下`summary_dir`子目录下所有日志文件、解析进行可视化展示。" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "### 导入模块" - ] - }, - { - "cell_type": "code", - "execution_count": 6, - "metadata": {}, - "outputs": [], - "source": [ - "import os\n", - "import mindspore.nn as nn\n", - "from mindspore.train.callback import ModelCheckpoint, CheckpointConfig, LossMonitor, TimeMonitor\n", - "from mindspore.train import Model\n", - "from mindspore.nn.metrics import Accuracy\n", - "from mindspore.train.callback import SummaryCollector\n", - "from mindspore.train.serialization import load_checkpoint, load_param_into_net\n", - "from mindspore import Tensor\n", - "from mindspore import context\n", - "\n", - "device_target = \"GPU\"\n", - "context.set_context(mode=context.GRAPH_MODE, device_target=device_target)" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "### 定义学习率\n", - "\n", - "以下一段代码定义学习率。" - ] - }, - { - "cell_type": "code", - "execution_count": 7, - "metadata": {}, - "outputs": [], - "source": [ - "import numpy as np\n", - "\n", - "\n", - "def get_lr(current_step, lr_max, total_epochs, steps_per_epoch):\n", - " \"\"\"\n", - " generate learning rate array\n", - "\n", - " Args:\n", - " current_step(int): current steps of the training\n", - " lr_max(float): max learning rate\n", - " total_epochs(int): total epoch of training\n", - " steps_per_epoch(int): steps of one epoch\n", - "\n", - " Returns:\n", - " np.array, learning rate array\n", - " \"\"\"\n", - " lr_each_step = []\n", - " total_steps = steps_per_epoch * total_epochs\n", - " decay_epoch_index = [0.8 * total_steps]\n", - " for i in range(total_steps):\n", - " if i < decay_epoch_index[0]:\n", - " lr = lr_max\n", - " else:\n", - " lr = lr_max * 0.1\n", - " lr_each_step.append(lr)\n", - " lr_each_step = np.array(lr_each_step).astype(np.float32)\n", - " learning_rate = lr_each_step[current_step:]\n", - "\n", - " return learning_rate\n" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "### 执行训练" - ] - }, - { - "cell_type": "code", - "execution_count": 8, - "metadata": {}, - "outputs": [ - { - "name": "stdout", - "output_type": "stream", - "text": [ - "============== Starting Training ==============\n", - "epoch: 1 step: 1, loss is 2.3056953\n", - "epoch: 1 step: 2, loss is 2.3169184\n", - "epoch: 1 step: 3, loss is 2.2732773\n", - "epoch: 1 step: 4, loss is 2.3223817\n", - "epoch: 1 step: 5, loss is 2.300379\n", - "epoch: 1 step: 6, loss is 2.2816362\n", - "epoch: 1 step: 7, loss is 2.3317387\n", - "epoch: 1 step: 8, loss is 2.2595024\n", - "epoch: 1 step: 9, loss is 2.3138928\n", - "epoch: 1 step: 10, loss is 2.294712\n", - "\n", - "...\n", - "\n", - "epoch: 10 step: 1553, loss is 0.23232733\n", - "epoch: 10 step: 1554, loss is 0.35622978\n", - "epoch: 10 step: 1555, loss is 0.24221122\n", - "epoch: 10 step: 1556, loss is 0.2082262\n", - "epoch: 10 step: 1557, loss is 0.29972154\n", - "epoch: 10 step: 1558, loss is 0.32628897\n", - "epoch: 10 step: 1559, loss is 0.44762093\n", - "epoch: 10 step: 1560, loss is 0.4621265\n", - "epoch: 10 step: 1561, loss is 0.13807176\n", - "epoch: 10 step: 1562, loss is 0.40322578\n", - "Epoch time: 242827.643, per step time: 155.459\n", - "============== Starting Testing ==============\n", - "============== {'Accuracy': 0.8299278846153846} ==============\n" - ] - } - ], - "source": [ - "\n", - "network = AlexNet(num_classes=10)\n", - "net_loss = nn.SoftmaxCrossEntropyWithLogits(sparse=True, reduction=\"mean\")\n", - "lr = Tensor(get_lr(0, 0.002, 10, ds_train.get_dataset_size()))\n", - "net_opt = nn.Momentum(network.trainable_params(), learning_rate=lr, momentum=0.9)\n", - "time_cb = TimeMonitor(data_size=ds_train.get_dataset_size())\n", - "config_ck = CheckpointConfig(save_checkpoint_steps=1562, keep_checkpoint_max=10)\n", - "ckpoint_cb = ModelCheckpoint(prefix=\"checkpoint_alexnet\", config=config_ck)\n", - "model = Model(network, net_loss, net_opt, metrics={\"Accuracy\": Accuracy()})\n", - "\n", - "summary_base_dir = \"./summary_dir\"\n", - "os.system(f\"mindinsight start --summary-base-dir {summary_base_dir} --port=8080\")\n", - "\n", - "# Init a SummaryCollector callback instance, and use it in model.train or model.eval\n", - "specified = {\"collect_metric\": True, \"histogram_regular\": \"^conv1.*|^conv2.*\"}\n", - "summary_collector = SummaryCollector(summary_dir=\"./summary_dir/summary_01\", collect_specified_data=specified, collect_freq=1, keep_default_action=False, collect_tensor_freq=200)\n", - "\n", - "print(\"============== Starting Training ==============\")\n", - "model.train(epoch=10, train_dataset=ds_train, callbacks=[time_cb, ckpoint_cb, LossMonitor(), summary_collector], dataset_sink_mode=True)\n", - "\n", - "print(\"============== Starting Testing ==============\")\n", - "param_dict = load_checkpoint(\"checkpoint_alexnet-10_1562.ckpt\")\n", - "load_param_into_net(network, param_dict)\n", - "acc = model.eval(ds_eval, callbacks=summary_collector, dataset_sink_mode=True)\n", - "print(\"============== {} ==============\".format(acc))" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "## MindInsight看板\n", - "\n", - "在本地浏览器中打开地址:`127.0.0.1:8080`,进入到可视化面板。\n", - "\n", - "![](https://gitee.com/mindspore/docs/raw/master/tutorials/notebook/mindinsight/images/mindinsight_panel.png)\n", - "\n", - "在上图所示面板中可以看到`summary_01`日志文件目录,点击**训练看板**进入到下图所示的训练数据展示面板,该面板展示了标量数据、直方图、图像和张量信息,并随着训练、测试的进行实时刷新数据,实时显示训练过程参数的变化情况。\n", - "\n", - "![](https://gitee.com/mindspore/docs/raw/master/tutorials/notebook/mindinsight/images/mindinsight_panel2.png)\n", - "\n", - "### 标量可视化\n", - "\n", - "标量可视化用于展示训练过程中标量的变化趋势,点击打开标量信息展示面板,该面板记录了迭代计算过程中的损失值标量信息,如下图展示了损失值标量趋势图。\n", - "\n", - "![](https://gitee.com/mindspore/docs/raw/master/tutorials/notebook/mindinsight/images/scalar_panel.png)\n", - "\n", - "上图展示了神经网络在训练过程中损失值的变化过程。横坐标是训练步骤,纵坐标是损失值。\n", - "\n", - "图中右上角有几个按钮功能,从左到右功能分别是全屏展示,切换Y轴比例,开启/关闭框选,分步回退和还原图形。\n", - "\n", - "- 全屏展示即全屏展示该标量曲线,再点击一次即可恢复。\n", - "- 切换Y轴比例是指可以将Y轴坐标进行对数转换。\n", - "- 开启/关闭框选是指可以框选图中部分区域,并放大查看该区域,可以在已放大的图形上叠加框选。\n", - "- 分步回退是指对同一个区域连续框选并放大查看时,可以逐步撤销操作。\n", - "- 还原图形是指进行了多次框选后,点击此按钮可以将图还原回原始状态。\n", - "\n", - "![](https://gitee.com/mindspore/docs/raw/master/tutorials/notebook/mindinsight/images/scalar_select.png)\n", - "\n", - "上图展示的标量可视化的功能区,提供了根据选择不同标签,水平轴的不同维度和平滑度来查看标量信息的功能。\n", - "\n", - "- 标签选择:提供了对所有标签进行多项选择的功能,用户可以通过勾选所需的标签,查看对应的标量信息。\n", - "- 水平轴:可以选择“步骤”、“相对时间”、“绝对时间”中的任意一项,来作为标量曲线的水平轴。\n", - "- 平滑度:可以通过调整平滑度,对标量曲线进行平滑处理。\n", - "- 标量合成:可以选中两条标量曲线进行合成并展示在一个图中,以方便对两条曲线进行对比或者查看合成后的图。\n", - " 标量合成的功能区与标量可视化的功能区相似。其中与标量可视化功能区不一样的地方,在于标签选择时,标量合成功能最多只能同时选择两个标签,将其曲线合成并展示。\n", - "\n", - "### 直方图可视化\n", - "\n", - "\n", - "直方图用于将用户所指定的张量以直方图的形式展示。点击打开直方图展示面板,以直方图的形式记录了在迭代过程中所有层参数分布信息。\n", - "\n", - "![](https://gitee.com/mindspore/docs/raw/master/tutorials/notebook/mindinsight/images/histogram_panel.png)\n", - "\n", - "如下图为`conv1`层参数分布信息,点击图中右上角,可以将图放大。\n", - "\n", - "![](https://gitee.com/mindspore/docs/raw/master/tutorials/notebook/mindinsight/images/histogram.png)\n", - "\n", - "下图为直方图功能区。\n", - "\n", - "![](https://gitee.com/mindspore/docs/raw/master/tutorials/notebook/mindinsight/images/histogram_func.png)\n", - "\n", - "上图展示直方图的功能区,包含以下内容:\n", - "\n", - "- 标签选择:提供了对所有标签进行多项选择的功能,用户可以通过勾选所需的标签,查看对应的直方图。\n", - "- 纵轴:可以选择步骤、相对时间、绝对时间中的任意一项,来作为直方图纵轴显示的数据。\n", - "- 视角:可以选择正视和俯视中的一种。正视是指从正面的角度查看直方图,此时不同步骤之间的数据会覆盖在一起。俯视是指偏移以45度角俯视直方图区域,这时可以呈现不同步骤之间数据的差异。\n", - "\n", - "### 图像可视化\n", - "\n", - "图像可视化用于展示用户所指定的图片。点击图像展示面板,展示了每个一步进行处理的图像信息。\n", - "\n", - "下图为展示`summary_01`记录的图像信息。\n", - "\n", - "![](https://gitee.com/mindspore/docs/raw/master/tutorials/notebook/mindinsight/images/image_panel.png)\n", - "\n", - "通过滑动上图中的\"步骤\"滑条,查看不同步骤的图片。\n", - "\n", - "![](https://gitee.com/mindspore/docs/raw/master/tutorials/notebook/mindinsight/images/image_function.png)\n", - "\n", - "上图展示图像可视化的功能区,提供了选择查看不同标签,不同亮度和不同对比度来查看图片信息。\n", - "\n", - "- 标签:提供了对所有标签进行多项选择的功能,用户可以通过勾选所需的标签,查看对应的图片信息。\n", - "- 亮度调整:可以调整所展示的所有图片亮度。\n", - "- 对比度调整:可以调整所展示的所有图片对比度。\n", - "\n", - "### 张量可视化\n", - "\n", - "张量可视化用于将张量以表格以及直方图的形式进行展示。\n", - "\n", - "![](https://gitee.com/mindspore/docs/raw/master/tutorials/notebook/mindinsight/images/tensor_func.png)\n", - "\n", - "上图展示了张量可视化的功能区,包含以下内容:\n", - "\n", - "- 标签选择:提供了对所有标签进行多项选择的功能,用户可以通过勾选所需的标签,查看对应的表格数据或者直方图。\n", - "- 视图:可以选择表格或者直方图来展示tensor数据。在直方图视图下存在纵轴和视角的功能选择。\n", - "- 纵轴:可以选择步骤、相对时间、绝对时间中的任意一项,来作为直方图纵轴显示的数据。\n", - "- 视角:可以选择正视和俯视中的一种。正视是指从正面的角度查看直方图,此时不同步骤之间的数据会覆盖在一起。俯视是指 偏移以45度角俯视直方图区域,这时可以呈现不同步骤之间数据的差异。\n", - "\n", - "![](https://gitee.com/mindspore/docs/raw/master/tutorials/notebook/mindinsight/images/tensor.png)\n", - "\n", - "上图中将用户所记录的张量以表格的形式展示,包含以下功能:\n", - "\n", - "- 点击表格右边小方框按钮,可以将表格放大。\n", - "- 表格中白色方框显示当前展示的是哪个维度下的张量数据,其中冒号\":\"表示当前维度的所有值,可以在方框输入对应的索引或者:后按Enter键或者点击后边的打勾按钮来查询特定维度的张量数据。 假设某维度是32,则其索引范围是-32到31。注意:可以查询0维到2维的张量数据,不支持查询超过两维的张量数据,即不能设置超过两个冒号\":\"的查询条件。\n", - "- 拖拽表格下方的空心圆圈可以查询特定步骤的张量数据。\n" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "## 单独记录数据\n", - "\n", - "以上流程为整体展示Summary算子能记录到的所有类型数据,也可以单独记录关心的数据,以降低性能开销和日志文件大小。\n", - "\n", - "### 单独记录损失值标量\n", - "\n", - "1. 配置`specified`参数为:\n", - "\n", - "```python\n", - "specified={\"collect_metric\": True}\n", - "```\n", - "\n", - "2. 配置`summary_collector`为:\n", - "\n", - "```python\n", - "summary_collector = SummaryCollector(summary_dir=\"./summary_dir/summary_loss_only\", \n", - " collect_specified_data=specified, \n", - " collect_freq=1, \n", - " keep_default_action=False)\n", - "```\n", - "\n", - "  运行以下一段代码,单独记录损失值标量信息。" - ] - }, - { - "cell_type": "code", - "execution_count": 9, - "metadata": {}, - "outputs": [ - { - "name": "stdout", - "output_type": "stream", - "text": [ - "============== Starting Training ==============\n", - "epoch: 1 step: 1, loss is 2.316685\n", - "epoch: 1 step: 2, loss is 2.3051994\n", - "epoch: 1 step: 3, loss is 2.2948198\n", - "epoch: 1 step: 4, loss is 2.3207984\n", - "epoch: 1 step: 5, loss is 2.3364246\n", - "epoch: 1 step: 6, loss is 2.2956452\n", - "epoch: 1 step: 7, loss is 2.2634928\n", - "epoch: 1 step: 8, loss is 2.3085115\n", - "epoch: 1 step: 9, loss is 2.254295\n", - "epoch: 1 step: 10, loss is 2.3339896\n", - "\n", - "...\n", - "\n", - "epoch: 10 step: 1556, loss is 0.40271574\n", - "epoch: 10 step: 1557, loss is 0.5172653\n", - "epoch: 10 step: 1558, loss is 0.3401278\n", - "epoch: 10 step: 1559, loss is 0.4081525\n", - "epoch: 10 step: 1560, loss is 0.31565452\n", - "epoch: 10 step: 1561, loss is 0.41298962\n", - "epoch: 10 step: 1562, loss is 0.24210417\n", - "Epoch time: 54385.175, per step time: 34.818\n", - "============== Starting Testing ==============\n", - "============== {'Accuracy': 0.8254206730769231} ==============\n" - ] - } - ], - "source": [ - "class AlexNet(nn.Cell):\n", - " \"\"\"\n", - " Alexnet\n", - " \"\"\"\n", - " def __init__(self, num_classes=10, channel=3):\n", - " super(AlexNet, self).__init__()\n", - " self.conv1 = conv(channel, 96, 11, stride=4)\n", - " self.conv2 = conv(96, 256, 5, pad_mode=\"same\")\n", - " self.conv3 = conv(256, 384, 3, pad_mode=\"same\")\n", - " self.conv4 = conv(384, 384, 3, pad_mode=\"same\")\n", - " self.conv5 = conv(384, 256, 3, pad_mode=\"same\")\n", - " self.relu = nn.ReLU()\n", - " self.max_pool2d = P.MaxPool(ksize=3, strides=2)\n", - " self.flatten = nn.Flatten()\n", - " self.fc1 = fc_with_initialize(6*6*256, 4096)\n", - " self.fc2 = fc_with_initialize(4096, 4096)\n", - " self.fc3 = fc_with_initialize(4096, num_classes)\n", - "\n", - " def construct(self, x):\n", - " x = self.conv1(x)\n", - " x = self.relu(x)\n", - " x = self.max_pool2d(x)\n", - " x = self.conv2(x)\n", - " x = self.relu(x)\n", - " x = self.max_pool2d(x)\n", - " x = self.conv3(x)\n", - " x = self.relu(x)\n", - " x = self.conv4(x)\n", - " x = self.relu(x)\n", - " x = self.conv5(x)\n", - " x = self.relu(x)\n", - " x = self.max_pool2d(x)\n", - " x = self.flatten(x)\n", - " x = self.fc1(x)\n", - " x = self.relu(x)\n", - " x = self.fc2(x)\n", - " x = self.relu(x)\n", - " x = self.fc3(x)\n", - " return x\n", - "\n", - "lr = Tensor(get_lr(0, 0.002, 10, ds_train.get_dataset_size()))\n", - "network = AlexNet(num_classes=10)\n", - "net_loss = nn.SoftmaxCrossEntropyWithLogits(sparse=True, reduction=\"mean\")\n", - "net_opt = nn.Momentum(network.trainable_params(), learning_rate=lr, momentum=0.9)\n", - "time_cb = TimeMonitor(data_size=ds_train.get_dataset_size())\n", - "config_ck = CheckpointConfig(save_checkpoint_steps=1562, keep_checkpoint_max=10)\n", - "ckpoint_cb = ModelCheckpoint(prefix=\"checkpoint_alexnet\", config=config_ck)\n", - "model = Model(network, net_loss, net_opt, metrics={\"Accuracy\": Accuracy()})\n", - "\n", - "# Init a SummaryCollector callback instance, and use it in model.train or model.eval\n", - "specified = {\"collect_metric\": True}\n", - "summary_collector = SummaryCollector(summary_dir=\"./summary_dir/summary_loss_only\", collect_specified_data=specified, collect_freq=1, keep_default_action=False)\n", - "\n", - "print(\"============== Starting Training ==============\")\n", - "model.train(epoch=10, train_dataset=ds_train, callbacks=[time_cb, ckpoint_cb, LossMonitor(), summary_collector], dataset_sink_mode=True)\n", - "\n", - "print(\"============== Starting Testing ==============\")\n", - "param_dict = load_checkpoint(\"checkpoint_alexnet_1-10_1562.ckpt\")\n", - "load_param_into_net(network, param_dict)\n", - "acc = model.eval(ds_eval, callbacks=summary_collector, dataset_sink_mode=True)\n", - "print(\"============== {} ==============\".format(acc))" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "此时点击打开MindInsight**训练列表**看板中的`./summary_loss_only`目录,如下图所示,可以看到只记录有损失值标量信息。\n", - "\n", - "![](https://gitee.com/mindspore/docs/raw/master/tutorials/notebook/mindinsight/images/loss_scalar_only.png)" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "### 单独记录参数分布直方图\n", - "\n", - "1. 配置`specified`参数为只记录`conv1`层直方图信息:\n", - "\n", - "```python\n", - "specified = {\"histogram_regular\": \"^conv1.*\"}\n", - "```\n", - "2. 配置`summary_collector`为:\n", - "\n", - "```python\n", - "summary_collector = SummaryCollector(summary_dir=\"./summary_dir/summary_histogram_only\", \n", - " collect_specified_data=specified, \n", - " collect_freq=1,\n", - " keep_default_action=False)\n", - "```\n", - "\n", - "  运行以下一段代码记录`conv1`层参数直方图信息(为了减少内存占用和减少训练时间,在后续的训练中设置`epoch`为1)。" - ] - }, - { - "cell_type": "code", - "execution_count": 10, - "metadata": {}, - "outputs": [ - { - "name": "stdout", - "output_type": "stream", - "text": [ - "============== Starting Training ==============\n", - "epoch: 1 step: 1, loss is 2.2941067\n", - "epoch: 1 step: 2, loss is 2.2770133\n", - "epoch: 1 step: 3, loss is 2.2869372\n", - "epoch: 1 step: 4, loss is 2.261474\n", - "epoch: 1 step: 5, loss is 2.3582025\n", - "epoch: 1 step: 6, loss is 2.290377\n", - "epoch: 1 step: 7, loss is 2.2761602\n", - "epoch: 1 step: 8, loss is 2.3452077\n", - "epoch: 1 step: 9, loss is 2.2613692\n", - "epoch: 1 step: 10, loss is 2.3617961\n", - "\n", - "...\n", - "\n", - "epoch: 1 step: 1552, loss is 0.9691028\n", - "epoch: 1 step: 1553, loss is 1.1841048\n", - "epoch: 1 step: 1554, loss is 1.3479778\n", - "epoch: 1 step: 1555, loss is 1.2386065\n", - "epoch: 1 step: 1556, loss is 1.0223479\n", - "epoch: 1 step: 1557, loss is 1.1582826\n", - "epoch: 1 step: 1558, loss is 0.87887794\n", - "epoch: 1 step: 1559, loss is 0.956085\n", - "epoch: 1 step: 1560, loss is 1.3973256\n", - "epoch: 1 step: 1561, loss is 1.234511\n", - "epoch: 1 step: 1562, loss is 1.0787828\n", - "Epoch time: 62971.025, per step time: 40.314\n", - "============== Starting Testing ==============\n", - "============== {'Accuracy': 0.5687099358974359} ==============\n" - ] - } - ], - "source": [ - "lr = Tensor(get_lr(0, 0.002, 1, ds_train.get_dataset_size()))\n", - "network = AlexNet(num_classes=10)\n", - "net_loss = nn.SoftmaxCrossEntropyWithLogits(sparse=True, reduction=\"mean\")\n", - "net_opt = nn.Momentum(network.trainable_params(), learning_rate=lr, momentum=0.9)\n", - "time_cb = TimeMonitor(data_size=ds_train.get_dataset_size())\n", - "config_ck = CheckpointConfig(save_checkpoint_steps=1562, keep_checkpoint_max=10)\n", - "ckpoint_cb = ModelCheckpoint(prefix=\"checkpoint_alexnet\", config=config_ck)\n", - "model = Model(network, net_loss, net_opt, metrics={\"Accuracy\": Accuracy()})\n", - "\n", - "# Init a SummaryCollector callback instance, and use it in model.train or model.eval\n", - "specified = {\"histogram_regular\": \"^conv1.*\"}\n", - "summary_collector = SummaryCollector(summary_dir=\"./summary_dir/summary_histogram_only\", collect_specified_data=specified, collect_freq=1, keep_default_action=False)\n", - "\n", - "print(\"============== Starting Training ==============\")\n", - "model.train(epoch=1, train_dataset=ds_train, callbacks=[time_cb, ckpoint_cb, LossMonitor(), summary_collector], dataset_sink_mode=True)\n", - "\n", - "print(\"============== Starting Testing ==============\")\n", - "param_dict = load_checkpoint(\"checkpoint_alexnet_2-1_1562.ckpt\")\n", - "load_param_into_net(network, param_dict)\n", - "acc = model.eval(ds_eval, callbacks=summary_collector, dataset_sink_mode=True)\n", - "print(\"============== {} ==============\".format(acc))" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "此时点击打开MindInsight**训练列表**看板中的`./summary_histogram_only`目录。\n", - "\n", - "![](https://gitee.com/mindspore/docs/raw/master/tutorials/notebook/mindinsight/images/histogram_only.png)\n", - "\n", - "在MindInsight面板中,如上图所示,只展示了直方图信息。\n", - "\n", - "![](https://gitee.com/mindspore/docs/raw/master/tutorials/notebook/mindinsight/images/histogram_only_all.png)\n", - "\n", - "点击进入直方图面板,如上图所示,只展示了`conv1`层的直方图信息。" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "### 单独记录张量数据\n", - "\n", - "1. 在AlexNet网络的`__init__`方法中初始化`TensorSummary`。\n", - "2. 在AlexNet网络的`construct`方法中使用`TensorSummary`算子记录张量数据。\n", - "3. 配置`summary_collector`为:\n", - "\n", - "```python\n", - "summary_collector = SummaryCollector(summary_dir=\"./summary_dir/summary_tensor_only\", \n", - " collect_specified_data=None, \n", - " collect_freq=1,\n", - " keep_default_action=False, \n", - " collect_tensor_freq=50)\n", - "```\n", - "\n", - "  运行以下一段代码只记录张量数据。\n" - ] - }, - { - "cell_type": "code", - "execution_count": 11, - "metadata": {}, - "outputs": [ - { - "name": "stdout", - "output_type": "stream", - "text": [ - "============== Starting Training ==============\n", - "epoch: 1 step: 1, loss is 2.3178232\n", - "epoch: 1 step: 2, loss is 2.3172731\n", - "epoch: 1 step: 3, loss is 2.3256557\n", - "epoch: 1 step: 4, loss is 2.3034613\n", - "epoch: 1 step: 5, loss is 2.318819\n", - "epoch: 1 step: 6, loss is 2.2775433\n", - "epoch: 1 step: 7, loss is 2.322216\n", - "epoch: 1 step: 8, loss is 2.2980762\n", - "epoch: 1 step: 9, loss is 2.3208668\n", - "epoch: 1 step: 10, loss is 2.3162236\n", - "\n", - "...\n", - "\n", - "epoch: 1 step: 1552, loss is 1.5239879\n", - "epoch: 1 step: 1553, loss is 1.3195564\n", - "epoch: 1 step: 1554, loss is 1.2827079\n", - "epoch: 1 step: 1555, loss is 1.0843871\n", - "epoch: 1 step: 1556, loss is 1.2715582\n", - "epoch: 1 step: 1557, loss is 1.4982214\n", - "epoch: 1 step: 1558, loss is 1.0394028\n", - "epoch: 1 step: 1559, loss is 1.0470619\n", - "epoch: 1 step: 1560, loss is 1.1495018\n", - "epoch: 1 step: 1561, loss is 1.0332686\n", - "epoch: 1 step: 1562, loss is 1.1649165\n", - "Epoch time: 113988.939, per step time: 72.976\n", - "============== Starting Testing ==============\n", - "============== {'Accuracy': 0.5647035256410257} ==============\n" - ] - } - ], - "source": [ - "class AlexNet(nn.Cell):\n", - " \"\"\"\n", - " Alexnet\n", - " \"\"\"\n", - " def __init__(self, num_classes=10, channel=3):\n", - " super(AlexNet, self).__init__()\n", - " self.conv1 = conv(channel, 96, 11, stride=4)\n", - " self.conv2 = conv(96, 256, 5, pad_mode=\"same\")\n", - " self.conv3 = conv(256, 384, 3, pad_mode=\"same\")\n", - " self.conv4 = conv(384, 384, 3, pad_mode=\"same\")\n", - " self.conv5 = conv(384, 256, 3, pad_mode=\"same\")\n", - " self.relu = nn.ReLU()\n", - " self.max_pool2d = P.MaxPool(ksize=3, strides=2)\n", - " self.flatten = nn.Flatten()\n", - " self.fc1 = fc_with_initialize(6*6*256, 4096)\n", - " self.fc2 = fc_with_initialize(4096, 4096)\n", - " self.fc3 = fc_with_initialize(4096, num_classes)\n", - " # Init TensorSummary\n", - " self.tensor_summary = P.TensorSummary()\n", - "\n", - " def construct(self, x):\n", - " x = self.conv1(x)\n", - " # Record tensor by Summary operator\n", - " self.tensor_summary(\"Tensor\", x)\n", - " x = self.relu(x)\n", - " x = self.max_pool2d(x)\n", - " x = self.conv2(x)\n", - " x = self.relu(x)\n", - " x = self.max_pool2d(x)\n", - " x = self.conv3(x)\n", - " x = self.relu(x)\n", - " x = self.conv4(x)\n", - " x = self.relu(x)\n", - " x = self.conv5(x)\n", - " x = self.relu(x)\n", - " x = self.max_pool2d(x)\n", - " x = self.flatten(x)\n", - " x = self.fc1(x)\n", - " x = self.relu(x)\n", - " x = self.fc2(x)\n", - " x = self.relu(x)\n", - " x = self.fc3(x)\n", - " return x\n", - "\n", - "lr = Tensor(get_lr(0, 0.002, 1, ds_train.get_dataset_size()))\n", - "network = AlexNet(num_classes=10)\n", - "net_loss = nn.SoftmaxCrossEntropyWithLogits(sparse=True, reduction=\"mean\")\n", - "net_opt = nn.Momentum(network.trainable_params(), learning_rate=lr, momentum=0.9)\n", - "time_cb = TimeMonitor(data_size=ds_train.get_dataset_size())\n", - "config_ck = CheckpointConfig(save_checkpoint_steps=1562, keep_checkpoint_max=10)\n", - "ckpoint_cb = ModelCheckpoint(prefix=\"checkpoint_alexnet\", config=config_ck)\n", - "model = Model(network, net_loss, net_opt, metrics={\"Accuracy\": Accuracy()})\n", - "\n", - "# Init a SummaryCollector callback instance, and use it in model.train or model.eval\n", - "summary_collector = SummaryCollector(summary_dir=\"./summary_dir/summary_tensor_only\", collect_specified_data=None, collect_freq=1, keep_default_action=False, collect_tensor_freq=50)\n", - "\n", - "print(\"============== Starting Training ==============\")\n", - "model.train(epoch=1, train_dataset=ds_train, callbacks=[time_cb, ckpoint_cb, LossMonitor(), summary_collector], dataset_sink_mode=True)\n", - "\n", - "print(\"============== Starting Testing ==============\")\n", - "param_dict = load_checkpoint(\"checkpoint_alexnet_3-1_1562.ckpt\")\n", - "load_param_into_net(network, param_dict)\n", - "acc = model.eval(ds_eval, callbacks=summary_collector, dataset_sink_mode=True)\n", - "print(\"============== {} ==============\".format(acc))" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "此时点击打开MindInsight**训练列表**看板中的`./summary_tensor_only`目录,如下图所示,可以看到只记录有张量信息。\n", - "\n", - "![](https://gitee.com/mindspore/docs/raw/master/tutorials/notebook/mindinsight/images/tensor_only.png)" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "### 单独记录图像\n", - "\n", - "1. 在AlexNet网络的`__init__`方法中初始化`ImageSummary`。\n", - "2. 在AlexNet网络的`construct`方法中使用`ImageSummary`算子记录输入的图像。\n", - "3. 配置`summary_collector`为:\n", - "\n", - " ```python\n", - " summary_collector = SummaryCollector(summary_dir=\"./summary_dir/summary_image_only\", \n", - " collect_specified_data=None, \n", - " collect_freq=1, \n", - " keep_default_action=False)\n", - " ```\n", - "\n", - "  运行以下一段代码只记录张量数据。" - ] - }, - { - "cell_type": "code", - "execution_count": 12, - "metadata": {}, - "outputs": [ - { - "name": "stdout", - "output_type": "stream", - "text": [ - "============== Starting Training ==============\n", - "epoch: 1 step: 1, loss is 2.3218322\n", - "epoch: 1 step: 2, loss is 2.3194063\n", - "epoch: 1 step: 3, loss is 2.3358874\n", - "epoch: 1 step: 4, loss is 2.3217764\n", - "epoch: 1 step: 5, loss is 2.3040292\n", - "epoch: 1 step: 6, loss is 2.2832437\n", - "epoch: 1 step: 7, loss is 2.31645\n", - "epoch: 1 step: 8, loss is 2.3330472\n", - "epoch: 1 step: 9, loss is 2.3014827\n", - "epoch: 1 step: 10, loss is 2.310499\n", - "\n", - "...\n", - "\n", - "epoch: 1 step: 1552, loss is 1.293778\n", - "epoch: 1 step: 1553, loss is 1.385756\n", - "epoch: 1 step: 1554, loss is 1.2190051\n", - "epoch: 1 step: 1555, loss is 1.2079114\n", - "epoch: 1 step: 1556, loss is 1.0067248\n", - "epoch: 1 step: 1557, loss is 1.2732614\n", - "epoch: 1 step: 1558, loss is 1.0397598\n", - "epoch: 1 step: 1559, loss is 1.1968248\n", - "epoch: 1 step: 1560, loss is 1.2161392\n", - "epoch: 1 step: 1561, loss is 1.135769\n", - "epoch: 1 step: 1562, loss is 1.4535972\n", - "Epoch time: 152835.351, per step time: 97.846\n", - "============== Starting Testing ==============\n", - "============== {'Accuracy': 0.5597956730769231} ==============\n" - ] - } - ], - "source": [ - "class AlexNet(nn.Cell):\n", - " \"\"\"\n", - " Alexnet\n", - " \"\"\"\n", - " def __init__(self, num_classes=10, channel=3):\n", - " super(AlexNet, self).__init__()\n", - " self.conv1 = conv(channel, 96, 11, stride=4)\n", - " self.conv2 = conv(96, 256, 5, pad_mode=\"same\")\n", - " self.conv3 = conv(256, 384, 3, pad_mode=\"same\")\n", - " self.conv4 = conv(384, 384, 3, pad_mode=\"same\")\n", - " self.conv5 = conv(384, 256, 3, pad_mode=\"same\")\n", - " self.relu = nn.ReLU()\n", - " self.max_pool2d = P.MaxPool(ksize=3, strides=2)\n", - " self.flatten = nn.Flatten()\n", - " self.fc1 = fc_with_initialize(6*6*256, 4096)\n", - " self.fc2 = fc_with_initialize(4096, 4096)\n", - " self.fc3 = fc_with_initialize(4096, num_classes)\n", - " # Init ImageSummary\n", - " self.image_summary = P.ImageSummary()\n", - "\n", - " def construct(self, x):\n", - " # Record image by Summary operator\n", - " self.image_summary(\"Image\", x)\n", - " x = self.conv1(x)\n", - " x = self.relu(x)\n", - " x = self.max_pool2d(x)\n", - " x = self.conv2(x)\n", - " x = self.relu(x)\n", - " x = self.max_pool2d(x)\n", - " x = self.conv3(x)\n", - " x = self.relu(x)\n", - " x = self.conv4(x)\n", - " x = self.relu(x)\n", - " x = self.conv5(x)\n", - " x = self.relu(x)\n", - " x = self.max_pool2d(x)\n", - " x = self.flatten(x)\n", - " x = self.fc1(x)\n", - " x = self.relu(x)\n", - " x = self.fc2(x)\n", - " x = self.relu(x)\n", - " x = self.fc3(x)\n", - " return x\n", - "\n", - "lr = Tensor(get_lr(0, 0.002, 1, ds_train.get_dataset_size()))\n", - "network = AlexNet(num_classes=10)\n", - "net_loss = nn.SoftmaxCrossEntropyWithLogits(sparse=True, reduction=\"mean\")\n", - "net_opt = nn.Momentum(network.trainable_params(), learning_rate=lr, momentum=0.9)\n", - "time_cb = TimeMonitor(data_size=ds_train.get_dataset_size())\n", - "config_ck = CheckpointConfig(save_checkpoint_steps=1562, keep_checkpoint_max=10)\n", - "ckpoint_cb = ModelCheckpoint(prefix=\"checkpoint_alexnet\", config=config_ck)\n", - "model = Model(network, net_loss, net_opt, metrics={\"Accuracy\": Accuracy()})\n", - "\n", - "# Init a SummaryCollector callback instance, and use it in model.train or model.eval\n", - "summary_collector = SummaryCollector(summary_dir=\"./summary_dir/summary_image_only\", collect_specified_data=None, collect_freq=1, keep_default_action=False)\n", - "\n", - "print(\"============== Starting Training ==============\")\n", - "model.train(epoch=1, train_dataset=ds_train, callbacks=[time_cb, ckpoint_cb, LossMonitor(), summary_collector], dataset_sink_mode=True)\n", - "\n", - "print(\"============== Starting Testing ==============\")\n", - "param_dict = load_checkpoint(\"checkpoint_alexnet_4-1_1562.ckpt\")\n", - "load_param_into_net(network, param_dict)\n", - "acc = model.eval(ds_eval, callbacks=summary_collector, dataset_sink_mode=True)\n", - "print(\"============== {} ==============\".format(acc))" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "![](https://gitee.com/mindspore/docs/raw/master/tutorials/notebook/mindinsight/images/image_only.png)\n", - "\n", - "在MindInsight面板中,如上图所示,只展示了输入图像信息。" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "### 对比看板\n", - "\n", - "对比看板用于多次训练之间的数据对比。\n", - "\n", - "点击MindInsight看板中的**对比看板**,打开对比看板,可以得到多次(不同)训练搜集到的标量数据对比信息。\n", - "\n", - "![](https://gitee.com/mindspore/docs/raw/master/tutorials/notebook/mindinsight/images/multi_scalars.png)\n", - "\n", - "上图展示了`summary_01`(上图中红色曲线)和`summary_loss_only`(上图中蓝色曲线)的标量曲线对比效果,横坐标是训练步骤,纵坐标是标量值。\n", - "\n", - "![](https://gitee.com/mindspore/docs/raw/master/tutorials/notebook/mindinsight/images/multi_scalars_select.png)\n", - "\n", - "上图展示的对比看板可视的功能区,提供了根据选择不同训练或标签,水平轴的不同维度和平滑度来进行标量对比的功能。\n", - "\n", - "- 训练:提供了对所有训练进行多项选择的功能,用户可以通过勾选或关键字筛选所需的训练。\n", - "- 标签:提供了对所有标签进行多项选择的功能,用户可以通过勾选所需的标签,查看对应的标量信息。\n", - "- 水平轴:可以选择“步骤”、“相对时间”、“绝对时间”中的任意一项,来作为标量曲线的水平轴。\n", - "- 平滑度:可以通过调整平滑度,对标量曲线进行平滑处理。" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "## 关闭MindInsight服务\n", - "\n", - "在终端命令行中执行以下代码关闭MindInsight服务。\n", - "\n", - "```shell\n", - "mindinsight stop --port 8080\n", - "```" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "## 注意事项和规格\n", - "- 在训练中使用Summary算子收集数据时,`HistogramSummary`算子会影响性能,所以请尽量少地使用。\n", - "- 不能同时使用多个 `SummaryRecord` 实例 (`SummaryCollector` 中使用了 `SummaryRecord`)。\n", - "- 为了控制列出summary文件目录的用时,MindInsight最多支持发现999个summary文件目录。\n", - "- 出于性能上的考虑,MindInsight对比看板使用缓存机制加载训练的标量曲线数据,并进行以下限制:\n", - " - 对比看板只支持在缓存中的训练进行比较标量曲线对比。\n", - " - 缓存最多保留最新(按修改时间排列)的15个训练。\n", - " - 用户最多同时对比5个训练的标量曲线。\n", - "- 为了控制内存占用,MindInsight对标签(tag)数目和步骤(step)数目进行了限制:\n", - " - 每个训练看板的最大标签数量为300个标签。标量标签、图片标签、计算图标签、参数分布图(直方图)标签、张量标签的数量总和不得超过300个。特别地,每个训练看板最多有10个计算图标签、6个张量标签。当实际标签数量超过这一限制时,将依照MindInsight的处理顺序,保留最近处理的300个标签。\n", - " - 每个训练看板的每个标量标签最多有1000个步骤的数据。当实际步骤的数目超过这一限制时,将对数据进行随机采样,以满足这一限制。\n", - " - 每个训练看板的每个图片标签最多有10个步骤的数据。当实际步骤的数目超过这一限制时,将对数据进行随机采样,以满足这一限制。\n", - " - 每个训练看板的每个参数分布图(直方图)标签最多有50个步骤的数据。当实际步骤的数目超过这一限制时,将对数据进行随机采样,以满足这一限制。\n", - " - 每个训练看板的每个张量标签最多有20个步骤的数据。当实际步骤的数目超过这一限制时,将对数据进行随机采样,以满足这一限制。\n", - "- 由于`TensorSummary`会记录完整Tensor数据,数据量通常会比较大,为了控制内存占用和出于性能上的考虑,MindInsight对Tensor的大小以及返回前端展示的数值个数进行以下限制:\n", - " - MindInsight最大支持加载含有1千万个数值的Tensor。\n", - " - Tensor加载后,在张量可视的表格视图下,最大支持查看10万个数值,如果所选择的维度查询得到的数值超过这一限制,则无法显示。\n", - "- 由于张量可视(`TensorSummary`)会记录原始张量数据,需要的存储空间较大。使用`TensorSummary`前和训练过程中请注意检查系统存储空间充足。 通过以下方法可以降低张量可视功能的存储空间占用:\\\n", - "  1)避免使用`TensorSummary`记录较大的Tensor。\\\n", - "  2)减少网络中`TensorSummary`算子的使用个数。\n", - "- 功能使用完毕后,请及时清理不再需要的训练日志,以释放磁盘空间。\n", - "- 备注:估算`TensorSummary`空间使用量的方法如下:\n", - " - 一个`TensorSummary`数据的大小 = Tensor中的数值个数 * 4 bytes。假设使用`TensorSummary`记录的Tensor大小为32 * 1 * 256 * 256,则一个`TensorSummary`数据大约需要32 * 1 * 256 * 256 * 4 bytes = 8,388,608 bytes = 8MiB。`TensorSummary`默认会记录20个步骤的数据,则记录这20组数据需要的空间约为20 * 8 MiB = 160MiB。需要注意的是,由于数据结构等因素的开销,实际使用的存储空间会略大于160MiB。\n", - " - 当使用`TensorSummary`时,由于记录完整Tensor数据,训练日志文件较大,MindInsight需要更多时间解析训练日志文件,请耐心等待。" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "## 总结\n", - "\n", - "本次体验流程为完整的MindSpore深度学习及MindInsight可视化展示的过程,包括了下载数据集及预处理过程,构建网络、损失函数和优化器过程,生成模型并进行训练、验证的过程,以及启动MindInsight服务进行训练过程可视化展示。读者可以基于本次体验流程构建自己的网络模型进行训练,并使用`SummaryCollector`以及Summary算子记录关心的数据,然后在MindInsight服务看板中进行可视化展示,根据MindInsight服务中展示的结果调整相应的参数以提高训练精度。\n", - "\n", - "以上便完成了标量、直方图、图像和张量可视化的体验,我们通过本次体验全面了解了MindSpore执行训练的过程和MindInsight在标量、直方图、图像和张量可视化的应用,理解了如何使用`SummaryColletor`记录训练过程中的标量、直方图、图像和张量数据。" - ] - } - ], - "metadata": { - "kernelspec": { - "display_name": "Python 3", - "language": "python", - "name": "python3" - }, - "language_info": { - "codemirror_mode": { - "name": "ipython", - "version": 3 - }, - "file_extension": ".py", - "mimetype": "text/x-python", - "name": "python", - "nbconvert_exporter": "python", - "pygments_lexer": "ipython3", - "version": "3.7.5" - } - }, - "nbformat": 4, - "nbformat_minor": 4 -} diff --git a/tutorials/notebook/programming_guide/dtype.ipynb b/tutorials/notebook/programming_guide/dtype.ipynb new file mode 100644 index 0000000000000000000000000000000000000000..3a4d47dcde38c12b1135570924e2bd0c2f623c31 --- /dev/null +++ b/tutorials/notebook/programming_guide/dtype.ipynb @@ -0,0 +1,111 @@ +{ + "cells": [ + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "# dtype" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## 概述\n", + "\n", + "MindSpore张量支持不同的数据类型,包含`int8`、`int16`、`int32`、`int64`、`uint8`、`uint16`、`uint32`、`uint64`、`float16`、`float32`、`float64`、`bool_`,与NumPy的数据类型一一对应。\n", + "\n", + "在MindSpore的运算处理流程中,Python中的`int`数会被转换为定义的`int64`类型,`float`数会被转换为定义的`float32`类型。\n", + "\n", + "详细的类型支持情况请参考https://www.mindspore.cn/doc/api_python/zh-CN/master/mindspore/mindspore.html#mindspore.dtype。\n", + "\n", + "以下代码,打印MindSpore的数据类型int32。" + ] + }, + { + "cell_type": "code", + "execution_count": 1, + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "Int32\n" + ] + } + ], + "source": [ + "from mindspore import dtype as mstype\n", + "\n", + "data_type = mstype.int32\n", + "print(data_type)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## 数据类型转换接口\n", + "\n", + "MindSpore提供了以下几个接口,实现与NumPy数据类型和Python内置的数据类型间的转换。\n", + "\n", + " * `dtype_to_nptype`:将MindSpore的数据类型转换为NumPy对应的数据类型。\n", + "\n", + " * `dtype_to_pytype`:将MindSpore的数据类型转换为Python对应的内置数据类型。\n", + "\n", + " * `pytype_to_dtype`:将Python内置的数据类型转换为MindSpore对应的数据类型。\n", + "\n", + "以下代码实现了不同数据类型间的转换,并打印转换后的类型。" + ] + }, + { + "cell_type": "code", + "execution_count": 2, + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "\n", + "Int64\n", + "\n" + ] + } + ], + "source": [ + "from mindspore import dtype as mstype\n", + "\n", + "np_type = mstype.dtype_to_nptype(mstype.int32)\n", + "ms_type = mstype.pytype_to_dtype(int)\n", + "py_type = mstype.dtype_to_pytype(mstype.float64)\n", + "\n", + "print(np_type)\n", + "print(ms_type)\n", + "print(py_type)" + ] + } + ], + "metadata": { + "kernelspec": { + "display_name": "Python 3", + "language": "python", + "name": "python3" + }, + "language_info": { + "codemirror_mode": { + "name": "ipython", + "version": 3 + }, + "file_extension": ".py", + "mimetype": "text/x-python", + "name": "python", + "nbconvert_exporter": "python", + "pygments_lexer": "ipython3", + "version": "3.7.5" + } + }, + "nbformat": 4, + "nbformat_minor": 4 +} diff --git a/tutorials/notebook/programming_guide/tensor.ipynb b/tutorials/notebook/programming_guide/tensor.ipynb new file mode 100644 index 0000000000000000000000000000000000000000..fc4e13fb3dfae27a4f0996aefcf31a15fd7be21f --- /dev/null +++ b/tutorials/notebook/programming_guide/tensor.ipynb @@ -0,0 +1,149 @@ +{ + "cells": [ + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "# Tensor" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## 概述\n", + "\n", + "张量(Tensor)是MindSpore网络运算中的基本数据结构。张量中的数据类型可参考[dtype](https://www.mindspore.cn/doc/programming_guide/zh-CN/master/dtype.html)。\n", + "\n", + "不同维度的张量分别表示不同的数据,0维张量表示标量,1维张量表示向量,2维张量表示矩阵,3维张量可以表示彩色图像的RGB三通道等等。" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "> 本文中的所有示例,支持在PyNative模式下运行,暂不支持CPU。" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## 张量构造\n", + "\n", + "构造张量时,支持传入`Tensor`、`float`、`int`、`bool`、`tuple`、`list`和`NumPy.array`类型。\n", + "\n", + "`Tensor`作为初始值时,可指定dtype,如果没有指定dtype,`int`、`float`、`bool`分别对应`int32`、`float32`、`bool_`,`tuple`和`list`生成的1维`Tensor`数据类型与`tuple`和`list`里存放数据的类型相对应。\n", + "\n", + "代码样例如下:" + ] + }, + { + "cell_type": "code", + "execution_count": 1, + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "[[1 2]\n", + " [3 4]] \n", + "\n", + " 1 \n", + "\n", + " 2 \n", + "\n", + " True \n", + "\n", + " [1 2 3] \n", + "\n", + " [4. 5. 6.]\n" + ] + } + ], + "source": [ + "import numpy as np\n", + "from mindspore import Tensor\n", + "from mindspore.common import dtype as mstype\n", + "\n", + "x = Tensor(np.array([[1, 2], [3, 4]]), mstype.int32)\n", + "y = Tensor(1.0, mstype.int32)\n", + "z = Tensor(2, mstype.int32)\n", + "m = Tensor(True, mstype.bool_)\n", + "n = Tensor((1, 2, 3), mstype.int16)\n", + "p = Tensor([4.0, 5.0, 6.0], mstype.float64)\n", + "\n", + "print(x, \"\\n\\n\", y, \"\\n\\n\", z, \"\\n\\n\", m, \"\\n\\n\", n, \"\\n\\n\", p)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## 张量的属性" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "### 属性\n", + "\n", + "张量的属性包括形状(shape)和数据类型(dtype)。\n", + "\n", + " * 形状:`Tensor`的shape,是一个tuple。\n", + "\n", + " * 数据类型:`Tensor`的dtype,是MindSpore的一个数据类型。\n", + "\n", + "代码样例如下:" + ] + }, + { + "cell_type": "code", + "execution_count": 2, + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "(2, 2) Int32\n" + ] + } + ], + "source": [ + "import numpy as np\n", + "from mindspore import Tensor\n", + "from mindspore.common import dtype as mstype\n", + "\n", + "x = Tensor(np.array([[1, 2], [3, 4]]), mstype.int32)\n", + "x_shape = x.shape\n", + "x_dtype = x.dtype\n", + "\n", + "print(x_shape, x_dtype)" + ] + } + ], + "metadata": { + "kernelspec": { + "display_name": "Python 3", + "language": "python", + "name": "python3" + }, + "language_info": { + "codemirror_mode": { + "name": "ipython", + "version": 3 + }, + "file_extension": ".py", + "mimetype": "text/x-python", + "name": "python", + "nbconvert_exporter": "python", + "pygments_lexer": "ipython3", + "version": "3.7.5" + } + }, + "nbformat": 4, + "nbformat_minor": 4 +} diff --git a/tutorials/notebook/quick_start.ipynb b/tutorials/notebook/quick_start.ipynb index 6110b8aeda958ec86cfa7d0ae890dd0eff96da72..4180ec9e87ab99a8afaa73920ee704548dc9b1a8 100644 --- a/tutorials/notebook/quick_start.ipynb +++ b/tutorials/notebook/quick_start.ipynb @@ -525,7 +525,7 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "使用MindSpore定义神经网络需要继承`mindspore.nn.cell.Cell`,`Cell`是所有神经网络(`Conv2d`等)的基类。\n", + "使用MindSpore定义神经网络需要继承`mindspore.nn.Cell`,`Cell`是所有神经网络(`Conv2d`等)的基类。\n", "\n", "神经网络的各层需要预先在`__init__`方法中定义,然后通过定义`construct`方法来完成神经网络的前向构造,按照LeNet5的网络结构,定义网络各层如下:" ] diff --git a/tutorials/training/source_en/advanced_use/apply_gradient_accumulation.md b/tutorials/training/source_en/advanced_use/apply_gradient_accumulation.md index 76012d6196c00d1419108b0b58017d913c96fd6d..0ee7673c662d38680a970f955bee0e119104c609 100644 --- a/tutorials/training/source_en/advanced_use/apply_gradient_accumulation.md +++ b/tutorials/training/source_en/advanced_use/apply_gradient_accumulation.md @@ -36,6 +36,7 @@ The ultimate objective is to achieve the same effect as training with N x mini-b The MNIST dataset is used as an example to describe how to customize a simple model to implement gradient accumulation. ### Importing Library Files + The following are the required public modules and MindSpore modules and library files. ```python @@ -65,7 +66,9 @@ Use the `MnistDataset` API provided by the dataset of MindSpore to load the MNIS The following uses the LeNet network as an example. You can also use other networks, such as ResNet-50 and BERT. The code is imported from [lenet.py]() in the lenet directory of model_zoo. ### Defining the Training Model + The training process is divided into three parts: forward and backward training, parameter update, and accumulated gradient clearance. + - `TrainForwardBackward` calculates the loss and gradient, and uses grad_sum to implement gradient accumulation. - `TrainOptim` updates parameters. - `TrainClear` clears the gradient accumulation variable grad_sum. @@ -134,8 +137,8 @@ class TrainClear(Cell): ``` ### Defining the Training Process -Each mini-batch calculates the loss and gradient through forward and backward training, and uses mini_steps to control the accumulated times before each parameter update. After the number of accumulation times is reached, the parameter is updated and the accumulated gradient variable is cleared. +Each mini-batch calculates the loss and gradient through forward and backward training, and uses mini_steps to control the accumulated times before each parameter update. After the number of accumulation times is reached, the parameter is updated and the accumulated gradient variable is cleared. ```python class GradientAccumulation: @@ -202,6 +205,7 @@ class GradientAccumulation: ``` ### Training and Saving the Model + Call the network, optimizer, and loss function, and then customize the `train_process` API of `GradientAccumulation` to train the model. ```python @@ -226,18 +230,20 @@ if __name__ == "__main__": ``` ## Experiment Result + After 10 epochs, the accuracy on the test set is about 96.31%. -**Training Execution** +**Training Execution:** + 1. Run the training code and view the running result. - ```shell - $ python train.py --data_path=./MNIST_Data + ```bash + python train.py --data_path=./MNIST_Data ``` The output is as follows. The loss value decreases during training. - ```shell + ```text epoch: 1 step: 27 loss is 0.3660637 epoch: 1 step: 28 loss is 0.25238192 ... @@ -252,17 +258,17 @@ After 10 epochs, the accuracy on the test set is about 96.31%. The model file `gradient_accumulation.ckpt` is saved during training. -**Model Validation** +**Model Validation:** Use the saved checkpoint file to load the validation dataset through [eval.py]() in the lenet directory of model_zoo. -```shell -$ python eval.py --data_path=./MNIST_Data --ckpt_path=./gradient_accumulation.ckpt --device_target=GPU +```bash +python eval.py --data_path=./MNIST_Data --ckpt_path=./gradient_accumulation.ckpt --device_target=GPU ``` The output is as follows. The accuracy of the validation dataset is about 96.31%, which is the same as the result when the value of batch_size is 32. -```shell +```text ============== Starting Testing ============== ============== {'Accuracy': 0.9631730769230769} ============== -``` \ No newline at end of file +``` diff --git a/tutorials/training/source_en/advanced_use/apply_host_device_training.md b/tutorials/training/source_en/advanced_use/apply_host_device_training.md index 1dfbfa88e44a3345d497c0ddcacc7934b447ffd4..e1f3d69be7c5858bbf527f60f5dfbcb2a7e3d7f7 100644 --- a/tutorials/training/source_en/advanced_use/apply_host_device_training.md +++ b/tutorials/training/source_en/advanced_use/apply_host_device_training.md @@ -18,13 +18,14 @@ In deep learning, one usually has to deal with the huge model problem, in which the total size of parameters in the model is beyond the device memory capacity. To efficiently train a huge model, one solution is to employ homogenous accelerators (*e.g.*, Ascend 910 AI Accelerator and GPU) for distributed training. When the size of a model is hundreds of GBs or several TBs, the number of required accelerators is too overwhelming for people to access, resulting in this solution inapplicable. One alternative is Host+Device hybrid training. This solution simultaneously leveraging the huge memory in hosts and fast computation in accelerators, is a promisingly -efficient method for addressing huge model problem. +efficient method for addressing huge model problem. In MindSpore, users can easily implement hybrid training by configuring trainable parameters and necessary operators to run on hosts, and other operators to run on accelerators. This tutorial introduces how to train [Wide&Deep](https://gitee.com/mindspore/mindspore/tree/master/model_zoo/official/recommend/wide_and_deep) in the Host+Ascend 910 AI Accelerator mode. + ## Preliminaries -1. Prepare the model. The Wide&Deep code can be found at: , in which `train_and_eval_auto_parallel.py` is the main function for training, +1. Prepare the model. The Wide&Deep code can be found at: , in which `train_and_eval_auto_parallel.py` is the main function for training, `src/` directory contains the model definition, data processing and configuration files, `script/` directory contains the launch scripts in different modes. 2. Prepare the dataset. The dataset can be found at: . Use the script `src/preprocess_data.py` to transform dataset into MindRecord format. @@ -50,19 +51,23 @@ This tutorial introduces how to train [Wide&Deep](https://gitee.com/mindspore/mi ## Configuring for Hybrid Training 1. Configure the flag of hybrid training. In the function `argparse_init` of file `src/config.py`, change the default value of `host_device_mix` to be `1`; change `self.host_device_mix` in function `__init__` of `class WideDeepConfig` to be `1`: + ```python self.host_device_mix = 1 ``` 2. Check placement of necessary operators and optimizers. In class `WideDeepModel` of file `src/wide_and_deep.py`, check the placement of `EmbeddingLookup` is at host: + ```python self.deep_embeddinglookup = nn.EmbeddingLookup() self.wide_embeddinglookup = nn.EmbeddingLookup() ``` + In `class TrainStepWrap(nn.Cell)` of file `src/wide_and_deep.py`, check two optimizer are also at host: + ```python - self.optimizer_w.sparse_opt.add_prim_attr("primitive_target", "CPU") - self.optimizer_d.sparse_opt.add_prim_attr("primitive_target", "CPU") + self.optimizer_w.target = "CPU" + self.optimizer_d.target = "CPU" ``` ## Training the Model @@ -73,7 +78,7 @@ and `RANK_TABLE_FILE` is the path of the above `rank_table_1p_0.json` file. The running log is in the directory of `device_0`, where `loss.log` contains every loss value of every step in the epoch. Here is an example: -``` +```text epoch: 1 step: 1, wide_loss is 0.6873926, deep_loss is 0.8878349 epoch: 1 step: 2, wide_loss is 0.6442529, deep_loss is 0.8342661 epoch: 1 step: 3, wide_loss is 0.6227323, deep_loss is 0.80273706 @@ -90,7 +95,7 @@ epoch: 1 step: 10, wide_loss is 0.566089, deep_loss is 0.6884129 `test_deep0.log` contains the runtime log (This needs to adjust the log level to INFO, and add the `-p on` option when compiling MindSpore). Search `EmbeddingLookup` in `test_deep0.log`, the following can be found: -``` +```text [INFO] DEVICE(109904,python3.7):2020-06-27-12:42:34.928.275 [mindspore/ccsrc/device/cpu/cpu_kernel_runtime.cc:324] Run] cpu kernel: Default/network-VirtualDatasetCellTriple/_backbone-NetWithLossClass/network-WideDeepModel/EmbeddingLookup-op297 costs 3066 us. [INFO] DEVICE(109904,python3.7):2020-06-27-12:42:34.943.896 [mindspore/ccsrc/device/cpu/cpu_kernel_runtime.cc:324] Run] cpu kernel: Default/network-VirtualDatasetCellTriple/_backbone-NetWithLossClass/network-WideDeepModel/EmbeddingLookup-op298 costs 15521 us. ``` @@ -99,7 +104,7 @@ showing the running time of `EmbeddingLookup` on the host. Search `FusedSparseFtrl` and `FusedSparseLazyAdam` in `test_deep0.log`, the following can be found: -``` +```text [INFO] DEVICE(109904,python3.7):2020-06-27-12:42:35.422.963 [mindspore/ccsrc/device/cpu/cpu_kernel_runtime.cc:324] Run] cpu kernel: Default/optimizer_w-FTRL/FusedSparseFtrl-op299 costs 54492 us. [INFO] DEVICE(109904,python3.7):2020-06-27-12:42:35.565.953 [mindspore/ccsrc/device/cpu/cpu_kernel_runtime.cc:324] Run] cpu kernel: Default/optimizer_d-LazyAdam/FusedSparseLazyAdam-op300 costs 142865 us. ``` diff --git a/tutorials/training/source_en/advanced_use/apply_parameter_server_training.md b/tutorials/training/source_en/advanced_use/apply_parameter_server_training.md index 50acc0f46e5625a44b1dc590e3bbb66a78cc97db..1161cf1fd005946a0f4b796e39d393c7ed9bace3 100644 --- a/tutorials/training/source_en/advanced_use/apply_parameter_server_training.md +++ b/tutorials/training/source_en/advanced_use/apply_parameter_server_training.md @@ -1,4 +1,4 @@ -# Training with Parameter Server +# Training with Parameter Server `Linux` `Ascend` `GPU` `Model Training` `Intermediate` `Expert` @@ -17,20 +17,21 @@ ## Overview + A parameter server is a widely used architecture in distributed training. Compared with the synchronous AllReduce training method, a parameter server has better flexibility, scalability, and node failover capabilities. Specifically, the parameter server supports both synchronous and asynchronous SGD training algorithms. In terms of scalability, model computing and update are separately deployed in the worker and server processes, so that resources of the worker and server can be independently scaled out and in horizontally. In addition, in an environment of a large-scale data center, various failures often occur in a computing device, a network, and a storage device, and consequently some nodes are abnormal. However, in an architecture of a parameter server, such a failure can be relatively easily handled without affecting a training job. In the parameter server implementation of MindSpore, the open-source [ps-lite](https://github.com/dmlc/ps-lite) is used as the basic architecture. Based on the remote communication capability provided by the [ps-lite](https://github.com/dmlc/ps-lite) and abstract Push/Pull primitives, the distributed training algorithm of the synchronous SGD is implemented. In addition, with the high-performance collective communication library in Ascend and GPU(HCCL and NCCL), MindSpore also provides the hybrid training mode of parameter server and AllReduce. Some weights can be stored and updated through the parameter server, and other weights are still trained through the AllReduce algorithm. The ps-lite architecture consists of three independent components: server, worker, and scheduler. Their functions are as follows: -- Server: saves model weights and backward computation gradients, and updates the model using gradients pushed by workers. +- Server: saves model weights and backward computation gradients, and updates the model using gradients pushed by workers. - Worker: performs forward and backward computation on the network. The gradient value for backward computation is uploaded to a server through the `Push` API, and the model updated by the server is downloaded to the worker through the `Pull` API. - Scheduler: establishes the communication relationship between the server and worker. - ## Preparations + The following describes how to use parameter server to train LeNet on Ascend 910: ### Training Script Preparation @@ -41,17 +42,18 @@ Learn how to train a LeNet using the [MNIST dataset](http://yann.lecun.com/exdb/ 1. First of all, Use `mindspore.context.set_ps_context(enable_ps=True)` to enable Parameter Server training mode. -- This method should be called before `mindspore.communication.management.init()`. -- If you don't call this method, the [Environment Variable Setting](https://www.mindspore.cn/tutorial/training/en/master/advanced_use/apply_parameter_server_training.html#environment-variable-setting) below will not take effect. -- Use `mindspore.context.reset_ps_context()` to disable Parameter Server training mode. + - This method should be called before `mindspore.communication.management.init()`. + - If you don't call this method, the [Environment Variable Setting](https://www.mindspore.cn/tutorial/training/en/master/advanced_use/apply_parameter_server_training.html#environment-variable-setting) below will not take effect. + - Use `mindspore.context.reset_ps_context()` to disable Parameter Server training mode. 2. In this training mode, you can use either of the following methods to control whether the training parameters are updated by the Parameter Server: -- Use `mindspore.nn.Cell.set_param_ps()` to set all weight recursions of `nn.Cell`. -- Use `mindspore.common.Parameter.set_param_ps()` to set the weight. -- The size of the weight which is updated by Parameter Server should not exceed INT_MAX(2^31 - 1) bytes. + - Use `mindspore.nn.Cell.set_param_ps()` to set all weight recursions of `nn.Cell`. + - Use `mindspore.common.Parameter.set_param_ps()` to set the weight. + - The size of the weight which is updated by Parameter Server should not exceed INT_MAX(2^31 - 1) bytes. 3. On the basis of the [original training script](https://gitee.com/mindspore/mindspore/blob/master/model_zoo/official/cv/lenet/train.py), set all LeNet model weights to be trained on the parameter server: + ```python context.set_ps_context(enable_ps=True) network = LeNet5(cfg.num_classes) @@ -62,7 +64,7 @@ network.set_param_ps() MindSpore reads environment variables to control parameter server training. The environment variables include the following options (all scripts of `MS_SCHED_HOST` and `MS_SCHED_PORT` must be consistent): -``` +```bash export PS_VERBOSE=1 # Print ps-lite log export MS_SERVER_NUM=1 # Server number export MS_WORKER_NUM=1 # Worker number @@ -78,6 +80,7 @@ export MS_ROLE=MS_SCHED # The role of this process: MS_SCHED repre Provide the shell scripts corresponding to the worker, server, and scheduler roles to start training: `Scheduler.sh`: + ```bash #!/bin/bash export PS_VERBOSE=1 @@ -90,6 +93,7 @@ export MS_ROLE=MS_SCHED # The role of this process: MS_SCHED repre ``` `Server.sh`: + ```bash #!/bin/bash export PS_VERBOSE=1 @@ -102,6 +106,7 @@ export MS_ROLE=MS_SCHED # The role of this process: MS_SCHED repre ``` `Worker.sh`: + ```bash #!/bin/bash export PS_VERBOSE=1 @@ -114,26 +119,31 @@ export MS_ROLE=MS_SCHED # The role of this process: MS_SCHED repre ``` Run the following commands separately: + ```bash sh Scheduler.sh > scheduler.log 2>&1 & sh Server.sh > server.log 2>&1 & sh Worker.sh > worker.log 2>&1 & ``` + Start training. 2. Viewing result Run the following command to view the communication logs between the server and worker in the `scheduler.log` file: - ``` + + ```text Bind to role=scheduler, id=1, ip=XXX.XXX.XXX.XXX, port=XXXX Assign rank=8 to node role=server, ip=XXX.XXX.XXX.XXX, port=XXXX Assign rank=9 to node role=worker, ip=XXX.XXX.XXX.XXX, port=XXXX the scheduler is connected to 1 workers and 1 servers ``` + The preceding information indicates that the communication between the server, worker, and scheduler is established successfully. Check the training result in the `worker.log` file: - ``` + + ```text epoch: 1 step: 1, loss is 2.302287 epoch: 1 step: 2, loss is 2.304071 epoch: 1 step: 3, loss is 2.308778 diff --git a/tutorials/training/source_en/advanced_use/apply_quantization_aware_training.md b/tutorials/training/source_en/advanced_use/apply_quantization_aware_training.md index 6868003cbb8b6e815ac79df7d45bbd2b39294ceb..1e18ce380d6296ca8f10dadfebd6bc89180158fd 100644 --- a/tutorials/training/source_en/advanced_use/apply_quantization_aware_training.md +++ b/tutorials/training/source_en/advanced_use/apply_quantization_aware_training.md @@ -39,6 +39,7 @@ Currently, there are two types of quantization solutions in the industry: quanti ### Fake Quantization Node A fake quantization node is a node inserted during quantization aware training, and is used to search for network data distribution and feed back a lost accuracy. The specific functions are as follows: + - Find the distribution of network data, that is, find the maximum and minimum values of the parameters to be quantized. - Simulate the accuracy loss of low-bit quantization, apply the loss to the network model, and transfer the loss to the loss function, so that the optimizer optimizes the loss value during training. @@ -99,7 +100,7 @@ class LeNet5(nn.Cell): Tensor, output tensor Examples: >>> LeNet(num_class=10, num_channel=1) - + """ def __init__(self, num_class=10, num_channel=1): super(LeNet5, self).__init__() @@ -129,10 +130,10 @@ class LeNet5(nn.Cell): def __init__(self, num_class=10): super(LeNet5, self).__init__() self.num_class = num_class - + self.conv1 = nn.Conv2dBnAct(1, 6, kernel_size=5, activation='relu') self.conv2 = nn.Conv2dBnAct(6, 16, kernel_size=5, activation='relu') - + self.fc1 = nn.DenseBnAct(16 * 5 * 5, 120, activation='relu') self.fc2 = nn.DenseBnAct(120, 84, activation='relu') self.fc3 = nn.DenseBnAct(84, self.num_class) @@ -164,17 +165,17 @@ net = quant.convert_quant_network(network, quant_delay=900, bn_fold=False, per_c The preceding describes the quantization aware training from scratch. A more common case is that an existing model file needs to be converted to a quantization model. The model file and training script obtained through common network model training are available for quantization aware training. To use a checkpoint file for retraining, perform the following steps: - 1. Process data and load datasets. - 2. Define an original unquantative network. - 3. Train the original network to generate a unquantative model. - 4. Define a fusion network. - 5. Define an optimizer and loss function. - 6. Generate a quantative network based on the fusion network. - 7. Load a model file and retrain the model. Load the unquantative model file generated in step 3 and retrain the quantative model based on the quantative network to generate a quantative model. For details, see . + 1. Process data and load datasets. + 2. Define an original unquantative network. + 3. Train the original network to generate a unquantative model. + 4. Define a fusion network. + 5. Define an optimizer and loss function. + 6. Generate a quantative network based on the fusion network. + 7. Load a model file and retrain the model. Load the unquantative model file generated in step 3 and retrain the quantative model based on the quantative network to generate a quantative model. For details, see . ### Inference -The inference using a quantization model is the same the common model inference. The inference can be performed by directly using the checkpoint file or converting the checkpoint file into a common model format (such as AIR or MINDIR). +The inference using a quantization model is the same the common model inference. The inference can be performed by directly using the checkpoint file or converting the checkpoint file into a common model format (such as AIR or MINDIR). For details, see . @@ -183,7 +184,7 @@ For details, see - [Converting Dataset to MindRecord](#converting-dataset-to-mindrecord) - - [Overview](#overview) - - [Basic Concepts](#basic-concepts) - - [Converting Dataset to MindRecord](#converting-dataset-to-mindrecord-1) - - [Loading MindRecord Dataset](#loading-mindrecord-dataset) + - [Overview](#overview) + - [Basic Concepts](#basic-concepts) + - [Converting Dataset to MindRecord](#converting-dataset-to-mindrecord-1) + - [Loading MindRecord Dataset](#loading-mindrecord-dataset) @@ -19,6 +19,7 @@ Users can convert non-standard datasets and common datasets into the MindSpore data format, MindRecord, so that they can be easily loaded to MindSpore for training. In addition, the performance of MindSpore in some scenarios is optimized, which delivers better user experience when you use datasets in the MindSpore data format. The MindSpore data format has the following features: + 1. Unified storage and access of user data are implemented, simplifying training data loading. 2. Data is aggregated for storage, which can be efficiently read, managed and moved. 3. Data encoding and decoding are efficient and transparent to users. @@ -96,7 +97,7 @@ The following tutorial demonstrates how to convert image data and its annotation 5. Create a `FileWriter` object, transfer the file name and number of slices, add the schema and index, call the `write_raw_data` API to write data, and call the `commit` API to generate a local data file. - ```python + ```python writer = FileWriter(file_name="test.mindrecord", shard_num=4) writer.add_schema(cv_schema_json, "test_schema") writer.add_index(indexes) @@ -141,7 +142,7 @@ The following tutorial briefly demonstrates how to load the MindRecord dataset u The output is as follows: - ``` + ```text sample: {'data': array([175, 175, 85, 60, 184, 124, 54, 189, 125, 193, 153, 91, 234, 106, 43, 143, 132, 211, 204, 160, 44, 105, 187, 185, 45, 205, 122, 236, 112, 123, 84, 177, 219], dtype=uint8), 'file_name': array(b'3.jpg', dtype='|S5'), 'label': array(99, dtype=int32)} diff --git a/tutorials/training/source_en/advanced_use/custom_debugging_info.md b/tutorials/training/source_en/advanced_use/custom_debugging_info.md index 201968f5804065cb24db3f6f2fdb7232505a0848..96331030f21b7dc2215ece5a3fd5e03bcc936731 100644 --- a/tutorials/training/source_en/advanced_use/custom_debugging_info.md +++ b/tutorials/training/source_en/advanced_use/custom_debugging_info.md @@ -24,7 +24,7 @@ This section describes how to use the customized capabilities provided by MindSpore, such as `callback`, `metrics`, `Print` operators and log printing, to help you quickly debug the training network. -## Introduction to Callback +## Introduction to Callback Here, callback is not a function but a class. You can use callback to observe the internal status and related information of the network during training or perform specific actions in a specific period. For example, you can monitor the loss, save model parameters, dynamically adjust parameters, and terminate training tasks in advance. @@ -39,7 +39,7 @@ MindSpore provides the callback capabilities to allow users to insert customized Usage: Transfer the callback object in the `model.train` method. The callback object can be a list, for example: ```python -ckpt_cb = ModelCheckpoint() +ckpt_cb = ModelCheckpoint() loss_cb = LossMonitor() summary_cb = SummaryCollector(summary_dir='./summary_dir') model.train(epoch, dataset, callbacks=[ckpt_cb, loss_cb, summary_cb]) @@ -58,7 +58,7 @@ The callback base class is defined as follows: ```python class Callback(): - """Callback base class""" + """Callback base class""" def begin(self, run_context): """Called once before the network executing.""" pass @@ -68,11 +68,11 @@ class Callback(): pass def epoch_end(self, run_context): - """Called after each epoch finished.""" + """Called after each epoch finished.""" pass def step_begin(self, run_context): - """Called before each epoch beginning.""" + """Called before each epoch beginning.""" pass def step_end(self, run_context): @@ -129,7 +129,7 @@ Here are two examples to further explain the usage of custom Callback. The output is as follows: - ``` + ```text epoch: 20 step: 32 loss: 2.298344373703003 ``` @@ -221,12 +221,16 @@ print('Accuracy is ', accuracy) ``` The output is as follows: -``` + +```text Accuracy is 0.6667 ``` + ## MindSpore Print Operator -MindSpore-developed `Print` operator is used to print the tensors or character strings input by users. Multiple strings, multiple tensors, and a combination of tensors and strings are supported, which are separated by comma (,). -The method of using the MindSpore `Print` operator is the same as using other operators. You need to assert MindSpore `Print` operator in `__init__` and invoke it using `construct`. The following is an example. + +MindSpore-developed `Print` operator is used to print the tensors or character strings input by users. Multiple strings, multiple tensors, and a combination of tensors and strings are supported, which are separated by comma (,). +The method of using the MindSpore `Print` operator is the same as using other operators. You need to assert MindSpore `Print` operator in `__init__` and invoke it using `construct`. The following is an example. + ```python import numpy as np from mindspore import Tensor @@ -250,8 +254,10 @@ y = Tensor(np.ones([2, 2]).astype(np.int32)) net = PrintDemo() output = net(x, y) ``` + The output is as follows: -``` + +```text print Tensor x and Tensor y: Tensor shape:[[const vector][2, 1]]Int32 val:[[1] @@ -313,7 +319,7 @@ The input and output of the operator can be saved for debugging through the data You can set `context.set_context(reserve_class_name_in_scope=False)` in your training script to avoid dump failure because of file name is too long. 4. Parse the Dump file. - + Call `numpy.fromfile` to parse dump data file. ### Asynchronous Dump @@ -321,6 +327,7 @@ The input and output of the operator can be saved for debugging through the data 1. Create dump json file:`data_dump.json`. The name and location of the JSON file can be customized. + ```json { "common_dump_settings": { @@ -369,30 +376,31 @@ The input and output of the operator can be saved for debugging through the data ``` ## Log-related Environment Variables and Configurations + MindSpore uses glog to output logs. The following environment variables are commonly used: - `GLOG_v` - - The environment variable specifies the log level. + + The environment variable specifies the log level. The default value is 2, indicating the WARNING level. The values are as follows: 0: DEBUG; 1: INFO; 2: WARNING; 3: ERROR. -- `GLOG_logtostderr` +- `GLOG_logtostderr` The environment variable specifies the log output mode. When `GLOG_logtostderr` is set to 1, logs are output to the screen. If the value is set to 0, logs are output to a file. The default value is 1. - `GLOG_log_dir` - - The environment variable specifies the log output path. - If `GLOG_logtostderr` is set to 0, value of this variable must be specified. - If `GLOG_log_dir` is specified and the value of `GLOG_logtostderr` is 1, logs are output to the screen but not to a file. + + The environment variable specifies the log output path. + If `GLOG_logtostderr` is set to 0, value of this variable must be specified. + If `GLOG_log_dir` is specified and the value of `GLOG_logtostderr` is 1, logs are output to the screen but not to a file. Logs of C++ and Python will be output to different files. The file name of C++ log complies with the naming rule of `GLOG` log file. Here, the name is `mindspore.MachineName.UserName.log.LogLevel.Timestamp`. The file name of Python log is `mindspore.log`. -- `MS_SUBMODULE_LOG_v` +- `MS_SUBMODULE_LOG_v` The environment variable specifies log levels of C++ sub modules of MindSpore. - The environment variable is assigned as: `MS_SUBMODULE_LOG_v="{SubModule1:LogLevel1,SubModule2:LogLevel2,...}"`. - The specified sub module log level will overwrite the global log level. The meaning of sub module log level is the same as `GLOG_v`, the sub modules of MindSpore are categorized by source directory is shown in the below table. + The environment variable is assigned as: `MS_SUBMODULE_LOG_v="{SubModule1:LogLevel1,SubModule2:LogLevel2,...}"`. + The specified sub module log level will overwrite the global log level. The meaning of sub module log level is the same as `GLOG_v`, the sub modules of MindSpore are categorized by source directory is shown in the below table. E.g. when set `GLOG_v=1 MS_SUBMODULE_LOG_v="{PARSER:2,ANALYZER:2}"` then log levels of `PARSER` and `ANALYZER` are WARNING, other modules' log levels are INFO. Sub modules of MindSpore grouped by source directory: @@ -424,4 +432,4 @@ Sub modules of MindSpore grouped by source directory: | mindspore/core/gvar | COMMON | | mindspore/core/ | CORE | -> The glog does not support log rotate. To control the disk space occupied by log files, use the log file management tool provided by the operating system, such as: logrotate of Linux. \ No newline at end of file +> The glog does not support log rotate. To control the disk space occupied by log files, use the log file management tool provided by the operating system, such as: logrotate of Linux. diff --git a/tutorials/training/source_en/advanced_use/custom_operator_ascend.md b/tutorials/training/source_en/advanced_use/custom_operator_ascend.md index c205cff59a5ed61e6cf2b5d6d2be5bea9b178034..ec197f4a58d0310adbe4321378fb7ed5d1c0d7a1 100644 --- a/tutorials/training/source_en/advanced_use/custom_operator_ascend.md +++ b/tutorials/training/source_en/advanced_use/custom_operator_ascend.md @@ -25,6 +25,7 @@ When built-in operators cannot meet requirements during network development, you To add a custom operator, you need to register the operator primitive, implement the operator, and register the operator information. The related concepts are as follows: + - Operator primitive: defines the frontend API prototype of an operator on the network. It is the basic unit for forming a network model and includes the operator name, attribute (optional), input and output names, output shape inference method, and output dtype inference method. - Operator implementation: describes the implementation of the internal computation logic for an operator through the DSL API provided by the Tensor Boost Engine (TBE). The TBE supports the development of custom operators based on the Ascend AI chip. You can apply for Open Beta Tests (OBTs) by visiting . - Operator information: describes basic information about a TBE operator, such as the operator name and supported input and output types. It is the basis for the backend to select and map operators. @@ -38,6 +39,7 @@ This section takes a Square operator as an example to describe how to customize The primitive of an operator is a subclass inherited from `PrimitiveWithInfer`. The type name of the subclass is the operator name. The definition of the custom operator primitive is the same as that of the built-in operator primitive. + - The attribute is defined by the input parameter of the constructor function `__init__`. The operator in this test case has no attribute. Therefore, `__init__` has only one input parameter. For details about test cases in which operators have attributes, see [custom add3](https://gitee.com/mindspore/mindspore/blob/master/tests/st/ops/custom_ops_tbe/cus_add3.py) in the MindSpore source code. - The input and output names are defined by the `init_prim_io_names` function. - The shape inference method of the output tensor is defined in the `infer_shape` function, and the dtype inference method of the output tensor is defined in the `infer_dtype` function. @@ -75,10 +77,12 @@ To compile an operator implementation, you need to compile a computable function The computable function of an operator is mainly used to encapsulate the computation logic of the operator for the main function to call. The computation logic is implemented by calling the combined API of the TBE. The entry function of an operator describes the internal process of compiling the operator. The process is as follows: + 1. Prepare placeholders to be input. A placeholder will return a tensor object that represents a group of input data. 2. Call the computable function. The computable function uses the API provided by the TBE to describe the computation logic of the operator. 3. Call the scheduling module. The model tiles the operator data based on the scheduling description and specifies the data transfer process to ensure optimal hardware execution. By default, the automatic scheduling module (`auto_schedule`) can be used. 4. Call `cce_build_code` to compile and generate an operator binary file. + > The input parameters of the entry function require the input information of each operator, output information of each operator, operator attributes (optional), and `kernel_name` (name of the generated operator binary file). The input and output information is encapsulated in dictionaries, including the input and output shape and dtype when the operator is called on the network. For details about TBE operator development, visit the [TBE website](https://support.huaweicloud.com/odevg-A800_3000_3010/atlaste_10_0063.html). For details about how to debug and optimize the TBE operator, visit the [Mind Studio website](https://support.huaweicloud.com/usermanual-mindstudioc73/atlasmindstudio_02_0043.html). @@ -88,12 +92,12 @@ For details about TBE operator development, visit the [TBE website](https://supp The operator information is key for the backend to select the operator implementation and guides the backend to insert appropriate type and format conversion operators. It uses the `TBERegOp` API for definition and uses the `op_info_register` decorator to bind the operator information to the entry function of the operator implementation. When the .py operator implementation file is imported, the `op_info_register` decorator registers the operator information to the operator information library at the backend. For details about how to use the operator information, see comments for the member method of `TBERegOp`. > The numbers and sequences of the input and output information defined in the operator information must be the same as those in the parameters of the entry function of the operator implementation and those listed in the operator primitive. - +> > If an operator has attributes, use `attr` to describe the attribute information in the operator information. The attribute names must be the same as those in the operator primitive definition. ### Example -The following takes the TBE implementation `square_impl.py` of the `Square` operator as an example. `square_compute` is a computable function of the operator implementation. It describes the computation logic of `x * x` by calling the API provided by `te.lang.cce`. `cus_square_op_info ` is the operator information, which is defined by `TBERegOp`. For the specific field meaning of the operator information, visit the [TBE website](https://support.huaweicloud.com/odevg-A800_3000_3010/atlaste_10_0096.html). +The following takes the TBE implementation `square_impl.py` of the `Square` operator as an example. `square_compute` is a computable function of the operator implementation. It describes the computation logic of `x * x` by calling the API provided by `te.lang.cce`. `cus_square_op_info` is the operator information, which is defined by `TBERegOp`. For the specific field meaning of the operator information, visit the [TBE website](https://support.huaweicloud.com/odevg-A800_3000_3010/atlaste_10_0096.html). Note the following parameters when setting `TBERegOp`: @@ -128,7 +132,7 @@ cus_square_op_info = TBERegOp("CusSquare") \ .output(0, "y", False, "required", "all") \ .dtype_format(DataType.F32_Default, DataType.F32_Default) \ .dtype_format(DataType.F16_Default, DataType.F16_Default) \ - .get_op_info() + .get_op_info() # Binding kernel info with the kernel implementation. @op_info_register(cus_square_op_info) @@ -185,17 +189,20 @@ def test_net(): ``` Execute the test case. -``` + +```bash pytest -s tests/st/ops/custom_ops_tbe/test_square.py::test_net ``` The execution result is as follows: -``` + +```text x: [1. 4. 9.] output: [1. 16. 81.] ``` ## Defining the bprop Function for an Operator + If an operator needs to support automatic differentiation, the bprop function needs to be defined in the primitive of the operator. In the bprop function, you need to describe the backward computation logic that uses the forward input, forward output, and output gradients to obtain the input gradients. The backward computation logic can be composed of built-in operators or custom backward operators. Note the following points when defining the bprop function: @@ -204,6 +211,7 @@ Note the following points when defining the bprop function: - The return value of the bprop function is tuples consisting of input gradients. The sequence of elements in a tuple is the same as that of the forward input parameters. Even if there is only one input gradient, the return value must be a tuple. For example, the `CusSquare` primitive after the bprop function is added is as follows: + ```python class CusSquare(PrimitiveWithInfer): @prim_attr_register @@ -228,6 +236,7 @@ class CusSquare(PrimitiveWithInfer): ``` Define backward cases in the `test_square.py` file. + ```python from mindspore.ops import composite as C def test_grad_net(): @@ -241,12 +250,14 @@ def test_grad_net(): ``` Execute the test case. -``` + +```bash pytest -s tests/st/ops/custom_ops_tbe/test_square.py::test_grad_net ``` The execution result is as follows: -``` + +```text x: [1. 4. 9.] dx: [2. 8. 18.] ``` diff --git a/tutorials/training/source_en/advanced_use/cv_resnet50.md b/tutorials/training/source_en/advanced_use/cv_resnet50.md index 6c59dcbbcf0ee5f11745416c9221d0c8ee879a03..a4b80602073d479d1cd86b6f228b10a9468a4ee0 100644 --- a/tutorials/training/source_en/advanced_use/cv_resnet50.md +++ b/tutorials/training/source_en/advanced_use/cv_resnet50.md @@ -26,8 +26,8 @@ Computer vision is one of the most widely researched and mature technology field This chapter describes how to apply MindSpore to computer vision scenarios based on image classification tasks. - ## Image Classification + Image classification is one of the most basic computer vision applications and belongs to the supervised learning category. For example, determine the class of a digital image, such as cat, dog, airplane, or car. The function is as follows: ```python @@ -42,7 +42,6 @@ MindSpore presets a typical CNN, developer can visit [model_zoo](https://gitee.c MindSpore supports the following image classification networks: LeNet, AlexNet, and ResNet. - ## Task Description and Preparation ![cifar10](images/cifar10.jpg) @@ -54,6 +53,7 @@ The CIFAR-10 dataset contains 10 classes of 60,000 images. Each class contains 6 Generally, a training indicator of image classification is accuracy, that is, a ratio of the quantity of accurately predicted examples to the total quantity of predicted examples. Next, let's use MindSpore to solve the image classification task. The overall process is as follows: + 1. Download the CIFAR-10 dataset. 2. Load and preprocess data. 3. Define a convolutional neural network. In this example, the ResNet-50 network is used. @@ -66,6 +66,7 @@ Next, let's use MindSpore to solve the image classification task. The overall pr The key parts of the task process code are explained below. ### Downloading the CIFAR-10 Dataset + CIFAR-10 dataset download address: [the website of Cifar-10 Dataset](https://www.cs.toronto.edu/~kriz/cifar.html). In this example, the data is in binary format. In the Linux environment, run the following command to download the dataset: ```shell @@ -78,7 +79,6 @@ Run the following command to decompress the dataset: tar -zvxf cifar-10-binary.tar.gz ``` - ### Data Preloading and Preprocessing 1. Load the dataset. @@ -86,10 +86,8 @@ tar -zvxf cifar-10-binary.tar.gz Data can be loaded through the built-in dataset format `Cifar10Dataset` API. > `Cifar10Dataset`: The read type is random read. The built-in CIFAR-10 dataset contains images and labels. The default image format is uint8, and the default label data format is uint32. For details, see the description of the `Cifar10Dataset` API. - The data loading code is as follows, where `data_home` indicates the data storage location: - ```python cifar_ds = ds.Cifar10Dataset(data_home) ``` @@ -138,7 +136,6 @@ tar -zvxf cifar-10-binary.tar.gz cifar_ds = cifar_ds.repeat(repeat_num) ``` - ### Defining the CNN CNN is a standard algorithm for image classification tasks. CNN uses a layered structure to perform feature extraction on an image, and is formed by stacking a series of network layers, such as a convolutional layer, a pooling layer, and an activation layer. @@ -153,10 +150,8 @@ network = resnet50(class_num=10) For more information about ResNet, see [ResNet Paper](https://arxiv.org/abs/1512.03385). - ### Defining the Loss Function and Optimizer - A loss function and an optimizer need to be defined. The loss function is a training objective of the deep learning, and is also referred to as an objective function. The loss function indicates the distance between a logit of a neural network and a label, and is scalar data. Common loss functions include mean square error, L2 loss, Hinge loss, and cross entropy. Cross entropy is usually used for image classification. @@ -173,7 +168,6 @@ ls = SoftmaxCrossEntropyWithLogits(sparse=True, reduction="mean") opt = Momentum(filter(lambda x: x.requires_grad, net.get_parameters()), 0.01, 0.9) ``` - ### Calling the High-level `Model` API To Train and Save the Model File After data preprocessing, network definition, and loss function and optimizer definition are complete, model training can be performed. Model training involves two iterations: multi-round iteration (`epoch`) of datasets and single-step iteration based on the batch size of datasets. The single-step iteration refers to extracting data from a dataset by `batch`, inputting the data to a network to calculate a loss function, and then calculating and updating a gradient of training parameters by using an optimizer. @@ -210,8 +204,6 @@ res = model.eval(eval_dataset) print("result: ", res) ``` - - ## References -[1] https://www.cs.toronto.edu/~kriz/cifar.html +[1] diff --git a/tutorials/training/source_en/advanced_use/cv_resnet50_second_order_optimizer.md b/tutorials/training/source_en/advanced_use/cv_resnet50_second_order_optimizer.md index b91878138c5d3f4ffd388becda17a9a9ab4de34c..8f2919d643678cb02f2ba20c2d6f23669446945c 100644 --- a/tutorials/training/source_en/advanced_use/cv_resnet50_second_order_optimizer.md +++ b/tutorials/training/source_en/advanced_use/cv_resnet50_second_order_optimizer.md @@ -37,7 +37,6 @@ Common optimization algorithms are classified into the first-order and the secon Based on the existing natural gradient algorithm, MindSpore development team uses optimized acceleration methods such as approximation and sharding for the FIM, greatly reducing the computation complexity of the inverse matrix and developing the available second-order optimizer THOR. With eight Ascend 910 AI processors, THOR can complete the training of ResNet-50 v1.5 network and ImageNet dataset within 72 minutes, which is nearly twice the speed of SGD+Momentum. - This tutorial describes how to use the second-order optimizer THOR provided by MindSpore to train the ResNet-50 v1.5 network and ImageNet dataset on Ascend 910 and GPU. > Download address of the complete code example: @@ -47,12 +46,12 @@ Directory Structure of Code Examples ```shell ├── resnet_thor ├── README.md - ├── scripts + ├── scripts ├── run_distribute_train.sh # launch distributed training for Ascend 910 └── run_eval.sh # launch inference for Ascend 910 ├── run_distribute_train_gpu.sh # launch distributed training for GPU └── run_eval_gpu.sh # launch inference for GPU - ├── src + ├── src ├── crossentropy.py # CrossEntropy loss function ├── config.py # parameter configuration ├── dataset_helper.py # dataset helper for minddata dataset @@ -61,20 +60,19 @@ Directory Structure of Code Examples ├── resnet_thor.py # resnet50_thor backone ├── thor.py # thor optimizer ├── thor_layer.py # thor layer - └── dataset.py # data preprocessing + └── dataset.py # data preprocessing ├── eval.py # infer script └── train.py # train script - ``` The overall execution process is as follows: + 1. Prepare the ImageNet dataset and process the required dataset. 2. Define the ResNet-50 network. 3. Define the loss function and the optimizer THOR. 4. Load the dataset and perform training. After the training is complete, check the result and save the model file. 5. Load the saved model for inference. - ## Preparation Ensure that MindSpore has been correctly installed. If not, install it by referring to [Install](https://www.mindspore.cn/install/en). @@ -85,7 +83,7 @@ Download the complete ImageNet2012 dataset, decompress the dataset, and save it The directory structure is as follows: -``` +```text └─ImageNet2012 ├─ilsvrc │ n03676483 @@ -97,19 +95,22 @@ The directory structure is as follows: │ n02504013 │ n07871810 │ ...... - ``` + ### Configuring Distributed Environment Variables + #### Ascend 910 + For details about how to configure the distributed environment variables of Ascend 910 AI processors, see [Parallel Distributed Training (Ascend)](https://www.mindspore.cn/tutorial/training/en/master/advanced_use/distributed_training_ascend.html#configuring-distributed-environment-variables). #### GPU -For details about how to configure the distributed environment of GPUs, see [Parallel Distributed Training (GPU)](https://www.mindspore.cn/tutorial/training/en/master/advanced_use/distributed_training_gpu.html#configuring-distributed-environment-variables). +For details about how to configure the distributed environment of GPUs, see [Parallel Distributed Training (GPU)](https://www.mindspore.cn/tutorial/training/en/master/advanced_use/distributed_training_gpu.html#configuring-distributed-environment-variables). ## Loading the Dataset During distributed training, load the dataset in parallel mode and process it through the data argumentation API provided by MindSpore. The `src/dataset.py` script in the source code is for loading and processing the dataset. + ```python import os import mindspore.common.dtype as mstype @@ -165,17 +166,18 @@ def create_dataset(dataset_path, do_train, repeat_num=1, batch_size=32, target=" > MindSpore supports multiple data processing and augmentation operations, which are usually combined. For details, see [Data Processing](https://www.mindspore.cn/tutorial/training/en/master/use/data_preparation.html). - ## Defining the Network + Use the ResNet-50 v1.5 network model as an example. Define the [ResNet-50 network](https://gitee.com/mindspore/mindspore/blob/master/model_zoo/official/cv/resnet/src/resnet.py), and replace the `Conv2d` and `Dense` operators with the operators customized by the second-order optimizer. The defined network model stores in the `src/resnet_thor.py` script in the source code, and the customized operators `Conv2d_thor` and `Dense_thor` store in the `src/thor_layer.py` script. -- Use `Conv2d_thor` to replace `Conv2d` in the original network model. -- Use `Dense_thor` to replace `Dense` in the original network model. +- Use `Conv2d_thor` to replace `Conv2d` in the original network model. +- Use `Dense_thor` to replace `Dense` in the original network model. > The `Conv2d_thor` and `Dense_thor` operators customized by THOR are used to save the second-order matrix information in model training. The backbone of the newly defined network is the same as that of the original network model. After the network is built, call the defined ResNet-50 in the `__main__` function. + ```python ... from src.resnet_thor import resnet50 @@ -188,15 +190,14 @@ if __name__ == "__main__": ... ``` - ## Defining the Loss Function and Optimizer THOR - ### Defining the Loss Function Loss functions supported by MindSpore include `SoftmaxCrossEntropyWithLogits`, `L1Loss`, and `MSELoss`. The `SoftmaxCrossEntropyWithLogits` loss function is required by THOR. The implementation procedure of the loss function is in the `src/crossentropy.py` script. A common trick in deep network model training, label smoothing, is used to improve the model tolerance to error label classification by smoothing real labels, thereby improving the model generalization capability. + ```python class CrossEntropy(_Loss): """CrossEntropy""" @@ -214,6 +215,7 @@ class CrossEntropy(_Loss): loss = self.mean(loss, 0) return loss ``` + Call the defined loss function in the `__main__` function. ```python @@ -236,6 +238,7 @@ The parameter update formula of THOR is as follows: $$ \theta^{t+1} = \theta^t + \alpha F^{-1}\nabla E$$ The meanings of parameters in the formula are as follows: + - $\theta$: trainable parameters on the network - $t$: number of training steps - $\alpha$: learning rate, which is the parameter update value per step @@ -296,7 +299,6 @@ if __name__ == "__main__": Use the `model.train` API provided by MindSpore to easily train the network. THOR reduces the computation workload and improves the computation speed by reducing the frequency of updating the second-order matrix. Therefore, the Model_Thor class is redefined to inherit the Model class provided by MindSpore. The parameter for controlling the frequency of updating the second-order matrix is added to the Model_Thor class. You can adjust this parameter to optimize the overall performance. - ```python ... from mindspore.train.loss_scale_manager import FixedLossScaleManager @@ -316,15 +318,21 @@ if __name__ == "__main__": ``` ### Running the Script + After the training script is defined, call the shell script in the `scripts` directory to start the distributed training process. + #### Ascend 910 + Currently, MindSpore distributed execution on Ascend uses the single-device single-process running mode. That is, one process runs on a device, and the number of total processes is the same as the number of devices that are being used. For device 0, the corresponding process is executed in the foreground. For other devices, the corresponding processes are executed in the background. Create a directory named `train_parallel`+`device_id` for each process to store log information, operator compilation information, and training checkpoint files. The following takes the distributed training script for eight devices as an example to describe how to run the script: Run the script. -``` + +```bash sh run_distribute_train.sh [RANK_TABLE_FILE] [DATASET_PATH] [DEVICE_NUM] ``` + Variables `RANK_TABLE_FILE`, `DATASET_PATH`, and `DEVICE_NUM` need to be transferred to the script. The meanings of variables are as follows: + - `RANK_TABLE_FILE`: path for storing the networking information file - `DATASET_PATH`: training dataset path - `DEVICE_NUM`: the actual number of running devices. @@ -361,17 +369,22 @@ In the preceding information, `*.ckpt` indicates the saved model parameter file. The name of a checkpoint file is in the following format: *Network name*-*Number of epochs*_*Number of steps*.ckpt. #### GPU + On the GPU hardware platform, MindSpore uses `mpirun` of OpenMPI to perform distributed training. The process creates a directory named `train_parallel` to store log information and training checkpoint files. The following takes the distributed training script for eight devices as an example to describe how to run the script: -``` + +```bash sh run_distribute_train_gpu.sh [DATASET_PATH] [DEVICE_NUM] ``` + Variables `DATASET_PATH` and `DEVICE_NUM` need to be transferred to the script. The meanings of variables are as follows: + - `DATASET_PATH`: training dataset path - `DEVICE_NUM`: the actual number of running devices During GPU-based training, the `DEVICE_ID` environment variable is not required. Therefore, you do not need to call `int(os.getenv('DEVICE_ID'))` in the main training script to obtain the device ID or transfer `device_id` to `context`. You need to set `device_target` to `GPU` and call `init()` to enable the NCCL. The following is an example of loss values output during training: + ```bash ... epoch: 1 step: 5004, loss is 4.2546034 @@ -391,7 +404,7 @@ The following is an example of model files saved after training: ├─ckpt_0 ├─resnet-1_5004.ckpt ├─resnet-2_5004.ckpt - │ ...... + │ ...... ├─resnet-36_5004.ckpt │ ...... ...... @@ -435,40 +448,54 @@ if __name__ == "__main__": # define model model = Model(net, loss_fn=loss, metrics={'top_1_accuracy', 'top_5_accuracy'}) - + # eval model res = model.eval(dataset) print("result:", res, "ckpt=", args_opt.checkpoint_path) ``` ### Inference + After the inference network is defined, the shell script in the `scripts` directory is called for inference. + #### Ascend 910 + On the Ascend 910 hardware platform, run the following inference command: -``` + +```bash sh run_eval.sh [DATASET_PATH] [CHECKPOINT_PATH] ``` + Variables `DATASET_PATH` and `CHECKPOINT_PATH` need to be transferred to the script. The meanings of variables are as follows: + - `DATASET_PATH`: inference dataset path - `CHECKPOINT_PATH`: path for storing the checkpoint file Currently, a single device (device 0 by default) is used for inference. The inference result is as follows: -``` + +```text result: {'top_5_accuracy': 0.9295574583866837, 'top_1_accuracy': 0.761443661971831} ckpt=train_parallel0/resnet-42_5004.ckpt ``` + - `top_5_accuracy`: For an input image, if the labels whose prediction probability ranks top 5 contain actual labels, the classification is correct. -- `top_1_accuracy`: For an input image, if the label with the highest prediction probability is the same as the actual label, the classification is correct. +- `top_1_accuracy`: For an input image, if the label with the highest prediction probability is the same as the actual label, the +classification is correct. + #### GPU On the GPU hardware platform, run the following inference command: -``` + +```bash sh run_eval_gpu.sh [DATASET_PATH] [CHECKPOINT_PATH] ``` + Variables `DATASET_PATH` and `CHECKPOINT_PATH` need to be transferred to the script. The meanings of variables are as follows: + - `DATASET_PATH`: inference dataset path - `CHECKPOINT_PATH`: path for storing the checkpoint file The inference result is as follows: -``` + +```text result: {'top_5_accuracy': 0.9287972151088348, 'top_1_accuracy': 0.7597031049935979} ckpt=train_parallel/resnet-36_5004.ckpt -``` \ No newline at end of file +``` diff --git a/tutorials/training/source_en/advanced_use/dashboard.md b/tutorials/training/source_en/advanced_use/dashboard.md index 739e117a1637cd12eb7d0ab8ba144f8fb5e1b8e6..c2709a946a74f5da0b17ba53e5d9d8c2720ca100 100644 --- a/tutorials/training/source_en/advanced_use/dashboard.md +++ b/tutorials/training/source_en/advanced_use/dashboard.md @@ -170,7 +170,6 @@ Figure 13 shows tensors recorded by a user in a form of a histogram. Click the u ## Notices - 1. Currently MindSpore supports recording computational graph after operator fusion for Ascend 910 AI processor only. 2. When using the Summary operator to collect data in training, 'HistogramSummary' operator will affect performance, so please use as few as possible. @@ -196,6 +195,6 @@ Figure 13 shows tensors recorded by a user in a form of a histogram. Click the u Remarks: The method of estimating the space usage of `TensorSummary` is as follows: - The size of a `TensorSummary` data = the number of values in the tensor * 4 bytes. Assuming that the size of the tensor recorded by `TensorSummary` is 32 * 1 * 256 * 256, then a `TensorSummary` data needs about 32 * 1 * 256 * 256 * 4 bytes = 8,388,608 bytes = 8MiB. `TensorSummary` will record data of 20 steps by default. Then the required space when recording these 20 sets of data is about 20 * 8 MiB = 160MiB. It should be noted that due to the overhead of data structure and other factors, the actual storage space used will be slightly larger than 160MiB. + The size of a `TensorSummary` data = the number of values in the tensor \* 4 bytes. Assuming that the size of the tensor recorded by `TensorSummary` is 32 \* 1 \* 256 \* 256, then a `TensorSummary` data needs about 32 \* 1 \* 256 \* 256 \* 4 bytes = 8,388,608 bytes = 8MiB. `TensorSummary` will record data of 20 steps by default. Then the required space when recording these 20 sets of data is about 20 * 8 MiB = 160MiB. It should be noted that due to the overhead of data structure and other factors, the actual storage space used will be slightly larger than 160MiB. -6. The training log file is large when using `TensorSummary` because the complete tensor data is recorded. MindInsight needs more time to parse the training log file, please be patient. \ No newline at end of file +6. The training log file is large when using `TensorSummary` because the complete tensor data is recorded. MindInsight needs more time to parse the training log file, please be patient. diff --git a/tutorials/training/source_en/advanced_use/debug_in_pynative_mode.md b/tutorials/training/source_en/advanced_use/debug_in_pynative_mode.md index c705b21fc86e377ced59c9f47769ecbf40ec25fc..063c565e79ac0d3664edfc4821f043ff112e664c 100644 --- a/tutorials/training/source_en/advanced_use/debug_in_pynative_mode.md +++ b/tutorials/training/source_en/advanced_use/debug_in_pynative_mode.md @@ -26,7 +26,7 @@ By default, MindSpore is in PyNative mode. You can switch it to the graph mode b In PyNative mode, single operators, common functions, network inference, and separated gradient calculation can be executed. The following describes the usage and precautions. -> In PyNative mode, operators are executed asynchronously on the device to improve performance. Therefore, when an error occurs during operator excution, the error information may be displayed after the program is executed. +> In PyNative mode, operators are executed asynchronously on the device to improve performance. Therefore, when an error occurs during operator excution, the error information may be displayed after the program is executed. ## Executing a Single Operator @@ -73,12 +73,12 @@ Output: [ 0.05016355 0.03958241 0.03958241 0.03958241 0.03443141]]]] ``` - ## Executing a Common Function Combine multiple operators into a function, call the function to execute the operators, and output the result, as shown in the following example: -**Example Code** +**Example Code:** + ```python import numpy as np from mindspore import context, Tensor @@ -97,9 +97,9 @@ output = tensor_add_func(x, y) print(output.asnumpy()) ``` -**Output** +**Output:** -```python +```text [[3. 3. 3.] [3. 3. 3.] [3. 3. 3.]] @@ -107,7 +107,6 @@ print(output.asnumpy()) > Parallel execution and summary are not supported in PyNative mode, so parallel and summary related operators cannot be used. - ### Improving PyNative Performance MindSpore provides the Staging function to improve the execution speed of inference tasks in PyNative mode. This function compiles Python functions or Python class methods into computational graphs in PyNative mode and improves the execution speed by using graph optimization technologies, as shown in the following example: @@ -140,9 +139,10 @@ tensor_add = P.TensorAdd() res = tensor_add(x, z) # PyNative mode print(res.asnumpy()) ``` -**Output** -```python +**Output:** + +```text [[3. 3. 3. 3.] [3. 3. 3. 3.] [3. 3. 3. 3.] @@ -153,7 +153,7 @@ In the preceding code, the `ms_function` decorator is added before `construct` o It should be noted that, in a function to which the `ms_function` decorator is added, if an operator (such as `pooling` or `tensor_add`) that does not need parameter training is included, the operator can be directly called in the decorated function, as shown in the following example: -**Example Code** +**Example Code:** ```python import numpy as np @@ -176,9 +176,10 @@ y = Tensor(np.ones([4, 4]).astype(np.float32)) z = tensor_add_fn(x, y) print(z.asnumpy()) ``` -**Output** -```shell +**Output:** + +```text [[2. 2. 2. 2.] [2. 2. 2. 2.] [2. 2. 2. 2.] @@ -187,7 +188,7 @@ print(z.asnumpy()) If the decorated function contains operators (such as `Convolution` and `BatchNorm`) that require parameter training, these operators must be instantiated before the decorated function is called, as shown in the following example: -**Example Code** +**Example Code:** ```python import numpy as np @@ -209,9 +210,9 @@ z = conv_fn(Tensor(input_data)) print(z.asnumpy()) ``` -**Output** +**Output:** -```shell +```text [[[[ 0.10377571 -0.0182163 -0.05221086] [ 0.1428334 -0.01216263 0.03171652] [-0.00673915 -0.01216291 0.02872104]] @@ -245,12 +246,11 @@ print(z.asnumpy()) [ 0.0377498 -0.06117418 0.00546303]]]] ``` - ## Debugging Network Train Model In PyNative mode, the gradient can be calculated separately. As shown in the following example, `GradOperation` is used to calculate all input gradients of the function or the network. Note that the inputs have to be Tensor. -**Example Code** +**Example Code:** ```python from mindspore.ops import composite as C @@ -267,15 +267,15 @@ def mainf(x, y): print(mainf(Tensor(1, mstype.int32), Tensor(2, mstype.int32))) ``` -**Output** +**Output:** -```python +```text (2, 1) ``` During network training, obtain the gradient, call the optimizer to optimize parameters (the breakpoint cannot be set during the reverse gradient calculation), and calculate the loss values. Then, network training is implemented in PyNative mode. -**Complete LeNet Sample Code** +**Complete LeNet Sample Code:** ```python import numpy as np @@ -312,7 +312,7 @@ class LeNet5(nn.Cell): Lenet network Args: num_class (int): Num classes. Default: 10. - + Returns: Tensor, output tensor @@ -346,8 +346,8 @@ class LeNet5(nn.Cell): x = self.relu(x) x = self.fc3(x) return x - - + + class GradWrap(nn.Cell): """ GradWrap definition """ def __init__(self, network): @@ -376,9 +376,9 @@ loss = loss_output.asnumpy() print(loss) ``` -**Output** +**Output:** -```python +```text 2.3050091 ``` diff --git a/tutorials/training/source_en/advanced_use/debugger.md b/tutorials/training/source_en/advanced_use/debugger.md index 13528eb1f156e387bd8f4ba04b5be4269c3010b6..2000412df467cb5e61a5acbee08a2e32ffbb2a65 100644 --- a/tutorials/training/source_en/advanced_use/debugger.md +++ b/tutorials/training/source_en/advanced_use/debugger.md @@ -29,19 +29,19 @@ In `Graph Mode` training, the computation results of intermediate nodes in the c - Visualize the computational graph on the UI and analyze the output of the graph node; - Set a conditional breakpoint to monitor training exceptions (such as INF), if the condition is met, users can track the cause of the bug when an exception occurs; -- Visualize and analyze the change of parameters, such as weights. +- Visualize and analyze the change of parameters, such as weights. ## Operation Process - Launch MindInsight in debugger mode, and set Debugger environment variables for the training; - At the beginning of the training, set conditional breakpoints; -- Analyze the training progress on MindInsight Debugger UI. +- Analyze the training progress on MindInsight Debugger UI. ## Debugger Environment Preparation At first, install MindInsight and launch it in debugger mode. MindSpore will send training information to MindInsight Debugger Server in debugger mode, users can analyze the information on MindInsight UI. -The command to launch MindInsight in debugger mode is as follows: +The command to launch MindInsight in debugger mode is as follows: ```shell mindinsight start --port {PORT} --enable-debugger True --debugger-port {DEBUGGER_PORT} @@ -67,7 +67,7 @@ Besides, do not use dataset sink mode (Set the parameter `dataset_sink_mode` in ## Debugger UI Introduction -After the Debugger environment preparation, users can run the training script. +After the Debugger environment preparation, users can run the training script. Before the execution of the computational graph, the MindInsight Debugger UI will show the information of the optimized computational graph. The following are the Debugger UI components. @@ -103,22 +103,22 @@ Figure 2: The Graph Node Details When choosing one node on the graph, the details of this node will be displayed at the bottom. The `Tensor Value Overview` area will show the input nodes and the outputs of this node. The `Type`, `Shape` and `Value` of the `Tensor` can also be viewed. -For GPU environment, after selecting an executable node on the graph, right-click to select `Continue to` on this node, -which means running the training script to the selected node within one step. +For GPU environment, after selecting an executable node on the graph, right-click to select `Continue to` on this node, +which means running the training script to the selected node within one step. After left-click `Continue to`, the training script will be executed and paused after running to this node. ![debugger_tensor_value](./images/debugger_tensor_value.png) Figure 3: `Tensor` Value Visualization -Some outputs of the node contain too many dimensions. +Some outputs of the node contain too many dimensions. For these `Tensors`, users can click the `View` link and visualize the `Tensor` in the new panel, which is shown in Figure 3. ![debugger_tensor_compare](./images/debugger_tensor_compare.png) Figure 4: Previous Step Value Compare For Parameter Nodes -In addition, the output of the parameter nodes can be compared with their output in the previous step. +In addition, the output of the parameter nodes can be compared with their output in the previous step. Click the `Compare with Previous Step` button to enter the comparison interface, as shown in Figure 4. ### Conditional Breakpoint @@ -127,13 +127,14 @@ Click the `Compare with Previous Step` button to enter the comparison interface, Figure 5: Set Conditional Breakpoint (Watch Point) -In order to monitor the training and find out the bugs, users can set conditional breakpoints (called `Watch Point List` on UI) to analyze the outputs of the +In order to monitor the training and find out the bugs, users can set conditional breakpoints (called `Watch Point List` on UI) to analyze the outputs of the specified nodes automatically. Figure 5 displays how to set a `Watch Point`: + - At first, click the `+` button on the upper right corner, and then choose a watch condition; - Select the nodes to be watched in the `Node List`, tick the boxes in the front of the chosen nodes; - Click the `OK` button to add this `Watch Point`. -The outputs of the watched nodes will be checked by the corresponding conditions. Once the condition is satisfied, the training will pause, and users can analyze +The outputs of the watched nodes will be checked by the corresponding conditions. Once the condition is satisfied, the training will pause, and users can analyze the triggered `Watch Point List` on the Debugger UI. ![debugger_watch_point_hit](./images/debugger_watch_point_hit.png) @@ -146,7 +147,7 @@ Users can further trace the reason of the bug by analyzing the node details. ### Training Control -At the bottom of the watchpoint setting panel is the training control panel, which shows the training control functions of the debugger, +At the bottom of the watchpoint setting panel is the training control panel, which shows the training control functions of the debugger, with four buttons: `CONTINUE`, `PAUSE`, `TERMINATE` and `OK`: - `OK` stands for executing the training for several steps, the number of the `step` can be specified in the above bar. @@ -160,20 +161,20 @@ The training will be paused until the `Watch Point List` is triggered, or the nu 1. Prepare the debugger environment, and open the MindInsight Debugger UI. ![debugger_waiting](./images/debugger_waiting.png) - + Figure 7: Debugger Start and Waiting for the Training - + The Debugger server is launched and waiting for the training to connect. 2. Run the training script, after a while, the computational graph will be displayed on Debugger UI, as shown in Figure 1. 3. Set conditional breakpoints for the training, as shown in Figure 5. - + In Figure 5, the conditions are selected, and some nodes are watched, which means whether there is any output meeting the conditions in the training process of these nodes. After setting the conditional breakpoint, users can set steps in the control panel and click `OK` or `CONTINUE` to continue training. 4. The conditional breakpoints are triggered, as shown in Figure 6. - + When the conditional breakpoints are triggered, users can analyze the corresponding node details to find out the reason of the bug. ## Notices diff --git a/tutorials/training/source_en/advanced_use/distributed_training_ascend.md b/tutorials/training/source_en/advanced_use/distributed_training_ascend.md index 6e3bc78be0a939fe5cdfc471c43fb461a4bfedcf..8bf810bc6ddf3fedaa8344091edc28d58bbd3706 100644 --- a/tutorials/training/source_en/advanced_use/distributed_training_ascend.md +++ b/tutorials/training/source_en/advanced_use/distributed_training_ascend.md @@ -12,6 +12,8 @@ - [Calling the Collective Communication Library](#calling-the-collective-communication-library) - [Loading the Dataset in Data Parallel Mode](#loading-the-dataset-in-data-parallel-mode) - [Defining the Network](#defining-the-network) + - [Hybrid Parallel Mode](#hybrid-parallel-mode) + - [Semi Auto Parallel Mode](#semi-auto-parallel-mode) - [Defining the Loss Function and Optimizer](#defining-the-loss-function-and-optimizer) - [Defining the Loss Function](#defining-the-loss-function) - [Defining the Optimizer](#defining-the-optimizer) @@ -32,6 +34,8 @@ This tutorial describes how to train the ResNet-50 network in data parallel and automatic parallel modes on MindSpore based on the Ascend 910 AI processor. > Download address of the complete sample code: +Besides, we describe the usages of hybrid parallel and semi-auto parallel modes in the sections [Defining the Network](https://www.mindspore.cn/tutorial/training/en/master/advanced_use/distributed_training_ascend.html#defining-the-network) and [Distributed Training Model Parameters Saving and Loading](https://www.mindspore.cn/tutorial/training/en/master/advanced_use/distributed_training_ascend.html#distributed-training-model-parameters-saving-and-loading). + ## Preparations ### Downloading the Dataset @@ -70,6 +74,7 @@ The following uses the Ascend 910 AI processor as an example. The JSON configura "status": "completed" } ``` + The following parameters need to be modified based on the actual training environment: - `server_count`: number of hosts. @@ -78,13 +83,14 @@ The following parameters need to be modified based on the actual training enviro - `device_ip`: IP address of the integrated NIC. You can run the `cat /etc/hccn.conf` command on the current host. The key value of `address_x` is the IP address of the NIC. - `rank_id`: logical sequence number of a device, which starts from 0. - ### Calling the Collective Communication Library The Huawei Collective Communication Library (HCCL) is used for the communication of MindSpore parallel distributed training and can be found in the Ascend 310 AI processor software package. In addition, `mindspore.communication.management` encapsulates the collective communication API provided by the HCCL to help users configure distributed information. > HCCL implements multi-device multi-node communication based on the Ascend AI processor. The common restrictions on using the distributed service are as follows. For details, see the HCCL documentation. +> > - In a single-node system, a cluster of 1, 2, 4, or 8 devices is supported. In a multi-node system, a cluster of 8 x N devices is supported. > - Each host has four devices numbered 0 to 3 and four devices numbered 4 to 7 deployed on two different networks. During training of 2 or 4 devices, the devices must be connected and clusters cannot be created across networks. +> - When we create a multi-node system, all nodes should use one same exchanger. > - The server hardware architecture and operating system require the symmetrical multi-processing (SMP) mode. The sample code for calling the HCCL as follows: @@ -97,10 +103,11 @@ from mindspore.communication.management import init if __name__ == "__main__": context.set_context(mode=context.GRAPH_MODE, device_target="Ascend", device_id=int(os.environ["DEVICE_ID"])) init() - ... + ... ``` In the preceding code: + - `mode=context.GRAPH_MODE`: sets the running mode to graph mode for distributed training. (The PyNative mode does not support parallel running.) - `device_id`: physical sequence number of a device, that is, the actual sequence number of the device on the corresponding host. - `init`: enables HCCL communication and completes the distributed training initialization. @@ -109,7 +116,6 @@ In the preceding code: During distributed training, data is imported in data parallel mode. The following takes the CIFAR-10 dataset as an example to describe how to import the CIFAR-10 dataset in data parallel mode. `data_path` indicates the dataset path, which is also the path of the `cifar-10-batches-bin` folder. - ```python import mindspore.common.dtype as mstype import mindspore.dataset as ds @@ -122,12 +128,12 @@ def create_dataset(data_path, repeat_num=1, batch_size=32, rank_id=0, rank_size= resize_width = 224 rescale = 1.0 / 255.0 shift = 0.0 - + # get rank_id and rank_size rank_id = get_rank() rank_size = get_group_size() data_set = ds.Cifar10Dataset(data_path, num_shards=rank_size, shard_id=rank_id) - + # define map operations random_crop_op = vision.RandomCrop((32, 32), (4, 4, 4, 4)) random_horizontal_op = vision.RandomHorizontalFlip() @@ -155,13 +161,76 @@ def create_dataset(data_path, repeat_num=1, batch_size=32, rank_id=0, rank_size= return data_set ``` + Different from the single-node system, the multi-node system needs to transfer the `num_shards` and `shard_id` parameters to the dataset API. The two parameters correspond to the number of devices and logical sequence numbers of devices, respectively. You are advised to obtain the parameters through the HCCL API. + - `get_rank`: obtains the ID of the current device in the cluster. - `get_group_size`: obtains the number of devices. ## Defining the Network -In data parallel and automatic parallel modes, the network definition method is the same as that in a single-node system. The reference code is as follows: +In data parallel and automatic parallel modes, the network definition method is the same as that in a single-node system. The reference code of ResNet is as follows: + +In this section we focus on how to define a network in hybrid parallel or semi-auto parallel mode. + +### Hybrid Parallel Mode + +Hybrid parallel mode adds the setting `layerwise_parallel` for `parameter` based on the data parallel mode. The `parameter` with the settig would be saved and computed in slice tensor and would not apply gradients aggregation. In this mode, MindSpore would not infer computation and communication for parallel operators automatically. To ensure the consistency of calculation logic, users are required to manually infer extra operations and insert them to networks. Therefore, this parallel mode is suitable for the users with deep understanding of parallel theory. + +In the following example, specify the `self.weight` as the `layerwise_parallel`, that is, the `self.weight` and the output of `MatMul` are sliced on the second dimension. At this time, perform ReduceSum on the second dimension would only get one sliced result. `AllReduce.Sum` is required here to accumulate the results among all devices. More information about the parallel theory please refer to the [design doc](https://www.mindspore.cn/doc/note/en/master/design/mindspore/distributed_training_design.html). + +```python +from mindspore import Tensor +import mindspore.ops as ops +import mindspore.common.dtype as mstype +import mindspore.nn as nn + +class HybridParallelNet(nn.Cell): + def __init__(self): + super(HybridParallelNet, self).__init__() + # initialize the weight which is sliced at the second dimension + weight_init = np.random.rand(512, 128/2).astype(np.float32) + self.weight = Parameter(Tensor(weight_init), name="weight", layerwise_parallel=True) + self.fc = ops.MatMul() + self.reduce = ops.ReduceSum() + self.allreduce = ops.AllReduce(op='sum') + + def construct(self, x): + x = self.fc(x, self.weight) + x = self.reduce(x, -1) + x = self.allreduce(x) + return x +``` + +### Semi Auto Parallel Mode + +Compared with the auto parallel mode, semi auto parallel mode supports manual configuration on shard strategies for network tuning. The definition of shard strategies could be referred by this [design doc](https://www.mindspore.cn/doc/note/en/master/design/mindspore/distributed_training_design.html). + +It should be noticed that the operators without shard strategies would be regraded as data parallel. If a parameter is used by multiple operators, each operator's shard strategy for this parameter needs to be consistent, otherwise an error will be reported. + +In the above example `HybridParallelNet`, the script in semi auto parallel mode is as follows. The shard stratege of `MatMul` is `{(1, 1), (1, 2)}`, which means `self.weight` is sliced at the second dimension. + +```python +from mindspore import Tensor +import mindspore.ops as ops +import mindspore.common.dtype as mstype +import mindspore.nn as nn + +class SemiAutoParallelNet(nn.Cell): + def __init__(self): + super(SemiAutoParallelNet, self).__init__() + # initialize full tensor weight + weight_init = np.random.rand(512, 128).astype(np.float32) + self.weight = Parameter(Tensor(weight_init), name="weight") + # set shard strategy + self.fc = ops.MatMul().shard({(1, 1),(1, 2)}) + self.reduce = ops.ReduceSum() + + def construct(self, x): + x = self.fc(x, self.weight) + x = self.reduce(x, -1) + return x +``` ## Defining the Loss Function and Optimizer @@ -195,7 +264,7 @@ class SoftmaxCrossEntropyExpand(nn.Cell): self.sparse = sparse self.max = P.ReduceMax(keep_dims=True) self.sub = P.Sub() - + def construct(self, logit, label): logit_max = self.max(logit, -1) exp = self.exp(self.sub(logit, logit_max)) @@ -252,11 +321,14 @@ def test_train_cifar(epoch_size=10): model = Model(net, loss_fn=loss, optimizer=opt) model.train(epoch_size, dataset, callbacks=[loss_cb], dataset_sink_mode=True) ``` + In the preceding code: + - `dataset_sink_mode=True`: uses the dataset sink mode. That is, the training computing is sunk to the hardware platform for execution. - `LossMonitor`: returns the loss value through the callback function to monitor the loss function. ## Running the Script + After the script required for training is edited, run the corresponding command to call the script. Currently, MindSpore distributed execution uses the single-device single-process running mode. That is, one process runs on each device, and the number of total processes is the same as the number of devices that are being used. For device 0, the corresponding process is executed in the foreground. For other devices, the corresponding processes are executed in the background. You need to create a directory for each process to store log information and operator compilation information. The following takes the distributed training script for eight devices as an example to describe how to run the script: @@ -318,6 +390,7 @@ cd ../ The variables `DATA_PATH` and `RANK_SIZE` need to be transferred to the script, which indicate the path of the dataset and the number of devices, respectively. The necessary environment variables are as follows: + - `RANK_TABLE_FILE`: path for storing the networking information file. - `DEVICE_ID`: actual sequence number of the current device on the corresponding host. - `RANK_ID`: logical sequence number of the current device. @@ -327,7 +400,7 @@ The running time is about 5 minutes, which is mainly occupied by operator compil Log files are saved in the `device` directory. The `env.log` file records environment variable information. The `train.log` file records the loss function information. The following is an example: -``` +```text epoch: 1 step: 156, loss is 2.0084016 epoch: 2 step: 156, loss is 1.6407638 epoch: 3 step: 156, loss is 1.6164391 @@ -417,7 +490,7 @@ strategy = ((1, 1), (1, 8)) net = DataParallelNet(strategy=strategy) # reset parallel mode context.reset_auto_parallel_context() -# set parallel mode, data parallel mode is selected for training and model saving. If you want to choose auto parallel +# set parallel mode, data parallel mode is selected for training and model saving. If you want to choose auto parallel # mode, you can simply change the value of parallel_mode parameter to ParallelMode.AUTO_PARALLEL. context.set_auto_parallel_context(parallel_mode=ParallelMode.DATA_PARALLEL, device_num=8) ``` @@ -478,10 +551,10 @@ strategy = ((1, 1), (1, 8)) net = SemiAutoParallelNet(strategy=strategy, strategy2=strategy) # reset parallel mode context.reset_auto_parallel_context() -# set parallel mode, data parallel mode is selected for training and model saving. If you want to choose auto parallel +# set parallel mode, data parallel mode is selected for training and model saving. If you want to choose auto parallel # mode, you can simply change the value of parallel_mode parameter to ParallelMode.AUTO_PARALLEL. context.set_auto_parallel_context(parallel_mode=ParallelMode.SEMI_AUTO_PARALLEL, - strategy_ckpt_save_file='./rank_{}_ckpt/strategy.txt'.format(get_rank)) + strategy_ckpt_save_file='./rank_{}_ckpt/strategy.txt'.format(get_rank)) ``` Then set the checkpoint saving policy, optimizer and loss function as required. The code is as follows: @@ -510,12 +583,14 @@ For the three parallel training modes described above, the checkpoint file is sa Only by changing the code that sets the checkpoint saving policy, the checkpoint file of each card can be saved on itself. The specific changes are as follows: Change the checkpoint configuration policy from: + ```python # config checkpoint ckpt_config = CheckpointConfig(keep_checkpoint_max=1) ``` to: + ```python # config checkpoint ckpt_config = CheckpointConfig(keep_checkpoint_max=1, integrated_save=False) @@ -525,4 +600,4 @@ It should be noted that if users chooses this checkpoint saving policy, users ne ### Hybrid Parallel Mode -For model parameter saving and loading in Hybrid Parallel Mode, please refer to [Saving and Loading Model Parameters in the Hybrid Parallel Scenario](https://www.mindspore.cn/tutorial/training/en/master/advanced_use/save_load_model_hybrid_parallel.html). \ No newline at end of file +For model parameter saving and loading in Hybrid Parallel Mode, please refer to [Saving and Loading Model Parameters in the Hybrid Parallel Scenario](https://www.mindspore.cn/tutorial/training/en/master/advanced_use/save_load_model_hybrid_parallel.html). diff --git a/tutorials/training/source_en/advanced_use/distributed_training_gpu.md b/tutorials/training/source_en/advanced_use/distributed_training_gpu.md index 49bd34a74af32c9d84cf1e34271d147b8b64cc02..69d9e0fcc20d247b11686a06f0b5ca6cc08f1860 100644 --- a/tutorials/training/source_en/advanced_use/distributed_training_gpu.md +++ b/tutorials/training/source_en/advanced_use/distributed_training_gpu.md @@ -1,6 +1,6 @@ # Distributed Parallel Training (GPU) -`Linux` `GPU` `Model Training` `Intermediate` `Expert` +`Linux` `GPU` `Model Training` `Intermediate` `Expert` @@ -70,7 +70,7 @@ from mindspore.communication.management import init if __name__ == "__main__": context.set_context(mode=context.GRAPH_MODE, device_target="GPU") init("nccl") - ... + ... ``` In the preceding information, @@ -110,7 +110,7 @@ mpirun -n 8 pytest -s -v ./resnet50_distributed_training.py > train.log 2>&1 & The script requires the variable `DATA_PATH`, which indicates the path of the dataset. In addition, you need to modify the `resnet50_distributed_training.py` file. Since the `DEVICE_ID` environment variable does not need to be set on the GPU, you do not need to call `int(os.getenv('DEVICE_ID'))` in the script to obtain the physical sequence number of the device, and `context` does not require `device_id`. You need to set `device_target` to `GPU` and call `init("nccl")` to enable the NCCL. The log file is saved in the device directory, and the loss result is saved in train.log. The output loss values of the grep command are as follows: -``` +```text epoch: 1 step: 1, loss is 2.3025854 epoch: 1 step: 1, loss is 2.3025854 epoch: 1 step: 1, loss is 2.3025854 @@ -124,6 +124,7 @@ epoch: 1 step: 1, loss is 2.3025854 ## Running the Multi-Host Script If multiple hosts are involved in the training, you need to set the multi-host configuration in the `mpirun` command. You can use the `-H` option in the `mpirun` command. For example, `mpirun -n 16 -H DEVICE1_IP:8,DEVICE2_IP:8 python hello.py` indicates that eight processes are started on the host whose IP addresses are DEVICE1_IP and DEVICE2_IP, respectively. Alternatively, you can create a hostfile similar to the following and transfer its path to the `--hostfile` option of `mpirun`. Each line in the hostfile is in the format of `[hostname] slots=[slotnum]`, where hostname can be an IP address or a host name. + ```bash DEVICE1 slots=8 DEVICE2 slots=8 diff --git a/tutorials/training/source_en/advanced_use/enable_graph_kernel_fusion.md b/tutorials/training/source_en/advanced_use/enable_graph_kernel_fusion.md index 6ef3b5c3751ac89d1197764c4091b03902fd4408..c18de80d685d415230262d0183e2f18dae2c3320 100644 --- a/tutorials/training/source_en/advanced_use/enable_graph_kernel_fusion.md +++ b/tutorials/training/source_en/advanced_use/enable_graph_kernel_fusion.md @@ -100,7 +100,6 @@ context.set_context(enable_graph_kernel=True) 2. `BERT-large` training network Take the training model of the `BERT-large` network as an example. For details about the dataset and training script, see . You only need to modify the `context` parameter. - ## Effect Evaluation diff --git a/tutorials/training/source_en/advanced_use/enable_mixed_precision.md b/tutorials/training/source_en/advanced_use/enable_mixed_precision.md index 8a031c231e3b9b324ff7bf2e209411daf9d27c13..1ca25936cbe2ae6b44983a1c75cf52d7878955d5 100644 --- a/tutorials/training/source_en/advanced_use/enable_mixed_precision.md +++ b/tutorials/training/source_en/advanced_use/enable_mixed_precision.md @@ -16,7 +16,7 @@ ## Overview -The mixed precision training method accelerates the deep learning neural network training process by using both the single-precision and half-precision data formats, and maintains the network precision achieved by the single-precision training at the same time. +The mixed precision training method accelerates the deep learning neural network training process by using both the single-precision and half-precision data formats, and maintains the network precision achieved by the single-precision training at the same time. Mixed precision training can accelerate the computation process, reduce memory usage, and enable a larger model or batch size to be trained on specific hardware. For FP16 operators, if the input data type is FP32, the backend of MindSpore will automatically handle it with reduced precision. Users could check the reduced-precision operators by enabling INFO log and then searching 'reduce precision'. @@ -42,6 +42,7 @@ This document describes the computation process by using examples of automatic a To use the automatic mixed precision, you need to invoke the corresponding API, which takes the network to be trained and the optimizer as the input. This API converts the operators of the entire network into FP16 operators (except the `BatchNorm` and Loss operators). You can use automatic mixed precision through API `amp` or API `Model`. The procedure of using automatic mixed precision by API `amp` is as follows: + 1. Introduce the MindSpore mixed precision API `amp`. 2. Define the network. This step is the same as the common network definition. (You do not need to manually configure the precision of any specific operator.) @@ -93,6 +94,7 @@ output = train_network(predict, label) ``` The procedure of using automatic mixed precision by API `Model` is as follows: + 1. Introduce the MindSpore model API `Model`. 2. Define the network. This step is the same as the common network definition. (You do not need to manually configure the precision of any specific operator.) @@ -169,7 +171,8 @@ model.train(epoch=10, train_dataset=ds_train) MindSpore also supports manual mixed precision. It is assumed that only one dense layer in the network needs to be calculated by using FP32, and other layers are calculated by using FP16. The mixed precision is configured in the granularity of cell. The default format of a cell is FP32. The following is the procedure for implementing manual mixed precision: -1. Define the network. This step is similar to step 2 in the automatic mixed precision. + +1. Define the network. This step is similar to step 2 in the automatic mixed precision. 2. Configure the mixed precision. Use `net.to_float(mstype.float16)` to set all operators of the cell and its sub-cells to FP16. Then, configure the dense to FP32. diff --git a/tutorials/training/source_en/advanced_use/evaluate_the_model_during_training.md b/tutorials/training/source_en/advanced_use/evaluate_the_model_during_training.md index d1bae7be1a799c11a10b2db113ca0c53d78fcb56..2b36b20bb6df7864bfda5c4dba086b4263f23d78 100644 --- a/tutorials/training/source_en/advanced_use/evaluate_the_model_during_training.md +++ b/tutorials/training/source_en/advanced_use/evaluate_the_model_during_training.md @@ -20,6 +20,7 @@ For a complex network, epoch training usually needs to be performed for dozens or even hundreds of times. Before training, it is difficult to know when a model can achieve required accuracy in epoch training. Therefore, the accuracy of the model is usually validated at a fixed epoch interval in training and the corresponding model is saved. After the training is completed, you can quickly select the optimal model by viewing the change of the corresponding model accuracy. This section uses this method and takes the LeNet network as an example. The procedure is as follows: + 1. Define the callback function EvalCallBack to implement synchronous training and validation. 2. Define a training network and execute it. 3. Draw a line chart based on the model accuracy under different epochs and select the optimal model. @@ -52,7 +53,7 @@ class EvalCallBack(Callback): self.eval_dataset = eval_dataset self.eval_per_epoch = eval_per_epoch self.epoch_per_eval = epoch_per_eval - + def epoch_end(self, run_context): cb_param = run_context.original_args() cur_epoch = cb_param.cur_epoch_num @@ -90,51 +91,52 @@ if __name__ == "__main__": eval_per_epoch = 2 ... ... - + # need to calculate how many steps are in each epoch, in this example, 1875 steps per epoch. config_ck = CheckpointConfig(save_checkpoint_steps=eval_per_epoch*1875, keep_checkpoint_max=15) ckpoint_cb = ModelCheckpoint(prefix="checkpoint_lenet",directory=ckpt_save_dir, config=config_ck) model = Model(network, net_loss, net_opt, metrics={"Accuracy": Accuracy()}) - + epoch_per_eval = {"epoch": [], "acc": []} eval_cb = EvalCallBack(model, eval_data, eval_per_epoch, epoch_per_eval) - + model.train(epoch_size, train_data, callbacks=[ckpoint_cb, LossMonitor(375), eval_cb], dataset_sink_mode=True) ``` The output is as follows: - epoch: 1 step: 375, loss is 2.298612 - epoch: 1 step: 750, loss is 2.075152 - epoch: 1 step: 1125, loss is 0.39205977 - epoch: 1 step: 1500, loss is 0.12368304 - epoch: 1 step: 1875, loss is 0.20988345 - epoch: 2 step: 375, loss is 0.20582482 - epoch: 2 step: 750, loss is 0.029070046 - epoch: 2 step: 1125, loss is 0.041760832 - epoch: 2 step: 1500, loss is 0.067035824 - epoch: 2 step: 1875, loss is 0.0050643035 - {'Accuracy': 0.9763621794871795} - - ... ... - - epoch: 9 step: 375, loss is 0.021227183 - epoch: 9 step: 750, loss is 0.005586236 - epoch: 9 step: 1125, loss is 0.029125651 - epoch: 9 step: 1500, loss is 0.00045874066 - epoch: 9 step: 1875, loss is 0.023556218 - epoch: 10 step: 375, loss is 0.0005807788 - epoch: 10 step: 750, loss is 0.02574059 - epoch: 10 step: 1125, loss is 0.108463734 - epoch: 10 step: 1500, loss is 0.01950589 - epoch: 10 step: 1875, loss is 0.10563098 - {'Accuracy': 0.979667467948718} - +```text +epoch: 1 step: 375, loss is 2.298612 +epoch: 1 step: 750, loss is 2.075152 +epoch: 1 step: 1125, loss is 0.39205977 +epoch: 1 step: 1500, loss is 0.12368304 +epoch: 1 step: 1875, loss is 0.20988345 +epoch: 2 step: 375, loss is 0.20582482 +epoch: 2 step: 750, loss is 0.029070046 +epoch: 2 step: 1125, loss is 0.041760832 +epoch: 2 step: 1500, loss is 0.067035824 +epoch: 2 step: 1875, loss is 0.0050643035 +{'Accuracy': 0.9763621794871795} + +... ... + +epoch: 9 step: 375, loss is 0.021227183 +epoch: 9 step: 750, loss is 0.005586236 +epoch: 9 step: 1125, loss is 0.029125651 +epoch: 9 step: 1500, loss is 0.00045874066 +epoch: 9 step: 1875, loss is 0.023556218 +epoch: 10 step: 375, loss is 0.0005807788 +epoch: 10 step: 750, loss is 0.02574059 +epoch: 10 step: 1125, loss is 0.108463734 +epoch: 10 step: 1500, loss is 0.01950589 +epoch: 10 step: 1875, loss is 0.10563098 +{'Accuracy': 0.979667467948718} +``` Find the `lenet_ckpt` folder in the same directory. The folder contains five models and data related to a calculation graph. The structure is as follows: -``` +```text lenet_ckpt ├── checkpoint_lenet-10_1875.ckpt ├── checkpoint_lenet-2_1875.ckpt @@ -148,7 +150,6 @@ lenet_ckpt Define the drawing function `eval_show`, load `epoch_per_eval` to `eval_show`, and draw the model accuracy variation chart based on different `epoch`. - ```python import matplotlib.pyplot as plt @@ -166,7 +167,6 @@ The output is as follows: ![png](./images/evaluate_the_model_during_training.png) - You can easily select the optimal model based on the preceding figure. ## Summary diff --git a/tutorials/training/source_en/advanced_use/improve_model_security_nad.md b/tutorials/training/source_en/advanced_use/improve_model_security_nad.md index 7af1a43349250261213fdd0b6419b76da6910a1a..05e021edaaec801236970eeed05a4101c313c89d 100644 --- a/tutorials/training/source_en/advanced_use/improve_model_security_nad.md +++ b/tutorials/training/source_en/advanced_use/improve_model_security_nad.md @@ -33,6 +33,7 @@ This section describes how to use MindArmour in adversarial attack and defense b > The current sample is for CPU, GPU and Ascend 910 AI processor. You can find the complete executable sample code at > +> > - `mnist_attack_fgsm.py`: contains attack code. > - `mnist_defense_nad.py`: contains defense code. @@ -133,18 +134,18 @@ The LeNet model is used as an example. You can also create and train your own mo return nn.Conv2d(in_channels, out_channels, kernel_size=kernel_size, stride=stride, padding=padding, weight_init=weight, has_bias=False, pad_mode="valid") - - + + def fc_with_initialize(input_channels, out_channels): weight = weight_variable() bias = weight_variable() return nn.Dense(input_channels, out_channels, weight, bias) - - + + def weight_variable(): return TruncatedNormal(0.02) - - + + class LeNet5(nn.Cell): """ Lenet network @@ -159,7 +160,7 @@ The LeNet model is used as an example. You can also create and train your own mo self.relu = nn.ReLU() self.max_pool2d = nn.MaxPool2d(kernel_size=2, stride=2) self.flatten = nn.Flatten() - + def construct(self, x): x = self.conv1(x) x = self.relu(x) @@ -191,7 +192,7 @@ The LeNet model is used as an example. You can also create and train your own mo model = Model(net, loss, opt, metrics=None) model.train(10, ds_train, callbacks=[LossMonitor()], dataset_sink_mode=False) - + # get test data ds_test = generate_mnist_dataset(os.path.join(mnist_path, "test"), batch_size=batch_size, repeat_size=1, @@ -218,16 +219,16 @@ The LeNet model is used as an example. You can also create and train your own mo logits = net(Tensor(batch_inputs)).asnumpy() test_logits.append(logits) test_logits = np.concatenate(test_logits) - + tmp = np.argmax(test_logits, axis=1) == np.argmax(test_labels, axis=1) accuracy = np.mean(tmp) LOGGER.info(TAG, 'prediction accuracy before attacking is : %s', accuracy) - + ``` The classification accuracy reaches 98%. - - ```python + + ```python prediction accuracy before attacking is : 0.9895833333333334 ``` @@ -274,7 +275,7 @@ LOGGER.info(TAG, 'The average structural similarity between original ' The attack results are as follows: -``` +```text prediction accuracy after attacking is : 0.052083 mis-classification rate of adversaries is : 0.947917 The average confidence of adversarial class is : 0.803375 @@ -351,7 +352,7 @@ LOGGER.info(TAG, 'The average confidence of true class is : %s', ### Defense Effect -``` +```text accuracy of TEST data on defensed model is : 0.974259 accuracy of adv data on defensed model is : 0.856370 defense mis-classification rate of adversaries is : 0.143629 @@ -359,5 +360,4 @@ The average confidence of adversarial class is : 0.616670 The average confidence of true class is : 0.177374 ``` -After NAD is used to defend against adversarial examples, the model's misclassification ratio of adversarial examples decreases from 95% to 14%, effectively defending against adversarial examples. In addition, the classification accuracy of the model for the original test dataset reaches 97%. - +After NAD is used to defend against adversarial examples, the model's misclassification ratio of adversarial examples decreases from 95% to 14%, effectively defending against adversarial examples. In addition, the classification accuracy of the model for the original test dataset reaches 97%. diff --git a/tutorials/training/source_en/advanced_use/lineage_and_scalars_comparision.md b/tutorials/training/source_en/advanced_use/lineage_and_scalars_comparision.md index 68cfe7d1a65fd5bc186e5a2f4cbb721ec3135886..73337706653defa72ead334ac953b702dcbb4d48 100644 --- a/tutorials/training/source_en/advanced_use/lineage_and_scalars_comparision.md +++ b/tutorials/training/source_en/advanced_use/lineage_and_scalars_comparision.md @@ -105,6 +105,7 @@ Figure 9 shows the scalars comparision function area, which allows you to view s ## Notices To ensure performance, MindInsight implements scalars comparision with the cache mechanism and the following restrictions: -- The scalars comparision supports only for trainings in cache. + +- The scalars comparision supports only for trainings in cache. - The maximum of 15 latest trainings (sorted by modification time) can be retained in the cache. - The maximum of 5 trainings can be selected for scalars comparision at the same time. diff --git a/tutorials/training/source_en/advanced_use/migrate_3rd_scripts.md b/tutorials/training/source_en/advanced_use/migrate_3rd_scripts.md index f84d41e41347b26b210a10513bd7336704afec58..140c2c5683b11fa06481d36c3f98a709e8b6e222 100644 --- a/tutorials/training/source_en/advanced_use/migrate_3rd_scripts.md +++ b/tutorials/training/source_en/advanced_use/migrate_3rd_scripts.md @@ -275,4 +275,3 @@ Models trained on the Ascend 910 AI processor can be used for inference on diffe ## Examples - [Model Zoo](https://gitee.com/mindspore/mindspore/tree/master/model_zoo) - diff --git a/tutorials/training/source_en/advanced_use/migrate_3rd_scripts_mindconverter.md b/tutorials/training/source_en/advanced_use/migrate_3rd_scripts_mindconverter.md index ce21f4c2521c29c14f0778cdcd25cb35f5a20f26..1bf17792d50b319f44f60dd7fcc1e46a43f3b6a5 100644 --- a/tutorials/training/source_en/advanced_use/migrate_3rd_scripts_mindconverter.md +++ b/tutorials/training/source_en/advanced_use/migrate_3rd_scripts_mindconverter.md @@ -22,12 +22,10 @@ MindConverter is a migration tool to transform the model scripts from PyTorch to Mindspore. Users can migrate their PyTorch models to Mindspore rapidly with minor changes according to the conversion report. - ## Installation Mindconverter is a submodule in MindInsight. Please follow the [Guide](https://www.mindspore.cn/install/en) here to install MindInsight. - ## Usage MindConverter currently only provides command-line interface. Here is the manual page. @@ -71,7 +69,7 @@ optional arguments: > The AST mode will be enabled, if both `--in_file` and `--model_file` are specified. -For the Graph mode, `--shape` is mandatory. +For the Graph mode, `--shape` is mandatory. For the AST mode, `--shape` is ignored. @@ -80,11 +78,8 @@ For the AST mode, `--shape` is ignored. Please note that your original PyTorch project is included in the module search path (PYTHONPATH). Use the python interpreter and test your module can be successfully loaded by `import` command. Use `--project_path` instead if your project is not in the PYTHONPATH to ensure MindConverter can load it. > Assume the project is located at `/home/user/project/model_training`, users can use this command to add the project to `PYTHONPATH` : `export PYTHONPATH=/home/user/project/model_training:$PYTHONPATH` - > MindConverter needs the original PyTorch scripts because of the reverse serialization. - - ## Scenario MindConverter provides two modes for different migration demands. @@ -96,13 +91,12 @@ The AST mode is recommended for the first demand. It parses and analyzes PyTorch For the second demand, the Graph mode is recommended. As the computational graph is a standard descriptive language, it is not affected by user's coding style. This mode may have more operators converted as long as these operators are supported by MindConverter. -Some typical image classification networks such as ResNet and VGG have been tested for the Graph mode. Note that: +Some typical image classification networks such as ResNet and VGG have been tested for the Graph mode. Note that: > 1. Currently, the Graph mode does not support models with multiple inputs. Only models with a single input and single output are supported. > 2. The Dropout operator will be lost after conversion because the inference mode is used to load the PyTorch model. Manually re-implement is necessary. > 3. The Graph-based mode will be continuously developed and optimized with further updates. - ## Example ### AST-Based Conversion @@ -123,8 +117,8 @@ line : [UnConvert] 'operator' didn't convert. ... For non-transformed operators, the original code keeps. Please manually migrate them. [Click here](https://www.mindspore.cn/doc/note/en/master/index.html#operator_api) for more information about operator mapping. - Here is an example of the conversion report: + ```text [Start Convert] [Insert] 'import mindspore.ops.operations as P' is inserted to the converted file. @@ -137,7 +131,6 @@ Here is an example of the conversion report: For non-transformed operators, suggestions are provided in the report. For instance, MindConverter suggests that replace `torch.nn.AdaptiveAvgPool2d` with `mindspore.ops.operations.ReduceMean`. - ### Graph-Based Conversion Assume the PyTorch model (.pth file) is located at `/home/user/model.pth`, with input shape (3, 224, 224) and the original PyTorch script is at `/home/user/project/model_training`. Output the transformed MindSpore script to `/home/user/output`, with the conversion report to `/home/user/output/report`. Use the following command: @@ -194,12 +187,9 @@ class Classifier(nn.Cell): ``` -> `--output` and `--report` are optional. MindConverter creates an `output` folder under the current working directory, and outputs generated scripts and conversion reports to it. - +> `--output` and `--report` are optional. MindConverter creates an `output` folder under the current working directory, and outputs generated scripts and conversion reports to it. ## Caution 1. PyTorch is not an explicitly stated dependency library in MindInsight. The Graph conversion requires the consistent PyTorch version as the model is trained. (MindConverter recommends PyTorch 1.4.0 or 1.6.0) -2. This script conversion tool relies on operators which supported by MindConverter and MindSpore. Unsupported operators may not successfully mapped to MindSpore operators. You can manually edit, or implement the mapping based on MindConverter, and contribute to our MindInsight repository. We appreciate your support for the MindSpore community. - - +2. This script conversion tool relies on operators which supported by MindConverter and MindSpore. Unsupported operators may not successfully mapped to MindSpore operators. You can manually edit, or implement the mapping based on MindConverter, and contribute to our MindInsight repository. We appreciate your support for the MindSpore community. diff --git a/tutorials/training/source_en/advanced_use/nlp_sentimentnet.md b/tutorials/training/source_en/advanced_use/nlp_sentimentnet.md index a697aa0c93ef8afc4350b9ed2db97b02df142f4f..e3dba0158c68eca3b0c1db799acfeef63825ae8d 100644 --- a/tutorials/training/source_en/advanced_use/nlp_sentimentnet.md +++ b/tutorials/training/source_en/advanced_use/nlp_sentimentnet.md @@ -47,6 +47,7 @@ Vertical polarity word = General polarity word + Domain-specific polarity word According to the text processing granularity, sentiment analysis can be divided into word, phrase, sentence, paragraph, and chapter levels. A sentiment analysis at paragraph level is used as an example. The input is a paragraph, and the output is information about whether the movie review is positive or negative. ## Preparation and Design + ### Downloading the Dataset The IMDb movie review dataset is used as experimental data. @@ -54,15 +55,17 @@ The IMDb movie review dataset is used as experimental data. The following are cases of negative and positive reviews. -| Review | Label | +| Review | Label | |---|---| | "Quitting" may be as much about exiting a pre-ordained identity as about drug withdrawal. As a rural guy coming to Beijing, class and success must have struck this young artist face on as an appeal to separate from his roots and far surpass his peasant parents' acting success. Troubles arise, however, when the new man is too new, when it demands too big a departure from family, history, nature, and personal identity. The ensuing splits, and confusion between the imaginary and the real and the dissonance between the ordinary and the heroic are the stuff of a gut check on the one hand or a complete escape from self on the other. | Negative | | This movie is amazing because the fact that the real people portray themselves and their real life experience and do such a good job it's like they're almost living the past over again. Jia Hongsheng plays himself an actor who quit everything except music and drugs struggling with depression and searching for the meaning of life while being angry at everyone especially the people who care for him most. | Positive | Download the GloVe file and add the following line at the beginning of the file, which means that a total of 400,000 words are read, and each word is represented by a word vector of 300 latitudes. -``` + +```text 400000 300 ``` + GloVe file download address: ### Determining Evaluation Criteria @@ -79,16 +82,17 @@ F1 score = (2 x Precision x Recall)/(Precision + Recall) In the IMDb dataset, the number of positive and negative samples does not vary greatly. Accuracy can be used as the evaluation criterion of the classification system. - ### Determining the Network and Process Currently, MindSpore GPU and CPU supports SentimentNet network based on the long short-term memory (LSTM) network for NLP. + 1. Load the dataset in use and process data if necessary. 2. Use the SentimentNet network based on LSTM to train data and generate a model. Long short-term memory (LSTM) is an artificial recurrent neural network (RNN) architecture used for processing and predicting an important event with a long interval and delay in a time sequence. For details, refer to online documentation. 3. After the model is obtained, use the validation dataset to check the accuracy of model. > The current sample is for the Ascend 910 AI processor. You can find the complete executable sample code at +> > - `src/config.py`: some configurations of the network, including the batch size and number of training epochs. > - `src/dataset.py`: dataset related definition, including converted MindRecord file and preprocessed data. > - `src/imdb.py`: the utility class for parsing IMDb dataset. @@ -97,8 +101,11 @@ Currently, MindSpore GPU and CPU supports SentimentNet network based on the long > - `eval.py`: the evaluation script. ## Implementation + ### Importing Library Files + The following are the required public modules and MindSpore modules and library files. + ```python import argparse import os @@ -118,6 +125,7 @@ from mindspore.train.serialization import load_param_into_net, load_checkpoint ### Configuring Environment Information 1. The `parser` module is used to transfer necessary information for running, such as storage paths of the dataset and the GloVe file. In this way, the frequently changed configurations can be entered during runtime, which is more flexible. + ```python parser = argparse.ArgumentParser(description='MindSpore LSTM Example') parser.add_argument('--preprocess', type=str, default='false', choices=['true', 'false'], @@ -138,13 +146,14 @@ from mindspore.train.serialization import load_param_into_net, load_checkpoint ``` 2. Before implementing code, configure necessary information, including the environment information, execution mode, backend information, and hardware information. - + ```python context.set_context( mode=context.GRAPH_MODE, save_graphs=False, device_target=args.device_target) ``` + For details about the API configuration, see the `context.set_context`. ### Preprocessing the Dataset @@ -156,15 +165,14 @@ if args.preprocess == "true": print("============== Starting Data Pre-processing ==============") convert_to_mindrecord(cfg.embed_size, args.aclimdb_path, args.preprocess_path, args.glove_path) ``` -> After successful conversion, `mindrecord` files are generated under the directory `preprocess_path`. Usually, this operation does not need to be performed every time if the dataset is unchanged. +> After successful conversion, `mindrecord` files are generated under the directory `preprocess_path`. Usually, this operation does not need to be performed every time if the dataset is unchanged. > For `convert_to_mindrecord`, you can find the complete definition at: - > It consists of two steps: +> >1. Process the text dataset, including encoding, word segmentation, alignment, and processing the original GloVe data to adapt to the network structure. >2. Convert the dataset format to the MindRecord format. - ### Defining the Network ```python @@ -178,11 +186,13 @@ network = SentimentNet(vocab_size=embedding_table.shape[0], weight=Tensor(embedding_table), batch_size=cfg.batch_size) ``` + > For `SentimentNet`, you can find the complete definition at: ### Pre-Training The parameter `pre_trained` specifies the preloading CheckPoint file for pre-training, which is empty by default + ```python if args.pre_trained: load_param_into_net(network, load_checkpoint(args.pre_trained)) @@ -217,6 +227,7 @@ else: model.train(cfg.num_epochs, ds_train, callbacks=[time_cb, ckpoint_cb, loss_cb]) print("============== Training Success ==============") ``` + > For `lstm_create_dataset`, you can find the complete definition at: ### Validating the Model @@ -238,12 +249,15 @@ print("============== {} ==============".format(acc)) ``` ## Experimental Result + After 20 epochs, the accuracy on the test set is about 84.19%. -**Training Execution** +**Training Execution:** + 1. Run the training code and view the running result. + ```shell - $ python train.py --preprocess=true --ckpt_path=./ --device_target=GPU + python train.py --preprocess=true --ckpt_path=./ --device_target=GPU ``` As shown in the following output, the loss value decreases gradually with the training process and reaches about 0.2855. @@ -263,11 +277,11 @@ After 20 epochs, the accuracy on the test set is about 84.19%. ``` 2. Check the saved CheckPoint files. - + CheckPoint files (model files) are saved during the training. You can view all saved files in the file path. ```shell - $ ls ./*.ckpt + ls ./*.ckpt ``` The output is as follows: @@ -276,12 +290,12 @@ After 20 epochs, the accuracy on the test set is about 84.19%. lstm-11_390.ckpt lstm-12_390.ckpt lstm-13_390.ckpt lstm-14_390.ckpt lstm-15_390.ckpt lstm-16_390.ckpt lstm-17_390.ckpt lstm-18_390.ckpt lstm-19_390.ckpt lstm-20_390.ckpt ``` -**Model Validation** +**Model Validation:** Use the last saved CheckPoint file to load and validate the dataset. ```shell -$ python eval.py --ckpt_path=./lstm-20_390.ckpt --device_target=GPU +python eval.py --ckpt_path=./lstm-20_390.ckpt --device_target=GPU ``` As shown in the following output, the sentiment analysis accuracy of the text is about 84.19%, which is basically satisfactory. @@ -290,4 +304,3 @@ As shown in the following output, the sentiment analysis accuracy of the text is ============== Starting Testing ============== ============== {'acc': 0.8419471153846154} ============== ``` - diff --git a/tutorials/training/source_en/advanced_use/optimize_data_processing.md b/tutorials/training/source_en/advanced_use/optimize_data_processing.md index dd973909e001655eb9970b725301c802249a9de1..3feeee91b9c5a466464151eef53c12c5bdaa06ce 100644 --- a/tutorials/training/source_en/advanced_use/optimize_data_processing.md +++ b/tutorials/training/source_en/advanced_use/optimize_data_processing.md @@ -53,7 +53,7 @@ import numpy as np The directory structure is as follows: -``` +```text dataset/Cifar10Data ├── cifar-10-batches-bin │   ├── batches.meta.txt @@ -76,6 +76,7 @@ dataset/Cifar10Data ``` In the preceding information: + - The `cifar-10-batches-bin` directory is the directory for storing the CIFAR-10 dataset in binary format. - The `cifar-10-batches-py` directory is the directory for storing the CIFAR-10 dataset in Python file format. @@ -93,6 +94,7 @@ MindSpore provides multiple data loading methods, including common dataset loadi ![title](./images/data_loading_performance_scheme.png) Suggestions on data loading performance optimization are as follows: + - Built-in loading operators are preferred for supported dataset formats. For details, see [Built-in Loading Operators](https://www.mindspore.cn/doc/api_python/en/master/mindspore/mindspore.dataset.html). If the performance cannot meet the requirements, use the multi-thread concurrency solution. For details, see [Multi-thread Optimization Solution](https://www.mindspore.cn/tutorial/training/en/master/advanced_use/optimize_data_processing.html#multi-thread-optimization-solution). - For a dataset format that is not supported, convert the format to the MindSpore data format and then use the `MindDataset` class to load the dataset. If the performance cannot meet the requirements, use the multi-thread concurrency solution, for details, see [Multi-thread Optimization Solution](https://www.mindspore.cn/tutorial/training/en/master/advanced_use/optimize_data_processing.html#multi-thread-optimization-solution). - For dataset formats that are not supported, the user-defined `GeneratorDataset` class is preferred for implementing fast algorithm verification. If the performance cannot meet the requirements, the multi-process concurrency solution can be used. For details, see [Multi-process Optimization Solution](https://www.mindspore.cn/tutorial/training/en/master/advanced_use/optimize_data_processing.html#multi-process-optimization-solution). @@ -113,7 +115,8 @@ Based on the preceding suggestions of data loading performance optimization, the ``` The output is as follows: - ``` + + ```text {'image': Tensor(shape=[32, 32, 3], dtype=UInt8, value= [[[235, 235, 235], [230, 230, 230], @@ -148,7 +151,7 @@ Based on the preceding suggestions of data loading performance optimization, the The output is as follows: - ``` + ```text {'data': Tensor(shape=[1431], dtype=UInt8, value= [255, 216, 255, ..., 63, 255, 217]), 'id': Tensor(shape=[], dtype=Int64, value= 30474), 'label': Tensor(shape=[], dtype=Int64, value= 2)} @@ -169,7 +172,7 @@ Based on the preceding suggestions of data loading performance optimization, the The output is as follows: - ``` + ```text {'data': Tensor(shape=[1], dtype=Int64, value= [0])} ``` @@ -182,6 +185,7 @@ The shuffle operation is used to shuffle ordered datasets or repeated datasets. ![title](./images/shuffle_performance_scheme.png) Suggestions on shuffle performance optimization are as follows: + - Use the `shuffle` parameter of built-in loading operators to shuffle data. - If the `shuffle` function is used and the performance still cannot meet the requirements, adjust the value of the `buffer_size` parameter to improve the performance. @@ -202,7 +206,7 @@ Based on the preceding shuffle performance optimization suggestions, the `shuffl The output is as follows: - ``` + ```text {'image': Tensor(shape=[32, 32, 3], dtype=UInt8, value= [[[235, 235, 235], [230, 230, 230], @@ -237,7 +241,7 @@ Based on the preceding shuffle performance optimization suggestions, the `shuffl The output is as follows: - ``` + ```text before shuffle: [0 1 2 3 4] [1 2 3 4 5] @@ -255,6 +259,7 @@ Based on the preceding shuffle performance optimization suggestions, the `shuffl ## Optimizing the Data Augmentation Performance During image classification training, especially when the dataset is small, users can use data augmentation to preprocess images to enrich the dataset. MindSpore provides multiple data augmentation methods, including: + - Use the built-in C operator (`c_transforms` module) to perform data augmentation. - Use the built-in Python operator (`py_transforms` module) to perform data augmentation. - Users can define Python functions as needed to perform data augmentation. @@ -271,6 +276,7 @@ The performance varies according to the underlying implementation methods. ![title](./images/data_enhancement_performance_scheme.png) Suggestions on data augmentation performance optimization are as follows: + - The `c_transforms` module is preferentially used to perform data augmentation for its highest performance. If the performance cannot meet the requirements, refer to [Multi-thread Optimization Solution](https://www.mindspore.cn/tutorial/training/en/master/advanced_use/optimize_data_processing.html#multi-thread-optimization-solution), [Compose Optimization Solution](https://www.mindspore.cn/tutorial/training/en/master/advanced_use/optimize_data_processing.html#compose-optimization-solution), or [Operator Fusion Optimization Solution](https://www.mindspore.cn/tutorial/training/en/master/advanced_use/optimize_data_processing.html#operator-fusion-optimization-solution). - If the `py_transforms` module is used to perform data augmentation and the performance still cannot meet the requirements, refer to [Multi-thread Optimization Solution](https://www.mindspore.cn/tutorial/training/en/master/advanced_use/optimize_data_processing.html#multi-thread-optimization-solution), [Multi-process Optimization Solution](https://www.mindspore.cn/tutorial/training/en/master/advanced_use/optimize_data_processing.html#multi-process-optimization-solution), [Compose Optimization Solution](https://www.mindspore.cn/tutorial/training/en/master/advanced_use/optimize_data_processing.html#compose-optimization-solution), or [Operator Fusion Optimization Solution](https://www.mindspore.cn/tutorial/training/en/master/advanced_use/optimize_data_processing.html#operator-fusion-optimization-solution). - The `c_transforms` module maintains buffer management in C++, and the `py_transforms` module maintains buffer management in Python. Because of the performance cost of switching between Python and C++, it is advised not to use different operator types together. @@ -324,7 +330,7 @@ Based on the preceding suggestions of data augmentation performance optimization The output is as follows: - ``` + ```text before map: [0 1 2 3 4] [1 2 3 4 5] @@ -392,6 +398,7 @@ Data processing is performed on the host. Therefore, configurations of the host ### Multi-thread Optimization Solution During the data pipeline process, the number of threads for related operators can be set to improve the concurrency and performance. For example: + - During data loading, the `num_parallel_workers` parameter in the built-in data loading class is used to set the number of threads. - During data augmentation, the `num_parallel_workers` parameter in the `map` function is used to set the number of threads. - During batch processing, the `num_parallel_workers` parameter in the `batch` function is used to set the number of threads. @@ -401,6 +408,7 @@ For details, see [Built-in Loading Operators](https://www.mindspore.cn/doc/api_p ### Multi-process Optimization Solution During data processing, operators implemented by Python support the multi-process mode. For example: + - By default, the `GeneratorDataset` class is in multi-process mode. The `num_parallel_workers` parameter indicates the number of enabled processes. The default value is 1. For details, see [GeneratorDataset](https://www.mindspore.cn/doc/api_python/en/master/mindspore/mindspore.dataset.html#mindspore.dataset.GeneratorDataset). - If the user-defined Python function or the `py_transforms` module is used to perform data augmentation and the `python_multiprocessing` parameter of the `map` function is set to True, the `num_parallel_workers` parameter indicates the number of processes and the default value of the `python_multiprocessing` parameter is False. In this case, the `num_parallel_workers` parameter indicates the number of threads. For details, see [Built-in Loading Operators](https://www.mindspore.cn/doc/api_python/en/master/mindspore/mindspore.dataset.html). diff --git a/tutorials/training/source_en/advanced_use/performance_profiling.md b/tutorials/training/source_en/advanced_use/performance_profiling.md index f18d59a6e706c5c76e3a0ecad9af5efaddba9f58..e69d6daf16666a88d42fa67838648261cbf45b5b 100644 --- a/tutorials/training/source_en/advanced_use/performance_profiling.md +++ b/tutorials/training/source_en/advanced_use/performance_profiling.md @@ -22,6 +22,7 @@ ## Overview + Performance data like operator's execution time is recorded in files and can be viewed on the web page, this can help users optimize the performance of neural networks. ## Operation Process @@ -52,10 +53,10 @@ from mindspore import Model, nn, context def test_profiler(): # Init context env context.set_context(mode=context.GRAPH_MODE, device_target='Ascend', device_id=int(os.environ["DEVICE_ID"])) - + # Init Profiler profiler = Profiler() - + # Init hyperparameter epoch = 2 # Init network and Model @@ -67,17 +68,15 @@ def test_profiler(): train_ds = create_mindrecord_dataset_for_training() # Model Train model.train(epoch, train_ds) - + # Profiler end profiler.analyse() ``` - ## Launch MindInsight The MindInsight launch command can refer to [MindInsight Commands](https://www.mindspore.cn/tutorial/training/en/master/advanced_use/mindinsight_commands.html). - ### Performance Analysis Users can access the Performance Profiler by selecting a specific training from the training list, and click the performance profiling link. @@ -87,6 +86,7 @@ Users can access the Performance Profiler by selecting a specific training from Figure 1: Overall Performance Figure 1 displays the overall performance of the training, including the overall data of Step Trace, Operator Performance, MindData Performance and Timeline. The data shown in these components include: + - Step Trace: It will divide the training steps into several stages and collect execution time for each stage. The overall performance page will show the step trace graph. - Operator Performance: It will collect the execution time of operators and operator types. The overall performance page will show the pie graph for different operator types. - MindData Performance: It will analyse the performance of the data input stages. The overall performance page will show the number of steps that may be the bottleneck for these stages. @@ -103,14 +103,14 @@ Step Gap (The time between the end of one step and the computation of next step) Figure 2: Step Trace Analysis -Figure 2 displays the Step Trace page. The Step Trace detail will show the start/finish time for each stage. By default, it shows the average time for all the steps. Users can also choose a specific step to see its step trace statistics. The graphs at the bottom of the page show the execution time of Step Gap, Forward/Backward Propagation and Step Tail (The time between the end of Backward Propagation and the end of Parameter Update) changes according to different steps, it will help to decide whether we can optimize the performance of some stages. +Figure 2 displays the Step Trace page. The Step Trace detail will show the start/finish time for each stage. By default, it shows the average time for all the steps. Users can also choose a specific step to see its step trace statistics. The graphs at the bottom of the page show the execution time of Step Gap, Forward/Backward Propagation and Step Tail (The time between the end of Backward Propagation and the end of Parameter Update) changes according to different steps, it will help to decide whether we can optimize the performance of some stages. In order to divide the stages, the Step Trace Component need to figure out the forward propagation start operator and the backward propagation end operator. MindSpore will automatically figure out the two operators to reduce the profiler configuration work. The first operator after `get_next` will be selected as the forward start operator and the operator before the last all reduce will be selected as the backward end operator. **However, Profiler do not guarantee that the automatically selected operators will meet the user's expectation in all cases.** Users can set the two operators manually as follows: + - Set environment variable `FP_POINT` to configure the forward start operator, for example, `export FP_POINT=fp32_vars/conv2d/BatchNorm`. - Set environment variable `BP_POINT` to configure the backward end operator, for example, `export BP_POINT=loss_scale/gradients/AddN_70`. - #### Operator Performance Analysis The operator performance analysis component is used to display the execution time of the operators during MindSpore run. @@ -120,7 +120,8 @@ The operator performance analysis component is used to display the execution tim Figure 3: Statistics for Operator Types Figure 3 displays the statistics for the operator types, including: -- Choose pie or bar graph to show the proportion time occupied by each operator type. The time of one operator type is calculated by accumulating the execution time of operators belonging to this type. + +- Choose pie or bar graph to show the proportion time occupied by each operator type. The time of one operator type is calculated by accumulating the execution time of operators belonging to this type. - Display top 20 operator types with the longest execution time, show the proportion and execution time (ms) of each operator type. ![op_statistics.png](./images/op_statistics.PNG) @@ -128,6 +129,7 @@ Figure 3 displays the statistics for the operator types, including: Figure 4: Statistics for Operators Figure 4 displays the statistics table for the operators, including: + - Choose All: Display statistics for the operators, including operator name, type, execution time, full scope time, information, etc. The table will be sorted by execution time by default. - Choose Type: Display statistics for the operator types, including operator type name, execution time, execution frequency and proportion of total time. Users can click on each line, querying for all the operators belonging to this type. - Search: There is a search box on the right, which can support fuzzy search for operators/operator types. @@ -135,7 +137,7 @@ Figure 4 displays the statistics table for the operators, including: #### MindData Performance Analysis The MindData performance analysis component is used to analyse the execution of data input pipeline for the training. The data input pipeline can be divided into three stages: -the data process pipeline, data transfer from host to device and data fetch on device. The component will analyse the performance of each stage in detail and display the results. +the data process pipeline, data transfer from host to device and data fetch on device. The component will analyse the performance of each stage in detail and display the results. ![minddata_profile.png](./images/minddata_profile.png) @@ -144,38 +146,44 @@ Figure 5: MindData Performance Analysis Figure 5 displays the page of MindData performance analysis component. It consists of two tabs: The step gap and the data process. The step gap page is used to analyse whether there is performance bottleneck in the three stages. We can get our conclusion from the data queue graphs: + - The data queue size stands for the queue length when the training fetches data from the queue on the device. If the data queue size is 0, the training will wait until there is data in the queue; If the data queue size is above 0, the training can get data very quickly, and it means MindData is not the bottleneck for this training step. - The host queue size can be used to infer the speed of data process and data transfer. If the host queue size is 0, it means we need to speed up the data process stage. -- If the size of the host queue is always large and the size of the data queue is continuously small, there may be a performance bottleneck in data transfer. +- If the size of the host queue is always large and the size of the data queue is continuously small, there may be a performance bottleneck in data transfer. ![data_op_profile.png](./images/data_op_profile.png) Figure 6: Data Process Pipeline Analysis Figure 6 displays the page of data process pipeline analysis. The data queues are used to exchange data between the MindData operators. The data size of the queues reflect the data consume speed of the operators, and can be used to infer the bottleneck operator. The queue usage percentage stands for the average value of data size in queue divide data queue maximum size, the higher the usage percentage, the more data that is accumulated in the queue. The graph at the bottom of the page shows the MindData pipeline operators with the data queues, the user can click one queue to see how the data size changes according to the time, and the operators connected to the queue. The data process pipeline can be analysed as follows: + - When the input queue usage percentage of one operator is high, and the output queue usage percentage is low, the operator may be the bottleneck. - For the leftmost operator, if the usage percentage of all the queues on the right are low, the operator may be the bottleneck. -- For the rightmost operator, if the usage percentage of all the queues on the left are high, the operator may be the bottleneck. +- For the rightmost operator, if the usage percentage of all the queues on the left are high, the operator may be the bottleneck. To optimize the perforamnce of MindData operators, there are some suggestions: + - If the Dataset Operator is the bottleneck, try to increase the `num_parallel_workers`. - If a GeneratorOp type operator is the bottleneck, try to increase the `num_parallel_workers` and replace the operator to `MindRecordDataset`. - If a MapOp type operator is the bottleneck, try to increase the `num_parallel_workers`. If it is a python operator, try to optimize the training script. -- If a BatchOp type operator is the bottleneck, try to adjust the size of `prefetch_size`. +- If a BatchOp type operator is the bottleneck, try to adjust the size of `prefetch_size`. #### Timeline Analysis -The Timeline component can display: +The Timeline component can display: + - The operators (AICore/AICPU operators) are executed on which device. - The MindSpore stream split strategy for this neural network. - The execution sequence and execution time of the operator on the device. Users can get the most detailed information from the Timeline: + - From the High level, users can analyse whether the stream split strategy can be optimized and whether the step tail is too long. - From the Low level, users can analyse the execution time for all the operators, etc. Users can click the download button on the overall performance page to view Timeline details. The Timeline data file (json format) will be stored on local machine, and can be displayed by tools. We suggest to use `chrome://tracing` or [Perfetto](https://ui.perfetto.dev/#!viewer) to visualize the Timeline. + - Chrome tracing: Click "load" on the upper left to load the file. - Perfetto: Click "Open trace file" on the left to load the file. @@ -184,8 +192,9 @@ Users can click the download button on the overall performance page to view Time Figure 7: Timeline Analysis The Timeline consists of the following parts: + - Device and Stream List: It will show the stream list on each device. Each stream consists of a series of tasks. One rectangle stands for one task, and the area stands for the execution time of the task. -- The Operator Information: When we click one task, the corresponding operator of this task will be shown at the bottom. +- The Operator Information: When we click one task, the corresponding operator of this task will be shown at the bottom. W/A/S/D can be applied to zoom in and out of the Timeline graph. diff --git a/tutorials/training/source_en/advanced_use/performance_profiling_gpu.md b/tutorials/training/source_en/advanced_use/performance_profiling_gpu.md index e989af720650ac32483bcdf43237c693dd4d3be2..be91784497ba1cea230ecc92ea2a165cabab3cfc 100644 --- a/tutorials/training/source_en/advanced_use/performance_profiling_gpu.md +++ b/tutorials/training/source_en/advanced_use/performance_profiling_gpu.md @@ -18,6 +18,7 @@ ## Overview + Performance data like operators' execution time is recorded in files and can be viewed on the web page, this can help the user optimize the performance of neural networks. ## Operation Process @@ -25,9 +26,8 @@ Performance data like operators' execution time is recorded in files and can be > The GPU operation process is the same as that in the Ascend chip. > > - > By default, common users do not have the permission to access the NVIDIA GPU performance counters on the target device. -> If common users need to use the profiler performance statistics capability in the training script, configure the permission by referring to the following description: +> If common users need to use the profiler performance statistics capability in the training script, configure the permission by referring to the following description: > > @@ -48,7 +48,7 @@ class StopAtStep(Callback): self.start_step = start_step self.stop_step = stop_step self.already_analysed = False - + def step_begin(self, run_context): cb_params = run_context.original_args() step_num = cb_params.cur_step_num @@ -61,7 +61,7 @@ class StopAtStep(Callback): if step_num == self.stop_step and not self.already_analysed: self.profiler.analyse() self.already_analysed = True - + def end(self, run_context): if not self.already_analysed: self.profiler.analyse() @@ -73,7 +73,6 @@ The code above is just an example. Users should implement callback by themselves The MindInsight launch command can refer to [MindInsight Commands](https://www.mindspore.cn/tutorial/training/en/master/advanced_use/mindinsight_commands.html). - ### Performance Analysis Users can access the Performance Profiler by selecting a specific training from the training list, and click the performance profiling link. And the Performance Profiler only supports operation analysis and Timeline Analysis now, the others modules will be published soon. @@ -83,6 +82,7 @@ Users can access the Performance Profiler by selecting a specific training from Figure 1: Overall Performance Figure 1 displays the overall performance of the training, including the overall data of Step Trace, Operator Performance, MindData Performance and Timeline: + - Operator Performance: It will collect the average execution time of operators and operator types. The overall performance page will show the pie graph for different operator types. - Timeline: It will collect execution time for operations and CUDA activity. The tasks will be shown on the time axis. The overall performance page will show the statistics for tasks. @@ -98,7 +98,7 @@ Figure 2: Statistics for Operator Types Figure 2 displays the statistics for the operator types, including: -- Choose a pie or a bar graph to show the proportion time occupied by each operator type. The time of one operator type is calculated by accumulating the execution time of operators belong to this type. +- Choose a pie or a bar graph to show the proportion time occupied by each operator type. The time of one operator type is calculated by accumulating the execution time of operators belong to this type. - Display top 20 operator types with the longest average execution time, show the proportion of total time and average execution time (ms) of each operator type. The bottom half of Figure 2 displays the statistics table for the operators' details, including: @@ -123,4 +123,4 @@ The usage is almost the same as that in Ascend. The difference is GPU Timeline d > The usage is described as follows: > -> \ No newline at end of file +> diff --git a/tutorials/training/source_en/advanced_use/protect_user_privacy_with_differential_privacy.md b/tutorials/training/source_en/advanced_use/protect_user_privacy_with_differential_privacy.md index 299f44d107e0c1c8f60353e725418e1aab0ecd9d..5653313d5046bb17377adb961635d4cd89991ad7 100644 --- a/tutorials/training/source_en/advanced_use/protect_user_privacy_with_differential_privacy.md +++ b/tutorials/training/source_en/advanced_use/protect_user_privacy_with_differential_privacy.md @@ -22,7 +22,7 @@ Differential privacy is a mechanism for protecting user data privacy. What is privacy? Privacy refers to the attributes of individual users. Common attributes shared by a group of users may not be considered as privacy. For example, if we say "smoking people have a higher probability of getting lung cancer", it does not disclose privacy. However, if we say "Zhang San smokes and gets lung cancer", it discloses the privacy of Zhang San. Assume that there are 100 patients in a hospital and 10 of them have lung cancer. If the information of any 99 patients are known, we can infer whether the remaining one has lung cancer. This behavior of stealing privacy is called differential attack. Differential privacy is a method for preventing differential attacks. By adding noise, the query results of two datasets with only one different record are nearly indistinguishable. In the above example, after differential privacy is used, the statistic information of the 100 patients achieved by the attacker is almost the same as that of the 99 patients. Therefore, the attacker can hardly infer the information of the remaining one patient. -**Differential privacy in machine learning** +**Differential privacy in machine learning:** Machine learning algorithms usually update model parameters and learn data features based on a large amount of data. Ideally, these models can learn the common features of a class of entities and achieve good generalization, such as "smoking patients are more likely to get lung cancer" rather than models with individual features, such as "Zhang San is a smoker who gets lung cancer." However, machine learning algorithms do not distinguish between general and individual features. The published machine learning models, especially the deep neural networks, may unintentionally memorize and expose the features of individual entities in training data. This can be exploited by malicious attackers to reveal Zhang San's privacy information from the published model. Therefore, it is necessary to use differential privacy to protect machine learning models from privacy leakage. @@ -32,14 +32,14 @@ $Pr[\mathcal{K}(D)\in S] \le e^{\epsilon} Pr[\mathcal{K}(D') \in S]+\delta$ For datasets $D$ and $D'$ that differ on only one record, the probability of obtaining the same result from $\mathcal{K}(D)$ and $\mathcal{K}(D')$ by using a randomized algorithm $\mathcal{K}$ must meet the preceding formula. $\epsilon$ indicates the differential privacy budget and $\delta$ indicates the perturbation. The smaller the values of $\epsilon$ and $\delta$, the closer the data distribution output by $\mathcal{K}$ on $D$ and $D'$. -**Differential privacy measurement** +**Differential privacy measurement:** Differential privacy can be measured using $\epsilon$ and $\delta$. - $\epsilon$: specifies the upper limit of the output probability that can be changed when a record is added to or deleted from the dataset. We usually hope that $\epsilon$ is a small constant. A smaller value indicates stricter differential privacy conditions. - $\delta$: limits the probability of arbitrary model behavior change. Generally, this parameter is set to a small constant. You are advised to set this parameter to a value less than the reciprocal of the size of a training dataset. -**Differential privacy implemented by MindArmour** +**Differential privacy implemented by MindArmour:** MindArmour differential privacy module Differential-Privacy implements the differential privacy optimizer. Currently, SGD, Momentum, and Adam are supported. They are differential privacy optimizers based on the Gaussian mechanism. Gaussian noise mechanism supports both non-adaptive policy and adaptive policy The non-adaptive policy use a fixed noise parameter for each step while the adaptive policy changes the noise parameter along time or iteration step. An advantage of using the non-adaptive Gaussian noise is that a differential privacy budget $\epsilon$ can be strictly controlled. However, a disadvantage is that in a model training process, the noise amount added in each step is fixed. In the later training stage, large noise makes the model convergence difficult, and even causes the performance to decrease greatly and the model usability to be poor. Adaptive noise can solve this problem. In the initial model training stage, the amount of added noise is large. As the model converges, the amount of noise decreases gradually, and the impact of noise on model availability decreases. The disadvantage is that the differential privacy budget cannot be strictly controlled. Under the same initial value, the $\epsilon$ of the adaptive differential privacy is greater than that of the non-adaptive differential privacy. Rényi differential privacy (RDP) [2] is also provided to monitor differential privacy budgets. @@ -84,7 +84,7 @@ TAG = 'Lenet5_train' ### Configuring Parameters 1. Set the running environment, dataset path, model training parameters, checkpoint storage parameters, and differential privacy parameters. Replace 'data_path' with you data path. For more configurations, see . - + ```python cfg = edict({ 'num_classes': 10, # the number of classes of model's output @@ -322,28 +322,29 @@ ds_train = generate_mnist_dataset(os.path.join(cfg.data_path, "train"), acc = model.eval(ds_eval, dataset_sink_mode=False) LOGGER.info(TAG, "============== Accuracy: %s ==============", acc) ``` - + 4. Run the following command. - + Execute the script: - + ```bash python lenet5_dp.py ``` - + In the preceding command, replace `lenet5_dp.py` with the name of your script. - + 5. Display the result. The accuracy of the LeNet model without differential privacy is 99%, and the accuracy of the LeNet model with Gaussian noise and adaptive clip differential privacy is mostly more than 95%. - ``` + + ```text ============== Starting Training ============== ... ============== Starting Testing ============== ... ============== Accuracy: 0.9698 ============== ``` - + ### References [1] C. Dwork and J. Lei. Differential privacy and robust statistics. In STOC, pages 371–380. ACM, 2009. @@ -351,6 +352,3 @@ ds_train = generate_mnist_dataset(os.path.join(cfg.data_path, "train"), [2] Ilya Mironov. Rényi differential privacy. In IEEE Computer Security Foundations Symposium, 2017. [3] Abadi, M. e. a., 2016. *Deep learning with differential privacy.* s.l.:Proceedings of the 2016 ACM SIGSAC Conference on Computer and Communications Security. - - - diff --git a/tutorials/training/source_en/advanced_use/save_load_model_hybrid_parallel.md b/tutorials/training/source_en/advanced_use/save_load_model_hybrid_parallel.md index fe7b6d8c7ba28449f73dc473d6e2bd1fc6d49f5f..aa75b23de65b09a999c685bdfdc94563ab8f2762 100644 --- a/tutorials/training/source_en/advanced_use/save_load_model_hybrid_parallel.md +++ b/tutorials/training/source_en/advanced_use/save_load_model_hybrid_parallel.md @@ -90,11 +90,11 @@ Finally, save the updated parameter list to a file through the API provided by M Define the network, call the `load_checkpoint` and `load_param_into_net` APIs to import the checkpoint files to the network in rank id order, and then call `parameters_and_names` API to obtain all parameters in this network. -``` -net = Net() +```python +net = Net() opt = Momentum(learning_rate=0.01, momentum=0.9, params=net.get_parameters()) net = TrainOneStepCell(net, opt) -param_dicts = [] +param_dicts = [] for i in range(rank_size): file_name = os.path.join("./node"+str(i), "CKP_1-4_32.ckpt") # checkpoint file name of current node param_dict = load_checkpoint(file_name) @@ -115,7 +115,7 @@ In the preceding information: Call the `build_searched_strategy` API to obtain the slice strategy of model. -``` +```python strategy = build_searched_strategy("./strategy_train.ckpt") ``` @@ -131,7 +131,7 @@ The parameter name is model\_parallel\_weight and the dividing strategy is to pe 1. Obtain the data value on all nodes for model parallel parameters. - ``` + ```python sliced_parameters = [] for i in range(4): parameter = param_dicts[i].get("model_parallel_weight") @@ -142,32 +142,32 @@ The parameter name is model\_parallel\_weight and the dividing strategy is to pe 2. Call the `merge_sliced_parameter` API to merge the sliced parameters. + ```python + merged_parameter = merge_sliced_parameter(sliced_parameters, strategy) ``` - merged_parameter = merge_sliced_parameter(sliced_parameters, strategy) - ``` - + > If there are multiple model parallel parameters, repeat steps 1 to 2 to process them one by one. ### Saving the Data and Generating a New Checkpoint File 1. Convert `param_dict` to `param_list`. - ``` + ```python param_list = [] for (key, value) in param_dict.items(): each_param = {} each_param["name"] = key if isinstance(value.data, Tensor): - param_data = value.data + param_data = value.data else: - param_data = Tensor(value.data) + param_data = Tensor(value.data) each_param["data"] = param_data param_list.append(each_param) ``` 2. Call the `save_checkpoint` API to write the parameter data to a file and generate a new checkpoint file. - ``` + ```python save_checkpoint(param_list, “./CKP-Integrated_1-4_32.ckpt”) ``` @@ -186,7 +186,7 @@ If you need to load the integrated and saved checkpoint file to multi-device tra Call the `load_checkpoint` API to load model parameter data from the checkpoint file. -``` +```python param_dict = load_checkpoint("./CKP-Integrated_1-4_32.ckpt") ``` @@ -205,7 +205,7 @@ The following uses a specific model parameter as an example. The parameter name In the following code example, data is divided into two slices in dimension 0. - ``` + ```python new_param = parameter_dict[“model_parallel_weight”] slice_list = np.split(new_param.data.asnumpy(), 2, axis=0) new_param_moments = parameter_dict[“moments.model_parallel_weight”] @@ -214,8 +214,10 @@ The following uses a specific model parameter as an example. The parameter name Data after dividing: - slice_list[0] --- [1, 2, 3, 4] Corresponding to device0 - slice_list[1] --- [5, 6, 7, 8] Corresponding to device1 + ```text + slice_list[0] --- [1, 2, 3, 4] Corresponding to device0 + slice_list[1] --- [5, 6, 7, 8] Corresponding to device1 + ``` Similar to slice\_list, slice\_moments\_list is divided into two tensors with the shape of \[1, 4]. @@ -223,7 +225,7 @@ The following uses a specific model parameter as an example. The parameter name Obtain rank\_id of the current node and load data based on rank\_id. - ``` + ```python rank = get_rank() tensor_slice = Tensor(slice_list[rank]) tensor_slice_moments = Tensor(slice_moments_list[rank]) @@ -233,7 +235,7 @@ The following uses a specific model parameter as an example. The parameter name 3. Modify values of model parameters. - ``` + ```python new_param.set_data(tensor_slice, True) new_param_moments.set_data(tensor_slice_moments, True) ``` @@ -244,8 +246,8 @@ The following uses a specific model parameter as an example. The parameter name Call the `load_param_into_net` API to load the model parameter data to the network. -``` -net = Net() +```python +net = Net() opt = Momentum(learning_rate=0.01, momentum=0.9, params=parallel_net.get_parameters()) load_param_into_net(net, param_dict) load_param_into_net(opt, param_dict) @@ -273,40 +275,38 @@ User process: 1. Run the following script to integrate the checkpoint files: - - - ``` + ```python python ./integrate_checkpoint.py "Name of the checkpoint file to be integrated" "Path and name of the checkpoint file generated after integration" "Path and name of the strategy file" "Number of nodes" ``` integrate\_checkpoint.py: - ``` + ```python import numpy as np import os import mindspore.nn as nn from mindspore import Tensor, Parameter from mindspore.ops import operations as P from mindspore.train.serialization import save_checkpoint, load_checkpoint, build_searched_strategy, merge_sliced_parameter - + class Net(nn.Cell): def __init__(self,weight_init): super(Net, self).__init__() self.weight = Parameter(Tensor(weight_init), "model_parallel_weight", layerwise_parallel=True) self.fc = P.MatMul(transpose_b=True) - + def construct(self, x): x = self.fc(x, self.weight1) return x - + def integrate_ckpt_file(old_ckpt_file, new_ckpt_file, strategy_file, rank_size): weight = np.ones([2, 8]).astype(np.float32) net = Net(weight) opt = Momentum(learning_rate=0.01, momentum=0.9, params=net.get_parameters()) net = TrainOneStepCell(net, opt) - + # load CheckPoint into net in rank id order - param_dicts = [] + param_dicts = [] for i in range(rank_size): file_name = os.path.join("./node"+str(i), old_ckpt_file) param_dict = load_checkpoint(file_name) @@ -315,21 +315,21 @@ User process: for _, param in net.parameters_and_names(): param_dict[param.name] = param param_dicts.append(param_dict) - + strategy = build_searched_strategy(strategy_file) param_dict = {} - + for paramname in ["model_parallel_weight", "moments.model_parallel_weight"]: # get layer wise model parallel parameter sliced_parameters = [] for i in range(rank_size): parameter = param_dicts[i].get(paramname) sliced_parameters.append(parameter) - + # merge the parallel parameters of the model - merged_parameter = merge_sliced_parameter(sliced_parameters, strategy) + merged_parameter = merge_sliced_parameter(sliced_parameters, strategy) param_dict[paramname] = merged_parameter - + # convert param_dict to list type data param_list = [] for (key, value) in param_dict.items(): @@ -339,14 +339,14 @@ User process: param_data = value.data else: param_data = Tensor(value.data) - each_param["data"] = param_data - param_list.append(each_param) - + each_param["data"] = param_data + param_list.append(each_param) + # call the API to generate a new CheckPoint file save_checkpoint(param_list, new_ckpt_file) - + return - + if __name__ == "__main__": try: old_ckpt_file = sys.argv[1] @@ -363,10 +363,10 @@ User process: Before the script is executed, the parameter values in the checkpoint files are as follows: - ``` + ```text device0: name is model_parallel_weight - value is + value is [[0.87537426 1.0448935 0.86736983 0.8836905 0.77354026 0.69588304 0.9183654 0.7792076] [0.87224025 0.8726848 0.771446 0.81967723 0.88974726 0.7988162 0.72919345 0.7677011]] name is learning_rate @@ -380,7 +380,7 @@ User process: device1: name is model_parallel_weight - value is + value is [[0.9210751 0.9050457 0.9827775 0.920396 0.9240526 0.9750359 1.0275179 1.0819869] [0.73605865 0.84631145 0.9746683 0.9386582 0.82902765 0.83565056 0.9702136 1.0514659]] name is learning_rate @@ -390,11 +390,11 @@ User process: name is moments.model_weight value is [[0.2417504 0.28193963 0.06713893 0.21510397 0.23380603 0.11424308 0.0218009 -0.11969765] - [0.45955992 0.22664294 0.01990281 0.0731914 0.27125207 0.27298513 -0.01716102 -0.15327111]] + [0.45955992 0.22664294 0.01990281 0.0731914 0.27125207 0.27298513 -0.01716102 -0.15327111]] device2: name is model_parallel_weight - value is + value is [[1.0108461 0.8689414 0.91719437 0.8805056 0.7994629 0.8999671 0.7585804 1.0287056 ] [0.90653455 0.60146594 0.7206475 0.8306303 0.8364681 0.89625114 0.7354735 0.8447268]] name is learning_rate @@ -402,7 +402,7 @@ User process: name is momentum value is [0.9] name is moments.model_weight - value is + value is [[0.03440702 0.41419312 0.24817684 0.30765256 0.48516113 0.24904746 0.57791173 0.00955463] [0.13458519 0.6690533 0.49259356 0.28319967 0.25951773 0.16777472 0.45696738 0.24933104]] @@ -416,16 +416,16 @@ User process: name is momentum value is [0.9] name is moments.model_parallel_weight - value is + value is [[0.14152306 0.5040985 0.24455397 0.10907605 0.11319532 0.19538902 0.01208619 0.40430856] [-0.7773164 -0.47611716 -0.6041424 -0.6144473 -0.2651842 -0.31909415 -0.4510405 -0.12860501]] ``` After the script is executed, the parameter values in the checkpoint files are as follows: - ``` + ```text name is model_parallel_weight - value is + value is [[1.1138763 1.0962057 1.3516843 1.0812817 1.1579804 1.1078343 1.0906502 1.3207073] [0.916671 1.0781671 1.0368758 0.9680898 1.1735439 1.0628364 0.9960786 1.0135143] [0.8828271 0.7963984 0.90675324 0.9830291 0.89010954 0.897052 0.7890109 0.89784735] @@ -439,7 +439,7 @@ User process: name is momentum value is [0.9] name is moments.model_parallel_weight - value is + value is [[0.2567724 -0.07485991 0.282002 0.2456022 0.454939 0.619168 0.18964815 0.45714882] [0.25946522 0.24344791 0.45677605 0.3611395 0.23378398 0.41439137 0.5312468 0.4696194 ] [0.2417504 0.28193963 0.06713893 0.21510397 0.23380603 0.11424308 0.0218009 -0.11969765] @@ -453,7 +453,7 @@ User process: 2. Execute stage 2 training and load the checkpoint file before training. The training code needs to be supplemented based on the site requirements. - ``` + ```python import numpy as np import os import mindspore.nn as nn @@ -497,7 +497,7 @@ User process: load_param_into_net(net, param_dict) opt = Momentum(learning_rate=0.01, momentum=0.9, params=parallel_net.get_parameters()) load_param_into_net(opt, param_dict) - # train code + # train code ... if __name__ == "__main__": @@ -506,7 +506,7 @@ User process: label = np.random.random((4, 4)).astype(np.float32) train_mindspore_impl_fc(input, label, weight1) ``` - + In the preceding information: - `mode=context.GRAPH_MODE`: sets the running mode to graph mode for distributed training. (The PyNative mode does not support parallel running.) @@ -515,10 +515,10 @@ User process: Parameter values after loading: - ``` + ```text device0: name is model_parallel_weight - value is + value is [[0.87537426 1.0448935 0.86736983 0.8836905 0.77354026 0.69588304 0.9183654 0.7792076] [0.87224025 0.8726848 0.771446 0.81967723 0.88974726 0.7988162 0.72919345 0.7677011] [0.8828271 0.7963984 0.90675324 0.9830291 0.89010954 0.897052 0.7890109 0.89784735] @@ -536,7 +536,7 @@ User process: device1: name is model_parallel_weight - value is + value is [[1.0053468 0.98402303 0.99762845 0.97587246 1.0259694 1.0055295 0.99420834 0.9496847] [1.0851002 1.0295962 1.0999886 1.0958165 0.9765328 1.146529 1.0970603 1.1388365] [0.7147005 0.9168278 0.80178416 0.6258351 0.8413766 0.5909515 0.696347 0.71359116] @@ -550,5 +550,5 @@ User process: [[0.03440702 0.41419312 0.24817684 0.30765256 0.48516113 0.24904746 0.57791173 0.00955463] [0.13458519 0.6690533 0.49259356 0.28319967 0.25951773 0.16777472 0.45696738 0.24933104] [0.14152306 0.5040985 0.24455397 0.10907605 0.11319532 0.19538902 0.01208619 0.40430856] - [-0.7773164 -0.47611716 -0.6041424 -0.6144473 -0.2651842 -0.31909415 -0.4510405 -0.12860501]] + [-0.7773164 -0.47611716 -0.6041424 -0.6144473 -0.2651842 -0.31909415 -0.4510405 -0.12860501]] ``` diff --git a/tutorials/training/source_en/advanced_use/summary_record.md b/tutorials/training/source_en/advanced_use/summary_record.md index 133317016f590f38bf63463de75dad948ba0a2c6..eb3760d6759e2d292cd0185be2d3062b9719b0c3 100644 --- a/tutorials/training/source_en/advanced_use/summary_record.md +++ b/tutorials/training/source_en/advanced_use/summary_record.md @@ -41,6 +41,7 @@ The `Callback` mechanism in MindSpore provides a quick and easy way to collect c When you write a training script, you just instantiate the `SummaryCollector` and apply it to either `model.train` or `model.eval`. You can automatically collect some common summary data. The detailed usage of `SummaryCollector` can refer to the `API` document `mindspore.train.callback.SummaryCollector`. The sample code is as follows: + ```python import mindspore import mindspore.nn as nn @@ -126,9 +127,10 @@ model.eval(ds_eval, callbacks=[summary_collector]) ### Method two: Custom collection of network data with summary operators and SummaryCollector -In addition to providing the `SummaryCollector` that automatically collects some summary data, MindSpore provides summary operators that enable customized collection of other data on the network, such as the input of each convolutional layer, or the loss value in the loss function, etc. +In addition to providing the `SummaryCollector` that automatically collects some summary data, MindSpore provides summary operators that enable customized collection of other data on the network, such as the input of each convolutional layer, or the loss value in the loss function, etc. The following summary operators are currently supported: + - [ScalarSummary](https://www.mindspore.cn/doc/api_python/en/master/mindspore/mindspore.ops.html#mindspore.ops.ScalarSummary): Record a scalar data. - [TensorSummary](https://www.mindspore.cn/doc/api_python/en/master/mindspore/mindspore.ops.html#mindspore.ops.TensorSummary): Record a tensor data. - [ImageSummary](https://www.mindspore.cn/doc/api_python/en/master/mindspore/mindspore.ops.html#mindspore.ops.ImageSummary): Record a image data. @@ -193,7 +195,7 @@ class MyOptimizer(Optimizer): self.histogram_summary(self.weight_names[0], self.paramters[0]) # Record gradient self.histogram_summary(self.weight_names[0] + ".gradient", grads[0]) - + ...... @@ -252,18 +254,18 @@ The detailed usage of `SummaryRecord` can refer to the `API` document `mindspore The sample code is as follows: -``` +```python from mindspore.train.callback import Callback from mindspore.train.summary import SummaryRecord class ConfusionMatrixCallback(Callback): def __init__(self, summary_dir): self._summary_dir = summary_dir - + def __enter__(self): # init you summary record in here, when the train script run, it will be inited before training self.summary_record = SummaryRecord(summary_dir) - + def __exit__(self, *exc_args): # Note: you must close the summary record, it will release the process pool resource # else your training script will not exit from training. @@ -274,7 +276,7 @@ class ConfusionMatrixCallback(Callback): cb_params = run_context.run_context.original_args() # create a confusion matric image, and record it to summary file - confusion_martrix = create_confusion_matrix(cb_params) + confusion_martrix = create_confusion_matrix(cb_params) self.summary_record.add_value('image', 'confusion_matrix', confusion_matric) self.summary_record.record(cb_params.cur_step) @@ -291,24 +293,28 @@ the `save_graphs` option of `context.set_context` in the training script is set In the saved files, `ms_output_after_hwopt.pb` is the computational graph after operator fusion, which can be viewed on the web page. ## Run MindInsight + After completing the data collection in the tutorial above, you can start MindInsight to visualize the collected data. When start MindInsight, you need to specify the summary log file directory with the `--summary-base-dir` parameter. The specified summary log file directory can be the output directory of a training or the parent directory of the output directory of multiple training. The output directory structure for a training is as follows -``` + +```text └─summary_dir events.out.events.summary.1596869898.hostname_MS events.out.events.summary.1596869898.hostname_lineage ``` Execute command: + ```Bash mindinsight start --summary-base-dir ./summary_dir ``` The output directory structure of multiple training is as follows: -``` + +```text └─summary ├─summary_dir1 │ events.out.events.summary.1596869898.hostname_MS @@ -320,6 +326,7 @@ The output directory structure of multiple training is as follows: ``` Execute command: + ```Bash mindinsight start --summary-base-dir ./summary ``` @@ -327,6 +334,7 @@ mindinsight start --summary-base-dir ./summary After successful startup, the visual page can be viewed by visiting the `http://127.0.0.1:8080` address through the browser. Stop MindInsight command: + ```Bash mindinsight stop ``` @@ -339,12 +347,13 @@ For more parameter Settings, see the [MindInsight related commands](https://www. 2. Multiple `SummaryRecord` instances can not be used at the same time. (`SummaryRecord` is used in `SummaryCollector`) - If you use two or more instances of `SummaryCollector` in the callback list of 'model.train' or 'model.eval', it is seen as using multiple `SummaryRecord` instances at the same time, and it will cause recoding data failure. + If you use two or more instances of `SummaryCollector` in the callback list of 'model.train' or 'model.eval', it is seen as using multiple `SummaryRecord` instances at the same time, and it will cause recoding data failure. If the customized callback uses `SummaryRecord`, it can not be used with `SummaryCollector` at the same time. Correct code: - ``` + + ```python ... summary_collector = SummaryCollector('./summary_dir') model.train(2, train_dataset, callbacks=[summary_collector]) @@ -353,7 +362,8 @@ For more parameter Settings, see the [MindInsight related commands](https://www. ``` Wrong code: - ``` + + ```python ... summary_collector1 = SummaryCollector('./summary_dir1') summary_collector2 = SummaryCollector('./summary_dir2') @@ -361,7 +371,8 @@ For more parameter Settings, see the [MindInsight related commands](https://www. ``` Wrong code: - ``` + + ```python ... # Note: the 'ConfusionMatrixCallback' is user-defined, and it uses SummaryRecord to record data. confusion_callback = ConfusionMatrixCallback('./summary_dir1') @@ -371,4 +382,4 @@ For more parameter Settings, see the [MindInsight related commands](https://www. 3. In each Summary log file directory, only one training data should be placed. If a summary log directory contains summary data from multiple training, MindInsight will overlay the summary data from these training when visualizing the data, which may not be consistent with the expected visualizations. -4. Currently, `SummaryCollector` and `SummaryRecord` do not support scenarios with GPU multi-card running. \ No newline at end of file +4. Currently, `SummaryCollector` and `SummaryRecord` do not support scenarios with GPU multi-card running. diff --git a/tutorials/training/source_en/advanced_use/test_model_security_fuzzing.md b/tutorials/training/source_en/advanced_use/test_model_security_fuzzing.md index ae637cdf6d71fff01473574b34e21f70cc708836..8d61b0abdbaceb7871c764641783ca65e5a93f3a 100644 --- a/tutorials/training/source_en/advanced_use/test_model_security_fuzzing.md +++ b/tutorials/training/source_en/advanced_use/test_model_security_fuzzing.md @@ -75,7 +75,7 @@ For details about the API configuration, see the `context.set_context`. images = data[0].asnumpy().astype(np.float32) train_images.append(images) train_images = np.concatenate(train_images, axis=0) - + # get test data data_list = "../common/dataset/MNIST/test" batch_size = 32 @@ -101,7 +101,7 @@ For details about the API configuration, see the `context.set_context`. The data mutation method must include the method based on the image pixel value changes. - The first two image transform methods support user-defined configuration parameters and randomly generated parameters by algorithms. For user-defined configuration parameters see the class methods corresponding to https://gitee.com/mindspore/mindarmour/blob/master/mindarmour/fuzz_testing/image_transform.py. For randomly generated parameters by algorithms you can set method's params to `'auto_param': [True]`. The mutation parameters are randomly generated within the recommended range. + The first two image transform methods support user-defined configuration parameters and randomly generated parameters by algorithms. For user-defined configuration parameters see the class methods corresponding to . For randomly generated parameters by algorithms you can set method's params to `'auto_param': [True]`. The mutation parameters are randomly generated within the recommended range. For details about how to set parameters based on the attack defense method, see the corresponding attack method class. @@ -144,7 +144,7 @@ For details about the API configuration, see the `context.set_context`. # make initial seeds initial_seeds = [] for img, label in zip(test_images, test_labels): - initial_seeds.append([img, label]) + initial_seeds.append([img, label]) initial_seeds = initial_seeds[:100] ``` @@ -174,7 +174,7 @@ For details about the API configuration, see the `context.set_context`. 6. Experiment results. - The results of fuzz testing contains five aspect data: + The results of fuzz testing contains five aspect data: - fuzz_samples: mutated samples in fuzz testing. - true_labels: the ground truth labels of fuzz_samples. @@ -188,8 +188,8 @@ For details about the API configuration, see the `context.set_context`. ```python if metrics: - for key in metrics: - LOGGER.info(TAG, key + ': %s', metrics[key]) + for key in metrics: + LOGGER.info(TAG, key + ': %s', metrics[key]) ``` The fuzz testing result is as follows: diff --git a/tutorials/training/source_en/quick_start/linear_regression.md b/tutorials/training/source_en/quick_start/linear_regression.md index d437a48ee9811ff373afb270101f620a9a325d5e..94dec03fb35e0e9f493722ef72efe8597a2d77bb 100644 --- a/tutorials/training/source_en/quick_start/linear_regression.md +++ b/tutorials/training/source_en/quick_start/linear_regression.md @@ -29,7 +29,6 @@ Author: [Yi Yang](https://github.com/helloyesterday)    Edit    - ## Overview Regression algorithms usually use a series of properties to predict a value, and the predicted values are consecutive. For example, the price of a house is predicted based on some given feature data of the house, such as area and the number of bedrooms; or future temperature conditions are predicted by using the temperature change data and satellite cloud images in the last week. If the actual price of the house is CNY5 million, and the value predicted through regression analysis is CNY4.99 million, the regression analysis is considered accurate. For machine learning problems, common regression analysis includes linear regression, polynomial regression, and logistic regression. This example describes the linear regression algorithms and how to use MindSpore to perform linear regression AI training. @@ -50,7 +49,6 @@ Complete MindSpore running configuration. Third-party support package: `matplotlib`. If this package is not installed, run the `pip install matplotlib` command to install it first. - ```python from mindspore import context @@ -67,7 +65,6 @@ context.set_context(mode=context.GRAPH_MODE, device_target="CPU") `get_data` is used to generate training and test datasets. Since linear data is fitted, the required training datasets should be randomly distributed around the objective function. Assume that the objective function to be fitted is $f(x)=2x+3$. $f(x)=2x+3+noise$ is used to generate training datasets, and `noise` is a random value that complies with standard normal distribution rules. - ```python import numpy as np @@ -81,7 +78,6 @@ def get_data(num, w=2.0, b=3.0): Use `get_data` to generate 50 groups of test data and visualize them. - ```python import matplotlib.pyplot as plt @@ -98,10 +94,8 @@ plt.show() The output is as follows: - ![png](./images/linear_regression_eval_datasets.png) - In the preceding figure, the green line indicates the objective function, and the red points indicate the verification data `eval_data`. ### Defining the Data Argumentation Function @@ -112,7 +106,6 @@ Use the MindSpore data conversion function `GeneratorDataset` to convert the dat - `batch`: combines `batch_size` pieces of data into a batch. - `repeat`: multiplies the number of datasets. - ```python from mindspore import dataset as ds @@ -125,13 +118,12 @@ def create_dataset(num_data, batch_size=16, repeat_size=1): Use the dataset argumentation function to generate training data and view the training data format. - ```python num_data = 1600 batch_size = 16 repeat_size = 1 -ds_train = create_dataset(num_data, batch_size=batch_size, repeat_size=repeat_size) +ds_train = create_dataset(num_data, batch_size=batch_size, repeat_size=repeat_size) print("The dataset size of ds_train:", ds_train.get_dataset_size()) dict_datasets = ds_train.create_dict_iterator().get_next() @@ -142,11 +134,12 @@ print("The y label value shape:", dict_datasets["label"].shape) The output is as follows: - The dataset size of ds_train: 100 - dict_keys(['data', 'label']) - The x label value shape: (16, 1) - The y label value shape: (16, 1) - +```text +The dataset size of ds_train: 100 +dict_keys(['data', 'label']) +The x label value shape: (16, 1) +The y label value shape: (16, 1) +``` Use the defined `create_dataset` to perform argumentation on the generated 1600 data records and set them into 100 datasets with the shape of 16 x 1. @@ -158,7 +151,6 @@ $$f(x)=wx+b\tag{1}$$ Use the Normal operator to randomly initialize the weights $w$ and $b$. - ```python from mindspore.common.initializer import Normal from mindspore import nn @@ -175,7 +167,6 @@ class LinearNet(nn.Cell): Call the network to view the initialized model parameters. - ```python net = LinearNet() model_params = net.trainable_params() @@ -184,18 +175,18 @@ print(model_params) The output is as follows: - [Parameter (name=fc.weight, value=Tensor(shape=[1, 1], dtype=Float32, - [[-7.35660456e-003]])), Parameter (name=fc.bias, value=Tensor(shape=[1], dtype=Float32, [-7.35660456e-003]))] - +```text +[Parameter (name=fc.weight, value=Tensor(shape=[1, 1], dtype=Float32, +[[-7.35660456e-003]])), Parameter (name=fc.bias, value=Tensor(shape=[1], dtype=Float32, [-7.35660456e-003]))] +``` After initializing the network model, visualize the initialized network function and training dataset to understand the model function before fitting. - ```python from mindspore import Tensor x_model_label = np.array([-10, 10, 0.1]) -y_model_label = (x_model_label * Tensor(model_params[0]).asnumpy()[0][0] + +y_model_label = (x_model_label * Tensor(model_params[0]).asnumpy()[0][0] + Tensor(model_params[1]).asnumpy()[0]) plt.scatter(x_eval_label, y_eval_label, color="red", s=5) @@ -206,10 +197,8 @@ plt.show() The output is as follows: - ![png](./images/model_net_and_eval_datasets.png) - As shown in the preceding figure, the initialized model function in blue differs greatly from the objective function in green. ## Defining and Associating the Forward and Backward Propagation Networks @@ -237,7 +226,6 @@ A forward propagation network consists of two parts: The following method is used in MindSpore: - ```python net = LinearNet() net_loss = nn.loss.MSELoss() @@ -258,7 +246,6 @@ Parameters in formula 3 are described as follows: After all weight values in the function are updated, transfer the values to the model function. This process is the backward propagation. To implement this process, the optimizer function in MindSpore is required. - ```python opt = nn.Momentum(net.trainable_params(), learning_rate=0.005, momentum=0.9) ``` @@ -267,7 +254,6 @@ opt = nn.Momentum(net.trainable_params(), learning_rate=0.005, momentum=0.9) After forward propagation and backward propagation are defined, call the `Model` function in MindSpore to associate the previously defined networks, loss functions, and optimizer function to form a complete computing network. - ```python from mindspore.train import Model @@ -280,7 +266,6 @@ model = Model(net, net_loss, opt) To make the entire training process easier to understand, the test data, objective function, and model network of the training process need to be visualized. The following defines a visualization function which is called after each training step to display a fitting process of the model network. - ```python import matplotlib.pyplot as plt import time @@ -293,7 +278,7 @@ def plot_model_and_datasets(net, eval_data): x1, y1 = zip(*eval_data) x_target = x y_target = x_target * 2 + 3 - + plt.axis([-11, 11, -20, 25]) plt.scatter(x1, y1, color="red", s=5) plt.plot(x, y, color="blue") @@ -306,7 +291,6 @@ def plot_model_and_datasets(net, eval_data): MindSpore provides tools to customize the model training process. The following calls the visualization function in `step_end` to display the fitting process. For more information, see [Customized Debugging Information](https://www.mindspore.cn/tutorial/training/en/master/advanced_use/custom_debugging_info.html#callback). - ```python from IPython import display from mindspore.train.callback import Callback @@ -315,7 +299,7 @@ class ImageShowCallback(Callback): def __init__(self, net, eval_data): self.net = net self.eval_data = eval_data - + def step_end(self, run_context): plot_model_and_datasets(self.net, self.eval_data) display.clear_output(wait=True) @@ -330,7 +314,6 @@ After the preceding process is complete, use the training parameter `ds_train` t - `callbacks`: Required callback function during training. - `dataset_sink_model`: Dataset offload mode, which supports the Ascend and GPU computing platforms. In this example, this parameter is set to False for the CPU computing platform. - ```python from mindspore.train.callback import LossMonitor @@ -345,13 +328,12 @@ print(net.trainable_params()[0], "\n%s" % net.trainable_params()[1]) The output is as follows: - ![gif](./images/linear_regression.gif) - - Parameter (name=fc.weight, value=[[2.0065749]]) - Parameter (name=fc.bias, value=[3.0089042]) - +```text +Parameter (name=fc.weight, value=[[2.0065749]]) +Parameter (name=fc.bias, value=[3.0089042]) +``` After the training is complete, the weight parameters of the final model are printed. The value of weight is close to 2.0 and the value of bias is close to 3.0. As a result, the model training meets the expectation. diff --git a/tutorials/training/source_en/quick_start/quick_start.md b/tutorials/training/source_en/quick_start/quick_start.md index b7bad45c0f0ef93ab96835c001f95dce81dcb6c3..f9f2dc93c54cfb2295951458a49ed12ab786fa11 100644 --- a/tutorials/training/source_en/quick_start/quick_start.md +++ b/tutorials/training/source_en/quick_start/quick_start.md @@ -32,6 +32,7 @@ This document uses a practice example to demonstrate the basic functions of MindSpore. For common users, it takes 20 to 30 minutes to complete the practice. During the practice, a simple image classification function is implemented. The overall process is as follows: + 1. Process the required dataset. The MNIST dataset is used in this example. 2. Define a network. The LeNet network is used in this example. 3. Define the loss function and optimizer. @@ -39,7 +40,7 @@ During the practice, a simple image classification function is implemented. The 5. Load the saved model for inference. 6. Validate the model, load the test dataset and trained model, and validate the result accuracy. -> You can find the complete executable sample code at . +> You can find the complete executable sample code at . This is a simple and basic workflow. For applying to other advanced and complex applications, extend this basic process as appropriate. @@ -61,7 +62,7 @@ Download the files, decompress them, and store them in the workspace directories The directory structure is as follows: -``` +```text └─MNIST_Data ├─test │ t10k-images.idx3-ubyte @@ -71,6 +72,7 @@ The directory structure is as follows: train-images.idx3-ubyte train-labels.idx1-ubyte ``` + > For ease of use, we added the function of automatically downloading datasets in the sample script. ### Importing Python Libraries and Modules @@ -78,8 +80,7 @@ The directory structure is as follows: Before start, you need to import Python libraries. Currently, the `os` libraries are required. For ease of understanding, other required libraries will not be described here. - - + ```python import os ``` @@ -156,7 +157,7 @@ def create_dataset(data_path, batch_size=32, repeat_size=1, rescale_op = CV.Rescale(rescale, shift) # rescale images hwc2chw_op = CV.HWC2CHW() # change shape from (height, width, channel) to (channel, height, width) to fit network. type_cast_op = C.TypeCast(mstype.int32) # change data type of label to int32 to fit network - + # apply map operations on images mnist_ds = mnist_ds.map(operations=type_cast_op, input_columns="label", num_parallel_workers=num_parallel_workers) mnist_ds = mnist_ds.map(operations=resize_op, input_columns="image", num_parallel_workers=num_parallel_workers) @@ -182,7 +183,6 @@ Perform the shuffle and batch operations, and then perform the repeat operation > MindSpore supports multiple data processing and augmentation operations, which are usually used in combined. For details, see section [Data Processing](https://www.mindspore.cn/tutorial/training/en/master/use/data_preparation.html) in the MindSpore Tutorials. - ## Defining the Network The LeNet network is relatively simple. In addition to the input layer, the LeNet network has seven layers, including two convolutional layers, two down-sample layers (pooling layers), and three full connection layers. Each layer contains different numbers of training parameters, as shown in the following figure: @@ -195,7 +195,7 @@ You can initialize the full connection layers and convolutional layers by `Norma MindSpore supports multiple parameter initialization methods, such as `TruncatedNormal`, `Normal`, and `Uniform`, default value is `Normal`. For details, see the description of the `mindspore.common.initializer` module in the MindSpore API. -To use MindSpore for neural network definition, inherit `mindspore.nn.cell.Cell`. `Cell` is the base class of all neural networks (such as `Conv2d`). +To use MindSpore for neural network definition, inherit `mindspore.nn.Cell`. `Cell` is the base class of all neural networks (such as `Conv2d`). Define each layer of a neural network in the `__init__` method in advance, and then define the `construct` method to complete the forward construction of the neural network. According to the structure of the LeNet network, define the network layers as follows: @@ -237,7 +237,7 @@ class LeNet5(nn.Cell): Before definition, this section briefly describes concepts of loss function and optimizer. - Loss function: It is also called objective function and is used to measure the difference between a predicted value and an actual value. Deep learning reduces the value of the loss function by continuous iteration. Defining a good loss function can effectively improve the model performance. -- Optimizer: It is used to minimize the loss function, improving the model during training. +- Optimizer: It is used to minimize the loss function, improving the model during training. After the loss function is defined, the weight-related gradient of the loss function can be obtained. The gradient is used to indicate the weight optimization direction for the optimizer, improving model performance. @@ -291,9 +291,9 @@ from mindspore.train.callback import ModelCheckpoint, CheckpointConfig if __name__ == "__main__": ... # set parameters of check point - config_ck = CheckpointConfig(save_checkpoint_steps=1875, keep_checkpoint_max=10) + config_ck = CheckpointConfig(save_checkpoint_steps=1875, keep_checkpoint_max=10) # apply parameters of check point - ckpoint_cb = ModelCheckpoint(prefix="checkpoint_lenet", config=config_ck) + ckpoint_cb = ModelCheckpoint(prefix="checkpoint_lenet", config=config_ck) ... ``` @@ -318,24 +318,27 @@ def train_net(args, model, epoch_size, mnist_path, repeat_size, ckpoint_cb, sink if __name__ == "__main__": ... - - epoch_size = 1 + + epoch_size = 1 mnist_path = "./MNIST_Data" repeat_size = 1 model = Model(network, net_loss, net_opt, metrics={"Accuracy": Accuracy()}) train_net(args, model, epoch_size, mnist_path, repeat_size, ckpoint_cb, dataset_sink_mode) ... ``` -In the preceding information: + +In the preceding information: In the `train_net` method, we loaded the training dataset, `MNIST path` is MNIST dataset path. ## Running and Viewing the Result Run the script using the following command: -``` + +```bash python lenet.py --device_target=CPU ``` -In the preceding information: + +In the preceding information: `Lenet. Py`: the script file you wrote. `--device_target CPU`: Specify the hardware platform.The parameters are 'CPU', 'GPU' or 'Ascend'. @@ -375,7 +378,6 @@ In the preceding information: After obtaining the model file, we verify the generalization ability of the model. - ```python from mindspore.train.serialization import load_checkpoint, load_param_into_net @@ -396,23 +398,24 @@ if __name__ == "__main__": test_net(network, model, mnist_path) ``` -In the preceding information: +In the preceding information: `load_checkpoint`: This API is used to load the CheckPoint model parameter file and return a parameter dictionary. `checkpoint_lenet-3_1404.ckpt`: name of the saved CheckPoint model file. `load_param_into_net`: This API is used to load parameters to the network. - Run the script using the following command: -``` + +```bash python lenet.py --device_target=CPU ``` -In the preceding information: + +In the preceding information: `Lenet. Py`: the script file you wrote. `--device_target CPU`: Specify the hardware platform.The parameters are 'CPU', 'GPU' or 'Ascend'. After executing the command, the result is displayed as follows: -``` +```text ============== Starting Testing ============== ============== Accuracy:{'Accuracy': 0.9663477564102564} ============== ``` diff --git a/tutorials/training/source_en/quick_start/quick_video.md b/tutorials/training/source_en/quick_start/quick_video.md index 05fdbb0749055b1e273d0761fe35aba4a89956b1..4c8b795bfccbd080b9d91d3647b177a89ab20347 100644 --- a/tutorials/training/source_en/quick_start/quick_video.md +++ b/tutorials/training/source_en/quick_start/quick_video.md @@ -108,7 +108,6 @@ Provides video tutorials from installation to try-on, helping you quickly use Mi - ## MindSpore Experience @@ -209,6 +208,14 @@ Provides video tutorials from installation to try-on, helping you quickly use Mi + + + + +## Training Process Visualization-MindInsight + + + @@ -371,4 +402,4 @@ Provides video tutorials from installation to try-on, helping you quickly use Mi - \ No newline at end of file + diff --git a/tutorials/training/source_en/quick_start/quick_video/loading_the_dataset_and_converting_data_format.md b/tutorials/training/source_en/quick_start/quick_video/loading_the_dataset_and_converting_data_format.md index 6e6d6c115eacc4cde4713f3ec63b4d9d9b3f7b92..cdd21d37f314ae52f230eeef6c16721190184ffa 100644 --- a/tutorials/training/source_en/quick_start/quick_video/loading_the_dataset_and_converting_data_format.md +++ b/tutorials/training/source_en/quick_start/quick_video/loading_the_dataset_and_converting_data_format.md @@ -5,4 +5,3 @@ - \ No newline at end of file diff --git a/tutorials/training/source_en/quick_start/quick_video/mindInsight_dashboard.md b/tutorials/training/source_en/quick_start/quick_video/mindInsight_dashboard.md index 52b39f1841978bb0b2129b1fcd87a10c76bdc60b..8a159643e6b0de231dcc83f8835930d6a9586cf9 100644 --- a/tutorials/training/source_en/quick_start/quick_video/mindInsight_dashboard.md +++ b/tutorials/training/source_en/quick_start/quick_video/mindInsight_dashboard.md @@ -8,4 +8,4 @@ **Install now**: -**See more**: \ No newline at end of file +**See more**: diff --git a/tutorials/training/source_en/quick_start/quick_video/mindInsight_installation_and_common_commands.md b/tutorials/training/source_en/quick_start/quick_video/mindInsight_installation_and_common_commands.md index cca2d82e2a0859c8b15a87a4678c1ef3ff08fb6c..a0dae43ea1fb50de88e82e82ad6fac19028aca14 100644 --- a/tutorials/training/source_en/quick_start/quick_video/mindInsight_installation_and_common_commands.md +++ b/tutorials/training/source_en/quick_start/quick_video/mindInsight_installation_and_common_commands.md @@ -8,4 +8,4 @@ **Install now**: -**More commands**: \ No newline at end of file +**More commands**: diff --git a/tutorials/training/source_en/quick_start/quick_video/mindInsight_lineage_and_scalars_comparision.md b/tutorials/training/source_en/quick_start/quick_video/mindInsight_lineage_and_scalars_comparision.md index 4960609f2998245eb4af5c1abb547c45c33c9e8f..5b762e715851344fe9d288882196614060bad2d9 100644 --- a/tutorials/training/source_en/quick_start/quick_video/mindInsight_lineage_and_scalars_comparision.md +++ b/tutorials/training/source_en/quick_start/quick_video/mindInsight_lineage_and_scalars_comparision.md @@ -6,4 +6,4 @@ -**See more**: \ No newline at end of file +**See more**: diff --git a/tutorials/training/source_en/quick_start/quick_video/mindInsight_performance_profiling.md b/tutorials/training/source_en/quick_start/quick_video/mindInsight_performance_profiling.md new file mode 100644 index 0000000000000000000000000000000000000000..953ca8844386844edb24faefb57ce3d69d89e43a --- /dev/null +++ b/tutorials/training/source_en/quick_start/quick_video/mindInsight_performance_profiling.md @@ -0,0 +1,13 @@ +# MindInsight Performance Profiling + +[comment]: <> (This document contains Hands-on Tutorial Series. Gitee does not support display. Please check tutorials on the official website) + + + +**See more**: + + + + \ No newline at end of file diff --git a/tutorials/training/source_en/quick_start/quick_video/quick_start_video.md b/tutorials/training/source_en/quick_start/quick_video/quick_start_video.md index 8af0e535a933d34b8ed7d176ecbf57980d029013..914e35c6a520ebb06589aaf09c87011537a2628c 100644 --- a/tutorials/training/source_en/quick_start/quick_video/quick_start_video.md +++ b/tutorials/training/source_en/quick_start/quick_video/quick_start_video.md @@ -8,4 +8,4 @@ **View code**: -**View the full tutorial**: \ No newline at end of file +**View the full tutorial**: diff --git a/tutorials/training/source_en/quick_start/quick_video/saving_and_loading_model_parameters.md b/tutorials/training/source_en/quick_start/quick_video/saving_and_loading_model_parameters.md index 2fc01870d5cc721dd56d5da841c3564f56ca11c1..78a12a04c0f41419e4afd4d8641be8c5004cb135 100644 --- a/tutorials/training/source_en/quick_start/quick_video/saving_and_loading_model_parameters.md +++ b/tutorials/training/source_en/quick_start/quick_video/saving_and_loading_model_parameters.md @@ -6,4 +6,4 @@ -**View the full tutorial**: \ No newline at end of file +**View the full tutorial**: diff --git a/tutorials/training/source_en/use/load_dataset_image.md b/tutorials/training/source_en/use/load_dataset_image.md index ee57af47a4b2344d9cfc338158ee391ef7a7ec1a..67ea6b6352ccc4650c0069519fae3fcde270b5e7 100644 --- a/tutorials/training/source_en/use/load_dataset_image.md +++ b/tutorials/training/source_en/use/load_dataset_image.md @@ -28,7 +28,7 @@ This tutorial uses the MNIST dataset [1] as an example to demonstrate how to loa 1. Download and decompress the training [Image](http://yann.lecun.com/exdb/mnist/train-images-idx3-ubyte.gz) and [Label](http://yann.lecun.com/exdb/mnist/train-labels-idx1-ubyte.gz) of the MNIST dataset to `./MNIST` directory. The directory structure is as follows. - ``` + ```text └─MNIST ├─train-images.idx3-ubyte └─train-labels.idx1-ubyte diff --git a/tutorials/training/source_en/use/load_dataset_text.md b/tutorials/training/source_en/use/load_dataset_text.md index e63808dbeda5034934dfcce8ea0d4ada6fd6b7d8..e4e9c34cdc20a1e91092794357d21f7ca658ca25 100644 --- a/tutorials/training/source_en/use/load_dataset_text.md +++ b/tutorials/training/source_en/use/load_dataset_text.md @@ -27,7 +27,7 @@ This tutorial briefly demonstrates how to load and process text data using MindS 1. Prepare the following text data. - ``` + ```text Welcome to Beijing! 北京欢迎您! 我喜欢English! @@ -35,7 +35,7 @@ This tutorial briefly demonstrates how to load and process text data using MindS 2. Create the `tokenizer.txt` file, copy the text data to the file, and save the file under `./test` directory. The directory structure is as follow. - ``` + ```text └─test └─tokenizer.txt ``` @@ -69,7 +69,7 @@ The following tutorial demonstrates loading datasets using the `TextFileDataset` The output without tokenization: - ``` + ```text Welcome to Beijing! 北京欢迎您! 我喜欢English! @@ -99,7 +99,7 @@ The following tutorial demonstrates how to perform data processing such as `Slid The output is as follows: - ``` + ```text ['大', '家', '早', '上', '好'] ``` @@ -118,7 +118,7 @@ The following tutorial demonstrates how to perform data processing such as `Slid The output is as follows: - ``` + ```text [['大', '家'], ['家', '早'], ['早', '上'], @@ -145,7 +145,7 @@ The following tutorial demonstrates how to perform data processing such as `Slid The output is as follows: - ``` + ```text c a d @@ -178,7 +178,7 @@ The following tutorial demonstrates how to use the `WhitespaceTokenizer` to toke The output after tokenization is as follows: - ``` + ```text ['Welcome', 'to', 'Beijing!'] ['北京欢迎您!'] ['我喜欢English!'] diff --git a/tutorials/training/source_en/use/load_model_for_inference_and_transfer.md b/tutorials/training/source_en/use/load_model_for_inference_and_transfer.md index d402b51700d83d2c95d8d7fcc0cbe57062becb22..8e6e891492d0a263b87a3b19835aea474c9d95db 100644 --- a/tutorials/training/source_en/use/load_model_for_inference_and_transfer.md +++ b/tutorials/training/source_en/use/load_model_for_inference_and_transfer.md @@ -1,4 +1,4 @@ -# Loading a Model for Inference and Transfer Learning +# Loading a Model for Inference and Transfer Learning `Linux` `Ascend` `GPU` `CPU` `Model Loading` `Beginner` `Intermediate` `Expert` @@ -50,6 +50,7 @@ The `eval` method validates the accuracy of the trained model. In the retraining and fine-tuning scenarios for task interruption, you can load network parameters and optimizer parameters to the model. The sample code is as follows: + ```python # return a parameter dict for model param_dict = load_checkpoint("resnet50-2_32.ckpt") @@ -105,11 +106,11 @@ The `load_checkpoint` method returns a parameter dictionary and then the `load_p ### For Transfer Training -When loading a model with `mindspore_hub.load` API, we can add an extra argument to load the feature extraction part of the model only. So we can easily add new layers to perform transfer learning. This feature can be found in the related model page when an extra argument (e.g., include_top) has been integrated into the model construction by the model developer. The value of `include_top` is True or False, indicating whether to keep the top layer in the fully-connected network. +When loading a model with `mindspore_hub.load` API, we can add an extra argument to load the feature extraction part of the model only. So we can easily add new layers to perform transfer learning. This feature can be found in the related model page when an extra argument (e.g., include_top) has been integrated into the model construction by the model developer. The value of `include_top` is True or False, indicating whether to keep the top layer in the fully-connected network. -We use GoogleNet as example to illustrate how to load a model trained on ImageNet dataset and then perform transfer learning (re-training) on specific sub-task dataset. The main steps are listed below: +We use GoogleNet as example to illustrate how to load a model trained on ImageNet dataset and then perform transfer learning (re-training) on specific sub-task dataset. The main steps are listed below: -1. Search the model of interest on [MindSpore Hub Website](https://www.mindspore.cn/resources/hub/) and get the related `url`. +1. Search the model of interest on [MindSpore Hub Website](https://www.mindspore.cn/resources/hub/) and get the related `url`. 2. Load the model from MindSpore Hub using the `url`. Note that the parameter `include_top` is provided by the model developer. @@ -142,7 +143,7 @@ We use GoogleNet as example to illustrate how to load a model trained on ImageNe super(ReduceMeanFlatten, self).__init__() self.mean = P.ReduceMean(keep_dims=True) self.flatten = nn.Flatten() - + def construct(self, x): x = self.mean(x, (2, 3)) x = self.flatten(x) @@ -197,7 +198,7 @@ We use GoogleNet as example to illustrate how to load a model trained on ImageNe data, label = items data = mindspore.Tensor(data) label = mindspore.Tensor(label) - + loss = train_net(data, label) print(f"epoch: {epoch}/{epoch_size}, loss: {loss}") # Save the ckpt file for each epoch. @@ -218,7 +219,7 @@ We use GoogleNet as example to illustrate how to load a model trained on ImageNe classification_layer = nn.Dense(last_channel, num_classes) classification_layer.set_train(False) softmax = nn.Softmax() - network = nn.SequentialCell([network, reducemean_flatten, + network = nn.SequentialCell([network, reducemean_flatten, classification_layer, softmax]) # Load a pre-trained ckpt file. @@ -237,4 +238,4 @@ We use GoogleNet as example to illustrate how to load a model trained on ImageNe res = model.eval(eval_dataset) print("result:", res, "ckpt=", ckpt_path) - ``` \ No newline at end of file + ``` diff --git a/tutorials/training/source_en/use/publish_model.md b/tutorials/training/source_en/use/publish_model.md index 2d208ae971327c20bda76fd21d1e6545ec501525..fc0e39991d8292a0861d16430ae1cbc77b265d51 100644 --- a/tutorials/training/source_en/use/publish_model.md +++ b/tutorials/training/source_en/use/publish_model.md @@ -14,15 +14,15 @@ ## Overview -[MindSpore Hub](https://www.mindspore.cn/resources/hub/) is a platform for storing pre-trained models provided by MindSpore or third-party developers. It provides application developers with simple model loading and fine-tuning APIs, which enables the users to perform inference or fine-tuning based on the pre-trained models and thus deploy to their own applications. Users can also submit their pre-trained models into MindSpore Hub following the specific steps. Thus other users can download and use the published models. +[MindSpore Hub](https://www.mindspore.cn/resources/hub/) is a platform for storing pre-trained models provided by MindSpore or third-party developers. It provides application developers with simple model loading and fine-tuning APIs, which enables the users to perform inference or fine-tuning based on the pre-trained models and thus deploy to their own applications. Users can also submit their pre-trained models into MindSpore Hub following the specific steps. Thus other users can download and use the published models. This tutorial uses GoogleNet as an example to describe how to submit models for model developers who are interested in publishing models into MindSpore Hub. ## How to publish models -You can publish models to MindSpore Hub via PR in [hub](https://gitee.com/mindspore/hub) repo. Here we use GoogleNet as an example to list the steps of model submission to MindSpore Hub. +You can publish models to MindSpore Hub via PR in [hub](https://gitee.com/mindspore/hub) repo. Here we use GoogleNet as an example to list the steps of model submission to MindSpore Hub. -1. Host your pre-trained model in a storage location where we are able to access. +1. Host your pre-trained model in a storage location where we are able to access. 2. Add a model generation python file called `mindspore_hub_conf.py` in your own repo using this [template](https://gitee.com/mindspore/mindspore/blob/master/model_zoo/official/cv/googlenet/mindspore_hub_conf.py). The location of the `mindspore_hub_conf.py` file is shown below: @@ -47,11 +47,11 @@ You can publish models to MindSpore Hub via PR in [hub](https://gitee.com/mindsp | ├── gpu | ├── 0.7 | ├── ascend - | ├── 0.7 + | ├── 0.7 | ├── googlenet_v1_cifar10.md │   ├── tools | ├── md_validator.py - | └── md_validator.py + | └── md_validator.py ``` Note that it is required to fill in the `{model_name}_{model_version}_{dataset}.md` template by providing `file-format`、`asset-link` and `asset-sha256` below, which refers to the model file format, model storage location from step 1 and model hash value, respectively. @@ -60,7 +60,7 @@ You can publish models to MindSpore Hub via PR in [hub](https://gitee.com/mindsp file-format: ckpt asset-link: https://download.mindspore.cn/model_zoo/official/cv/googlenet/goolenet_ascend_0.2.0_cifar10_official_classification_20200713/googlenet.ckpt asset-sha256: 114e5acc31dad444fa8ed2aafa02ca34734419f602b9299f3b53013dfc71b0f7 - ``` + ``` The MindSpore Hub supports multiple model file formats including: - [MindSpore CKPT](https://www.mindspore.cn/tutorial/training/en/master/use/save_model.html#checkpoint-configuration-policies) @@ -81,6 +81,6 @@ You can publish models to MindSpore Hub via PR in [hub](https://gitee.com/mindsp python md_validator.py ../assets/mindspore/ascend/0.7/googlenet_v1_cifar10.md ``` -5. Create a PR in `mindspore/hub` repo. See our [Contributor Wiki](https://gitee.com/mindspore/mindspore/blob/master/CONTRIBUTING.md#) for more information about creating a PR. +5. Create a PR in `mindspore/hub` repo. See our [Contributor Wiki](https://gitee.com/mindspore/mindspore/blob/master/CONTRIBUTING.md#) for more information about creating a PR. -Once your PR is merged into master branch here, your model will show up in [MindSpore Hub Website](https://www.mindspore.cn/resources/hub) within 24 hours. Please refer to [README](https://gitee.com/mindspore/hub/blob/master/mshub_res/README.md#) for more information about model submission. +Once your PR is merged into master branch here, your model will show up in [MindSpore Hub Website](https://www.mindspore.cn/resources/hub) within 24 hours. Please refer to [README](https://gitee.com/mindspore/hub/blob/master/mshub_res/README.md#) for more information about model submission. diff --git a/tutorials/training/source_en/use/save_model.md b/tutorials/training/source_en/use/save_model.md index d21e33e2e0ce98b402c65e0494d44990da5189ba..98b4938598d87bdd39d394827194e953efd13b25 100644 --- a/tutorials/training/source_en/use/save_model.md +++ b/tutorials/training/source_en/use/save_model.md @@ -35,6 +35,7 @@ During model training, use the callback mechanism to transfer the object of the You can use the `CheckpointConfig` object to set the CheckPoint saving policies. The saved parameters are classified into network parameters and optimizer parameters. `ModelCheckpoint` provides default configuration policies for users to quickly get started. The following describes the usage: + ```python from mindspore.train.callback import ModelCheckpoint ckpoint_cb = ModelCheckpoint() @@ -61,7 +62,7 @@ Create a `ModelCheckpoint` object and transfer it to the model.train method. The Generated CheckPoint files are as follows: -``` +```text resnet50-graph.meta # Generate compiled computation graph. resnet50-1_32.ckpt # The file name extension is .ckpt. resnet50-2_32.ckpt # The file name format contains the epoch and step correspond to the saved parameters. @@ -100,6 +101,7 @@ When you have a CheckPoint file, if you want to do inference on device, you need If you want to do inference on the device, then you need to generate corresponding MINDIR models based on the network and CheckPoint. Currently we support the export of MINDIR models for inference based on graph mode, which don't contain control flow. Taking the export of MINDIR model as an example to illustrate the implementation of model export, the code is as follows: + ```python from mindspore.train.serialization import export import numpy as np @@ -133,7 +135,7 @@ input = np.random.uniform(0.0, 1.0, size=[32, 3, 224, 224]).astype(np.float32) export(resnet, Tensor(input), file_name='resnet50-2_32.air', file_format='AIR') ``` -Before using the `export` interface, you need to import` mindspore.train.serialization`. +Before using the `export` interface, you need to import`mindspore.train.serialization`. The `input` parameter is used to specify the input shape and the data type of the exported model. diff --git a/tutorials/training/source_zh_cn/advanced_use/apply_deep_probability_programming.md b/tutorials/training/source_zh_cn/advanced_use/apply_deep_probability_programming.md index da0cbc368bf90e5b97ac2f204a6c6a7ee6dee537..9c03363c1ad8e6894b2e83f3745fe5259f9dbbbf 100644 --- a/tutorials/training/source_zh_cn/advanced_use/apply_deep_probability_programming.md +++ b/tutorials/training/source_zh_cn/advanced_use/apply_deep_probability_programming.md @@ -1,4 +1,5 @@ # 深度概率编程 + `Ascend` `GPU` `全流程` `初级` `中级` `高级` @@ -29,16 +30,20 @@ ## 概述 + 深度学习模型具有强大的拟合能力,而贝叶斯理论具有很好的可解释能力。MindSpore深度概率编程(MindSpore Deep Probabilistic Programming, MDP)将深度学习和贝叶斯学习结合,通过设置网络权重为分布、引入隐空间分布等,可以对分布进行采样前向传播,由此引入了不确定性,从而增强了模型的鲁棒性和可解释性。MDP不仅包含通用、专业的概率学习编程语言,适用于“专业”用户,而且支持使用开发深度学习模型的逻辑进行概率编程,让初学者轻松上手;此外,还提供深度概率学习的工具箱,拓展贝叶斯应用功能。 本章将详细介绍深度概率编程在MindSpore上的应用。在动手进行实践之前,确保,你已经正确安装了MindSpore 0.7.0-beta及其以上版本。本章的具体内容如下: + 1. 介绍如何使用[bnn_layers模块](https://gitee.com/mindspore/mindspore/tree/master/mindspore/nn/probability/bnn_layers)实现贝叶斯神经网(Bayesian Neural Network, BNN); 2. 介绍如何使用[variational模块](https://gitee.com/mindspore/mindspore/tree/master/mindspore/nn/probability/infer/variational)和[dpn模块](https://gitee.com/mindspore/mindspore/tree/master/mindspore/nn/probability/dpn)实现变分自编码器(Variational AutoEncoder, VAE); 3. 介绍如何使用[transforms模块](https://gitee.com/mindspore/mindspore/tree/master/mindspore/nn/probability/transforms)实现DNN(Deep Neural Network, DNN)一键转BNN; 4. 介绍如何使用[toolbox模块](https://gitee.com/mindspore/mindspore/blob/master/mindspore/nn/probability/toolbox/uncertainty_evaluation.py)实现不确定性估计。 ## 使用贝叶斯神经网络 + 贝叶斯神经网络是由概率模型和神经网络组成的基本模型,它的权重不再是一个确定的值,而是一个分布。本例介绍了如何使用MDP中的bnn_layers模块实现贝叶斯神经网络,并利用贝叶斯神经网络实现一个简单的图片分类功能,整体流程如下: + 1. 处理MNIST数据集; 2. 定义贝叶斯LeNet网络; 3. 定义损失函数和优化器; @@ -47,12 +52,14 @@ > 本例面向GPU或Ascend 910 AI处理器平台,你可以在这里下载完整的样例代码: ### 处理数据集 + 本例子使用的是MNIST数据集,数据处理过程与教程中的[实现一个图片分类应用](https://www.mindspore.cn/tutorial/training/zh-CN/master/quick_start/quick_start.html)一致。 ### 定义贝叶斯神经网络 + 本例使用的是Bayesian LeNet。利用bnn_layers模块构建贝叶斯神经网络的方法与构建普通的神经网络相同。值得注意的是,`bnn_layers`和普通的神经网络层可以互相组合。 -``` +```python import mindspore.nn as nn from mindspore.nn.probability import bnn_layers import mindspore.ops.operations as P @@ -98,7 +105,9 @@ class BNNLeNet5(nn.Cell): x = self.fc3(x) return x ``` + ### 定义损失函数和优化器 + 接下来需要定义损失函数(Loss)和优化器(Optimizer)。损失函数是深度学习的训练目标,也叫目标函数,可以理解为神经网络的输出(Logits)和标签(Labels)之间的距离,是一个标量数据。 常见的损失函数包括均方误差、L2损失、Hinge损失、交叉熵等等。图像分类应用通常采用交叉熵损失(CrossEntropy)。 @@ -107,7 +116,7 @@ class BNNLeNet5(nn.Cell): MindSpore中定义损失函数和优化器的代码样例如下: -``` +```python # loss function definition criterion = SoftmaxCrossEntropyWithLogits(sparse=True, reduction="mean") @@ -116,9 +125,10 @@ optimizer = AdamWeightDecay(params=network.trainable_params(), learning_rate=0.0 ``` ### 训练网络 + 贝叶斯神经网络的训练过程与DNN基本相同,唯一不同的是将`WithLossCell`替换为适用于BNN的`WithBNNLossCell`。除了`backbone`和`loss_fn`两个参数之外,`WithBNNLossCell`增加了`dnn_factor`和`bnn_factor`两个参数。`dnn_factor`是由损失函数计算得到的网络整体损失的系数,`bnn_factor`是每个贝叶斯层的KL散度的系数,这两个参数是用来平衡网络整体损失和贝叶斯层的KL散度的,防止KL散度的值过大掩盖了网络整体损失。 -``` +```python net_with_loss = bnn_layers.WithBNNLossCell(network, criterion, dnn_factor=60000, bnn_factor=0.000001) train_bnn_network = TrainOneStepCell(net_with_loss, optimizer) train_bnn_network.set_train() @@ -136,9 +146,10 @@ for i in range(epoch): print('Epoch: {} \tTraining Loss: {:.4f} \tTraining Accuracy: {:.4f} \tvalidation Accuracy: {:.4f}'. format(i, train_loss, train_acc, valid_acc)) ``` + 其中,`train_model`和`validate_model`在MindSpore中的代码样例如下: -``` +```python def train_model(train_net, net, dataset): accs = [] loss_sum = 0 @@ -172,17 +183,22 @@ def validate_model(net, dataset): ``` ## 使用变分自编码器 + 接下来介绍如何使用MDP中的variational模块和dpn模块实现变分自编码器。变分自编码器是经典的应用了变分推断的深度概率模型,用来学习潜在变量的表示,通过该模型,不仅可以压缩输入数据,还可以生成该类型的新图像。本例的整体流程如下: + 1. 定义变分自编码器; 2. 定义损失函数和优化器; 3. 处理数据; 4. 训练网络; 5. 生成新样本或重构输入样本。 + > 本例面向GPU或Ascend 910 AI处理器平台,你可以在这里下载完整的样例代码: + ### 定义变分自编码器 + 使用dpn模块来构造变分自编码器尤为简单,你只需要自定义编码器和解码器(DNN模型),调用`VAE`接口即可。 -``` +```python class Encoder(nn.Cell): def __init__(self): super(Encoder, self).__init__() @@ -218,11 +234,13 @@ encoder = Encoder() decoder = Decoder() vae = VAE(encoder, decoder, hidden_size=400, latent_size=20) ``` + ### 定义损失函数和优化器 + 接下来需要定义损失函数(Loss)和优化器(Optimizer)。本例使用的损失函数是`ELBO`,`ELBO`是变分推断专用的损失函数;本例使用的优化器是`Adam`。 MindSpore中定义损失函数和优化器的代码样例如下: -``` +```python # loss function definition net_loss = ELBO(latent_prior='Normal', output_prior='Normal') @@ -231,24 +249,30 @@ optimizer = nn.Adam(params=vae.trainable_params(), learning_rate=0.001) net_with_loss = nn.WithLossCell(vae, net_loss) ``` + ### 处理数据 + 本例使用的是MNIST数据集,数据处理过程与教程中的[实现一个图片分类应用](https://www.mindspore.cn/tutorial/training/zh-CN/master/quick_start/quick_start.html)一致。 ### 训练网络 + 使用variational模块中的`SVI`接口对VAE网络进行训练。 -``` +```python from mindspore.nn.probability.infer import SVI vi = SVI(net_with_loss=net_with_loss, optimizer=optimizer) vae = vi.run(train_dataset=ds_train, epochs=10) trained_loss = vi.get_train_loss() ``` + 通过`vi.run`可以得到训练好的网络,使用`vi.get_train_loss`可以得到训练之后的损失。 + ### 生成新样本或重构输入样本 + 利用训练好的VAE网络,我们可以生成新的样本或重构输入样本。 -``` +```python IMAGE_SHAPE = (-1, 1, 32, 32) generated_sample = vae.generate_sample(64, IMAGE_SHAPE) for sample in ds_train.create_dict_iterator(): @@ -257,16 +281,21 @@ for sample in ds_train.create_dict_iterator(): ``` ## DNN一键转换成BNN + 对于不熟悉贝叶斯模型的DNN研究人员,MDP提供了高级API`TransformToBNN`,支持DNN模型一键转换成BNN模型。目前在LeNet,ResNet,MobileNet,VGG等模型上验证了API的通用性。本例将会介绍如何使用transforms模块中的`TransformToBNN`API实现DNN一键转换成BNN,整体流程如下: + 1. 定义DNN模型; 2. 定义损失函数和优化器; 3. 实现功能一:转换整个模型; 4. 实现功能二:转换指定类型的层。 + > 本例面向GPU或Ascend 910 AI处理器平台,你可以在这里下载完整的样例代码: + ### 定义DNN模型 + 本例使用的DNN模型是LeNet。 -``` +```python from mindspore.common.initializer import TruncatedNormal import mindspore.nn as nn import mindspore.ops.operations as P @@ -332,9 +361,10 @@ class LeNet5(nn.Cell): x = self.fc3(x) return x ``` + LeNet的网络结构如下: -``` +```text LeNet5 (conv1) Conv2dinput_channels=1, output_channels=6, kernel_size=(5, 5),stride=(1, 1), pad_mode=valid, padding=0, dilation=(1, 1), group=1, has_bias=False (conv2) Conv2dinput_channels=6, output_channels=16, kernel_size=(5, 5),stride=(1, 1), pad_mode=valid, padding=0, dilation=(1, 1), group=1, has_bias=False @@ -347,9 +377,10 @@ LeNet5 ``` ### 定义损失函数和优化器 + 接下来需要定义损失函数(Loss)和优化器(Optimizer)。本例使用交叉熵损失作为损失函数,`Adam`作为优化器。 -``` +```python network = LeNet5() # loss function definition @@ -361,10 +392,12 @@ optimizer = AdamWeightDecay(params=network.trainable_params(), learning_rate=0.0 net_with_loss = WithLossCell(network, criterion) train_network = TrainOneStepCell(net_with_loss, optimizer) ``` + ### 实例化TransformToBNN + `TransformToBNN`的`__init__`函数定义如下: -``` +```python class TransformToBNN: def __init__(self, trainable_dnn, dnn_factor=1, bnn_factor=1): net_with_loss = trainable_dnn.network @@ -375,18 +408,21 @@ class TransformToBNN: self.bnn_factor = bnn_factor self.bnn_loss_file = None ``` + 参数`trainable_bnn`是经过`TrainOneStepCell`包装的可训练DNN模型,`dnn_factor`和`bnn_factor`分别为由损失函数计算得到的网络整体损失的系数和每个贝叶斯层的KL散度的系数。 MindSpore中实例化`TransformToBNN`的代码如下: -``` +```python from mindspore.nn.probability import transforms bnn_transformer = transforms.TransformToBNN(train_network, 60000, 0.000001) ``` + ### 实现功能一:转换整个模型 + `transform_to_bnn_model`方法可以将整个DNN模型转换为BNN模型。其定义如下: -``` +```python def transform_to_bnn_model(self, get_dense_args=lambda dp: {"in_channels": dp.in_channels, "has_bias": dp.has_bias, "out_channels": dp.out_channels, "activation": dp.activation}, @@ -413,90 +449,94 @@ bnn_transformer = transforms.TransformToBNN(train_network, 60000, 0.000001) Cell, a trainable BNN model wrapped by TrainOneStepCell. """ ``` + 参数`get_dense_args`指定从DNN模型的全连接层中获取哪些参数,`get_conv_args`指定从DNN模型的卷积层中获取哪些参数,参数`add_dense_args`和`add_conv_args`分别指定了要为BNN层指定哪些新的参数值。需要注意的是,`add_dense_args`中的参数不能与`get_dense_args`重复,`add_conv_args`和`get_conv_args`也是如此。 在MindSpore中将整个DNN模型转换成BNN模型的代码如下: -``` +```python train_bnn_network = bnn_transformer.transform_to_bnn_model() ``` + 整个模型转换后的结构如下: -``` +```text LeNet5 (conv1) ConvReparam in_channels=1, out_channels=6, kernel_size=(5, 5), stride=(1, 1), pad_mode=valid, padding=0, dilation=(1, 1), group=1, weight_mean=Parameter (name=conv1.weight_posterior.mean), weight_std=Parameter (name=conv1.weight_posterior.untransformed_std), has_bias=False (weight_prior) NormalPrior (normal) Normalmean = 0.0, standard deviation = 0.1 - + (weight_posterior) NormalPosterior (normal) Normalbatch_shape = None - - + + (conv2) ConvReparam in_channels=6, out_channels=16, kernel_size=(5, 5), stride=(1, 1), pad_mode=valid, padding=0, dilation=(1, 1), group=1, weight_mean=Parameter (name=conv2.weight_posterior.mean), weight_std=Parameter (name=conv2.weight_posterior.untransformed_std), has_bias=False (weight_prior) NormalPrior (normal) Normalmean = 0.0, standard deviation = 0.1 - + (weight_posterior) NormalPosterior (normal) Normalbatch_shape = None - - + + (fc1) DenseReparam in_channels=400, out_channels=120, weight_mean=Parameter (name=fc1.weight_posterior.mean), weight_std=Parameter (name=fc1.weight_posterior.untransformed_std), has_bias=True, bias_mean=Parameter (name=fc1.bias_posterior.mean), bias_std=Parameter (name=fc1.bias_posterior.untransformed_std) (weight_prior) NormalPrior (normal) Normalmean = 0.0, standard deviation = 0.1 - + (weight_posterior) NormalPosterior (normal) Normalbatch_shape = None - + (bias_prior) NormalPrior (normal) Normalmean = 0.0, standard deviation = 0.1 - + (bias_posterior) NormalPosterior (normal) Normalbatch_shape = None - - + + (fc2) DenseReparam in_channels=120, out_channels=84, weight_mean=Parameter (name=fc2.weight_posterior.mean), weight_std=Parameter (name=fc2.weight_posterior.untransformed_std), has_bias=True, bias_mean=Parameter (name=fc2.bias_posterior.mean), bias_std=Parameter (name=fc2.bias_posterior.untransformed_std) (weight_prior) NormalPrior (normal) Normalmean = 0.0, standard deviation = 0.1 - + (weight_posterior) NormalPosterior (normal) Normalbatch_shape = None - + (bias_prior) NormalPrior (normal) Normalmean = 0.0, standard deviation = 0.1 - + (bias_posterior) NormalPosterior (normal) Normalbatch_shape = None - - + + (fc3) DenseReparam in_channels=84, out_channels=10, weight_mean=Parameter (name=fc3.weight_posterior.mean), weight_std=Parameter (name=fc3.weight_posterior.untransformed_std), has_bias=True, bias_mean=Parameter (name=fc3.bias_posterior.mean), bias_std=Parameter (name=fc3.bias_posterior.untransformed_std) (weight_prior) NormalPrior (normal) Normalmean = 0.0, standard deviation = 0.1 - + (weight_posterior) NormalPosterior (normal) Normalbatch_shape = None - + (bias_prior) NormalPrior (normal) Normalmean = 0.0, standard deviation = 0.1 - + (bias_posterior) NormalPosterior (normal) Normalbatch_shape = None - - + + (relu) ReLU (max_pool2d) MaxPool2dkernel_size=2, stride=2, pad_mode=VALID (flatten) Flatten ``` + 可以看到,整个LeNet网络中的卷积层和全连接层都转变成了相应的贝叶斯层。 ### 实现功能二:转换指定类型的层 + `transform_to_bnn_layer`方法可以将DNN模型中指定类型的层(nn.Dense或者nn.Conv2d)转换为对应的贝叶斯层。其定义如下: -``` +```python def transform_to_bnn_layer(self, dnn_layer, bnn_layer, get_args=None, add_args=None): r""" Transform a specific type of layers in DNN model to corresponding BNN layer. @@ -513,16 +553,18 @@ LeNet5 Cell, a trainable model wrapped by TrainOneStepCell, whose sprcific type of layer is transformed to the corresponding bayesian layer. """ ``` + 参数`dnn_layer`指定将哪个类型的DNN层转换成BNN层,`bnn_layer`指定DNN层将转换成哪个类型的BNN层,`get_args`和`add_args`分别指定从DNN层中获取哪些参数和要为BNN层的哪些参数重新赋值。 在MindSpore中将DNN模型中的Dense层转换成相应贝叶斯层`DenseReparam`的代码如下: -``` +```python train_bnn_network = bnn_transformer.transform_to_bnn_layer(nn.Dense, bnn_layers.DenseReparam) ``` + 转换后网络的结构如下: -``` +```text LeNet5 (conv1) Conv2dinput_channels=1, output_channels=6, kernel_size=(5, 5),stride=(1, 1), pad_mode=valid, padding=0, dilation=(1, 1), group=1, has_bias=False (conv2) Conv2dinput_channels=6, output_channels=16, kernel_size=(5, 5),stride=(1, 1), pad_mode=valid, padding=0, dilation=(1, 1), group=1, has_bias=False @@ -530,55 +572,58 @@ LeNet5 in_channels=400, out_channels=120, weight_mean=Parameter (name=fc1.weight_posterior.mean), weight_std=Parameter (name=fc1.weight_posterior.untransformed_std), has_bias=True, bias_mean=Parameter (name=fc1.bias_posterior.mean), bias_std=Parameter (name=fc1.bias_posterior.untransformed_std) (weight_prior) NormalPrior (normal) Normalmean = 0.0, standard deviation = 0.1 - + (weight_posterior) NormalPosterior (normal) Normalbatch_shape = None - + (bias_prior) NormalPrior (normal) Normalmean = 0.0, standard deviation = 0.1 - + (bias_posterior) NormalPosterior (normal) Normalbatch_shape = None - - + + (fc2) DenseReparam in_channels=120, out_channels=84, weight_mean=Parameter (name=fc2.weight_posterior.mean), weight_std=Parameter (name=fc2.weight_posterior.untransformed_std), has_bias=True, bias_mean=Parameter (name=fc2.bias_posterior.mean), bias_std=Parameter (name=fc2.bias_posterior.untransformed_std) (weight_prior) NormalPrior (normal) Normalmean = 0.0, standard deviation = 0.1 - + (weight_posterior) NormalPosterior (normal) Normalbatch_shape = None - + (bias_prior) NormalPrior (normal) Normalmean = 0.0, standard deviation = 0.1 - + (bias_posterior) NormalPosterior (normal) Normalbatch_shape = None - - + + (fc3) DenseReparam in_channels=84, out_channels=10, weight_mean=Parameter (name=fc3.weight_posterior.mean), weight_std=Parameter (name=fc3.weight_posterior.untransformed_std), has_bias=True, bias_mean=Parameter (name=fc3.bias_posterior.mean), bias_std=Parameter (name=fc3.bias_posterior.untransformed_std) (weight_prior) NormalPrior (normal) Normalmean = 0.0, standard deviation = 0.1 - + (weight_posterior) NormalPosterior (normal) Normalbatch_shape = None - + (bias_prior) NormalPrior (normal) Normalmean = 0.0, standard deviation = 0.1 - + (bias_posterior) NormalPosterior (normal) Normalbatch_shape = None - - + + (relu) ReLU (max_pool2d) MaxPool2dkernel_size=2, stride=2, pad_mode=VALID (flatten) Flatten ``` + 可以看到,LeNet网络中的卷积层保持不变,全连接层变成了对应的贝叶斯层`DenseReparam`。 ## 使用不确定性估计工具箱 + 贝叶斯神经网络的优势之一就是可以获取不确定性,MDP在上层提供了不确定性估计的工具箱,用户可以很方便地使用该工具箱计算不确定性。不确定性意味着深度学习模型对预测结果的不确定程度。目前,大多数深度学习算法只能给出预测结果,而不能判断预测结果的可靠性。不确定性主要有两种类型:偶然不确定性和认知不确定性。 + - 偶然不确定性(Aleatoric Uncertainty):描述数据中的内在噪声,即无法避免的误差,这个现象不能通过增加采样数据来削弱。 - 认知不确定性(Epistemic Uncertainty):模型自身对输入数据的估计可能因为训练不佳、训练数据不够等原因而不准确,可以通过增加训练数据等方式来缓解。 @@ -587,7 +632,7 @@ LeNet5 以分类任务为例,本例中使用的模型是LeNet,数据集为MNIST,数据处理过程与教程中的[实现一个图片分类应用](https://www.mindspore.cn/tutorial/training/zh-CN/master/quick_start/quick_start.html)一致。为了评估测试示例的不确定性,使用工具箱的方法如下: -``` +```python from mindspore.nn.probability.toolbox.uncertainty_evaluation import UncertaintyEvaluation from mindspore.train.serialization import load_checkpoint, load_param_into_net @@ -610,4 +655,3 @@ for eval_data in ds_eval.create_dict_iterator(): epistemic_uncertainty = evaluation.eval_epistemic_uncertainty(eval_data) aleatoric_uncertainty = evaluation.eval_aleatoric_uncertainty(eval_data) ``` - diff --git a/tutorials/training/source_zh_cn/advanced_use/apply_gradient_accumulation.md b/tutorials/training/source_zh_cn/advanced_use/apply_gradient_accumulation.md index 96eccdf5381618c39603cbdb1e265adff5831238..38897c1dd87c1a4b9ab515fbc673a15108fd0324 100644 --- a/tutorials/training/source_zh_cn/advanced_use/apply_gradient_accumulation.md +++ b/tutorials/training/source_zh_cn/advanced_use/apply_gradient_accumulation.md @@ -36,6 +36,7 @@ 以MNIST作为示范数据集,自定义简单模型实现梯度累积。 ### 导入需要的库文件 + 下列是我们所需要的公共模块及MindSpore的模块及库文件。 ```python @@ -65,7 +66,9 @@ from model_zoo.official.cv.lenet.src.lenet import LeNet5 这里以LeNet网络为例进行介绍,当然也可以使用其它的网络,如ResNet-50、BERT等, 此部分代码由`model_zoo`中`lenet`目录下的[lenet.py]()导入。 ### 定义训练模型 + 将训练流程拆分为正向反向训练、参数更新和累积梯度清理三个部分: + - `TrainForwardBackward`计算loss和梯度,利用grad_sum实现梯度累加。 - `TrainOptim`实现参数更新。 - `TrainClear`实现对梯度累加变量grad_sum清零。 @@ -134,6 +137,7 @@ class TrainClear(Cell): ``` ### 定义训练过程 + 每个Mini-batch通过正反向训练计算loss和梯度,通过mini_steps控制每次更新参数前的累加次数。达到累加次数后进行参数更新和 累加梯度变量清零。 @@ -202,6 +206,7 @@ class GradientAccumulation: ``` ### 训练并保存模型 + 调用网络、优化器及损失函数,然后自定义`GradientAccumulation`的`train_process`接口,进行模型训练。 ```python @@ -226,13 +231,15 @@ if __name__ == "__main__": ``` ## 实验结果 + 在经历了10轮epoch之后,在测试集上的精度约为96.31%。 -**执行训练** +**执行训练:** + 1. 运行训练代码,查看运行结果。 ```shell - $ python train.py --data_path=./MNIST_Data + python train.py --data_path=./MNIST_Data ``` 输出如下,可以看到loss值随着训练逐步降低: @@ -247,17 +254,17 @@ if __name__ == "__main__": epoch: 10 step: 448 loss is 0.06443884 epoch: 10 step: 449 loss is 0.0067842817 ``` - + 2. 查看保存的CheckPoint文件。 训练过程中保存了CheckPoint文件`gradient_accumulation.ckpt`,即模型文件。 -**验证模型** +**验证模型:** 通过`model_zoo`中`lenet`目录下的[eval.py](),使用保存的CheckPoint文件,加载验证数据集,进行验证。 ```shell -$ python eval.py --data_path=./MNIST_Data --ckpt_path=./gradient_accumulation.ckpt --device_target=GPU +python eval.py --data_path=./MNIST_Data --ckpt_path=./gradient_accumulation.ckpt --device_target=GPU ``` 输出如下,可以看到使用验证的数据集,正确率在96.31%左右,与batch_size为32的验证结果一致。 diff --git a/tutorials/training/source_zh_cn/advanced_use/apply_host_device_training.md b/tutorials/training/source_zh_cn/advanced_use/apply_host_device_training.md index 6e8b3ebc390223daee20e7496b853b21a27cf2d2..c171c55efb9ab4d016ebbeecb634bc52c4bfafc2 100644 --- a/tutorials/training/source_zh_cn/advanced_use/apply_host_device_training.md +++ b/tutorials/training/source_zh_cn/advanced_use/apply_host_device_training.md @@ -47,19 +47,23 @@ ## 配置混合执行 1. 配置混合训练标识。在`src/config.py`文件中,设置`argparse_init`函数中的`host_device_mix`默认值为`1`,设置`WideDeepConfig`类的`__init__`函数中`self.host_device_mix`为`1`: + ```python self.host_device_mix = 1 ``` 2. 检查必要算子和优化器的执行位置。在`src/wide_and_deep.py`的`WideDeepModel`类中,检查`EmbeddingLookup`为主机端执行: + ```python self.deep_embeddinglookup = nn.EmbeddingLookup() self.wide_embeddinglookup = nn.EmbeddingLookup() ``` + 在`src/wide_and_deep.py`文件的`class TrainStepWrap(nn.Cell)`中,检查两个优化器主机端执行的属性。 + ```python - self.optimizer_w.sparse_opt.add_prim_attr("primitive_target", "CPU") - self.optimizer_d.sparse_opt.add_prim_attr("primitive_target", "CPU") + self.optimizer_w.target = "CPU" + self.optimizer_d.target = "CPU" ``` ## 训练模型 @@ -69,7 +73,7 @@ 运行日志保存在`device_0`目录下,其中`loss.log`保存一个epoch内中多个loss值,其值类似如下: -``` +```text epoch: 1 step: 1, wide_loss is 0.6873926, deep_loss is 0.8878349 epoch: 1 step: 2, wide_loss is 0.6442529, deep_loss is 0.8342661 epoch: 1 step: 3, wide_loss is 0.6227323, deep_loss is 0.80273706 @@ -84,7 +88,7 @@ epoch: 1 step: 10, wide_loss is 0.566089, deep_loss is 0.6884129 `test_deep0.log`保存pytest进程输出的详细的运行时日志(需要将日志级别设置为INFO,且在MindSpore编译时加上-p on选项),搜索关键字`EmbeddingLookup`,可找到如下信息: -``` +```text [INFO] DEVICE(109904,python3.7):2020-06-27-12:42:34.928.275 [mindspore/ccsrc/device/cpu/cpu_kernel_runtime.cc:324] Run] cpu kernel: Default/network-VirtualDatasetCellTriple/_backbone-NetWithLossClass/network-WideDeepModel/EmbeddingLookup-op297 costs 3066 us. [INFO] DEVICE(109904,python3.7):2020-06-27-12:42:34.943.896 [mindspore/ccsrc/device/cpu/cpu_kernel_runtime.cc:324] Run] cpu kernel: Default/network-VirtualDatasetCellTriple/_backbone-NetWithLossClass/network-WideDeepModel/EmbeddingLookup-op298 costs 15521 us. ``` @@ -92,7 +96,7 @@ epoch: 1 step: 10, wide_loss is 0.566089, deep_loss is 0.6884129 表示`EmbeddingLookup`在主机端的执行时间。 继续在`test_deep0.log`搜索关键字`FusedSparseFtrl`和`FusedSparseLazyAdam`,可找到如下信息: -``` +```text [INFO] DEVICE(109904,python3.7):2020-06-27-12:42:35.422.963 [mindspore/ccsrc/device/cpu/cpu_kernel_runtime.cc:324] Run] cpu kernel: Default/optimizer_w-FTRL/FusedSparseFtrl-op299 costs 54492 us. [INFO] DEVICE(109904,python3.7):2020-06-27-12:42:35.565.953 [mindspore/ccsrc/device/cpu/cpu_kernel_runtime.cc:324] Run] cpu kernel: Default/optimizer_d-LazyAdam/FusedSparseLazyAdam-op300 costs 142865 us. ``` diff --git a/tutorials/training/source_zh_cn/advanced_use/apply_parameter_server_training.md b/tutorials/training/source_zh_cn/advanced_use/apply_parameter_server_training.md index 6fab9ad461d8518db4f4f7beda17402b60408996..e586363322df7a9e32aadc504fff8234587bb636 100644 --- a/tutorials/training/source_zh_cn/advanced_use/apply_parameter_server_training.md +++ b/tutorials/training/source_zh_cn/advanced_use/apply_parameter_server_training.md @@ -17,6 +17,7 @@ ## 概述 + Parameter Server(参数服务器)是分布式训练中一种广泛使用的架构,相较于同步的AllReduce训练方法,Parameter Server具有更好的灵活性、可扩展性以及节点容灾的能力。具体来讲,参数服务器既支持同步SGD,也支持异步SGD的训练算法;在扩展性上,将模型的计算与模型的更新分别部署在Worker和Server两类进程中,使得Worker和Server的资源可以独立地横向扩缩;另外,在大规模数据中心的环境下,计算设备、网络以及存储经常会出现各种故障而导致部分节点异常,而在参数服务器的架构下,能够较为容易地处理此类的故障而不会对训练中的任务产生影响。 在MindSpore的参数服务器实现中,采用了开源的[ps-lite](https://github.com/dmlc/ps-lite)作为基础架构,基于其提供的远程通信能力以及抽象的Push/Pull原语,实现了同步SGD的分布式训练算法,另外结合Ascend和GPU中的高性能集合通信库(HCCL和NCCL),MindSpore还提供了Parameter Server和AllReduce的混合训练模式,支持将部分权重通过参数服务器进行存储和更新,其余权重仍然通过AllReduce算法进行训练。 @@ -30,6 +31,7 @@ Parameter Server(参数服务器)是分布式训练中一种广泛使用的架 - Scheduler:用于建立Server和Worker的通信关系。 ## 准备工作 + 以LeNet在Ascend 910上使用Parameter Server训练为例: ### 训练脚本准备 @@ -51,6 +53,7 @@ Parameter Server(参数服务器)是分布式训练中一种广泛使用的架 - 被设置为通过Parameter Server更新的单个权重大小不得超过INT_MAX(2^31 - 1)字节。 3. 在[原训练脚本](https://gitee.com/mindspore/mindspore/blob/master/model_zoo/official/cv/lenet/train.py)基础上,设置LeNet模型所有权重通过Parameter Server训练: + ```python context.set_ps_context(enable_ps=True) network = LeNet5(cfg.num_classes) @@ -61,7 +64,7 @@ Parameter Server(参数服务器)是分布式训练中一种广泛使用的架 MindSpore通过读取环境变量,控制Parameter Server训练,环境变量包括以下选项(其中`MS_SCHED_HOST`及`MS_SCHED_PORT`所有脚本需保持一致): -``` +```text export PS_VERBOSE=1 # Print ps-lite log export MS_SERVER_NUM=1 # Server number export MS_WORKER_NUM=1 # Worker number @@ -90,6 +93,7 @@ export MS_ROLE=MS_SCHED # The role of this process: MS_SCHED repre ``` `Server.sh`: + ```bash #!/bin/bash export PS_VERBOSE=1 @@ -102,6 +106,7 @@ export MS_ROLE=MS_SCHED # The role of this process: MS_SCHED repre ``` `Worker.sh`: + ```bash #!/bin/bash export PS_VERBOSE=1 @@ -114,26 +119,31 @@ export MS_ROLE=MS_SCHED # The role of this process: MS_SCHED repre ``` 最后分别执行: + ```bash sh Scheduler.sh > scheduler.log 2>&1 & sh Server.sh > server.log 2>&1 & sh Worker.sh > worker.log 2>&1 & ``` + 启动训练 2. 查看结果 查看`scheduler.log`中Server与Worker通信日志: - ``` + + ```text Bind to role=scheduler, id=1, ip=XXX.XXX.XXX.XXX, port=XXXX Assign rank=8 to node role=server, ip=XXX.XXX.XXX.XXX, port=XXXX Assign rank=9 to node role=worker, ip=XXX.XXX.XXX.XXX, port=XXXX the scheduler is connected to 1 workers and 1 servers ``` + 说明Server、Worker与Scheduler通信建立成功。 查看`worker.log`中训练结果: - ``` + + ```text epoch: 1 step: 1, loss is 2.302287 epoch: 1 step: 2, loss is 2.304071 epoch: 1 step: 3, loss is 2.308778 diff --git a/tutorials/training/source_zh_cn/advanced_use/apply_quantization_aware_training.md b/tutorials/training/source_zh_cn/advanced_use/apply_quantization_aware_training.md index d9a6c653a7a4c2b292b2be8213658408aae6c4fe..14183e2221312cf8e652117fa5fe2e93f06afc2e 100644 --- a/tutorials/training/source_zh_cn/advanced_use/apply_quantization_aware_training.md +++ b/tutorials/training/source_zh_cn/advanced_use/apply_quantization_aware_training.md @@ -39,6 +39,7 @@ ### 伪量化节点 伪量化节点,是指感知量化训练中插入的节点,用以寻找网络数据分布,并反馈损失精度,具体作用如下: + - 找到网络数据的分布,即找到待量化参数的最大值和最小值; - 模拟量化为低比特时的精度损失,把该损失作用到网络模型中,传递给损失函数,让优化器在训练过程中对该损失值进行优化。 @@ -59,12 +60,12 @@ MindSpore的感知量化训练是在训练基础上,使用低精度数据替 感知量化训练模型与一般训练步骤一致,在定义网络和最后生成模型阶段后,需要进行额外的操作,完整流程如下: -1. 数据处理加载数据集。 -2. 定义原始非量化网络。 -3. 定义融合网络。在完成定义原始非量化网络后,替换指定的算子,完成融合网络的定义。 -4. 定义优化器和损失函数。 -5. 转化量化网络。基于融合网络,使用转化接口在融合网络中插入伪量化节点,生成量化网络。 -6. 进行量化训练。基于量化网络训练,生成量化模型。 +1. 数据处理加载数据集。 +2. 定义原始非量化网络。 +3. 定义融合网络。在完成定义原始非量化网络后,替换指定的算子,完成融合网络的定义。 +4. 定义优化器和损失函数。 +5. 转化量化网络。基于融合网络,使用转化接口在融合网络中插入伪量化节点,生成量化网络。 +6. 进行量化训练。基于量化网络训练,生成量化模型。 在上面流程中,第3、5、6步是感知量化训练区别普通训练需要额外进行的步骤。 @@ -99,7 +100,7 @@ class LeNet5(nn.Cell): Tensor, output tensor Examples: >>> LeNet(num_class=10, num_channel=1) - + """ def __init__(self, num_class=10, num_channel=1): super(LeNet5, self).__init__() @@ -129,10 +130,10 @@ class LeNet5(nn.Cell): def __init__(self, num_class=10): super(LeNet5, self).__init__() self.num_class = num_class - + self.conv1 = nn.Conv2dBnAct(1, 6, kernel_size=5, activation='relu') self.conv2 = nn.Conv2dBnAct(6, 16, kernel_size=5, activation='relu') - + self.fc1 = nn.DenseBnAct(16 * 5 * 5, 120, activation='relu') self.fc2 = nn.DenseBnAct(120, 84, activation='relu') self.fc3 = nn.DenseBnAct(84, self.num_class) @@ -164,13 +165,13 @@ net = quant.convert_quant_network(network, quant_delay=900, bn_fold=False, per_c 上面介绍了从零开始进行感知量化训练。更常见情况是已有一个模型文件,希望生成量化模型,这时已有正常网络模型训练得到的模型文件及训练脚本,进行感知量化训练。这里使用checkpoint文件重新训练的功能,详细步骤为: - 1. 数据处理加载数据集。 - 2. 定义原始非量化网络。 - 3. 训练原始网络生成非量化模型。 - 4. 定义融合网络。 - 5. 定义优化器和损失函数。 - 6. 基于融合网络转化生成量化网络。 - 7. 加载模型文件重训。加载已有非量化模型文件,基于量化网络重新训练生成量化模型。详细模型重载训练,请参见。 + 1. 数据处理加载数据集。 + 2. 定义原始非量化网络。 + 3. 训练原始网络生成非量化模型。 + 4. 定义融合网络。 + 5. 定义优化器和损失函数。 + 6. 基于融合网络转化生成量化网络。 + 7. 加载模型文件重训。加载已有非量化模型文件,基于量化网络重新训练生成量化模型。详细模型重载训练,请参见。 ### 进行推理 @@ -180,11 +181,11 @@ net = quant.convert_quant_network(network, quant_delay=900, bn_fold=False, per_c - 使用感知量化训练后得到的checkpoint文件进行推理: - 1. 加载量化模型。 - 2. 推理。 + 1. 加载量化模型。 + 2. 推理。 - 转化为ONNX等通用格式进行推理(暂不支持,开发完善后补充)。 - + ## 参考文献 [1] Jacob B, Kligys S, Chen B, et al. Quantization and training of neural networks for efficient integer-arithmetic-only inference[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2018: 2704-2713. diff --git a/tutorials/training/source_zh_cn/advanced_use/convert_dataset.md b/tutorials/training/source_zh_cn/advanced_use/convert_dataset.md index b025f48e7066029a21b553d09b1a91f663ab9efc..1e60f89a8cf434850c5b5b95327bcc05324cb2f2 100644 --- a/tutorials/training/source_zh_cn/advanced_use/convert_dataset.md +++ b/tutorials/training/source_zh_cn/advanced_use/convert_dataset.md @@ -16,9 +16,10 @@ ## 概述 -用户可以将非标准的数据集和常用的数据集转换为MindSpore数据格式,即MindRecord,从而方便地加载到MindSpore中进行训练。同时,MindSpore在部分场景做了性能优化,使用MindSpore数据格式可以获得更好的性能。 +用户可以将非标准的数据集和常用的数据集转换为MindSpore数据格式,即MindRecord,从而方便地加载到MindSpore中进行训练。同时,MindSpore在部分场景做了性能优化,使用MindSpore数据格式可以获得更好的性能。 MindSpore数据格式具备的特征如下: + 1. 实现多变的用户数据统一存储、访问,训练数据读取更简便; 2. 数据聚合存储,高效读取,且方便管理、移动; 3. 高效数据编解码操作,对用户透明、无感知; @@ -96,7 +97,7 @@ MindSpore数据格式的目标是归一化用户的数据集,并进一步通 5. 创建`FileWriter`对象,传入文件名及分片数量,然后添加Schema文件及索引,调用`write_raw_data`接口写入数据,最后调用`commit`接口生成本地数据文件。 - ```python + ```python writer = FileWriter(file_name="test.mindrecord", shard_num=4) writer.add_schema(cv_schema_json, "test_schema") writer.add_index(indexes) @@ -141,7 +142,7 @@ MindSpore数据格式的目标是归一化用户的数据集,并进一步通 输出结果如下: - ``` + ```text sample: {'data': array([175, 175, 85, 60, 184, 124, 54, 189, 125, 193, 153, 91, 234, 106, 43, 143, 132, 211, 204, 160, 44, 105, 187, 185, 45, 205, 122, 236, 112, 123, 84, 177, 219], dtype=uint8), 'file_name': array(b'3.jpg', dtype='|S5'), 'label': array(99, dtype=int32)} diff --git a/tutorials/training/source_zh_cn/advanced_use/custom_debugging_info.md b/tutorials/training/source_zh_cn/advanced_use/custom_debugging_info.md index 0d933a30b9f5d3699e9e7f0ce515c232a0901a98..9a9584f1767e620302058e3743eb3cb1febdc27c 100644 --- a/tutorials/training/source_zh_cn/advanced_use/custom_debugging_info.md +++ b/tutorials/training/source_zh_cn/advanced_use/custom_debugging_info.md @@ -41,7 +41,7 @@ MindSpore提供`Callback`能力,支持用户在训练/推理的特定阶段, 使用方法:在`model.train`方法中传入`Callback`对象,它可以是一个`Callback`列表,例: ```python -ckpt_cb = ModelCheckpoint() +ckpt_cb = ModelCheckpoint() loss_cb = LossMonitor() summary_cb = SummaryCollector(summary_dir='./summary_dir') model.train(epoch, dataset, callbacks=[ckpt_cb, loss_cb, summary_cb]) @@ -60,7 +60,7 @@ model.train(epoch, dataset, callbacks=[ckpt_cb, loss_cb, summary_cb]) ```python class Callback(): - """Callback base class""" + """Callback base class""" def begin(self, run_context): """Called once before the network executing.""" pass @@ -70,11 +70,11 @@ class Callback(): pass def epoch_end(self, run_context): - """Called after each epoch finished.""" + """Called after each epoch finished.""" pass def step_begin(self, run_context): - """Called before each epoch beginning.""" + """Called before each epoch beginning.""" pass def step_end(self, run_context): @@ -131,7 +131,7 @@ class Callback(): 输出: - ``` + ```text epoch: 20 step: 32 loss: 2.298344373703003 ``` @@ -172,7 +172,6 @@ class Callback(): 具体实现逻辑为:定义一个`Callback`对象,初始化对象接收`model`对象和`ds_eval`(验证数据集)。在`step_end`阶段验证模型的精度,当精度为当前最高时,手动触发保存checkpoint方法,保存当前的参数。 - ## MindSpore metrics功能介绍 当训练结束后,可以使用metrics评估训练结果的好坏。 @@ -224,13 +223,17 @@ print('Accuracy is ', accuracy) ``` 输出: -``` + +```text Accuracy is 0.6667 ``` + ## Print算子功能介绍 + MindSpore的自研`Print`算子可以将用户输入的Tensor或字符串信息打印出来,支持多字符串输入,多Tensor输入和字符串与Tensor的混合输入,输入参数以逗号隔开。 `Print`算子使用方法与其他算子相同,在网络中的`__init__`声明算子并在`construct`进行调用,具体使用实例及输出结果如下: + ```python import numpy as np from mindspore import Tensor @@ -254,8 +257,10 @@ y = Tensor(np.ones([2, 2]).astype(np.int32)) net = PrintDemo() output = net(x, y) ``` + 输出: -``` + +```text print Tensor x and Tensor y: Tensor shape:[[const vector][2, 1]]Int32 val:[[1] @@ -315,7 +320,7 @@ val:[[1 1] 3. 执行用例Dump数据。 可以在训练脚本中设置`context.set_context(reserve_class_name_in_scope=False)`,避免Dump文件名称过长导致Dump数据文件生成失败。 4. 解析Dump数据。 - + 通过`numpy.fromfile`读取Dump数据文件即可解析。 ### 异步Dump功能介绍 @@ -372,6 +377,7 @@ val:[[1 1] ``` ## 日志相关的环境变量和配置 + MindSpore采用glog来输出日志,常用的几个环境变量如下: - `GLOG_v` @@ -379,13 +385,13 @@ MindSpore采用glog来输出日志,常用的几个环境变量如下: 该环境变量控制日志的级别。 该环境变量默认值为2,即WARNING级别,对应关系如下:0-DEBUG、1-INFO、2-WARNING、3-ERROR。 -- `GLOG_logtostderr` +- `GLOG_logtostderr` 该环境变量控制日志的输出方式。 该环境变量的值设置为1时,日志输出到屏幕;值设置为0时,日志输出到文件。默认值为1。 -- `GLOG_log_dir` - +- `GLOG_log_dir` + 该环境变量指定日志输出的路径。 若`GLOG_logtostderr`的值为0,则必须设置此变量。 若指定了`GLOG_log_dir`且`GLOG_logtostderr`的值为1时,则日志输出到屏幕,不输出到文件。 @@ -428,6 +434,3 @@ MindSpore子模块按照目录划分如下: | mindspore/core/ | CORE | > glog不支持日志文件的绕接,如果需要控制日志文件对磁盘空间的占用,可选用操作系统提供的日志文件管理工具,例如:Linux的logrotate。 - - - diff --git a/tutorials/training/source_zh_cn/advanced_use/custom_operator_ascend.md b/tutorials/training/source_zh_cn/advanced_use/custom_operator_ascend.md index ed1a76fb97058cb30837566de9128166f1405011..6bd704a066c9206985ab3888079438e0bbc7ea67 100644 --- a/tutorials/training/source_zh_cn/advanced_use/custom_operator_ascend.md +++ b/tutorials/training/source_zh_cn/advanced_use/custom_operator_ascend.md @@ -25,6 +25,7 @@ 添加一个自定义算子,需要完成算子原语注册、算子实现、算子信息注册三部分工作。 其中: + - 算子原语:定义了算子在网络中的前端接口原型,也是组成网络模型的基础单元,主要包括算子的名称、属性(可选)、输入输出名称、输出shape推理方法、输出dtype推理方法等信息。 - 算子实现:通过TBE(Tensor Boost Engine)提供的特性语言接口,描述算子内部计算逻辑的实现。TBE提供了开发昇腾AI芯片自定义算子的能力。你可以在页面申请公测。 - 算子信息:描述TBE算子的基本信息,如算子名称、支持的输入输出类型等。它是后端做算子选择和映射时的依据。 @@ -38,6 +39,7 @@ 每个算子的原语是一个继承于`PrimitiveWithInfer`的子类,其类型名称即是算子名称。 自定义算子原语与内置算子原语的接口定义完全一致: + - 属性由构造函数`__init__`的入参定义。本用例的算子没有属性,因此`__init__`没有额外的入参。带属性的用例可参考MindSpore源码中的[custom add3](https://gitee.com/mindspore/mindspore/blob/master/tests/st/ops/custom_ops_tbe/cus_add3.py)用例。 - 输入输出的名称通过`init_prim_io_names`函数定义。 - 输出Tensor的shape推理方法在`infer_shape`函数中定义,输出Tensor的dtype推理方法在`infer_dtype`函数中定义。 @@ -75,10 +77,12 @@ class CusSquare(PrimitiveWithInfer): 算子的计算函数主要用来封装算子的计算逻辑供主函数调用,其内部通过调用TBE的API接口组合实现算子的计算逻辑。 算子的入口函数描述了编译算子的内部过程,一般分为如下几步: + 1. 准备输入的placeholder,placeholder是一个占位符,返回一个Tensor对象,表示一组输入数据。 2. 调用计算函数,计算函数使用TBE提供的API接口描述了算子内部的计算逻辑。 3. 调用Schedule调度模块,调度模块对算子中的数据按照调度模块的调度描述进行切分,同时指定好数据的搬运流程,确保在硬件上的执行达到最优。默认可以采用自动调度模块(`auto_schedule`)。 4. 调用`cce_build_code`编译生成算子二进制。 + > 入口函数的输入参数有特殊要求,需要依次为:算子每个输入的信息、算子每个输出的信息、算子属性(可选)和`kernel_name`(生成算子二进制的名称)。输入和输出的信息用字典封装传入,其中包含该算子在网络中被调用时传入的实际输入和输出的shape和dtype。 更多关于使用TBE开发算子的内容请参考[TBE文档](https://support.huaweicloud.com/odevg-A800_3000_3010/atlaste_10_0063.html),关于TBE算子的调试和性能优化请参考[MindStudio文档](https://support.huaweicloud.com/usermanual-mindstudioc73/atlasmindstudio_02_0043.html)。 @@ -92,7 +96,7 @@ class CusSquare(PrimitiveWithInfer): ### 示例 -下面以`Square`算子的TBE实现`square_impl.py`为例进行介绍。`square_compute`是算子实现的计算函数,通过调用`te.lang.cce`提供的API描述了`x * x`的计算逻辑。`cus_square_op_info `是算子信息,通过`TBERegOp`来定义。 +下面以`Square`算子的TBE实现`square_impl.py`为例进行介绍。`square_compute`是算子实现的计算函数,通过调用`te.lang.cce`提供的API描述了`x * x`的计算逻辑。`cus_square_op_info`是算子信息,通过`TBERegOp`来定义。 `TBERegOp`的设置需要注意以下几点: @@ -128,7 +132,7 @@ cus_square_op_info = TBERegOp("CusSquare") \ .output(0, "y", False, "required", "all") \ .dtype_format(DataType.F32_Default, DataType.F32_Default) \ .dtype_format(DataType.F16_Default, DataType.F16_Default) \ - .get_op_info() + .get_op_info() # Binding kernel info with the kernel implementation. @op_info_register(cus_square_op_info) @@ -185,17 +189,20 @@ def test_net(): ``` 执行用例: -``` + +```bash pytest -s tests/st/ops/custom_ops_tbe/test_square.py::test_net ``` 执行结果: -``` + +```text x: [1. 4. 9.] output: [1. 16. 81.] ``` ## 定义算子反向传播函数 + 如果算子要支持自动微分,需要在其原语中定义其反向传播函数(bprop)。你需要在bprop中描述利用正向输入、正向输出和输出梯度得到输入梯度的反向计算逻辑。反向计算逻辑可以使用内置算子或自定义反向算子构成。 定义算子反向传播函数时需注意以下几点: @@ -204,6 +211,7 @@ output: [1. 16. 81.] - bprop函数的返回值形式约定为输入梯度组成的元组,元组中元素的顺序与正向输入参数顺序一致。即使只有一个输入梯度,返回值也要求是元组的形式。 例如,增加bprop后的`CusSquare`原语为: + ```python class CusSquare(PrimitiveWithInfer): @prim_attr_register @@ -228,6 +236,7 @@ class CusSquare(PrimitiveWithInfer): ``` 在`test_square.py`文件中定义反向用例。 + ```python from mindspore.ops import composite as C def test_grad_net(): @@ -241,12 +250,14 @@ def test_grad_net(): ``` 执行用例: -``` + +```bash pytest -s tests/st/ops/custom_ops_tbe/test_square.py::test_grad_net ``` 执行结果: -``` + +```text x: [1. 4. 9.] dx: [2. 8. 18.] ``` diff --git a/tutorials/training/source_zh_cn/advanced_use/cv_mobilenetv2_fine_tune.md b/tutorials/training/source_zh_cn/advanced_use/cv_mobilenetv2_fine_tune.md index e206515f68a906e939db574b88f0748ce9c3cf05..1743089f80e081542ecf6a5061668d84fbeb0e52 100644 --- a/tutorials/training/source_zh_cn/advanced_use/cv_mobilenetv2_fine_tune.md +++ b/tutorials/training/source_zh_cn/advanced_use/cv_mobilenetv2_fine_tune.md @@ -293,26 +293,26 @@ Windows系统输出信息到交互式命令行,Linux系统环境下运行`run_ - 开始增量训练 - - 使用样例1:通过Python文件调用1个GPU处理器。 + - 使用样例1:通过Python文件调用1个GPU处理器。 - ```bash - # Windows or Linux with Python - python train.py --platform GPU --dataset_path [TRAIN_DATASET_PATH] --pretrain_ckpt ./pretrain_checkpoint/mobilenetv2_cpu_gpu.ckpt --freeze_layer backbone - ``` + ```bash + # Windows or Linux with Python + python train.py --platform GPU --dataset_path [TRAIN_DATASET_PATH] --pretrain_ckpt ./pretrain_checkpoint/mobilenetv2_cpu_gpu.ckpt --freeze_layer backbone + ``` - - 使用样例2:通过Shell脚本调用1个GPU处理器,设备ID为`“0”`。 + - 使用样例2:通过Shell脚本调用1个GPU处理器,设备ID为`“0”`。 - ```bash - # Linux with Shell - sh run_train.sh GPU 1 0 [TRAIN_DATASET_PATH] ../pretrain_checkpoint/mobilenetv2_cpu_gpu.ckpt backbone - ``` + ```bash + # Linux with Shell + sh run_train.sh GPU 1 0 [TRAIN_DATASET_PATH] ../pretrain_checkpoint/mobilenetv2_cpu_gpu.ckpt backbone + ``` - - 使用样例3:通过Shell脚本调用8个GPU处理器,设备ID为`“0,1,2,3,4,5,6,7”`。 + - 使用样例3:通过Shell脚本调用8个GPU处理器,设备ID为`“0,1,2,3,4,5,6,7”`。 - ```bash - # Linux with Shell - sh run_train.sh GPU 8 0,1,2,3,4,5,6,7 [TRAIN_DATASET_PATH] ../pretrain_checkpoint/mobilenetv2_cpu_gpu.ckpt backbone - ``` + ```bash + # Linux with Shell + sh run_train.sh GPU 8 0,1,2,3,4,5,6,7 [TRAIN_DATASET_PATH] ../pretrain_checkpoint/mobilenetv2_cpu_gpu.ckpt backbone + ``` ### Ascend加载训练 @@ -322,68 +322,68 @@ Windows系统输出信息到交互式命令行,Linux系统环境下运行`run_ - 开始增量训练 - - 使用样例1:通过Python文件调用1个Ascend处理器。 + - 使用样例1:通过Python文件调用1个Ascend处理器。 - ```bash - # Windows or Linux with Python - python train.py --platform Ascend --dataset_path [TRAIN_DATASET_PATH] --pretrain_ckpt ./pretrain_checkpoint mobilenetv2_ascend.ckpt --freeze_layer backbone - ``` + ```bash + # Windows or Linux with Python + python train.py --platform Ascend --dataset_path [TRAIN_DATASET_PATH] --pretrain_ckpt ./pretrain_checkpoint mobilenetv2_ascend.ckpt --freeze_layer backbone + ``` - - 使用样例2:通过Shell脚本调用1个Ascend AI处理器,设备ID为“0”。 + - 使用样例2:通过Shell脚本调用1个Ascend AI处理器,设备ID为“0”。 - ```bash - # Linux with Shell - sh run_train.sh Ascend 1 0 ~/rank_table.json [TRAIN_DATASET_PATH] ../pretrain_checkpoint/mobilenetv2_ascend.ckpt backbone - ``` + ```bash + # Linux with Shell + sh run_train.sh Ascend 1 0 ~/rank_table.json [TRAIN_DATASET_PATH] ../pretrain_checkpoint/mobilenetv2_ascend.ckpt backbone + ``` - - 使用样例3:通过Shell脚本调用8个Ascend AI处理器,设备ID为”0,1,2,3,4,5,6,7“。 + - 使用样例3:通过Shell脚本调用8个Ascend AI处理器,设备ID为”0,1,2,3,4,5,6,7“。 - ```bash - # Linux with Shell - sh run_train.sh Ascend 8 0,1,2,3,4,5,6,7 ~/rank_table.json [TRAIN_DATASET_PATH] ../pretrain_checkpoint/mobilenetv2_ascend.ckpt backbone - ``` + ```bash + # Linux with Shell + sh run_train.sh Ascend 8 0,1,2,3,4,5,6,7 ~/rank_table.json [TRAIN_DATASET_PATH] ../pretrain_checkpoint/mobilenetv2_ascend.ckpt backbone + ``` ### 微调训练结果 - 查看运行结果。 - - 运行Python文件时在交互式命令行中查看打印信息,`Linux`上运行Shell脚本运行后使用`cat ./train/rank0/log0.log`中查看打印信息,输出结果如下: + - 运行Python文件时在交互式命令行中查看打印信息,`Linux`上运行Shell脚本运行后使用`cat ./train/rank0/log0.log`中查看打印信息,输出结果如下: - ```bash - train args: Namespace(dataset_path='./dataset/train', platform='CPU', \ - pretrain_ckpt='./pretrain_checkpoint/mobilenetv2_cpu_gpu.ckpt', freeze_layer='backbone') - cfg: {'num_classes': 26, 'image_height': 224, 'image_width': 224, 'batch_size': 150, \ - 'epoch_size': 200, 'warmup_epochs': 0, 'lr_max': 0.03, 'lr_end': 0.03, 'momentum': 0.9, \ - 'weight_decay': 4e-05, 'label_smooth': 0.1, 'loss_scale': 1024, 'save_checkpoint': True, \ - 'save_checkpoint_epochs': 1, 'keep_checkpoint_max': 20, 'save_checkpoint_path': './', \ - 'platform': 'CPU'} - Processing batch: 16: 100%|███████████████████████████████████████████ █████████████████████| 16/16 [00:00 本例面向Ascend 910 AI处理器硬件平台,你可以在这里下载完整的样例代码: 下面对任务流程中各个环节及代码关键片段进行解释说明。 - ## 下载CIFAR-10数据集 + 先从[CIFAR-10数据集官网](https://www.cs.toronto.edu/~kriz/cifar.html)上下载CIFAR-10数据集。本例中采用binary格式的数据,Linux环境可以通过下面的命令下载: ```shell @@ -81,7 +80,6 @@ wget https://www.cs.toronto.edu/~kriz/cifar-10-binary.tar.gz tar -zvxf cifar-10-binary.tar.gz ``` - ## 数据预加载和预处理 1. 加载数据集 @@ -89,10 +87,8 @@ tar -zvxf cifar-10-binary.tar.gz 数据加载可以通过内置数据集格式`Cifar10Dataset`接口完成。 > `Cifar10Dataset`,读取类型为随机读取,内置CIFAR-10数据集,包含图像和标签,图像格式默认为uint8,标签数据格式默认为uint32。更多说明请查看API中`Cifar10Dataset`接口说明。 - 数据加载代码如下,其中`data_home`为数据存储位置: - ```python cifar_ds = ds.Cifar10Dataset(data_home) ``` @@ -141,7 +137,6 @@ tar -zvxf cifar-10-binary.tar.gz cifar_ds = cifar_ds.repeat(repeat_num) ``` - ## 定义卷积神经网络 卷积神经网络已经是图像分类任务的标准算法了。卷积神经网络采用分层的结构对图片进行特征提取,由一系列的网络层堆叠而成,比如卷积层、池化层、激活层等等。 @@ -156,7 +151,6 @@ network = resnet50(class_num=10) 更多ResNet的介绍请参考:[ResNet论文](https://arxiv.org/abs/1512.03385) - ## 定义损失函数和优化器 接下来需要定义损失函数(Loss)和优化器(Optimizer)。损失函数是深度学习的训练目标,也叫目标函数,可以理解为神经网络的输出(Logits)和标签(Labels)之间的距离,是一个标量数据。 @@ -175,7 +169,6 @@ ls = SoftmaxCrossEntropyWithLogits(sparse=True, reduction="mean") opt = Momentum(filter(lambda x: x.requires_grad, net.get_parameters()), 0.01, 0.9) ``` - ## 调用`Model`高阶API进行训练和保存模型文件 完成数据预处理、网络定义、损失函数和优化器定义之后,就可以进行模型训练了。模型训练包含两层迭代,数据集的多轮迭代(`epoch`)和一轮数据集内按分组(`batch`)大小进行的单步迭代。其中,单步迭代指的是按分组从数据集中抽取数据,输入到网络中计算得到损失函数,然后通过优化器计算和更新训练参数的梯度。 @@ -214,4 +207,4 @@ print("result: ", res) ## 参考文献 -[1] https://www.cs.toronto.edu/~kriz/cifar.html +[1] diff --git a/tutorials/training/source_zh_cn/advanced_use/cv_resnet50_second_order_optimizer.md b/tutorials/training/source_zh_cn/advanced_use/cv_resnet50_second_order_optimizer.md index 855cb086c5438b1700d3d4ac40d917310bbe8ebc..3c0c523802b016c1035d638c1b06cfdf1b441b4f 100644 --- a/tutorials/training/source_zh_cn/advanced_use/cv_resnet50_second_order_optimizer.md +++ b/tutorials/training/source_zh_cn/advanced_use/cv_resnet50_second_order_optimizer.md @@ -37,7 +37,6 @@ MindSpore开发团队在现有的自然梯度算法的基础上,对FIM矩阵采用近似、切分等优化加速手段,极大的降低了逆矩阵的计算复杂度,开发出了可用的二阶优化器THOR。使用8块Ascend 910 AI处理器,THOR可以在72min内完成ResNet50-v1.5网络和ImageNet数据集的训练,相比于SGD+Momentum速度提升了近一倍。 - 本篇教程将主要介绍如何在Ascend 910 以及GPU上,使用MindSpore提供的二阶优化器THOR训练ResNet50-v1.5网络和ImageNet数据集。 > 你可以在这里下载完整的示例代码: 。 @@ -47,12 +46,12 @@ MindSpore开发团队在现有的自然梯度算法的基础上,对FIM矩阵 ```shell ├── resnet_thor ├── README.md - ├── scripts + ├── scripts ├── run_distribute_train.sh # launch distributed training for Ascend 910 └── run_eval.sh # launch inference for Ascend 910 ├── run_distribute_train_gpu.sh # launch distributed training for GPU └── run_eval_gpu.sh # launch inference for GPU - ├── src + ├── src ├── crossentropy.py # CrossEntropy loss function ├── config.py # parameter configuration ├── dataset_helper.py # dataset helper for minddata dataset @@ -61,20 +60,20 @@ MindSpore开发团队在现有的自然梯度算法的基础上,对FIM矩阵 ├── resnet_thor.py # resnet50_thor backone ├── thor.py # thor optimizer ├── thor_layer.py # thor layer - └── dataset.py # data preprocessing + └── dataset.py # data preprocessing ├── eval.py # infer script └── train.py # train script - + ``` 整体执行流程如下: + 1. 准备ImageNet数据集,处理需要的数据集; 2. 定义ResNet50网络; 3. 定义损失函数和THOR优化器; 4. 加载数据集并进行训练,训练完成后,查看结果及保存模型文件; 5. 加载保存的模型,进行推理。 - ## 准备环节 实践前,确保已经正确安装MindSpore。如果没有,可以通过[MindSpore安装页面](https://www.mindspore.cn/install)安装MindSpore。 @@ -85,7 +84,7 @@ MindSpore开发团队在现有的自然梯度算法的基础上,对FIM矩阵 目录结构如下: -``` +```text └─ImageNet2012 ├─ilsvrc │ n03676483 @@ -99,17 +98,21 @@ MindSpore开发团队在现有的自然梯度算法的基础上,对FIM矩阵 │ ...... ``` + ### 配置分布式环境变量 + #### Ascend 910 + Ascend 910 AI处理器的分布式环境变量配置参考[分布式并行训练 (Ascend)](https://www.mindspore.cn/tutorial/training/zh-CN/master/advanced_use/distributed_training_ascend.html#id4)。 #### GPU -GPU的分布式环境配置参考[分布式并行训练 (GPU)](https://www.mindspore.cn/tutorial/training/zh-CN/master/advanced_use/distributed_training_gpu.html#id4)。 +GPU的分布式环境配置参考[分布式并行训练 (GPU)](https://www.mindspore.cn/tutorial/training/zh-CN/master/advanced_use/distributed_training_gpu.html#id4)。 ## 加载处理数据集 分布式训练时,通过并行的方式加载数据集,同时通过MindSpore提供的数据增强接口对数据集进行处理。加载处理数据集的脚本在源码的`src/dataset.py`脚本中。 + ```python import os import mindspore.common.dtype as mstype @@ -165,17 +168,18 @@ def create_dataset(dataset_path, do_train, repeat_num=1, batch_size=32, target=" > MindSpore支持进行多种数据处理和增强的操作,各种操作往往组合使用,具体可以参考[数据处理](https://www.mindspore.cn/doc/programming_guide/zh-CN/master/pipeline.html)和[数据增强](https://www.mindspore.cn/doc/programming_guide/zh-CN/master/augmentation.html)章节。 - ## 定义网络 + 本示例中使用的网络模型为ResNet50-v1.5,先定义[ResNet50网络](https://gitee.com/mindspore/mindspore/blob/master/model_zoo/official/cv/resnet/src/resnet.py),然后使用二阶优化器自定义的算子替换`Conv2d`和 和`Dense`算子。定义好的网络模型在在源码`src/resnet_thor.py`脚本中,自定义的算子`Conv2d_thor`和`Dense_thor`在`src/thor_layer.py`脚本中。 -- 使用`Conv2d_thor`替换原网络模型中的`Conv2d` -- 使用`Dense_thor`替换原网络模型中的`Dense` +- 使用`Conv2d_thor`替换原网络模型中的`Conv2d` +- 使用`Dense_thor`替换原网络模型中的`Dense` > 使用THOR自定义的算子`Conv2d_thor`和`Dense_thor`是为了保存模型训练中的二阶矩阵信息,新定义的网络与原网络模型的backbone一致。 网络构建完成以后,在`__main__`函数中调用定义好的ResNet50: + ```python ... from src.resnet_thor import resnet50 @@ -188,15 +192,14 @@ if __name__ == "__main__": ... ``` - ## 定义损失函数及THOR优化器 - ### 定义损失函数 MindSpore支持的损失函数有`SoftmaxCrossEntropyWithLogits`、`L1Loss`、`MSELoss`等。THOR优化器需要使用`SoftmaxCrossEntropyWithLogits`损失函数。 损失函数的实现步骤在`src/crossentropy.py`脚本中。这里使用了深度网络模型训练中的一个常用trick:label smoothing,通过对真实标签做平滑处理,提高模型对分类错误标签的容忍度,从而可以增加模型的泛化能力。 + ```python class CrossEntropy(_Loss): """CrossEntropy""" @@ -214,6 +217,7 @@ class CrossEntropy(_Loss): loss = self.mean(loss, 0) return loss ``` + 在`__main__`函数中调用定义好的损失函数: ```python @@ -236,6 +240,7 @@ THOR优化器的参数更新公式如下: $$ \theta^{t+1} = \theta^t + \alpha F^{-1}\nabla E$$ 参数更新公式中各参数的含义如下: + - $\theta$:网络中的可训参数; - $t$:迭代次数; - $\alpha$:学习率值,参数的更新步长; @@ -296,7 +301,6 @@ if __name__ == "__main__": 通过MindSpore提供的`model.train`接口可以方便地进行网络的训练。THOR优化器通过降低二阶矩阵更新频率,来减少计算量,提升计算速度,故重新定义一个Model_Thor类,继承MindSpore提供的Model类。在Model_Thor类中增加二阶矩阵更新频率控制参数,用户可以通过调整该参数,优化整体的性能。 - ```python ... from mindspore.train.loss_scale_manager import FixedLossScaleManager @@ -316,15 +320,21 @@ if __name__ == "__main__": ``` ### 运行脚本 + 训练脚本定义完成之后,调`scripts`目录下的shell脚本,启动分布式训练进程。 + #### Ascend 910 + 目前MindSpore分布式在Ascend上执行采用单卡单进程运行方式,即每张卡上运行1个进程,进程数量与使用的卡的数量一致。其中,0卡在前台执行,其他卡放在后台执行。每个进程创建1个目录,目录名称为`train_parallel`+ `device_id`,用来保存日志信息,算子编译信息以及训练的checkpoint文件。下面以使用8张卡的分布式训练脚本为例,演示如何运行脚本: 使用以下命令运行脚本: -``` + +```bash sh run_distribute_train.sh [RANK_TABLE_FILE] [DATASET_PATH] [DEVICE_NUM] ``` + 脚本需要传入变量`RANK_TABLE_FILE`、`DATASET_PATH`和`DEVICE_NUM`,其中: + - `RANK_TABLE_FILE`:组网信息文件的路径。 - `DATASET_PATH`:训练数据集路径。 - `DEVICE_NUM`:实际的运行卡数。 @@ -361,17 +371,22 @@ epoch: 42 step: 5004, loss is 1.6453942 `*.ckpt`:指保存的模型参数文件。checkpoint文件名称具体含义:*网络名称*-*epoch数*_*step数*.ckpt。 #### GPU + 在GPU硬件平台上,MindSpore采用OpenMPI的`mpirun`进行分布式训练,进程创建1个目录,目录名称为`train_parallel`,用来保存日志信息和训练的checkpoint文件。下面以使用8张卡的分布式训练脚本为例,演示如何运行脚本: -``` + +```bash sh run_distribute_train_gpu.sh [DATASET_PATH] [DEVICE_NUM] ``` + 脚本需要传入变量`DATASET_PATH`和`DEVICE_NUM`,其中: + - `DATASET_PATH`:训练数据集路径。 - `DEVICE_NUM`:实际的运行卡数。 在GPU训练时,无需设置`DEVICE_ID`环境变量,因此在主训练脚本中不需要调用`int(os.getenv('DEVICE_ID'))`来获取卡的物理序号,同时`context`中也无需传入`device_id`。我们需要将device_target设置为GPU,并需要调用`init()`来使能NCCL。 训练过程中loss打印示例如下: + ```bash ... epoch: 1 step: 5004, loss is 4.2546034 @@ -391,7 +406,7 @@ epoch: 36 step: 5004, loss is 1.645802 ├─ckpt_0 ├─resnet-1_5004.ckpt ├─resnet-2_5004.ckpt - │ ...... + │ ...... ├─resnet-36_5004.ckpt │ ...... ...... @@ -436,40 +451,53 @@ if __name__ == "__main__": # define model model = Model(net, loss_fn=loss, metrics={'top_1_accuracy', 'top_5_accuracy'}) - + # eval model res = model.eval(dataset) print("result:", res, "ckpt=", args_opt.checkpoint_path) ``` ### 执行推理 + 推理网络定义完成之后,调用`scripts`目录下的shell脚本,进行推理。 + #### Ascend 910 + 在Ascend 910硬件平台上,推理的执行命令如下: -``` + +```bash sh run_eval.sh [DATASET_PATH] [CHECKPOINT_PATH] ``` + 脚本需要传入变量`DATASET_PATH`和`CHECKPOINT_PATH`,其中: + - `DATASET_PATH`:推理数据集路径。 - `CHECKPOINT_PATH`:保存的checkpoint路径。 目前推理使用的是单卡(默认device 0)进行推理,推理的结果如下: -``` + +```text result: {'top_5_accuracy': 0.9295574583866837, 'top_1_accuracy': 0.761443661971831} ckpt=train_parallel0/resnet-42_5004.ckpt ``` + - `top_5_accuracy`:对于一个输入图片,如果预测概率排名前五的标签中包含真实标签,即认为分类正确; - `top_1_accuracy`:对于一个输入图片,如果预测概率最大的标签与真实标签相同,即认为分类正确。 + #### GPU 在GPU硬件平台上,推理的执行命令如下: -``` + +```bash sh run_eval_gpu.sh [DATASET_PATH] [CHECKPOINT_PATH] ``` + 脚本需要传入变量`DATASET_PATH`和`CHECKPOINT_PATH`,其中: + - `DATASET_PATH`:推理数据集路径。 - `CHECKPOINT_PATH`:保存的checkpoint路径。 推理的结果如下: -``` + +```text result: {'top_5_accuracy': 0.9287972151088348, 'top_1_accuracy': 0.7597031049935979} ckpt=train_parallel/resnet-36_5004.ckpt ``` diff --git a/tutorials/training/source_zh_cn/advanced_use/dashboard.md b/tutorials/training/source_zh_cn/advanced_use/dashboard.md index bd5f41918a33ddf1c34d048b91e949d428cbc2d0..1779b5a0161fd0cdfa9de16155acff8d83357587 100644 --- a/tutorials/training/source_zh_cn/advanced_use/dashboard.md +++ b/tutorials/training/source_zh_cn/advanced_use/dashboard.md @@ -17,7 +17,7 @@    - + ## 概述 @@ -196,4 +196,4 @@ 备注:估算`TensorSummary`空间使用量的方法如下: 一个`TensorSummary数据的大小 = Tensor中的数值个数 * 4 bytes`。假设使用`TensorSummary`记录的Tensor大小为`32 * 1 * 256 * 256`,则一个`TensorSummary`数据大约需要`32 * 1 * 256 * 256 * 4 bytes = 8,388,608 bytes = 8MiB`。`TensorSummary`默认会记录20个步骤的数据,则记录这20组数据需要的空间约为`20 * 8 MiB = 160MiB`。需要注意的是,由于数据结构等因素的开销,实际使用的存储空间会略大于160MiB。 -6. 当使用`TensorSummary`时,由于记录完整Tensor数据,训练日志文件较大,MindInsight需要更多时间解析训练日志文件,请耐心等待。 \ No newline at end of file +6. 当使用`TensorSummary`时,由于记录完整Tensor数据,训练日志文件较大,MindInsight需要更多时间解析训练日志文件,请耐心等待。 diff --git a/tutorials/training/source_zh_cn/advanced_use/debug_in_pynative_mode.md b/tutorials/training/source_zh_cn/advanced_use/debug_in_pynative_mode.md index 3879fbfa69d4265dcd5551119b61b78a0b0dc6f6..80e4b8a6e815fd0fc344fd580fc8bbab76525570 100644 --- a/tutorials/training/source_zh_cn/advanced_use/debug_in_pynative_mode.md +++ b/tutorials/training/source_zh_cn/advanced_use/debug_in_pynative_mode.md @@ -75,12 +75,12 @@ print(output.asnumpy()) [ 0.05016355 0.03958241 0.03958241 0.03958241 0.03443141]]]] ``` - ## 执行普通函数 将若干算子组合成一个函数,然后直接通过函数调用的方式执行这些算子,并打印相关结果,如下例所示。 -**示例代码** +**示例代码:** + ```python import numpy as np from mindspore import context, Tensor @@ -99,7 +99,7 @@ output = tensor_add_func(x, y) print(output.asnumpy()) ``` -**输出** +**输出:** ```python [[3. 3. 3.] @@ -109,7 +109,6 @@ print(output.asnumpy()) > PyNative不支持并行执行和summary功能,图模式的并行和summary相关算子不能使用。 - ### 提升PyNative性能 为了提高PyNative模式下的前向计算任务执行速度,MindSpore提供了Staging功能,该功能可以在PyNative模式下将Python函数或者Python类的方法编译成计算图,通过图优化等技术提高运行速度,如下例所示。 @@ -142,7 +141,8 @@ tensor_add = P.TensorAdd() res = tensor_add(x, z) # PyNative mode print(res.asnumpy()) ``` -**输出** + +**输出:** ```python [[3. 3. 3. 3.] @@ -155,7 +155,7 @@ print(res.asnumpy()) 需要说明的是,加装了`ms_function`装饰器的函数中,如果包含不需要进行参数训练的算子(如`pooling`、`tensor_add`等算子),则这些算子可以在被装饰的函数中直接调用,如下例所示。 -**示例代码** +**示例代码:** ```python import numpy as np @@ -178,7 +178,8 @@ y = Tensor(np.ones([4, 4]).astype(np.float32)) z = tensor_add_fn(x, y) print(z.asnumpy()) ``` -**输出** + +**输出:** ```shell [[2. 2. 2. 2.] @@ -189,7 +190,7 @@ print(z.asnumpy()) 如果被装饰的函数中包含了需要进行参数训练的算子(如`Convolution`、`BatchNorm`等算子),则这些算子必须在被装饰等函数之外完成实例化操作,如下例所示。 -**示例代码** +**示例代码:** ```python import numpy as np @@ -211,7 +212,7 @@ z = conv_fn(Tensor(input_data)) print(z.asnumpy()) ``` -**输出** +**输出:** ```shell [[[[ 0.10377571 -0.0182163 -0.05221086] @@ -247,12 +248,11 @@ print(z.asnumpy()) [ 0.0377498 -0.06117418 0.00546303]]]] ``` - ## 调试网络训练模型 PyNative模式下,还可以支持单独求梯度的操作。如下例所示,可通过`GradOperation`求该函数或者网络所有的输入梯度。需要注意,输入类型仅支持Tensor。 -**示例代码** +**示例代码:** ```python from mindspore.ops import composite as C @@ -269,7 +269,7 @@ def mainf(x, y): print(mainf(Tensor(1, mstype.int32), Tensor(2, mstype.int32))) ``` -**输出** +**输出:** ```python (2, 1) @@ -277,7 +277,7 @@ print(mainf(Tensor(1, mstype.int32), Tensor(2, mstype.int32))) 在进行网络训练时,求得梯度然后调用优化器对参数进行优化(暂不支持在反向计算梯度的过程中设置断点),然后再利用前向计算loss,从而实现在PyNative模式下进行网络训练。 -**完整LeNet示例代码** +**完整LeNet示例代码:** ```python import numpy as np @@ -314,7 +314,7 @@ class LeNet5(nn.Cell): Lenet network Args: num_class (int): Num classes. Default: 10. - + Returns: Tensor, output tensor @@ -348,8 +348,8 @@ class LeNet5(nn.Cell): x = self.relu(x) x = self.fc3(x) return x - - + + class GradWrap(nn.Cell): """ GradWrap definition """ def __init__(self, network): @@ -378,7 +378,7 @@ loss = loss_output.asnumpy() print(loss) ``` -**输出** +**输出:** ```python 2.3050091 diff --git a/tutorials/training/source_zh_cn/advanced_use/debugger.md b/tutorials/training/source_zh_cn/advanced_use/debugger.md index 35214c624c4f8598917cce8bbfdb274d52f5cdcd..6652fe3ca6af1a5033fa28962f9e301a7c85725a 100644 --- a/tutorials/training/source_zh_cn/advanced_use/debugger.md +++ b/tutorials/training/source_zh_cn/advanced_use/debugger.md @@ -22,6 +22,7 @@ ## 概述 + MindSpore调试器是为图模式训练提供的调试工具,可以用来查看并分析计算图节点的中间结果。 在MindSpore图模式的训练过程中,用户无法从Python层获取到计算图中间节点的结果,使得训练调试变得很困难。使用MindSpore调试器,用户可以: @@ -37,6 +38,7 @@ MindSpore调试器是为图模式训练提供的调试工具,可以用来查 - 在MindInsight调试器界面分析训练执行情况。 ## 调试器环境准备 + 开始训练前,请先安装MindInsight,并以调试模式启动。调试模式下,MindSpore会将训练信息发送给MindInsight调试服务,用户可在MindInsight调试器界面进行查看和分析。 MindInsight调试服务启动命令: @@ -72,6 +74,7 @@ mindinsight start --port {PORT} --enable-debugger True --debugger-port {DEBUGGER 图1: 调试器初始页面 ### 计算图 + 调试器将优化后的最终执行图展示在UI的中上位置,用户可以双击打开图上的方框 (代表一个`scope`) 将计算图进一步展开,查看`scope`中的节点信息。 面板的最上方展示了`训练端地址`(训练脚本所在进程的地址和端口),训练使用的`卡号`, 训练的`当前轮次`等元信息。 @@ -119,11 +122,12 @@ mindinsight start --port {PORT} --enable-debugger True --debugger-port {DEBUGGER 图6: 查看触发的条件断点 -图6展示了条件断点触发后的展示页面,该页面和`节点列表`所在位置相同。触发的节点以及监控条件会按照节点的执行序排列,用户点击某一行,会在计算图中跳转到对应节点,可以进一步查看节点信息分析INF等异常结果出现的原因。 +图6展示了条件断点触发后的展示页面,该页面和`节点列表`所在位置相同。触发的节点以及监控条件会按照节点的执行序排列,用户点击某一行,会在计算图中跳转到对应节点,可以进一步查看节点信息分析INF等异常结果出现的原因。 ### 训练控制 监测点设置面板的下方是训练控制面板,该面板展示了调试器的训练控制功能,有`继续`、`暂停`、`结束`、`确定`四个按钮。 + - `确定`代表训练向前执行若干个`轮次`,需要用户在上方的输入框内指定执行的`轮次`数目,直到条件断点触发、或`轮次`执行完毕后暂停; - `继续`代表训练一直执行,直到条件断点触发后暂停、或运行至训练结束; - `暂停`代表训练暂停; @@ -134,20 +138,20 @@ mindinsight start --port {PORT} --enable-debugger True --debugger-port {DEBUGGER 1. 在调试器环境准备完成后,打开调试器界面,如下图所示: ![debugger_waiting](./images/debugger_waiting.png) - + 图7: 调试器等待训练连接 - + 此时,调试器处于等待训练启动和连接的状态。 2. 运行训练脚本,稍后可以看到计算图显示在调试器界面,见图1。 3. 设置条件断点,见图5。 - + 图5中,选中检测条件,并勾选了部分节点,代表监控这些节点在计算过程是否存在满足监控条件的输出。 设置完条件断点后,可以在控制面板选择设置轮次点击`确定`或者`继续`继续训练。 4. 条件断点触发,见图6。 - + 条件断点触发后,用户查看对应的节点信息,找出异常原因后修改脚本,解掉bug。 ## 注意事项 diff --git a/tutorials/training/source_zh_cn/advanced_use/distributed_training_ascend.md b/tutorials/training/source_zh_cn/advanced_use/distributed_training_ascend.md index f0752c2b060cd84a1ab23a0f7200ffe672686f1e..d9368cc9c7f7bd9e832d7dd46549e173bfde0e85 100644 --- a/tutorials/training/source_zh_cn/advanced_use/distributed_training_ascend.md +++ b/tutorials/training/source_zh_cn/advanced_use/distributed_training_ascend.md @@ -12,6 +12,8 @@ - [调用集合通信库](#调用集合通信库) - [数据并行模式加载数据集](#数据并行模式加载数据集) - [定义网络](#定义网络) + - [手动混合并行模式](#手动混合并行模式) + - [半自动并行模式](#半自动并行模式) - [定义损失函数及优化器](#定义损失函数及优化器) - [定义损失函数](#定义损失函数) - [定义优化器](#定义优化器) @@ -34,6 +36,8 @@ > > +此外在[定义网络](https://www.mindspore.cn/tutorial/training/zh-CN/master/advanced_use/distributed_training_ascend.html#id7)和[分布式训练模型参数保存和加载](https://www.mindspore.cn/tutorial/training/zh-CN/master/advanced_use/distributed_training_ascend.html#id13)小节中我们针对手动混合并行模式和半自动并行模式的使用做了特殊说明。 + ## 准备环节 ### 下载数据集 @@ -81,13 +85,14 @@ - `device_ip`表示集成网卡的IP地址,可以在当前机器执行指令`cat /etc/hccn.conf`,`address_x`的键值就是网卡IP地址。 - `rank_id`表示卡逻辑序号,固定从0开始编号。 - ### 调用集合通信库 MindSpore分布式并行训练的通信使用了华为集合通信库`Huawei Collective Communication Library`(以下简称HCCL),可以在Ascend AI处理器配套的软件包中找到。同时`mindspore.communication.management`中封装了HCCL提供的集合通信接口,方便用户配置分布式信息。 > HCCL实现了基于Ascend AI处理器的多机多卡通信,有一些使用限制,我们列出使用分布式服务常见的,详细的可以查看HCCL对应的使用文档。 +> > - 单机场景下支持1、2、4、8卡设备集群,多机场景下支持8*n卡设备集群。 > - 每台机器的0-3卡和4-7卡各为1个组网,2卡和4卡训练时卡必须相连且不支持跨组网创建集群。 +> - 组建多机集群时需要保证各台机器使用同一交换机。 > - 服务器硬件架构及操作系统需要是SMP(Symmetrical Multi-Processing,对称多处理器)处理模式。 下面是调用集合通信库样例代码: @@ -100,10 +105,11 @@ from mindspore.communication.management import init if __name__ == "__main__": context.set_context(mode=context.GRAPH_MODE, device_target="Ascend", device_id=int(os.environ["DEVICE_ID"])) init() - ... + ... ``` 其中, + - `mode=context.GRAPH_MODE`:使用分布式训练需要指定运行模式为图模式(PyNative模式不支持并行)。 - `device_id`:卡的物理序号,即卡所在机器中的实际序号。 - `init`:使能HCCL通信,并完成分布式训练初始化操作。 @@ -112,7 +118,6 @@ if __name__ == "__main__": 分布式训练时,数据是以数据并行的方式导入的。下面我们以CIFAR-10数据集为例,介绍以数据并行方式导入CIFAR-10数据集的方法,`data_path`是指数据集的路径,即`cifar-10-batches-bin`文件夹的路径。 - ```python import mindspore.common.dtype as mstype import mindspore.dataset as ds @@ -158,13 +163,76 @@ def create_dataset(data_path, repeat_num=1, batch_size=32, rank_id=0, rank_size= return data_set ``` + 其中,与单机不同的是,在数据集接口需要传入`num_shards`和`shard_id`参数,分别对应卡的数量和逻辑序号,建议通过HCCL接口获取: + - `get_rank`:获取当前设备在集群中的ID。 - `get_group_size`:获取集群数量。 ## 定义网络 -数据并行及自动并行模式下,网络定义方式与单机一致。代码请参考: +数据并行及自动并行模式下,网络定义方式与单机写法一致,可以参考[ResNet网络样例脚本](https://gitee.com/mindspore/docs/blob/master/tutorials/tutorial_code/resnet/resnet.py)。 + +本章节重点介绍手动混合并行和半自动并行模式的网络定义方法。 + +### 手动混合并行模式 + +手动混合并行模式在数据并行模式的基础上,对`parameter`增加了模型并行`layerwise_parallel`配置,包含此配置的`parameter`将以切片的形式保存并参与计算,在优化器计算时不会进行梯度累加。在该模式下,框架不会自动插入并行算子前后需要的计算和通信操作,为了保证计算逻辑的正确性,用户需要手动推导并写在网络结构中,适合对并行原理深入了解的用户使用。 + +以下面的代码为例,将`self.weight`指定为模型并行配置,即`self.weight`和`MatMul`的输出在第二维`channel`上存在切分。这时再在第二维上进行`ReduceSum`得到的仅是单卡累加结果,还需要引入`AllReduce.Sum`通信操作对每卡的结果做加和。关于并行算子的推导原理可以参考这篇[设计文档](https://www.mindspore.cn/doc/note/zh-CN/master/design/mindspore/distributed_training_design.html#id10)。 + +```python +from mindspore import Tensor +import mindspore.ops as ops +import mindspore.common.dtype as mstype +import mindspore.nn as nn + +class HybridParallelNet(nn.Cell): + def __init__(self): + super(HybridParallelNet, self).__init__() + # initialize the weight which is sliced at the second dimension + weight_init = np.random.rand(512, 128/2).astype(np.float32) + self.weight = Parameter(Tensor(weight_init), name="weight", layerwise_parallel=True) + self.fc = ops.MatMul() + self.reduce = ops.ReduceSum() + self.allreduce = ops.AllReduce(op='sum') + + def construct(self, x): + x = self.fc(x, self.weight) + x = self.reduce(x, -1) + x = self.allreduce(x) + return x +``` + +### 半自动并行模式 + +半自动并行模式相较于自动并行模式支持用户手动配置并行策略进行调优。关于算子并行策略的定义可以参考这篇[设计文档](https://www.mindspore.cn/doc/note/zh-CN/master/design/mindspore/distributed_training_design.html#id10)。 + +用户在使用半自动并行模式时,需要注意,未配置策略的算子默认以数据并行方式执行,如果某个`parameter`被多个算子使用,则每个算子对这个`parameter`的切分策略需要保持一致,否则将报错。 + +以前述的`HybridParallelNet`为例,在半自动并行模式下的脚本代码如下,`MatMul`的切分策略为`{(1, 1),(1, 2)}`,指定`self.weight`在第二维度上被切分两份。 + +```python +from mindspore import Tensor +import mindspore.ops as ops +import mindspore.common.dtype as mstype +import mindspore.nn as nn + +class SemiAutoParallelNet(nn.Cell): + def __init__(self): + super(SemiAutoParallelNet, self).__init__() + # initialize full tensor weight + weight_init = np.random.rand(512, 128).astype(np.float32) + self.weight = Parameter(Tensor(weight_init), name="weight") + # set shard strategy + self.fc = ops.MatMul().shard({(1, 1),(1, 2)}) + self.reduce = ops.ReduceSum() + + def construct(self, x): + x = self.fc(x, self.weight) + x = self.reduce(x, -1) + return x +``` ## 定义损失函数及优化器 @@ -255,7 +323,9 @@ def test_train_cifar(epoch_size=10): model = Model(net, loss_fn=loss, optimizer=opt) model.train(epoch_size, dataset, callbacks=[loss_cb], dataset_sink_mode=True) ``` + 其中, + - `dataset_sink_mode=True`:表示采用数据集的下沉模式,即训练的计算下沉到硬件平台中执行。 - `LossMonitor`:能够通过回调函数返回Loss值,用于监控损失函数。 @@ -322,6 +392,7 @@ cd ../ 脚本需要传入变量`DATA_PATH`和`RANK_SIZE`,分别表示数据集的路径和卡的数量。 其中必要的环境变量有, + - `RANK_TABLE_FILE`:组网信息文件的路径。 - `DEVICE_ID`:当前卡在机器上的实际序号。 - `RANK_ID`:当前卡的逻辑序号。 @@ -331,7 +402,7 @@ cd ../ 日志文件保存`device`目录下,`env.log`中记录了环境变量的相关信息,关于Loss部分结果保存在`train.log`中,示例如下: -``` +```text epoch: 1 step: 156, loss is 2.0084016 epoch: 2 step: 156, loss is 1.6407638 epoch: 3 step: 156, loss is 1.6164391 @@ -485,7 +556,7 @@ context.reset_auto_parallel_context() # set parallel mode, data parallel mode is selected for training and model saving. If you want to choose auto parallel # mode, you can simply change the value of parallel_mode parameter to ParallelMode.AUTO_PARALLEL. context.set_auto_parallel_context(parallel_mode=ParallelMode.SEMI_AUTO_PARALLEL, - strategy_ckpt_save_file='./rank_{}_ckpt/strategy.txt'.format(get_rank)) + strategy_ckpt_save_file='./rank_{}_ckpt/strategy.txt'.format(get_rank)) ``` 然后根据需要设置checkpoint保存策略,以及设置优化器和损失函数等,代码如下: @@ -514,12 +585,14 @@ context.reset_auto_parallel_context() 只需要改动设置checkpoint保存策略的代码,将`CheckpointConfig`中的`integrated_save`参数设置为Fasle,便可实现每张卡上只保存本卡的checkpoint文件,具体改动如下: 将checkpoint配置策略由 + ```python # config checkpoint ckpt_config = CheckpointConfig(keep_checkpoint_max=1) ``` 改为 + ```python # config checkpoint ckpt_config = CheckpointConfig(keep_checkpoint_max=1, integrated_save=False) diff --git a/tutorials/training/source_zh_cn/advanced_use/distributed_training_gpu.md b/tutorials/training/source_zh_cn/advanced_use/distributed_training_gpu.md index c1d5cb03a3080e84b9e9ed82142831349a38f266..0b2f4a1eeb38e37f31340aa0239c343f2691a878 100644 --- a/tutorials/training/source_zh_cn/advanced_use/distributed_training_gpu.md +++ b/tutorials/training/source_zh_cn/advanced_use/distributed_training_gpu.md @@ -70,7 +70,7 @@ from mindspore.communication.management import init if __name__ == "__main__": context.set_context(mode=context.GRAPH_MODE, device_target="GPU") init("nccl") - ... + ... ``` 其中, @@ -110,7 +110,7 @@ mpirun -n 8 pytest -s -v ./resnet50_distributed_training.py > train.log 2>&1 & 脚本需要传入变量`DATA_PATH`,表示数据集的路径。此外,我们需要修改下`resnet50_distributed_training.py`文件,由于在GPU上,我们无需设置`DEVICE_ID`环境变量,因此,在脚本中不需要调用`int(os.getenv('DEVICE_ID'))`来获取卡的物理序号,同时`context`中也无需传入`device_id`。我们需要将`device_target`设置为`GPU`,并调用`init("nccl")`来使能NCCL。日志文件保存到device目录下,关于Loss部分结果保存在train.log中。将loss值grep出来后,示例如下: -``` +```text epoch: 1 step: 1, loss is 2.3025854 epoch: 1 step: 1, loss is 2.3025854 epoch: 1 step: 1, loss is 2.3025854 @@ -124,6 +124,7 @@ epoch: 1 step: 1, loss is 2.3025854 ## 运行多机脚本 若训练涉及多机,则需要额外在`mpirun`命令中设置多机配置。你可以直接在`mpirun`命令中用`-H`选项进行设置,比如`mpirun -n 16 -H DEVICE1_IP:8,DEVICE2_IP:8 python hello.py`,表示在ip为DEVICE1_IP和DEVICE2_IP的机器上分别起8个进程运行程序;或者也可以构造一个如下这样的hostfile文件,并将其路径传给`mpirun`的`--hostfile`的选项。hostfile文件每一行格式为`[hostname] slots=[slotnum]`,hostname可以是ip或者主机名。 + ```bash DEVICE1 slots=8 DEVICE2 slots=8 @@ -145,4 +146,4 @@ echo "start training" mpirun -n 16 --hostfile $HOSTFILE -x DATA_PATH=$DATA_PATH -x PATH -mca pml ob1 pytest -s -v ./resnet50_distributed_training.py > train.log 2>&1 & ``` -在GPU上进行分布式训练时,模型参数的保存和加载可参考[分布式训练模型参数保存和加载](https://www.mindspore.cn/tutorial/training/zh-CN/master/advanced_use/distributed_training_ascend.html#id12) \ No newline at end of file +在GPU上进行分布式训练时,模型参数的保存和加载可参考[分布式训练模型参数保存和加载](https://www.mindspore.cn/tutorial/training/zh-CN/master/advanced_use/distributed_training_ascend.html#id12) diff --git a/tutorials/training/source_zh_cn/advanced_use/enable_mixed_precision.md b/tutorials/training/source_zh_cn/advanced_use/enable_mixed_precision.md index 7181afe3cca19a2a293bd0a09e3db02bd5f8b3bb..9f52c1ac9681f82f7e9f641a7ed955f8d9941bd9 100644 --- a/tutorials/training/source_zh_cn/advanced_use/enable_mixed_precision.md +++ b/tutorials/training/source_zh_cn/advanced_use/enable_mixed_precision.md @@ -42,6 +42,7 @@ MindSpore混合精度典型的计算流程如下图所示: 使用自动混合精度,需要调用相应的接口,将待训练网络和优化器作为输入传进去;该接口会将整张网络的算子转换成FP16算子(除`BatchNorm`算子和Loss涉及到的算子外)。可以使用`amp`接口和`Model`接口两种方式实现混合精度。 使用`amp`接口具体的实现步骤为: + 1. 引入MindSpore的混合精度的接口`amp`; 2. 定义网络:该步骤和普通的网络定义没有区别(无需手动配置某个算子的精度); @@ -93,6 +94,7 @@ output = train_network(predict, label) ``` 使用`Model`接口具体的实现步骤为: + 1. 引入MindSpore的模型训练接口`Model`; 2. 定义网络:该步骤和普通的网络定义没有区别(无需手动配置某个算子的精度); @@ -169,6 +171,7 @@ model.train(epoch=10, train_dataset=ds_train) MindSpore还支持手动混合精度。假定在网络中只有一个Dense Layer要用FP32计算,其他Layer都用FP16计算。混合精度配置以Cell为粒度,Cell默认是FP32类型。 以下是一个手动混合精度的实现步骤: + 1. 定义网络:该步骤与自动混合精度中的步骤2类似; 2. 配置混合精度:通过`net.to_float(mstype.float16)`,把该Cell及其子Cell中所有的算子都配置成FP16;然后,将模型中的dense算子手动配置成FP32; @@ -220,4 +223,4 @@ train_network.set_train() # Run training output = train_network(predict, label) -``` \ No newline at end of file +``` diff --git a/tutorials/training/source_zh_cn/advanced_use/evaluate_the_model_during_training.md b/tutorials/training/source_zh_cn/advanced_use/evaluate_the_model_during_training.md index ce8172c6236837449633b315a8a125d63bd06c4f..ca8e2c515002d1092f59bd004215b516406a5193 100644 --- a/tutorials/training/source_zh_cn/advanced_use/evaluate_the_model_during_training.md +++ b/tutorials/training/source_zh_cn/advanced_use/evaluate_the_model_during_training.md @@ -22,6 +22,7 @@ 在面对复杂网络时,往往需要进行几十甚至几百次的epoch训练。在训练之前,很难掌握在训练到第几个epoch时,模型的精度能达到满足要求的程度,所以经常会采用一边训练的同时,在相隔固定epoch的位置对模型进行精度验证,并保存相应的模型,等训练完毕后,通过查看对应模型精度的变化就能迅速地挑选出相对最优的模型,本文将采用这种方法,以LeNet网络为样本,进行示例。 流程如下: + 1. 定义回调函数EvalCallBack,实现同步进行训练和验证。 2. 定义训练网络并执行。 3. 将不同epoch下的模型精度绘制出折线图并挑选最优模型。 @@ -54,7 +55,7 @@ class EvalCallBack(Callback): self.eval_dataset = eval_dataset self.eval_per_epoch = eval_per_epoch self.epoch_per_eval = epoch_per_eval - + def epoch_end(self, run_context): cb_param = run_context.original_args() cur_epoch = cb_param.cur_epoch_num @@ -92,21 +93,21 @@ if __name__ == "__main__": eval_per_epoch = 2 ... ... - + # need to calculate how many steps are in each epoch,in this example, 1875 steps per epoch config_ck = CheckpointConfig(save_checkpoint_steps=eval_per_epoch*1875, keep_checkpoint_max=15) ckpoint_cb = ModelCheckpoint(prefix="checkpoint_lenet",directory=ckpt_save_dir, config=config_ck) model = Model(network, net_loss, net_opt, metrics={"Accuracy": Accuracy()}) - + epoch_per_eval = {"epoch": [], "acc": []} eval_cb = EvalCallBack(model, eval_data, eval_per_epoch, epoch_per_eval) - + model.train(epoch_size, train_data, callbacks=[ckpoint_cb, LossMonitor(375), eval_cb], dataset_sink_mode=True) ``` 输出结果: - + ```text epoch: 1 step: 375, loss is 2.298612 epoch: 1 step: 750, loss is 2.075152 epoch: 1 step: 1125, loss is 0.39205977 @@ -118,9 +119,7 @@ if __name__ == "__main__": epoch: 2 step: 1500, loss is 0.067035824 epoch: 2 step: 1875, loss is 0.0050643035 {'Accuracy': 0.9763621794871795} - ... ... - epoch: 9 step: 375, loss is 0.021227183 epoch: 9 step: 750, loss is 0.005586236 epoch: 9 step: 1125, loss is 0.029125651 @@ -133,10 +132,9 @@ if __name__ == "__main__": epoch: 10 step: 1875, loss is 0.10563098 {'Accuracy': 0.979667467948718} - 在同一目录找到`lenet_ckpt`文件夹,文件夹中保存了5个模型,和一个计算图相关数据,其结构如下: -``` +```text lenet_ckpt ├── checkpoint_lenet-10_1875.ckpt ├── checkpoint_lenet-2_1875.ckpt @@ -150,7 +148,6 @@ lenet_ckpt 定义绘图函数`eval_show`,将`epoch_per_eval`载入到`eval_show`中,绘制出不同`epoch`下模型的验证精度折线图。 - ```python import matplotlib.pyplot as plt @@ -168,7 +165,6 @@ eval_show(epoch_per_eval) ![png](./images/evaluate_the_model_during_training.png) - 从上图可以一目了然地挑选出需要的最优模型。 ## 总结 diff --git a/tutorials/training/source_zh_cn/advanced_use/improve_model_security_nad.md b/tutorials/training/source_zh_cn/advanced_use/improve_model_security_nad.md index 68020090ef4af4cd0e311deacc06c1ae3873479f..ed4b2d5b3f6f738334bb5d8e4e94a11aafae9e35 100644 --- a/tutorials/training/source_zh_cn/advanced_use/improve_model_security_nad.md +++ b/tutorials/training/source_zh_cn/advanced_use/improve_model_security_nad.md @@ -25,6 +25,7 @@ 本教程介绍MindArmour提供的模型安全防护手段,引导您快速使用MindArmour,为您的AI模型提供一定的安全防护能力。 AI算法设计之初普遍未考虑相关的安全威胁,使得AI算法的判断结果容易被恶意攻击者影响,导致AI系统判断失准。攻击者在原始样本处加入人类不易察觉的微小扰动,导致深度学习模型误判,称为对抗样本攻击。MindArmour模型安全提供对抗样本生成、对抗样本检测、模型防御、攻防效果评估等功能,为AI模型安全研究和AI应用安全提供重要支撑。 + - 对抗样本生成模块支持安全工程师快速高效地生成对抗样本,用于攻击AI模型。 - 对抗样本检测、防御模块支持用户检测过滤对抗样本、增强AI模型对于对抗样本的鲁棒性。 - 评估模块提供多种指标全面评估对抗样本攻防性能。 @@ -32,6 +33,7 @@ AI算法设计之初普遍未考虑相关的安全威胁,使得AI算法的判 这里通过图像分类任务上的对抗性攻防,以攻击算法FGSM和防御算法NAD为例,介绍MindArmour在对抗攻防上的使用方法。 > 本例面向CPU、GPU、Ascend 910 AI处理器,你可以在这里下载完整的样例代码: +> > - `mnist_attack_fgsm.py`:包含攻击代码。 > - `mnist_defense_nad.py`:包含防御代码。 @@ -132,18 +134,18 @@ def generate_mnist_dataset(data_path, batch_size=32, repeat_size=1, return nn.Conv2d(in_channels, out_channels, kernel_size=kernel_size, stride=stride, padding=padding, weight_init=weight, has_bias=False, pad_mode="valid") - - + + def fc_with_initialize(input_channels, out_channels): weight = weight_variable() bias = weight_variable() return nn.Dense(input_channels, out_channels, weight, bias) - - + + def weight_variable(): return TruncatedNormal(0.02) - - + + class LeNet5(nn.Cell): """ Lenet network @@ -158,7 +160,7 @@ def generate_mnist_dataset(data_path, batch_size=32, repeat_size=1, self.relu = nn.ReLU() self.max_pool2d = nn.MaxPool2d(kernel_size=2, stride=2) self.flatten = nn.Flatten() - + def construct(self, x): x = self.conv1(x) x = self.relu(x) @@ -190,7 +192,7 @@ def generate_mnist_dataset(data_path, batch_size=32, repeat_size=1, model = Model(net, loss, opt, metrics=None) model.train(10, ds_train, callbacks=[LossMonitor()], dataset_sink_mode=False) - + # 2. get test data ds_test = generate_mnist_dataset(os.path.join(mnist_path, "test"), batch_size=batch_size, repeat_size=1, @@ -203,7 +205,7 @@ def generate_mnist_dataset(data_path, batch_size=32, repeat_size=1, test_inputs = np.concatenate(inputs) test_labels = np.concatenate(labels) ``` - + 3. 测试模型。 ```python @@ -217,15 +219,15 @@ def generate_mnist_dataset(data_path, batch_size=32, repeat_size=1, logits = net(Tensor(batch_inputs)).asnumpy() test_logits.append(logits) test_logits = np.concatenate(test_logits) - + tmp = np.argmax(test_logits, axis=1) == np.argmax(test_labels, axis=1) accuracy = np.mean(tmp) LOGGER.info(TAG, 'prediction accuracy before attacking is : %s', accuracy) ``` - + 测试结果中分类精度达到了98%。 - - ```python + + ```python prediction accuracy before attacking is : 0.9895833333333334 ``` @@ -272,7 +274,7 @@ LOGGER.info(TAG, 'The average structural similarity between original ' 攻击结果如下: -``` +```text prediction accuracy after attacking is : 0.052083 mis-classification rate of adversaries is : 0.947917 The average confidence of adversarial class is : 0.803375 @@ -349,7 +351,7 @@ LOGGER.info(TAG, 'The average confidence of true class is : %s', ### 防御效果 -``` +```text accuracy of TEST data on defensed model is : 0.974259 accuracy of adv data on defensed model is : 0.856370 defense mis-classification rate of adversaries is : 0.143629 @@ -358,4 +360,3 @@ The average confidence of true class is : 0.177374 ``` 使用NAD进行对抗样本防御后,模型对于对抗样本的误分类率从95%降至14%,模型有效地防御了对抗样本。同时,模型对于原来测试数据集的分类精度达97%。 - diff --git a/tutorials/training/source_zh_cn/advanced_use/lineage_and_scalars_comparision.md b/tutorials/training/source_zh_cn/advanced_use/lineage_and_scalars_comparision.md index f7a55d1c793bd401e67156859c4dce0536293100..555ef0fee8d2c087de64f6852395e3c3ba70bdda 100644 --- a/tutorials/training/source_zh_cn/advanced_use/lineage_and_scalars_comparision.md +++ b/tutorials/training/source_zh_cn/advanced_use/lineage_and_scalars_comparision.md @@ -106,6 +106,7 @@ MindInsight中的模型溯源、数据溯源和对比看板同训练看板一样 ## 注意事项 出于性能上的考虑,MindInsight对比看板使用缓存机制加载训练的标量曲线数据,并进行以下限制: + - 对比看板只支持在缓存中的训练进行比较标量曲线对比。 - 缓存最多保留最新(按修改时间排列)的15个训练。 - 用户最多同时对比5个训练的标量曲线。 diff --git a/tutorials/training/source_zh_cn/advanced_use/migrate_3rd_scripts.md b/tutorials/training/source_zh_cn/advanced_use/migrate_3rd_scripts.md index 7b94469d212649f5dc183f295a6618dfa192328a..d382670fc2dfbb67b8bd3a51e107b0a20df5a9da 100644 --- a/tutorials/training/source_zh_cn/advanced_use/migrate_3rd_scripts.md +++ b/tutorials/training/source_zh_cn/advanced_use/migrate_3rd_scripts.md @@ -36,10 +36,12 @@ 以ResNet-50为例,[Conv](https://www.mindspore.cn/doc/api_python/zh-CN/master/mindspore/mindspore.nn.html#mindspore.nn.Conv2d)和[BatchNorm](https://www.mindspore.cn/doc/api_python/zh-CN/master/mindspore/mindspore.nn.html#mindspore.nn.BatchNorm2d)是其中最主要的两个算子,它们已在MindSpore支持的算子列表中。 如果发现没有对应算子,建议: + - 使用其他算子替换:分析算子实现公式,审视是否可以采用MindSpore现有算子叠加达到预期目标。 - 临时替代方案:比如不支持某个Loss,是否可以替换为同类已支持的Loss算子;又比如当前的网络结构,是否可以替换为其他同类主流网络等。 如果发现支持的算子存在功能不全,建议: + - 非必要功能:可删除。 - 必要功能:寻找替代方案。 @@ -68,7 +70,7 @@ MindSpore与TensorFlow、PyTorch在网络结构组织方式上,存在一定差 2. 加载数据集和预处理。 使用MindSpore构造你需要使用的数据集。目前MindSpore已支持常见数据集,你可以通过原始格式、`MindRecord`、`TFRecord`等多种接口调用,同时还支持数据处理以及数据增强等相关功能,具体用法可参考[准备数据教程](https://www.mindspore.cn/tutorial/training/zh-CN/master/use/data_preparation.html)。 - + 本例中加载了Cifar-10数据集,可同时支持单卡和多卡的场景。 ```python @@ -78,7 +80,7 @@ MindSpore与TensorFlow、PyTorch在网络结构组织方式上,存在一定差 ds = de.Cifar10Dataset(dataset_path, num_parallel_workers=4, shuffle=True, num_shards=device_num, shard_id=rank_id) ``` - + 然后对数据进行了数据增强、数据清洗和批处理等操作。代码详见。 3. 构建网络。 @@ -234,13 +236,13 @@ MindSpore与TensorFlow、PyTorch在网络结构组织方式上,存在一定差 ``` 如果希望使用`Model`内置的评估方法,则可以使用[metrics](https://www.mindspore.cn/tutorial/training/zh-CN/master/advanced_use/custom_debugging_info.html#mindspore-metrics)属性设置希望使用的评估方法。 - + ```python model = Model(net, loss_fn=loss, optimizer=opt, loss_scale_manager=loss_scale, metrics={'acc'}) ``` 类似于TensorFlow的`estimator.train`,可以通过调用`model.train`接口来进行训练。CheckPoint和中间结果打印等功能,可通过`Callback`的方式定义到`model.train`接口上。 - + ```python time_cb = TimeMonitor(data_size=step_size) loss_cb = LossMonitor() @@ -256,6 +258,7 @@ MindSpore与TensorFlow、PyTorch在网络结构组织方式上,存在一定差 #### 精度调试 精度调优过程建议如下两点: + 1. 单卡精度验证时,建议先采用小数据集进行训练。验证达标后,多卡精度验证时,再采用全量数据集。这样可以帮助提升调试效率。 2. 首先删减脚本中的不必要技巧(如优化器中的增强配置、动态Loss Scale等),验证达标后,在此基础上逐个叠加新增功能,待当前新增功能确认正常后,再叠加下一个功能。这样可以帮助快速定位问题。 diff --git a/tutorials/training/source_zh_cn/advanced_use/migrate_3rd_scripts_mindconverter.md b/tutorials/training/source_zh_cn/advanced_use/migrate_3rd_scripts_mindconverter.md index 4128c54a2c594fac6eb46e8964326cdc8538bfe4..6cf61899c7b2a90c36839a3f4a5c132432f41f12 100644 --- a/tutorials/training/source_zh_cn/advanced_use/migrate_3rd_scripts_mindconverter.md +++ b/tutorials/training/source_zh_cn/advanced_use/migrate_3rd_scripts_mindconverter.md @@ -22,14 +22,10 @@ MindConverter是一款将PyTorch模型脚本转换至MindSpore的脚本迁移工具。结合转换报告的提示信息,用户对转换后脚本进行微小改动,即可快速将PyTorch模型脚本迁移至MindSpore。 - - ## 安装 此工具为MindInsight的子模块,安装MindInsight后,即可使用MindConverter,MindInsight安装请参考该[安装文档](https://www.mindspore.cn/install/)。 - - ## 用法 MindConverter提供命令行(Command-line interface, CLI)的使用方式,命令如下。 @@ -79,15 +75,13 @@ optional arguments: 另外,当使用基于图结构的脚本生成方案时,请确保原PyTorch项目已在Python包搜索路径中,可通过CLI进入Python交互式命令行,通过import的方式判断是否已满足;若未加入,可通过`--project_path`命令手动将项目路径传入,以确保MindConverter可引用到原PyTorch脚本。 - > 假设用户项目目录为`/home/user/project/model_training`,用户可通过如下命令手动项目添加至包搜索路径中:`export PYTHONPATH=/home/user/project/model_training:$PYTHONPATH` - > 此处MindConverter需要引用原PyTorch脚本,是因为PyTorch模型反向序列化过程中会引用原脚本。 - ## 使用场景 MindConverter提供两种技术方案,以应对不同脚本迁移场景: + 1. 用户希望迁移后脚本保持原有PyTorch脚本结构(包括变量、函数、类命名等与原脚本保持一致); 2. 用户希望迁移后脚本保持较高的转换率,尽量少的修改、甚至不需要修改,即可实现迁移后模型脚本的执行。 @@ -101,7 +95,6 @@ MindConverter提供两种技术方案,以应对不同脚本迁移场景: > 2. 基于图结构的脚本生成方案,由于要基于推理模式加载PyTorch模型,会导致转换后网络中Dropout算子丢失,需要用户手动补齐; > 3. 基于图结构的脚本生成方案持续优化中。 - ## 使用示例 ### 基于AST的脚本转换示例 @@ -121,6 +114,7 @@ line x:y: [UnConvert] 'operator' didn't convert. ... ``` 转换报告示例如下所示: + ```text [Start Convert] [Insert] 'import mindspore.ops.operations as P' is inserted to the converted file. @@ -133,7 +127,6 @@ line x:y: [UnConvert] 'operator' didn't convert. ... 对于部分未成功转换的算子,报告中会提供修改建议,如`line 157:23`,MindConverter建议将`torch.nn.AdaptiveAvgPool2d`替换为`mindspore.ops.operations.ReduceMean`。 - ### 基于图结构的脚本生成示例 若用户已将PyTorch模型保存为.pth格式,假设模型绝对路径为`/home/user/model.pth`,该模型期望的输入样本shape为(3, 224, 224),原PyTorch脚本位于`/home/user/project/model_training`,希望将脚本输出至`/home/user/output`,转换报告输出至`/home/user/output/report`,则脚本生成命令为: @@ -147,10 +140,8 @@ mindconverter --model_file /home/user/model.pth --shape 3,224,224 \ 执行该命令,MindSpore代码文件、转换报告生成至相应目录。 - 基于图结构的脚本生成方案产生的转换报告格式与AST方案相同。然而,由于基于图结构方案属于生成式方法,转换过程中未参考原PyTorch脚本,因此生成的转换报告中涉及的代码行、列号均指生成后脚本。 - 另外对于未成功转换的算子,在代码中会相应的标识该节点输入、输出Tensor的shape(以`input_shape`, `output_shape`标识),便于用户手动修改。以Reshape算子为例(暂不支持Reshape),将生成如下代码: ```python @@ -194,7 +185,6 @@ class Classifier(nn.Cell): ``` - > 其中`--output`与`--report`参数可省略,若省略,该命令将在当前工作目录(Working directory)下自动创建`output`目录,将生成的脚本、转换报告输出至该目录。 ## 注意事项 diff --git a/tutorials/training/source_zh_cn/advanced_use/nlp_bert_poetry.md b/tutorials/training/source_zh_cn/advanced_use/nlp_bert_poetry.md index d595dbef535f308296582fc1edec0628d35d278b..27cbdb6297c7c7091272562aaa64d79554952cac 100644 --- a/tutorials/training/source_zh_cn/advanced_use/nlp_bert_poetry.md +++ b/tutorials/training/source_zh_cn/advanced_use/nlp_bert_poetry.md @@ -87,7 +87,7 @@ BERT采用了Encoder结构,`attention_mask`为全1的向量,即每个token 样例代码可[点击下载](https://mindspore-website.obs.cn-north-4.myhuaweicloud.com:443/DemoCode/bert_poetry_c.rar),可直接运行体验实现写诗效果,代码结构如下: -``` +```text └─bert_poetry ├── src ├── bert_for_pre_training.py # 封装BERT-Base正反向网络类 @@ -107,7 +107,7 @@ BERT采用了Encoder结构,`attention_mask`为全1的向量,即每个token ├── poetry_client.py # 客户端代码 ├── ms_service_pb2_grpc.py # 定义了grpc相关函数供bert_flask.py使用 └── ms_service_pb2.py # 定义了protocol buffer相关函数供bert_flask.py使用 - + ``` ## 实现步骤 @@ -118,7 +118,6 @@ BERT采用了Encoder结构,`attention_mask`为全1的向量,即每个token ### 数据准备 - 数据集为43030首诗词:可[下载](https://github.com/AaronJny/DeepLearningExamples/tree/master/keras-bert-poetry-generator)其中的`poetry.txt`。 BERT-Base模型的预训练ckpt:可在[MindSpore官网](http://download.mindspore.cn/model_zoo/official/nlp/bert/bert_base_ascend_0.5.0_cn-wiki_official_nlp_20200720.tar.gz)下载。 @@ -127,7 +126,7 @@ BERT-Base模型的预训练ckpt:可在[MindSpore官网](http://download.mindsp 在`src/finetune_config.py`中修改`pre_training_ckpt`路径,加载预训练的ckpt,修改`batch_size`为bs,修改`dataset_path`为存放诗词的路径,默认的`BertConfig`为Base模型。 -``` +```python 'dataset_path': '/your/path/to/poetry.txt', 'batch_size': bs, 'pre_training_ckpt': '/your/path/to/pre_training_ckpt', @@ -135,7 +134,7 @@ BERT-Base模型的预训练ckpt:可在[MindSpore官网](http://download.mindsp 执行训练指令 -``` +```bash python poetry.py ``` @@ -145,19 +144,20 @@ python poetry.py `generate_random_poetry`函数实现随机生成和续写诗句的功能,如果入参`s`为空则代表随机生成,`s`不为空则为续写诗句。 -``` +```python output = generate_random_poetry(poetrymodel, s='') #随机生成 output = generate_random_poetry(poetrymodel, s='天下为公') #续写诗句 ``` `generate_hidden`函数实现生成藏头诗的功能,入参`head`为隐藏的头部语句。 -``` + +```python output = generate_hidden(poetrymodel, head="人工智能") #藏头诗 ``` 执行推理指令 -``` +```bash python poetry.py --train=False --ckpt_path=/your/ckpt/path ``` @@ -165,7 +165,7 @@ python poetry.py --train=False --ckpt_path=/your/ckpt/path 随机生成: -``` +```text 大堤柳暗, 春深树根。 东望一望, @@ -178,7 +178,7 @@ python poetry.py --train=False --ckpt_path=/your/ckpt/path 续写 【天下为公】: -``` +```text 天下为公少, 唯君北向西。 远山无路见, @@ -191,7 +191,7 @@ python poetry.py --train=False --ckpt_path=/your/ckpt/path 藏头诗 【人工智能】: -``` +```text 人君离别难堪望, 工部张机自少年。 智士不知身没处, @@ -206,7 +206,7 @@ python poetry.py --train=False --ckpt_path=/your/ckpt/path 在使用Serving部署服务前,需要导出模型文件,在`poetry.py`中提供了`export_net`函数负责导出MINDIR模型,执行命令: - ``` + ```bash python poetry.py --export=True --ckpt_path=/your/ckpt/path ``` @@ -216,7 +216,7 @@ python poetry.py --train=False --ckpt_path=/your/ckpt/path 在服务器侧启动Serving服务,并加载导出的MINDIR文件`poetry.pb`。 - ``` + ```bash cd serving ./ms_serving --model_path=/path/to/your/MINDIR_file --model_name=your_mindir.pb ``` @@ -225,7 +225,7 @@ python poetry.py --train=False --ckpt_path=/your/ckpt/path 预处理及后处理通过Flask框架来快速实现,在服务器侧运行`bert_flask.py`文件,启动Flask服务。 - ``` + ```bash python bert_flask.py ``` @@ -235,36 +235,38 @@ python poetry.py --train=False --ckpt_path=/your/ckpt/path 可用电脑作为客户端,修改`poetry_client.py`中的url请求地址为推理服务启动的服务器IP,并确保端口与服务端`bert_flask.py`中的端口一致,例如: - ``` + ```python url = 'http://10.155.170.71:8080/' ``` 运行`poetry_client.py`文件 - ``` + ```bash python poetry_client.py ``` 此时在客户端输入指令,即可在远端服务器进行推理,返回生成的诗句。 - ``` + ```text 选择模式:0-随机生成,1:续写,2:藏头诗 0 ``` - ``` + + ```text 一朵黄花叶, 千竿绿树枝。 含香待夏晚, 澹浩长风时。 ``` - ``` + ```text 选择模式:0-随机生成,1:续写,2:藏头诗 1 输入首句诗 明月 ``` - ``` + + ```text 明月照三峡, 长空一片云。 秋风与雨过, @@ -275,13 +277,14 @@ python poetry.py --train=False --ckpt_path=/your/ckpt/path 何道逐风君。 ``` - ``` + ```text 选择模式:0-随机生成,1:续写,2:藏头诗 2 输入藏头诗 人工智能 ``` - ``` + + ```text 人生事太远, 工部与神期。 智者岂无识, @@ -290,10 +293,8 @@ python poetry.py --train=False --ckpt_path=/your/ckpt/path 细读鉴赏一下,平仄、押韵、意味均有体现,AI诗人已然成形。 - > 友情提醒,修改其他类型数据集,也可以完成其他简单的生成类任务,如对春联,简单聊天机器人等,用户可尝试体验实现。 - ## 参考文献 [1] [BERT:Pre-training of Deep Bidirectional Transformers for Language Understanding](https://arxiv.org/abs/1810.04805) @@ -301,4 +302,3 @@ python poetry.py --train=False --ckpt_path=/your/ckpt/path [2] [https://github.com/AaronJny/DeepLearningExamples/](https://github.com/AaronJny/DeepLearningExamples/) [3] [https://github.com/bojone/bert4keras](https://github.com/bojone/bert4keras) - diff --git a/tutorials/training/source_zh_cn/advanced_use/nlp_sentimentnet.md b/tutorials/training/source_zh_cn/advanced_use/nlp_sentimentnet.md index 9dcc7157aebbdbc6ca221656b57322f84525e722..e73bbcaf3b629a66a603977dd014998a5292b9f5 100644 --- a/tutorials/training/source_zh_cn/advanced_use/nlp_sentimentnet.md +++ b/tutorials/training/source_zh_cn/advanced_use/nlp_sentimentnet.md @@ -48,6 +48,7 @@ $垂直极性词 = 通用极性词 + 领域特有极性词$ 按照处理文本的粒度不同,情感分析可分为词语级、短语级、句子级、段落级以及篇章级等几个研究层次。这里以“段落级”为例,输入为一个段落,输出为影评是正面还是负面的信息。 ## 准备及设计 + ### 下载数据集 采用IMDb影评数据集作为实验数据。 @@ -55,15 +56,17 @@ $垂直极性词 = 通用极性词 + 领域特有极性词$ 以下是负面影评(Negative)和正面影评(Positive)的案例。 -| Review | Label | +| Review | Label | |---|---| | "Quitting" may be as much about exiting a pre-ordained identity as about drug withdrawal. As a rural guy coming to Beijing, class and success must have struck this young artist face on as an appeal to separate from his roots and far surpass his peasant parents' acting success. Troubles arise, however, when the new man is too new, when it demands too big a departure from family, history, nature, and personal identity. The ensuing splits, and confusion between the imaginary and the real and the dissonance between the ordinary and the heroic are the stuff of a gut check on the one hand or a complete escape from self on the other. | Negative | | This movie is amazing because the fact that the real people portray themselves and their real life experience and do such a good job it's like they're almost living the past over again. Jia Hongsheng plays himself an actor who quit everything except music and drugs struggling with depression and searching for the meaning of life while being angry at everyone especially the people who care for him most. | Positive | 同时,我们要下载GloVe文件,并在文件开头处添加新的一行,意思是总共读取400000个单词,每个单词用300纬度的词向量表示。 -``` + +```text 400000 300 ``` + GloVe文件下载地址:。 ### 确定评价标准 @@ -74,22 +77,23 @@ $精度(Accuracy)= 分类正确的样本数目 / 总样本数目$ $精准度(Precision)= 真阳性样本数目 / 所有预测类别为阳性的样本数目$ -$召回率(Recall)= 真阳性样本数目 / 所有真实类别为阳性的样本数目$ +$召回率(Recall)= 真阳性样本数目 / 所有真实类别为阳性的样本数目$ -$F1分数 = (2 * Precision * Recall) / (Precision + Recall)$ +$F1分数 = (2 \* Precision \* Recall) / (Precision + Recall)$ 在IMDb这个数据集中,正负样本数差别不大,可以简单地用精度(accuracy)作为分类器的衡量标准。 - ### 确定网络及流程 我们使用基于LSTM构建的SentimentNet网络进行自然语言处理。 + 1. 加载使用的数据集,并进行必要的数据处理。 2. 使用基于LSTM构建的SentimentNet网络训练数据,生成模型。 > LSTM(Long short-term memory,长短期记忆)网络是一种时间循环神经网络,适合于处理和预测时间序列中间隔和延迟非常长的重要事件。具体介绍可参考网上资料,在此不再赘述。 3. 得到模型之后,使用验证数据集,查看模型精度情况。 > 本例面向GPU或CPU硬件平台,你可以在这里下载完整的样例代码: +> > - `src/config.py`:网络中的一些配置,包括`batch size`、进行几次epoch训练等。 > - `src/dataset.py`:数据集相关,包括转换成MindRecord文件,数据预处理等。 > - `src/imdb.py`: 解析IMDb数据集的工具。 @@ -98,8 +102,11 @@ $F1分数 = (2 * Precision * Recall) / (Precision + Recall)$ > - `eval.py`:模型的推理脚本。 ## 实现阶段 + ### 导入需要的库文件 + 下列是我们所需要的公共模块及MindSpore的模块及库文件。 + ```python import argparse import os @@ -119,6 +126,7 @@ from mindspore.train.serialization import load_param_into_net, load_checkpoint ### 配置环境信息 1. 使用`parser`模块,传入运行必要的信息,如数据集存放路径,GloVe存放路径,这样的好处是,对于经常变化的配置,可以在运行代码时输入,使用更加灵活。 + ```python parser = argparse.ArgumentParser(description='MindSpore LSTM Example') parser.add_argument('--preprocess', type=str, default='false', choices=['true', 'false'], @@ -139,12 +147,14 @@ from mindspore.train.serialization import load_param_into_net, load_checkpoint ``` 2. 实现代码前,需要配置必要的信息,包括环境信息、执行的模式、后端信息及硬件信息。 + ```python context.set_context( mode=context.GRAPH_MODE, save_graphs=False, device_target=args.device_target) ``` + 详细的接口配置信息,请参见`context.set_context`接口说明。 ### 预处理数据集 @@ -156,15 +166,14 @@ if args.preprocess == "true": print("============== Starting Data Pre-processing ==============") convert_to_mindrecord(cfg.embed_size, args.aclimdb_path, args.preprocess_path, args.glove_path) ``` -> 转换成功后会在`preprocess_path`路径下生成`mindrecord`文件; 通常该操作在数据集不变的情况下,无需每次训练都执行。 +> 转换成功后会在`preprocess_path`路径下生成`mindrecord`文件; 通常该操作在数据集不变的情况下,无需每次训练都执行。 > `convert_to_mindrecord`函数的具体实现请参考 - > 其中包含两大步骤: +> > 1. 解析文本数据集,包括编码、分词、对齐、处理GloVe原始数据,使之能够适应网络结构。 > 2. 转换并保存为MindRecord格式数据集。 - ### 定义网络 ```python @@ -178,11 +187,13 @@ network = SentimentNet(vocab_size=embedding_table.shape[0], weight=Tensor(embedding_table), batch_size=cfg.batch_size) ``` + > `SentimentNet`网络结构的具体实现请参考 ### 预训练模型 通过参数`pre_trained`指定预加载CheckPoint文件来进行预训练,默认该参数为空。 + ```python if args.pre_trained: load_param_into_net(network, load_checkpoint(args.pre_trained)) @@ -217,6 +228,7 @@ else: model.train(cfg.num_epochs, ds_train, callbacks=[time_cb, ckpoint_cb, loss_cb]) print("============== Training Success ==============") ``` + > `lstm_create_dataset`函数的具体实现请参考 ### 模型验证 @@ -238,12 +250,15 @@ print("============== {} ==============".format(acc)) ``` ## 实验结果 + 在经历了20轮epoch之后,在测试集上的精度约为84.19%。 -**执行训练** +### 执行训练 + 1. 运行训练代码,查看运行结果。 + ```shell - $ python train.py --preprocess=true --ckpt_path=./ --device_target=GPU + python train.py --preprocess=true --ckpt_path=./ --device_target=GPU ``` 输出如下,可以看到loss值随着训练逐步降低,最后达到0.2855左右: @@ -261,13 +276,13 @@ print("============== {} ==============".format(acc)) epoch: 20 step: 389, loss is 0.1354 epoch: 20 step: 390, loss is 0.2855 ``` - + 2. 查看保存的CheckPoint文件。 - + 训练过程中保存了CheckPoint文件,即模型文件,我们可以查看文件保存的路径下的所有保存文件。 ```shell - $ ls ./*.ckpt + ls ./*.ckpt ``` 输出如下: @@ -276,12 +291,12 @@ print("============== {} ==============".format(acc)) lstm-11_390.ckpt lstm-12_390.ckpt lstm-13_390.ckpt lstm-14_390.ckpt lstm-15_390.ckpt lstm-16_390.ckpt lstm-17_390.ckpt lstm-18_390.ckpt lstm-19_390.ckpt lstm-20_390.ckpt ``` -**验证模型** +### 验证模型 使用最后保存的CheckPoint文件,加载验证数据集,进行验证。 ```shell -$ python eval.py --ckpt_path=./lstm-20_390.ckpt --device_target=GPU +python eval.py --ckpt_path=./lstm-20_390.ckpt --device_target=GPU ``` 输出如下,可以看到使用验证的数据集,对文本的情感分析正确率在84.19%左右,达到一个基本满意的结果。 @@ -290,6 +305,3 @@ $ python eval.py --ckpt_path=./lstm-20_390.ckpt --device_target=GPU ============== Starting Testing ============== ============== {'acc': 0.8419471153846154} ============== ``` - - - diff --git a/tutorials/training/source_zh_cn/advanced_use/optimize_data_processing.md b/tutorials/training/source_zh_cn/advanced_use/optimize_data_processing.md index 30c79d1cbc7d233fe3cc48513a7d55da416b1bae..256a17af7d87826fc63aa65c5a718db3c0065277 100644 --- a/tutorials/training/source_zh_cn/advanced_use/optimize_data_processing.md +++ b/tutorials/training/source_zh_cn/advanced_use/optimize_data_processing.md @@ -54,7 +54,7 @@ import numpy as np 目录结构如下所示: -``` +```text dataset/Cifar10Data ├── cifar-10-batches-bin │   ├── batches.meta.txt @@ -77,6 +77,7 @@ dataset/Cifar10Data ``` 其中: + - `cifar-10-batches-bin`目录为CIFAR-10二进制格式数据集目录。 - `cifar-10-batches-py`目录为CIFAR-10 Python文件格式数据集目录。 @@ -94,6 +95,7 @@ MindSpore为用户提供了多种数据加载方式,其中包括常用数据 ![title](./images/data_loading_performance_scheme.png) 数据加载性能优化建议如下: + - 已经支持的数据集格式优选内置加载算子,具体内容请参考[内置加载算子](https://www.mindspore.cn/doc/api_python/zh-CN/master/mindspore/mindspore.dataset.html),如果性能仍无法满足需求,则可采取多线程并发方案,请参考本文[多线程优化方案](https://www.mindspore.cn/tutorial/training/zh-CN/master/advanced_use/optimize_data_processing.html#id16)。 - 不支持的数据集格式,优选转换为MindSpore数据格式后再使用`MindDataset`类进行加载,具体内容请参考[MindSpore数据格式转换](https://www.mindspore.cn/doc/programming_guide/zh-CN/master/dataset_conversion.html),如果性能仍无法满足需求,则可采取多线程并发方案,请参考本文[多线程优化方案](https://www.mindspore.cn/tutorial/training/zh-CN/master/advanced_use/optimize_data_processing.html#id16)。 - 不支持的数据集格式,算法快速验证场景,优选用户自定义`GeneratorDataset`类实现,如果性能仍无法满足需求,则可采取多进程并发方案,请参考本文[多进程优化方案](https://www.mindspore.cn/tutorial/training/zh-CN/master/advanced_use/optimize_data_processing.html#id17)。 @@ -115,7 +117,7 @@ MindSpore为用户提供了多种数据加载方式,其中包括常用数据 输出: - ``` + ```text {'image': Tensor(shape=[32, 32, 3], dtype=UInt8, value= [[[235, 235, 235], [230, 230, 230], @@ -150,7 +152,7 @@ MindSpore为用户提供了多种数据加载方式,其中包括常用数据 输出: - ``` + ```text {'data': Tensor(shape=[1431], dtype=UInt8, value= [255, 216, 255, ..., 63, 255, 217]), 'id': Tensor(shape=[], dtype=Int64, value= 30474), 'label': Tensor(shape=[], dtype=Int64, value= 2)} @@ -171,7 +173,7 @@ MindSpore为用户提供了多种数据加载方式,其中包括常用数据 输出: - ``` + ```text {'data': Tensor(shape=[1], dtype=Int64, value= [0])} ``` @@ -184,6 +186,7 @@ shuffle操作主要是对有序的数据集或者进行过repeat的数据集进 ![title](./images/shuffle_performance_scheme.png) shuffle性能优化建议如下: + - 直接使用内置加载算子的`shuffle`参数进行数据的混洗。 - 如果使用的是`shuffle`函数,当性能仍无法满足需求,可通过调整`buffer_size`参数的值来优化提升性能。 @@ -204,7 +207,7 @@ shuffle性能优化建议如下: 输出: - ``` + ```text {'image': Tensor(shape=[32, 32, 3], dtype=UInt8, value= [[[254, 254, 254], [255, 255, 254], @@ -239,7 +242,7 @@ shuffle性能优化建议如下: 输出: - ``` + ```text before shuffle: [0 1 2 3 4] [1 2 3 4 5] @@ -257,6 +260,7 @@ shuffle性能优化建议如下: ## 数据增强性能优化 在图片分类的训练中,尤其是当数据集比较小的时候,用户可以使用数据增强的方式来预处理图片,从而丰富数据集。MindSpore为用户提供了多种数据增强的方式,其中包括: + - 使用内置C算子(`c_transforms`模块)进行数据增强。 - 使用内置Python算子(`py_transforms`模块)进行数据增强。 - 用户可根据自己的需求,自定义Python函数进行数据增强。 @@ -273,6 +277,7 @@ shuffle性能优化建议如下: ![title](./images/data_enhancement_performance_scheme.png) 数据增强性能优化建议如下: + - 优先使用`c_transforms`模块进行数据增强,因为性能最高,如果性能仍无法满足需求,可采取[多线程优化方案](https://www.mindspore.cn/tutorial/training/zh-CN/master/advanced_use/optimize_data_processing.html#id16)、[Compose优化方案](https://www.mindspore.cn/tutorial/training/zh-CN/master/advanced_use/optimize_data_processing.html#compose)或者[算子融合优化方案](https://www.mindspore.cn/tutorial/training/zh-CN/master/advanced_use/optimize_data_processing.html#id18)。 - 如果使用了`py_transforms`模块进行数据增强,当性能仍无法满足需求,可采取[多线程优化方案](https://www.mindspore.cn/tutorial/training/zh-CN/master/advanced_use/optimize_data_processing.html#id16)、[多进程优化方案](https://www.mindspore.cn/tutorial/training/zh-CN/master/advanced_use/optimize_data_processing.html#id17)、[Compose优化方案](https://www.mindspore.cn/tutorial/training/zh-CN/master/advanced_use/optimize_data_processing.html#compose)或者[算子融合优化方案](https://www.mindspore.cn/tutorial/training/zh-CN/master/advanced_use/optimize_data_processing.html#id18)。 - `c_transforms`模块是在C++内维护buffer管理,`py_transforms`模块是在Python内维护buffer管理。因为Python和C++切换的性能成本,建议不要混用算子。 @@ -326,7 +331,7 @@ shuffle性能优化建议如下: 输出: - ``` + ```text before map: [0 1 2 3 4] [1 2 3 4 5] @@ -394,6 +399,7 @@ shuffle性能优化建议如下: ### 多线程优化方案 在数据pipeline过程中,相关算子一般都有线程数设置参数,来提升处理并发度,提升性能,例如: + - 在数据加载的过程中,内置数据加载类有`num_parallel_workers`参数用来设置线程数。 - 在数据增强的过程中,`map`函数有`num_parallel_workers`参数用来设置线程数。 - 在Batch的过程中,`batch`函数有`num_parallel_workers`参数用来设置线程数。 @@ -403,6 +409,7 @@ shuffle性能优化建议如下: ### 多进程优化方案 数据处理中Python实现的算子均支持多进程的模式,例如: + - `GeneratorDataset`这个类默认是多进程模式,它的`num_parallel_workers`参数表示的是开启的进程数,默认为1,具体内容请参考[GeneratorDataset](https://www.mindspore.cn/doc/api_python/zh-CN/master/mindspore/mindspore.dataset.html#mindspore.dataset.GeneratorDataset)。 - 如果使用Python自定义函数或者`py_transforms`模块进行数据增强的时候,当`map`函数的参数`python_multiprocessing`设置为True时,此时参数`num_parallel_workers`表示的是进程数,参数`python_multiprocessing`默认为False,此时参数`num_parallel_workers`表示的是线程数,具体的内容请参考[内置加载算子](https://www.mindspore.cn/doc/api_python/zh-CN/master/mindspore/mindspore.dataset.html)。 diff --git a/tutorials/training/source_zh_cn/advanced_use/performance_profiling.md b/tutorials/training/source_zh_cn/advanced_use/performance_profiling.md index cea56ee54da1bb4d6785807ca53909d321ec57cc..c603d4164123779e9ac35e60180b58e372808da6 100644 --- a/tutorials/training/source_zh_cn/advanced_use/performance_profiling.md +++ b/tutorials/training/source_zh_cn/advanced_use/performance_profiling.md @@ -22,6 +22,7 @@ ## 概述 + 将训练过程中的算子耗时等信息记录到文件中,通过可视化界面供用户查看分析,帮助用户更高效地调试神经网络性能。 ## 操作流程 @@ -31,11 +32,13 @@ - 在训练列表找到对应训练,点击性能分析,即可在页面中查看训练性能数据。 ## 环境准备 + 在使用性能分析工具之前,要确保后台工具进程(ada)正确启动,要求用户使用HwHiAiUser用户组的用户或root启动ada进程,并使用同用户跑训练脚本,启动命令为:`/usr/local/Ascend/driver/tools/ada`。 ## 准备训练脚本 为了收集神经网络的性能数据,需要在训练脚本中添加MindSpore Profiler相关接口。 + - `set_context`之后,初始化网络和HCCL之前,需要初始化MindSpore `Profiler`对象。 > Profiler支持的参数可以参考: @@ -54,10 +57,10 @@ from mindspore import Model, nn, context def test_profiler(): # Init context env context.set_context(mode=context.GRAPH_MODE, device_target='Ascend', device_id=int(os.environ["DEVICE_ID"])) - + # Init Profiler profiler = Profiler() - + # Init hyperparameter epoch = 2 # Init network and Model @@ -69,7 +72,7 @@ def test_profiler(): train_ds = create_mindrecord_dataset_for_training() # Model Train model.train(epoch, train_ds) - + # Profiler end profiler.analyse() ``` @@ -78,7 +81,6 @@ def test_profiler(): 启动命令请参考[MindInsight相关命令](https://www.mindspore.cn/tutorial/training/zh-CN/master/advanced_use/mindinsight_commands.html)。 - ### 性能分析 用户从训练列表中选择指定的训练,点击性能调试,可以查看该次训练的性能数据。 @@ -88,6 +90,7 @@ def test_profiler(): 图1:性能数据总览 图1展示了性能数据总览页面,包含了迭代轨迹(Step Trace)、算子性能、MindData性能和Timeline等组件的数据总体呈现。各组件展示的数据如下: + - 迭代轨迹:将训练step划分为几个阶段,统计每个阶段的耗时,按时间线进行展示;总览页展示了迭代轨迹图。 - 算子性能:统计单算子以及各算子类型的执行时间,进行排序展示;总览页中展示了各算子类型时间占比的饼状图。 - MindData性能:统计训练数据准备各阶段的性能情况;总览页中展示了各阶段性能可能存在瓶颈的step数目。 @@ -108,6 +111,7 @@ def test_profiler(): 迭代轨迹在做阶段划分时,需要识别前向计算开始的算子和反向计算结束的算子。为了降低用户使用Profiler的门槛,MindSpore会对这两个算子做自动识别,方法为: 前向计算开始的算子指定为`get_next`算子之后连接的第一个算子,反向计算结束的算子指定为最后一次all reduce之前连接的算子。**Profiler不保证在所有情况下自动识别的结果和用户的预期一致,用户可以根据网络的特点自行调整**,调整方法如下: + - 设置`FP_POINT`环境变量指定前向计算开始的算子,如`export FP_POINT=fp32_vars/conv2d/BatchNorm`。 - 设置`BP_POINT`环境变量指定反向计算结束的算子,如`export BP_POINT=loss_scale/gradients/AddN_70`。 @@ -120,6 +124,7 @@ def test_profiler(): 图3:算子类别统计分析 图3展示了按算子类别进行统计分析的结果,包含以下内容: + - 可以选择饼图/柱状图展示各算子类别的时间占比,每个算子类别的执行时间会统计属于该类别的算子执行时间总和。 - 统计前20个占比时间最长的算子类别,展示其时间所占的百分比以及具体的执行时间(毫秒)。 @@ -128,6 +133,7 @@ def test_profiler(): 图4:算子统计分析 图4展示了算子性能统计表,包含以下内容: + - 选择全部:按单个算子的统计结果进行排序展示,展示维度包括算子名称、算子类型、算子执行时间、算子全scope名称、算子信息等;默认按算子执行时间排序。 - 选择分类:按算子类别的统计结果进行排序展示,展示维度包括算子分类名称、算子类别执行时间、执行频次、占总时间的比例等。点击每个算子类别,可以进一步查看该类别下所有单个算子的统计信息。 - 搜索:在右侧搜索框中输入字符串,支持对算子名称/类别进行模糊搜索。 @@ -143,6 +149,7 @@ def test_profiler(): 图5展示了MindData性能分析页面,包含迭代间隙和数据处理两个TAB页面。 迭代间隙TAB页主要用来分析数据准备三个阶段是否存在性能瓶颈,数据队列图是分析判断的重要依据: + - 数据队列Size代表Device侧从队列取数据时队列的长度,如果数据队列Size为0,则训练会一直等待,直到队列中有数据才会开始某个step的训练;如果数据队列Size大于0,则训练可以快速取到数据,MindData不是该step的瓶颈所在。 - 主机队列Size可以推断出数据处理和发送速度,如果主机队列Size为0,表示数据处理速度慢而数据发送速度快,需要加快数据处理。 - 如果主机队列Size一直较大,而数据队列的Size持续很小,则数据发送有可能存在性能瓶颈。 @@ -154,20 +161,22 @@ def test_profiler(): 图6展示了数据处理TAB页面,可以对数据处理pipeline做进一步分析。不同的数据算子之间使用队列进行数据交换,队列的长度可以反映出算子处理数据的快慢,进而推断出pipeline中的瓶颈算子所在。 算子队列的平均使用率代表队列中已有数据Size除以队列最大数据Size的平均值,使用率越高说明队列中数据积累越多。算子队列关系展示了数据处理pipeline中的算子以及它们之间的连接情况,点击某个队列可以在下方查看该队列中数据Size随着时间的变化曲线,以及与数据队列连接的算子信息等。对数据处理pipeline的分析有如下建议: + - 当算子左边连接的Queue使用率都比较高,右边连接的Queue使用率比较低,该算子可能是性能瓶颈。 - 对于最左侧的算子,如果其右边所有Queue的使用率都比较低,该算子可能是性能瓶颈。 - 对于最右侧的算子,如果其左边所有Queue的使用率都比较高,该算子可能是性能瓶颈。 对于不同的类型的MindData算子,有如下优化建议: + - 如果Dataset算子是性能瓶颈,建议增加`num_parallel_workers`。 - 如果GeneratorOp类型的算子是性能瓶颈,建议增加`num_parallel_workers`,并尝试将其替换为`MindRecordDataset`。 - 如果MapOp类型的算子是性能瓶颈,建议增加`num_parallel_workers`,如果该算子为Python算子,可以尝试优化脚本。 - 如果BatchOp类型的算子是性能瓶颈,建议调整`prefetch_size`的大小。 - #### Timeline分析 Timeline组件可以展示: + - 算子分配到哪个设备(AICPU、AICore等)执行。 - MindSpore对该网络的流切分策略。 - 算子在Device上的执行序列和执行时长。 @@ -175,6 +184,7 @@ Timeline组件可以展示: 通过分析Timeline,用户可以对训练过程进行细粒度分析:从High Level层面,可以分析流切分方法是否合理、迭代间隙和拖尾时间是否过长等;从Low Level层面,可以分析算子执行时间等。 用户可以点击总览页面Timeline部分的下载按钮,将Timeline数据文件 (json格式) 保存至本地,再通过工具查看Timeline的详细信息。推荐使用 `chrome://tracing` 或者 [Perfetto](https://ui.perfetto.dev/#!viewer) 做Timeline展示。 + - Chrome tracing:点击左上角"load"加载文件。 - Perfetto:点击左侧"Open trace file"加载文件。 @@ -183,12 +193,12 @@ Timeline组件可以展示: 图7:Timeline分析 Timeline主要包含如下几个部分: + - Device及其stream list:包含Device上的stream列表,每个stream由task执行序列组成,一个task是其中的一个小方块,大小代表执行时间长短。 - 算子信息:选中某个task后,可以显示该task对应算子的信息,包括名称、type等。 可以使用W/A/S/D来放大、缩小地查看Timeline图信息。 - ## 规格 - 为了控制性能测试时生成数据的大小,大型网络建议性能调试的step数目限制在10以内。 diff --git a/tutorials/training/source_zh_cn/advanced_use/performance_profiling_gpu.md b/tutorials/training/source_zh_cn/advanced_use/performance_profiling_gpu.md index 6b98ea73a57f15127956b710ac956487978eff51..5ebda9c4bb1285988444a4ae066d03004a80aa2e 100644 --- a/tutorials/training/source_zh_cn/advanced_use/performance_profiling_gpu.md +++ b/tutorials/training/source_zh_cn/advanced_use/performance_profiling_gpu.md @@ -18,6 +18,7 @@ ## 概述 + 将训练过程中的算子耗时等信息记录到文件中,通过可视化界面供用户查看分析,帮助用户更高效地调试神经网络性能。 ## 操作流程 @@ -25,7 +26,6 @@ > 操作流程可以参考Ascend 910上profiler的操作: > > - > 普通用户默认情况下无权访问目标设备上的NVIDIA GPU性能计数器。如果普通用户需要在训练脚本中使用profiler性能统计能力,则需参考以下网址的说明进行权限配置。 > > @@ -33,6 +33,7 @@ ## 准备训练脚本 为了收集神经网络的性能数据,需要在训练脚本中添加MindSpore Profiler相关接口。 + - `set_context`之后,需要初始化MindSpore `Profiler`对象,GPU场景下初始化Profiler对象时只有output_path参数有效。 - 在训练结束后,调用`Profiler.analyse`停止性能数据收集并生成性能分析结果。 @@ -49,7 +50,7 @@ class StopAtStep(Callback): self.start_step = start_step self.stop_step = stop_step self.already_analysed = False - + def step_begin(self, run_context): cb_params = run_context.original_args() step_num = cb_params.cur_step_num @@ -62,7 +63,7 @@ class StopAtStep(Callback): if step_num == self.stop_step and not self.already_analysed: self.profiler.analyse() self.already_analysed = True - + def end(self, run_context): if not self.already_analysed: self.profiler.analyse() @@ -74,7 +75,6 @@ class StopAtStep(Callback): 启动命令请参考[MindInsight相关命令](https://www.mindspore.cn/tutorial/training/zh-CN/master/advanced_use/mindinsight_commands.html)。 - ### 性能分析 用户从训练列表中选择指定的训练,点击性能调试,可以查看该次训练的性能数据(目前GPU场景只支持算子耗时排名统计和Timeline功能,其他功能敬请期待)。 @@ -84,6 +84,7 @@ class StopAtStep(Callback): 图1:性能数据总览 图1展示了性能数据总览页面,包含了迭代轨迹(Step Trace)、算子性能、MindData性能和Timeline等组件的数据总体呈现: + - 算子性能:统计单算子以及各算子类型的执行时间,进行排序展示;总览页中展示了各算子类型平均执行时间占比的饼状图。 - Timeline:统计了算子以及CUDA activity,在时间轴排列展示;总览页展示了Timeline中执行情况汇总。 @@ -98,10 +99,12 @@ class StopAtStep(Callback): 图2:算子类别统计分析 图2展示了按算子类别进行统计分析的结果,包含以下内容: + - 可以选择饼图/柱状图展示各算子类别的时间占比,每个算子类别的执行时间会统计属于该类别的算子执行时间总和以及平均执行时间。 - 统计前20个平均执行时间最长的算子类别。 图2下半部分展示了算子性能统计表,包含以下内容: + - 选择全部:按单个算子的统计结果进行排序展示,展示维度包括算子位置(Device/Host)、算子类型、算子执行时间、算子全名等;默认按算子平均执行时间排序。 - 选择分类:按算子类别的统计结果进行排序展示,展示维度包括算子分类名称、算子类别执行时间、执行频次、执行总时间的比例、平均执行时间。点击每个算子类别,可以进一步查看该类别下所有单个算子的统计信息。 - 搜索:在右侧搜索框中输入字符串,支持对算子名称/类别进行模糊搜索。 @@ -120,7 +123,6 @@ class StopAtStep(Callback): GPU场景下,Timeline分析的使用方法和Ascend场景相同,不同之处是,GPU Timeline展示的是算子信息和CUDA activity的信息。使用方法参考: -> 样例代码与Ascend使用方式一致可以参考: +> 样例代码与Ascend使用方式一致可以参考: > > - diff --git a/tutorials/training/source_zh_cn/advanced_use/protect_user_privacy_with_differential_privacy.md b/tutorials/training/source_zh_cn/advanced_use/protect_user_privacy_with_differential_privacy.md index 827919e43ed88e6925f422ba908b3bde2ebb9e9e..d936ee4d962520ee2413ee20fcc9ab40c2d2092f 100644 --- a/tutorials/training/source_zh_cn/advanced_use/protect_user_privacy_with_differential_privacy.md +++ b/tutorials/training/source_zh_cn/advanced_use/protect_user_privacy_with_differential_privacy.md @@ -22,7 +22,7 @@ 差分隐私是一种保护用户数据隐私的机制。什么是隐私,隐私指的是单个用户的某些属性,一群用户的某一些属性可以不看做隐私。例如:“抽烟的人有更高的几率会得肺癌”,这个不泄露隐私,但是“张三抽烟,得了肺癌”,这个就泄露了张三的隐私。如果我们知道A医院,今天就诊的100个病人,其中有10个肺癌,并且我们知道了其中99个人的患病信息,就可以推测剩下一个人是否患有肺癌。这种窃取隐私的行为叫做差分攻击。差分隐私是防止差分攻击的方法,通过添加噪声,使得差别只有一条记录的两个数据集,通过模型推理获得相同结果的概率非常接近。也就是说,用了差分隐私后,攻击者知道的100个人的患病信息和99个人的患病信息几乎是一样的,从而无法推测出剩下1个人的患病情况。 -**机器学习中的差分隐私** +### 机器学习中的差分隐私 机器学习算法一般是用大量数据并更新模型参数,学习数据特征。在理想情况下,这些算法学习到一些泛化性较好的模型,例如“吸烟患者更容易得肺癌”,而不是特定的个体特征,例如“张三是个吸烟者,患有肺癌”。然而,机器学习算法并不会区分通用特征还是个体特征。当我们用机器学习来完成某个重要的任务,例如肺癌诊断,发布的机器学习模型,可能在无意中透露训练集中的个体特征,恶意攻击者可能从发布的模型获得关于张三的隐私信息,因此使用差分隐私技术来保护机器学习模型是十分必要的。 @@ -32,14 +32,14 @@ $Pr[\mathcal{K}(D)\in S] \le e^{\epsilon} Pr[\mathcal{K}(D') \in S]+\delta$ 对于两个差别只有一条记录的数据集$D, D'$,通过随机算法$\mathcal{K}$,输出为结果集合$S$子集的概率满足上面公式,$\epsilon$为差分隐私预算,$\delta$ 为扰动,$\epsilon, \delta$越小,$\mathcal{K}$在$D, D'$上输出的数据分布越接近。 -**差分隐私的度量** +### 差分隐私的度量 差分隐私可以用$\epsilon, \delta$ 度量。 - $\epsilon$:数据集中增加或者减少一条记录,引起的输出概率可以改变的上限。我们通常希望$\epsilon$是一个较小的常数,值越小表示差分隐私条件越严格。 - $\delta$:用于限制模型行为任意改变的概率,通常设置为一个小的常数,推荐设置小于训练数据集大小的倒数。 -**MindArmour实现的差分隐私** +### MindArmour实现的差分隐私 MindArmour的差分隐私模块Differential-Privacy,实现了差分隐私优化器。目前支持基于高斯机制的差分隐私SGD、Momentum、Adam优化器。其中,高斯噪声机制支持固定标准差的非自适应高斯噪声和随着时间或者迭代步数变化而变化的自适应高斯噪声,使用非自适应高斯噪声的优势在于可以严格控制差分隐私预算$\epsilon$,缺点是在模型训练过程中,每个Step添加的噪声量固定,在训练后期,较大的噪声使得模型收敛困难,甚至导致性能大幅下跌,模型可用性差。自适应噪声很好的解决了这个问题,在模型训练初期,添加的噪声量较大,随着模型逐渐收敛,噪声量逐渐减小,噪声对于模型可用性的影响减小。自适应噪声的缺点是不能严格控制差分隐私预算,在同样的初始值下,自适应差分隐私的$\epsilon$比非自适应的大。同时还提供RDP(R’enyi differential privacy)[2]用于监测差分隐私预算。 @@ -336,7 +336,8 @@ ds_train = generate_mnist_dataset(os.path.join(cfg.data_path, "train"), 5. 结果展示。 不加差分隐私的LeNet模型精度稳定在99%,加了Gaussian噪声,自适应Clip的差分隐私LeNet模型收敛,精度稳定在95%左右。 - ``` + + ```text ============== Starting Training ============== ... ============== Starting Testing ============== diff --git a/tutorials/training/source_zh_cn/advanced_use/save_load_model_hybrid_parallel.md b/tutorials/training/source_zh_cn/advanced_use/save_load_model_hybrid_parallel.md index b72b90fcf2949f4401ca5a5f9cd7126f1a081d41..40bca9703031ef6f25d470fff5602f5cb26655d3 100644 --- a/tutorials/training/source_zh_cn/advanced_use/save_load_model_hybrid_parallel.md +++ b/tutorials/training/source_zh_cn/advanced_use/save_load_model_hybrid_parallel.md @@ -72,9 +72,6 @@ MindSpore模型并行场景下,每个实例进程只保存有本节点对应 4. 执行阶段二训练。 - - - ## 对保存的CheckPoint文件做合并处理 ### 整体流程 @@ -85,18 +82,19 @@ MindSpore模型并行场景下,每个实例进程只保存有本节点对应 最后,将更新之后的参数列表,通过MindSpore提供的API保存到文件,生成新的CheckPoint文件。对应下图中的Step4。 -![img](./images/checkpoint_integration_process.jpg) +![img](./images/checkpoint_integration_process.jpg) ### 准备工作 #### 按逻辑顺序导入CheckPoint文件 定义网络,调用`load_checkpoint`、`load_param_into_net`接口,按逻辑顺序将CheckPoint文件导入网络,之后调用`parameters_and_names`接口获取网络里所有的参数数据。 -``` -net = Net() + +```python +net = Net() opt = Momentum(learning_rate=0.01, momentum=0.9, params=net.get_parameters()) net = TrainOneStepCell(net, opt) -param_dicts = [] +param_dicts = [] for i in range(rank_size): file_name = os.path.join("./node"+str(i), "CKP_1-4_32.ckpt") # checkpoint file name of current node param_dict = load_checkpoint(file_name) @@ -116,7 +114,8 @@ for i in range(rank_size): #### 获取模型参数切分策略 调用`build_searched_strategy`接口,得到模型各个参数的切分策略。 -``` + +```python strategy = build_searched_strategy("./strategy_train.cpkt") ``` @@ -130,45 +129,48 @@ strategy = build_searched_strategy("./strategy_train.cpkt") 参数名称为"model_parallel_weight",切分逻辑为4卡场景。 -1. 针对涉及模型并行的参数,获取所有节点上的参数数据。 +1. 针对涉及模型并行的参数,获取所有节点上的参数数据。 - ``` + ```python sliced_parameters = [] for i in range(4): parameter = param_dicts[i].get("model_parallel_weight") sliced_parameters.append(parameter) ``` + > 如果要保证参数更新速度不变,需要对优化器中保存的参数,如“moments.model_parallel_weight”,同样做合并处理。 2. 调用`merge_sliced_parameter`接口进行参数合并。 - ``` - merged_parameter = merge_sliced_parameter(sliced_parameters, strategy) + ```python + merged_parameter = merge_sliced_parameter(sliced_parameters, strategy) ``` > 如果存在多个模型并行的参数,则需要重复步骤1到步骤2循环逐个处理。 ### 保存数据生成新的CheckPoint文件 -1. 将`param_dict`转换为list类型数据。 +1. 将`param_dict`转换为list类型数据。 - ``` + ```python param_list = [] for (key, value) in param_dict.items(): - each_param = {} - each_param["name"] = key - if isinstance(value.data, Tensor): - param_data = value.data - else: - param_data = Tensor(value.data) - each_param["data"] = param_data + each_param = {} + each_param["name"] = key + if isinstance(value.data, Tensor): + param_data = value.data + else: + param_data = Tensor(value.data) + each_param["data"] = param_data param_list.append(each_param) ``` 2. 调用`save_checkpoint`接口,将参数数据写入文件,生成新的CheckPoint文件。 - ``` + + ```python save_checkpoint(param_list, “./CKP-Integrated_1-4_32.ckpt”) ``` + 其中, - `save_checkpoint`: 通过该接口将网络模型参数信息存入文件。 - `CKP-Integrated_1-4_32.ckpt`: 新生成的CheckPoint模型参数文件名称。 @@ -185,7 +187,7 @@ strategy = build_searched_strategy("./strategy_train.cpkt") 调用`load_checkpoint`接口,从CheckPoint文件中加载模型参数数据。 -``` +```python param_dict = load_checkpoint("./CKP-Integrated_1-4_32.ckpt") ``` @@ -204,7 +206,8 @@ param_dict = load_checkpoint("./CKP-Integrated_1-4_32.ckpt") 1. 对模型参数数据做切分。 如下代码示例,在维度0上,将数据切分为两个切片。 - ``` + + ```python new_param = parameter_dict[“model_parallel_weight”] slice_list = np.split(new_param.data.asnumpy(), 2, axis=0) new_param_moments = parameter_dict[“moments.model_parallel_weight”] @@ -213,24 +216,28 @@ param_dict = load_checkpoint("./CKP-Integrated_1-4_32.ckpt") 切分后的数据情况: - slice_list[0] --- [1, 2, 3, 4] 对应device0 - slice_list[1] --- [5, 6, 7, 8] 对应device1 + ```text + slice_list[0] --- [1, 2, 3, 4] 对应device0 + slice_list[1] --- [5, 6, 7, 8] 对应device1 + ``` 与`slice_list`类似,`slice_moments_list` 也被切分为两个shape为[1, 4]的Tensor。 -2. 在每个节点分别加载对应的数据切片。 +2. 在每个节点分别加载对应的数据切片。 获取本节点的rank_id,根据rank_id加载数据。 - ``` + + ```python rank = get_rank() tensor_slice = Tensor(slice_list[rank]) tensor_slice_moments = Tensor(slice_moments_list[rank]) ``` - - `get_rank`:获取当前设备在集群中的ID。 -3. 修改模型参数数据值。 + - `get_rank`:获取当前设备在集群中的ID。 - ``` +3. 修改模型参数数据值。 + + ```python new_param.set_data(tensor_slice, True) new_param_moments.set_data(tensor_slice_moments, True) ``` @@ -240,8 +247,9 @@ param_dict = load_checkpoint("./CKP-Integrated_1-4_32.ckpt") ### 步骤3:将修改后的参数数据加载到网络中 调用`load_param_into_net`接口,将模型参数数据加载到网络中。 -``` -net = Net() + +```python +net = Net() opt = Momentum(learning_rate=0.01, momentum=0.9, params=parallel_net.get_parameters()) load_param_into_net(net, param_dict) load_param_into_net(opt, param_dict) @@ -266,43 +274,44 @@ load_param_into_net(opt, param_dict) > > 本文档附上对CheckPoint文件做合并处理以及分布式训练前加载CheckPoint文件的示例代码,仅作为参考,实际请参考具体情况实现。 -### 示例代码 +### 示例代码 1. 执行脚本对CheckPoint文件做合并处理。 - 脚本执行命令: - ``` + 脚本执行命令: + + ```bash python ./integrate_checkpoint.py "待合并的CheckPoint文件名称" "合并生成的CheckPoint文件路径&名称" "策略文件路径&名称" "节点数" ``` integrate_checkpoint.py: - ``` + ```python import numpy as np import os import mindspore.nn as nn from mindspore import Tensor, Parameter from mindspore.ops import operations as P from mindspore.train.serialization import save_checkpoint, load_checkpoint, build_searched_strategy, merge_sliced_parameter - + class Net(nn.Cell): def __init__(self,weight_init): super(Net, self).__init__() self.weight = Parameter(Tensor(weight_init), "model_parallel_weight", layerwise_parallel=True) self.fc = P.MatMul(transpose_b=True) - + def construct(self, x): x = self.fc(x, self.weight1) return x - + def integrate_ckpt_file(old_ckpt_file, new_ckpt_file, strategy_file, rank_size): weight = np.ones([2, 8]).astype(np.float32) net = Net(weight) opt = Momentum(learning_rate=0.01, momentum=0.9, params=net.get_parameters()) net = TrainOneStepCell(net, opt) - + # load CheckPoint into net in rank id order - param_dicts = [] + param_dicts = [] for i in range(rank_size): file_name = os.path.join("./node"+str(i), old_ckpt_file) param_dict = load_checkpoint(file_name) @@ -311,21 +320,21 @@ load_param_into_net(opt, param_dict) for _, param in net.parameters_and_names(): param_dict[param.name] = param param_dicts.append(param_dict) - + strategy = build_searched_strategy(strategy_file) param_dict = {} - + for paramname in ["model_parallel_weight", "moments.model_parallel_weight"]: # get layer wise model parallel parameter sliced_parameters = [] for i in range(rank_size): parameter = param_dicts[i].get(paramname) sliced_parameters.append(parameter) - + # merge the parallel parameters of the model - merged_parameter = merge_sliced_parameter(sliced_parameters, strategy) + merged_parameter = merge_sliced_parameter(sliced_parameters, strategy) param_dict[paramname] = merged_parameter - + # convert param_dict to list type data param_list = [] for (key, value) in param_dict.items(): @@ -335,14 +344,14 @@ load_param_into_net(opt, param_dict) param_data = value.data else: param_data = Tensor(value.data) - each_param["data"] = param_data - param_list.append(each_param) - + each_param["data"] = param_data + param_list.append(each_param) + # call the API to generate a new CheckPoint file save_checkpoint(param_list, new_ckpt_file) - + return - + if __name__ == "__main__": try: old_ckpt_file = sys.argv[1] @@ -354,14 +363,15 @@ load_param_into_net(opt, param_dict) print("Fail to integrate checkpoint file) sys.exit(-1) ``` - + 执行结果: 脚本执行前,CheckPoint文件中参数值: - ``` + + ```text device0: name is model_parallel_weight - value is + value is [[0.87537426 1.0448935 0.86736983 0.8836905 0.77354026 0.69588304 0.9183654 0.7792076] [0.87224025 0.8726848 0.771446 0.81967723 0.88974726 0.7988162 0.72919345 0.7677011]] name is learning_rate @@ -372,10 +382,10 @@ load_param_into_net(opt, param_dict) value is [[0.2567724 -0.07485991 0.282002 0.2456022 0.454939 0.619168 0.18964815 0.45714882] [0.25946522 0.24344791 0.45677605 0.3611395 0.23378398 0.41439137 0.5312468 0.4696194]] - + device1: name is model_parallel_weight - value is + value is [[0.9210751 0.9050457 0.9827775 0.920396 0.9240526 0.9750359 1.0275179 1.0819869] [0.73605865 0.84631145 0.9746683 0.9386582 0.82902765 0.83565056 0.9702136 1.0514659]] name is learning_rate @@ -385,11 +395,11 @@ load_param_into_net(opt, param_dict) name is moments.model_weight value is [[0.2417504 0.28193963 0.06713893 0.21510397 0.23380603 0.11424308 0.0218009 -0.11969765] - [0.45955992 0.22664294 0.01990281 0.0731914 0.27125207 0.27298513 -0.01716102 -0.15327111]] - + [0.45955992 0.22664294 0.01990281 0.0731914 0.27125207 0.27298513 -0.01716102 -0.15327111]] + device2: name is model_parallel_weight - value is + value is [[1.0108461 0.8689414 0.91719437 0.8805056 0.7994629 0.8999671 0.7585804 1.0287056 ] [0.90653455 0.60146594 0.7206475 0.8306303 0.8364681 0.89625114 0.7354735 0.8447268]] name is learning_rate @@ -397,10 +407,10 @@ load_param_into_net(opt, param_dict) name is momentum value is [0.9] name is moments.model_weight - value is + value is [[0.03440702 0.41419312 0.24817684 0.30765256 0.48516113 0.24904746 0.57791173 0.00955463] [0.13458519 0.6690533 0.49259356 0.28319967 0.25951773 0.16777472 0.45696738 0.24933104]] - + device3: name is model_parallel_weight value is @@ -411,16 +421,16 @@ load_param_into_net(opt, param_dict) name is momentum value is [0.9] name is moments.model_parallel_weight - value is + value is [[0.14152306 0.5040985 0.24455397 0.10907605 0.11319532 0.19538902 0.01208619 0.40430856] [-0.7773164 -0.47611716 -0.6041424 -0.6144473 -0.2651842 -0.31909415 -0.4510405 -0.12860501]] ``` 脚本执行后,CheckPoint文件中参数值: - ``` + ```text name is model_parallel_weight - value is + value is [[1.1138763 1.0962057 1.3516843 1.0812817 1.1579804 1.1078343 1.0906502 1.3207073] [0.916671 1.0781671 1.0368758 0.9680898 1.1735439 1.0628364 0.9960786 1.0135143] [0.8828271 0.7963984 0.90675324 0.9830291 0.89010954 0.897052 0.7890109 0.89784735] @@ -434,7 +444,7 @@ load_param_into_net(opt, param_dict) name is momentum value is [0.9] name is moments.model_parallel_weight - value is + value is [[0.2567724 -0.07485991 0.282002 0.2456022 0.454939 0.619168 0.18964815 0.45714882] [0.25946522 0.24344791 0.45677605 0.3611395 0.23378398 0.41439137 0.5312468 0.4696194 ] [0.2417504 0.28193963 0.06713893 0.21510397 0.23380603 0.11424308 0.0218009 -0.11969765] @@ -446,10 +456,9 @@ load_param_into_net(opt, param_dict) -0.12860501]] ``` - 2. 执行阶段2训练,训练前加载CheckPoint文件。其中训练代码部分,需要根据实际情况补充。 - ``` + ```python import numpy as np import os import mindspore.nn as nn @@ -458,24 +467,24 @@ load_param_into_net(opt, param_dict) from mindspore import Tensor, Parameter from mindspore.ops import operations as P from mindspore.train.serialization import load_checkpoint, load_param_into_net - + from mindspore.communication.management import init devid = int(os.getenv('DEVICE_ID')) context.set_context(mode=context.GRAPH_MODE,device_target='Ascend',save_graphs=True, device_id=devid) init() - + class Net(nn.Cell): def __init__(self,weight_init): super(Net, self).__init__() self.weight = Parameter(Tensor(weight_init), "model_parallel_weight", layerwise_parallel=True) self.fc = P.MatMul(transpose_b=True) - + def construct(self, x): x = self.fc(x, self.weight1) return x def train_mindspore_impl_fc(input, label, ckpt_file): param_dict = load_checkpoint(ckpt_file) - + for paramname in ["model_parallel_weight", "moments.model_parallel_weight"]: # get layer wise model parallel parameter new_param = parameter_dict[paramname] @@ -486,23 +495,23 @@ load_param_into_net(opt, param_dict) tensor_slice = Tensor(slice_list[rank]) # modify model parameter data values new_param.set_data(tensor_slice, True) - + # load the modified parameter data into the network weight = np.ones([4, 8]).astype(np.float32) net = Net(weight) load_param_into_net(net, param_dict) opt = Momentum(learning_rate=0.01, momentum=0.9, params=parallel_net.get_parameters()) load_param_into_net(opt, param_dict) - # train code + # train code ... - + if __name__ == "__main__": input = np.random.random((4, 8)).astype(np.float32) print("mean = ", np.mean(input,axis=1, keepdims=True)) label = np.random.random((4, 4)).astype(np.float32) train_mindspore_impl_fc(input, label, weight1) ``` - + 其中, - `mode=context.GRAPH_MODE`:使用分布式训练需要指定运行模式为图模式(PyNative模式不支持并行)。 @@ -511,10 +520,10 @@ load_param_into_net(opt, param_dict) 加载后的参数值: - ``` + ```text device0: name is model_parallel_weight - value is + value is [[0.87537426 1.0448935 0.86736983 0.8836905 0.77354026 0.69588304 0.9183654 0.7792076] [0.87224025 0.8726848 0.771446 0.81967723 0.88974726 0.7988162 0.72919345 0.7677011] [0.8828271 0.7963984 0.90675324 0.9830291 0.89010954 0.897052 0.7890109 0.89784735] @@ -532,7 +541,7 @@ load_param_into_net(opt, param_dict) device1: name is model_parallel_weight - value is + value is [[1.0053468 0.98402303 0.99762845 0.97587246 1.0259694 1.0055295 0.99420834 0.9496847] [1.0851002 1.0295962 1.0999886 1.0958165 0.9765328 1.146529 1.0970603 1.1388365] [0.7147005 0.9168278 0.80178416 0.6258351 0.8413766 0.5909515 0.696347 0.71359116] @@ -546,5 +555,5 @@ load_param_into_net(opt, param_dict) [[0.03440702 0.41419312 0.24817684 0.30765256 0.48516113 0.24904746 0.57791173 0.00955463] [0.13458519 0.6690533 0.49259356 0.28319967 0.25951773 0.16777472 0.45696738 0.24933104] [0.14152306 0.5040985 0.24455397 0.10907605 0.11319532 0.19538902 0.01208619 0.40430856] - [-0.7773164 -0.47611716 -0.6041424 -0.6144473 -0.2651842 -0.31909415 -0.4510405 -0.12860501]] + [-0.7773164 -0.47611716 -0.6041424 -0.6144473 -0.2651842 -0.31909415 -0.4510405 -0.12860501]] ``` diff --git a/tutorials/training/source_zh_cn/advanced_use/summary_record.md b/tutorials/training/source_zh_cn/advanced_use/summary_record.md index 40d9a9fde98731b0c709c73e744684137d12afec..dd3da7678f7e6f154d2127ae436d02a5a6937991 100644 --- a/tutorials/training/source_zh_cn/advanced_use/summary_record.md +++ b/tutorials/training/source_zh_cn/advanced_use/summary_record.md @@ -17,7 +17,7 @@    - + ## 概述 @@ -43,6 +43,7 @@ MindSpore目前支持三种方式将数据记录到summary日志文件中。 即可自动收集一些常见信息。`SummaryCollector` 详细的用法可以参考 `API` 文档中 `mindspore.train.callback.SummaryCollector`。 样例代码如下: + ```python import mindspore import mindspore.nn as nn @@ -131,6 +132,7 @@ model.eval(ds_eval, callbacks=[summary_collector]) MindSpore除了提供 `SummaryCollector` 能够自动收集一些常见数据,还提供了Summary算子,支持在网络中自定义收集其他的数据,比如每一个卷积层的输入,或在损失函数中的损失值等。 当前支持的Summary算子: + - [ScalarSummary](https://www.mindspore.cn/doc/api_python/zh-CN/master/mindspore/mindspore.ops.html#mindspore.ops.ScalarSummary):记录标量数据 - [TensorSummary](https://www.mindspore.cn/doc/api_python/zh-CN/master/mindspore/mindspore.ops.html#mindspore.ops.TensorSummary):记录张量数据 - [ImageSummary](https://www.mindspore.cn/doc/api_python/zh-CN/master/mindspore/mindspore.ops.html#mindspore.ops.ImageSummary):记录图片数据 @@ -254,18 +256,18 @@ MindSpore支持自定义Callback, 并允许在自定义Callback中将数据记 样例代码如下: -``` +```python from mindspore.train.callback import Callback from mindspore.train.summary import SummaryRecord class ConfusionMatrixCallback(Callback): def __init__(self, summary_dir): self._summary_dir = summary_dir - + def __enter__(self): # init you summary record in here, when the train script run, it will be inited before training self.summary_record = SummaryRecord(summary_dir) - + def __exit__(self, *exc_args): # Note: you must close the summary record, it will release the process pool resource # else your training script will not exit from training. @@ -276,7 +278,7 @@ class ConfusionMatrixCallback(Callback): cb_params = run_context.run_context.original_args() # create a confusion matric image, and record it to summary file - confusion_martrix = create_confusion_matrix(cb_params) + confusion_martrix = create_confusion_matrix(cb_params) self.summary_record.add_value('image', 'confusion_matrix', confusion_matric) self.summary_record.record(cb_params.cur_step) @@ -288,31 +290,34 @@ model.train(cnn_network, train_dataset=train_ds, callbacks=[confusion_martrix]) ``` 上面的三种方式,支持记录计算图, 损失值等多种数据。除此以外,MindSpore还支持保存训练中其他阶段的计算图,通过 -将训练脚本中 `context.set_context` 的 `save_graphs` 选项设置为 `True`, 可以记录其他阶段的计算图,其中包括算子融合后的计算图。 +将训练脚本中 `context.set_context` 的 `save_graphs` 选项设置为 `True`, 可以记录其他阶段的计算图,其中包括算子融合后的计算图。 在保存的文件中,`ms_output_after_hwopt.pb` 即为算子融合后的计算图,可以使用可视化页面对其进行查看。 ## 运行MindInsight + 按照上面教程完成数据收集后,启动MindInsight,即可可视化收集到的数据。启动MindInsight时, 需要通过 `--summary-base-dir` 参数指定summary日志文件目录。 其中指定的summary日志文件目录可以是一次训练的输出目录,也可以是多次训练输出目录的父目录。 - 一次训练的输出目录结构如下: -``` + +```text └─summary_dir events.out.events.summary.1596869898.hostname_MS events.out.events.summary.1596869898.hostname_lineage ``` 启动命令: + ```Bash mindinsight start --summary-base-dir ./summary_dir ``` 多次训练的输出目录结构如下: -``` + +```text └─summary ├─summary_dir1 │ events.out.events.summary.1596869898.hostname_MS @@ -324,6 +329,7 @@ mindinsight start --summary-base-dir ./summary_dir ``` 启动命令: + ```Bash mindinsight start --summary-base-dir ./summary ``` @@ -331,13 +337,13 @@ mindinsight start --summary-base-dir ./summary 启动成功后,通过浏览器访问 `http://127.0.0.1:8080` 地址,即可查看可视化页面。 停止MindInsight命令: + ```Bash mindinsight stop ``` 更多参数设置,请点击查看[MindInsight相关命令](https://www.mindspore.cn/tutorial/training/zh-CN/master/advanced_use/mindinsight_commands.html)页面。 - ## 注意事项 1. 为了控制列出summary文件目录的用时,MindInsight最多支持发现999个summary文件目录。 @@ -349,7 +355,8 @@ mindinsight stop 自定义callback中如果使用 `SummaryRecord`,则其不能和 `SummaryCollector` 同时使用。 正确代码: - ``` + + ```python ... summary_collector = SummaryCollector('./summary_dir') model.train(2, train_dataset, callbacks=[summary_collector]) @@ -359,7 +366,8 @@ mindinsight stop ``` 错误代码: - ``` + + ```python ... summary_collector1 = SummaryCollector('./summary_dir1') summary_collector2 = SummaryCollector('./summary_dir2') @@ -367,7 +375,8 @@ mindinsight stop ``` 错误代码: - ``` + + ```python ... # Note: the 'ConfusionMatrixCallback' is user-defined, and it uses SummaryRecord to record data. confusion_callback = ConfusionMatrixCallback('./summary_dir1') @@ -377,4 +386,4 @@ mindinsight stop 3. 每个summary日志文件目录中,应该只放置一次训练的数据。一个summary日志目录中如果存放了多次训练的summary数据,MindInsight在可视化数据时会将这些训练的summary数据进行叠加展示,可能会与预期可视化效果不相符。 -4. 当前 `SummaryCollector` 和 `SummaryRecord` 不支持GPU多卡运行的场景。 \ No newline at end of file +4. 当前 `SummaryCollector` 和 `SummaryRecord` 不支持GPU多卡运行的场景。 diff --git a/tutorials/training/source_zh_cn/advanced_use/test_model_security_fuzzing.md b/tutorials/training/source_zh_cn/advanced_use/test_model_security_fuzzing.md index dde3f397c448a934218d4e57f322fb574b96898d..148399f8fda9f614f9499852ceb35b0ae870d884 100644 --- a/tutorials/training/source_zh_cn/advanced_use/test_model_security_fuzzing.md +++ b/tutorials/training/source_zh_cn/advanced_use/test_model_security_fuzzing.md @@ -10,7 +10,7 @@ - [导入需要的库文件](#导入需要的库文件) - [参数配置](#参数配置) - [运用Fuzz Testing](#运用fuzz-testing) - +    @@ -75,7 +75,7 @@ context.set_context(mode=context.GRAPH_MODE, device_target="Ascend") images = data[0].asnumpy().astype(np.float32) train_images.append(images) train_images = np.concatenate(train_images, axis=0) - + # get test data data_list = "../common/dataset/MNIST/test" batch_size = 32 @@ -105,7 +105,7 @@ context.set_context(mode=context.GRAPH_MODE, device_target="Ascend") 中对应的类方法。算法随机选择参数,则`params`设置为`'auto_param': [True]`,参数将在推荐范围内随机生成。 基于对抗攻击方法的参数配置请参考对应的攻击方法类。 - + 下面时变异方法及其参数配置的一个例子: ```python @@ -174,12 +174,12 @@ context.set_context(mode=context.GRAPH_MODE, device_target="Ascend") ``` 6. 实验结果。 - + fuzzing的返回结果中包含了5个数据:fuzz生成的样本fuzz_samples、生成样本的真实标签true_labels、被测模型对于生成样本的预测值fuzz_preds、 生成样本使用的变异方法fuzz_strategies、fuzz testing的评估报告metrics_report。用户可使用这些返回结果进一步的分析模型的鲁棒性。这里只展开metrics_report,查看fuzz testing后的各个评估指标。 ```python if metrics: - for key in metrics: + for key in metrics: LOGGER.info(TAG, key + ': %s', metrics[key]) ``` @@ -199,4 +199,4 @@ context.set_context(mode=context.GRAPH_MODE, device_target="Ascend") ​ Fuzz生成的变异图片: - ![fuzz_res](./images/fuzz_res.png) \ No newline at end of file + ![fuzz_res](./images/fuzz_res.png) diff --git a/tutorials/training/source_zh_cn/advanced_use/test_model_security_membership_inference.md b/tutorials/training/source_zh_cn/advanced_use/test_model_security_membership_inference.md index a5057865f58984a0b5665591d4195d6846e36bf5..559e1faf214329efc226a82f76906650d7a70243 100644 --- a/tutorials/training/source_zh_cn/advanced_use/test_model_security_membership_inference.md +++ b/tutorials/training/source_zh_cn/advanced_use/test_model_security_membership_inference.md @@ -10,7 +10,6 @@ - [建立模型](#建立模型) - [运用MembershipInference进行隐私安全评估](#运用membershipinference进行隐私安全评估) - [参考文献](#参考文献) -    @@ -30,7 +29,9 @@ ## 实现阶段 ### 导入需要的库文件 + #### 引入相关包 + 下面是我们需要的公共模块、MindSpore相关模块和MembershipInference特性模块,以及配置日志标签和日志等级。 ```python @@ -57,9 +58,11 @@ LOGGER = LogUtil.get_instance() TAG = "MembershipInference_test" LOGGER.set_level("INFO") ``` + ### 加载数据集 这里采用的是CIFAR-100数据集,您也可以采用自己的数据集,但要保证传入的数据仅有两项属性"image"和"label"。 + ```python # Generate CIFAR-100 data. def vgg_create_dataset100(data_home, image_size, batch_size, rank_id=0, rank_size=1, repeat_num=1, @@ -111,9 +114,11 @@ def vgg_create_dataset100(data_home, image_size, batch_size, rank_id=0, rank_siz return data_set ``` + ### 建立模型 这里以VGG16模型为例,您也可以替换为自己的模型。 + ```python def _make_layer(base, args, batch_norm): """Make stage network of VGG.""" @@ -178,10 +183,11 @@ def vgg16(num_classes=1000, args=None, phase="train"): ``` ### 运用MembershipInference进行隐私安全评估 + 1. 构建VGG16模型并加载参数文件。 - + 这里直接加载预训练完成的VGG16参数配置,您也可以使用如上的网络自行训练。 - + ```python ... # load parameter @@ -195,8 +201,8 @@ def vgg16(num_classes=1000, args=None, phase="train"): args.padding = 0 args.pad_mode = "same" args.weight_decay = 5e-4 - args.loss_scale = 1.0 - + args.loss_scale = 1.0 + # Load the pretrained model. net = vgg16(num_classes=100, args=args) loss = nn.SoftmaxCrossEntropyWithLogits(sparse=True) @@ -205,7 +211,7 @@ def vgg16(num_classes=1000, args=None, phase="train"): load_param_into_net(net, load_checkpoint(args.pre_trained)) model = Model(network=net, loss_fn=loss, optimizer=opt) ``` - + 2. 加载CIFAR-100数据集,按8:2分割为成员推理模型的训练集和测试集。 ```python @@ -221,9 +227,9 @@ def vgg16(num_classes=1000, args=None, phase="train"): ``` 3. 配置推理参数和评估参数 - + 设置用于成员推理的方法和参数。目前支持的推理方法有:KNN、LR、MLPClassifier和RandomForestClassifier。推理参数数据类型使用list,各个方法使用key为"method"和"params"的字典表示。 - + ```python config = [ { @@ -232,7 +238,7 @@ def vgg16(num_classes=1000, args=None, phase="train"): "C": np.logspace(-4, 2, 10) } }, - { + { "method": "knn", "params": { "n_neighbors": [3, 5, 7] @@ -258,13 +264,13 @@ def vgg16(num_classes=1000, args=None, phase="train"): } ] ``` - + 我们约定标签为数据集的是正类,标签为测试集的是负类。设置评价指标,目前支持3种评价指标。包括: - * 准确率:accuracy,正确推理的数量占全体样本中的比例。 - * 精确率:precision,正确推理的正类样本占所有推理为正类中的比例。 - * 召回率:recall,正确推理的正类样本占全体正类样本的比例。 + - 准确率:accuracy,正确推理的数量占全体样本中的比例。 + - 精确率:precision,正确推理的正类样本占所有推理为正类中的比例。 + - 召回率:recall,正确推理的正类样本占全体正类样本的比例。 在样本数量足够大时,如果上述指标均大于0.6,我们认为目标模型就存在隐私泄露的风险。 - + ```python metrics = ["precision", "accuracy", "recall"] ``` @@ -273,11 +279,11 @@ def vgg16(num_classes=1000, args=None, phase="train"): ```python inference = MembershipInference(model) # Get inference model. - + inference.train(train_train, train_test, config) # Train inference model. msg = "Membership inference model training completed." LOGGER.info(TAG, msg) - + result = inference.eval(eval_train, eval_test, metrics) # Eval metrics. count = len(config) for i in range(count): @@ -286,16 +292,16 @@ def vgg16(num_classes=1000, args=None, phase="train"): 5. 实验结果。 执行如下指令,开始成员推理训练和评估: - - ``` + + ```bash python example_vgg_cifar.py --data_path ./cifar-100-binary/ --pre_trained ./VGG16-100_781.ckpt ``` 成员推理的指标如下所示,各数值均保留至小数点后四位。 以第一行结果为例:在使用lr(逻辑回归分类)进行成员推理时,推理的准确率(accuracy)为0.7132,推理精确率(precision)为0.6596,正类样本召回率为0.8810,说明lr有71.32%的概率能正确分辨一个数据样本是否属于目标模型的训练数据集。在二分类任务下,指标表明成员推理是有效的,即该模型存在隐私泄露的风险。 - - ``` + + ```text Method: lr, {'recall': 0.8810,'precision': 0.6596,'accuracy': 0.7132} Method: knn, {'recall': 0.7082,'precision': 0.5613,'accuracy': 0.5774} Method: mlp, {'recall': 0.6729,'precision': 0.6462,'accuracy': 0.6522} @@ -303,4 +309,5 @@ def vgg16(num_classes=1000, args=None, phase="train"): ``` ## 参考文献 + [1] [Shokri R , Stronati M , Song C , et al. Membership Inference Attacks against Machine Learning Models[J].](https://arxiv.org/abs/1610.05820v2) diff --git a/tutorials/training/source_zh_cn/advanced_use/use_on_the_cloud.md b/tutorials/training/source_zh_cn/advanced_use/use_on_the_cloud.md index 47680705b2a44f4f25d103ab25612414410df8e7..bf3dfd79ce8be8343042b7b90500edf752e212df 100644 --- a/tutorials/training/source_zh_cn/advanced_use/use_on_the_cloud.md +++ b/tutorials/training/source_zh_cn/advanced_use/use_on_the_cloud.md @@ -53,7 +53,7 @@ ModelArts使用对象存储服务(Object Storage Service,简称OBS)进行 2. 新建一个自己的OBS桶(例如:ms-dataset),在桶中创建数据目录(例如:cifar-10),将CIFAR-10数据按照如下结构上传至数据目录。 - ``` + ```text └─对象存储/ms-dataset/cifar-10 ├─train │ data_batch_1.bin @@ -73,7 +73,7 @@ ModelArts使用对象存储服务(Object Storage Service,简称OBS)进行 为了方便后续创建训练作业,先创建训练输出目录和日志输出目录,本示例创建的目录结构如下: -``` +```text └─对象存储/resnet50-train ├─resnet50_cifar10_train │ dataset.py @@ -87,7 +87,7 @@ ModelArts使用对象存储服务(Object Storage Service,简称OBS)进行 “执行脚本准备”章节提供的脚本可以直接运行在ModelArts,想要快速体验ResNet-50训练CIFAR-10可以跳过本章节。如果需要将自定义MindSpore脚本或更多MindSpore示例代码在ModelArts运行起来,需要参考本章节对MindSpore代码进行简单适配。 -### 适配脚本参数 +### 适配脚本参数 1. 在ModelArts运行的脚本必须配置`data_url`和`train_url`,分别对应数据存储路径(OBS路径)和训练输出路径(OBS路径)。 @@ -125,13 +125,14 @@ MindSpore暂时没有提供直接访问OBS数据的接口,需要通过MoXing ``` ### 适配8卡训练任务 + 如果需要将脚本运行在`8*Ascend`规格的环境上,需要对创建数据集的代码和本地数据路径进行适配,并配置分布式策略。通过获取`DEVICE_ID`和`RANK_SIZE`两个环境变量,用户可以构建适用于`1*Ascend`和`8*Ascend`两种不同规格的训练脚本。 1. 本地路径适配。 ```python import os - + device_num = int(os.getenv('RANK_SIZE')) device_id = int(os.getenv('DEVICE_ID')) # define local data path @@ -311,7 +312,6 @@ ModelArts教程 - ## 概述 回归问题算法通常是利用一系列属性来预测一个值,预测的值是连续的。例如给出一套房子的一些特征数据,如面积、卧室数等等来预测房价,利用最近一周的气温变化和卫星云图来预测未来的气温情况等。如果一套房子实际价格为500万元,通过回归分析的预测值为499万元,则认为这是一个比较好的回归分析。在机器学习问题中,常见的回归分析有线性回归、多项式回归、逻辑回归等。本例子介绍线性回归算法,并通过MindSpore进行线性回归AI训练体验。 @@ -47,7 +46,6 @@ 设置MindSpore运行配置 - ```python from mindspore import context @@ -66,7 +64,6 @@ context.set_context(mode=context.GRAPH_MODE, device_target="CPU") `get_data`用于生成训练数据集和测试数据集。由于拟合的是线性数据,假定要拟合的目标函数为:$f(x)=2x+3$,那么我们需要的训练数据集应随机分布于函数周边,这里采用了$f(x)=2x+3+noise$的方式生成,其中`noise`为遵循标准正态分布规律的随机数值。 - ```python import numpy as np @@ -80,7 +77,6 @@ def get_data(num, w=2.0, b=3.0): 使用`get_data`生成50组测试数据,可视化展示。 - ```python import matplotlib.pyplot as plt @@ -97,10 +93,8 @@ plt.show() 输出结果: - ![png](./images/linear_regression_eval_datasets.png) - 上图中绿色线条部分为目标函数,红点部分为验证数据`eval_data`。 ### 定义数据增强函数 @@ -111,7 +105,6 @@ plt.show() - `batch`:将`batch_size`个数据组合成一个batch。 - `repeat`:将数据集数量倍增。 - ```python from mindspore import dataset as ds @@ -124,13 +117,12 @@ def create_dataset(num_data, batch_size=16, repeat_size=1): 使用数据集增强函数生成训练数据,并查看训练数据的格式。 - ```python num_data = 1600 batch_size = 16 repeat_size = 1 -ds_train = create_dataset(num_data, batch_size=batch_size, repeat_size=repeat_size) +ds_train = create_dataset(num_data, batch_size=batch_size, repeat_size=repeat_size) print("The dataset size of ds_train:", ds_train.get_dataset_size()) dict_datasets = ds_train.create_dict_iterator().get_next() @@ -141,11 +133,12 @@ print("The y label value shape:", dict_datasets["label"].shape) 输出结果: - The dataset size of ds_train: 100 - dict_keys(['data', 'label']) - The x label value shape: (16, 1) - The y label value shape: (16, 1) - +```text +The dataset size of ds_train: 100 +dict_keys(['data', 'label']) +The x label value shape: (16, 1) +The y label value shape: (16, 1) +``` 通过定义的`create_dataset`将生成的1600个数据增强为了100组shape为16x1的数据集。 @@ -157,7 +150,6 @@ $$f(x)=wx+b\tag{1}$$ 并使用Normal算子随机初始化权重$w$和$b$。 - ```python from mindspore.common.initializer import Normal from mindspore import nn @@ -174,7 +166,6 @@ class LinearNet(nn.Cell): 调用网络查看初始化的模型参数。 - ```python net = LinearNet() model_params = net.trainable_params() @@ -183,18 +174,18 @@ print(model_params) 输出结果: - [Parameter (name=fc.weight, value=Tensor(shape=[1, 1], dtype=Float32, - [[-7.35660456e-003]])), Parameter (name=fc.bias, value=Tensor(shape=[1], dtype=Float32, [-7.35660456e-003]))] - +```text +[Parameter (name=fc.weight, value=Tensor(shape=[1, 1], dtype=Float32, +[[-7.35660456e-003]])), Parameter (name=fc.bias, value=Tensor(shape=[1], dtype=Float32, [-7.35660456e-003]))] +``` 初始化网络模型后,接下来将初始化的网络函数和训练数据集进行可视化,了解拟合前的模型函数情况。 - ```python from mindspore import Tensor x_model_label = np.array([-10, 10, 0.1]) -y_model_label = (x_model_label * Tensor(model_params[0]).asnumpy()[0][0] + +y_model_label = (x_model_label * Tensor(model_params[0]).asnumpy()[0][0] + Tensor(model_params[1]).asnumpy()[0]) plt.scatter(x_eval_label, y_eval_label, color="red", s=5) @@ -205,10 +196,8 @@ plt.show() 输出结果: - ![png](./images/model_net_and_eval_datasets.png) - 从上图中可以看出,蓝色线条的初始化模型函数与绿色线条的目标函数还是有较大的差别的。 ## 定义前向传播网络与反向传播网络并关联 @@ -236,7 +225,6 @@ $$J(w)=\frac{1}{2m}\sum_{i=1}^m(h(x_i)-y^{(i)})^2\tag{2}$$ 在MindSpore中使用如下方式实现。 - ```python net = LinearNet() net_loss = nn.loss.MSELoss() @@ -257,7 +245,6 @@ $$w_{t}=w_{t-1}-\alpha\frac{\partial{J(w_{t-1})}}{\partial{w}}\tag{3}$$ 函数中所有的权重值更新完成后,将值传入到模型函数中,这个过程就是反向传播过程,实现此过程需要使用MindSpore中的优化器函数,如下: - ```python opt = nn.Momentum(net.trainable_params(), learning_rate=0.005, momentum=0.9) ``` @@ -266,7 +253,6 @@ opt = nn.Momentum(net.trainable_params(), learning_rate=0.005, momentum=0.9) 定义完成前向传播和反向传播后,在MindSpore中需要调用`Model`函数,将前面定义的网络,损失函数,优化器函数关联起来,使之变成完整的计算网络。 - ```python from mindspore.train import Model @@ -279,7 +265,6 @@ model = Model(net, net_loss, opt) 为了使得整个训练过程更容易理解,需要将训练过程的测试数据、目标函数和模型网络进行可视化,这里定义了可视化函数,将在每个step训练结束后调用,展示模型网络的拟合过程。 - ```python import matplotlib.pyplot as plt import time @@ -292,7 +277,7 @@ def plot_model_and_datasets(net, eval_data): x1, y1 = zip(*eval_data) x_target = x y_target = x_target * 2 + 3 - + plt.axis([-11, 11, -20, 25]) plt.scatter(x1, y1, color="red", s=5) plt.plot(x, y, color="blue") @@ -305,7 +290,6 @@ def plot_model_and_datasets(net, eval_data): MindSpore提供的工具,可对模型训练过程进行自定义控制,这里在`step_end`中调用可视化函数,展示拟合过程。更多的使用可参考[官网说明]()。 - ```python from IPython import display from mindspore.train.callback import Callback @@ -314,7 +298,7 @@ class ImageShowCallback(Callback): def __init__(self, net, eval_data): self.net = net self.eval_data = eval_data - + def step_end(self, run_context): plot_model_and_datasets(self.net, self.eval_data) display.clear_output(wait=True) @@ -329,7 +313,6 @@ class ImageShowCallback(Callback): - `callbacks`:训练过程中需要调用的回调函数。 - `dataset_sink_model`:数据集下沉模式,支持Ascend、GPU计算平台,本例为CPU计算平台设置为False。 - ```python from mindspore.train.callback import LossMonitor @@ -344,13 +327,12 @@ print(net.trainable_params()[0], "\n%s" % net.trainable_params()[1]) 输出结果: - ![gif](./images/linear_regression.gif) - - Parameter (name=fc.weight, value=[[2.0065749]]) - Parameter (name=fc.bias, value=[3.0089042]) - +```text +Parameter (name=fc.weight, value=[[2.0065749]]) +Parameter (name=fc.bias, value=[3.0089042]) +``` 训练完成后打印出最终模型的权重参数,其中weight接近于2.0,bias接近于3.0,模型训练完成,符合预期。 diff --git a/tutorials/training/source_zh_cn/quick_start/quick_start.md b/tutorials/training/source_zh_cn/quick_start/quick_start.md index b9d211561fbd82befa9f3ac7bee738773bd76008..10cbb10ec939b5054e5295f72b35e902ad860e5d 100644 --- a/tutorials/training/source_zh_cn/quick_start/quick_start.md +++ b/tutorials/training/source_zh_cn/quick_start/quick_start.md @@ -36,6 +36,7 @@ 下面我们通过一个实际样例,带领大家体验MindSpore基础的功能,对于一般的用户而言,完成整个样例实践会持续20~30分钟。 本例子会实现一个简单的图片分类的功能,整体流程如下: + 1. 处理需要的数据集,这里使用了MNIST数据集。 2. 定义一个网络,这里我们使用LeNet网络。 3. 定义损失函数和优化器。 @@ -45,7 +46,6 @@ > 你可以在这里找到完整可运行的样例代码: 。 - 这是简单、基础的应用流程,其他高级、复杂的应用可以基于这个基本流程进行扩展。 ## 准备环节 @@ -66,7 +66,7 @@ 目录结构如下: -``` +```text └─MNIST_Data ├─test │ t10k-images.idx3-ubyte @@ -76,6 +76,7 @@ train-images.idx3-ubyte train-labels.idx1-ubyte ``` + > 为了方便样例使用,我们在样例脚本中添加了自动下载数据集的功能。 ### 导入Python库&模块 @@ -83,8 +84,7 @@ 在使用前,需要导入需要的Python库。 目前使用到`os`库,为方便理解,其他需要的库,我们在具体使用到时再说明。 - - + ```python import os ``` @@ -161,7 +161,7 @@ def create_dataset(data_path, batch_size=32, repeat_size=1, rescale_op = CV.Rescale(rescale, shift) # rescale images hwc2chw_op = CV.HWC2CHW() # change shape from (height, width, channel) to (channel, height, width) to fit network. type_cast_op = C.TypeCast(mstype.int32) # change data type of label to int32 to fit network - + # apply map operations on images mnist_ds = mnist_ds.map(operations=type_cast_op, input_columns="label", num_parallel_workers=num_parallel_workers) mnist_ds = mnist_ds.map(operations=resize_op, input_columns="image", num_parallel_workers=num_parallel_workers) @@ -187,7 +187,6 @@ def create_dataset(data_path, batch_size=32, repeat_size=1, > MindSpore支持进行多种数据处理和增强的操作,各种操作往往组合使用,具体可以参考[数据处理](https://www.mindspore.cn/doc/programming_guide/zh-CN/master/pipeline.html)和与[数据增强](https://www.mindspore.cn/doc/programming_guide/zh-CN/master/augmentation.html)章节。 - ## 定义网络 我们选择相对简单的LeNet网络。LeNet网络不包括输入层的情况下,共有7层:2个卷积层、2个下采样层(池化层)、3个全连接层。每层都包含不同数量的训练参数,如下图所示: @@ -196,11 +195,11 @@ def create_dataset(data_path, batch_size=32, repeat_size=1, > 更多的LeNet网络的介绍不在此赘述,希望详细了解LeNet网络,可以查询。 -我们对全连接层以及卷积层采用`Normal`进行参数初始化。 +我们对全连接层以及卷积层采用`Normal`进行参数初始化。 MindSpore支持`TruncatedNormal`、`Normal`、`Uniform`等多种参数初始化方法,默认采用`Normal`。具体可以参考MindSpore API的`mindspore.common.initializer`模块说明。 -使用MindSpore定义神经网络需要继承`mindspore.nn.cell.Cell`。`Cell`是所有神经网络(`Conv2d`等)的基类。 +使用MindSpore定义神经网络需要继承`mindspore.nn.Cell`。`Cell`是所有神经网络(`Conv2d`等)的基类。 神经网络的各层需要预先在`__init__`方法中定义,然后通过定义`construct`方法来完成神经网络的前向构造。按照LeNet的网络结构,定义网络各层如下: @@ -242,7 +241,7 @@ class LeNet5(nn.Cell): 在进行定义之前,先简单介绍损失函数及优化器的概念。 - 损失函数:又叫目标函数,用于衡量预测值与实际值差异的程度。深度学习通过不停地迭代来缩小损失函数的值。定义一个好的损失函数,可以有效提高模型的性能。 -- 优化器:用于最小化损失函数,从而在训练过程中改进模型。 +- 优化器:用于最小化损失函数,从而在训练过程中改进模型。 定义了损失函数后,可以得到损失函数关于权重的梯度。梯度用于指示优化器优化权重的方向,以提高模型性能。 @@ -296,9 +295,9 @@ from mindspore.train.callback import ModelCheckpoint, CheckpointConfig if __name__ == "__main__": ... # set parameters of check point - config_ck = CheckpointConfig(save_checkpoint_steps=1875, keep_checkpoint_max=10) + config_ck = CheckpointConfig(save_checkpoint_steps=1875, keep_checkpoint_max=10) # apply parameters of check point - ckpoint_cb = ModelCheckpoint(prefix="checkpoint_lenet", config=config_ck) + ckpoint_cb = ModelCheckpoint(prefix="checkpoint_lenet", config=config_ck) ... ``` @@ -307,7 +306,6 @@ if __name__ == "__main__": 通过MindSpore提供的`model.train`接口可以方便地进行网络的训练。`LossMonitor`可以监控训练过程中`loss`值的变化。 这里把`epoch_size`设置为1,对数据集进行1个迭代的训练。 - ```python from mindspore.nn.metrics import Accuracy from mindspore.train.callback import LossMonitor @@ -324,23 +322,26 @@ def train_net(args, model, epoch_size, mnist_path, repeat_size, ckpoint_cb, sink if __name__ == "__main__": ... - - epoch_size = 1 + + epoch_size = 1 mnist_path = "./MNIST_Data" repeat_size = 1 model = Model(network, net_loss, net_opt, metrics={"Accuracy": Accuracy()}) train_net(args, model, epoch_size, mnist_path, repeat_size, ckpoint_cb, dataset_sink_mode) ... ``` + 其中, 在`train_net`方法中,我们加载了之前下载的训练数据集,`mnist_path`是MNIST数据集路径。 ## 运行并查看结果 使用以下命令运行脚本: -``` + +```bash python lenet.py --device_target=CPU ``` + 其中, `lenet.py`:为你根据教程编写的脚本文件。 `--device_target CPU`:指定运行硬件平台,参数为`CPU`、`GPU`或者`Ascend`,根据你的实际运行硬件平台来指定。 @@ -402,23 +403,24 @@ if __name__ == "__main__": test_net(network, model, mnist_path) ``` -其中, +其中, `load_checkpoint`:通过该接口加载CheckPoint模型参数文件,返回一个参数字典。 `checkpoint_lenet-1_1875.ckpt`:之前保存的CheckPoint模型文件名称。 `load_param_into_net`:通过该接口把参数加载到网络中。 - 使用运行命令,运行你的代码脚本。 + ```bash python lenet.py --device_target=CPU ``` + 其中, `lenet.py`:为你根据教程编写的脚本文件。 `--device_target CPU`:指定运行硬件平台,参数为`CPU`、`GPU`或者`Ascend`,根据你的实际运行硬件平台来指定。 运行结果示例如下: -``` +```text ... ============== Starting Testing ============== ============== Accuracy:{'Accuracy': 0.9663477564102564} ============== diff --git a/tutorials/training/source_zh_cn/quick_start/quick_video.md b/tutorials/training/source_zh_cn/quick_start/quick_video.md index 75c15f0c82a95af47410f087654a6f07d399f1fc..8da94f547ad1a63eae72fb82bfc64acf82cf357d 100644 --- a/tutorials/training/source_zh_cn/quick_start/quick_video.md +++ b/tutorials/training/source_zh_cn/quick_start/quick_video.md @@ -108,7 +108,6 @@ - ## 体验MindSpore @@ -209,11 +208,34 @@ + - ## 使用可视化组件MindInsight @@ -426,4 +448,4 @@ - \ No newline at end of file + diff --git a/tutorials/training/source_zh_cn/quick_start/quick_video/inference.md b/tutorials/training/source_zh_cn/quick_start/quick_video/inference.md new file mode 100644 index 0000000000000000000000000000000000000000..9bccf474586b17c133300dcdeaf7d57f51e7da83 --- /dev/null +++ b/tutorials/training/source_zh_cn/quick_start/quick_video/inference.md @@ -0,0 +1,9 @@ +# 多平台推理 + +[comment]: <> (本文档中包含手把手系列视频,码云Gitee不支持展示,请于官方网站对应教程中查看) + + + +**更多内容**: \ No newline at end of file diff --git a/tutorials/training/source_zh_cn/use/load_model_for_inference_and_transfer.md b/tutorials/training/source_zh_cn/use/load_model_for_inference_and_transfer.md index a5e79747edb17264a5d862dbbd9bd80dbd28d4c1..c38c10191788bdaf2249932da565fc3c48832789 100644 --- a/tutorials/training/source_zh_cn/use/load_model_for_inference_and_transfer.md +++ b/tutorials/training/source_zh_cn/use/load_model_for_inference_and_transfer.md @@ -48,6 +48,7 @@ acc = model.eval(dataset_eval) 针对任务中断再训练及微调(Fine Tune)场景,可以加载网络参数和优化器参数到模型中。 示例代码如下: + ```python # return a parameter dict for model param_dict = load_checkpoint("resnet50-2_32.ckpt") @@ -103,7 +104,7 @@ model.train(epoch, dataset) ### 用于迁移学习 -通过`mindspore_hub.load`完成模型加载后,可以增加一个额外的参数项只加载神经网络的特征提取部分,这样我们就能很容易地在之后增加一些新的层进行迁移学习。*当模型开发者将额外的参数(例如 `include_top`)添加到模型构造中时,可以在模型的详情页中找到这个功能。`include_top`取值为True或者False,表示是否保留顶层的全连接网络。* +通过`mindspore_hub.load`完成模型加载后,可以增加一个额外的参数项只加载神经网络的特征提取部分,这样我们就能很容易地在之后增加一些新的层进行迁移学习。*当模型开发者将额外的参数(例如 `include_top`)添加到模型构造中时,可以在模型的详情页中找到这个功能。`include_top`取值为True或者False,表示是否保留顶层的全连接网络。* 下面我们以GoogleNet为例,说明如何加载一个基于ImageNet的预训练模型,并在特定的子任务数据集上进行迁移学习(重训练)。主要的步骤如下: @@ -140,7 +141,7 @@ model.train(epoch, dataset) super(ReduceMeanFlatten, self).__init__() self.mean = P.ReduceMean(keep_dims=True) self.flatten = nn.Flatten() - + def construct(self, x): x = self.mean(x, (2, 3)) x = self.flatten(x) @@ -180,10 +181,10 @@ model.train(epoch, dataset) optim = Momentum(filter(lambda x: x.requires_grad, loss_net.get_parameters()), Tensor(lr), 0.9, 4e-5) train_net = nn.TrainOneStepCell(loss_net, optim) ``` - + 5. 构建数据集,开始重训练。 - 如下所示,进行微调任务的数据集为垃圾分类数据集,存储位置为`/ssd/data/garbage/train`。 + 如下所示,进行微调任务的数据集为垃圾分类数据集,存储位置为`/ssd/data/garbage/train`。 ```python dataset = create_dataset("/ssd/data/garbage/train", @@ -197,7 +198,7 @@ model.train(epoch, dataset) data, label = items data = mindspore.Tensor(data) label = mindspore.Tensor(label) - + loss = train_net(data, label) print(f"epoch: {epoch}/{epoch_size}, loss: {loss}") # Save the ckpt file for each epoch. @@ -218,7 +219,7 @@ model.train(epoch, dataset) classification_layer = nn.Dense(last_channel, num_classes) classification_layer.set_train(False) softmax = nn.Softmax() - network = nn.SequentialCell([network, reducemean_flatten, + network = nn.SequentialCell([network, reducemean_flatten, classification_layer, softmax]) # Load a pre-trained ckpt file. @@ -237,4 +238,4 @@ model.train(epoch, dataset) res = model.eval(eval_dataset) print("result:", res, "ckpt=", ckpt_path) - ``` \ No newline at end of file + ``` diff --git a/tutorials/training/source_zh_cn/use/publish_model.md b/tutorials/training/source_zh_cn/use/publish_model.md index 9d3aadc66768f33a1b3b376d2a8fc22f93887019..51778851255c74789c6ae2cd24063268fa1c8757 100644 --- a/tutorials/training/source_zh_cn/use/publish_model.md +++ b/tutorials/training/source_zh_cn/use/publish_model.md @@ -24,9 +24,9 @@ 1. 将你的预训练模型托管在可以访问的存储位置。 -2. 参照[模板](https://gitee.com/mindspore/mindspore/blob/master/model_zoo/official/cv/googlenet/mindspore_hub_conf.py),在你自己的代码仓中添加模型生成文件`mindspore_hub_conf.py`,文件放置的位置如下: +2. 参照[模板](https://gitee.com/mindspore/mindspore/blob/master/model_zoo/official/cv/googlenet/mindspore_hub_conf.py),在你自己的代码仓中添加模型生成文件`mindspore_hub_conf.py`,文件放置的位置如下: - ```shell + ```bash googlenet ├── src │   ├── googlenet.py @@ -39,7 +39,7 @@ 3. 参照[模板](https://gitee.com/mindspore/hub/blob/master/mshub_res/assets/mindspore/ascend/0.7/googlenet_v1_cifar10.md#),在`hub/mshub_res/assets/mindspore/ascend/0.7`文件夹下创建`{model_name}_{model_version}_{dataset}.md`文件,其中`ascend`为模型运行的硬件平台,`0.7`为MindSpore的版本号,`hub/mshub_res`的目录结构为: - ```shell + ```bash hub ├── mshub_res │   ├── assets @@ -47,19 +47,20 @@ | ├── gpu | ├── 0.7 | ├── ascend - | ├── 0.7 + | ├── 0.7 | ├── googlenet_v1_cifar10.md │   ├── tools | ├── md_validator.py - | └── md_validator.py + | └── md_validator.py ``` + 注意,`{model_name}_{model_version}_{dataset}.md`文件中需要补充如下所示的`file-format`、`asset-link` 和 `asset-sha256`信息,它们分别表示模型文件格式、模型存储位置(步骤1所得)和模型哈希值。 - ```shell + ```bash file-format: ckpt asset-link: https://download.mindspore.cn/model_zoo/official/cv/googlenet/goolenet_ascend_0.2.0_cifar10_official_classification_20200713/googlenet.ckpt asset-sha256: 114e5acc31dad444fa8ed2aafa02ca34734419f602b9299f3b53013dfc71b0f7 - ``` + ``` 其中,MindSpore Hub支持的模型文件格式有: - [MindSpore CKPT](https://www.mindspore.cn/tutorial/training/zh-CN/master/use/save_model.html#checkpoint) diff --git a/tutorials/training/source_zh_cn/use/save_model.md b/tutorials/training/source_zh_cn/use/save_model.md index 0a2312e85106badd0e5be989acee0612aab899c1..0e5e9a3c1105bab12594792cfa03950b6f514d39 100644 --- a/tutorials/training/source_zh_cn/use/save_model.md +++ b/tutorials/training/source_zh_cn/use/save_model.md @@ -34,6 +34,7 @@ 通过`CheckpointConfig`对象可以设置CheckPoint的保存策略。保存的参数分为网络参数和优化器参数。 `ModelCheckpoint`提供默认配置策略,方便用户快速上手。具体用法如下: + ```python from mindspore.train.callback import ModelCheckpoint ckpoint_cb = ModelCheckpoint() @@ -60,7 +61,7 @@ model.train(epoch_num, dataset, callbacks=ckpoint_cb) 生成的CheckPoint文件如下: -``` +```text resnet50-graph.meta # 编译后的计算图 resnet50-1_32.ckpt # CheckPoint文件后缀名为'.ckpt' resnet50-2_32.ckpt # 文件的命名方式表示保存参数所在的epoch和step数 diff --git a/tutorials/tutorial_code/evaluate_the_model_during_training/README.md b/tutorials/tutorial_code/evaluate_the_model_during_training/README.md index 19b9474a559963f0ba47134d2da9cf118dc24321..acd60348ea8a95aed80e0e36f20186b294fdc158 100644 --- a/tutorials/tutorial_code/evaluate_the_model_during_training/README.md +++ b/tutorials/tutorial_code/evaluate_the_model_during_training/README.md @@ -1,7 +1,9 @@ -使用数据集: [MNIST](http://yann.lecun.com/exdb/mnist/) +# README + +使用数据集: [MNIST](http://yann.lecun.com/exdb/mnist/) 下载后按照下述结构放置: -``` +```text ├─evaluate_the_model_during_training.py │ └─MNIST_Data @@ -14,4 +16,4 @@ train-labels.idx1-ubyte ``` -使用命令`python evaluate_the_model_during_training.py >train.log 2>&1 &`运行(过程较长,大约需要3分钟),运行结果会记录在`log.txt`文件中。 \ No newline at end of file +使用命令`python evaluate_the_model_during_training.py >train.log 2>&1 &`运行(过程较长,大约需要3分钟),运行结果会记录在`log.txt`文件中。