From 4ea0e76a59b26ab72b22746efd4ae01d0fc346d5 Mon Sep 17 00:00:00 2001 From: Xiao Tianci Date: Wed, 23 Sep 2020 10:11:51 +0800 Subject: [PATCH] fix en links --- tutorials/source_en/advanced_use/dataset_conversion.md | 2 +- .../optimize_the_performance_of_data_preparation.md | 10 +++++----- 2 files changed, 6 insertions(+), 6 deletions(-) diff --git a/tutorials/source_en/advanced_use/dataset_conversion.md b/tutorials/source_en/advanced_use/dataset_conversion.md index e76a8cba88..b784c2a73e 100644 --- a/tutorials/source_en/advanced_use/dataset_conversion.md +++ b/tutorials/source_en/advanced_use/dataset_conversion.md @@ -7,7 +7,7 @@ - [Convert Dataset to MindRecord](#convert-dataset-to-mindrecord) - [Overview](#overview) - [Basic Concepts](#basic-concepts) - - [Convert Dataset to MindRecord](#convert-dataset-to-mindrecord) + - [Convert Dataset to MindRecord](#convert-dataset-to-mindrecord-1) - [Load MindRecord Dataset](#load-mindrecord-dataset) diff --git a/tutorials/source_en/advanced_use/optimize_the_performance_of_data_preparation.md b/tutorials/source_en/advanced_use/optimize_the_performance_of_data_preparation.md index 5e4b3ce2f1..81a6da7ec8 100644 --- a/tutorials/source_en/advanced_use/optimize_the_performance_of_data_preparation.md +++ b/tutorials/source_en/advanced_use/optimize_the_performance_of_data_preparation.md @@ -93,7 +93,7 @@ MindSpore provides multiple data loading methods, including common dataset loadi Suggestions on data loading performance optimization are as follows: - Built-in loading operators are preferred for supported dataset formats. For details, see [Built-in Loading Operators](https://www.mindspore.cn/api/en/master/api/python/mindspore/mindspore.dataset.html). If the performance cannot meet the requirements, use the multi-thread concurrency solution. For details, see [Multi-thread Optimization Solution](https://www.mindspore.cn/tutorial/en/master/advanced_use/optimize_the_performance_of_data_preparation.html#multi-thread-optimization-solution). -- For a dataset format that is not supported, convert the format to MindSpore data format and then use the `MindDataset` class to load the dataset. For details, see [Convert Dataset to MindRecord](https://www.mindspore.cn/api/en/master/programming_guide/dataset_conversion.html). If the performance cannot meet the requirements, use the multi-thread concurrency solution, for details, see [Multi-thread Optimization Solution](https://www.mindspore.cn/tutorial/en/master/advanced_use/optimize_the_performance_of_data_preparation.html#multi-thread-optimization-solution). +- For a dataset format that is not supported, convert the format to MindSpore data format and then use the `MindDataset` class to load the dataset. For details, see [MindSpore Data Format Conversion](https://www.mindspore.cn/api/en/master/programming_guide/dataset_conversion.html). If the performance cannot meet the requirements, use the multi-thread concurrency solution, for details, see [Multi-thread Optimization Solution](https://www.mindspore.cn/tutorial/en/master/advanced_use/optimize_the_performance_of_data_preparation.html#multi-thread-optimization-solution). - For dataset formats that are not supported, the user-defined `GeneratorDataset` class is preferred for implementing fast algorithm verification. If the performance cannot meet the requirements, the multi-process concurrency solution can be used. For details, see [Multi-process Optimization Solution](https://www.mindspore.cn/tutorial/en/master/advanced_use/optimize_the_performance_of_data_preparation.html#multi-process-optimization-solution). ### Code Example @@ -172,7 +172,7 @@ Based on the preceding suggestions of data loading performance optimization, the ## Optimizing the Shuffle Performance -The shuffle operation is used to shuffle ordered datasets or repeated datasets. MindSpore provides the `shuffle` function for users. A larger value of `buffer_size` indicates a higher shuffling degree, consuming more time and computing resources. This API allows users to shuffle the data at any time during the entire pipeline process. For details, see [Shuffle Processing](https://www.mindspore.cn/api/en/master/programming_guide/pipeline.html#shuffle). However, because the underlying implementation methods are different, the performance of this method is not as good as that of setting the `shuffle` parameter to directly shuffle data by referring to the [Built-in Loading Operators](https://www.mindspore.cn/api/en/master/api/python/mindspore/mindspore.dataset.html). +The shuffle operation is used to shuffle ordered datasets or repeated datasets. MindSpore provides the `shuffle` function for users. A larger value of `buffer_size` indicates a higher shuffling degree, consuming more time and computing resources. This API allows users to shuffle the data at any time during the entire pipeline process. For details, see [Shuffle](https://www.mindspore.cn/api/en/master/programming_guide/pipeline.html#shuffle). However, because the underlying implementation methods are different, the performance of this method is not as good as that of setting the `shuffle` parameter to directly shuffle data by referring to the [Built-in Loading Operators](https://www.mindspore.cn/api/en/master/api/python/mindspore/mindspore.dataset.html). ### Performance Optimization Solution @@ -256,7 +256,7 @@ During image classification training, especially when the dataset is small, user - Use the built-in Python operator (`py_transforms` module) to perform data augmentation. - Users can define Python functions as needed to perform data augmentation. -For details, see [Data Augmentation](https://www.mindspore.cn/api/en/master/programming_guide/augmentation.html). The performance varies according to the underlying implementation methods. +For details, see [Augmentation](https://www.mindspore.cn/api/en/master/programming_guide/augmentation.html). The performance varies according to the underlying implementation methods. | Module | Underlying API | Description | | :----: | :----: | :----: | @@ -394,7 +394,7 @@ For details, see [Built-in Loading Operators](https://www.mindspore.cn/api/en/ma ### Multi-process Optimization Solution During data processing, operators implemented by Python support the multi-process mode. For example: -- By default, the `GeneratorDataset` class is in multi-process mode. The `num_parallel_workers` parameter indicates the number of enabled processes. The default value is 1. For details, see [Generator Dataset](https://www.mindspore.cn/api/en/master/api/python/mindspore/mindspore.dataset.html#mindspore.dataset.GeneratorDataset) +- By default, the `GeneratorDataset` class is in multi-process mode. The `num_parallel_workers` parameter indicates the number of enabled processes. The default value is 1. For details, see [GeneratorDataset](https://www.mindspore.cn/api/en/master/api/python/mindspore/mindspore.dataset.html#mindspore.dataset.GeneratorDataset) - If the user-defined Python function or the `py_transforms` module is used to perform data augmentation and the `python_multiprocessing` parameter of the `map` function is set to True, the `num_parallel_workers` parameter indicates the number of processes and the default value of the `python_multiprocessing` parameter is False. In this case, the `num_parallel_workers` parameter indicates the number of threads. For details, see [Built-in Loading Operators](https://www.mindspore.cn/api/en/master/api/python/mindspore/mindspore.dataset.html). ### Compose Optimization Solution @@ -405,7 +405,7 @@ Map operators can receive the Tensor operator list and apply all these operators ### Operator Fusion Optimization Solution -Some fusion operators are provided to aggregate the functions of two or more operators into one operator. For details, see [Data Augmentation Operators](https://www.mindspore.cn/api/en/master/api/python/mindspore/mindspore.dataset.vision.html). Compared with the pipelines of their components, such fusion operators provide better performance. As shown in the figure: +Some fusion operators are provided to aggregate the functions of two or more operators into one operator. For details, see [Augmentation Operators](https://www.mindspore.cn/api/en/master/api/python/mindspore/mindspore.dataset.vision.html). Compared with the pipelines of their components, such fusion operators provide better performance. As shown in the figure: ![title](./images/operator_fusion.png) -- Gitee