diff --git a/tutorials/source_en/dataset/augment.md b/tutorials/source_en/dataset/augment.md index 42517ae51224f017d81797785ab8a511971fbd74..a6466073b1ebf8597485c9adb3002b678b8526e7 100644 --- a/tutorials/source_en/dataset/augment.md +++ b/tutorials/source_en/dataset/augment.md @@ -122,7 +122,7 @@ The following demonstrates the use of automatic data augmentation based on callb step_num = 0 for ep_num in range(epochs): for data in itr: - print("epcoh: {}, step:{}, data :{}".format(ep_num, step_num, data)) + print("epoch: {}, step:{}, data :{}".format(ep_num, step_num, data)) step_num += 1 dataset.sync_update(condition_name="policy", data={'ep_num': ep_num, 'step_num': step_num}) @@ -131,21 +131,21 @@ The following demonstrates the use of automatic data augmentation based on callb The output is as follows: ```text - epcoh: 0, step:0, data :[Tensor(shape=[], dtype=Int64, value= 1)] - epcoh: 0, step:1, data :[Tensor(shape=[], dtype=Int64, value= 2)] - epcoh: 0, step:2, data :[Tensor(shape=[], dtype=Int64, value= 3)] - epcoh: 1, step:3, data :[Tensor(shape=[], dtype=Int64, value= 1)] - epcoh: 1, step:4, data :[Tensor(shape=[], dtype=Int64, value= 5)] - epcoh: 1, step:5, data :[Tensor(shape=[], dtype=Int64, value= 7)] - epcoh: 2, step:6, data :[Tensor(shape=[], dtype=Int64, value= 6)] - epcoh: 2, step:7, data :[Tensor(shape=[], dtype=Int64, value= 50)] - epcoh: 2, step:8, data :[Tensor(shape=[], dtype=Int64, value= 66)] - epcoh: 3, step:9, data :[Tensor(shape=[], dtype=Int64, value= 81)] - epcoh: 3, step:10, data :[Tensor(shape=[], dtype=Int64, value= 1001)] - epcoh: 3, step:11, data :[Tensor(shape=[], dtype=Int64, value= 1333)] - epcoh: 4, step:12, data :[Tensor(shape=[], dtype=Int64, value= 1728)] - epcoh: 4, step:13, data :[Tensor(shape=[], dtype=Int64, value= 28562)] - epcoh: 4, step:14, data :[Tensor(shape=[], dtype=Int64, value= 38418)] + epoch: 0, step:0, data :[Tensor(shape=[], dtype=Int64, value= 1)] + epoch: 0, step:1, data :[Tensor(shape=[], dtype=Int64, value= 2)] + epoch: 0, step:2, data :[Tensor(shape=[], dtype=Int64, value= 3)] + epoch: 1, step:3, data :[Tensor(shape=[], dtype=Int64, value= 1)] + epoch: 1, step:4, data :[Tensor(shape=[], dtype=Int64, value= 5)] + epoch: 1, step:5, data :[Tensor(shape=[], dtype=Int64, value= 7)] + epoch: 2, step:6, data :[Tensor(shape=[], dtype=Int64, value= 6)] + epoch: 2, step:7, data :[Tensor(shape=[], dtype=Int64, value= 50)] + epoch: 2, step:8, data :[Tensor(shape=[], dtype=Int64, value= 66)] + epoch: 3, step:9, data :[Tensor(shape=[], dtype=Int64, value= 81)] + epoch: 3, step:10, data :[Tensor(shape=[], dtype=Int64, value= 1001)] + epoch: 3, step:11, data :[Tensor(shape=[], dtype=Int64, value= 1333)] + epoch: 4, step:12, data :[Tensor(shape=[], dtype=Int64, value= 1728)] + epoch: 4, step:13, data :[Tensor(shape=[], dtype=Int64, value= 28562)] + epoch: 4, step:14, data :[Tensor(shape=[], dtype=Int64, value= 38418)] ``` ## ImageNet Automatic Data Augmentation diff --git a/tutorials/source_en/dataset/cache.md b/tutorials/source_en/dataset/cache.md index b0a4382487f61bd177478a8cec4dad384860dac5..c9beb1a4b8efedfef2be750543f2611d934da64d 100644 --- a/tutorials/source_en/dataset/cache.md +++ b/tutorials/source_en/dataset/cache.md @@ -16,9 +16,9 @@ Currently, the cache service supports only single-node cache. That is, the ![cache on leaf pipeline](./images/cache_dataset.png) -- Cache the data processed by argumentation. +- Cache the data processed by augmentation. - You can also use the cache in the `map` operation. The data processed by argumentation (such as image cropping or resizing) is directly cached, avoiding repeated data argumentation operations and reducing unnecessary computations. + You can also use the cache in the `map` operation. The data processed by augmentation (such as image cropping or resizing) is directly cached, avoiding repeated data augmentation operations and reducing unnecessary computations. ![cache on map pipeline](./images/cache_processed_data.png) @@ -144,13 +144,13 @@ Note: - The use of `size`: - - `size=0` indicates that the memory space used by the cache is not limited manually, but automically controlled by the cache server according to system's total memory resources, and cache server's memory usage would be limited to within 80% of the total system memory. + - `size=0` indicates that the memory space used by the cache is not limited manually, but automatically controlled by the cache server according to system's total memory resources, and cache server's memory usage would be limited to within 80% of the total system memory. - Users can also manually set `size` to a proper value based on the idle memory of the machine. Note that before setting the `size` parameter, make sure to check the available memory of the system and the size of the dataset to be loaded. If the memory space occupied by the dataset-cache-server or the space of the dataset to be loaded exceeds the available memory of the system, it may cause problems such as machine downtime/restart, automatic shutdown of dataset-cache-server, and failure of training process execution. - The use of `spilling=True`: - - `spilling=True` indicates that the remaining data is written to disks when the memory space is insufficient. Therefore, ensure that you have the writing permission and the sufficient disk space on the configured disk path is to store the cache data that spills to the disk. Note that if no spilling path is set when cache server starts, setting `spilling=True` will raise an error when calling the API. + - `spilling=True` indicates that the remaining data is written to disks when the memory space is insufficient. Therefore, ensure that you have the writing permission and the sufficient disk space on the configured disk path is to store the cache data that spills to the disk. Note that if no spilling path is set when cache server starts, setting `spilling=True` will raise an error when calling the API. - `spilling=False` indicates that no data is written once the configured memory space is used up on the cache server. @@ -158,7 +158,7 @@ Note: ### 4. Insert a Cache Instance -Currently, the cache service can be used to cache both original datasets and datasets processed by argumentation. The following examples show the processing of the two types of data separately. +Currently, the cache service can be used to cache both original datasets and datasets processed by augmentation. The following examples show the processing of the two types of data separately. Note that both examples need to create a cache instance according to the method in step 3, and pass in the created `test_cache` as `cache` parameters in the dataset load or map operation. @@ -234,7 +234,7 @@ Listing sessions for server on port 50052 780643335 2044459912 4 n/a 3226 4 ``` -#### Caching the Data Processed by Argumentation +#### Caching the Data Processed by augmentation Cache data after data enhancement processing `transforms`. @@ -610,5 +610,5 @@ However, we may **not benefit from cache** in the following scenarios: - Currently, dataset classes such as `GeneratorDataset`, `PaddedDataset`, and `NumpySlicesDataset` do not support cache. `GeneratorDataset`, `PaddedDataset`, and `NumpySlicesDataset` belong to `GeneratorOp`, so their error message is displayed as "There is currently no support for GeneratorOp under cache." - Data processed by `batch`, `concat`, `filter`, `repeat`, `skip`, `split`, `take`, and `zip` does not support cache. -- Data processed by random data argumentation operations (such as `RandomCrop`) does not support cache. +- Data processed by random data augmentation operations (such as `RandomCrop`) does not support cache. - The same cache instance cannot be nested in different locations of the same pipeline. diff --git a/tutorials/source_zh_cn/dataset/augment.ipynb b/tutorials/source_zh_cn/dataset/augment.ipynb index c788494aa8199819d49a5beb8510ff104f42e4ee..5c2413a11009ba1468a71f790bd9196dcf18bb92 100644 --- a/tutorials/source_zh_cn/dataset/augment.ipynb +++ b/tutorials/source_zh_cn/dataset/augment.ipynb @@ -187,21 +187,21 @@ "name": "stdout", "output_type": "stream", "text": [ - "epcoh: 0, step:0, data :[Tensor(shape=[], dtype=Int64, value= 1)]\n", - "epcoh: 0, step:1, data :[Tensor(shape=[], dtype=Int64, value= 2)]\n", - "epcoh: 0, step:2, data :[Tensor(shape=[], dtype=Int64, value= 3)]\n", - "epcoh: 1, step:3, data :[Tensor(shape=[], dtype=Int64, value= 1)]\n", - "epcoh: 1, step:4, data :[Tensor(shape=[], dtype=Int64, value= 5)]\n", - "epcoh: 1, step:5, data :[Tensor(shape=[], dtype=Int64, value= 7)]\n", - "epcoh: 2, step:6, data :[Tensor(shape=[], dtype=Int64, value= 6)]\n", - "epcoh: 2, step:7, data :[Tensor(shape=[], dtype=Int64, value= 50)]\n", - "epcoh: 2, step:8, data :[Tensor(shape=[], dtype=Int64, value= 66)]\n", - "epcoh: 3, step:9, data :[Tensor(shape=[], dtype=Int64, value= 81)]\n", - "epcoh: 3, step:10, data :[Tensor(shape=[], dtype=Int64, value= 1001)]\n", - "epcoh: 3, step:11, data :[Tensor(shape=[], dtype=Int64, value= 1333)]\n", - "epcoh: 4, step:12, data :[Tensor(shape=[], dtype=Int64, value= 1728)]\n", - "epcoh: 4, step:13, data :[Tensor(shape=[], dtype=Int64, value= 28562)]\n", - "epcoh: 4, step:14, data :[Tensor(shape=[], dtype=Int64, value= 38418)]\n" + "epoch: 0, step:0, data :[Tensor(shape=[], dtype=Int64, value= 1)]\n", + "epoch: 0, step:1, data :[Tensor(shape=[], dtype=Int64, value= 2)]\n", + "epoch: 0, step:2, data :[Tensor(shape=[], dtype=Int64, value= 3)]\n", + "epoch: 1, step:3, data :[Tensor(shape=[], dtype=Int64, value= 1)]\n", + "epoch: 1, step:4, data :[Tensor(shape=[], dtype=Int64, value= 5)]\n", + "epoch: 1, step:5, data :[Tensor(shape=[], dtype=Int64, value= 7)]\n", + "epoch: 2, step:6, data :[Tensor(shape=[], dtype=Int64, value= 6)]\n", + "epoch: 2, step:7, data :[Tensor(shape=[], dtype=Int64, value= 50)]\n", + "epoch: 2, step:8, data :[Tensor(shape=[], dtype=Int64, value= 66)]\n", + "epoch: 3, step:9, data :[Tensor(shape=[], dtype=Int64, value= 81)]\n", + "epoch: 3, step:10, data :[Tensor(shape=[], dtype=Int64, value= 1001)]\n", + "epoch: 3, step:11, data :[Tensor(shape=[], dtype=Int64, value= 1333)]\n", + "epoch: 4, step:12, data :[Tensor(shape=[], dtype=Int64, value= 1728)]\n", + "epoch: 4, step:13, data :[Tensor(shape=[], dtype=Int64, value= 28562)]\n", + "epoch: 4, step:14, data :[Tensor(shape=[], dtype=Int64, value= 38418)]\n" ] } ], @@ -212,7 +212,7 @@ "step_num = 0\n", "for ep_num in range(epochs):\n", " for data in itr:\n", - " print(\"epcoh: {}, step:{}, data :{}\".format(ep_num, step_num, data))\n", + " print(\"epoch: {}, step:{}, data :{}\".format(ep_num, step_num, data))\n", " step_num += 1\n", " dataset.sync_update(condition_name=\"policy\",\n", " data={'ep_num': ep_num, 'step_num': step_num})"