diff --git a/tutorials/source_en/dataset/augment.md b/tutorials/source_en/dataset/augment.md index b6b3b8bd0560f30ab6035ca9be266cc4b394f641..8b3e3fd8921910bb640782e8982da6dc978d9444 100644 --- a/tutorials/source_en/dataset/augment.md +++ b/tutorials/source_en/dataset/augment.md @@ -122,7 +122,7 @@ The following demonstrates the use of automatic data augmentation based on callb step_num = 0 for ep_num in range(epochs): for data in itr: - print("epcoh: {}, step:{}, data :{}".format(ep_num, step_num, data)) + print("epoch: {}, step:{}, data :{}".format(ep_num, step_num, data)) step_num += 1 dataset.sync_update(condition_name="policy", data={'ep_num': ep_num, 'step_num': step_num}) @@ -131,21 +131,21 @@ The following demonstrates the use of automatic data augmentation based on callb The output is as follows: ```text - epcoh: 0, step:0, data :[Tensor(shape=[], dtype=Int64, value= 1)] - epcoh: 0, step:1, data :[Tensor(shape=[], dtype=Int64, value= 2)] - epcoh: 0, step:2, data :[Tensor(shape=[], dtype=Int64, value= 3)] - epcoh: 1, step:3, data :[Tensor(shape=[], dtype=Int64, value= 1)] - epcoh: 1, step:4, data :[Tensor(shape=[], dtype=Int64, value= 5)] - epcoh: 1, step:5, data :[Tensor(shape=[], dtype=Int64, value= 7)] - epcoh: 2, step:6, data :[Tensor(shape=[], dtype=Int64, value= 6)] - epcoh: 2, step:7, data :[Tensor(shape=[], dtype=Int64, value= 50)] - epcoh: 2, step:8, data :[Tensor(shape=[], dtype=Int64, value= 66)] - epcoh: 3, step:9, data :[Tensor(shape=[], dtype=Int64, value= 81)] - epcoh: 3, step:10, data :[Tensor(shape=[], dtype=Int64, value= 1001)] - epcoh: 3, step:11, data :[Tensor(shape=[], dtype=Int64, value= 1333)] - epcoh: 4, step:12, data :[Tensor(shape=[], dtype=Int64, value= 1728)] - epcoh: 4, step:13, data :[Tensor(shape=[], dtype=Int64, value= 28562)] - epcoh: 4, step:14, data :[Tensor(shape=[], dtype=Int64, value= 38418)] + epoch: 0, step:0, data :[Tensor(shape=[], dtype=Int64, value= 1)] + epoch: 0, step:1, data :[Tensor(shape=[], dtype=Int64, value= 2)] + epoch: 0, step:2, data :[Tensor(shape=[], dtype=Int64, value= 3)] + epoch: 1, step:3, data :[Tensor(shape=[], dtype=Int64, value= 1)] + epoch: 1, step:4, data :[Tensor(shape=[], dtype=Int64, value= 5)] + epoch: 1, step:5, data :[Tensor(shape=[], dtype=Int64, value= 7)] + epoch: 2, step:6, data :[Tensor(shape=[], dtype=Int64, value= 6)] + epoch: 2, step:7, data :[Tensor(shape=[], dtype=Int64, value= 50)] + epoch: 2, step:8, data :[Tensor(shape=[], dtype=Int64, value= 66)] + epoch: 3, step:9, data :[Tensor(shape=[], dtype=Int64, value= 81)] + epoch: 3, step:10, data :[Tensor(shape=[], dtype=Int64, value= 1001)] + epoch: 3, step:11, data :[Tensor(shape=[], dtype=Int64, value= 1333)] + epoch: 4, step:12, data :[Tensor(shape=[], dtype=Int64, value= 1728)] + epoch: 4, step:13, data :[Tensor(shape=[], dtype=Int64, value= 28562)] + epoch: 4, step:14, data :[Tensor(shape=[], dtype=Int64, value= 38418)] ``` ## ImageNet Automatic Data Augmentation diff --git a/tutorials/source_en/dataset/cache.md b/tutorials/source_en/dataset/cache.md index 4220aff9a6418073baef5ec44c9613cfe87c21b0..01c05337cca56eebcea558b878cd0c53e1e483b4 100644 --- a/tutorials/source_en/dataset/cache.md +++ b/tutorials/source_en/dataset/cache.md @@ -16,9 +16,9 @@ Currently, the cache service supports only single-node cache. That is, the ![cache on leaf pipeline](./images/cache_dataset.png) -- Cache the data processed by argumentation. +- Cache the data processed by augmentation. - You can also use the cache in the `map` operation. The data processed by argumentation (such as image cropping or resizing) is directly cached, avoiding repeated data argumentation operations and reducing unnecessary computations. + You can also use the cache in the `map` operation. The data processed by augmentation (such as image cropping or resizing) is directly cached, avoiding repeated data augmentation operations and reducing unnecessary computations. ![cache on map pipeline](./images/cache_processed_data.png) @@ -144,13 +144,13 @@ Note: - The use of `size`: - - `size=0` indicates that the memory space used by the cache is not limited manually, but automically controlled by the cache server according to system's total memory resources, and cache server's memory usage would be limited to within 80% of the total system memory. + - `size=0` indicates that the memory space used by the cache is not limited manually, but automatically controlled by the cache server according to system's total memory resources, and cache server's memory usage would be limited to within 80% of the total system memory. - Users can also manually set `size` to a proper value based on the idle memory of the machine. Note that before setting the `size` parameter, make sure to check the available memory of the system and the size of the dataset to be loaded. If the memory space occupied by the dataset-cache-server or the space of the dataset to be loaded exceeds the available memory of the system, it may cause problems such as machine downtime/restart, automatic shutdown of dataset-cache-server, and failure of training process execution. - The use of `spilling=True`: - - `spilling=True` indicates that the remaining data is written to disks when the memory space is insufficient. Therefore, ensure that you have the writing permission and the sufficient disk space on the configured disk path is to store the cache data that spills to the disk. Note that if no spilling path is set when cache server starts, setting `spilling=True` will raise an error when calling the API. + - `spilling=True` indicates that the remaining data is written to disks when the memory space is insufficient. Therefore, ensure that you have the writing permission and the sufficient disk space on the configured disk path is to store the cache data that spills to the disk. Note that if no spilling path is set when cache server starts, setting `spilling=True` will raise an error when calling the API. - `spilling=False` indicates that no data is written once the configured memory space is used up on the cache server. @@ -158,7 +158,7 @@ Note: ### 4. Insert a Cache Instance -Currently, the cache service can be used to cache both original datasets and datasets processed by argumentation. The following examples show the processing of the two types of data separately. +Currently, the cache service can be used to cache both original datasets and datasets processed by augmentation. The following examples show the processing of the two types of data separately. Note that both examples need to create a cache instance according to the method in step 3, and pass in the created `test_cache` as `cache` parameters in the dataset load or map operation. @@ -234,7 +234,7 @@ Listing sessions for server on port 50052 780643335 2044459912 4 n/a 3226 4 ``` -#### Caching the Data Processed by Argumentation +#### Caching the Data Processed by augmentation Cache data after data enhancement processing `transforms`. @@ -610,5 +610,5 @@ However, we may **not benefit from cache** in the following scenarios: - Currently, dataset classes such as `GeneratorDataset`, `PaddedDataset`, and `NumpySlicesDataset` do not support cache. `GeneratorDataset`, `PaddedDataset`, and `NumpySlicesDataset` belong to `GeneratorOp`, so their error message is displayed as "There is currently no support for GeneratorOp under cache." - Data processed by `batch`, `concat`, `filter`, `repeat`, `skip`, `split`, `take`, and `zip` does not support cache. -- Data processed by random data argumentation operations (such as `RandomCrop`) does not support cache. +- Data processed by random data augmentation operations (such as `RandomCrop`) does not support cache. - The same cache instance cannot be nested in different locations of the same pipeline. diff --git a/tutorials/source_zh_cn/dataset/augment.ipynb b/tutorials/source_zh_cn/dataset/augment.ipynb index 6a6a438a081fe3d0e1ed9ab4b95e44f622e5884a..78ba7192be76ddf164b7ba5dd861f8717149a4ba 100644 --- a/tutorials/source_zh_cn/dataset/augment.ipynb +++ b/tutorials/source_zh_cn/dataset/augment.ipynb @@ -187,21 +187,21 @@ "name": "stdout", "output_type": "stream", "text": [ - "epcoh: 0, step:0, data :[Tensor(shape=[], dtype=Int64, value= 1)]\n", - "epcoh: 0, step:1, data :[Tensor(shape=[], dtype=Int64, value= 2)]\n", - "epcoh: 0, step:2, data :[Tensor(shape=[], dtype=Int64, value= 3)]\n", - "epcoh: 1, step:3, data :[Tensor(shape=[], dtype=Int64, value= 1)]\n", - "epcoh: 1, step:4, data :[Tensor(shape=[], dtype=Int64, value= 5)]\n", - "epcoh: 1, step:5, data :[Tensor(shape=[], dtype=Int64, value= 7)]\n", - "epcoh: 2, step:6, data :[Tensor(shape=[], dtype=Int64, value= 6)]\n", - "epcoh: 2, step:7, data :[Tensor(shape=[], dtype=Int64, value= 50)]\n", - "epcoh: 2, step:8, data :[Tensor(shape=[], dtype=Int64, value= 66)]\n", - "epcoh: 3, step:9, data :[Tensor(shape=[], dtype=Int64, value= 81)]\n", - "epcoh: 3, step:10, data :[Tensor(shape=[], dtype=Int64, value= 1001)]\n", - "epcoh: 3, step:11, data :[Tensor(shape=[], dtype=Int64, value= 1333)]\n", - "epcoh: 4, step:12, data :[Tensor(shape=[], dtype=Int64, value= 1728)]\n", - "epcoh: 4, step:13, data :[Tensor(shape=[], dtype=Int64, value= 28562)]\n", - "epcoh: 4, step:14, data :[Tensor(shape=[], dtype=Int64, value= 38418)]\n" + "epoch: 0, step:0, data :[Tensor(shape=[], dtype=Int64, value= 1)]\n", + "epoch: 0, step:1, data :[Tensor(shape=[], dtype=Int64, value= 2)]\n", + "epoch: 0, step:2, data :[Tensor(shape=[], dtype=Int64, value= 3)]\n", + "epoch: 1, step:3, data :[Tensor(shape=[], dtype=Int64, value= 1)]\n", + "epoch: 1, step:4, data :[Tensor(shape=[], dtype=Int64, value= 5)]\n", + "epoch: 1, step:5, data :[Tensor(shape=[], dtype=Int64, value= 7)]\n", + "epoch: 2, step:6, data :[Tensor(shape=[], dtype=Int64, value= 6)]\n", + "epoch: 2, step:7, data :[Tensor(shape=[], dtype=Int64, value= 50)]\n", + "epoch: 2, step:8, data :[Tensor(shape=[], dtype=Int64, value= 66)]\n", + "epoch: 3, step:9, data :[Tensor(shape=[], dtype=Int64, value= 81)]\n", + "epoch: 3, step:10, data :[Tensor(shape=[], dtype=Int64, value= 1001)]\n", + "epoch: 3, step:11, data :[Tensor(shape=[], dtype=Int64, value= 1333)]\n", + "epoch: 4, step:12, data :[Tensor(shape=[], dtype=Int64, value= 1728)]\n", + "epoch: 4, step:13, data :[Tensor(shape=[], dtype=Int64, value= 28562)]\n", + "epoch: 4, step:14, data :[Tensor(shape=[], dtype=Int64, value= 38418)]\n" ] } ], @@ -212,7 +212,7 @@ "step_num = 0\n", "for ep_num in range(epochs):\n", " for data in itr:\n", - " print(\"epcoh: {}, step:{}, data :{}\".format(ep_num, step_num, data))\n", + " print(\"epoch: {}, step:{}, data :{}\".format(ep_num, step_num, data))\n", " step_num += 1\n", " dataset.sync_update(condition_name=\"policy\",\n", " data={'ep_num': ep_num, 'step_num': step_num})"