From 528d746f07cc4937d17b99b27363cb0d69fe2a9c Mon Sep 17 00:00:00 2001 From: lvmingfu Date: Tue, 24 Nov 2020 17:31:44 +0800 Subject: [PATCH] modify notebook files in tutorials for master --- tutorials/notebook/README.md | 1 + ...spore_convert_dataset_to_mindrecord.ipynb} | 54 ++++-- ... mindspore_data_loading_enhancement.ipynb} | 55 +++++- ...indspore_debugging_in_pynative_mode.ipynb} | 116 ++++++++----- ..._evaluate_the_model_during_training.ipynb} | 81 ++++++--- ....ipynb => mindspore_mixed_precision.ipynb} | 160 ++++++++++-------- .../advanced_use/convert_dataset.md | 2 +- .../advanced_use/debug_in_pynative_mode.md | 2 +- .../advanced_use/enable_mixed_precision.md | 4 +- .../evaluate_the_model_during_training.md | 2 +- 10 files changed, 308 insertions(+), 169 deletions(-) rename tutorials/notebook/convert_dataset_to_mindrecord/{convert_dataset_to_mindrecord.ipynb => mindspore_convert_dataset_to_mindrecord.ipynb} (91%) rename tutorials/notebook/data_loading_enhance/{data_loading_enhancement.ipynb => mindspore_data_loading_enhancement.ipynb} (99%) rename tutorials/notebook/{debugging_in_pynative_mode.ipynb => mindspore_debugging_in_pynative_mode.ipynb} (99%) rename tutorials/notebook/{evaluate_the_model_during_training.ipynb => mindspore_evaluate_the_model_during_training.ipynb} (96%) rename tutorials/notebook/{mixed_precision.ipynb => mindspore_mixed_precision.ipynb} (97%) diff --git a/tutorials/notebook/README.md b/tutorials/notebook/README.md index 53ab5ca549..dff7cfcdd7 100644 --- a/tutorials/notebook/README.md +++ b/tutorials/notebook/README.md @@ -78,3 +78,4 @@ | 优化训练性能 | 混合精度 | [mixed_precision.ipynb](https://gitee.com/mindspore/docs/blob/master/tutorials/notebook/mixed_precision.ipynb) | - 了解混合精度训练的原理
- 学习在MindSpore中使用混合精度训练
- 对比单精度训练和混合精度训练的对模型训练的影响 | 模型安全和隐私 | 模型安全 | [model_security.ipynb](https://gitee.com/mindspore/docs/blob/master/tutorials/notebook/model_security.ipynb) | - 了解AI算法的安全威胁的概念和影响
- 介绍MindArmour提供的模型安全防护手段
- 学习如何模拟攻击训练模型
- 学习针对被攻击模型进行对抗性防御 +> \ No newline at end of file diff --git a/tutorials/notebook/convert_dataset_to_mindrecord/convert_dataset_to_mindrecord.ipynb b/tutorials/notebook/convert_dataset_to_mindrecord/mindspore_convert_dataset_to_mindrecord.ipynb similarity index 91% rename from tutorials/notebook/convert_dataset_to_mindrecord/convert_dataset_to_mindrecord.ipynb rename to tutorials/notebook/convert_dataset_to_mindrecord/mindspore_convert_dataset_to_mindrecord.ipynb index 592b296d2a..2667dc5582 100644 --- a/tutorials/notebook/convert_dataset_to_mindrecord/convert_dataset_to_mindrecord.ipynb +++ b/tutorials/notebook/convert_dataset_to_mindrecord/mindspore_convert_dataset_to_mindrecord.ipynb @@ -96,7 +96,35 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "- 在jupyter工作目录下创建`./datasets/convert_dataset_to_mindrecord/datas_to_mindrecord`目录,本次体验将所有的转换数据集都放在该目录下。" + "下载需要处理的图片数据`tansform.jpg`。" + ] + }, + { + "cell_type": "code", + "execution_count": 1, + "metadata": {}, + "outputs": [], + "source": [ + "!wget https://gitee.com/mindspore/docs/raw/master/tutorials/notebook/convert_dataset_to_mindrecord/datasets/convert_dataset_to_mindrecord/images/transform.jpg" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "创建文件夹`./datasets/convert_dataset_to_mindrecord/datas_to_mindrecord/`用于存放本次体验中所有的转换数据集。 \n", + "创建文件夹`./datasets/convert_dataset_to_mindrecord/images/`用于存放下载下来的图片数据。" + ] + }, + { + "cell_type": "code", + "execution_count": 2, + "metadata": {}, + "outputs": [], + "source": [ + "!mkdir -p ./datasets/convert_dataset_to_mindrecord/datas_to_mindrecord/\n", + "!mkdir -p ./datasets/convert_dataset_to_mindrecord/images/\n", + "!mv -f ./transform.jpg ./datasets/convert_dataset_to_mindrecord/images/" ] }, { @@ -119,7 +147,7 @@ }, { "cell_type": "code", - "execution_count": 1, + "execution_count": 3, "metadata": {}, "outputs": [], "source": [ @@ -147,7 +175,7 @@ }, { "cell_type": "code", - "execution_count": 2, + "execution_count": 4, "metadata": {}, "outputs": [ { @@ -156,7 +184,7 @@ "0" ] }, - "execution_count": 2, + "execution_count": 4, "metadata": {}, "output_type": "execute_result" } @@ -175,7 +203,7 @@ }, { "cell_type": "code", - "execution_count": 3, + "execution_count": 5, "metadata": {}, "outputs": [], "source": [ @@ -197,7 +225,7 @@ }, { "cell_type": "code", - "execution_count": 4, + "execution_count": 6, "metadata": {}, "outputs": [ { @@ -206,7 +234,7 @@ "MSRStatus.SUCCESS" ] }, - "execution_count": 4, + "execution_count": 6, "metadata": {}, "output_type": "execute_result" } @@ -229,7 +257,7 @@ }, { "cell_type": "code", - "execution_count": 5, + "execution_count": 7, "metadata": {}, "outputs": [ { @@ -238,7 +266,7 @@ "MSRStatus.SUCCESS" ] }, - "execution_count": 5, + "execution_count": 7, "metadata": {}, "output_type": "execute_result" } @@ -280,7 +308,7 @@ }, { "cell_type": "code", - "execution_count": 6, + "execution_count": 8, "metadata": {}, "outputs": [ { @@ -289,7 +317,7 @@ "MSRStatus.SUCCESS" ] }, - "execution_count": 6, + "execution_count": 8, "metadata": {}, "output_type": "execute_result" } @@ -323,7 +351,7 @@ }, { "cell_type": "code", - "execution_count": 7, + "execution_count": 9, "metadata": {}, "outputs": [], "source": [ @@ -339,7 +367,7 @@ }, { "cell_type": "code", - "execution_count": 8, + "execution_count": 10, "metadata": {}, "outputs": [ { diff --git a/tutorials/notebook/data_loading_enhance/data_loading_enhancement.ipynb b/tutorials/notebook/data_loading_enhance/mindspore_data_loading_enhancement.ipynb similarity index 99% rename from tutorials/notebook/data_loading_enhance/data_loading_enhancement.ipynb rename to tutorials/notebook/data_loading_enhance/mindspore_data_loading_enhancement.ipynb index e7d144150e..3566917f6e 100644 --- a/tutorials/notebook/data_loading_enhance/data_loading_enhancement.ipynb +++ b/tutorials/notebook/data_loading_enhance/mindspore_data_loading_enhancement.ipynb @@ -437,9 +437,14 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "2. 使用一类图片当作数据,体验操作。在一个数据量比较大的图片数据集中,例如数据集名称叫`images`,它的存储方式是在`images`文件夹下,有不同子类别的文件夹,一个子类别文件夹中的图片属于同一类。所以我们本次体验所使用的图片放置方法,就需要创建`enhance_images`文件夹,接着在`enhance_images`下建一个名为`sample`的子类别文件夹,将图片放在`sample`文件夹中即可。如果有更多类别图片,可以在`enhance_images`下创建对应的子类别文件夹,将图片放入即可。\n", - "\n", - " 增强体验使用的数据位置在中,使用过程中可以在此路径下找到图片数据,并参照本次体验中图片放置的位置来新建文件夹。" + "2. 使用一类图片当作数据,体验操作。在一个数据量比较大的图片数据集中,例如数据集名称叫`images`,它的存储方式是在`images`文件夹下,有不同子类别的文件夹,一个子类别文件夹中的图片属于同一类。所以我们本次体验所使用的图片放置方法,就需要创建`enhance_images`文件夹,接着在`enhance_images`下建一个名为`sample`的子类别文件夹,将图片放在`sample`文件夹中即可。如果有更多类别图片,可以在`enhance_images`下创建对应的子类别文件夹,将图片放入即可。" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "增强体验使用的图片数据下载,执行如下命令:" ] }, { @@ -447,6 +452,40 @@ "execution_count": 11, "metadata": {}, "outputs": [], + "source": [ + "!wget https://gitee.com/mindspore/docs/raw/master/tutorials/notebook/data_loading_enhance/enhance_images/sample/sample1.png\n", + "!wget https://gitee.com/mindspore/docs/raw/master/tutorials/notebook/data_loading_enhance/enhance_images/sample/sample2.png" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "生成图片数据的保存路径`./enhance_images/sample/`,并将下载下来的图片数据`sample1.png`和`sample2.png`移入其中。" + ] + }, + { + "cell_type": "code", + "execution_count": 12, + "metadata": {}, + "outputs": [], + "source": [ + "!mkdir -p ./enhance_images/sample/\n", + "!mv sample1.png sample2.png -t ./enhance_images/sample/" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "使用过程中可以在此路径下找到图片数据,并参照本次体验中图片放置的位置来新建文件夹。" + ] + }, + { + "cell_type": "code", + "execution_count": 13, + "metadata": {}, + "outputs": [], "source": [ "DATA_DIR = \"./enhance_images\"\n", "ds1 = ds.ImageFolderDataset(DATA_DIR, decode=True)" @@ -461,7 +500,7 @@ }, { "cell_type": "code", - "execution_count": 12, + "execution_count": 14, "metadata": {}, "outputs": [], "source": [ @@ -471,7 +510,7 @@ }, { "cell_type": "code", - "execution_count": 13, + "execution_count": 15, "metadata": {}, "outputs": [], "source": [ @@ -488,7 +527,7 @@ }, { "cell_type": "code", - "execution_count": 14, + "execution_count": 16, "metadata": {}, "outputs": [ { @@ -533,7 +572,7 @@ }, { "cell_type": "code", - "execution_count": 15, + "execution_count": 17, "metadata": {}, "outputs": [], "source": [ @@ -550,7 +589,7 @@ }, { "cell_type": "code", - "execution_count": 16, + "execution_count": 18, "metadata": {}, "outputs": [ { diff --git a/tutorials/notebook/debugging_in_pynative_mode.ipynb b/tutorials/notebook/mindspore_debugging_in_pynative_mode.ipynb similarity index 99% rename from tutorials/notebook/debugging_in_pynative_mode.ipynb rename to tutorials/notebook/mindspore_debugging_in_pynative_mode.ipynb index 2d25ba6b5f..0bdd0bf7f9 100644 --- a/tutorials/notebook/debugging_in_pynative_mode.ipynb +++ b/tutorials/notebook/mindspore_debugging_in_pynative_mode.ipynb @@ -55,21 +55,56 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "这里我们需要将MNIST数据集中随机取出一张图片,并增强成适合LeNet网络的数据格式(如何处理请参考[quick_start.ipynb](https://gitee.com/mindspore/docs/blob/master/tutorials/notebook/quick_start.ipynb)),训练数据集下载地址:{\"\", \"\"} 。\n", - "
数据集放在----Jupyter工作目录+\\MNIST_Data\\train\\,如下图结构:" + "下载并解压数据集数据。" + ] + }, + { + "cell_type": "code", + "execution_count": 1, + "metadata": {}, + "outputs": [], + "source": [ + "!wget https://obs.dualstack.cn-north-4.myhuaweicloud.com/mindspore-website/notebook/datasets/MNIST_Data.zip\n", + "!unzip MNIST_Data.zip" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "将解压后的数据集移动到指定位置。" + ] + }, + { + "cell_type": "code", + "execution_count": 2, + "metadata": {}, + "outputs": [], + "source": [ + "!mkdir -p ./datasets/\n", + "!mv -f ./MNIST_Data/ ./datasets/" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "执行完成上述命令后,数据存放形式应如下:" ] }, { "cell_type": "raw", "metadata": {}, "source": [ - "MNIST\n", - "├── test\n", - "│   ├── t10k-images-idx3-ubyte\n", - "│   └── t10k-labels-idx1-ubyte\n", - "└── train\n", - " ├── train-images-idx3-ubyte\n", - " └── train-labels-idx1-ubyte\n" + "datasets\n", + "│\n", + "└── MNIST_Data\n", + " ├── test\n", + " │   ├── t10k-images-idx3-ubyte\n", + " │   └── t10k-labels-idx1-ubyte\n", + " └── train\n", + " ├── train-images-idx3-ubyte\n", + " └── train-labels-idx1-ubyte \n" ] }, { @@ -88,7 +123,7 @@ }, { "cell_type": "code", - "execution_count": 1, + "execution_count": 3, "metadata": {}, "outputs": [], "source": [ @@ -157,7 +192,7 @@ }, { "cell_type": "code", - "execution_count": 2, + "execution_count": 4, "metadata": {}, "outputs": [ { @@ -219,7 +254,7 @@ }, { "cell_type": "code", - "execution_count": 3, + "execution_count": 5, "metadata": {}, "outputs": [], "source": [ @@ -246,11 +281,25 @@ }, { "cell_type": "code", - "execution_count": 4, - "metadata": { - "scrolled": false - }, - "outputs": [], + "execution_count": 6, + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "LeNet5<\n", + " (conv1): Conv2d\n", + " (conv2): Conv2d\n", + " (fc1): Dense\n", + " (fc2): Dense\n", + " (fc3): Dense\n", + " (relu): ReLU<>\n", + " (max_pool2d): MaxPool2d\n", + " >\n" + ] + } + ], "source": [ "import mindspore.nn as nn\n", "import mindspore.ops as ops\n", @@ -318,31 +367,8 @@ " x = self.relu(x)\n", " x = self.fc3(x)\n", " self.switch -= 1\n", - " return x" - ] - }, - { - "cell_type": "code", - "execution_count": 5, - "metadata": {}, - "outputs": [ - { - "name": "stdout", - "output_type": "stream", - "text": [ - "LeNet5<\n", - " (conv1): Conv2d\n", - " (conv2): Conv2d\n", - " (fc1): Dense\n", - " (fc2): Dense\n", - " (fc3): Dense\n", - " (relu): ReLU<>\n", - " (max_pool2d): MaxPool2d\n", - " >\n" - ] - } - ], - "source": [ + " return x\n", + "\n", "print(LeNet5())" ] }, @@ -356,7 +382,7 @@ }, { "cell_type": "code", - "execution_count": 6, + "execution_count": 7, "metadata": {}, "outputs": [], "source": [ @@ -388,7 +414,7 @@ }, { "cell_type": "code", - "execution_count": 7, + "execution_count": 8, "metadata": { "scrolled": false }, @@ -523,7 +549,7 @@ }, { "cell_type": "code", - "execution_count": 8, + "execution_count": 9, "metadata": { "scrolled": false }, diff --git a/tutorials/notebook/evaluate_the_model_during_training.ipynb b/tutorials/notebook/mindspore_evaluate_the_model_during_training.ipynb similarity index 96% rename from tutorials/notebook/evaluate_the_model_during_training.ipynb rename to tutorials/notebook/mindspore_evaluate_the_model_during_training.ipynb index 5d71a8ce78..e80be1c4f6 100644 --- a/tutorials/notebook/evaluate_the_model_during_training.ipynb +++ b/tutorials/notebook/mindspore_evaluate_the_model_during_training.ipynb @@ -51,25 +51,56 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "训练数据集下载地址:{\"http://yann.lecun.com/exdb/mnist/train-images-idx3-ubyte.gz \", \"http://yann.lecun.com/exdb/mnist/train-labels-idx1-ubyte.gz \"}。\n", - "\n", - "测试数据集:{\"\", \"\"}\n", - "
数据集放在----*Jupyter工作目录+\\datasets\\MNIST_Data\\*,如下图结构:" + "下载并解压数据集数据。" + ] + }, + { + "cell_type": "code", + "execution_count": 1, + "metadata": {}, + "outputs": [], + "source": [ + "!wget https://obs.dualstack.cn-north-4.myhuaweicloud.com/mindspore-website/notebook/datasets/MNIST_Data.zip\n", + "!unzip MNIST_Data.zip" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ - "```\n", - "MNIST\n", - "├── test\n", - "│   ├── t10k-images-idx3-ubyte\n", - "│   └── t10k-labels-idx1-ubyte\n", - "└── train\n", - " ├── train-images-idx3-ubyte\n", - " └── train-labels-idx1-ubyte \n", - "```" + "将解压后的数据集移动到指定位置。" + ] + }, + { + "cell_type": "code", + "execution_count": 2, + "metadata": {}, + "outputs": [], + "source": [ + "!mkdir -p ./datasets/\n", + "!mv -f ./MNIST_Data/ ./datasets/" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "执行完成上述命令后,数据存放形式应如下:" + ] + }, + { + "cell_type": "raw", + "metadata": {}, + "source": [ + "datasets\n", + "│\n", + "└── MNIST_Data\n", + " ├── test\n", + " │   ├── t10k-images-idx3-ubyte\n", + " │   └── t10k-labels-idx1-ubyte\n", + " └── train\n", + " ├── train-images-idx3-ubyte\n", + " └── train-labels-idx1-ubyte " ] }, { @@ -88,7 +119,7 @@ }, { "cell_type": "code", - "execution_count": 1, + "execution_count": 3, "metadata": {}, "outputs": [], "source": [ @@ -141,7 +172,7 @@ }, { "cell_type": "code", - "execution_count": 2, + "execution_count": 4, "metadata": {}, "outputs": [], "source": [ @@ -182,17 +213,11 @@ "\n", " # use the preceding operators to construct networks\n", " def construct(self, x):\n", - " x = self.conv1(x)\n", - " x = self.relu(x)\n", - " x = self.max_pool2d(x)\n", - " x = self.conv2(x)\n", - " x = self.relu(x)\n", - " x = self.max_pool2d(x)\n", + " x = self.max_pool2d(self.relu(self.conv1(x)))\n", + " x = self.max_pool2d(self.relu(self.conv2(x)))\n", " x = self.flatten(x)\n", - " x = self.fc1(x)\n", - " x = self.relu(x)\n", - " x = self.fc2(x)\n", - " x = self.relu(x)\n", + " x = self.relu(self.fc1(x))\n", + " x = self.relu(self.fc2(x))\n", " x = self.fc3(x)\n", " return x" ] @@ -226,7 +251,7 @@ }, { "cell_type": "code", - "execution_count": 3, + "execution_count": 5, "metadata": {}, "outputs": [], "source": [ @@ -282,7 +307,7 @@ }, { "cell_type": "code", - "execution_count": 4, + "execution_count": 6, "metadata": { "scrolled": true }, @@ -421,7 +446,7 @@ }, { "cell_type": "code", - "execution_count": 5, + "execution_count": 6, "metadata": {}, "outputs": [ { diff --git a/tutorials/notebook/mixed_precision.ipynb b/tutorials/notebook/mindspore_mixed_precision.ipynb similarity index 97% rename from tutorials/notebook/mixed_precision.ipynb rename to tutorials/notebook/mindspore_mixed_precision.ipynb index 972ae34d83..b866ca7255 100644 --- a/tutorials/notebook/mixed_precision.ipynb +++ b/tutorials/notebook/mindspore_mixed_precision.ipynb @@ -56,7 +56,7 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "![image](https://gitee.com/mindspore/docs/raw/master/tutorials/training/source_zh_cn/advanced_use/images/mix_precision.jpg)" + "![image](https://gitee.com/mindspore/docs/raw/master/tutorials/training/source_zh_cn/advanced_use/images/mix_precision.PNG)" ] }, { @@ -90,31 +90,58 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "数据集下载地址:。\n", - "\n", - "数据集下载后,解压至jupyter的工作路径+/datasets/cifar10,由于测试数据集和训练数据集在一个文件夹中,需要你分开两个文件夹存放,存放形式如下。" + "下载并解压数据集数据。" ] }, { - "cell_type": "raw", + "cell_type": "code", + "execution_count": 1, + "metadata": {}, + "outputs": [], + "source": [ + "!wget https://obs.dualstack.cn-north-4.myhuaweicloud.com/mindspore-website/notebook/datasets/cifar10.zip\n", + "!unzip cifar10.zip" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "将解压后的数据集移动到指定位置。" + ] + }, + { + "cell_type": "code", + "execution_count": 2, "metadata": {}, + "outputs": [], "source": [ - "cifar10\n", - "├── test\n", - "│   └── test_batch.bin\n", - "└── train\n", - " ├── data_batch_1.bin\n", - " ├── data_batch_2.bin\n", - " ├── data_batch_3.bin\n", - " ├── data_batch_4.bin\n", - " └── data_batch_5.bin\n" + "!mkdir -p ./datasets/\n", + "!mv -f ./cifar10/ ./datasets/" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ - "如果放置正确,可以在开启的jupyter的首页网址+`/tree/datasets/cifar10`,找到`test`和`train`文件夹。" + "执行完成上述命令后,数据存放形式应如下:" + ] + }, + { + "cell_type": "raw", + "metadata": {}, + "source": [ + "datasets\n", + "│\n", + "└── cifar10\n", + " ├── test\n", + " │   └── test_batch.bin\n", + " └── train\n", + " ├── data_batch_1.bin\n", + " ├── data_batch_2.bin\n", + " ├── data_batch_3.bin\n", + " ├── data_batch_4.bin\n", + " └── data_batch_5.bin" ] }, { @@ -133,7 +160,7 @@ }, { "cell_type": "code", - "execution_count": 1, + "execution_count": 3, "metadata": {}, "outputs": [ { @@ -187,60 +214,16 @@ "### 定义数据增强函数" ] }, - { - "cell_type": "code", - "execution_count": 2, - "metadata": {}, - "outputs": [], - "source": [ - "import os\n", - "import mindspore.common.dtype as mstype\n", - "import mindspore.dataset.engine as de\n", - "import mindspore.dataset.vision.c_transforms as C\n", - "import mindspore.dataset.transforms.c_transforms as C2\n", - "\n", - "def create_dataset(dataset_path, do_train, repeat_num=1, batch_size=32, target=\"GPU\"):\n", - " \n", - " ds = de.Cifar10Dataset(dataset_path, num_parallel_workers=8, shuffle=True)\n", - " \n", - " # define map operations\n", - " trans = []\n", - " if do_train:\n", - " trans += [\n", - " C.RandomCrop((32, 32), (4, 4, 4, 4)),\n", - " C.RandomHorizontalFlip(prob=0.5)\n", - " ]\n", - "\n", - " trans += [\n", - " C.Resize((224, 224)),\n", - " C.Rescale(1.0 / 255.0, 0.0),\n", - " C.Normalize([0.4914, 0.4822, 0.4465], [0.2023, 0.1994, 0.2010]),\n", - " C.HWC2CHW()\n", - " ]\n", - "\n", - " type_cast_op = C2.TypeCast(mstype.int32)\n", - "\n", - " ds = ds.map(operations=type_cast_op, input_columns=\"label\", num_parallel_workers=8)\n", - " ds = ds.map(operations=trans, input_columns=\"image\", num_parallel_workers=8)\n", - "\n", - " # apply batch operations\n", - " ds = ds.batch(batch_size, drop_remainder=True)\n", - " # apply dataset repeat operation\n", - " ds = ds.repeat(repeat_num)\n", - "\n", - " return ds" - ] - }, { "cell_type": "markdown", "metadata": {}, "source": [ - "定义完成数据集增强函数后,我们来看一下,数据集增强后的效果是如何的:" + "定义数据集增强函数,并使用该函数对原始数据集进行增强操作,取出其中一个batch中的一张图片数据进行可视化,查看数据集增强后的效果是如何的。" ] }, { "cell_type": "code", - "execution_count": 3, + "execution_count": 4, "metadata": {}, "outputs": [ { @@ -278,6 +261,43 @@ } ], "source": [ + "import os\n", + "import mindspore.common.dtype as mstype\n", + "import mindspore.dataset.engine as de\n", + "import mindspore.dataset.vision.c_transforms as C\n", + "import mindspore.dataset.transforms.c_transforms as C2\n", + "\n", + "def create_dataset(dataset_path, do_train, repeat_num=1, batch_size=32, target=\"GPU\"):\n", + " \n", + " ds = de.Cifar10Dataset(dataset_path, num_parallel_workers=8, shuffle=True)\n", + " \n", + " # define map operations\n", + " trans = []\n", + " if do_train:\n", + " trans += [\n", + " C.RandomCrop((32, 32), (4, 4, 4, 4)),\n", + " C.RandomHorizontalFlip(prob=0.5)\n", + " ]\n", + "\n", + " trans += [\n", + " C.Resize((224, 224)),\n", + " C.Rescale(1.0 / 255.0, 0.0),\n", + " C.Normalize([0.4914, 0.4822, 0.4465], [0.2023, 0.1994, 0.2010]),\n", + " C.HWC2CHW()\n", + " ]\n", + "\n", + " type_cast_op = C2.TypeCast(mstype.int32)\n", + "\n", + " ds = ds.map(operations=type_cast_op, input_columns=\"label\", num_parallel_workers=8)\n", + " ds = ds.map(operations=trans, input_columns=\"image\", num_parallel_workers=8)\n", + "\n", + " # apply batch operations\n", + " ds = ds.batch(batch_size, drop_remainder=True)\n", + " # apply dataset repeat operation\n", + " ds = ds.repeat(repeat_num)\n", + "\n", + " return ds\n", + "\n", "ds = create_dataset(train_path, do_train=True, repeat_num=1, batch_size=32, target=\"GPU\")\n", "print(\"the cifar dataset size is:\", ds.get_dataset_size())\n", "dict1 = ds.create_dict_iterator()\n", @@ -293,7 +313,7 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "cifar10通过数据增强后的,变成了一共有1562个batch,张量为(32,3,224,224)的数据集。" + "cifar10通过数据增强后的,变成了一共有1562个batch,图片张量为(32,3,224,224)的数据集。" ] }, { @@ -312,7 +332,7 @@ }, { "cell_type": "code", - "execution_count": 4, + "execution_count": 5, "metadata": {}, "outputs": [], "source": [ @@ -399,7 +419,7 @@ }, { "cell_type": "code", - "execution_count": 5, + "execution_count": 6, "metadata": {}, "outputs": [], "source": [ @@ -440,7 +460,7 @@ }, { "cell_type": "code", - "execution_count": 6, + "execution_count": 7, "metadata": {}, "outputs": [], "source": [ @@ -651,7 +671,7 @@ }, { "cell_type": "code", - "execution_count": 7, + "execution_count": 8, "metadata": {}, "outputs": [], "source": [ @@ -702,7 +722,7 @@ }, { "cell_type": "code", - "execution_count": 8, + "execution_count": 9, "metadata": { "scrolled": true }, @@ -897,7 +917,7 @@ }, { "cell_type": "code", - "execution_count": 9, + "execution_count": 10, "metadata": {}, "outputs": [ { @@ -990,4 +1010,4 @@ }, "nbformat": 4, "nbformat_minor": 4 -} \ No newline at end of file +} diff --git a/tutorials/training/source_zh_cn/advanced_use/convert_dataset.md b/tutorials/training/source_zh_cn/advanced_use/convert_dataset.md index 21c0729e1c..7510892bac 100644 --- a/tutorials/training/source_zh_cn/advanced_use/convert_dataset.md +++ b/tutorials/training/source_zh_cn/advanced_use/convert_dataset.md @@ -13,7 +13,7 @@    - + ## 概述 diff --git a/tutorials/training/source_zh_cn/advanced_use/debug_in_pynative_mode.md b/tutorials/training/source_zh_cn/advanced_use/debug_in_pynative_mode.md index 0784052004..ea7e46aa27 100644 --- a/tutorials/training/source_zh_cn/advanced_use/debug_in_pynative_mode.md +++ b/tutorials/training/source_zh_cn/advanced_use/debug_in_pynative_mode.md @@ -15,7 +15,7 @@    - + ## 概述 diff --git a/tutorials/training/source_zh_cn/advanced_use/enable_mixed_precision.md b/tutorials/training/source_zh_cn/advanced_use/enable_mixed_precision.md index 5df7ce1f91..146f48b881 100644 --- a/tutorials/training/source_zh_cn/advanced_use/enable_mixed_precision.md +++ b/tutorials/training/source_zh_cn/advanced_use/enable_mixed_precision.md @@ -14,7 +14,7 @@    - + ## 概述 @@ -229,4 +229,4 @@ output = train_network(predict, label) ## 约束 -使用混合精度时,只能由自动微分功能生成反向网络,不能由用户自定义生成反向网络,否则可能会导致MindSpore产生数据格式不匹配的异常信息。 \ No newline at end of file +使用混合精度时,只能由自动微分功能生成反向网络,不能由用户自定义生成反向网络,否则可能会导致MindSpore产生数据格式不匹配的异常信息。 diff --git a/tutorials/training/source_zh_cn/advanced_use/evaluate_the_model_during_training.md b/tutorials/training/source_zh_cn/advanced_use/evaluate_the_model_during_training.md index 4a3fcd4b59..8d15166811 100644 --- a/tutorials/training/source_zh_cn/advanced_use/evaluate_the_model_during_training.md +++ b/tutorials/training/source_zh_cn/advanced_use/evaluate_the_model_during_training.md @@ -15,7 +15,7 @@    - + ## 概述 -- Gitee