From 23687fc7aa44ffe89f1f66539849566fc25ba681 Mon Sep 17 00:00:00 2001 From: harry-zzh Date: Fri, 17 Jun 2022 09:04:31 +0000 Subject: [PATCH 1/4] =?UTF-8?q?=E6=96=B0=E5=BB=BA=20MUNIT=5FID0953=5Ffor?= =?UTF-8?q?=5FTensorFlow?= MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit --- TensorFlow/contrib/cv/MUNIT_ID0953_for_TensorFlow/.keep | 0 1 file changed, 0 insertions(+), 0 deletions(-) create mode 100644 TensorFlow/contrib/cv/MUNIT_ID0953_for_TensorFlow/.keep diff --git a/TensorFlow/contrib/cv/MUNIT_ID0953_for_TensorFlow/.keep b/TensorFlow/contrib/cv/MUNIT_ID0953_for_TensorFlow/.keep new file mode 100644 index 000000000..e69de29bb -- Gitee From c276bf7dde09683b98acfb57acdf7c4a386787ad Mon Sep 17 00:00:00 2001 From: harry-zzh Date: Fri, 17 Jun 2022 09:07:45 +0000 Subject: [PATCH 2/4] add files --- .../cv/MUNIT_ID0953_for_TensorFlow/LICENSE | 284 +++++++ .../cv/MUNIT_ID0953_for_TensorFlow/MUNIT.py | 698 ++++++++++++++++++ .../cv/MUNIT_ID0953_for_TensorFlow/README.md | 188 +++++ .../cv/MUNIT_ID0953_for_TensorFlow/main.py | 213 ++++++ .../modelarts_entry_acc.py | 63 ++ .../modelarts_entry_perf.py | 63 ++ .../modelzoo_level.txt | 3 + .../cv/MUNIT_ID0953_for_TensorFlow/ops.py | 244 ++++++ .../cv/MUNIT_ID0953_for_TensorFlow/utils.py | 146 ++++ 9 files changed, 1902 insertions(+) create mode 100644 TensorFlow/contrib/cv/MUNIT_ID0953_for_TensorFlow/LICENSE create mode 100644 TensorFlow/contrib/cv/MUNIT_ID0953_for_TensorFlow/MUNIT.py create mode 100644 TensorFlow/contrib/cv/MUNIT_ID0953_for_TensorFlow/README.md create mode 100644 TensorFlow/contrib/cv/MUNIT_ID0953_for_TensorFlow/main.py create mode 100644 TensorFlow/contrib/cv/MUNIT_ID0953_for_TensorFlow/modelarts_entry_acc.py create mode 100644 TensorFlow/contrib/cv/MUNIT_ID0953_for_TensorFlow/modelarts_entry_perf.py create mode 100644 TensorFlow/contrib/cv/MUNIT_ID0953_for_TensorFlow/modelzoo_level.txt create mode 100644 TensorFlow/contrib/cv/MUNIT_ID0953_for_TensorFlow/ops.py create mode 100644 TensorFlow/contrib/cv/MUNIT_ID0953_for_TensorFlow/utils.py diff --git a/TensorFlow/contrib/cv/MUNIT_ID0953_for_TensorFlow/LICENSE b/TensorFlow/contrib/cv/MUNIT_ID0953_for_TensorFlow/LICENSE new file mode 100644 index 000000000..5ea8a5f7b --- /dev/null +++ b/TensorFlow/contrib/cv/MUNIT_ID0953_for_TensorFlow/LICENSE @@ -0,0 +1,284 @@ + Apache License + Version 2.0, January 2004 + http://www.apache.org/licenses/ + + TERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION + + 1. Definitions. + + "License" shall mean the terms and conditions for use, reproduction, + and distribution as defined by Sections 1 through 9 of this document. + + "Licensor" shall mean the copyright owner or entity authorized by + the copyright owner that is granting the License. + + "Legal Entity" shall mean the union of the acting entity and all + other entities that control, are controlled by, or are under common + control with that entity. For the purposes of this definition, + "control" means (i) the power, direct or indirect, to cause the + direction or management of such entity, whether by contract or + otherwise, or (ii) ownership of fifty percent (50%) or more of the + outstanding shares, or (iii) beneficial ownership of such entity. + + "You" (or "Your") shall mean an individual or Legal Entity + exercising permissions granted by this License. + + "Source" form shall mean the preferred form for making modifications, + including but not limited to software source code, documentation + source, and configuration files. + + "Object" form shall mean any form resulting from mechanical + transformation or translation of a Source form, including but + not limited to compiled object code, generated documentation, + and conversions to other media types. + + "Work" shall mean the work of authorship, whether in Source or + Object form, made available under the License, as indicated by a + copyright notice that is included in or attached to the work + (an example is provided in the Appendix below). + + "Derivative Works" shall mean any work, whether in Source or Object + form, that is based on (or derived from) the Work and for which the + editorial revisions, annotations, elaborations, or other modifications + represent, as a whole, an original work of authorship. For the purposes + of this License, Derivative Works shall not include works that remain + separable from, or merely link (or bind by name) to the interfaces of, + the Work and Derivative Works thereof. + + "Contribution" shall mean any work of authorship, including + the original version of the Work and any modifications or additions + to that Work or Derivative Works thereof, that is intentionally + submitted to Licensor for inclusion in the Work by the copyright owner + or by an individual or Legal Entity authorized to submit on behalf of + the copyright owner. For the purposes of this definition, "submitted" + means any form of electronic, verbal, or written communication sent + to the Licensor or its representatives, including but not limited to + communication on electronic mailing lists, source code control systems, + and issue tracking systems that are managed by, or on behalf of, the + Licensor for the purpose of discussing and improving the Work, but + excluding communication that is conspicuously marked or otherwise + designated in writing by the copyright owner as "Not a Contribution." + + "Contributor" shall mean Licensor and any individual or Legal Entity + on behalf of whom a Contribution has been received by Licensor and + subsequently incorporated within the Work. + + 2. Grant of Copyright License. Subject to the terms and conditions of + this License, each Contributor hereby grants to You a perpetual, + worldwide, non-exclusive, no-charge, royalty-free, irrevocable + copyright license to reproduce, prepare Derivative Works of, + publicly display, publicly perform, sublicense, and distribute the + Work and such Derivative Works in Source or Object form. + + 3. Grant of Patent License. Subject to the terms and conditions of + this License, each Contributor hereby grants to You a perpetual, + worldwide, non-exclusive, no-charge, royalty-free, irrevocable + (except as stated in this section) patent license to make, have made, + use, offer to sell, sell, import, and otherwise transfer the Work, + where such license applies only to those patent claims licensable + by such Contributor that are necessarily infringed by their + Contribution(s) alone or by combination of their Contribution(s) + with the Work to which such Contribution(s) was submitted. If You + institute patent litigation against any entity (including a + cross-claim or counterclaim in a lawsuit) alleging that the Work + or a Contribution incorporated within the Work constitutes direct + or contributory patent infringement, then any patent licenses + granted to You under this License for that Work shall terminate + as of the date such litigation is filed. + + 4. Redistribution. You may reproduce and distribute copies of the + Work or Derivative Works thereof in any medium, with or without + modifications, and in Source or Object form, provided that You + meet the following conditions: + + (a) You must give any other recipients of the Work or + Derivative Works a copy of this License; and + + (b) You must cause any modified files to carry prominent notices + stating that You changed the files; and + + (c) You must retain, in the Source form of any Derivative Works + that You distribute, all copyright, patent, trademark, and + attribution notices from the Source form of the Work, + excluding those notices that do not pertain to any part of + the Derivative Works; and + + (d) If the Work includes a "NOTICE" text file as part of its + distribution, then any Derivative Works that You distribute must + include a readable copy of the attribution notices contained + within such NOTICE file, excluding those notices that do not + pertain to any part of the Derivative Works, in at least one + of the following places: within a NOTICE text file distributed + as part of the Derivative Works; within the Source form or + documentation, if provided along with the Derivative Works; or, + within a display generated by the Derivative Works, if and + wherever such third-party notices normally appear. The contents + of the NOTICE file are for informational purposes only and + do not modify the License. You may add Your own attribution + notices within Derivative Works that You distribute, alongside + or as an addendum to the NOTICE text from the Work, provided + that such additional attribution notices cannot be construed + as modifying the License. + + You may add Your own copyright statement to Your modifications and + may provide additional or different license terms and conditions + for use, reproduction, or distribution of Your modifications, or + for any such Derivative Works as a whole, provided Your use, + reproduction, and distribution of the Work otherwise complies with + the conditions stated in this License. + + 5. Submission of Contributions. Unless You explicitly state otherwise, + any Contribution intentionally submitted for inclusion in the Work + by You to the Licensor shall be under the terms and conditions of + this License, without any additional terms or conditions. + Notwithstanding the above, nothing herein shall supersede or modify + the terms of any separate license agreement you may have executed + with Licensor regarding such Contributions. + + 6. Trademarks. This License does not grant permission to use the trade + names, trademarks, service marks, or product names of the Licensor, + except as required for reasonable and customary use in describing the + origin of the Work and reproducing the content of the NOTICE file. + + 7. Disclaimer of Warranty. Unless required by applicable law or + agreed to in writing, Licensor provides the Work (and each + Contributor provides its Contributions) on an "AS IS" BASIS, + WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or + implied, including, without limitation, any warranties or conditions + of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A + PARTICULAR PURPOSE. You are solely responsible for determining the + appropriateness of using or redistributing the Work and assume any + risks associated with Your exercise of permissions under this License. + + 8. Limitation of Liability. In no event and under no legal theory, + whether in tort (including negligence), contract, or otherwise, + unless required by applicable law (such as deliberate and grossly + negligent acts) or agreed to in writing, shall any Contributor be + liable to You for damages, including any direct, indirect, special, + incidental, or consequential damages of any character arising as a + result of this License or out of the use or inability to use the + Work (including but not limited to damages for loss of goodwill, + work stoppage, computer failure or malfunction, or any and all + other commercial damages or losses), even if such Contributor + has been advised of the possibility of such damages. + + 9. Accepting Warranty or Additional Liability. While redistributing + the Work or Derivative Works thereof, You may choose to offer, + and charge a fee for, acceptance of support, warranty, indemnity, + or other liability obligations and/or rights consistent with this + License. However, in accepting such obligations, You may act only + on Your own behalf and on Your sole responsibility, not on behalf + of any other Contributor, and only if You agree to indemnify, + defend, and hold each Contributor harmless for any liability + incurred by, or claims asserted against, such Contributor by reason + of your accepting any such warranty or additional liability. + + END OF TERMS AND CONDITIONS + + APPENDIX: How to apply the Apache License to your work. + + To apply the Apache License to your work, attach the following + boilerplate notice, with the fields enclosed by brackets "[]" + replaced with your own identifying information. (Don't include + the brackets!) The text should be enclosed in the appropriate + comment syntax for the file format. We also recommend that a + file or class name and description of purpose be included on the + same "printed page" as the copyright notice for easier + identification within third-party archives. + + Copyright [yyyy] [name of copyright owner] + + Licensed under the Apache License, Version 2.0 (the "License"); + you may not use this file except in compliance with the License. + You may obtain a copy of the License at + + http://www.apache.org/licenses/LICENSE-2.0 + + Unless required by applicable law or agreed to in writing, software + distributed under the License is distributed on an "AS IS" BASIS, + WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + See the License for the specific language governing permissions and + limitations under the License. + +------------------ +Files: third_party/compute_library/... + +MIT License + +Permission is hereby granted, free of charge, to any person obtaining a copy +of this software and associated documentation files (the "Software"), to deal +in the Software without restriction, including without limitation the rights +to use, copy, modify, merge, publish, distribute, sublicense, and/or sell +copies of the Software, and to permit persons to whom the Software is +furnished to do so, subject to the following conditions: + +The above copyright notice and this permission notice shall be included in all +copies or substantial portions of the Software. + +THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR +IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, +FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE +AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER +LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, +OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE +SOFTWARE. + +------------------ +Files: ACKNOWLEDGEMENTS +LICENSE + +Redistribution and use in source and binary forms, with or without +modification, are permitted provided that the following conditions are met: + +1. Redistributions of source code must retain the above copyright notice, this + list of conditions and the following disclaimer. + +2. Redistributions in binary form must reproduce the above copyright notice, + this list of conditions and the following disclaimer in the documentation + and/or other materials provided with the distribution. + + THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND + ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED + WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE + DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE LIABLE FOR + ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES + (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; + LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND + ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT + (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS + SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. + +------------------ +Files: third_party/hexagon + +Copyright (c) 2016-2019, The Linux Foundation. All rights reserved. + +Redistribution and use in source and binary forms, with or without +modification, are permitted (subject to the limitations in the +disclaimer below) provided that the following conditions are met: + + * Redistributions of source code must retain the above copyright + notice, this list of conditions and the following disclaimer. + + * Redistributions in binary form must reproduce the above + copyright notice, this list of conditions and the following + disclaimer in the documentation and/or other materials provided + with the distribution. + + * Neither the name of The Linux Foundation nor the names of its + contributors may be used to endorse or promote products derived + from this software without specific prior written permission. + +NO EXPRESS OR IMPLIED LICENSES TO ANY PARTY'S PATENT RIGHTS ARE +GRANTED BY THIS LICENSE. THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT +HOLDERS AND CONTRIBUTORS "AS IS" AND ANY EXPRESS OR IMPLIED +WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF +MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. +IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE FOR +ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL +DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE +GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS +INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER +IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR +OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN +IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. \ No newline at end of file diff --git a/TensorFlow/contrib/cv/MUNIT_ID0953_for_TensorFlow/MUNIT.py b/TensorFlow/contrib/cv/MUNIT_ID0953_for_TensorFlow/MUNIT.py new file mode 100644 index 000000000..3564f9a79 --- /dev/null +++ b/TensorFlow/contrib/cv/MUNIT_ID0953_for_TensorFlow/MUNIT.py @@ -0,0 +1,698 @@ +# +# Copyright 2017 The TensorFlow Authors. All Rights Reserved. +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. +# ============================================================================ +# Copyright 2021 Huawei Technologies Co., Ltd +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. +# +from npu_bridge.npu_init import * +from ops import * +from utils import * +from glob import glob +import time +from tensorflow.contrib.data import batch_and_drop_remainder + +class MUNIT(object) : + def __init__(self, sess, args): + self.model_name = 'MUNIT' + self.sess = sess + self.checkpoint_dir = args.checkpoint_dir + self.result_dir = args.result_dir + self.log_dir = args.log_dir + self.sample_dir = args.sample_dir + #self.dataset_name = args.dataset + self.dataset_name = args.data_path + self.augment_flag = args.augment_flag + + self.epoch = args.epoch + self.iteration = args.iteration + + self.gan_type = args.gan_type + + self.batch_size = args.batch_size + self.print_freq = args.print_freq + self.save_freq = args.save_freq + self.num_style = args.num_style # for test + self.guide_img = args.guide_img + self.direction = args.direction + + self.img_h = args.img_h + self.img_w = args.img_w + self.img_ch = args.img_ch + + self.init_lr = args.lr + self.ch = args.ch + + """ Weight """ + self.gan_w = args.gan_w + self.recon_x_w = args.recon_x_w + self.recon_s_w = args.recon_s_w + self.recon_c_w = args.recon_c_w + self.recon_x_cyc_w = args.recon_x_cyc_w + + """ Generator """ + self.n_res = args.n_res + self.mlp_dim = pow(2, args.n_sample) * args.ch # default : 256 + + self.n_downsample = args.n_sample + self.n_upsample = args.n_sample + self.style_dim = args.style_dim + + """ Discriminator """ + self.n_dis = args.n_dis + self.n_scale = args.n_scale + + self.sample_dir = os.path.join(args.sample_dir, self.model_dir) + check_folder(self.sample_dir) + + #self.trainA_dataset = glob('./dataset/{}/*.*'.format(self.dataset_name + '/trainA')) + #self.trainB_dataset = glob('./dataset/{}/*.*'.format(self.dataset_name + '/trainB')) + self.trainA_dataset = glob('{}/*.*'.format(self.dataset_name + '/trainA')) + self.trainB_dataset = glob('{}/*.*'.format(self.dataset_name + '/trainB')) + self.dataset_num = max(len(self.trainA_dataset), len(self.trainB_dataset)) + + # lossScale相关 + self.bert_loss_scale = args.bert_loss_scale + self.mmgr = {} + + print("##### Information #####") + print("# gan type : ", self.gan_type) + print("# dataset : ", self.dataset_name) + print("# max dataset number : ", self.dataset_num) + print("# batch_size : ", self.batch_size) + print("# epoch : ", self.epoch) + print("# iteration per epoch : ", self.iteration) + print("# style in test phase : ", self.num_style) + + print() + + print("##### Generator #####") + print("# residual blocks : ", self.n_res) + print("# Style dimension : ", self.style_dim) + print("# MLP dimension : ", self.mlp_dim) + print("# Down sample : ", self.n_downsample) + print("# Up sample : ", self.n_upsample) + + print() + + print("##### Discriminator #####") + print("# Discriminator layer : ", self.n_dis) + print("# Multi-scale Dis : ", self.n_scale) + + ################################################################################## + # Encoder and Decoders + ################################################################################## + + def Style_Encoder(self, x, reuse=False, scope='style_encoder'): + # IN removes the original feature mean and variance that represent important style information + channel = self.ch + with tf.variable_scope(scope, reuse=reuse) : + x = conv(x, channel, kernel=7, stride=1, pad=3, pad_type='reflect', scope='conv_0') + x = relu(x) + + for i in range(2) : + x = conv(x, channel*2, kernel=4, stride=2, pad=1, pad_type='reflect', scope='conv_'+str(i+1)) + x = relu(x) + + channel = channel * 2 + + for i in range(2) : + x = conv(x, channel, kernel=4, stride=2, pad=1, pad_type='reflect', scope='down_conv_'+str(i)) + x = relu(x) + + x = adaptive_avg_pooling(x) # global average pooling + x = conv(x, self.style_dim, kernel=1, stride=1, scope='SE_logit') + + return x + + def Content_Encoder(self, x, reuse=False, scope='content_encoder'): + channel = self.ch + with tf.variable_scope(scope, reuse=reuse) : + x = conv(x, channel, kernel=7, stride=1, pad=3, pad_type='reflect', scope='conv_0') + x = instance_norm(x, scope='ins_0') + x = relu(x) + + for i in range(self.n_downsample) : + x = conv(x, channel*2, kernel=4, stride=2, pad=1, pad_type='reflect', scope='conv_'+str(i+1)) + x = instance_norm(x, scope='ins_'+str(i+1)) + x = relu(x) + + channel = channel * 2 + + for i in range(self.n_res) : + x = resblock(x, channel, scope='resblock_'+str(i)) + + return x + + def generator(self, contents, style, reuse=False, scope="decoder"): + channel = self.mlp_dim + with tf.variable_scope(scope, reuse=reuse) : + mu, var = self.MLP(style) + x = contents + + for i in range(self.n_res) : + idx = 2 * i + x = adaptive_resblock(x, channel, mu[idx], var[idx], mu[idx + 1], var[idx + 1], scope='adaptive_resblock_'+str(i)) + + for i in range(self.n_upsample) : + # # IN removes the original feature mean and variance that represent important style information + x = up_sample(x, scale_factor=2) + x = conv(x, channel//2, kernel=5, stride=1, pad=2, pad_type='reflect', scope='conv_'+str(i)) + x = layer_norm(x, scope='layer_norm_'+str(i)) + x = relu(x) + + channel = channel // 2 + + x = conv(x, channels=self.img_ch, kernel=7, stride=1, pad=3, pad_type='reflect', scope='G_logit') + x = tanh(x) + + return x + + def MLP(self, style, scope='MLP'): + channel = self.mlp_dim + with tf.variable_scope(scope) : + x = style + + for i in range(2): + x = fully_connected(x, channel, scope='FC_' + str(i)) + x = relu(x) + + mu_list = [] + var_list = [] + + for i in range(self.n_res * 2): + mu = fully_connected(x, channel, scope='FC_mu_' + str(i)) + var = fully_connected(x, channel, scope='FC_var_' + str(i)) + + mu = tf.reshape(mu, shape=[-1, 1, 1, channel]) + var = tf.reshape(var, shape=[-1, 1, 1, channel]) + + mu_list.append(mu) + var_list.append(var) + + return mu_list, var_list + + ################################################################################## + # Discriminator + ################################################################################## + + def discriminator(self, x_init, reuse=False, scope="discriminator"): + D_logit = [] + with tf.variable_scope(scope, reuse=reuse) : + for scale in range(self.n_scale) : + channel = self.ch + x = conv(x_init, channel, kernel=4, stride=2, pad=1, pad_type='reflect', scope='ms_' + str(scale) + 'conv_0') + x = lrelu(x, 0.2) + + for i in range(1, self.n_dis): + x = conv(x, channel * 2, kernel=4, stride=2, pad=1, pad_type='reflect', scope='ms_' + str(scale) +'conv_' + str(i)) + x = lrelu(x, 0.2) + + channel = channel * 2 + + x = conv(x, channels=1, kernel=1, stride=1, scope='ms_' + str(scale) + 'D_logit') + D_logit.append(x) + + x_init = down_sample(x_init) + + return D_logit + + ################################################################################## + # Model + ################################################################################## + + def Encoder_A(self, x_A, reuse=False): + style_A = self.Style_Encoder(x_A, reuse=reuse, scope='style_encoder_A') + content_A = self.Content_Encoder(x_A, reuse=reuse, scope='content_encoder_A') + + return content_A, style_A + + def Encoder_B(self, x_B, reuse=False): + style_B = self.Style_Encoder(x_B, reuse=reuse, scope='style_encoder_B') + content_B = self.Content_Encoder(x_B, reuse=reuse, scope='content_encoder_B') + + return content_B, style_B + + def Decoder_A(self, content_B, style_A, reuse=False): + x_ba = self.generator(contents=content_B, style=style_A, reuse=reuse, scope='decoder_A') + + return x_ba + + def Decoder_B(self, content_A, style_B, reuse=False): + x_ab = self.generator(contents=content_A, style=style_B, reuse=reuse, scope='decoder_B') + + return x_ab + + def discriminate_real(self, x_A, x_B): + real_A_logit = self.discriminator(x_A, scope="discriminator_A") + real_B_logit = self.discriminator(x_B, scope="discriminator_B") + + return real_A_logit, real_B_logit + + def discriminate_fake(self, x_ba, x_ab): + fake_A_logit = self.discriminator(x_ba, reuse=True, scope="discriminator_A") + fake_B_logit = self.discriminator(x_ab, reuse=True, scope="discriminator_B") + + return fake_A_logit, fake_B_logit + + def build_model(self): + self.lr = tf.placeholder(tf.float32, name='learning_rate') + + """ Input Image""" + Image_Data_Class = ImageData(self.img_h, self.img_w, self.img_ch, self.augment_flag) + + trainA = tf.data.Dataset.from_tensor_slices(self.trainA_dataset) + trainB = tf.data.Dataset.from_tensor_slices(self.trainB_dataset) + + trainA = trainA.prefetch(self.batch_size).shuffle(self.dataset_num).map(Image_Data_Class.image_processing, num_parallel_calls=8).apply(batch_and_drop_remainder(self.batch_size)).repeat() + trainB = trainB.prefetch(self.batch_size).shuffle(self.dataset_num).map(Image_Data_Class.image_processing, num_parallel_calls=8).apply(batch_and_drop_remainder(self.batch_size)).repeat() + + trainA_iterator = trainA.make_one_shot_iterator() + trainB_iterator = trainB.make_one_shot_iterator() + + + self.domain_A = trainA_iterator.get_next() + self.domain_B = trainB_iterator.get_next() + + + """ Define Encoder, Generator, Discriminator """ + self.style_a = tf.placeholder(tf.float32, shape=[self.batch_size, 1, 1, self.style_dim], name='style_a') + self.style_b = tf.placeholder(tf.float32, shape=[self.batch_size, 1, 1, self.style_dim], name='style_b') + + # encode + content_a, style_a_prime = self.Encoder_A(self.domain_A) + content_b, style_b_prime = self.Encoder_B(self.domain_B) + + # decode (within domain) + x_aa = self.Decoder_A(content_B=content_a, style_A=style_a_prime) + x_bb = self.Decoder_B(content_A=content_b, style_B=style_b_prime) + + # decode (cross domain) + x_ba = self.Decoder_A(content_B=content_b, style_A=self.style_a, reuse=True) + x_ab = self.Decoder_B(content_A=content_a, style_B=self.style_b, reuse=True) + + # encode again + content_b_, style_a_ = self.Encoder_A(x_ba, reuse=True) + content_a_, style_b_ = self.Encoder_B(x_ab, reuse=True) + + # decode again (if needed) + if self.recon_x_cyc_w > 0 : + x_aba = self.Decoder_A(content_B=content_a_, style_A=style_a_prime, reuse=True) + x_bab = self.Decoder_B(content_A=content_b_, style_B=style_b_prime, reuse=True) + + cyc_recon_A = L1_loss(x_aba, self.domain_A) + cyc_recon_B = L1_loss(x_bab, self.domain_B) + + else : + cyc_recon_A = 0.0 + cyc_recon_B = 0.0 + + real_A_logit, real_B_logit = self.discriminate_real(self.domain_A, self.domain_B) + fake_A_logit, fake_B_logit = self.discriminate_fake(x_ba, x_ab) + + """ Define Loss """ + G_ad_loss_a = generator_loss(self.gan_type, fake_A_logit) + G_ad_loss_b = generator_loss(self.gan_type, fake_B_logit) + + D_ad_loss_a = discriminator_loss(self.gan_type, real_A_logit, fake_A_logit) + D_ad_loss_b = discriminator_loss(self.gan_type, real_B_logit, fake_B_logit) + + recon_A = L1_loss(x_aa, self.domain_A) # reconstruction + recon_B = L1_loss(x_bb, self.domain_B) # reconstruction + + # The style reconstruction loss encourages + # diverse outputs given different style codes + recon_style_A = L1_loss(style_a_, self.style_a) + recon_style_B = L1_loss(style_b_, self.style_b) + + # The content reconstruction loss encourages + # the translated image to preserve semantic content of the input image + recon_content_A = L1_loss(content_a_, content_a) + recon_content_B = L1_loss(content_b_, content_b) + + + Generator_A_loss = self.gan_w * G_ad_loss_a + \ + self.recon_x_w * recon_A + \ + self.recon_s_w * recon_style_A + \ + self.recon_c_w * recon_content_A + \ + self.recon_x_cyc_w * cyc_recon_A + + Generator_B_loss = self.gan_w * G_ad_loss_b + \ + self.recon_x_w * recon_B + \ + self.recon_s_w * recon_style_B + \ + self.recon_c_w * recon_content_B + \ + self.recon_x_cyc_w * cyc_recon_B + + Discriminator_A_loss = self.gan_w * D_ad_loss_a + Discriminator_B_loss = self.gan_w * D_ad_loss_b + + self.Generator_loss = Generator_A_loss + Generator_B_loss + regularization_loss('encoder') + regularization_loss('decoder') + self.Discriminator_loss = Discriminator_A_loss + Discriminator_B_loss + regularization_loss('discriminator') + + """ Training """ + t_vars = tf.trainable_variables() + G_vars = [var for var in t_vars if 'decoder' in var.name or 'encoder' in var.name] + D_vars = [var for var in t_vars if 'discriminator' in var.name] + + + # self.G_optim = tf.train.AdamOptimizer(self.lr, beta1=0.5, beta2=0.999).minimize(self.Generator_loss, var_list=G_vars) + # self.D_optim = tf.train.AdamOptimizer(self.lr, beta1=0.5, beta2=0.999).minimize(self.Discriminator_loss, var_list=D_vars) + + # 开启loss_scale + self.G_optim = tf.train.AdamOptimizer(self.lr, beta1=0.5, beta2=0.999) + self.D_optim = tf.train.AdamOptimizer(self.lr, beta1=0.5, beta2=0.999) + # 开启loss_scale + self.G_optim = self.open_loss_scale(self.G_optim, 'G') + self.D_optim = self.open_loss_scale(self.D_optim, 'D') + + self.G_optim = self.G_optim.minimize(self.Generator_loss, var_list=G_vars) + self.D_optim = self.D_optim.minimize(self.Discriminator_loss, var_list=D_vars) + + + + """" Summary """ + self.all_G_loss = tf.summary.scalar("Generator_loss", self.Generator_loss) + self.all_D_loss = tf.summary.scalar("Discriminator_loss", self.Discriminator_loss) + self.G_A_loss = tf.summary.scalar("G_A_loss", Generator_A_loss) + self.G_B_loss = tf.summary.scalar("G_B_loss", Generator_B_loss) + self.D_A_loss = tf.summary.scalar("D_A_loss", Discriminator_A_loss) + self.D_B_loss = tf.summary.scalar("D_B_loss", Discriminator_B_loss) + + self.G_loss = tf.summary.merge([self.G_A_loss, self.G_B_loss, self.all_G_loss]) + self.D_loss = tf.summary.merge([self.D_A_loss, self.D_B_loss, self.all_D_loss]) + + """ Image """ + self.fake_A = x_ba + self.fake_B = x_ab + + self.real_A = self.domain_A + self.real_B = self.domain_B + + """ Test """ + self.test_image = tf.placeholder(tf.float32, [1, self.img_h, self.img_w, self.img_ch], name='test_image') + self.test_style = tf.placeholder(tf.float32, [1, 1, 1, self.style_dim], name='test_style') + + test_content_a, _ = self.Encoder_A(self.test_image, reuse=True) + test_content_b, _ = self.Encoder_B(self.test_image, reuse=True) + + self.test_fake_A = self.Decoder_A(content_B=test_content_b, style_A=self.test_style, reuse=True) + self.test_fake_B = self.Decoder_B(content_A=test_content_a, style_B=self.test_style, reuse=True) + + """ Guided Image Translation """ + self.content_image = tf.placeholder(tf.float32, [1, self.img_h, self.img_w, self.img_ch], name='content_image') + self.style_image = tf.placeholder(tf.float32, [1, self.img_h, self.img_w, self.img_ch], name='guide_style_image') + + if self.direction == 'a2b' : + guide_content_A, guide_style_A = self.Encoder_A(self.content_image, reuse=True) + guide_content_B, guide_style_B = self.Encoder_B(self.style_image, reuse=True) + + else : + guide_content_B, guide_style_B = self.Encoder_B(self.content_image, reuse=True) + guide_content_A, guide_style_A = self.Encoder_A(self.style_image, reuse=True) + + self.guide_fake_A = self.Decoder_A(content_B=guide_content_B, style_A=guide_style_A, reuse=True) + self.guide_fake_B = self.Decoder_B(content_A=guide_content_A, style_B=guide_style_B, reuse=True) + + def train(self): + # initialize all variables + tf.global_variables_initializer().run() + + # saver to save model + self.saver = tf.train.Saver() + + # summary writer + self.writer = tf.summary.FileWriter(self.log_dir + '/' + self.model_dir, self.sess.graph) + + # restore check-point if it exits + could_load, checkpoint_counter = self.load(self.checkpoint_dir) + if could_load: + start_epoch = (int)(checkpoint_counter / self.iteration) + start_batch_id = checkpoint_counter - start_epoch * self.iteration + counter = checkpoint_counter + print(" [*] Load SUCCESS") + else: + start_epoch = 0 + start_batch_id = 0 + counter = 1 + print(" [!] Load failed...") + + # loop for epoch + start_time = time.time() + for epoch in range(start_epoch, self.epoch): + + lr = self.init_lr * pow(0.5, epoch) + + for idx in range(start_batch_id, self.iteration): + style_a = np.random.normal(loc=0.0, scale=1.0, size=[self.batch_size, 1, 1, self.style_dim]) + style_b = np.random.normal(loc=0.0, scale=1.0, size=[self.batch_size, 1, 1, self.style_dim]) + + train_feed_dict = { + self.style_a : style_a, + self.style_b : style_b, + self.lr : lr + } + + # Update D + # _, d_loss, summary_str = \ + # self.sess.run([self.D_optim, self.Discriminator_loss, self.D_loss], feed_dict = train_feed_dict) + + # 保存scale value,并打印到日志里,来观察整网的溢出情况 + _, d_loss, summary_str, d_scale = \ + self.sess.run([self.D_optim, self.Discriminator_loss, self.D_loss, self.mmgr['D'].get_loss_scale()], + feed_dict=train_feed_dict) + + self.writer.add_summary(summary_str, counter) + + # Update G + #batch_A_images, batch_B_images, fake_A, fake_B, _, g_loss, summary_str = self.sess.run([self.real_A, self.real_B, self.fake_A, self.fake_B, self.G_optim, self.Generator_loss, self.G_loss], feed_dict = train_feed_dict) + batch_A_images, batch_B_images, fake_A, fake_B, _, g_loss, summary_str, g_scale = self.sess.run( + [self.real_A, self.real_B, self.fake_A, self.fake_B, self.G_optim, self.Generator_loss, + self.G_loss, self.mmgr['G'].get_loss_scale()], feed_dict=train_feed_dict) + self.writer.add_summary(summary_str, counter) + + # display training status + counter += 1 + print("Epoch: [%2d] [%6d/%6d] time: %4.4f d_loss: %.8f, g_loss: %.8f, d_scale: %d, g_scale: %d\n" \ + % (epoch, idx, self.iteration, time.time() - start_time, d_loss, g_loss, d_scale, g_scale)) + # print("Epoch: [%2d] [%6d/%6d] time: %4.4f d_loss: %.8f, g_loss: %.8f\n" \ + # % (epoch, idx, self.iteration, time.time() - start_time, d_loss, g_loss)) + + if np.mod(idx+1, self.print_freq) == 0 : + save_images(batch_A_images, [self.batch_size, 1], + '{}/real_A_{:02d}_{:06d}.jpg'.format(self.sample_dir, epoch, idx+1)) + # save_images(batch_B_images, [self.batch_size, 1], + # './{}/real_B_{}_{:02d}_{:06d}.jpg'.format(self.sample_dir, gpu_id, epoch, idx+1)) + + # save_images(fake_A, [self.batch_size, 1], + # './{}/fake_A_{}_{:02d}_{:06d}.jpg'.format(self.sample_dir, gpu_id, epoch, idx+1)) + save_images(fake_B, [self.batch_size, 1], + '{}/fake_B_{:02d}_{:06d}.jpg'.format(self.sample_dir, epoch, idx+1)) + + if np.mod(idx+1, self.save_freq) == 0 : + self.save(self.checkpoint_dir, counter) + + # After an epoch, start_batch_id is set to zero + # non-zero value is only for the first epoch after loading pre-trained model + start_batch_id = 0 + + # save model for final step + self.save(self.checkpoint_dir, counter) + + + @property + def model_dir(self): + return "{}_{}".format(self.model_name, self.gan_type) + #return "{}_{}_{}".format(self.model_name, self.dataset_name, self.gan_type) + + def save(self, checkpoint_dir, step): + checkpoint_dir = os.path.join(checkpoint_dir, self.model_dir) + + if not os.path.exists(checkpoint_dir): + os.makedirs(checkpoint_dir) + + self.saver.save(self.sess, os.path.join(checkpoint_dir, self.model_name + '.model'), global_step=step) + + def load(self, checkpoint_dir): + import re + print(" [*] Reading checkpoints...") + checkpoint_dir = os.path.join(checkpoint_dir, self.model_dir) + + ckpt = tf.train.get_checkpoint_state(checkpoint_dir) + if ckpt and ckpt.model_checkpoint_path: + ckpt_name = os.path.basename(ckpt.model_checkpoint_path) + self.saver.restore(self.sess, os.path.join(checkpoint_dir, ckpt_name)) + counter = int(next(re.finditer("(\d+)(?!.*\d)", ckpt_name)).group(0)) + print(" [*] Success to read {}".format(ckpt_name)) + return True, counter + else: + print(" [*] Failed to find a checkpoint") + return False, 0 + + def test(self): + tf.global_variables_initializer().run() + + #test_A_files = glob('{}/*.*'.format(self.dataset_name + '/testA')) + #test_B_files = glob('{}/*.*'.format(self.dataset_name + '/testB')) + test_A_files = self.trainA_dataset + test_B_files = self.trainB_dataset + + self.saver = tf.train.Saver() + could_load, checkpoint_counter = self.load(self.checkpoint_dir) + self.result_dir = os.path.join(self.result_dir, self.model_dir) + check_folder(self.result_dir) + + if could_load : + print(" [*] Load SUCCESS") + else : + print(" [!] Load failed...") + + # write html for visual comparison + index_path = os.path.join(self.result_dir, 'index.html') + index = open(index_path, 'w') + index.write("") + index.write("") + + cnt = 0 + for sample_file in test_A_files : # A -> B + print('Processing A image: ' + sample_file) + sample_image = np.asarray(load_test_data(sample_file, size_h=self.img_h, size_w=self.img_w)) + file_name = os.path.basename(sample_file).split(".")[0] + file_extension = os.path.basename(sample_file).split(".")[1] + + for i in range(self.num_style) : + test_style = np.random.normal(loc=0.0, scale=1.0, size=[1, 1, 1, self.style_dim]) + image_path = os.path.join(self.result_dir, '{}_style{}.{}'.format(file_name, i, file_extension)) + + fake_img = self.sess.run(self.test_fake_B, feed_dict = {self.test_image : sample_image, self.test_style : test_style}) + save_images(fake_img, [1, 1], image_path) + + index.write("" % os.path.basename(image_path)) + index.write("" % (sample_file if os.path.isabs(sample_file) else ( + '../..' + os.path.sep + sample_file), self.img_w, self.img_h)) + index.write("" % (image_path if os.path.isabs(image_path) else ( + '../..' + os.path.sep + image_path), self.img_w, self.img_h)) + index.write("") + cnt += 1 + if cnt == 1: break + + cnt = 0 + for sample_file in test_B_files : # B -> A + print('Processing B image: ' + sample_file) + sample_image = np.asarray(load_test_data(sample_file, size_h=self.img_h, size_w=self.img_w)) + file_name = os.path.basename(sample_file).split(".")[0] + file_extension = os.path.basename(sample_file).split(".")[1] + + for i in range(self.num_style): + test_style = np.random.normal(loc=0.0, scale=1.0, size=[1, 1, 1, self.style_dim]) + image_path = os.path.join(self.result_dir, '{}_style{}.{}'.format(file_name, i, file_extension)) + + fake_img = self.sess.run(self.test_fake_A, feed_dict={self.test_image: sample_image, self.test_style: test_style}) + save_images(fake_img, [1, 1], image_path) + + index.write("" % os.path.basename(image_path)) + index.write("" % (sample_file if os.path.isabs(sample_file) else ( + '../..' + os.path.sep + sample_file), self.img_w, self.img_h)) + index.write("" % (image_path if os.path.isabs(image_path) else ( + '../..' + os.path.sep + image_path), self.img_w, self.img_h)) + index.write("") + cnt += 1 + if cnt == 1: break + index.close() + + def style_guide_test(self): + tf.global_variables_initializer().run() + test_A_files = glob('./dataset/{}/*.*'.format(self.dataset_name + '/testA')) + test_B_files = glob('./dataset/{}/*.*'.format(self.dataset_name + '/testB')) + + style_file = np.asarray(load_test_data(self.guide_img, size_h=self.img_h, size_w=self.img_w)) + + self.saver = tf.train.Saver() + could_load, checkpoint_counter = self.load(self.checkpoint_dir) + self.result_dir = os.path.join(self.result_dir, self.model_dir, 'guide') + check_folder(self.result_dir) + + if could_load: + print(" [*] Load SUCCESS") + else: + print(" [!] Load failed...") + + # write html for visual comparison + index_path = os.path.join(self.result_dir, 'index.html') + index = open(index_path, 'w') + index.write("
nameinputoutput
%s
%s
") + index.write("") + + if self.direction == 'a2b' : + for sample_file in test_A_files: # A -> B + print('Processing A image: ' + sample_file) + sample_image = np.asarray(load_test_data(sample_file, size_h=self.img_h, size_w=self.img_w)) + image_path = os.path.join(self.result_dir, '{}'.format(os.path.basename(sample_file))) + + fake_img = self.sess.run(self.guide_fake_B, feed_dict={self.content_image: sample_image, self.style_image : style_file}) + save_images(fake_img, [1, 1], image_path) + + index.write("" % os.path.basename(image_path)) + index.write("" % (sample_file if os.path.isabs(sample_file) else ( + '../../..' + os.path.sep + sample_file), self.img_w, self.img_h)) + index.write("" % (image_path if os.path.isabs(image_path) else ( + '../../..' + os.path.sep + image_path), self.img_w, self.img_h)) + index.write("") + + else : + for sample_file in test_B_files: # B -> A + print('Processing B image: ' + sample_file) + sample_image = np.asarray(load_test_data(sample_file, size_h=self.img_h, size_w=self.img_w)) + image_path = os.path.join(self.result_dir, '{}'.format(os.path.basename(sample_file))) + + fake_img = self.sess.run(self.guide_fake_A, feed_dict={self.content_image: sample_image, self.style_image : style_file}) + save_images(fake_img, [1, 1], image_path) + + index.write("" % os.path.basename(image_path)) + index.write("" % (sample_file if os.path.isabs(sample_file) else ( + '../../..' + os.path.sep + sample_file), self.img_w, self.img_h)) + index.write("" % (image_path if os.path.isabs(image_path) else ( + '../../..' + os.path.sep + image_path), self.img_w, self.img_h)) + index.write("") + index.close() + + def open_loss_scale(self, opt, key): + opt_tmp = opt + if self.bert_loss_scale == 0: + # loss_scale_manager = ExponentialUpdateLossScaleManager(init_loss_scale=2 ** 32, incr_every_n_steps=1000, + # decr_every_n_nan_or_inf=2, decr_ratio=0.5) + loss_scale_manager = ExponentialUpdateLossScaleManager(init_loss_scale = 2 ** 10, incr_every_n_steps = 100, decr_every_n_nan_or_inf = 2, decr_ratio = 0.8) + print("lossScale type: exponential") + elif self.bert_loss_scale >= 1: + loss_scale_manager = FixedLossScaleManager(loss_scale=self.bert_loss_scale) + else: + raise ValueError("Invalid loss scale: %d" % self.bert_loss_scale) + self.mmgr[key] = loss_scale_manager + # device数是否大于1,如果大于1,进行分布式训练 + # if ops_adapter.size() > 1: + # opt_tmp = NPUDistributedOptimizer(opt_tmp) + # opt = NPULossScaleOptimizer(opt_tmp, loss_scale_manager, is_distributed=True) + # else: + opt = NPULossScaleOptimizer(opt_tmp, loss_scale_manager) + return opt diff --git a/TensorFlow/contrib/cv/MUNIT_ID0953_for_TensorFlow/README.md b/TensorFlow/contrib/cv/MUNIT_ID0953_for_TensorFlow/README.md new file mode 100644 index 000000000..5fe0ce00f --- /dev/null +++ b/TensorFlow/contrib/cv/MUNIT_ID0953_for_TensorFlow/README.md @@ -0,0 +1,188 @@ +- [基本信息](#基本信息.md) +- [概述](#概述.md) +- [训练环境准备](#训练环境准备.md) +- [快速上手](#快速上手.md) +- [训练结果](#训练结果.md) +- [高级参考](#高级参考.md) +

基本信息

+ +**发布者(Publisher):Huawei** + +**应用领域(Application Domain):Computer Vision** + +**版本(Version):1.2** + +**修改时间(Modified) :2022.06.12** + +**大小(Size):104KB** + +**框架(Framework):TensorFlow 1.15.0** + +**模型格式(Model Format):ckpt** + +**精度(Precision):Mixed** + +**处理器(Processor):昇腾910** + +**应用级别(Categories):Official** + +**描述(Description):基于TensorFlow框架的图像迁移算法** + +

概述

+Munit是2018年提出的多模态无监督图像转换框架,可以从给定的源域图像生成不同风格的目标域图像输出。 + + +- 参考论文: + + https://arxiv.org/abs/1804.04732 + +- 参考实现: + + https://github.com/taki0112/MUNIT-Tensorflow + +- 适配昇腾 AI 处理器的实现: + + https://gitee.com/harry-zzh/modelzoo/edit/master/contrib/Tensorflow/MUNIT_ID0953_for_TensorFlow/ + + + +- 通过Git获取对应commit\_id的代码方法如下: + + ``` + git clone {repository_url} # 克隆仓库的代码 + cd {repository_name} # 切换到模型的代码仓目录 + git checkout {branch} # 切换到对应分支 + git reset --hard {commit_id} # 代码设置到对应的commit_id + cd {code_path} # 切换到模型代码所在路径,若仓库下只有该模型,则无需切换 + ``` + +## 默认配置 + +- 训练超参 + + - Batch size: 1 + - Train epoch: 1 + - Train step: 100000 + + +## 支持特性 + +| 特性列表 | 是否支持 | +|-------|------| +| 分布式训练 | 否 | +| 混合精度 | 否 | +| 并行数据 | 否 | + + +

训练环境准备

+ +1. 硬件环境准备请参见各硬件产品文档"[驱动和固件安装升级指南]( https://support.huawei.com/enterprise/zh/category/ai-computing-platform-pid-1557196528909)"。需要在硬件设备上安装与CANN版本配套的固件与驱动。 +2. 宿主机上需要安装Docker并登录[Ascend Hub中心](https://ascendhub.huawei.com/#/detail?name=ascend-tensorflow-arm)获取镜像。 + + 当前模型支持的镜像列表如[表1](#zh-cn_topic_0000001074498056_table1519011227314)所示。 + + **表 1** 镜像列表 + + +
nameinputoutput
%s
%s
+ + + + + + + + + +

镜像名称

+

镜像版本

+

配套CANN版本

+
+

20.2.0

+

20.2

+
+ + +

快速上手

+ +- 数据集准备 +1. 模型训练使用edges2shoes数据集,数据集请用户自行获取。 + +## 模型训练 + +- 单击“立即下载”,并选择合适的下载方式下载源码包。 + +- 启动训练之前,首先要配置程序运行相关环境变量。 + + 环境变量配置信息参见: + + [Ascend 910训练平台环境变量设置](https://gitee.com/ascend/modelzoo/wikis/Ascend%20910%E8%AE%AD%E7%BB%83%E5%B9%B3%E5%8F%B0%E7%8E%AF%E5%A2%83%E5%8F%98%E9%87%8F%E8%AE%BE%E7%BD%AE?sort_id=3148819) + +- 单卡训练 + + 1. 配置训练参数。 + + 首先在脚本test/train_full_1p.sh中,配置batch_size、steps、epochs、data_path等参数,请用户根据实际路径配置data_path,或者在启动训练的命令行中以参数形式下发。 + + ``` + batch_size=1 + train_steps=100000 + epochs=1 + data_path="./dataset/edges2shoes/train" + ``` + + 2. 启动训练。 + + 启动单卡训练 (脚本为MUNIT_ID0953_for_TensorFlow/test/train_full_1p.sh) + + ``` + bash train_full_1p.sh + ``` + +

训练结果

+ +- 精度结果比对 + +取训练最后1000个steps的loss,计算平均值,进行结果比对。 + +|精度指标项|GPU实测|NPU实测| +|---|---|---| +|d_loss|2.619421507950002|2.7996314894200007| +|g_loss|4.192780654629998|4.389258856830003| + + +

高级参考

+ +## 脚本和示例代码 + +``` +├── MUNIT.py //网络训练与测试代码 +├── main.py //主函数设置代码 +├── ops.py //基础模块代码 +├── utils.py //工具函数代码 +├── README.md //代码说明文档 +├── test +│ ├──train_performance_1p.sh //单卡训练验证性能启动脚本 +│ ├──train_full_1p.sh //单卡全量训练启动脚本 + +``` + +## 脚本参数 + +``` +--data_path 数据集路径,默认:./dataset/edges2shoes/train +--phase 运行模式,默认:train +--batch_size 每个NPU的batch size,默认:1 +--learing_rate 初始学习率,默认:0.001 +--iteration 每个epcoh训练步数,默认:100000 +--epoch 训练epcoh数量,默认:1 +--result 结果输出路径,默认:./test/output/${ASCEND_DEVICE_ID} +``` + +## 训练过程 + +1. 通过“模型训练”中的训练指令启动单卡卡训练。 + +2. 参考脚本的模型存储路径为./test/output/${ASCEND_DEVICE_ID}/checkpoint/MUNIT_lsgan。 + + diff --git a/TensorFlow/contrib/cv/MUNIT_ID0953_for_TensorFlow/main.py b/TensorFlow/contrib/cv/MUNIT_ID0953_for_TensorFlow/main.py new file mode 100644 index 000000000..e36a801de --- /dev/null +++ b/TensorFlow/contrib/cv/MUNIT_ID0953_for_TensorFlow/main.py @@ -0,0 +1,213 @@ +# +# Copyright 2017 The TensorFlow Authors. All Rights Reserved. +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. +# ============================================================================ +# Copyright 2021 Huawei Technologies Co., Ltd +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. +# +from npu_bridge.npu_init import * +import tensorflow as tf +from MUNIT import MUNIT +import argparse +from utils import * +# from help_modelarts import modelarts_result2obs +import precision_tool.tf_config as npu_tf_config + +"""parsing and configuration""" +def parse_args(): + desc = "Tensorflow implementation of MUNIT" + parser = argparse.ArgumentParser(description=desc) + parser.add_argument('--code_dir', type=str, default='code', help='code_dir') + parser.add_argument('--phase', type=str, default='train', help='train or test or guide') + parser.add_argument('--dataset', type=str, default='summer2winter', help='dataset_name') + parser.add_argument('--data_path', type=str, default='summer2winter', help='dataset_name') + parser.add_argument('--augment_flag', type=bool, default=False, help='Image augmentation use or not') + parser.add_argument('--obs_dir', type=str, default='./output/', help='obs_dir') + + parser.add_argument('--epoch', type=int, default=10, help='The number of epochs to run') + parser.add_argument('--iteration', type=int, default=100000, help='The number of training iterations') + parser.add_argument('--batch_size', type=int, default=1, help='The batch size') + parser.add_argument('--print_freq', type=int, default=1000, help='The number of image_print_freq') + parser.add_argument('--save_freq', type=int, default=1000, help='The number of ckpt_save_freq') + parser.add_argument('--num_style', type=int, default=3, help='number of styles to sample') + parser.add_argument('--direction', type=str, default='a2b', help='direction of style guided image translation') + parser.add_argument('--guide_img', type=str, default='guide.jpg', help='Style guided image translation') + + parser.add_argument('--gan_type', type=str, default='lsgan', help='GAN loss type [gan / lsgan]') + + parser.add_argument('--lr', type=float, default=0.0001, help='The learning rate') + parser.add_argument('--gan_w', type=float, default=1.0, help='weight of adversarial loss') + parser.add_argument('--recon_x_w', type=float, default=10.0, help='weight of image reconstruction loss') + parser.add_argument('--recon_s_w', type=float, default=1.0, help='weight of style reconstruction loss') + parser.add_argument('--recon_c_w', type=float, default=1.0, help='weight of content reconstruction loss') + parser.add_argument('--recon_x_cyc_w', type=float, default=0.0, help='weight of explicit style augmented cycle consistency loss') + + parser.add_argument('--ch', type=int, default=64, help='base channel number per layer') + parser.add_argument('--style_dim', type=int, default=8, help='length of style code') + parser.add_argument('--n_sample', type=int, default=2, help='number of sampling layers in content encoder') + parser.add_argument('--n_res', type=int, default=4, help='number of residual blocks in content encoder/decoder') + + parser.add_argument('--n_dis', type=int, default=4, help='number of discriminator layer') + parser.add_argument('--n_scale', type=int, default=3, help='number of scales') + + parser.add_argument('--img_h', type=int, default=256, help='The size of image hegiht') + parser.add_argument('--img_w', type=int, default=256, help='The size of image width') + parser.add_argument('--img_ch', type=int, default=3, help='The size of image channel') + + parser.add_argument('--result', type=str, default='results', + help='Directory name to save the results') + parser.add_argument('--checkpoint_dir', type=str, default='checkpoint', + help='Directory name to save the checkpoints') + parser.add_argument('--result_dir', type=str, default='results', + help='Directory name to save the generated images') + parser.add_argument('--log_dir', type=str, default='logs', + help='Directory name to save training logs') + parser.add_argument('--sample_dir', type=str, default='samples', + help='Directory name to save the samples on training') + + # parser.add_argument('--use_fp16', type=bool, default=True) + parser.add_argument('--bert_loss_scale', type=int, default=0) + + + return check_args(parser.parse_args()) + +"""checking arguments""" +def check_args(args): + # --checkpoint_dir + args.checkpoint_dir = os.path.join(args.result, args.checkpoint_dir) + check_folder(args.checkpoint_dir) + + # --result_dir + args.result_dir = os.path.join(args.result, args.result_dir) + check_folder(args.result_dir) + + # --result_dir + args.log_dir = os.path.join(args.result, args.log_dir) + check_folder(args.log_dir) + + # --dump_dir + args.dump_dir = os.path.join(args.result, "dump") + check_folder(args.dump_dir) + + # --sample_dir + args.sample_dir = os.path.join(args.result, args.sample_dir) + check_folder(args.sample_dir) + + # --epoch + try: + assert args.epoch >= 1 + except: + print('number of epochs must be larger than or equal to one') + + # --batch_size + try: + assert args.batch_size >= 1 + except: + print('batch size must be larger than or equal to one') + return args + +"""main""" +def main(): + # parse arguments + args = parse_args() + if args is None: + exit() + + ############################## npu modify ######################### + config = tf.ConfigProto(allow_soft_placement=True) + custom_op = config.graph_options.rewrite_options.custom_optimizers.add() + custom_op.name = "NpuOptimizer" + custom_op.parameter_map["use_off_line"].b = True + # # 混合精度 + custom_op.parameter_map["precision_mode"].s = tf.compat.as_bytes("allow_fp32_to_fp16") + #custom_op.parameter_map["precision_mode"].s = tf.compat.as_bytes("force_fp32") + # 算子黑名单 + #custom_op.parameter_map["modify_mixlist"].s = tf.compat.as_bytes("./ops_info.json") + # custom_op.parameter_map["modify_mixlist"].s = tf.compat.as_bytes(os.path.join(args.code_dir, "ops_info.json")) + # print(os.path.isfile(os.path.join(args.code_dir, "ops_info.json"))) + # print(os.path.join(args.code_dir, "ops_info.json")) + + # 判断是否溢出 + # # dump_path:dump数据存放路径,该参数指定的目录需要在启动训练的环境上(容器或Host侧)提前创建且确保安装时配置的运行用户具有读写权限 + # custom_op.parameter_map["dump_path"].s = tf.compat.as_bytes(args.dump_dir) + # # enable_dump_debug:是否开启溢出检测功能 + # custom_op.parameter_map["enable_dump_debug"].b = True + # # dump_debug_mode:溢出检测模式,取值:all/aicore_overflow/atomic_overflow + # custom_op.parameter_map["dump_debug_mode"].s = tf.compat.as_bytes("all") + # custom_op = npu_tf_config.update_custom_op(custom_op, action='overflow') + + # # 关闭全部融合规则 + # config = npu_tf_config.session_dump_config(config, action='fusion_off') + + config.graph_options.rewrite_options.remapping = RewriterConfig.OFF #off remap + config = npu_config_proto(config_proto=config) + + + + # if args.use_fp16 and (args.bert_loss_scale not in [None, -1]): + # opt_tmp = custom_op + # if args.bert_loss_scale == 0: + # loss_scale_manager = ExponentialUpdateLossScaleManager(init_loss_scale=2 ** 32, incr_every_n_steps=1000, + # decr_every_n_nan_or_inf=2, decr_ratio=0.5) + # elif args.bert_loss_scale >= 1: + # loss_scale_manager = FixedLossScaleManager(loss_scale=args.bert_loss_scale) + # else: + # raise ValueError("Invalid loss scale: %d" % args.bert_loss_scale) + # # device数是否大于1,如果大于1,进行分布式训练 + # # if ops_adapter.size() > 1: + # # opt_tmp = NPUDistributedOptimizer(opt_tmp) + # # custom_op = NPULossScaleOptimizer(opt_tmp, loss_scale_manager, is_distributed=True) + # # else: + # custom_op = NPULossScaleOptimizer(opt_tmp, loss_scale_manager) + + # open session + with tf.Session(config=config) as sess: + gan = MUNIT(sess, args) + ############################## npu modify ######################### + + # build graph + gan.build_model() + + # show network architecture + show_all_variables() + + if args.phase == 'train' : + # launch the graph in a session + gan.train() + print(" [*] Training finished!") + + if args.phase == 'test' : + gan.test() + print(" [*] Test finished!") + + if args.phase == 'guide' : + gan.style_guide_test() + print(" [*] Guide finished!") + + #modelarts_result2obs(args) + +if __name__ == '__main__': + main() + diff --git a/TensorFlow/contrib/cv/MUNIT_ID0953_for_TensorFlow/modelarts_entry_acc.py b/TensorFlow/contrib/cv/MUNIT_ID0953_for_TensorFlow/modelarts_entry_acc.py new file mode 100644 index 000000000..e316635ee --- /dev/null +++ b/TensorFlow/contrib/cv/MUNIT_ID0953_for_TensorFlow/modelarts_entry_acc.py @@ -0,0 +1,63 @@ +# Copyright 2017 The TensorFlow Authors. All Rights Reserved. +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. +# ============================================================================ +# Copyright 2021 Huawei Technologies Co., Ltd +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +import os +import argparse +import sys + +# 解析输入参数data_url +parser = argparse.ArgumentParser() +parser.add_argument("--data_url", type=str, default="/home/ma-user/modelarts/inputs/data_url_0") +parser.add_argument("--train_url", type=str, default="/home/ma-user/modelarts/outputs/train_url_0/") +config = parser.parse_args() + +print("[CANN-Modelzoo] code_dir path is [%s]" % (sys.path[0])) +code_dir = sys.path[0] +os.chdir(code_dir) +print("[CANN-Modelzoo] work_dir path is [%s]" % (os.getcwd())) + +print("[CANN-Modelzoo] before train - list my run files:") +os.system("ls -al /usr/local/Ascend/ascend-toolkit/") + +print("[CANN-Modelzoo] before train - list my dataset files:") +os.system("ls -al %s" % config.data_url) + +print("[CANN-Modelzoo] start run train shell") +# 设置sh文件格式为linux可执行 +os.system("dos2unix ./test/*") + +# 执行train_full_1p.sh或者train_performance_1p.sh,需要用户自己指定 +# full和performance的差异,performance只需要执行很少的step,控制在15分钟以内,主要关注性能FPS +os.system("bash ./test/train_full_1p.sh --data_path=%s --output_path=%s " % (config.data_url, config.train_url)) + +print("[CANN-Modelzoo] finish run train shell") + +# 将当前执行目录所有文件拷贝到obs的output进行备份 +print("[CANN-Modelzoo] after train - list my output files:") +os.system("cp -r %s %s " % (code_dir, config.train_url)) +os.system("ls -al %s" % config.train_url) \ No newline at end of file diff --git a/TensorFlow/contrib/cv/MUNIT_ID0953_for_TensorFlow/modelarts_entry_perf.py b/TensorFlow/contrib/cv/MUNIT_ID0953_for_TensorFlow/modelarts_entry_perf.py new file mode 100644 index 000000000..d4b6e5535 --- /dev/null +++ b/TensorFlow/contrib/cv/MUNIT_ID0953_for_TensorFlow/modelarts_entry_perf.py @@ -0,0 +1,63 @@ +# Copyright 2017 The TensorFlow Authors. All Rights Reserved. +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. +# ============================================================================ +# Copyright 2021 Huawei Technologies Co., Ltd +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +import os +import argparse +import sys + +# 解析输入参数data_url +parser = argparse.ArgumentParser() +parser.add_argument("--data_url", type=str, default="/home/ma-user/modelarts/inputs/data_url_0") +parser.add_argument("--train_url", type=str, default="/home/ma-user/modelarts/outputs/train_url_0/") +config = parser.parse_args() + +print("[CANN-Modelzoo] code_dir path is [%s]" % (sys.path[0])) +code_dir = sys.path[0] +os.chdir(code_dir) +print("[CANN-Modelzoo] work_dir path is [%s]" % (os.getcwd())) + +print("[CANN-Modelzoo] before train - list my run files:") +os.system("ls -al /usr/local/Ascend/ascend-toolkit/") + +print("[CANN-Modelzoo] before train - list my dataset files:") +os.system("ls -al %s" % config.data_url) + +print("[CANN-Modelzoo] start run train shell") +# 设置sh文件格式为linux可执行 +os.system("dos2unix ./test/*") + +# 执行train_full_1p.sh或者train_performance_1p.sh,需要用户自己指定 +# full和performance的差异,performance只需要执行很少的step,控制在15分钟以内,主要关注性能FPS +os.system("bash ./test/train_performance_1p.sh --data_path=%s --output_path=%s " % (config.data_url, config.train_url)) + +print("[CANN-Modelzoo] finish run train shell") + +# 将当前执行目录所有文件拷贝到obs的output进行备份 +print("[CANN-Modelzoo] after train - list my output files:") +os.system("cp -r %s %s " % (code_dir, config.train_url)) +os.system("ls -al %s" % config.train_url) \ No newline at end of file diff --git a/TensorFlow/contrib/cv/MUNIT_ID0953_for_TensorFlow/modelzoo_level.txt b/TensorFlow/contrib/cv/MUNIT_ID0953_for_TensorFlow/modelzoo_level.txt new file mode 100644 index 000000000..a17c8f95f --- /dev/null +++ b/TensorFlow/contrib/cv/MUNIT_ID0953_for_TensorFlow/modelzoo_level.txt @@ -0,0 +1,3 @@ +FuncStatus:OK +PerfStatus:NOK +PrecisionStatus:OK \ No newline at end of file diff --git a/TensorFlow/contrib/cv/MUNIT_ID0953_for_TensorFlow/ops.py b/TensorFlow/contrib/cv/MUNIT_ID0953_for_TensorFlow/ops.py new file mode 100644 index 000000000..efaafd2bb --- /dev/null +++ b/TensorFlow/contrib/cv/MUNIT_ID0953_for_TensorFlow/ops.py @@ -0,0 +1,244 @@ +# +# Copyright 2017 The TensorFlow Authors. All Rights Reserved. +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. +# ============================================================================ +# Copyright 2021 Huawei Technologies Co., Ltd +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. +# +from npu_bridge.npu_init import * +import tensorflow as tf +import tensorflow.contrib as tf_contrib +from utils import pytorch_kaiming_weight_factor + +factor, mode, uniform = pytorch_kaiming_weight_factor(a=0.0, uniform=False) +weight_init = tf_contrib.layers.variance_scaling_initializer(factor=factor, mode=mode, uniform=uniform) +weight_regularizer = tf_contrib.layers.l2_regularizer(scale=0.0001) + +################################################################################## +# Layer +################################################################################## + +def conv(x, channels, kernel=4, stride=2, pad=0, pad_type='zero', use_bias=True, scope='conv'): + with tf.variable_scope(scope): + if scope.__contains__("discriminator") : + weight_init = tf.random_normal_initializer(mean=0.0, stddev=0.02) + else : + weight_init = tf_contrib.layers.variance_scaling_initializer() + + if pad > 0: + h = x.get_shape().as_list()[1] + if h % stride == 0: + pad = pad * 2 + else: + pad = max(kernel - (h % stride), 0) + + pad_top = pad // 2 + pad_bottom = pad - pad_top + pad_left = pad // 2 + pad_right = pad - pad_left + + if pad_type == 'zero': + x = tf.pad(x, [[0, 0], [pad_top, pad_bottom], [pad_left, pad_right], [0, 0]]) + if pad_type == 'reflect': + x = tf.pad(x, [[0, 0], [pad_top, pad_bottom], [pad_left, pad_right], [0, 0]], mode='REFLECT') + + x = tf.layers.conv2d(inputs=x, filters=channels, + kernel_size=kernel, kernel_initializer=weight_init, + kernel_regularizer=weight_regularizer, + strides=stride, use_bias=use_bias) + + return x + +def fully_connected(x, units, use_bias=True, scope='fully_connected'): + with tf.variable_scope(scope): + x = flatten(x) + x = tf.layers.dense(x, units=units, kernel_initializer=weight_init, + kernel_regularizer=weight_regularizer, + use_bias=use_bias) + + return x + +def flatten(x) : + return tf.layers.flatten(x) + +################################################################################## +# Residual-block +################################################################################## + +def resblock(x_init, channels, use_bias=True, scope='resblock'): + with tf.variable_scope(scope): + with tf.variable_scope('res1'): + x = conv(x_init, channels, kernel=3, stride=1, pad=1, pad_type='reflect', use_bias=use_bias) + x = instance_norm(x) + x = relu(x) + + with tf.variable_scope('res2'): + x = conv(x, channels, kernel=3, stride=1, pad=1, pad_type='reflect', use_bias=use_bias) + x = instance_norm(x) + + return x + x_init + +def adaptive_resblock(x_init, channels, gamma1, beta1, gamma2, beta2, use_bias=True, scope='adaptive_resblock') : + with tf.variable_scope(scope): + with tf.variable_scope('res1'): + x = conv(x_init, channels, kernel=3, stride=1, pad=1, pad_type='reflect', use_bias=use_bias) + x = adaptive_instance_norm(x, gamma1, beta1) + x = relu(x) + + with tf.variable_scope('res2'): + x = conv(x, channels, kernel=3, stride=1, pad=1, pad_type='reflect', use_bias=use_bias) + x = adaptive_instance_norm(x, gamma2, beta2) + + return x + x_init + +################################################################################## +# Sampling +################################################################################## + +def down_sample(x) : + return tf.layers.average_pooling2d(x, pool_size=3, strides=2, padding='SAME') + +def up_sample(x, scale_factor=2): + _, h, w, _ = x.get_shape().as_list() + new_size = [h * scale_factor, w * scale_factor] + return tf.image.resize_nearest_neighbor(x, size=new_size) + +def adaptive_avg_pooling(x): + # global average pooling + gap = tf.reduce_mean(x, axis=[1, 2], keep_dims=True) + + return gap + +################################################################################## +# Activation function +################################################################################## + +def lrelu(x, alpha=0.01): + # pytorch alpha is 0.01 + return tf.nn.leaky_relu(x, alpha) + + +def relu(x): + return tf.nn.relu(x) + + +def tanh(x): + return tf.tanh(x) + +################################################################################## +# Normalization function +################################################################################## + +def adaptive_instance_norm(content, gamma, beta, epsilon=1e-5): + # gamma, beta = style_mean, style_std from MLP + + c_mean, c_var = tf.nn.moments(content, axes=[1, 2], keep_dims=True) + c_std = tf.sqrt(c_var + epsilon) + + return gamma * ((content - c_mean) / c_std) + beta + + +def instance_norm(x, scope='instance_norm'): + return tf_contrib.layers.instance_norm(x, + epsilon=1e-05, + center=True, scale=True, + scope=scope) + +def layer_norm(x, scope='layer_norm') : + return tf_contrib.layers.layer_norm(x, + center=True, scale=True, + scope=scope) + +################################################################################## +# Loss function +################################################################################## + +""" + +Author use LSGAN +For LSGAN, multiply each of G and D by 0.5. +However, MUNIT authors did not do this. + +""" + +def discriminator_loss(type, real, fake): + n_scale = len(real) + loss = [] + + real_loss = 0 + fake_loss = 0 + + for i in range(n_scale) : + if type == 'lsgan' : + real_loss = tf.reduce_mean(tf.squared_difference(real[i], 1.0)) + fake_loss = tf.reduce_mean(tf.square(fake[i])) + + if type == 'gan' : + real_loss = tf.reduce_mean(tf.nn.sigmoid_cross_entropy_with_logits(labels=tf.ones_like(real[i]), logits=real[i])) + fake_loss = tf.reduce_mean(tf.nn.sigmoid_cross_entropy_with_logits(labels=tf.zeros_like(fake[i]), logits=fake[i])) + + loss.append(real_loss + fake_loss) + + return sum(loss) + + +def generator_loss(type, fake): + n_scale = len(fake) + loss = [] + + fake_loss = 0 + + for i in range(n_scale) : + if type == 'lsgan' : + fake_loss = tf.reduce_mean(tf.squared_difference(fake[i], 1.0)) + + if type == 'gan' : + fake_loss = tf.reduce_mean(tf.nn.sigmoid_cross_entropy_with_logits(labels=tf.ones_like(fake[i]), logits=fake[i])) + + loss.append(fake_loss) + + + return sum(loss) + + +def L1_loss(x, y): + loss = tf.reduce_mean(tf.abs(x - y)) + + return loss + +def regularization_loss(scope_name) : + """ + If you want to use "Regularization" + g_loss += regularization_loss('generator') + d_loss += regularization_loss('discriminator') + """ + collection_regularization = tf.get_collection(tf.GraphKeys.REGULARIZATION_LOSSES) + + loss = [] + for item in collection_regularization : + if scope_name in item.name : + loss.append(item) + + return tf.reduce_sum(loss) diff --git a/TensorFlow/contrib/cv/MUNIT_ID0953_for_TensorFlow/utils.py b/TensorFlow/contrib/cv/MUNIT_ID0953_for_TensorFlow/utils.py new file mode 100644 index 000000000..e4db60aa1 --- /dev/null +++ b/TensorFlow/contrib/cv/MUNIT_ID0953_for_TensorFlow/utils.py @@ -0,0 +1,146 @@ +# +# Copyright 2017 The TensorFlow Authors. All Rights Reserved. +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. +# ============================================================================ +# Copyright 2021 Huawei Technologies Co., Ltd +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. +# +from npu_bridge.npu_init import * +import tensorflow as tf +from tensorflow.contrib import slim +from scipy import misc +import os, random +import numpy as np +import imageio +from skimage.transform import resize + +# https://people.eecs.berkeley.edu/~taesung_park/CycleGAN/datasets/ +# https://people.eecs.berkeley.edu/~tinghuiz/projects/pix2pix/datasets/ + +class ImageData: + + def __init__(self, img_h, img_w, channels, augment_flag=False): + self.img_h = img_h + self.img_w = img_w + self.channels = channels + self.augment_flag = augment_flag + + def image_processing(self, filename): + x = tf.read_file(filename) + x_decode = tf.image.decode_jpeg(x, channels=self.channels) + img = tf.image.resize_images(x_decode, [self.img_h, self.img_w]) + img = tf.cast(img, tf.float32) / 127.5 - 1 + + if self.augment_flag : + augment_size_h = self.img_h + (30 if self.img_h == 256 else 15) + augment_size_w = self.img_w + (30 if self.img_w == 256 else 15) + p = random.random() + if p > 0.5: + img = augmentation(img, augment_size_h, augment_size_w) + + return img + + +def load_test_data(image_path, size_h=256, size_w=256): + #img = misc.imread(image_path, mode='RGB') + img = imageio.imread(image_path, pilmode= 'RGB') + #img = misc.imresize(img, [size_h, size_w]) + img = resize(img, output_shape=(size_h, size_w)) + img = np.expand_dims(img, axis=0) + img = preprocessing(img) + + return img + +def preprocessing(x): + x = x/127.5 - 1 # -1 ~ 1 + return x + +def augmentation(image, aug_img_h, aug_img_w): + seed = random.randint(0, 2 ** 31 - 1) + ori_image_shape = tf.shape(image) + image = tf.image.random_flip_left_right(image, seed=seed) + image = tf.image.resize_images(image, [aug_img_h, aug_img_w]) + image = tf.random_crop(image, ori_image_shape, seed=seed) + return image + +def save_images(images, size, image_path): + return imsave(inverse_transform(images), size, image_path) + +def inverse_transform(images): + return (images+1.) / 2 + +def imsave(images, size, path): + return imageio.imwrite(path, merge(images, size)) + +def merge(images, size): + h, w = images.shape[1], images.shape[2] + img = np.zeros((h * size[0], w * size[1], 3)) + for idx, image in enumerate(images): + i = idx % size[1] + j = idx // size[1] + img[h*j:h*(j+1), w*i:w*(i+1), :] = image + + return img + +def show_all_variables(): + model_vars = tf.trainable_variables() + slim.model_analyzer.analyze_vars(model_vars, print_info=True) + +def check_folder(log_dir): + if not os.path.exists(log_dir): + os.makedirs(log_dir) + return log_dir + +def pytorch_xavier_weight_factor(gain=0.02, uniform=False) : + + if uniform : + factor = gain * gain + mode = 'FAN_AVG' + else : + factor = (gain * gain) / 1.3 + mode = 'FAN_AVG' + + return factor, mode, uniform + +def pytorch_kaiming_weight_factor(a=0.0, activation_function='relu', uniform=False) : + + if activation_function == 'relu' : + gain = np.sqrt(2.0) + elif activation_function == 'leaky_relu' : + gain = np.sqrt(2.0 / (1 + a ** 2)) + elif activation_function =='tanh' : + gain = 5.0 / 3 + else : + gain = 1.0 + + if uniform : + factor = gain * gain + mode = 'FAN_IN' + else : + factor = (gain * gain) / 1.3 + mode = 'FAN_IN' + + return factor, mode, uniform + -- Gitee From 3f3eecdc4b898555eb4cee32f21ff8cd8e4e719e Mon Sep 17 00:00:00 2001 From: harry-zzh Date: Fri, 17 Jun 2022 09:08:04 +0000 Subject: [PATCH 3/4] =?UTF-8?q?=E6=96=B0=E5=BB=BA=20test?= MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit --- TensorFlow/contrib/cv/MUNIT_ID0953_for_TensorFlow/test/.keep | 0 1 file changed, 0 insertions(+), 0 deletions(-) create mode 100644 TensorFlow/contrib/cv/MUNIT_ID0953_for_TensorFlow/test/.keep diff --git a/TensorFlow/contrib/cv/MUNIT_ID0953_for_TensorFlow/test/.keep b/TensorFlow/contrib/cv/MUNIT_ID0953_for_TensorFlow/test/.keep new file mode 100644 index 000000000..e69de29bb -- Gitee From 5accd4c29c4233533cb7a8b70a8c0ee054f95279 Mon Sep 17 00:00:00 2001 From: harry-zzh Date: Fri, 17 Jun 2022 09:08:36 +0000 Subject: [PATCH 4/4] add two files --- .../test/train_full_1p.sh | 225 ++++++++++++++++++ .../test/train_performance_1p.sh | 207 ++++++++++++++++ 2 files changed, 432 insertions(+) create mode 100644 TensorFlow/contrib/cv/MUNIT_ID0953_for_TensorFlow/test/train_full_1p.sh create mode 100644 TensorFlow/contrib/cv/MUNIT_ID0953_for_TensorFlow/test/train_performance_1p.sh diff --git a/TensorFlow/contrib/cv/MUNIT_ID0953_for_TensorFlow/test/train_full_1p.sh b/TensorFlow/contrib/cv/MUNIT_ID0953_for_TensorFlow/test/train_full_1p.sh new file mode 100644 index 000000000..b7a5aef00 --- /dev/null +++ b/TensorFlow/contrib/cv/MUNIT_ID0953_for_TensorFlow/test/train_full_1p.sh @@ -0,0 +1,225 @@ +#!/bin/bash + +########################################################## +#########第3行 至 100行,请一定不要、不要、不要修改########## +#########第3行 至 100行,请一定不要、不要、不要修改########## +#########第3行 至 100行,请一定不要、不要、不要修改########## +########################################################## +# shell脚本所在路径 +cur_path=`echo $(cd $(dirname $0);pwd)` + +# 判断当前shell是否是performance +perf_flag=`echo $0 | grep performance | wc -l` + +# 当前执行网络的名称 +Network=`echo $(cd $(dirname $0);pwd) | awk -F"/" '{print $(NF-1)}'` + +export RANK_SIZE=1 +export RANK_ID=0 +export JOB_ID=10087 + +# 路径参数初始化 +data_path="" +output_path="" + +# 帮助信息,不需要修改 +if [[ $1 == --help || $1 == -h ]];then + echo"usage:./train_performance_1P.sh " + echo " " + echo "parameter explain: + --data_path # dataset of training + --output_path # output of training + --train_steps # max_step for training + --train_epochs # max_epoch for training + --batch_size # batch size + -h/--help show help message + " + exit 1 +fi + +# 参数校验,不需要修改 +for para in $* +do + if [[ $para == --data_path* ]];then + data_path=`echo ${para#*=}` + elif [[ $para == --output_path* ]];then + output_path=`echo ${para#*=}` + elif [[ $para == --train_steps* ]];then + train_steps=`echo ${para#*=}` + elif [[ $para == --train_epochs* ]];then + train_epochs=`echo ${para#*=}` + elif [[ $para == --batch_size* ]];then + batch_size=`echo ${para#*=}` + fi +done + +# 校验是否传入data_path,不需要修改 +# data_path="./dataset/edges2shoes/train" +if [[ $data_path == "" ]];then + echo "[Error] para \"data_path\" must be config" + exit 1 +fi + +# 校验是否传入output_path,不需要修改 +if [[ $output_path == "" ]];then + output_path="./test/output/${ASCEND_DEVICE_ID}" +fi + +# 设置打屏日志文件名,请保留,文件名为${print_log} +print_log="./test/output/${ASCEND_DEVICE_ID}/train_${ASCEND_DEVICE_ID}.log" +modelarts_flag=`cat /etc/passwd |grep ma-user` +if [ x"${modelarts_flag}" != x ]; +then + echo "running with modelarts_flag..." + print_log_name=`ls /home/ma-user/modelarts/log/ | grep proc-rank` + print_log="/home/ma-user/modelarts/log/${print_log_name}" +fi +echo "### get your log here : ${print_log}" + +CaseName="" +function get_casename() +{ + if [ x"${perf_flag}" = x1 ]; + then + CaseName=${Network}_bs${batch_size}_${RANK_SIZE}'p'_'perf' + else + CaseName=${Network}_bs${batch_size}_${RANK_SIZE}'p'_'acc' + fi +} + +# 跳转到code目录 +cd ${cur_path}/../ +rm -rf ./test/output/${ASCEND_DEVICE_ID} +mkdir -p ./test/output/${ASCEND_DEVICE_ID} + +# 训练开始时间记录,不需要修改 +start_time=$(date +%s) +########################################################## +#########第3行 至 100行,请一定不要、不要、不要修改########## +#########第3行 至 100行,请一定不要、不要、不要修改########## +#########第3行 至 100行,请一定不要、不要、不要修改########## +########################################################## + +#========================================================= +#========================================================= +#========训练执行命令,需要根据您的网络进行修改============== +#========================================================= +#========================================================= +# 基础参数,需要模型审视修改 +# 您的训练数据集在${data_path}路径下,请直接使用这个变量获取 +# 您的训练输出目录在${output_path}路径下,请直接使用这个变量获取 +# 您的其他基础参数,可以自定义增加,但是batch_size请保留,并且设置正确的值 + +# batch_size=64 + +# if [ x"${modelarts_flag}" != x ]; +# then +# python3.7 ./LeNet.py --data_path=${data_path} --output_path=${output_path} +# else +# python3.7 ./LeNet.py --data_path=${data_path} --output_path=${output_path} 1>${print_log} 2>&1 +# fi + +# # 性能相关数据计算 +# StepTime=`grep "sec/step :" ${print_log} | tail -n 10 | awk '{print $NF}' | awk '{sum+=$1} END {print sum/NR}'` +# FPS=`awk 'BEGIN{printf "%.2f\n", '${batch_size}'/'${StepTime}'}'` + +# # 精度相关数据计算 +# train_accuracy=`grep "Final Accuracy accuracy" ${print_log} | awk '{print $NF}'` +# # 提取所有loss打印信息 +# grep "loss :" ${print_log} | awk -F ":" '{print $4}' | awk -F "-" '{print $1}' > ./test/output/${ASCEND_DEVICE_ID}/my_output_loss.txt +train_epochs=1 +train_steps=100000 +batch_size=1 +epoch=1 + +if [ x"${modelarts_flag}" != x ]; +then + ASCEND_VISIBLE_DEVICES=0 python3.7 ./main.py --data_path=${data_path} --phase train \ + --epoch ${epoch} \ + --iteration ${train_steps} \ + --result=${output_path} \ + --batch_size ${batch_size} +else + ASCEND_VISIBLE_DEVICES=0 python3.7 ./main.py --data_path=${data_path} --phase train \ + --epoch ${epoch} \ + --iteration ${train_steps} \ + --result=${output_path} \ + --batch_size ${batch_size} 1>${print_log} 2>&1 +fi + +# 性能相关数据计算 +#StepTime=`grep "sec/step :" ${print_log} | tail -n 10 | awk '{print $NF}' | awk '{sum+=$1} END {print sum/NR}'` +step0=`grep time ${print_log} | awk -F"time: " '{print $2}' | awk -F" " 'END{print $1}'` +step1=`grep time ${print_log} | awk -F"time: " '{print $2}' | awk -F" " '{print $1}' | tail -2 | head -1` +StepTime=`awk 'BEGIN{printf "%.4f",('${step0}'-'${step1}')}'` +FPS=`awk 'BEGIN{printf "%.2f\n", '${batch_size}'/'${StepTime}'}'` + +# # 精度相关数据计算 +# train_accuracy=`grep "Final Accuracy accuracy" ${print_log} | awk '{print $NF}'` +# # 提取所有loss打印信息 +# grep "loss :" ${print_log} | awk -F ":" '{print $4}' | awk -F "-" '{print $1}' > ./test/output/${ASCEND_DEVICE_ID}/my_output_loss.txt +# 精度相关数据计算 +# train_accuracy=`grep "Final Accuracy accuracy" ${print_log} | awk '{print $NF}'` +train_accuracy="None" +# 提取所有loss打印信息 +#grep "loss :" ${print_log} | awk -F ":" '{print $4}' | awk -F "-" '{print $1}' > ./test/output/${ASCEND_DEVICE_ID}/my_output_loss.txt +grep d_loss ${print_log} | awk -F"d_loss: " '{print $2}' > ./test/output/${ASCEND_DEVICE_ID}/my_output_loss.txt +grep g_loss ${print_log} | awk -F"g_loss: " '{print $2}' > ./test/output/${ASCEND_DEVICE_ID}/my_output_loss.txt + + +########################################################### +#########后面的所有内容请不要修改########################### +#########后面的所有内容请不要修改########################### +#########后面的所有内容请不要修改########################### +########################################################### + +# 判断本次执行是否正确使用Ascend NPU +tf_flag=`echo ${Network} | grep TensorFlow | wc -l` +use_npu_flag=`grep "The model has been compiled on the Ascend AI processor" ${print_log} | wc -l` +if [ x"${use_npu_flag}" == x0 -a x"${tf_flag}" == x1 ]; +then + echo "------------------ ERROR NOTICE START ------------------" + echo "ERROR, your task haven't used Ascend NPU, please check your npu Migration." + echo "------------------ ERROR NOTICE END------------------" +else + echo "------------------ INFO NOTICE START------------------" + echo "INFO, your task have used Ascend NPU, please check your result." + echo "------------------ INFO NOTICE END------------------" +fi + +# 获取最终的casename,请保留,case文件名为${CaseName} +get_casename + +# 重命名loss文件 +if [ -f ./test/output/${ASCEND_DEVICE_ID}/my_output_loss.txt ]; +then + mv ./test/output/${ASCEND_DEVICE_ID}/my_output_loss.txt ./test/output/${ASCEND_DEVICE_ID}/${CaseName}_loss.txt +fi + +# 训练端到端耗时 +end_time=$(date +%s) +e2e_time=$(( $end_time - $start_time )) + +echo "------------------ Final result ------------------" +# 输出性能FPS/单step耗时/端到端耗时 +echo "Final Performance images/sec : $FPS" +echo "Final Performance sec/step : $StepTime" +echo "E2E Training Duration sec : $e2e_time" + +# 输出训练精度 +echo "Final Train Accuracy : ${train_accuracy}" + +# 最后一个迭代loss值,不需要修改 +ActualLoss=(`awk 'END {print $NF}' $cur_path/output/$ASCEND_DEVICE_ID/${CaseName}_loss.txt`) + +#关键信息打印到${CaseName}.log中,不需要修改 +echo "Network = ${Network}" > $cur_path/output/$ASCEND_DEVICE_ID/${CaseName}.log +echo "RankSize = ${RANK_SIZE}" >> $cur_path/output/$ASCEND_DEVICE_ID/${CaseName}.log +echo "BatchSize = ${batch_size}" >> $cur_path/output/$ASCEND_DEVICE_ID/${CaseName}.log +echo "DeviceType = `uname -m`" >> $cur_path/output/$ASCEND_DEVICE_ID/${CaseName}.log +echo "CaseName = ${CaseName}" >> $cur_path/output/$ASCEND_DEVICE_ID/${CaseName}.log +echo "ActualFPS = ${FPS}" >> $cur_path/output/$ASCEND_DEVICE_ID/${CaseName}.log +echo "TrainingTime = ${StepTime}" >> $cur_path/output/$ASCEND_DEVICE_ID/${CaseName}.log +echo "ActualLoss = ${ActualLoss}" >> $cur_path/output/$ASCEND_DEVICE_ID/${CaseName}.log +echo "E2ETrainingTime = ${e2e_time}" >> $cur_path/output/$ASCEND_DEVICE_ID/${CaseName}.log +echo "TrainAccuracy = ${train_accuracy}" >> $cur_path/output/$ASCEND_DEVICE_ID/${CaseName}.log \ No newline at end of file diff --git a/TensorFlow/contrib/cv/MUNIT_ID0953_for_TensorFlow/test/train_performance_1p.sh b/TensorFlow/contrib/cv/MUNIT_ID0953_for_TensorFlow/test/train_performance_1p.sh new file mode 100644 index 000000000..24e28e539 --- /dev/null +++ b/TensorFlow/contrib/cv/MUNIT_ID0953_for_TensorFlow/test/train_performance_1p.sh @@ -0,0 +1,207 @@ +#!/bin/bash + +########################################################## +#########第3行 至 100行,请一定不要、不要、不要修改########## +#########第3行 至 100行,请一定不要、不要、不要修改########## +#########第3行 至 100行,请一定不要、不要、不要修改########## +########################################################## +# shell脚本所在路径 +cur_path=`echo $(cd $(dirname $0);pwd)` + +output_path="./edges2shoes_npu_allow_fp32_to_fp16" +# 判断当前shell是否是performance +perf_flag=`echo $0 | grep performance | wc -l` + +# 当前执行网络的名称 +Network=`echo $(cd $(dirname $0);pwd) | awk -F"/" '{print $(NF-1)}'` + +export RANK_SIZE=1 +export RANK_ID=0 +export JOB_ID=10087 + +# 路径参数初始化 +data_path="" +output_path="" + +# 帮助信息,不需要修改 +if [[ $1 == --help || $1 == -h ]];then + echo"usage:./train_performance_1P.sh " + echo " " + echo "parameter explain: + --data_path # dataset of training + --output_path # output of training + --train_steps # max_step for training + --train_epochs # max_epoch for training + --batch_size # batch size + -h/--help show help message + " + exit 1 +fi + +# 参数校验,不需要修改 +for para in $* +do + if [[ $para == --data_path* ]];then + data_path=`echo ${para#*=}` + elif [[ $para == --output_path* ]];then + output_path=`echo ${para#*=}` + elif [[ $para == --train_steps* ]];then + train_steps=`echo ${para#*=}` + elif [[ $para == --train_epochs* ]];then + train_epochs=`echo ${para#*=}` + elif [[ $para == --batch_size* ]];then + batch_size=`echo ${para#*=}` + fi +done + +# 校验是否传入data_path,不需要修改 +# data_path="./dataset/edges2shoes/train" +if [[ $data_path == "" ]];then + echo "[Error] para \"data_path\" must be config" + exit 1 +fi + +# 校验是否传入output_path,不需要修改 +if [[ $output_path == "" ]];then + output_path="./test/output/${ASCEND_DEVICE_ID}" +fi + +# 设置打屏日志文件名,请保留,文件名为${print_log} +print_log="./test/output/${ASCEND_DEVICE_ID}/train_${ASCEND_DEVICE_ID}.log" +modelarts_flag=`cat /etc/passwd |grep ma-user` +if [ x"${modelarts_flag}" != x ]; +then + echo "running with modelarts..." + print_log_name=`ls /home/ma-user/modelarts/log/ | grep proc-rank` + print_log="/home/ma-user/modelarts/log/${print_log_name}" +fi +echo "### get your log here : ${print_log}" + +CaseName="" +function get_casename() +{ + if [ x"${perf_flag}" = x1 ]; + then + CaseName=${Network}_bs${batch_size}_${RANK_SIZE}'p'_'perf' + else + CaseName=${Network}_bs${batch_size}_${RANK_SIZE}'p'_'acc' + fi +} + +# 跳转到code目录 +cd ${cur_path}/../ +rm -rf ./test/output/${ASCEND_DEVICE_ID} +mkdir -p ./test/output/${ASCEND_DEVICE_ID} + +# 训练开始时间记录,不需要修改 +start_time=$(date +%s) +########################################################## +#########第3行 至 100行,请一定不要、不要、不要修改########## +#########第3行 至 100行,请一定不要、不要、不要修改########## +#########第3行 至 100行,请一定不要、不要、不要修改########## +########################################################## + +#========================================================= +#========================================================= +#========训练执行命令,需要根据您的网络进行修改============== +#========================================================= +#========================================================= +# 基础参数,需要模型审视修改 +# 您的训练数据集在${data_path}路径下,请直接使用这个变量获取 +# 您的训练输出目录在${output_path}路径下,请直接使用这个变量获取 +# 您的其他基础参数,可以自定义增加,但是batch_size请保留,并且设置正确的值 +train_epochs=1 +train_steps=100000 +batch_size=1 +epoch=1 + +if [ x"${modelarts_flag}" != x ]; +then + python3.7 ./main.py --data_path=${data_path} --phase train \ + --epoch ${epoch} \ + --iteration ${train_steps} \ + --result=${output_path} \ + --batch_size ${batch_size} +else + python3.7 ./main.py --data_path=${data_path} --phase train \ + --epoch ${epoch} \ + --iteration ${train_steps} \ + --result=${output_path} \ + --batch_size ${batch_size} 1>${print_log} 2>&1 +fi + +# 性能相关数据计算 +#StepTime=`grep "sec/step :" ${print_log} | tail -n 10 | awk '{print $NF}' | awk '{sum+=$1} END {print sum/NR}'` +step0=`grep time ${print_log} | awk -F"time: " '{print $2}' | awk -F" " 'END{print $1}'` +step1=`grep time ${print_log} | awk -F"time: " '{print $2}' | awk -F" " '{print $1}' | tail -2 | head -1` +StepTime=`awk 'BEGIN{printf "%.4f",('${step0}'-'${step1}')}'` +FPS=`awk 'BEGIN{printf "%.2f\n", '${batch_size}'/'${StepTime}'}'` + +# # 精度相关数据计算 +# train_accuracy=`grep "Final Accuracy accuracy" ${print_log} | awk '{print $NF}'` +# # 提取所有loss打印信息 +# grep "loss :" ${print_log} | awk -F ":" '{print $4}' | awk -F "-" '{print $1}' > ./test/output/${ASCEND_DEVICE_ID}/my_output_loss.txt +# 精度相关数据计算 +# train_accuracy=`grep "Final Accuracy accuracy" ${print_log} | awk '{print $NF}'` +train_accuracy="None" +# 提取所有loss打印信息 +#grep "loss :" ${print_log} | awk -F ":" '{print $4}' | awk -F "-" '{print $1}' > ./test/output/${ASCEND_DEVICE_ID}/my_output_loss.txt +grep d_loss ${print_log} | awk -F"d_loss: " '{print $2}' > ./test/output/${ASCEND_DEVICE_ID}/my_output_loss.txt +grep g_loss ${print_log} | awk -F"g_loss: " '{print $2}' > ./test/output/${ASCEND_DEVICE_ID}/my_output_loss.txt + + +########################################################### +#########后面的所有内容请不要修改########################### +#########后面的所有内容请不要修改########################### +#########后面的所有内容请不要修改########################### +########################################################### + +# 判断本次执行是否正确使用Ascend NPU +tf_flag=`echo ${Network} | grep TensorFlow | wc -l` +use_npu_flag=`grep "The model has been compiled on the Ascend AI processor" ${print_log} | wc -l` +if [ x"${use_npu_flag}" == x0 -a x"${tf_flag}" == x1 ]; +then + echo "------------------ ERROR NOTICE START ------------------" + echo "ERROR, your task haven't used Ascend NPU, please check your npu Migration." + echo "------------------ ERROR NOTICE END------------------" +else + echo "------------------ INFO NOTICE START------------------" + echo "INFO, your task have used Ascend NPU, please check your result." + echo "------------------ INFO NOTICE END------------------" +fi + +# 获取最终的casename,请保留,case文件名为${CaseName} +get_casename + +# 重命名loss文件 +if [ -f ./test/output/${ASCEND_DEVICE_ID}/my_output_loss.txt ]; +then + mv ./test/output/${ASCEND_DEVICE_ID}/my_output_loss.txt ./test/output/${ASCEND_DEVICE_ID}/${CaseName}_loss.txt +fi + +# 训练端到端耗时 +end_time=$(date +%s) +e2e_time=$(( $end_time - $start_time )) + +echo "------------------ Final result ------------------" +# 输出性能FPS/单step耗时/端到端耗时 +echo "Final Performance images/sec : $FPS" +echo "Final Performance sec/step : $StepTime" +echo "E2E Training Duration sec : $e2e_time" + +# 输出训练精度 +echo "Final Train Accuracy : ${train_accuracy}" + +# 最后一个迭代loss值,不需要修改 +ActualLoss=(`awk 'END {print $NF}' $cur_path/output/$ASCEND_DEVICE_ID/${CaseName}_loss.txt`) + +#关键信息打印到${CaseName}.log中,不需要修改 +echo "Network = ${Network}" > $cur_path/output/$ASCEND_DEVICE_ID/${CaseName}.log +echo "RankSize = ${RANK_SIZE}" >> $cur_path/output/$ASCEND_DEVICE_ID/${CaseName}.log +echo "BatchSize = ${batch_size}" >> $cur_path/output/$ASCEND_DEVICE_ID/${CaseName}.log +echo "DeviceType = `uname -m`" >> $cur_path/output/$ASCEND_DEVICE_ID/${CaseName}.log +echo "CaseName = ${CaseName}" >> $cur_path/output/$ASCEND_DEVICE_ID/${CaseName}.log +echo "ActualFPS = ${FPS}" >> $cur_path/output/$ASCEND_DEVICE_ID/${CaseName}.log +echo "TrainingTime = ${StepTime}" >> $cur_path/output/$ASCEND_DEVICE_ID/${CaseName}.log +echo "ActualLoss = ${ActualLoss}" >> $cur_path/output/$ASCEND_DEVICE_ID/${CaseName}.log +echo "E2ETrainingTime = ${e2e_time}" >> $cur_path/output/$ASCEND_DEVICE_ID/${CaseName}.log \ No newline at end of file -- Gitee