diff --git a/TensorFlow/built-in/cv/image_classification/ShuffleNetV1-1.0x-group3_ID2129_for_TensorFlow/README.md b/TensorFlow/built-in/cv/image_classification/ShuffleNetV1-1.0x-group3_ID2129_for_TensorFlow/README.md index 60b523c76bb85b8092d2bf67efb41acd8defe483..241a2b48e4a57a016815e267322588b9300d98e0 100644 --- a/TensorFlow/built-in/cv/image_classification/ShuffleNetV1-1.0x-group3_ID2129_for_TensorFlow/README.md +++ b/TensorFlow/built-in/cv/image_classification/ShuffleNetV1-1.0x-group3_ID2129_for_TensorFlow/README.md @@ -2,7 +2,6 @@ - [概述](#概述.md) - [训练环境准备](#训练环境准备.md) - [快速上手](#快速上手.md) -- [迁移学习指导](#迁移学习指导.md) - [高级参考](#高级参考.md)

基本信息

@@ -30,19 +29,17 @@

概述

-​ 目前,神经网络架构设计主要以计算复杂度的\emph{indirect} 度量,即FLOPs 为指导。然而,\emph{direct} 指标(例如速度)还取决于其他因素,例如内存访问成本和平台特性。因此,这项工作建议评估目标平台上的直接指标,而不仅仅是考虑 FLOP。基于一系列受控实验,这项工作推导出了几个实用的\ emph {指南},用于有效的网络设计。因此,提出了一种新的架构,称为 \emph{ShuffleNet V2}。综合消融实验验证了我们的模型在速度和精度权衡方面是最先进的。 -- 参考论文: +​ 目前,神经网络架构设计主要以计算复杂度的\emph{indirect} 度量,即FLOPs 为指导。然而,\emph{direct} 指标(例如速度)还取决于其他因素,例如内存访问成本和平台特性。因此,这项工作建议评估目标平台上的直接指标,而不仅仅是考虑 FLOP。基于一系列受控实验,这项工作推导出了几个实用的\ emph {指南},用于有效的网络设计。ShuffleNetV1提出了channel shuffle操作,使得网络可以尽情地使用分组卷积来加速。 - [ShuffleNet V2: Practical Guidelines for Efficient CNN Architecture Design](https://arxiv.org/abs/1807.11164) +- 参考论文: + https://arxiv.org/pdf/1707.01083.pdf - 参考实现: https://github.com/weiSupreme/shufflenetv2-tensorflow - 适配昇腾 AI 处理器的实现: - - - https://gitee.com/ascend/modelzoo/tree/master/built-in/TensorFlow/Official/cv/image_classification/ShuffleNetV1-1.0x-group3_ID2129_for_TensorFlow + https://gitee.com/ascend/ModelZoo-TensorFlow/tree/master/TensorFlow/built-in/cv/image_classification/ShuffleNetV1-1.0x-group3_ID2129_for_TensorFlow - 通过Git获取对应commit\_id的代码方法如下: @@ -154,121 +151,16 @@ run_config = NPURunConfig( model_dir=flags_obj.model_dir, [Ascend 910训练平台环境变量设置](https://gitee.com/ascend/modelzoo/wikis/Ascend%20910%E8%AE%AD%E7%BB%83%E5%B9%B3%E5%8F%B0%E7%8E%AF%E5%A2%83%E5%8F%98%E9%87%8F%E8%AE%BE%E7%BD%AE?sort_id=3148819) - 单卡训练 - 1. 配置训练参数。 - - 在脚本scripts/train_1p.sh中,配置训练数据集路径,请用户根据实际路径配置,数据集参数如下所示: - - `−−datadir=/opt/npu/slimImagenet` - - 2. 执行训练指令(脚本为scripts/run_1p.sh)。 - - `bash run_1p.sh` - -- 8卡训练 - - 1. 配置训练参数。 - - 在脚本scripts/train_8p.sh中,配置训练数据集路径,请用户根据实际路径配置,数据集参数如下所示: - - `−−datadir=/opt/npu/slimImagenet` - - 2. 执行训练指令(脚本为scripts/run_8p.sh)。 - -- 验证。 - - 1. 测试的时候,需要修改脚本启动参数(脚本位于scripts/test.sh),配置mode为evaluate并在eval_dir中配置checkpoint文件所在路径,请用户根据实际路径配置,参数如下所示: - - - - ``` - −−mode=evaluate - −−datadir=/opt/npu/slimImagenet - ``` - - - - 2. 测试指令(脚本位于scripts/test.sh). - - `bash test.sh` - - -

迁移学习指导

- -- 数据集准备。 - - 数据集要求如下: - - 1. 获取数据。 - - 如果要使用自己的数据集,需要将数据集放到脚本参数data_dir对应目录下。参考代码中的数据集存放路径如下: - - - 训练集: /opt/npu/slimImagenet - - 测试集: /opt/npu/slimImagenet - - 训练数据集和测试数据集以文件名中的train和validation加以区分。 - - 数据集也可以放在其它目录,则修改对应的脚本入参data_dir即可。 - - 2. 准确标注类别标签的数据集。 - - 3. 数据集每个类别所占比例大致相同。 - - 4. 参照tfrecord脚本生成train/eval使用的TFRecord文件。 - - 5. 数据集文件结构,请用户自行制作TFRecord文件,包含训练集和验证集两部分,目录参考: - - ``` - |--|imagenet_tfrecord - | train-00000-of-01024 - | train-00001-of-01024 - | train-00002-of-01024 - | ... - | validation-00000-of-00128 - | validation-00000-of-00128 - | ... - ``` - -- 模型修改。 - - 1. 模型分类类别修改。 使用自有数据集进行分类,如需将分类类别修改为10,修改vgg16/model.py将depth=1000修改为depth=10。 - - `labels_one_hot = tf.one_hot(labels, depth=1000) ` - - 2. 修改vgg16/vgg.py,将1000设置为为10。 - - - ``` - #fc8 - x = tf.layers.dense(x, 1000, activation=None, use_bias=True, kernel_initializer=tf.keras.initializers.RandomNormal(stddev=0.01)) - ``` - -- 加载预训练模型。 - - 1. 修改配置文件参数,修改train.py文件,增加以下参数。 - - ``` - parser.add_argument('--restore_path', default='/code/model.ckpt-100', help="""restore path""") #配置预训练ckpt路径 - parser.add_argument('--restore_exclude', default=['dense_2'], help="""restore_exclude""") #不加载预训练网络中FC层权重 - ``` - - - 2. 模型加载修改,修改超规格vgg16/model.py文件,增加以下代码行。 - - - ``` - assert (mode == tf.estimator.ModeKeys.TRAIN) - #restore ckpt for finetune - variables_to_restore = tf.contrib.slim.get_variables_to_restore(exclude=self.args.restore_exclude) - tf.train.init_from_checkpoint(self.args.restore_path,{v.name.split(':')[0]: v for v in variables_to_restore}) - ``` +``` + 以数据目录为./data为例: + cd test + bash train_full_1p.sh --data_path=../data(全量) + bash train_performance_1p.sh --data_path=../data(功能、性能测试) -- 模型训练。 +``` - 参考“模型训练”中训练步骤。 -- 模型评估。 - 可以参考“模型训练”中训练步骤。

高级参考

@@ -276,26 +168,32 @@ run_config = NPURunConfig( model_dir=flags_obj.model_dir, ``` -├── train.py //网络训练与测试代码 -├── README.md //代码说明文档 -├── vgg16 -│ ├──vgg.py //网络构建 -│ ├──create_session.py //sess参数配置 -│ ├──data_loader.py //数据加载 -│ ├──layers.py //计算accuracy -│ ├──logger.py //打印logging信息 -│ ├──model.py //model estimator -│ ├──train_helper.py //ckpt排序 -│ ├──hyper_param.py //配置学习率策略 -│ ├──trainer.py //训练器配置 -│ ├──preprocessing.py //数据预处理 -├── scripts -│ ├──run_1p.sh //单卡运行启动脚本 -│ ├──train_1p.sh //单卡执行脚本 -│ ├──run_8p.sh //8卡运行启动脚本 -│ ├──train_8p.sh //8卡执行脚本 -│ ├──test.sh //推理运行脚本 -│ ├──8p.json //多卡配置文件 +ShuffleNetV1-1.0x-group3_ID2129_for_TensorFlow/ +├── architecture.py +├── architecture_object_detection.py +├── deploy.py +├── deploy_train.py +├── inference_with_trained_model.py +├── input_pipeline.py +├── input_pipeline_color.py +├── model.py +├── model_d.py +├── modifyClassNum.py +├── modelzoo_level.txt +├── predict.py +├── requirements.txt +├── resnet_model_fn.py +├── resnet_v1.py +├── shufflenet.py +├── shufflenet_model_fn.py +├── train.py +├── trainWithImages.py +├── train_shufflenet.py +├── vis.py +├── README.md +└── test + ├── train_full_1p.sh + └── train_performance_1p.sh ``` @@ -303,67 +201,10 @@ run_config = NPURunConfig( model_dir=flags_obj.model_dir, ``` ---rank_size 使用NPU卡数量,默认:1 ---mode 运行模式,默认train_and_evaluate;可选:train,evaluate,参见本小结说明 ---max_train_steps 训练次数,默认:100 ---iterations_per_loop NPU运行时,device端下沉次数,默认:10 ---max_epochs 训练epoch次数,推荐配合train_and_evaluate模式使用,默认:150 ---epochs_between_evals train_and_evaluate模式时训练和推理的间隔,默认:5 ---data_dir 数据集路径,默认:path/data ---eval_dir 推理时checkpoint文件所在路径,默认:path/eval ---dtype 网络输入数据类型,默认:tf.float32 ---use_nesterov 是否使用Nesterov,默认:True ---label_smoothing label smooth系数,默认:0.1 ---weight_decay 权重衰减,默认:0.0001 ---batch_size 每个NPU的batch size,默认:32 ---lr 初始学习率,默认:0.01 ---T_max cosine_annealing 学习率策略中的T_max值,默认:150 ---momentum 动量,默认:0.9 ---display_every 打屏间隔,默认:1 ---log_name log文件名,默认:vgg16.log ---log_dir ckpt文件存放路径,默认:./model_1p -``` - -说明:当前默认模式为train_and_evaluate,每训练epochs_between_evals个epoch测试1次,共训练max_epochs个epoch;可选模式:train,训练max_train_steps次;evaluate模式,对eval_dir目录下的ckpt进行测试。 - -## 训练过程 +--data_dir 数据集路径 -1. 通过“模型训练”中的训练指令启动单卡或者多卡训练。单卡和多卡通过运行不同脚本,支持单卡、8卡网络训练。 -2. 参考脚本的模型存储路径为results/1p或者results/8p,训练脚本log中包括如下信息。 ``` -2020-06-20 12:08:46.650335: I tf_adapter/kernels/geop_npu.cc:714] [GEOP] RunGraphAsync callback, status:0, kernel_name:GeOp41_0[ 298635878us] -2020-06-20 12:08:46.651767: I tf_adapter/kernels/geop_npu.cc:511] [GEOP] Begin GeOp::ComputeAsync, kernel_name:GeOp33_0, num_inputs:0, num_outputs:1 2020-06-20 12:08:46.651882: I tf_adapter/kernels/geop_npu.cc:419] [GEOP] tf session directc244d6ef05380c63, graph id: 6 no need to rebuild -2020-06-20 12:08:46.651903: I tf_adapter/kernels/geop_npu.cc:722] [GEOP] Call ge session RunGraphAsync, kernel_name:GeOp33_0 ,tf session: directc244d6ef05380c63 ,graph id: 6 -2020-06-20 12:08:46.652148: I tf_adapter/kernels/geop_npu.cc:735] [GEOP] End GeOp::ComputeAsync, kernel_name:GeOp33_0, ret_status:success ,tf session: directc244d6ef05380c63 ,graph id: 6 [0 ms] -2020-06-20 12:08:46.654145: I tf_adapter/kernels/geop_npu.cc:64] BuildOutputTensorInfo, num_outputs:1 -2020-06-20 12:08:46.654179: I tf_adapter/kernels/geop_npu.cc:93] BuildOutputTensorInfo, output index:0, total_bytes:8, shape:, tensor_ptr:281471054129792, output281471051824928 -2020-06-20 12:08:46.654223: I tf_adapter/kernels/geop_npu.cc:714] [GEOP] RunGraphAsync callback, status:0, kernel_name:GeOp33_0[ 2321us] step: 35028 epoch: 7.0 FPS: 4289.5 loss: 3.773 total_loss: 4.477 lr:0.00996 -2020-06-20 12:08:46.655903: I tf_adapter/kernels/geop_npu.cc:511] [GEOP] Begin GeOp::ComputeAsync, kernel_name:GeOp33_0, num_inputs:0, num_outputs:1 2020-06-20 12:08:46.655975: I tf_adapter/kernels/geop_npu.cc:419] [GEOP] tf session directc244d6ef05380c63, graph id: 6 no need to rebuild -2020-06-20 12:08:46.655993: I tf_adapter/kernels/geop_npu.cc:722] [GEOP] Call ge session RunGraphAsync, kernel_name:GeOp33_0 ,tf session: directc244d6ef05380c63 ,graph id: 6 -2020-06-20 12:08:46.656226: I tf_adapter/kernels/geop_npu.cc:735] [GEOP] End GeOp::ComputeAsync, kernel_name:GeOp33_0, ret_status:success ,tf session: directc244d6ef05380c63 ,graph id: 6 [0 ms] -2020-06-20 12:08:46.657765: I tf_adapter/kernels/geop_npu.cc:64] BuildOutputTensorInfo, num_outputs:1 -``` -## 推理/验证过程 - -1. 通过“模型训练”中的测试指令启动测试。 -2. 当前只能针对该工程训练出的checkpoint进行推理测试。 -3. 推理脚本的参数eval_dir可以配置为checkpoint所在的文件夹路径,则该路径下所有.ckpt文件都会进行验证。 -4. 测试结束后会打印验证集的top1 accuracy和top5 accuracy,如下所示。 - -``` -2020-06-20 12:31:56.960517: I tf_adapter/kernels/geop_npu.cc:354] [GEOP] GE Remove Graph success. tf session: direct3a0fa9fc2797f845 , graph id: 5 -2020-06-20 12:31:56.960537: I tf_adapter/util/session_manager.cc:50] find ge session connect with tf session direct3a0fa9fc2797f845 -2020-06-20 12:31:57.201046: I tf_adapter/util/session_manager.cc:55] destory ge session connect with tf session direct3a0fa9fc2797f845 success. -2020-06-20 12:31:57.529579: I tf_adapter/kernels/geop_npu.cc:395] [GEOP] Close TsdClient. -2020-06-20 12:31:57.724877: I tf_adapter/kernels/geop_npu.cc:400] [GEOP] Close TsdClient success. -2020-06-20 12:31:57.724914: I tf_adapter/kernels/geop_npu.cc:375] [GEOP] GeOp Finalize success, tf session: direct3a0fa9fc2797f845, graph_id_: 5 step epoch top1 top5 loss checkpoint_time(UTC) 25020 1.0 36.212 63.23 3.85 -2020-06-20 11:58:14 30024 1.0 40.609 67.52 3.71 -2020-06-20 12:03:45 35028 1.0 43.494 70.31 3.57 -2020-06-20 12:08:50 40032 1.0 45.985 72.55 3.40 -2020-06-20 12:13:55 45036 2.0 48.612 75.00 3.20 -2020-06-20 12:18:59 Finished evaluation -``` diff --git a/TensorFlow/built-in/cv/image_segmentation/OSMN_ID1103_for_TensorFlow/README.md b/TensorFlow/built-in/cv/image_segmentation/OSMN_ID1103_for_TensorFlow/README.md index d36bdab4a5f9761ec65d72439db5a18f075d778d..ddd7c08a6eb2bd979f766a9aec6c2e8bbbcf5aec 100644 --- a/TensorFlow/built-in/cv/image_segmentation/OSMN_ID1103_for_TensorFlow/README.md +++ b/TensorFlow/built-in/cv/image_segmentation/OSMN_ID1103_for_TensorFlow/README.md @@ -36,16 +36,13 @@ OSMN是利用modulators模块快速地调整分割网络使其可以适应特定的物体,而不需要执行数百次的梯度下降;同时不需要调整所有的参数。在视频目标分割上有两个关键的点:视觉外观和空间中持续的移动。为了同时使用视觉和空间信息,将视觉modulator和空间modulator进行合并,在第一帧的标注信息和目标空间位置的基础上分别学习如何调整主体分割网络。 - 参考论文: - https://openaccess.thecvf.com/content_cvpr_2018/papers/Yang_Efficient_Video_Object_CVPR_2018_paper.pdf - 参考实现: - https://github.com/linjieyangsc/video_seg -- 适配昇腾 AI 处理器的实现: - - https://gitee.com/ascend/modelzoo/tree/master/built-in/TensorFlow/Research/cv/image_segmentation/OSMN_ID1103_for_TensorFlow +- 适配昇腾 AI 处理器的实现: + https://gitee.com/ascend/ModelZoo-TensorFlow/tree/master/TensorFlow/built-in/cv/image_segmentation/OSMN_ID1103_for_TensorFlow ## 训练环境准备 @@ -76,7 +73,7 @@ ## 安装依赖 -参照:requirements.txt +pip3 install requirements.txt ## 快速上手 diff --git a/TensorFlow/built-in/cv/image_segmentation/OSMN_ID1103_for_TensorFlow/requirements.txt b/TensorFlow/built-in/cv/image_segmentation/OSMN_ID1103_for_TensorFlow/requirements.txt index 2a1bb5b6a9e33ec905dd851ebf0c61430228468a..99dd505369eb8e35452aaa7be4bc00a920628917 100644 --- a/TensorFlow/built-in/cv/image_segmentation/OSMN_ID1103_for_TensorFlow/requirements.txt +++ b/TensorFlow/built-in/cv/image_segmentation/OSMN_ID1103_for_TensorFlow/requirements.txt @@ -1,4 +1,3 @@ -Python 2.7 -Tensorflow r1.0 or higher (pip install tensorflow-gpu) along with standard dependencies -Densecrf by Philipp Krähenbühl and Vladlen Koltun -Other python dependencies: PIL (Pillow version), numpy, scipy \ No newline at end of file +Pillow ==7.2.0 +numpy==1.19.5 +scipy==1.2.1 \ No newline at end of file diff --git a/TensorFlow/built-in/nlp/ALBERT-lcqmc-ZH_ID1461_for_TensorFlow/README.md b/TensorFlow/built-in/nlp/ALBERT-lcqmc-ZH_ID1461_for_TensorFlow/README.md index 31c8db595cbba563f00371b24cc521217cb5e25c..4ce59aa216938d7b9876b58f8a82b6fbf194458c 100644 --- a/TensorFlow/built-in/nlp/ALBERT-lcqmc-ZH_ID1461_for_TensorFlow/README.md +++ b/TensorFlow/built-in/nlp/ALBERT-lcqmc-ZH_ID1461_for_TensorFlow/README.md @@ -2,7 +2,6 @@ - [概述](#概述.md) - [训练环境准备](#训练环境准备.md) - [快速上手](#快速上手.md) -- [迁移学习指导](#迁移学习指导.md) - [高级参考](#高级参考.md)

基本信息

**发布者(Publisher):Huawei** @@ -31,22 +30,17 @@ 在预训练自然语言表示时增加模型大小通常会提高下游任务的性能。 但是,由于GPU / TPU内存的限制和更长的训练时间,在某些时候,进一步的模型增加变得更加困难。 为了解决这些问题,我们提出了两种参数减少技术,以降低内存消耗并提高BERT的训练速度。 全面的经验证据表明,与原始BERT相比,我们提出的方法所导致的模型可扩展性更好。 我们还使用了一个自我监督的损失,该损失集中于对句子之间的连贯性进行建模,并表明它始终可以通过多句子输入来帮助下游任务。 因此,我们的最佳模型在GLUE,RACE和\ squad基准上建立了最新的技术成果,而与BERT-large相比,参数更少。 - 参考论文: - https://paperswithcode.com/paper/albert-a-lite-bert-for-self-supervised -- 参考实现: - +- 参考实现: https://github.com/brightmart/albert_zh - 适配昇腾 AI 处理器的实现: - - https://gitee.com/ascend/modelzoo/tree/master/built-in/TensorFlow/Official/nlp/ALBERT-lcqmc-ZH_ID1461_for_TensorFlow + https://gitee.com/ascend/ModelZoo-TensorFlow/tree/master/TensorFlow/built-in/nlp/ALBERT-lcqmc-ZH_ID1461_for_TensorFlow - 通过Git获取对应commit\_id的代码方法如下: - - ``` git clone {repository_url} # 克隆仓库的代码 cd {repository_name} # 切换到模型的代码仓目录 @@ -109,7 +103,8 @@ run_config = NPURunConfig(

训练环境准备

-1. 硬件环境准备请参见各硬件产品文档"[驱动和固件安装升级指南]( https://support.huawei.com/enterprise/zh/category/ai-computing-platform-pid-1557196528909)"。需要在硬件设备上安装与CANN版本配套的固件与驱动。_ +1. 硬件环境准备请参见各硬件产品文档"[驱动和固件安装升级指南]( https://support.huawei.com/enterprise/zh/category/ai-computing-platform-pid-1557196528909)"。需要在硬件设备上安装与CANN版本配套的固件与驱动。 + 2. 宿主机上需要安装Docker并登录[Ascend Hub中心](https://ascendhub.huawei.com/#/detail?name=ascend-tensorflow-arm)获取镜像。 当前模型支持的镜像列表如[表1](#zh-cn_topic_0000001074498056_table1519011227314)所示。 @@ -135,6 +130,10 @@ run_config = NPURunConfig( +3. 安装依赖 + +pip3 install requirements.txt +

快速上手

## 数据集准备 @@ -319,64 +318,4 @@ run_config = NPURunConfig( --use_fp16 Whether to use fp16, default is True. ``` -## 训练过程 - -训练脚本会在训练过程中,每隔2500个训练步骤保存checkpoint,结果存储在model_dir目录中。 - -训练脚本同时会每个step打印一次当前loss值,以方便查看loss收敛情况,如下: - - -``` -INFO:tensorflow:words/sec: 43.81k -I0928 11:44:46.900814 281473395838992 wps_hook.py:60] words/sec: 43.81k -INFO:tensorflow:loss = 10.8089695, step = 36 (0.219 sec) -I0928 11:44:46.901059 281473395838992 basic_session_run_hooks.py:260] loss = 10.8089695, step = 36 (0.219 sec) -2020-09-28 11:44:46.901399: I tf_adapter/kernels/geop_npu.cc:393] [GEOP] Begin GeOp::ComputeAsync, kernel_name:GeOp9_0, num_inputs:0, num_outputs:1 2020-09-28 11:44:46.901443: I tf_adapter/kernels/geop_npu.cc:258] [GEOP] tf session directb287e87e429467f3, graph id: 31 no need to rebuild -2020-09-28 11:44:46.901453: I tf_adapter/kernels/geop_npu.cc:602] [GEOP] Call ge session RunGraphAsync, kernel_name:GeOp9_0 ,tf session: directb287e87e429467f3 ,graph id: 31 -2020-09-28 11:44:46.901727: I tf_adapter/kernels/geop_npu.cc:615] [GEOP] End GeOp::ComputeAsync, kernel_name:GeOp9_0, ret_status:success ,tf session: directb287e87e429467f3 ,graph id: 31 [0 ms] -2020-09-28 11:44:46.904749: I tf_adapter/kernels/geop_npu.cc:76] BuildOutputTensorInfo, num_outputs:1 -2020-09-28 11:44:46.904783: I tf_adapter/kernels/geop_npu.cc:103] BuildOutputTensorInfo, output index:0, total_bytes:8, shape:, tensor_ptr:281462830504320, output281453257618064 -2020-09-28 11:44:46.904805: I tf_adapter/kernels/geop_npu.cc:595] [GEOP] RunGraphAsync callback, status:0, kernel_name:GeOp9_0[ 3352us] -INFO:tensorflow:global_step...36 -I0928 11:44:46.904968 281473395838992 npu_hook.py:114] global_step...36 -2020-09-28 11:44:46.906018: I tf_adapter/kernels/geop_npu.cc:393] [GEOP] Begin GeOp::ComputeAsync, kernel_name:GeOp21_0, num_inputs:10, num_outputs:8 2020-09-28 11:44:46.906123: I tf_adapter/kernels/geop_npu.cc:258] [GEOP] tf session directb287e87e429467f3, graph id: 71 no need to rebuild -2020-09-28 11:44:46.906304: I tf_adapter/kernels/geop_npu.cc:602] [GEOP] Call ge session RunGraphAsync, kernel_name:GeOp21_0 ,tf session: directb287e87e429467f3 ,graph id: 71 -2020-09-28 11:44:46.906606: I tf_adapter/kernels/geop_npu.cc:615] [GEOP] End GeOp::ComputeAsync, kernel_name:GeOp21_0, ret_status:success ,tf session: directb287e87e429467f3 ,graph id: 71 [0 ms] -2020-09-28 11:44:47.100919: I tf_adapter/kernels/geop_npu.cc:76] BuildOutputTensorInfo, num_outputs:8 -2020-09-28 11:44:47.100972: I tf_adapter/kernels/geop_npu.cc:103] BuildOutputTensorInfo, output index:0, total_bytes:4, shape:, tensor_ptr:281454044286272, output281454046025984 -2020-09-28 11:44:47.100988: I tf_adapter/kernels/geop_npu.cc:103] BuildOutputTensorInfo, output index:1, total_bytes:4, shape:, tensor_ptr:281454044286400, output281454043978208 -2020-09-28 11:44:47.100996: I tf_adapter/kernels/geop_npu.cc:103] BuildOutputTensorInfo, output index:2, total_bytes:4, shape:, tensor_ptr:281454044286592, output281454046026240 -2020-09-28 11:44:47.101005: I tf_adapter/kernels/geop_npu.cc:103] BuildOutputTensorInfo, output index:3, total_bytes:4, shape:, tensor_ptr:281454045161664, output281454045699456 -2020-09-28 11:44:47.101013: I tf_adapter/kernels/geop_npu.cc:103] BuildOutputTensorInfo, output index:4, total_bytes:4, shape:, tensor_ptr:281454045161856, output281454045182336 -2020-09-28 11:44:47.101020: I tf_adapter/kernels/geop_npu.cc:103] BuildOutputTensorInfo, output index:5, total_bytes:4, shape:, tensor_ptr:281454045161984, output281454045182208 -2020-09-28 11:44:47.101028: I tf_adapter/kernels/geop_npu.cc:103] BuildOutputTensorInfo, output index:6, total_bytes:8, shape:, tensor_ptr:281454045266560, output281454043539264 -2020-09-28 11:44:47.101036: I tf_adapter/kernels/geop_npu.cc:103] BuildOutputTensorInfo, output index:7, total_bytes:8, shape:, tensor_ptr:281454045266752, output281454043539552 -2020-09-28 11:44:47.101045: I tf_adapter/kernels/geop_npu.cc:595] [GEOP] RunGraphAsync callback, status:0, kernel_name:GeOp21_0[ 194908us] -2020-09-28 11:44:47.102274: I tf_adapter/kernels/geop_npu.cc:393] [GEOP] Begin GeOp::ComputeAsync, kernel_name:GeOp9_0, num_inputs:0, num_outputs:1 2020-09-28 11:44:47.102332: I tf_adapter/kernels/geop_npu.cc:258] [GEOP] tf session directb287e87e429467f3, graph id: 31 no need to rebuild -2020-09-28 11:44:47.102343: I tf_adapter/kernels/geop_npu.cc:602] [GEOP] Call ge session RunGraphAsync, kernel_name:GeOp9_0 ,tf session: directb287e87e429467f3 ,graph id: 31 -2020-09-28 11:44:47.102605: I tf_adapter/kernels/geop_npu.cc:615] [GEOP] End GeOp::ComputeAsync, kernel_name:GeOp9_0, ret_status:success ,tf session: directb287e87e429467f3 ,graph id: 31 [0 ms] -2020-09-28 11:44:47.105650: I tf_adapter/kernels/geop_npu.cc:76] BuildOutputTensorInfo, num_outputs:1 -2020-09-28 11:44:47.105681: I tf_adapter/kernels/geop_npu.cc:103] BuildOutputTensorInfo, output index:0, total_bytes:8, shape:, tensor_ptr:281462830504512, output281453176492864 -2020-09-28 11:44:47.105695: I tf_adapter/kernels/geop_npu.cc:595] [GEOP] RunGraphAsync callback, status:0, kernel_name:GeOp9_0[ 3351us] -INFO:tensorflow:global_step/sec: 4.64619 -I0928 11:44:47.106539 281473395838992 basic_session_run_hooks.py:692] global_step/sec: 4.64619 -2020-09-28 11:44:47.107284: I tf_adapter/kernels/geop_npu.cc:393] [GEOP] Begin GeOp::ComputeAsync, kernel_name:GeOp9_0, num_inputs:0, num_outputs:1 2020-09-28 11:44:47.107373: I tf_adapter/kernels/geop_npu.cc:258] [GEOP] tf session directb287e87e429467f3, graph id: 31 no need to rebuild -2020-09-28 11:44:47.107391: I tf_adapter/kernels/geop_npu.cc:602] [GEOP] Call ge session RunGraphAsync, kernel_name:GeOp9_0 ,tf session: directb287e87e429467f3 ,graph id: 31 -2020-09-28 11:44:47.107602: I tf_adapter/kernels/geop_npu.cc:615] [GEOP] End GeOp::ComputeAsync, kernel_name:GeOp9_0, ret_status:success ,tf session: directb287e87e429467f3 ,graph id: 31 [0 ms] -2020-09-28 11:44:47.110194: I tf_adapter/kernels/geop_npu.cc:76] BuildOutputTensorInfo, num_outputs:1 -2020-09-28 11:44:47.110217: I tf_adapter/kernels/geop_npu.cc:103] BuildOutputTensorInfo, output index:0, total_bytes:8, shape:, tensor_ptr:281462830977792, output281453202209616 -2020-09-28 11:44:47.110232: I tf_adapter/kernels/geop_npu.cc:595] [GEOP] RunGraphAsync callback, status:0, kernel_name:GeOp9_0[ 2841us] -2020-09-28 11:44:47.111610: I tf_adapter/kernels/geop_npu.cc:393] [GEOP] Begin GeOp::ComputeAsync, kernel_name:GeOp19_0, num_inputs:0, num_outputs:1 2020-09-28 11:44:47.111668: I tf_adapter/kernels/geop_npu.cc:258] [GEOP] tf session directb287e87e429467f3, graph id: 61 no need to rebuild -2020-09-28 11:44:47.111685: I tf_adapter/kernels/geop_npu.cc:602] [GEOP] Call ge session RunGraphAsync, kernel_name:GeOp19_0 ,tf session: directb287e87e429467f3 ,graph id: 61 -2020-09-28 11:44:47.111890: I tf_adapter/kernels/geop_npu.cc:615] [GEOP] End GeOp::ComputeAsync, kernel_name:GeOp19_0, ret_status:success ,tf session: directb287e87e429467f3 ,graph id: 61 [0 ms] -2020-09-28 11:44:47.114397: I tf_adapter/kernels/geop_npu.cc:76] BuildOutputTensorInfo, num_outputs:1 -2020-09-28 11:44:47.114428: I tf_adapter/kernels/geop_npu.cc:103] BuildOutputTensorInfo, output index:0, total_bytes:8, shape:, tensor_ptr:281463707345344, output281453333853216 -2020-09-28 11:44:47.114442: I tf_adapter/kernels/geop_npu.cc:595] [GEOP] RunGraphAsync callback, status:0, kernel_name:GeOp19_0[ 2756us] -``` - -## 推理/验证过程 - -通过“快速上手”中的测试指令启动单卡或者多卡测试。单卡和多卡的配置与训练过程一致。 - -BLEU = 28.74, 59.5/34.3/22.2/15.0 (BP=1.000, ratio=1.029, hyp_len=66369, ref_len=64504)