diff --git "a/TensorFlow/contrib/cv/GMH\342\200\224MDN_ID1225_for_TensorFlow/LICENSE" "b/TensorFlow/contrib/cv/GMH\342\200\224MDN_ID1225_for_TensorFlow/LICENSE" new file mode 100644 index 0000000000000000000000000000000000000000..ea754b00e424d7d35f371c971001a3b865de0535 --- /dev/null +++ "b/TensorFlow/contrib/cv/GMH\342\200\224MDN_ID1225_for_TensorFlow/LICENSE" @@ -0,0 +1,21 @@ +MIT License + +Copyright (c) 2019 Li Chen + +Permission is hereby granted, free of charge, to any person obtaining a copy +of this software and associated documentation files (the "Software"), to deal +in the Software without restriction, including without limitation the rights +to use, copy, modify, merge, publish, distribute, sublicense, and/or sell +copies of the Software, and to permit persons to whom the Software is +furnished to do so, subject to the following conditions: + +The above copyright notice and this permission notice shall be included in all +copies or substantial portions of the Software. + +THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR +IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, +FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE +AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER +LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, +OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE +SOFTWARE. diff --git "a/TensorFlow/contrib/cv/GMH\342\200\224MDN_ID1225_for_TensorFlow/Network.png" "b/TensorFlow/contrib/cv/GMH\342\200\224MDN_ID1225_for_TensorFlow/Network.png" new file mode 100644 index 0000000000000000000000000000000000000000..91666b9e308b4ba03f57d0777d508f462d57133f Binary files /dev/null and "b/TensorFlow/contrib/cv/GMH\342\200\224MDN_ID1225_for_TensorFlow/Network.png" differ diff --git "a/TensorFlow/contrib/cv/GMH\342\200\224MDN_ID1225_for_TensorFlow/README.md" "b/TensorFlow/contrib/cv/GMH\342\200\224MDN_ID1225_for_TensorFlow/README.md" new file mode 100644 index 0000000000000000000000000000000000000000..9bcdabab731e8f3dcb686b3afeb8ee5a87b9f694 --- /dev/null +++ "b/TensorFlow/contrib/cv/GMH\342\200\224MDN_ID1225_for_TensorFlow/README.md" @@ -0,0 +1,345 @@ +- [基本信息](#基本信息.md) +- [概述](#概述.md) +- [训练环境准备](#训练环境准备.md) +- [快速上手](#快速上手.md) +- [迁移学习指导](#迁移学习指导.md) +- [高级参考](#高级参考.md) +

基本信息

+ +**发布者(Publisher):Huawei** + +**应用领域(Application Domain):cv** + +**版本(Version):1.1** + +**修改时间(Modified) :2021.12.14** + +**大小(Size):249M** + +**框架(Framework):TensorFlow 1.15.0** + +**模型格式(Model Format):ckpt** + +**精度(Precision):Mixed** + +**处理器(Processor):昇腾910** + +**应用级别(Categories):Benchmark** + +**描述(Description):基于TensorFlow框架的基于多峰混合密度网络生成多个可行的 3D 姿态假设的网络** + +

概述

+ +- GMH-MDN:一种基于多峰混合密度网络生成多个可行的 3D 姿态假设的网络 + +- 参考论文: + + ``` + https://arxiv.org/pdf/1904.05547.pdf + ``` + +- 参考实现: + + ``` + https://github.com/chaneyddtt/Generating-Multiple-Hypotheses-for-3D-Human-Pose-Estimation-with-Mixture-Density-Network + ``` + +- 适配昇腾 AI 处理器的实现: + ``` + https://gitee.com/ascend/modelzoo/tree/master/built-in/TensorFlow/Benchmark/cv/image_classification/Shufflenet_ID0645_for_TensorFlow + branch=master + commit_id= 477b07a1e95a35885b3a9a569b1c8ccb9ad5d7af + ``` + + +- 通过Git获取对应commit\_id的代码方法如下: + + ``` + git clone {repository_url} # 克隆仓库的代码 + cd {repository_name} # 切换到模型的代码仓目录 + git checkout {branch} # 切换到对应分支 + git reset --hard {commit_id} # 代码设置到对应的commit_id + cd {code_path} # 切换到模型代码所在路径,若仓库下只有该模型,则无需切换 + ``` + +## 默认配置 +- 网络结构 + - 初始学习率为0.001,对学习率learning_rate应用指数衰减。 + - 优化器:ADAM + - 学习率衰减速度 decay_steps:100000 + - 学习率衰减系数 decay_rate:0.96 + - 单卡batchsize:64 + - 总Epoch数:200 + - dropout:0.5 + +- 训练超参(单卡): + - Batch size: 64 + - LR scheduler: exponential decay + - Learning rate\(LR\): 0.001 + - Train epoch: 200 + - dropout:0.5 + - linear_size:1024 \#ps: size of each layer(每一层神经元的个数) + + +## 支持特性 + +| 特性列表 | 是否支持 | +| ---------- | -------- | +| 分布式训练 | 否 | +| 混合精度 | 是 | +| 数据并行 | 否 | + + +## 混合精度训练 + +昇腾910 AI处理器提供自动混合精度功能,可以针对全网中float32数据类型的算子,按照内置的优化策略,自动将部分float32的算子降低精度到float16,从而在精度损失很小的情况下提升系统性能并减少内存使用。 + +## 开启混合精度 +相关代码示例。 + +``` +config = tf.ConfigProto() +custom_op = config.graph_options.rewrite_options.custom_optimizers.add() +custom_op.name = "NpuOptimizer" +custom_op.parameter_map["use_off_line"].b = True +custom_op.parameter_map["precision_mode"].s = tf.compat.as_bytes("allow_mix_precision") +custom_op.parameter_map["modify_mixlist"].s = tf.compat.as_bytes("/home/test/ops_info.json") +config.graph_options.rewrite_options.remapping = RewriterConfig.OFF +config.graph_options.rewrite_options.memory_optimization = RewriterConfig.OFF +with tf.Session(config=config) as sess: + print(sess.run(cost)) +``` + +

训练环境准备

+ +1. 硬件环境准备请参见各硬件产品文档"[驱动和固件安装升级指南]( https://support.huawei.com/enterprise/zh/category/ai-computing-platform-pid-1557196528909)"。需要在硬件设备上安装与CANN版本配套的固件与驱动。 +2. 宿主机上需要安装Docker并登录[Ascend Hub中心](https://ascendhub.huawei.com/#/detail?name=ascend-tensorflow-arm)获取镜像。 + + 当前模型支持的镜像列表如[表1](#zh-cn_topic_0000001074498056_table1519011227314)所示。 + + **表 1** 镜像列表 + + + + + + + + + + + + +

镜像名称

+

镜像版本

+

配套CANN版本

+
+

20.2.0

+

20.2

+
+ + +

快速上手

+ +### 数据集准备 + +1. 模型预训练使用 [Human3.6M]数据集 ,需用户自行申请。因申请较慢,故可在[此处](https://github.com/MendyD/human36m) 下载 + +2. 数据集下载后,放入模型目录下,在训练脚本中指定数据集路径,可正常使用。 + +### 模型训练 +- 下载训练脚本。 + +- 开始训练。 + + 1.启动训练之前,首先要配置程序运行相关环境变量。 + + 环境变量配置信息参见: + [Ascend 910训练平台环境变量设置](https://gitee.com/ascend/modelzoo/wikis/Ascend%20910%E8%AE%AD%E7%BB%83%E5%B9%B3%E5%8F%B0%E7%8E%AF%E5%A2%83%E5%8F%98%E9%87%8F%E8%AE%BE%E7%BD%AE?sort_id=3148819) + + 2.单卡训练 + + 2.1设置单卡训练参数(脚本位于./GMH—MDN_ID1225_for_TensorFlow/test/train_full_1p.sh),示例如下。请确保下面例子中的“data_dir,batch_size,epochs”修改为用户数据集的路径。 + + data_dir="../data/h36m/" + batch_size=64 + epochs=200 + 2.2 单卡训练指令(脚本位于./GMH—MDN_ID1225_for_TensorFlow/test/train_performance_1p.sh) + + bash train_performance_1p.sh --train_dir + + +

开始测试

+ +- 预训练模型下载 + + [地址](https://drive.google.com/open?id=1ndJyuVL-7fbhw-G654m5U8tHogcQIftT) +- 参数配置 + + 1.修改脚本启动参数(脚本位于test/train_full_1p.sh),将test设置为True,如下所示: + + data_dir="../data/h36m/" + batch_size=64 + epochs=200 + + 2.增加checkpoints的路径,请用户根据checkpoints实际路径进行配置。使用预训练模型或者自己训练的模型 + + checkpoint_dir=../Models/mdm_5_prior/ + +- 执行测试指令 + + 1.上述文件修改完成之后,执行测试指令 + + bash test/train_performance.sh + +

迁移学习指导

+ +- 数据集准备。 + + 1.获取数据。 + 请参见“快速上手”中的数据集准备。 + + 2.数据目录结构如下: + + human36m/ + ├── h36m/ + ├── cameras.h5 + ├── S1/ + ├── S11/ + ├── S5/ + ├── S6/ + ├── S7/ + ├── S8/ + ├── S9/ + └── logging.conf/ +- 修改训练脚本。 + + 1.加载预训练模型。 + 修改**load_dir**的路径以及load参数,其中**load**为checkpoint-4874200.index 中的数字部分 4874200 + +- 模型训练。 + + 请参考“快速上手”章节。 + +- 模型评估。 + + 可以参考“模型训练”中训练步骤。 + +

高级参考【深加工】

+ +### 脚本和示例代码 + + ├── README.md //说明文档 + ├── requirements.txt //依赖 + ├── LICENSE + ├── Models + ├── experiments + ├── src_npu_20211208155957 + │ ├── cameras.py + │ ├── data_utils.py + │ ├── logging.conf + │ ├── mix_den_model.py + │ ├── predict_3dpose_mdm.py + │ ├── procrustes.py + │ └── viz.py + + +### 脚本参数 + +``` +--learning_rate Learning rate default:0.001 +--dropout Dropout keep probability 1 means no dropout default:0.5 +--batch_size batch size to use during training default:64 +--epochs How many epochs we should train for default:200 +--camera_frame Convert 3d poses to camera coordinates default:TRUE +--max_norm Apply maxnorm constraint to the weights default:TRUE +--batch_norm Use batch_normalization default:TRUE +# Data loading +--predict_14 predict 14 joints default:FALSE +--use_sh Use 2d pose predictions from StackedHourglass default:TRUE +--action The action to train on 'All' means all the actions default:All +# Architecture +--linear_size Size of each model layer default:1024 +--num_layers Number of layers in the model default:2 +--residual Whether to add a residual connection every 2 layers default:TRUE +# Evaluation +--procrustes Apply procrustes analysis at test time default:FALSE +--evaluateActionWise The dataset to use either h36m or heva default:TRUE +# Directories +--cameras_path Directory to load camera parameters default:/data/h36m/cameras.h5 +--data_dir Data directory default: /data/h36m/ +--train_dir Training directory default:/experiments/test_git/ +--load_dir Specify the directory to load trained model default:/Models/mdm_5_prior/ +# Train or load +--sample Set to True for sampling default:FALSE +--test Set to True for sampling default:FALSE +--use_cpu Whether to use the CPU default:FALSE +--load Try to load a previous checkpoint default:0 +--miss_num Specify how many missing joints default:1 +``` + +## 训练过程 + +通过“模型训练”中的训练指令启动单卡训练,通过运行脚本训练。 +将训练脚本(test/train_full_1p.sh)中的data_dir设置为训练数据集的路径。具体的流程参见“模型训练”的示例。 +模型存储路径为{train_dir},包括训练的log以及checkpoints文件。以单卡训练为例,loss信息在文件.{train_dir}/log/log.txt中,示例如下。 + +``` +Epoch: 1 +Global step: 48742 +Learning rate: 9.80e-04 +Train loss avg: 10.5440 +============================= +2021-12-10 08:44:00,731 [INFO] root - ===Action=== ==mm== +2021-12-10 08:44:14,404 [INFO] root - Directions 67.16 +2021-12-10 08:44:41,598 [INFO] root - Discussion 69.08 +2021-12-10 08:44:58,180 [INFO] root - Eating 64.17 +2021-12-10 08:45:11,033 [INFO] root - Greeting 70.90 +2021-12-10 08:45:34,378 [INFO] root - Phoning 84.17 +2021-12-10 08:45:46,680 [INFO] root - Photo 86.36 +2021-12-10 08:45:57,517 [INFO] root - Posing 63.92 +2021-12-10 08:46:05,577 [INFO] root - Purchases 68.64 +2021-12-10 08:46:22,047 [INFO] root - Sitting 82.77 +2021-12-10 08:46:35,970 [INFO] root - SittingDown 107.23 +2021-12-10 08:46:59,066 [INFO] root - Smoking 75.12 +2021-12-10 08:47:14,754 [INFO] root - Waiting 71.51 +2021-12-10 08:47:26,528 [INFO] root - WalkDog 78.11 +2021-12-10 08:47:38,442 [INFO] root - Walking 59.05 +2021-12-10 08:47:49,315 [INFO] root - WalkTogether 63.24 +2021-12-10 08:47:49,323 [INFO] root - Average 74.09 +2021-12-10 08:47:49,325 [INFO] root - =================== +``` + + +### 推理/验证过程 + +#### 推理验证 + +在200 epoch训练执行完成后,请参见“模型训练”中的测试流程,需要修改脚本启动参数(脚本位于test/train_performance.sh)将test设置为True,修改load_dir的路径以及load参数,其中load_dir 为模型ckpt目录,load为ckpt 文件 checkpoint-4874200.index 中的数字部分 4874200,然后执行脚本。 + +`bash train_full_1p.sh --test=True` + +该脚本会自动执行验证流程,验证结果若想输出至文档描述文件,则需修改启动脚本参数,否则输出至默认log文件(./experiments/test_git/log/log.txt)中。 + +``` +2021-12-10 07:29:31,061 [INFO] root - Logs will be written to ../experiments/test_git/log +2021-12-10 07:32:14,597 [INFO] root - ===Action=== ==mm== +2021-12-10 07:32:33,258 [INFO] root - Directions 50.76 +2021-12-10 07:32:59,096 [INFO] root - Discussion 61.78 +2021-12-10 07:33:14,707 [INFO] root - Eating 56.20 +2021-12-10 07:33:26,797 [INFO] root - Greeting 60.24 +2021-12-10 07:33:49,975 [INFO] root - Phoning 78.02 +2021-12-10 07:34:02,201 [INFO] root - Photo 74.15 +2021-12-10 07:34:13,259 [INFO] root - Posing 52.02 +2021-12-10 07:34:21,237 [INFO] root - Purchases 67.17 +2021-12-10 07:34:37,670 [INFO] root - Sitting 78.90 +2021-12-10 07:34:50,829 [INFO] root - SittingDown 101.50 +2021-12-10 07:35:13,391 [INFO] root - Smoking 66.54 +2021-12-10 07:35:28,320 [INFO] root - Waiting 60.78 +2021-12-10 07:35:39,677 [INFO] root - WalkDog 68.80 +2021-12-10 07:35:51,568 [INFO] root - Walking 52.74 +2021-12-10 07:36:02,067 [INFO] root - WalkTogether 57.69 +2021-12-10 07:36:02,660 [INFO] root - Average 65.82 +2021-12-10 07:36:02,671 [INFO] root - =================== +``` + \ No newline at end of file diff --git "a/TensorFlow/contrib/cv/GMH\342\200\224MDN_ID1225_for_TensorFlow/README_ori.md" "b/TensorFlow/contrib/cv/GMH\342\200\224MDN_ID1225_for_TensorFlow/README_ori.md" new file mode 100644 index 0000000000000000000000000000000000000000..6f0ca3c6033eca7263c895feb5ac26de744f4b42 --- /dev/null +++ "b/TensorFlow/contrib/cv/GMH\342\200\224MDN_ID1225_for_TensorFlow/README_ori.md" @@ -0,0 +1,73 @@ +# Generating-Multiple-Hypotheses-for-3D-Human-Pose-Estimation-with-Mixture-Density-Network + +**About** + +This is the source code for the paper + +Chen Li, Gim Hee Lee. Generating Multiple Hypotheses for 3D Human Pose Estimation with Mixture Density Network. In CVPR2019. + +We argue that 3D human pose estimation from a monocular/2D-joint input is an iverse problem where multiple solutions can exist. +![Problem illustration](problem_illustration.png) + +We use a two-stage approach to generate multiple 3D pose hypotheses. The 2D joints are firstly detected from the input images in the first stage, followed by a feature extractor and hypotheses generator to generate 3D pose hypotheses. + +![Network architecture](Network.png) + +For more details, please refer to our paper on [arXiv](https://arxiv.org/pdf/1904.05547.pdf). + +**Bibtex:** +``` +@InProceedings{Li_2019_CVPR, +author = {Li, Chen and Lee, Gim Hee}, +title = {Generating Multiple Hypotheses for 3D Human Pose Estimation With Mixture Density Network}, +booktitle = {The IEEE Conference on Computer Vision and Pattern Recognition (CVPR)}, +month = {June}, +year = {2019} +} +``` + +**Dependencies** +1. h5py--to read data +2. Tensorflow 1.8 + +**Train** + +Get this code: +``` +git clone https://github.com/chaneyddtt/Generating-Multiple-Hypotheses-for-3D-Human-Pose-Estimation-with-Mixture-Density-Network.git +``` +Download the 2D detections of [Human3.6 dataset](https://github.com/una-dinosauria/3d-pose-baseline). + +Run: +``` +python predict_3dpose_mdm.py --train_dir +``` +You can also change the arguments during training. For example,you can train with one or two missing joint(s) randomly selected from the limbs by run: +``` +python predict_3dpose_mdm.py --miss_num +``` +You can also change other arguments in the predict_3dpose_mdm.py in a similar way. + + **Test** + +Down our [pretrained model](https://drive.google.com/open?id=1ndJyuVL-7fbhw-G654m5U8tHogcQIftT) + +To test our pretrained model, run: +``` +python predict_3dpose_mdm.py --test True --load 4338038 --load_dir ../Models/mdm_5_prior/ (model with the dirichlet conjucate prior) +``` +or run: +``` +python predict_3dpose_mdm.py --test True --load 4679232 --load_dir ../Models/mdm_5/ (model without the dirichlet conjucate prior) +``` +**Visualize** + +To visualze all the five 3D pose hypotheses generated by our model, run: +``` +python predict_3dpose_mdm.py --sample True --load 4338038 --load_dir ../Models/mdm_5_prior/ +``` + + +**Acknowledgments** + +The pre-processed human3.6 dataset and the feature extractor of our model was ported or adapted from the code by [@una-dinosauria](https://github.com/una-dinosauria/3d-pose-baseline). diff --git "a/TensorFlow/contrib/cv/GMH\342\200\224MDN_ID1225_for_TensorFlow/modelarts_entry_acc.py" "b/TensorFlow/contrib/cv/GMH\342\200\224MDN_ID1225_for_TensorFlow/modelarts_entry_acc.py" new file mode 100644 index 0000000000000000000000000000000000000000..ee3060aab202f6ef6c354b4bfd7be193b7ae82f9 --- /dev/null +++ "b/TensorFlow/contrib/cv/GMH\342\200\224MDN_ID1225_for_TensorFlow/modelarts_entry_acc.py" @@ -0,0 +1,63 @@ +# Copyright 2017 The TensorFlow Authors. All Rights Reserved. +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. +# ============================================================================ +# Copyright 2021 Huawei Technologies Co., Ltd +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +import os +import argparse +import sys + +# 解析输入参数data_url +parser = argparse.ArgumentParser() +parser.add_argument("--data_url", type=str, default="/home/ma-user/modelarts/inputs/data_url_0") +parser.add_argument("--train_url", type=str, default="/home/ma-user/modelarts/outputs/train_url_0/") +config = parser.parse_args() + +print("[CANN-Modelzoo] code_dir path is [%s]" % (sys.path[0])) +code_dir = sys.path[0] +os.chdir(code_dir) +print("[CANN-Modelzoo] work_dir path is [%s]" % (os.getcwd())) + +print("[CANN-Modelzoo] before train - list my run files:") +os.system("ls -al /usr/local/Ascend/ascend-toolkit/") + +print("[CANN-Modelzoo] before train - list my dataset files:") +os.system("ls -al %s" % config.data_url) + +print("[CANN-Modelzoo] start run train shell") +# 设置sh文件格式为linux可执行 +os.system("dos2unix ./test/*") + +# 执行train_full_1p.sh或者train_performance_1p.sh,需要用户自己指定 +# full和performance的差异,performance只需要执行很少的step,控制在15分钟以内,主要关注性能FPS +os.system("bash ./test/train_full_1p.sh --data_dir=%s --train_dir=%s " % (config.data_url, config.train_url)) + +print("[CANN-Modelzoo] finish run train shell") + +# 将当前执行目录所有文件拷贝到obs的output进行备份 +print("[CANN-Modelzoo] after train - list my output files:") +os.system("cp -r %s %s " % (code_dir, config.train_url)) +os.system("ls -al %s" % config.train_url) diff --git "a/TensorFlow/contrib/cv/GMH\342\200\224MDN_ID1225_for_TensorFlow/modelarts_entry_perf.py" "b/TensorFlow/contrib/cv/GMH\342\200\224MDN_ID1225_for_TensorFlow/modelarts_entry_perf.py" new file mode 100644 index 0000000000000000000000000000000000000000..f64068bb904b18b3ed68d458e5cda3b926b15f51 --- /dev/null +++ "b/TensorFlow/contrib/cv/GMH\342\200\224MDN_ID1225_for_TensorFlow/modelarts_entry_perf.py" @@ -0,0 +1,64 @@ +# Copyright 2017 The TensorFlow Authors. All Rights Reserved. +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. +# ============================================================================ +# Copyright 2021 Huawei Technologies Co., Ltd +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +import os +import argparse +import sys + +# 解析输入参数data_url +parser = argparse.ArgumentParser() +parser.add_argument("--data_url", type=str, default="/home/ma-user/modelarts/inputs/data_url_0") +parser.add_argument("--train_url", type=str, default="/home/ma-user/modelarts/outputs/train_url_0/") +config = parser.parse_args() + +print("[CANN-Modelzoo] code_dir path is [%s]" % (sys.path[0])) +code_dir = sys.path[0] +os.chdir(code_dir) +print("[CANN-Modelzoo] work_dir path is [%s]" % (os.getcwd())) + +print("[CANN-Modelzoo] before train - list my run files:") +os.system("ls -al /usr/local/Ascend/ascend-toolkit/") + +print("[CANN-Modelzoo] before train - list my dataset files:") +os.system("ls -al %s" % config.data_url) + +print("[CANN-Modelzoo] start run train shell") +# 设置sh文件格式为linux可执行 +os.system("dos2unix ./test/*") + +# 执行train_full_1p.sh或者train_performance_1p.sh,需要用户自己指定 +# full和performance的差异,performance只需要执行很少的step,控制在15分钟以内,主要关注性能FPS +print("-----",config.data_url,"----",config.train_url) +os.system("bash ./test/train_performance_1p.sh --data_dir=%s --train_dir=%s " % (config.data_url, config.train_url)) + +print("[CANN-Modelzoo] finish run train shell") + +# 将当前执行目录所有文件拷贝到obs的output进行备份 +print("[CANN-Modelzoo] after train - list my output files:") +os.system("cp -r %s %s " % (code_dir, config.train_url)) +os.system("ls -al %s" % config.train_url) diff --git "a/TensorFlow/contrib/cv/GMH\342\200\224MDN_ID1225_for_TensorFlow/modelzoo_level.txt" "b/TensorFlow/contrib/cv/GMH\342\200\224MDN_ID1225_for_TensorFlow/modelzoo_level.txt" new file mode 100644 index 0000000000000000000000000000000000000000..7eeb8d729d7fb2dd94b91dcf79f8eabd5cfc5b77 --- /dev/null +++ "b/TensorFlow/contrib/cv/GMH\342\200\224MDN_ID1225_for_TensorFlow/modelzoo_level.txt" @@ -0,0 +1,3 @@ +FuncStatus:OK +PerfStatus:OK +PrecisionStatus:OK diff --git "a/TensorFlow/contrib/cv/GMH\342\200\224MDN_ID1225_for_TensorFlow/problem_illustration.png" "b/TensorFlow/contrib/cv/GMH\342\200\224MDN_ID1225_for_TensorFlow/problem_illustration.png" new file mode 100644 index 0000000000000000000000000000000000000000..62b35ce90da94d1135516e9a86a92932e223255f Binary files /dev/null and "b/TensorFlow/contrib/cv/GMH\342\200\224MDN_ID1225_for_TensorFlow/problem_illustration.png" differ diff --git "a/TensorFlow/contrib/cv/GMH\342\200\224MDN_ID1225_for_TensorFlow/requirements.txt" "b/TensorFlow/contrib/cv/GMH\342\200\224MDN_ID1225_for_TensorFlow/requirements.txt" new file mode 100644 index 0000000000000000000000000000000000000000..91f9d6fb793e30b725e8c7cc40a49d08e9ff5eba --- /dev/null +++ "b/TensorFlow/contrib/cv/GMH\342\200\224MDN_ID1225_for_TensorFlow/requirements.txt" @@ -0,0 +1,2 @@ +h5py +tensorflow==1.15.0 \ No newline at end of file diff --git "a/TensorFlow/contrib/cv/GMH\342\200\224MDN_ID1225_for_TensorFlow/src/cameras.py" "b/TensorFlow/contrib/cv/GMH\342\200\224MDN_ID1225_for_TensorFlow/src/cameras.py" new file mode 100644 index 0000000000000000000000000000000000000000..60a2608ca0e4641457c9f28534027cea4219a7a8 --- /dev/null +++ "b/TensorFlow/contrib/cv/GMH\342\200\224MDN_ID1225_for_TensorFlow/src/cameras.py" @@ -0,0 +1,166 @@ +# Copyright 2017 The TensorFlow Authors. All Rights Reserved. +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. +# ============================================================================ +# Copyright 2021 Huawei Technologies Co., Ltd +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +"""Utilities to deal with the cameras of human3.6m""" + +from __future__ import division + +import h5py +import numpy as np +import matplotlib.pyplot as plt +import matplotlib.image as mpimg +import data_utils +import viz + +def project_point_radial( P, R, T, f, c, k, p ): + """ + Project points from 3d to 2d using camera parameters + including radial and tangential distortion + + Args + P: Nx3 points in world coordinates + R: 3x3 Camera rotation matrix + T: 3x1 Camera translation parameters + f: (scalar) Camera focal length + c: 2x1 Camera center + k: 3x1 Camera radial distortion coefficients + p: 2x1 Camera tangential distortion coefficients + Returns + Proj: Nx2 points in pixel space + D: 1xN depth of each point in camera space + radial: 1xN radial distortion per point + tan: 1xN tangential distortion per point + r2: 1xN squared radius of the projected points before distortion + """ + + # P is a matrix of 3-dimensional points + assert len(P.shape) == 2 + assert P.shape[1] == 3 + + N = P.shape[0] + X = R.dot( P.T - T ) # rotate and translate + XX = X[:2,:] / X[2,:] + r2 = XX[0,:]**2 + XX[1,:]**2 + + radial = 1 + np.einsum( 'ij,ij->j', np.tile(k,(1, N)), np.array([r2, r2**2, r2**3]) ); + tan = p[0]*XX[1,:] + p[1]*XX[0,:] + + XXX = XX * np.tile(radial+tan,(2,1)) + np.outer(np.array([p[1], p[0]]).reshape(-1), r2 ) + + Proj = (f * XXX) + c + Proj = Proj.T + + D = X[2,] + + return Proj, D, radial, tan, r2 + +def world_to_camera_frame(P, R, T): + """ + Convert points from world to camera coordinates + + Args + P: Nx3 3d points in world coordinates + R: 3x3 Camera rotation matrix + T: 3x1 Camera translation parameters + Returns + X_cam: Nx3 3d points in camera coordinates + """ + + assert len(P.shape) == 2 + assert P.shape[1] == 3 + + X_cam = R.dot( P.T - T ) # rotate and translate + + return X_cam.T + +def camera_to_world_frame(P, R, T): + """Inverse of world_to_camera_frame + + Args + P: Nx3 points in camera coordinates + R: 3x3 Camera rotation matrix + T: 3x1 Camera translation parameters + Returns + X_cam: Nx3 points in world coordinates + """ + + assert len(P.shape) == 2 + assert P.shape[1] == 3 + + X_cam = R.T.dot( P.T ) + T # rotate and translate + + return X_cam.T + +def load_camera_params( hf, path ): + """Load h36m camera parameters + + Args + hf: hdf5 open file with h36m cameras data + path: path or key inside hf to the camera we are interested in + Returns + R: 3x3 Camera rotation matrix + T: 3x1 Camera translation parameters + f: (scalar) Camera focal length + c: 2x1 Camera center + k: 3x1 Camera radial distortion coefficients + p: 2x1 Camera tangential distortion coefficients + name: String with camera id + """ + + R = hf[ path.format('R') ][:] + R = R.T + + T = hf[ path.format('T') ][:] + f = hf[ path.format('f') ][:] + c = hf[ path.format('c') ][:] + k = hf[ path.format('k') ][:] + p = hf[ path.format('p') ][:] + + name = hf[ path.format('Name') ][:] + name = "".join( [chr(item) for item in name] ) + + return R, T, f, c, k, p, name + +def load_cameras( bpath='cameras.h5', subjects=[1,5,6,7,8,9,11] ): + """Loads the cameras of h36m + + Args + bpath: path to hdf5 file with h36m camera data + subjects: List of ints representing the subject IDs for which cameras are requested + Returns + rcams: dictionary of 4 tuples per subject ID containing its camera parameters for the 4 h36m cams + """ + rcams = {} + + with h5py.File(bpath,'r') as hf: + for s in subjects: + for c in range(4): # There are 4 cameras in human3.6m + rcams[(s, c+1)] = load_camera_params(hf, 'subject%d/camera%d/{0}' % (s,c+1) ) + + return rcams + diff --git "a/TensorFlow/contrib/cv/GMH\342\200\224MDN_ID1225_for_TensorFlow/src/data_utils.py" "b/TensorFlow/contrib/cv/GMH\342\200\224MDN_ID1225_for_TensorFlow/src/data_utils.py" new file mode 100644 index 0000000000000000000000000000000000000000..76c7f39f43259e81ef9d79ffdf356c82577b542e --- /dev/null +++ "b/TensorFlow/contrib/cv/GMH\342\200\224MDN_ID1225_for_TensorFlow/src/data_utils.py" @@ -0,0 +1,757 @@ +# Copyright 2017 The TensorFlow Authors. All Rights Reserved. +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. +# ============================================================================ +# Copyright 2021 Huawei Technologies Co., Ltd +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +"""Utility functions for dealing with human3.6m data.""" + +from __future__ import division + +import os +import numpy as np +import cameras +import h5py +import glob +import copy +import random + + +# Human3.6m IDs for training and testing +TRAIN_SUBJECTS = [1,5,6,7,8] +TEST_SUBJECTS = [9,11] + +# Joints in H3.6M -- data has 32 joints, but only 17 that move; these are the indices. +H36M_NAMES = ['']*32 +H36M_NAMES[0] = 'Hip' +H36M_NAMES[1] = 'RHip' +H36M_NAMES[2] = 'RKnee' +H36M_NAMES[3] = 'RFoot' +H36M_NAMES[6] = 'LHip' +H36M_NAMES[7] = 'LKnee' +H36M_NAMES[8] = 'LFoot' +H36M_NAMES[12] = 'Spine' +H36M_NAMES[13] = 'Thorax' +H36M_NAMES[14] = 'Neck/Nose' +H36M_NAMES[15] = 'Head' +H36M_NAMES[17] = 'LShoulder' +H36M_NAMES[18] = 'LElbow' +H36M_NAMES[19] = 'LWrist' +H36M_NAMES[25] = 'RShoulder' +H36M_NAMES[26] = 'RElbow' +H36M_NAMES[27] = 'RWrist' + +# Stacked Hourglass produces 16 joints. These are the names. +SH_NAMES = ['']*16 +SH_NAMES[0] = 'RFoot' +SH_NAMES[1] = 'RKnee' +SH_NAMES[2] = 'RHip' +SH_NAMES[3] = 'LHip' +SH_NAMES[4] = 'LKnee' +SH_NAMES[5] = 'LFoot' +SH_NAMES[6] = 'Hip' +SH_NAMES[7] = 'Spine' +SH_NAMES[8] = 'Thorax' +SH_NAMES[9] = 'Head' +SH_NAMES[10] = 'RWrist' +SH_NAMES[11] = 'RElbow' +SH_NAMES[12] = 'RShoulder' +SH_NAMES[13] = 'LShoulder' +SH_NAMES[14] = 'LElbow' +SH_NAMES[15] = 'LWrist' + +def load_data( bpath, subjects, actions, dim=3 ): + """ + Loads 2d ground truth from disk, and puts it in an easy-to-acess dictionary + + Args + bpath: String. Path where to load the data from + subjects: List of integers. Subjects whose data will be loaded + actions: List of strings. The actions to load + dim: Integer={2,3}. Load 2 or 3-dimensional data + Returns: + data: Dictionary with keys k=(subject, action, seqname) + values v=(nx(32*2) matrix of 2d ground truth) + There will be 2 entries per subject/action if loading 3d data + There will be 8 entries per subject/action if loading 2d data + """ + + if not dim in [2,3]: + raise(ValueError, 'dim must be 2 or 3') + + data = {} + + for subj in subjects: + for action in actions: + + print('Reading subject {0}, action {1}'.format(subj, action)) + + dpath = os.path.join( bpath, 'S{0}'.format(subj), 'MyPoses/{0}D_positions'.format(dim), '{0}*.h5'.format(action) ) + print( dpath ) + + fnames = glob.glob( dpath ) + + loaded_seqs = 0 + for fname in fnames: + seqname = os.path.basename( fname ) + + # This rule makes sure SittingDown is not loaded when Sitting is requested + if action == "Sitting" and seqname.startswith( "SittingDown" ): + continue + + # This rule makes sure that WalkDog and WalkTogeter are not loaded when + # Walking is requested. + if seqname.startswith( action ): + print( fname ) + loaded_seqs = loaded_seqs + 1 + + with h5py.File( fname, 'r' ) as h5f: + poses = h5f['{0}D_positions'.format(dim)][:] + + poses = poses.T + data[ (subj, action, seqname) ] = poses + + if dim == 2: + assert loaded_seqs == 8, "Expecting 8 sequences, found {0} instead".format( loaded_seqs ) + else: + assert loaded_seqs == 2, "Expecting 2 sequences, found {0} instead".format( loaded_seqs ) + + return data + + +def load_stacked_hourglass(data_dir, subjects, actions): + """ + Load 2d detections from disk, and put it in an easy-to-acess dictionary. + + Args + data_dir: string. Directory where to load the data from, + subjects: list of integers. Subjects whose data will be loaded. + actions: list of strings. The actions to load. + Returns + data: dictionary with keys k=(subject, action, seqname) + values v=(nx(32*2) matrix of 2d stacked hourglass detections) + There will be 2 entries per subject/action if loading 3d data + There will be 8 entries per subject/action if loading 2d data + """ + # Permutation that goes from SH detections to H36M ordering. + SH_TO_GT_PERM = np.array([SH_NAMES.index( h ) for h in H36M_NAMES if h != '' and h in SH_NAMES]) + assert np.all( SH_TO_GT_PERM == np.array([6,2,1,0,3,4,5,7,8,9,13,14,15,12,11,10]) ) + + data = {} + + for subj in subjects: + for action in actions: + + print('Reading subject {0}, action {1}'.format(subj, action)) + + dpath = os.path.join( data_dir, 'S{0}'.format(subj), 'StackedHourglass/{0}*.h5'.format(action) ) + print( dpath ) + + fnames = glob.glob( dpath ) + + loaded_seqs = 0 + for fname in fnames: + seqname = os.path.basename( fname ) + seqname = seqname.replace('_',' ') + + # This rule makes sure SittingDown is not loaded when Sitting is requested + if action == "Sitting" and seqname.startswith( "SittingDown" ): + continue + + # This rule makes sure that WalkDog and WalkTogeter are not loaded when + # Walking is requested. + if seqname.startswith( action ): + print( fname ) + loaded_seqs = loaded_seqs + 1 + + # Load the poses from the .h5 file + with h5py.File( fname, 'r' ) as h5f: + poses = h5f['poses'][:] + + # Permute the loaded data to make it compatible with H36M + poses = poses[:,SH_TO_GT_PERM,:] + + # Reshape into n x (32*2) matrix + poses = np.reshape(poses,[poses.shape[0], -1]) + poses_final = np.zeros([poses.shape[0], len(H36M_NAMES)*2]) + + dim_to_use_x = np.where(np.array([x != '' and x != 'Neck/Nose' for x in H36M_NAMES]))[0] * 2 + dim_to_use_y = dim_to_use_x+1 + + dim_to_use = np.zeros(len(SH_NAMES)*2,dtype=np.int32) + dim_to_use[0::2] = dim_to_use_x + dim_to_use[1::2] = dim_to_use_y + poses_final[:,dim_to_use] = poses + seqname = seqname+'-sh' + data[ (subj, action, seqname) ] = poses_final + + # Make sure we loaded 8 sequences + if (subj == 11 and action == 'Directions'): # <-- this video is damaged + assert loaded_seqs == 7, "Expecting 7 sequences, found {0} instead. S:{1} {2}".format(loaded_seqs, subj, action ) + else: + assert loaded_seqs == 8, "Expecting 8 sequences, found {0} instead. S:{1} {2}".format(loaded_seqs, subj, action ) + + return data + + +def normalization_stats(complete_data, dim, predict_14=False ): + """ + Computes normalization statistics: mean and stdev, dimensions used and ignored + + Args + complete_data: nxd np array with poses + dim. integer={2,3} dimensionality of the data + predict_14. boolean. Whether to use only 14 joints + Returns + data_mean: np vector with the mean of the data + data_std: np vector with the standard deviation of the data + dimensions_to_ignore: list of dimensions not used in the model + dimensions_to_use: list of dimensions used in the model + """ + if not dim in [2,3]: + raise(ValueError, 'dim must be 2 or 3') + + data_mean = np.mean(complete_data, axis=0) + data_std = np.std(complete_data, axis=0) + + # Encodes which 17 (or 14) 2d-3d pairs we are predicting + dimensions_to_ignore = [] + if dim == 2: + dimensions_to_use = np.where(np.array([x != '' and x != 'Neck/Nose' for x in H36M_NAMES]))[0] + dimensions_to_use = np.sort( np.hstack( (dimensions_to_use*2, dimensions_to_use*2+1))) + dimensions_to_ignore = np.delete( np.arange(len(H36M_NAMES)*2), dimensions_to_use ) + else: # dim == 3 + dimensions_to_use = np.where(np.array([x != '' for x in H36M_NAMES]))[0] + dimensions_to_use = np.delete( dimensions_to_use, [0,7,9] if predict_14 else 0 ) + + dimensions_to_use = np.sort( np.hstack( (dimensions_to_use*3, + dimensions_to_use*3+1, + dimensions_to_use*3+2))) + dimensions_to_ignore = np.delete( np.arange(len(H36M_NAMES)*3), dimensions_to_use ) + + return data_mean, data_std, dimensions_to_ignore, dimensions_to_use + + +def transform_world_to_camera(poses_set, cams, ncams=4 ): + """ + Project 3d poses from world coordinate to camera coordinate system + Args + poses_set: dictionary with 3d poses + cams: dictionary with cameras + ncams: number of cameras per subject + Return: + t3d_camera: dictionary with 3d poses in camera coordinate + """ + t3d_camera = {} + for t3dk in sorted( poses_set.keys() ): + + subj, action, seqname = t3dk + t3d_world = poses_set[ t3dk ] + + for c in range( ncams ): + R, T, f, c, k, p, name = cams[ (subj, c+1) ] + camera_coord = cameras.world_to_camera_frame( np.reshape(t3d_world, [-1, 3]), R, T) + camera_coord = np.reshape( camera_coord, [-1, len(H36M_NAMES)*3] ) + + sname = seqname[:-3]+"."+name+".h5" # e.g.: Waiting 1.58860488.h5 + t3d_camera[ (subj, action, sname) ] = camera_coord + + return t3d_camera + + +def normalize_data(data, data_mean, data_std, dim_to_use ): + """ + Normalizes a dictionary of poses + + Args + data: dictionary where values are + data_mean: np vector with the mean of the data + data_std: np vector with the standard deviation of the data + dim_to_use: list of dimensions to keep in the data + Returns + data_out: dictionary with same keys as data, but values have been normalized + """ + data_out = {} + + for key in data.keys(): + data[ key ] = data[ key ][ :, dim_to_use ] + mu = data_mean[dim_to_use] + stddev = data_std[dim_to_use] + data_out[ key ] = np.divide( (data[key] - mu), stddev ) + + return data_out + + +def unNormalizeData(normalized_data, data_mean, data_std, dimensions_to_ignore): + """ + Un-normalizes a matrix whose mean has been substracted and that has been divided by + standard deviation. Some dimensions might also be missing + + Args + normalized_data: nxd matrix to unnormalize + data_mean: np vector with the mean of the data + data_std: np vector with the standard deviation of the data + dimensions_to_ignore: list of dimensions that were removed from the original data + Returns + orig_data: the input normalized_data, but unnormalized + """ + T = normalized_data.shape[0] # Batch size + D = data_mean.shape[0] # Dimensionality + + orig_data = np.zeros((T, D), dtype=np.float32) + dimensions_to_use = np.array([dim for dim in range(D) + if dim not in dimensions_to_ignore]) + + orig_data[:, dimensions_to_use] = normalized_data + + # Multiply times stdev and add the mean + stdMat = data_std.reshape((1, D)) + stdMat = np.repeat(stdMat, T, axis=0) + meanMat = data_mean.reshape((1, D)) + meanMat = np.repeat(meanMat, T, axis=0) + orig_data = np.multiply(orig_data, stdMat) + meanMat + return orig_data + + +def define_actions( action ): + """ + Given an action string, returns a list of corresponding actions. + + Args + action: String. either "all" or one of the h36m actions + Returns + actions: List of strings. Actions to use. + Raises + ValueError: if the action is not a valid action in Human 3.6M + """ + actions = ["Directions","Discussion","Eating","Greeting", + "Phoning","Photo","Posing","Purchases", + "Sitting","SittingDown","Smoking","Waiting", + "WalkDog","Walking","WalkTogether"] + + if action == "All" or action == "all": + return actions + + if not action in actions: + raise( ValueError, "Unrecognized action: %s" % action ) + + return [action] + + +def project_to_cameras( poses_set, cams, ncams=4 ): + """ + Project 3d poses using camera parameters + + Args + poses_set: dictionary with 3d poses + cams: dictionary with camera parameters + ncams: number of cameras per subject + Returns + t2d: dictionary with 2d poses + """ + t2d = {} + + for t3dk in sorted( poses_set.keys() ): + subj, a, seqname = t3dk + t3d = poses_set[ t3dk ] + + for cam in range( ncams ): + R, T, f, c, k, p, name = cams[ (subj, cam+1) ] + pts2d, _, _, _, _ = cameras.project_point_radial( np.reshape(t3d, [-1, 3]), R, T, f, c, k, p ) + + pts2d = np.reshape( pts2d, [-1, len(H36M_NAMES)*2] ) + sname = seqname[:-3]+"."+name+".h5" # e.g.: Waiting 1.58860488.h5 + t2d[ (subj, a, sname) ] = pts2d + + return t2d + + +def read_2d_predictions( actions, data_dir ): + """ + Loads 2d data from precomputed Stacked Hourglass detections + + Args + actions: list of strings. Actions to load + data_dir: string. Directory where the data can be loaded from + Returns + train_set: dictionary with loaded 2d stacked hourglass detections for training + test_set: dictionary with loaded 2d stacked hourglass detections for testing + data_mean: vector with the mean of the 2d training data + data_std: vector with the standard deviation of the 2d training data + dim_to_ignore: list with the dimensions to not predict + dim_to_use: list with the dimensions to predict + """ + + train_set = load_stacked_hourglass( data_dir, TRAIN_SUBJECTS, actions) + test_set = load_stacked_hourglass( data_dir, TEST_SUBJECTS, actions) + + complete_train = copy.deepcopy( np.vstack( train_set.values() )) + data_mean, data_std, dim_to_ignore, dim_to_use = normalization_stats( complete_train, dim=2 ) + + train_set = normalize_data( train_set, data_mean, data_std, dim_to_use ) + test_set = normalize_data( test_set, data_mean, data_std, dim_to_use ) + + return train_set, test_set, data_mean, data_std, dim_to_ignore, dim_to_use + + +def create_2d_data( actions, data_dir, rcams ): + """ + Creates 2d poses by projecting 3d poses with the corresponding camera + parameters. Also normalizes the 2d poses + + Args + actions: list of strings. Actions to load + data_dir: string. Directory where the data can be loaded from + rcams: dictionary with camera parameters + Returns + train_set: dictionary with projected 2d poses for training + test_set: dictionary with projected 2d poses for testing + data_mean: vector with the mean of the 2d training data + data_std: vector with the standard deviation of the 2d training data + dim_to_ignore: list with the dimensions to not predict + dim_to_use: list with the dimensions to predict + """ + + # Load 3d data + train_set = load_data( data_dir, TRAIN_SUBJECTS, actions, dim=3 ) + test_set = load_data( data_dir, TEST_SUBJECTS, actions, dim=3 ) + + train_set = project_to_cameras( train_set, rcams ) + test_set = project_to_cameras( test_set, rcams ) + + + # Compute normalization statistics. + complete_train = copy.deepcopy( np.vstack( train_set.values() )) + data_mean, data_std, dim_to_ignore, dim_to_use = normalization_stats( complete_train, dim=2 ) + + # Divide every dimension independently + train_set = normalize_data( train_set, data_mean, data_std, dim_to_use ) + test_set = normalize_data( test_set, data_mean, data_std, dim_to_use ) + + return train_set, test_set, data_mean, data_std, dim_to_ignore, dim_to_use + + +def read_3d_data( actions, data_dir, camera_frame, rcams, predict_14=False ): + """ + Loads 3d poses, zero-centres and normalizes them + + Args + actions: list of strings. Actions to load + data_dir: string. Directory where the data can be loaded from + camera_frame: boolean. Whether to convert the data to camera coordinates + rcams: dictionary with camera parameters + predict_14: boolean. Whether to predict only 14 joints + Returns + train_set: dictionary with loaded 3d poses for training + test_set: dictionary with loaded 3d poses for testing + data_mean: vector with the mean of the 3d training data + data_std: vector with the standard deviation of the 3d training data + dim_to_ignore: list with the dimensions to not predict + dim_to_use: list with the dimensions to predict + train_root_positions: dictionary with the 3d positions of the root in train + test_root_positions: dictionary with the 3d positions of the root in test + """ + # Load 3d data + train_set = load_data( data_dir, TRAIN_SUBJECTS, actions, dim=3 ) + test_set = load_data( data_dir, TEST_SUBJECTS, actions, dim=3 ) + + + if camera_frame: + train_set = transform_world_to_camera( train_set, rcams ) + test_set = transform_world_to_camera( test_set, rcams ) + + # Apply 3d post-processing (centering around root) + train_set, train_root_positions = postprocess_3d( train_set ) + test_set, test_root_positions = postprocess_3d( test_set ) + + # Compute normalization statistics + complete_train = copy.deepcopy( np.vstack( train_set.values() )) + data_mean, data_std, dim_to_ignore, dim_to_use = normalization_stats( complete_train, dim=3, predict_14=predict_14 ) + + # Divide every dimension independently + train_set = normalize_data( train_set, data_mean, data_std, dim_to_use ) + test_set = normalize_data( test_set, data_mean, data_std, dim_to_use ) + + return train_set, test_set, data_mean, data_std, dim_to_ignore, dim_to_use, train_root_positions, test_root_positions + + +def postprocess_3d( poses_set ): + """ + Center 3d points around root + + Args + poses_set: dictionary with 3d data + Returns + poses_set: dictionary with 3d data centred around root (center hip) joint + root_positions: dictionary with the original 3d position of each pose + """ + root_positions = {} + for k in poses_set.keys(): + # Keep track of the global position + root_positions[k] = copy.deepcopy(poses_set[k][:,:3]) + + # Remove the root from the 3d position + poses = poses_set[k] + poses = poses - np.tile( poses[:,:3], [1, len(H36M_NAMES)] ) + poses_set[k] = poses + + return poses_set, root_positions + + + +def create_2d_mpii_test(dataset, Debug=False): + + ''' + Create 2d pose data as the input of the stage two. + For mpii dataset, we use the output of the hourglass network + For mpi dataset, we use the 2d joints provided by the dataset + Args: + dataset: spicify which dataset to use, either 'mpi' or 'mpii' + + ''' + + mpii_to_human36 = np.array([6, 2, 1, 0, 3, 4, 5, 7, 8, 9, 13, 14, 15, 12, 11, 10]) + + if dataset == 'mpi': + input_file = '../data/mpi/annotVal_outdoor.h5' # mpi dataset has three scenario, green background, normal indoor and outdoor + annot_train = getData(input_file) + joints = annot_train['annot_2d'] + + else: + input_file = '../data/mpii/mpii_preds.h5' + annot_train = getData(input_file) + joints = annot_train['part'] + + joints = joints[:, mpii_to_human36, :] # only use the correspoinding joints + + dim_to_use_x = np.where(np.array([x != '' and x != 'Neck/Nose' for x in H36M_NAMES]))[0] * 2 + dim_to_use_y = dim_to_use_x + 1 + + dim_to_use = np.zeros(len(SH_NAMES) * 2, dtype=np.int32) + dim_to_use[0::2] = dim_to_use_x + dim_to_use[1::2] = dim_to_use_y + + dimensions_to_ignore = np.delete(np.arange(len(H36M_NAMES) * 2), dim_to_use) + poses = np.reshape(joints, [joints.shape[0], -1]) + + poses_final = np.zeros([poses.shape[0], len(H36M_NAMES) * 2]) + poses_final[:, dim_to_use] = poses + + print('{} left from {} after filter'.format(poses_final.shape[0], poses.shape[0])) + complete_train = copy.deepcopy(poses_final) + test_set_2d, data_mean, data_std = normalize_data_mpii(complete_train, dim_to_use) + + # if Debug: + # + # data = unNormalizeData(train_set, data_mean, data_std, dimensions_to_ignore) + # for i in range(data.shape[0]): + # pose = data[i, dim_to_use].reshape(16,2) + # human36_to_mpii = np.argsort(mpii_to_human36) + # pose_mpii = pose[human36_to_mpii] + # name = names[i][:13] + # imgpath = '/home/lichen/pose_estimation/images_2d/' + # img = cv2.imread(os.path.join(imgpath, name)) + # c = (255, 0, 0) + # + # for j in range(pose_mpii.shape[0]): + # cv2.circle(img, (int(pose_mpii[j, 0]), int(pose_mpii[j, 1])), 3, c, -1) + # cv2.imshow('img', img) + # cv2.waitKey() + + return test_set_2d, data_mean, data_std, dim_to_use + +def get_all_batches_mpii(data, batch_size): + ''' + + data: all data to use + batch_size: batch size for model + return: data in batch + ''' + + data = np.array(data) + n = data.shape[0] + n_extra = np.int32(n % batch_size) + n_batches = np.int32(n // batch_size ) + if n_extra>0: + encoder_inputs = np.split(data[:-n_extra, :],n_batches ) + else: + encoder_inputs = np.split(data, n_batches) + return encoder_inputs + + +def normalize_data_mpii(data, dim_to_use): + """ + Normalize the 2d pose data + Args + data: + dim_to_use: list of dimensions to keep in the 2d pose data + Returns + data_out: normalized data, mean and standard deviation + """ + + + data_mean = np.mean(data, axis=0) + data_std = np.std(data,axis=0) + data_out = np.divide( (data[:, dim_to_use] - data_mean[dim_to_use]), data_std[dim_to_use] ) + + return data_out, data_mean, data_std + +def unnormalize_data_mpii(normalized_data, data_mean, data_std, dimensions_to_use): + + ''' + Unnormalize the 2d pose data + ''' + + T = normalized_data.shape[0] # Batch size + D = data_mean.shape[0] # Dimensionality + + orig_data = np.zeros((T, D), dtype=np.float32) + orig_data[:, dimensions_to_use] = normalized_data + + # Multiply times stdev and add the mean + stdMat = data_std.reshape((1, D)) + stdMat = np.repeat(stdMat, T, axis=0) + meanMat = data_mean.reshape((1, D)) + meanMat = np.repeat(meanMat, T, axis=0) + orig_data = np.multiply(orig_data, stdMat) + meanMat + + return orig_data + +def h36_to_mpii(pose): + h36_to_mpii_permu = np.array([3, 2, 1, 4, 5, 6, 0, 7, 8, 9, 15, 14, 13, 10, 11, 12]) # joint indexes for mpii dataset and h36 are different + pose = np.reshape(pose, [pose.shape[0], len(SH_NAMES), -1]) + pose = pose[:, h36_to_mpii_permu] + + return pose + +def create_3d_mpi_test(): + + ''' + Create 3d pose data for mpi data set + ''' + + mpii_to_human36 = np.array([6, 2, 1, 0, 3, 4, 5, 7, 8, 9, 13, 14, 15, 12, 11, 10]) # to make the joint index in consistence + input_file = '../data/mpi/annotVal_outdoor.h5' + data = h5py.File(input_file, 'r') + joints_3d = data['annot_3d'].value + img_name = data['annot_image'].value + + poses = joints_3d[:, mpii_to_human36, :] # mpi does not have the annotation for joint 'neck', approximate by the average value of throat and head + pose_neck = (poses[:, 8, :] + poses[:, 10, :])/2 + poses_17 = np.insert(poses, 9 , pose_neck, axis= 1) + poses_mpi_to_h36 = copy.deepcopy( poses_17) + + poses_17 = np.reshape(poses_17, [poses_17.shape[0], -1]) + poses_final = np.zeros([poses.shape[0], len(H36M_NAMES) * 3]) + + + dim_to_use_x = np.where(np.array([x != '' for x in H36M_NAMES]))[0] * 3 + dim_to_use_y = dim_to_use_x + 1 + dim_to_use_z = dim_to_use_x + 2 + + dim_to_use = np.zeros(17 * 3, dtype=np.int32) + dim_to_use[0::3] = dim_to_use_x + dim_to_use[1::3] = dim_to_use_y + dim_to_use[2::3] = dim_to_use_z + poses_final[:, dim_to_use] = poses_17 + + test_set, test_root_positions = postprocess_3d_mpi(poses_final) + complete_test = copy.deepcopy(np.vstack(test_set)) + data_mean, data_std, dim_to_ignore, dim_to_use_ = normalization_stats(complete_test, dim=3) + + # Divide every dimension independently + test_set = normalize_data_mpi(test_set, data_mean, data_std, dim_to_use_) + return test_set, data_mean, data_std, dim_to_ignore, dim_to_use_, test_root_positions, poses_mpi_to_h36, img_name + +def postprocess_3d_mpi(pose_3d): + ''' + + process the 3d pose data with respect to the root joint + We regress the relative rather than the absolute coordinates, + ''' + + root_position = copy.deepcopy(pose_3d[:, :3]) + + pose_3d_root = pose_3d - np.tile(root_position,[1, len(H36M_NAMES)]) + + return pose_3d_root, root_position + +def normalize_data_mpi(data, mean, std, dim_to_use): + + ''' + Normalize the 3d pose data in mpi dataset + ''' + + data = data[:, dim_to_use] + mean = mean[dim_to_use] + std = std[dim_to_use] + data_out = np.divide((data-mean), std+0.0000001) + + return data_out + +def getData(tmpFile): + ''' + Read data from .h5 file + ''' + data = h5py.File(tmpFile, 'r') + d = {} + for k, v in data.items(): + d[k] = np.asarray(data[k]) + data.close() + return d + + +def generage_missing_data(enc_in,mis_number): + ''' + + enc_in: input 2d pose data + mis_number: the number of missing joints + return: 2d pose with missing joints randomly selected from the limbs + ''' + + joints_missing = [2, 3, 5, 6, 11, 12, 14, 15] # only delete joints from limbs + for i in range(enc_in.shape[0]): + if mis_number == 1: + missing_index = random.randint(0, 7) + missing_dim = np.array([joints_missing[missing_index]*2, joints_missing[missing_index]*2+1]) + else: + missing_index = random.sample(range(8), 2) + missing_dim = np.array([joints_missing[missing_index[0]] * 2, joints_missing[missing_index[0]] * 2 + 1, + joints_missing[missing_index[1]] * 2, joints_missing[missing_index[1]] * 2 + 1]) + + enc_in[i, missing_dim] = 0.0 # get missing joints by setting the corresponding value to 0 + + return enc_in + + + + + + + + + + + diff --git "a/TensorFlow/contrib/cv/GMH\342\200\224MDN_ID1225_for_TensorFlow/src/logging.conf" "b/TensorFlow/contrib/cv/GMH\342\200\224MDN_ID1225_for_TensorFlow/src/logging.conf" new file mode 100644 index 0000000000000000000000000000000000000000..837f760c0fbd1cafc7e51635b276016cf1ef426e --- /dev/null +++ "b/TensorFlow/contrib/cv/GMH\342\200\224MDN_ID1225_for_TensorFlow/src/logging.conf" @@ -0,0 +1,28 @@ +[loggers] +keys=root,simpleExample + +[handlers] +keys=consoleHandler + +[formatters] +keys=simpleFormatter + +[logger_root] +level=DEBUG +handlers=consoleHandler + +[logger_simpleExample] +level=DEBUG +handlers=consoleHandler +qualname=simpleExample +propagate=0 + +[handler_consoleHandler] +class=StreamHandler +level=DEBUG +formatter=simpleFormatter +args=(sys.stdout,) + +[formatter_simpleFormatter] +format=%(asctime)s [%(levelname)s] %(name)s - %(message)s +datefmt= \ No newline at end of file diff --git "a/TensorFlow/contrib/cv/GMH\342\200\224MDN_ID1225_for_TensorFlow/src/mix_den_model.py" "b/TensorFlow/contrib/cv/GMH\342\200\224MDN_ID1225_for_TensorFlow/src/mix_den_model.py" new file mode 100644 index 0000000000000000000000000000000000000000..1405f57634cda3b6762ac8b790202c9cd7be9bb7 --- /dev/null +++ "b/TensorFlow/contrib/cv/GMH\342\200\224MDN_ID1225_for_TensorFlow/src/mix_den_model.py" @@ -0,0 +1,430 @@ +# Copyright 2017 The TensorFlow Authors. All Rights Reserved. +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. +# ============================================================================ +# Copyright 2021 Huawei Technologies Co., Ltd +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +"""Simple model to regress 3d human poses from 2d joint locations""" + +from __future__ import absolute_import +from __future__ import division +from __future__ import print_function + +from tensorflow.python.ops import variable_scope as vs + +import os +import numpy as np +from six.moves import xrange # pylint: disable=redefined-builtin +import tensorflow as tf +import data_utils +import cameras as cam +from npu_bridge.npu_init import * + +def kaiming(shape, dtype, partition_info=None): + """Kaiming initialization as described in https://arxiv.org/pdf/1502.01852.pdf + + Args + shape: dimensions of the tf array to initialize + dtype: data type of the array + partition_info: (Optional) info about how the variable is partitioned. + See https://github.com/tensorflow/tensorflow/blob/master/tensorflow/python/ops/init_ops.py#L26 + Needed to be used as an initializer. + Returns + Tensorflow array with initial weights + """ + return(tf.truncated_normal(shape, dtype=dtype)*tf.sqrt(2/float(shape[0]))) + +class LinearModel(object): + """ A simple Linear+RELU model """ + + def __init__(self, + linear_size, + num_layers, + residual, + batch_norm, + max_norm, + batch_size, + learning_rate, + summaries_dir, + predict_14=False, + dtype=tf.float32): + """Creates the linear + relu model + + Args + linear_size: integer. number of units in each layer of the model + num_layers: integer. number of bilinear blocks in the model + residual: boolean. Whether to add residual connections + batch_norm: boolean. Whether to use batch normalization + max_norm: boolean. Whether to clip weights to a norm of 1 + batch_size: integer. The size of the batches used during training + learning_rate: float. Learning rate to start with + summaries_dir: String. Directory where to log progress + predict_14: boolean. Whether to predict 14 instead of 17 joints + dtype: the data type to use to store internal variables + """ + + # There are in total 17 joints in H3.6M and 16 in MPII (and therefore in stacked + # hourglass detections). We settled with 16 joints in 2d just to make models + # compatible (e.g. you can train on ground truth 2d and test on SH detections). + # This does not seem to have an effect on prediction performance. + self.HUMAN_2D_SIZE = 16 * 2 + + # In 3d all the predictions are zero-centered around the root (hip) joint, so + # we actually predict only 16 joints. The error is still computed over 17 joints, + # because if one uses, e.g. Procrustes alignment, there is still error in the + # hip to account for! + # There is also an option to predict only 14 joints, which makes our results + # directly comparable to those in https://arxiv.org/pdf/1611.09010.pdf + self.HUMAN_3D_SIZE = 14 * 3 if predict_14 else 16 * 3 + + self.input_size = self.HUMAN_2D_SIZE + self.output_size = self.HUMAN_3D_SIZE + + self.isTraining = tf.placeholder(tf.bool,name="isTrainingflag") + self.dropout_keep_prob = tf.placeholder(tf.float32, name="dropout_keep_prob") + + # Summary writers for train and test runs + self.train_writer = tf.summary.FileWriter( os.path.join(summaries_dir, 'train' )) + self.test_writer = tf.summary.FileWriter( os.path.join(summaries_dir, 'test' )) + + self.linear_size = linear_size + self.batch_size = batch_size + self.learning_rate = tf.Variable( float(learning_rate), trainable=False, dtype=dtype, name="learning_rate") + self.global_step = tf.Variable(0, trainable=False, name="global_step") + decay_steps = 100000 # empirical + decay_rate = 0.96 # empirical + self.learning_rate = tf.train.exponential_decay(self.learning_rate, self.global_step, decay_steps, decay_rate) + self.num_models = 5 # specify the number of gaussian kernels in the mixture model + + + # === Transform the inputs === + with vs.variable_scope("inputs"): + + # === fix the batch size in order to introdoce uncertainty into loss ===# + + enc_in = tf.placeholder(dtype, shape=[None, self.input_size], name="enc_in") + dec_out = tf.placeholder(dtype, shape=[None, self.output_size], name="dec_out") + + + self.encoder_inputs = enc_in + self.decoder_outputs = dec_out + + # === Create the linear + relu combos === + with vs.variable_scope( "linear_model" ): + + # === First layer, brings dimensionality up to linear_size === + w1 = tf.get_variable( name="w1", initializer=kaiming, shape=[self.HUMAN_2D_SIZE, linear_size], dtype=dtype ) + b1 = tf.get_variable( name="b1", initializer=kaiming, shape=[linear_size], dtype=dtype ) + w1 = tf.clip_by_norm(w1,1) if max_norm else w1 + y3 = tf.matmul( enc_in, w1 ) + b1 + + if batch_norm: + y3 = tf.layers.batch_normalization(y3,training=self.isTraining, name="batch_normalization") + y3 = tf.nn.relu( y3 ) + y3 = tf.nn.dropout( y3, self.dropout_keep_prob ) + + # === Create multiple bi-linear layers === + for idx in range( num_layers ): + y3 = self.two_linear( y3, linear_size, residual, self.dropout_keep_prob, max_norm, batch_norm, dtype, idx ) + + + + # === Last linear layer has HUMAN_3D_SIZE in output === + w4 = tf.get_variable( name="w4", initializer=kaiming, shape=[linear_size, self.HUMAN_3D_SIZE*self.num_models], dtype=dtype ) + b4 = tf.get_variable( name="b4", initializer=kaiming, shape=[self.HUMAN_3D_SIZE*self.num_models], dtype=dtype ) + w4 = tf.clip_by_norm(w4,1) if max_norm else w4 + y_mu = tf.matmul(y3, w4) + b4 + + + w5 = tf.get_variable( name="w5", initializer=kaiming, shape=[linear_size, self.num_models], dtype=dtype ) + b5 = tf.get_variable( name="b5", initializer=kaiming, shape=[self.num_models], dtype=dtype ) + w5 = tf.clip_by_norm(w5,1) if max_norm else w5 + y_sigma = tf.matmul(y3, w5) + b5 + y_sigma = tf.nn.elu(y_sigma)+1 + + w6 = tf.get_variable( name="w6", initializer=kaiming, shape=[linear_size, self.num_models], dtype=dtype ) + b6 = tf.get_variable( name="b6", initializer=kaiming, shape=[self.num_models], dtype=dtype ) + y_alpha = tf.matmul(y3, w6) + b6 + y_alpha = tf.nn.softmax(y_alpha, dim=1) + + # === End linear model === + + components = tf.concat([y_mu, y_sigma, y_alpha], axis=1) + self.outputs = components + + # add dirichlet conjucate prior to the mixing coefficents + prior = tf.constant([2.0, 2.0, 2.0, 2.0, 2.0], dtype=tf.float32) + loss_prior = Dirichlet_loss(components, self.HUMAN_3D_SIZE, self.num_models, prior) + + with vs.variable_scope('loss'): + + loss_gaussion = mean_log_Gaussian_like(dec_out, components, self.HUMAN_3D_SIZE, self.num_models) # Mixture density network based on gaussian kernel + self.loss = loss_gaussion + loss_prior + + tf.summary.scalar('loss', self.loss, collections=['train', 'test']) + self.loss_summary = tf.summary.merge_all('train') + + + + + # To keep track of the loss in mm + self.err_mm = tf.placeholder( tf.float32, name="error_mm" ) + self.err_mm_summary = tf.summary.scalar( "loss/error_mm", self.err_mm ) + + # Gradients and update operation for training the model. + opt = tf.train.AdamOptimizer( self.learning_rate ) + update_ops = tf.get_collection(tf.GraphKeys.UPDATE_OPS) + + with tf.control_dependencies(update_ops): + + # Update all the trainable parameters + gradients = opt.compute_gradients(self.loss) + self.gradients = [[] if i==None else i for i in gradients] + self.updates = opt.apply_gradients(gradients, global_step=self.global_step) + + # Keep track of the learning rate + self.learning_rate_summary = tf.summary.scalar('learning_rate/learning_rate', self.learning_rate) + + # To save the model + self.saver = tf.train.Saver( tf.global_variables(), max_to_keep=None ) + + + def two_linear( self, xin, linear_size, residual, dropout_keep_prob, max_norm, batch_norm, dtype, idx ): + """ + Make a bi-linear block with optional residual connection + + Args + xin: the batch that enters the block + linear_size: integer. The size of the linear units + residual: boolean. Whether to add a residual connection + dropout_keep_prob: float [0,1]. Probability of dropping something out + max_norm: boolean. Whether to clip weights to 1-norm + batch_norm: boolean. Whether to do batch normalization + dtype: type of the weigths. Usually tf.float32 + idx: integer. Number of layer (for naming/scoping) + Returns + y: the batch after it leaves the block + """ + + with vs.variable_scope( "two_linear_"+str(idx) ) as scope: + + input_size = int(xin.get_shape()[1]) + + # Linear 1 + w2 = tf.get_variable( name="w2_"+str(idx), initializer=kaiming, shape=[input_size, linear_size], dtype=dtype) + b2 = tf.get_variable( name="b2_"+str(idx), initializer=kaiming, shape=[linear_size], dtype=dtype) + w2 = tf.clip_by_norm(w2,1) if max_norm else w2 + y = tf.matmul(xin, w2) + b2 + if batch_norm: + y = tf.layers.batch_normalization(y,training=self.isTraining,name="batch_normalization1"+str(idx)) + + y = tf.nn.relu( y ) + y = tf.nn.dropout( y, dropout_keep_prob ) + + # Linear 2 + w3 = tf.get_variable( name="w3_"+str(idx), initializer=kaiming, shape=[linear_size, linear_size], dtype=dtype) + b3 = tf.get_variable( name="b3_"+str(idx), initializer=kaiming, shape=[linear_size], dtype=dtype) + w3 = tf.clip_by_norm(w3,1) if max_norm else w3 + y = tf.matmul(y, w3) + b3 + + if batch_norm: + y = tf.layers.batch_normalization(y,training=self.isTraining,name="batch_normalization2"+str(idx)) + + y = tf.nn.relu( y ) + y = tf.nn.dropout( y, dropout_keep_prob ) + + # Residual every 2 blocks + y = (xin + y) if residual else y + + return y + + def step(self, session, encoder_inputs, decoder_outputs, dropout_keep_prob, isTraining=True): + """Run a step of the model feeding the given inputs. + + Args + session: tensorflow session to use + encoder_inputs: list of numpy vectors to feed as encoder inputs + decoder_outputs: list of numpy vectors that are the expected decoder outputs + dropout_keep_prob: (0,1] dropout keep probability + isTraining: whether to do the backward step or only forward + + Returns + if isTraining is True, a 4-tuple + loss: the computed loss of this batch + loss_summary: tf summary of this batch loss, to log on tensorboard + learning_rate_summary: tf summary of learnign rate to log on tensorboard + outputs: predicted 3d poses + if isTraining is False, a 3-tuple + (loss, loss_summary, outputs) same as above + """ + + input_feed = {self.encoder_inputs: encoder_inputs, + self.decoder_outputs: decoder_outputs, + self.isTraining: isTraining, + self.dropout_keep_prob: dropout_keep_prob} + + # Output feed: depends on whether we do a backward step or not. + if isTraining: + output_feed = [self.updates, # Update Op that does SGD + self.loss, + self.loss_summary, + self.learning_rate_summary, + self.outputs] + + outputs = session.run( output_feed, input_feed ) + return outputs[1], outputs[2], outputs[3], outputs[4] + + else: + output_feed = [self.loss, # Loss for this batch. + self.loss_summary, + self.outputs] + + outputs = session.run(output_feed, input_feed) + return outputs[0], outputs[1], outputs[2] # No gradient norm + + def get_all_batches(self, data_x, data_y, camera_frame, training=True): + """ + Obtain a list of all the batches, randomly permutted + Args + data_x: dictionary with 2d inputs + data_y: dictionary with 3d expected outputs + camera_frame: whether the 3d data is in camera coordinates + training: True if this is a training batch. False otherwise. + + Returns + encoder_inputs: list of 2d batches + decoder_outputs: list of 3d batches + """ + + # Figure out how many frames we have + n = 0 + repre = {} + + for key2d in sorted(data_x.keys()): + n2d, _ = data_x[key2d].shape + n = n + n2d + repre[key2d] = n2d + + encoder_inputs = np.zeros((n, self.HUMAN_2D_SIZE), dtype=float) + decoder_outputs = np.zeros((n, self.HUMAN_3D_SIZE), dtype=float) + + # Put all the data into big arrays + idx = 0 + for key2d in sorted(data_x.keys()): + (subj, b, fname) = key2d + # keys should be the same if 3d is in camera coordinates + key3d = key2d if (camera_frame) else (subj, b, '{0}.h5'.format(fname.split('.')[0])) + key3d = (subj, b, fname[:-3]) if fname.endswith('-sh') and camera_frame else key3d + + n2d, _ = data_x[key2d].shape + encoder_inputs[idx:idx + n2d, :] = data_x[key2d] + decoder_outputs[idx:idx + n2d, :] = data_y[key3d] + idx = idx + n2d + + if training: + # Randomly permute everything + idx = np.random.permutation(n) + encoder_inputs = encoder_inputs[idx, :] + decoder_outputs = decoder_outputs[idx, :] + + # Make the number of examples a multiple of the batch size + n_extra = n % self.batch_size + if n_extra > 0: # Otherwise examples are already a multiple of batch size + encoder_inputs = encoder_inputs[:-n_extra, :] + decoder_outputs = decoder_outputs[:-n_extra, :] + + n_batches = n // self.batch_size + encoder_inputs = np.split(encoder_inputs, n_batches) + decoder_outputs = np.split(decoder_outputs, n_batches) + repre[sorted(data_x.keys())[-1]] = repre[sorted(data_x.keys())[-1]] - n_extra ## track how many frames are used in each video, + + return encoder_inputs, decoder_outputs, repre + + +def mean_log_Gaussian_like(y_true, parameters,c,m ): + """Mean Log Gaussian Likelihood distribution + y_truth: ground truth 3d pose + parameters: output of hypotheses generator, which conclude the mean, variance and mixture coeffcient of the mixture model + c: dimension of 3d pose + m: number of kernels + """ + components = tf.reshape(parameters, [-1, c + 2, m]) + mu = components[:, :c, :] + sigma = components[:, c, :] + sigma = tf.clip_by_value(sigma, 1e-15,1e15) + alpha = components[:, c + 1, :] + alpha = tf.clip_by_value(alpha, 1e-8, 1.) + + exponent = tf.log(alpha) - 0.5 * c * tf.log(2 * np.pi) \ + - c * tf.log(sigma) \ + - tf.reduce_sum((tf.expand_dims(y_true, 2) - mu) ** 2, axis=1) / (2.0 * (sigma) ** 2.0) + + log_gauss = log_sum_exp(exponent, axis=1) + res = - tf.reduce_mean(log_gauss) + return res + + +def Dirichlet_loss(parameters, c, m, prior): + ''' + add dirichlet conjucate prior to the loss function to prevent all data fitting into single kernel + ''' + + components = tf.reshape(parameters, [-1, c + 2, m]) + alpha = components[:, c + 1, :] + alpha = tf.clip_by_value(alpha, 1e-8, 1.) + + loss = tf.reduce_sum((prior-1.0) * tf.log(alpha), axis=1) + res = -tf.reduce_mean(loss) + return res + + + + +def log_sum_exp(x, axis=None): + """Log-sum-exp trick implementation""" + x_max = tf.reduce_max(x, axis=axis, keep_dims=True) + return tf.log(tf.reduce_sum(tf.exp(x - x_max), + axis=axis, keep_dims=True))+x_max + + + + +def mean_log_LaPlace_like(y_true, parameters, c, m): + """Mean Log Laplace Likelihood distribution + parameters refer to mean_log_Gaussian_like + """ + components = tf.reshape(parameters, [-1, c + 2, m]) + mu = components[:, :c, :] + sigma = components[:, c, :] + sigma = tf.clip_by_value(sigma, 1e-15, 1e15) + alpha = components[:, c + 1, :] + alpha = tf.clip_by_value(alpha, 1e-8, 1.) + + exponent = tf.log(alpha) - c * tf.log(2.0 * sigma) \ + - tf.reduce_sum(tf.abs(tf.expand_dims(y_true, 2) - mu), axis=1) / (sigma) + + log_gauss, _ = log_sum_exp(exponent, axis=1) + res = - tf.reduce_mean(log_gauss) + return res diff --git "a/TensorFlow/contrib/cv/GMH\342\200\224MDN_ID1225_for_TensorFlow/src/predict_3dpose_mdm.py" "b/TensorFlow/contrib/cv/GMH\342\200\224MDN_ID1225_for_TensorFlow/src/predict_3dpose_mdm.py" new file mode 100644 index 0000000000000000000000000000000000000000..482cbda0285f7d987c7fbde770bea44a2c78c3ab --- /dev/null +++ "b/TensorFlow/contrib/cv/GMH\342\200\224MDN_ID1225_for_TensorFlow/src/predict_3dpose_mdm.py" @@ -0,0 +1,678 @@ +# Copyright 2017 The TensorFlow Authors. All Rights Reserved. +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. +# ============================================================================ +# Copyright 2021 Huawei Technologies Co., Ltd +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + + +"""Predicting 3d poses from 2d joints""" + +from __future__ import absolute_import +from __future__ import division +from __future__ import print_function +from npu_bridge.npu_init import * + +import math +import os +import random +import sys +import time +import h5py +import copy + +import matplotlib.pyplot as plt +import matplotlib.gridspec as gridspec +import numpy as np +from six.moves import xrange # pylint: disable=redefined-builtin +import tensorflow as tf +import procrustes + +import viz +import cameras +import data_utils +import mix_den_model +import logging, logging.config + + +tf.app.flags.DEFINE_float("learning_rate", 1e-3, "Learning rate") +tf.app.flags.DEFINE_float("dropout", 0.5, "Dropout keep probability. 1 means no dropout") +tf.app.flags.DEFINE_integer("batch_size", 64,"batch size to use during training") +tf.app.flags.DEFINE_integer("epochs", 200, "How many epochs we should train for") +tf.app.flags.DEFINE_boolean("camera_frame", True, "Convert 3d poses to camera coordinates") +tf.app.flags.DEFINE_boolean("max_norm", True, "Apply maxnorm constraint to the weights") +tf.app.flags.DEFINE_boolean("batch_norm", True, "Use batch_normalization") + +# Data loading +tf.app.flags.DEFINE_boolean("predict_14", False, "predict 14 joints") +tf.app.flags.DEFINE_boolean("use_sh", True, "Use 2d pose predictions from StackedHourglass") +tf.app.flags.DEFINE_string("action","All", "The action to train on. 'All' means all the actions") + +# Architecture +tf.app.flags.DEFINE_integer("linear_size", 1024, "Size of each model layer.") +tf.app.flags.DEFINE_integer("num_layers", 2, "Number of layers in the model.") +tf.app.flags.DEFINE_boolean("residual", True, "Whether to add a residual connection every 2 layers") + +# Evaluation +tf.app.flags.DEFINE_boolean("procrustes", False, "Apply procrustes analysis at test time") +tf.app.flags.DEFINE_boolean("evaluateActionWise",True, "The dataset to use either h36m or heva") + +# Directories +tf.app.flags.DEFINE_string("cameras_path","../data/h36m/cameras.h5","Directory to load camera parameters") +tf.app.flags.DEFINE_string("data_dir", "../data/h36m/", "Data directory") +tf.app.flags.DEFINE_string("train_dir", "../experiments/test_git/", "Training directory.") +tf.app.flags.DEFINE_string("load_dir", "../Models/mdm_5_prior/", "Specify the directory to load trained model") + +# Train or load +tf.app.flags.DEFINE_boolean("sample", False, "Set to True for sampling.") +tf.app.flags.DEFINE_boolean("test", False, "Set to True for sampling.") +tf.app.flags.DEFINE_boolean("use_cpu", False, "Whether to use the CPU") +tf.app.flags.DEFINE_integer("load", 0, "Try to load a previous checkpoint.") +tf.app.flags.DEFINE_integer("miss_num", 1, "Specify how many missing joints.") + +### 4679232 for mdm_5 +### 4338038 for mdm prior + +# Misc +tf.app.flags.DEFINE_boolean("use_fp16", False, "Train using fp16 instead of fp32.") + +FLAGS = tf.app.flags.FLAGS + +def make_dir_if_not_exist(path): + try: + os.makedirs(path) + except OSError: + if not os.path.isdir(path): + raise + +train_dir = FLAGS.train_dir +load_dir = FLAGS.load_dir +summaries_dir = os.path.join( train_dir, "summary" ) +logdir = os.path.join(train_dir,"log") +os.system('mkdir -p {}'.format(summaries_dir)) +make_dir_if_not_exist(logdir) + +logging.config.fileConfig('./logging.conf') +logger = logging.getLogger() +fileHandler = logging.FileHandler("{0}/log.txt".format(logdir)) +logFormatter = logging.Formatter("%(asctime)s [%(levelname)s] %(name)s - %(message)s") +fileHandler.setFormatter(logFormatter) +logger.addHandler(fileHandler) +logger.info("Logs will be written to %s" % logdir) + + + + + + + +def create_model( session, actions, batch_size ): + """ + Create model and initialize it or load its parameters in a session + + Args + session: tensorflow session + actions: list of string. Actions to train/test on + batch_size: integer. Number of examples in each batch + Returns + model: The created (or loaded) model + Raises + ValueError if asked to load a model, but the checkpoint specified by + FLAGS.load cannot be found. + """ + + model = mix_den_model.LinearModel( + FLAGS.linear_size, + FLAGS.num_layers, + FLAGS.residual, + FLAGS.batch_norm, + FLAGS.max_norm, + batch_size, + FLAGS.learning_rate, + summaries_dir, + FLAGS.predict_14, + dtype=tf.float16 if FLAGS.use_fp16 else tf.float32) + + if FLAGS.load <= 0: + # Create a new model from scratch + print("Creating model with fresh parameters.") + session.run( tf.global_variables_initializer() ) + return model + + # Load a previously saved model + ckpt = tf.train.get_checkpoint_state( load_dir, latest_filename="checkpoint") + print( "train_dir", load_dir ) + + if ckpt and ckpt.model_checkpoint_path: + # Check if the specific checkpoint exists + if FLAGS.load > 0: + if os.path.isfile(os.path.join(load_dir,"checkpoint-{0}.index".format(FLAGS.load))): + ckpt_name = os.path.join( os.path.join(load_dir,"checkpoint-{0}".format(FLAGS.load)) ) + else: + raise ValueError("Asked to load checkpoint {0}, but it does not seem to exist".format(FLAGS.load)) + else: + ckpt_name = os.path.basename( ckpt.model_checkpoint_path ) + + print("Loading model {0}".format( ckpt_name )) + model.saver.restore( session, ckpt_name ) + return model + else: + print("Could not find checkpoint. Aborting.") + raise( ValueError, "Checkpoint {0} does not seem to exist".format( ckpt.model_checkpoint_path ) ) + + +def train(): + """Train a linear model for 3d pose estimation""" + + actions = data_utils.define_actions( FLAGS.action ) + + # Load camera parameters + SUBJECT_IDS = [1,5,6,7,8,9,11] + rcams = cameras.load_cameras(FLAGS.cameras_path, SUBJECT_IDS) + + # Load 3d data and load (or create) 2d projections + train_set_3d, test_set_3d, data_mean_3d, data_std_3d, dim_to_ignore_3d, dim_to_use_3d, train_root_positions, test_root_positions = data_utils.read_3d_data( + actions, FLAGS.data_dir, FLAGS.camera_frame, rcams, FLAGS.predict_14 ) + + # Read stacked hourglass 2D predictions if use_sh, otherwise use groundtruth 2D projections + if FLAGS.use_sh: + train_set_2d, test_set_2d, data_mean_2d, data_std_2d, dim_to_ignore_2d, dim_to_use_2d = data_utils.read_2d_predictions(actions, FLAGS.data_dir) + else: + train_set_2d, test_set_2d, data_mean_2d, data_std_2d, dim_to_ignore_2d, dim_to_use_2d = data_utils.create_2d_data( actions, FLAGS.data_dir, rcams ) + + + # Avoid using the GPU if requested + device_count = {"GPU": 0} if FLAGS.use_cpu else {"GPU": 1} + with tf.Session(config=npu_config_proto(config_proto=tf.ConfigProto( + device_count=device_count, + allow_soft_placement=True ))) as sess: + + # === Create the model === + print("Creating %d bi-layers of %d units." % (FLAGS.num_layers, FLAGS.linear_size)) + model = create_model( sess, actions, FLAGS.batch_size ) + model.train_writer.add_graph( sess.graph ) + + + #=== This is the training loop === + step_time, loss, val_loss = 0.0, 0.0, 0.0 + current_step = 0 if FLAGS.load <= 0 else FLAGS.load + 1 + + + current_epoch = 0 + log_every_n_batches = 100 + + + for epoch in xrange( FLAGS.epochs ): + current_epoch = current_epoch + 1 + + # === Load training batches for one epoch === + encoder_inputs, decoder_outputs, _ = model.get_all_batches( train_set_2d, train_set_3d, FLAGS.camera_frame, training=True ) + nbatches = len( encoder_inputs ) + start_time, loss = time.time(), 0. + + # === Loop through all the training batches === + for i in range( nbatches): + + if (i+1) % log_every_n_batches == 0: + # Print progress every log_every_n_batches batches + print("Working on epoch {0}, batch {1} / {2}... ".format( current_epoch, i+1, nbatches), end="" ) + + enc_in, dec_out = encoder_inputs[i], decoder_outputs[i] + # enc_in = data_utils.generage_missing_data(enc_in, FLAGS.miss_num) + step_loss, loss_summary, lr_summary, comp = model.step( sess, enc_in, dec_out, FLAGS.dropout, isTraining=True ) + + + if (i+1) % log_every_n_batches == 0: + + # Log and print progress every log_every_n_batches batches + + model.train_writer.add_summary( loss_summary, current_step ) + model.train_writer.add_summary( lr_summary, current_step ) + step_time = (time.time() - start_time) + start_time = time.time() + print("done in {0:.2f} ms".format( 1000*step_time / log_every_n_batches ) ) + + loss += step_loss + current_step += 1 + # === end looping through training batches === + + loss = loss / nbatches + + logger.info("=============================\n" + "Epoch: %d\n" + "Global step: %d\n" + "Learning rate: %.2e\n" + "Train loss avg: %.4f\n" + "=============================" % (epoch, model.global_step.eval(), + model.learning_rate.eval(), loss) ) + # === End training for an epoch === + + # === Testing after this epoch === + + if FLAGS.evaluateActionWise: + + logger.info("{0:=^12} {1:=^6}".format("Action", "mm")) # line of 30 equal signs + + cum_err = 0 # select the mixture model which has mininum error + for action in actions: + + + # Get 2d and 3d testing data for this action + action_test_set_2d = get_action_subset( test_set_2d, action ) + action_test_set_3d = get_action_subset( test_set_3d, action ) + encoder_inputs, decoder_outputs, repro_info = model.get_all_batches( action_test_set_2d, action_test_set_3d, FLAGS.camera_frame, training=False) + + act_err, step_time, loss = evaluate_batches( sess, model, + data_mean_3d, data_std_3d, dim_to_use_3d, dim_to_ignore_3d, + data_mean_2d, data_std_2d, dim_to_use_2d, dim_to_ignore_2d, + current_step, encoder_inputs, decoder_outputs) + + cum_err = cum_err + act_err + logger.info('{0:<12} {1:>6.2f}'.format(action, act_err)) + + summaries = sess.run( model.err_mm_summary, {model.err_mm: float(cum_err/float(len(actions)))} ) + model.test_writer.add_summary( summaries, current_step ) + + logger.info('{0:<12} {1:>6.2f}'.format("Average", cum_err/float(len(actions)))) + + logger.info('{0:=^19}'.format('')) + + # Save the model + print( "Saving the model... ", end="" ) + start_time = time.time() + if cum_err/float(len(actions))<60.66: + model.saver.save(sess, os.path.join(train_dir, 'checkpoint'), global_step=current_step) + print( "done in {0:.2f} ms".format(1000*(time.time() - start_time)) ) + + # Reset global time and loss + step_time, loss = 0, 0 + + sys.stdout.flush() + + +def get_action_subset( poses_set, action ): + """ + Given a preloaded dictionary of poses, load the subset of a particular action + + Args + poses_set: dictionary with keys k=(subject, action, seqname), + values v=(nxd matrix of poses) + action: string. The action that we want to filter out + Returns + poses_subset: dictionary with same structure as poses_set, but only with the + specified action. + """ + return {k:v for k, v in poses_set.items() if k[1] == action} + + +def evaluate_batches( sess, model, + data_mean_3d, data_std_3d, dim_to_use_3d, dim_to_ignore_3d, + data_mean_2d, data_std_2d, dim_to_use_2d, dim_to_ignore_2d, + current_step, encoder_inputs, decoder_outputs, current_epoch=0 ): + """ + Generic method that evaluates performance of a list of batches. + May be used to evaluate all actions or a single action. + + Args + sess + model + data_mean_3d + data_std_3d + dim_to_use_3d + dim_to_ignore_3d + data_mean_2d + data_std_2d + dim_to_use_2d + dim_to_ignore_2d + current_step + encoder_inputs + decoder_outputs + current_epoch + Returns + + total_err + joint_err + step_time + loss + """ + + n_joints = 17 if not(FLAGS.predict_14) else 14 + nbatches = len( encoder_inputs ) + + + # Loop through test examples + all_dists, start_time, loss = [], time.time(), 0. + log_every_n_batches = 100 + all_poses_3d = [] + all_enc_in =[] + + for i in range(nbatches): + + if current_epoch > 0 and (i+1) % log_every_n_batches == 0: + print("Working on test epoch {0}, batch {1} / {2}".format( current_epoch, i+1, nbatches) ) + + enc_in, dec_out = encoder_inputs[i], decoder_outputs[i] + # enc_in = data_utils.generage_missing_data(enc_in, FLAGS.miss_num) + dp = 1.0 # dropout keep probability is always 1 at test time + step_loss, loss_summary, out_all_components_ori = model.step( sess, enc_in, dec_out, dp, isTraining=False ) + loss += step_loss + + out_all_components = np.reshape(out_all_components_ori,[-1, model.HUMAN_3D_SIZE+2, model.num_models]) + out_mean = out_all_components[:, : model.HUMAN_3D_SIZE, :] + + + # denormalize + enc_in = data_utils.unNormalizeData( enc_in, data_mean_2d, data_std_2d, dim_to_ignore_2d ) + enc_in_ = copy.deepcopy(enc_in) + all_enc_in.append(enc_in_) + dec_out = data_utils.unNormalizeData( dec_out, data_mean_3d, data_std_3d, dim_to_ignore_3d ) + pose_3d = np.zeros((enc_in.shape[0],96, out_mean.shape[-1])) + + for j in range(out_mean.shape[-1]): + pose_3d[:, :, j] = data_utils.unNormalizeData( out_mean[:, :, j], data_mean_3d, data_std_3d, dim_to_ignore_3d ) + + pose_3d_ = copy.deepcopy(pose_3d) + all_poses_3d.append(pose_3d_) + + # Keep only the relevant dimensions + dtu3d = np.hstack( (np.arange(3), dim_to_use_3d) ) if not(FLAGS.predict_14) else dim_to_use_3d + + dec_out = dec_out[:, dtu3d] + pose_3d = pose_3d[:, dtu3d,:] + + assert dec_out.shape[0] == FLAGS.batch_size + assert pose_3d.shape[0] == FLAGS.batch_size + + if FLAGS.procrustes: + # Apply per-frame procrustes alignment if asked to do so + for j in range(FLAGS.batch_size): + for k in range(model.num_models): + gt = np.reshape(dec_out[j,:],[-1,3]) + out = np.reshape(pose_3d[j,:, k],[-1,3]) + _, Z, T, b, c = procrustes.compute_similarity_transform(gt,out,compute_optimal_scale=True) + out = (b*out.dot(T))+c + + pose_3d[j, :, k] = np.reshape(out,[-1,17*3] ) if not(FLAGS.predict_14) else np.reshape(pose_3d[j,:, k],[-1,14*3] ) + + # Compute Euclidean distance error per joint + sqerr = (pose_3d - np.expand_dims(dec_out,axis=2))**2 # Squared error between prediction and expected output + dists = np.zeros((sqerr.shape[0], n_joints, sqerr.shape[2])) # Array with L2 error per joint in mm + + for m in range(dists.shape[-1]): + dist_idx = 0 + for k in np.arange(0, n_joints*3, 3): + # Sum across X,Y, and Z dimenstions to obtain L2 distance + dists[:,dist_idx, m] = np.sqrt( np.sum( sqerr[:, k:k+3,m], axis=1 )) + + dist_idx = dist_idx + 1 + + all_dists.append(dists) + assert sqerr.shape[0] == FLAGS.batch_size + + step_time = (time.time() - start_time) / nbatches + loss = loss / nbatches + + all_dists = np.vstack( all_dists ) + aver_minerr = np.mean(np.min(np.sum( all_dists, axis=1),axis=1))/n_joints + + return aver_minerr, step_time, loss + + +def test(): + + actions = data_utils.define_actions( FLAGS.action ) + + # Load camera parameters + SUBJECT_IDS = [1,5,6,7,8,9,11] + rcams = cameras.load_cameras(FLAGS.cameras_path, SUBJECT_IDS) + + # Load 3d data and load (or create) 2d projections + train_set_3d, test_set_3d, data_mean_3d, data_std_3d, dim_to_ignore_3d, dim_to_use_3d, train_root_positions, test_root_positions = data_utils.read_3d_data( + actions, FLAGS.data_dir, FLAGS.camera_frame, rcams, FLAGS.predict_14 ) + + # Read stacked hourglass 2D predictions if use_sh, otherwise use groundtruth 2D projections + if FLAGS.use_sh: + train_set_2d, test_set_2d, data_mean_2d, data_std_2d, dim_to_ignore_2d, dim_to_use_2d = data_utils.read_2d_predictions(actions, FLAGS.data_dir) + else: + train_set_2d, test_set_2d, data_mean_2d, data_std_2d, dim_to_ignore_2d, dim_to_use_2d = data_utils.create_2d_data( actions, FLAGS.data_dir, rcams ) + + + # Avoid using the GPU if requested + device_count = {"GPU": 0} if FLAGS.use_cpu else {"GPU": 1} + with tf.Session(config=npu_config_proto(config_proto=tf.ConfigProto( + device_count=device_count, + allow_soft_placement=True ))) as sess: + + # === Create the model === + print("Creating %d bi-layers of %d units." % (FLAGS.num_layers, FLAGS.linear_size)) + model = create_model( sess, actions, FLAGS.batch_size ) + model.train_writer.add_graph( sess.graph ) + + current_step = 0 if FLAGS.load <= 0 else FLAGS.load + 1 + + if FLAGS.evaluateActionWise: + + logger.info("{0:=^12} {1:=^6}".format("Action", "mm")) # line of 30 equal signs + + cum_err = 0 # select the mixture model which has mininum error + for action in actions: + + + # Get 2d and 3d testing data for this action + action_test_set_2d = get_action_subset( test_set_2d, action ) + action_test_set_3d = get_action_subset( test_set_3d, action ) + encoder_inputs, decoder_outputs, repro_info = model.get_all_batches( action_test_set_2d, action_test_set_3d, FLAGS.camera_frame, training=False) + + act_err, step_time, loss = evaluate_batches( sess, model, + data_mean_3d, data_std_3d, dim_to_use_3d, dim_to_ignore_3d, + data_mean_2d, data_std_2d, dim_to_use_2d, dim_to_ignore_2d, + current_step, encoder_inputs, decoder_outputs) + + cum_err = cum_err + act_err + logger.info('{0:<12} {1:>6.2f}'.format(action, act_err)) + + summaries = sess.run( model.err_mm_summary, {model.err_mm: float(cum_err/float(len(actions)))} ) + model.test_writer.add_summary( summaries, current_step ) + + logger.info('{0:<12} {1:>6.2f}'.format("Average", cum_err/float(len(actions)))) + + logger.info('{0:=^19}'.format('')) + + +def sample(): + + """Get samples from a model and visualize them""" + path = '{}/samples_sh'.format(FLAGS.train_dir) + if not os.path.exists(path): + os.makedirs(path) + actions = data_utils.define_actions( FLAGS.action ) + + # Load camera parameters + SUBJECT_IDS = [1,5,6,7,8,9,11] + rcams = cameras.load_cameras(FLAGS.cameras_path, SUBJECT_IDS) + n_joints = 17 if not (FLAGS.predict_14) else 14 + + # Load 3d data and load (or create) 2d projections + train_set_3d, test_set_3d, data_mean_3d, data_std_3d, dim_to_ignore_3d, dim_to_use_3d, train_root_positions, test_root_positions = data_utils.read_3d_data( + actions, FLAGS.data_dir, FLAGS.camera_frame, rcams, FLAGS.predict_14 ) + + if FLAGS.use_sh: + train_set_2d, test_set_2d, data_mean_2d, data_std_2d, dim_to_ignore_2d, dim_to_use_2d = data_utils.read_2d_predictions(actions, FLAGS.data_dir) + else: + train_set_2d, test_set_2d, data_mean_2d, data_std_2d, dim_to_ignore_2d, dim_to_use_2d, _ = data_utils.create_2d_data( actions, FLAGS.data_dir, rcams ) + + device_count = {"GPU": 0} if FLAGS.use_cpu else {"GPU": 1} + with tf.Session(config=npu_config_proto(config_proto=tf.ConfigProto( device_count = device_count ))) as sess: + # === Create the model === + + batch_size = 128 + model = create_model(sess, actions, batch_size) + print("Model loaded") + + + for key2d in test_set_2d.keys(): + + (subj, b, fname) = key2d + + # choose SittingDown action to visualize + if b == 'SittingDown': + print( "Subject: {}, action: {}, fname: {}".format(subj, b, fname) ) + + # keys should be the same if 3d is in camera coordinates + key3d = key2d if FLAGS.camera_frame else (subj, b, '{0}.h5'.format(fname.split('.')[0])) + key3d = (subj, b, fname[:-3]) if (fname.endswith('-sh')) and FLAGS.camera_frame else key3d + + enc_in = test_set_2d[ key2d ] + n2d, _ = enc_in.shape + dec_out = test_set_3d[ key3d ] + n3d, _ = dec_out.shape + assert n2d == n3d + + # Split into about-same-size batches + + enc_in = np.array_split( enc_in, n2d // batch_size ) + dec_out = np.array_split( dec_out, n3d // batch_size ) + + # store all pose hypotheses in a list + pose_3d_mdm = [[], [], [], [], []] + + for bidx in range( len(enc_in) ): + + # Dropout probability 0 (keep probability 1) for sampling + dp = 1.0 + loss, _, out_all_components = model.step(sess, enc_in[bidx], dec_out[bidx], dp, isTraining=False) + + # denormalize the input 2d pose, ground truth 3d pose as well as 3d pose hypotheses from mdm + out_all_components = np.reshape(out_all_components, [-1, model.HUMAN_3D_SIZE + 2, model.num_models]) + out_mean = out_all_components[:, : model.HUMAN_3D_SIZE, :] + + enc_in[bidx] = data_utils.unNormalizeData( enc_in[bidx], data_mean_2d, data_std_2d, dim_to_ignore_2d ) + dec_out[bidx] = data_utils.unNormalizeData( dec_out[bidx], data_mean_3d, data_std_3d, dim_to_ignore_3d ) + poses3d = np.zeros((out_mean.shape[0], 96, out_mean.shape[-1])) + for j in range(out_mean.shape[-1]): + poses3d[:, :, j] = data_utils.unNormalizeData( out_mean[:, :, j], data_mean_3d, data_std_3d, dim_to_ignore_3d ) + + # extract the 17 joints + dtu3d = np.hstack((np.arange(3), dim_to_use_3d)) if not (FLAGS.predict_14) else dim_to_use_3d + dec_out_17 = dec_out[bidx][: , dtu3d] + pose_3d_17 = poses3d[:, dtu3d, :] + sqerr = (pose_3d_17 - np.expand_dims(dec_out_17, axis=2)) ** 2 + dists = np.zeros((sqerr.shape[0], n_joints, sqerr.shape[2])) + for m in range(dists.shape[-1]): + dist_idx = 0 + for k in np.arange(0, n_joints * 3, 3): + dists[:, dist_idx, m] = np.sqrt(np.sum(sqerr[:, k:k + 3, m], axis=1)) + dist_idx = dist_idx + 1 + + [pose_3d_mdm[i].append(poses3d[:, :, i]) for i in range(poses3d.shape[-1])] + + # Put all the poses together + enc_in, dec_out= map(np.vstack,[enc_in, dec_out]) + for i in range(poses3d.shape[-1]): + pose_3d_mdm[i] = np.vstack(pose_3d_mdm[i]) + + # Convert back to world coordinates + if FLAGS.camera_frame: + N_CAMERAS = 4 + N_JOINTS_H36M = 32 + + # Add global position back + dec_out = dec_out + np.tile( test_root_positions[ key3d ], [1,N_JOINTS_H36M] ) + for i in range(poses3d.shape[-1]): + pose_3d_mdm[i] = pose_3d_mdm[i] + np.tile(test_root_positions[key3d], [1, N_JOINTS_H36M]) + + + # Load the appropriate camera + subj, action, sname = key3d + + cname = sname.split('.')[1] # <-- camera name + scams = {(subj,c+1): rcams[(subj,c+1)] for c in range(N_CAMERAS)} # cams of this subject + scam_idx = [scams[(subj,c+1)][-1] for c in range(N_CAMERAS)].index( cname ) # index of camera used + the_cam = scams[(subj, scam_idx+1)] # <-- the camera used + R, T, f, c, k, p, name = the_cam + assert name == cname + + def cam2world_centered(data_3d_camframe): + data_3d_worldframe = cameras.camera_to_world_frame(data_3d_camframe.reshape((-1, 3)), R, T) + data_3d_worldframe = data_3d_worldframe.reshape((-1, N_JOINTS_H36M*3)) + # subtract root translation + return data_3d_worldframe - np.tile( data_3d_worldframe[:,:3], (1,N_JOINTS_H36M) ) + + # Apply inverse rotation and translation + dec_out = cam2world_centered(dec_out) + for i in range(poses3d.shape[-1]): + pose_3d_mdm[i] = cam2world_centered(pose_3d_mdm[i]) + + # sample some results to visualize + np.random.seed(42) + idx = np.random.permutation(enc_in.shape[0]) + enc_in, dec_out = enc_in[idx, :], dec_out[idx,:] + for i in range(poses3d.shape[-1]): + pose_3d_mdm[i] = pose_3d_mdm[i][idx, :] + + exidx = 1 + nsamples = 20 + + for i in np.arange(nsamples): + fig = plt.figure(figsize=(20, 5)) + + subplot_idx = 1 + gs1 = gridspec.GridSpec(1, 7) # 5 rows, 9 columns + gs1.update(wspace=-0.00, hspace=0.05) # set the spacing between axes. + plt.axis('off') + + # Plot 2d pose + ax1 = plt.subplot(gs1[subplot_idx - 1]) + p2d = enc_in[exidx, :] + viz.show2Dpose(p2d, ax1) + ax1.invert_yaxis() + + # Plot 3d gt + ax2 = plt.subplot(gs1[subplot_idx], projection='3d') + p3d = dec_out[exidx, :] + viz.show3Dpose(p3d, ax2) + + # Plot 3d pose hypotheses + + for i in range(poses3d.shape[-1]): + ax3 = plt.subplot(gs1[subplot_idx + i + 1], projection='3d') + p3d = pose_3d_mdm[i][exidx] + viz.show3Dpose(p3d, ax3, lcolor="#9b59b6", rcolor="#2ecc71") + # plt.show() + plt.savefig('{}/sample_{}_{}_{}_{}.png'.format(path, subj, action, scam_idx, exidx)) + plt.close(fig) + exidx = exidx + 1 + + +def main(_): + if FLAGS.sample: + sample() + elif FLAGS.test: + test() + else: + train() + +if __name__ == "__main__": + + tf.app.run() diff --git "a/TensorFlow/contrib/cv/GMH\342\200\224MDN_ID1225_for_TensorFlow/src/procrustes.py" "b/TensorFlow/contrib/cv/GMH\342\200\224MDN_ID1225_for_TensorFlow/src/procrustes.py" new file mode 100644 index 0000000000000000000000000000000000000000..31eb2ef7b2d725b52b572462121fd1188c093f69 --- /dev/null +++ "b/TensorFlow/contrib/cv/GMH\342\200\224MDN_ID1225_for_TensorFlow/src/procrustes.py" @@ -0,0 +1,91 @@ +# Copyright 2017 The TensorFlow Authors. All Rights Reserved. +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. +# ============================================================================ +# Copyright 2021 Huawei Technologies Co., Ltd +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +def compute_similarity_transform(X, Y, compute_optimal_scale=False): + """ + A port of MATLAB's `procrustes` function to Numpy. + Adapted from http://stackoverflow.com/a/18927641/1884420 + + Args + X: array NxM of targets, with N number of points and M point dimensionality + Y: array NxM of inputs + compute_optimal_scale: whether we compute optimal scale or force it to be 1 + + Returns: + d: squared error after transformation + Z: transformed Y + T: computed rotation + b: scaling + c: translation + """ + import numpy as np + + muX = X.mean(0) + muY = Y.mean(0) + + X0 = X - muX + Y0 = Y - muY + + ssX = (X0**2.).sum() + ssY = (Y0**2.).sum() + + # centred Frobenius norm + normX = np.sqrt(ssX) + normY = np.sqrt(ssY) + + # scale to equal (unit) norm + X0 = X0 / normX + Y0 = Y0 / normY + + # optimum rotation matrix of Y + A = np.dot(X0.T, Y0) + U,s,Vt = np.linalg.svd(A,full_matrices=False) + V = Vt.T + T = np.dot(V, U.T) + + # Make sure we have a rotation + detT = np.linalg.det(T) + V[:,-1] *= np.sign( detT ) + s[-1] *= np.sign( detT ) + T = np.dot(V, U.T) + + traceTA = s.sum() + + if compute_optimal_scale: # Compute optimum scaling of Y. + b = traceTA * normX / normY + d = 1 - traceTA**2 + Z = normX*traceTA*np.dot(Y0, T) + muX + else: # If no scaling allowed + b = 1 + d = 1 + ssY/ssX - 2 * traceTA * normY / normX + Z = normY*np.dot(Y0, T) + muX + + c = muX - b*np.dot(muY, T) + + return d, Z, T, b, c + diff --git "a/TensorFlow/contrib/cv/GMH\342\200\224MDN_ID1225_for_TensorFlow/src/viz.py" "b/TensorFlow/contrib/cv/GMH\342\200\224MDN_ID1225_for_TensorFlow/src/viz.py" new file mode 100644 index 0000000000000000000000000000000000000000..a383d07fc26a80b973cbbdca0a012917fb8c37ff --- /dev/null +++ "b/TensorFlow/contrib/cv/GMH\342\200\224MDN_ID1225_for_TensorFlow/src/viz.py" @@ -0,0 +1,200 @@ +# Copyright 2017 The TensorFlow Authors. All Rights Reserved. +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. +# ============================================================================ +# Copyright 2021 Huawei Technologies Co., Ltd +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +"""Functions to visualize human poses""" + +import matplotlib.pyplot as plt +import data_utils +import numpy as np +import h5py +import os +from mpl_toolkits.mplot3d import Axes3D + +def show3Dpose(channels, ax, lcolor="#3498db", rcolor="#e74c3c", add_labels=False): # blue, orange + """ + Visualize the ground truth 3d skeleton + + Args + channels: 96x1 vector. The pose to plot. + ax: matplotlib 3d axis to draw on + lcolor: color for left part of the body + rcolor: color for right part of the body + add_labels: whether to add coordinate labels + Returns + Nothing. Draws on ax. + """ + + assert channels.size == len(data_utils.H36M_NAMES)*3, "channels should have 96 entries, it has %d instead" % channels.size + vals = np.reshape( channels, (len(data_utils.H36M_NAMES), -1) ) + + I = np.array([1,2,3,1,7,8,1, 13,14,15,14,18,19,14,26,27])-1 # start points + J = np.array([2,3,4,7,8,9,13,14,15,16,18,19,20,26,27,28])-1 # end points + LR = np.array([1,1,1,0,0,0,0, 0, 0, 0, 0, 0, 0, 1, 1, 1], dtype=bool) + + # Make connection matrix + for i in np.arange( len(I) ): + x, y, z = [np.array( [vals[I[i], j], vals[J[i], j]] ) for j in range(3)] + # ax.plot(x, y, z, lw=3, marker = 'o', markersize = 5, c=lcolor if LR[i] else rcolor, markeredgecolor = lcolor) + ax.plot(x, y, z, lw=2, c=lcolor if LR[i] else rcolor) + + RADIUS = 750 # space around the subject + xroot, yroot, zroot = vals[0,0], vals[0,1], vals[0,2] + ax.set_xlim3d([-RADIUS+xroot, RADIUS+xroot]) + ax.set_zlim3d([-RADIUS+zroot, RADIUS+zroot]) + ax.set_ylim3d([-RADIUS+yroot, RADIUS+yroot]) + + # ax.set_xlim3d([np.min(vals[:, 0]), np.max(vals[:, 0])]) + # ax.set_zlim3d([np.min(vals[:, 2]), np.max(vals[:, 2])]) + # ax.set_ylim3d([np.min(vals[:, 1]), np.max(vals[:, 1])]) + + if add_labels: + ax.set_xlabel("x") + ax.set_ylabel("y") + ax.set_zlabel("z") + + # Get rid of the ticks and tick labels + + ax.set_xticks([]) + ax.set_yticks([]) + ax.set_zticks([]) + + ax.get_xaxis().set_ticklabels([]) + ax.get_yaxis().set_ticklabels([]) + ax.set_zticklabels([]) + ax.set_aspect('equal') + + # Get rid of the panes (actually, make them white) + white = (1.0, 1.0, 1.0, 0.0) + ax.w_xaxis.set_pane_color(white) + # ax.w_zaxis.set_pane_color(white) + ax.w_yaxis.set_pane_color(white) + # # Keep z pane + # + # # Get rid of the lines in 3d + ax.w_xaxis.line.set_color(white) + ax.w_yaxis.line.set_color(white) + ax.w_zaxis.line.set_color(white) + + + ax.view_init(azim=129, elev=10) + + + +def show2Dpose(channels, ax, lcolor="#3498db", rcolor="#e74c3c", add_labels=False): + """ + Visualize a 2d skeleton with 32 joints + + Args + channels: 64x1 vector. The pose to plot. + ax: matplotlib axis to draw on + lcolor: color for left part of the body + rcolor: color for right part of the body + add_labels: whether to add coordinate labels + Returns + Nothing. Draws on ax. + """ + + assert channels.size == len(data_utils.H36M_NAMES)*2, "channels should have 64 entries, it has %d instead" % channels.size + vals = np.reshape( channels, (len(data_utils.H36M_NAMES), -1) ) + + I = np.array([1,2,3,1,7,8,1, 13,14,14,18,19,14,26,27])-1 # start points + J = np.array([2,3,4,7,8,9,13,14,16,18,19,20,26,27,28])-1 # end points + LR = np.array([1,1,1,0,0,0,0, 0, 0, 0, 0, 0, 1, 1, 1], dtype=bool) + + # Make connection matrix + for i in np.arange( len(I) ): + x, y = [np.array( [vals[I[i], j], vals[J[i], j]] ) for j in range(2)] + ax.plot(x, y, lw=2, c=lcolor if LR[i] else rcolor) + + # Get rid of the ticks + ax.set_xticks([]) + ax.set_yticks([]) + + # Get rid of tick labels + ax.get_xaxis().set_ticklabels([]) + ax.get_yaxis().set_ticklabels([]) + + RADIUS = 300 # space around the subject + xroot, yroot = vals[0,0], vals[0,1] + ax.set_xlim([-RADIUS+xroot, RADIUS+xroot]) + ax.set_ylim([-RADIUS+yroot, RADIUS+yroot]) + if add_labels: + ax.set_xlabel("x") + ax.set_ylabel("z") + + ax.set_aspect('equal') + + + + +def show2Dpose_mdm(channels, ax, lcolor="#3498db", rcolor="#e74c3c", add_labels=False): + """ + Visualize 2d reprojections of all 3d pose hypotheses in one fig in order to show the similarity between them + + Args + channels: 64 * 5, 2d reprojections of all 3d pose hypotheses + ax: matplotlib axis to draw on + lcolor: color for left part of the body + rcolor: color for right part of the body. Note that we do not really use lcolor and rcolor in this function. + In stead, we define a color for each hypotheses to show the overlap between them. + add_labels: whether to add coordinate labels + Returns + Nothing. Draws on ax. + """ + + + + + I = np.array([1,2,3,1,7,8,1, 13,14,14,18,19,14,26,27])-1 # start points + J = np.array([2,3,4,7,8,9,13,14,16,18,19,20,26,27,28])-1 # end points + LR = np.array([1,1,1,0,0,0,0, 0, 0, 0, 0, 0, 1, 1, 1], dtype=bool) + colors = ['#FF8000', '#4169E1', '#308014', '#000080', '#FF83FA'] # color used for 2d reprejection from each 3d pose hypotheses + for m in range(channels.shape[-1]): + vals = np.reshape(channels[:,m], [len(data_utils.H36M_NAMES), -1]) + for i in np.arange( len(I) ): + x, y = [np.array( [vals[I[i], j], vals[J[i], j]] ) for j in range(2)] + ax.plot(x, y, lw=2, c=colors[m]) + + # Get rid of the ticks + ax.set_xticks([]) + ax.set_yticks([]) + + # Get rid of tick labels + ax.get_xaxis().set_ticklabels([]) + ax.get_yaxis().set_ticklabels([]) + + RADIUS = 300 # space around the subject + xroot, yroot = vals[0,0], vals[0,1] + ax.set_xlim([-RADIUS+xroot, RADIUS+xroot]) + ax.set_ylim([-RADIUS+yroot, RADIUS+yroot]) + if add_labels: + ax.set_xlabel("x") + ax.set_ylabel("z") + + ax.set_aspect('equal') + diff --git "a/TensorFlow/contrib/cv/GMH\342\200\224MDN_ID1225_for_TensorFlow/test/env.sh" "b/TensorFlow/contrib/cv/GMH\342\200\224MDN_ID1225_for_TensorFlow/test/env.sh" new file mode 100644 index 0000000000000000000000000000000000000000..1193ce4826e1553b109047f49d13032ec89220c7 --- /dev/null +++ "b/TensorFlow/contrib/cv/GMH\342\200\224MDN_ID1225_for_TensorFlow/test/env.sh" @@ -0,0 +1,14 @@ +#!/bin/bash +cur_path=`pwd`/../ +export install_path=/usr/local/Ascend +export LD_LIBRARY_PATH=/usr/local/Ascend/driver/lib64/common/:/usr/local/Ascend/driver/lib64/driver:$LD_LIBRARY_PATH # 仅容器训练场景配置 +export PATH=${install_path}/fwkacllib/ccec_compiler/bin:${install_path}/fwkacllib/bin:$PATH +export LD_LIBRARY_PATH=${install_path}/fwkacllib/lib64:$LD_LIBRARY_PATH +export PYTHONPATH=${install_path}/fwkacllib/python/site-packages:$PYTHONPATH +export PYTHONPATH=/usr/local/python3.7.5/lib/python3.7/site-packages:${install_path}/tfplugin/python/site-packages:$PYTHONPATH +export ASCEND_OPP_PATH=${install_path}/opp +export ASCEND_AICPU_PATH=${install_path} +export PYTHONPATH=$cur_path/models/research:$cur_path/models/research/slim:$PYTHONPATH +export JOB_ID=10087 +export ASCEND_GLOBAL_LOG_LEVEL=3 +export ASCEND_DEVICE_ID=0 \ No newline at end of file diff --git "a/TensorFlow/contrib/cv/GMH\342\200\224MDN_ID1225_for_TensorFlow/test/train_full_1p.sh" "b/TensorFlow/contrib/cv/GMH\342\200\224MDN_ID1225_for_TensorFlow/test/train_full_1p.sh" new file mode 100644 index 0000000000000000000000000000000000000000..edc4d2c4e296426b934821040e2ce023c8291837 --- /dev/null +++ "b/TensorFlow/contrib/cv/GMH\342\200\224MDN_ID1225_for_TensorFlow/test/train_full_1p.sh" @@ -0,0 +1,185 @@ +#!/bin/bash + +# shell脚本所在路径 +cur_path=`echo $(cd $(dirname $0);pwd)` + +# 判断当前shell是否是performance +perf_flag=`echo $0 | grep performance | wc -l` +# 当前执行网络的名称 +Network="GMH—MDN_ID1225_for_TensorFlow" +#失败用例打屏 +export ASCEND_SLOG_PRINT_TO_STDOUT=1 +#基础参数,需要模型审视修改 +#batch Size +batch_size=64 +#当前是否为测试,默认为False,即训练模式 +test="False" +#网络名称,同目录名称 +#Device数量,单卡默认为1 +RankSize=1 +#训练epoch,可选 +epochs=1 +#学习率 +learning_rate='1e-3' +#参数配置 +cameras_path="" +data_dir="" +train_dir="" +load_dir="" +load=0 + +# 帮助信息,不需要修改 +if [[ $1 == --help || $1 == --h ]];then + echo "usage:./train_performance_1p.sh " + + echo "" + echo "parameter explain: + --test Set to True for sampling + --learning_rate Learning rate + --batch_size batch size to use during training + --epochs How many epochs we should train for + --cameras_path Directory to load camera parameters + --data_dir Data directory + --train_dir Training directory + --load_dir Specify the directory to load trained model + --load Try to load a previous checkpoint + -h/--help Show help message + " + exit 1 +fi + +# 参数校验,不需要修改 +for para in $* +do + if [[ $para == --data_dir* ]];then + data_dir=`echo ${para#*=}` + elif [[ $para == --batch_size* ]];then + batch_size=`echo ${para#*=}` + elif [[ $para == --epochs* ]];then + epochs=`echo ${para#*=}` + elif [[ $para == --train_dir* ]];then + train_dir=`echo ${para#*=}` + elif [[ $para == --load_dir* ]];then + load_dir=`echo ${para#*=}` + elif [[ $para == --cameras_path* ]];then + cameras_path=`echo ${para#*=}` + elif [[ $para == --load* ]];then + load=`echo ${para#*=}` + elif [[ $para == --learning_rate* ]];then + learning_rate=`echo ${para#*=}` + elif [[ $para == --test* ]];then + test=`echo ${para#*=}` + fi +done + +# 校验是否传入data_dir,不需要修改 +if [[ $data_dir == "" ]];then + echo "[Error] para \"data_dir\" must be config" + exit 1 +fi + +# 校验是否传入train_dir,不需要修改 +if [[ $train_dir == "" ]];then + echo "[Error] para \"train_dir\" must be config" + exit 1 +fi + +# 设置打屏日志文件名,请保留,文件名为${print_log} +print_log="./test/output/${ASCEND_DEVICE_ID}/train_${ASCEND_DEVICE_ID}.log" +modelarts_flag=${MODELARTS_MODEL_PATH} +if [ x"${modelarts_flag}" != x ]; +then + echo "running without etp..." + print_log_name=`ls /home/ma-user/modelarts/log/ | grep proc-rank` + print_log="/home/ma-user/modelarts/log/${print_log_name}" +fi +echo "### get your log here : ${print_log}" + +CaseName="" +function get_casename() +{ + if [ x"${perf_flag}" = x1 ]; + then + CaseName=${Network}_bs${batch_size}_${RANK_SIZE}'p'_'perf' + else + CaseName=${Network}_bs${batch_size}_${RANK_SIZE}'p'_'acc' + fi +} + +# 跳转到code目录 +cd ${cur_path}/../ +rm -rf ./test/output/${ASCEND_DEVICE_ID} +mkdir -p ./test/output/${ASCEND_DEVICE_ID} +touch ./test/output/${ASCEND_DEVICE_ID}/my_output_loss.txt +cd ${cur_path}/../src + +start=$(date +%s) +python3 ./predict_3dpose_mdm.py \ +--data_dir ${data_dir} \ +--train_dir ${train_dir} \ +--test ${test} \ +--load_dir ${load_dir} \ +--cameras_path ${cameras_path} \ +--load ${load} +--batch_size ${batch_size} \ +--epochs ${epochs} \ +--learning_rate ${learning_rate} > ${print_log} +wait +end=$(date +%s) +e2e_time=$(( $end - $start )) + +#输出性能FPS,需要模型审视修改 +StepTime=`grep "done in" ${print_log} | tail -n 10|awk '{print $3}' | awk '{sum+=$1} END {print sum/NR/1000}'` +#打印,不需要修改 +FPS=`awk 'BEGIN{printf("%.2f\n", '${batch_size}' /'${StepTime}')}'` + +#输出训练精度,需要模型审视修改 +train_accuracy=`grep "root - Average" ${print_log} | awk 'END {print $7}'` + +# 提取所有loss打印信息 +grep "Train loss avg:" ${print_log} | awk '{print $4}' > $cur_path/output/${ASCEND_DEVICE_ID}/my_output_loss.txt + +# 判断本次执行是否正确使用Ascend NPU +use_npu_flag=`grep "tf_adapter" ${print_log} | wc -l` +if [ x"${use_npu_flag}" == x0 ]; +then + echo "------------------ ERROR NOTICE START ------------------" + echo "ERROR, your task haven't used Ascend NPU, please check your npu Migration." + echo "------------------ ERROR NOTICE END------------------" +else + echo "------------------ INFO NOTICE START------------------" + echo "INFO, your task have used Ascend NPU, please check your result." + echo "------------------ INFO NOTICE END------------------" +fi + +# 获取最终的casename,请保留,case文件名为${CaseName} +get_casename + +# 重命名loss文件 +if [ -f $cur_path/output/${ASCEND_DEVICE_ID}/my_output_loss.txt ]; +then + mv $cur_path/output/${ASCEND_DEVICE_ID}/my_output_loss.txt $cur_path/output/${ASCEND_DEVICE_ID}/${CaseName}_loss.txt +fi + +echo "------------------ Final result ------------------" +# 输出性能FPS/单step耗时/端到端耗时 +echo "Final Performance images/sec : $FPS" +echo "Final Performance sec/step : $StepTime" +echo "E2E Training Duration sec : $e2e_time" + +# 输出训练精度 +echo "Final Train Accuracy : ${train_accuracy}" + +# 最后一个迭代loss值,不需要修改 +ActualLoss=(`awk 'END {print $NF}' $cur_path/output/$ASCEND_DEVICE_ID/${CaseName}_loss.txt`) + +#关键信息打印到${CaseName}.log中,不需要修改 +echo "Network = ${Network}" > $cur_path/output/$ASCEND_DEVICE_ID/${CaseName}.log +echo "RankSize = ${RANK_SIZE}" >> $cur_path/output/$ASCEND_DEVICE_ID/${CaseName}.log +echo "BatchSize = ${batch_size}" >> $cur_path/output/$ASCEND_DEVICE_ID/${CaseName}.log +echo "DeviceType = `uname -m`" >> $cur_path/output/$ASCEND_DEVICE_ID/${CaseName}.log +echo "CaseName = ${CaseName}" >> $cur_path/output/$ASCEND_DEVICE_ID/${CaseName}.log +echo "ActualFPS = ${FPS}" >> $cur_path/output/$ASCEND_DEVICE_ID/${CaseName}.log +echo "TrainingTime = ${StepTime}" >> $cur_path/output/$ASCEND_DEVICE_ID/${CaseName}.log +echo "ActualLoss = ${ActualLoss}" >> $cur_path/output/$ASCEND_DEVICE_ID/${CaseName}.log +echo "E2ETrainingTime = ${e2e_time}" >> $cur_path/output/$ASCEND_DEVICE_ID/${CaseName}.log \ No newline at end of file diff --git "a/TensorFlow/contrib/cv/GMH\342\200\224MDN_ID1225_for_TensorFlow/test/train_performance_1p.sh" "b/TensorFlow/contrib/cv/GMH\342\200\224MDN_ID1225_for_TensorFlow/test/train_performance_1p.sh" new file mode 100644 index 0000000000000000000000000000000000000000..edc4d2c4e296426b934821040e2ce023c8291837 --- /dev/null +++ "b/TensorFlow/contrib/cv/GMH\342\200\224MDN_ID1225_for_TensorFlow/test/train_performance_1p.sh" @@ -0,0 +1,185 @@ +#!/bin/bash + +# shell脚本所在路径 +cur_path=`echo $(cd $(dirname $0);pwd)` + +# 判断当前shell是否是performance +perf_flag=`echo $0 | grep performance | wc -l` +# 当前执行网络的名称 +Network="GMH—MDN_ID1225_for_TensorFlow" +#失败用例打屏 +export ASCEND_SLOG_PRINT_TO_STDOUT=1 +#基础参数,需要模型审视修改 +#batch Size +batch_size=64 +#当前是否为测试,默认为False,即训练模式 +test="False" +#网络名称,同目录名称 +#Device数量,单卡默认为1 +RankSize=1 +#训练epoch,可选 +epochs=1 +#学习率 +learning_rate='1e-3' +#参数配置 +cameras_path="" +data_dir="" +train_dir="" +load_dir="" +load=0 + +# 帮助信息,不需要修改 +if [[ $1 == --help || $1 == --h ]];then + echo "usage:./train_performance_1p.sh " + + echo "" + echo "parameter explain: + --test Set to True for sampling + --learning_rate Learning rate + --batch_size batch size to use during training + --epochs How many epochs we should train for + --cameras_path Directory to load camera parameters + --data_dir Data directory + --train_dir Training directory + --load_dir Specify the directory to load trained model + --load Try to load a previous checkpoint + -h/--help Show help message + " + exit 1 +fi + +# 参数校验,不需要修改 +for para in $* +do + if [[ $para == --data_dir* ]];then + data_dir=`echo ${para#*=}` + elif [[ $para == --batch_size* ]];then + batch_size=`echo ${para#*=}` + elif [[ $para == --epochs* ]];then + epochs=`echo ${para#*=}` + elif [[ $para == --train_dir* ]];then + train_dir=`echo ${para#*=}` + elif [[ $para == --load_dir* ]];then + load_dir=`echo ${para#*=}` + elif [[ $para == --cameras_path* ]];then + cameras_path=`echo ${para#*=}` + elif [[ $para == --load* ]];then + load=`echo ${para#*=}` + elif [[ $para == --learning_rate* ]];then + learning_rate=`echo ${para#*=}` + elif [[ $para == --test* ]];then + test=`echo ${para#*=}` + fi +done + +# 校验是否传入data_dir,不需要修改 +if [[ $data_dir == "" ]];then + echo "[Error] para \"data_dir\" must be config" + exit 1 +fi + +# 校验是否传入train_dir,不需要修改 +if [[ $train_dir == "" ]];then + echo "[Error] para \"train_dir\" must be config" + exit 1 +fi + +# 设置打屏日志文件名,请保留,文件名为${print_log} +print_log="./test/output/${ASCEND_DEVICE_ID}/train_${ASCEND_DEVICE_ID}.log" +modelarts_flag=${MODELARTS_MODEL_PATH} +if [ x"${modelarts_flag}" != x ]; +then + echo "running without etp..." + print_log_name=`ls /home/ma-user/modelarts/log/ | grep proc-rank` + print_log="/home/ma-user/modelarts/log/${print_log_name}" +fi +echo "### get your log here : ${print_log}" + +CaseName="" +function get_casename() +{ + if [ x"${perf_flag}" = x1 ]; + then + CaseName=${Network}_bs${batch_size}_${RANK_SIZE}'p'_'perf' + else + CaseName=${Network}_bs${batch_size}_${RANK_SIZE}'p'_'acc' + fi +} + +# 跳转到code目录 +cd ${cur_path}/../ +rm -rf ./test/output/${ASCEND_DEVICE_ID} +mkdir -p ./test/output/${ASCEND_DEVICE_ID} +touch ./test/output/${ASCEND_DEVICE_ID}/my_output_loss.txt +cd ${cur_path}/../src + +start=$(date +%s) +python3 ./predict_3dpose_mdm.py \ +--data_dir ${data_dir} \ +--train_dir ${train_dir} \ +--test ${test} \ +--load_dir ${load_dir} \ +--cameras_path ${cameras_path} \ +--load ${load} +--batch_size ${batch_size} \ +--epochs ${epochs} \ +--learning_rate ${learning_rate} > ${print_log} +wait +end=$(date +%s) +e2e_time=$(( $end - $start )) + +#输出性能FPS,需要模型审视修改 +StepTime=`grep "done in" ${print_log} | tail -n 10|awk '{print $3}' | awk '{sum+=$1} END {print sum/NR/1000}'` +#打印,不需要修改 +FPS=`awk 'BEGIN{printf("%.2f\n", '${batch_size}' /'${StepTime}')}'` + +#输出训练精度,需要模型审视修改 +train_accuracy=`grep "root - Average" ${print_log} | awk 'END {print $7}'` + +# 提取所有loss打印信息 +grep "Train loss avg:" ${print_log} | awk '{print $4}' > $cur_path/output/${ASCEND_DEVICE_ID}/my_output_loss.txt + +# 判断本次执行是否正确使用Ascend NPU +use_npu_flag=`grep "tf_adapter" ${print_log} | wc -l` +if [ x"${use_npu_flag}" == x0 ]; +then + echo "------------------ ERROR NOTICE START ------------------" + echo "ERROR, your task haven't used Ascend NPU, please check your npu Migration." + echo "------------------ ERROR NOTICE END------------------" +else + echo "------------------ INFO NOTICE START------------------" + echo "INFO, your task have used Ascend NPU, please check your result." + echo "------------------ INFO NOTICE END------------------" +fi + +# 获取最终的casename,请保留,case文件名为${CaseName} +get_casename + +# 重命名loss文件 +if [ -f $cur_path/output/${ASCEND_DEVICE_ID}/my_output_loss.txt ]; +then + mv $cur_path/output/${ASCEND_DEVICE_ID}/my_output_loss.txt $cur_path/output/${ASCEND_DEVICE_ID}/${CaseName}_loss.txt +fi + +echo "------------------ Final result ------------------" +# 输出性能FPS/单step耗时/端到端耗时 +echo "Final Performance images/sec : $FPS" +echo "Final Performance sec/step : $StepTime" +echo "E2E Training Duration sec : $e2e_time" + +# 输出训练精度 +echo "Final Train Accuracy : ${train_accuracy}" + +# 最后一个迭代loss值,不需要修改 +ActualLoss=(`awk 'END {print $NF}' $cur_path/output/$ASCEND_DEVICE_ID/${CaseName}_loss.txt`) + +#关键信息打印到${CaseName}.log中,不需要修改 +echo "Network = ${Network}" > $cur_path/output/$ASCEND_DEVICE_ID/${CaseName}.log +echo "RankSize = ${RANK_SIZE}" >> $cur_path/output/$ASCEND_DEVICE_ID/${CaseName}.log +echo "BatchSize = ${batch_size}" >> $cur_path/output/$ASCEND_DEVICE_ID/${CaseName}.log +echo "DeviceType = `uname -m`" >> $cur_path/output/$ASCEND_DEVICE_ID/${CaseName}.log +echo "CaseName = ${CaseName}" >> $cur_path/output/$ASCEND_DEVICE_ID/${CaseName}.log +echo "ActualFPS = ${FPS}" >> $cur_path/output/$ASCEND_DEVICE_ID/${CaseName}.log +echo "TrainingTime = ${StepTime}" >> $cur_path/output/$ASCEND_DEVICE_ID/${CaseName}.log +echo "ActualLoss = ${ActualLoss}" >> $cur_path/output/$ASCEND_DEVICE_ID/${CaseName}.log +echo "E2ETrainingTime = ${e2e_time}" >> $cur_path/output/$ASCEND_DEVICE_ID/${CaseName}.log \ No newline at end of file