diff --git a/operator/op_using/aclopExecuteV2/Add/README.md b/operator/op_using/aclopExecuteV2/Add/README.md new file mode 100644 index 0000000000000000000000000000000000000000..a1533e0d34969884893129a0f60fd9a8d68bec2f --- /dev/null +++ b/operator/op_using/aclopExecuteV2/Add/README.md @@ -0,0 +1,179 @@ +# Through the aclopExecuteV2 Interface Operator Verification on Network: Add + +## Overview + +This sample verifies the function of the [custom operator Add](../../1_custom_op/doc/Add_EN.md) by converting the custom operator file into a single-operator offline model file and loading the file using AscendCL for execution. + +Note: The generation of a single-operator model file depends only on the operator code implementation file, operator prototype definition, and operator information library, but does not depend on the operator adaptation plugin. + +## Directory Structure + +``` +├── inc // Header file directory +│ ├── common.h // Common method class declaration file, used to read binary files +│ ├── operator_desc.h // Operator description declaration file, including the operator inputs and outputs, operator type, input description, and output description +│ ├── op_runner.h // Operator execution information declaration file, including the numbers and sizes of operator inputs and outputs +├── run // Directory of files required for single-operator execution +│ ├── out // Directory of executables required for single-operator execution +│ └── test_data // Directory of test data files +│ ├── config +│ └── acl.json // File for AscendCL initialization, which must not be modified +│ └── add_op.json // Operator description file, used to construct a single-operator model file +│ ├── data +│ └── generate_data.py // Script for generating test data +├── src +│ ├── CMakeLists.txt // Build script +│ ├── common.cpp // Common function file, used to read binary files +│ ├── main.cpp // File containing the operator configuration, used to build the single-operator Add into an OM file and load the OM file for execution. To verify other single-operators, modify the configuration based on this file. +│ ├── operator_desc.cpp // File used to construct the input and output descriptions of the operator +│ ├── op_runner.cpp // Function implementation file for building and running a single-operator +``` + +## Environment Requirements + +- OS and architecture: CentOS x86\_64, CentOS AArch64, Ubuntu 18.04 x86\_64 +- Python version and dependency library: Python 3.7.*x* (3.7.0 to 3.7.11) and Python 3.8.*x* (3.8.0 to 3.8.11). +- Ascend AI Software Stack deployed + +## Environment Variables + +- Configuring Environment Variables in the Development Environment + + 1. The CANN portfolio provides a process-level environment variable setting script to automatically set environment variables. The following commands are used as examples + + ``` + . ${HOME}/Ascend/ascend-toolkit/set_env.sh + ``` + + Replace **$HOME/Ascend** with the actual Ascend-CANN-Toolkit installation path. + + 2. Operator building requires Python installation. The following takes Python 3.7.5 as an example. Run the following commands as a running user to set the environment variables related to Python 3.7.5. + + ``` + # Set tje Python3.7.5 library path. + export LD_LIBRARY_PATH=/usr/local/python3.7.5/lib:$LD_LIBRARY_PATH + # If multiple Python 3 versions exist in the user environment, specify Python 3.7.5. + export PATH=/usr/local/python3.7.5/bin:$PATH + ``` + + Replace the Python 3.7.5 installation path as required. You can also write the preceding commands to the ~/.bashrc file and run the source ~/.bashrc command to make the modification take effect immediately. + + 3. In the development environment, set environment variables and configure the header search path and library search path on which the build of the AscendCL single-operator verification program depends. + + + After setting the following environment variables, the compilation script will find the header file that the compilation depends on according to the directory "{DDK_PATH}/runtime/include/acl", and the library file that the compilation depends on according to the directory pointed to by the {NPU_HOST_LIB} environment variable. Please replace **$HOME/Ascend** with the actual component installation path. + + - If the operating system architecture of the development environment is the same as that of the running environment, run the following command: + + ``` + export DDK_PATH=$HOME/Ascend/ascend-toolkit/latest + export NPU_HOST_LIB=$DDK_PATH/runtime/lib64/stub + ``` + +- Configuring Environment Variables in the Running Environment + + - If Ascend-CANN-Toolkit is installed in the running environment, set the environment variable as follows: + + ``` + . ${HOME}/Ascend/ascend-toolkit/set_env.sh + ``` + + Replace **$HOME/Ascend** with the actual component installation path. + +## Build and Run + +1. Generate the single-operator offline model file of the Add operator. + 1. Log in to the development environment as a running user \(for example, **HwHiAiUser**\) and go to the **Add/run/out** directory of the sample project. + 2. Run the following command in the **out** directory to generate a single-operator model file: + + **atc --singleop=test\_data/config/add\_op.json --soc\_version=*Ascend310* --output=op\_models** + + Specifically, + + - **singleop**: operator description file \(.json\). + - **soc\_version**: Ascend AI Processor version. Replace it with the actual version. + + - **--output=op\_models**: indicates that the generated model file is stored in the **op\_models** folder in the current directory. + + After the model conversion is successful, the following files are generated: + + The single-operator model file **0_Add_3_2_8_4_3_2_8_4_3_2_8_4.om** is generated in the **op\_models** subdirectory of the current directory. The file is named in the format of "No.+opType+input description \(dataType\_format\_shape\)+output description". + + View the enumerated values of **dataType** and format in the **include/graph/types.h** file. The enumerated values start from 0 and increase in ascending order. + + **Note:** During model conversion, operators in the custom OPP are preferentially searched to match the operators in the model file. + + +2. Generate test data. + + Go to the **run/out/test\_data/data** directory of the sample project and run the following command: + + **python3.7.5 generate\_data.py** + + Two data files **input\_0.bin** and **input\_1.bin** with shape \(8, 4\) and data type int32 are generated in the current directory for verifying the Add operator. + +3. Build the sample project to generate an executable for single-operator verification. + 1. Go to the **Add** directory of the sample project and run the following command in this directory to create a directory for storing the generated executable, for example, **build/intermediates/host**. + + **mkdir -p build/intermediates/host** + + 2. Go to the **build/intermediates/host** directory and run the **cmake** command. + + Replace **../../../src** with the actual directory of **CMakeLists.txt**. + + Set **DCMAKE_SKIP_RPATH** to **TRUE**, **rpath** (path specified by **NPU_HOST_LIB**) is not added to the executable generated after build. The executable automatically looks up for dynamic libraries in the path included in **LD_LIBRARY_PATH**. + + - If the operating system architecture of the development environment is the same as that of the running environment, run the following command: + + **cd build/intermediates/host** + + **cmake ../../../src -DCMAKE\_CXX\_COMPILER=g++ -DCMAKE\_SKIP\_RPATH=TRUE** + + + + 3. Run the following command to generate an executable: + + **make** + + The executable **execute\_add\_op** is generated in the **run/out** directory of the project. + + +4. Execute the single-operator verification file on the host of the hardware device. + 1. As a running user \(for example, **HwHiAiUser**\), copy the **out** folder in the **Add/run/** directory of the sample project in the development environment to any directory in the operating environment \(host\), for example, **/home/HwHiAiUser/HIAI\_PROJECTS/run\_add/**. + + **Note**: If your development environment is the host of the hardware device, skip this step. + + 2. Execute the **execute\_add\_op** file in the operating environment to verify the single-operator model file. + + Run the following commands in **/home/HwHiAiUser/HIAI\_PROJECTS/run\_add/out**: + + **chmod +x execute\_add\_op** + + **./execute\_add\_op** + + The following information is displayed. \(Note: The data file is randomly generated by the data generation script, so the displayed data may be different.\) + + ``` + [INFO] Input[0]: + 88 34 4 79 40 48 20 60 89 16 26 63 54 33 48 76 + 64 27 20 93 92 90 20 60 15 60 0 49 96 0 41 97 + [INFO] Set input[1] from test_data/data/input_1.bin success. + [INFO] Input[1]: + 22 54 79 39 13 94 38 11 68 15 60 11 37 73 0 68 + 91 16 10 34 53 75 56 96 95 10 30 38 37 4 37 5 + [INFO] Copy input[0] success + [INFO] Copy input[1] success + [INFO] Create stream success + [INFO] Execute Add success + [INFO] Synchronize stream success + [INFO] Copy output[0] success + [INFO] Output[0]: + 110 88 83 118 53 142 58 71 157 31 86 74 91 106 48 144 + 155 43 30 127 145 165 76 156 110 70 30 87 133 4 78 102 + [INFO] Write output[0] success. output file = result_files/output_0.bin + [INFO] Run op success + ``` + + As shown above, the output result is the sum of input 1 and input 2. The Add operator has passed the verification. + + **result\_files/output\_0.bin**: result binary file. diff --git a/operator/op_using/aclopExecuteV2/Add/README_CN.md b/operator/op_using/aclopExecuteV2/Add/README_CN.md new file mode 100644 index 0000000000000000000000000000000000000000..2c948577e83c3915ca467f5cfc94e45dae95740d --- /dev/null +++ b/operator/op_using/aclopExecuteV2/Add/README_CN.md @@ -0,0 +1,178 @@ + + +# 通过aclopExecuteV2接口进行Add单算子调用运行验证 + +## 功能描述 + +该样例实现了通过aclopExecuteV2接口进行Add单算子调用方式,对[自定义算子Add](../../1_custom_op/doc/Add_CN.md)的功能验证,通过将自定义算子转换为单算子离线模型文件,然后通过AscendCL加载单算子模型文件进行运行。 + +说明:单算子模型文件的生成只依赖算子代码实现文件、算子原型定义、算子信息库,不依赖算子适配插件。 + +## 目录结构 + +``` +├── inc //头文件目录 +│ ├── common.h // 声明公共方法类,用于读取二进制文件 +│ ├── operator_desc.h //算子描述声明文件,包含算子输入/输出,算子类型以及输入描述与输出描述 +│ ├── op_runner.h //算子运行相关信息声明文件,包含算子输入/输出个数,输入/输出大小等 +├── run // 单算子执行需要的文件存放目录 +│ ├── out // 单算子执行需要的可执行文件存放目录 +│ └── test_data // 测试数据存放目录 +│ ├── config +│ └── acl.json //用于进行~acl初始化,请勿修改此文件 +│ └── add_op.json // 算子描述文件,用于构造单算子模型文件 +│ ├── data +│ └── generate_data.py // 生成测试数据的脚本 +├── src +│ ├── CMakeLists.txt // 编译规则文件 +│ ├── common.cpp // 公共函数,读取二进制文件函数的实现文件 +│ ├── main.cpp // 将单算子编译为om文件并加载om文件执行,此文件中包含算子的相关配置,若验证其他单算子,基于此文件修改 +│ ├── operator_desc.cpp // 构造算子的输入与输出描述 +│ ├── op_runner.cpp // 单算子编译与运行函数实现文件 +``` + +## 环境要求 + +- 操作系统及架构:CentOS x86\_64、CentOS aarch64、Ubuntu 18.04 x86\_64 +- python及依赖的库:Python3.7.*x*(3.7.0 ~ 3.7.11)、Python3.8.*x*(3.8.0 ~ 3.8.11) +- 已完成昇腾AI软件栈的部署。 + +## 配置环境变量 + +- 开发环境上环境变量配置 + + 1. CANN-Toolkit包提供进程级环境变量配置脚本,供用户在进程中引用,以自动完成CANN基础环境变量的配置,配置示例如下所示 + + ``` + . ${HOME}/Ascend/ascend-toolkit/set_env.sh + ``` + + “$HOME/Ascend”请替换“Ascend-cann-toolkit”包的实际安装路径。 + + 2. 算子编译依赖Python,以Python3.7.5为例,请以运行用户执行如下命令设置Python3.7.5的相关环境变量。 + + ``` + #用于设置python3.7.5库文件路径 + export LD_LIBRARY_PATH=/usr/local/python3.7.5/lib:$LD_LIBRARY_PATH + #如果用户环境存在多个python3版本,则指定使用python3.7.5版本 + export PATH=/usr/local/python3.7.5/bin:$PATH + ``` + + Python3.7.5安装路径请根据实际情况进行替换,您也可以将以上命令写入~/.bashrc文件中,然后执行source ~/.bashrc命令使其立即生效。 + + 3. 开发环境上,设置环境变量,配置AscendCL单算子验证程序编译依赖的头文件与库文件路径。 + + 设置以下环境变量后,编译脚本会根据“{DDK_PATH}环境变量值/runtime/include/acl”目录查找编译依赖的头文件,根据{NPU_HOST_LIB}环境变量指向的目录查找编译依赖的库文件。“$HOME/Ascend”请替换“Ascend-cann-toolkit”包的实际安装路径。 + + - 当开发环境与运行环境的操作系统架构相同时,配置示例如下所示: + + ``` + export DDK_PATH=$HOME/Ascend/ascend-toolkit/latest + export NPU_HOST_LIB=$DDK_PATH/runtime/lib64/stub + ``` + +- 运行环境上环境变量配置 + + - 运行环境上安装的“Ascend-cann-toolkit”包,环境变量设置如下: + + ``` + . ${HOME}/Ascend/ascend-toolkit/set_env.sh + ``` + +## 编译运行 + +1. 生成Add算子的单算子离线模型文件。 + 1. 以运行用户(例如HwHiAiUser)登录开发环境,并进入样例工程的“**Add/run/out**“目录。 + 2. 在out目录下执行如下命令,生成单算子模型文件。 + + **atc --singleop=test\_data/config/add\_op.json --soc\_version=*Ascend910* --output=op\_models** + + 其中: + + - singleop:算子描述的json文件。 + - soc\_version:昇腾AI处理器的型号,此处以Ascend910为例,请根据实际情况替换。 + + - --output=op\_models:代表生成的模型文件存储在当前目录下的op\_models文件夹下。 + + 模型转换成功后,会生成如下文件: + + 在当前目录的op\_models目录下生成单算子的模型文件**0_Add_3_2_8_4_3_2_8_4_3_2_8_4.om**,命名规范为:序号+opType + 输入的描述\(dateType\_format\_shape\)+输出的描述。 + + dataType以及format对应枚举值请从CANN软件所在目录下的“include/graph/types.h”文件中查看,枚举值从0开始依次递增。 + + **说明:**模型转换时,会优先去查找自定义算子库去匹配模型文件中的算子。 + + +2. 生成测试数据。 + + 进入样例工程目录的run/out/test\_data/data目录下,执行如下命令: + + **python3.7.5 generate\_data.py** + + 会在当前目录下生成两个shape为\(8, 4\),数据类型为int32的数据文件input\_0.bin与input\_1.bin,用于进行Add算子的验证。 + +3. 编译样例工程,生成单算子验证可执行文件。 + 1. 切换到样例工程根目录**Add**,然后在样例工程根目录下执行如下命令创建目录用于存放编译文件,例如,创建的目录为“build/intermediates/host“。 + + **mkdir -p build/intermediates/host** + + 2. 切换到“build/intermediates/host”目录,执行cmake命令生成编译文件。 + + “../../../src“表示CMakeLists.txt文件所在的目录,请根据实际目录层级修改。 + + DCMAKE_SKIP_RPATH需设置为TRUE,代表不会将rpath信息(即NPU_HOST_LIB配置的路径)添加到编译生成的可执行文件中去,可执行文件运行时会自动搜索实际设置的LD_LIBRARY_PATH中的动态链接库。 + + - 开发环境与运行环境操作系统架构相同,执行如下命令编译。 + + **cd build/intermediates/host** + + **cmake ../../../src -DCMAKE\_CXX\_COMPILER=g++ -DCMAKE\_SKIP\_RPATH=TRUE** + + + 3. 执行如下命令,生成可执行文件。 + + **make** + + 会在工程目录的“run/out“目录下生成可执行文件**execute\_add\_op**。 + + +4. 在硬件设备的Host侧执行单算子验证文件。 + 1. 以运行用户(例如HwHiAiUser)拷贝开发环境中样例工程**Add/run/**目录下的out文件夹到运行环境任一目录,例如上传到/home/HwHiAiUser/HIAI\_PROJECTS/run\_add/目录下。 + + **说明:**若您的开发环境即为运行环境,此拷贝操作可跳过。 + + 2. 在运行环境中执行execute\_add\_op文件,验证单算子模型文件。 + + 在/home/HwHiAiUser/HIAI\_PROJECTS/run\_add/out目录下执行如下命令: + + **chmod +x execute\_add\_op** + + **./execute\_add\_op** + + 会有如下屏显信息(注意:由于数据生成脚本生成的数据文件是随机的,屏显显示的数据会有不同): + + ``` +[INFO] Input[0]: + 88 34 4 79 40 48 20 60 89 16 26 63 54 33 48 76 + 64 27 20 93 92 90 20 60 15 60 0 49 96 0 41 97 +[INFO] Set input[1] from test_data/data/input_1.bin success. +[INFO] Input[1]: + 22 54 79 39 13 94 38 11 68 15 60 11 37 73 0 68 + 91 16 10 34 53 75 56 96 95 10 30 38 37 4 37 5 +[INFO] Copy input[0] success +[INFO] Copy input[1] success +[INFO] Create stream success +[INFO] Execute Add success +[INFO] Synchronize stream success +[INFO] Copy output[0] success +[INFO] Output[0]: + 110 88 83 118 53 142 58 71 157 31 86 74 91 106 48 144 + 155 43 30 127 145 165 76 156 110 70 30 87 133 4 78 102 +[INFO] Write output[0] success. output file = result_files/output_0.bin +[INFO] Run op success + ``` + + 可见输出结果=输入数据1+输入数据2,Add算子验证结果正确。 + + result\_files/output\_0.bin:输出数据的二进制文件。 + diff --git a/operator/op_using/aclopExecuteV2/Add/inc/common.h b/operator/op_using/aclopExecuteV2/Add/inc/common.h new file mode 100644 index 0000000000000000000000000000000000000000..854c5931c321a8b6b9b81114b16189a78f5ac7b3 --- /dev/null +++ b/operator/op_using/aclopExecuteV2/Add/inc/common.h @@ -0,0 +1,45 @@ +/** +* @file common.h +* +* Copyright (C) 2020. Huawei Technologies Co., Ltd. All rights reserved. +* +* This program is distributed in the hope that it will be useful, +* but WITHOUT ANY WARRANTY; without even the implied warranty of +* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. +*/ +#ifndef COMMON_H +#define COMMON_H + +#include +#include +#include +#include +#include + +#include "acl/acl.h" + +#define SUCCESS 0 +#define FAILED 1 + +#define INFO_LOG(fmt, args...) fprintf(stdout, "[INFO] " fmt "\n", ##args) +#define WARN_LOG(fmt, args...) fprintf(stdout, "[WARN] " fmt "\n", ##args) +#define ERROR_LOG(fmt, args...) fprintf(stderr, "[ERROR] " fmt "\n", ##args) + +/** + * @brief Read data from file + * @param [in] filePath: file path + * @param [out] fileSize: file size + * @return read result + */ +bool ReadFile(const std::string &filePath, size_t &fileSize, void *buffer, size_t bufferSize); + +/** + * @brief Write data to file + * @param [in] filePath: file path + * @param [in] buffer: data to write to file + * @param [in] size: size to write + * @return write result + */ +bool WriteFile(const std::string &filePath, const void *buffer, size_t size); + +#endif // COMMON_H diff --git a/operator/op_using/aclopExecuteV2/Add/inc/op_runner.h b/operator/op_using/aclopExecuteV2/Add/inc/op_runner.h new file mode 100644 index 0000000000000000000000000000000000000000..fbbce1cea15c70a6257dc2dd3461557b8ce1926a --- /dev/null +++ b/operator/op_using/aclopExecuteV2/Add/inc/op_runner.h @@ -0,0 +1,157 @@ +/** +* @file op_runner.h +* +* Copyright (C) 2020. Huawei Technologies Co., Ltd. All rights reserved. +* +* This program is distributed in the hope that it will be useful, +* but WITHOUT ANY WARRANTY; without even the implied warranty of +* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. +*/ +#ifndef OP_RUNNER_H +#define OP_RUNNER_H + +#include "acl/acl.h" +#include "common.h" +#include "operator_desc.h" + +class OpRunner { +public: + /** + * @brief Constructor + * @param [in] opDesc: op description + */ + explicit OpRunner(OperatorDesc *opDesc); + + /** + * @brief Destructor + */ + virtual ~OpRunner(); + + /** + * @brief Init op runner + */ + bool Init(); + + /** + * @brief Get number of inputs + * @return number of inputs + */ + const size_t NumInputs(); + + /** + * @brief Get number of outputs + * @return number of outputs + */ + const size_t NumOutputs(); + + /** + * @brief Get input size by index + * @param [in] index: input index + * @return size of the input + */ + const size_t GetInputSize(size_t index) const; + + /** + * @brief Get output size by index + * @param [in] index: output index + * @return size of the output + */ + size_t GetOutputSize(size_t index) const; + + /** + * @brief Get input element count by index + * @param i[in] ndex: input index + * @return element count of the input + */ + size_t GetInputElementCount(size_t index) const; + + /** + * @brief Get output element count by index + * @param [in] index: output index + * @return element count of the output + */ + size_t GetOutputElementCount(size_t index) const; + + /** + * @brief Get input shape by index + * @param [in] index: input index + * @return shape of the output + */ + std::vector GetInputShape(size_t index) const; + + /** + * @brief Get output shape by index + * @param [in] index: output index + * @return shape of the output + */ + std::vector GetOutputShape(size_t index) const; + + /** + * @brief Get output buffer(host memory) by index + * @tparam T: data type + * @param [in] index: input index + * @return host address of the input + */ + template + T *GetInputBuffer(size_t index) + { + if (index >= numInputs_) { + ERROR_LOG("index out of range. index = %zu, numInputs = %zu", index, numInputs_); + return nullptr; + } + return reinterpret_cast(hostInputs_[index]); + } + + /** + * @brief Get input buffer(host memory) by index + * @tparam T: data type + * @param [in] index: output index + * @return host address of the output + */ + template + const T *GetOutputBuffer(size_t index) + { + if (index >= numOutputs_) { + ERROR_LOG("index out of range. index = %zu, numOutputs = %zu", index, numOutputs_); + return nullptr; + } + + return reinterpret_cast(hostOutputs_[index]); + } + + /** + * @brief Print readable input by index + * @param [in] index: input index + * @param [in] elementsPerRow: number of elements per row + */ + void PrintInput(size_t index, size_t elementsPerRow = 16); + + /** + * @brief Print readable output by index + * @param [in] index: output index + * @param [in] elementsPerRow: number of elements per row + */ + void PrintOutput(size_t index, size_t elementsPerRow = 16); + + /** + * @brief Run op + * @return run result + */ + bool RunOp(); + +private: + size_t numInputs_; + size_t numOutputs_; + + std::vector inputBuffers_; + std::vector outputBuffers_; + + std::vector devInputs_; + std::vector devOutputs_; + + std::vector hostInputs_; + std::vector hostOutputs_; + OperatorDesc *opDesc_; +}; + +#endif // OP_RUNNER_H diff --git a/operator/op_using/aclopExecuteV2/Add/inc/operator_desc.h b/operator/op_using/aclopExecuteV2/Add/inc/operator_desc.h new file mode 100644 index 0000000000000000000000000000000000000000..5dbc1cee1946d0afb3d5bd4704eb6ce66f03a14e --- /dev/null +++ b/operator/op_using/aclopExecuteV2/Add/inc/operator_desc.h @@ -0,0 +1,56 @@ +/** +* @file operator_desc.h +* +* Copyright (C) 2020. Huawei Technologies Co., Ltd. All rights reserved. +* +* This program is distributed in the hope that it will be useful, +* but WITHOUT ANY WARRANTY; without even the implied warranty of +* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. +*/ +#ifndef OPERATOR_DESC_H +#define OPERATOR_DESC_H + +#include +#include + +#include "acl/acl.h" + +struct OperatorDesc { + /** + * Constructor + * @param [in] opType: op type + */ + explicit OperatorDesc(std::string opType); + + /** + * Destructor + */ + virtual ~OperatorDesc(); + + /** + * Add an input tensor description + * @param [in] dataType: data type + * @param [in] numDims: number of dims + * @param [in] dims: dims + * @param [in] format: format + * @return OperatorDesc + */ + OperatorDesc &AddInputTensorDesc(aclDataType dataType, int numDims, const int64_t *dims, aclFormat format); + + /** + * Add an output tensor description + * @param [in] dataType: data type + * @param [in] numDims: number of dims + * @param [in] dims: dims + * @param [in] format: format + * @return OperatorDesc + */ + OperatorDesc &AddOutputTensorDesc(aclDataType dataType, int numDims, const int64_t *dims, aclFormat format); + + std::string opType; + std::vector inputDesc; + std::vector outputDesc; + aclopAttr *opAttr; +}; + +#endif // OPERATOR_DESC_H diff --git a/operator/op_using/aclopExecuteV2/Add/run/out/test_data/config/acl.json b/operator/op_using/aclopExecuteV2/Add/run/out/test_data/config/acl.json new file mode 100644 index 0000000000000000000000000000000000000000..9e26dfeeb6e641a33dae4961196235bdb965b21b --- /dev/null +++ b/operator/op_using/aclopExecuteV2/Add/run/out/test_data/config/acl.json @@ -0,0 +1 @@ +{} \ No newline at end of file diff --git a/operator/op_using/aclopExecuteV2/Add/run/out/test_data/config/add_op.json b/operator/op_using/aclopExecuteV2/Add/run/out/test_data/config/add_op.json new file mode 100644 index 0000000000000000000000000000000000000000..2141cb78e306729ece4e71f05e672846e66de38a --- /dev/null +++ b/operator/op_using/aclopExecuteV2/Add/run/out/test_data/config/add_op.json @@ -0,0 +1,24 @@ +[ + { + "op": "Add", + "input_desc": [ + { + "format": "ND", + "shape": [8, 4], + "type": "int32" + }, + { + "format": "ND", + "shape": [8, 4], + "type": "int32" + } + ], + "output_desc": [ + { + "format": "ND", + "shape": [8, 4], + "type": "int32" + } + ] + } +] diff --git a/operator/op_using/aclopExecuteV2/Add/run/out/test_data/data/generate_data.py b/operator/op_using/aclopExecuteV2/Add/run/out/test_data/data/generate_data.py new file mode 100644 index 0000000000000000000000000000000000000000..896c87e2f3b9454ca89c3dff789135eb50874aef --- /dev/null +++ b/operator/op_using/aclopExecuteV2/Add/run/out/test_data/data/generate_data.py @@ -0,0 +1,16 @@ +""" +* @file generate_data.py +* +* Copyright (C) 2020. Huawei Technologies Co., Ltd. All rights reserved. +* +* This program is distributed in the hope that it will be useful, +* but WITHOUT ANY WARRANTY; without even the implied warranty of +* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. +""" +import numpy as np + +a = np.random.randint(100, size=(8, 4)).astype(np.int32) +b = np.random.randint(100, size=(8, 4)).astype(np.int32) + +a.tofile('input_0.bin') +b.tofile('input_1.bin') diff --git a/operator/op_using/aclopExecuteV2/Add/src/CMakeLists.txt b/operator/op_using/aclopExecuteV2/Add/src/CMakeLists.txt new file mode 100644 index 0000000000000000000000000000000000000000..760ff9eddf333deb7b55cd96f7e640d56e8913a8 --- /dev/null +++ b/operator/op_using/aclopExecuteV2/Add/src/CMakeLists.txt @@ -0,0 +1,55 @@ +# Copyright (c) Huawei Technologies Co., Ltd. 2020. All rights reserved. + +# CMake lowest version requirement +cmake_minimum_required(VERSION 3.5.1) + +# project information +project(acl_execute_add) + +# Compile options +add_compile_options(-std=c++11) + +set(CMAKE_RUNTIME_OUTPUT_DIRECTORY "../../../run/out") +set(CMAKE_LIBRARY_OUTPUT_DIRECTORY "../../outputs") + +set(INC_PATH $ENV{DDK_PATH}) + +if (NOT DEFINED ENV{DDK_PATH}) + set(INC_PATH "/usr/local/Ascend") + message(STATUS "set default INC_PATH: ${INC_PATH}") +else () + message(STATUS "env INC_PATH: ${INC_PATH}") +endif() + +set(LIB_PATH $ENV{NPU_HOST_LIB}) + +# Dynamic libraries in the stub directory can only be used for compilation +if (NOT DEFINED ENV{NPU_HOST_LIB}) + set(LIB_PATH "/usr/local/Ascend/runtime/lib64/stub/") + message(STATUS "set default LIB_PATH: ${LIB_PATH}") +else () + message(STATUS "env LIB_PATH: ${LIB_PATH}") +endif() + +# Header path +include_directories( + ${INC_PATH}/runtime/include/ + ../inc +) + +# add host lib path +link_directories( + ${LIB_PATH} +) + +add_executable(execute_add_op + operator_desc.cpp + op_runner.cpp + main.cpp + common.cpp) + +target_link_libraries(execute_add_op + ascendcl + stdc++) + +install(TARGETS execute_add_op DESTINATION ${CMAKE_RUNTIME_OUTPUT_DIRECTORY}) diff --git a/operator/op_using/aclopExecuteV2/Add/src/common.cpp b/operator/op_using/aclopExecuteV2/Add/src/common.cpp new file mode 100644 index 0000000000000000000000000000000000000000..298c509ebbc2504870f926344f84aab41f1eb03a --- /dev/null +++ b/operator/op_using/aclopExecuteV2/Add/src/common.cpp @@ -0,0 +1,77 @@ +/** +* @file common.cpp +* +* Copyright (C) 2020. Huawei Technologies Co., Ltd. All rights reserved. +* +* This program is distributed in the hope that it will be useful, +* but WITHOUT ANY WARRANTY; without even the implied warranty of +* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. +*/ +#include "common.h" + +#include +#include +#include +#include + +bool ReadFile(const std::string &filePath, size_t &fileSize, void *buffer, size_t bufferSize) +{ + struct stat sBuf; + int fileStatus = stat(filePath.data(), &sBuf); + if (fileStatus == -1) { + ERROR_LOG("failed to get file"); + return false; + } + if (S_ISREG(sBuf.st_mode) == 0) { + ERROR_LOG("%s is not a file, please enter a file", filePath.c_str()); + return false; + } + + std::ifstream file; + file.open(filePath, std::ios::binary); + if (!file.is_open()) { + ERROR_LOG("Open file failed. path = %s", filePath.c_str()); + return false; + } + + std::filebuf *buf = file.rdbuf(); + size_t size = buf->pubseekoff(0, std::ios::end, std::ios::in); + if (size == 0) { + ERROR_LOG("file size is 0"); + file.close(); + return false; + } + if (size > bufferSize) { + ERROR_LOG("file size is larger than buffer size"); + file.close(); + return false; + } + buf->pubseekpos(0, std::ios::in); + buf->sgetn(static_cast(buffer), size); + fileSize = size; + file.close(); + return true; +} + +bool WriteFile(const std::string &filePath, const void *buffer, size_t size) +{ + if (buffer == nullptr) { + ERROR_LOG("Write file failed. buffer is nullptr"); + return false; + } + + int fd = open(filePath.c_str(), O_RDWR | O_CREAT | O_TRUNC, S_IRUSR | S_IWRITE); + if (fd < 0) { + ERROR_LOG("Open file failed. path = %s", filePath.c_str()); + return false; + } + + auto writeSize = write(fd, buffer, size); + (void) close(fd); + if (writeSize != size) { + ERROR_LOG("Write file Failed."); + return false; + } + + return true; +} diff --git a/operator/op_using/aclopExecuteV2/Add/src/main.cpp b/operator/op_using/aclopExecuteV2/Add/src/main.cpp new file mode 100644 index 0000000000000000000000000000000000000000..65cb4c01dd55b1c5c4e993be3a8437774ff6520a --- /dev/null +++ b/operator/op_using/aclopExecuteV2/Add/src/main.cpp @@ -0,0 +1,190 @@ +/** +* @file main.cpp +* +* Copyright (C) 2020. Huawei Technologies Co., Ltd. All rights reserved. +* +* This program is distributed in the hope that it will be useful, +* but WITHOUT ANY WARRANTY; without even the implied warranty of +* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. +*/ +#include +#include +#include +#include +#include + +#include "acl/acl.h" +#include "op_runner.h" + +#include "common.h" + +int deviceId = 0; + +OperatorDesc CreateOpDesc() +{ + // define operator + std::vector shape{8, 4}; + std::string opType = "Add"; + aclDataType dataType = ACL_INT32; + aclFormat format = ACL_FORMAT_ND; + OperatorDesc opDesc(opType); + opDesc.AddInputTensorDesc(dataType, shape.size(), shape.data(), format); + opDesc.AddInputTensorDesc(dataType, shape.size(), shape.data(), format); + opDesc.AddOutputTensorDesc(dataType, shape.size(), shape.data(), format); + // 返回opDesc对象,这个对象描述了算子的输入输出信息 + return opDesc; +} + +bool SetInputData(OpRunner &runner) { + // 从OpRunner中获取有几个输入,遍历每个输入 + for (size_t i = 0; i < runner.NumInputs(); ++i) { + size_t fileSize = 0; + // 取第i个输入的数据文件 + std::string filePath = "test_data/data/input_" + std::to_string(i) + ".bin"; + // 读取文件,放到输入的host内存中 + bool result = ReadFile(filePath, fileSize, + runner.GetInputBuffer(i), runner.GetInputSize(i)); + if (!result) { + ERROR_LOG("Read input[%zu] failed", i); + return false; + } + + INFO_LOG("Set input[%zu] from %s success.", i, filePath.c_str()); + INFO_LOG("Input[%zu]:", i); + runner.PrintInput(i); + } + + return true; +} + +bool ProcessOutputData(OpRunner &runner) { + // 从OpRunner中获取有几个输出,遍历每个输出 + for (size_t i = 0; i < runner.NumOutputs(); ++i) { + INFO_LOG("Output[%zu]:", i); + // 打印每一个输出 + runner.PrintOutput(i); + + // 把输出数据写到文件中 + std::string filePath = "result_files/output_" + std::to_string(i) + ".bin"; + if (!WriteFile(filePath, runner.GetOutputBuffer(i), runner.GetOutputSize(i))) { + ERROR_LOG("Write output[%zu] failed.", i); + return false; + } + + INFO_LOG("Write output[%zu] success. output file = %s", i, filePath.c_str()); + } + return true; +} + +void DestoryResource() { + bool flag = false; + if (aclrtResetDevice(deviceId) != ACL_SUCCESS) { + ERROR_LOG("Reset device %d failed", deviceId); + flag = true; + } + if (aclFinalize() != ACL_SUCCESS) { + ERROR_LOG("Finalize acl failed"); + flag = true; + } + if (flag) { + ERROR_LOG("Destory resource failed"); + } else { + INFO_LOG("Destory resource success"); + } +} + +bool InitResource() { + // 创建目录 + std::string output = "./result_files"; + if (access(output.c_str(), 0) == -1) { + int ret = mkdir(output.c_str(), 0700); + if (ret == 0) { + INFO_LOG("Make output directory successfully"); + } else { + ERROR_LOG("Make output directory fail"); + return false; + } + } + + // AscendCL初始化 + if (aclInit("test_data/config/acl.json") != ACL_SUCCESS) { + ERROR_LOG("acl init failed"); + return false; + } + + // 设置devicec + if (aclrtSetDevice(deviceId) != ACL_SUCCESS) { + ERROR_LOG("Set device failed. deviceId is %d", deviceId); + (void)aclFinalize(); + return false; + } + INFO_LOG("Set device[%d] success", deviceId); + + // 获取软件栈的运行模式 + aclrtRunMode runMode; + if (aclrtGetRunMode(&runMode) != ACL_SUCCESS) { + ERROR_LOG("Get run mode failed"); + DestoryResource(); + return false; + } + bool g_isDevice = (runMode == ACL_DEVICE); + + // 指定模型路径 + if (aclopSetModelDir("op_models") != ACL_SUCCESS) { + ERROR_LOG("Load single op model failed"); + DestoryResource(); + return false; + } + + return true; +} + +bool RunAddOp() { + // 创建Desc类 + OperatorDesc opDesc = CreateOpDesc(); + + // 用opDesc作为输入,创建一个opRunner类 + OpRunner opRunner(&opDesc); + // 初始化 + if (!opRunner.Init()) { + ERROR_LOG("Init OpRunner failed"); + return false; + } + + // 设置输入数据,之前的操作是给输入占用了内存,但是并没有输入数据 + if (!SetInputData(opRunner)) { + ERROR_LOG("Set input data failed"); + return false; + } + + // 运行算子 + if (!opRunner.RunOp()) { + ERROR_LOG("Run op failed"); + return false; + } + + // 打印输出数据 + if (!ProcessOutputData(opRunner)) { + ERROR_LOG("Process output data failed"); + return false; + } + + INFO_LOG("Run op success"); + return true; +} + +int main() { + // 初始化资源 + if (!InitResource()) + { + ERROR_LOG("Init resource failed"); + return FAILED; + } + INFO_LOG("Init resource success"); + + // 执行算子 + if (!RunAddOp()) { + DestoryResource(); + return FAILED; + } +} \ No newline at end of file diff --git a/operator/op_using/aclopExecuteV2/Add/src/op_runner.cpp b/operator/op_using/aclopExecuteV2/Add/src/op_runner.cpp new file mode 100644 index 0000000000000000000000000000000000000000..b2182a38c928bbe619beef13ad97326cac1cf403 --- /dev/null +++ b/operator/op_using/aclopExecuteV2/Add/src/op_runner.cpp @@ -0,0 +1,390 @@ +/** +* @file op_runner.cpp +* +* Copyright (C) 2020. Huawei Technologies Co., Ltd. All rights reserved. +* +* This program is distributed in the hope that it will be useful, +* but WITHOUT ANY WARRANTY; without even the implied warranty of +* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. +*/ +#include "op_runner.h" + +#include + +#include "common.h" + +using namespace std; + +bool g_isDevice = false; + +OpRunner::OpRunner(OperatorDesc *opDesc) : opDesc_(opDesc) +{ + // 获取main函数中CreateOpDesc,输入个数,输出个数,分别给到numInputs_和numOutputs_ + numInputs_ = opDesc->inputDesc.size(); + numOutputs_ = opDesc->outputDesc.size(); +} + +OpRunner::~OpRunner() +{ + for (size_t i = 0; i < numInputs_; ++i) { + (void)aclDestroyDataBuffer(inputBuffers_[i]); + (void)aclrtFree(devInputs_[i]); + if (g_isDevice) { + (void)aclrtFree(hostInputs_[i]); + } else { + (void)aclrtFreeHost(hostInputs_[i]); + } + } + + for (size_t i = 0; i < numOutputs_; ++i) { + (void)aclDestroyDataBuffer(outputBuffers_[i]); + (void)aclrtFree(devOutputs_[i]); + if (g_isDevice) { + (void)aclrtFree(hostOutputs_[i]); + } else { + (void)aclrtFreeHost(hostOutputs_[i]); + } + } +} + +bool OpRunner::Init() +{ + // 遍历每一个输入 + for (size_t i = 0; i < numInputs_; ++i) { + // 获取输入占用的内存大小 + auto size = GetInputSize(i); + void *devMem = nullptr; + // 调用aclrtMalloc在device上申请devMem内存 + if (aclrtMalloc(&devMem, size, ACL_MEM_MALLOC_NORMAL_ONLY) != ACL_SUCCESS) { + ERROR_LOG("Malloc device memory for input[%zu] failed", i); + return false; + } + // 把devMem内存加载到devInputs_向量中 + devInputs_.emplace_back(devMem); + // 用devMem创建DataBuffer,把它加到DataBuffer的向量里 + inputBuffers_.emplace_back(aclCreateDataBuffer(devMem, size)); + + void *hostMem = nullptr; + if (g_isDevice) { + if (aclrtMalloc(&hostMem, size, ACL_MEM_MALLOC_NORMAL_ONLY) != ACL_SUCCESS) { + ERROR_LOG("Malloc device memory for input[%zu] failed", i); + return false; + } + } else { + // 调用aclrtMalloc在host上申请一块同样大小的hostMem内存 + if (aclrtMallocHost(&hostMem, size) != ACL_SUCCESS) { + ERROR_LOG("Malloc device memory for input[%zu] failed", i); + return false; + } + } + if (hostMem == nullptr) { + ERROR_LOG("Malloc memory for input[%zu] failed", i); + return false; + } + // 把hostMem内存加载到hostInputs_向量中 + hostInputs_.emplace_back(hostMem); + } + + // 遍历每一个输出 + for (size_t i = 0; i < numOutputs_; ++i) { + auto size = GetOutputSize(i); + void *devMem = nullptr; + if (aclrtMalloc(&devMem, size, ACL_MEM_MALLOC_NORMAL_ONLY) != ACL_SUCCESS) { + ERROR_LOG("Malloc device memory for output[%zu] failed", i); + return false; + } + devOutputs_.emplace_back(devMem); + outputBuffers_.emplace_back(aclCreateDataBuffer(devMem, size)); + + void *hostOutput = nullptr; + if (g_isDevice) { + if (aclrtMalloc(&hostOutput, size, ACL_MEM_MALLOC_NORMAL_ONLY) != ACL_SUCCESS) { + ERROR_LOG("Malloc device memory for output[%zu] failed", i); + return false; + } + } else { + if (aclrtMallocHost(&hostOutput, size) != ACL_SUCCESS) { + ERROR_LOG("Malloc device memory for output[%zu] failed", i); + return false; + } + } + if (hostOutput == nullptr) { + ERROR_LOG("Malloc host memory for output[%zu] failed", i); + return false; + } + hostOutputs_.emplace_back(hostOutput); + } + + return true; +} + +const size_t OpRunner::NumInputs() +{ + return numInputs_; +} + +const size_t OpRunner::NumOutputs() +{ + return numOutputs_; +} + +const size_t OpRunner::GetInputSize(size_t index) const +{ + if (index >= numInputs_) { + ERROR_LOG("index out of range. index = %zu, numInputs = %zu", index, numInputs_); + return 0; + } + + return aclGetTensorDescSize(opDesc_->inputDesc[index]); +} + +std::vector OpRunner::GetInputShape(size_t index) const +{ + std::vector ret; + if (index >= numInputs_) { + ERROR_LOG("index out of range. index = %zu, numInputs = %zu", index, numInputs_); + return ret; + } + + auto desc = opDesc_->inputDesc[index]; + for (size_t i = 0; i < aclGetTensorDescNumDims(desc); ++i) { + int64_t dimSize; + if (aclGetTensorDescDimV2(desc, i, &dimSize) != ACL_SUCCESS) { + ERROR_LOG("get dims from tensor desc failed. dims index = %zu", i); + ret.clear(); + return ret; + } + ret.emplace_back(dimSize); + } + + return ret; +} + +std::vector OpRunner::GetOutputShape(size_t index) const +{ + std::vector ret; + if (index >= opDesc_->outputDesc.size()) { + ERROR_LOG("index out of range. index = %zu, numOutputs = %zu", index, numOutputs_); + return ret; + } + + auto desc = opDesc_->outputDesc[index]; + for (size_t i = 0; i < aclGetTensorDescNumDims(desc); ++i) { + int64_t dimSize; + if (aclGetTensorDescDimV2(desc, i, &dimSize) != ACL_SUCCESS) { + ERROR_LOG("get dims from tensor desc failed. dims index = %zu", i); + ret.clear(); + return ret; + } + ret.emplace_back(dimSize); + } + return ret; +} + +size_t OpRunner::GetInputElementCount(size_t index) const +{ + if (index >= opDesc_->inputDesc.size()) { + ERROR_LOG("index out of range. index = %zu, numInputs = %zu", index, numInputs_); + return 0; + } + + return aclGetTensorDescElementCount(opDesc_->inputDesc[index]); +} + +size_t OpRunner::GetOutputElementCount(size_t index) const +{ + if (index >= opDesc_->outputDesc.size()) { + ERROR_LOG("index out of range. index = %zu, numOutputs = %zu", index, numOutputs_); + return 0; + } + + return aclGetTensorDescElementCount(opDesc_->outputDesc[index]); +} + +size_t OpRunner::GetOutputSize(size_t index) const +{ + if (index >= opDesc_->outputDesc.size()) { + ERROR_LOG("index out of range. index = %zu, numOutputs = %zu", index, numOutputs_); + return 0; + } + + return aclGetTensorDescSize(opDesc_->outputDesc[index]); +} + +bool OpRunner::RunOp() +{ + // 遍历每一个输入 + for (size_t i = 0; i < numInputs_; ++i) { + // 取输入的大小 + auto size = GetInputSize(i); + aclrtMemcpyKind kind = ACL_MEMCPY_HOST_TO_DEVICE; + if (g_isDevice) { + kind = ACL_MEMCPY_DEVICE_TO_DEVICE; + } + // 把数据从host拷贝到device上 + if (aclrtMemcpy(devInputs_[i], size, hostInputs_[i], size, kind) != ACL_SUCCESS) { + ERROR_LOG("Copy input[%zu] failed", i); + return false; + } + INFO_LOG("Copy input[%zu] success", i); + } + + // aclopExecuteV2是异步执行,需要创建一个stream + aclrtStream stream = nullptr; + if (aclrtCreateStream(&stream) != ACL_SUCCESS) { + ERROR_LOG("Create stream failed"); + return false; + } + INFO_LOG("Create stream success"); + + // 调用aclopExecuteV2,并往里面传入算子的名称,输入个数,算子的描述,算子的输入数据,输出个数,输出描述,输出数据,算子的额外属性信息,stream + auto ret = aclopExecuteV2(opDesc_->opType.c_str(), + numInputs_, + opDesc_->inputDesc.data(), + inputBuffers_.data(), + numOutputs_, + opDesc_->outputDesc.data(), + outputBuffers_.data(), + opDesc_->opAttr, + stream); + // ret返回的不是计算结果,返回的是任务有没有成功的被下发到stream里去,所以下面的操作是检查任务有没有成功下发 + if (ret == ACL_ERROR_OP_TYPE_NOT_MATCH || ret == ACL_ERROR_OP_INPUT_NOT_MATCH || + ret == ACL_ERROR_OP_OUTPUT_NOT_MATCH || ret == ACL_ERROR_OP_ATTR_NOT_MATCH) { + ERROR_LOG("[%s] op with the given description is not compiled. Please run atc first, errorCode is %d", + opDesc_->opType.c_str(), static_cast(ret)); + (void)aclrtDestroyStream(stream); + return false; + } else if (ret != ACL_SUCCESS) { + (void)aclrtDestroyStream(stream); + ERROR_LOG("Execute %s failed. errorCode is %d", opDesc_->opType.c_str(), static_cast(ret)); + return false; + } + INFO_LOG("Execute %s success", opDesc_->opType.c_str()); + + // 等待stream中任务执行完毕,如果任务失败销毁资源 + if (aclrtSynchronizeStream(stream) != ACL_SUCCESS) { + ERROR_LOG("Synchronize stream failed"); + (void)aclrtDestroyStream(stream); + return false; + } + INFO_LOG("Synchronize stream success"); + + // 任务成功后,遍历输出 + for (size_t i = 0; i < numOutputs_; ++i) { + auto size = GetOutputSize(i); + aclrtMemcpyKind kind = ACL_MEMCPY_DEVICE_TO_HOST; + if (g_isDevice) { + kind = ACL_MEMCPY_DEVICE_TO_DEVICE; + } + // 把输出数据从device侧回传到host侧,等待使用 + if (aclrtMemcpy(hostOutputs_[i], size, devOutputs_[i], size, kind) != ACL_SUCCESS) { + INFO_LOG("Copy output[%zu] success", i); + (void)aclrtDestroyStream(stream); + return false; + } + INFO_LOG("Copy output[%zu] success", i); + } + + // 数据回传后,销毁stream + (void)aclrtDestroyStream(stream); + return true; +} + + +template +void DoPrintData(const T *data, size_t count, size_t elementsPerRow) +{ + for (size_t i = 0; i < count; ++i) { + std::cout << std::setw(10) << data[i]; + if (elementsPerRow == 0) { + ERROR_LOG("elementsPerRow should not be 0."); + } + if (i % elementsPerRow == elementsPerRow - 1) { + std::cout << std::endl; + } + } +} + +void DoPrintFp16Data(const aclFloat16 *data, size_t count, size_t elementsPerRow) +{ + for (size_t i = 0; i < count; ++i) { + std::cout << std::setw(10) << std::setprecision(4) << aclFloat16ToFloat(data[i]); + if (elementsPerRow == 0) { + ERROR_LOG("elementsPerRow should not be 0."); + } + if (i % elementsPerRow == elementsPerRow - 1) { + std::cout << std::endl; + } + } +} + +void PrintData(const void *data, size_t count, aclDataType dataType, size_t elementsPerRow) +{ + if (data == nullptr) { + ERROR_LOG("Print data failed. data is nullptr"); + return; + } + + switch (dataType) { + case ACL_BOOL: + DoPrintData(reinterpret_cast(data), count, elementsPerRow); + break; + case ACL_INT8: + DoPrintData(reinterpret_cast(data), count, elementsPerRow); + break; + case ACL_UINT8: + DoPrintData(reinterpret_cast(data), count, elementsPerRow); + break; + case ACL_INT16: + DoPrintData(reinterpret_cast(data), count, elementsPerRow); + break; + case ACL_UINT16: + DoPrintData(reinterpret_cast(data), count, elementsPerRow); + break; + case ACL_INT32: + DoPrintData(reinterpret_cast(data), count, elementsPerRow); + break; + case ACL_UINT32: + DoPrintData(reinterpret_cast(data), count, elementsPerRow); + break; + case ACL_INT64: + DoPrintData(reinterpret_cast(data), count, elementsPerRow); + break; + case ACL_UINT64: + DoPrintData(reinterpret_cast(data), count, elementsPerRow); + break; + case ACL_FLOAT16: + DoPrintFp16Data(reinterpret_cast(data), count, elementsPerRow); + break; + case ACL_FLOAT: + DoPrintData(reinterpret_cast(data), count, elementsPerRow); + break; + case ACL_DOUBLE: + DoPrintData(reinterpret_cast(data), count, elementsPerRow); + break; + default: + ERROR_LOG("Unsupported type: %d", dataType); + } +} + +void OpRunner::PrintInput(size_t index, size_t numElementsPerRow) +{ + if (index >= numInputs_) { + ERROR_LOG("index out of range. index = %zu, numOutputs = %zu", index, numInputs_); + return; + } + + auto desc = opDesc_->inputDesc[index]; + PrintData(hostInputs_[index], GetInputElementCount(index), aclGetTensorDescType(desc), numElementsPerRow); +} + +void OpRunner::PrintOutput(size_t index, size_t numElementsPerRow) +{ + if (index >= numOutputs_) { + ERROR_LOG("index out of range. index = %zu, numOutputs = %zu", index, numOutputs_); + return; + } + + auto desc = opDesc_->outputDesc[index]; + PrintData(hostOutputs_[index], GetOutputElementCount(index), aclGetTensorDescType(desc), numElementsPerRow); +} diff --git a/operator/op_using/aclopExecuteV2/Add/src/operator_desc.cpp b/operator/op_using/aclopExecuteV2/Add/src/operator_desc.cpp new file mode 100644 index 0000000000000000000000000000000000000000..edb87657596186085f290414bce24f755c9914a3 --- /dev/null +++ b/operator/op_using/aclopExecuteV2/Add/src/operator_desc.cpp @@ -0,0 +1,60 @@ +/** +* @file operator_desc.cpp +* +* Copyright (C) 2020. Huawei Technologies Co., Ltd. All rights reserved. +* +* This program is distributed in the hope that it will be useful, +* but WITHOUT ANY WARRANTY; without even the implied warranty of +* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. +*/ +#include "common.h" +#include "operator_desc.h" + +using namespace std; + +OperatorDesc::OperatorDesc(std::string opType) : opType(std::move(opType)) +{ + opAttr = aclopCreateAttr(); +} + +OperatorDesc::~OperatorDesc() +{ + for (auto *desc : inputDesc) { + aclDestroyTensorDesc(desc); + } + + for (auto *desc : outputDesc) { + aclDestroyTensorDesc(desc); + } + + aclopDestroyAttr(opAttr); +} + +OperatorDesc &OperatorDesc::AddInputTensorDesc(aclDataType dataType, + int numDims, + const int64_t *dims, + aclFormat format) +{ + aclTensorDesc *desc = aclCreateTensorDesc(dataType, numDims, dims, format); + if (desc == nullptr) { + ERROR_LOG("create tensor failed"); + return *this; + } + inputDesc.emplace_back(desc); + return *this; +} + +OperatorDesc &OperatorDesc::AddOutputTensorDesc(aclDataType dataType, + int numDims, + const int64_t *dims, + aclFormat format) +{ + aclTensorDesc *desc = aclCreateTensorDesc(dataType, numDims, dims, format); + if (desc == nullptr) { + ERROR_LOG("create tensor failed"); + return *this; + } + + outputDesc.emplace_back(desc); + return *this; +}