From af726a697e4f42535ea2401b8aaa1812da205251 Mon Sep 17 00:00:00 2001 From: ZijunYin Date: Fri, 22 Oct 2021 17:30:59 +0800 Subject: [PATCH] update en docs. --- .../ONNX Operator List/ONNX Operator List.md | 1492 ++++++++--------- .../PyTorch API Support.md | 48 +- .../PyTorch Installation Guide.md | 76 +- ...etwork Model Porting and Training Guide.md | 824 ++++----- .../PyTorch Online Inference Guide.md | 68 +- .../PyTorch Operator Development Guide.md | 248 +-- .../PyTorch Operator Support.md | 8 +- docs/en/RELEASENOTE/RELEASENOTE.md | 80 +- 8 files changed, 1422 insertions(+), 1422 deletions(-) diff --git a/docs/en/ONNX Operator List/ONNX Operator List.md b/docs/en/ONNX Operator List/ONNX Operator List.md index aa756a18d2..4b5f3dee36 100644 --- a/docs/en/ONNX Operator List/ONNX Operator List.md +++ b/docs/en/ONNX Operator List/ONNX Operator List.md @@ -1,160 +1,160 @@ # ONNX Operator List -- [Abs](#abs.md) -- [Acos](#acos.md) -- [Acosh](#acosh.md) -- [AdaptiveAvgPool2D](#adaptiveavgpool2d.md) -- [AdaptiveMaxPool2D](#adaptivemaxpool2d.md) -- [Add](#add.md) -- [Addcmul](#addcmul.md) -- [AffineGrid](#affinegrid.md) -- [And](#and.md) -- [Argmax](#argmax.md) -- [Argmin](#argmin.md) -- [AscendRequantS16](#ascendrequants16.md) -- [AscendRequant](#ascendrequant.md) -- [AscendQuant](#ascendquant.md) -- [AscendDequantS16](#ascenddequants16.md) -- [AscendDequant](#ascenddequant.md) -- [AscendAntiQuant](#ascendantiquant.md) -- [Asin](#asin.md) -- [Asinh](#asinh.md) -- [Atan](#atan.md) -- [Atanh](#atanh.md) -- [AveragePool](#averagepool.md) -- [BatchNormalization](#batchnormalization.md) -- [BatchMatMul](#batchmatmul.md) -- [BatchMultiClassNMS](#batchmulticlassnms.md) -- [BitShift](#bitshift.md) -- [Cast](#cast.md) -- [Ceil](#ceil.md) -- [Celu](#celu.md) -- [Concat](#concat.md) -- [Clip](#clip.md) -- [ConvTranspose](#convtranspose.md) -- [Cumsum](#cumsum.md) -- [Conv](#conv.md) -- [Compress](#compress.md) -- [Constant](#constant.md) -- [ConstantOfShape](#constantofshape.md) -- [Cos](#cos.md) -- [Cosh](#cosh.md) -- [DeformableConv2D](#deformableconv2d.md) -- [Det](#det.md) -- [DepthToSpace](#depthtospace.md) -- [Div](#div.md) -- [Dropout](#dropout.md) -- [Elu](#elu.md) -- [EmbeddingBag](#embeddingbag.md) -- [Equal](#equal.md) -- [Erf](#erf.md) -- [Exp](#exp.md) -- [Expand](#expand.md) -- [EyeLike](#eyelike.md) -- [Flatten](#flatten.md) -- [Floor](#floor.md) -- [Gather](#gather.md) -- [GatherND](#gathernd.md) -- [GatherElements](#gatherelements.md) -- [Gemm](#gemm.md) -- [GlobalAveragePool](#globalaveragepool.md) -- [GlobalLpPool](#globallppool.md) -- [GlobalMaxPool](#globalmaxpool.md) -- [Greater](#greater.md) -- [GreaterOrEqual](#greaterorequal.md) -- [HardSigmoid](#hardsigmoid.md) -- [hardmax](#hardmax.md) -- [HardSwish](#hardswish.md) -- [Identity](#identity.md) -- [If](#if.md) -- [InstanceNormalization](#instancenormalization.md) -- [Less](#less.md) -- [LeakyRelu](#leakyrelu.md) -- [LessOrEqual](#lessorequal.md) -- [Log](#log.md) -- [LogSoftMax](#logsoftmax.md) -- [LpNormalization](#lpnormalization.md) -- [LpPool](#lppool.md) -- [LRN](#lrn.md) -- [LSTM](#lstm.md) -- [MatMul](#matmul.md) -- [Max](#max.md) -- [MaxPool](#maxpool.md) -- [MaxRoiPool](#maxroipool.md) -- [MaxUnpool](#maxunpool.md) -- [Mean](#mean.md) -- [MeanVarianceNormalization](#meanvariancenormalization.md) -- [Min](#min.md) -- [Mod](#mod.md) -- [Mul](#mul.md) -- [Multinomial](#multinomial.md) -- [Neg](#neg.md) -- [NonMaxSuppression](#nonmaxsuppression.md) -- [NonZero](#nonzero.md) -- [Not](#not.md) -- [OneHot](#onehot.md) -- [Or](#or.md) -- [RandomNormalLike](#randomnormallike.md) -- [RandomUniformLike](#randomuniformlike.md) -- [RandomUniform](#randomuniform.md) -- [Range](#range.md) -- [Reciprocal](#reciprocal.md) -- [ReduceL1](#reducel1.md) -- [ReduceL2](#reducel2.md) -- [ReduceLogSum](#reducelogsum.md) -- [ReduceLogSumExp](#reducelogsumexp.md) -- [ReduceMin](#reducemin.md) -- [ReduceMean](#reducemean.md) -- [ReduceProd](#reduceprod.md) -- [ReduceSumSquare](#reducesumsquare.md) -- [Resize](#resize.md) -- [Relu](#relu.md) -- [ReduceSum](#reducesum.md) -- [ReduceMax](#reducemax.md) -- [Reshape](#reshape.md) -- [ReverseSequence](#reversesequence.md) -- [RoiExtractor](#roiextractor.md) -- [RoiAlign](#roialign.md) -- [Round](#round.md) -- [PRelu](#prelu.md) -- [Scatter](#scatter.md) -- [ScatterElements](#scatterelements.md) -- [ScatterND](#scatternd.md) -- [Shrink](#shrink.md) -- [Selu](#selu.md) -- [Shape](#shape.md) -- [Sigmoid](#sigmoid.md) -- [Slice](#slice.md) -- [Softmax](#softmax.md) -- [Softsign](#softsign.md) -- [Softplus](#softplus.md) -- [SpaceToDepth](#spacetodepth.md) -- [Split](#split.md) -- [Sqrt](#sqrt.md) -- [Squeeze](#squeeze.md) -- [Sub](#sub.md) -- [Sign](#sign.md) -- [Sin](#sin.md) -- [Sinh](#sinh.md) -- [Size](#size.md) -- [Sum](#sum.md) -- [Tanh](#tanh.md) -- [TfIdfVectorizer](#tfidfvectorizer.md) -- [Tile](#tile.md) -- [ThresholdedRelu](#thresholdedrelu.md) -- [TopK](#topk.md) -- [Transpose](#transpose.md) -- [Pad](#pad.md) -- [Pow](#pow.md) -- [Unsqueeze](#unsqueeze.md) -- [Xor](#xor.md) -- [Where](#where.md) -

Abs

- -## Description +- [Abs](#absmd) +- [Acos](#acosmd) +- [Acosh](#acoshmd) +- [AdaptiveAvgPool2D](#adaptiveavgpool2dmd) +- [AdaptiveMaxPool2D](#adaptivemaxpool2dmd) +- [Add](#addmd) +- [Addcmul](#addcmulmd) +- [AffineGrid](#affinegridmd) +- [And](#andmd) +- [Argmax](#argmaxmd) +- [Argmin](#argminmd) +- [AscendRequantS16](#ascendrequants16md) +- [AscendRequant](#ascendrequantmd) +- [AscendQuant](#ascendquantmd) +- [AscendDequantS16](#ascenddequants16md) +- [AscendDequant](#ascenddequantmd) +- [AscendAntiQuant](#ascendantiquantmd) +- [Asin](#asinmd) +- [Asinh](#asinhmd) +- [Atan](#atanmd) +- [Atanh](#atanhmd) +- [AveragePool](#averagepoolmd) +- [BatchNormalization](#batchnormalizationmd) +- [BatchMatMul](#batchmatmulmd) +- [BatchMultiClassNMS](#batchmulticlassnmsmd) +- [BitShift](#bitshiftmd) +- [Cast](#castmd) +- [Ceil](#ceilmd) +- [Celu](#celumd) +- [Concat](#concatmd) +- [Clip](#clipmd) +- [ConvTranspose](#convtransposemd) +- [Cumsum](#cumsummd) +- [Conv](#convmd) +- [Compress](#compressmd) +- [Constant](#constantmd) +- [ConstantOfShape](#constantofshapemd) +- [Cos](#cosmd) +- [Cosh](#coshmd) +- [DeformableConv2D](#deformableconv2dmd) +- [Det](#detmd) +- [DepthToSpace](#depthtospacemd) +- [Div](#divmd) +- [Dropout](#dropoutmd) +- [Elu](#elumd) +- [EmbeddingBag](#embeddingbagmd) +- [Equal](#equalmd) +- [Erf](#erfmd) +- [Exp](#expmd) +- [Expand](#expandmd) +- [EyeLike](#eyelikemd) +- [Flatten](#flattenmd) +- [Floor](#floormd) +- [Gather](#gathermd) +- [GatherND](#gatherndmd) +- [GatherElements](#gatherelementsmd) +- [Gemm](#gemmmd) +- [GlobalAveragePool](#globalaveragepoolmd) +- [GlobalLpPool](#globallppoolmd) +- [GlobalMaxPool](#globalmaxpoolmd) +- [Greater](#greatermd) +- [GreaterOrEqual](#greaterorequalmd) +- [HardSigmoid](#hardsigmoidmd) +- [hardmax](#hardmaxmd) +- [HardSwish](#hardswishmd) +- [Identity](#identitymd) +- [If](#ifmd) +- [InstanceNormalization](#instancenormalizationmd) +- [Less](#lessmd) +- [LeakyRelu](#leakyrelumd) +- [LessOrEqual](#lessorequalmd) +- [Log](#logmd) +- [LogSoftMax](#logsoftmaxmd) +- [LpNormalization](#lpnormalizationmd) +- [LpPool](#lppoolmd) +- [LRN](#lrnmd) +- [LSTM](#lstmmd) +- [MatMul](#matmulmd) +- [Max](#maxmd) +- [MaxPool](#maxpoolmd) +- [MaxRoiPool](#maxroipoolmd) +- [MaxUnpool](#maxunpoolmd) +- [Mean](#meanmd) +- [MeanVarianceNormalization](#meanvariancenormalizationmd) +- [Min](#minmd) +- [Mod](#modmd) +- [Mul](#mulmd) +- [Multinomial](#multinomialmd) +- [Neg](#negmd) +- [NonMaxSuppression](#nonmaxsuppressionmd) +- [NonZero](#nonzeromd) +- [Not](#notmd) +- [OneHot](#onehotmd) +- [Or](#ormd) +- [RandomNormalLike](#randomnormallikemd) +- [RandomUniformLike](#randomuniformlikemd) +- [RandomUniform](#randomuniformmd) +- [Range](#rangemd) +- [Reciprocal](#reciprocalmd) +- [ReduceL1](#reducel1md) +- [ReduceL2](#reducel2md) +- [ReduceLogSum](#reducelogsummd) +- [ReduceLogSumExp](#reducelogsumexpmd) +- [ReduceMin](#reduceminmd) +- [ReduceMean](#reducemeanmd) +- [ReduceProd](#reduceprodmd) +- [ReduceSumSquare](#reducesumsquaremd) +- [Resize](#resizemd) +- [Relu](#relumd) +- [ReduceSum](#reducesummd) +- [ReduceMax](#reducemaxmd) +- [Reshape](#reshapemd) +- [ReverseSequence](#reversesequencemd) +- [RoiExtractor](#roiextractormd) +- [RoiAlign](#roialignmd) +- [Round](#roundmd) +- [PRelu](#prelumd) +- [Scatter](#scattermd) +- [ScatterElements](#scatterelementsmd) +- [ScatterND](#scatterndmd) +- [Shrink](#shrinkmd) +- [Selu](#selumd) +- [Shape](#shapemd) +- [Sigmoid](#sigmoidmd) +- [Slice](#slicemd) +- [Softmax](#softmaxmd) +- [Softsign](#softsignmd) +- [Softplus](#softplusmd) +- [SpaceToDepth](#spacetodepthmd) +- [Split](#splitmd) +- [Sqrt](#sqrtmd) +- [Squeeze](#squeezemd) +- [Sub](#submd) +- [Sign](#signmd) +- [Sin](#sinmd) +- [Sinh](#sinhmd) +- [Size](#sizemd) +- [Sum](#summd) +- [Tanh](#tanhmd) +- [TfIdfVectorizer](#tfidfvectorizermd) +- [Tile](#tilemd) +- [ThresholdedRelu](#thresholdedrelumd) +- [TopK](#topkmd) +- [Transpose](#transposemd) +- [Pad](#padmd) +- [Pow](#powmd) +- [Unsqueeze](#unsqueezemd) +- [Xor](#xormd) +- [Where](#wheremd) +

Abs

+ +### Description Computes the absolute value of a tensor. -## Parameters +### Parameters \[Inputs\] @@ -168,17 +168,17 @@ One output y: tensor. Has the identical data type and shape as the input. -## ONNX Opset Support +### ONNX Opset Support Opset v8/v9/v10/v11/v12/v13 -

Acos

+

Acos

-## Description +### Description Computes acos of the input element-wise. -## Parameters +### Parameters \[Inputs\] @@ -192,17 +192,17 @@ One output y: tensor. Has the identical data type and shape as the input. -## ONNX Opset Support +### ONNX Opset Support Opset v8/v9/v10/v11/v12/v13 -

Acosh

+

Acosh

-## Description +### Description Computes inverse hyperbolic cosine of x element-wise. -## Parameters +### Parameters \[Inputs\] @@ -216,17 +216,17 @@ One output y: tensor. Has the identical data type and shape as the input. -## ONNX Opset Support +### ONNX Opset Support Opset v9/v10/v11/v12/v13 -

AdaptiveAvgPool2D

+

AdaptiveAvgPool2D

-## Description +### Description Applies a 2D adaptive avg pooling over the input. -## Parameters +### Parameters \[Inputs\] @@ -246,17 +246,17 @@ One output y: tensor of the identical data type as x. -## ONNX Opset Support +### ONNX Opset Support No ONNX support for this custom operator -

AdaptiveMaxPool2D

+

AdaptiveMaxPool2D

-## Description +### Description Applies a 2D adaptive max pooling over the input. -## Parameters +### Parameters \[Inputs\] @@ -278,17 +278,17 @@ y: tensor of the identical data type as x. argmax: tensor of type int32 or int64. -## ONNX Opset Support +### ONNX Opset Support No ONNX support for this custom operator -

Add

+

Add

-## Description +### Description Adds inputs element-wise. -## Parameters +### Parameters \[Inputs\] @@ -302,17 +302,17 @@ B: tensor of the identical data type as A. C: tensor of the identical data type as A. -## ONNX Opset Support +### ONNX Opset Support Opset v8/v9/v10/v11/v12/v13 -

Addcmul

+

Addcmul

-## Description +### Description Performs element-wise computation: \(x1 \* x2\) \* value + input\_data -## Parameters +### Parameters \[Inputs\] @@ -332,17 +332,17 @@ One output y: tensor of the identical data type as the inputs. -## ONNX Opset Support +### ONNX Opset Support No ONNX support for this custom operator -

AffineGrid

+

AffineGrid

-## Description +### Description Generates a sampling grid with given matrices. -## Parameters +### Parameters \[Inputs\] @@ -364,17 +364,17 @@ One output y: tensor of type int. -## ONNX Opset Support +### ONNX Opset Support No ONNX support for this custom operator -

And

+

And

-## Description +### Description Returns the tensor resulted from performing the and logical operation element-wise on the input tensors. -## Parameters +### Parameters \[Inputs\] @@ -390,17 +390,17 @@ One output y: tensor of the identical data type and shape as input x. -## ONNX Opset Support +### ONNX Opset Support Opset v8/v9/v10/v11/v12/v13 -

Argmax

+

Argmax

-## Description +### Description Returns the indices of the maximum elements along the provided axis. -## Parameters +### Parameters \[Inputs\] @@ -424,17 +424,17 @@ keep\_dim: \(optional\) either 1 \(default\) or 0. The operator does not support inputs of type float32 when the atc command-line option **--precision\_mode** is set to **must\_keep\_origin\_dtype**. -## ONNX Opset Support +### ONNX Opset Support Opset v8/v9/v10/v11/v12/v13 -

Argmin

+

Argmin

-## Description +### Description Returns the indices of the minimum values along an axis. -## Parameters +### Parameters \[Inputs\] @@ -456,17 +456,17 @@ axis: int. Must be in the range \[–r, r – 1\], where r indicates the rank of The operator does not support inputs of type float32 when the atc command-line option **--precision\_mode** is set to **must\_keep\_origin\_dtype**. -## ONNX Opset Support +### ONNX Opset Support Opset v8/v9/v10/v11/v12/v13 -

AscendRequantS16

+

AscendRequantS16

-## Description +### Description Performs requantization. -## Parameters +### Parameters \[Inputs\] @@ -494,17 +494,17 @@ y0: tensor of type int8. y1: tensor of type int16. -## ONNX Opset Support +### ONNX Opset Support No ONNX support for this custom operator -

AscendRequant

+

AscendRequant

-## Description +### Description Performs requantization. -## Parameters +### Parameters \[Inputs\] @@ -526,17 +526,17 @@ One output y: tensor of type int8. -## ONNX Opset Support +### ONNX Opset Support No ONNX support for this custom operator -

AscendQuant

+

AscendQuant

-## Description +### Description Performs quantization. -## Parameters +### Parameters \[Inputs\] @@ -562,17 +562,17 @@ One output y: tensor of type int8. -## ONNX Opset Support +### ONNX Opset Support No ONNX support for this custom operator -

AscendDequantS16

+

AscendDequantS16

-## Description +### Description Performs dequantization. -## Parameters +### Parameters \[Inputs\] @@ -596,17 +596,17 @@ One output y: tensor of type int16. -## ONNX Opset Support +### ONNX Opset Support No ONNX support for this custom operator -

AscendDequant

+

AscendDequant

-## Description +### Description Performs dequantization. -## Parameters +### Parameters \[Inputs\] @@ -630,17 +630,17 @@ One output y: tensor of type float16 or float. -## ONNX Opset Support +### ONNX Opset Support No ONNX support for this custom operator -

AscendAntiQuant

+

AscendAntiQuant

-## Description +### Description Performs dequantization. -## Parameters +### Parameters \[Inputs\] @@ -664,17 +664,17 @@ One output y: tensor of type float16 or float. -## ONNX Opset Support +### ONNX Opset Support No ONNX support for this custom operator -

Asin

+

Asin

-## Description +### Description Computes the trignometric inverse sine of the input element-wise. -## Parameters +### Parameters \[Inputs\] @@ -688,17 +688,17 @@ One output y: tensor. Has the identical data type and shape as the input. -## ONNX Opset Support +### ONNX Opset Support Opset v8/v9/v10/v11/v12/v13 -

Asinh

+

Asinh

-## Description +### Description Computes inverse hyperbolic sine of the input element-wise. -## Parameters +### Parameters \[Inputs\] @@ -710,17 +710,17 @@ x: tensor of type float16, float32, or double. y: tensor. Has the identical data type and shape as the input. -## ONNX Opset Support +### ONNX Opset Support Opset v9/v10/v11/v12/v13 -

Atan

+

Atan

-## Description +### Description Computes the trignometric inverse tangent of the input element-wise. -## Parameters +### Parameters \[Inputs\] @@ -734,17 +734,17 @@ One output y: tensor. Has the identical data type and shape as the input. -## ONNX Opset Support +### ONNX Opset Support Opset v8/v9/v10/v11/v12/v13 -

Atanh

+

Atanh

-## Description +### Description Computes inverse hyperbolic tangent of the input element-wise. -## Parameters +### Parameters \[Inputs\] @@ -758,17 +758,17 @@ One output y: tensor. Has the identical data type and shape as the input. -## ONNX Opset Support +### ONNX Opset Support Opset v9/v10/v11/v12/v13 -

AveragePool

+

AveragePool

-## Description +### Description Performs average pooling. -## Parameters +### Parameters \[Inputs\] @@ -824,17 +824,17 @@ The operator does not support inputs of type float32 when the atc command-line o Beware that both the SAME\_UPPER and SAME\_LOWER values of auto\_pad are functionally the same as the SAME argument of built-in TBE operators. The attribute configuration may lead to accuracy drop as the SAME argument is position-insensitive. -## ONNX Opset Support +### ONNX Opset Support Opset v8/v9/v10/v11/v12/v13 -

BatchNormalization

+

BatchNormalization

-## Description +### Description Normalizes the inputs. -## Parameters +### Parameters \[Inputs\] @@ -870,17 +870,17 @@ epsilon: \(optional\) float32, added to var to avoid dividing by zero. Defaults momentum: float32, not supported currently. -## ONNX Opset Support +### ONNX Opset Support Opset v8/v9/v10/v11/v12/v13 -

BatchMatMul

+

BatchMatMul

-## Description +### Description Multiplies slices of two tensors in batches. -## Parameters +### Parameters \[Inputs\] @@ -904,17 +904,17 @@ One output y: tensor of type float16, float, or int32. -## ONNX Opset Support +### ONNX Opset Support No ONNX support for this custom operator -

BatchMultiClassNMS

+

BatchMultiClassNMS

-## Description +### Description Applies non-maximum suppression \(NMS\) on input boxes and input scores. -## Parameters +### Parameters \[Inputs\] @@ -956,17 +956,17 @@ nmsed\_classes: tensor of type float16 nmsed\_num: tensor of type float16 -## ONNX Opset Support +### ONNX Opset Support No ONNX support for this custom operator -

BitShift

+

BitShift

-## Description +### Description Performs element-wise shift. -## Parameters +### Parameters \[Inputs\] @@ -988,17 +988,17 @@ direction: \(required\) string, indicating the direction of moving bits. Either When direction="LEFT", the inputs must not be of type UINT16, UIN32, or UINT64. -## ONNX Opset Support +### ONNX Opset Support Opset v11/v12/v13 -

Cast

+

Cast

-## Description +### Description Casts a tensor to a new type. -## Parameters +### Parameters \[Inputs\] @@ -1014,17 +1014,17 @@ y: tensor of the data type specified by the attribute. Must be one of the follow to: \(required\) int, the destination type. -## ONNX Opset Support +### ONNX Opset Support Opset v8/v9/v10/v11/v12/v13 -

Ceil

+

Ceil

-## Description +### Description Returns the ceiling of the input, element-wise. -## Parameters +### Parameters \[Inputs\] @@ -1038,19 +1038,19 @@ One output y: tensor. Has the identical data type and shape as the input. -## ONNX Opset Support +### ONNX Opset Support Opset v8/v9/v10/v11/v12/v13 -

Celu

+

Celu

-## Description +### Description Continuously Differentiable Exponential Linear Units \(CELUs\): performs the linear unit element-wise on the input tensor X using formula: max\(0,x\) + min\(0,alpha \* \(exp\(x/alpha\) – 1\)\) -## Parameters +### Parameters \[Inputs\] @@ -1064,17 +1064,17 @@ Y: tensor of type float. alpha: float. Defaults to 1.0. -## ONNX Opset Support +### ONNX Opset Support Opset v12/v13 -

Concat

+

Concat

-## Description +### Description Concatenates multiple inputs. -## Parameters +### Parameters \[Inputs\] @@ -1088,17 +1088,17 @@ concat\_result: tensor of the identical data type as inputs. axis: the axis along which to concatenate — may be negative to index from the end. Must be in the range \[–r, r – 1\], where, r = rank\(inputs\). -## ONNX Opset Support +### ONNX Opset Support Opset v8/v9/v10/v11/v12/v13 -

Clip

+

Clip

-## Description +### Description Clips tensor values to a specified min and max. -## Parameters +### Parameters \[Inputs\] @@ -1116,17 +1116,17 @@ One output Y: output tensor with clipped input elements. Has the identical shape and data type as the input. -## ONNX Opset Support +### ONNX Opset Support Opset v8/v9/v10/v11/v12/v13 -

ConvTranspose

+

ConvTranspose

-## Description +### Description Computes transposed convolution. -## Parameters +### Parameters \[Inputs\] @@ -1174,17 +1174,17 @@ The operator does not support inputs of type float32 or float64 when the atc com The auto\_pad attribute must not be SAME\_UPPER or SAME\_LOWER. -## ONNX Opset Support +### ONNX Opset Support Opset v8/v9/v10/v11/v12/v13 -

Cumsum

+

Cumsum

-## Description +### Description Performs cumulative sum of the input elements along the given axis. -## Parameters +### Parameters \[Inputs\] @@ -1206,17 +1206,17 @@ exclusive: int. Whether to return exclusive sum in which the top element is not reverse: int. Whether to perform the sums in reverse direction. Defaults to 0. -## ONNX Opset Support +### ONNX Opset Support Opset v8/v9/v10/v11/v12/v13 -

Conv

+

Conv

-## Description +### Description Computes convolution. -## Parameters +### Parameters \[Inputs\] @@ -1254,17 +1254,17 @@ The operator is not supported if the output Y meets: W = 1, H ! = 1 The operator does not support inputs of type float32 or float64 when the atc command-line option **--precision\_mode** is set to **must\_keep\_origin\_dtype**. -## ONNX Opset Support +### ONNX Opset Support Opset v9/v10/v11/v12/v13 -

Compress

+

Compress

-## Description +### Description Slices data based on the specified axis. -## Parameters +### Parameters \[Inputs\] @@ -1284,17 +1284,17 @@ output: tensor of the same type as the input \(Optional\) axis: int, axis for slicing. If no axis is specified, the input tensor is flattened before slicing. The value range is \[-r, r-1\]. **r** indicates the dimensions of the input tensor. -## ONNX Opset Support +### ONNX Opset Support Opset v9//v11/v12/v13 -

Constant

+

Constant

-## Description +### Description Creates a constant tensor. -## Parameters +### Parameters \[Inputs\] @@ -1314,17 +1314,17 @@ value: the value for the elements of the output tensor. sparse\_value: not supported -## ONNX Opset Support +### ONNX Opset Support Opset v8/v9/v10/v11/v12/v13 -

ConstantOfShape

+

ConstantOfShape

-## Description +### Description Generates a tensor with given value and shape. -## Parameters +### Parameters \[Inputs\] @@ -1342,17 +1342,17 @@ value: the value and data type of the output elements. x: 1 <= len\(shape\) <= 8 -## ONNX Opset Support +### ONNX Opset Support Opset v9/v10/v11/v12/v13 -

Cos

+

Cos

-## Description +### Description Computes cos of the input element-wise. -## Parameters +### Parameters \[Inputs\] @@ -1366,17 +1366,17 @@ One output y: tensor. Has the identical data type and shape as the input. -## ONNX Opset Support +### ONNX Opset Support Opset v8/v9/v10/v11/v12/v13 -

Cosh

+

Cosh

-## Description +### Description Computes hyperbolic cosine of the input element-wise. -## Parameters +### Parameters \[Inputs\] @@ -1390,17 +1390,17 @@ One output y: tensor. Has the identical data type and shape as the input. -## ONNX Opset Support +### ONNX Opset Support Opset v8/v9/v10/v11/v12/v13 -

DeformableConv2D

+

DeformableConv2D

-## Description +### Description Deformable convolution -## Parameters +### Parameters \[Inputs\] @@ -1442,17 +1442,17 @@ For the weight tensor, expected range of both the W and H dimensions are \[1, 63 The operator does not support inputs of type float32 or float64 when the atc command-line option **--precision\_mode** is set to **must\_keep\_origin\_dtype**. -## ONNX Opset Support +### ONNX Opset Support No ONNX support for this custom operator -

Det

+

Det

-## Description +### Description Calculates determinant of a square matrix or batches of square matrices. -## Parameters +### Parameters \[Inputs\] @@ -1466,17 +1466,17 @@ One output y: tensor of the identical data type and shape as input x. -## ONNX Opset Support +### ONNX Opset Support Opset v8/v9/v10/v11/v12/v13 -

DepthToSpace

+

DepthToSpace

-## Description +### Description Rearranges \(permutes\) data from depth into blocks of spatial data. -## Parameters +### Parameters \[Inputs\] @@ -1496,17 +1496,17 @@ blocksize: \(required\) int, blocks to be moved. mode: string, either DCR \(default\) for depth-column-row order re-arrangement or CRD for column-row-depth order arrangement. -## ONNX Opset Support +### ONNX Opset Support Opset v8/v9/v10/v11/v12/v13 -

Div

+

Div

-## Description +### Description Performs element-wise division. -## Parameters +### Parameters \[Inputs\] @@ -1526,17 +1526,17 @@ y: tensor of the identical data type as the inputs. The output has the identical data type as the inputs. -## ONNX Opset Support +### ONNX Opset Support Opset v8/v9/v10/v11/v12/v13 -

Dropout

+

Dropout

-## Description +### Description Copies or masks the input tensor. -## Parameters +### Parameters \[Inputs\] @@ -1556,17 +1556,17 @@ output: tensor mask: tensor -## ONNX Opset Support +### ONNX Opset Support Opset v8/v9/v10/v11/v12/v13 -

Elu

+

Elu

-## Description +### Description Computes the exponential linear function. -## Parameters +### Parameters \[Inputs\] @@ -1584,17 +1584,17 @@ y: tensor of the same data type and shape as input x. alpha: float, indicating the coefficient. Defaults to 1.0. -## ONNX Opset Support +### ONNX Opset Support Opset v8/v9/v10/v11/v12/v13 -

EmbeddingBag

+

EmbeddingBag

-## Description +### Description Computes sums, means, or maxes of bags of embeddings. -## Parameters +### Parameters \[Inputs\] @@ -1626,17 +1626,17 @@ One output y: tensor of type float32. -## ONNX Opset Support +### ONNX Opset Support No ONNX support for this custom operator -

Equal

+

Equal

-## Description +### Description Returns the truth value of \(X1 == X2\) element-wise. -## Parameters +### Parameters \[Inputs\] @@ -1656,17 +1656,17 @@ y: tensor of type bool. X1 and X2 have the same format and data type. The following data types are supported: bool, uint8, int8, int16, int32, int64, float16, float32, and double. -## ONNX Opset Support +### ONNX Opset Support Opset v8/v9/v10/v11/v12/v13 -

Erf

+

Erf

-## Description +### Description Computes the Gauss error function of x element-wise. -## Parameters +### Parameters \[Inputs\] @@ -1680,17 +1680,17 @@ One output y: tensor. Has the identical data type and format as the input. -## ONNX Opset Support +### ONNX Opset Support Opset v9/v10/v11/v12/v13 -

Exp

+

Exp

-## Description +### Description Computes exponential of the input element-wise. -## Parameters +### Parameters \[Inputs\] @@ -1704,17 +1704,17 @@ One output y: tensor. Has the identical data type and shape as the input. -## ONNX Opset Support +### ONNX Opset Support Opset v8/v9/v10/v11/v12/v13 -

Expand

+

Expand

-## Description +### Description Broadcasts the input tensor following the given shape and the broadcast rule. -## Parameters +### Parameters \[Inputs\] @@ -1734,17 +1734,17 @@ y: tensor of the identical data type and shape as input x. The model's inputs need to be changed from placeholders to constants. You can use ONNX Simplifier to simplify your model. -## ONNX Opset Support +### ONNX Opset Support Opset v8/v9/v10/v11/v12/v13 -

EyeLike

+

EyeLike

-## Description +### Description Generate a 2D tensor \(matrix\) with ones on the diagonal and zeros everywhere else. -## Parameters +### Parameters \[Inputs\] @@ -1768,17 +1768,17 @@ k: int, specifying the index of the diagonal to be populated with ones. Defaults k must be 0. -## ONNX Opset Support +### ONNX Opset Support Opset v8/v9/v10/v11/v12/v13 -

Flatten

+

Flatten

-## Description +### Description Flattens the input. -## Parameters +### Parameters \[Inputs\] @@ -1792,17 +1792,17 @@ input: ND tensor. Must be one of the following data types: int8, uint8, int16, u axis: int. Must be positive. -## ONNX Opset Support +### ONNX Opset Support Opset v8/v9/v10/v11/v12/v13 -

Floor

+

Floor

-## Description +### Description Returns element-wise largest integer not greater than x. -## Parameters +### Parameters \[Inputs\] @@ -1816,17 +1816,17 @@ One output y: tensor. Has the identical data type and shape as the input. -## ONNX Opset Support +### ONNX Opset Support Opset v8/v9/v10/v11/v12/v13 -

Gather

+

Gather

-## Description +### Description Gathers slices from the input according to indices. -## Parameters +### Parameters \[Inputs\] @@ -1850,17 +1850,17 @@ axis: int, the axis in x1 to gather indices from. Must be in the range \[–r, r indices must not be negative. -## ONNX Opset Support +### ONNX Opset Support Opset v8/v9/v10/v11/v12/v13 -

GatherND

+

GatherND

-## Description +### Description Gathers slices of data into an output tensor. -## Parameters +### Parameters \[Inputs\] @@ -1884,17 +1884,17 @@ batch\_dims: int, the number of batch dimensions. Defaults to 0. The operator does not support inputs of type double when the atc command-line option --precision\_mode is set to must\_keep\_origin\_dtype. -## ONNX Opset Support +### ONNX Opset Support Opset v11/v12/v13 -

GatherElements

+

GatherElements

-## Description +### Description Produces an output by indexing into the input tensor at index positions. -## Parameters +### Parameters \[Inputs\] @@ -1914,17 +1914,17 @@ output: tensor with the same shape as indices. axis: int, the axis to gather on. Defaults to 0. -## ONNX Opset Support +### ONNX Opset Support Opset v8/v9/v10/v11/v12/v13 -

Gemm

+

Gemm

-## Description +### Description General matrix multiplication -## Parameters +### Parameters \[Inputs\] @@ -1952,17 +1952,17 @@ beta: float, not supported currently. Opset V8, V9, and V10 versions do not support inputs of type float32 when the atc command-line option **--precision\_mode** is set to **must\_keep\_origin\_dtype**. -## ONNX Opset Support +### ONNX Opset Support Opset v8/v9/v10/v11/v12/v13 -

GlobalAveragePool

+

GlobalAveragePool

-## Description +### Description Performs global average pooling. -## Parameters +### Parameters \[Inputs\] @@ -1972,17 +1972,17 @@ X: tensor of type float16 or float32, in NCHW format. Y: pooled tensor in NCHW format. Has the same data type as X. -## ONNX Opset Support +### ONNX Opset Support Opset v8/v9/v10/v11/v12/v13 -

GlobalLpPool

+

GlobalLpPool

-## Description +### Description Performs global norm pooling. -## Parameters +### Parameters \[Inputs\] @@ -1998,17 +1998,17 @@ One output y: tensor of the same data type as input x. -## ONNX Opset Support +### ONNX Opset Support Opset v8/v9/v10/v11/v12/v13 -

GlobalMaxPool

+

GlobalMaxPool

-## Description +### Description Performs global max pooling. -## Parameters +### Parameters \[Inputs\] @@ -2022,17 +2022,17 @@ One output output: pooled tensor -## ONNX Opset Support +### ONNX Opset Support Opset v8/v9/v10/v11/v12/v13 -

Greater

+

Greater

-## Description +### Description Returns the truth value of \(x1 \> x2\) element-wise. -## Parameters +### Parameters \[Inputs\] @@ -2048,17 +2048,17 @@ One output y: tensor of type bool. -## ONNX Opset Support +### ONNX Opset Support Opset v8/v9/v10/v11/v12/v13 -

GreaterOrEqual

+

GreaterOrEqual

-## Description +### Description Returns the truth value of \(x1 \>= x2\) element-wise. -## Parameters +### Parameters \[Inputs\] @@ -2074,17 +2074,17 @@ One output y: tensor of type bool. -## ONNX Opset Support +### ONNX Opset Support Opset v8/v12 -

HardSigmoid

+

HardSigmoid

-## Description +### Description Takes one input data \(tensor\) and produces one output data \(tensor\) where the HardSigmoid function, y = max\(0, min\(1, alpha \* x + beta\)\), is applied to the tensor element-wise. -## Parameters +### Parameters \[Inputs\] @@ -2104,17 +2104,17 @@ alpha: float. Defaults to 0.2. beta: float. Defaults to 0.2. -## ONNX Opset Support +### ONNX Opset Support Opset v1/v6/v8/v9/v10/v11/v12/v13 -

hardmax

+

hardmax

-## Description +### Description Computes the hardmax values for the given input: Hardmax\(element in input, axis\) = 1 if the element is the first maximum value along the specified axis, 0 otherwise. -## Parameters +### Parameters \[Inputs\] @@ -2136,17 +2136,17 @@ axis: int. The dimension Hardmax will be performed on. Defaults to –1. In the atc command line, the --precision\_mode option must be set to allow\_fp32\_to\_fp16. -## ONNX Opset Support +### ONNX Opset Support Opset v8/v9/v10/v11/v12/v13 -

HardSwish

+

HardSwish

-## Description +### Description Applies the HardSwish function. **y=x \* max\(0, min\(1, alpha \* x + beta \)\)**, where **alpha** is **1/6** and **beat** is **0.5**. -## Parameters +### Parameters \[Inputs\] @@ -2160,17 +2160,17 @@ One output y: tensor of type float16 or float32. -## ONNX Opset Support +### ONNX Opset Support Opset v14 -

Identity

+

Identity

-## Description +### Description Identity operator -## Parameters +### Parameters \[Inputs\] @@ -2184,17 +2184,17 @@ One output y: tensor of the identical data type and shape as input x. -## ONNX Opset Support +### ONNX Opset Support Opset v8/v9/v10/v11/v12/v13 -

If

+

If

-## Description +### Description If conditional -## Parameters +### Parameters \[Inputs\] @@ -2214,17 +2214,17 @@ One or more outputs y: tensor or list of tensors -## ONNX Opset Support +### ONNX Opset Support Opset v8/v9/v10/v11/v12/v13 -

InstanceNormalization

+

InstanceNormalization

-## Description +### Description Computes a tensor by using the formula: y = scale \* \(x – mean\) / sqrt\(variance + epsilon\) + B, where mean and variance are computed per instance per channel. -## Parameters +### Parameters \[Inputs\] @@ -2246,17 +2246,17 @@ y: tensor of the identical data type and shape as input x. epsilon: float. The epsilon value to use to avoid division by zero. Defaults to 1e – 05. -## ONNX Opset Support +### ONNX Opset Support Opset v8/v9/v10/v11/v12/v13 -

Less

+

Less

-## Description +### Description Returns the truth value of \(x1 < x2\) element-wise. -## Parameters +### Parameters \[Inputs\] @@ -2272,17 +2272,17 @@ One output y: tensor of type bool. -## ONNX Opset Support +### ONNX Opset Support Opset v8/v9/v10/v11/v12/v13 -

LeakyRelu

+

LeakyRelu

-## Description +### Description Computes the Leaky ReLU activation function. -## Parameters +### Parameters \[Inputs\] @@ -2300,17 +2300,17 @@ y: tensor. Has the identical data type and shape as the input. alpha: float, the leakage coefficient. Defaults to 0.01. -## ONNX Opset Support +### ONNX Opset Support Opset v8/v9/v10/v11/v12/v13 -

LessOrEqual

+

LessOrEqual

-## Description +### Description Returns the truth value of \(x <= y\) element-wise. -## Parameters +### Parameters \[Inputs\] @@ -2326,17 +2326,17 @@ One output y: tensor of type bool, with the same shape as the input x. -## ONNX Opset Support +### ONNX Opset Support Opset v12/v13 -

Log

+

Log

-## Description +### Description Computes natural logarithm of x element-wise. -## Parameters +### Parameters \[Inputs\] @@ -2350,17 +2350,17 @@ One output y: tensor of the identical data type as the input. -## ONNX Opset Support +### ONNX Opset Support Opset v8/v9/v10/v11/v12/v13 -

LogSoftMax

+

LogSoftMax

-## Description +### Description Computes log softmax activations. -## Parameters +### Parameters \[Inputs\] @@ -2378,17 +2378,17 @@ y: tensor. Has the identical data type and shape as the input. axis: int. Must be in the range \[–r, r – 1\], where r indicates the rank of the input. -## ONNX Opset Support +### ONNX Opset Support Opset v8/v9/v10/v11/v12/v13 -

LpNormalization

+

LpNormalization

-## Description +### Description Given a matrix, applies Lp-normalization along the provided axis. -## Parameters +### Parameters \[Inputs\] @@ -2412,17 +2412,17 @@ p: int. Defaults to **2**. Beware that both the **SAME\_UPPER** and **SAME\_LOWER** values of auto\_pad are functionally the same as the SAME argument of built-in TBE operators. The attribute configuration may lead to an accuracy drop as the SAME argument is position-insensitive. -## ONNX Opset Support +### ONNX Opset Support Opset v1/v8/v9/v10/v11/v12/v13 -

LpPool

+

LpPool

-## Description +### Description Performs Lp norm pooling. -## Parameters +### Parameters \[Inputs\] @@ -2448,17 +2448,17 @@ pads: int list. strides: int list. -## ONNX Opset Support +### ONNX Opset Support Opset v11/v12/v13 -

LRN

+

LRN

-## Description +### Description Performs local response normalization. -## Parameters +### Parameters \[Inputs\] @@ -2482,17 +2482,17 @@ bias: float. size: int, the number of channels to sum over. Must be odd. -## ONNX Opset Support +### ONNX Opset Support Opset v8/v9/v10/v11/v12/v13 -

LSTM

+

LSTM

-## Description +### Description Computes a one-layer LSTM. This operator is usually supported via some custom implementation such as CuDNN. -## Parameters +### Parameters \[3–8 Inputs\] @@ -2538,17 +2538,17 @@ input\_forget: int. Defaults to 0. layout: int. Defaults to 0. -## ONNX Opset Support +### ONNX Opset Support Opset v8/v9/v10/v11/v12/v13 -

MatMul

+

MatMul

-## Description +### Description Multiplies two matrices. -## Parameters +### Parameters \[Inputs\] @@ -2568,17 +2568,17 @@ y: 2D tensor of type float16. Only 1D to 6D inputs are supported. -## ONNX Opset Support +### ONNX Opset Support Opset v8/v9/v10/v11/v12/v13 -

Max

+

Max

-## Description +### Description Computes element-wise max of each of the input tensors. -## Parameters +### Parameters \[Inputs\] @@ -2592,17 +2592,17 @@ One output max: tensor with the same type and shape as the input x \(broadcast shape\) -## ONNX Opset Support +### ONNX Opset Support Opset v8/v9/v10/v11/v12/v13 -

MaxPool

+

MaxPool

-## Description +### Description Performs max pooling. -## Parameters +### Parameters \[Inputs\] @@ -2655,17 +2655,17 @@ The operator does not support inputs of type float32 when the atc command-line o pads and auto\_pad are mutually exclusive. -## ONNX Opset Support +### ONNX Opset Support Opset v8/v9/v10/v11/v12/v13 -

MaxRoiPool

+

MaxRoiPool

-## Description +### Description Consumes an input tensor X and region of interests \(RoIs\) to apply max pooling across each RoI, to produce output 4-D tensor of shape \(num\_rois, channels, pooled\_shape\[0\], pooled\_shape\[1\]\). -## Parameters +### Parameters \[Inputs\] @@ -2687,17 +2687,17 @@ spatial\_scale: float. Defaults to 1.0. The operator does not support inputs of type float32 when the atc command-line option **--precision\_mode** is set to **must\_keep\_origin\_dtype**. -## ONNX Opset Support +### ONNX Opset Support Opset v8/v9/v10/v11/v12/13 -

MaxUnpool

+

MaxUnpool

-## Description +### Description Indicates the reverse of the MaxPool operation. -## Parameters +### Parameters \[Inputs\] @@ -2719,17 +2719,17 @@ pads: int list, pad on each axis. strides: int list, stride on each axis. -## ONNX Opset Support +### ONNX Opset Support Opset v9/v11/v12/v13 -

Mean

+

Mean

-## Description +### Description Computes element-wise mean of each of the input tensors \(with NumPy-style broadcasting support\). All inputs and outputs must have the same data type. This operator supports multi-directional \(NumPy-style\) broadcasting. -## Parameters +### Parameters \[Inputs\] One or more inputs \(1–∞\) @@ -2739,17 +2739,17 @@ data\_0: tensor of type float16, float, double, or bfloat16. mean: tensor of type float16, float, double, or bfloat16. -## ONNX Opset Support +### ONNX Opset Support Opset v8/v9/v10/v11/v12/v13 -

MeanVarianceNormalization

+

MeanVarianceNormalization

-## Description +### Description Performs mean variance normalization on the input tensor X using formula: \(X – EX\)/sqrt\(E\(X – EX\)^2\) -## Parameters +### Parameters \[Inputs\] @@ -2763,17 +2763,17 @@ Y: tensor of type float16, float, or bfloat16. axes: list of ints. Defaults to \['0', '2', '3'\]. -## ONNX Opset Support +### ONNX Opset Support Opset v9/v10/v11/v12/v13 -

Min

+

Min

-## Description +### Description Returns the minimum of the input tensors. -## Parameters +### Parameters \[Inputs\] @@ -2787,17 +2787,17 @@ One output y: output tensor -## ONNX Opset Support +### ONNX Opset Support Opset v8/v9/v10/v11/v12/v13 -

Mod

+

Mod

-## Description +### Description Performs element-wise binary modulus \(with NumPy-style broadcasting support\). The sign of the remainder is the same as that of the divisor. -## Parameters +### Parameters \[Inputs\] @@ -2817,17 +2817,17 @@ fmod: int. Defaults to 0. fmod must not be 0 if the inputs are of type float. -## ONNX Opset Support +### ONNX Opset Support Opset v10/v11/v12/v13 -

Mul

+

Mul

-## Description +### Description Performs dot product of two matrices. -## Parameters +### Parameters \[Inputs\] @@ -2839,17 +2839,17 @@ B: tensor of type float16, float32, uint8, int8, int16, or int32. C: tensor of the identical data type as the input tensor. -## ONNX Opset Support +### ONNX Opset Support Opset v8/v9/v10/v11/v12/v13 -

Multinomial

+

Multinomial

-## Description +### Description Generates a tensor of samples from a multinomial distribution according to the probabilities of each of the possible outcomes. -## Parameters +### Parameters \[Inputs\] @@ -2871,17 +2871,17 @@ sample\_size: int. Number of times to sample. Defaults to 1. seed: float. Seed to the random generator. -## ONNX Opset Support +### ONNX Opset Support Opset v8/v9/v10/v11/v12/v13 -

Neg

+

Neg

-## Description +### Description Computes numerical negative value element-wise. -## Parameters +### Parameters \[Inputs\] @@ -2895,17 +2895,17 @@ One output y: tensor of the identical data type as the input. -## ONNX Opset Support +### ONNX Opset Support Opset v8/v9/v10/v11/v12/v13 -

NonMaxSuppression

+

NonMaxSuppression

-## Description +### Description Filters out boxes that have high intersection-over-union \(IOU\) overlap with previously selected boxes. Bounding boxes with score less than score\_threshold are removed. Bounding box format is indicated by the center\_point\_box attribute. Note that this algorithm is agnostic to where the origin is in the coordinate system and more generally is invariant to orthogonal transformations and translations of the coordinate system; thus translating or reflections of the coordinate system result in the same boxes being selected by the algorithm. The selected\_indices output is a set of integers indexing into the input collection of bounding boxes representing the selected boxes. The bounding box coordinates corresponding to the selected indices can then be obtained using the Gather or GatherND operation. -## Parameters +### Parameters \[2–5 Inputs\] @@ -2927,17 +2927,17 @@ selected\_indices: tensor of type int64 center\_point\_box: int. Defaults to 0. -## ONNX Opset Support +### ONNX Opset Support Opset v10/v11/v12/v13 -

NonZero

+

NonZero

-## Description +### Description Returns the indices of the elements that are non-zero \(in row-major order\). -## Parameters +### Parameters \[Inputs\] @@ -2951,17 +2951,17 @@ One output y: tensor of type int64. -## ONNX Opset Support +### ONNX Opset Support Opset v9/v10/v11/v12/v13 -

Not

+

Not

-## Description +### Description Returns the negation of the input tensor element-wise. -## Parameters +### Parameters \[Inputs\] @@ -2975,17 +2975,17 @@ One output y: tensor of type bool. -## ONNX Opset Support +### ONNX Opset Support Opset v8/v9/v10/v11/v12/v13 -

OneHot

+

OneHot

-## Description +### Description Produces a one-hot tensor based on inputs. -## Parameters +### Parameters \[Inputs\] @@ -3013,17 +3013,17 @@ y: tensor of the identical data type as the values input. axis must not be less than –1. -## ONNX Opset Support +### ONNX Opset Support Opset v9/v10/v11/v12/v13 -

Or

+

Or

-## Description +### Description Returns the tensor resulted from performing the or logical operation element-wise on the input tensors. -## Parameters +### Parameters \[Inputs\] @@ -3039,17 +3039,17 @@ One output y: tensor of type bool. -## ONNX Opset Support +### ONNX Opset Support Opset v8/v9/v10/v11/v12/v13 -

RandomNormalLike

+

RandomNormalLike

-## Description +### Description Generates a tensor with random values drawn from a normal distribution. The shape of the output tensor is copied from the shape of the input tensor. -## Parameters +### Parameters \[Inputs\] @@ -3073,17 +3073,17 @@ scale: float. The standard deviation of the normal distribution. Defaults to 1.0 seed: float. Seed to the random generator. -## ONNX Opset Support +### ONNX Opset Support Opset v8/v9/v10/v11/v12/v13 -

RandomUniformLike

+

RandomUniformLike

-## Description +### Description Generates a tensor with random values drawn from a uniform distribution. The shape of the output tensor is copied from the shape of the input tensor. -## Parameters +### Parameters \[Inputs\] @@ -3107,17 +3107,17 @@ low: float. Lower boundary of the uniform distribution. Defaults to 0.0. seed: float. Seed to the random generator. -## ONNX Opset Support +### ONNX Opset Support Opset v8/v9/v10/v11/v12/v13 -

RandomUniform

+

RandomUniform

-## Description +### Description Generates a tensor with random values drawn from a uniform distribution. -## Parameters +### Parameters \[Attributes\] @@ -3139,17 +3139,17 @@ One output y: tensor of the data type specified by the dtype attribute. -## ONNX Opset Support +### ONNX Opset Support Opset v8/v9/v10/v11/v12/v13 -

Range

+

Range

-## Description +### Description Generate a tensor containing a sequence of numbers. -## Parameters +### Parameters \[Inputs\] @@ -3167,17 +3167,17 @@ One output y: tensor of the identical data type as input x. -## ONNX Opset Support +### ONNX Opset Support Opset v8/v9/v10/v11/v12/v13 -

Reciprocal

+

Reciprocal

-## Description +### Description Computes the reciprocal of the input element-wise. -## Parameters +### Parameters \[Inputs\] @@ -3191,17 +3191,17 @@ One output y: tensor. Has the identical data type and shape as the input. -## ONNX Opset Support +### ONNX Opset Support Opset v8/v9/v10/v11/v12/v13 -

ReduceL1

+

ReduceL1

-## Description +### Description Computes the L1 norm of the input tensor's elements along the provided axes. The resulted tensor has the same rank as the input if keepdim is set to 1. If keepdim is set to 0, then the result tensor has the reduced dimension pruned. The above behavior is similar to NumPy, with the exception that NumPy defaults keepdim to False instead of True. -## Parameters +### Parameters \[Inputs\] @@ -3217,17 +3217,17 @@ axes: list of ints. keepdims: int. Defaults to 1. -## ONNX Opset Support +### ONNX Opset Support Opset v8/v9/v10/v11/v12/v13 -

ReduceL2

+

ReduceL2

-## Description +### Description Computes the L2 norm of the input tensor's elements along the provided axes. The resulted tensor has the same rank as the input if keepdim is set to 1. If keepdim is set to 0, then the result tensor has the reduced dimension pruned. The above behavior is similar to NumPy, with the exception that NumPy defaults keepdim to False instead of True. -## Parameters +### Parameters \[Inputs\] @@ -3243,17 +3243,17 @@ axes: list of ints. keepdims: int. Defaults to 1. -## ONNX Opset Support +### ONNX Opset Support Opset v8/v9/v10/v11/v12/v13 -

ReduceLogSum

+

ReduceLogSum

-## Description +### Description Computes the sum of elements across dimensions of a tensor in log representations. -## Parameters +### Parameters \[Inputs\] @@ -3273,17 +3273,17 @@ axes: int list. Must be in the range \[–r, r – 1\], where **r** indicates keepdims: int. Defaults to **1**, meaning that the reduced dimensions with length 1 are retained. -## ONNX Opset Support +### ONNX Opset Support Opset v11/v13 -

ReduceLogSumExp

+

ReduceLogSumExp

-## Description +### Description Reduces a dimension of a tensor by calculating exponential for all elements in the dimension and calculates logarithm of the sum. -## Parameters +### Parameters \[Inputs\] @@ -3303,17 +3303,17 @@ axes: tensor of type int32 or int64. Must be in the range \[–r, r – 1\], whe keepdims: int, indicating whether to reduce the dimensions. The default value is **1**, indicating that the dimensions are reduced. -## ONNX Opset Support +### ONNX Opset Support Opset v8/v9/v10/v11/v12/v13 -

ReduceMin

+

ReduceMin

-## Description +### Description Computes the minimum of elements across dimensions of a tensor. -## Parameters +### Parameters \[Inputs\] @@ -3333,17 +3333,17 @@ axes: int list. Must be in the range \[–r, r – 1\], where **r** indicates keepdims: int. Defaults to 1, meaning that the reduced dimensions with length 1 are retained. -## ONNX Opset Support +### ONNX Opset Support Opset v8/v9/v10/v11/v12/v13 -

ReduceMean

+

ReduceMean

-## Description +### Description Computes the mean of elements across dimensions of a tensor. -## Parameters +### Parameters \[Inputs\] @@ -3363,17 +3363,17 @@ axes: 1D list of ints, the dimensions to reduce. Must be in the range \[–r, r keepdims: int. Defaults to 1, meaning that the reduced dimensions with length 1 are retained. -## ONNX Opset Support +### ONNX Opset Support Opset v8/v9/v10/v11/v12/v13 -

ReduceProd

+

ReduceProd

-## Description +### Description Computes the product of the input tensor's elements along the provided axes. The resulted tensor has the same rank as the input if keepdim is set to 1. If keepdim is set to 0, then the result tensor has the reduced dimension pruned. -## Parameters +### Parameters \[Inputs\] @@ -3389,17 +3389,17 @@ axes: list of ints. keepdims: int. Defaults to 1. -## ONNX Opset Support +### ONNX Opset Support Opset v8/v9/v10/v11/v12/v13 -

ReduceSumSquare

+

ReduceSumSquare

-## Description +### Description Computes the sum square of the input tensor's elements along the provided axes. The resulted tensor has the same rank as the input if keepdim is set to 1. If keepdim is set to 0, then the result tensor has the reduced dimension pruned. The above behavior is similar to NumPy, with the exception that NumPy defaults keepdim to False instead of True. -## Parameters +### Parameters \[Inputs\] @@ -3415,17 +3415,17 @@ axes: list of ints. keepdims: int. Defaults to 1. -## ONNX Opset Support +### ONNX Opset Support Opset v1/v8/v9/v10/v11/v12/v13 -

Resize

+

Resize

-## Description +### Description Resizes the input tensor. -## Parameters +### Parameters \[Inputs\] @@ -3461,17 +3461,17 @@ nearest\_mode: string. Defaults to round\_prefer\_floor. Currently, only the nearest and linear interpolation modes are supported to process images. In addition, the model's two inputs \(scales and sizes\) need to be changed from placeholders to constants. You can use ONNX Simplifier to simplify your model. -## ONNX Opset Support +### ONNX Opset Support Opset v10/v11/v12 -

Relu

+

Relu

-## Description +### Description Applies the rectified linear unit activation function. -## Parameters +### Parameters \[Inputs\] @@ -3481,17 +3481,17 @@ X: input tensor of type float32, int32, uint8, int16, int8, uint16, float16, or Y: tensor of the identical data type as X. -## ONNX Opset Support +### ONNX Opset Support Opset v8/v9/v10/v11/v12/v13 -

ReduceSum

+

ReduceSum

-## Description +### Description Computes the sum of the input tensor's element along the provided axes. -## Parameters +### Parameters \[Inputs\] @@ -3511,17 +3511,17 @@ axes: 1D list of ints, the dimensions to reduce. Must be in the range \[–r, r keepdims: int. Defaults to 1, meaning that the reduced dimensions with length 1 are retained. -## ONNX Opset Support +### ONNX Opset Support Opset v8/v9/v10/v11/v12/v13 -

ReduceMax

+

ReduceMax

-## Description +### Description Computes the maximum of elements across dimensions of a tensor. -## Parameters +### Parameters \[Inputs\] @@ -3541,17 +3541,17 @@ axes: list of ints. Must be in the range \[–r, r – 1\], where r indicates th keepdims: int. Defaults to 1, meaning that the reduced dimensions with length 1 are retained. -## ONNX Opset Support +### ONNX Opset Support Opset v8/v9/v10/v11/v12/v13 -

Reshape

+

Reshape

-## Description +### Description Reshapes the input. -## Parameters +### Parameters \[Inputs\] @@ -3565,17 +3565,17 @@ shape: tensor of type int64, for the shape of the output tensor. reshaped: tensor -## ONNX Opset Support +### ONNX Opset Support Opset v8/v9/v10/v11/v12/v13 -

ReverseSequence

+

ReverseSequence

-## Description +### Description Reverses batch of sequences having different lengths. -## Parameters +### Parameters \[Inputs\] @@ -3597,17 +3597,17 @@ batch\_axis: int. Specifies the batch axis. Defaults to 1. time\_axis: int. Specifies the time axis. Defaults to 1. -## ONNX Opset Support +### ONNX Opset Support Opset v10/v11/v12/v13 -

RoiExtractor

+

RoiExtractor

-## Description +### Description Obtains the ROI feature matrix from the feature mapping list. -## Parameters +### Parameters \[Inputs\] @@ -3643,17 +3643,17 @@ One output y: tensor of type float32 or float16. -## ONNX Opset Support +### ONNX Opset Support No ONNX support for this custom operator -

RoiAlign

+

RoiAlign

-## Description +### Description Performs ROI align operation. -## Parameters +### Parameters \[Inputs\] @@ -3689,17 +3689,17 @@ batch\_indices must be of type int32 instead of int64. The operator does not support inputs of type float32 or float64 when the atc command-line option **--precision\_mode** is set to **must\_keep\_origin\_dtype**. -## ONNX Opset Support +### ONNX Opset Support Opset v10/v11/v12/v13 -

Round

+

Round

-## Description +### Description Rounds the values of a tensor to the nearest integer, element-wise. -## Parameters +### Parameters \[Inputs\] @@ -3713,17 +3713,17 @@ One output y: tensor. Has the identical data type and shape as the input. -## ONNX Opset Support +### ONNX Opset Support Opset v8/v9/v10/v11/v12/v13 -

PRelu

+

PRelu

-## Description +### Description Computes Parametric Rectified Linear Unit. -## Parameters +### Parameters \[Inputs\] @@ -3743,17 +3743,17 @@ y: tensor of the identical data type and shape as input x. slope must be 1D. When input x is 1D, the dimension value of slope must be 1. When input x is not 1D, the dimension value of slope can be 1 or shape\[1\] of input x. -## ONNX Opset Support +### ONNX Opset Support Opset v8/v9/v10/v11/v12/v13 -

Scatter

+

Scatter

-## Description +### Description Returns the result by updating the values of the input data to values specified by updates at specific index positions specified by indices. -## Parameters +### Parameters \[Inputs\] @@ -3775,17 +3775,17 @@ y: tensor of the identical data type and shape as input x. axis: int, specifying which axis to scatter on. Defaults to 0. -## ONNX Opset Support +### ONNX Opset Support Opset v9/v10 -

ScatterElements

+

ScatterElements

-## Description +### Description Returns the result by updating the values of the input data to values specified by updates at specific index positions specified by indices. -## Parameters +### Parameters \[Inputs\] @@ -3807,17 +3807,17 @@ y: tensor of the identical data type and shape as input x. axis: int, specifying which axis to scatter on. Defaults to 0. -## ONNX Opset Support +### ONNX Opset Support Opset v11/v12/v13 -

ScatterND

+

ScatterND

-## Description +### Description Creates a copy of the input data, and then updates its values to those specified by updates at specific index positions specified by indices. -## Parameters +### Parameters \[Inputs\] @@ -3835,17 +3835,17 @@ One output y: tensor of the identical data type and shape as input x. -## ONNX Opset Support +### ONNX Opset Support Opset v11 -

Shrink

+

Shrink

-## Description +### Description Takes one input tensor and outputs one tensor. The formula of this operator is: If x < – lambd, y = x + bias; If x \> lambd, y = x – bias; otherwise, y = 0. -## Parameters +### Parameters \[Inputs\] @@ -3865,17 +3865,17 @@ bias: float. Defaults to 0.0. lambda: float. Defaults to 0.5. -## ONNX Opset Support +### ONNX Opset Support Opset v9/v10/v11/ v12/v13 -

Selu

+

Selu

-## Description +### Description Produces a tensor where the scaled exponential linear unit function: y = gamma \* \(alpha \* e^x – alpha\) for x <= 0, y = gamma \* x for x \> 0, is applied to the input tensor element-wise. -## Parameters +### Parameters \[Inputs\] @@ -3895,17 +3895,17 @@ One output y: tensor of the identical data type as the input. -## ONNX Opset Support +### ONNX Opset Support Opset v8/v9/v10/v11/v12/v13 -

Shape

+

Shape

-## Description +### Description Returns a tensor containing the shape of the input tensor. -## Parameters +### Parameters \[Inputs\] @@ -3917,17 +3917,17 @@ x: tensor y: int64 tensor containing the shape of the input tensor. -## ONNX Opset Support +### ONNX Opset Support Opset v8/v9/v10/v11/v12/v13 -

Sigmoid

+

Sigmoid

-## Description +### Description Computes sigmoid of the input element-wise. -## Parameters +### Parameters \[Inputs\] @@ -3941,17 +3941,17 @@ One output y: tensor of the identical data type as input x. -## ONNX Opset Support +### ONNX Opset Support Opset v8/v9/v10/v11/v12/v13 -

Slice

+

Slice

-## Description +### Description Extracts a slice from a tensor. -## Parameters +### Parameters \[Inputs\] @@ -3975,17 +3975,17 @@ y: tensor of the identical data type as input x. x: must have a rank greater than 1. -## ONNX Opset Support +### ONNX Opset Support Opset v8/v9/v10/v11/v12/v13 -

Softmax

+

Softmax

-## Description +### Description Computes softmax activations. -## Parameters +### Parameters \[Inputs\] @@ -4003,17 +4003,17 @@ y: tensor. Has the identical data type and shape as the input x. axis: \(optional\) int, the dimension softmax would be performed on. Defaults to –1. Must be in the range \[–len\(x.shape\), len\(x.shape\) – 1\]. -## ONNX Opset Support +### ONNX Opset Support Opset v8/v9/v10/v11/v12/v13 -

Softsign

+

Softsign

-## Description +### Description Computes softsign: \(x/\(1+|x|\)\) -## Parameters +### Parameters \[Inputs\] @@ -4027,17 +4027,17 @@ One output y: tensor. Has the identical data type and shape as the input. -## ONNX Opset Support +### ONNX Opset Support Opset v8/v9/v10/v11/v12/v13 -

Softplus

+

Softplus

-## Description +### Description Computes softplus. -## Parameters +### Parameters \[Inputs\] @@ -4057,17 +4057,17 @@ Only the float16 and float32 data types are supported. The output has the identical data type as the input. -## ONNX Opset Support +### ONNX Opset Support Opset v8/v9/v10/v11/v12/v13 -

SpaceToDepth

+

SpaceToDepth

-## Description +### Description Rearranges blocks of spatial data into depth. More specifically, this operator outputs a copy of the input tensor where values from the height and width dimensions are moved to the depth dimension. -## Parameters +### Parameters \[Inputs\] @@ -4081,17 +4081,17 @@ output: tensor. Must be one of the following data types: uint8, uint16, uint32, blocksize: int -## ONNX Opset Support +### ONNX Opset Support Opset v8/v9/v10/v11/v12/v13 -

Split

+

Split

-## Description +### Description Splits the input tensor into a list of sub-tensors. -## Parameters +### Parameters \[Inputs\] @@ -4119,17 +4119,17 @@ The sum of all split elements must be equal to axis. axis ∈ \[–len\(x.shape\), len\(x.shape\) – 1\] -## ONNX Opset Support +### ONNX Opset Support Opset v8/v9/v10/v11/v12/v13 -

Sqrt

+

Sqrt

-## Description +### Description Computes element-wise square root of the input tensor. -## Parameters +### Parameters \[Inputs\] @@ -4149,17 +4149,17 @@ The output has the identical shape and dtype as the input. The supported data ty NaN is returned if x is less than 0. -## ONNX Opset Support +### ONNX Opset Support Opset v8/v9/v10/v11/v12/v13 -

Squeeze

+

Squeeze

-## Description +### Description Removes dimensions of size 1 from the shape of a tensor. -## Parameters +### Parameters \[Inputs\] @@ -4175,17 +4175,17 @@ y: tensor of the identical data type as the input. axes: 1D list of int32s or int64s, indicating the dimensions to squeeze. Negative value means counting dimensions from the back. Accepted range is \[–r, r – 1\] where r = rank\(x\). -## ONNX Opset Support +### ONNX Opset Support Opset v8/v9/v10/v11/v12/v13 -

Sub

+

Sub

-## Description +### Description Performs element-wise subtraction. -## Parameters +### Parameters \[Inputs\] @@ -4205,17 +4205,17 @@ y: tensor of the identical data type as the input. The output has the identical shape and dtype as the input. The supported data types are int32, float16, and float32. -## ONNX Opset Support +### ONNX Opset Support Opset v8/v9/v10/v11/v12/v13 -

Sign

+

Sign

-## Description +### Description Computes the symbol of the input tensor element-wise. -## Parameters +### Parameters \[Inputs\] @@ -4229,17 +4229,17 @@ One output y: tensor of the identical data type and shape as input x. -## ONNX Opset Support +### ONNX Opset Support Opset v8/v9/v10/v11/v12/v13 -

Sin

+

Sin

-## Description +### Description Computes sine of the input element-wise. -## Parameters +### Parameters \[Inputs\] @@ -4253,17 +4253,17 @@ One output y: tensor. Has the identical data type and shape as the input. -## ONNX Opset Support +### ONNX Opset Support Opset v8/v9/v10/v11/v12/v13 -

Sinh

+

Sinh

-## Description +### Description Computes hyperbolic sine of the input element-wise. -## Parameters +### Parameters \[Inputs\] @@ -4277,17 +4277,17 @@ One output y: tensor. Has the identical data type and shape as the input. -## ONNX Opset Support +### ONNX Opset Support Opset v8/v9/v10/v11/v12/v13 -

Size

+

Size

-## Description +### Description Outputs the number of elements in the input tensor. -## Parameters +### Parameters \[Inputs\] @@ -4301,17 +4301,17 @@ One output y: scalar of type int64 -## ONNX Opset Support +### ONNX Opset Support Opset v8/v9/v10/v11/v12/v13 -

Sum

+

Sum

-## Description +### Description Computes element-wise sum of each of the input tensors. -## Parameters +### Parameters \[Inputs\] @@ -4325,17 +4325,17 @@ One output y: tensor of the identical data type and shape as input x. -## ONNX Opset Support +### ONNX Opset Support Opset v8/v9/v10/v11/v12/v13 -

Tanh

+

Tanh

-## Description +### Description Computes hyperbolic tangent of the input element-wise. -## Parameters +### Parameters \[Inputs\] @@ -4349,17 +4349,17 @@ One output y: tensor of the identical data type as the input. -## ONNX Opset Support +### ONNX Opset Support Opset v8/v9/v10/v11/v12/v13 -

TfIdfVectorizer

+

TfIdfVectorizer

-## Description +### Description Extracts n-grams from the input sequence and save them as a vector. -## Parameters +### Parameters \[Inputs\] @@ -4393,17 +4393,17 @@ pool\_strings: list of strings. Has the same meaning as pool\_int64s. weights: list of floats. Stores the weight of each n-gram in pool. -## ONNX Opset Support +### ONNX Opset Support Opset v9/v10/v11/ v12/v13 -

Tile

+

Tile

-## Description +### Description Constructs a tensor by tiling a given tensor. -## Parameters +### Parameters \[Inputs\] @@ -4419,17 +4419,17 @@ One output y: tensor of the identical type and dimension as the input. output\_dim\[i\] = input\_dim\[i\] \* repeats\[i\] -## ONNX Opset Support +### ONNX Opset Support Opset v8/v9/v10/v11/v12/v13 -

ThresholdedRelu

+

ThresholdedRelu

-## Description +### Description When x \> alpha, y = x; otherwise, y = 0. -## Parameters +### Parameters \[Inputs\] @@ -4447,17 +4447,17 @@ y: tensor of the identical data type and shape as input x. alpha: float, indicating the threshold. Defaults to 1.0. -## ONNX Opset Support +### ONNX Opset Support Opset v10/v11/v12/v13 -

TopK

+

TopK

-## Description +### Description Retrieves the top-K largest or smallest elements along a specified axis. -## Parameters +### Parameters \[Inputs\] @@ -4483,17 +4483,17 @@ largest: int. Whether to return the top-K largest or smallest elements. Defaults sorted: int. Whether to return the elements in sorted order. Defaults to 1. -## ONNX Opset Support +### ONNX Opset Support Opset v8/v9/v10/v11/v12/v13 -

Transpose

+

Transpose

-## Description +### Description Transposes the input. -## Parameters +### Parameters \[Inputs\] @@ -4507,17 +4507,17 @@ transposed: tensor after transposition. perm: \(required\) list of integers, for the dimension sequence of data. -## ONNX Opset Support +### ONNX Opset Support Opset v8/v9/v10/v11/v12/v13 -

Pad

+

Pad

-## Description +### Description Pads a tensor. -## Parameters +### Parameters \[Inputs\] @@ -4543,17 +4543,17 @@ mode: str type. The following modes are supported: constant, reflect, and edge. If the value of mode is **constant**, the value of **constant\_value** can only be **0**. -## ONNX Opset Support +### ONNX Opset Support Opset v11 -

Pow

+

Pow

-## Description +### Description Computes x1 to the x2th power. -## Parameters +### Parameters \[Inputs\] @@ -4569,17 +4569,17 @@ One output y: tensor of the identical data type as input x1. -## ONNX Opset Support +### ONNX Opset Support Opset v8/v9/v10/v11/v12/v13 -

Unsqueeze

+

Unsqueeze

-## Description +### Description Inserts single-dimensional entries to the shape of an input tensor. -## Parameters +### Parameters \[Inputs\] @@ -4597,17 +4597,17 @@ y: tensor of the identical data type as input x. axes: list of integers indicating the dimensions to be inserted. Accepted range is \[–input\_rank, input\_rank\]\(inclusive\) where r = rank\(x\). -## ONNX Opset Support +### ONNX Opset Support Opset v8/v9/10/v11/v12 -

Xor

+

Xor

-## Description +### Description Computes the element-wise logical XOR of the given input tensors. -## Parameters +### Parameters \[Inputs\] @@ -4621,17 +4621,17 @@ b: tensor of type bool. c: tensor of type bool. -## ONNX Opset Support +### ONNX Opset Support Opset v8/v9/v10/v11/v12/v13 -

Where

+

Where

-## Description +### Description Returns elements chosen from x or y depending on condition. -## Parameters +### Parameters \[Inputs\] @@ -4647,7 +4647,7 @@ y: tensor of the identical data type as x. Elements from which to choose when co Tensor of the identical data type as input x. -## ONNX Opset Support +### ONNX Opset Support Opset v8/v9/v10/v11/v12/v13 diff --git a/docs/en/PyTorch API Support/PyTorch API Support.md b/docs/en/PyTorch API Support/PyTorch API Support.md index 17f1f0425d..907fa0f5ff 100644 --- a/docs/en/PyTorch API Support/PyTorch API Support.md +++ b/docs/en/PyTorch API Support/PyTorch API Support.md @@ -1,17 +1,17 @@ # PyTorch API Support -- [Tensors](#tensors.md) -- [Generators](#generators.md) -- [Random sampling](#random-sampling.md) -- [Serialization](#serialization.md) -- [Math operations](#math-operations.md) -- [Utilities](#utilities.md) -- [Other](#other.md) -- [torch.Tensor](#torch-tensor.md) -- [Layers \(torch.nn\)](#layers-(torch-nn).md) -- [Functions\(torch.nn.functional\)](#functions(torch-nn-functional).md) -- [torch.distributed](#torch-distributed.md) -- [NPU and CUDA Function Alignment](#npu-and-cuda-function-alignment.md) -

Tensors

+- [Tensors](#tensorsmd) +- [Generators](#generatorsmd) +- [Random sampling](#random-samplingmd) +- [Serialization](#serializationmd) +- [Math operations](#math-operationsmd) +- [Utilities](#utilitiesmd) +- [Other](#othermd) +- [torch.Tensor](#torch-tensormd) +- [Layers \(torch.nn\)](#layers-torch-nnmd) +- [Functions\(torch.nn.functional\)](#functionstorch-nn-functionalmd) +- [torch.distributed](#torch-distributedmd) +- [NPU and CUDA Function Alignment](#npu-and-cuda-function-alignmentmd) +

Tensors

No.

@@ -361,7 +361,7 @@
-

Generators

+

Generators

No.

@@ -424,7 +424,7 @@
-

Random sampling

+

Random sampling

No.

@@ -641,7 +641,7 @@
-

Serialization

+

Serialization

No.

@@ -669,7 +669,7 @@
-

Math operations

+

Math operations

No.

@@ -1873,7 +1873,7 @@
-

Utilities

+

Utilities

No.

@@ -1915,7 +1915,7 @@
-

Other

+

Other

No.

@@ -1978,7 +1978,7 @@
-

torch.Tensor

+

torch.Tensor

No.

@@ -4484,7 +4484,7 @@
-

Layers \(torch.nn\)

+

Layers (torch.nn)

No.

@@ -6710,7 +6710,7 @@
-

Functions\(torch.nn.functional\)

+

Functions(torch.nn.functional)

No.

@@ -7417,7 +7417,7 @@
-

torch.distributed

+

torch.distributed

No.

@@ -7641,7 +7641,7 @@
-

NPU and CUDA Function Alignment

+

NPU and CUDA Function Alignment

No.

diff --git a/docs/en/PyTorch Installation Guide/PyTorch Installation Guide.md b/docs/en/PyTorch Installation Guide/PyTorch Installation Guide.md index fdc1d09ae8..28542efee4 100644 --- a/docs/en/PyTorch Installation Guide/PyTorch Installation Guide.md +++ b/docs/en/PyTorch Installation Guide/PyTorch Installation Guide.md @@ -1,15 +1,15 @@ # PyTorch Installation Guide -- [Overview](#overview.md) -- [Manual Build and Installation](#manual-build-and-installation.md) - - [Prerequisites](#prerequisites.md) - - [Installing the PyTorch Framework](#installing-the-pytorch-framework.md) - - [Configuring Environment Variables](#configuring-environment-variables.md) - - [Installing the Mixed Precision Module](#installing-the-mixed-precision-module.md) -- [References](#references.md) - - [Installing CMake](#installing-cmake.md) - - [How Do I Install GCC 7.3.0?](#how-do-i-install-gcc-7-3-0.md) - - [What Do I Do If "torch 1.5.0xxxx" and "torchvision" Do Not Match When torch-\*.whl Is Installed?](#what-do-i-do-if-torch-1-5-0xxxx-and-torchvision-do-not-match-when-torch--whl-is-installed.md) -

Overview

+- [Overview](#overviewmd) +- [Manual Build and Installation](#manual-build-and-installationmd) + - [Prerequisites](#prerequisitesmd) + - [Installing the PyTorch Framework](#installing-the-pytorch-frameworkmd) + - [Configuring Environment Variables](#configuring-environment-variablesmd) + - [Installing the Mixed Precision Module](#installing-the-mixed-precision-modulemd) +- [References](#referencesmd) + - [Installing CMake](#installing-cmakemd) + - [How Do I Install GCC 7.3.0?](#how-do-i-install-gcc-7-3-0md) + - [What Do I Do If "torch 1.5.0xxxx" and "torchvision" Do Not Match When torch-\*.whl Is Installed?](#what-do-i-do-if-torch-1-5-0xxxx-and-torchvision-do-not-match-when-torch--whl-is-installedmd) +

Overview

When setting up the environment for PyTorch model development and running, you can manually build and install the modules adapted to the PyTorch framework on a server. @@ -18,24 +18,24 @@ When setting up the environment for PyTorch model development and running, you c ![](figures/210926103326800.png) -

Manual Build and Installation

+

Manual Build and Installation

-- **[Prerequisites](#prerequisites.md)** +- **[Prerequisites](#prerequisitesmd)** -- **[Installing the PyTorch Framework](#installing-the-pytorch-framework.md)** +- **[Installing the PyTorch Framework](#installing-the-pytorch-frameworkmd)** -- **[Configuring Environment Variables](#configuring-environment-variables.md)** +- **[Configuring Environment Variables](#configuring-environment-variablesmd)** -- **[Installing the Mixed Precision Module](#installing-the-mixed-precision-module.md)** +- **[Installing the Mixed Precision Module](#installing-the-mixed-precision-modulemd)** -

Prerequisites

+

Prerequisites

-## Prerequisites +#### Prerequisites - The development or operating environment of CANN has been installed. For details, see the _CANN Software Installation Guide_. -- CMake 3.12.0 or later has been installed. For details about how to install CMake, see [Installing CMake](#installing-cmake.md). -- GCC 7.3.0 or later has been installed. For details about how to install and use GCC 7.3.0, see [How Do I Install GCC 7.3.0?](#how-do-i-install-gcc-7-3-0.md). +- CMake 3.12.0 or later has been installed. For details about how to install CMake, see [Installing CMake](#installing-cmakemd). +- GCC 7.3.0 or later has been installed. For details about how to install and use GCC 7.3.0, see [How Do I Install GCC 7.3.0?](#how-do-i-install-gcc-7-3-0md). - Python 3.7.5 or 3.8 has been installed. - The Patch and Git tools have been installed in the environment. To install the tools for Ubuntu and CentOS, run the following commands: - Ubuntu @@ -54,9 +54,9 @@ When setting up the environment for PyTorch model development and running, you c -

Installing the PyTorch Framework

+

Installing the PyTorch Framework

-## Installation Process +#### Installation Process 1. Log in to the server as the **root** user or a non-root user. 2. Run the following commands in sequence to install the PyTorch dependencies. @@ -162,7 +162,7 @@ When setting up the environment for PyTorch model development and running, you c >**pip3 list | grep torch** -

Configuring Environment Variables

+

Configuring Environment Variables

After the software packages are installed, configure environment variables to use Ascend PyTorch. For details about the environment variables, see [Table 1](#en-us_topic_0000001152616261_table42017516135). @@ -211,7 +211,7 @@ After the software packages are installed, configure environment variables to us

LD_LIBRARY_PATH

Dynamic library search path. Set this variable based on the preceding example.

-

If you need to upgrade GCC in OSs such as CentOS, Debian, and BC-Linux, add ${install_path}/lib64 to the LD_LIBRARY_PATH variable of the dynamic library search path. Replace {install_path} with the GCC installation path. For details, see 5.

+

If you need to upgrade GCC in OSs such as CentOS, Debian, and BC-Linux, add ${install_path}/lib64 to the LD_LIBRARY_PATH variable of the dynamic library search path. Replace {install_path} with the GCC installation path. For details, see 5.

PYTHONPATH

@@ -314,14 +314,14 @@ After the software packages are installed, configure environment variables to us
-

Installing the Mixed Precision Module

+

Installing the Mixed Precision Module

-## Prerequisites +#### Prerequisites 1. Ensure that the PyTorch framework adapted to Ascend AI Processors in the operating environment can be used properly. -2. Before building and installing Apex, you have configured the environment variables on which the build depends. See [Configuring Environment Variables](#configuring-environment-variables.md). +2. Before building and installing Apex, you have configured the environment variables on which the build depends. See [Configuring Environment Variables](#configuring-environment-variablesmd). -## Installation Process +#### Installation Process 1. Log in to the server as the **root** user or a non-root user. 2. Obtain the Apex source code. @@ -408,16 +408,16 @@ After the software packages are installed, configure environment variables to us >**pip3 list | grep apex** -

References

+

References

-- **[Installing CMake](#installing-cmake.md)** +- **[Installing CMake](#installing-cmakemd)** -- **[How Do I Install GCC 7.3.0?](#how-do-i-install-gcc-7-3-0.md)** +- **[How Do I Install GCC 7.3.0?](#how-do-i-install-gcc-7-3-0md)** -- **[What Do I Do If "torch 1.5.0xxxx" and "torchvision" Do Not Match When torch-\*.whl Is Installed?](#what-do-i-do-if-torch-1-5-0xxxx-and-torchvision-do-not-match-when-torch--whl-is-installed.md)** +- **[What Do I Do If "torch 1.5.0xxxx" and "torchvision" Do Not Match When torch-\*.whl Is Installed?](#what-do-i-do-if-torch-1-5-0xxxx-and-torchvision-do-not-match-when-torch--whl-is-installedmd)** -

Installing CMake

+

Installing CMake

Procedure for upgrading CMake to 3.12.1 @@ -456,7 +456,7 @@ Procedure for upgrading CMake to 3.12.1 If the message "cmake version 3.12.1" is displayed, the installation is successful. -

How Do I Install GCC 7.3.0?

+

How Do I Install GCC 7.3.0?

Perform the following steps as the **root** user. @@ -537,19 +537,19 @@ Perform the following steps as the **root** user. >Skip this step if you do not need to use the compilation environment with GCC upgraded. -

What Do I Do If "torch 1.5.0xxxx" and "torchvision" Do Not Match When torch-\*.whl Is Installed?

+

What Do I Do If "torch 1.5.0xxxx" and "torchvision" Do Not Match When torch-\*.whl Is Installed?

-## Symptom +#### Symptom During the installation of **torch-**_\*_**.whl**, the message "ERROR: torchvision 0.6.0 has requirement torch==1.5.0, but you'll have torch 1.5.0a0+1977093 which is incompatible" " is displayed. ![](figures/en-us_image_0000001190081735.png) -## Possible Causes +#### Possible Causes When the PyTorch is installed, the version check is automatically triggered. The version of the torchvision installed in the environment is 0.6.0. During the check, it is found that the version of the **torch-**_\*_**.whl** is inconsistent with the required version 1.5.0. As a result, an error message is displayed, but the installation is successful. -## Solution +#### Solution This problem has no impact on the actual result, and no action is required. diff --git a/docs/en/PyTorch Network Model Porting and Training Guide/PyTorch Network Model Porting and Training Guide.md b/docs/en/PyTorch Network Model Porting and Training Guide/PyTorch Network Model Porting and Training Guide.md index 9221b4d884..d250cd9234 100644 --- a/docs/en/PyTorch Network Model Porting and Training Guide/PyTorch Network Model Porting and Training Guide.md +++ b/docs/en/PyTorch Network Model Porting and Training Guide/PyTorch Network Model Porting and Training Guide.md @@ -1,102 +1,102 @@ # PyTorch Network Model Porting and Training Guide -- [Overview](#overview.md) -- [Restrictions and Limitations](#restrictions-and-limitations.md) -- [Porting Process](#porting-process.md) -- [Model Porting Evaluation](#model-porting-evaluation.md) -- [Environment Setup](#environment-setup.md) -- [Model Porting](#model-porting.md) - - [Tool-Facilitated](#tool-facilitated.md) - - [Introduction](#introduction.md) - - [Instructions](#instructions.md) - - [Result Analysis](#result-analysis.md) - - [Manual](#manual.md) - - [Single-Device Training Model Porting](#single-device-training-model-porting.md) - - [Multi-Device Training Model Porting](#multi-device-training-model-porting.md) - - [PyTorch-related API Replacement](#pytorch-related-api-replacement.md) - - [Mixed Precision](#mixed-precision.md) -- [Model Training](#model-training.md) -- [Performance Analysis and Optimization](#performance-analysis-and-optimization.md) - - [Prerequisites](#prerequisites.md) - - [Commissioning Process](#commissioning-process.md) - - [Overall Guideline](#overall-guideline.md) - - [Training Data Collection](#training-data-collection.md) - - [Host-side Performance Optimization](#host-side-performance-optimization.md) - - [Overview](#overview-0.md) - - [Changing the CPU Performance Mode \(x86 Server\)](#changing-the-cpu-performance-mode-(x86-server).md) - - [Changing the CPU Performance Mode \(ARM Server\)](#changing-the-cpu-performance-mode-(arm-server).md) - - [Installing the High-Performance Pillow Library \(x86 Server\)](#installing-the-high-performance-pillow-library-(x86-server).md) - - [\(Optional\) Installing the OpenCV Library of the Specified Version](#(optional)-installing-the-opencv-library-of-the-specified-version.md) - - [Training Performance Optimization](#training-performance-optimization.md) - - [Affinity Library](#affinity-library.md) - - [Source](#source.md) - - [Functions](#functions.md) -- [Precision Commissioning](#precision-commissioning.md) - - [Prerequisites](#prerequisites-1.md) - - [Commissioning Process](#commissioning-process-2.md) - - [Overall Guideline](#overall-guideline-3.md) - - [Precision Tuning Methods](#precision-tuning-methods.md) - - [Single-Operator Overflow/Underflow Detection](#single-operator-overflow-underflow-detection.md) - - [Network-wide Commissioning ](#network-wide-commissioning.md) -- [Model Saving and Conversion](#model-saving-and-conversion.md) - - [Introduction](#introduction-4.md) - - [Saving a Model](#saving-a-model.md) - - [Exporting an ONNX Model](#exporting-an-onnx-model.md) -- [Samples](#samples.md) - - [ResNet-50 Model Porting](#resnet-50-model-porting.md) - - [Obtaining Samples](#obtaining-samples.md) - - [Porting the Training Script](#porting-the-training-script.md) - - [Single-Device Training Modification](#single-device-training-modification.md) - - [Distributed Training Modification](#distributed-training-modification.md) - - [Script Execution](#script-execution.md) - - [ShuffleNet Model Optimization](#shufflenet-model-optimization.md) - - [Obtaining Samples](#obtaining-samples-5.md) - - [Model Evaluation](#model-evaluation.md) - - [Porting the Network](#porting-the-network.md) - - [Commissioning the Network](#commissioning-the-network.md) -- [References](#references.md) - - [Single-Operator Sample Building](#single-operator-sample-building.md) - - [Single-Operator Dump Method](#single-operator-dump-method.md) - - [Common Environment Variables](#common-environment-variables.md) - - [dump op Method](#dump-op-method.md) - - [Compilation Option Settings](#compilation-option-settings.md) - - [How Do I Install GCC 7.3.0?](#how-do-i-install-gcc-7-3-0.md) - - [HDF5 Compilation and Installation](#hdf5-compilation-and-installation.md) -- [FAQs](#faqs.md) - - [FAQs About Software Installation](#faqs-about-software-installation.md) - - [pip3.7 install Pillow==5.3.0 Installation Failed](#pip3-7-install-pillow-5-3-0-installation-failed.md) - - [FAQs About Model and Operator Running](#faqs-about-model-and-operator-running.md) - - [What Do I Do If the Error Message "RuntimeError: ExchangeDevice:" Is Displayed During Model or Operator Running?](#what-do-i-do-if-the-error-message-runtimeerror-exchangedevice-is-displayed-during-model-or-operator.md) - - [What Do I Do If the Error Message "Error in atexit.\_run\_exitfuncs:" Is Displayed During Model or Operator Running?](#what-do-i-do-if-the-error-message-error-in-atexit-_run_exitfuncs-is-displayed-during-model-or-operat.md) - - [What Do I Do If the Error Message "terminate called after throwing an instance of 'c10::Error' what\(\): HelpACLExecute:" Is Displayed During Model Running?](#what-do-i-do-if-the-error-message-terminate-called-after-throwing-an-instance-of-c10-error-what()-he.md) - - [What Do I Do If the Error Message "terminate called after throwing an instance of 'c10::Error' what\(\): 0 INTERNAL ASSERT" Is Displayed During Model Running?](#what-do-i-do-if-the-error-message-terminate-called-after-throwing-an-instance-of-c10-error-what()-0.md) - - [What Do I Do If the Error Message "ImportError: libhccl.so." Is Displayed During Model Running?](#what-do-i-do-if-the-error-message-importerror-libhccl-so-is-displayed-during-model-running.md) - - [What Do I Do If the Error Message "RuntimeError: Initialize." Is Displayed During Model Running?](#what-do-i-do-if-the-error-message-runtimeerror-initialize-is-displayed-during-model-running.md) - - [What Do I Do If the Error Message "TVM/te/cce error." Is Displayed During Model Running?](#what-do-i-do-if-the-error-message-tvm-te-cce-error-is-displayed-during-model-running.md) - - [What Do I Do If the Error Message "MemCopySync:drvMemcpy failed." Is Displayed During Model Running?](#what-do-i-do-if-the-error-message-memcopysync-drvmemcpy-failed-is-displayed-during-model-running.md) - - [What Do I Do If the Error Message "MemCopySync:drvMemcpy failed." Is Displayed During Model Running?](#what-do-i-do-if-the-error-message-memcopysync-drvmemcpy-failed-is-displayed-during-model-running-6.md) - - [What Do I Do If the Error Message "HelpACLExecute." Is Displayed After Multi-Task Delivery Is Disabled \(export TASK\_QUEUE\_ENABLE=0\) During Model Running?](#what-do-i-do-if-the-error-message-helpaclexecute-is-displayed-after-multi-task-delivery-is-disabled.md) - - [What Do I Do If the Error Message "55056 GetInputConstDataOut: ErrorNo: -1\(failed\)" Is Displayed During Model Running?](#what-do-i-do-if-the-error-message-55056-getinputconstdataout-errorno--1(failed)-is-displayed-during.md) - - [FAQs About Model Commissioning](#faqs-about-model-commissioning.md) - - [What Do I Do If the Error Message "RuntimeError: malloc:/..../pytorch/c10/npu/NPUCachingAllocator.cpp:293 NPU error, error code is 500000." Is Displayed During Model Commissioning?](#what-do-i-do-if-the-error-message-runtimeerror-malloc-pytorch-c10-npu-npucachingallocator-cpp-293-np.md) - - [What Do I Do If the Error Message "RuntimeError: Could not run 'aten::trunc.out' with arguments from the 'NPUTensorId' backend." Is Displayed During Model Commissioning](#what-do-i-do-if-the-error-message-runtimeerror-could-not-run-aten-trunc-out-with-arguments-from-the.md) - - [What Do I Do If the MaxPoolGradWithArgmaxV1 and max Operators Report Errors During Model Commissioning?](#what-do-i-do-if-the-maxpoolgradwithargmaxv1-and-max-operators-report-errors-during-model-commissioni.md) - - [What Do I Do If the Error Message "ModuleNotFoundError: No module named 'torch.\_C'" Is Displayed When torch Is Called?](#what-do-i-do-if-the-error-message-modulenotfounderror-no-module-named-torch-_c-is-displayed-when-tor.md) - - [FAQs About Other Operations](#faqs-about-other-operations.md) - - [What Do I Do If an Error Is Reported During CUDA Stream Synchronization?](#what-do-i-do-if-an-error-is-reported-during-cuda-stream-synchronization.md) - - [What Do I Do If aicpu\_kernels/libpt\_kernels.so Does Not Exist?](#what-do-i-do-if-aicpu_kernels-libpt_kernels-so-does-not-exist.md) - - [What Do I Do If the Python Process Is Residual When the npu-smi info Command Is Used to View Video Memory?](#what-do-i-do-if-the-python-process-is-residual-when-the-npu-smi-info-command-is-used-to-view-video-m.md) - - [What Do I Do If the Error Message "match op inputs failed"Is Displayed When the Dynamic Shape Is Used?](#what-do-i-do-if-the-error-message-match-op-inputs-failed-is-displayed-when-the-dynamic-shape-is-used.md) - - [What Do I Do If the Error Message "Op type SigmoidCrossEntropyWithLogitsV2 of ops kernel AIcoreEngine is unsupported" Is Displayed?](#what-do-i-do-if-the-error-message-op-type-sigmoidcrossentropywithlogitsv2-of-ops-kernel-aicoreengine.md) - - [What Do I Do If a Hook Failure Occurs?](#what-do-i-do-if-a-hook-failure-occurs.md) - - [What Do I Do If the Error Message "load state\_dict error." Is Displayed When the Weight Is Loaded?](#what-do-i-do-if-the-error-message-load-state_dict-error-is-displayed-when-the-weight-is-loaded.md) - - [FAQs About Distributed Model Training](#faqs-about-distributed-model-training.md) - - [What Do I Do If the Error Message "host not found." Is Displayed During Distributed Model Training?](#what-do-i-do-if-the-error-message-host-not-found-is-displayed-during-distributed-model-training.md) - - [What Do I Do If the Error Message "RuntimeError: connect\(\) timed out." Is Displayed During Distributed Model Training?](#what-do-i-do-if-the-error-message-runtimeerror-connect()-timed-out-is-displayed-during-distributed-m.md) -

Overview

+- [Overview](#overviewmd) +- [Restrictions and Limitations](#restrictions-and-limitationsmd) +- [Porting Process](#porting-processmd) +- [Model Porting Evaluation](#model-porting-evaluationmd) +- [Environment Setup](#environment-setupmd) +- [Model Porting](#model-portingmd) + - [Tool-Facilitated](#tool-facilitatedmd) + - [Introduction](#introductionmd) + - [Instructions](#instructionsmd) + - [Result Analysis](#result-analysismd) + - [Manual](#manualmd) + - [Single-Device Training Model Porting](#single-device-training-model-portingmd) + - [Multi-Device Training Model Porting](#multi-device-training-model-portingmd) + - [PyTorch-related API Replacement](#pytorch-related-api-replacementmd) + - [Mixed Precision](#mixed-precisionmd) +- [Model Training](#model-trainingmd) +- [Performance Analysis and Optimization](#performance-analysis-and-optimizationmd) + - [Prerequisites](#prerequisitesmd) + - [Commissioning Process](#commissioning-processmd) + - [Overall Guideline](#overall-guidelinemd) + - [Training Data Collection](#training-data-collectionmd) + - [Host-side Performance Optimization](#host-side-performance-optimizationmd) + - [Overview](#overview-0md) + - [Changing the CPU Performance Mode \(x86 Server\)](#changing-the-cpu-performance-mode-x86-servermd) + - [Changing the CPU Performance Mode \(ARM Server\)](#changing-the-cpu-performance-mode-arm-servermd) + - [Installing the High-Performance Pillow Library \(x86 Server\)](#installing-the-high-performance-pillow-library-x86-servermd) + - [\(Optional\) Installing the OpenCV Library of the Specified Version](#optional-installing-the-opencv-library-of-the-specified-versionmd) + - [Training Performance Optimization](#training-performance-optimizationmd) + - [Affinity Library](#affinity-librarymd) + - [Source](#sourcemd) + - [Functions](#functionsmd) +- [Precision Commissioning](#precision-commissioningmd) + - [Prerequisites](#prerequisites-1md) + - [Commissioning Process](#commissioning-process-2md) + - [Overall Guideline](#overall-guideline-3md) + - [Precision Tuning Methods](#precision-tuning-methodsmd) + - [Single-Operator Overflow/Underflow Detection](#single-operator-overflow-underflow-detectionmd) + - [Network-wide Commissioning ](#network-wide-commissioningmd) +- [Model Saving and Conversion](#model-saving-and-conversionmd) + - [Introduction](#introduction-4md) + - [Saving a Model](#saving-a-modelmd) + - [Exporting an ONNX Model](#exporting-an-onnx-modelmd) +- [Samples](#samplesmd) + - [ResNet-50 Model Porting](#resnet-50-model-portingmd) + - [Obtaining Samples](#obtaining-samplesmd) + - [Porting the Training Script](#porting-the-training-scriptmd) + - [Single-Device Training Modification](#single-device-training-modificationmd) + - [Distributed Training Modification](#distributed-training-modificationmd) + - [Script Execution](#script-executionmd) + - [ShuffleNet Model Optimization](#shufflenet-model-optimizationmd) + - [Obtaining Samples](#obtaining-samples-5md) + - [Model Evaluation](#model-evaluationmd) + - [Porting the Network](#porting-the-networkmd) + - [Commissioning the Network](#commissioning-the-networkmd) +- [References](#referencesmd) + - [Single-Operator Sample Building](#single-operator-sample-buildingmd) + - [Single-Operator Dump Method](#single-operator-dump-methodmd) + - [Common Environment Variables](#common-environment-variablesmd) + - [dump op Method](#dump-op-methodmd) + - [Compilation Option Settings](#compilation-option-settingsmd) + - [How Do I Install GCC 7.3.0?](#how-do-i-install-gcc-7-3-0md) + - [HDF5 Compilation and Installation](#hdf5-compilation-and-installationmd) +- [FAQs](#faqsmd) + - [FAQs About Software Installation](#faqs-about-software-installationmd) + - [pip3.7 install Pillow==5.3.0 Installation Failed](#pip3-7-install-pillow-5-3-0-installation-failedmd) + - [FAQs About Model and Operator Running](#faqs-about-model-and-operator-runningmd) + - [What Do I Do If the Error Message "RuntimeError: ExchangeDevice:" Is Displayed During Model or Operator Running?](#what-do-i-do-if-the-error-message-runtimeerror-exchangedevice-is-displayed-during-model-or-operatormd) + - [What Do I Do If the Error Message "Error in atexit.\_run\_exitfuncs:" Is Displayed During Model or Operator Running?](#what-do-i-do-if-the-error-message-error-in-atexit-_run_exitfuncs-is-displayed-during-model-or-operatmd) + - [What Do I Do If the Error Message "terminate called after throwing an instance of 'c10::Error' what\(\): HelpACLExecute:" Is Displayed During Model Running?](#what-do-i-do-if-the-error-message-terminate-called-after-throwing-an-instance-of-c10-error-what-hemd) + - [What Do I Do If the Error Message "terminate called after throwing an instance of 'c10::Error' what\(\): 0 INTERNAL ASSERT" Is Displayed During Model Running?](#what-do-i-do-if-the-error-message-terminate-called-after-throwing-an-instance-of-c10-error-what-0md) + - [What Do I Do If the Error Message "ImportError: libhccl.so." Is Displayed During Model Running?](#what-do-i-do-if-the-error-message-importerror-libhccl-so-is-displayed-during-model-runningmd) + - [What Do I Do If the Error Message "RuntimeError: Initialize." Is Displayed During Model Running?](#what-do-i-do-if-the-error-message-runtimeerror-initialize-is-displayed-during-model-runningmd) + - [What Do I Do If the Error Message "TVM/te/cce error." Is Displayed During Model Running?](#what-do-i-do-if-the-error-message-tvm-te-cce-error-is-displayed-during-model-runningmd) + - [What Do I Do If the Error Message "MemCopySync:drvMemcpy failed." Is Displayed During Model Running?](#what-do-i-do-if-the-error-message-memcopysync-drvmemcpy-failed-is-displayed-during-model-runningmd) + - [What Do I Do If the Error Message "MemCopySync:drvMemcpy failed." Is Displayed During Model Running?](#what-do-i-do-if-the-error-message-memcopysync-drvmemcpy-failed-is-displayed-during-model-running-6md) + - [What Do I Do If the Error Message "HelpACLExecute." Is Displayed After Multi-Task Delivery Is Disabled \(export TASK\_QUEUE\_ENABLE=0\) During Model Running?](#what-do-i-do-if-the-error-message-helpaclexecute-is-displayed-after-multi-task-delivery-is-disabledmd) + - [What Do I Do If the Error Message "55056 GetInputConstDataOut: ErrorNo: -1\(failed\)" Is Displayed During Model Running?](#what-do-i-do-if-the-error-message-55056-getinputconstdataout-errorno--1failed-is-displayed-duringmd) + - [FAQs About Model Commissioning](#faqs-about-model-commissioningmd) + - [What Do I Do If the Error Message "RuntimeError: malloc:/..../pytorch/c10/npu/NPUCachingAllocator.cpp:293 NPU error, error code is 500000." Is Displayed During Model Commissioning?](#what-do-i-do-if-the-error-message-runtimeerror-malloc-pytorch-c10-npu-npucachingallocator-cpp-293-npmd) + - [What Do I Do If the Error Message "RuntimeError: Could not run 'aten::trunc.out' with arguments from the 'NPUTensorId' backend." Is Displayed During Model Commissioning](#what-do-i-do-if-the-error-message-runtimeerror-could-not-run-aten-trunc-out-with-arguments-from-themd) + - [What Do I Do If the MaxPoolGradWithArgmaxV1 and max Operators Report Errors During Model Commissioning?](#what-do-i-do-if-the-maxpoolgradwithargmaxv1-and-max-operators-report-errors-during-model-commissionimd) + - [What Do I Do If the Error Message "ModuleNotFoundError: No module named 'torch.\_C'" Is Displayed When torch Is Called?](#what-do-i-do-if-the-error-message-modulenotfounderror-no-module-named-torch-_c-is-displayed-when-tormd) + - [FAQs About Other Operations](#faqs-about-other-operationsmd) + - [What Do I Do If an Error Is Reported During CUDA Stream Synchronization?](#what-do-i-do-if-an-error-is-reported-during-cuda-stream-synchronizationmd) + - [What Do I Do If aicpu\_kernels/libpt\_kernels.so Does Not Exist?](#what-do-i-do-if-aicpu_kernels-libpt_kernels-so-does-not-existmd) + - [What Do I Do If the Python Process Is Residual When the npu-smi info Command Is Used to View Video Memory?](#what-do-i-do-if-the-python-process-is-residual-when-the-npu-smi-info-command-is-used-to-view-video-mmd) + - [What Do I Do If the Error Message "match op inputs failed"Is Displayed When the Dynamic Shape Is Used?](#what-do-i-do-if-the-error-message-match-op-inputs-failed-is-displayed-when-the-dynamic-shape-is-usedmd) + - [What Do I Do If the Error Message "Op type SigmoidCrossEntropyWithLogitsV2 of ops kernel AIcoreEngine is unsupported" Is Displayed?](#what-do-i-do-if-the-error-message-op-type-sigmoidcrossentropywithlogitsv2-of-ops-kernel-aicoreenginemd) + - [What Do I Do If a Hook Failure Occurs?](#what-do-i-do-if-a-hook-failure-occursmd) + - [What Do I Do If the Error Message "load state\_dict error." Is Displayed When the Weight Is Loaded?](#what-do-i-do-if-the-error-message-load-state_dict-error-is-displayed-when-the-weight-is-loadedmd) + - [FAQs About Distributed Model Training](#faqs-about-distributed-model-trainingmd) + - [What Do I Do If the Error Message "host not found." Is Displayed During Distributed Model Training?](#what-do-i-do-if-the-error-message-host-not-found-is-displayed-during-distributed-model-trainingmd) + - [What Do I Do If the Error Message "RuntimeError: connect\(\) timed out." Is Displayed During Distributed Model Training?](#what-do-i-do-if-the-error-message-runtimeerror-connect-timed-out-is-displayed-during-distributed-mmd) +

Overview

Currently, the solution of adapting to the Ascend AI Processor is an online solution. -## Solution Features and Advantages +### Solution Features and Advantages The acceleration of the Ascend AI Processor is implemented by calling various operators \(OP-based\). That is, the AscendCL is used to call one or more D affinity operators to replace the original GPU-based implementation. [Figure 1](#fig2267112413239) shows the logical model of the implementation. @@ -113,7 +113,7 @@ Currently, the main reasons for selecting the online adaptation solution are as 4. It has good scalability. During the streamlining process, only the development and implementation of related compute operators are involved for new network types or structures. Framework operators, reverse graph building, and implementation mechanisms can be reused. 5. The usage and style are the same as those of the GPU-based implementation. During online adaption, you only need to specify the device as the Ascend AI Processor in Python and device operations to develop, train, and debug the network in PyTorch using the Ascend AI Processor. You do not need to pay attention to the underlying details of the Ascend AI Processor. In this way, you can minimize the modification and complete porting with low costs. -

Restrictions and Limitations

+

Restrictions and Limitations

- In the **infershape** phase, operators do not support unknown shape inference. - Only the float16 operator can be used for cube computing. @@ -127,7 +127,7 @@ Currently, the main reasons for selecting the online adaptation solution are as - Only the int8, int32, float16, and float32 data types are supported. -

Porting Process

+

Porting Process

Model porting refers to moving models that have been implemented in the open-source community to an Ascend AI Processor. [Figure 1](#fig759451810422) shows the model porting process. @@ -145,12 +145,12 @@ Model porting refers to moving models that have been implemented in the open-sou

Model selection

-

For details, see Model Selection.

+

For details, see Model Selection.

Model porting evaluation

-

For details, see Model Porting Evaluation.

+

For details, see Model Porting Evaluation.

Operator development

@@ -160,17 +160,17 @@ Model porting refers to moving models that have been implemented in the open-sou

Environment setup

-

For details, see Environment Setup.

+

For details, see Environment Setup.

Model porting

-

For details, see Model Porting.

+

For details, see Model Porting.

Model training

-

For details, see Model Training.

+

For details, see Model Training.

Error analysis

@@ -180,17 +180,17 @@ Model porting refers to moving models that have been implemented in the open-sou

Performance analysis and optimization

-

For details, see Performance Optimization and Analysis.

+

For details, see Performance Optimization and Analysis.

Precision commissioning

-

For details, see Precision Commissioning.

+

For details, see Precision Commissioning.

Model saving and conversion

-

For details, see Model Saving and Conversion and "ATC Tool Instructions" in the CANN Auxiliary Development Tool User Guide .

+

For details, see Model Saving and Conversion and "ATC Tool Instructions" in the CANN Auxiliary Development Tool User Guide .

Application software development

@@ -200,48 +200,48 @@ Model porting refers to moving models that have been implemented in the open-sou

FAQs

-

Describes how to prepare the environment, port models, commission models, and resolve other common problems. For details, see FAQs.

+

Describes how to prepare the environment, port models, commission models, and resolve other common problems. For details, see FAQs.

-

Model Porting Evaluation

+

Model Porting Evaluation

1. When selecting models, select authoritative PyTorch models as benchmarks, including but not limited to PyTorch \([example](https://github.com/pytorch/examples/tree/master/imagenet)/[vision](https://github.com/pytorch/vision)\), facebookresearch \([Detectron](https://github.com/facebookresearch/Detectron)/[detectron2](https://github.com/facebookresearch/detectron2)\), and open-mmlab \([mmdetection](https://github.com/open-mmlab/mmdetection)/[mmpose](https://github.com/open-mmlab/mmpose)\). -2. Check the operator adaptation. Before porting the original model and training script to an Ascend AI Processor, train the original model and training script on the CPU, obtain the operator information by using the dump op method, and compare the operator information with that in the _PyTorch Operator Support_ to check whether the operator is supported. For details about the dump op method, see [dump op Method](#dump-op-method.md). If an operator is not supported, develop the operator. For details, see the _PyTorch Operator Development Guide_. +2. Check the operator adaptation. Before porting the original model and training script to an Ascend AI Processor, train the original model and training script on the CPU, obtain the operator information by using the dump op method, and compare the operator information with that in the _PyTorch Operator Support_ to check whether the operator is supported. For details about the dump op method, see [dump op Method](#dump-op-methodmd). If an operator is not supported, develop the operator. For details, see the _PyTorch Operator Development Guide_. >![](public_sys-resources/icon-note.gif) **NOTE:** >You can also port the model and training script to the Ascend AI Processor for training to view the error information. For details about how to port the model and training script, see the following sections. Generally, a message is displayed, indicating that an operator \(the first operator that is not supported\) cannot run in the backend of the Ascend AI Processor. -

Environment Setup

+

Environment Setup

Refer to the _PyTorch Installation Guide_ to install PyTorch and the mixed precision module, and configure required environment variables. -

Model Porting

+

Model Porting

-- **[Tool-Facilitated](#tool-facilitated.md)** +- **[Tool-Facilitated](#tool-facilitatedmd)** -- **[Manual](#manual.md)** +- **[Manual](#manualmd)** -- **[Mixed Precision](#mixed-precision.md)** +- **[Mixed Precision](#mixed-precisionmd)** -

Tool-Facilitated

+

Tool-Facilitated

The Ascend platform provides a script conversion tool to enable you to port training scripts to Ascend AI Processors using commands. The following will provide the details. In addition to using commands, you can also use the PyTorch GPU2Ascend function integrated in MindStudio to port scripts. For details, see the _MindStudio User Guide_. -- **[Introduction](#introduction.md)** +- **[Introduction](#introductionmd)** -- **[Instructions](#instructions.md)** +- **[Instructions](#instructionsmd)** -- **[Result Analysis](#result-analysis.md)** +- **[Result Analysis](#result-analysismd)** -

Introduction

+

Introduction

-## Overview +##### Overview Ascend NPU is an up-and-comer in the AI computing field, but most training and online inference scripts are based on GPUs. Due to the architecture differences between NPUs and GPUs, GPU-based training and online inference scripts cannot be directly used on NPUs. The script conversion tool provides an automated method for converting GPU-based scripts into NPU-based scripts, reducing the learning cost and workload of manual script migration, thereby improving the migration efficiency. @@ -538,17 +538,17 @@ Ascend NPU is an up-and-comer in the AI computing field, but most training and o -## System Requirement +##### System Requirement msFmkTransplt runs on Ubuntu 18.04, CentOS 7.6, and EulerOS 2.8 only. -## Environment Setup +##### Environment Setup Set up the development environment by referring to the _CANN Software Installation Guide_. -

Instructions

+

Instructions

-## Command-line Options +##### Command-line Options **Table 1** Command-line options @@ -596,7 +596,7 @@ Set up the development environment by referring to the _CANN Software Installat -## Customizing a Rule File +##### Customizing a Rule File An example of a custom conversion rule is as follows: @@ -689,7 +689,7 @@ An example of a custom conversion rule is as follows: -## Performing Conversion +##### Performing Conversion 1. Go to the directory of the script conversion tool msFmkTransplt. @@ -705,7 +705,7 @@ An example of a custom conversion rule is as follows: 3. Find the converted script in the specified output path. -

Result Analysis

+

Result Analysis

You can view the result files in the output path when the script is converted. @@ -716,16 +716,16 @@ You can view the result files in the output path when the script is converted. │ ├── unsupported_op.xlsx // File of the unsupported operator list ``` -

Manual

+

Manual

-- **[Single-Device Training Model Porting](#single-device-training-model-porting.md)** +- **[Single-Device Training Model Porting](#single-device-training-model-portingmd)** -- **[Multi-Device Training Model Porting](#multi-device-training-model-porting.md)** +- **[Multi-Device Training Model Porting](#multi-device-training-model-portingmd)** -- **[PyTorch-related API Replacement](#pytorch-related-api-replacement.md)** +- **[PyTorch-related API Replacement](#pytorch-related-api-replacementmd)** -

Single-Device Training Model Porting

+

Single-Device Training Model Porting

The advantage of the online adaption is that the training on the Ascend AI Processor is consistent with the usage of the GPU. During online adaption,** you only need to specify the device as the Ascend AI Processor in Python and device operations** to develop, train, and debug the network in PyTorch using the Ascend AI Processor. For single-device model training, main changes for porting are as follows: @@ -755,9 +755,9 @@ The code ported to the Ascend AI Processor is as follows: target = target.to(CALCULATE_DEVICE) ``` -For details, see [Single-Device Training Modification](#single-device-training-modification.md). +For details, see [Single-Device Training Modification](#single-device-training-modificationmd). -

Multi-Device Training Model Porting

+

Multi-Device Training Model Porting

To port a multi-device training model, **you need to specify the device as the Ascend AI Processor in Python and device operations**. In addition, you can perform distributed training using PyTorch **DistributedDataParallel**, that is, run **init\_process\_group** during model initialization, and then initialize the model into a **DistributedDataParallel** model. Note that the **backend **must be set to **hccl **and the initialization mode must be shielded when **init\_process\_group** is executed. @@ -781,9 +781,9 @@ def main(): lr_scheduler) ``` -For details, see [Distributed Training Modification](#distributed-training-modification.md). +For details, see [Distributed Training Modification](#distributed-training-modificationmd). - + 1. To enable the Ascend AI Processor to use the capabilities of the PyTorch framework, the native PyTorch framework needs to be adapted at the device layer. The APIs related to the CPU and CUDA need to be replaced for external presentation. During network porting, some device-related APIs need to be replaced with the APIs related to the Ascend AI Processor. [Table 1](#table1922064517344) lists the supported device-related APIs. @@ -982,9 +982,9 @@ For details, see [Distributed Training Modification](#distributed-training-modi For more APIs, see the _PyTorch API Support_. -

Mixed Precision

+

Mixed Precision

-## Overview +#### Overview Based on the architecture features of the NPU chip, mixed precision training is involved, that is, the scenario where the float16 and float32 data types are used together. Replacing float32 with float16 has the following advantages: @@ -999,7 +999,7 @@ In addition to the preceding advantages, the mixed precision module Apex adapted - During mixed precision calculation, Apex calculates the grad of the model. You can enable combine\_grad to accelerate these operations. Set the **combine\_grad** parameter of the amp.initialize\(\) interface to **True**. - After the adaptation, Apex optimizes optimizers, such as adadelta, adam, sgd, and lamb to adapt them to Ascend AI Processors. As a result, the obtained NPU-based fusion optimizers are consistent with the native algorithms, but the calculation speed is faster. You only need to replace the original optimizer with **apex.optimizers.\*** \(**\*** indicates the optimizer name, for example, **NpuFusedSGD**\). -## Supported Features +#### Supported Features [Table 1](#table10717173813332) describes the functions and optimization of the mixed precision module. @@ -1039,7 +1039,7 @@ In addition to the preceding advantages, the mixed precision module Apex adapted >- In the current version, Apex is implemented using Python and does not support AscendCL or CUDA optimization. >- Ascend AI devices do not support the original FusedLayerNorm interface module of Apex. If the original model script file uses the FusedLayerNorm interface module, you need to replace the script header file **from apex.normalization import FusedLayerNorm** with **from torch.nn import LayerNorm**. -## Integrating Mixed Precision Module Into the PyTorch Model +#### Integrating Mixed Precision Module Into the PyTorch Model 1. To use the mixed precision module Apex, you need to import the amp from the Apex library as follows: @@ -1073,40 +1073,40 @@ In addition to the preceding advantages, the mixed precision module Apex adapted ``` -

Model Training

+

Model Training

-After the training scripts are ported, set environment variables by following the instructions in [Environment Variable Configuration](#en-us_topic_0000001144082004.md) and run the **python3** _xxx_ command to train a model. For details, see [Script Execution](#script-execution.md). +After the training scripts are ported, set environment variables by following the instructions in [Environment Variable Configuration](#en-us_topic_0000001144082004md) and run the **python3** _xxx_ command to train a model. For details, see [Script Execution](#script-executionmd). >![](public_sys-resources/icon-note.gif) **NOTE:** >When running the **python3** _xxx_ command, create a soft link between Python 3 and the installation path of Python that matches the current PyTorch version. -

Performance Analysis and Optimization

+

Performance Analysis and Optimization

-- **[Prerequisites](#prerequisites.md)** +- **[Prerequisites](#prerequisitesmd)** -- **[Commissioning Process](#commissioning-process.md)** +- **[Commissioning Process](#commissioning-processmd)** -- **[Affinity Library](#affinity-library.md)** +- **[Affinity Library](#affinity-librarymd)** -

Prerequisites

+

Prerequisites

-1. Modify the open-source code to ensure that the model can run properly, including data preprocessing, forward propagation, loss calculation, mixed precision, back propagation, and parameter update. For details, see [Samples](#samples.md). +1. Modify the open-source code to ensure that the model can run properly, including data preprocessing, forward propagation, loss calculation, mixed precision, back propagation, and parameter update. For details, see [Samples](#samplesmd). 2. During model porting, check whether the model can run properly and whether the existing operators can meet the requirements. If no operator meets the requirements, develop an adapted operator. For details, see the _PyTorch Operator Development Guide_. 3. Prioritize the single-device function, and then enable the multi-device function. -

Commissioning Process

+

Commissioning Process

-- **[Overall Guideline](#overall-guideline.md)** +- **[Overall Guideline](#overall-guidelinemd)** -- **[Training Data Collection](#training-data-collection.md)** +- **[Training Data Collection](#training-data-collectionmd)** -- **[Host-side Performance Optimization](#host-side-performance-optimization.md)** +- **[Host-side Performance Optimization](#host-side-performance-optimizationmd)** -- **[Training Performance Optimization](#training-performance-optimization.md)** +- **[Training Performance Optimization](#training-performance-optimizationmd)** -

Overall Guideline

+

Overall Guideline

1. Check whether the throughput meets the expected requirements based on the training execution result. 2. If the throughput does not meet requirements, you need to find out the causes of the performance bottleneck. Possible causes are as follows: @@ -1117,9 +1117,9 @@ After the training scripts are ported, set environment variables by following th 3. Analyze the preceding causes of performance bottlenecks and optimize the performance. -

Training Data Collection

+

Training Data Collection

-## Profile Data Collection +##### Profile Data Collection During model training, if the throughput does not meet requirements, you can collect profile data generated during the training process to analyze which step and which operator cause the performance consumption. The profile data is collected at the PyTorch layer \(PyTorch API data\) and CANN layer \(TBE operator data\). @@ -1170,7 +1170,7 @@ Select a collection mode based on the site requirements and perform the followin -## Obtaining Operator Information \(OP\_INFO\) +##### Obtaining Operator Information \(OP\_INFO\) The network model is executed as an operator \(OP\). The OPInfo log can be used to obtain the operator and its attributes during the actual execution. Obtain the information by running the **get\_ascend\_op\_info.py** script. @@ -1237,20 +1237,20 @@ The network model is executed as an operator \(OP\). The OPInfo log can be used 6. Analyze the extra tasks in TaskInfo, especially transdata. -

Host-side Performance Optimization

+

Host-side Performance Optimization

-- **[Overview](#overview-0.md)** +- **[Overview](#overview-0md)** -- **[Changing the CPU Performance Mode \(x86 Server\)](#changing-the-cpu-performance-mode-(x86-server).md)** +- **[Changing the CPU Performance Mode \(x86 Server\)](#changing-the-cpu-performance-mode-(x86-server)md)** -- **[Changing the CPU Performance Mode \(ARM Server\)](#changing-the-cpu-performance-mode-(arm-server).md)** +- **[Changing the CPU Performance Mode \(ARM Server\)](#changing-the-cpu-performance-mode-(arm-server)md)** -- **[Installing the High-Performance Pillow Library \(x86 Server\)](#installing-the-high-performance-pillow-library-(x86-server).md)** +- **[Installing the High-Performance Pillow Library \(x86 Server\)](#installing-the-high-performance-pillow-library-(x86-server)md)** -- **[\(Optional\) Installing the OpenCV Library of the Specified Version](#(optional)-installing-the-opencv-library-of-the-specified-version.md)** +- **[\(Optional\) Installing the OpenCV Library of the Specified Version](#(optional)-installing-the-opencv-library-of-the-specified-versionmd)** -

Overview

+
Overview
During PyTorch model porting and training, the number of images recognized within one second \(FPS\) for some network models is low and the performance does not meet the requirements. You can perform the following optimization on the server to improve the training performance: @@ -1258,9 +1258,9 @@ During PyTorch model porting and training, the number of images recognized withi - Install the high-performance Pillow library. - \(Optional\) Install the OpenCV library of the specified version. -

Changing the CPU Performance Mode \(x86 Server\)

+
Changing the CPU Performance Mode (x86 Server)
-## Setting the Power Policy to High Performance +###### Setting the Power Policy to High Performance To improve network performance, you need to set the power policy to high performance in the BIOS settings of the x86 server. The detailed operations are as follows: @@ -1287,7 +1287,7 @@ To improve network performance, you need to set the power policy to high perform 6. Press **F10** to save the settings and reboot the server. -## Setting the CPU Mode to Performance +###### Setting the CPU Mode to Performance Perform the following steps as the **root** user: @@ -1365,9 +1365,9 @@ Perform the following steps as the **root** user: 4. Perform [Step 1](#li158435131344) again to check whether the current CPU mode is set to performance. -

Changing the CPU Performance Mode \(ARM Server\)

+
Changing the CPU Performance Mode (ARM Server)
-## Setting the Power Policy to High Performance +###### Setting the Power Policy to High Performance Some models that have demanding requirements on the CPUs on the host, for example, the object detection model, require complex image pre-processing. Enabling the high-performance mode of the power supply can improve performance and stability. To improve network performance, you need to set the power policy to high performance in the BIOS settings of the ARM server. The detailed operations are as follows: @@ -1394,7 +1394,7 @@ Some models that have demanding requirements on the CPUs on the host, for exampl 6. Press **F10** to save the settings and reboot the server. -

Installing the High-Performance Pillow Library \(x86 Server\)

+
Installing the High-Performance Pillow Library (x86 Server)
1. Run the following command to install the dependencies for the high-performance pillow library: @@ -1432,7 +1432,7 @@ Some models that have demanding requirements on the CPUs on the host, for exampl >``` -3. Modify the TorchVision code to solve the problem that the pillow-simd does not contain the **PILLOW\_VERSION** field. For details about how to install TorchVision, see [How to Obtain](#obtaining-samples.md). +3. Modify the TorchVision code to solve the problem that the pillow-simd does not contain the **PILLOW\_VERSION** field. For details about how to install TorchVision, see [How to Obtain](#obtaining-samplesmd). Modify the code in line 5 of **/usr/local/python3._x.x_/lib/python3._x_/site-packages/torchvision/transforms/functional.py** as follows: @@ -1445,36 +1445,36 @@ Some models that have demanding requirements on the CPUs on the host, for exampl ``` -

\(Optional\) Installing the OpenCV Library of the Specified Version

+
(Optional) Installing the OpenCV Library of the Specified Version
If the model depends on OpenCV, you are advised to install OpenCV 3.4.10 to ensure training performance. 1. Source code: [Link](https://opencv.org/releases/) 2. Installation guide: [Link](https://docs.opencv.org/3.4.10/d7/d9f/tutorial_linux_install.html) -

Training Performance Optimization

+

Training Performance Optimization

-## Operator Bottleneck Optimization +##### Operator Bottleneck Optimization -1. Obtain the profile data during training. For details, see [Profile Data Collection](#training-data-collection.md). +1. Obtain the profile data during training. For details, see [Profile Data Collection](#training-data-collectionmd). 2. Analyze the profile data to obtain the time-consuming operator. -3. See [Single-Operator Sample Building](#single-operator-sample-building.md) to build the single-operator sample of the time-consuming operator, and compare the execution time of a single-operator sample on the CPU and GPU. If the performance is insufficient, use either of the following methods to solve the problem: +3. See [Single-Operator Sample Building](#single-operator-sample-buildingmd) to build the single-operator sample of the time-consuming operator, and compare the execution time of a single-operator sample on the CPU and GPU. If the performance is insufficient, use either of the following methods to solve the problem: - Workaround: Use other efficient operators with the same semantics. - Solution: Improve the operator performance. -## Copy Bottleneck Optimization +##### Copy Bottleneck Optimization -1. Obtain the profile data during training. For details, see [Profile Data Collection](#training-data-collection.md). +1. Obtain the profile data during training. For details, see [Profile Data Collection](#training-data-collectionmd). 2. Analyze the Profile data to obtain the execution time of **D2DCopywithStreamSynchronize**, **PTCopy**, or **format\_contiguous** in the entire network. 3. If the execution takes a long time, use either of the following methods to solve the problem: - Method 1 \(workaround\): Replace view operators with compute operators. In PyTorch, view operators cause conversion from non-contiguous tensors to contiguous tensors. The optimization idea is to replace view operators with compute operators. Common view operators include view, permute, and transpose operators. For more view operators, go to [https://pytorch.org/docs/stable/tensor\_view.html](https://pytorch.org/docs/stable/tensor_view.html). - Method 2 \(solution\): Accelerate the operation of converting non-contiguous tensors to contiguous tensors. -## Framework Bottleneck Optimization +##### Framework Bottleneck Optimization -1. Obtain the operator information \(OP\_INFO\) during the training. For details, see [Obtaining Operator Information \(OP\_INFO\)](#training-data-collection.md). +1. Obtain the operator information \(OP\_INFO\) during the training. For details, see [Obtaining Operator Information \(OP\_INFO\)](#training-data-collectionmd). 2. Analyze the specifications and calling relationship of operators in OP\_INFO to check whether redundant operators are inserted. Pay special attention to check whether transdata is proper. 3. Solution: Specify the initialization format of some operators to eliminate cast operators. 4. In **pytorch/torch/nn/modules/module.py**, specify the operator initialization format in **cast\_weight**, as shown in the following figure. @@ -1487,28 +1487,28 @@ If the model depends on OpenCV, you are advised to install OpenCV 3.4.10 to ensu - For the linear operator, weight can be set to NZ format, for example, line 409. -## Compilation Bottleneck Optimization +##### Compilation Bottleneck Optimization -1. Obtain the operator information \(OP\_INFO\) during the training. For details, see [Obtaining Operator Information \(OP\_INFO\)](#training-data-collection.md). +1. Obtain the operator information \(OP\_INFO\) during the training. For details, see [Obtaining Operator Information \(OP\_INFO\)](#training-data-collectionmd). 2. View the INFO log and check the keyword **aclopCompile::aclOp** after the first step. If **Match op inputs/type failed** or **To compile op** is displayed, the operator is dynamically compiled and needs to be optimized. 3. Use either of the following methods to solve the problem: - Workaround: Based on the understanding of model semantics and related APIs, replace dynamic shape with static shape. - Solution: Reduce compilation or do not compile the operator. - - For details about how to optimize the operator compilation configuration, see [Compilation Option Settings](#compilation-option-settings.md). + - For details about how to optimize the operator compilation configuration, see [Compilation Option Settings](#compilation-option-settingsmd). -

Affinity Library

+

Affinity Library

-- **[Source](#source.md)** +- **[Source](#sourcemd)** -- **[Functions](#functions.md)** +- **[Functions](#functionsmd)** -

Source

+

Source

The common network structures and functions in the public models are optimized to greatly improve computing performance. In addition, the network structures and functions are integrated into the PyTorch framework to facilitate model performance optimization. -

Functions

+

Functions

Function

@@ -1553,30 +1553,30 @@ The common network structures and functions in the public models are optimized t >![](public_sys-resources/icon-note.gif) **NOTE:** >The optimization content will be enhanced and updated with the version. Use the content in the corresponding path of the actual PyTorch version. -

Precision Commissioning

+

Precision Commissioning

-- **[Prerequisites](#prerequisites-1.md)** +- **[Prerequisites](#prerequisites-1md)** -- **[Commissioning Process](#commissioning-process-2.md)** +- **[Commissioning Process](#commissioning-process-2md)** -

Prerequisites

+

Prerequisites

Run a certain number of epochs \(20% of the total number of epoches is recommended\) with the same semantics and hyperparameters to align the precision and loss with the corresponding level of the GPU. After the alignment is complete, align the final precision. -

Commissioning Process

+

Commissioning Process

-- **[Overall Guideline](#overall-guideline-3.md)** +- **[Overall Guideline](#overall-guideline-3md)** -- **[Precision Tuning Methods](#precision-tuning-methods.md)** +- **[Precision Tuning Methods](#precision-tuning-methodsmd)** -

Overall Guideline

+

Overall Guideline

To locate the precision problem, you need to find out the step in which the problem occurs. The following aspects are involved: 1. Model network calculation error - - Locating method: Add a hook to the network to determine which part is suspected. Then build a [single-operator sample](#single-operator-sample-building.md) to narrow down the error range. This can prove that the operator calculation is incorrect in the current network. You can compare the result with the CPU or GPU result to prove the problem. + - Locating method: Add a hook to the network to determine which part is suspected. Then build a [single-operator sample](#single-operator-sample-buildingmd) to narrow down the error range. This can prove that the operator calculation is incorrect in the current network. You can compare the result with the CPU or GPU result to prove the problem. - Workaround: Use other operators with the same semantics. @@ -1605,22 +1605,22 @@ To locate the precision problem, you need to find out the step in which the prob -

Precision Tuning Methods

+

Precision Tuning Methods

General model precision problems are as follows: training loss not converge or unqualified precision due to operator overflow/underflow; unqualified performance due to network-wide training. You can perform single-operator overflow/underflow detection and network-wide commissioning to resolve the preceding problems. -- **[Single-Operator Overflow/Underflow Detection](#single-operator-overflow-underflow-detection.md)** +- **[Single-Operator Overflow/Underflow Detection](#single-operator-overflow-underflow-detectionmd)** -- **[Network-wide Commissioning](#network-wide-commissioning.md)** +- **[Network-wide Commissioning](#network-wide-commissioningmd)** -

Single-Operator Overflow/Underflow Detection

+
Single-Operator Overflow/Underflow Detection
With this function, you can check whether an operator overflows/underflows and collect data of overflowed/underflowed operators, helping developers quickly locate and solve operator precision problems. -## Restrictions +###### Restrictions -- Install the HDF5 tool to support the operator dump function. For details about how to install the tool, see [HDF5 Compilation and Installation](#hdf5-compilation-and-installation.md). +- Install the HDF5 tool to support the operator dump function. For details about how to install the tool, see [HDF5 Compilation and Installation](#hdf5-compilation-and-installationmd). - This function provides only IR-level operator overflow/underflow detection for only the AI Core \(not Atomic\). - Add the **USE\_DUMP=1** field to the **build.sh** file of the PyTorch source code. @@ -1633,7 +1633,7 @@ With this function, you can check whether an operator overflows/underflows and c - When using the single-operator overflow/underflow detection function, do not enable the dynamic loss scale mode of apex and the tensor fusion function at the same time. -## Collecting Data of Overflowed/Underflowed Operators +###### Collecting Data of Overflowed/Underflowed Operators ``` # check_overflow is the overflow/underflow detection control switch. @@ -1644,15 +1644,15 @@ with torch.utils.dumper(check_overflow=check_overflow, dump_path=dump_path, load During model running, if an operator overflows/underflows, the name of the corresponding IR is printed. -## Viewing Dump Data +###### Viewing Dump Data If dump data is collected during training, an .h5 file of the dump data is generated in the **\{dump\_path\}** directory. You can go to the directory to view the dump data. -## Solution +###### Solution Send the screenshots of operator overflow/underflow and the collected .h5 file to Huawei R&D engineers as the attachment of an issue. -

Network-wide Commissioning

+
Network-wide Commissioning
You can also commission the network model precision by analyzing the entire network. @@ -1714,16 +1714,16 @@ You can also commission the network model precision by analyzing the entire netw ``` -

Model Saving and Conversion

+

Model Saving and Conversion

-- **[Introduction](#introduction-4.md)** +- **[Introduction](#introduction-4md)** -- **[Saving a Model](#saving-a-model.md)** +- **[Saving a Model](#saving-a-modelmd)** -- **[Exporting an ONNX Model](#exporting-an-onnx-model.md)** +- **[Exporting an ONNX Model](#exporting-an-onnx-modelmd)** -

Introduction

+

Introduction

After the model training is complete, save the model file and export the ONNX model by using the APIs provided by PyTorch. Then use the ATC tool to convert the model into an .om file that adapts to the Ascend AI Processor for offline inference. @@ -1735,7 +1735,7 @@ For details about how to build an offline inference application, see the _CANN ![](figures/en-us_image_0000001144082132.png) -

Saving a Model

+

Saving a Model

During PyTorch training, **torch.save\(\)** is used to save checkpoint files. Based on the usage of model files, model files are saved in the following two formats: @@ -1808,13 +1808,13 @@ During PyTorch training, **torch.save\(\)** is used to save checkpoint files. >![](public_sys-resources/icon-notice.gif) **NOTICE:** >Generally, an operator is processed in different ways in the training graph and inference graph \(for example, BatchNorm and dropout operators\), and the input formats are also different. Therefore, before inference or ONNX model exporting, **model.eval\(\)** must be called to set the dropout and batch normalization layers to the inference mode. -

Exporting an ONNX Model

+

Exporting an ONNX Model

-## Introduction +#### Introduction The deployment policy of the Ascend AI Processor for PyTorch models is implemented based on the ONNX module that is supported by PyTorch. ONNX is a mainstream model format in the industry and is widely used for model sharing and deployment. This section describes how to export a checkpoint file as an ONNX model by using the **torch.onnx.export\(\)** API. -## Using the .pth or .pt File to Export the ONNX Model +#### Using the .pth or .pt File to Export the ONNX Model The saved .pth or .pt file can be restored by building a model using PyTorch and then loading the weight. Then you can export the ONNX model. The following is an example. @@ -1856,7 +1856,7 @@ if __name__ == "__main__": >- The model in the sample script comes from the definition in the torchvision module. You need to specify a model when using your own model. >- The constructed input and output must correspond to the input and output during training. Otherwise, the inference cannot be performed properly. -## Using the .pth.tar File to Export the ONNX Model +#### Using the .pth.tar File to Export the ONNX Model Before exporting the ONNX model using the .pth.tar file, you need to check the saved information. Sometimes, the saved node name may be different from the node name in the model definition. For example, a prefix and suffix may be added. During the conversion, you can modify the node name. The following is an example of the conversion. @@ -1895,25 +1895,25 @@ if __name__ == "__main__": convert() ``` -

Samples

+

Samples

-- **[ResNet-50 Model Porting](#resnet-50-model-porting.md)** +- **[ResNet-50 Model Porting](#resnet-50-model-portingmd)** -- **[ShuffleNet Model Optimization](#shufflenet-model-optimization.md)** +- **[ShuffleNet Model Optimization](#shufflenet-model-optimizationmd)** -

ResNet-50 Model Porting

+

ResNet-50 Model Porting

-- **[Obtaining Samples](#obtaining-samples.md)** +- **[Obtaining Samples](#obtaining-samplesmd)** -- **[Porting the Training Script](#porting-the-training-script.md)** +- **[Porting the Training Script](#porting-the-training-scriptmd)** -- **[Script Execution](#script-execution.md)** +- **[Script Execution](#script-executionmd)** -

Obtaining Samples

+

Obtaining Samples

-## How to Obtain +##### How to Obtain 1. This sample is used to adapt to the porting and reconstruction of the Ascend 910 AI Processor based on the ImageNet dataset training model provided by the PyTorch official website. The sample can be obtained from [https://github.com/pytorch/examples/tree/master/imagenet](https://github.com/pytorch/examples/tree/master/imagenet). 2. This sample depends on torchvision. Therefore, you need to install the torchvision dependency. If you install it as a non-root user, add **--user** to the end of the command. @@ -1941,7 +1941,7 @@ if __name__ == "__main__": >![](public_sys-resources/icon-note.gif) **NOTE:** >ResNet-50 is a model built in PyTorch. For more built-in models, visit the [PyTorch official website](https://pytorch.org/). - 2. During script execution, set **arch** to **resnet50**. This method is used in the sample. For details, see [Script Execution](#script-execution.md). + 2. During script execution, set **arch** to **resnet50**. This method is used in the sample. For details, see [Script Execution](#script-executionmd). ``` --arch resnet50 @@ -1949,7 +1949,7 @@ if __name__ == "__main__": -## Directory Structure +##### Directory Structure The structure of major directories and files is as follows: @@ -1957,14 +1957,14 @@ The structure of major directories and files is as follows: ├──main.py ``` -

Porting the Training Script

+

Porting the Training Script

-- **[Single-Device Training Modification](#single-device-training-modification.md)** +- **[Single-Device Training Modification](#single-device-training-modificationmd)** -- **[Distributed Training Modification](#distributed-training-modification.md)** +- **[Distributed Training Modification](#distributed-training-modificationmd)** -

Single-Device Training Modification

+
Single-Device Training Modification
1. Add the header file to **main.py** to support model training on the Ascend 910 AI Processor based on the PyTorch framework. @@ -2107,7 +2107,7 @@ The structure of major directories and files is as follows: ``` -

Distributed Training Modification

+
Distributed Training Modification
1. Add the header file to **main.py** to support mixed-precision model training on the Ascend 910 AI Processor based on the PyTorch framework. @@ -2458,17 +2458,17 @@ The structure of major directories and files is as follows: ``` -

Script Execution

+

Script Execution

-## Preparing a Dataset +##### Preparing a Dataset Prepare a dataset and upload it to a directory in the operating environment, for example, **/home/data/resnet50/imagenet**. -## Configuring Environment Variables +##### Configuring Environment Variables -For details, see [Environment Variable Configuration](#en-us_topic_0000001144082004.md). +For details, see [Environment Variable Configuration](#en-us_topic_0000001144082004md). -## Command +##### Command Example: @@ -2509,20 +2509,20 @@ python3 main.py /home/data/resnet50/imagenet --addr='1.1.1.1' \ # >![](public_sys-resources/icon-note.gif) **NOTE:** >**dist-backend** must be set to **hccl** to support distributed training on the Ascend AI device. -

ShuffleNet Model Optimization

+

ShuffleNet Model Optimization

-- **[Obtaining Samples](#obtaining-samples-5.md)** +- **[Obtaining Samples](#obtaining-samples-5md)** -- **[Model Evaluation](#model-evaluation.md)** +- **[Model Evaluation](#model-evaluationmd)** -- **[Porting the Network](#porting-the-network.md)** +- **[Porting the Network](#porting-the-networkmd)** -- **[Commissioning the Network](#commissioning-the-network.md)** +- **[Commissioning the Network](#commissioning-the-networkmd)** -

Obtaining Samples

+

Obtaining Samples

-## How to Obtain +##### How to Obtain 1. This sample is used to adapt to the porting and reconstruction of the Ascend 910 AI Processor based on the ImageNet dataset training model provided by the PyTorch official website. The sample can be obtained from [https://github.com/pytorch/examples/tree/master/imagenet](https://github.com/pytorch/examples/tree/master/imagenet). 2. For details about the ShuffleNet model, see the [ShuffleNet V2](https://pytorch.org/hub/pytorch_vision_shufflenet_v2/) in the PyTorch official website. Set the **arch** parameter to **shufflenet\_v2\_x1\_0** during script execution. @@ -2535,7 +2535,7 @@ python3 main.py /home/data/resnet50/imagenet --addr='1.1.1.1' \ # >ShuffleNet is a model built in PyTorch. For more built-in models, visit the [PyTorch official website](https://pytorch.org/). -## Directory Structure +##### Directory Structure The structure of major directories and files is as follows: @@ -2543,19 +2543,19 @@ The structure of major directories and files is as follows: ├──main.py ``` -

Model Evaluation

+

Model Evaluation

Model evaluation focuses on operator adaptation. Use the dump op method to obtain the ShuffleNet operator information and compare the information with that in the _PyTorch Operator Support_. If an operator is not supported, in simple scenarios, you can replace the operator with a similar operator or place the operator on the CPU to avoid this problem. In complex scenarios, operator development is required. For details, see the _PyTorch Operator Development Guide_. -

Porting the Network

+

Porting the Network

-For details about how to port the training scripts, see [Single-Device Training Modification](#single-device-training-modification.md) and [Distributed Training Modification](#distributed-training-modification.md). During the script execution, select the **--arch shufflenet\_v2\_x1\_0** parameter. +For details about how to port the training scripts, see [Single-Device Training Modification](#single-device-training-modificationmd) and [Distributed Training Modification](#distributed-training-modificationmd). During the script execution, select the **--arch shufflenet\_v2\_x1\_0** parameter. -

Commissioning the Network

+

Commissioning the Network

-For details about how to commission the network, see [Commissioning Process](#commissioning-process.md). After check, it is found that too much time is consumed by operators during ShuffleNet running. The following provides the time consumption data and solutions. +For details about how to commission the network, see [Commissioning Process](#commissioning-processmd). After check, it is found that too much time is consumed by operators during ShuffleNet running. The following provides the time consumption data and solutions. -## Forward check +##### Forward check The forward check record table is as follows: @@ -2622,13 +2622,13 @@ The forward check record table is as follows: The details are as follows: -- The native **torch.transpose\(x, 1, 2\).contiguous\(\)** uses the view operator transpose, which produced non-contiguous tensors. For example, the copy bottleneck described in the [copy bottleneck optimization](#training-performance-optimization.md) uses **channel\_shuffle\_index\_select** to replace the framework operator with the compute operator when the semantics is the same, reducing the time consumption. -- ShuffleNet V2 contains a large number of chunk operations, and chunk operations are framework operators in PyTorch. As a result, a tensor is split into several non-contiguous tensors of the same length. The operation of converting non-contiguous tensors to contiguous tensors takes a long time. Therefore, the compute operator is used to eliminate non-contiguous tensors. For details, see the copy bottleneck described in the [copy bottleneck optimization](#training-performance-optimization.md) +- The native **torch.transpose\(x, 1, 2\).contiguous\(\)** uses the view operator transpose, which produced non-contiguous tensors. For example, the copy bottleneck described in the [copy bottleneck optimization](#training-performance-optimizationmd) uses **channel\_shuffle\_index\_select** to replace the framework operator with the compute operator when the semantics is the same, reducing the time consumption. +- ShuffleNet V2 contains a large number of chunk operations, and chunk operations are framework operators in PyTorch. As a result, a tensor is split into several non-contiguous tensors of the same length. The operation of converting non-contiguous tensors to contiguous tensors takes a long time. Therefore, the compute operator is used to eliminate non-contiguous tensors. For details, see the copy bottleneck described in the [copy bottleneck optimization](#training-performance-optimizationmd) - During operator adaptation, the output format is specified as the input format by default. However, Concat does not support the 5HD format whose C dimension is not an integral multiple of 16, so it converts the format into 4D for processing. In addition, the Concat is followed by the GatherV2 operator, which supports only the 4D format. Therefore, the data format conversion process is 5HD \> 4D \> Concat \> 5HD \> 4D \> GatherV2 \> 5HD. The solution is to modify the Concat output format. When the output format is not an integer multiple of 16, the specified output format is 4D. After the optimization, the data format conversion process is 5HD \> 4D \> Concat \> GatherV2 \> 5HD. For details about the method for ShuffleNet, see line 121 in **pytorch/aten/src/ATen/native/npu/CatKernelNpu.cpp**. -- Set the weight initialization format to avoid repeated transdata during calculation, for example, the framework bottleneck described in the [copy bottleneck optimization](#training-performance-optimization.md). +- Set the weight initialization format to avoid repeated transdata during calculation, for example, the framework bottleneck described in the [copy bottleneck optimization](#training-performance-optimizationmd). - The output format of the DWCONV weight is rectified to avoid the unnecessary conversion from 5HD to 4D. -## Entire Network Check +##### Entire Network Check The record table of the entire network check is as follows: @@ -2815,7 +2815,7 @@ The details are as follows: 15. After using the GatherV3 operator optimized for the ShuffleNet V2 scenario, the overall performance can be further improved. -## Python Optimization Details +##### Python Optimization Details The optimization on the Python side is to make the network more affinity on the NPU by modifying some equivalent semantics. The current operations of converting non-contiguous tensors to contiguous tensors can be the performance bottleneck. The **channel\_shuffle** operation in ShuffleNet V2 involves the conversion operations after permute, causing poor performance of the entire network. The performance of the entire network can be greatly improved by modifying the equivalent semantics of the **channel\_shuffle** operation and combining it with the concat operation. The torchvision version is used. For details, go to [open source link](https://github.com/pytorch/vision/blob/master/torchvision/models/shufflenetv2.py). @@ -3050,26 +3050,26 @@ for group in [2, 4, 8]: ``` -

References

+

References

-- **[Single-Operator Sample Building](#single-operator-sample-building.md)** +- **[Single-Operator Sample Building](#single-operator-sample-buildingmd)** -- **[Single-Operator Dump Method](#single-operator-dump-method.md)** +- **[Single-Operator Dump Method](#single-operator-dump-methodmd)** -- **[Common Environment Variables](#common-environment-variables.md)** +- **[Common Environment Variables](#common-environment-variablesmd)** -- **[dump op Method](#dump-op-method.md)** +- **[dump op Method](#dump-op-methodmd)** -- **[Compilation Option Settings](#compilation-option-settings.md)** +- **[Compilation Option Settings](#compilation-option-settingsmd)** -- **[How Do I Install GCC 7.3.0?](#how-do-i-install-gcc-7-3-0.md)** +- **[How Do I Install GCC 7.3.0?](#how-do-i-install-gcc-7-3-0md)** -- **[HDF5 Compilation and Installation](#hdf5-compilation-and-installation.md)** +- **[HDF5 Compilation and Installation](#hdf5-compilation-and-installationmd)** -

Single-Operator Sample Building

+

Single-Operator Sample Building

-When a problem occurs in a model, it is costly to reproduce the problem in the entire network. You can build a single-operator sample to reproduce the precision or performance problem to locate and solve the problem. A single-operator sample can be built in either of the following ways: For details about single-operator dump methods, see [Single-Operator Dump Method](#single-operator-dump-method.md). +When a problem occurs in a model, it is costly to reproduce the problem in the entire network. You can build a single-operator sample to reproduce the precision or performance problem to locate and solve the problem. A single-operator sample can be built in either of the following ways: For details about single-operator dump methods, see [Single-Operator Dump Method](#single-operator-dump-methodmd). 1. Build a single-operator sample test case. You can directly call the operator to reproduce the error scenario. @@ -3163,9 +3163,9 @@ When a problem occurs in a model, it is costly to reproduce the problem in the e ``` -

Single-Operator Dump Method

+

Single-Operator Dump Method

-## Collecting Dump Data +#### Collecting Dump Data Currently, the PyTorch adapted to Ascend AI Processors uses the init\_dump\(\), set\_dump\(\), and finalize\_dump\(\) interfaces in **torch.npu** to collect operator dump data. The init\_dump\(\) interface initializes the dump configuration, invokes the set\_dump\(\) interface to import the configuration file to configure dump parameters, and invokes the finalize\_dump interface to end the dump. The following uses the add\_ operator as an example to describe how to collect dump data. @@ -3229,7 +3229,7 @@ The fields of **dump.json** are described as follows.
-## Viewing Overflowed Data +#### Viewing Overflowed Data The collected dump data is generated in the _\{dump\_path\}_**/**_\{time\}_**/**_\{deviceid\}_**/**_\{model\_id\}_**/**_\{data\_index\}_ directory, for example, **/home/HwHiAiUser/output/20200808163566/0/0**. @@ -3242,7 +3242,7 @@ The fields in the dump data path and file are described as follows: - **_model\_id_**: subgraph ID - A dump file is named as: _\{op\_type\}_._\{op\_name\}_._\{taskid\}_._\{stream\_id\}_._\{timestamp\}_. Any period \(.\), slash \(/\), backslash \(\\\), or space in the _op\_type_ or _op\_name_ field is replaced by an underscore \(\_\). -## Parse the dump file of an overflow operator. +#### Parse the dump file of an overflow operator. 1. Upload the **_\{op\_type\}.\{op\_name\}.\{taskid\}.\{stream\_id\}.\{timestamp\}_** file to the environment with CANN installed. 2. Go to the path where the parsing script is stored. Assume that the installation directory of the CANN is **/home/HwHiAiUser/Ascend**. @@ -3271,7 +3271,7 @@ The fields in the dump data path and file are described as follows: The dimension and **Dtype** information no longer exist in the .txt file. For details, visit the NumPy website. -

Common Environment Variables

+

Common Environment Variables

1. Enables the task delivery in multi-thread mode. When this function is enabled, the training performance of the entire network is improved in most cases. @@ -3312,7 +3312,7 @@ The fields in the dump data path and file are described as follows: **export HCCL\_WHITELIST\_DISABLE=1** -

dump op Method

+

dump op Method

1. Use the profile API to reconstruct the loss calculation and optimization process of the original code training script and print the operator information. The following is a code example. @@ -3327,7 +3327,7 @@ The fields in the dump data path and file are described as follows: 2. Train the reconstructed training script on the CPU. The related operator information is displayed. -

Compilation Option Settings

+

Compilation Option Settings

Configure the attributes of an operator during compilation to improve performance, which is implemented by ACL APIs. The usage and explanation are as follows: @@ -3368,7 +3368,7 @@ ACL_OP_COMPILER_CACHE_MODE: Configures the disk cache mode for operator compilat ACL_OP_COMPILER_CACHE_DIR: Configures the disk cache directory for operator compilation. This compilation option must be used together with ACL_OP_COMPILER_CACHE_MODE. ``` -

How Do I Install GCC 7.3.0?

+

How Do I Install GCC 7.3.0?

Perform the following steps as the **root** user. @@ -3450,7 +3450,7 @@ Perform the following steps as the **root** user. >Skip this step if you do not need to use the compilation environment with GCC upgraded. -

HDF5 Compilation and Installation

+

HDF5 Compilation and Installation

Perform the following steps as the **root** user. @@ -3487,35 +3487,35 @@ Perform the following steps as the **root** user. ``` -

FAQs

+

FAQs

-- **[FAQs About Software Installation](#faqs-about-software-installation.md)** +- **[FAQs About Software Installation](#faqs-about-software-installationmd)** -- **[FAQs About Model and Operator Running](#faqs-about-model-and-operator-running.md)** +- **[FAQs About Model and Operator Running](#faqs-about-model-and-operator-runningmd)** -- **[FAQs About Model Commissioning](#faqs-about-model-commissioning.md)** +- **[FAQs About Model Commissioning](#faqs-about-model-commissioningmd)** -- **[FAQs About Other Operations](#faqs-about-other-operations.md)** +- **[FAQs About Other Operations](#faqs-about-other-operationsmd)** -- **[FAQs About Distributed Model Training](#faqs-about-distributed-model-training.md)** +- **[FAQs About Distributed Model Training](#faqs-about-distributed-model-trainingmd)** -

FAQs About Software Installation

+

FAQs About Software Installation

-- **[pip3.7 install Pillow==5.3.0 Installation Failed](#pip3-7-install-pillow-5-3-0-installation-failed.md)** +- **[pip3.7 install Pillow==5.3.0 Installation Failed](#pip3-7-install-pillow-5-3-0-installation-failedmd)** -

pip3.7 install Pillow==5.3.0 Installation Failed

+

pip3.7 install Pillow==5.3.0 Installation Failed

-## Symptom +##### Symptom **pip3.7 install pillow==5.3.0** installation failed. -## Possible Causes +##### Possible Causes Necessary dependencies are missing, such as libjpeg, python-devel, zlib-devel, and libjpeg-turbo-devel. -## Solutions +##### Solutions Run the following commands to install the dependencies: @@ -3528,79 +3528,79 @@ Run the following commands to install the dependencies: **apt-get install libjpeg python-devel zlib-devel libjpeg-turbo-devel** -

FAQs About Model and Operator Running

+

FAQs About Model and Operator Running

-- **[What Do I Do If the Error Message "RuntimeError: ExchangeDevice:" Is Displayed During Model or Operator Running?](#what-do-i-do-if-the-error-message-runtimeerror-exchangedevice-is-displayed-during-model-or-operator.md)** +- **[What Do I Do If the Error Message "RuntimeError: ExchangeDevice:" Is Displayed During Model or Operator Running?](#what-do-i-do-if-the-error-message-runtimeerror-exchangedevice-is-displayed-during-model-or-operatormd)** -- **[What Do I Do If the Error Message "Error in atexit.\_run\_exitfuncs:" Is Displayed During Model or Operator Running?](#what-do-i-do-if-the-error-message-error-in-atexit-_run_exitfuncs-is-displayed-during-model-or-operat.md)** +- **[What Do I Do If the Error Message "Error in atexit.\_run\_exitfuncs:" Is Displayed During Model or Operator Running?](#what-do-i-do-if-the-error-message-error-in-atexit-_run_exitfuncs-is-displayed-during-model-or-operatmd)** -- **[What Do I Do If the Error Message "terminate called after throwing an instance of 'c10::Error' what\(\): HelpACLExecute:" Is Displayed During Model Running?](#what-do-i-do-if-the-error-message-terminate-called-after-throwing-an-instance-of-c10-error-what()-he.md)** +- **[What Do I Do If the Error Message "terminate called after throwing an instance of 'c10::Error' what\(\): HelpACLExecute:" Is Displayed During Model Running?](#what-do-i-do-if-the-error-message-terminate-called-after-throwing-an-instance-of-c10-error-what-hemd)** -- **[What Do I Do If the Error Message "terminate called after throwing an instance of 'c10::Error' what\(\): 0 INTERNAL ASSERT" Is Displayed During Model Running?](#what-do-i-do-if-the-error-message-terminate-called-after-throwing-an-instance-of-c10-error-what()-0.md)** +- **[What Do I Do If the Error Message "terminate called after throwing an instance of 'c10::Error' what\(\): 0 INTERNAL ASSERT" Is Displayed During Model Running?](#what-do-i-do-if-the-error-message-terminate-called-after-throwing-an-instance-of-c10-error-what-0md)** -- **[What Do I Do If the Error Message "ImportError: libhccl.so." Is Displayed During Model Running?](#what-do-i-do-if-the-error-message-importerror-libhccl-so-is-displayed-during-model-running.md)** +- **[What Do I Do If the Error Message "ImportError: libhccl.so." Is Displayed During Model Running?](#what-do-i-do-if-the-error-message-importerror-libhccl-so-is-displayed-during-model-runningmd)** -- **[What Do I Do If the Error Message "RuntimeError: Initialize." Is Displayed During Model Running?](#what-do-i-do-if-the-error-message-runtimeerror-initialize-is-displayed-during-model-running.md)** +- **[What Do I Do If the Error Message "RuntimeError: Initialize." Is Displayed During Model Running?](#what-do-i-do-if-the-error-message-runtimeerror-initialize-is-displayed-during-model-runningmd)** -- **[What Do I Do If the Error Message "TVM/te/cce error." Is Displayed During Model Running?](#what-do-i-do-if-the-error-message-tvm-te-cce-error-is-displayed-during-model-running.md)** +- **[What Do I Do If the Error Message "TVM/te/cce error." Is Displayed During Model Running?](#what-do-i-do-if-the-error-message-tvm-te-cce-error-is-displayed-during-model-runningmd)** -- **[What Do I Do If the Error Message "MemCopySync:drvMemcpy failed." Is Displayed During Model Running?](#what-do-i-do-if-the-error-message-memcopysync-drvmemcpy-failed-is-displayed-during-model-running.md)** +- **[What Do I Do If the Error Message "MemCopySync:drvMemcpy failed." Is Displayed During Model Running?](#what-do-i-do-if-the-error-message-memcopysync-drvmemcpy-failed-is-displayed-during-model-runningmd)** -- **[What Do I Do If the Error Message "MemCopySync:drvMemcpy failed." Is Displayed During Model Running?](#what-do-i-do-if-the-error-message-memcopysync-drvmemcpy-failed-is-displayed-during-model-running-6.md)** +- **[What Do I Do If the Error Message "MemCopySync:drvMemcpy failed." Is Displayed During Model Running?](#what-do-i-do-if-the-error-message-memcopysync-drvmemcpy-failed-is-displayed-during-model-running-6md)** -- **[What Do I Do If the Error Message "HelpACLExecute." Is Displayed After Multi-Task Delivery Is Disabled \(export TASK\_QUEUE\_ENABLE=0\) During Model Running?](#what-do-i-do-if-the-error-message-helpaclexecute-is-displayed-after-multi-task-delivery-is-disabled.md)** +- **[What Do I Do If the Error Message "HelpACLExecute." Is Displayed After Multi-Task Delivery Is Disabled \(export TASK\_QUEUE\_ENABLE=0\) During Model Running?](#what-do-i-do-if-the-error-message-helpaclexecute-is-displayed-after-multi-task-delivery-is-disabledmd)** -- **[What Do I Do If the Error Message "55056 GetInputConstDataOut: ErrorNo: -1\(failed\)" Is Displayed During Model Running?](#what-do-i-do-if-the-error-message-55056-getinputconstdataout-errorno--1(failed)-is-displayed-during.md)** +- **[What Do I Do If the Error Message "55056 GetInputConstDataOut: ErrorNo: -1\(failed\)" Is Displayed During Model Running?](#what-do-i-do-if-the-error-message-55056-getinputconstdataout-errorno--1failed-is-displayed-duringmd)** -

What Do I Do If the Error Message "RuntimeError: ExchangeDevice:" Is Displayed During Model or Operator Running?

+

What Do I Do If the Error Message "RuntimeError: ExchangeDevice:" Is Displayed During Model or Operator Running?

-## Symptom +##### Symptom ![](figures/faq1.png) -## Possible Causes +##### Possible Causes Currently, only one NPU device can be called in a thread. When different NPU devices are switched, the preceding error occurs. -## Solution +##### Solution In the code, when **torch.npu.set\_device\(device\)**, **tensor.to\(device\)**, or **model.to\(device\)** is called in the same thread, the device names are inconsistent. For multiple threads \(such as multi-device training\), each thread can call only a fixed NPU device. -

What Do I Do If the Error Message "Error in atexit.\_run\_exitfuncs:" Is Displayed During Model or Operator Running?

+

What Do I Do If the Error Message "Error in atexit.\_run\_exitfuncs:" Is Displayed During Model or Operator Running?

-## Symptom +##### Symptom ![](figures/faq2.png) -## Possible Causes +##### Possible Causes If no NPU device is specified by **torch.npu.device\(id\)** during torch initialization, device 0 is used by default. If another NPU device is directly used, for example, a tensor is created on device 1, the preceding error occurs during running. -## Solution +##### Solution Before calling an NPU device, specify the NPU device by using **torch.npu.set\_device\(device\)**. -

What Do I Do If the Error Message "terminate called after throwing an instance of 'c10::Error' what\(\): HelpACLExecute:" Is Displayed During Model Running?

+

What Do I Do If the Error Message "terminate called after throwing an instance of 'c10::Error' what(): HelpACLExecute:" Is Displayed During Model Running?

-## Symptom +##### Symptom ![](figures/faq3.png) -## Possible Causes +##### Possible Causes Currently, the HelpACLExecute error cannot be directly located. In this case, an error is reported when the task is delivered. This is because the multi-thread delivery of the task is enabled \(**export TASK\_QUEUE\_ENABLE=1**\), and the error information is encapsulated at the upper layer. As a result, more detailed error logs cannot be obtained. -## Solution +##### Solution You can resolve this exception by using either of the following methods: - Check the host error log information. The default log path is **/var/log/npu/slog/host-0/**. Search for the log file whose name is prefixed with **host-0** based on the time identifier, open the log file, and search for error information using keyword **ERROR**. - Disable multi-thread delivery \(**export TASK\_QUEUE\_ENABLE=0**\) and run the code again. Generally, you can locate the fault based on the error information reported by the terminal. -

What Do I Do If the Error Message "terminate called after throwing an instance of 'c10::Error' what\(\): 0 INTERNAL ASSERT" Is Displayed During Model Running?

+

What Do I Do If the Error Message "terminate called after throwing an instance of 'c10::Error' what(): 0 INTERNAL ASSERT" Is Displayed During Model Running?

-## Symptom +##### Symptom ``` import torch @@ -3627,11 +3627,11 @@ The following error message is displayed after code execution. ![](figures/en-us_image_0000001208897433.png) -## Possible Causes +##### Possible Causes After the backward operation is performed, the **set\_decice\(\)** method is used to manually set the device. As a result, an error is reported. During the backward operation, if the device is not set, the program automatically initializes the device to **0** by default. That is, **set\_device\("npu:0"\)** is executed. Currently, the device cannot be switched for calculation. If the device is manually set by using the **set\_decice\(\)** method, this error may occur. -## Solution +##### Solution Before performing the backward operation, use the **set\_decice\(\)** method to manually set the device. The modification is as follows: @@ -3642,27 +3642,27 @@ if __name__ == "__main__": test_npu() ``` -

What Do I Do If the Error Message "ImportError: libhccl.so." Is Displayed During Model Running?

+

What Do I Do If the Error Message "ImportError: libhccl.so." Is Displayed During Model Running?

-## Symptom +##### Symptom ![](figures/faq7.png) -## Possible Causes +##### Possible Causes Currently, the released PyTorch installation package uses the NPU and HCCL functions by default. Therefore, you need to add the path of the HCCL module to the environment variables when calling the PyTorch installation package. The error message "can not find libhccl.so" indicates that the cause is that the HCCL library file is missing. -## Solution +##### Solution Add the path of the HCCL module to the environment variables. Generally, the path of the HCCL library file is **.../fwkacllib/python/site-packages/hccl** in the installation package. -

What Do I Do If the Error Message "RuntimeError: Initialize." Is Displayed During Model Running?

+

What Do I Do If the Error Message "RuntimeError: Initialize." Is Displayed During Model Running?

-## Symptom +##### Symptom ![](figures/faq9.png) -## Possible Causes +##### Possible Causes According to the error information, it is preliminarily determined that an error occurs during the initialization of the NPU device. The error information in the host log is as follows: @@ -3670,7 +3670,7 @@ According to the error information, it is preliminarily determined that an error The log information indicates that an error is reported when the system starts the NPU device. -## Solution +##### Solution To solve the problem, perform the following steps: @@ -3694,25 +3694,25 @@ To solve the problem, perform the following steps: 4. Contact Huawei technical support personnel. -

What Do I Do If the Error Message "TVM/te/cce error." Is Displayed During Model Running?

+

What Do I Do If the Error Message "TVM/te/cce error." Is Displayed During Model Running?

-## Symptom +##### Symptom ![](figures/faq10.png) -## Possible Causes +##### Possible Causes Calling an NPU operator in PyTorch strongly depends on the TE, CCE, and TVM components. The PyTorch, CANN/NNAE, and TE versions must be the same. After CANN/NNAE is updated, components such as TE are not automatically updated. When their versions do not match, this error is reported. -## Solution +##### Solution Update the versions of components such as TE. The **te-\*.whl** and **topi-\*.whl** installation packages need to be updated. In the **lib64** subdirectory of the CANN or NNAE installation directory \(the installation user is the **root** user and the default installation directory is **/usr/local/Ascend/ascend-toolkit/latest/lib64**\), update the installation packages: The **topi-0.4.0-py3-none-any.whl** and **te-0.4.0-py3-none-any.whl** installation packages exist in the directory. Run the **pip3 install --upgrade topi-0.4.0-py3-none-any.whl** and **pip install --upgrade te-0.4.0-py3-none-any.whl** commands, respectively. ![](figures/faq10-1.png) -

What Do I Do If the Error Message "MemCopySync:drvMemcpy failed." Is Displayed During Model Running?

+

What Do I Do If the Error Message "MemCopySync:drvMemcpy failed." Is Displayed During Model Running?

-## Symptom +##### Symptom Scripts: @@ -3761,7 +3761,7 @@ Log message: [ERROR] RUNTIME(12731,python3.7):2021-02-02-22:23:56.475.717 [../../../../../../runtime/feature/src/api_c.cc:224]12828 rtKernelLaunch:ErrCode=207001, desc=[module new memory error], InnerCode=0x70a0002 ``` -## Possible Causes +##### Possible Causes The shell error message does not match the log message. @@ -3769,7 +3769,7 @@ The shell error message indicates that the error occurs on the AI CPU during syn The possible cause is that the AI CPU operator is executed asynchronously. As a result, the error information is delayed. -## Solution +##### Solution Perform the following steps to locate the fault based on the actual error information: @@ -3779,9 +3779,9 @@ Perform the following steps to locate the fault based on the actual error inform 4. Print the shape, dtype, and npu\_format of all stack parameters. Construct a single-operator case to reproduce the problem. The cause is that the data types of the input parameters for subtraction are different. As a result, the data types of the a-b and b-a results are different, and an error is reported in the stack operator. 5. Convert the data types of the stack input parameters to the same one to temporarily avoid the problem. -

What Do I Do If the Error Message "MemCopySync:drvMemcpy failed." Is Displayed During Model Running?

+

What Do I Do If the Error Message "MemCopySync:drvMemcpy failed." Is Displayed During Model Running?

-## Symptom +##### Symptom Script: @@ -3830,7 +3830,7 @@ Log message: [ERROR] RUNTIME(12731,python3.7):2021-02-02-22:23:56.475.717 [../../../../../../runtime/feature/src/api_c.cc:224]12828 rtKernelLaunch:ErrCode=207001, desc=[module new memory error], InnerCode=0x70a0002 ``` -## Possible Causes +##### Possible Causes The shell error message does not match the log message. @@ -3838,7 +3838,7 @@ The shell error message indicates that the error occurs on the AI CPU during syn The possible cause is that the AI CPU operator is executed asynchronously. As a result, the error information is delayed. -## Solution +##### Solution Perform the following steps to locate the fault based on the actual error information: @@ -3848,17 +3848,17 @@ Perform the following steps to locate the fault based on the actual error inform 4. Print the shape, dtype, and npu\_format of all stack parameters. Construct a single-operator case to reproduce the problem. The cause is that the data types of the input parameters for subtraction are different. As a result, the data types of the a-b and b-a results are different, and an error is reported in the stack operator. 5. Convert the data types of the stack input parameters to the same one to temporarily avoid the problem. -

What Do I Do If the Error Message "HelpACLExecute." Is Displayed After Multi-Task Delivery Is Disabled \(export TASK\_QUEUE\_ENABLE=0\) During Model Running?

+

What Do I Do If the Error Message "HelpACLExecute." Is Displayed After Multi-Task Delivery Is Disabled (export TASK\_QUEUE\_ENABLE=0) During Model Running?

-## Symptom +##### Symptom ![](figures/faq8.png) -## Possible Causes +##### Possible Causes The PyTorch operator runs on the NPU and calls the optimized operators at the bottom layer through the AcendCL API. When the error message "HelpACLExecute." is reported at the upper layer, the error information and logs are being optimized. As a result, when errors occur in some operators, the error information fails to be obtained. -## Solution +##### Solution View the host log to determine the operator and location where the error is reported. The default log path is **/var/log/npu/slog/host-0**. Search for the **ERROR** field in the log file of the corresponding time to find the error information. For the preceding error, the **ERROR** field in the log is as follows: @@ -3868,74 +3868,74 @@ The error information in the log indicates that the error operator is topKD and Locate the topKD operator in the model code and check whether the operator can be replaced by another operator. If the operator can be replaced by another operator, use the replacement solution and report the operator error information to Huawei engineers. If the operator cannot be replaced by another operator, contact Huawei technical support. -

What Do I Do If the Error Message "55056 GetInputConstDataOut: ErrorNo: -1\(failed\)" Is Displayed During Model Running?

+

What Do I Do If the Error Message "55056 GetInputConstDataOut: ErrorNo: -1(failed)" Is Displayed During Model Running?

-## Symptom +##### Symptom During model training, the following error information may be displayed in the host training log \(directory: **/root/ascend/log/plog/**\): ![](figures/20210720-102720(welinkpc).png) -## Possible Causes +##### Possible Causes A public API is called. -## Solution +##### Solution The error information does not affect the training function and performance and can be ignored. -

FAQs About Model Commissioning

+

FAQs About Model Commissioning

-- **[What Do I Do If the Error Message "RuntimeError: malloc:/..../pytorch/c10/npu/NPUCachingAllocator.cpp:293 NPU error, error code is 500000." Is Displayed During Model Commissioning?](#what-do-i-do-if-the-error-message-runtimeerror-malloc-pytorch-c10-npu-npucachingallocator-cpp-293-np.md)** +- **[What Do I Do If the Error Message "RuntimeError: malloc:/..../pytorch/c10/npu/NPUCachingAllocator.cpp:293 NPU error, error code is 500000." Is Displayed During Model Commissioning?](#what-do-i-do-if-the-error-message-runtimeerror-malloc-pytorch-c10-npu-npucachingallocator-cpp-293-npmd)** -- **[What Do I Do If the Error Message "RuntimeError: Could not run 'aten::trunc.out' with arguments from the 'NPUTensorId' backend." Is Displayed During Model Commissioning](#what-do-i-do-if-the-error-message-runtimeerror-could-not-run-aten-trunc-out-with-arguments-from-the.md)** +- **[What Do I Do If the Error Message "RuntimeError: Could not run 'aten::trunc.out' with arguments from the 'NPUTensorId' backend." Is Displayed During Model Commissioning](#what-do-i-do-if-the-error-message-runtimeerror-could-not-run-aten-trunc-out-with-arguments-from-themd)** -- **[What Do I Do If the MaxPoolGradWithArgmaxV1 and max Operators Report Errors During Model Commissioning?](#what-do-i-do-if-the-maxpoolgradwithargmaxv1-and-max-operators-report-errors-during-model-commissioni.md)** +- **[What Do I Do If the MaxPoolGradWithArgmaxV1 and max Operators Report Errors During Model Commissioning?](#what-do-i-do-if-the-maxpoolgradwithargmaxv1-and-max-operators-report-errors-during-model-commissionimd)** -- **[What Do I Do If the Error Message "ModuleNotFoundError: No module named 'torch.\_C'" Is Displayed When torch Is Called?](#what-do-i-do-if-the-error-message-modulenotfounderror-no-module-named-torch-_c-is-displayed-when-tor.md)** +- **[What Do I Do If the Error Message "ModuleNotFoundError: No module named 'torch.\_C'" Is Displayed When torch Is Called?](#what-do-i-do-if-the-error-message-modulenotfounderror-no-module-named-torch-_c-is-displayed-when-tormd)** -

What Do I Do If the Error Message "RuntimeError: malloc:/..../pytorch/c10/npu/NPUCachingAllocator.cpp:293 NPU error, error code is 500000." Is Displayed During Model Commissioning?

+

What Do I Do If the Error Message "RuntimeError: malloc:/..../pytorch/c10/npu/NPUCachingAllocator.cpp:293 NPU error, error code is 500000." Is Displayed During Model Commissioning?

-## Symptom +##### Symptom ![](figures/faq4.png) -## Possible Causes +##### Possible Causes For the malloc error in **NPUCachingAllocator**, the possible cause is that the required video memory is larger than the available video memory on the NPU. -## Solution +##### Solution During model commissioning, you can decrease the value of the **batch size** parameter to reduce the size of the occupied video memory on the NPU. -

What Do I Do If the Error Message "RuntimeError: Could not run 'aten::trunc.out' with arguments from the 'NPUTensorId' backend." Is Displayed During Model Commissioning

+

What Do I Do If the Error Message "RuntimeError: Could not run 'aten::trunc.out' with arguments from the 'NPUTensorId' backend." Is Displayed During Model Commissioning

-## Symptom +##### Symptom ![](figures/faq5.png) -## Possible Causes +##### Possible Causes Currently, the NPU supports only some PyTorch operators. The preceding error is reported when operators that are not supported are used. The operators are being developed. For details about the supported operators, see [PyTorch Native Operators](https://support.huaweicloud.com/intl/en-us/opl-pytorch/atlasptol_09_0001.html). -## Solution +##### Solution During model commissioning, you can decrease the value of the **batch size** parameter to reduce the size of the occupied video memory on the NPU. -

What Do I Do If the MaxPoolGradWithArgmaxV1 and max Operators Report Errors During Model Commissioning?

+

What Do I Do If the MaxPoolGradWithArgmaxV1 and max Operators Report Errors During Model Commissioning?

-## Symptom +##### Symptom ![](figures/faq6.png) ![](figures/faq6-1.png) -## Possible Causes +##### Possible Causes During model building, the operator input parameters are diversified. For some operators \(such as MaxPoolGradWithArgmaxV1 and max\) with specific parameters, an error is reported during calculation or the operators are not supported. You can locate the operators based on the error information. -## Solution +##### Solution Locate the operators based on the error information and perform the following steps: @@ -3949,50 +3949,50 @@ Locate the operators based on the error information and perform the following st In the preceding figure, the error information indicates that the MaxPoolGradWithArgmaxV1 and max operators report the error. MaxPoolGradWithArgmaxV1 reports the error during backward propagation. Therefore, construct a reverse scenario. The max operator reports the error during forward propagation. Therefore, construct a forward scenario. -If an operator error is reported in the model, you are advised to build a single-operator test case and determine the error scenario and cause. If a single-operator case cannot be built in a single operator, you need to construct a context-based single-operator scenario. For details about how to build a test case, see [Single-Operator Sample Building](#single-operator-sample-building.md). +If an operator error is reported in the model, you are advised to build a single-operator test case and determine the error scenario and cause. If a single-operator case cannot be built in a single operator, you need to construct a context-based single-operator scenario. For details about how to build a test case, see [Single-Operator Sample Building](#single-operator-sample-buildingmd). -

What Do I Do If the Error Message "ModuleNotFoundError: No module named 'torch.\_C'" Is Displayed When torch Is Called?

+

What Do I Do If the Error Message "ModuleNotFoundError: No module named 'torch.\_C'" Is Displayed When torch Is Called?

-## Symptom +##### Symptom ![](figures/faq11.png) -## Possible Causes +##### Possible Causes In the preceding figure, the error path is **.../code/pytorch/torch/\_\_init\_\_.py**. However, the current operating path is **.../code/pytorch**. When the **import torch** command is executed, the **torch** folder is searched in the current directory by default. As a result, an error is reported. The torch package installed in the system directory instead of the torch package in the current directory is called. -## Solution +##### Solution Switch to another directory to run the script. -

FAQs About Other Operations

+

FAQs About Other Operations

-- **[What Do I Do If an Error Is Reported During CUDA Stream Synchronization?](#what-do-i-do-if-an-error-is-reported-during-cuda-stream-synchronization.md)** +- **[What Do I Do If an Error Is Reported During CUDA Stream Synchronization?](#what-do-i-do-if-an-error-is-reported-during-cuda-stream-synchronizationmd)** -- **[What Do I Do If aicpu\_kernels/libpt\_kernels.so Does Not Exist?](#what-do-i-do-if-aicpu_kernels-libpt_kernels-so-does-not-exist.md)** +- **[What Do I Do If aicpu\_kernels/libpt\_kernels.so Does Not Exist?](#what-do-i-do-if-aicpu_kernels-libpt_kernels-so-does-not-existmd)** -- **[What Do I Do If the Python Process Is Residual When the npu-smi info Command Is Used to View Video Memory?](#what-do-i-do-if-the-python-process-is-residual-when-the-npu-smi-info-command-is-used-to-view-video-m.md)** +- **[What Do I Do If the Python Process Is Residual When the npu-smi info Command Is Used to View Video Memory?](#what-do-i-do-if-the-python-process-is-residual-when-the-npu-smi-info-command-is-used-to-view-video-mmd)** -- **[What Do I Do If the Error Message "match op inputs failed"Is Displayed When the Dynamic Shape Is Used?](#what-do-i-do-if-the-error-message-match-op-inputs-failed-is-displayed-when-the-dynamic-shape-is-used.md)** +- **[What Do I Do If the Error Message "match op inputs failed"Is Displayed When the Dynamic Shape Is Used?](#what-do-i-do-if-the-error-message-match-op-inputs-failed-is-displayed-when-the-dynamic-shape-is-usedmd)** -- **[What Do I Do If the Error Message "Op type SigmoidCrossEntropyWithLogitsV2 of ops kernel AIcoreEngine is unsupported" Is Displayed?](#what-do-i-do-if-the-error-message-op-type-sigmoidcrossentropywithlogitsv2-of-ops-kernel-aicoreengine.md)** +- **[What Do I Do If the Error Message "Op type SigmoidCrossEntropyWithLogitsV2 of ops kernel AIcoreEngine is unsupported" Is Displayed?](#what-do-i-do-if-the-error-message-op-type-sigmoidcrossentropywithlogitsv2-of-ops-kernel-aicoreenginemd)** -- **[What Do I Do If a Hook Failure Occurs?](#what-do-i-do-if-a-hook-failure-occurs.md)** +- **[What Do I Do If a Hook Failure Occurs?](#what-do-i-do-if-a-hook-failure-occursmd)** -- **[What Do I Do If the Error Message "load state\_dict error." Is Displayed When the Weight Is Loaded?](#what-do-i-do-if-the-error-message-load-state_dict-error-is-displayed-when-the-weight-is-loaded.md)** +- **[What Do I Do If the Error Message "load state\_dict error." Is Displayed When the Weight Is Loaded?](#what-do-i-do-if-the-error-message-load-state_dict-error-is-displayed-when-the-weight-is-loadedmd)** -

What Do I Do If an Error Is Reported During CUDA Stream Synchronization?

+

What Do I Do If an Error Is Reported During CUDA Stream Synchronization?

-## Symptom +##### Symptom ![](figures/model_faq11_20210728.jpg) -## Possible Causes +##### Possible Causes The NPU does not use NPU stream synchronization. -## Solution +##### Solution Use NPU stream synchronization. @@ -4001,17 +4001,17 @@ stream = torch.npu.current_stream() stream.synchronize() ``` -

What Do I Do If aicpu\_kernels/libpt\_kernels.so Does Not Exist?

+

What Do I Do If aicpu\_kernels/libpt\_kernels.so Does Not Exist?

-## Symptom +##### Symptom ![](figures/faq13.png) -## Possible Causes +##### Possible Causes The AI CPU is not imported. -## Solution +##### Solution Import the AI CPU. \(The following describes how to install the CANN software package as the **root** user in the default installation path.\) @@ -4019,17 +4019,17 @@ Import the AI CPU. \(The following describes how to install the CANN software pa export ASCEND_AICPU_PATH=/usr/local/Ascend/ascend-toolkit/latest ``` -

What Do I Do If the Python Process Is Residual When the npu-smi info Command Is Used to View Video Memory?

+

What Do I Do If the Python Process Is Residual When the npu-smi info Command Is Used to View Video Memory?

-## Symptom +##### Symptom ![](figures/faq14.png) -## Possible Causes +##### Possible Causes The Python process needs to be killed. -## Solution +##### Solution Kill the Python process. @@ -4037,40 +4037,40 @@ Kill the Python process. pkill -9 python ``` -

What Do I Do If the Error Message "match op inputs failed"Is Displayed When the Dynamic Shape Is Used?

+

What Do I Do If the Error Message "match op inputs failed"Is Displayed When the Dynamic Shape Is Used?

-## Symptom +##### Symptom ![](figures/faq15.png) -## Possible Causes +##### Possible Causes The operator compiled by **PTIndexPut** does not match the input shape, and the log starting with **acl\_dynamic\_shape\_op** is displayed. It is determined that an error is reported for the dynamic shape. -## Solution +##### Solution **PTIndexPut** corresponds to **tensor\[indices\] = value**. Locate the field in the code and change the dynamic shape to a fixed shape. -

What Do I Do If the Error Message "Op type SigmoidCrossEntropyWithLogitsV2 of ops kernel AIcoreEngine is unsupported" Is Displayed?

+

What Do I Do If the Error Message "Op type SigmoidCrossEntropyWithLogitsV2 of ops kernel AIcoreEngine is unsupported" Is Displayed?

-## Symptom +##### Symptom ``` [ERROR] GE(24836,python3.7):2021-01-27-18:27:51.562.111 [../../../../../../graphengine/ge/engine_manager/dnnengine_manager.cc:266]25155 GetDNNEngineName: ErrorNo: 1343242282(assign engine failed) GetDNNEngineName:Op type SigmoidCrossEntropyWithLogitsV2 of ops kernel AIcoreEngine is unsupported, reason:Op SigmoidCrossEntropyWithLogitsV2 not supported reason: The type of this op is not found in op store, check whether the op store has this type of op. Op store name is tbe-custom. The dtype, format or shape of input in op desc is not supported in op store, check the dtype, format or shape of input between the op store and the graph. Op store name is tbe-builtin. ``` -## Possible Causes +##### Possible Causes The input data type is not supported by the SigmoidCrossEntropyWithLogitsV2 operator. The possible cause is that the input data type is int64. -## Solution +##### Solution Check the input data type in the Python code and modify the data type. -

What Do I Do If a Hook Failure Occurs?

+

What Do I Do If a Hook Failure Occurs?

-## Symptom +##### Symptom ``` Traceback (most recent call last): @@ -4095,11 +4095,11 @@ Traceback (most recent call last): StopIteration ``` -## Possible Causes +##### Possible Causes The loss structure of the mmdet triggers the bug of the native hook of PyTorch, leading to an infinite loop. -## Solution +##### Solution Add **try** to line 658 to skip in the **/usr/local/python3.7.5/lib/python3.7/site-packages/torch/nn/modules/module.py** file: @@ -4124,19 +4124,19 @@ if len(self._backward_hooks) > 0: return result ``` -

What Do I Do If the Error Message "load state\_dict error." Is Displayed When the Weight Is Loaded?

+

What Do I Do If the Error Message "load state\_dict error." Is Displayed When the Weight Is Loaded?

-## Symptom +##### Symptom ![](figures/faq18.png) ![](figures/faq18-1.png) -## Possible Causes +##### Possible Causes The key value of **state\_dict** saved after model training is different from the key value of **state\_dict** when the model is loaded. When the model is saved, a **module** prefix is added to the beginning of each key. -## Solution +##### Solution When loading the weight, traverse the **state\_dict** dictionary, modify the key value, and use the new dictionary. For details about the test case, see **demo.py**. @@ -4153,38 +4153,38 @@ The script is as follows: model.load_state_dict(state_dict) ``` -

FAQs About Distributed Model Training

+

FAQs About Distributed Model Training

-- **[What Do I Do If the Error Message "host not found." Is Displayed During Distributed Model Training?](#what-do-i-do-if-the-error-message-host-not-found-is-displayed-during-distributed-model-training.md)** +- **[What Do I Do If the Error Message "host not found." Is Displayed During Distributed Model Training?](#what-do-i-do-if-the-error-message-host-not-found-is-displayed-during-distributed-model-trainingmd)** -- **[What Do I Do If the Error Message "RuntimeError: connect\(\) timed out." Is Displayed During Distributed Model Training?](#what-do-i-do-if-the-error-message-runtimeerror-connect()-timed-out-is-displayed-during-distributed-m.md)** +- **[What Do I Do If the Error Message "RuntimeError: connect\(\) timed out." Is Displayed During Distributed Model Training?](#what-do-i-do-if-the-error-message-runtimeerror-connect-timed-out-is-displayed-during-distributed-mmd)** -

What Do I Do If the Error Message "host not found." Is Displayed During Distributed Model Training?

+

What Do I Do If the Error Message "host not found." Is Displayed During Distributed Model Training?

-## Symptom +##### Symptom ![](figures/faq19.png) -## Possible Causes +##### Possible Causes During distributed model training, the Huawei Collective Communication Library \(HCCL\) is invoked. You need to set the IP address and port number based on the site requirements. The error information indicates that the IP address is incorrect. -## Solution +##### Solution Set the correct IP address in the running script. If a single server is deployed, set the IP address to the IP address of the server. If multiple servers are deployed, set the IP address in the script on each server to the IP address of the active node. -

What Do I Do If the Error Message "RuntimeError: connect\(\) timed out." Is Displayed During Distributed Model Training?

+

What Do I Do If the Error Message "RuntimeError: connect\(\) timed out." Is Displayed During Distributed Model Training?

-## Symptom +##### Symptom ![](figures/1234.png) -## Possible Causes +##### Possible Causes During distributed model training, the system firewall may block the communication of the HCCL port. Check whether the communication port is enabled based on the error information and perform related settings. -## Solution +##### Solution Query the HCCL port that is blocked by the system firewall and enable the port. diff --git a/docs/en/PyTorch Online Inference Guide/PyTorch Online Inference Guide.md b/docs/en/PyTorch Online Inference Guide/PyTorch Online Inference Guide.md index 186f76fefd..fccfe6ab18 100644 --- a/docs/en/PyTorch Online Inference Guide/PyTorch Online Inference Guide.md +++ b/docs/en/PyTorch Online Inference Guide/PyTorch Online Inference Guide.md @@ -1,48 +1,48 @@ # PyTorch Online Inference Guide -- [Application Scenario](#application-scenario.md) -- [Basic Workflow](#basic-workflow.md) - - [Prerequisites](#prerequisites.md) - - [Online Inference Process](#online-inference-process.md) - - [Environment Variable Configuration](#environment-variable-configuration.md) - - [Sample Reference](#sample-reference.md) -- [Special Topics](#special-topics.md) - - [Mixed Precision](#mixed-precision.md) -- [How Do I Install GCC 7.3.0?](#how-do-i-install-gcc-7-3-0.md) -

Application Scenario

+- [Application Scenario](#application-scenariomd) +- [Basic Workflow](#basic-workflowmd) + - [Prerequisites](#prerequisitesmd) + - [Online Inference Process](#online-inference-processmd) + - [Environment Variable Configuration](#environment-variable-configurationmd) + - [Sample Reference](#sample-referencemd) +- [Special Topics](#special-topicsmd) + - [Mixed Precision](#mixed-precisionmd) +- [How Do I Install GCC 7.3.0?](#how-do-i-install-gcc-7-3-0md) +

Application Scenario

Online inference, unlike offline inference, allows developers to perform inference directly with PyTorch models using the **model.eval\(\)** method. In this way, PyTorch-based inference apps can be directly ported to the Ascend AI Processor, which is especially useful in the data center inference scenarios. -## Supported Processors +### Supported Processors Ascend 910 AI Processor Ascend 710 AI Processor -

Basic Workflow

+

Basic Workflow

-- **[Prerequisites](#prerequisites.md)** +- **[Prerequisites](#prerequisitesmd)** -- **[Online Inference Process](#online-inference-process.md)** +- **[Online Inference Process](#online-inference-processmd)** -- **[Environment Variable Configuration](#environment-variable-configuration.md)** +- **[Environment Variable Configuration](#environment-variable-configurationmd)** -- **[Sample Reference](#sample-reference.md)** +- **[Sample Reference](#sample-referencemd)** -

Prerequisites

+

Prerequisites

The PyTorch framework and mixed precision module have been installed. For details, see the _PyTorch Installation Guide_. -

Online Inference Process

+

Online Inference Process

[Figure 1](#fig13802941161818) shows the online inference process. **Figure 1** Online inference process ![](figures/online-inference-process.png "online-inference-process") -

Environment Variable Configuration

+

Environment Variable Configuration

The following are the environment variables required for starting the inference process on PyTorch: @@ -82,7 +82,7 @@ export TASK_QUEUE_ENABLE=0

LD_LIBRARY_PATH

Dynamic library search path. Set this variable based on the preceding example.

-
NOTE:

If GCC 7.3.0 is installed in OSs such as CentOS 7.6, Debian, and BC-Linux, configure the related environment variables. For details, see 5.

+
NOTE:

If GCC 7.3.0 is installed in OSs such as CentOS 7.6, Debian, and BC-Linux, configure the related environment variables. For details, see 5.

Required

@@ -133,9 +133,9 @@ export TASK_QUEUE_ENABLE=0 >![](public_sys-resources/icon-note.gif) **NOTE:** >For more log information, see the _CANN Log Reference_. -

Sample Reference

+

Sample Reference

-## Sample Code +#### Sample Code Try to minimize the initialization frequency across the app lifetime during inference. The inference mode is set using the **model.eval\(\)** method, and the inference process must run under the code branch **with torch.no\_grad\(\):**. @@ -406,7 +406,7 @@ if __name__ == '__main__': main() ``` -## Sample Running +#### Sample Running The following uses the ResNet-50 model as an example to describe how to perform online inference. @@ -420,7 +420,7 @@ The following uses the ResNet-50 model as an example to describe how to perform 3. Run inference. - Set environment variables by referring to [Environment Variable Configuration](#environment-variable-configuration.md) and then run the following command: + Set environment variables by referring to [Environment Variable Configuration](#environment-variable-configurationmd) and then run the following command: ``` python3 pytorch-resnet50-apex.py --data /data/imagenet \ @@ -433,14 +433,14 @@ The following uses the ResNet-50 model as an example to describe how to perform >The preceding command is an example only. Modify the arguments as needed. -

Special Topics

+

Special Topics

-- **[Mixed Precision](#mixed-precision.md)** +- **[Mixed Precision](#mixed-precisionmd)** -

Mixed Precision

+

Mixed Precision

-## Overview +#### Overview Based on the architecture features of the NPU, mixed precision is involved in the model computing, that is, the scenario where the float16 and float32 data types are used together. Replacing float32 with float16 has the following advantages: @@ -450,7 +450,7 @@ Based on the architecture features of the NPU, mixed precision is involved in th However, the mixed precision training is limited by the precision range expressed by float16. If float32 is converted into float16, the training convergence is affected. To use float16 for acceleration in some computations and ensure training convergence, the mixed precision module Apex is used. The mixed precision module Apex is a comprehensive optimization library that features high optimization performance and precision. -## Supported Features +#### Supported Features [Table 1](#en-us_topic_0278765773_table10717173813332) describes the functions and optimization of the mixed precision module. @@ -489,7 +489,7 @@ However, the mixed precision training is limited by the precision range expresse >![](public_sys-resources/icon-note.gif) **NOTE:** >In the current version, Apex is implemented using Python and does not support AscendCL or CUDA optimization. -## Initializing the Mixed Precision Model +#### Initializing the Mixed Precision Model 1. To use the mixed precision module Apex, you need to import the amp module from the Apex library as follows: @@ -503,20 +503,20 @@ However, the mixed precision training is limited by the precision range expresse model, optimizer = amp.initialize(model, optimizer) ``` - For details, see "Initialize the mixed precision model."# in [Sample Code](#sample-reference.md). + For details, see "Initialize the mixed precision model."# in [Sample Code](#sample-referencemd). ``` model, optimizer = amp.initialize(model, optimizer, opt_level="O2", loss_scale=1024, verbosity=1) ``` -## Mixed Precision Inference +#### Mixed Precision Inference After the mixed precision model is initialized, perform model forward propagation. -Sample code: For details, see the implementation of **validate\(val\_loader, model, args\)** in [Sample Code](#sample-reference.md). +Sample code: For details, see the implementation of **validate\(val\_loader, model, args\)** in [Sample Code](#sample-referencemd). -

How Do I Install GCC 7.3.0?

+

How Do I Install GCC 7.3.0?

Perform the following steps as the **root** user. diff --git a/docs/en/PyTorch Operator Development Guide/PyTorch Operator Development Guide.md b/docs/en/PyTorch Operator Development Guide/PyTorch Operator Development Guide.md index a65e00f755..78351f5798 100644 --- a/docs/en/PyTorch Operator Development Guide/PyTorch Operator Development Guide.md +++ b/docs/en/PyTorch Operator Development Guide/PyTorch Operator Development Guide.md @@ -1,40 +1,40 @@ # PyTorch Operator Development Guide -- [Introduction](#introduction.md) -- [Operator Development Process](#operator-development-process.md) -- [Operator Development Preparations](#operator-development-preparations.md) - - [Setting Up the Environment](#setting-up-the-environment.md) - - [Looking Up Operators](#looking-up-operators.md) -- [Operator Adaptation](#operator-adaptation.md) - - [Prerequisites](#prerequisites.md) - - [Obtaining the PyTorch Source Code](#obtaining-the-pytorch-source-code.md) - - [Registering an Operator](#registering-an-operator.md) - - [Overview](#overview.md) - - [Registering an Operator for PyTorch 1.5.0](#registering-an-operator-for-pytorch-1-5-0.md) - - [Registering an Operator for PyTorch 1.8.1](#registering-an-operator-for-pytorch-1-8-1.md) - - [Developing an Operator Adaptation Plugin](#developing-an-operator-adaptation-plugin.md) - - [Compiling and Installing the PyTorch Framework](#compiling-and-installing-the-pytorch-framework.md) -- [Operator Function Verification](#operator-function-verification.md) - - [Overview](#overview-0.md) - - [Implementation](#implementation.md) -- [FAQs](#faqs.md) - - [Pillow==5.3.0 Installation Failed](#pillow-5-3-0-installation-failed.md) - - [pip3.7 install torchvision Installation Failed](#pip3-7-install-torchvision-installation-failed.md) - - ["torch 1.5.0xxxx" and "torchvision" Do Not Match When torch-\*.whl Is Installed](#torch-1-5-0xxxx-and-torchvision-do-not-match-when-torch--whl-is-installed.md) - - [How Do I View Test Run Logs?](#how-do-i-view-test-run-logs.md) - - [Why Cannot the Custom TBE Operator Be Called?](#why-cannot-the-custom-tbe-operator-be-called.md) - - [How Do I Determine Whether the TBE Operator Is Correctly Called for PyTorch Adaptation?](#how-do-i-determine-whether-the-tbe-operator-is-correctly-called-for-pytorch-adaptation.md) - - [PyTorch Compilation Fails and the Message "error: ld returned 1 exit status" Is Displayed](#pytorch-compilation-fails-and-the-message-error-ld-returned-1-exit-status-is-displayed.md) - - [PyTorch Compilation Fails and the Message "error: call of overload...." Is Displayed](#pytorch-compilation-fails-and-the-message-error-call-of-overload-is-displayed.md) -- [Appendixes](#appendixes.md) - - [Installing CMake](#installing-cmake.md) - - [Exporting a Custom Operator](#exporting-a-custom-operator.md) -

Introduction

- -## Overview +- [Introduction](#introductionmd) +- [Operator Development Process](#operator-development-processmd) +- [Operator Development Preparations](#operator-development-preparationsmd) + - [Setting Up the Environment](#setting-up-the-environmentmd) + - [Looking Up Operators](#looking-up-operatorsmd) +- [Operator Adaptation](#operator-adaptationmd) + - [Prerequisites](#prerequisitesmd) + - [Obtaining the PyTorch Source Code](#obtaining-the-pytorch-source-codemd) + - [Registering an Operator](#registering-an-operatormd) + - [Overview](#overviewmd) + - [Registering an Operator for PyTorch 1.5.0](#registering-an-operator-for-pytorch-1-5-0md) + - [Registering an Operator for PyTorch 1.8.1](#registering-an-operator-for-pytorch-1-8-1md) + - [Developing an Operator Adaptation Plugin](#developing-an-operator-adaptation-pluginmd) + - [Compiling and Installing the PyTorch Framework](#compiling-and-installing-the-pytorch-frameworkmd) +- [Operator Function Verification](#operator-function-verificationmd) + - [Overview](#overview-0md) + - [Implementation](#implementationmd) +- [FAQs](#faqsmd) + - [Pillow==5.3.0 Installation Failed](#pillow-5-3-0-installation-failedmd) + - [pip3.7 install torchvision Installation Failed](#pip3-7-install-torchvision-installation-failedmd) + - ["torch 1.5.0xxxx" and "torchvision" Do Not Match When torch-\*.whl Is Installed](#torch-1-5-0xxxx-and-torchvision-do-not-match-when-torch--whl-is-installedmd) + - [How Do I View Test Run Logs?](#how-do-i-view-test-run-logsmd) + - [Why Cannot the Custom TBE Operator Be Called?](#why-cannot-the-custom-tbe-operator-be-calledmd) + - [How Do I Determine Whether the TBE Operator Is Correctly Called for PyTorch Adaptation?](#how-do-i-determine-whether-the-tbe-operator-is-correctly-called-for-pytorch-adaptationmd) + - [PyTorch Compilation Fails and the Message "error: ld returned 1 exit status" Is Displayed](#pytorch-compilation-fails-and-the-message-error-ld-returned-1-exit-status-is-displayedmd) + - [PyTorch Compilation Fails and the Message "error: call of overload...." Is Displayed](#pytorch-compilation-fails-and-the-message-error-call-of-overload-is-displayedmd) +- [Appendixes](#appendixesmd) + - [Installing CMake](#installing-cmakemd) + - [Exporting a Custom Operator](#exporting-a-custom-operatormd) +

Introduction

+ +### Overview To enable the PyTorch deep learning framework to run on Ascend AI Processors, you need to use Tensor Boost Engine \(TBE\) to customize the framework operators. -

Operator Development Process

+

Operator Development Process

PyTorch operator development includes TBE operator development and operator adaptation to the PyTorch framework. @@ -69,7 +69,7 @@ PyTorch operator development includes TBE operator development and operator adap

Set up the development and operating environments required for operator development, execution, and verification.

-

Operator Development Preparations

+

Operator Development Preparations

2

@@ -86,7 +86,7 @@ PyTorch operator development includes TBE operator development and operator adap

Obtain the PyTorch source code from the Ascend Community.

-

Operator Adaptation

+

Operator Adaptation

4

@@ -116,24 +116,24 @@ PyTorch operator development includes TBE operator development and operator adap

Verify the operator functions in the real-world hardware environment.

-

Operator Function Verification

+

Operator Function Verification

-

Operator Development Preparations

+

Operator Development Preparations

-- **[Setting Up the Environment](#setting-up-the-environment.md)** +- **[Setting Up the Environment](#setting-up-the-environmentmd)** -- **[Looking Up Operators](#looking-up-operators.md)** +- **[Looking Up Operators](#looking-up-operatorsmd)** -

Setting Up the Environment

+

Setting Up the Environment

- The development or operating environment of CANN has been installed. For details, see the _CANN Software Installation Guide_. - Python 3.7.5 or 3.8 has been installed. -- CMake 3.12.0 or later has been installed. For details, see [Installing CMake](#installing-cmake.md). +- CMake 3.12.0 or later has been installed. For details, see [Installing CMake](#installing-cmakemd). - GCC 7.3.0 or later has been installed. For details about how to install and use GCC 7.3.0, see "Installing GCC 7.3.0" in the _CANN Software Installation Guide_. - The Git tool has been installed. To install Git for Ubuntu and CentOS, run the following commands: - Ubuntu @@ -152,7 +152,7 @@ PyTorch operator development includes TBE operator development and operator adap -

Looking Up Operators

+

Looking Up Operators

During operator development, you can query the list of operators supported by Ascend AI Processors and the list of operators adapted to PyTorch. Develop or adapt operators to PyTorch based on the query result. @@ -168,44 +168,44 @@ The following describes how to query the operators supported by Ascend AI Proces - For the list of operators adapted to PyTorch, see the _PyTorch Operator Support_. -

Operator Adaptation

+

Operator Adaptation

-- **[Prerequisites](#prerequisites.md)** +- **[Prerequisites](#prerequisitesmd)** -- **[Obtaining the PyTorch Source Code](#obtaining-the-pytorch-source-code.md)** +- **[Obtaining the PyTorch Source Code](#obtaining-the-pytorch-source-codemd)** -- **[Registering an Operator](#registering-an-operator.md)** +- **[Registering an Operator](#registering-an-operatormd)** -- **[Developing an Operator Adaptation Plugin](#developing-an-operator-adaptation-plugin.md)** +- **[Developing an Operator Adaptation Plugin](#developing-an-operator-adaptation-pluginmd)** -- **[Compiling and Installing the PyTorch Framework](#compiling-and-installing-the-pytorch-framework.md)** +- **[Compiling and Installing the PyTorch Framework](#compiling-and-installing-the-pytorch-frameworkmd)** -

Prerequisites

+

Prerequisites

-- The development and operating environments have been set up, and related dependencies have been installed. For details, see [Setting Up the Environment](#setting-up-the-environment.md). +- The development and operating environments have been set up, and related dependencies have been installed. For details, see [Setting Up the Environment](#setting-up-the-environmentmd). - TBE operators have been developed and deployed. For details, see the _CANN TBE Custom Operator Development Guide_. -

Obtaining the PyTorch Source Code

+

Obtaining the PyTorch Source Code

Currently, only PyTorch 1.5.0 and 1.8.1 are supported. To obtain the PyTorch source code, perform steps described in the "Installing the PyTorch Framework" in the _PyTorch Installation Guide_. The full code adapted to Ascend AI Processors is generated in the **pytorch/pytorch** directory. The PyTorch operator is also adapted and developed in this directory. -

Registering an Operator

+

Registering an Operator

-- **[Overview](#overview.md)** +- **[Overview](#overviewmd)** -- **[Registering an Operator for PyTorch 1.5.0](#registering-an-operator-for-pytorch-1-5-0.md)** +- **[Registering an Operator for PyTorch 1.5.0](#registering-an-operator-for-pytorch-1-5-0md)** -- **[Registering an Operator for PyTorch 1.8.1](#registering-an-operator-for-pytorch-1-8-1.md)** +- **[Registering an Operator for PyTorch 1.8.1](#registering-an-operator-for-pytorch-1-8-1md)** -

Overview

+

Overview

Currently, the NPU adaptation dispatch principle is as follows: The NPU operator is directly dispatched as the NPU adaptation function without being processed by the common function of the framework. That is, the operator execution call stack contains only the function call of the NPU adaptation and does not contain the common function of the framework. During compilation, the PyTorch framework generates the calling description of the middle layer of the new operator based on the definition in **native\_functions.yaml** and the type and device dispatch principle defined in the framework. For NPUs, the description is generated in **build/aten/src/ATen/NPUType.cpp**. -

Registering an Operator for PyTorch 1.5.0

+

Registering an Operator for PyTorch 1.5.0

-## Registering an Operator +##### Registering an Operator 1. Open the **native\_functions.yaml** file. @@ -249,7 +249,7 @@ Currently, the NPU adaptation dispatch principle is as follows: The NPU operator >The formats are for reference only. The function name during operator adaptation must be the same as **NPU\_Adapt\_Fun\_Name**. -## Examples +##### Examples The following uses the torch.add\(\) operator as an example to describe how to register an operator. @@ -335,9 +335,9 @@ The following uses the torch.add\(\) operator as an example to describe how to r -

Registering an Operator for PyTorch 1.8.1

+

Registering an Operator for PyTorch 1.8.1

-## Registering an Operator +##### Registering an Operator 1. Open the **native\_functions.yaml** file. @@ -358,7 +358,7 @@ The following uses the torch.add\(\) operator as an example to describe how to r -## Examples +##### Examples The following uses the torch.add\(\) operator as an example to describe how to register an operator. @@ -423,13 +423,13 @@ The following uses the torch.add\(\) operator as an example to describe how to r -

Developing an Operator Adaptation Plugin

+

Developing an Operator Adaptation Plugin

-## Overview +#### Overview You can develop an operator adaptation plugin to convert the formats of the input parameters, output parameters, and attributes of the PyTorch native operators so that the obtained formats are the same as the formats of the input parameters, output parameters, and attributes of the TBE operators. The PyTorch source code that is adapted to Ascend AI Processors provides methods related to adaptation association, type conversion and discrimination, and dynamic shape processing for users. -## Adaptation Plugin Implementation +#### Adaptation Plugin Implementation 1. Create an adaptation plugin file. @@ -466,7 +466,7 @@ You can develop an operator adaptation plugin to convert the formats of the inpu - **m** is a fixed field. -## Example +#### Example The following uses the torch.add\(\) operator as an example to describe how to adapt an operator. @@ -631,9 +631,9 @@ The following uses the torch.add\(\) operator as an example to describe how to a >![](public_sys-resources/icon-note.gif) **NOTE:** >For details about the implementation code of **AddKernelNpu.cpp**, see the **pytorch/aten/src/ATen/native/npu/AddKernelNpu.cpp** document. -

Compiling and Installing the PyTorch Framework

+

Compiling and Installing the PyTorch Framework

-## Compiling the PyTorch Framework +#### Compiling the PyTorch Framework 1. Go to the PyTorch working directory **pytorch/pytorch**. 2. Install the dependency. @@ -653,7 +653,7 @@ The following uses the torch.add\(\) operator as an example to describe how to a Specify the Python version in the environment for compilation. After the compilation is successful, the binary package **torch-\*.whl** is generated in the **pytorch/pytorch/dist** directory, for example, **torch-1.5.0+ascend.post3-cp37-cp37m-linux\_x86.whl** or **torch-1.8.1+ascend-cp37-cp37m-linux\_x86.whl**. -## Installing the PyTorch Framework +#### Installing the PyTorch Framework Go to the **pytorch/pytorch/dist** directory and run the following command to install PyTorch: @@ -669,34 +669,34 @@ _**\{arch\}**_ indicates the architecture information. The value can be **aarc After the code has been modified, you need to re-compile and re-install PyTorch. -

Operator Function Verification

+

Operator Function Verification

-- **[Overview](#overview-0.md)** +- **[Overview](#overview-0md)** -- **[Implementation](#implementation.md)** +- **[Implementation](#implementationmd)** -

Overview

+

Overview

-## Introduction +#### Introduction After operator adaptation is complete, you can run the PyTorch operator adapted to Ascend AI Processor to verify the operator running result. Operator verification involves all deliverables generated during operator development, including the implementation files, operator prototype definitions, operator information library, and operator plugins. This section describes only the verification method. -## Test Cases and Records +#### Test Cases and Records Use the PyTorch frontend to construct the custom operator function and run the function to verify the custom operator functions. The test cases and test tools are provided in the **pytorch/test/test\_npu** directory at **https://gitee.com/ascend/pytorch**. -

Implementation

+

Implementation

-## Introduction +#### Introduction This section describes how to test the functions of a PyTorch operator. -## Procedure +#### Procedure 1. Set environment variables. @@ -768,36 +768,36 @@ This section describes how to test the functions of a PyTorch operator. ``` -

FAQs

+

FAQs

-- **[Pillow==5.3.0 Installation Failed](#pillow-5-3-0-installation-failed.md)** +- **[Pillow==5.3.0 Installation Failed](#pillow-5-3-0-installation-failedmd)** -- **[pip3.7 install torchvision Installation Failed](#pip3-7-install-torchvision-installation-failed.md)** +- **[pip3.7 install torchvision Installation Failed](#pip3-7-install-torchvision-installation-failedmd)** -- **["torch 1.5.0xxxx" and "torchvision" Do Not Match When torch-\*.whl Is Installed](#torch-1-5-0xxxx-and-torchvision-do-not-match-when-torch--whl-is-installed.md)** +- **["torch 1.5.0xxxx" and "torchvision" Do Not Match When torch-\*.whl Is Installed](#torch-1-5-0xxxx-and-torchvision-do-not-match-when-torch--whl-is-installedmd)** -- **[How Do I View Test Run Logs?](#how-do-i-view-test-run-logs.md)** +- **[How Do I View Test Run Logs?](#how-do-i-view-test-run-logsmd)** -- **[Why Cannot the Custom TBE Operator Be Called?](#why-cannot-the-custom-tbe-operator-be-called.md)** +- **[Why Cannot the Custom TBE Operator Be Called?](#why-cannot-the-custom-tbe-operator-be-calledmd)** -- **[How Do I Determine Whether the TBE Operator Is Correctly Called for PyTorch Adaptation?](#how-do-i-determine-whether-the-tbe-operator-is-correctly-called-for-pytorch-adaptation.md)** +- **[How Do I Determine Whether the TBE Operator Is Correctly Called for PyTorch Adaptation?](#how-do-i-determine-whether-the-tbe-operator-is-correctly-called-for-pytorch-adaptationmd)** -- **[PyTorch Compilation Fails and the Message "error: ld returned 1 exit status" Is Displayed](#pytorch-compilation-fails-and-the-message-error-ld-returned-1-exit-status-is-displayed.md)** +- **[PyTorch Compilation Fails and the Message "error: ld returned 1 exit status" Is Displayed](#pytorch-compilation-fails-and-the-message-error-ld-returned-1-exit-status-is-displayedmd)** -- **[PyTorch Compilation Fails and the Message "error: call of overload...." Is Displayed](#pytorch-compilation-fails-and-the-message-error-call-of-overload-is-displayed.md)** +- **[PyTorch Compilation Fails and the Message "error: call of overload...." Is Displayed](#pytorch-compilation-fails-and-the-message-error-call-of-overload-is-displayedmd)** -

Pillow==5.3.0 Installation Failed

+

Pillow==5.3.0 Installation Failed

-## Symptom +#### Symptom **Pillow==5.3.0** installation failed. -## Possible Cause +#### Possible Cause Necessary dependencies are missing, such as libjpeg, python-devel, zlib-devel, and libjpeg-turbo-devel. -## Solutions +#### Solutions Run the following command to install the required dependencies: @@ -805,17 +805,17 @@ Run the following command to install the required dependencies: apt-get install libjpeg python-devel zlib-devel libjpeg-turbo-devel ``` -

pip3.7 install torchvision Installation Failed

+

pip3.7 install torchvision Installation Failed

-## Symptom +#### Symptom **pip3.7 install torchvision** installation failed. -## Possible Cause +#### Possible Cause The versions of PyTorch and TorchVision do not match. -## Solutions +#### Solutions Run the following command: @@ -823,9 +823,9 @@ Run the following command: pip3.7 install torchvision --no-deps ``` -

"torch 1.5.0xxxx" and "torchvision" Do Not Match When torch-\*.whl Is Installed

+

"torch 1.5.0xxxx" and "torchvision" Do Not Match When torch-\*.whl Is Installed

-## Symptom +#### Symptom During the installation of **torch-**_\*_**.whl**, the message "ERROR: torchvision 0.6.0 has requirement torch==1.5.0, but you'll have torch 1.5.0a0+1977093 which is incompatible" " is displayed. @@ -833,15 +833,15 @@ During the installation of **torch-**_\*_**.whl**, the message "ERROR: torchvis However, the installation is successful. -## Possible Cause +#### Possible Cause When the PyTorch is installed, the version check is automatically triggered. The version of the torchvision installed in the environment is 0.6.0. During the check, it is found that the version of the **torch-**_\*_**.whl** is inconsistent with the required version 1.5.0. As a result, an error message is displayed. -## Solutions +#### Solutions This problem has no impact on the actual result, and no action is required. -

How Do I View Test Run Logs?

+

How Do I View Test Run Logs?

When an error message is displayed during the test, but the reference information is insufficient, how can we view more detailed run logs? @@ -862,21 +862,21 @@ Output the logs to the screen and redirect them to a specified text file. ``` -

Why Cannot the Custom TBE Operator Be Called?

+

Why Cannot the Custom TBE Operator Be Called?

-## Symptom +#### Symptom The custom TBE operator has been developed and adapted to PyTorch. However, the newly developed operator cannot be called during test case execution. -## Possible Cause +#### Possible Cause - The environment variables are not set correctly. - An error occurs in the YAML file. As a result, the operator is not correctly dispatched. - The implementation of the custom TBE operator is incorrect. As a result, the operator cannot be called. -## Solutions +#### Solutions -1. Set the operating environment by referring to [Verifying Operator Functions](#operator-function-verification.md). Pay special attention to the following settings: +1. Set the operating environment by referring to [Verifying Operator Functions](#operator-function-verificationmd). Pay special attention to the following settings: ``` . /home/HwHiAiUser/Ascend/ascend-toolkit/set_env.sh @@ -916,7 +916,7 @@ The custom TBE operator has been developed and adapted to PyTorch. However, the -

How Do I Determine Whether the TBE Operator Is Correctly Called for PyTorch Adaptation?

+

How Do I Determine Whether the TBE Operator Is Correctly Called for PyTorch Adaptation?

Both the custom and built-in operators are stored in the installation directory as .py source code after installation. Therefore, you can edit the source code and add logs at the API entry to print the input parameters, and determine whether the input parameters are correct. @@ -966,15 +966,15 @@ The following uses the **zn\_2\_nchw** operator in the built-in operator packa ![](figures/en-us_image_0000001144082072.png) -

PyTorch Compilation Fails and the Message "error: ld returned 1 exit status" Is Displayed

+

PyTorch Compilation Fails and the Message "error: ld returned 1 exit status" Is Displayed

-## Symptom +#### Symptom PyTorch compilation fails and the message "error: ld returned 1 exit status" is displayed. ![](figures/en-us_image_0000001190201973.png) -## Possible Cause +#### Possible Cause According to the log analysis, the possible cause is that the adaptation function implemented in _xxxx_**KernelNpu.cpp** does not match the dispatch implementation API parameters required by the PyTorch framework operator. In the preceding example, the function is **binary\_cross\_entropy\_npu**. Open the corresponding _xxxx_**KernelNpu.cpp** file and find the adaptation function. @@ -982,13 +982,13 @@ According to the log analysis, the possible cause is that the adaptation functio In the implementation, the type of the last parameter is **int**, which does not match the required **long**. -## Solutions +#### Solutions Modify the adaptation function implemented in _xxxx_**KernelNpu.cpp**. In the preceding example, change the type of the last parameter in the **binary\_cross\_entropy\_npu** function to **int64\_t** \(use **int64\_t** instead of **long** in the .cpp file\). -

PyTorch Compilation Fails and the Message "error: call of overload...." Is Displayed

+

PyTorch Compilation Fails and the Message "error: call of overload...." Is Displayed

-## Symptom +#### Symptom PyTorch compilation fails and the message "error: call of overload...." is displayed. @@ -996,7 +996,7 @@ PyTorch compilation fails and the message "error: call of overload...." is displ ![](figures/en-us_image_0000001190201935.png) -## Possible Cause +#### Possible Cause According to the log analysis, the error is located in line 30 in the _xxxx_**KernelNpu.cpp** file, indicating that the **NPUAttrDesc** parameter is invalid. In the preceding example, the function is **binary\_cross\_entropy\_attr**. Open the corresponding _xxxx_**KernelNpu.cpp** file and find the adaptation function. @@ -1004,20 +1004,20 @@ According to the log analysis, the error is located in line 30 in the _xxxx_**K In the implementation, the type of the second input parameter of **NPUAttrDesc** is **int**, which does not match the definition of **NPUAttrDesc**. -## Solutions +#### Solutions 1. Replace the incorrect code line in the **binary\_cross\_entropy\_attr\(\)** function with the code in the preceding comment. 2. Change the input parameter type of **binary\_cross\_entropy\_attr\(\)** to **int64\_t**. -

Appendixes

+

Appendixes

-- **[Installing CMake](#installing-cmake.md)** +- **[Installing CMake](#installing-cmakemd)** -- **[Exporting a Custom Operator](#exporting-a-custom-operator.md)** +- **[Exporting a Custom Operator](#exporting-a-custom-operatormd)** -

Installing CMake

+

Installing CMake

The following describes how to upgrade CMake to 3.12.1. @@ -1056,17 +1056,17 @@ The following describes how to upgrade CMake to 3.12.1. If the message "cmake version 3.12.1" is displayed, the installation is successful. -

Exporting a Custom Operator

+

Exporting a Custom Operator

-## Overview +#### Overview A PyTorch model contains a custom operator. You can export the custom operator as an ONNX single-operator model, which can be easily ported to other AI frameworks. Three types of custom operator export are available: NPU-adapted TBE operator export, C++ operator export, and pure Python operator export. -## Prerequisites +#### Prerequisites You have installed the PyTorch framework. -## TBE Operator Export +#### TBE Operator Export A TBE operator can be exported using either of the following methods: @@ -1221,7 +1221,7 @@ Method 2: >![](public_sys-resources/icon-note.gif) **NOTE:** >For details about the implementation code, see [test\_custom\_ops\_npu\_demo.py](https://gitee.com/ascend/pytorch/blob/master/test/test_npu/test_onnx/torch.onnx/custom_ops_demo/test_custom_ops_npu_demo.py). If you do not have the permission to obtain the code, contact Huawei technical support to join the **Ascend** organization. -## C++ Operator Export +#### C++ Operator Export 1. Customize an operator. @@ -1287,7 +1287,7 @@ Method 2: >![](public_sys-resources/icon-note.gif) **NOTE:** >For details about the implementation code, see [test\_custom\_ops\_demo.py](https://gitee.com/ascend/pytorch/blob/master/test/test_npu/test_onnx/torch.onnx/custom_ops_demo/test_custom_ops_demo.py). If you do not have the permission to obtain the code, contact Huawei technical support to join the **Ascend** organization. -## Pure Python Operator Export +#### Pure Python Operator Export 1. Customize an operator. diff --git a/docs/en/PyTorch Operator Support/PyTorch Operator Support.md b/docs/en/PyTorch Operator Support/PyTorch Operator Support.md index 29c16ee65f..6488217b0b 100644 --- a/docs/en/PyTorch Operator Support/PyTorch Operator Support.md +++ b/docs/en/PyTorch Operator Support/PyTorch Operator Support.md @@ -1,7 +1,7 @@ # PyTorch Operator Support -- [Mapping Between PyTorch Native Operators and Ascend Adapted Operators](#mapping-between-pytorch-native-operators-and-ascend-adapted-operators.md) -- [PyTorch Operators Customized by Ascend](#pytorch-operators-customized-by-ascend.md) -

Mapping Between PyTorch Native Operators and Ascend Adapted Operators

+- [Mapping Between PyTorch Native Operators and Ascend Adapted Operators](#mapping-between-pytorch-native-operators-and-ascend-adapted-operatorsmd) +- [PyTorch Operators Customized by Ascend](#pytorch-operators-customized-by-ascendmd) +

Mapping Between PyTorch Native Operators and Ascend Adapted Operators

No.

@@ -5643,7 +5643,7 @@
-

PyTorch Operators Customized by Ascend

+

PyTorch Operators Customized by Ascend

No.

diff --git a/docs/en/RELEASENOTE/RELEASENOTE.md b/docs/en/RELEASENOTE/RELEASENOTE.md index 8bc732c5de..372600c758 100644 --- a/docs/en/RELEASENOTE/RELEASENOTE.md +++ b/docs/en/RELEASENOTE/RELEASENOTE.md @@ -1,40 +1,40 @@ # FrameworkPTAdapter 2.0.3 Release Notes -- [FrameworkPTAdapter 2.0.3](#frameworkptadapter-2-0-3.md) - - [Before You Start](#before-you-start.md) - - [New Features](#new-features.md) - - [Modified Features](#modified-features.md) - - [Resolved Issues](#resolved-issues.md) - - [Known Issues](#known-issues.md) - - [Compatibility](#compatibility.md) -- [FrameworkPTAdapter 2.0.2](#frameworkptadapter-2-0-2.md) - - [Before You Start](#before-you-start-0.md) - - [New Features](#new-features-1.md) - - [Modified Features](#modified-features-2.md) - - [Resolved Issues](#resolved-issues-3.md) - - [Known Issues](#known-issues-4.md) - - [Compatibility](#compatibility-5.md) -

FrameworkPTAdapter 2.0.3

+- [FrameworkPTAdapter 2.0.3](#frameworkptadapter-2-0-3md) + - [Before You Start](#before-you-startmd) + - [New Features](#new-featuresmd) + - [Modified Features](#modified-featuresmd) + - [Resolved Issues](#resolved-issuesmd) + - [Known Issues](#known-issuesmd) + - [Compatibility](#compatibilitymd) +- [FrameworkPTAdapter 2.0.2](#frameworkptadapter-2-0-2md) + - [Before You Start](#before-you-start-0md) + - [New Features](#new-features-1md) + - [Modified Features](#modified-features-2md) + - [Resolved Issues](#resolved-issues-3md) + - [Known Issues](#known-issues-4md) + - [Compatibility](#compatibility-5md) +

FrameworkPTAdapter 2.0.3

-- **[Before You Start](#before-you-start.md)** +- **[Before You Start](#before-you-startmd)** -- **[New Features](#new-features.md)** +- **[New Features](#new-featuresmd)** -- **[Modified Features](#modified-features.md)** +- **[Modified Features](#modified-featuresmd)** -- **[Resolved Issues](#resolved-issues.md)** +- **[Resolved Issues](#resolved-issuesmd)** -- **[Known Issues](#known-issues.md)** +- **[Known Issues](#known-issuesmd)** -- **[Compatibility](#compatibility.md)** +- **[Compatibility](#compatibilitymd)** -

Before You Start

+

Before You Start

This framework is modified based on the open-source PyTorch 1.5.0 developed by Facebook, inherits native PyTorch features, and uses NPUs for dynamic image training. Models are adapted by operator granularity, code can be reused, and current networks can be ported and used on NPUs with only device types or data types modified. PyTorch 1.8.1 is supported by this version and later, and this version inherits the features of PyTorch 1.5.0 and provides the same functions, except for the profiling tool. In addition, it optimizes the backend operator adaptation. Currently, PyTorch 1.8.1 supports only the ResNet-50 network model. -

New Features

+

New Features

**Table 1** Features supported by PyTorch @@ -209,15 +209,15 @@ PyTorch 1.8.1 is supported by this version and later, and this version inherits
-

Modified Features

+

Modified Features

N/A -

Resolved Issues

+

Resolved Issues

N/A -

Known Issues

+

Known Issues

Known Issue

@@ -256,32 +256,32 @@ N/A
-

Compatibility

+

Compatibility

Atlas 800 \(model 9010\): CentOS 7.6, Ubuntu 18.04, BC-Linux 7.6, Debian 9.9, Debian 10, and openEuler 20.03 LTS. Atlas 800 \(model 9000\): CentOS 7.6, Euler 2.8, Kylin v10, BC-Linux 7.6, OpenEuler 20.03 LTS, and UOS 20 1020e. -

FrameworkPTAdapter 2.0.2

+

FrameworkPTAdapter 2.0.2

-- **[Before You Start](#before-you-start-0.md)** +- **[Before You Start](#before-you-start-0md)** -- **[New Features](#new-features-1.md)** +- **[New Features](#new-features-1md)** -- **[Modified Features](#modified-features-2.md)** +- **[Modified Features](#modified-features-2md)** -- **[Resolved Issues](#resolved-issues-3.md)** +- **[Resolved Issues](#resolved-issues-3md)** -- **[Known Issues](#known-issues-4.md)** +- **[Known Issues](#known-issues-4md)** -- **[Compatibility](#compatibility-5.md)** +- **[Compatibility](#compatibility-5md)** -

Before You Start

+

Before You Start

This framework is modified based on the open-source PyTorch 1.5.0 primarily developed by Facebook, inherits native PyTorch features, and uses NPUs for dynamic image training. Models are adapted by operator granularity, code can be reused, and current networks can be ported and used on NPUs with only device types or data types modified. -

New Features

+

New Features

**Table 1** Features supported by PyTorch @@ -356,15 +356,15 @@ This framework is modified based on the open-source PyTorch 1.5.0 primarily deve -

Modified Features

+

Modified Features

N/A -

Resolved Issues

+

Resolved Issues

N/A -

Known Issues

+

Known Issues

Known Issue

@@ -403,7 +403,7 @@ N/A
-

Compatibility

+

Compatibility

Atlas 800 \(model 9010\): CentOS 7.6/Ubuntu 18.04/BC-Linux 7.6/Debian 9.9/Debian 10/openEuler 20.03 LTS -- Gitee