From 44433f91ef202446af7323b2a9ae144b5002bbde Mon Sep 17 00:00:00 2001 From: majorli Date: Tue, 18 Jun 2024 14:17:39 +0800 Subject: [PATCH 01/11] update llama2-13b and qwen-7b results Signed-off-by: majorli --- models/nlp/large_language_model/llama2-13b/trtllm/README.md | 6 ++++++ .../qwen-7b/text-generation-inference/README.md | 6 ++++++ 2 files changed, 12 insertions(+) diff --git a/models/nlp/large_language_model/llama2-13b/trtllm/README.md b/models/nlp/large_language_model/llama2-13b/trtllm/README.md index d10629b1..afb1d13d 100755 --- a/models/nlp/large_language_model/llama2-13b/trtllm/README.md +++ b/models/nlp/large_language_model/llama2-13b/trtllm/README.md @@ -53,3 +53,9 @@ bash scripts/test_trtllm_llama2_13b_gpu2_build.sh # Inference bash scripts/test_trtllm_llama2_13b_gpu2.sh ``` + +## Results + +| Model | tokens | tokens per second | +| ---------- | ------ | ----------------- | +| Llama2 13B | 1596 | 33.39 | diff --git a/models/nlp/large_language_model/qwen-7b/text-generation-inference/README.md b/models/nlp/large_language_model/qwen-7b/text-generation-inference/README.md index 65bf7afa..348850b3 100644 --- a/models/nlp/large_language_model/qwen-7b/text-generation-inference/README.md +++ b/models/nlp/large_language_model/qwen-7b/text-generation-inference/README.md @@ -53,3 +53,9 @@ ENABLE_INFER_PG=1 CUDA_VISIBLE_DEVICES=0 USE_FLASH_ATTENTION=true text-generatio export CUDA_VISIBLE_DEVICES=1 python3 offline_inference.py --model2path ./data/qwen-7B ``` + +## Results + +| Model | QPS | +| ------- | ----- | +| Qwen-7B | 35.64 | -- Gitee From 924165b3fbe6f5b6c7b79c8333a63e6f4b276180 Mon Sep 17 00:00:00 2001 From: majorli Date: Tue, 18 Jun 2024 14:19:55 +0800 Subject: [PATCH 02/11] rename chatglm3-6b path name Signed-off-by: majorli --- .../large_language_model/{chatglm => chatglm3-6b}/vllm/README.md | 0 .../{chatglm => chatglm3-6b}/vllm/offline_inference.py | 0 .../{chatglm => chatglm3-6b}/vllm/server_inference.py | 0 .../large_language_model/{chatglm => chatglm3-6b}/vllm/utils.py | 0 4 files changed, 0 insertions(+), 0 deletions(-) rename models/nlp/large_language_model/{chatglm => chatglm3-6b}/vllm/README.md (100%) rename models/nlp/large_language_model/{chatglm => chatglm3-6b}/vllm/offline_inference.py (100%) rename models/nlp/large_language_model/{chatglm => chatglm3-6b}/vllm/server_inference.py (100%) rename models/nlp/large_language_model/{chatglm => chatglm3-6b}/vllm/utils.py (100%) diff --git a/models/nlp/large_language_model/chatglm/vllm/README.md b/models/nlp/large_language_model/chatglm3-6b/vllm/README.md similarity index 100% rename from models/nlp/large_language_model/chatglm/vllm/README.md rename to models/nlp/large_language_model/chatglm3-6b/vllm/README.md diff --git a/models/nlp/large_language_model/chatglm/vllm/offline_inference.py b/models/nlp/large_language_model/chatglm3-6b/vllm/offline_inference.py similarity index 100% rename from models/nlp/large_language_model/chatglm/vllm/offline_inference.py rename to models/nlp/large_language_model/chatglm3-6b/vllm/offline_inference.py diff --git a/models/nlp/large_language_model/chatglm/vllm/server_inference.py b/models/nlp/large_language_model/chatglm3-6b/vllm/server_inference.py similarity index 100% rename from models/nlp/large_language_model/chatglm/vllm/server_inference.py rename to models/nlp/large_language_model/chatglm3-6b/vllm/server_inference.py diff --git a/models/nlp/large_language_model/chatglm/vllm/utils.py b/models/nlp/large_language_model/chatglm3-6b/vllm/utils.py similarity index 100% rename from models/nlp/large_language_model/chatglm/vllm/utils.py rename to models/nlp/large_language_model/chatglm3-6b/vllm/utils.py -- Gitee From a9fb598991a038d3c85201fdcc9e2022ba2773cc Mon Sep 17 00:00:00 2001 From: majorli Date: Tue, 18 Jun 2024 14:41:03 +0800 Subject: [PATCH 03/11] igie rexnext50 not support int8 Signed-off-by: majorli --- README.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/README.md b/README.md index 0e015465..94454b72 100644 --- a/README.md +++ b/README.md @@ -227,7 +227,7 @@ DeepSparkInference将按季度进行版本更新,后续会逐步丰富模型 INT8 - Supported + - - -- Gitee From 98f48aa27961920302b4bb9cedbd18b28d478551 Mon Sep 17 00:00:00 2001 From: majorli Date: Tue, 18 Jun 2024 15:03:56 +0800 Subject: [PATCH 04/11] add igie models to model list - part 1 Signed-off-by: majorli --- README.md | 96 ++++++++++++++++++++++++++++++++++++++++++++++++++++--- 1 file changed, 92 insertions(+), 4 deletions(-) diff --git a/README.md b/README.md index 94454b72..4dcd4018 100644 --- a/README.md +++ b/README.md @@ -76,6 +76,17 @@ DeepSparkInference将按季度进行版本更新,后续会逐步丰富模型 - - + + DenseNet161 + FP16 + Supported + - + + + INT8 + - + - + EfficientNet-B0 FP16 @@ -90,7 +101,7 @@ DeepSparkInference将按季度进行版本更新,后续会逐步丰富模型 EfficientNet_B1 FP16 - - + Supported Supported @@ -98,6 +109,17 @@ DeepSparkInference将按季度进行版本更新,后续会逐步丰富模型 - Supported + + EfficientNetv2_rw_t + FP16 + Supported + - + + + INT8 + - + - + GoogLeNet FP16 @@ -143,9 +165,20 @@ DeepSparkInference将按季度进行版本更新,后续会逐步丰富模型 Supported - MobileNetV3 + MobileNetV3_Large FP16 + Supported + - + + + INT8 - + - + + + MobileNetV3_Small + FP16 + Supported Supported @@ -153,6 +186,17 @@ DeepSparkInference将按季度进行版本更新,后续会逐步丰富模型 - - + + RegNet_x_1_6gf + FP16 + Supported + - + + + INT8 + - + - + RepVGG FP16 @@ -167,9 +211,20 @@ DeepSparkInference将按季度进行版本更新,后续会逐步丰富模型 Res2Net50 FP16 + Supported + Supported + + + INT8 - Supported + + ResNeSt50 + FP16 + Supported + - + INT8 - @@ -178,12 +233,23 @@ DeepSparkInference将按季度进行版本更新,后续会逐步丰富模型 ResNet101 FP16 - - - Supported + Supported + Supported INT8 + Supported + Supported + + + ResNet152 + FP16 + Supported - + + + INT8 + Supported - @@ -241,6 +307,17 @@ DeepSparkInference将按季度进行版本更新,后续会逐步丰富模型 - - + + ShuffleNetV2_x0_5 + FP16 + Supported + - + + + INT8 + - + - + SqueezeNet 1.0 FP16 @@ -274,6 +351,17 @@ DeepSparkInference将按季度进行版本更新,后续会逐步丰富模型 Supported - + + Wide_ResNet50 + FP16 + Supported + - + + + INT8 + Supported + - + ### Detection -- Gitee From 4f7813ac49933b5fc96d1e36752653249e4dba31 Mon Sep 17 00:00:00 2001 From: majorli Date: Tue, 18 Jun 2024 15:16:26 +0800 Subject: [PATCH 05/11] add igie models to model list - part 2 Signed-off-by: majorli --- README.md | 44 +++++++++++++++++++++++++++ models/cv/trace/repnet/igie/README.md | 2 +- 2 files changed, 45 insertions(+), 1 deletion(-) diff --git a/README.md b/README.md index 4dcd4018..c9dd81e7 100644 --- a/README.md +++ b/README.md @@ -373,6 +373,39 @@ DeepSparkInference将按季度进行版本更新,后续会逐步丰富模型 IGIE IxRT + + CenterNet + FP16 + Supported + - + + + INT8 + - + - + + + FoveaBox + FP16 + Supported + - + + + INT8 + - + - + + + HRNet + FP16 + Supported + - + + + INT8 + - + - + RetinaNet FP16 @@ -516,6 +549,17 @@ DeepSparkInference将按季度进行版本更新,后续会逐步丰富模型 Supported - + + RepNet-Vehicle-ReID + FP16 + Supported + - + + + INT8 + - + - + ## NLP diff --git a/models/cv/trace/repnet/igie/README.md b/models/cv/trace/repnet/igie/README.md index a37c6c8b..03659b81 100644 --- a/models/cv/trace/repnet/igie/README.md +++ b/models/cv/trace/repnet/igie/README.md @@ -1,4 +1,4 @@ -# RepNet-VehicleReID +# RepNet-Vehicle-ReID ## Description -- Gitee From f1604f20b3d822daba1ad8d06d2d0a4d22259367 Mon Sep 17 00:00:00 2001 From: majorli Date: Tue, 18 Jun 2024 15:46:18 +0800 Subject: [PATCH 06/11] add ixrt models to model list - part 1 Signed-off-by: majorli --- README.md | 48 ++++++++++++++++++- .../classification/densenet121/ixrt/README.md | 6 +-- .../efficientnet_v2/ixrt/README.md | 2 +- .../inceptionresnetv2/ixrt/README.md | 2 +- .../resnet_v1_d50/ixrt/README.md | 10 ++-- .../squeezenet_1.1/ixrt/README.md | 8 ++-- 6 files changed, 60 insertions(+), 16 deletions(-) diff --git a/README.md b/README.md index c9dd81e7..8b758361 100644 --- a/README.md +++ b/README.md @@ -69,7 +69,7 @@ DeepSparkInference将按季度进行版本更新,后续会逐步丰富模型 DenseNet121 FP16 Supported - - + Supported INT8 @@ -109,6 +109,17 @@ DeepSparkInference将按季度进行版本更新,后续会逐步丰富模型 - Supported + + EfficientNetV2 + FP16 + - + Supported + + + INT8 + - + Supported + EfficientNetv2_rw_t FP16 @@ -146,12 +157,23 @@ DeepSparkInference将按季度进行版本更新,后续会逐步丰富模型 InceptionV3 FP16 Supported - - + Supported INT8 Supported + Supported + + + Inception_ResNet_V2 + FP16 + - + Supported + + + INT8 - + Supported MobileNetV2 @@ -285,6 +307,17 @@ DeepSparkInference将按季度进行版本更新,后续会逐步丰富模型 Supported - + + ResNet_V1_D50 + FP16 + - + Supported + + + INT8 + - + Supported + ResNeXt50_32x4d FP16 @@ -329,6 +362,17 @@ DeepSparkInference将按季度进行版本更新,后续会逐步丰富模型 - Supported + + SqueezeNet 1.1 + FP16 + - + Supported + + + INT8 + - + Supported + Swin Transformer FP16 diff --git a/models/cv/classification/densenet121/ixrt/README.md b/models/cv/classification/densenet121/ixrt/README.md index 3468b21a..9b5c2078 100644 --- a/models/cv/classification/densenet121/ixrt/README.md +++ b/models/cv/classification/densenet121/ixrt/README.md @@ -54,6 +54,6 @@ bash scripts/infer_densenet_fp16_performance.sh ## Results -Model |BatchSize |Precision |FPS |Top-1(%) |Top-5(%) ----------|-----------|----------|----------|----------|-------- -DenseNet | | FP16 | 1536.89 | 0.7442 | 0.9197 +| Model | BatchSize | Precision | FPS | Top-1(%) | Top-5(%) | +| -------- | --------- | --------- | ------- | -------- | -------- | +| DenseNet | 32 | FP16 | 1536.89 | 0.7442 | 0.9197 | diff --git a/models/cv/classification/efficientnet_v2/ixrt/README.md b/models/cv/classification/efficientnet_v2/ixrt/README.md index 098105ce..88ccc2aa 100755 --- a/models/cv/classification/efficientnet_v2/ixrt/README.md +++ b/models/cv/classification/efficientnet_v2/ixrt/README.md @@ -1,4 +1,4 @@ -# EfficientnetV2 +# EfficientNetV2 ## Description diff --git a/models/cv/classification/inceptionresnetv2/ixrt/README.md b/models/cv/classification/inceptionresnetv2/ixrt/README.md index c0be6674..64690193 100755 --- a/models/cv/classification/inceptionresnetv2/ixrt/README.md +++ b/models/cv/classification/inceptionresnetv2/ixrt/README.md @@ -1,4 +1,4 @@ -# InceptionResNetV2 +# Inception-ResNetV2 ## Description diff --git a/models/cv/classification/resnet_v1_d50/ixrt/README.md b/models/cv/classification/resnet_v1_d50/ixrt/README.md index 06a1ed34..42880951 100644 --- a/models/cv/classification/resnet_v1_d50/ixrt/README.md +++ b/models/cv/classification/resnet_v1_d50/ixrt/README.md @@ -1,4 +1,4 @@ -# ResNet50 +# ResNet_V1_D50 ## Description @@ -64,7 +64,7 @@ bash scripts/infer_resnet_v1_d50_int8_performance.sh ## Results -Model |BatchSize |Precision |FPS |Top-1(%) |Top-5(%) ----------|-----------|----------|----------|----------|-------- -ResNet50 | | FP16 | 3887.55 | 0.77544 | 0.93568 -ResNet50 | | INT8 | 7148.58 | 0.7711 | 0.93514 +| Model | BatchSize | Precision | FPS | Top-1(%) | Top-5(%) | +| ------------- | --------- | --------- | ------- | -------- | -------- | +| ResNet_V1_D50 | 32 | FP16 | 3887.55 | 0.77544 | 0.93568 | +| ResNet_V1_D50 | 32 | INT8 | 7148.58 | 0.7711 | 0.93514 | diff --git a/models/cv/classification/squeezenet_1.1/ixrt/README.md b/models/cv/classification/squeezenet_1.1/ixrt/README.md index 088ee0ad..08fe037a 100644 --- a/models/cv/classification/squeezenet_1.1/ixrt/README.md +++ b/models/cv/classification/squeezenet_1.1/ixrt/README.md @@ -70,7 +70,7 @@ bash scripts/infer_squeezenet_v11_int8_performance.sh ## Results -Model |BatchSize |Precision |FPS |Top-1(%) |Top-5(%) ----------------|-----------|----------|---------|----------|-------- -SqueezeNet 1.1 | | FP16 | 13701 | 0.58182 | 0.80622 -SqueezeNet 1.1 | | INT8 | 20128 | 0.50966 | 0.77552 +| Model | BatchSize | Precision | FPS | Top-1(%) | Top-5(%) | +| -------------- | --------- | --------- | ----- | -------- | -------- | +| SqueezeNet 1.1 | 32 | FP16 | 13701 | 0.58182 | 0.80622 | +| SqueezeNet 1.1 | 32 | INT8 | 20128 | 0.50966 | 0.77552 | -- Gitee From 09940120f10055e0e554ed5b0c3b13487361aeb9 Mon Sep 17 00:00:00 2001 From: majorli Date: Tue, 18 Jun 2024 15:52:50 +0800 Subject: [PATCH 07/11] complete ixrt hrnet missing results Signed-off-by: majorli --- README.md | 4 ++-- models/cv/classification/hrnet_w18/ixrt/README.md | 8 ++++---- 2 files changed, 6 insertions(+), 6 deletions(-) diff --git a/README.md b/README.md index 8b758361..5b58086e 100644 --- a/README.md +++ b/README.md @@ -146,12 +146,12 @@ DeepSparkInference将按季度进行版本更新,后续会逐步丰富模型 HRNet-W18 FP16 Supported - - + Supported INT8 - - - + Supported InceptionV3 diff --git a/models/cv/classification/hrnet_w18/ixrt/README.md b/models/cv/classification/hrnet_w18/ixrt/README.md index 00cf3b2e..278d5427 100644 --- a/models/cv/classification/hrnet_w18/ixrt/README.md +++ b/models/cv/classification/hrnet_w18/ixrt/README.md @@ -64,7 +64,7 @@ bash scripts/infer_hrnet_w18_int8_performance.sh ## Results -Model |BatchSize |Precision |FPS |Top-1(%) |Top-5(%) ----------|-----------|----------|----------|----------|-------- -ResNet50 | | | | | -ResNet50 | | | | | +| Model | BatchSize | Precision | FPS | Top-1(%) | Top-5(%) | +| -------- | --------- | --------- | ------- | -------- | -------- | +| ResNet50 | 32 | FP16 | 1474.26 | 0.76764 | 0.93446 | +| ResNet50 | 32 | INT8 | 1649.40 | 0.76158 | 0.93152 | -- Gitee From 2e6210da16dd300ccd54860ebc093f7ac4d9f861 Mon Sep 17 00:00:00 2001 From: majorli Date: Wed, 19 Jun 2024 09:53:14 +0800 Subject: [PATCH 08/11] add ixrt models to model list - part 2 Signed-off-by: majorli --- README.md | 45 +++++++++++++++++++--- models/cv/detection/detr/ixrt/README.md | 4 +- models/cv/detection/fcos/ixrt/README.md | 28 +++++++++----- models/cv/detection/yolov5s/ixrt/README.md | 2 +- 4 files changed, 61 insertions(+), 18 deletions(-) diff --git a/README.md b/README.md index 5b58086e..866ad422 100644 --- a/README.md +++ b/README.md @@ -428,6 +428,28 @@ DeepSparkInference将按季度进行版本更新,后续会逐步丰富模型 - - + + DETR + FP16 + - + Supported + + + INT8 + - + - + + + FCOS + FP16 + - + Supported + + + INT8 + - + - + FoveaBox FP16 @@ -465,12 +487,12 @@ DeepSparkInference将按季度进行版本更新,后续会逐步丰富模型 YOLOv3 FP16 Supported - - + Supported INT8 Supported - - + Supported YOLOv4 @@ -487,12 +509,23 @@ DeepSparkInference将按季度进行版本更新,后续会逐步丰富模型 YOLOv5 FP16 Supported - - + Supported INT8 Supported + Supported + + + YOLOv5s + FP16 - + Supported + + + INT8 + - + Supported YOLOv6 @@ -509,12 +542,12 @@ DeepSparkInference将按季度进行版本更新,后续会逐步丰富模型 YOLOv7 FP16 Supported - - + Supported INT8 Supported - - + Supported YOLOv8 @@ -676,7 +709,7 @@ DeepSparkInference将按季度进行版本更新,后续会逐步丰富模型 ------- +--- ## 社区 diff --git a/models/cv/detection/detr/ixrt/README.md b/models/cv/detection/detr/ixrt/README.md index 5d05389d..28df3f60 100755 --- a/models/cv/detection/detr/ixrt/README.md +++ b/models/cv/detection/detr/ixrt/README.md @@ -1,4 +1,4 @@ -# Detr +# DETR ## Description @@ -63,4 +63,4 @@ bash scripts/infer_detr_fp16_performance.sh Model |BatchSize |Precision |FPS |MAP@0.5 |MAP@0.5:0.95 --------|-----------|----------|----------|----------|------------ -Detr | 1 | FP16 | 65.84 | 0.370 | 0.198 +DETR | 1 | FP16 | 65.84 | 0.370 | 0.198 diff --git a/models/cv/detection/fcos/ixrt/README.md b/models/cv/detection/fcos/ixrt/README.md index 244e8f6a..49db1e04 100755 --- a/models/cv/detection/fcos/ixrt/README.md +++ b/models/cv/detection/fcos/ixrt/README.md @@ -1,4 +1,5 @@ # FCOS + ## Description FCOS is an anchor-free model based on the Fully Convolutional Network (FCN) architecture for pixel-wise object detection. It implements a proposal-free solution and introduces the concept of centerness. @@ -7,8 +8,14 @@ For more details, please refer to our [report on Arxiv](https://arxiv.org/abs/19 ## Setup ### Install -``` -yum install mesa-libGL + +```bash +# Install libGL +## CentOS +yum install -y mesa-libGL +## Ubuntu +apt install -y libgl1-mesa-dev + pip3 install tqdm pip3 install onnx pip3 install onnxsim @@ -36,13 +43,14 @@ sh install_mmcv.sh Pretrained model: -- COCO2017数据集准备参考: https://cocodataset.org/ +- COCO2017数据集准备参考: - 图片目录: Path/To/val2017/*.jpg - 标注文件目录: Path/To/annotations/instances_val2017.json ### Model Conversion MMDetection is an open source object detection toolbox based on PyTorch. It is a part of the OpenMMLab project.It is utilized for model conversion. In MMDetection, Execute model conversion command, and the checkpoints folder needs to be created, (mkdir checkpoints) in project + ```bash git clone -b v2.25.0 https://github.com/open-mmlab/mmdetection.git @@ -59,12 +67,13 @@ python3 tools/deployment/pytorch2onnx.py \ --skip-postprocess \ --dynamic-export \ --cfg-options \ - model.test_cfg.deploy_nms_pre=-1 \ - + model.test_cfg.deploy_nms_pre=-1 ``` -If there are issues such as input parameter mismatch during model export, it may be due to ONNX version. To resolve this, please delete the last parameter (dynamic_slice) from the return value of the _slice_helper function in the /usr/local/lib/python3.10/site-packages/mmcv/onnx/onnx_utils/symbolic_helper.py file. + +If there are issues such as input parameter mismatch during model export, it may be due to ONNX version. To resolve this, please delete the last parameter (dynamic_slice) from the return value of the_slice_helper function in the /usr/local/lib/python3.10/site-packages/mmcv/onnx/onnx_utils/symbolic_helper.py file. ## Inference + ```bash export PROJ_DIR=./ export DATASETS_DIR=/Path/to/coco/ @@ -73,6 +82,7 @@ export RUN_DIR=./ ``` ### FP16 + ```bash # Accuracy bash scripts/infer_fcos_fp16_accuracy.sh @@ -82,6 +92,6 @@ bash scripts/infer_fcos_fp16_performance.sh ## Results -Model |BatchSize |Precision |FPS |MAP@0.5 |MAP@0.5:0.95 | ---------|-----------|----------|---------|----------|-------------| -Fcos | 1 | FP16 | 51.62 | 0.546 | 0.360 | \ No newline at end of file +| Model | BatchSize | Precision | FPS | MAP@0.5 | MAP@0.5:0.95 | +| ----- | --------- | --------- | ----- | ------- | ------------ | +| FCOS | 1 | FP16 | 51.62 | 0.546 | 0.360 | diff --git a/models/cv/detection/yolov5s/ixrt/README.md b/models/cv/detection/yolov5s/ixrt/README.md index c189fc70..28e6cf73 100755 --- a/models/cv/detection/yolov5s/ixrt/README.md +++ b/models/cv/detection/yolov5s/ixrt/README.md @@ -1,4 +1,4 @@ -# YOLOv5-s +# YOLOv5s ## Description -- Gitee From f21e756bfc641dddcf4063b2eb8365fa0bf62717 Mon Sep 17 00:00:00 2001 From: majorli Date: Wed, 19 Jun 2024 09:59:27 +0800 Subject: [PATCH 09/11] mv solov1 to segmentation dir Signed-off-by: majorli --- models/cv/{detection => segmentation}/solov1/ixrt/README.md | 0 models/cv/{detection => segmentation}/solov1/ixrt/build_engine.py | 0 .../cv/{detection => segmentation}/solov1/ixrt/coco_instance.py | 0 models/cv/{detection => segmentation}/solov1/ixrt/common.py | 0 .../solov1/ixrt/scripts/infer_solov1_fp16_accuracy.sh | 0 .../solov1/ixrt/scripts/infer_solov1_fp16_performance.sh | 0 .../cv/{detection => segmentation}/solov1/ixrt/simplify_model.py | 0 .../solov1/ixrt/solo_r50_fpn_3x_coco.py | 0 .../cv/{detection => segmentation}/solov1/ixrt/solo_torch2onnx.py | 0 .../{detection => segmentation}/solov1/ixrt/solov1_inference.py | 0 10 files changed, 0 insertions(+), 0 deletions(-) rename models/cv/{detection => segmentation}/solov1/ixrt/README.md (100%) rename models/cv/{detection => segmentation}/solov1/ixrt/build_engine.py (100%) rename models/cv/{detection => segmentation}/solov1/ixrt/coco_instance.py (100%) rename models/cv/{detection => segmentation}/solov1/ixrt/common.py (100%) rename models/cv/{detection => segmentation}/solov1/ixrt/scripts/infer_solov1_fp16_accuracy.sh (100%) rename models/cv/{detection => segmentation}/solov1/ixrt/scripts/infer_solov1_fp16_performance.sh (100%) rename models/cv/{detection => segmentation}/solov1/ixrt/simplify_model.py (100%) rename models/cv/{detection => segmentation}/solov1/ixrt/solo_r50_fpn_3x_coco.py (100%) rename models/cv/{detection => segmentation}/solov1/ixrt/solo_torch2onnx.py (100%) rename models/cv/{detection => segmentation}/solov1/ixrt/solov1_inference.py (100%) diff --git a/models/cv/detection/solov1/ixrt/README.md b/models/cv/segmentation/solov1/ixrt/README.md similarity index 100% rename from models/cv/detection/solov1/ixrt/README.md rename to models/cv/segmentation/solov1/ixrt/README.md diff --git a/models/cv/detection/solov1/ixrt/build_engine.py b/models/cv/segmentation/solov1/ixrt/build_engine.py similarity index 100% rename from models/cv/detection/solov1/ixrt/build_engine.py rename to models/cv/segmentation/solov1/ixrt/build_engine.py diff --git a/models/cv/detection/solov1/ixrt/coco_instance.py b/models/cv/segmentation/solov1/ixrt/coco_instance.py similarity index 100% rename from models/cv/detection/solov1/ixrt/coco_instance.py rename to models/cv/segmentation/solov1/ixrt/coco_instance.py diff --git a/models/cv/detection/solov1/ixrt/common.py b/models/cv/segmentation/solov1/ixrt/common.py similarity index 100% rename from models/cv/detection/solov1/ixrt/common.py rename to models/cv/segmentation/solov1/ixrt/common.py diff --git a/models/cv/detection/solov1/ixrt/scripts/infer_solov1_fp16_accuracy.sh b/models/cv/segmentation/solov1/ixrt/scripts/infer_solov1_fp16_accuracy.sh similarity index 100% rename from models/cv/detection/solov1/ixrt/scripts/infer_solov1_fp16_accuracy.sh rename to models/cv/segmentation/solov1/ixrt/scripts/infer_solov1_fp16_accuracy.sh diff --git a/models/cv/detection/solov1/ixrt/scripts/infer_solov1_fp16_performance.sh b/models/cv/segmentation/solov1/ixrt/scripts/infer_solov1_fp16_performance.sh similarity index 100% rename from models/cv/detection/solov1/ixrt/scripts/infer_solov1_fp16_performance.sh rename to models/cv/segmentation/solov1/ixrt/scripts/infer_solov1_fp16_performance.sh diff --git a/models/cv/detection/solov1/ixrt/simplify_model.py b/models/cv/segmentation/solov1/ixrt/simplify_model.py similarity index 100% rename from models/cv/detection/solov1/ixrt/simplify_model.py rename to models/cv/segmentation/solov1/ixrt/simplify_model.py diff --git a/models/cv/detection/solov1/ixrt/solo_r50_fpn_3x_coco.py b/models/cv/segmentation/solov1/ixrt/solo_r50_fpn_3x_coco.py similarity index 100% rename from models/cv/detection/solov1/ixrt/solo_r50_fpn_3x_coco.py rename to models/cv/segmentation/solov1/ixrt/solo_r50_fpn_3x_coco.py diff --git a/models/cv/detection/solov1/ixrt/solo_torch2onnx.py b/models/cv/segmentation/solov1/ixrt/solo_torch2onnx.py similarity index 100% rename from models/cv/detection/solov1/ixrt/solo_torch2onnx.py rename to models/cv/segmentation/solov1/ixrt/solo_torch2onnx.py diff --git a/models/cv/detection/solov1/ixrt/solov1_inference.py b/models/cv/segmentation/solov1/ixrt/solov1_inference.py similarity index 100% rename from models/cv/detection/solov1/ixrt/solov1_inference.py rename to models/cv/segmentation/solov1/ixrt/solov1_inference.py -- Gitee From cd128bd44f53b57db1434e18e009ab8638b53f94 Mon Sep 17 00:00:00 2001 From: majorli Date: Wed, 19 Jun 2024 10:16:01 +0800 Subject: [PATCH 10/11] add ixrt models to model list - part 3 Signed-off-by: majorli --- README.md | 13 ++++- models/cv/segmentation/solov1/ixrt/README.md | 2 +- .../bert_base_squad/igie/README.md | 6 +-- .../bert_base_squad/ixrt/README.md | 51 ++++++++++--------- 4 files changed, 44 insertions(+), 28 deletions(-) diff --git a/README.md b/README.md index 866ad422..190ed725 100644 --- a/README.md +++ b/README.md @@ -593,6 +593,17 @@ DeepSparkInference将按季度进行版本更新,后续会逐步丰富模型 - - + + SOLOv1 + FP16 + - + Supported + + + INT8 + - + - + ### Trace @@ -670,7 +681,7 @@ DeepSparkInference将按季度进行版本更新,后续会逐步丰富模型 INT8 - - - + Supported BERT Large SQuAD diff --git a/models/cv/segmentation/solov1/ixrt/README.md b/models/cv/segmentation/solov1/ixrt/README.md index fbe6fd97..d675f549 100644 --- a/models/cv/segmentation/solov1/ixrt/README.md +++ b/models/cv/segmentation/solov1/ixrt/README.md @@ -69,4 +69,4 @@ bash scripts/infer_solov1_fp16_performance.sh Model |BatchSize |Precision |FPS |MAP@0.5 |MAP@0.5:0.95 --------|-----------|----------|----------|----------|------------ -Solov1 | 1 | FP16 | 24.67 | 0.541 | 0.338 +SOLOv1 | 1 | FP16 | 24.67 | 0.541 | 0.338 diff --git a/models/nlp/language_model/bert_base_squad/igie/README.md b/models/nlp/language_model/bert_base_squad/igie/README.md index cff33a64..4c4ab629 100644 --- a/models/nlp/language_model/bert_base_squad/igie/README.md +++ b/models/nlp/language_model/bert_base_squad/igie/README.md @@ -43,6 +43,6 @@ bash scripts/infer_bert_base_squad_fp16_performance.sh ## Results -Model |BatchSize |SeqLength |Precision |FPS | F1 Score ------------------|-----------|----------|----------|----------|-------- -Bertbase(Squad) | 8 | 256 | FP16 |901.81 | 88.08 +| Model | BatchSize | SeqLength | Precision | FPS | F1 Score | +| --------------- | --------- | --------- | --------- | ------ | -------- | +| BERT Base SQuAD | 8 | 256 | FP16 | 901.81 | 88.08 | diff --git a/models/nlp/language_model/bert_base_squad/ixrt/README.md b/models/nlp/language_model/bert_base_squad/ixrt/README.md index 0a7f4b33..0976e599 100644 --- a/models/nlp/language_model/bert_base_squad/ixrt/README.md +++ b/models/nlp/language_model/bert_base_squad/ixrt/README.md @@ -12,23 +12,23 @@ BERT is designed to pre-train deep bidirectional representations from unlabeled docker pull nvcr.io/nvidia/tensorrt:23.04-py3 ``` -### Install +## Install -#### On iluvatar +### Install on Iluvatar ```bash cmake -S . -B build cmake --build build -j16 ``` -#### On T4 +### Install on T4 ```bash cmake -S . -B build -DUSE_TENSORRT=true cmake --build build -j16 ``` -### Download +## Download ```bash cd python @@ -37,17 +37,6 @@ bash script/prepare.sh v1_1 ## Inference -### On T4 - -```bash -# FP16 -cd python -pip install onnx pycuda -# use --bs to set max_batch_size (dynamic) -bash script/build_engine.sh --bs 32 -bash script/inference_squad.sh --bs 32 -``` - ```bash # INT8 cd python @@ -55,25 +44,41 @@ pip install onnx pycuda bash script/build_engine.sh --bs 32 --int8 bash script/inference_squad.sh --bs 32 --int8 ``` -#### On iluvatar + +### On Iluvatar + +#### FP16 ```bash -# FP16 cd python/script bash infer_bert_base_squad_fp16_ixrt.sh ``` +#### INT8 + ```bash -# INT8 cd python/script bash infer_bert_base_squad_int8_ixrt.sh ``` +### On T4 + +```bash +# FP16 +cd python +pip install onnx pycuda +# use --bs to set max_batch_size (dynamic) +bash script/build_engine.sh --bs 32 +bash script/inference_squad.sh --bs 32 +``` + ## Results -Model | BatchSize | Precision | FPS | ACC -------|-----------|-----------|-----|---- -BERT-Base-SQuAD | 32 | fp16 | Latency QPS: 1543.40 sentences/s | "exact_match": 80.92, "f1": 88.20 +| Model | BatchSize | Precision | Latency QPS | exact_match | f1 | +| --------------- | --------- | --------- | ----------- | ----------- | ----- | +| BERT Base SQuAD | 32 | fp16 | 1444.69 | 80.92 | 88.20 | +| BERT Base SQuAD | 32 | fp16 | 2325.20 | 78.41 | 86.97 | + +## Referenece -## Referenece -- [bert-base-uncased.zip 外网链接](https://drive.google.com/file/d/1_DJDdKBanqJ6h3VGhH78F9EPgE2wK_Tw/view?usp=drive_link) +- [bert-base-uncased.zip](https://drive.google.com/file/d/1_DJDdKBanqJ6h3VGhH78F9EPgE2wK_Tw/view?usp=drive_link) -- Gitee From 1d6417065c44b4a69ef5800f09aca5199086ac8b Mon Sep 17 00:00:00 2001 From: majorli Date: Wed, 19 Jun 2024 11:03:14 +0800 Subject: [PATCH 11/11] add ixrt models to model list - part 4 Signed-off-by: majorli --- README.md | 35 +++++++++++++++++++ .../bert_base_squad/ixrt/README.md | 4 +-- 2 files changed, 37 insertions(+), 2 deletions(-) diff --git a/README.md b/README.md index 190ed725..d33db306 100644 --- a/README.md +++ b/README.md @@ -696,6 +696,41 @@ DeepSparkInference将按季度进行版本更新,后续会逐步丰富模型 +### Large Language Model + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
ModelsvLLMTensorRT-LLMTGI
Baichuan2-7BSupported--
ChatGLM-3-6BSupported--
Llama2-7B-Supported-
Qwen-7B--Supported
+ ## Speech ### Speech Recognition diff --git a/models/nlp/language_model/bert_base_squad/ixrt/README.md b/models/nlp/language_model/bert_base_squad/ixrt/README.md index 0976e599..acc3592b 100644 --- a/models/nlp/language_model/bert_base_squad/ixrt/README.md +++ b/models/nlp/language_model/bert_base_squad/ixrt/README.md @@ -76,8 +76,8 @@ bash script/inference_squad.sh --bs 32 | Model | BatchSize | Precision | Latency QPS | exact_match | f1 | | --------------- | --------- | --------- | ----------- | ----------- | ----- | -| BERT Base SQuAD | 32 | fp16 | 1444.69 | 80.92 | 88.20 | -| BERT Base SQuAD | 32 | fp16 | 2325.20 | 78.41 | 86.97 | +| BERT Base SQuAD | 32 | FP16 | 1444.69 | 80.92 | 88.20 | +| BERT Base SQuAD | 32 | INT8 | 2325.20 | 78.41 | 86.97 | ## Referenece -- Gitee