diff --git a/docs/README_TEMPLATE.md b/docs/README_TEMPLATE.md index 345a38ed6af8de27372b6e6307de3cfe468f3c89..04daa0d98724f3640cb543808174148b9d20ff7e 100644 --- a/docs/README_TEMPLATE.md +++ b/docs/README_TEMPLATE.md @@ -1,18 +1,38 @@ -# MODEL_NAME +# MODEL_NAME (IGIE/IxRT/vLLM/TGI/TRT-LLM/IxFormer) -## Description +## Model Description A brief introduction about this model. +A brief introduction about this model. +A brief introduction about this model. + +## Supported Environments -## Setup +| Iluvatar GPU | IXUCA | +|--------------|-------| +| MR-V50 | 4.1.2 | +| MR-V100 | 4.2.0 | -### Install (remove this step if not necessary) +## Model Preparation -### Download (remove this step if not necessary) +### Prepare Resources + +```bash +python3 dataset/coco/download_coco.py +``` + +Go to huggingface. + +### Install Dependencies + +```bash +pip install -r requirements.txt +python3 setup.py install +``` -### Model Conversion (remove this step if not necessary) +### Model Conversion -## Inference +## Model Inference ### FP16 @@ -26,12 +46,13 @@ bash test_fp16.sh bash test_int8.sh ``` -## Results (leave empty for testing team to complete) +## Model Results -Model | BatchSize | Precision | FPS | ACC -------|-----------|-----------|-----|---- -MODEL_NAME | | | | +| Model | GPU | Precision | Performance | +|------------|------------|-----------|-------------| +| MODEL_NAME | MR-V100 x1 | | | -## Referenece (remove if not necessary) +## References - [refer-page-name](https://refer-links) +- [Paper](Paper_link) diff --git a/models/audio/speech_recognition/conformer/igie/README.md b/models/audio/speech_recognition/conformer/igie/README.md index 47bafe826dd1179629c058217bbe39af672c3cfd..4db0cbd6bf5b4828daf0c989099cef272417372d 100644 --- a/models/audio/speech_recognition/conformer/igie/README.md +++ b/models/audio/speech_recognition/conformer/igie/README.md @@ -1,15 +1,21 @@ -# Conformer +# Conformer (IGIE) -## Description +## Model Description Conformer is a speech recognition model proposed by Google in 2020. It combines the advantages of CNN and Transformer. CNN efficiently extracts local features, while Transformer is more effective in capturing long sequence dependencies. Conformer applies convolution to the Encoder layer of Transformer, enhancing the performance of Transformer in the ASR (Automatic Speech Recognition) domain. -## Setup +## Model Preparation -### Install +### Prepare Resources + +Pretrained model: + +Dataset: to download the Aishell dataset. + +### Install Dependencies ```bash pip3 install -r requirements.txt @@ -17,12 +23,6 @@ cd ctc_decoder/swig && bash setup.sh cd ../../ ``` -### Download - -Pretrained model: - -Dataset: to download the Aishell dataset. - ### Model Conversion ```bash @@ -47,7 +47,7 @@ onnxsim encoder_bs24_seq384_static.onnx encoder_bs24_seq384_static_opt.onnx python3 alter_onnx.py --batch_size 24 --path encoder_bs24_seq384_static_opt.onnx ``` -## Inference +## Model Inference ```bash # Need to unzip aishell to the current directory. For details, refer to data.list @@ -63,7 +63,7 @@ bash scripts/infer_conformer_fp16_accuracy.sh bash scripts/infer_conformer_fp16_performance.sh ``` -## Results +## Model Results | Model | BatchSize | Precision | FPS | ACC | |-----------|-----------|-----------|----------|-------| diff --git a/models/audio/speech_recognition/conformer/ixrt/README.md b/models/audio/speech_recognition/conformer/ixrt/README.md index ed8584218b15d277bb3d24a93404961789129363..16ee007960ae8485f6f41fc8e92997c032fa1d1d 100644 --- a/models/audio/speech_recognition/conformer/ixrt/README.md +++ b/models/audio/speech_recognition/conformer/ixrt/README.md @@ -1,45 +1,40 @@ -# Conformer +# Conformer (IxRT) -## Description +## Model Description Conformer is a speech recognition model proposed by Google in 2020. It combines the advantages of CNN and Transformer. CNN efficiently extracts local features, while Transformer is more effective in capturing long sequence dependencies. Conformer applies convolution to the Encoder layer of Transformer, enhancing the performance of Transformer in the ASR (Automatic Speech Recognition) domain. -## Setup +## Model Preparation -### Install - -```bash -# Install libGL -## CentOS -yum install -y mesa-libGL -## Ubuntu -apt install -y libgl1-mesa-glx - -pip3 install -r requirements.txt -``` - -### Download +### Prepare Resources Pretrained model: Dataset: to download the Aishell dataset. -Download and put model in conformer_checkpoints. - ```bash +# Download and put model in conformer_checkpoints ln -s /home/deepspark/datasets/INFER/conformer/20210601_u2++_conformer_exp_aishell ./conformer_checkpoints -``` - -### Prepare Data -```bash -# Accuracy +# Prepare AISHELL Data DATA_DIR=/PATH/to/aishell_test_data TOOL_DIR="$(pwd)/tools" bash scripts/aishell_data_prepare.sh ${DATA_DIR} ${TOOL_DIR} ``` -## Model Conversion And Inference +### Install Dependencies + +```bash +# Install libGL +## CentOS +yum install -y mesa-libGL +## Ubuntu +apt install -y libgl1-mesa-glx + +pip3 install -r requirements.txt +``` + +## Model Inference ### FP16 @@ -50,7 +45,7 @@ bash scripts/infer_conformer_fp16_accuracy.sh bash scripts/infer_conformer_fp16_performance.sh ``` -## Results +## Model Results | Model | BatchSize | Precision | QPS | CER | | --------- | --------- | --------- | ------- | ------ | diff --git a/models/audio/speech_recognition/transformer_asr/ixrt/README.md b/models/audio/speech_recognition/transformer_asr/ixrt/README.md index 601362a2c8d47c1d0cf6f14357918b412a4c217a..a759fdcdc7706cd0216924770698e3ead9c05cfc 100644 --- a/models/audio/speech_recognition/transformer_asr/ixrt/README.md +++ b/models/audio/speech_recognition/transformer_asr/ixrt/README.md @@ -1,20 +1,14 @@ -# Transformer ASR(BeamSearch) +# Transformer ASR (IxRT) -## Description +## Model Description Beam search allows us to exert control over the output of text generation. This is useful because we sometimes know exactly what we want inside the output. For example, in a Neural Machine Translation task, we might know which words must be included in the final translation with a dictionary lookup. -## Setup +## Model Preparation -### Install - -```bash -pip3 install -r requirements.txt -``` - -### Download +### Prepare Resources Pretrained model: @@ -51,7 +45,13 @@ ln -s /PATH/to/data_aishell /home/data/speechbrain/aishell/ cp results/transformer/8886/*.csv /home/data/speechbrain/aishell/csv_data ``` -## Inference +### Install Dependencies + +```bash +pip3 install -r requirements.txt +``` + +## Model Inference ### Build faster kernels @@ -78,7 +78,7 @@ python3 builder.py \ python3 inference.py hparams/train_ASR_transformer.yaml --data_folder=/home/data/speechbrain/aishell --engine_path transformer.engine ``` -## Results +## Model Results | Model | BatchSize | Precision | QPS | CER | |-----------------|-----------|-----------|-------|------| diff --git a/models/cv/classification/alexnet/igie/README.md b/models/cv/classification/alexnet/igie/README.md index 0720e0ffd3ab813085dc2753f1d85dbd24bde671..f6662bcb0ddc2f2d286aebb46deef1bbceb4e500 100644 --- a/models/cv/classification/alexnet/igie/README.md +++ b/models/cv/classification/alexnet/igie/README.md @@ -1,6 +1,6 @@ -# AlexNet +# AlexNet (IGIE) -## Description +## Model Description AlexNet, developed by Alex Krizhevsky, Ilya Sutskever, and Geoffrey Hinton, is a groundbreaking convolutional neural network (CNN) architecture that achieved remarkable success in the 2012 ImageNet Large Scale Visual Recognition @@ -8,27 +8,27 @@ Challenge (ILSVRC). This neural network comprises eight layers, incorporating fi connected layers. The architecture employs the Rectified Linear Unit (ReLU) activation function to introduce non-linearity, allowing the model to learn complex features from input images. -## Setup +## Model Preparation -### Install - -```bash -pip3 install -r requirements.txt -``` - -### Download +### Prepare Resources Pretrained model: Dataset: to download the validation dataset. +### Install Dependencies + +```bash +pip3 install -r requirements.txt +``` + ### Model Conversion ```bash python3 export.py --weight alexnet-owt-7be5be79.pth --output alexnet.onnx ``` -## Inference +## Model Inference ```bash export DATASETS_DIR=/Path/to/imagenet_val/ @@ -52,7 +52,7 @@ bash scripts/infer_alexnet_int8_accuracy.sh bash scripts/infer_alexnet_int8_performance.sh ``` -## Results +## Model Results | Model | BatchSize | Precision | FPS | Top-1(%) | Top-5(%) | |---------|-----------|-----------|----------|----------|----------| diff --git a/models/cv/classification/alexnet/ixrt/README.md b/models/cv/classification/alexnet/ixrt/README.md index a111e4060d0b219ddcbf6e87ced599e4955d27df..daae7a099fa38934c58c75c3181a740b378105e7 100644 --- a/models/cv/classification/alexnet/ixrt/README.md +++ b/models/cv/classification/alexnet/ixrt/README.md @@ -1,13 +1,19 @@ -# AlexNet +# AlexNet (IxRT) -## Description +## Model Description AlexNet is a classic convolutional neural network architecture. It consists of convolutions, max pooling and dense layers as the basic building blocks. -## Setup +## Model Preparation -### Install +### Prepare Resources + +Pretrained model: + +Dataset: to download the validation dataset. + +### Install Dependencies ```bash # Install libGL @@ -19,12 +25,6 @@ apt install -y libgl1-mesa-glx pip3 install -r requirements.txt ``` -### Download - -Pretrained model: - -Dataset: to download the validation dataset. - ### Model Conversion ```bash @@ -32,7 +32,7 @@ mkdir checkpoints python3 export_onnx.py --origin_model /path/to/alexnet-owt-7be5be79.pth --output_model checkpoints/alexnet.onnx ``` -## Inference +## Model Inference ```bash export PROJ_DIR=./ @@ -60,7 +60,7 @@ bash scripts/infer_alexnet_int8_accuracy.sh bash scripts/infer_alexnet_int8_performance.sh ``` -## Results +## Model Results | Model | BatchSize | Precision | FPS | Top-1(%) | Top-5(%) | |---------|-----------|-----------|----------|----------|----------| diff --git a/models/cv/classification/clip/igie/README.md b/models/cv/classification/clip/igie/README.md index 1f864a06984033f7c6b90f69886a9994b3503a55..0e41a1e2d9b7583aa81571e7b01f04569d450c28 100644 --- a/models/cv/classification/clip/igie/README.md +++ b/models/cv/classification/clip/igie/README.md @@ -1,18 +1,12 @@ -# CLIP +# CLIP (IGIE) -## Description +## Model Description CLIP (Contrastive Language-Image Pre-Training) is a neural network trained on a variety of (image, text) pairs. It can be instructed in natural language to predict the most relevant text snippet, given an image, without directly optimizing for the task, similarly to the zero-shot capabilities of GPT-2 and 3. -## Setup +## Model Preparation -### Install - -```bash -pip3 install -r requirements.txt -``` - -### Download +### Prepare Resources Pretrained model: @@ -23,6 +17,12 @@ git clone https://huggingface.co/openai/clip-vit-base-patch32 clip-vit-base-patc Dataset: to download the validation dataset. +### Install Dependencies + +```bash +pip3 install -r requirements.txt +``` + ### Model Conversion ```bash @@ -32,7 +32,7 @@ python3 export.py --output clip.onnx onnxsim clip.onnx clip_opt.onnx ``` -## Inference +## Model Inference ```bash export DATASETS_DIR=/Path/to/imagenet_val/ @@ -47,8 +47,8 @@ bash scripts/infer_clip_fp16_accuracy.sh bash scripts/infer_clip_fp16_performance.sh ``` -## Results +## Model Results -Model |BatchSize |Precision |FPS |Top-1(%) |Top-5(%) -------|-----------|----------|----------|----------|-------- -CLIP | 32 | FP16 | 496.91 | 59.68 | 86.16 +| Model | BatchSize | Precision | FPS | Top-1(%) | Top-5(%) | +|-------|-----------|-----------|--------|----------|----------| +| CLIP | 32 | FP16 | 496.91 | 59.68 | 86.16 | diff --git a/models/cv/classification/conformer_base/igie/README.md b/models/cv/classification/conformer_base/igie/README.md index d79d899db4c0631fc2cbb09ac0c1355dea8ce4fa..cf5c3e27c48edcb8adc04e8124dd82c14583a8c5 100644 --- a/models/cv/classification/conformer_base/igie/README.md +++ b/models/cv/classification/conformer_base/igie/README.md @@ -1,23 +1,23 @@ -# Conformer Base +# Conformer Base (IGIE) -## Description +## Model Description Conformer is a novel network architecture that addresses the limitations of conventional Convolutional Neural Networks (CNNs) and visual transformers. Rooted in the Feature Coupling Unit (FCU), Conformer efficiently fuses local features and global representations at different resolutions through interactive processes. Its concurrent architecture ensures the maximal retention of both local and global features. -## Setup +## Model Preparation -### Install - -```bash -pip3 install -r requirements.txt -``` - -### Download +### Prepare Resources Pretrained model: Dataset: to download the validation dataset. +### Install Dependencies + +```bash +pip3 install -r requirements.txt +``` + ### Model Conversion ```bash @@ -28,7 +28,7 @@ onnxsim conformer_base.onnx conformer_base_opt.onnx ``` -## Inference +## Model Inference ```bash export DATASETS_DIR=/Path/to/imagenet_val/ @@ -43,12 +43,12 @@ bash scripts/infer_conformer_base_fp16_accuracy.sh bash scripts/infer_conformer_base_fp16_performance.sh ``` -## Results +## Model Results -Model |BatchSize |Precision |FPS |Top-1(%) |Top-5(%) -----------|-----------|----------|----------|----------|-------- -Conformer Base | 32 | FP16 | 428.73 | 83.83 | 96.59 +| Model | BatchSize | Precision | FPS | Top-1(%) | Top-5(%) | +|----------------|-----------|-----------|--------|----------|----------| +| Conformer Base | 32 | FP16 | 428.73 | 83.83 | 96.59 | -## Reference +## References - [Conformer](https://github.com/pengzhiliang/Conformer) diff --git a/models/cv/classification/convnext_base/igie/README.md b/models/cv/classification/convnext_base/igie/README.md index 37ca8560b5d8db57395af9fef06f1fa3fb3cb856..4d4676cda66789a1754fc29961268d0460738b18 100644 --- a/models/cv/classification/convnext_base/igie/README.md +++ b/models/cv/classification/convnext_base/igie/README.md @@ -1,30 +1,30 @@ -# ConvNext Base +# ConvNext Base (IGIE) -## Description +## Model Description The ConvNeXt Base model represents a significant stride in the evolution of convolutional neural networks (CNNs), introduced by researchers at Facebook AI Research (FAIR) and UC Berkeley. It is part of the ConvNeXt family, which challenges the dominance of Vision Transformers (ViTs) in the realm of visual recognition tasks. -## Setup +## Model Preparation -### Install - -```bash -pip3 install -r requirements.txt -``` - -### Download +### Prepare Resources Pretrained model: Dataset: to download the validation dataset. +### Install Dependencies + +```bash +pip3 install -r requirements.txt +``` + ### Model Conversion ```bash python3 export.py --weight convnext_base-6075fbad.pth --output convnext_base.onnx ``` -## Inference +## Model Inference ```bash export DATASETS_DIR=/Path/to/imagenet_val/ @@ -39,7 +39,7 @@ bash scripts/infer_convnext_base_fp16_accuracy.sh bash scripts/infer_convnext_base_fp16_performance.sh ``` -## Results +## Model Results | Model | BatchSize | Precision | FPS | Top-1(%) | Top-5(%) | | -------------- | --------- | --------- | ------- | -------- | -------- | diff --git a/models/cv/classification/convnext_base/ixrt/README.md b/models/cv/classification/convnext_base/ixrt/README.md index 0e4da2e6bd8188db4833bfd88f2608b202982991..fbb660695163169f8b631a4de73c3036c75e0656 100644 --- a/models/cv/classification/convnext_base/ixrt/README.md +++ b/models/cv/classification/convnext_base/ixrt/README.md @@ -1,12 +1,18 @@ -# ConvNeXt Base +# ConvNeXt Base (IxRT) -## Description +## Model Description The ConvNeXt Base model represents a significant stride in the evolution of convolutional neural networks (CNNs), introduced by researchers at Facebook AI Research (FAIR) and UC Berkeley. It is part of the ConvNeXt family, which challenges the dominance of Vision Transformers (ViTs) in the realm of visual recognition tasks. -## Setup +## Model Preparation -### Install +### Prepare Resources + +Pretrained model: + +Dataset: to download the validation dataset. + +### Install Dependencies ```bash # Install libGL @@ -15,28 +21,16 @@ yum install -y mesa-libGL ## Ubuntu apt install -y libgl1-mesa-glx -pip3 install tqdm -pip3 install onnx -pip3 install onnxsim -pip3 install tabulate -pip3 install ppq -pip3 install tqdm -pip3 install cuda-python +pip3 install -r requirements.txt ``` -### Download - -Pretrained model: - -Dataset: to download the validation dataset. - ### Model Conversion ```bash python3 export.py --weight convnext_base-6075fbad.pth --output convnext_base.onnx ``` -## Inference +## Model Inference ```bash export DATASETS_DIR=/Path/to/imagenet_val/ @@ -52,7 +46,7 @@ bash scripts/infer_convnext_base_fp16_accuracy.sh bash scripts/infer_convnext_base_fp16_performance.sh ``` -## Results +## Model Results | Model | BatchSize | Precision | FPS | Top-1(%) | Top-5(%) | | -------------- | --------- | --------- | ------- | -------- | -------- | diff --git a/models/cv/classification/convnext_base/ixrt/requirements.txt b/models/cv/classification/convnext_base/ixrt/requirements.txt new file mode 100644 index 0000000000000000000000000000000000000000..520130b7d8ff1a3a6ef5b97c52eeb10fb870f6ed --- /dev/null +++ b/models/cv/classification/convnext_base/ixrt/requirements.txt @@ -0,0 +1,7 @@ +tqdm +onnx +onnxsim +tabulate +ppq +tqdm +cuda-python \ No newline at end of file diff --git a/models/cv/classification/convnext_s/igie/README.md b/models/cv/classification/convnext_s/igie/README.md index 9d133a66fc0ca1eef042495cfa1254d3f670a58c..3a4a0b360f474a24dfc2e834778062e75276ca8f 100644 --- a/models/cv/classification/convnext_s/igie/README.md +++ b/models/cv/classification/convnext_s/igie/README.md @@ -1,12 +1,18 @@ -# ConvNext-S (OpenMMLab) +# ConvNext-S (IGIE) -## Description +## Model Description ConvNeXt-S is a small-sized model in the ConvNeXt family, designed to balance performance and computational complexity. With 50.22M parameters and 8.69G FLOPs, it achieves 83.13% Top-1 accuracy on ImageNet-1k. Modernized from traditional ConvNets, ConvNeXt-S incorporates features such as large convolutional kernels (7x7), LayerNorm, and GELU activations, making it highly efficient and scalable. -## Setup +## Model Preparation -### Install +### Prepare Resources + +Pretrained model: + +Dataset: to download the validation dataset. + +### Install Dependencies ```bash # Install libGL @@ -18,12 +24,6 @@ apt install -y libgl1-mesa-glx pip3 install -r requirements.txt ``` -### Download - -Pretrained model: - -Dataset: to download the validation dataset. - ### Model Conversion ```bash @@ -38,7 +38,7 @@ onnxsim convnext_s.onnx convnext_s_opt.onnx ``` -## Inference +## Model Inference ```bash export DATASETS_DIR=/Path/to/imagenet_val/ @@ -53,12 +53,12 @@ bash scripts/infer_convnext_s_fp16_accuracy.sh bash scripts/infer_convnext_s_fp16_performance.sh ``` -## Results +## Model Results | Model | BatchSize | Precision | FPS | Top-1(%) | Top-5(%) | | ------------ | --------- | --------- | -------- | -------- | -------- | | ConvNext-S | 32 | FP16 | 728.32 | 82.786 | 96.415 | -## Reference +## References -ConvNext-S: +- [ConvNext-S](https://github.com/open-mmlab/mmpretrain) diff --git a/models/cv/classification/convnext_small/igie/README.md b/models/cv/classification/convnext_small/igie/README.md index 9d1711a9d8a42f7fca1cb87a067cece2ae775524..fc8a0f5eca28a0efb0c0f5b5604e96bab7f81162 100644 --- a/models/cv/classification/convnext_small/igie/README.md +++ b/models/cv/classification/convnext_small/igie/README.md @@ -1,30 +1,30 @@ -# ConvNeXt Small +# ConvNeXt Small (IGIE) -## Description +## Model Description The ConvNeXt Small model represents a significant stride in the evolution of convolutional neural networks (CNNs), introduced by researchers at Facebook AI Research (FAIR) and UC Berkeley. It is part of the ConvNeXt family, which challenges the dominance of Vision Transformers (ViTs) in the realm of visual recognition tasks. -## Setup +## Model Preparation -### Install - -```bash -pip3 install -r requirements.txt -``` - -### Download +### Prepare Resources Pretrained model: Dataset: to download the validation dataset. +### Install Dependencies + +```bash +pip3 install -r requirements.txt +``` + ### Model Conversion ```bash python3 export.py --weight convnext_small-0c510722.pth --output convnext_small.onnx ``` -## Inference +## Model Inference ```bash export DATASETS_DIR=/Path/to/imagenet_val/ @@ -39,7 +39,7 @@ bash scripts/infer_convnext_small_fp16_accuracy.sh bash scripts/infer_convnext_small_fp16_performance.sh ``` -## Results +## Model Results | Model | BatchSize | Precision | FPS | Top-1(%) | Top-5(%) | | -------------- | --------- | --------- | ------- | -------- | -------- | diff --git a/models/cv/classification/convnext_small/ixrt/README.md b/models/cv/classification/convnext_small/ixrt/README.md index 70d1e0a8ba61705bc4b5af15ffd3bd6804f728b1..b47d5f12e64ded8114dbf7d54849f0a2ed8ec2a3 100644 --- a/models/cv/classification/convnext_small/ixrt/README.md +++ b/models/cv/classification/convnext_small/ixrt/README.md @@ -1,12 +1,18 @@ -# ConvNeXt Small +# ConvNeXt Small (IxRT) -## Description +## Model Description The ConvNeXt Small model represents a significant stride in the evolution of convolutional neural networks (CNNs), introduced by researchers at Facebook AI Research (FAIR) and UC Berkeley. It is part of the ConvNeXt family, which challenges the dominance of Vision Transformers (ViTs) in the realm of visual recognition tasks. -## Setup +## Model Preparation -### Install +### Prepare Resources + +Pretrained model: + +Dataset: to download the validation dataset. + +### Install Dependencies ```bash # Install libGL @@ -18,19 +24,13 @@ apt install -y libgl1-mesa-glx pip3 install -r requirements.txt ``` -### Download - -Pretrained model: - -Dataset: to download the validation dataset. - ### Model Conversion ```bash python3 export.py --weight convnext_small-0c510722.pth --output convnext_small.onnx ``` -## Inference +## Model Inference ```bash export DATASETS_DIR=/Path/to/imagenet_val/ @@ -46,7 +46,7 @@ bash scripts/infer_convnext_small_fp16_accuracy.sh bash scripts/infer_convnext_small_fp16_performance.sh ``` -## Results +## Model Results | Model | BatchSize | Precision | FPS | Top-1(%) | Top-5(%) | | -------------- | --------- | --------- | ------- | -------- | -------- | diff --git a/models/cv/classification/cspdarknet53/igie/README.md b/models/cv/classification/cspdarknet53/igie/README.md index ca15f15aa820c37d196a48199d04709ebe928647..0cdd89a26bc46f9319e03cea8340fff5a8e66963 100644 --- a/models/cv/classification/cspdarknet53/igie/README.md +++ b/models/cv/classification/cspdarknet53/igie/README.md @@ -1,12 +1,18 @@ -# CSPDarkNet53 +# CSPDarkNet53 (IGIE) -## Description +## Model Description CSPDarkNet53 is an enhanced convolutional neural network architecture that reduces redundant computations by integrating cross-stage partial network features and truncating gradient flow, thereby maintaining high accuracy while lowering computational costs. -## Setup +## Model Preparation -### Install +### Prepare Resources + +Pretrained model: + +Dataset: to download the validation dataset. + +### Install Dependencies ```bash # Install libGL @@ -18,12 +24,6 @@ apt install -y libgl1-mesa-glx pip3 install -r requirements.txt ``` -### Download - -Pretrained model: - -Dataset: to download the validation dataset. - ### Model Conversion ```bash @@ -38,10 +38,9 @@ python3 export.py --cfg mmpretrain/configs/cspnet/cspdarknet50_8xb32_in1k.py --w # Use onnxsim optimize onnx model onnxsim cspdarknet53.onnx cspdarknet53_opt.onnx - ``` -## Inference +## Model Inference ```bash export DATASETS_DIR=/Path/to/imagenet_val/ @@ -56,12 +55,12 @@ bash scripts/infer_cspdarknet53_fp16_accuracy.sh bash scripts/infer_cspdarknet53_fp16_performance.sh ``` -## Results +## Model Results | Model | BatchSize | Precision | FPS | Top-1(%) | Top-5(%) | | ------------ | --------- | --------- | -------- | -------- | -------- | | CSPDarkNet53 | 32 | FP16 | 3214.387 | 79.063 | 94.492 | -## Reference +## References -CSPDarkNet53: +- [mmpretrain](https://github.com/open-mmlab/mmpretrain) diff --git a/models/cv/classification/cspdarknet53/ixrt/README.md b/models/cv/classification/cspdarknet53/ixrt/README.md index 40addd41de5c37c861b3944bf04364bd73bbcd5b..9065285828bb0b9b37c87aca070487ae094b66a8 100644 --- a/models/cv/classification/cspdarknet53/ixrt/README.md +++ b/models/cv/classification/cspdarknet53/ixrt/README.md @@ -1,12 +1,18 @@ -# CSPDarkNet53 +# CSPDarkNet53 (IxRT) -## Description +## Model Description CSPDarkNet53 is an enhanced convolutional neural network architecture that reduces redundant computations by integrating cross-stage partial network features and truncating gradient flow, thereby maintaining high accuracy while lowering computational costs. -## Setup +## Model Preparation -### Install +### Prepare Resources + +Pretrained model: + +Dataset: to download the validation dataset. + +### Install Dependencies ```bash # Install libGL @@ -18,12 +24,6 @@ apt install -y libgl1-mesa-glx pip3 install -r requirements.txt ``` -### Download - -Pretrained model: - -Dataset: to download the validation dataset. - ### Model Conversion ```bash @@ -42,7 +42,7 @@ onnxsim cspdarknet5.onnx checkpoints/cspdarknet53_sim.onnx ``` -## Inference +## Model Inference ```bash export DATASETS_DIR=/Path/to/imagenet_val/ @@ -68,13 +68,13 @@ bash scripts/infer_cspdarknet53_int8_accuracy.sh bash scripts/infer_cspdarknet53_int8_performance.sh ``` -## Results +## Model Results | Model | BatchSize | Precision | FPS | Top-1(%) | Top-5(%) | | ------------ | --------- | --------- | -------- | -------- | -------- | | CSPDarkNet53 | 32 | FP16 | 3282.318 | 79.09 | 94.52 | | CSPDarkNet53 | 32 | INT8 | 6335.86 | 75.49 | 92.66 | -## Reference +## References -CSPDarkNet53: +- [mmpretrain](https://github.com/open-mmlab/mmpretrain) diff --git a/models/cv/classification/cspresnet50/igie/README.md b/models/cv/classification/cspresnet50/igie/README.md index 46341ba62511a13a6dd01205b7ea7c4d2e999139..8abb227cf1eb893cace7fe16e7301e84462f4f6f 100644 --- a/models/cv/classification/cspresnet50/igie/README.md +++ b/models/cv/classification/cspresnet50/igie/README.md @@ -1,12 +1,20 @@ -# CSPResNet50 +# CSPResNet50 (IGIE) -## Description +## Model Description -CSPResNet50 combines the strengths of ResNet50 and CSPNet (Cross-Stage Partial Network) to create a more efficient and high-performing architecture. By splitting and fusing feature maps across stages, CSPResNet50 reduces redundant computations, optimizes gradient flow, and enhances feature representation. +CSPResNet50 combines the strengths of ResNet50 and CSPNet (Cross-Stage Partial Network) to create a more efficient and +high-performing architecture. By splitting and fusing feature maps across stages, CSPResNet50 reduces redundant +computations, optimizes gradient flow, and enhances feature representation. -## Setup +## Model Preparation -### Install +### Prepare Resources + +Pretrained model: + +Dataset: to download the validation dataset. + +### Install Dependencies ```bash # Install libGL @@ -18,12 +26,6 @@ apt install -y libgl1-mesa-glx pip3 install -r requirements.txt ``` -### Download - -Pretrained model: - -Dataset: to download the validation dataset. - ### Model Conversion ```bash @@ -38,7 +40,7 @@ onnxsim cspresnet50.onnx cspresnet50_opt.onnx ``` -## Inference +## Model Inference ```bash export DATASETS_DIR=/Path/to/imagenet_val/ @@ -53,12 +55,12 @@ bash scripts/infer_cspresnet50_fp16_accuracy.sh bash scripts/infer_cspresnet50_fp16_performance.sh ``` -## Results +## Model Results | Model | BatchSize | Precision | FPS | Top-1(%) | Top-5(%) | | ------------ | --------- | --------- | -------- | -------- | -------- | | CSPResNet50 | 32 | FP16 | 4553.80 | 78.507 | 94.142 | -## Reference +## References -CSPResNet50: +- [mmpretrain](https://github.com/open-mmlab/mmpretrain) diff --git a/models/cv/classification/cspresnet50/ixrt/README.md b/models/cv/classification/cspresnet50/ixrt/README.md index fa054551f15b187883bfdf049c48202e37e7b638..2e14d0609c48a33409a9c37e412932f785853128 100644 --- a/models/cv/classification/cspresnet50/ixrt/README.md +++ b/models/cv/classification/cspresnet50/ixrt/README.md @@ -1,13 +1,17 @@ -# CSPResNet50 +# CSPResNet50 (IxRT) -## Description +## Model Description Neural networks have enabled state-of-the-art approaches to achieve incredible results on computer vision tasks such as object detection. CSPResNet50 is the one of best models. -## Setup +## Model Preparation -### Install +### Prepare Resources + +Dataset: to download the validation dataset. + +### Install Dependencies ```bash # Install libGL @@ -19,10 +23,6 @@ apt install -y libgl1-mesa-glx pip3 install -r requirements.txt ``` -### Download - -Dataset: to download the validation dataset. - ### Model Conversion ```bash @@ -35,7 +35,7 @@ python3 export_onnx.py \ --output_model ./checkpoints/cspresnet50.onnx ``` -## Inference +## Model Inference ```bash export PROJ_DIR=./ @@ -43,7 +43,6 @@ export DATASETS_DIR=/path/to/imagenet_val export CHECKPOINTS_DIR=./checkpoints export RUN_DIR=./ export CONFIG_DIR=config/CSPRESNET50_CONFIG - ``` ### FP16 @@ -64,9 +63,9 @@ bash scripts/infer_cspresnet50_int8_accuracy.sh bash scripts/infer_cspresnet50_int8_performance.sh ``` -## Results +## Model Results -Model |BatchSize |Precision |FPS |Top-1(%) |Top-5(%) -------------|-----------|----------|---------|----------|-------- -CSPResNet50 | 32 | FP16 | 4555.95 | 78.51 | 94.17 -CSPResNet50 | 32 | INT8 | 8801.94 | 78.15 | 93.95 +| Model | BatchSize | Precision | FPS | Top-1(%) | Top-5(%) | +|-------------|-----------|-----------|---------|----------|----------| +| CSPResNet50 | 32 | FP16 | 4555.95 | 78.51 | 94.17 | +| CSPResNet50 | 32 | INT8 | 8801.94 | 78.15 | 93.95 | diff --git a/models/cv/classification/deit_tiny/igie/README.md b/models/cv/classification/deit_tiny/igie/README.md index 89cb1aa4720f995757568638d39366e9b4ad313e..374b665da87ab4a22cd2a1f5c2efd8e17f7244d5 100644 --- a/models/cv/classification/deit_tiny/igie/README.md +++ b/models/cv/classification/deit_tiny/igie/README.md @@ -1,12 +1,18 @@ -# DeiT-tiny +# DeiT-tiny (IGIE) -## Description +## Model Description DeiT Tiny is a lightweight vision transformer designed for data-efficient learning. It achieves rapid training and high accuracy on small datasets through innovative attention distillation methods, while maintaining the simplicity and efficiency of the model. -## Setup +## Model Preparation -### Install +### Prepare Resources + +Pretrained model: + +Dataset: to download the validation dataset. + +### Install Dependencies ```bash # Install libGL @@ -18,12 +24,6 @@ apt install -y libgl1-mesa-glx pip3 install -r requirements.txt ``` -### Download - -Pretrained model: - -Dataset: to download the validation dataset. - ### Model Conversion ```bash @@ -35,10 +35,9 @@ python3 export.py --cfg mmpretrain/configs/deit/deit-tiny_pt-4xb256_in1k.py --we # Use onnxsim optimize onnx model onnxsim deit_tiny.onnx deit_tiny_opt.onnx - ``` -## Inference +## Model Inference ```bash export DATASETS_DIR=/Path/to/imagenet_val/ @@ -53,12 +52,12 @@ bash scripts/infer_deit_tiny_fp16_accuracy.sh bash scripts/infer_deit_tin_fp16_performance.sh ``` -## Results +## Model Results | Model | BatchSize | Precision | FPS | Top-1(%) | Top-5(%) | | --------- | --------- | --------- | -------- | -------- | -------- | | DeiT-tiny | 32 | FP16 | 2172.771 | 74.334 | 92.175 | -## Reference +## References -Deit_tiny: +- [mmpretrain](https://github.com/open-mmlab/mmpretrain) diff --git a/models/cv/classification/deit_tiny/ixrt/README.md b/models/cv/classification/deit_tiny/ixrt/README.md index 42710f0ff58b11534208e7d331895f72f77bcc4f..1b5b42b2e85d5ab749cb3e64533071d14b9b8190 100644 --- a/models/cv/classification/deit_tiny/ixrt/README.md +++ b/models/cv/classification/deit_tiny/ixrt/README.md @@ -1,12 +1,18 @@ -# DeiT-tiny +# DeiT-tiny (IxRT) -## Description +## Model Description DeiT Tiny is a lightweight vision transformer designed for data-efficient learning. It achieves rapid training and high accuracy on small datasets through innovative attention distillation methods, while maintaining the simplicity and efficiency of the model. -## Setup +## Model Preparation -### Install +### Prepare Resources + +Pretrained model: + +Dataset: to download the validation dataset. + +### Install Dependencies ```bash # Install libGL @@ -15,21 +21,9 @@ yum install -y mesa-libGL ## Ubuntu apt install -y libgl1-mesa-glx -pip3 install tqdm -pip3 install onnx -pip3 install onnxsim -pip3 install tabulate -pip3 install ppq -pip3 install tqdm -pip3 install cuda-python +pip3 install -r requirements.txt ``` -### Download - -Pretrained model: - -Dataset: to download the validation dataset. - ### Model Conversion ```bash @@ -45,7 +39,7 @@ onnxsim deit_tiny.onnx deit_tiny_opt.onnx ``` -## Inference +## Model Inference ```bash export DATASETS_DIR=/Path/to/imagenet_val/ @@ -62,12 +56,12 @@ bash scripts/infer_deit_tiny_fp16_performance.sh ``` -## Results +## Model Results | Model | BatchSize | Precision | FPS | Top-1(%) | Top-5(%) | | --------- | --------- | --------- | -------- | -------- | -------- | | DeiT-tiny | 32 | FP16 | 1446.690 | 74.34 | 92.21 | -## Reference +## References -Deit_tiny: +- [mmpretrain](https://github.com/open-mmlab/mmpretrain) diff --git a/models/cv/classification/deit_tiny/ixrt/requirements.txt b/models/cv/classification/deit_tiny/ixrt/requirements.txt new file mode 100644 index 0000000000000000000000000000000000000000..520130b7d8ff1a3a6ef5b97c52eeb10fb870f6ed --- /dev/null +++ b/models/cv/classification/deit_tiny/ixrt/requirements.txt @@ -0,0 +1,7 @@ +tqdm +onnx +onnxsim +tabulate +ppq +tqdm +cuda-python \ No newline at end of file diff --git a/models/cv/classification/densenet121/igie/README.md b/models/cv/classification/densenet121/igie/README.md index 61deb581503b40e6f4d56691b2a89ed941c0ead3..2ca0d81e0b0f224a30dc4090191df201509a2da1 100644 --- a/models/cv/classification/densenet121/igie/README.md +++ b/models/cv/classification/densenet121/igie/README.md @@ -1,30 +1,30 @@ -# DenseNet121 +# DenseNet121 (IGIE) -## Description +## Model Description DenseNet-121 is a convolutional neural network architecture that belongs to the family of Dense Convolutional Networks.The network consists of four dense blocks, each containing a varying number of densely connected convolutional layers. Transition layers with pooling operations reduce the spatial dimensions between dense blocks. -## Setup +## Model Preparation -### Install - -```bash -pip3 install -r requirements.txt -``` - -### Download +### Prepare Resources Pretrained model: Dataset: to download the validation dataset. +### Install Dependencies + +```bash +pip3 install -r requirements.txt +``` + ### Model Conversion ```bash python3 export.py --weight densenet121-a639ec97.pth --output densenet121.onnx ``` -## Inference +## Model Inference ```bash export DATASETS_DIR=/Path/to/imagenet_val/ @@ -39,8 +39,8 @@ bash scripts/infer_densenet121_fp16_accuracy.sh bash scripts/infer_densenet121_fp16_performance.sh ``` -## Results +## Model Results -Model |BatchSize |Precision |FPS |Top-1(%) |Top-5(%) -------------|-----------|----------|---------|---------|-------- -DenseNet121 | 32 | FP16 | 2199.75 | 74.40 | 91.931 +| Model | BatchSize | Precision | FPS | Top-1(%) | Top-5(%) | +|-------------|-----------|-----------|---------|----------|----------| +| DenseNet121 | 32 | FP16 | 2199.75 | 74.40 | 91.931 | diff --git a/models/cv/classification/densenet121/ixrt/README.md b/models/cv/classification/densenet121/ixrt/README.md index 7e2afd493dc8bd5ec6ee9377faffa9ca3b53a9bb..d683c42214f8bffcfec0df473e9cacd2119a3410 100644 --- a/models/cv/classification/densenet121/ixrt/README.md +++ b/models/cv/classification/densenet121/ixrt/README.md @@ -1,12 +1,16 @@ -# DenseNet +# DenseNet (IxRT) -## Description +## Model Description Dense Convolutional Network (DenseNet), connects each layer to every other layer in a feed-forward fashion. Whereas traditional convolutional networks with L layers have L connections - one between each layer and its subsequent layer - our network has L(L+1)/2 direct connections. -## Setup +## Model Preparation -### Install +### Prepare Resources + +Dataset: to download the validation dataset. + +### Install Dependencies ```bash # Install libGL @@ -18,10 +22,6 @@ apt install -y libgl1-mesa-glx pip3 install -r requirements.txt ``` -### Download - -Dataset: to download the validation dataset. - ### Model Conversion ```bash @@ -29,7 +29,7 @@ mkdir checkpoints python3 export_onnx.py --output_model checkpoints/densenet121.onnx ``` -## Inference +## Model Inference ```bash export DATASETS_DIR=/path/to/imagenet_val/ @@ -47,7 +47,7 @@ bash scripts/infer_densenet_fp16_accuracy.sh bash scripts/infer_densenet_fp16_performance.sh ``` -## Results +## Model Results | Model | BatchSize | Precision | FPS | Top-1(%) | Top-5(%) | | -------- | --------- | --------- | ------- | -------- | -------- | diff --git a/models/cv/classification/densenet161/igie/README.md b/models/cv/classification/densenet161/igie/README.md index ba2b2757dd64cdd27e5c32bda227689490303c06..4b736b3f1749e1eb5f322e2bcac33ef25fa35a3d 100644 --- a/models/cv/classification/densenet161/igie/README.md +++ b/models/cv/classification/densenet161/igie/README.md @@ -1,30 +1,30 @@ -# DenseNet161 +# DenseNet161 (IGIE) -## Description +## Model Description DenseNet161 is a convolutional neural network architecture that belongs to the family of Dense Convolutional Networks (DenseNets). Introduced as an extension to the previous DenseNet models, DenseNet161 offers improved performance and deeper network capacity, making it suitable for various computer vision tasks. -## Setup +## Model Preparation -### Install - -```bash -pip3 install -r requirements.txt -``` - -### Download +### Prepare Resources Pretrained model: Dataset: to download the validation dataset. +### Install Dependencies + +```bash +pip3 install -r requirements.txt +``` + ### Model Conversion ```bash python3 export.py --weight densenet161-8d451a50.pth --output densenet161.onnx ``` -## Inference +## Model Inference ```bash export DATASETS_DIR=/Path/to/imagenet_val/ @@ -39,7 +39,7 @@ bash scripts/infer_densenet161_fp16_accuracy.sh bash scripts/infer_densenet161_fp16_performance.sh ``` -## Results +## Model Results | Model | BatchSize | Precision | FPS | Top-1(%) | Top-5(%) | | ----------- | --------- | --------- | ------ | -------- | -------- | diff --git a/models/cv/classification/densenet161/ixrt/README.md b/models/cv/classification/densenet161/ixrt/README.md index 58659f827d1e95fea322f779839cfb911de51b32..70dba728fc926745be8fc41769b929e652c629f4 100644 --- a/models/cv/classification/densenet161/ixrt/README.md +++ b/models/cv/classification/densenet161/ixrt/README.md @@ -1,12 +1,17 @@ -# DenseNet161 +# DenseNet161 (IxRT) -## Description +## Model Description DenseNet161 is a convolutional neural network architecture that belongs to the family of Dense Convolutional Networks (DenseNets). Introduced as an extension to the previous DenseNet models, DenseNet161 offers improved performance and deeper network capacity, making it suitable for various computer vision tasks. -## Setup +## Model Preparation -### Install +### Prepare Resources + +Pretrained model: +Dataset: to download the validation dataset. + +### Install Dependencies ```bash # Install libGL @@ -18,18 +23,13 @@ apt install -y libgl1-mesa-glx pip3 install -r requirements.txt ``` -### Download - -Pretrained model: -Dataset: to download the validation dataset. - ### Model Conversion ```bash python3 export.py --weight densenet161-8d451a50.pth --output densenet161.onnx ``` -## Inference +## Model Inference ```bash export DATASETS_DIR=/Path/to/imagenet_val/ @@ -44,7 +44,7 @@ bash scripts/infer_densenet161_fp16_accuracy.sh bash scripts/infer_densenet161_fp16_performance.sh ``` -## Results +## Model Results | Model | BatchSize | Precision | FPS | Top-1(%) | Top-5(%) | | ----------- | --------- | --------- | ------- | -------- | -------- | diff --git a/models/cv/classification/densenet169/igie/README.md b/models/cv/classification/densenet169/igie/README.md index 0acd325a3c21828d862bb568a87afec0a1e7c206..6560fdb9029661a255a94c32c9f9c460d93feeb6 100644 --- a/models/cv/classification/densenet169/igie/README.md +++ b/models/cv/classification/densenet169/igie/README.md @@ -1,30 +1,30 @@ -# DenseNet169 +# DenseNet169 (IGIE) -## Description +## Model Description DenseNet-169 is a variant of the Dense Convolutional Network (DenseNet) architecture, characterized by its 169 layers and a growth rate of 32. This network leverages the dense connectivity pattern, where each layer is connected to every other layer in a feed-forward fashion, resulting in a substantial increase in the number of direct connections compared to traditional convolutional networks. This connectivity pattern facilitates the reuse of features and enhances the flow of information and gradients throughout the network, which is particularly beneficial for deep architectures. -## Setup +## Model Preparation -### Install - -```bash -pip3 install -r requirements.txt -``` - -### Download +### Prepare Resources Pretrained model: Dataset: to download the validation dataset. +### Install Dependencies + +```bash +pip3 install -r requirements.txt +``` + ### Model Conversion ```bash python3 export.py --weight densenet169-b2777c0a.pth --output densenet169.onnx ``` -## Inference +## Model Inference ```bash export DATASETS_DIR=/Path/to/imagenet_val/ @@ -39,7 +39,7 @@ bash scripts/infer_densenet169_fp16_accuracy.sh bash scripts/infer_densenet169_fp16_performance.sh ``` -## Results +## Model Results | Model | BatchSize | Precision | FPS | Top-1(%) | Top-5(%) | | ----------- | --------- | --------- | -------- | -------- | -------- | diff --git a/models/cv/classification/densenet169/ixrt/README.md b/models/cv/classification/densenet169/ixrt/README.md index 0e3aee4c481f79324e51fbb5aa9913a9e2f17986..c60b673a13c2ce119c1f6999d1efe03182d45640 100644 --- a/models/cv/classification/densenet169/ixrt/README.md +++ b/models/cv/classification/densenet169/ixrt/README.md @@ -1,12 +1,18 @@ -# DenseNet169 +# DenseNet169 (IxRT) -## Description +## Model Description Dense Convolutional Network (DenseNet), connects each layer to every other layer in a feed-forward fashion. Whereas traditional convolutional networks with L layers have L connections - one between each layer and its subsequent layer - our network has L(L+1)/2 direct connections. -## Setup +## Model Preparation -### Install +### Prepare Resources + +Pretrained model: + +Dataset: to download the validation dataset. + +### Install Dependencies ```bash # Install libGL @@ -18,19 +24,13 @@ apt install -y libgl1-mesa-glx pip3 install -r requirements.txt ``` -### Download - -Pretrained model: - -Dataset: to download the validation dataset. - ### Model Conversion ```bash python3 export.py --weight densenet169-b2777c0a.pth --output densenet169.onnx ``` -## Inference +## Model Inference ```bash export DATASETS_DIR=/Path/to/imagenet_val/ @@ -45,8 +45,8 @@ bash scripts/infer_densenet169_fp16_accuracy.sh bash scripts/infer_densenet169_fp16_performance.sh ``` -## Results +## Model Results -| Model | BatchSize | Precision | FPS | Top-1(%) | Top-5(%) | -| -------- | --------- | --------- | ------- | -------- | -------- | +| Model | BatchSize | Precision | FPS | Top-1(%) | Top-5(%) | +|-------------|-----------|-----------|---------|----------|----------| | DenseNet169 | 32 | FP16 | 1119.69 | 0.7558 | 0.9284 | diff --git a/models/cv/classification/densenet201/igie/README.md b/models/cv/classification/densenet201/igie/README.md index 072ff5915608a4cca556b0ae94e686d66c26e69a..a21fd44b0e2eb75b7984fdaa8e7f6db47193f5b7 100644 --- a/models/cv/classification/densenet201/igie/README.md +++ b/models/cv/classification/densenet201/igie/README.md @@ -1,30 +1,30 @@ -# DenseNet201 +# DenseNet201 (IGIE) -## Description +## Model Description DenseNet201 is a deep convolutional neural network that stands out for its unique dense connection architecture, where each layer integrates features from all previous layers, effectively reusing features and reducing the number of parameters. This design not only enhances the network's information flow and parameter efficiency but also increases the model's regularization effect, helping to prevent overfitting. DenseNet201 consists of multiple dense blocks and transition layers, capable of capturing rich feature representations while maintaining computational efficiency, making it suitable for complex image recognition tasks. -## Setup +## Model Preparation -### Install - -```bash -pip3 install -r requirements.txt -``` - -### Download +### Prepare Resources Pretrained model: Dataset: to download the validation dataset. +### Install Dependencies + +```bash +pip3 install -r requirements.txt +``` + ### Model Conversion ```bash python3 export.py --weight densenet201-c1103571.pth --output densenet201.onnx ``` -## Inference +## Model Inference ```bash export DATASETS_DIR=/Path/to/imagenet_val/ @@ -39,7 +39,7 @@ bash scripts/infer_densenet201_fp16_accuracy.sh bash scripts/infer_densenet201_fp16_performance.sh ``` -## Results +## Model Results | Model | BatchSize | Precision | FPS | Top-1(%) | Top-5(%) | | ----------- | --------- | --------- | -------- | -------- | -------- | diff --git a/models/cv/classification/densenet201/ixrt/README.md b/models/cv/classification/densenet201/ixrt/README.md index e7e2b9ae04ce38cd28fba3310f5a7638fffa4769..1fc3594f5ca30db42b69245530c1dd10d4f3cc15 100644 --- a/models/cv/classification/densenet201/ixrt/README.md +++ b/models/cv/classification/densenet201/ixrt/README.md @@ -1,12 +1,18 @@ -# DenseNet201 +# DenseNet201 (IxRT) -## Description +## Model Description DenseNet201 is a deep convolutional neural network that stands out for its unique dense connection architecture, where each layer integrates features from all previous layers, effectively reusing features and reducing the number of parameters. This design not only enhances the network's information flow and parameter efficiency but also increases the model's regularization effect, helping to prevent overfitting. DenseNet201 consists of multiple dense blocks and transition layers, capable of capturing rich feature representations while maintaining computational efficiency, making it suitable for complex image recognition tasks. -## Setup +## Model Preparation -### Install +### Prepare Resources + +Pretrained model: + +Dataset: to download the validation dataset. + +### Install Dependencies ```bash # Install libGL @@ -15,28 +21,16 @@ yum install -y mesa-libGL ## Ubuntu apt install -y libgl1-mesa-glx -pip3 install tqdm -pip3 install onnx -pip3 install onnxsim -pip3 install tabulate -pip3 install ppq -pip3 install tqdm -pip3 install cuda-python +pip3 install -r requirements.txt ``` -### Download - -Pretrained model: - -Dataset: to download the validation dataset. - ### Model Conversion ```bash python3 export.py --weight densenet201-c1103571.pth --output densenet201.onnx ``` -## Inference +## Model Inference ```bash export DATASETS_DIR=/Path/to/imagenet_val/ @@ -51,7 +45,7 @@ bash scripts/infer_densenet201_fp16_accuracy.sh bash scripts/infer_densenet201_fp16_performance.sh ``` -## Results +## Model Results | Model | BatchSize | Precision | FPS | Top-1(%) | Top-5(%) | | ----------- | --------- | --------- | -------- | -------- | -------- | diff --git a/models/cv/classification/densenet201/ixrt/requirements.txt b/models/cv/classification/densenet201/ixrt/requirements.txt new file mode 100644 index 0000000000000000000000000000000000000000..520130b7d8ff1a3a6ef5b97c52eeb10fb870f6ed --- /dev/null +++ b/models/cv/classification/densenet201/ixrt/requirements.txt @@ -0,0 +1,7 @@ +tqdm +onnx +onnxsim +tabulate +ppq +tqdm +cuda-python \ No newline at end of file diff --git a/models/cv/classification/efficientnet_b0/igie/README.md b/models/cv/classification/efficientnet_b0/igie/README.md index 40ccbfad7fc97f067493330b46ff82145f4448f9..200016806875a74212785e7f02a6e28fc102615c 100644 --- a/models/cv/classification/efficientnet_b0/igie/README.md +++ b/models/cv/classification/efficientnet_b0/igie/README.md @@ -1,30 +1,30 @@ -# EfficientNet B0 +# EfficientNet B0 (IGIE) -## Description +## Model Description EfficientNet-B0 is a lightweight yet highly efficient convolutional neural network architecture. It is part of the EfficientNet family, known for its superior performance in balancing model size and accuracy. Developed with a focus on resource efficiency, EfficientNet-B0 achieves remarkable results across various computer vision tasks, including image classification and feature extraction. -## Setup +## Model Preparation -### Install - -```bash -pip3 install -r requirements.txt -``` - -### Download +### Prepare Resources Pretrained model: Dataset: to download the validation dataset. +### Install Dependencies + +```bash +pip3 install -r requirements.txt +``` + ### Model Conversion ```bash python3 export.py --weight efficientnet_b0_rwightman-7f5810bc.pth --output efficientnet_b0.onnx ``` -## Inference +## Model Inference ```bash export DATASETS_DIR=/Path/to/imagenet_val/ @@ -39,8 +39,8 @@ bash scripts/infer_efficientnet_b0_fp16_accuracy.sh bash scripts/infer_efficientnet_b0_fp16_performance.sh ``` -## Results +## Model Results -Model |BatchSize |Precision |FPS |Top-1(%) |Top-5(%) -----------------|-----------|----------|----------|----------|-------- -EfficientNet_B0 | 32 | FP16 | 2596.60 | 77.639 | 93.540 +| Model | BatchSize | Precision | FPS | Top-1(%) | Top-5(%) | +|-----------------|-----------|-----------|---------|----------|----------| +| EfficientNet_B0 | 32 | FP16 | 2596.60 | 77.639 | 93.540 | diff --git a/models/cv/classification/efficientnet_b0/ixrt/README.md b/models/cv/classification/efficientnet_b0/ixrt/README.md index e7c96a2823a069f719e30fea51e13128d8f5ffee..c7a7448b944914c315d36453c09ff12fd1820fe5 100644 --- a/models/cv/classification/efficientnet_b0/ixrt/README.md +++ b/models/cv/classification/efficientnet_b0/ixrt/README.md @@ -1,12 +1,18 @@ -# EfficientNet B0 +# EfficientNet B0 (IxRT) -## Description +## Model Description EfficientNet B0 is a convolutional neural network architecture that belongs to the EfficientNet family, which was introduced by Mingxing Tan and Quoc V. Le in their paper "EfficientNet: Rethinking Model Scaling for Convolutional Neural Networks." The EfficientNet family is known for achieving state-of-the-art performance on various computer vision tasks while being more computationally efficient than many existing models. -## Setup +## Model Preparation -### Install +### Prepare Resources + +Pretrained model: + +Dataset: to download the validation dataset. + +### Install Dependencies ```bash # Install libGL @@ -18,19 +24,13 @@ apt install -y libgl1-mesa-glx pip3 install -r requirements.txt ``` -### Download - -Pretrained model: - -Dataset: to download the validation dataset. - ### Model Conversion ```bash python3 export_onnx.py --origin_model /path/to/efficientnet_b0_rwightman-3dd342df.pth --output_model efficientnet_b0.onnx ``` -## Inference +## Model Inference ```bash export DATASETS_DIR=/path/to/imagenet_val/ @@ -54,9 +54,9 @@ bash scripts/infer_efficientnet_b0_int8_accuracy.sh bash scripts/infer_efficientnet_b0_int8_performance.sh ``` -## Results +## Model Results | Model | BatchSize | Precision | FPS | Top-1(%) | Top-5(%) | -| --------------- | --------- | --------- | ------- | -------- | -------- | +|-----------------|-----------|-----------|---------|----------|----------| | EfficientNet B0 | 32 | FP16 | 2325.54 | 77.66 | 93.58 | | EfficientNet B0 | 32 | INT8 | 2666.00 | 74.27 | 91.85 | diff --git a/models/cv/classification/efficientnet_b1/igie/README.md b/models/cv/classification/efficientnet_b1/igie/README.md index 2707cbce04eae42f10d65e843462b141b8a466f3..1a36f8aa664fee616747372a84bdcdb7656f7277 100644 --- a/models/cv/classification/efficientnet_b1/igie/README.md +++ b/models/cv/classification/efficientnet_b1/igie/README.md @@ -1,30 +1,30 @@ -# EfficientNet B1 +# EfficientNet B1 (IGIE) -## Description +## Model Description EfficientNet B1 is a convolutional neural network architecture that falls under the EfficientNet family, known for its remarkable balance between model size and performance. Introduced as part of the EfficientNet series, EfficientNet B1 offers a compact yet powerful solution for various computer vision tasks, including image classification, object detection and segmentation. -## Setup +## Model Preparation -### Install - -```bash -pip3 install -r requirements.txt -``` - -### Download +### Prepare Resources Pretrained model: Dataset: to download the validation dataset. +### Install Dependencies + +```bash +pip3 install -r requirements.txt +``` + ### Model Conversion ```bash python3 export.py --weight efficientnet_b1-c27df63c.pth --output efficientnet_b1.onnx ``` -## Inference +## Model Inference ```bash export DATASETS_DIR=/Path/to/imagenet_val/ @@ -39,8 +39,8 @@ bash scripts/infer_efficientnet_b1_fp16_accuracy.sh bash scripts/infer_efficientnet_b1_fp16_performance.sh ``` -## Results +## Model Results -Model |BatchSize |Precision |FPS |Top-1(%) |Top-5(%) -----------------|-----------|----------|---------|---------|-------- -EfficientNet B1 | 32 | FP16 | 1292.31 | 78.823 | 94.494 +| Model | BatchSize | Precision | FPS | Top-1(%) | Top-5(%) | +|-----------------|-----------|-----------|---------|----------|----------| +| EfficientNet B1 | 32 | FP16 | 1292.31 | 78.823 | 94.494 | diff --git a/models/cv/classification/efficientnet_b1/ixrt/README.md b/models/cv/classification/efficientnet_b1/ixrt/README.md index a2076bbb06b268856365f058d86eab715e841e59..0fc5a210c6dacb8fa7ca753d5e42beefa237a26f 100644 --- a/models/cv/classification/efficientnet_b1/ixrt/README.md +++ b/models/cv/classification/efficientnet_b1/ixrt/README.md @@ -1,12 +1,16 @@ -# EfficientNet B1 +# EfficientNet B1 (IxRT) -## Description +## Model Description EfficientNet B1 is one of the variants in the EfficientNet family of neural network architectures, introduced by Mingxing Tan and Quoc V. Le in their paper "EfficientNet: Rethinking Model Scaling for Convolutional Neural Networks." EfficientNet B1 is a scaled-up version of the baseline model (B0) and is designed to achieve better performance on various computer vision tasks. -## Setup +## Model Preparation -### Install +### Prepare Resources + +Dataset: to download the validation dataset. + +### Install Dependencies ```bash # Install libGL @@ -18,10 +22,6 @@ apt install -y libgl1-mesa-glx pip3 install -r requirements.txt ``` -### Download - -Dataset: to download the validation dataset. - ### Model Conversion ```bash @@ -29,7 +29,7 @@ mkdir checkpoints python3 export_onnx.py --output_model checkpoints/efficientnet-b1.onnx ``` -## Inference +## Model Inference ```bash export PROJ_DIR=./ @@ -57,9 +57,9 @@ bash scripts/infer_efficientnet_b1_int8_accuracy.sh bash scripts/infer_efficientnet_b1_int8_performance.sh ``` -## Results +## Model Results -Model |BatchSize |Precision |FPS |Top-1(%) |Top-5(%) -----------------|-----------|----------|---------|----------|-------- -EfficientNet_B1 | 32 | FP16 | 1517.84 | 77.60 | 93.60 -EfficientNet_B1 | 32 | INT8 | 1817.88 | 75.32 | 92.46 +| Model | BatchSize | Precision | FPS | Top-1(%) | Top-5(%) | +|-----------------|-----------|-----------|---------|----------|----------| +| EfficientNet_B1 | 32 | FP16 | 1517.84 | 77.60 | 93.60 | +| EfficientNet_B1 | 32 | INT8 | 1817.88 | 75.32 | 92.46 | diff --git a/models/cv/classification/efficientnet_b2/igie/README.md b/models/cv/classification/efficientnet_b2/igie/README.md index 0a2b56dc5f22aaaa4b51d86934f0ca3ee54dddef..efdb3274804c862467a37d741eff84a355d064f6 100644 --- a/models/cv/classification/efficientnet_b2/igie/README.md +++ b/models/cv/classification/efficientnet_b2/igie/README.md @@ -1,30 +1,30 @@ -# EfficientNet B2 +# EfficientNet B2 (IGIE) -## Description +## Model Description EfficientNet B2 is a member of the EfficientNet family, a series of convolutional neural network architectures that are designed to achieve excellent accuracy and efficiency. Introduced by researchers at Google, EfficientNets utilize the compound scaling method, which uniformly scales the depth, width, and resolution of the network to improve accuracy and efficiency. -## Setup +## Model Preparation -### Install - -```bash -pip3 install -r requirements.txt -``` - -### Download +### Prepare Resources Pretrained model: Dataset: to download the validation dataset. +### Install Dependencies + +```bash +pip3 install -r requirements.txt +``` + ### Model Conversion ```bash python3 export.py --weight efficientnet_b2_rwightman-c35c1473.pth --output efficientnet_b2.onnx ``` -## Inference +## Model Inference ```bash export DATASETS_DIR=/Path/to/imagenet_val/ @@ -39,8 +39,8 @@ bash scripts/infer_efficientnet_b2_fp16_accuracy.sh bash scripts/infer_efficientnet_b2_fp16_performance.sh ``` -## Results +## Model Results | Model | BatchSize | Precision | FPS | Top-1(%) | Top-5(%) | -| --------------- | --------- | --------- | -------- | -------- | -------- | +|-----------------|-----------|-----------|----------|----------|----------| | EfficientNet B2 | 32 | FP16 | 1527.044 | 77.739 | 93.702 | diff --git a/models/cv/classification/efficientnet_b2/ixrt/README.md b/models/cv/classification/efficientnet_b2/ixrt/README.md index 059c80ba683405433fbe063a3c5413949f3450b2..e627737e04836ebbbc9ce9c254032a126d268d68 100644 --- a/models/cv/classification/efficientnet_b2/ixrt/README.md +++ b/models/cv/classification/efficientnet_b2/ixrt/README.md @@ -1,12 +1,18 @@ -# EfficientNet B2 +# EfficientNet B2 (IxRT) -## Description +## Model Description EfficientNet B2 is a member of the EfficientNet family, a series of convolutional neural network architectures that are designed to achieve excellent accuracy and efficiency. Introduced by researchers at Google, EfficientNets utilize the compound scaling method, which uniformly scales the depth, width, and resolution of the network to improve accuracy and efficiency. -## Setup +## Model Preparation -### Install +### Prepare Resources + +Pretrained model: + +Dataset: to download the validation dataset. + +### Install Dependencies ```bash # Install libGL @@ -18,19 +24,13 @@ apt install -y libgl1-mesa-glx pip3 install -r requirements.txt ``` -### Download - -Pretrained model: - -Dataset: to download the validation dataset. - ### Model Conversion ```bash python3 export.py --weight efficientnet_b2_rwightman-c35c1473.pth --output efficientnet_b2.onnx ``` -## Inference +## Model Inference ```bash export DATASETS_DIR=/Path/to/imagenet_val/ @@ -45,7 +45,7 @@ bash scripts/infer_efficientnet_b2_fp16_accuracy.sh bash scripts/infer_efficientnet_b2_fp16_performance.sh ``` -## Results +## Model Results | Model | BatchSize | Precision | FPS | Top-1(%) | Top-5(%) | | --------------- | --------- | --------- | ------- | -------- | -------- | diff --git a/models/cv/classification/efficientnet_b3/igie/README.md b/models/cv/classification/efficientnet_b3/igie/README.md index ff3b3e31519b075e63a669a20306b9e56a9ac0c3..cd219d924d3d660c0636d8c5a6b14af0619adbe6 100644 --- a/models/cv/classification/efficientnet_b3/igie/README.md +++ b/models/cv/classification/efficientnet_b3/igie/README.md @@ -1,30 +1,30 @@ -# EfficientNet B3 +# EfficientNet B3 (IGIE) -## Description +## Model Description EfficientNet B3 is a member of the EfficientNet family, a series of convolutional neural network architectures that are designed to achieve excellent accuracy and efficiency. Introduced by researchers at Google, EfficientNets utilize the compound scaling method, which uniformly scales the depth, width, and resolution of the network to improve accuracy and efficiency. -## Setup +## Model Preparation -### Install - -```bash -pip3 install -r requirements.txt -``` - -### Download +### Prepare Resources Pretrained model: Dataset: to download the validation dataset. +### Install Dependencies + +```bash +pip3 install -r requirements.txt +``` + ### Model Conversion ```bash python3 export.py --weight efficientnet_b3_rwightman-b3899882.pth --output efficientnet_b3.onnx ``` -## Inference +## Model Inference ```bash export DATASETS_DIR=/Path/to/imagenet_val/ @@ -39,7 +39,7 @@ bash scripts/infer_efficientnet_b3_fp16_accuracy.sh bash scripts/infer_efficientnet_b3_fp16_performance.sh ``` -## Results +## Model Results | Model | BatchSize | Precision | FPS | Top-1(%) | Top-5(%) | | --------------- | --------- | --------- | -------- | -------- | -------- | diff --git a/models/cv/classification/efficientnet_b3/ixrt/README.md b/models/cv/classification/efficientnet_b3/ixrt/README.md index 1860a5e231f4c869a8ff5a8ce4e9773756fb9cc6..4851383ff494d7ed419dd9bd3356031e6661c16b 100644 --- a/models/cv/classification/efficientnet_b3/ixrt/README.md +++ b/models/cv/classification/efficientnet_b3/ixrt/README.md @@ -1,12 +1,18 @@ -# EfficientNet B3 +# EfficientNet B3 (IxRT) -## Description +## Model Description EfficientNet B3 is a member of the EfficientNet family, a series of convolutional neural network architectures that are designed to achieve excellent accuracy and efficiency. Introduced by researchers at Google, EfficientNets utilize the compound scaling method, which uniformly scales the depth, width, and resolution of the network to improve accuracy and efficiency. -## Setup +## Model Preparation -### Install +### Prepare Resources + +Pretrained model: + +Dataset: to download the validation dataset. + +### Install Dependencies ```bash # Install libGL @@ -15,25 +21,16 @@ yum install -y mesa-libGL ## Ubuntu apt install -y libgl1-mesa-glx -pip3 install tqdm -pip3 install onnx -pip3 install onnxsim -pip3 install tabulate +pip3 install -r requirements.txt ``` -### Download - -Pretrained model: - -Dataset: to download the validation dataset. - ### Model Conversion ```bash python3 export.py --weight efficientnet_b3_rwightman-b3899882.pth --output efficientnet_b3.onnx ``` -## Inference +## Model Inference ```bash export DATASETS_DIR=/Path/to/imagenet_val/ @@ -48,7 +45,7 @@ bash scripts/infer_efficientnet_b3_fp16_accuracy.sh bash scripts/infer_efficientnet_b3_fp16_performance.sh ``` -## Results +## Model Results | Model | BatchSize | Precision | FPS | Top-1(%) | Top-5(%) | | --------------- | --------- | --------- | ------- | -------- | -------- | diff --git a/models/cv/classification/efficientnet_b3/ixrt/requirements.txt b/models/cv/classification/efficientnet_b3/ixrt/requirements.txt new file mode 100644 index 0000000000000000000000000000000000000000..e1eda59c3910ca96c73128bab86d534dbd55bbae --- /dev/null +++ b/models/cv/classification/efficientnet_b3/ixrt/requirements.txt @@ -0,0 +1,4 @@ +tqdm +onnx +onnxsim +tabulate \ No newline at end of file diff --git a/models/cv/classification/efficientnet_b4/igie/README.md b/models/cv/classification/efficientnet_b4/igie/README.md index bced3cda89a2ce66f2563ad1a3656099773af369..8e97a99c7103480acb0c1785441fd124b1c18d2f 100644 --- a/models/cv/classification/efficientnet_b4/igie/README.md +++ b/models/cv/classification/efficientnet_b4/igie/README.md @@ -1,30 +1,30 @@ -# EfficientNet B4 +# EfficientNet B4 (IGIE) -## Description +## Model Description EfficientNet B4 is a high-performance convolutional neural network model introduced in Google's paper "EfficientNet: Rethinking Model Scaling for Convolutional Neural Networks." It is part of the EfficientNet family, which leverages compound scaling to balance depth, width, and input resolution for better accuracy and efficiency. -## Setup +## Model Preparation -### Install - -```bash -pip3 install -r requirements.txt -``` - -### Download +### Prepare Resources Pretrained model: Dataset: to download the validation dataset. +### Install Dependencies + +```bash +pip3 install -r requirements.txt +``` + ### Model Conversion ```bash python3 export.py --weight efficientnet_b4_rwightman-23ab8bcd.pth --output efficientnet_b4.onnx ``` -## Inference +## Model Inference ```bash export DATASETS_DIR=/Path/to/imagenet_val/ @@ -39,7 +39,7 @@ bash scripts/infer_efficientnet_b4_fp16_accuracy.sh bash scripts/infer_efficientnet_b4_fp16_performance.sh ``` -## Results +## Model Results | Model | BatchSize | Precision | FPS | Top-1(%) | Top-5(%) | | --------------- | --------- | --------- | -------- | -------- | -------- | diff --git a/models/cv/classification/efficientnet_v2/igie/README.md b/models/cv/classification/efficientnet_v2/igie/README.md index f483a9baf6cacd5a33156a992db559ee75127038..3ee88837d68f49d7ef8c3246bd6ad8f3d14ace7a 100644 --- a/models/cv/classification/efficientnet_v2/igie/README.md +++ b/models/cv/classification/efficientnet_v2/igie/README.md @@ -1,30 +1,30 @@ -# EfficientNetV2-M +# EfficientNetV2-M (IGIE) -## Description +## Model Description EfficientNetV2 M is an optimized model in the EfficientNetV2 series, which was developed by Google researchers. It continues the legacy of the EfficientNet family, focusing on advancing the state-of-the-art in accuracy and efficiency through advanced scaling techniques and architectural innovations. -## Setup +## Model Preparation -### Install - -```bash -pip3 install -r requirements.txt -``` - -### Download +### Prepare Resources Pretrained model: Dataset: to download the validation dataset. +### Install Dependencies + +```bash +pip3 install -r requirements.txt +``` + ### Model Conversion ```bash python3 export.py --weight efficientnet_v2_m-dc08266a.pth --output efficientnet_v2_m.onnx ``` -## Inference +## Model Inference ```bash export DATASETS_DIR=/Path/to/imagenet_val/ @@ -39,7 +39,7 @@ bash scripts/infer_efficientnet_v2_fp16_accuracy.sh bash scripts/infer_efficientnet_v2_fp16_performance.sh ``` -## Results +## Model Results | Model | BatchSize | Precision | FPS | Top-1(%) | Top-5(%) | | ---------------- | --------- | --------- | -------- | -------- | -------- | diff --git a/models/cv/classification/efficientnet_v2/ixrt/README.md b/models/cv/classification/efficientnet_v2/ixrt/README.md index c83e812f84deb22aaaa5819bedef2720b44d8637..9355271c9a7ed4ac6cc8d056081f15b96dfe9d40 100755 --- a/models/cv/classification/efficientnet_v2/ixrt/README.md +++ b/models/cv/classification/efficientnet_v2/ixrt/README.md @@ -1,12 +1,20 @@ -# EfficientNetV2 +# EfficientNetV2 (IxRT) -## Description +## Model Description -EfficientNetV2 is an improved version of the EfficientNet architecture proposed by Google, aiming to enhance model performance and efficiency. Unlike the original EfficientNet, EfficientNetV2 features a simplified design and incorporates a series of enhancement strategies to further boost performance. +EfficientNetV2 is an improved version of the EfficientNet architecture proposed by Google, aiming to enhance model +performance and efficiency. Unlike the original EfficientNet, EfficientNetV2 features a simplified design and +incorporates a series of enhancement strategies to further boost performance. -## Setup +## Model Preparation -### Install +### Prepare Resources + +Pretrained model: + +Dataset: to download the validation dataset. + +### Install Dependencies ```bash # Install libGL @@ -18,12 +26,6 @@ apt install -y libgl1-mesa-glx pip3 install -r requirements.txt ``` -### Download - -Pretrained model: - -Dataset: to download the validation dataset. - ### Model Conversion ```bash @@ -39,7 +41,7 @@ python3 -m models.export_onnx --output_model ../../checkpoints/efficientnet_v2.o cd ../../ ``` -## Inference +## Model Inference ```bash export PROJ_DIR=/Path/to/efficientnet_v2/ixrt @@ -68,9 +70,9 @@ bash scripts/infer_efficientnet_v2_int8_accuracy.sh bash scripts/infer_efficientnet_v2_int8_performance.sh ``` -## Results +## Model Results -Model | BatchSize | Precision | FPS | Top-1(%) | Top-5(%) ----------------|-----------|-----------|----------|----------|-------- -EfficientnetV2 | 32 | FP16 | 1882.87 | 82.14 | 96.16 -EfficientnetV2 | 32 | INT8 | 2595.96 | 81.50 | 95.96 +| Model | BatchSize | Precision | FPS | Top-1(%) | Top-5(%) | +|----------------|-----------|-----------|---------|----------|----------| +| EfficientnetV2 | 32 | FP16 | 1882.87 | 82.14 | 96.16 | +| EfficientnetV2 | 32 | INT8 | 2595.96 | 81.50 | 95.96 | diff --git a/models/cv/classification/efficientnet_v2_s/igie/README.md b/models/cv/classification/efficientnet_v2_s/igie/README.md index ea32a5550dcbe4644be70d0bcbe928c90f002fa3..00a733d53c244f89cc77fa6f03d95a460887e9f2 100644 --- a/models/cv/classification/efficientnet_v2_s/igie/README.md +++ b/models/cv/classification/efficientnet_v2_s/igie/README.md @@ -1,30 +1,30 @@ -# EfficientNet_v2_s +# EfficientNet_v2_s (IGIE) -## Description +## Model Description EfficientNetV2 S is an optimized model in the EfficientNetV2 series, which was developed by Google researchers. It continues the legacy of the EfficientNet family, focusing on advancing the state-of-the-art in accuracy and efficiency through advanced scaling techniques and architectural innovations. -## Setup +## Model Preparation -### Install - -```bash -pip3 install -r requirements.txt -``` - -### Download +### Prepare Resources Pretrained model: Dataset: to download the validation dataset. +### Install Dependencies + +```bash +pip3 install -r requirements.txt +``` + ### Model Conversion ```bash python3 export.py --weight efficientnet_v2_s-dd5fe13b.pth --output efficientnet_v2_s.onnx ``` -## Inference +## Model Inference ```bash export DATASETS_DIR=/Path/to/imagenet_val/ @@ -39,7 +39,7 @@ bash scripts/infer_efficientnet_v2_s_fp16_accuracy.sh bash scripts/infer_efficientnet_v2_s_fp16_performance.sh ``` -## Results +## Model Results | Model | BatchSize | Precision | FPS | Top-1(%) | Top-5(%) | | ----------------- | --------- | --------- | -------- | -------- | -------- | diff --git a/models/cv/classification/efficientnet_v2_s/ixrt/README.md b/models/cv/classification/efficientnet_v2_s/ixrt/README.md index 3c6baa2254c0bfa97b6de0dea0ef5d5e1239c16c..92f3874ba1d2440385683388a7e606d96ec70e4a 100644 --- a/models/cv/classification/efficientnet_v2_s/ixrt/README.md +++ b/models/cv/classification/efficientnet_v2_s/ixrt/README.md @@ -1,30 +1,30 @@ -# EfficientNet_v2_s +# EfficientNet_v2_s (IxRT) -## Description +## Model Description EfficientNetV2 S is an optimized model in the EfficientNetV2 series, which was developed by Google researchers. It continues the legacy of the EfficientNet family, focusing on advancing the state-of-the-art in accuracy and efficiency through advanced scaling techniques and architectural innovations. -## Setup +## Model Preparation -### Install - -```bash -pip3 install -r requirements.txt -``` - -### Download +### Prepare Resources Pretrained model: Dataset: to download the validation dataset. +### Install Dependencies + +```bash +pip3 install -r requirements.txt +``` + ### Model Conversion ```bash python3 export.py --weight efficientnet_v2_s-dd5fe13b.pth --output efficientnet_v2_s.onnx ``` -## Inference +## Model Inference ```bash export DATASETS_DIR=/Path/to/imagenet_val/ @@ -39,7 +39,7 @@ bash scripts/infer_efficientnet_v2_s_fp16_accuracy.sh bash scripts/infer_efficientnet_v2_s_fp16_performance.sh ``` -## Results +## Model Results | Model | BatchSize | Precision | FPS | Top-1(%) | Top-5(%) | | ----------------- | --------- | --------- | -------- | -------- | -------- | diff --git a/models/cv/classification/efficientnetv2_rw_t/igie/README.md b/models/cv/classification/efficientnetv2_rw_t/igie/README.md index 390de644b6fdbb1995d5904799446a58b958fdd9..1f16673bb2e1b035fbabfcea2182dea98300569d 100644 --- a/models/cv/classification/efficientnetv2_rw_t/igie/README.md +++ b/models/cv/classification/efficientnetv2_rw_t/igie/README.md @@ -1,30 +1,30 @@ -# EfficientNetv2_rw_t +# EfficientNetv2_rw_t (IGIE) -## Description +## Model Description EfficientNetV2_rw_t is an enhanced version of the EfficientNet family of convolutional neural network architectures. It builds upon the success of its predecessors by introducing novel advancements aimed at further improving performance and efficiency in various computer vision tasks. -## Setup +## Model Preparation -### Install - -```bash -pip3 install -r requirements.txt -``` - -### Download +### Prepare Resources Pretrained model: Dataset: to download the validation dataset. +### Install Dependencies + +```bash +pip3 install -r requirements.txt +``` + ### Model Conversion ```bash python3 export.py --weight efficientnetv2_t_agc-3620981a.pth --output efficientnetv2_rw_t.onnx ``` -## Inference +## Model Inference ```bash export DATASETS_DIR=/Path/to/imagenet_val/ @@ -39,8 +39,8 @@ bash scripts/infer_efficientnetv2_rw_t_fp16_accuracy.sh bash scripts/infer_efficientnetv2_rw_t_fp16_performance.sh ``` -## Results +## Model Results -Model |BatchSize |Precision |FPS |Top-1(%) |Top-5(%) ---------------------|-----------|----------|---------|---------|-------- -Efficientnetv2_rw_t | 32 | FP16 | 831.678 | 82.306 | 96.163 +| Model | BatchSize | Precision | FPS | Top-1(%) | Top-5(%) | +|---------------------|-----------|-----------|---------|----------|----------| +| Efficientnetv2_rw_t | 32 | FP16 | 831.678 | 82.306 | 96.163 | diff --git a/models/cv/classification/efficientnetv2_rw_t/ixrt/README.md b/models/cv/classification/efficientnetv2_rw_t/ixrt/README.md index 1e5d56d0d65c9b38a61cc9e5635d722ca76266ff..3ca2ec65dfa97c0165b356104238c7f2ad6b8a7d 100644 --- a/models/cv/classification/efficientnetv2_rw_t/ixrt/README.md +++ b/models/cv/classification/efficientnetv2_rw_t/ixrt/README.md @@ -1,12 +1,18 @@ -# EfficientNetv2_rw_t +# EfficientNetv2_rw_t (IGIE) -## Description +## Model Description EfficientNetV2_rw_t is an enhanced version of the EfficientNet family of convolutional neural network architectures. It builds upon the success of its predecessors by introducing novel advancements aimed at further improving performance and efficiency in various computer vision tasks. -## Setup +## Model Preparation -### Install +### Prepare Resources + +Pretrained model: + +Dataset: to download the validation dataset. + +### Install Dependencies ```bash # Install libGL @@ -15,29 +21,16 @@ yum install -y mesa-libGL ## Ubuntu apt install -y libgl1-mesa-glx -pip3 install tqdm -pip3 install timm -pip3 install onnx -pip3 install onnxsim -pip3 install tabulate -pip3 install ppq -pip3 install tqdm -pip3 install cuda-python +pip3 install -r requirements.txt ``` -### Download - -Pretrained model: - -Dataset: to download the validation dataset. - ### Model Conversion ```bash python3 export.py --weight efficientnetv2_t_agc-3620981a.pth --output efficientnetv2_rw_t.onnx ``` -## Inference +## Model Inference ```bash export DATASETS_DIR=/Path/to/imagenet_val/ @@ -52,8 +45,8 @@ bash scripts/infer_efficientnetv2_rw_t_fp16_accuracy.sh bash scripts/infer_efficientnetv2_rw_t_fp16_performance.sh ``` -## Results +## Model Results -Model |BatchSize |Precision |FPS |Top-1(%) |Top-5(%) ---------------------|-----------|----------|---------|---------|-------- -Efficientnetv2_rw_t | 32 | FP16 | 1525.22 | 82.336 | 96.194 +| Model | BatchSize | Precision | FPS | Top-1(%) | Top-5(%) | +|---------------------|-----------|-----------|---------|----------|----------| +| Efficientnetv2_rw_t | 32 | FP16 | 1525.22 | 82.336 | 96.194 | diff --git a/models/cv/classification/efficientnetv2_rw_t/ixrt/requirements.txt b/models/cv/classification/efficientnetv2_rw_t/ixrt/requirements.txt new file mode 100644 index 0000000000000000000000000000000000000000..72371658b582e88b0aa61e0cd9e3ac469a63bc9f --- /dev/null +++ b/models/cv/classification/efficientnetv2_rw_t/ixrt/requirements.txt @@ -0,0 +1,8 @@ +tqdm +timm +onnx +onnxsim +tabulate +ppq +tqdm +cuda-python \ No newline at end of file diff --git a/models/cv/classification/googlenet/igie/README.md b/models/cv/classification/googlenet/igie/README.md index fe903822a44c04d4ed56e23a1e48356869f9a10e..4f1a030aa45fcd80a86c2130fbb37d0748b5afd5 100644 --- a/models/cv/classification/googlenet/igie/README.md +++ b/models/cv/classification/googlenet/igie/README.md @@ -1,30 +1,30 @@ -# GoogleNet +# GoogleNet (IGIE) -## Description +## Model Description Introduced in 2014, GoogleNet revolutionized image classification models by introducing the concept of inception modules. These modules utilize parallel convolutional filters of different sizes, allowing the network to capture features at various scales efficiently. With its emphasis on computational efficiency and the reduction of parameters, GoogleNet achieved competitive accuracy while maintaining a relatively low computational cost. -## Setup +## Model Preparation -### Install - -```bash -pip3 install -r requirements.txt -``` - -### Download +### Prepare Resources Pretrained model: Dataset: to download the validation dataset. +### Install Dependencies + +```bash +pip3 install -r requirements.txt +``` + ### Model Conversion ```bash python3 export.py --weight googlenet-1378be20.pth --output googlenet.onnx ``` -## Inference +## Model Inference ```bash export DATASETS_DIR=/Path/to/imagenet_val/ @@ -48,9 +48,9 @@ bash scripts/infer_googlenet_int8_accuracy.sh bash scripts/infer_googlenet_int8_performance.sh ``` -## Results +## Model Results -Model |BatchSize |Precision |FPS |Top-1(%) |Top-5(%) -----------|-----------|----------|----------|---------|-------- -GoogleNet | 32 | FP16 | 6564.20 | 62.44 | 84.31 -GoogleNet | 32 | INT8 | 7910.65 | 61.06 | 83.26 +| Model | BatchSize | Precision | FPS | Top-1(%) | Top-5(%) | +|-----------|-----------|-----------|---------|----------|----------| +| GoogleNet | 32 | FP16 | 6564.20 | 62.44 | 84.31 | +| GoogleNet | 32 | INT8 | 7910.65 | 61.06 | 83.26 | diff --git a/models/cv/classification/googlenet/ixrt/README.md b/models/cv/classification/googlenet/ixrt/README.md index 31170f0d714fa87784db194b5bce63ee37ad962a..b0d142307b34769527004ac47bb0e7930616b224 100644 --- a/models/cv/classification/googlenet/ixrt/README.md +++ b/models/cv/classification/googlenet/ixrt/README.md @@ -1,12 +1,18 @@ -# GoogLeNet +# GoogLeNet (IxRT) -## Description +## Model Description GoogLeNet is a type of convolutional neural network based on the Inception architecture. It utilises Inception modules, which allow the network to choose between multiple convolutional filter sizes in each block. An Inception network stacks these modules on top of each other, with occasional max-pooling layers with stride 2 to halve the resolution of the grid. -## Setup +## Model Preparation -### Install +### Prepare Resources + +Pretrained model: + +Dataset: to download the validation dataset. + +### Install Dependencies ```bash # Install libGL @@ -18,12 +24,6 @@ apt install -y libgl1-mesa-glx pip3 install -r requirements.txt ``` -### Download - -Pretrained model: - -Dataset: to download the validation dataset. - ### Model Conversion ```bash @@ -31,7 +31,7 @@ mkdir checkpoints python3 export_onnx.py --origin_model /path/to/googlenet-1378be20.pth --output_model checkpoints/googlenet.onnx ``` -## Inference +## Model Inference ```bash export PROJ_DIR=./ @@ -59,9 +59,9 @@ bash scripts/infer_googlenet_int8_accuracy.sh bash scripts/infer_googlenet_int8_performance.sh ``` -## Results +## Model Results -Model |BatchSize |Precision |FPS |Top-1(%) |Top-5(%) -----------|-----------|----------|----------|----------|-------- -GoogLeNet | 32 | FP16 | 6470.34 | 62.456 | 84.33 -GoogLeNet | 32 | INT8 | 9358.11 | 62.106 | 84.30 +| Model | BatchSize | Precision | FPS | Top-1(%) | Top-5(%) | +|-----------|-----------|-----------|---------|----------|----------| +| GoogLeNet | 32 | FP16 | 6470.34 | 62.456 | 84.33 | +| GoogLeNet | 32 | INT8 | 9358.11 | 62.106 | 84.30 | diff --git a/models/cv/classification/hrnet_w18/igie/README.md b/models/cv/classification/hrnet_w18/igie/README.md index 5284b949bb6dbbcbaa5ded4f7c4b1d7fed4d89aa..80eb7c9d02d567a8c63752f6f45993b564b3b006 100644 --- a/models/cv/classification/hrnet_w18/igie/README.md +++ b/models/cv/classification/hrnet_w18/igie/README.md @@ -1,12 +1,18 @@ -# HRNet-W18 +# HRNet-W18 (IGIE) -## Description +## Model Description HRNet, short for High-Resolution Network, presents a paradigm shift in handling position-sensitive vision challenges, such as human pose estimation, semantic segmentation, and object detection. The distinctive features of HRNet result in semantically richer and spatially more precise representations. -## Setup +## Model Preparation -### Install +### Prepare Resources + +Pretrained model: + +Dataset: to download the validation dataset. + +### Install Dependencies ```bash # Install libGL @@ -18,12 +24,6 @@ apt install -y libgl1-mesa-glx pip3 install -r requirements.txt ``` -### Download - -Pretrained model: - -Dataset: to download the validation dataset. - ### Model Conversion ```bash @@ -35,10 +35,9 @@ python3 export.py --cfg mmpretrain/configs/hrnet/hrnet-w18_4xb32_in1k.py --weigh # Use onnxsim optimize onnx model onnxsim hrnet_w18.onnx hrnet_w18_opt.onnx - ``` -## Inference +## Model Inference ```bash export DATASETS_DIR=/Path/to/imagenet_val/ @@ -53,12 +52,12 @@ bash scripts/infer_hrnet_w18_fp16_accuracy.sh bash scripts/infer_hrnet_w18_fp16_performance.sh ``` -## Results +## Model Results Model |BatchSize |Precision |FPS |Top-1(%) |Top-5(%) ----------|-----------|----------|----------|----------|-------- HRNet_w18 | 32 | FP16 | 954.18 | 76.74 | 93.42 -## Reference +## References -HRNet: +- [mmpretrain](https://github.com/open-mmlab/mmpretrain) diff --git a/models/cv/classification/hrnet_w18/ixrt/README.md b/models/cv/classification/hrnet_w18/ixrt/README.md index 93691d2bd07a79fe101c0430a98f0146be27907c..d09d4fe911b8ac6fecf53dc340ccb121186ad44e 100644 --- a/models/cv/classification/hrnet_w18/ixrt/README.md +++ b/models/cv/classification/hrnet_w18/ixrt/README.md @@ -1,12 +1,16 @@ -# HRNet-W18 +# HRNet-W18 (IxRT) -## Description +## Model Description HRNet-W18 is a powerful image classification model developed by Jingdong AI Research and released in 2020. It belongs to the HRNet (High-Resolution Network) family of models, known for their exceptional performance in various computer vision tasks. -## Setup +## Model Preparation -### Install +### Prepare Resources + +Dataset: to download the validation dataset. + +### Install Dependencies ```bash # Install libGL @@ -18,10 +22,6 @@ apt install -y libgl1-mesa-glx pip3 install -r requirements.txt ``` -### Download - -Dataset: to download the validation dataset. - ### Model Conversion ```bash @@ -29,7 +29,7 @@ mkdir checkpoints python3 export_onnx.py --output_model checkpoints/hrnet-w18.onnx ``` -## Inference +## Model Inference ```bash export DATASETS_DIR=/path/to/imagenet_val/ @@ -56,9 +56,9 @@ bash scripts/infer_hrnet_w18_int8_accuracy.sh bash scripts/infer_hrnet_w18_int8_performance.sh ``` -## Results +## Model Results | Model | BatchSize | Precision | FPS | Top-1(%) | Top-5(%) | -| -------- | --------- | --------- | ------- | -------- | -------- | +|----------|-----------|-----------|---------|----------|----------| | ResNet50 | 32 | FP16 | 1474.26 | 0.76764 | 0.93446 | | ResNet50 | 32 | INT8 | 1649.40 | 0.76158 | 0.93152 | diff --git a/models/cv/classification/inception_resnet_v2/ixrt/README.md b/models/cv/classification/inception_resnet_v2/ixrt/README.md index a845e9d6a4de1e12ac224e8c3e065a0640ef288e..4ab239f69dc411d40355d79bf6f82b3ddf5f178f 100755 --- a/models/cv/classification/inception_resnet_v2/ixrt/README.md +++ b/models/cv/classification/inception_resnet_v2/ixrt/README.md @@ -1,12 +1,18 @@ -# Inception-ResNet-V2 +# Inception-ResNet-V2 (IxRT) -## Description +## Model Description Inception-ResNet-V2 is a deep learning model proposed by Google in 2016, which combines the architectures of Inception and ResNet. This model integrates the dense connections of the Inception series with the residual connections of ResNet, aiming to enhance model performance and training efficiency. -## Setup +## Model Preparation -### Install +### Prepare Resources + +Pretrained model: + +Dataset: to download the validation dataset. + +### Install Dependencies ```bash # Install libGL @@ -18,12 +24,6 @@ apt install -y libgl1-mesa-glx pip3 install -r requirements.txt ``` -### Download - -Pretrained model: - -Dataset: to download the validation dataset. - ### Model Conversion ```bash @@ -32,7 +32,7 @@ mkdir checkpoints python3 export_model.py --output_model /Path/to/checkpoints/inceptionresnetv2.onnx ``` -## Inference +## Model Inference ```bash export PROJ_DIR=/Path/to/inceptionresnetv2/ixrt @@ -61,7 +61,7 @@ bash scripts/infer_inceptionresnetv2_int8_accuracy.sh bash scripts/infer_inceptionresnetv2_int8_performance.sh ``` -## Results +## Model Results | Model | BatchSize | Precision | FPS | Top-1(%) | Top-5(%) | |---------------------|-----------|-----------|---------|----------|----------| diff --git a/models/cv/classification/inception_v3/igie/README.md b/models/cv/classification/inception_v3/igie/README.md index fe7406167a7a4ba400a961fcab0cd3af961419d6..69dbd46b116281dc3db3f2da55543cf80dd6a025 100644 --- a/models/cv/classification/inception_v3/igie/README.md +++ b/models/cv/classification/inception_v3/igie/README.md @@ -1,30 +1,30 @@ -# Inception V3 +# Inception V3 (IGIE) -## Description +## Model Description Inception v3 is a convolutional neural network architecture designed for image recognition and classification tasks. Developed by Google, it represents an evolution of the earlier Inception models. Inception v3 is characterized by its deep architecture, featuring multiple layers with various filter sizes and efficient use of computational resources. The network employs techniques like factorized convolutions and batch normalization to enhance training stability and accelerate convergence. -## Setup +## Model Preparation -### Install - -```bash -pip3 install -r requirements.txt -``` - -### Download +### Prepare Resources Pretrained model: Dataset: to download the validation dataset. +### Install Dependencies + +```bash +pip3 install -r requirements.txt +``` + ### Model Conversion ```bash python3 export.py --weight inception_v3_google-0cc3c7bd.pth --output inception_v3.onnx ``` -## Inference +## Model Inference ```bash export DATASETS_DIR=/Path/to/imagenet_val/ @@ -48,9 +48,9 @@ bash scripts/infer_inception_v3_int8_accuracy.sh bash scripts/infer_inception_v3_int8_performance.sh ``` -## Results +## Model Results -Model |BatchSize |Precision |FPS |Top-1(%) |Top-5(%) --------------|-----------|----------|----------|----------|-------- -Inception_v3 | 32 | FP16 | 3557.25 | 69.848 | 88.858 -Inception_v3 | 32 | INT8 | 3631.80 | 69.022 | 88.412 +| Model | BatchSize | Precision | FPS | Top-1(%) | Top-5(%) | +|--------------|-----------|-----------|---------|----------|----------| +| Inception_v3 | 32 | FP16 | 3557.25 | 69.848 | 88.858 | +| Inception_v3 | 32 | INT8 | 3631.80 | 69.022 | 88.412 | diff --git a/models/cv/classification/inception_v3/ixrt/README.md b/models/cv/classification/inception_v3/ixrt/README.md index e0d938d3b989c7f1b937e8391003fc124dd65b5e..ba183ef3bf691bdee14e6fd4d1a4594f22579383 100755 --- a/models/cv/classification/inception_v3/ixrt/README.md +++ b/models/cv/classification/inception_v3/ixrt/README.md @@ -1,12 +1,18 @@ -# Inception V3 +# Inception V3 (IxRT) -## Description +## Model Description Inception v3 is a convolutional neural network architecture designed for image recognition and classification tasks. Developed by Google, it represents an evolution of the earlier Inception models. Inception v3 is characterized by its deep architecture, featuring multiple layers with various filter sizes and efficient use of computational resources. The network employs techniques like factorized convolutions and batch normalization to enhance training stability and accelerate convergence. -## Setup +## Model Preparation -### Install +### Prepare Resources + +Pretrained model: + +Dataset: to download the validation dataset. + +### Install Dependencies ```bash # Install libGL @@ -18,12 +24,6 @@ apt install -y libgl1-mesa-glx pip3 install -r requirements.txt ``` -### Download - -Pretrained model: - -Dataset: to download the validation dataset. - ### Model Conversion ```bash @@ -32,7 +32,7 @@ mkdir checkpoints python3 export_onnx.py --origin_model inception_v3_google-0cc3c7bd.pth --output_model checkpoints/inception_v3.onnx ``` -## Inference +## Model Inference ```bash export PROJ_DIR=/Path/to/inception_v3/ixrt @@ -61,9 +61,9 @@ bash scripts/infer_inception_v3_int8_accuracy.sh bash scripts/infer_inception_v3_int8_performance.sh ``` -## Results +## Model Results -Model |BatchSize |Precision |FPS |Top-1(%) |Top-5(%) --------------|-----------|----------|----------|----------|-------- -Inception_v3 | 32 | FP16 | 3515.29 | 70.64 | 89.33 -Inception_v3 | 32 | INT8 | 4916.32 | 70.45 | 89.28 +| Model | BatchSize | Precision | FPS | Top-1(%) | Top-5(%) | +|--------------|-----------|-----------|---------|----------|----------| +| Inception_v3 | 32 | FP16 | 3515.29 | 70.64 | 89.33 | +| Inception_v3 | 32 | INT8 | 4916.32 | 70.45 | 89.28 | diff --git a/models/cv/classification/mlp_mixer_base/igie/README.md b/models/cv/classification/mlp_mixer_base/igie/README.md index d82c728b9298d0c22948a2244e114b56e9221372..ecd32b94d94c66683c5c13ff1261d310205b8e4a 100644 --- a/models/cv/classification/mlp_mixer_base/igie/README.md +++ b/models/cv/classification/mlp_mixer_base/igie/README.md @@ -1,12 +1,18 @@ -# MLP-Mixer Base +# MLP-Mixer Base (IGIE) -## Description +## Model Description MLP-Mixer Base is a foundational model in the MLP-Mixer family, designed to use only MLP layers for vision tasks like image classification. Unlike CNNs and Vision Transformers, MLP-Mixer replaces both convolution and self-attention mechanisms with simple MLP layers to process spatial and channel-wise information independently. -## Setup +## Model Preparation -### Install +### Prepare Resources + +Pretrained model: + +Dataset: to download the validation dataset. + +### Install Dependencies ```bash # Install libGL @@ -18,12 +24,6 @@ apt install -y libgl1-mesa-glx pip3 install -r requirements.txt ``` -### Download - -Pretrained model: - -Dataset: to download the validation dataset. - ### Model Conversion ```bash @@ -38,7 +38,7 @@ onnxsim mlp_mixer_base.onnx mlp_mixer_base_opt.onnx ``` -## Inference +## Model Inference ```bash export DATASETS_DIR=/Path/to/imagenet_val/ @@ -53,12 +53,12 @@ bash scripts/infer_mlp_mixer_base_fp16_accuracy.sh bash scripts/infer_mlp_mixer_base_fp16_performance.sh ``` -## Results +## Model Results | Model | BatchSize | Precision | FPS | Top-1(%) | Top-5(%) | | ----------------| --------- | --------- | -------- | -------- | -------- | | MLP-Mixer-Base | 32 | FP16 | 1477.15 | 72.545 | 90.035 | -## Reference +## References -MLP-Mixer-Base: +- [mmpretrain](https://github.com/open-mmlab/mmpretrain) diff --git a/models/cv/classification/mnasnet0_5/igie/README.md b/models/cv/classification/mnasnet0_5/igie/README.md index 86ec2dec3eb86c813681dd2f79e846cfbe3fac03..cc8f2c383b66544ee894ea46762a62c2c5140ea9 100644 --- a/models/cv/classification/mnasnet0_5/igie/README.md +++ b/models/cv/classification/mnasnet0_5/igie/README.md @@ -1,30 +1,30 @@ -# MNASNet0_5 +# MNASNet0_5 (IGIE) -## Description +## Model Description MNASNet0_5 is a neural network architecture optimized for mobile devices, designed through neural architecture search technology. It is characterized by high efficiency and excellent accuracy, offering 50% higher accuracy than MobileNetV2 while maintaining low latency and memory usage. MNASNet0_5 widely uses depthwise separable convolutions, supports multi-scale inputs, and demonstrates good robustness, making it suitable for real-time image recognition tasks in resource-constrained environments. -## Setup +## Model Preparation -### Install - -```bash -pip3 install -r requirements.txt -``` - -### Download +### Prepare Resources Pretrained model: Dataset: to download the validation dataset. +### Install Dependencies + +```bash +pip3 install -r requirements.txt +``` + ### Model Conversion ```bash python3 export.py --weight mnasnet0.5_top1_67.823-3ffadce67e.pth --output mnasnet0_5.onnx ``` -## Inference +## Model Inference ```bash export DATASETS_DIR=/Path/to/imagenet_val/ @@ -39,7 +39,7 @@ bash scripts/infer_mnasnet0_5_fp16_accuracy.sh bash scripts/infer_mnasnet0_5_fp16_performance.sh ``` -## Results +## Model Results | Model | BatchSize | Precision | FPS | Top-1(%) | Top-5(%) | | ----------------- | --------- | --------- | -------- | -------- | -------- | diff --git a/models/cv/classification/mnasnet0_75/igie/README.md b/models/cv/classification/mnasnet0_75/igie/README.md index 292ba158cd5c7e7480e734465d774f042b754ee2..9288814c317b2b2346429b8d2a90dd1babdaf343 100644 --- a/models/cv/classification/mnasnet0_75/igie/README.md +++ b/models/cv/classification/mnasnet0_75/igie/README.md @@ -1,30 +1,30 @@ -# MNASNet0_75 +# MNASNet0_75 (IGIE) -## Description +## Model Description MNASNet0_75 is a lightweight convolutional neural network designed for mobile devices, introduced in the paper "MNASNet: Multi-Objective Neural Architecture Search for Mobile." The model leverages Multi-Objective Neural Architecture Search (NAS) to achieve a balance between accuracy and efficiency by optimizing both performance and computational cost. With a width multiplier of 0.75, MNASNet0_75 reduces the number of channels compared to the standard MNASNet (width multiplier of 1.0), resulting in fewer parameters. -## Setup +## Model Preparation -### Install - -```bash -pip3 install -r requirements.txt -``` - -### Download +### Prepare Resources Pretrained model: Dataset: to download the validation dataset. +### Install Dependencies + +```bash +pip3 install -r requirements.txt +``` + ### Model Conversion ```bash python3 export.py --weight mnasnet0_75-7090bc5f.pth --output mnasnet0_75.onnx ``` -## Inference +## Model Inference ```bash export DATASETS_DIR=/Path/to/imagenet_val/ @@ -39,7 +39,7 @@ bash scripts/infer_mnasnet0_75_fp16_accuracy.sh bash scripts/infer_mnasnet0_75_fp16_performance.sh ``` -## Results +## Model Results | Model | BatchSize | Precision | FPS | Top-1(%) | Top-5(%) | | ----------------- | --------- | --------- | -------- | -------- | -------- | diff --git a/models/cv/classification/mobilenet_v2/igie/README.md b/models/cv/classification/mobilenet_v2/igie/README.md index 15b4d95b30a44f8ff84d9a4e9fdea9779ae5ab88..193f744fc75f2e01d707ca6414d9d0fe571109e0 100644 --- a/models/cv/classification/mobilenet_v2/igie/README.md +++ b/models/cv/classification/mobilenet_v2/igie/README.md @@ -1,30 +1,30 @@ -# MobileNetV2 +# MobileNetV2 (IGIE) -## Description +## Model Description MobileNetV2 is an improvement on V1. Its new ideas include Linear Bottleneck and Inverted Residuals, and is based on an inverted residual structure where the input and output of the residual block are thin bottleneck layers. The intermediate expansion layer uses lightweight depthwise convolutions to filter features as a source of non-linearity. -## Setup +## Model Preparation -### Install - -```bash -pip3 install -r requirements.txt -``` - -### Download +### Prepare Resources Pretrained model: Dataset: to download the validation dataset. +### Install Dependencies + +```bash +pip3 install -r requirements.txt +``` + ### Model Conversion ```bash python3 export.py --weight mobilenet_v2-7ebf99e0.pth --output mobilenet_v2.onnx ``` -## Inference +## Model Inference ```bash export DATASETS_DIR=/Path/to/imagenet_val/ @@ -48,9 +48,9 @@ bash scripts/infer_mobilenet_v2_int8_accuracy.sh bash scripts/infer_mobilenet_v2_int8_performance.sh ``` -## Results +## Model Results -Model |BatchSize |Precision |FPS |Top-1(%) |Top-5(%) --------------|-----------|----------|---------|----------|-------- -MobileNetV2 | 32 | FP16 | 6910.65 | 71.96 | 90.60 -MobileNetV2 | 32 | INT8 | 8155.362 | 71.48 | 90.47 +| Model | BatchSize | Precision | FPS | Top-1(%) | Top-5(%) | +|-------------|-----------|-----------|----------|----------|----------| +| MobileNetV2 | 32 | FP16 | 6910.65 | 71.96 | 90.60 | +| MobileNetV2 | 32 | INT8 | 8155.362 | 71.48 | 90.47 | diff --git a/models/cv/classification/mobilenet_v2/ixrt/README.md b/models/cv/classification/mobilenet_v2/ixrt/README.md index de892cf6a5441d18996f75e625ce5917b2175fbe..883b366e7e67d41c529b8a74b5b6739bfe28f4ac 100644 --- a/models/cv/classification/mobilenet_v2/ixrt/README.md +++ b/models/cv/classification/mobilenet_v2/ixrt/README.md @@ -1,23 +1,23 @@ -# MobileNetV2 +# MobileNetV2 (IxRT) -## Description +## Model Description The MobileNetV2 architecture is based on an inverted residual structure where the input and output of the residual block are thin bottleneck layers opposite to traditional residual models which use expanded representations in the input an MobileNetV2 uses lightweight depthwise convolutions to filter features in the intermediate expansion layer. -## Setup +## Model Preparation -### Install - -```bash -pip3 install -r requirements.txt -``` - -### Download +### Prepare Resources Pretrained model: Download the [imagenet](https://www.image-net.org/download.php) to download the validation dataset. +### Install Dependencies + +```bash +pip3 install -r requirements.txt +``` + ### Model Conversion ```bash @@ -25,7 +25,7 @@ mkdir checkpoints python3 export_onnx.py --origin_model /path/to/mobilenet_v2-b0353104 --output_model checkpoints/mobilenet_v2.onnx ``` -## Inference +## Model Inference ```bash export PROJ_DIR=./ @@ -52,13 +52,12 @@ bash script/infer_mobilenet_v2_int8_accuracy.sh bash script/infer_mobilenet_v2_int8_performance.sh ``` -## Results +## Model Results | Model | BatchSize | Precision | FPS | Top-1(%) | Top-5(%) | | ----------- | --------- | --------- | ------- | -------- | -------- | | MobileNetV2 | 32 | FP16 | 4835.19 | 0.7186 | 0.90316 | -## Referenece +## Refereneces -- [MobileNetV2](https://arxiv.org/abs/1801.04381) -- +- [Paper](https://arxiv.org/abs/1801.04381) diff --git a/models/cv/classification/mobilenet_v3/igie/README.md b/models/cv/classification/mobilenet_v3/igie/README.md index 501c8be80b7ea1b68feb829012b5c748f3ed3821..82ab6081ffd2ee75e11ed6a7b1ad490769870ab2 100644 --- a/models/cv/classification/mobilenet_v3/igie/README.md +++ b/models/cv/classification/mobilenet_v3/igie/README.md @@ -1,30 +1,30 @@ -# MobileNetV3_Small +# MobileNetV3_Small (IGIE) -## Description +## Model Description MobileNetV3_Small is a lightweight convolutional neural network architecture designed for efficient mobile and embedded devices. It is part of the MobileNet family, renowned for its compact size and high performance, making it ideal for applications with limited computational resources.The key focus of MobileNetV3_Small is to achieve a balance between model size, speed, and accuracy. -## Setup +## Model Preparation -### Install - -```bash -pip3 install -r requirements.txt -``` - -### Download +### Prepare Resources Pretrained model: Dataset: to download the validation dataset. +### Install Dependencies + +```bash +pip3 install -r requirements.txt +``` + ### Model Conversion ```bash python3 export.py --weight mobilenet_v3_small-047dcff4.pth --output mobilenetv3_small.onnx ``` -## Inference +## Model Inference ```bash export DATASETS_DIR=/Path/to/imagenet_val/ @@ -39,8 +39,8 @@ bash scripts/infer_mobilenet_v3_fp16_accuracy.sh bash scripts/infer_mobilenet_v3_fp16_performance.sh ``` -## Results +## Model Results -Model |BatchSize |Precision |FPS |Top-1(%) |Top-5(%) -------------------|-----------|----------|---------|---------|-------- -MobileNetV3_Small | 32 | FP16 | 6837.86 | 67.612 | 87.404 +| Model | BatchSize | Precision | FPS | Top-1(%) | Top-5(%) | +|-------------------|-----------|-----------|---------|----------|----------| +| MobileNetV3_Small | 32 | FP16 | 6837.86 | 67.612 | 87.404 | diff --git a/models/cv/classification/mobilenet_v3/ixrt/README.md b/models/cv/classification/mobilenet_v3/ixrt/README.md index 7a80ec49623d2901b2ea6d1a4c7349a23ec589bd..cfbbf00c47274fe467636052d9eafef2bbb7f568 100644 --- a/models/cv/classification/mobilenet_v3/ixrt/README.md +++ b/models/cv/classification/mobilenet_v3/ixrt/README.md @@ -1,12 +1,18 @@ -# MobileNetV3 +# MobileNetV3 (IxRT) -## Description +## Model Description MobileNetV3 is a convolutional neural network that is tuned to mobile phone CPUs through a combination of hardware-aware network architecture search (NAS) complemented by the NetAdapt algorithm, and then subsequently improved through novel architecture advances. Advances include (1) complementary search techniques, (2) new efficient versions of nonlinearities practical for the mobile setting, (3) new efficient network design. -## Setup +## Model Preparation -### Install +### Prepare Resources + +Pretrained model: + +Dataset: to download the validation dataset. + +### Install Dependencies ```bash # Install libGL @@ -18,12 +24,6 @@ apt install -y libgl1-mesa-glx pip3 install -r requirements.txt ``` -### Download - -Pretrained model: - -Dataset: to download the validation dataset. - ### Model Conversion ```bash @@ -31,7 +31,7 @@ mkdir checkpoints python3 export_onnx.py --origin_model /path/to/mobilenet_v3_small-047dcff4.pth --output_model checkpoints/mobilenet_v3.onnx ``` -## Inference +## Model Inference ```bash export PROJ_DIR=./ @@ -50,8 +50,8 @@ bash scripts/infer_mobilenet_v3_fp16_accuracy.sh bash scripts/infer_mobilenet_v3_fp16_performance.sh ``` -## Results +## Model Results -Model | BatchSize | Precision| FPS | Top-1(%) | Top-5(%) -------------|-----------|----------|----------|----------|-------- -MobileNetV3 | 32 | FP16 | 8464.36 | 67.62 | 87.42 +| Model | BatchSize | Precision | FPS | Top-1(%) | Top-5(%) | +|-------------|-----------|-----------|---------|----------|----------| +| MobileNetV3 | 32 | FP16 | 8464.36 | 67.62 | 87.42 | diff --git a/models/cv/classification/mobilenet_v3_large/igie/README.md b/models/cv/classification/mobilenet_v3_large/igie/README.md index b44fd185bd604451ce3ca03ac8793e730b10e608..fc4193b0c6c72d7fe34eb9b217d3c3f24603e0a3 100644 --- a/models/cv/classification/mobilenet_v3_large/igie/README.md +++ b/models/cv/classification/mobilenet_v3_large/igie/README.md @@ -1,30 +1,30 @@ -# MobileNetV3_Large +# MobileNetV3_Large (IGIE) -## Description +## Model Description MobileNetV3_Large builds upon the success of its predecessors by incorporating several innovative design strategies to enhance performance. It features larger model capacity and computational resources compared to MobileNetV3_Small, allowing for deeper network architectures and more complex feature representations. -## Setup +## Model Preparation -### Install - -```bash -pip3 install -r requirements.txt -``` - -### Download +### Prepare Resources Pretrained model: Dataset: to download the validation dataset. +### Install Dependencies + +```bash +pip3 install -r requirements.txt +``` + ### Model Conversion ```bash python3 export.py --weight mobilenet_v3_large-8738ca79.pth --output mobilenetv3_large.onnx ``` -## Inference +## Model Inference ```bash export DATASETS_DIR=/Path/to/imagenet_val/ @@ -39,8 +39,8 @@ bash scripts/infer_mobilenet_v3_large_fp16_accuracy.sh bash scripts/infer_mobilenet_v3_large_fp16_performance.sh ``` -## Results +## Model Results -Model |BatchSize |Precision |FPS |Top-1(%) |Top-5(%) -------------------|-----------|----------|---------|---------|-------- -MobileNetV3_Large | 32 | FP16 | 3644.08 | 74.042 | 91.303 +| Model | BatchSize | Precision | FPS | Top-1(%) | Top-5(%) | +|-------------------|-----------|-----------|---------|----------|----------| +| MobileNetV3_Large | 32 | FP16 | 3644.08 | 74.042 | 91.303 | diff --git a/models/cv/classification/mvitv2_base/igie/README.md b/models/cv/classification/mvitv2_base/igie/README.md index c6a112ae03e26f2e232d37fa51f02ed404469bd1..3a3ed0933b99e11c7f82698e0bf192a582657f06 100644 --- a/models/cv/classification/mvitv2_base/igie/README.md +++ b/models/cv/classification/mvitv2_base/igie/README.md @@ -1,12 +1,18 @@ -# MViTv2-base +# MViTv2-base (IGIE) -## Description +## Model Description MViTv2_base is an efficient multi-scale vision Transformer model designed specifically for image classification tasks. By employing a multi-scale structure and hierarchical representation, it effectively captures both global and local image features while maintaining computational efficiency. The MViTv2_base has demonstrated excellent performance on multiple standard datasets and is suitable for a variety of visual recognition tasks. -## Setup +## Model Preparation -### Install +### Prepare Resources + +Pretrained model: + +Dataset: to download the validation dataset. + +### Install Dependencies ```bash # Install libGL @@ -18,12 +24,6 @@ apt install -y libgl1-mesa-glx pip3 install -r requirements.txt ``` -### Download - -Pretrained model: - -Dataset: to download the validation dataset. - ### Model Conversion ```bash @@ -38,7 +38,7 @@ onnxsim mvitv2_base.onnx mvitv2_base_opt.onnx ``` -## Inference +## Model Inference ```bash export DATASETS_DIR=/Path/to/imagenet_val/ @@ -53,12 +53,12 @@ bash scripts/infer_mvitv2_base_fp16_accuracy.sh bash scripts/infer_mvitv2_base_fp16_performance.sh ``` -## Results +## Model Results | Model | BatchSize | Precision | FPS | Top-1(%) | Top-5(%) | | ----------- | --------- | --------- | -------- | -------- | -------- | | MViTv2-base | 16 | FP16 | 58.76 | 84.226 | 96.848 | -## Reference +## References -MViTv2-base: +- [mmpretrain](https://github.com/open-mmlab/mmpretrain) diff --git a/models/cv/classification/regnet_x_16gf/igie/README.md b/models/cv/classification/regnet_x_16gf/igie/README.md index a79be028dc893f22817c2a83ef222a8eef001485..a52c9da2968e51b19e5883d21f2688bf5cc0af0b 100644 --- a/models/cv/classification/regnet_x_16gf/igie/README.md +++ b/models/cv/classification/regnet_x_16gf/igie/README.md @@ -1,31 +1,31 @@ -# RegNet_x_16gf +# RegNet_x_16gf (IGIE) -## Description +## Model Description RegNet_x_16gf is a deep convolutional neural network from the RegNet family, introduced in the paper "Designing Network Design Spaces" by Facebook AI. RegNet models emphasize simplicity, efficiency, and scalability, and they systematically explore design spaces to achieve optimal performance.The x in RegNet_x_16gf indicates it belongs to the RegNetX series, which focuses on optimizing network width and depth, while 16gf refers to its computational complexity of approximately 16 GFLOPs. The model features linear width scaling, group convolutions, and bottleneck blocks, providing high accuracy while maintaining computational efficiency. -## Setup +## Model Preparation -### Install - -```bash -pip3 install -r requirements.txt -``` - -### Download +### Prepare Resources Pretrained model: Dataset: to download the validation dataset. +### Install Dependencies + +```bash +pip3 install -r requirements.txt +``` + ### Model Conversion ```bash python3 export.py --weight regnet_x_16gf-2007eb11.pth --output regnet_x_16gf.onnx ``` -## Inference +## Model Inference ```bash export DATASETS_DIR=/Path/to/imagenet_val/ @@ -40,8 +40,8 @@ bash scripts/infer_regnet_x_16gf_fp16_accuracy.sh bash scripts/infer_regnet_x_16gf_fp16_performance.sh ``` -## Results +## Model Results -Model |BatchSize |Precision |FPS |Top-1(%) |Top-5(%) -------------------|-----------|----------|---------|---------|-------- -RegNet_x_16gf | 32 | FP16 | 970.928 | 80.028 | 94.922 +| Model | BatchSize | Precision | FPS | Top-1(%) | Top-5(%) | +|---------------|-----------|-----------|---------|----------|----------| +| RegNet_x_16gf | 32 | FP16 | 970.928 | 80.028 | 94.922 | diff --git a/models/cv/classification/regnet_x_1_6gf/igie/README.md b/models/cv/classification/regnet_x_1_6gf/igie/README.md index faa39d56dcde6483d98858b62a6fc40f4b3edb31..4e136c85c85474b1ccb452be72a680f230810e28 100644 --- a/models/cv/classification/regnet_x_1_6gf/igie/README.md +++ b/models/cv/classification/regnet_x_1_6gf/igie/README.md @@ -1,30 +1,30 @@ -# RegNet_x_1_6gf +# RegNet_x_1_6gf (IGIE) -## Description +## Model Description RegNet is a family of models designed for image classification tasks, as described in the paper "Designing Network Design Spaces". The RegNet design space provides simple and fast networks that work well across a wide range of computational budgets.The architecture of RegNet models is based on the principle of designing network design spaces, which allows for a more systematic exploration of possible network architectures. This makes it easier to understand and modify the architecture.RegNet_x_1_6gf is a specific model within the RegNet family, designed for image classification tasks -## Setup +## Model Preparation -### Install - -```bash -pip3 install -r requirements.txt -``` - -### Download +### Prepare Resources Pretrained model: Dataset: to download the validation dataset. +### Install Dependencies + +```bash +pip3 install -r requirements.txt +``` + ### Model Conversion ```bash python3 export.py --weight regnet_x_1_6gf-a12f2b72.pth --output regnet_x_1_6gf.onnx ``` -## Inference +## Model Inference ```bash export DATASETS_DIR=/Path/to/imagenet_val/ @@ -39,8 +39,8 @@ bash scripts/infer_regnet_x_1_6gf_fp16_accuracy.sh bash scripts/infer_regnet_x_1_6gf_fp16_performance.sh ``` -## Results +## Model Results -Model |BatchSize |Precision |FPS |Top-1(%) |Top-5(%) -------------------|-----------|----------|---------|---------|-------- -RegNet_x_1_6gf | 32 | FP16 | 487.749 | 79.303 | 94.624 +| Model | BatchSize | Precision | FPS | Top-1(%) | Top-5(%) | +|----------------|-----------|-----------|---------|----------|----------| +| RegNet_x_1_6gf | 32 | FP16 | 487.749 | 79.303 | 94.624 | diff --git a/models/cv/classification/regnet_y_1_6gf/igie/README.md b/models/cv/classification/regnet_y_1_6gf/igie/README.md index 189da9c8acb5bbdec6acaaa13189b73a9b09bb90..602a2afdd9b980a2a6f5c4058198911e70cdae1b 100644 --- a/models/cv/classification/regnet_y_1_6gf/igie/README.md +++ b/models/cv/classification/regnet_y_1_6gf/igie/README.md @@ -1,30 +1,30 @@ -# RegNet_y_1_6gf +# RegNet_y_1_6gf (IGIE) -## Description +## Model Description RegNet is a family of models designed for image classification tasks, as described in the paper "Designing Network Design Spaces". The RegNet design space provides simple and fast networks that work well across a wide range of computational budgets.The architecture of RegNet models is based on the principle of designing network design spaces, which allows for a more systematic exploration of possible network architectures. This makes it easier to understand and modify the architecture.RegNet_y_1_6gf is a specific model within the RegNet family, designed for image classification tasks. -## Setup +## Model Preparation -### Install - -```bash -pip3 install -r requirements.txt -``` - -### Download +### Prepare Resources Pretrained model: Dataset: to download the validation dataset. +### Install Dependencies + +```bash +pip3 install -r requirements.txt +``` + ### Model Conversion ```bash python3 export.py --weight regnet_y_1_6gf-b11a554e.pth --output regnet_y_1_6gf.onnx ``` -## Inference +## Model Inference ```bash export DATASETS_DIR=/Path/to/imagenet_val/ @@ -39,7 +39,7 @@ bash scripts/infer_regnet_y_1_6gf_fp16_accuracy.sh bash scripts/infer_regnet_y_1_6gf_fp16_performance.sh ``` -## Results +## Model Results | Model | BatchSize | Precision | FPS | Top-1(%) | Top-5(%) | | -------------- | --------- | --------- | ------- | -------- | -------- | diff --git a/models/cv/classification/repvgg/igie/README.md b/models/cv/classification/repvgg/igie/README.md index 0f02b88d30d724d35fe42ffc181d9c3428069285..80c5bf196216a353770112251d6c9805fd59a4a5 100644 --- a/models/cv/classification/repvgg/igie/README.md +++ b/models/cv/classification/repvgg/igie/README.md @@ -1,12 +1,18 @@ -# RepVGG +# RepVGG (IGIE) -## Description +## Model Description RepVGG is an innovative convolutional neural network architecture that combines the simplicity of VGG-style inference with a multi-branch topology during training. Through structural re-parameterization, RepVGG achieves high accuracy while significantly improving computational efficiency. -## Setup +## Model Preparation -### Install +### Prepare Resources + +Pretrained model: + +Dataset: to download the validation dataset. + +### Install Dependencies ```bash # Install libGL @@ -18,12 +24,6 @@ apt install -y libgl1-mesa-glx pip3 install -r requirements.txt ``` -### Download - -Pretrained model: - -Dataset: to download the validation dataset. - ### Model Conversion ```bash @@ -35,7 +35,7 @@ python3 export.py --cfg mmpretrain/configs/repvgg/repvgg-A0_4xb64-coslr-120e_in1 ``` -## Inference +## Model Inference ```bash export DATASETS_DIR=/Path/to/imagenet_val/ @@ -50,12 +50,12 @@ bash scripts/infer_repvgg_fp16_accuracy.sh bash scripts/infer_repvgg_fp16_performance.sh ``` -## Results +## Model Results | Model | BatchSize | Precision | FPS | Top-1(%) | Top-5(%) | | ------ | --------- | --------- | -------- | -------- | -------- | | RepVGG | 32 | FP16 | 7423.035 | 72.345 | 90.543 | -## Reference +## References -RepVGG: +- [mmpretrain](https://github.com/open-mmlab/mmpretrain) diff --git a/models/cv/classification/repvgg/ixrt/README.md b/models/cv/classification/repvgg/ixrt/README.md index a1b2a36328a45115352b2ef91641278630e6d930..bb046c5d73abe1a5674e3bfaff9df84129ef0147 100644 --- a/models/cv/classification/repvgg/ixrt/README.md +++ b/models/cv/classification/repvgg/ixrt/README.md @@ -1,13 +1,17 @@ -# RepVGG +# RepVGG (IxRT) -## Description +## Model Description REPVGG is a family of convolutional neural network (CNN) architectures designed for image classification tasks. It was developed by researchers at the University of Oxford and introduced in their paper titled "REPVGG: Making VGG-style ConvNets Great Again" in 2021. -## Setup +## Model Preparation -### Install +### Prepare Resources + +Dataset: to download the validation dataset. + +### Install Dependencies ```bash # Install libGL @@ -19,10 +23,6 @@ apt install -y libgl1-mesa-glx pip3 install -r requirements.txt ``` -### Download - -Dataset: to download the validation dataset. - ### Model Conversion ```bash @@ -35,7 +35,7 @@ python3 export_onnx.py \ --output_model ./checkpoints/repvgg_A0.onnx ``` -## Inference +## Model Inference ```bash export PROJ_DIR=./ @@ -55,7 +55,7 @@ bash scripts/infer_repvgg_fp16_accuracy.sh bash scripts/infer_repvgg_fp16_performance.sh ``` -## Results +## Model Results | Model | BatchSize | Precision | FPS | Top-1(%) | Top-5(%) | | ------ | --------- | --------- | ------- | -------- | -------- | diff --git a/models/cv/classification/res2net50/igie/README.md b/models/cv/classification/res2net50/igie/README.md index eb698705924ce0f16d85e963a9171c1989295fe8..6f4cb1a373888dfe0f90d699216e35ff54ce7981 100644 --- a/models/cv/classification/res2net50/igie/README.md +++ b/models/cv/classification/res2net50/igie/README.md @@ -1,12 +1,18 @@ -# Res2Net50 +# Res2Net50 (IGIE) -## Description +## Model Description Res2Net50 is a convolutional neural network architecture that introduces the concept of "Residual-Residual Networks" (Res2Nets) to enhance feature representation and model expressiveness, particularly in image recognition tasks.The key innovation of Res2Net50 lies in its hierarchical feature aggregation mechanism, which enables the network to capture multi-scale features more effectively. Unlike traditional ResNet architectures, Res2Net50 incorporates multiple parallel pathways within each residual block, allowing the network to dynamically adjust the receptive field size and aggregate features across different scales. -## Setup +## Model Preparation -### Install +### Prepare Resources + +Pretrained model: + +Dataset: to download the validation dataset. + +### Install Dependencies ```bash # Install libGL @@ -18,12 +24,6 @@ apt install -y libgl1-mesa-glx pip3 install -r requirements.txt ``` -### Download - -Pretrained model: - -Dataset: to download the validation dataset. - ### Model Conversion ```bash @@ -38,7 +38,7 @@ onnxsim res2net50.onnx res2net50_opt.onnx ``` -## Inference +## Model Inference ```bash export DATASETS_DIR=/Path/to/imagenet_val/ @@ -53,12 +53,12 @@ bash scripts/infer_res2net50_fp16_accuracy.sh bash scripts/infer_res2net50_fp16_performance.sh ``` -## Results +## Model Results -Model |BatchSize |Precision |FPS |Top-1(%) |Top-5(%) -----------|-----------|----------|----------|----------|-------- -Res2Net50 | 32 | FP16 | 1641.961 | 78.139 | 93.826 +| Model | BatchSize | Precision | FPS | Top-1(%) | Top-5(%) | +|-----------|-----------|-----------|----------|----------|----------| +| Res2Net50 | 32 | FP16 | 1641.961 | 78.139 | 93.826 | -## Reference +## References -Res2Net50: +- [mmpretrain](https://github.com/open-mmlab/mmpretrain) diff --git a/models/cv/classification/res2net50/ixrt/README.md b/models/cv/classification/res2net50/ixrt/README.md index 6700543f5ce219c793533d37996aeda1bda56e1d..9d97a31a462c018b9ce664e7999681cb8021a252 100644 --- a/models/cv/classification/res2net50/ixrt/README.md +++ b/models/cv/classification/res2net50/ixrt/README.md @@ -1,12 +1,18 @@ -# Res2Net50 +# Res2Net50 (IxRT) -## Description +## Model Description A novel building block for CNNs, namely Res2Net, by constructing hierarchical residual-like connections within one single residual block. The Res2Net represents multi-scale features at a granular level and increases the range of receptive fields for each network layer. The proposed Res2Net block can be plugged into the state-of-the-art backbone CNN models, e.g., ResNet, ResNeXt, and DLA. We evaluate the Res2Net block on all these models and demonstrate consistent performance gains over baseline models on widely-used datasets, e.g., CIFAR-100 and ImageNet. Further ablation studies and experimental results on representative computer vision tasks, i.e., object detection, class activation mapping, and salient object detection, further verify the superiority of the Res2Net over the state-of-the-art baseline methods. -## Setup +## Model Preparation -### Install +### Prepare Resources + +Pretrained model: + +Dataset: to download the validation dataset. + +### Install Dependencies ```bash # Install libGL @@ -18,12 +24,6 @@ apt install -y libgl1-mesa-glx pip3 install -r requirements.txt ``` -### Download - -Pretrained model: - -Dataset: to download the validation dataset. - ### Model Conversion ```bash @@ -31,7 +31,7 @@ mkdir checkpoints python3 export_onnx.py --origin_model /path/to/res2net50_14w_8s-6527dddc.pth --output_model checkpoints/res2net50.onnx ``` -## Inference +## Model Inference ```bash export PROJ_DIR=./ @@ -59,9 +59,9 @@ bash scripts/infer_res2net50_int8_accuracy.sh bash scripts/infer_res2net50_int8_performance.sh ``` -## Results +## Model Results -Model |BatchSize |Precision |FPS |Top-1(%) |Top-5(%) -----------|-----------|----------|----------|---------|-------- -Res2Net50 | 32 | FP16 | 921.37 | 77.92 | 93.71 -Res2Net50 | 32 | INT8 | 1933.74 | 77.80 | 93.62 +| Model | BatchSize | Precision | FPS | Top-1(%) | Top-5(%) | +|-----------|-----------|-----------|---------|----------|----------| +| Res2Net50 | 32 | FP16 | 921.37 | 77.92 | 93.71 | +| Res2Net50 | 32 | INT8 | 1933.74 | 77.80 | 93.62 | diff --git a/models/cv/classification/resnest50/igie/README.md b/models/cv/classification/resnest50/igie/README.md index 21137c42b5c10e13a6a4529018b104ad3699c206..7b9ddc20985013e0052ad45b0fa441aed685a463 100644 --- a/models/cv/classification/resnest50/igie/README.md +++ b/models/cv/classification/resnest50/igie/README.md @@ -1,12 +1,18 @@ -# ResNeSt50 +# ResNeSt50 (IGIE) -## Description +## Model Description ResNeSt50 is a deep convolutional neural network model based on the ResNeSt architecture, specifically designed to enhance performance in visual recognition tasks such as image classification, object detection, instance segmentation, and semantic segmentation. ResNeSt stands for Split-Attention Networks, a modular network architecture that leverages channel-wise attention mechanisms across different network branches to capture cross-feature interactions and learn diverse representations. -## Setup +## Model Preparation -### Install +### Prepare Resources + +Pretrained model: + +Dataset: to download the validation dataset. + +### Install Dependencies ```bash # Install libGL @@ -18,12 +24,6 @@ apt install -y libgl1-mesa-glx pip3 install -r requirements.txt ``` -### Download - -Pretrained model: - -Dataset: to download the validation dataset. - ### Model Conversion ```bash @@ -35,7 +35,7 @@ onnxsim resnest50.onnx resnest50_opt.onnx ``` -## Inference +## Model Inference ```bash export DATASETS_DIR=/Path/to/imagenet_val/ @@ -50,12 +50,12 @@ bash scripts/infer_resnest50_fp16_accuracy.sh bash scripts/infer_resnest50_fp16_performance.sh ``` -## Results +## Model Results -Model |BatchSize |Precision |FPS |Top-1(%) |Top-5(%) -----------|-----------|----------|----------|----------|-------- -ResNeSt50 | 32 | FP16 | 344.453 | 80.93 | 95.347 +| Model | BatchSize | Precision | FPS | Top-1(%) | Top-5(%) | +|-----------|-----------|-----------|---------|----------|----------| +| ResNeSt50 | 32 | FP16 | 344.453 | 80.93 | 95.347 | -## Reference +## References -ResNeSt50: +- [ResNeSt](https://github.com/zhanghang1989/ResNeSt) diff --git a/models/cv/classification/resnet101/igie/README.md b/models/cv/classification/resnet101/igie/README.md index a992f6b179c134cecf725b5b9f24c7b3116fea62..43f1c559780fca413b8e7ad11c5a038ddcb43f3f 100644 --- a/models/cv/classification/resnet101/igie/README.md +++ b/models/cv/classification/resnet101/igie/README.md @@ -1,30 +1,30 @@ -# ResNet101 +# ResNet101 (IGIE) -## Description +## Model Description ResNet101 is a convolutional neural network architecture that belongs to the ResNet (Residual Network) family.With a total of 101 layers, ResNet101 comprises multiple residual blocks, each containing convolutional layers with batch normalization and rectified linear unit (ReLU) activations. These residual blocks allow the network to effectively capture complex features at different levels of abstraction, leading to superior performance on image recognition tasks. -## Setup +## Model Preparation -### Install - -```bash -pip3 install -r requirements.txt -``` - -### Download +### Prepare Resources Pretrained model: Dataset: to download the validation dataset. +### Install Dependencies + +```bash +pip3 install -r requirements.txt +``` + ### Model Conversion ```bash python3 export.py --weight resnet101-63fe2227.pth --output resnet101.onnx ``` -## Inference +## Model Inference ```bash export DATASETS_DIR=/Path/to/imagenet_val/ @@ -48,9 +48,9 @@ bash scripts/infer_resnet101_int8_accuracy.sh bash scripts/infer_resnet101_int8_performance.sh ``` -## Results +## Model Results -Model |BatchSize |Precision |FPS |Top-1(%) |Top-5(%) -----------|-----------|----------|----------|----------|-------- -ResNet101 | 32 | FP16 | 2507.074 | 77.331 | 93.520 -ResNet101 | 32 | INT8 | 5458.890 | 76.719 | 93.348 +| Model | BatchSize | Precision | FPS | Top-1(%) | Top-5(%) | +|-----------|-----------|-----------|----------|----------|----------| +| ResNet101 | 32 | FP16 | 2507.074 | 77.331 | 93.520 | +| ResNet101 | 32 | INT8 | 5458.890 | 76.719 | 93.348 | diff --git a/models/cv/classification/resnet101/ixrt/README.md b/models/cv/classification/resnet101/ixrt/README.md index 3869ea55bb571a068854a896a71600d8472fd1ff..5c2787477d017e19acd41a3937bacd49c2e726ef 100644 --- a/models/cv/classification/resnet101/ixrt/README.md +++ b/models/cv/classification/resnet101/ixrt/README.md @@ -1,12 +1,16 @@ -# Resnet101 +# Resnet101 (IxRT) -## Description +## Model Description ResNet-101 is a variant of the ResNet (Residual Network) architecture, and it belongs to a family of deep neural networks introduced by Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun in their 2016 paper, "Deep Residual Learning for Image Recognition." The ResNet architecture is known for its effective use of residual connections, which help in training very deep neural networks. -## Setup +## Model Preparation -### Install +### Prepare Resources + +Dataset: to download the validation dataset. + +### Install Dependencies ```bash # Install libGL @@ -18,10 +22,6 @@ apt install -y libgl1-mesa-glx pip3 install -r reuirements.txt ``` -### Download - -Dataset: to download the validation dataset. - ### Model Conversion ```bash @@ -29,7 +29,7 @@ mkdir checkpoints python3 export_onnx.py --output_model checkpoints/resnet101.onnx ``` -## Inference +## Model Inference ```bash export PROJ_DIR=./ @@ -57,9 +57,9 @@ bash scripts/infer_resnet101_int8_accuracy.sh bash scripts/infer_resnet101_int8_performance.sh ``` -## Results +## Model Results -Model |BatchSize |Precision |FPS |Top-1(%) |Top-5(%) -----------|-----------|----------|---------|----------|-------- -Resnet101 | 32 | FP16 | 2592.04 | 77.36 | 93.56 -Resnet101 | 32 | INT8 | 5760.69 | 76.88 | 93.43 +| Model | BatchSize | Precision | FPS | Top-1(%) | Top-5(%) | +|-----------|-----------|-----------|---------|----------|----------| +| Resnet101 | 32 | FP16 | 2592.04 | 77.36 | 93.56 | +| Resnet101 | 32 | INT8 | 5760.69 | 76.88 | 93.43 | diff --git a/models/cv/classification/resnet152/igie/README.md b/models/cv/classification/resnet152/igie/README.md index 5066e0ca260e4f811b317bbde04e877fb31c13d7..173e1f38394b3da33d328768efea702bfe5ba77c 100644 --- a/models/cv/classification/resnet152/igie/README.md +++ b/models/cv/classification/resnet152/igie/README.md @@ -1,30 +1,30 @@ -# ResNet152 +# ResNet152 (IGIE) -## Description +## Model Description ResNet152 is a convolutional neural network architecture that is part of the ResNet (Residual Network) family, Comprising 152 layers, At the core of ResNet152 is the innovative residual learning framework, which addresses the challenges associated with training very deep neural networks. -## Setup +## Model Preparation -### Install - -```bash -pip3 install -r requirements.txt -``` - -### Download +### Prepare Resources Pretrained model: Dataset: to download the validation dataset. +### Install Dependencies + +```bash +pip3 install -r requirements.txt +``` + ### Model Conversion ```bash python3 export.py --weight resnet152-394f9c45.pth --output resnet152.onnx ``` -## Inference +## Model Inference ```bash export DATASETS_DIR=/Path/to/imagenet_val/ @@ -48,9 +48,9 @@ bash scripts/infer_resnet152_int8_accuracy.sh bash scripts/infer_resnet152_int8_performance.sh ``` -## Results +## Model Results -Model |BatchSize |Precision |FPS |Top-1(%) |Top-5(%) -----------|-----------|----------|----------|----------|-------- -ResNet152 | 32 | FP16 | 1768.348 | 78.285 | 94.022 -ResNet152 | 32 | INT8 | 3864.913 | 77.637 | 93.728 +| Model | BatchSize | Precision | FPS | Top-1(%) | Top-5(%) | +|-----------|-----------|-----------|----------|----------|----------| +| ResNet152 | 32 | FP16 | 1768.348 | 78.285 | 94.022 | +| ResNet152 | 32 | INT8 | 3864.913 | 77.637 | 93.728 | diff --git a/models/cv/classification/resnet18/igie/README.md b/models/cv/classification/resnet18/igie/README.md index 3cdca0c785987b668aacfec0e64ca644311c38cc..e6373ba2d67b545418746ad573c5b5e82a809f86 100644 --- a/models/cv/classification/resnet18/igie/README.md +++ b/models/cv/classification/resnet18/igie/README.md @@ -1,30 +1,30 @@ -# ResNet18 +# ResNet18 (IGIE) -## Description +## Model Description ResNet-18 is a relatively compact deep neural network.The ResNet-18 architecture consists of 18 layers, including convolutional, pooling, and fully connected layers. It incorporates residual blocks, a key innovation that utilizes shortcut connections to facilitate the flow of information through the network. -## Setup +## Model Preparation -### Install - -```bash -pip3 install -r requirements.txt -``` - -### Download +### Prepare Resources Pretrained model: Dataset: to download the validation dataset. +### Install Dependencies + +```bash +pip3 install -r requirements.txt +``` + ### Model Conversion ```bash python3 export.py --weight resnet18-f37072fd.pth --output resnet18.onnx ``` -## Inference +## Model Inference ```bash export DATASETS_DIR=/Path/to/imagenet_val/ @@ -48,9 +48,9 @@ bash scripts/infer_resnet18_int8_accuracy.sh bash scripts/infer_resnet18_int8_performance.sh ``` -## Results +## Model Results -Model |BatchSize |Precision |FPS |Top-1(%) |Top-5(%) ----------|-----------|----------|----------|----------|-------- -ResNet18 | 32 | FP16 | 9592.98 | 69.77 | 89.09 -ResNet18 | 32 | INT8 | 21314.55 | 69.53 | 88.97 +| Model | BatchSize | Precision | FPS | Top-1(%) | Top-5(%) | +|----------|-----------|-----------|----------|----------|----------| +| ResNet18 | 32 | FP16 | 9592.98 | 69.77 | 89.09 | +| ResNet18 | 32 | INT8 | 21314.55 | 69.53 | 88.97 | diff --git a/models/cv/classification/resnet18/ixrt/README.md b/models/cv/classification/resnet18/ixrt/README.md index 84ce36f64876ba9fd66a01af5bbdd460658f545e..1a69e4746808cdb18b7f033b80e2f5000e22d86b 100644 --- a/models/cv/classification/resnet18/ixrt/README.md +++ b/models/cv/classification/resnet18/ixrt/README.md @@ -1,12 +1,18 @@ -# Resnet18 +# ResNet18 (IxRT) -## Description +## Model Description ResNet-18 is a variant of the ResNet (Residual Network) architecture, which was introduced by Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun in their 2016 paper, "Deep Residual Learning for Image Recognition." The ResNet architecture was pivotal in addressing the challenges of training very deep neural networks by introducing residual blocks. -## Setup +## Model Preparation -### Install +### Prepare Resources + +Pretrained model: + +Dataset: to download the validation dataset. + +### Install Dependencies ```bash # Install libGL @@ -18,12 +24,6 @@ apt install -y libgl1-mesa-glx pip3 install -r requirements.txt ``` -### Download - -Pretrained model: - -Dataset: to download the validation dataset. - ### Model Conversion ```bash @@ -31,7 +31,7 @@ mkdir checkpoints python3 export_onnx.py --origin_model /path/to/resnet18-f37072fd.pth --output_model checkpoints/resnet18.onnx ``` -## Inference +## Model Inference ```bash export PROJ_DIR=./ @@ -59,9 +59,9 @@ bash scripts/infer_resnet18_int8_accuracy.sh bash scripts/infer_resnet18_int8_performance.sh ``` -## Results +## Model Results -Model |BatchSize |Precision |FPS |Top-1(%) |Top-5(%) ----------|-----------|----------|----------|----------|-------- -Resnet18 | 32 | FP16 | 9592.98 | 69.77 | 89.09 -Resnet18 | 32 | INT8 | 21314.55 | 69.53 | 88.97 +| Model | BatchSize | Precision | FPS | Top-1(%) | Top-5(%) | +|----------|-----------|-----------|----------|----------|----------| +| Resnet18 | 32 | FP16 | 9592.98 | 69.77 | 89.09 | +| Resnet18 | 32 | INT8 | 21314.55 | 69.53 | 88.97 | diff --git a/models/cv/classification/resnet34/ixrt/README.md b/models/cv/classification/resnet34/ixrt/README.md index 48df97d3216c5c17675b80007a504032622648dc..b421869aca678ed23cdaeb785941219878dd9828 100644 --- a/models/cv/classification/resnet34/ixrt/README.md +++ b/models/cv/classification/resnet34/ixrt/README.md @@ -1,12 +1,16 @@ -# ResNet34 +# ResNet34 (IxRT) -## Description +## Model Description Residual Networks, or ResNets, learn residual functions with reference to the layer inputs, instead of learning unreferenced functions. Instead of hoping each few stacked layers directly fit a desired underlying mapping, residual nets let these layers fit a residual mapping. -## Setup +## Model Preparation -### Install +### Prepare Resources + +Dataset: to download the validation dataset. + +### Install Dependencies ```bash # Install libGL @@ -18,10 +22,6 @@ apt install -y libgl1-mesa-glx pip3 install -r requirements.txt ``` -### Download - -Dataset: to download the validation dataset. - ### Model Conversion ```bash @@ -29,7 +29,7 @@ mkdir checkpoints python3 export_onnx.py --output_model checkpoints/resnet34.onnx ``` -## Inference +## Model Inference ```bash export PROJ_DIR=./ @@ -57,9 +57,9 @@ bash scripts/infer_resnet34_int8_accuracy.sh bash scripts/infer_resnet34_int8_performance.sh ``` -## Results +## Model Results -Model |BatchSize |Precision |FPS |Top-1(%) |Top-5(%) ----------|-----------|----------|----------|----------|-------- -ResNet34 | 32 | FP16 | 6179.47 | 73.30 | 91.42 -ResNet34 | 32 | INT8 | 11256.36 | 73.13 | 91.34 +| Model | BatchSize | Precision | FPS | Top-1(%) | Top-5(%) | +|----------|-----------|-----------|----------|----------|----------| +| ResNet34 | 32 | FP16 | 6179.47 | 73.30 | 91.42 | +| ResNet34 | 32 | INT8 | 11256.36 | 73.13 | 91.34 | diff --git a/models/cv/classification/resnet50/igie/README.md b/models/cv/classification/resnet50/igie/README.md index 1d670729f59208f4046056a6c6b98b6c6786e630..57347150c39861010b782e58bcb476b1b39c0a94 100644 --- a/models/cv/classification/resnet50/igie/README.md +++ b/models/cv/classification/resnet50/igie/README.md @@ -1,30 +1,30 @@ -# ResNet50 +# ResNet50 (IGIE) -## Description +## Model Description ResNet-50 is a convolutional neural network architecture that belongs to the ResNet.The key innovation in ResNet-50 is the introduction of residual blocks, which include shortcut connections (skip connections) to enable the flow of information directly from one layer to another. These shortcut connections help mitigate the vanishing gradient problem and facilitate the training of very deep networks. -## Setup +## Model Preparation -### Install - -```bash -pip3 install -r requirements.txt -``` - -### Download +### Prepare Resources Pretrained model: Dataset: to download the validation dataset. +### Install Dependencies + +```bash +pip3 install -r requirements.txt +``` + ### Model Conversion ```bash python3 export.py --weight resnet50-0676ba61.pth --output resnet50.onnx ``` -## Inference +## Model Inference ```bash export DATASETS_DIR=/Path/to/imagenet_val/ @@ -48,9 +48,9 @@ bash scripts/infer_resnet50_int8_accuracy.sh bash scripts/infer_resnet50_int8_performance.sh ``` -## Results +## Model Results -Model |BatchSize |Precision |FPS |Top-1(%) |Top-5(%) ----------|-----------|----------|----------|----------|-------- -ResNet50 | 32 | FP16 | 4417.29 | 76.11 | 92.85 -ResNet50 | 32 | INT8 | 8628.61 | 75.72 | 92.71 +| Model | BatchSize | Precision | FPS | Top-1(%) | Top-5(%) | +|----------|-----------|-----------|---------|----------|----------| +| ResNet50 | 32 | FP16 | 4417.29 | 76.11 | 92.85 | +| ResNet50 | 32 | INT8 | 8628.61 | 75.72 | 92.71 | diff --git a/models/cv/classification/resnet50/ixrt/README.md b/models/cv/classification/resnet50/ixrt/README.md index 0d5fc9fa2347e50e7eb067d76f3d6af583477770..df3a089363ba90216906deec4c9ac06483f8b023 100644 --- a/models/cv/classification/resnet50/ixrt/README.md +++ b/models/cv/classification/resnet50/ixrt/README.md @@ -1,12 +1,18 @@ -# ResNet50 +# ResNet50 (IxRT) -## Description +## Model Description Residual Networks, or ResNets, learn residual functions with reference to the layer inputs, instead of learning unreferenced functions. Instead of hoping each few stacked layers directly fit a desired underlying mapping, residual nets let these layers fit a residual mapping. -## Setup +## Model Preparation -### Install +### Prepare Resources + +Pretrained model: + +Dataset: to download the validation dataset. + +### Install Dependencies ```bash # Install libGL @@ -18,12 +24,6 @@ apt install -y libgl1-mesa-glx pip3 install -r requirements.txt ``` -### Download - -Pretrained model: - -Dataset: to download the validation dataset. - ### Model Conversion ```bash @@ -31,7 +31,7 @@ mkdir checkpoints python3 export_onnx.py --origin_model /path/to/resnet50-0676ba61.pth --output_model checkpoints/resnet50.onnx ``` -## Inference +## Model Inference ```bash export DATASETS_DIR=/path/to/imagenet_val/ @@ -58,9 +58,9 @@ bash scripts/infer_resnet50_int8_accuracy.sh bash scripts/infer_resnet50_int8_performance.sh ``` -## Results +## Model Results | Model | BatchSize | Precision | FPS | Top-1(%) | Top-5(%) | -| -------- | --------- | --------- | ------- | -------- | -------- | +|----------|-----------|-----------|---------|----------|----------| | ResNet50 | 32 | FP16 | 4077.58 | 0.76158 | 0.92872 | | ResNet50 | 32 | INT8 | 9113.07 | 0.74516 | 0.9287 | diff --git a/models/cv/classification/resnetv1d50/igie/README.md b/models/cv/classification/resnetv1d50/igie/README.md index f25284797274271c1b5e35650b201c0548287729..e8b669da74955336c8f00760c368aeef6b52b68e 100644 --- a/models/cv/classification/resnetv1d50/igie/README.md +++ b/models/cv/classification/resnetv1d50/igie/README.md @@ -1,12 +1,18 @@ -# ResNetV1D50 +# ResNetV1D50 (IGIE) -## Description +## Model Description ResNetV1D50 is an enhanced version of ResNetV1-50 that incorporates changes like dilated convolutions and adjusted downsampling, leading to better performance in large-scale image classification tasks. Its ability to capture richer image features makes it a popular choice in deep learning models. -## Setup +## Model Preparation -### Install +### Prepare Resources + +Pretrained model: + +Dataset: to download the validation dataset. + +### Install Dependencies ```bash # Install libGL @@ -18,12 +24,6 @@ apt install -y libgl1-mesa-glx pip3 install -r requirements.txt ``` -### Download - -Pretrained model: - -Dataset: to download the validation dataset. - ### Model Conversion ```bash @@ -35,7 +35,7 @@ python3 export.py --cfg mmpretrain/configs/resnet/resnetv1d50_b32x8_imagenet.py ``` -## Inference +## Model Inference ```bash export DATASETS_DIR=/Path/to/imagenet_val/ @@ -50,12 +50,12 @@ bash scripts/infer_resnetv1d50_fp16_accuracy.sh bash scripts/infer_resnetv1d50_fp16_performance.sh ``` -## Results +## Model Results | Model | BatchSize | Precision | FPS | Top-1(%) | Top-5(%) | | ----------- | --------- | --------- | -------- | -------- | -------- | | ResNetV1D50 | 32 | FP16 | 4017.92 | 77.517 | 93.538 | -## Reference +## References -ResNetV1D50: +- [mmpretrain](https://github.com/open-mmlab/mmpretrain) diff --git a/models/cv/classification/resnetv1d50/ixrt/README.md b/models/cv/classification/resnetv1d50/ixrt/README.md index 7445f4581d62d622e08d0bb0de2072d85252fcd2..954ff13a5e9299923f43d6bcbd25cffbac5db307 100644 --- a/models/cv/classification/resnetv1d50/ixrt/README.md +++ b/models/cv/classification/resnetv1d50/ixrt/README.md @@ -1,12 +1,16 @@ -# ResNetV1D50 +# ResNetV1D50 (IxRT) -## Description +## Model Description Residual Networks, or ResNets, learn residual functions with reference to the layer inputs, instead of learning unreferenced functions. Instead of hoping each few stacked layers directly fit a desired underlying mapping, residual nets let these layers fit a residual mapping. -## Setup +## Model Preparation -### Install +### Prepare Resources + +Dataset: to download the validation dataset. + +### Install Dependencies ```bash # Install libGL @@ -18,10 +22,6 @@ apt install -y libgl1-mesa-glx pip3 install -r requirments.txt ``` -### Download - -Dataset: to download the validation dataset. - ### Model Conversion ```bash @@ -29,7 +29,7 @@ mkdir checkpoints python3 export_onnx.py --output_model checkpoints/resnet_v1_d50.onnx ``` -## Inference +## Model Inference ```bash export DATASETS_DIR=/path/to/imagenet_val/ @@ -56,7 +56,7 @@ bash scripts/infer_resnetv1d50_int8_accuracy.sh bash scripts/infer_resnetv1d50_int8_performance.sh ``` -## Results +## Model Results | Model | BatchSize | Precision | FPS | Top-1(%) | Top-5(%) | | ------------- | --------- | --------- | ------- | -------- | -------- | diff --git a/models/cv/classification/resnext101_32x8d/igie/README.md b/models/cv/classification/resnext101_32x8d/igie/README.md index 84f7a1d35737886bce726a4c37b3b0ffe6fa4803..de622298dcbf487069057bd40abddaee84a7e92f 100644 --- a/models/cv/classification/resnext101_32x8d/igie/README.md +++ b/models/cv/classification/resnext101_32x8d/igie/README.md @@ -1,30 +1,30 @@ -# ResNext101_32x8d +# ResNext101_32x8d (IGIE) -## Description +## Model Description ResNeXt101_32x8d is a deep convolutional neural network introduced in the paper "Aggregated Residual Transformations for Deep Neural Networks." It enhances the traditional ResNet architecture by incorporating group convolutions, offering a new dimension for scaling network capacity through "cardinality" (the number of groups) rather than merely increasing depth or width.The model consists of 101 layers and uses a configuration of 32 groups, each with a width of 8 channels. This design improves feature extraction while maintaining computational efficiency. -## Setup +## Model Preparation -### Install - -```bash -pip3 install -r requirements.txt -``` - -### Download +### Prepare Resources Pretrained model: Dataset: to download the validation dataset. +### Install Dependencies + +```bash +pip3 install -r requirements.txt +``` + ### Model Conversion ```bash python3 export.py --weight resnext101_32x8d-8ba56ff5.pth --output resnext101_32x8d.onnx ``` -## Inference +## Model Inference ```bash export DATASETS_DIR=/Path/to/imagenet_val/ @@ -39,7 +39,7 @@ bash scripts/infer_resnext101_32x8d_fp16_accuracy.sh bash scripts/infer_resnext101_32x8d_fp16_performance.sh ``` -## Results +## Model Results | Model | BatchSize | Precision | FPS | Top-1(%) | Top-5(%) | | ---------------- | --------- | --------- | ------ | -------- | -------- | diff --git a/models/cv/classification/resnext101_64x4d/igie/README.md b/models/cv/classification/resnext101_64x4d/igie/README.md index 3d8042381537390526ef733b53e60c958f43423b..5f9d62f03d18c39bedd4e4881aa9b26fc0645da6 100644 --- a/models/cv/classification/resnext101_64x4d/igie/README.md +++ b/models/cv/classification/resnext101_64x4d/igie/README.md @@ -1,30 +1,30 @@ -# ResNext101_64x4d +# ResNext101_64x4d (IGIE) -## Description +## Model Description The ResNeXt101_64x4d is a deep learning model based on the deep residual network architecture, which enhances performance and efficiency through the use of grouped convolutions. With a depth of 101 layers and 64 filter groups, it is particularly suited for complex image recognition tasks. While maintaining excellent accuracy, it can adapt to various input sizes -## Setup +## Model Preparation -### Install - -```bash -pip3 install -r requirements.txt -``` - -### Download +### Prepare Resources Pretrained model: Dataset: to download the validation dataset. +### Install Dependencies + +```bash +pip3 install -r requirements.txt +``` + ### Model Conversion ```bash python3 export.py --weight resnext101_64x4d-173b62eb.pth --output resnext101_64x4d.onnx ``` -## Inference +## Model Inference ```bash export DATASETS_DIR=/Path/to/imagenet_val/ @@ -39,7 +39,7 @@ bash scripts/infer_resnext101_64x4d_fp16_accuracy.sh bash scripts/infer_resnext101_64x4d_fp16_performance.sh ``` -## Results +## Model Results | Model | BatchSize | Precision | FPS | Top-1(%) | Top-5(%) | | ---------------- | --------- | --------- | ------ | -------- | -------- | diff --git a/models/cv/classification/resnext50_32x4d/igie/README.md b/models/cv/classification/resnext50_32x4d/igie/README.md index a90b47c769a52f5911a427c87e692108ac56ea14..bb6532b9ff6d4d5ef890e881f18fee4892cc62e6 100644 --- a/models/cv/classification/resnext50_32x4d/igie/README.md +++ b/models/cv/classification/resnext50_32x4d/igie/README.md @@ -1,30 +1,30 @@ -# ResNext50_32x4d +# ResNext50_32x4d (IGIE) -## Description +## Model Description The ResNeXt50_32x4d model is a convolutional neural network architecture designed for image classification tasks. It is an extension of the ResNet (Residual Network) architecture, incorporating the concept of cardinality to enhance model performance. -## Setup +## Model Preparation -### Install - -```bash -pip3 install -r requirements.txt -``` - -### Download +### Prepare Resources Pretrained model: Dataset: to download the validation dataset. +### Install Dependencies + +```bash +pip3 install -r requirements.txt +``` + ### Model Conversion ```bash python3 export.py --weight resnext50_32x4d-7cdf4587.pth --output resnext50_32x4d.onnx ``` -## Inference +## Model Inference ```bash export DATASETS_DIR=/Path/to/imagenet_val/ @@ -39,8 +39,8 @@ bash scripts/infer_resnext50_32x4d_fp16_accuracy.sh bash scripts/infer_resnext50_32x4d_fp16_performance.sh ``` -## Results +## Model Results -Model |BatchSize |Precision |FPS |Top-1(%) |Top-5(%) -----------------|-----------|----------|---------|----------|-------- -resnext50_32x4d | 32 | FP16 | 273.20 | 77.601 | 93.656 +| Model | BatchSize | Precision | FPS | Top-1(%) | Top-5(%) | +|-----------------|-----------|-----------|--------|----------|----------| +| ResNext50_32x4d | 32 | FP16 | 273.20 | 77.601 | 93.656 | diff --git a/models/cv/classification/resnext50_32x4d/ixrt/README.md b/models/cv/classification/resnext50_32x4d/ixrt/README.md index 0c7ed2fe0e20e82535660fbe4e707bf3c18e1371..a9782585329c1800bb3800c62a2e2f84a8238b09 100644 --- a/models/cv/classification/resnext50_32x4d/ixrt/README.md +++ b/models/cv/classification/resnext50_32x4d/ixrt/README.md @@ -1,30 +1,30 @@ -# ResNext50_32x4d +# ResNext50_32x4d (IxRT) -## Description +## Model Description The ResNeXt50_32x4d model is a convolutional neural network architecture designed for image classification tasks. It is an extension of the ResNet (Residual Network) architecture, incorporating the concept of cardinality to enhance model performance. -## Setup +## Model Preparation -### Install - -```bash -pip3 install -r requirements.txt -``` - -### Download +### Prepare Resources Pretrained model: Dataset: to download the validation dataset. +### Install Dependencies + +```bash +pip3 install -r requirements.txt +``` + ### Model Conversion ```bash python3 export.py --weight resnext50_32x4d-7cdf4587.pth --output resnext50_32x4d.onnx ``` -## Inference +## Model Inference ```bash export DATASETS_DIR=/Path/to/imagenet_val/ @@ -39,7 +39,7 @@ bash scripts/infer_resnext50_32x4d_fp16_accuracy.sh bash scripts/infer_resnext50_32x4d_fp16_performance.sh ``` -## Results +## Model Results | Model | BatchSize | Precision | FPS | Top-1(%) | Top-5(%) | | --------------- | --------- | --------- | ------ | -------- | -------- | diff --git a/models/cv/classification/seresnet50/igie/README.md b/models/cv/classification/seresnet50/igie/README.md index ddd891ec083a848bcb2fd24e567df59c106175eb..b2b736fa4fa53502e63745110f360956c9526580 100644 --- a/models/cv/classification/seresnet50/igie/README.md +++ b/models/cv/classification/seresnet50/igie/README.md @@ -1,12 +1,18 @@ -# SEResNet50 +# SEResNet50 (IGIE) -## Description +## Model Description SEResNet50 is an enhanced version of the ResNet50 network integrated with Squeeze-and-Excitation (SE) blocks, which strengthens the network's feature expression capability by explicitly emphasizing useful features and suppressing irrelevant ones. This improvement enables SEResNet50 to demonstrate higher accuracy in various visual recognition tasks compared to the standard ResNet50. -## Setup +## Model Preparation -### Install +### Prepare Resources + +Pretrained model: + +Dataset: to download the validation dataset. + +### Install Dependencies ```bash # Install libGL @@ -18,12 +24,6 @@ apt install -y libgl1-mesa-glx pip3 install -r requirements.txt ``` -### Download - -Pretrained model: - -Dataset: to download the validation dataset. - ### Model Conversion ```bash @@ -32,10 +32,9 @@ git clone -b v0.24.0 https://github.com/open-mmlab/mmpretrain.git # export onnx model python3 export.py --cfg mmpretrain/configs/seresnet/seresnet50_8xb32_in1k.py --weight se-resnet50_batch256_imagenet_20200804-ae206104.pth --output seresnet50.onnx - ``` -## Inference +## Model Inference ```bash export DATASETS_DIR=/Path/to/imagenet_val/ @@ -50,12 +49,12 @@ bash scripts/infer_seresnet_fp16_accuracy.sh bash scripts/infer_seresnet_fp16_performance.sh ``` -## Results +## Model Results | Model | BatchSize | Precision | FPS | Top-1(%) | Top-5(%) | | ---------- | --------- | --------- | -------- | -------- | -------- | | SEResNet50 | 32 | FP16 | 2548.268 | 77.709 | 93.812 | -## Reference +## References -SE_ResNet50: +- [mmpretrain](https://github.com/open-mmlab/mmpretrain) diff --git a/models/cv/classification/shufflenet_v1/ixrt/README.md b/models/cv/classification/shufflenet_v1/ixrt/README.md index ae50050c073f565f67f7353049af5ede9b538f13..2e4b668666ffbafb53b05badda8132bd6f4d15ac 100644 --- a/models/cv/classification/shufflenet_v1/ixrt/README.md +++ b/models/cv/classification/shufflenet_v1/ixrt/README.md @@ -1,13 +1,19 @@ -# ShuffleNetV1 +# ShuffleNetV1 (IxRT) -## Description +## Model Description ShuffleNet V1 is a lightweight neural network architecture primarily used for image classification and object detection tasks. It uses techniques such as deep separable convolution and channel shuffle to reduce the number of parameters and computational complexity of the model, thereby achieving low computational resource consumption while maintaining high accuracy. -## Setup +## Model Preparation -### Install +### Prepare Resources + +Pretrained model: + +Dataset: to download the validation dataset. + +### Install Dependencies ```bash # Install libGL @@ -19,12 +25,6 @@ apt install -y libgl1-mesa-glx pip3 install -r requirements.txt ``` -### Download - -Pretrained model: - -Dataset: to download the validation dataset. - ### Model Conversion ```bash @@ -37,7 +37,7 @@ python3 export_onnx.py \ --output_model ./checkpoints/shufflenet_v1.onnx ``` -## Inference +## Model Inference ```bash export PROJ_DIR=./ @@ -57,8 +57,8 @@ bash scripts/infer_shufflenet_v1_fp16_accuracy.sh bash scripts/infer_shufflenet_v1_fp16_performance.sh ``` -## Results +## Model Results -Model |BatchSize |Precision |FPS |Top-1(%) |Top-5(%) --------------|-----------|----------|---------|----------|-------- -ShuffleNetV1 | 32 | FP16 | 3619.89 | 66.17 | 86.54 +| Model | BatchSize | Precision | FPS | Top-1(%) | Top-5(%) | +|--------------|-----------|-----------|---------|----------|----------| +| ShuffleNetV1 | 32 | FP16 | 3619.89 | 66.17 | 86.54 | diff --git a/models/cv/classification/shufflenetv2_x0_5/igie/README.md b/models/cv/classification/shufflenetv2_x0_5/igie/README.md index 52d580b4783658599db8fc37fcca4dd90dc56f01..def9e304be2577254916c6f019eab65edae613e5 100644 --- a/models/cv/classification/shufflenetv2_x0_5/igie/README.md +++ b/models/cv/classification/shufflenetv2_x0_5/igie/README.md @@ -1,30 +1,30 @@ -# ShuffleNetV2_x0_5 +# ShuffleNetV2_x0_5 (IGIE) -## Description +## Model Description ShuffleNetV2_x0_5 is a lightweight convolutional neural network architecture designed for efficient image classification and feature extraction, it also incorporates other design optimizations such as depthwise separable convolutions, group convolutions, and efficient building blocks to further reduce computational complexity and improve efficiency. -## Setup +## Model Preparation -### Install - -```bash -pip3 install -r requirements.txt -``` - -### Download +### Prepare Resources Pretrained model: Dataset: to download the validation dataset. +### Install Dependencies + +```bash +pip3 install -r requirements.txt +``` + ### Model Conversion ```bash python3 export.py --weight shufflenetv2_x0.5-f707e7126e.pth --output shufflenetv2_x0_5.onnx ``` -## Inference +## Model Inference ```bash export DATASETS_DIR=/Path/to/imagenet_val/ @@ -39,8 +39,8 @@ bash scripts/infer_shufflenetv2_x0_5_fp16_accuracy.sh bash scripts/infer_shufflenetv2_x0_5_fp16_performance.sh ``` -## Results +## Model Results -Model |BatchSize |Precision |FPS |Top-1(%) |Top-5(%) -------------------|-----------|----------|----------|----------|-------- -ShuffleNetV2_x0_5 | 32 | FP16 | 11677.55 | 60.501 | 81.702 +| Model | BatchSize | Precision | FPS | Top-1(%) | Top-5(%) | +|-------------------|-----------|-----------|----------|----------|----------| +| ShuffleNetV2_x0_5 | 32 | FP16 | 11677.55 | 60.501 | 81.702 | diff --git a/models/cv/classification/shufflenetv2_x1_0/igie/README.md b/models/cv/classification/shufflenetv2_x1_0/igie/README.md index 6ffc72007c684cb5cd2228d154e331579119c0a3..152432488859cd4fcc6c9db9045d9188ccd3b2c1 100644 --- a/models/cv/classification/shufflenetv2_x1_0/igie/README.md +++ b/models/cv/classification/shufflenetv2_x1_0/igie/README.md @@ -1,30 +1,30 @@ -# ShuffleNetV2_x1_0 +# ShuffleNetV2_x1_0 (IGIE) -## Description +## Model Description ShuffleNet V2_x1_0 is an efficient convolutional neural network (CNN) architecture that emphasizes a balance between computational efficiency and accuracy, particularly suited for deployment on mobile and embedded devices. The model refines the ShuffleNet series by introducing structural innovations that enhance feature reuse and reduce redundancy, all while maintaining simplicity and performance. -## Setup +## Model Preparation -### Install - -```bash -pip3 install -r requirements.txt -``` - -### Download +### Prepare Resources Pretrained model: Dataset: to download the validation dataset. +### Install Dependencies + +```bash +pip3 install -r requirements.txt +``` + ### Model Conversion ```bash python3 export.py --weight shufflenetv2_x1-5666bf0f80.pth --output shufflenetv2_x1_0.onnx ``` -## Inference +## Model Inference ```bash export DATASETS_DIR=/Path/to/imagenet_val/ @@ -39,7 +39,7 @@ bash scripts/infer_shufflenetv2_x1_0_fp16_accuracy.sh bash scripts/infer_shufflenetv2_x1_0_fp16_performance.sh ``` -## Results +## Model Results | Model | BatchSize | Precision | FPS | Top-1(%) | Top-5(%) | | ----------------- | --------- | --------- | -------- | -------- | -------- | diff --git a/models/cv/classification/shufflenetv2_x1_5/igie/README.md b/models/cv/classification/shufflenetv2_x1_5/igie/README.md index f882d3b87a194d1c0e7679cca096de47a7e13e03..a1ec634d5878907191760f110bdf172b30c99c48 100644 --- a/models/cv/classification/shufflenetv2_x1_5/igie/README.md +++ b/models/cv/classification/shufflenetv2_x1_5/igie/README.md @@ -1,30 +1,30 @@ -# ShuffleNetV2_x1_5 +# ShuffleNetV2_x1_5 (IGIE) -## Description +## Model Description ShuffleNetV2_x1_5 is a lightweight convolutional neural network specifically designed for efficient image recognition tasks on resource-constrained devices. It achieves high performance and low latency through the introduction of channel shuffling and pointwise group convolutions. Despite its small model size, it offers high accuracy and is suitable for a variety of vision tasks in mobile devices and embedded systems. -## Setup +## Model Preparation -### Install - -```bash -pip3 install -r requirements.txt -``` - -### Download +### Prepare Resources Pretrained model: Dataset: to download the validation dataset. +### Install Dependencies + +```bash +pip3 install -r requirements.txt +``` + ### Model Conversion ```bash python3 export.py --weight shufflenetv2_x1_5-3c479a10.pth --output shufflenetv2_x1_5.onnx ``` -## Inference +## Model Inference ```bash export DATASETS_DIR=/Path/to/imagenet_val/ @@ -39,7 +39,7 @@ bash scripts/infer_shufflenetv2_x1_5_fp16_accuracy.sh bash scripts/infer_shufflenetv2_x1_5_fp16_performance.sh ``` -## Results +## Model Results | Model | BatchSize | Precision | FPS | Top-1(%) | Top-5(%) | | ----------------- | --------- | --------- | -------- | -------- | -------- | diff --git a/models/cv/classification/shufflenetv2_x2_0/igie/README.md b/models/cv/classification/shufflenetv2_x2_0/igie/README.md index dac24a22f8b48f30e896a0f175b7d9a1efd65bf5..009b4aba0886760344820d3232d4889987c855e9 100644 --- a/models/cv/classification/shufflenetv2_x2_0/igie/README.md +++ b/models/cv/classification/shufflenetv2_x2_0/igie/README.md @@ -1,30 +1,30 @@ -# ShuffleNetV2_x2_0 +# ShuffleNetV2_x2_0 (IGIE) -## Description +## Model Description ShuffleNetV2_x2_0 is a lightweight convolutional neural network introduced in the paper "ShuffleNet V2: Practical Guidelines for Efficient CNN Architecture Design" by Megvii (Face++). It is designed to achieve high performance with low computational cost, making it ideal for mobile and embedded devices.The x2_0 in its name indicates a width multiplier of 2.0, meaning the model has twice as many channels compared to the baseline ShuffleNetV2_x1_0. It employs Channel Shuffle to enable efficient information exchange between grouped convolutions, addressing the limitations of group convolutions. The core building block, the ShuffleNetV2 block, features a split-merge design and channel shuffle mechanism, ensuring both high efficiency and accuracy. -## Setup +## Model Preparation -### Install - -```bash -pip3 install -r requirements.txt -``` - -### Download +### Prepare Resources Pretrained model: Dataset: to download the validation dataset. +### Install Dependencies + +```bash +pip3 install -r requirements.txt +``` + ### Model Conversion ```bash python3 export.py --weight shufflenetv2_x2_0-8be3c8ee.pth --output shufflenetv2_x2_0.onnx ``` -## Inference +## Model Inference ```bash export DATASETS_DIR=/Path/to/imagenet_val/ @@ -39,7 +39,7 @@ bash scripts/infer_shufflenetv2_x2_0_fp16_accuracy.sh bash scripts/infer_shufflenetv2_x2_0_fp16_performance.sh ``` -## Results +## Model Results | Model | BatchSize | Precision | FPS | Top-1(%) | Top-5(%) | | ----------------- | --------- | --------- | -------- | -------- | -------- | diff --git a/models/cv/classification/squeezenet_v1_0/igie/README.md b/models/cv/classification/squeezenet_v1_0/igie/README.md index a3555dd8ff176dce8b59300407d544374090972e..411abc6213c90dd4783358915bbb6a3062b26a2c 100644 --- a/models/cv/classification/squeezenet_v1_0/igie/README.md +++ b/models/cv/classification/squeezenet_v1_0/igie/README.md @@ -1,30 +1,30 @@ -# SqueezeNet1_0 +# SqueezeNet1_0 (IGIE) -## Description +## Model Description SqueezeNet1_0 is a lightweight convolutional neural network introduced in the paper "SqueezeNet: AlexNet-level accuracy with 50x fewer parameters and <0.5MB model size." It was designed to achieve high classification accuracy with significantly fewer parameters, making it highly efficient for resource-constrained environments.The core innovation of SqueezeNet lies in the Fire Module, which reduces parameters using 1x1 convolutions in the "Squeeze layer" and expands feature maps through a mix of 1x1 and 3x3 convolutions in the "Expand layer." Additionally, delayed downsampling improves feature representation and accuracy. -## Setup +## Model Preparation -### Install - -```bash -pip3 install -r requirements.txt -``` - -### Download +### Prepare Resources Pretrained model: Dataset: to download the validation dataset. +### Install Dependencies + +```bash +pip3 install -r requirements.txt +``` + ### Model Conversion ```bash python3 export.py --weight squeezenet1_0-b66bff10.pth --output squeezenet1_0.onnx ``` -## Inference +## Model Inference ```bash export DATASETS_DIR=/Path/to/imagenet_val/ @@ -39,8 +39,8 @@ bash scripts/infer_squeezenet_v1_0_fp16_accuracy.sh bash scripts/infer_squeezenet_v1_0_fp16_performance.sh ``` -## Results +## Model Results -Model |BatchSize |Precision |FPS |Top-1(%) |Top-5(%) -----------------|-----------|----------|----------|----------|-------- -Squeezenet_v1_0 | 32 | FP16 | 7777.50 | 58.08 | 80.39 +| Model | BatchSize | Precision | FPS | Top-1(%) | Top-5(%) | +|-----------------|-----------|-----------|---------|----------|----------| +| Squeezenet_v1_0 | 32 | FP16 | 7777.50 | 58.08 | 80.39 | diff --git a/models/cv/classification/squeezenet_v1_0/ixrt/README.md b/models/cv/classification/squeezenet_v1_0/ixrt/README.md index faea4b18eea97de42f508ed03f2b361f821723ea..bab185eebe1012f17b4fee1e31d42f467492acaf 100644 --- a/models/cv/classification/squeezenet_v1_0/ixrt/README.md +++ b/models/cv/classification/squeezenet_v1_0/ixrt/README.md @@ -1,14 +1,20 @@ -# SqueezeNet 1.0 +# SqueezeNet 1.0 (IxRT) -## Description +## Model Description SqueezeNet 1.0 is a deep learning model for image classification, designed to be lightweight and efficient for deployment on resource-constrained devices. It was developed by researchers at DeepScale and released in 2016. -## Setup +## Model Preparation -### Install +### Prepare Resources + +Pretrained model: + +Dataset: to download the validation dataset. + +### Install Dependencies ```bash # Install libGL @@ -20,12 +26,6 @@ apt install -y libgl1-mesa-glx pip3 install -r requirements.txt ``` -### Download - -Pretrained model: - -Dataset: to download the validation dataset. - ### Model Conversion ```bash @@ -33,7 +33,7 @@ mkdir checkpoints python3 export_onnx.py --origin_model /path/to/squeezenet1_0-b66bff10.pth --output_model checkpoints/squeezenetv10.onnx ``` -## Inference +## Model Inference ```bash export PROJ_DIR=./ @@ -61,9 +61,9 @@ bash scripts/infer_squeezenet_v1_0_int8_accuracy.sh bash scripts/infer_squeezenet_v1_0_int8_performance.sh ``` -## Results +## Model Results -Model |BatchSize |Precision |FPS |Top-1(%) |Top-5(%) ----------------|-----------|----------|---------|----------|-------- -SqueezeNet 1.0 | 32 | FP16 | 7740.26 | 58.07 | 80.43 -SqueezeNet 1.0 | 32 | INT8 | 8871.93 | 55.10 | 79.21 +| Model | BatchSize | Precision | FPS | Top-1(%) | Top-5(%) | +|----------------|-----------|-----------|---------|----------|----------| +| SqueezeNet 1.0 | 32 | FP16 | 7740.26 | 58.07 | 80.43 | +| SqueezeNet 1.0 | 32 | INT8 | 8871.93 | 55.10 | 79.21 | diff --git a/models/cv/classification/squeezenet_v1_1/ixrt/README.md b/models/cv/classification/squeezenet_v1_1/ixrt/README.md index ada0742aacf510b66c53b37c55857db30a1f63a1..477c5f8b85730acbff77a153c13a751d2ed5c56c 100644 --- a/models/cv/classification/squeezenet_v1_1/ixrt/README.md +++ b/models/cv/classification/squeezenet_v1_1/ixrt/README.md @@ -1,14 +1,20 @@ -# SqueezeNet 1.1 +# SqueezeNet 1.1 (IxRT) -## Description +## Model Description SqueezeNet 1.1 is a deep learning model for image classification, designed to be lightweight and efficient for deployment on resource-constrained devices. It was developed by researchers at DeepScale and released in 2016. -## Setup +## Model Preparation -### Install +### Prepare Resources + +Pretrained model: + +Dataset: to download the validation dataset. + +### Install Dependencies ```bash # Install libGL @@ -20,12 +26,6 @@ apt install -y libgl1-mesa-glx pip3 install -r requirements.txt ``` -### Download - -Pretrained model: - -Dataset: to download the validation dataset. - ### Model Conversion ```bash @@ -33,7 +33,7 @@ mkdir checkpoints python3 export_onnx.py --origin_model /path/to/squeezenet1_1-b8a52dc0.pth --output_model checkpoints/squeezenet_v1_1.onnx ``` -## Inference +## Model Inference ```bash export PROJ_DIR=./ @@ -62,7 +62,7 @@ bash scripts/infer_squeezenet_v1_1_int8_accuracy.sh bash scripts/infer_squeezenet_v1_1_int8_performance.sh ``` -## Results +## Model Results | Model | BatchSize | Precision | FPS | Top-1(%) | Top-5(%) | | -------------- | --------- | --------- | ----- | -------- | -------- | diff --git a/models/cv/classification/svt_base/igie/README.md b/models/cv/classification/svt_base/igie/README.md index 26d020d0f5d35478678de98e807cca2b42aa1888..83da0b7ce91be20135a15c89f3bdbfded3eb3606 100644 --- a/models/cv/classification/svt_base/igie/README.md +++ b/models/cv/classification/svt_base/igie/README.md @@ -1,12 +1,18 @@ -# SVT Base +# SVT Base (IGIE) -## Description +## Model Description SVT Base is a mid-sized variant of the Sparse Vision Transformer (SVT) series, designed to combine the expressive power of Vision Transformers (ViTs) with the efficiency of sparse attention mechanisms. By employing sparse attention and multi-stage feature extraction, SVT-Base reduces computational complexity while retaining global modeling capabilities. -## Setup +## Model Preparation -### Install +### Prepare Resources + +Pretrained model: + +Dataset: to download the validation dataset. + +### Install Dependencies ```bash # Install libGL @@ -18,12 +24,6 @@ apt install -y libgl1-mesa-glx pip3 install -r requirements.txt ``` -### Download - -Pretrained model: - -Dataset: to download the validation dataset. - ### Model Conversion ```bash @@ -37,7 +37,7 @@ python3 export.py --cfg mmpretrain/configs/twins/twins-svt-base_8xb128_in1k.py - onnxsim svt_base.onnx svt_base_opt.onnx ``` -## Inference +## Model Inference ```bash export DATASETS_DIR=/Path/to/imagenet_val/ @@ -52,12 +52,12 @@ bash scripts/infer_svt_base_fp16_accuracy.sh bash scripts/infer_svt_base_fp16_performance.sh ``` -## Results +## Model Results | Model | BatchSize | Precision | FPS | Top-1(%) | Top-5(%) | | ----------| --------- | --------- | -------- | -------- | -------- | | SVT Base | 32 | FP16 | 673.165 | 82.865 | 96.213 | -## Reference +## References -SVT Base: +- [mmpretrain](https://github.com/open-mmlab/mmpretrain) diff --git a/models/cv/classification/swin_transformer/igie/README.md b/models/cv/classification/swin_transformer/igie/README.md index aafdbc70480d84d4be5df055124c125dac8e1024..6eaac7d977914c7ffaf71811608a2b4c41618124 100644 --- a/models/cv/classification/swin_transformer/igie/README.md +++ b/models/cv/classification/swin_transformer/igie/README.md @@ -1,18 +1,12 @@ -# Swin Transformer +# Swin Transformer (IGIE) -## Description +## Model Description Swin Transformer is a pioneering neural network architecture that introduces a novel approach to handling local and global information in computer vision tasks. Departing from traditional self-attention mechanisms, Swin Transformer adopts a hierarchical design, organizing its attention windows in a shifted manner. This innovation enables more efficient modeling of contextual information across different scales, enhancing the model's capability to capture intricate patterns. -## Setup +## Model Preparation -### Install - -```bash -pip3 install -r requirements.txt -``` - -### Download +### Prepare Resources Pretrained model: @@ -23,6 +17,12 @@ git clone https://huggingface.co/microsoft/swin-tiny-patch4-window7-224 swin-tin Dataset: to download the validation dataset. +### Install Dependencies + +```bash +pip3 install -r requirements.txt +``` + ### Model Conversion ```bash @@ -32,7 +32,7 @@ python3 export.py --output swin_transformer.onnx onnxsim swin_transformer.onnx swin_transformer_opt.onnx ``` -## Inference +## Model Inference ```bash export DATASETS_DIR=/Path/to/imagenet_val/ @@ -47,8 +47,8 @@ bash scripts/infer_swin_transformer_fp16_accuracy.sh bash scripts/infer_swin_transformer_fp16_performance.sh ``` -## Results +## Model Results -Model |BatchSize |Precision |FPS |Top-1(%) |Top-5(%) ------------------|-----------|----------|----------|----------|-------- -Swin Transformer | 32 | FP16 |1104.52 | 80.578 | 95.2 +| Model | BatchSize | Precision | FPS | Top-1(%) | Top-5(%) | +|------------------|-----------|-----------|---------|----------|----------| +| Swin Transformer | 32 | FP16 | 1104.52 | 80.578 | 95.2 | diff --git a/models/cv/classification/swin_transformer_large/ixrt/README.md b/models/cv/classification/swin_transformer_large/ixrt/README.md index e6cf19d4a5ca943f0a1a74608613a4f56bbd8f8b..0abed0405928fe68d1cb5b9376c32ed6790ffb4e 100644 --- a/models/cv/classification/swin_transformer_large/ixrt/README.md +++ b/models/cv/classification/swin_transformer_large/ixrt/README.md @@ -1,24 +1,12 @@ -# Swin Transformer Large +# Swin Transformer Large (IxRT) -## Description +## Model Description Swin Transformer-Large is a variant of the Swin Transformer, an architecture designed for computer vision tasks, particularly within the realms of image classification, object detection, and segmentation. The Swin Transformer-Large model represents an expanded version with more layers and parameters compared to its base configuration, aiming for improved performance and deeper processing of visual data. -## Setup +## Model Preparation -### Install - -```bash -export PROJ_ROOT=/PATH/TO/DEEPSPARKINFERENCE -export MODEL_PATH=${PROJ_ROOT}/models/cv/classification/swin_transformer_large/ixrt -cd ${MODEL_PATH} - -apt install -y libnuma-dev libgl1-mesa-glx - -pip3 install -r requirements.txt -``` - -### Download +### Prepare Resources Pretrained model: @@ -28,7 +16,18 @@ or you can : ```bash bash ./scripts/prepare_model_and_dataset.sh +``` + +### Install Dependencies + +```bash +export PROJ_ROOT=/PATH/TO/DEEPSPARKINFERENCE +export MODEL_PATH=${PROJ_ROOT}/models/cv/classification/swin_transformer_large/ixrt +cd ${MODEL_PATH} +apt install -y libnuma-dev libgl1-mesa-glx + +pip3 install -r requirements.txt ``` ### Model Conversion @@ -42,7 +41,7 @@ python3 torch2onnx.py --model_path ./general_perf/model_zoo/popular/swin-large/s ``` -## Inference +## Model Inference ```bash git clone https://gitee.com/deep-spark/iluvatar-corex-ixrt.git --depth=1 @@ -83,7 +82,7 @@ wget -O workloads/swin-large-torch-fp32.json https://raw.githubusercontent.com/b python3 core/perf_engine.py --hardware_type ILUVATAR --task swin-large-torch-fp32 ``` -## Results +## Model Results | Model | BatchSize | Precision | QPS | Top-1 Acc | | ---------------------- | --------- | --------- | ----- | --------- | diff --git a/models/cv/classification/vgg11/igie/README.md b/models/cv/classification/vgg11/igie/README.md index 522ff3e7a108eedb95cb61074ac1b90ebb8d027c..08ad45003b054178833d729f91c3246b8d6f82ee 100644 --- a/models/cv/classification/vgg11/igie/README.md +++ b/models/cv/classification/vgg11/igie/README.md @@ -1,30 +1,30 @@ -# VGG11 +# VGG11 (IGIE) -## Description +## Model Description VGG11 is a deep convolutional neural network introduced by the Visual Geometry Group at the University of Oxford in the paper "Very Deep Convolutional Networks for Large-Scale Image Recognition." The model consists of 11 layers with trainable weights, including 8 convolutional layers and 3 fully connected layers. It employs small 3x3 convolutional kernels and 2x2 max-pooling layers to extract hierarchical features from input images. The ReLU activation function is used throughout the network to enhance non-linearity and mitigate the vanishing gradient problem. -## Setup +## Model Preparation -### Install - -```bash -pip3 install -r requirements.txt -``` - -### Download +### Prepare Resources Pretrained model: Dataset: to download the validation dataset. +### Install Dependencies + +```bash +pip3 install -r requirements.txt +``` + ### Model Conversion ```bash python3 export.py --weight vgg11-8a719046.pth --output vgg11.onnx ``` -## Inference +## Model Inference ```bash export DATASETS_DIR=/Path/to/imagenet_val/ @@ -39,8 +39,8 @@ bash scripts/infer_vgg11_fp16_accuracy.sh bash scripts/infer_vgg11_fp16_performance.sh ``` -## Results +## Model Results -Model |BatchSize |Precision |FPS |Top-1(%) |Top-5(%) ---------|-----------|----------|----------|----------|-------- -VGG11 | 32 | FP16 | 3872.86 | 69.03 | 88.6 +| Model | BatchSize | Precision | FPS | Top-1(%) | Top-5(%) | +|-------|-----------|-----------|---------|----------|----------| +| VGG11 | 32 | FP16 | 3872.86 | 69.03 | 88.6 | diff --git a/models/cv/classification/vgg16/igie/README.md b/models/cv/classification/vgg16/igie/README.md index 292b03ca0e6e61f22de1e2092fe8fbdf06a73a8f..ff1c9e520e3cf28ed9b2c4f5464ab847141acc53 100644 --- a/models/cv/classification/vgg16/igie/README.md +++ b/models/cv/classification/vgg16/igie/README.md @@ -1,30 +1,30 @@ -# VGG16 +# VGG16 (IGIE) -## Description +## Model Description VGG16 is a convolutional neural network (CNN) architecture designed for image classification tasks.The architecture of VGG16 is characterized by its simplicity and uniform structure. It consists of 16 convolutional and fully connected layers, organized into five blocks, with the convolutional layers using small 3x3 filters. -## Setup +## Model Preparation -### Install - -```bash -pip3 install -r requirements.txt -``` - -### Download +### Prepare Resources Pretrained model: Dataset: to download the validation dataset. +### Install Dependencies + +```bash +pip3 install -r requirements.txt +``` + ### Model Conversion ```bash python3 export.py --weight vgg16-397923af.pth --output vgg16.onnx ``` -## Inference +## Model Inference ```bash export DATASETS_DIR=/Path/to/imagenet_val/ @@ -48,9 +48,9 @@ bash scripts/infer_vgg16_int8_accuracy.sh bash scripts/infer_vgg16_int8_performance.sh ``` -## Results +## Model Results -Model |BatchSize |Precision |FPS |Top-1(%) |Top-5(%) ---------|-----------|----------|----------|----------|-------- -VGG16 | 32 | FP16 | 1830.53 | 71.55 | 90.37 -VGG16 | 32 | INT8 | 3528.01 | 71.53 | 90.32 +| Model | BatchSize | Precision | FPS | Top-1(%) | Top-5(%) | +|-------|-----------|-----------|---------|----------|----------| +| VGG16 | 32 | FP16 | 1830.53 | 71.55 | 90.37 | +| VGG16 | 32 | INT8 | 3528.01 | 71.53 | 90.32 | diff --git a/models/cv/classification/vgg16/ixrt/README.md b/models/cv/classification/vgg16/ixrt/README.md index 7681a924aae81e772138b10546a408b08db9f31d..ae03982d5aae72f77ab8e2b8f2951f49e16002d9 100644 --- a/models/cv/classification/vgg16/ixrt/README.md +++ b/models/cv/classification/vgg16/ixrt/README.md @@ -1,13 +1,19 @@ -# VGG16 +# VGG16 (IxRT) -## Description +## Model Description VGG16 is a deep convolutional neural network model developed by the Visual Geometry Group at the University of Oxford. It finished second in the 2014 ImageNet Massive Visual Identity Challenge (ILSVRC). -## Setup +## Model Preparation -### Install +### Prepare Resources + +Pretrained model: + +Dataset: to download the validation dataset. + +### Install Dependencies ```bash # Install libGL @@ -19,12 +25,6 @@ apt install -y libgl1-mesa-glx pip3 install -r requirements.txt ``` -### Download - -Pretrained model: - -Dataset: to download the validation dataset. - ### Model Conversion ```bash @@ -32,7 +32,7 @@ mkdir checkpoints python3 export_onnx.py --origin_model /path/to/vgg16-397923af.pth --output_model checkpoints/vgg16.onnx ``` -## Inference +## Model Inference ```bash export PROJ_DIR=./ @@ -60,9 +60,9 @@ bash scripts/infer_vgg16_int8_accuracy.sh bash scripts/infer_vgg16_int8_performance.sh ``` -## Results +## Model Results -Model |BatchSize |Precision |FPS |Top-1(%) |Top-5(%) -------|-----------|----------|---------|---------|-------- -VGG16 | 32 | FP16 | 1777.85 | 71.57 | 90.40 -VGG16 | 32 | INT8 | 4451.80 | 71.47 | 90.35 +| Model | BatchSize | Precision | FPS | Top-1(%) | Top-5(%) | +|-------|-----------|-----------|---------|----------|----------| +| VGG16 | 32 | FP16 | 1777.85 | 71.57 | 90.40 | +| VGG16 | 32 | INT8 | 4451.80 | 71.47 | 90.35 | diff --git a/models/cv/classification/wide_resnet101/igie/README.md b/models/cv/classification/wide_resnet101/igie/README.md index 93a5a3b8daa1d18ba43053f76ca90bdcb0b31bad..fb55e46ae0685bf36559dd7f33ad41b68a4a84fc 100644 --- a/models/cv/classification/wide_resnet101/igie/README.md +++ b/models/cv/classification/wide_resnet101/igie/README.md @@ -1,30 +1,30 @@ -# Wide ResNet101 +# Wide ResNet101 (IGIE) -## Description +## Model Description Wide ResNet101 is a variant of the ResNet architecture that focuses on increasing the network's width (number of channels per layer) rather than its depth. This approach, inspired by the paper "Wide Residual Networks," balances model depth and width to achieve better performance while avoiding the drawbacks of overly deep networks, such as vanishing gradients and feature redundancy.Wide ResNet101 builds upon the standard ResNet101 architecture but doubles (or quadruples) the number of channels in each residual block. This results in significantly improved feature representation, making it suitable for complex tasks like image classification, object detection, and segmentation. -## Setup +## Model Preparation -### Install - -```bash -pip3 install -r requirements.txt -``` - -### Download +### Prepare Resources Pretrained model: Dataset: to download the validation dataset. +### Install Dependencies + +```bash +pip3 install -r requirements.txt +``` + ### Model Conversion ```bash python3 export.py --weight wide_resnet101_2-32ee1156.pth --output wide_resnet101.onnx ``` -## Inference +## Model Inference ```bash export DATASETS_DIR=/Path/to/imagenet_val/ @@ -39,7 +39,7 @@ bash scripts/infer_wide_resnet101_fp16_accuracy.sh bash scripts/infer_wide_resnet101_fp16_performance.sh ``` -## Results +## Model Results | Model | BatchSize | Precision | FPS | Top-1(%) | Top-5(%) | | -------------- | --------- | --------- | -------- | -------- | -------- | diff --git a/models/cv/classification/wide_resnet50/igie/README.md b/models/cv/classification/wide_resnet50/igie/README.md index 3aebe48ffc2ea9842c0a24f494af32486c91992a..14e4afe3da06e0324542bf885d48b68289579a9d 100644 --- a/models/cv/classification/wide_resnet50/igie/README.md +++ b/models/cv/classification/wide_resnet50/igie/README.md @@ -1,30 +1,30 @@ -# Wide ResNet50 +# Wide ResNet50 (IGIE) -## Description +## Model Description The distinguishing feature of Wide ResNet50 lies in its widened architecture compared to traditional ResNet models. By increasing the width of the residual blocks, Wide ResNet50 enhances the capacity of the network to capture richer and more diverse feature representations, leading to improved performance on various visual recognition tasks. -## Setup +## Model Preparation -### Install - -```bash -pip3 install -r requirements.txt -``` - -### Download +### Prepare Resources Pretrained model: Dataset: to download the validation dataset. +### Install Dependencies + +```bash +pip3 install -r requirements.txt +``` + ### Model Conversion ```bash python3 export.py --weight wide_resnet50_2-95faca4d.pth --output wide_resnet50.onnx ``` -## Inference +## Model Inference ```bash export DATASETS_DIR=/Path/to/imagenet_val/ @@ -48,9 +48,9 @@ bash scripts/infer_wide_resnet50_int8_accuracy.sh bash scripts/infer_wide_resnet50_int8_performance.sh ``` -## Results +## Model Results | Model | BatchSize | Precision | FPS | Top-1(%) | Top-5(%) | -| ------------- | --------- | --------- | -------- | -------- | -------- | +|---------------|-----------|-----------|----------|----------|----------| | Wide ResNet50 | 32 | FP16 | 2312.383 | 78.459 | 94.052 | | Wide ResNet50 | 32 | INT8 | 5195.654 | 77.957 | 93.798 | diff --git a/models/cv/classification/wide_resnet50/ixrt/README.md b/models/cv/classification/wide_resnet50/ixrt/README.md index 72fd5b4986a20b25edc633fe87de75560c5c5045..616ef1717463fc88875e9e434e815cfa81fddbb9 100644 --- a/models/cv/classification/wide_resnet50/ixrt/README.md +++ b/models/cv/classification/wide_resnet50/ixrt/README.md @@ -1,23 +1,23 @@ -# Wide ResNet50 +# Wide ResNet50 (IxRT) -## Description +## Model Description The distinguishing feature of Wide ResNet50 lies in its widened architecture compared to traditional ResNet models. By increasing the width of the residual blocks, Wide ResNet50 enhances the capacity of the network to capture richer and more diverse feature representations, leading to improved performance on various visual recognition tasks. -## Setup +## Model Preparation -### Install - -```bash -pip3 install -r requirements.txt -``` - -### Download +### Prepare Resources Pretrained model: Dataset: to download the validation dataset. +### Install Dependencies + +```bash +pip3 install -r requirements.txt +``` + ### Model Conversion ```bash @@ -25,7 +25,7 @@ mkdir -p checkpoints/ python3 export.py --weight wide_resnet50_2-95faca4d.pth --output checkpoints/wide_resnet50.onnx ``` -## Inference +## Model Inference ```bash export DATASETS_DIR=/Path/to/imagenet_val/ @@ -52,7 +52,7 @@ bash scripts/infer_wide_resnet50_int8_accuracy.sh bash scripts/infer_wide_resnet50_int8_performance.sh ``` -## Results +## Model Results | Model | BatchSize | Precision | FPS | Top-1(%) | Top-5(%) | | ------------- | --------- | --------- | -------- | -------- | -------- | diff --git a/models/cv/face_recognition/facenet/ixrt/README.md b/models/cv/face_recognition/facenet/ixrt/README.md index 36ee33dbcfef22c1c812184c3d4fda88abfa260e..1e4079f166c7e79641405105fc69a3d335846ad4 100644 --- a/models/cv/face_recognition/facenet/ixrt/README.md +++ b/models/cv/face_recognition/facenet/ixrt/README.md @@ -1,24 +1,12 @@ -# FaceNet +# FaceNet (IxRT) -## Description +## Model Description Facenet is a facial recognition system originally proposed and developed by Google. It utilizes deep learning techniques, specifically convolutional neural networks (CNNs), to transform facial images into high-dimensional feature vectors. These feature vectors possess high discriminative power, enabling comparison and identification of different faces. The core idea of Facenet is to map faces into a multi-dimensional space of feature vectors, achieving efficient representation and recognition of faces. -## Setup +## Model Preparation -### Install - -```bash -# Install libGL -## CentOS -yum install -y mesa-libGL -## Ubuntu -apt install -y libgl1-mesa-glx - -pip3 install -r requirements.txt -``` - -### Download +### Prepare Resources Pretrained model: @@ -30,10 +18,21 @@ cd ${DeepSparkInference_PATH}/models/cv/face/facenet/ixrt unzip 20180408-102900.zip ``` -### Model Conversion +### Install Dependencies ```bash +# Install libGL +## CentOS +yum install -y mesa-libGL +## Ubuntu +apt install -y libgl1-mesa-glx +pip3 install -r requirements.txt +``` + +### Model Conversion + +```bash mkdir -p checkpoints mkdir -p facenet_weights git clone https://github.com/timesler/facenet-pytorch @@ -56,7 +55,7 @@ wget https://raw.githubusercontent.com/lanrax/Project_dataset/master/facenet_dat unzip facenet_datasets.zip ``` -## Inference +## Model Inference Because there are differences in model export, it is necessary to verify the following information before executing inference: In deploy.py, "/last_bn/BatchNormalization_output_0" refers to the output name of the BatchNormalization node in the exported ONNX model, such as "1187". "/avgpool_1a/GlobalAveragePool_output_0" refers to the output name of the GlobalAveragePool node, such as "1178". Additionally, make sure to update "/last_bn/BatchNormalization_output_0" in build_engine.py to the corresponding name, such as "1187". @@ -82,7 +81,7 @@ bash scripts/infer_facenet_int8_accuracy.sh bash scripts/infer_facenet_int8_performance.sh ``` -## Results +## Model Results | Model | BatchSize | Precision | FPS | AUC | ACC | | ------- | --------- | --------- | --------- | ----- | ---------------- | diff --git a/models/cv/instance_segmentation/mask_rcnn/ixrt/README.md b/models/cv/instance_segmentation/mask_rcnn/ixrt/README.md index 3fc83674ed3ace64c6bf2164d350b40b7f5ff71a..063840f1fa40a0e9628b6c3a0b5da2ba3b4635cc 100644 --- a/models/cv/instance_segmentation/mask_rcnn/ixrt/README.md +++ b/models/cv/instance_segmentation/mask_rcnn/ixrt/README.md @@ -1,15 +1,12 @@ -# Mask R-CNN +# Mask R-CNN (IxRT) -## Description +## Model Description Mask R-CNN (Mask Region-Based Convolutional Neural Network) is an extension of the Faster R-CNN model, which is itself an improvement over R-CNN and Fast R-CNN. Developed by Kaiming He et al., Mask R-CNN is designed for object instance segmentation tasks, meaning it not only detects objects within an image but also generates high-quality segmentation masks for each instance. -## Prepare +## Model Preparation -```bash -# go to current model home path -cd ${PROJ_ROOT}/models/cv/segmentation/mask_rcnn/ixrt -``` +### Prepare Resources Prepare weights and datasets referring to below steps: @@ -29,27 +26,25 @@ Visit [COCO site](https://cocodataset.org/) and get COCO2017 datasets - images directory: coco/images/val2017/*.jpg - annotations directory: coco/annotations/instances_val2017.json -## Setup - -```bash -cd scripts/ -``` +### Install Dependencies -### Prepare on MR GPU +Prepare on MR GPU ```bash +cd scripts/ bash init.sh ``` -### Prepare on NV GPU +Prepare on NV GPU ```bash +cd scripts/ bash init_nv.sh ``` -## Inference +## Model Inference -### FP16 Performance +### FP16 ```bash cd ../ @@ -59,13 +54,13 @@ bash scripts/infer_mask_rcnn_fp16_performance.sh bash scripts/infer_mask_rcnn_fp16_accuracy.sh ``` -## Results +## Model Results -Model | BatchSize | Precision | FPS | ACC -------|-----------|-----------|-----|---- -Mask R-CNN | 1 | FP16 | 12.15 | bbox mAP@0.5 : 0.5512, segm mAP@0.5 : 0.5189 +| Model | BatchSize | Precision | FPS | ACC | +|------------|-----------|-----------|-------|------------------------------------------------| +| Mask R-CNN | 1 | FP16 | 12.15 | bbox mAP@0.5 : 0.5512, segm mAP@0.5 : 0.5189 | -## Referenece +## Refereneces - [tensorrtx](https://github.com/wang-xinyu/tensorrtx/tree/master/rcnn) - [detectron2](https://github.com/facebookresearch/detectron2) diff --git a/models/cv/instance_segmentation/solov1/ixrt/README.md b/models/cv/instance_segmentation/solov1/ixrt/README.md index 45de0d380974ad938105fe249b78291a24f33cab..1c031d0c3b2cb24ebf65297905b869485d18a820 100644 --- a/models/cv/instance_segmentation/solov1/ixrt/README.md +++ b/models/cv/instance_segmentation/solov1/ixrt/README.md @@ -1,24 +1,32 @@ -# SOLOv1 +# SOLOv1 (IxRT) -## Description +## Model Description SOLO (Segmenting Objects by Locations) is a new instance segmentation method that differs from traditional approaches by introducing the concept of “instance categories”. Based on the location and size of each instance, SOLO assigns each pixel to a corresponding instance category. This method transforms the instance segmentation problem into a single-shot classification task, simplifying the overall process. -## Setup +## Model Preparation -### Install +### Prepare Resources + +Pretrained model: + +Dataset: to download the validation dataset. + +### Install Dependencies ```bash -yum install mesa-libGL +# Install libGL +## CentOS +yum install -y mesa-libGL +## Ubuntu +apt install -y libgl1-mesa-glx pip3 install -r requirements.txt ``` -### Dependency - The inference of the Solov1 model requires a dependency on a well-adapted mmcv-v1.7.0 library. Please inquire with the staff to obtain the relevant libraries. -You can follow here to build: https://gitee.com/deep-spark/deepsparkhub/blob/master/toolbox/MMDetection/prepare_mmcv.sh +You can follow the script [prepare_mmcv.sh](https://gitee.com/deep-spark/deepsparkhub/blob/master/toolbox/MMDetection/prepare_mmcv.sh) to build: ```bash cd mmcv @@ -26,12 +34,6 @@ sh build_mmcv.sh sh install_mmcv.sh ``` -### Download - -Pretrained model: - -Dataset: to download the validation dataset. - ### Model Conversion ```bash @@ -40,7 +42,7 @@ python3 solo_torch2onnx.py --cfg /path/to/solo/solo_r50_fpn_3x_coco.py --checkpo mv r50_solo_bs1_800x800.onnx /Path/to/checkpoints/r50_solo_bs1_800x800.onnx ``` -## Inference +## Model Inference ```bash export PROJ_DIR=./ @@ -60,8 +62,8 @@ bash scripts/infer_solov1_fp16_accuracy.sh bash scripts/infer_solov1_fp16_performance.sh ``` -## Results +## Model Results -Model |BatchSize |Precision |FPS |MAP@0.5 |MAP@0.5:0.95 ---------|-----------|----------|----------|----------|------------ -SOLOv1 | 1 | FP16 | 24.67 | 0.541 | 0.338 +| Model | BatchSize | Precision | FPS | MAP@0.5 | MAP@0.5:0.95 | +|--------|-----------|-----------|-------|---------|--------------| +| SOLOv1 | 1 | FP16 | 24.67 | 0.541 | 0.338 | diff --git a/models/cv/multi_object_tracking/deepsort/igie/README.md b/models/cv/multi_object_tracking/deepsort/igie/README.md index 4e143861334301bf8a93d113bfebb0387fab551c..adfc9d0b0518f73a973c71a02e8c78bedfb668c5 100644 --- a/models/cv/multi_object_tracking/deepsort/igie/README.md +++ b/models/cv/multi_object_tracking/deepsort/igie/README.md @@ -1,23 +1,23 @@ -# DeepSort +# DeepSort (IGIE) -## Description +## Model Description DeepSort integrates deep neural networks with traditional tracking methods to achieve robust and accurate tracking of objects in video streams. The algorithm leverages a combination of a deep appearance feature extractor and the Hungarian algorithm for data association. -## Setup +## Model Preparation -### Install - -```bash -pip3 install -r requirements.txt -``` - -### Download +### Prepare Resources Pretrained model(ckpt.t7): Dataset: to download the market1501 dataset. +### Install Dependencies + +```bash +pip3 install -r requirements.txt +``` + ### Model Conversion ```bash @@ -27,7 +27,7 @@ python3 export.py --weight ckpt.t7 --output deepsort.onnx onnxsim deepsort.onnx deepsort_opt.onnx ``` -## Inference +## Model Inference ```bash export DATASETS_DIR=/Path/to/market1501/ @@ -51,9 +51,9 @@ bash scripts/infer_deepsort_int8_accuracy.sh bash scripts/infer_deepsort_int8_performance.sh ``` -## Results +## Model Results -Model |BatchSize |Precision |FPS |Acc(%) | ----------|-----------|----------|----------|----------| -DeepSort | 32 | FP16 |17164.67 | 99.32 | -DeepSort | 32 | INT8 |20399.12 | 99.29 | +| Model | BatchSize | Precision | FPS | Acc(%) | +|----------|-----------|-----------|----------|--------| +| DeepSort | 32 | FP16 | 17164.67 | 99.32 | +| DeepSort | 32 | INT8 | 20399.12 | 99.29 | diff --git a/models/cv/multi_object_tracking/fastreid/igie/README.md b/models/cv/multi_object_tracking/fastreid/igie/README.md index 16aaff6d6dbea7f4d2b196f5c8d0d92554d12715..0f5446026a44abd279680f3847a396a204207f16 100644 --- a/models/cv/multi_object_tracking/fastreid/igie/README.md +++ b/models/cv/multi_object_tracking/fastreid/igie/README.md @@ -1,23 +1,23 @@ -# FastReID +# FastReID (IGIE) -## Description +## Model Description FastReID is a research platform that implements state-of-the-art re-identification algorithms. -## Setup +## Model Preparation -### Install - -```bash -pip3 install -r requirements.txt -``` - -### Download +### Prepare Resources Pretrained model: Dataset: to download the vehicleid dataset. +### Install Dependencies + +```bash +pip3 install -r requirements.txt +``` + ### Model Conversion ```bash @@ -31,7 +31,7 @@ python3 tools/deploy/onnx_export.py --config-file configs/VehicleID/bagtricks_R5 cd .. ``` -## Inference +## Model Inference ```bash export DATASETS_DIR=/Path/to/VehicleID @@ -46,8 +46,8 @@ bash scripts/infer_fastreid_fp16_accuracy.sh bash scripts/infer_fastreid_fp16_performance.sh ``` -## Results +## Model Results -Model |BatchSize |Precision |FPS |Rank-1(%) |Rank-5(%) |mAP | ----------|-----------|----------|----------|----------|----------|--------| -FastReid | 32 | FP16 | 1850.78 | 88.39 | 98.45 | 92.79 | +| Model | BatchSize | Precision | FPS | Rank-1(%) | Rank-5(%) | mAP | +|----------|-----------|-----------|---------|-----------|-----------|-------| +| FastReid | 32 | FP16 | 1850.78 | 88.39 | 98.45 | 92.79 | diff --git a/models/cv/multi_object_tracking/repnet/igie/README.md b/models/cv/multi_object_tracking/repnet/igie/README.md index 95a18415d5665a8cbc0cadb270ce91aadb854471..3a161ecc6ba859a1a42f25b5677752c1c0afda87 100644 --- a/models/cv/multi_object_tracking/repnet/igie/README.md +++ b/models/cv/multi_object_tracking/repnet/igie/README.md @@ -1,23 +1,23 @@ -# RepNet-Vehicle-ReID +# RepNet-Vehicle-ReID (IGIE) -## Description +## Model Description The paper "Deep Relative Distance Learning: Tell the Difference Between Similar Vehicles" introduces a model named Deep Relative Distance Learning (DRDL), specifically designed for the problem of vehicle re-identification. DRDL employs a dual-branch deep convolutional network architecture, combined with a coupled clusters loss function and a mixed difference network structure, effectively mapping vehicle images into Euclidean space for similarity measurement. -## Setup +## Model Preparation -### Install - -```bash -pip3 install -r requirements.txt -``` - -### Download +### Prepare Resources Pretrained model: Dataset: to download the VehicleID dataset. +### Install Dependencies + +```bash +pip3 install -r requirements.txt +``` + ### Model Conversion ```bash @@ -27,7 +27,7 @@ python3 export.py --weight epoch_14.pth --output repnet.onnx onnxsim repnet.onnx repnet_opt.onnx ``` -## Inference +## Model Inference ```bash export DATASETS_DIR=/Path/to/VehicleID/ @@ -42,12 +42,12 @@ bash scripts/infer_repnet_fp16_accuracy.sh bash scripts/infer_repnet_fp16_performance.sh ``` -## Results +## Model Results -Model |BatchSize |Precision |FPS |Acc(%) | ---------|-----------|----------|----------|----------| -RepNet | 32 | FP16 |1373.579 | 99.88 | +| Model | BatchSize | Precision | FPS | Acc(%) | +|--------|-----------|-----------|----------|--------| +| RepNet | 32 | FP16 | 1373.579 | 99.88 | -## Reference +## References -RepNet-MDNet-VehicleReID: +- [RepNet-MDNet-VehicleReID](https://github.com/CaptainEven/RepNet-MDNet-VehicleReID) diff --git a/models/cv/object_detection/atss/igie/README.md b/models/cv/object_detection/atss/igie/README.md index 23f0ae94ca2824f4c4c876bcc5a5d880e840bbde..357d7c957a399f456873256d038466a82af2ebb0 100644 --- a/models/cv/object_detection/atss/igie/README.md +++ b/models/cv/object_detection/atss/igie/README.md @@ -1,12 +1,22 @@ -# ATSS +# ATSS (IGIE) -## Description +## Model Description ATSS is an advanced adaptive training sample selection method that effectively enhances the performance of both anchor-based and anchor-free object detectors by dynamically choosing positive and negative samples based on the statistical characteristics of objects. The design of ATSS reduces reliance on hyperparameters, simplifies the sample selection process, and significantly improves detection accuracy without adding extra computational costs. -## Setup +## Model Preparation -### Install +### Prepare Resources + +Pretrained model: + +Dataset: to download the validation dataset. + +```bash +wget https://download.openmmlab.com/mmdetection/v2.0/atss/atss_r50_fpn_1x_coco/atss_r50_fpn_1x_coco_20200209-985f7bd0.pth +``` + +### Install Dependencies ```bash # Install libGL @@ -18,16 +28,6 @@ apt install -y libgl1-mesa-glx pip3 install -r requirements.txt ``` -### Download - -Pretrained model: - -Dataset: to download the validation dataset. - -```bash -wget https://download.openmmlab.com/mmdetection/v2.0/atss/atss_r50_fpn_1x_coco/atss_r50_fpn_1x_coco_20200209-985f7bd0.pth -``` - ### Model Conversion ```bash @@ -38,7 +38,7 @@ python3 export.py --weight atss_r50_fpn_1x_coco_20200209-985f7bd0.pth --cfg atss onnxsim atss.onnx atss_opt.onnx ``` -## Inference +## Model Inference ```bash export DATASETS_DIR=/Path/to/coco/ @@ -53,13 +53,12 @@ bash scripts/infer_atss_fp16_accuracy.sh bash scripts/infer_atss_fp16_performance.sh ``` -## Results - -Model |BatchSize |Precision |FPS |IOU@0.5 |IOU@0.5:0.95 | --------|-----------|----------|----------|----------|---------------| -ATSS | 32 | FP16 | 81.671 | 0.541 | 0.367 | +## Model Results +| Model | BatchSize | Precision | FPS | IOU@0.5 | IOU@0.5:0.95 | +|-------|-----------|-----------|--------|---------|--------------| +| ATSS | 32 | FP16 | 81.671 | 0.541 | 0.367 | -## Reference +## References -mmdetection: +- [mmdetection](https://github.com/open-mmlab/mmdetection.git) diff --git a/models/cv/object_detection/centernet/igie/README.md b/models/cv/object_detection/centernet/igie/README.md index 64156b2edcc94f154eec2fca25775005ff4605e5..a9e1516a008dd3ef45fb7270b2a6797f232e1518 100644 --- a/models/cv/object_detection/centernet/igie/README.md +++ b/models/cv/object_detection/centernet/igie/README.md @@ -1,12 +1,18 @@ -# CenterNet +# CenterNet (IGIE) -## Description +## Model Description CenterNet is an efficient object detection model that simplifies the traditional object detection process by representing targets as the center points of their bounding boxes and using keypoint estimation techniques to locate these points. This model not only excels in speed, achieving real-time detection while maintaining high accuracy, but also exhibits good versatility, easily extending to tasks such as 3D object detection and human pose estimation. CenterNet's network architecture employs various optimized fully convolutional networks and combines effective loss functions, making the model training and inference process more efficient. -## Setup +## Model Preparation -### Install +### Prepare Resources + +Pretrained model: + +Dataset: to download the validation dataset. + +### Install Dependencies ```bash # Install libGL @@ -18,12 +24,6 @@ apt install -y libgl1-mesa-glx pip3 install -r requirements.txt ``` -### Download - -Pretrained model: - -Dataset: to download the validation dataset. - ### Model Conversion ```bash @@ -31,7 +31,7 @@ Dataset: to download the valida python3 export.py --weight centernet_resnet18_140e_coco_20210705_093630-bb5b3bf7.pth --cfg centernet_r18_8xb16-crop512-140e_coco.py --output centernet.onnx ``` -## Inference +## Model Inference ```bash export DATASETS_DIR=/Path/to/coco/ @@ -46,12 +46,12 @@ bash scripts/infer_centernet_fp16_accuracy.sh bash scripts/infer_centernet_fp16_performance.sh ``` -## Results +## Model Results -Model |BatchSize |Precision |FPS |IOU@0.5 |IOU@0.5:0.95 | -----------|-----------|----------|----------|----------|---------------| -CenterNet | 32 | FP16 | 799.70 | 0.423 | 0.258 | +| Model | BatchSize | Precision | FPS | IOU@0.5 | IOU@0.5:0.95 | +|-----------|-----------|-----------|--------|---------|--------------| +| CenterNet | 32 | FP16 | 799.70 | 0.423 | 0.258 | -## Reference +## References -mmdetection: +- [mmdetection](https://github.com/open-mmlab/mmdetection.git) diff --git a/models/cv/object_detection/centernet/ixrt/README.md b/models/cv/object_detection/centernet/ixrt/README.md index 776b54fb257d9394cd9caafe36ed022f3019d356..031525be203b348abf0f369a099e22601b3300ad 100644 --- a/models/cv/object_detection/centernet/ixrt/README.md +++ b/models/cv/object_detection/centernet/ixrt/README.md @@ -1,12 +1,18 @@ -# CenterNet +# CenterNet (IxRT) -## Description +## Model Description CenterNet is an efficient object detection model that simplifies the traditional object detection process by representing targets as the center points of their bounding boxes and using keypoint estimation techniques to locate these points. This model not only excels in speed, achieving real-time detection while maintaining high accuracy, but also exhibits good versatility, easily extending to tasks such as 3D object detection and human pose estimation. CenterNet's network architecture employs various optimized fully convolutional networks and combines effective loss functions, making the model training and inference process more efficient. -## Setup +## Model Preparation -### Install +### Prepare Resources + +Pretrained model: + +Dataset: to download the validation dataset. + +### Install Dependencies ```bash # Install libGL @@ -19,12 +25,6 @@ pip3 install -r requirements.txt # Contact the Iluvatar administrator to get the mmcv install package. ``` -### Download - -Pretrained model: - -Dataset: to download the validation dataset. - ### Model Conversion ```bash @@ -32,7 +32,7 @@ Dataset: to download the valida python3 export.py --weight centernet_resnet18_140e_coco_20210705_093630-bb5b3bf7.pth --cfg centernet_r18_8xb16-crop512-140e_coco.py --output centernet.onnx ``` -## Inference +## Model Inference ```bash export DATASETS_DIR=/Path/to/coco/ @@ -48,12 +48,12 @@ bash scripts/infer_centernet_fp16_accuracy.sh bash scripts/infer_centernet_fp16_performance.sh ``` -## Results +## Model Results | Model | BatchSize | Precision | FPS | IOU@0.5 | IOU@0.5:0.95 | | --------- | --------- | --------- | ------- | ------- | ------------ | | CenterNet | 32 | FP16 | 879.447 | 0.423 | 0.258 | -## Reference +## References -mmdetection: +- [mmdetection](https://github.com/open-mmlab/mmdetection.git) diff --git a/models/cv/object_detection/detr/ixrt/README.md b/models/cv/object_detection/detr/ixrt/README.md index cde0acdaffda41ffc0f3f0f83229cf82fb501003..2b318fb544511f5b82ef9f5a8811c27cad89dac5 100755 --- a/models/cv/object_detection/detr/ixrt/README.md +++ b/models/cv/object_detection/detr/ixrt/README.md @@ -1,12 +1,18 @@ -# DETR +# DETR (IxRT) -## Description +## Model Description DETR (DEtection TRansformer) is a novel approach that views object detection as a direct set prediction problem. This method streamlines the detection process, eliminating the need for many hand-designed components like non-maximum suppression procedures or anchor generation, which are typically used to explicitly encode prior knowledge about the task. -## Setup +## Model Preparation -### Install +### Prepare Resources + +Pretrained model: + +Dataset: to download the validation dataset. + +### Install Dependencies ```bash # Install libGL @@ -18,12 +24,6 @@ apt install -y libgl1-mesa-glx pip3 install -r requirements.txt ``` -### Download - -Pretrained model: - -Dataset: to download the validation dataset. - ### Model Conversion ```bash @@ -31,7 +31,7 @@ mkdir checkpoints python3 export_model.py --torch_file /path/to/detr_r50_8xb2-150e_coco_20221023_153551-436d03e8.pth --onnx_file checkpoints/detr_res50.onnx --bsz 1 ``` -## Inference +## Model Inference ```bash export PROJ_DIR=./ @@ -52,8 +52,8 @@ bash scripts/infer_detr_fp16_accuracy.sh bash scripts/infer_detr_fp16_performance.sh ``` -## Results +## Model Results -Model |BatchSize |Precision |FPS |MAP@0.5 |MAP@0.5:0.95 ---------|-----------|----------|----------|----------|------------ -DETR | 1 | FP16 | 65.84 | 0.370 | 0.198 +| Model | BatchSize | Precision | FPS | MAP@0.5 | MAP@0.5:0.95 | +|-------|-----------|-----------|-------|---------|--------------| +| DETR | 1 | FP16 | 65.84 | 0.370 | 0.198 | diff --git a/models/cv/object_detection/fcos/igie/README.md b/models/cv/object_detection/fcos/igie/README.md index 2bcdbee2a63fc95e28cc608c5798ac8262c18f87..c4f9ece1fa04f25cbe6a7fc9bc77c5f4c99e20a1 100644 --- a/models/cv/object_detection/fcos/igie/README.md +++ b/models/cv/object_detection/fcos/igie/README.md @@ -1,12 +1,22 @@ -# FCOS +# FCOS (IGIE) -## Description +## Model Description FCOS is an innovative one-stage object detection framework that abandons traditional anchor box dependency and uses a fully convolutional network for per-pixel target prediction. By introducing a centerness branch and multi-scale feature fusion, FCOS enhances detection performance while simplifying the model structure, especially in detecting small and overlapping targets. Additionally, FCOS eliminates the need for hyperparameter tuning related to anchor boxes, streamlining the model training and tuning process. -## Setup +## Model Preparation -### Install +### Prepare Resources + +Pretrained model: + +Dataset: to download the validation dataset. + +```bash +wget https://download.openmmlab.com/mmdetection/v2.0/fcos/fcos_r50_caffe_fpn_gn-head_1x_coco/fcos_r50_caffe_fpn_gn-head_1x_coco-821213aa.pth +``` + +### Install Dependencies ```bash # Install libGL @@ -18,16 +28,6 @@ apt install -y libgl1-mesa-glx pip3 install -r requirements.txt ``` -### Download - -Pretrained model: - -Dataset: to download the validation dataset. - -```bash -wget https://download.openmmlab.com/mmdetection/v2.0/fcos/fcos_r50_caffe_fpn_gn-head_1x_coco/fcos_r50_caffe_fpn_gn-head_1x_coco-821213aa.pth -``` - ### Model Conversion ```bash @@ -38,7 +38,7 @@ python3 export.py --weight fcos_r50_caffe_fpn_gn-head_1x_coco-821213aa.pth --cfg onnxsim fcos.onnx fcos_opt.onnx ``` -## Inference +## Model Inference ```bash export DATASETS_DIR=/Path/to/coco/ @@ -53,12 +53,12 @@ bash scripts/infer_fcos_fp16_accuracy.sh bash scripts/infer_fcos_fp16_performance.sh ``` -## Results +## Model Results -Model |BatchSize |Precision |FPS |IOU@0.5 |IOU@0.5:0.95 | --------|-----------|----------|----------|----------|---------------| -FCOS | 32 | FP16 | 83.09 | 0.522 | 0.339 | +| Model | BatchSize | Precision | FPS | IOU@0.5 | IOU@0.5:0.95 | +|-------|-----------|-----------|-------|---------|--------------| +| FCOS | 32 | FP16 | 83.09 | 0.522 | 0.339 | -## Reference +## References -mmdetection: +- [mmdetection](https://github.com/open-mmlab/mmdetection.git) diff --git a/models/cv/object_detection/fcos/ixrt/README.md b/models/cv/object_detection/fcos/ixrt/README.md index 56b18c6980bfbc5a1689eaf26b9b002e25b296db..4f17fe4db5d62b6b89536db62108763f379c23a7 100755 --- a/models/cv/object_detection/fcos/ixrt/README.md +++ b/models/cv/object_detection/fcos/ixrt/README.md @@ -1,13 +1,22 @@ -# FCOS +# FCOS (IxRT) -## Description +## Model Description FCOS is an anchor-free model based on the Fully Convolutional Network (FCN) architecture for pixel-wise object detection. It implements a proposal-free solution and introduces the concept of centerness. For more details, please refer to our [report on Arxiv](https://arxiv.org/abs/1904.01355). -## Setup +## Model Preparation -### Install +### Prepare Resources + +Pretrained model: + +COCO2017: + +- val2017: Path/To/val2017/*.jpg +- annotations: Path/To/annotations/instances_val2017.json + +### Install Dependencies ```bash # Install libGL @@ -19,27 +28,16 @@ apt install -y libgl1-mesa-glx pip3 install -r requirements.txt ``` -### Dependency - The inference of the FCOS model requires a dependency on a well-adapted mmcv-v1.7.0 library. Please inquire with the staff to obtain the relevant libraries. -You can follow here to build: https://gitee.com/deep-spark/deepsparkhub/blob/master/toolbox/MMDetection/prepare_mmcv.sh +You can follow the script [prepare_mmcv.sh](https://gitee.com/deep-spark/deepsparkhub/blob/master/toolbox/MMDetection/prepare_mmcv.sh) to build: ```bash - cd mmcv sh build_mmcv.sh sh install_mmcv.sh ``` -### Download - -Pretrained model: - -- COCO2017数据集准备参考: - - 图片目录: Path/To/val2017/*.jpg - - 标注文件目录: Path/To/annotations/instances_val2017.json - ### Model Conversion MMDetection is an open source object detection toolbox based on PyTorch. It is a part of the OpenMMLab project.It is utilized for model conversion. In MMDetection, Execute model conversion command, and the checkpoints folder needs to be created, (mkdir checkpoints) in project @@ -65,7 +63,7 @@ python3 tools/deployment/pytorch2onnx.py \ If there are issues such as input parameter mismatch during model export, it may be due to ONNX version. To resolve this, please delete the last parameter (dynamic_slice) from the return value of the_slice_helper function in the /usr/local/lib/python3.10/site-packages/mmcv/onnx/onnx_utils/symbolic_helper.py file. -## Inference +## Model Inference ```bash export PROJ_DIR=./ @@ -83,7 +81,7 @@ bash scripts/infer_fcos_fp16_accuracy.sh bash scripts/infer_fcos_fp16_performance.sh ``` -## Results +## Model Results | Model | BatchSize | Precision | FPS | MAP@0.5 | MAP@0.5:0.95 | | ----- | --------- | --------- | ----- | ------- | ------------ | diff --git a/models/cv/object_detection/foveabox/igie/README.md b/models/cv/object_detection/foveabox/igie/README.md index cbed39fddb63ae831cc5b07172c93dccf10ab0df..bc5a8a876a96658597c600544db350b5521df0e8 100644 --- a/models/cv/object_detection/foveabox/igie/README.md +++ b/models/cv/object_detection/foveabox/igie/README.md @@ -1,12 +1,18 @@ -# FoveaBox +# FoveaBox (IGIE) -## Description +## Model Description FoveaBox is an advanced anchor-free object detection framework that enhances accuracy and flexibility by directly predicting the existence and bounding box coordinates of objects. Utilizing a Feature Pyramid Network (FPN), it adeptly handles targets of varying scales, particularly excelling with objects of arbitrary aspect ratios. FoveaBox also demonstrates robustness against image deformations. -## Setup +## Model Preparation -### Install +### Prepare Resources + +Pretrained model: + +Dataset: to download the validation dataset. + +### Install Dependencies ```bash # Install libGL @@ -18,12 +24,6 @@ apt install -y libgl1-mesa-glx pip3 install -r requirements.txt ``` -### Download - -Pretrained model: - -Dataset: to download the validation dataset. - ### Model Conversion ```bash @@ -34,7 +34,7 @@ python3 export.py --weight fovea_r50_fpn_4x4_1x_coco_20200219-ee4d5303.pth --cfg onnxsim foveabox.onnx foveabox_opt.onnx ``` -## Inference +## Model Inference ```bash export DATASETS_DIR=/Path/to/coco/ @@ -49,12 +49,12 @@ bash scripts/infer_foveabox_fp16_accuracy.sh bash scripts/infer_foveabox_fp16_performance.sh ``` -## Results +## Model Results -Model |BatchSize |Precision |FPS |IOU@0.5 |IOU@0.5:0.95 | ----------|-----------|----------|----------|----------|---------------| -FoveaBox | 32 | FP16 | 192.496 | 0.531 | 0.346 | +| Model | BatchSize | Precision | FPS | IOU@0.5 | IOU@0.5:0.95 | +|----------|-----------|-----------|---------|---------|--------------| +| FoveaBox | 32 | FP16 | 192.496 | 0.531 | 0.346 | -## Reference +## References -mmdetection: +- [mmdetection](https://github.com/open-mmlab/mmdetection.git) diff --git a/models/cv/object_detection/foveabox/ixrt/README.md b/models/cv/object_detection/foveabox/ixrt/README.md index 2682a25328e027e5ebadaa62b187a55694f9cfab..1949fca37aa98c2c96a1410ceca6cd0a0533e61e 100644 --- a/models/cv/object_detection/foveabox/ixrt/README.md +++ b/models/cv/object_detection/foveabox/ixrt/README.md @@ -1,12 +1,18 @@ -# FoveaBox +# FoveaBox (IxRT) -## Description +## Model Description FoveaBox is an advanced anchor-free object detection framework that enhances accuracy and flexibility by directly predicting the existence and bounding box coordinates of objects. Utilizing a Feature Pyramid Network (FPN), it adeptly handles targets of varying scales, particularly excelling with objects of arbitrary aspect ratios. FoveaBox also demonstrates robustness against image deformations. -## Setup +## Model Preparation -### Install +### Prepare Resources + +Pretrained model: + +Dataset: to download the validation dataset. + +### Install Dependencies ```bash # Install libGL @@ -15,22 +21,9 @@ yum install -y mesa-libGL ## Ubuntu apt install -y libgl1-mesa-glx -pip3 install tqdm -pip3 install onnx -pip3 install onnxsim -pip3 install ultralytics -pip3 install pycocotools -pip3 install mmdeploy -pip3 install mmdet -pip3 install opencv-python==4.6.0.66 +pip3 install -r requirements.txt ``` -### Download - -Pretrained model: - -Dataset: to download the validation dataset. - ### Model Conversion ```bash @@ -41,7 +34,7 @@ python3 export.py --weight fovea_r50_fpn_4x4_1x_coco_20200219-ee4d5303.pth --cfg onnxsim foveabox.onnx foveabox_opt.onnx ``` -## Inference +## Model Inference ```bash export DATASETS_DIR=/Path/to/coco/ @@ -56,12 +49,12 @@ bash scripts/infer_foveabox_fp16_accuracy.sh bash scripts/infer_foveabox_fp16_performance.sh ``` -## Results +## Model Results -Model |BatchSize |Precision |FPS |IOU@0.5 |IOU@0.5:0.95 | ----------|-----------|----------|----------|----------|---------------| -FoveaBox | 32 | FP16 | 181.304 | 0.531 | 0.346 | +| Model | BatchSize | Precision | FPS | IOU@0.5 | IOU@0.5:0.95 | +|----------|-----------|-----------|---------|---------|--------------| +| FoveaBox | 32 | FP16 | 181.304 | 0.531 | 0.346 | -## Reference +## References -mmdetection: +- [mmdetection](https://github.com/open-mmlab/mmdetection.git) diff --git a/models/cv/object_detection/foveabox/ixrt/requirements.txt b/models/cv/object_detection/foveabox/ixrt/requirements.txt new file mode 100644 index 0000000000000000000000000000000000000000..6b25e9d96c0fbcfc31464f8d950aa998f2c47885 --- /dev/null +++ b/models/cv/object_detection/foveabox/ixrt/requirements.txt @@ -0,0 +1,8 @@ +tqdm +onnx +onnxsim +ultralytics +pycocotools +mmdeploy +mmdet +opencv-python==4.6.0.66 \ No newline at end of file diff --git a/models/cv/object_detection/fsaf/igie/README.md b/models/cv/object_detection/fsaf/igie/README.md index d6fdbcfd85f55cd9458ecf86df433e595e1c82ae..ce71c0c5e189e21b62b09851e8b5f4d1d4268a2c 100644 --- a/models/cv/object_detection/fsaf/igie/README.md +++ b/models/cv/object_detection/fsaf/igie/README.md @@ -1,12 +1,22 @@ -# FSAF +# FSAF (IGIE) -## Description +## Model Description The FSAF (Feature Selective Anchor-Free) module is an innovative component for single-shot object detection that enhances performance through online feature selection and anchor-free branches. The FSAF module dynamically selects the most suitable feature level for each object instance, rather than relying on traditional anchor-based heuristic methods. This improvement significantly boosts the accuracy of object detection, especially for small targets and in complex scenes. Moreover, compared to existing anchor-based detectors, the FSAF module maintains high efficiency while adding negligible additional inference overhead. -## Setup +## Model Preparation -### Install +### Prepare Resources + +Pretrained model: + +Dataset: to download the validation dataset. + +```bash +wget https://download.openmmlab.com/mmdetection/v2.0/fsaf/fsaf_r50_fpn_1x_coco/fsaf_r50_fpn_1x_coco-94ccc51f.pth +``` + +### Install Dependencies ```bash # Install libGL @@ -18,16 +28,6 @@ apt install -y libgl1-mesa-glx pip3 install -r requirements.txt ``` -### Download - -Pretrained model: - -Dataset: to download the validation dataset. - -```bash -wget https://download.openmmlab.com/mmdetection/v2.0/fsaf/fsaf_r50_fpn_1x_coco/fsaf_r50_fpn_1x_coco-94ccc51f.pth -``` - ### Model Conversion ```bash @@ -38,7 +38,7 @@ python3 export.py --weight fsaf_r50_fpn_1x_coco-94ccc51f.pth --cfg fsaf_r50_fpn_ onnxsim fsaf.onnx fsaf_opt.onnx ``` -## Inference +## Model Inference ```bash export DATASETS_DIR=/Path/to/coco/ @@ -53,12 +53,12 @@ bash scripts/infer_fsaf_fp16_accuracy.sh bash scripts/infer_fsaf_fp16_performance.sh ``` -## Results +## Model Results -Model |BatchSize |Precision |FPS |IOU@0.5 |IOU@0.5:0.95 | --------|-----------|----------|----------|----------|---------------| -FSAF | 32 | FP16 | 122.35 | 0.530 | 0.345 | +| Model | BatchSize | Precision | FPS | IOU@0.5 | IOU@0.5:0.95 | +|-------|-----------|-----------|--------|---------|--------------| +| FSAF | 32 | FP16 | 122.35 | 0.530 | 0.345 | -## Reference +## References -mmdetection: +- [mmdetection](https://github.com/open-mmlab/mmdetection.git) diff --git a/models/cv/object_detection/fsaf/ixrt/README.md b/models/cv/object_detection/fsaf/ixrt/README.md index 03935e629b1cfe63a91dd566577a522902690e50..8fdb52cc114e1bad1c0aa53646c4c39abc12bdf4 100644 --- a/models/cv/object_detection/fsaf/ixrt/README.md +++ b/models/cv/object_detection/fsaf/ixrt/README.md @@ -1,12 +1,22 @@ -# FSAF +# FSAF (IxRT) -## Description +## Model Description The FSAF (Feature Selective Anchor-Free) module is an innovative component for single-shot object detection that enhances performance through online feature selection and anchor-free branches. The FSAF module dynamically selects the most suitable feature level for each object instance, rather than relying on traditional anchor-based heuristic methods. This improvement significantly boosts the accuracy of object detection, especially for small targets and in complex scenes. Moreover, compared to existing anchor-based detectors, the FSAF module maintains high efficiency while adding negligible additional inference overhead. -## Setup +## Model Preparation -### Install +### Prepare Resources + +Pretrained model: + +Dataset: to download the validation dataset. + +```bash +wget https://download.openmmlab.com/mmdetection/v2.0/fsaf/fsaf_r50_fpn_1x_coco/fsaf_r50_fpn_1x_coco-94ccc51f.pth +``` + +### Install Dependencies ```bash # Install libGL @@ -18,16 +28,6 @@ apt install -y libgl1-mesa-glx pip3 install -r requirements.txt ``` -### Download - -Pretrained model: - -Dataset: to download the validation dataset. - -```bash -wget https://download.openmmlab.com/mmdetection/v2.0/fsaf/fsaf_r50_fpn_1x_coco/fsaf_r50_fpn_1x_coco-94ccc51f.pth -``` - ### Model Conversion ```bash @@ -38,7 +38,7 @@ python3 export.py --weight fsaf_r50_fpn_1x_coco-94ccc51f.pth --cfg fsaf_r50_fpn_ onnxsim fsaf.onnx fsaf_opt.onnx ``` -## Inference +## Model Inference ```bash export DATASETS_DIR=/Path/to/coco/ @@ -53,12 +53,12 @@ bash scripts/infer_fsaf_fp16_accuracy.sh bash scripts/infer_fsaf_fp16_performance.sh ``` -## Results +## Model Results -Model |BatchSize |Precision |FPS |IOU@0.5 |IOU@0.5:0.95 | --------|-----------|----------|----------|----------|---------------| -FSAF | 32 | FP16 | 133.85 | 0.530 | 0.345 | +| Model | BatchSize | Precision | FPS | IOU@0.5 | IOU@0.5:0.95 | +|-------|-----------|-----------|--------|---------|--------------| +| FSAF | 32 | FP16 | 133.85 | 0.530 | 0.345 | -## Reference +## References -mmdetection: +- [mmdetection](https://github.com/open-mmlab/mmdetection.git) diff --git a/models/cv/object_detection/hrnet/igie/README.md b/models/cv/object_detection/hrnet/igie/README.md index 49fc4dcd00fa07afdd156791253463487d1f5000..7889fb6fbb52d12768286e3d9de77e445f4540c4 100644 --- a/models/cv/object_detection/hrnet/igie/README.md +++ b/models/cv/object_detection/hrnet/igie/README.md @@ -1,12 +1,18 @@ -# HRNet +# HRNet (IGIE) -## Description +## Model Description HRNet is an advanced deep learning architecture for human pose estimation, characterized by its maintenance of high-resolution representations throughout the entire network process, thereby avoiding the low-to-high resolution recovery step typical of traditional models. The network features parallel multi-resolution subnetworks and enriches feature representation through repeated multi-scale fusion, which enhances the accuracy of keypoint detection. Additionally, HRNet offers computational efficiency and has demonstrated superior performance over previous methods on several standard datasets. -## Setup +## Model Preparation -### Install +### Prepare Resources + +Pretrained model: + +Dataset: to download the validation dataset. + +### Install Dependencies ```bash # Install libGL @@ -18,12 +24,6 @@ apt install -y libgl1-mesa-glx pip3 install -r requirements.txt ``` -### Download - -Pretrained model: - -Dataset: to download the validation dataset. - ### Model Conversion ```bash @@ -34,7 +34,7 @@ python3 export.py --weight fcos_hrnetv2p_w18_gn-head_4x4_1x_coco_20201212_100710 onnxsim hrnet.onnx hrnet_opt.onnx ``` -## Inference +## Model Inference ```bash export DATASETS_DIR=/Path/to/coco/ @@ -49,12 +49,12 @@ bash scripts/infer_hrnet_fp16_accuracy.sh bash scripts/infer_hrnet_fp16_performance.sh ``` -## Results +## Model Results -Model |BatchSize |Precision |FPS |IOU@0.5 |IOU@0.5:0.95 | --------|-----------|----------|----------|----------|---------------| -HRNet | 32 | FP16 | 64.282 | 0.491 | 0.326 | +| Model | BatchSize | Precision | FPS | IOU@0.5 | IOU@0.5:0.95 | +|-------|-----------|-----------|--------|---------|--------------| +| HRNet | 32 | FP16 | 64.282 | 0.491 | 0.326 | -## Reference +## References -mmdetection: +- [mmdetection](https://github.com/open-mmlab/mmdetection.git) diff --git a/models/cv/object_detection/hrnet/ixrt/README.md b/models/cv/object_detection/hrnet/ixrt/README.md index 284faafd0497c095e25ac375bfd652c87a496b46..cd1737770152f031c621f373c573854bd0adc946 100644 --- a/models/cv/object_detection/hrnet/ixrt/README.md +++ b/models/cv/object_detection/hrnet/ixrt/README.md @@ -1,12 +1,18 @@ -# HRNet +# HRNet (IxRT) -## Description +## Model Description HRNet is an advanced deep learning architecture for human pose estimation, characterized by its maintenance of high-resolution representations throughout the entire network process, thereby avoiding the low-to-high resolution recovery step typical of traditional models. The network features parallel multi-resolution subnetworks and enriches feature representation through repeated multi-scale fusion, which enhances the accuracy of keypoint detection. Additionally, HRNet offers computational efficiency and has demonstrated superior performance over previous methods on several standard datasets. -## Setup +## Model Preparation -### Install +### Prepare Resources + +Pretrained model: + +Dataset: to download the validation dataset. + +### Install Dependencies ```bash # Install libGL @@ -18,12 +24,6 @@ apt install -y libgl1-mesa-glx pip3 install -r requirements.txt ``` -### Download - -Pretrained model: - -Dataset: to download the validation dataset. - ### Model Conversion ```bash @@ -34,7 +34,7 @@ python3 export.py --weight fcos_hrnetv2p_w18_gn-head_4x4_1x_coco_20201212_100710 onnxsim hrnet.onnx hrnet_opt.onnx ``` -## Inference +## Model Inference ```bash export DATASETS_DIR=/Path/to/coco/ @@ -49,12 +49,12 @@ bash scripts/infer_hrnet_fp16_accuracy.sh bash scripts/infer_hrnet_fp16_performance.sh ``` -## Results +## Model Results -Model |BatchSize |Precision |FPS |IOU@0.5 |IOU@0.5:0.95 | --------|-----------|----------|----------|----------|---------------| -HRNet | 32 | FP16 | 75.199 | 0.491 | 0.327 | +| Model | BatchSize | Precision | FPS | IOU@0.5 | IOU@0.5:0.95 | +|-------|-----------|-----------|--------|---------|--------------| +| HRNet | 32 | FP16 | 75.199 | 0.491 | 0.327 | -## Reference +## References -mmdetection: +- [mmdetection](https://github.com/open-mmlab/mmdetection.git) diff --git a/models/cv/object_detection/paa/igie/README.md b/models/cv/object_detection/paa/igie/README.md index 5cbc5505c4d072f5bc881c863e091827bac3011e..f9797fa87ba952f73ce100877287ee2b06f9fc74 100644 --- a/models/cv/object_detection/paa/igie/README.md +++ b/models/cv/object_detection/paa/igie/README.md @@ -1,12 +1,18 @@ -# PAA +# PAA (IGIE) -## Description +## Model Description PAA (Probabilistic Anchor Assignment) is an algorithm for object detection that adaptively assigns positive and negative anchor samples using a probabilistic model. It employs a Gaussian mixture model to dynamically select positive and negative samples based on score distribution, avoiding the misassignment issues of traditional IoU threshold-based methods. PAA enhances detection accuracy, particularly in complex scenarios, and is compatible with existing detection frameworks. -## Setup +## Model Preparation -### Install +### Prepare Resources + +Pretrained model: + +Dataset: to download the validation dataset. + +### Install Dependencies ```bash # Install libGL @@ -18,12 +24,6 @@ apt install -y libgl1-mesa-glx pip3 install -r requirements.txt ``` -### Download - -Pretrained model: - -Dataset: to download the validation dataset. - ### Model Conversion ```bash @@ -34,7 +34,7 @@ python3 export.py --weight paa_r50_fpn_1x_coco_20200821-936edec3.pth --cfg paa_r onnxsim paa.onnx paa_opt.onnx ``` -## Inference +## Model Inference ```bash export DATASETS_DIR=/Path/to/coco/ @@ -49,12 +49,12 @@ bash scripts/infer_paa_fp16_accuracy.sh bash scripts/infer_paa_fp16_performance.sh ``` -## Results +## Model Results | Model | BatchSize | Precision | FPS | IOU@0.5 | IOU@0.5:0.95 | | ----- | --------- | --------- | ------- | ------- | ------------ | | PAA | 32 | FP16 | 138.414 | 0.555 | 0.381 | -## Reference +## References -mmdetection: +- [mmdetection](https://github.com/open-mmlab/mmdetection.git) diff --git a/models/cv/object_detection/retinaface/igie/README.md b/models/cv/object_detection/retinaface/igie/README.md index 3610a8bf666b55cf053afb8cd0dc9149135fc98b..0f0da855b0e59a69bf54c640d919840b396b7b02 100755 --- a/models/cv/object_detection/retinaface/igie/README.md +++ b/models/cv/object_detection/retinaface/igie/README.md @@ -1,12 +1,18 @@ -# RetinaFace +# RetinaFace (IGIE) -## Description +## Model Description RetinaFace is an efficient single-stage face detection model that employs a multi-task learning strategy to simultaneously predict facial locations, landmarks, and 3D facial shapes. It utilizes feature pyramids and context modules to extract multi-scale features and employs a self-supervised mesh decoder to enhance detection accuracy. RetinaFace demonstrates excellent performance on datasets like WIDER FACE, supports real-time processing, and its code and datasets are publicly available for researchers. -## Setup +## Model Preparation -### Install +### Prepare Resources + +Pretrained model: + +Dataset: to download the validation dataset. + +### Install Dependencies ```bash # Install libGL @@ -18,12 +24,6 @@ apt install -y libgl1-mesa-glx pip3 install -r requirements.txt ``` -### Download - -Pretrained model: - -Dataset: to download the validation dataset. - ```bash wget https://github.com/biubug6/Face-Detector-1MB-with-landmark/raw/master/weights/mobilenet0.25_Final.pth ``` @@ -38,7 +38,7 @@ python3 export.py --weight mobilenet0.25_Final.pth --output retinaface.onnx onnxsim retinaface.onnx retinaface_opt.onnx ``` -## Inference +## Model Inference ```bash export DATASETS_DIR=/Path/to/widerface/ @@ -53,12 +53,12 @@ bash scripts/infer_retinaface_fp16_accuracy.sh bash scripts/infer_retinaface_fp16_performance.sh ``` -## Results +## Model Results -| Model | BatchSize | Precision | FPS | Easy AP(%) | Medium AP (%) | Hard AP(%) | -| :--------: | :-------: | :-------: | :------: | :--------: | :-----------: | :--------: | -| RetinaFace | 32 | FP16 | 8304.626 | 80.13 | 68.52 | 36.59 | +| Model | BatchSize | Precision | FPS | Easy AP(%) | Medium AP (%) | Hard AP(%) | +|------------|-----------|-----------|----------|------------|---------------|------------| +| RetinaFace | 32 | FP16 | 8304.626 | 80.13 | 68.52 | 36.59 | -## Reference +## References -Face-Detector-1MB-with-landmark: +- [Face-Detector-1MB-with-landmark](https://github.com/biubug6/Face-Detector-1MB-with-landmark) diff --git a/models/cv/object_detection/retinaface/ixrt/README.md b/models/cv/object_detection/retinaface/ixrt/README.md index 2513eb34d69f52137d8b022b8ed157574a6b0032..fcb4e4803e0088a05161b1bfbfc00be837f3d88a 100644 --- a/models/cv/object_detection/retinaface/ixrt/README.md +++ b/models/cv/object_detection/retinaface/ixrt/README.md @@ -1,12 +1,22 @@ -# RetinaFace +# RetinaFace (IxRT) -## Description +## Model Description RetinaFace is an efficient single-stage face detection model that employs a multi-task learning strategy to simultaneously predict facial locations, landmarks, and 3D facial shapes. It utilizes feature pyramids and context modules to extract multi-scale features and employs a self-supervised mesh decoder to enhance detection accuracy. RetinaFace demonstrates excellent performance on datasets like WIDER FACE, supports real-time processing, and its code and datasets are publicly available for researchers. -## Setup +## Model Preparation -### Install +### Prepare Resources + +Pretrained model: + +Dataset: to download the validation dataset. + +```bash +wget https://github.com/biubug6/Face-Detector-1MB-with-landmark/raw/master/weights/mobilenet0.25_Final.pth +``` + +### Install Dependencies ```bash # Install libGL @@ -20,23 +30,14 @@ pip3 install -r requirements.txt python3 setup.py build_ext --inplace ``` -### Download - -Pretrained model: - -Dataset: to download the validation dataset. - -```bash -wget https://github.com/biubug6/Face-Detector-1MB-with-landmark/raw/master/weights/mobilenet0.25_Final.pth -``` - ### Model Conversion ```bash # export onnx model python3 torch2onnx.py --model mobilenet0.25_Final.pth --onnx_model mnetv1_retinaface.onnx +``` -## Inference +## Model Inference ```bash export DATASETS_DIR=/Path/to/widerface/ @@ -52,12 +53,12 @@ bash scripts/infer_retinaface_fp16_accuracy.sh bash scripts/infer_retinaface_fp16_performance.sh ``` -## Results +## Model Results -| Model | BatchSize | Precision | FPS | Easy AP(%) | Medium AP (%) | Hard AP(%) | -| :--------: | :-------: | :-------: | :------: | :--------: | :-----------: | :--------: | -| RetinaFace | 32 | FP16 | 8536.367 | 80.84 | 69.34 | 37.31 | +| Model | BatchSize | Precision | FPS | Easy AP(%) | Medium AP (%) | Hard AP(%) | +|------------|-----------|-----------|----------|------------|---------------|------------| +| RetinaFace | 32 | FP16 | 8536.367 | 80.84 | 69.34 | 37.31 | -## Reference +## References -Face-Detector-1MB-with-landmark: +- [Face-Detector-1MB-with-landmark](https://github.com/biubug6/Face-Detector-1MB-with-landmark) diff --git a/models/cv/object_detection/retinanet/igie/README.md b/models/cv/object_detection/retinanet/igie/README.md index 18067d85f6bd296aca7a3b00694a5765564a1fb7..35dfcbfbc721f4afb78122465ff252baa38a41e2 100644 --- a/models/cv/object_detection/retinanet/igie/README.md +++ b/models/cv/object_detection/retinanet/igie/README.md @@ -1,12 +1,18 @@ -# RetinaNet +# RetinaNet (IGIE) -## Description +## Model Description RetinaNet, an innovative object detector, challenges the conventional trade-off between speed and accuracy in the realm of computer vision. Traditionally, two-stage detectors, exemplified by R-CNN, achieve high accuracy by applying a classifier to a limited set of candidate object locations. In contrast, one-stage detectors, like RetinaNet, operate over a dense sampling of possible object locations, aiming for simplicity and speed. -## Setup +## Model Preparation -### Install +### Prepare Resources + +Pretrained model: + +Dataset: to download the validation dataset. + +### Install Dependencies ```bash # Install libGL @@ -18,12 +24,6 @@ apt install -y libgl1-mesa-glx pip3 install -r requirements.txt ``` -### Download - -Pretrained model: - -Dataset: to download the validation dataset. - ### Model Conversion ```bash @@ -34,7 +34,7 @@ python3 export.py --weight retinanet_r50_fpn_1x_coco_20200130-c2398f9e.pth --cfg onnxsim retinanet.onnx retinanet_opt.onnx ``` -## Inference +## Model Inference ```bash export DATASETS_DIR=/Path/to/coco/ @@ -49,12 +49,12 @@ bash scripts/infer_retinanet_fp16_accuracy.sh bash scripts/infer_retinanet_fp16_performance.sh ``` -## Results +## Model Results -Model |BatchSize |Precision |FPS |IOU@0.5 |IOU@0.5:0.95 | -----------|-----------|----------|----------|----------|---------------| -RetinaNet | 32 | FP16 | 160.52 | 0.515 | 0.335 | +| Model | BatchSize | Precision | FPS | IOU@0.5 | IOU@0.5:0.95 | +|-----------|-----------|-----------|--------|---------|--------------| +| RetinaNet | 32 | FP16 | 160.52 | 0.515 | 0.335 | -## Reference +## References -mmdetection: +- [mmdetection](https://github.com/open-mmlab/mmdetection.git) diff --git a/models/cv/object_detection/rtmdet/igie/README.md b/models/cv/object_detection/rtmdet/igie/README.md index 8ed35963b1af9faf66346bd01b6e7316f2ed18f5..d1f0fc26fd8e3a7e3df087830ef8e8811ea993f6 100644 --- a/models/cv/object_detection/rtmdet/igie/README.md +++ b/models/cv/object_detection/rtmdet/igie/README.md @@ -1,12 +1,22 @@ -# RTMDet +# RTMDet (IGIE) -## Description +## Model Description RTMDet, presented by the Shanghai AI Laboratory, is a novel framework for real-time object detection that surpasses the efficiency of the YOLO series. The model's architecture is meticulously crafted for optimal efficiency, employing a basic building block consisting of large-kernel depth-wise convolutions in both the backbone and neck, which enhances the model's ability to capture global context. -## Setup +## Model Preparation -### Install +### Prepare Resources + +Pretrained model: + +Dataset: to download the validation dataset. + +```bash +wget https://download.openmmlab.com/mmpose/v1/projects/rtmpose/rtmdet_nano_8xb32-100e_coco-obj365-person-05d8511e.pth +``` + +### Install Dependencies ```bash # Install libGL @@ -18,16 +28,6 @@ apt install -y libgl1-mesa-glx pip3 install -r requirements.txt ``` -### Download - -Pretrained model: - -Dataset: to download the validation dataset. - -```bash -wget https://download.openmmlab.com/mmpose/v1/projects/rtmpose/rtmdet_nano_8xb32-100e_coco-obj365-person-05d8511e.pth -``` - ### Model Conversion ```bash @@ -38,7 +38,7 @@ python3 export.py --weight rtmdet_nano_8xb32-100e_coco-obj365-person-05d8511e.pt onnxsim rtmdet.onnx rtmdet_opt.onnx ``` -## Inference +## Model Inference ```bash export DATASETS_DIR=/Path/to/coco/ @@ -53,11 +53,12 @@ bash scripts/infer_rtmdet_fp16_accuracy.sh bash scripts/infer_rtmdet_fp16_performance.sh ``` -## Results -Model |BatchSize |Precision |FPS |IOU@0.5 |IOU@0.5:0.95 | -----------|-----------|----------|----------|----------|---------------| -RTMDet | 32 | FP16 | 2627.15 | 0.619 | 0.403 | +## Model Results + +| Model | BatchSize | Precision | FPS | IOU@0.5 | IOU@0.5:0.95 | +|--------|-----------|-----------|---------|---------|--------------| +| RTMDet | 32 | FP16 | 2627.15 | 0.619 | 0.403 | -## Reference +## References -mmdetection: +- [mmdetection](https://github.com/open-mmlab/mmdetection.git) diff --git a/models/cv/object_detection/sabl/igie/README.md b/models/cv/object_detection/sabl/igie/README.md index 25732e82c79af5130609848962a9045f25a5d0b9..4cb475c25c82ad4344da94ba26e86b07d04b4998 100644 --- a/models/cv/object_detection/sabl/igie/README.md +++ b/models/cv/object_detection/sabl/igie/README.md @@ -1,12 +1,22 @@ -# SABL +# SABL (IGIE) -## Description +## Model Description SABL (Side-Aware Boundary Localization) is an innovative approach in object detection that focuses on improving the precision of bounding box localization. It addresses the limitations of traditional bounding box regression methods, such as boundary ambiguity and asymmetric prediction errors, was first proposed in the paper "Side-Aware Boundary Localization for More Precise Object Detection". -## Setup +## Model Preparation -### Install +### Prepare Resources + +Pretrained model: + +Dataset: to download the validation dataset. + +```bash +wget https://download.openmmlab.com/mmdetection/v2.0/sabl/sabl_retinanet_r50_fpn_1x_coco/sabl_retinanet_r50_fpn_1x_coco-6c54fd4f.pth +``` + +### Install Dependencies ```bash # Install libGL @@ -18,16 +28,6 @@ apt install -y libgl1-mesa-glx pip3 install -r requirements.txt ``` -### Download - -Pretrained model: - -Dataset: to download the validation dataset. - -```bash -wget https://download.openmmlab.com/mmdetection/v2.0/sabl/sabl_retinanet_r50_fpn_1x_coco/sabl_retinanet_r50_fpn_1x_coco-6c54fd4f.pth -``` - ### Model Conversion ```bash @@ -38,7 +38,7 @@ python3 export.py --weight sabl_retinanet_r50_fpn_1x_coco-6c54fd4f.pth --cfg sab onnxsim sabl.onnx sabl_opt.onnx ``` -## Inference +## Model Inference ```bash export DATASETS_DIR=/Path/to/coco/ @@ -53,12 +53,12 @@ bash scripts/infer_sabl_fp16_accuracy.sh bash scripts/infer_sabl_fp16_performance.sh ``` -## Results +## Model Results -Model |BatchSize |Precision |FPS |IOU@0.5 |IOU@0.5:0.95 | --------|-----------|----------|----------|----------|---------------| -SABL | 32 | FP16 | 189.42 | 0.530 | 0.356 | +| Model | BatchSize | Precision | FPS | IOU@0.5 | IOU@0.5:0.95 | +|-------|-----------|-----------|--------|---------|--------------| +| SABL | 32 | FP16 | 189.42 | 0.530 | 0.356 | -## Reference +## References -mmdetection: +- [mmdetection](https://github.com/open-mmlab/mmdetection.git) diff --git a/models/cv/object_detection/yolov10/igie/README.md b/models/cv/object_detection/yolov10/igie/README.md index fde2e5dc5b04fcaf28b35ba3f9a4bfacd0dda900..f94a3c1e45ea094c116d4f5b7b8ad9335abebdfa 100644 --- a/models/cv/object_detection/yolov10/igie/README.md +++ b/models/cv/object_detection/yolov10/igie/README.md @@ -1,34 +1,34 @@ -# YOLOv10 +# YOLOv10 (IGIE) -## Description +## Model Description YOLOv10, built on the Ultralytics Python package by researchers at Tsinghua University, introduces a new approach to real-time object detection, addressing both the post-processing and model architecture deficiencies found in previous YOLO versions. By eliminating non-maximum suppression (NMS) and optimizing various model components, YOLOv10 achieves state-of-the-art performance with significantly reduced computational overhead. Extensive experiments demonstrate its superior accuracy-latency trade-offs across multiple model scales. -## Setup +## Model Preparation -### Install +### Prepare Resources + +Pretrained model: + +### Install Dependencies ```bash pip3 install -r requirements.txt ``` -### Download - -Pretrained model: - ## Model Conversion ```bash git clone --depth 1 https://github.com/THU-MIG/yolov10.git -cd yolov10 +cd yolov10/ pip3 install -e . --no-deps -cd .. +cd ../ python3 export.py --weight yolov10s.pt --batch 32 ``` -## Inference +## Model Inference ```bash export DATASETS_DIR=/Path/to/coco/ @@ -43,12 +43,12 @@ bash scripts/infer_yolov10_fp16_accuracy.sh bash scripts/infer_yolov10_fp16_performance.sh ``` -## Results +## Model Results | Model | BatchSize | Precision | FPS | IOU@0.5 | IOU@0.5:0.95 | | ------- | --------- | --------- | ------ | ------- | ------------ | | YOLOv10 | 32 | FP16 | 810.97 | 0.629 | 0.461 | -## Reference +## References -YOLOv10: +- [YOLOv10](https://docs.ultralytics.com/models/yolov10) diff --git a/models/cv/object_detection/yolov11/igie/README.md b/models/cv/object_detection/yolov11/igie/README.md index 9db80e45536a491f696adc8377eeaaec4bf629b8..96cb53858c8a62bf848fda790a3773fb4ed8a331 100644 --- a/models/cv/object_detection/yolov11/igie/README.md +++ b/models/cv/object_detection/yolov11/igie/README.md @@ -1,28 +1,28 @@ -# YOLOv11 +# YOLOv11 (IGIE) -## Description +## Model Description YOLOv11 is the latest generation of the YOLO (You Only Look Once) series object detection model released by Ultralytics. Building upon the advancements of previous YOLO models, such as YOLOv5 and YOLOv8, YOLOv11 introduces comprehensive upgrades to further enhance performance, flexibility, and usability. It is a versatile deep learning model designed for multi-task applications, supporting object detection, instance segmentation, image classification, keypoint pose estimation, and rotated object detection. -## Setup +## Model Preparation -### Install +### Prepare Resources + +Pretrained model: + +### Install Dependencies ```bash pip3 install -r requirements.txt ``` -### Download - -Pretrained model: - ## Model Conversion ```bash python3 export.py --weight yolo11n.pt --batch 32 ``` -## Inference +## Model Inference ```bash export DATASETS_DIR=/Path/to/coco/ @@ -37,12 +37,12 @@ bash scripts/infer_yolov11_fp16_accuracy.sh bash scripts/infer_yolov11_fp16_performance.sh ``` -## Results +## Model Results | Model | BatchSize | Precision | FPS | IOU@0.5 | IOU@0.5:0.95 | | ------- | --------- | --------- | ------- | ------- | ------------ | | YOLOv11 | 32 | FP16 | 1519.25 | 0.551 | 0.393 | -## Reference +## References YOLOv11: diff --git a/models/cv/object_detection/yolov3/igie/README.md b/models/cv/object_detection/yolov3/igie/README.md index 7e1d769afe6584d0eb08af2202023f3432ee7ccd..566e805228bd9c7c6626d68ed457fc1f9f0149f0 100644 --- a/models/cv/object_detection/yolov3/igie/README.md +++ b/models/cv/object_detection/yolov3/igie/README.md @@ -1,12 +1,18 @@ -# YOLOv3 +# YOLOv3 (IGIE) -## Description +## Model Description YOLOv3 is a influential object detection algorithm.The key innovation of YOLOv3 lies in its ability to efficiently detect and classify objects in real-time with a single pass through the neural network. YOLOv3 divides an input image into a grid and predicts bounding boxes, class probabilities, and objectness scores for each grid cell. -## Setup +## Model Preparation -### Install +### Prepare Resources + +Pretrained model: + +Dataset: to download the validation dataset. + +### Install Dependencies ```bash # Install libGL @@ -18,12 +24,6 @@ apt install -y libgl1-mesa-glx pip3 install -r requirements.txt ``` -### Download - -Pretrained model: - -Dataset: to download the validation dataset. - ### Model Conversion ```bash @@ -33,7 +33,7 @@ python3 export.py --weight yolov3.pt --output yolov3.onnx onnxsim yolov3.onnx yolov3_opt.onnx ``` -## Inference +## Model Inference ```bash export DATASETS_DIR=/Path/to/coco/ @@ -57,9 +57,9 @@ bash scripts/infer_yolov3_int8_accuracy.sh bash scripts/infer_yolov3_int8_performance.sh ``` -## Results +## Model Results -Model |BatchSize |Precision |FPS |MAP@0.5 |MAP@0.5:0.95 | ---------|-----------|----------|---------|---------|-------------| -YOLOv3 | 32 | FP16 | 312.47 | 0.658 | 0.467 | -YOLOv3 | 32 | INT8 | 711.72 | 0.639 | 0.427 | +| Model | BatchSize | Precision | FPS | MAP@0.5 | MAP@0.5:0.95 | +|--------|-----------|-----------|--------|---------|--------------| +| YOLOv3 | 32 | FP16 | 312.47 | 0.658 | 0.467 | +| YOLOv3 | 32 | INT8 | 711.72 | 0.639 | 0.427 | diff --git a/models/cv/object_detection/yolov3/ixrt/README.md b/models/cv/object_detection/yolov3/ixrt/README.md index d1f37fec61e169da484a63de689f4484a27ef4b2..fd429df128949182f135e67a2aff41f10e3be031 100644 --- a/models/cv/object_detection/yolov3/ixrt/README.md +++ b/models/cv/object_detection/yolov3/ixrt/README.md @@ -1,12 +1,21 @@ -# YOLOv3 +# YOLOv3 (IxRT) -## Description +## Model Description YOLOv3 is a influential object detection algorithm.The key innovation of YOLOv3 lies in its ability to efficiently detect and classify objects in real-time with a single pass through the neural network. YOLOv3 divides an input image into a grid and predicts bounding boxes, class probabilities, and objectness scores for each grid cell. -## Setup +## Model Preparation -### Install +### Prepare Resources + +Pretrained model: + +Dataset: to download the validation dataset. + +- val2017: Path/To/val2017/*.jpg +- annotations: Path/To/annotations/instances_val2017.json + +### Install Dependencies ```bash # Install libGL @@ -18,15 +27,6 @@ apt install -y libgl1-mesa-glx pip3 install -r requirements.txt ``` -### Download - -Pretrained model: - -Dataset: to download the validation dataset. - -- 图片目录: Path/To/val2017/*.jpg -- 标注文件目录: Path/To/annotations/instances_val2017.json - ### Model Conversion ```bash @@ -41,7 +41,7 @@ python3 detect.py --cfg cfg/yolov3.cfg --weights weights/yolov3.weights mv weights/export.onnx /Path/to/checkpoints/yolov3.onnx ``` -## Inference +## Model Inference ```bash export PROJ_DIR=/Path/to/yolov3/ixrt @@ -71,9 +71,9 @@ bash scripts/infer_yolov3_int8_accuracy.sh bash scripts/infer_yolov3_int8_performance.sh ``` -## Results +## Model Results -Model |BatchSize |Precision |FPS |MAP@0.5 |MAP@0.5:0.95 | ---------|-----------|----------|---------|----------|-------------| -YOLOv3 | 32 | FP16 | 757.11 | 0.663 | 0.381 | -YOLOv3 | 32 | INT8 | 1778.34 | 0.659 | 0.356 | +| Model | BatchSize | Precision | FPS | MAP@0.5 | MAP@0.5:0.95 | +|--------|-----------|-----------|---------|---------|--------------| +| YOLOv3 | 32 | FP16 | 757.11 | 0.663 | 0.381 | +| YOLOv3 | 32 | INT8 | 1778.34 | 0.659 | 0.356 | diff --git a/models/cv/object_detection/yolov4/igie/README.md b/models/cv/object_detection/yolov4/igie/README.md index f2a0320d1fe1263f1d136b2ffd69dee351cd70a5..cdb0c58e31cca6717c225e23c9ab88886e1d173c 100644 --- a/models/cv/object_detection/yolov4/igie/README.md +++ b/models/cv/object_detection/yolov4/igie/README.md @@ -1,12 +1,19 @@ -# YOLOv4 +# YOLOv4 (IGIE) -## Description +## Model Description YOLOv4 employs a two-step process, involving regression for bounding box positioning and classification for object categorization. it amalgamates past YOLO family research contributions with novel features like WRC, CSP, CmBN, SAT, Mish activation, Mosaic data augmentation, DropBlock regularization, and CIoU loss. -## Setup +## Model Preparation -### Install +### Prepare Resources + +Pretrained cfg: +Pretrained model: + +Dataset: to download the validation dataset. + +### Install Dependencies ```bash # Install libGL @@ -18,13 +25,6 @@ apt install -y libgl1-mesa-glx pip3 install -r requirements.txt ``` -### Download - -Pretrained cfg: -Pretrained model: - -Dataset: to download the validation dataset. - ### Model Conversion ```bash @@ -38,7 +38,7 @@ python3 export.py --cfg yolov4/cfg/yolov4.cfg --weight yolov4.weights --output y onnxsim yolov4.onnx yolov4_opt.onnx ``` -## Inference +## Model Inference ```bash export DATASETS_DIR=/Path/to/coco/ @@ -62,14 +62,14 @@ bash scripts/infer_yolov4_int8_accuracy.sh bash scripts/infer_yolov4_int8_performance.sh ``` -## Results +## Model Results -Model |BatchSize |Precision |FPS |MAP@0.5 |MAP@0.5:0.95 | ---------|-----------|----------|----------|----------|-------------| -yolov4 | 32 | FP16 |285.218 | 0.741 | 0.506 | -yolov4 | 32 | INT8 |413.320 | 0.721 | 0.463 | +| Model | BatchSize | Precision | FPS | MAP@0.5 | MAP@0.5:0.95 | +|--------|-----------|-----------|---------|---------|--------------| +| YOLOv4 | 32 | FP16 | 285.218 | 0.741 | 0.506 | +| YOLOv4 | 32 | INT8 | 413.320 | 0.721 | 0.463 | -## Reference +## References -DarkNet: -Pytorch-YOLOv4: +- [darknet](https://github.com/AlexeyAB/darknet) +- [pytorch-YOLOv4](https://github.com/Tianxiaomo/pytorch-YOLOv4) diff --git a/models/cv/object_detection/yolov4/ixrt/README.md b/models/cv/object_detection/yolov4/ixrt/README.md index 8b53a7527041d56cfb18873474fcd99d80036cf4..4bf799c87cfab4301df970873549e62b12502eac 100644 --- a/models/cv/object_detection/yolov4/ixrt/README.md +++ b/models/cv/object_detection/yolov4/ixrt/README.md @@ -1,12 +1,19 @@ -# YOLOv4 +# YOLOv4 (IxRT) -## Description +## Model Description YOLOv4 employs a two-step process, involving regression for bounding box positioning and classification for object categorization. it amalgamates past YOLO family research contributions with novel features like WRC, CSP, CmBN, SAT, Mish activation, Mosaic data augmentation, DropBlock regularization, and CIoU loss. -## Setup +## Model Preparation -### Install +### Prepare Resources + +Pretrained cfg: +Pretrained model: + +Dataset: to download the validation dataset. + +### Install Dependencies ```bash # Install libGL @@ -18,13 +25,6 @@ apt install -y libgl1-mesa-glx pip3 install -r requirements.txt ``` -### Download - -Pretrained cfg: -Pretrained model: - -Dataset: to download the validation dataset. - ### Model Conversion ```bash @@ -45,7 +45,7 @@ onnxsim data/yolov4.onnx data/yolov4_sim.onnx # Make sure the dataset path is "data/coco" ``` -## Inference +## Model Inference ### FP16 @@ -65,14 +65,14 @@ bash scripts/infer_yolov4_int8_accuracy.sh bash scripts/infer_yolov4_int8_performance.sh ``` -## Results +## Model Results | Model | BatchSize | Precision | FPS | MAP@0.5 | | ------ | --------- | --------- | ------ | ------- | | YOLOv4 | 32 | FP16 | 303.27 | 0.730 | | YOLOv4 | 32 | INT8 | 682.14 | 0.608 | -## Reference +## References -DarkNet: -Pytorch-YOLOv4: +- [darknet](https://github.com/AlexeyAB/darknet) +- [pytorch-YOLOv4](https://github.com/Tianxiaomo/pytorch-YOLOv4) diff --git a/models/cv/object_detection/yolov5/igie/README.md b/models/cv/object_detection/yolov5/igie/README.md index f71018d449f2f21c2621f50da02bc5db04fd28ec..07f8ef3cffecccb7b413f678171f3a3f3e1cc0c7 100644 --- a/models/cv/object_detection/yolov5/igie/README.md +++ b/models/cv/object_detection/yolov5/igie/README.md @@ -1,12 +1,18 @@ -# YOLOv5-m +# YOLOv5-m (IGIE) -## Description +## Model Description The YOLOv5 architecture is designed for efficient and accurate object detection tasks in real-time scenarios. It employs a single convolutional neural network to simultaneously predict bounding boxes and class probabilities for multiple objects within an image. -## Setup +## Model Preparation -### Install +### Prepare Resources + +Pretrained model: + +Dataset: to download the validation dataset. + +### Install Dependencies ```bash # Install libGL @@ -18,12 +24,6 @@ apt install -y libgl1-mesa-glx pip3 install -r requirements.txt ``` -### Download - -Pretrained model: - -Dataset: to download the validation dataset. - ### Model Conversion ```bash @@ -33,7 +33,7 @@ python3 export.py --weight yolov5m.pt --output yolov5m.onnx onnxsim yolov5m.onnx yolov5m_opt.onnx ``` -## Inference +## Model Inference ```bash export DATASETS_DIR=/Path/to/coco/ @@ -57,9 +57,9 @@ bash scripts/infer_yolov5_int8_accuracy.sh bash scripts/infer_yolov5_int8_performance.sh ``` -## Results +## Model Results -Model |BatchSize |Precision |FPS |MAP@0.5 |MAP@0.5:0.95 | ---------|-----------|----------|---------|----------|-------------| -YOLOv5m | 32 | FP16 | 533.53 | 0.639 | 0.451 | -YOLOv5m | 32 | INT8 | 969.53 | 0.624 | 0.428 | +| Model | BatchSize | Precision | FPS | MAP@0.5 | MAP@0.5:0.95 | +|---------|-----------|-----------|--------|---------|--------------| +| YOLOv5m | 32 | FP16 | 533.53 | 0.639 | 0.451 | +| YOLOv5m | 32 | INT8 | 969.53 | 0.624 | 0.428 | diff --git a/models/cv/object_detection/yolov5/ixrt/README.md b/models/cv/object_detection/yolov5/ixrt/README.md index fd4e7e0b6ab22676d278d8aff82fca9e446877db..57ebca968de599e80270716cadd3e2718408831c 100644 --- a/models/cv/object_detection/yolov5/ixrt/README.md +++ b/models/cv/object_detection/yolov5/ixrt/README.md @@ -1,12 +1,21 @@ -# YOLOv5-m +# YOLOv5-m (IxRT) -## Description +## Model Description The YOLOv5 architecture is designed for efficient and accurate object detection tasks in real-time scenarios. It employs a single convolutional neural network to simultaneously predict bounding boxes and class probabilities for multiple objects within an image. The YOLOV5m is a medium-sized model. -## Setup +## Model Preparation -### Install +### Prepare Resources + +Pretrained model: + +Dataset: to download the validation dataset. + +- val2017: Path/To/val2017/*.jpg +- annotations: Path/To/annotations/instances_val2017.json + +### Install Dependencies ```bash # Install libGL @@ -18,15 +27,6 @@ apt install -y libgl1-mesa-glx pip3 install -r requirements.txt ``` -### Download - -Pretrained model: - -Dataset: to download the validation dataset. - -- 图片目录: Path/To/val2017/*.jpg -- 标注文件目录: Path/To/annotations/instances_val2017.json - ### Model Conversion ```bash @@ -45,7 +45,7 @@ python3 export.py --weights yolov5m.pt --include onnx --opset 11 --batch-size 32 mv yolov5m.onnx /Path/to/checkpoints ``` -## Inference +## Model Inference ```bash export PROJ_DIR=/Path/to/yolov5/ixrt @@ -75,9 +75,9 @@ bash scripts/infer_yolov5_int8_accuracy.sh bash scripts/infer_yolov5_int8_performance.sh ``` -## Results +## Model Results -Model |BatchSize |Precision |FPS |MAP@0.5 |MAP@0.5:0.95 | ---------|-----------|----------|---------|----------|-------------| -YOLOv5 | 32 | FP16 | 680.93 | 0.637 | 0.447 | -YOLOv5 | 32 | INT8 | 1328.50 | 0.627 | 0.425 | +| Model | BatchSize | Precision | FPS | MAP@0.5 | MAP@0.5:0.95 | +|--------|-----------|-----------|---------|---------|--------------| +| YOLOv5 | 32 | FP16 | 680.93 | 0.637 | 0.447 | +| YOLOv5 | 32 | INT8 | 1328.50 | 0.627 | 0.425 | diff --git a/models/cv/object_detection/yolov5s/ixrt/README.md b/models/cv/object_detection/yolov5s/ixrt/README.md index cf048372b410f658c14f65a3ab03605e9cf29529..b4082db087acf23c286684fc5591a7f90a9efc4c 100755 --- a/models/cv/object_detection/yolov5s/ixrt/README.md +++ b/models/cv/object_detection/yolov5s/ixrt/README.md @@ -1,12 +1,18 @@ -# YOLOv5s +# YOLOv5s (IxRT) -## Description +## Model Description The YOLOv5 architecture is designed for efficient and accurate object detection tasks in real-time scenarios. It employs a single convolutional neural network to simultaneously predict bounding boxes and class probabilities for multiple objects within an image. The YOLOV5s is a tiny model. -## Setup +## Model Preparation -### Install +### Prepare Resources + +Pretrained model: + +Dataset: to download the validation dataset. + +### Install Dependencies ```bash # Install libGL @@ -18,12 +24,6 @@ apt install -y libgl1-mesa-glx pip3 install -r requirements.txt ``` -### Download - -Pretrained model: - -Dataset: to download the validation dataset. - ### Model Conversion ```bash @@ -44,7 +44,7 @@ mv yolov5s.onnx ../checkpoints popd ``` -## Inference +## Model Inference ```bash export PROJ_DIR=/Path/to/yolov5s/ixrt @@ -74,9 +74,9 @@ bash scripts/infer_yolov5s_int8_accuracy.sh bash scripts/infer_yolov5s_int8_performance.sh ``` -## Results +## Model Results -Model |BatchSize |Precision |FPS |MAP@0.5 |MAP@0.5:0.95 | ---------|-----------|----------|---------|----------|-------------| -YOLOv5s | 32 | FP16 | 1112.66 | 0.565 | 0.370 | -YOLOv5s | 32 | INT8 | 2440.54 | 0.557 | 0.351 | +| Model | BatchSize | Precision | FPS | MAP@0.5 | MAP@0.5:0.95 | +|---------|-----------|-----------|---------|---------|--------------| +| YOLOv5s | 32 | FP16 | 1112.66 | 0.565 | 0.370 | +| YOLOv5s | 32 | INT8 | 2440.54 | 0.557 | 0.351 | diff --git a/models/cv/object_detection/yolov6/igie/README.md b/models/cv/object_detection/yolov6/igie/README.md index af38205212174d5c01ad44d76cf16e1a61d56a41..4d2982ac59e65f425bf39a7f04bae791a185dfaf 100644 --- a/models/cv/object_detection/yolov6/igie/README.md +++ b/models/cv/object_detection/yolov6/igie/README.md @@ -1,12 +1,18 @@ -# YOLOv6 +# YOLOv6 (IGIE) -## Description +## Model Description YOLOv6 integrates cutting-edge object detection advancements from industry and academia, incorporating recent innovations in network design, training strategies, testing techniques, quantization, and optimization methods. This culmination results in a suite of deployment-ready networks, accommodating varied use cases across different scales. -## Setup +## Model Preparation -### Install +### Prepare Resources + +Pretrained model: + +Dataset: to download the validation dataset. + +### Install Dependencies ```bash # Install libGL @@ -18,12 +24,6 @@ apt install -y libgl1-mesa-glx pip3 install -r requirements.txt ``` -### Download - -Pretrained model: - -Dataset: to download the validation dataset. - ### Model Conversion ```bash @@ -38,7 +38,7 @@ python3 deploy/ONNX/export_onnx.py --weights ../yolov6s.pt --img 640 --dynamic-b cd .. ``` -## Inference +## Model Inference ```bash export DATASETS_DIR=/Path/to/coco/ @@ -53,12 +53,12 @@ bash scripts/infer_yolov6_fp16_accuracy.sh bash scripts/infer_yolov6_fp16_performance.sh ``` -## Results +## Model Results -Model |BatchSize |Precision |FPS |MAP@0.5 |MAP@0.5:0.95 | ----------|-----------|----------|----------|----------|-------------| -yolov6 | 32 | FP16 | 994.902 | 0.617 | 0.448 | +| Model | BatchSize | Precision | FPS | MAP@0.5 | MAP@0.5:0.95 | +|--------|-----------|-----------|---------|---------|--------------| +| YOLOv6 | 32 | FP16 | 994.902 | 0.617 | 0.448 | -## Reference +## References -YOLOv6: +- [YOLOv6](https://github.com/meituan/YOLOv6) diff --git a/models/cv/object_detection/yolov6/ixrt/README.md b/models/cv/object_detection/yolov6/ixrt/README.md index 5d0acbcd26ded23b9414019898dad5454c62e921..3df05941cbe257afb6da2bb53d9583c23aecceac 100644 --- a/models/cv/object_detection/yolov6/ixrt/README.md +++ b/models/cv/object_detection/yolov6/ixrt/README.md @@ -1,12 +1,18 @@ -# YOLOv6 +# YOLOv6 (IxRT) -## Description +## Model Description YOLOv6 integrates cutting-edge object detection advancements from industry and academia, incorporating recent innovations in network design, training strategies, testing techniques, quantization, and optimization methods. This culmination results in a suite of deployment-ready networks, accommodating varied use cases across different scales. -## Setup +## Model Preparation -### Install +### Prepare Resources + +Pretrained model: + +Dataset: to download the validation dataset. + +### Install Dependencies ```bash # Install libGL @@ -18,12 +24,6 @@ apt install -y libgl1-mesa-glx pip3 install -r requirements.txt ``` -### Download - -Pretrained model: - -Dataset: to download the validation dataset. - ```bash # get yolov6s.pt wget https://github.com/meituan/YOLOv6/releases/download/0.4.0/yolov6s.pt @@ -46,7 +46,7 @@ mv ../yolov6s.onnx ../data/ popd ``` -## Inference +## Model Inference ```bash export DATASETS_DIR=/Path/to/coco/ @@ -70,13 +70,13 @@ bash scripts/infer_yolov6_int8_accuracy.sh bash scripts/infer_yolov6_int8_performance.sh ``` -## Results +## Model Results | Model | BatchSize | Precision | FPS | MAP@0.5 | -| ------ | --------- | --------- | -------- | ------- | +|--------|-----------|-----------|----------|---------| | YOLOv6 | 32 | FP16 | 1107.511 | 0.617 | | YOLOv6 | 32 | INT8 | 2080.475 | 0.583 | -## Reference +## References -YOLOv6: +- [YOLOv6](https://github.com/meituan/YOLOv6) diff --git a/models/cv/object_detection/yolov7/igie/README.md b/models/cv/object_detection/yolov7/igie/README.md index d23f22d644f6fdc9cb0c4dbde87008f55cf3bf82..bfad733eef843441c4a337f735762e37d3093da4 100644 --- a/models/cv/object_detection/yolov7/igie/README.md +++ b/models/cv/object_detection/yolov7/igie/README.md @@ -1,12 +1,18 @@ -# YOLOv7 +# YOLOv7 (IGIE) -## Description +## Model Description YOLOv7 is a state-of-the-art real-time object detector that surpasses all known object detectors in both speed and accuracy in the range from 5 FPS to 160 FPS. -## Setup +## Model Preparation -### Install +### Prepare Resources + +Pretrained model: + +Dataset: to download the validation dataset. + +### Install Dependencies ```bash # Install libGL @@ -18,12 +24,6 @@ apt install -y libgl1-mesa-glx pip3 install -r requirements.txt ``` -### Download - -Pretrained model: - -Dataset: to download the validation dataset. - ### Model Conversion ```bash @@ -37,7 +37,7 @@ python3 export.py --weights ../yolov7.pt --simplify --img-size 640 640 --dynamic cd .. ``` -## Inference +## Model Inference ```bash export DATASETS_DIR=/Path/to/coco/ @@ -61,13 +61,13 @@ bash scripts/infer_yolov7_int8_accuracy.sh bash scripts/infer_yolov7_int8_performance.sh ``` -## Results +## Model Results -Model |BatchSize |Precision |FPS |MAP@0.5 |MAP@0.5:0.95 | ---------|-----------|----------|----------|----------|-------------| -yolov7 | 32 | FP16 |341.681 | 0.695 | 0.509 | -yolov7 | 32 | INT8 |669.783 | 0.685 | 0.473 | +| Model | BatchSize | Precision | FPS | MAP@0.5 | MAP@0.5:0.95 | +|--------|-----------|-----------|---------|---------|--------------| +| YOLOv7 | 32 | FP16 | 341.681 | 0.695 | 0.509 | +| YOLOv7 | 32 | INT8 | 669.783 | 0.685 | 0.473 | -## Reference +## References -YOLOv7: +- [YOLOv7](https://github.com/WongKinYiu/yolov7) diff --git a/models/cv/object_detection/yolov7/ixrt/README.md b/models/cv/object_detection/yolov7/ixrt/README.md index 2174c199b93a12950e97eeea9df36cb9d57afa20..b3b62a6a9b9e730f0ed8a1f6d7d48a344ca24018 100644 --- a/models/cv/object_detection/yolov7/ixrt/README.md +++ b/models/cv/object_detection/yolov7/ixrt/README.md @@ -1,12 +1,21 @@ -# YOLOv7 +# YOLOv7 (IxRT) -## Description +## Model Description YOLOv7 is an object detection model based on the YOLO (You Only Look Once) series. It is an improved version of YOLOv5 developed by the Ultralytics team. YOLOv7 aims to enhance the performance and efficiency of object detection through a series of improvements including network architecture, training strategies, and data augmentation techniques, in order to achieve more accurate and faster object detection. -## Setup +## Model Preparation -### Install +### Prepare Resources + +Pretrained model: + +Dataset: to download the validation dataset. + +- val2017: Path/To/val2017/*.jpg +- annotations: Path/To/annotations/instances_val2017.json + +### Install Dependencies ```bash # Install libGL @@ -18,15 +27,6 @@ apt install -y libgl1-mesa-glx pip3 install -r requirements.txt ``` -### Download - -Pretrained model: - -Dataset: to download the validation dataset. - -- 图片目录: Path/To/val2017/*.jpg -- 标注文件目录: Path/To/annotations/instances_val2017.json - ### Model Conversion ```bash @@ -38,7 +38,7 @@ mkdir /Your_Projects/To/checkpoints mv yolov7.onnx /Path/to/checkpoints/yolov7m.onnx ``` -## Inference +## Model Inference ```bash export PROJ_DIR=/Path/to/yolov7/ixrt @@ -68,9 +68,13 @@ bash scripts/infer_yolov7_int8_accuracy.sh bash scripts/infer_yolov7_int8_performance.sh ``` -## Results +## Model Results + +| Model | BatchSize | Precision | FPS | MAP@0.5 | MAP@0.5:0.95 | +|--------|-----------|-----------|--------|---------|--------------| +| YOLOv7 | 32 | FP16 | 375.41 | 0.693 | 0.506 | +| YOLOv7 | 32 | INT8 | 816.71 | 0.688 | 0.471 | + +## References -Model |BatchSize |Precision |FPS |MAP@0.5 |MAP@0.5:0.95 | ---------|-----------|----------|---------|----------|-------------| -YOLOv7 | 32 | FP16 | 375.41 | 0.693 | 0.506 | -YOLOv7 | 32 | INT8 | 816.71 | 0.688 | 0.471 | +- [YOLOv7](https://github.com/WongKinYiu/yolov7) diff --git a/models/cv/object_detection/yolov8/igie/README.md b/models/cv/object_detection/yolov8/igie/README.md index b3ee9d5ef9744d44c6b64c8f7b8cc4059efc47ea..eff4ecdf2209d0e0c96463fc58c7baa33f5b7ce4 100644 --- a/models/cv/object_detection/yolov8/igie/README.md +++ b/models/cv/object_detection/yolov8/igie/README.md @@ -1,12 +1,18 @@ -# YOLOv8 +# YOLOv8 (IGIE) -## Description +## Model Description Yolov8 combines speed and accuracy in real-time object detection tasks. With a focus on simplicity and efficiency, this model employs a single neural network to make predictions, enabling fast and accurate identification of objects in images or video streams. -## Setup +## Model Preparation -### Install +### Prepare Resources + +Pretrained model: + +Dataset: to download the validation dataset. + +### Install Dependencies ```bash # Install libGL @@ -18,19 +24,13 @@ apt install -y libgl1-mesa-glx pip3 install -r requirements.txt ``` -### Download - -Pretrained model: - -Dataset: to download the validation dataset. - ### Model Conversion ```bash python3 export.py --weight yolov8s.pt --batch 32 ``` -## Inference +## Model Inference ```bash export DATASETS_DIR=/Path/to/coco/ @@ -54,9 +54,9 @@ bash scripts/infer_yolov8_int8_accuracy.sh bash scripts/infer_yolov8_int8_performance.sh ``` -## Results +## Model Results -Model |BatchSize |Precision |FPS |MAP@0.5 |MAP@0.5:0.95 | ---------|-----------|----------|----------|----------|-------------| -yolov8 | 32 | FP16 | 1002.98 | 0.617 | 0.449 | -yolov8 | 32 | INT8 | 1392.29 | 0.604 | 0.429 | +| Model | BatchSize | Precision | FPS | MAP@0.5 | MAP@0.5:0.95 | +|--------|-----------|-----------|---------|---------|--------------| +| YOLOv8 | 32 | FP16 | 1002.98 | 0.617 | 0.449 | +| YOLOv8 | 32 | INT8 | 1392.29 | 0.604 | 0.429 | diff --git a/models/cv/object_detection/yolov8/ixrt/README.md b/models/cv/object_detection/yolov8/ixrt/README.md index 6ed7ea5364fe0d7eeecf65d31d75ca47ecb52676..09f2cc99b963baec3630c0e21b725c2f2fd84948 100644 --- a/models/cv/object_detection/yolov8/ixrt/README.md +++ b/models/cv/object_detection/yolov8/ixrt/README.md @@ -1,12 +1,18 @@ -# YOLOv8 +# YOLOv8 (IxRT) -## Description +## Model Description Yolov8 combines speed and accuracy in real-time object detection tasks. With a focus on simplicity and efficiency, this model employs a single neural network to make predictions, enabling fast and accurate identification of objects in images or video streams. -## Setup +## Model Preparation -### Install +### Prepare Resources + +Pretrained model: + +Dataset: to download the validation dataset. + +### Install Dependencies ```bash # Install libGL @@ -18,12 +24,6 @@ apt install -y libgl1-mesa-glx pip3 install -r requirements.txt ``` -### Download - -Pretrained model: - -Dataset: to download the validation dataset. - ### Model Conversion ```bash @@ -33,7 +33,7 @@ python3 export.py --weight yolov8.pt --batch 32 onnxsim yolov8.onnx ./checkpoints/yolov8.onnx ``` -## Inference +## Model Inference ```bash export PROJ_DIR=./ @@ -60,9 +60,9 @@ bash scripts/infer_yolov8_int8_accuracy.sh bash scripts/infer_yolov8_int8_performance.sh ``` -## Results +## Model Results | Model | BatchSize | Precision | FPS | MAP@0.5 | -| ------ | --------- | --------- | -------- | ------- | +|--------|-----------|-----------|----------|---------| | YOLOv8 | 32 | FP16 | 1511.366 | 0.525 | | YOLOv8 | 32 | INT8 | 1841.017 | 0.517 | diff --git a/models/cv/object_detection/yolov9/igie/README.md b/models/cv/object_detection/yolov9/igie/README.md index 9b3313e4ce857d4f689ebdd9264741ac7dc84bec..5a11d80b4cf6b4bf7c3d43dbded6694ce8da21f7 100644 --- a/models/cv/object_detection/yolov9/igie/README.md +++ b/models/cv/object_detection/yolov9/igie/README.md @@ -1,28 +1,28 @@ -# YOLOv9 +# YOLOv9 (IGIE) -## Description +## Model Description YOLOv9 represents a major leap in real-time object detection by introducing innovations like Programmable Gradient Information (PGI) and the Generalized Efficient Layer Aggregation Network (GELAN), significantly improving efficiency, accuracy, and adaptability. Developed by an open-source team and building on the YOLOv5 codebase, it sets new benchmarks on the MS COCO dataset. YOLOv9's architecture effectively addresses information loss in deep neural networks, enhancing learning capacity and ensuring higher detection accuracy. -## Setup +## Model Preparation -### Install +### Prepare Resources + +Pretrained model: + +### Install Dependencies ```bash pip3 install -r requirements.txt ``` -### Download - -Pretrained model: - ## Model Conversion ```bash python3 export.py --weight yolov9s.pt --batch 32 ``` -## Inference +## Model Inference ```bash export DATASETS_DIR=/Path/to/coco/ @@ -37,12 +37,12 @@ bash scripts/infer_yolov9_fp16_accuracy.sh bash scripts/infer_yolov9_fp16_performance.sh ``` -## Results +## Model Results | Model | BatchSize | Precision | FPS | IOU@0.5 | IOU@0.5:0.95 | | ------ | --------- | --------- | ------ | ------- | ------------ | | YOLOv9 | 32 | FP16 | 814.42 | 0.625 | 0.464 | -## Reference +## References -YOLOv9: +- [YOLOv9](https://docs.ultralytics.com/models/yolov9) diff --git a/models/cv/object_detection/yolox/igie/README.md b/models/cv/object_detection/yolox/igie/README.md index 32c3d6389e248739684608afa69fc4650009f7ec..598c97b222fb5420f7af48eec09436aac75b6056 100644 --- a/models/cv/object_detection/yolox/igie/README.md +++ b/models/cv/object_detection/yolox/igie/README.md @@ -1,12 +1,18 @@ -# YOLOX +# YOLOX (IGIE) -## Description +## Model Description YOLOX is an anchor-free version of YOLO, with a simpler design but better performance! It aims to bridge the gap between research and industrial communities. -## Setup +## Model Preparation -### Install +### Prepare Resources + +Pretrained model: + +Dataset: to download the validation dataset. + +### Install Dependencies ```bash # Install libGL @@ -21,27 +27,21 @@ pip3 install -r requirements.txt source /opt/rh/devtoolset-7/enable ``` -### Download - -Pretrained model: - -Dataset: to download the validation dataset. - ### Model Conversion ```bash # install yolox git clone https://github.com/Megvii-BaseDetection/YOLOX.git -cd YOLOX +cd YOLOX/ python3 setup.py develop # export onnx model python3 tools/export_onnx.py -c ../yolox_m.pth -o 13 -n yolox-m --input input --output output --dynamic --output-name ../yolox.onnx -cd .. +cd ../ ``` -## Inference +## Model Inference ```bash export DATASETS_DIR=/Path/to/coco/ @@ -65,13 +65,13 @@ bash scripts/infer_yolox_int8_accuracy.sh bash scripts/infer_yolox_int8_performance.sh ``` -## Results +## Model Results -Model |BatchSize |Precision |FPS |MAP@0.5 | ---------|-----------|----------|----------|----------| -yolox | 32 | FP16 |409.517 | 0.656 | -yolox | 32 | INT8 |844.991 | 0.637 | +| Model | BatchSize | Precision | FPS | MAP@0.5 | +|-------|-----------|-----------|---------|---------| +| YOLOX | 32 | FP16 | 409.517 | 0.656 | +| YOLOX | 32 | INT8 | 844.991 | 0.637 | -## Reference +## References -YOLOX: +- [YOLOX](https://github.com/Megvii-BaseDetection/YOLOX) diff --git a/models/cv/object_detection/yolox/ixrt/README.md b/models/cv/object_detection/yolox/ixrt/README.md index 0a19d5bf2873f20b501ab0bce6eb89a8f072c387..9104b2e85604832c53ecf03cc2ceafdd0be3f42e 100644 --- a/models/cv/object_detection/yolox/ixrt/README.md +++ b/models/cv/object_detection/yolox/ixrt/README.md @@ -1,13 +1,19 @@ -# YOLOX +# YOLOX (IxRT) -## Description +## Model Description YOLOX is an anchor-free version of YOLO, with a simpler design but better performance! It aims to bridge the gap between research and industrial communities. For more details, please refer to our [report on Arxiv](https://arxiv.org/abs/2107.08430). -## Setup +## Model Preparation -### Install +### Prepare Resources + +Pretrained model: + +Dataset: to download the validation dataset. + +### Install Dependencies ```bash # Install libGL @@ -19,12 +25,6 @@ apt install -y libgl1-mesa-glx pip3 install -r requirements.txt ``` -### Download - -Pretrained model: - -Dataset: to download the validation dataset. - ### Model Conversion ```bash @@ -38,7 +38,7 @@ python3 tools/export_onnx.py --output-name ../yolox.onnx -n yolox-m -c yolox_m.p popd ``` -## Inference +## Model Inference ```bash # Set DATASETS_DIR @@ -73,13 +73,13 @@ bash scripts/infer_yolox_int8_accuracy.sh bash scripts/infer_yolox_int8_performance.sh ``` -## Results +## Model Results -Model |BatchSize |Precision |FPS |MAP@0.5 | ---------|-----------|----------|----------|----------| -yolox | 32 | FP16 | 424.53 | 0.656 | -yolox | 32 | INT8 | 832.16 | 0.647 | +| Model | BatchSize | Precision | FPS | MAP@0.5 | +|-------|-----------|-----------|--------|---------| +| YOLOX | 32 | FP16 | 424.53 | 0.656 | +| YOLOX | 32 | INT8 | 832.16 | 0.647 | -## Reference +## References -YOLOX: +- [YOLOX](https://github.com/Megvii-BaseDetection/YOLOX) diff --git a/models/cv/ocr/kie_layoutxlm/igie/README.md b/models/cv/ocr/kie_layoutxlm/igie/README.md index 75f799098bf9a90d8184ff7819029e091c3bf53c..4ef5301cdd19d56608bd2e47e0c6b0819ff4cb72 100644 --- a/models/cv/ocr/kie_layoutxlm/igie/README.md +++ b/models/cv/ocr/kie_layoutxlm/igie/README.md @@ -1,25 +1,24 @@ -# LayoutXLM +# LayoutXLM (IGIE) -## Description +## Model Description LayoutXLM is a groundbreaking multimodal pre-trained model for multilingual document understanding, achieving exceptional performance by integrating text, layout, and image data. -## Setup +## Model Preparation -```shell -pip3 install -r requirements.txt -``` - -## Download +### Prepare Resources Pretrained model: Dataset: to download the XFUND_zh dataset. -## Model Conversion +```bash +pip3 install -r requirements.txt +``` -```shell +## Model Conversion +```bash tar -xf ser_vi_layoutxlm_xfund_pretrained.tar tar -xf XFUND.tar @@ -35,13 +34,13 @@ python3 tools/export_model.py -c configs/kie/vi_layoutxlm/ser_vi_layoutxlm_xfund # Export the inference model to onnx model paddle2onnx --model_dir ./inference/ser_vi_layoutxlm --model_filename inference.pdmodel --params_filename inference.pdiparams --save_file ../kie_ser.onnx --opset_version 11 --enable_onnx_checker True -cd .. +cd ../ # Use onnxsim optimize onnx model onnxsim kie_ser.onnx kie_ser_opt.onnx ``` -## Inference +## Model Inference ```shell export DATASETS_DIR=/Path/to/XFUND/ @@ -49,19 +48,19 @@ export DATASETS_DIR=/Path/to/XFUND/ ### FP16 -```shell +```bash # Accuracy bash scripts/infer_kie_layoutxlm_fp16_accuracy.sh # Performance bash scripts/infer_kie_layoutxlm_fp16_performance.sh ``` -## Results +## Model Results | Model | BatchSize | Precision | FPS | Hmean | | ------- | --------- | --------- | ------ | ------ | | Kie_ser | 8 | FP16 | 107.65 | 93.61% | -## Reference +## References -PaddleOCR: +- [PaddleOCR](https://github.com/PaddlePaddle/PaddleOCR/blob/main/docs/algorithm/kie/algorithm_kie_layoutxlm.md) diff --git a/models/cv/ocr/svtr/igie/README.md b/models/cv/ocr/svtr/igie/README.md index 47ea9d783d6f9fba002924cd263e371b5340145f..790e3799f18c6ea8e40ebf78677ff5c0031e9266 100644 --- a/models/cv/ocr/svtr/igie/README.md +++ b/models/cv/ocr/svtr/igie/README.md @@ -1,9 +1,18 @@ -# SVTR -## Description +# SVTR (IGIE) + +## Model Description + SVTR proposes a single vision model for scene text recognition. This model completely abandons sequence modeling within the patch-wise image tokenization framework. Under the premise of competitive accuracy, the model has fewer parameters and faster speed. -## Setup -```shell +## Model Preparation + +### Prepare Resources + +Pretrained model: + +Dataset: to download the lmdb evaluation datasets. + +```bash # Install libGL ## CentOS yum install -y mesa-libGL @@ -13,18 +22,14 @@ apt install -y libgl1-mesa-glx pip3 install -r requirements.txt ``` -## Download -Pretrained model: - -Dataset: to download the lmdb evaluation datasets. - ## Model Conversion -```shell + +```bash tar -xf rec_svtr_tiny_none_ctc_en_train.tar git clone -b release/2.6 https://github.com/PaddlePaddle/PaddleOCR.git --depth=1 -cd PaddleOCR +cd PaddleOCR/ # Export the trained model into inference model python3 tools/export_model.py -c ../rec_svtr_tiny_6local_6global_stn_en.yml -o Global.pretrained_model=../rec_svtr_tiny_none_ctc_en_train/best_accuracy Global.save_inference_dir=./inference/rec_svtr_tiny @@ -32,28 +37,33 @@ python3 tools/export_model.py -c ../rec_svtr_tiny_6local_6global_stn_en.yml -o G # Export the inference model to onnx model paddle2onnx --model_dir ./inference/rec_svtr_tiny --model_filename inference.pdmodel --params_filename inference.pdiparams --save_file ../SVTR.onnx --opset_version 13 --enable_onnx_checker True -cd .. +cd ../ # Use onnxsim optimize onnx model onnxsim SVTR.onnx SVTR_opt.onnx -``` +``` + +## Model Inference -## Inference -```shell +```bash export DATASETS_DIR=/Path/to/lmdb_evaluation/ ``` + ### FP16 -```shell + +```bash # Accuracy bash scripts/infer_svtr_fp16_accuracy.sh # Performance bash scripts/infer_svtr_fp16_performance.sh ``` -## Results -Model |BatchSize |Precision |FPS |Acc | ---------|-----------|----------|----------|----------| -SVTR | 32 | FP16 | 4936.47 | 88.29% | +## Model Results + +| Model | BatchSize | Precision | FPS | Acc | +|-------|-----------|-----------|---------|--------| +| SVTR | 32 | FP16 | 4936.47 | 88.29% | + +## References -## Reference -PaddleOCR: https://github.com/PaddlePaddle/PaddleOCR/blob/main/docs/algorithm/text_recognition/algorithm_rec_svtr.md +- [PaddleOCR](https://github.com/PaddlePaddle/PaddleOCR/blob/main/docs/algorithm/text_recognition/algorithm_rec_svtr.md) diff --git a/models/cv/pose_estimation/hrnetpose/igie/README.md b/models/cv/pose_estimation/hrnetpose/igie/README.md index 09771216b84e9066329ecdeb2c14c95c10b21a50..7833e893a33b68d0cb0135ac9fbb2c99c090774e 100644 --- a/models/cv/pose_estimation/hrnetpose/igie/README.md +++ b/models/cv/pose_estimation/hrnetpose/igie/README.md @@ -1,12 +1,22 @@ -# HRNetPose +# HRNetPose (IGIE) -## Description +## Model Description HRNetPose (High-Resolution Network for Pose Estimation) is a high-performance human pose estimation model introduced in the paper "Deep High-Resolution Representation Learning for Human Pose Estimation". It is designed to address the limitations of traditional methods by maintaining high-resolution feature representations throughout the network, enabling more accurate detection of human keypoints. -## Setup +## Model Preparation -### Install +### Prepare Resources + +Pretrained model: + +Dataset: to download the validation dataset. + +```bash +wget https://download.openmmlab.com/mmpose/top_down/hrnet/hrnet_w32_coco_256x192-c78dce93_20200708.pth +``` + +### Install Dependencies ```bash # Install libGL @@ -18,16 +28,6 @@ apt install -y libgl1-mesa-glx pip3 install -r requirements.txt ``` -### Download - -Pretrained model: - -Dataset: to download the validation dataset. - -```bash -wget https://download.openmmlab.com/mmpose/top_down/hrnet/hrnet_w32_coco_256x192-c78dce93_20200708.pth -``` - ### Model Conversion ```bash @@ -38,7 +38,7 @@ python3 export.py --weight hrnet_w32_coco_256x192-c78dce93_20200708.pth --cfg td onnxsim hrnetpose.onnx hrnetpose_opt.onnx ``` -## Inference +## Model Inference ```bash export DATASETS_DIR=/Path/to/coco/ @@ -53,12 +53,12 @@ bash scripts/infer_hrnetpose_fp16_accuracy.sh bash scripts/infer_hrnetpose_fp16_performance.sh ``` -## Results +## Model Results -| Model | BatchSize | Input Shape | Precision | FPS | mAP@0.5(%) | -| :---------: | :-------: | :---------: | :-------: | :-------: | :--------: | -| HRNetPose | 32 | 252x196 | FP16 | 1831.20 | 0.926 | +| Model | BatchSize | Input Shape | Precision | FPS | mAP@0.5(%) | +|-----------|-----------|-------------|-----------|---------|------------| +| HRNetPose | 32 | 252x196 | FP16 | 1831.20 | 0.926 | -## Reference +## References -mmpose: +- [mmpose](https://github.com/open-mmlab/mmpose.git) diff --git a/models/cv/pose_estimation/lightweight_openpose/ixrt/README.md b/models/cv/pose_estimation/lightweight_openpose/ixrt/README.md index 4fa8923c7770e8cbc2746bbdcaa9db28116b9956..d335f055e12f47b64217b1f9c7d18079a6d64f8b 100644 --- a/models/cv/pose_estimation/lightweight_openpose/ixrt/README.md +++ b/models/cv/pose_estimation/lightweight_openpose/ixrt/README.md @@ -1,12 +1,21 @@ -# Lightweight OpenPose +# Lightweight OpenPose (IxRT) -## Description +## Model Description -This work heavily optimizes the OpenPose approach to reach real-time inference on CPU with negliable accuracy drop. It detects a skeleton (which consists of keypoints and connections between them) to identify human poses for every person inside the image. The pose may contain up to 18 keypoints: ears, eyes, nose, neck, shoulders, elbows, wrists, hips, knees, and ankles. On COCO 2017 Keypoint Detection validation set this code achives 40% AP for the single scale inference (no flip or any post-processing done). +This work heavily optimizes the OpenPose approach to reach real-time inference on CPU with negliable accuracy drop. It +detects a skeleton (which consists of keypoints and connections between them) to identify human poses for every person +inside the image. The pose may contain up to 18 keypoints: ears, eyes, nose, neck, shoulders, elbows, wrists, hips, +knees, and ankles. On COCO 2017 Keypoint Detection validation set this code achives 40% AP for the single scale +inference (no flip or any post-processing done). -## Setup +## Model Preparation -### Install +### Prepare Resources + +- dataset: +- checkpoints: + +### Install Dependencies ```bash # Install libGL @@ -18,10 +27,6 @@ apt install -y libgl1-mesa-glx pip3 install -r requirements.txt ``` -### Download -- dataset: http://cocodataset.org/#download -- checkpoints: https://download.01.org/opencv/openvino_training_extensions/models/human_pose_estimation/checkpoint_iter_370000.pth - ### Model Conversion ```bash @@ -35,7 +40,7 @@ mkdir -p checkpoints onnxsim ./lightweight-human-pose-estimation.pytorch/human-pose-estimation.onnx ./checkpoints/lightweight_openpose.onnx ``` -## Inference +## Model Inference ```bash export DATASETS_DIR=/Path/to/coco_pose/ @@ -51,12 +56,12 @@ bash scripts/infer_lightweight_openpose_fp16_accuracy.sh bash scripts/infer_lightweight_openpose_fp16_performance.sh ``` -## Results +## Model Results -Model |BatchSize |Precision |FPS |IOU@0.5 |IOU@0.5:0.95 | -----------|-----------|----------|----------|----------|---------------| -Lightweight OpenPose | 1 | FP16 | 21030.833 | 0.660 | 0.401 | +| Model | BatchSize | Precision | FPS | IOU@0.5 | IOU@0.5:0.95 | +|----------------------|-----------|-----------|-----------|---------|--------------| +| Lightweight OpenPose | 1 | FP16 | 21030.833 | 0.660 | 0.401 | -## Reference +## References -https://github.com/Daniil-Osokin/lightweight-human-pose-estimation.pytorch \ No newline at end of file +- [lightweight-human-pose-estimation](https://github.com/Daniil-Osokin/lightweight-human-pose-estimation.pytorch) diff --git a/models/cv/pose_estimation/rtmpose/igie/README.md b/models/cv/pose_estimation/rtmpose/igie/README.md index 9497dcfdb4089c7d2eae913b64f7e8945339a4fe..9fdc499679c95d19c1d2c57f0d99071918559124 100644 --- a/models/cv/pose_estimation/rtmpose/igie/README.md +++ b/models/cv/pose_estimation/rtmpose/igie/README.md @@ -1,12 +1,22 @@ -# RTMPose +# RTMPose (IGIE) -## Description +## Model Description RTMPose, a state-of-the-art framework developed by Shanghai AI Laboratory, excels in real-time multi-person pose estimation by integrating an innovative model architecture with the efficiency of the MMPose foundation. The framework's architecture is meticulously designed to enhance performance and reduce latency, making it suitable for a variety of applications where real-time analysis is crucial. -## Setup +## Model Preparation -### Install +### Prepare Resources + +Pretrained model: + +Dataset: to download the validation dataset. + +```bash +wget https://download.openmmlab.com/mmpose/v1/projects/rtmposev1/rtmpose-m_simcc-aic-coco_pt-aic-coco_420e-256x192-63eb25f7_20230126.pth +``` + +### Install Dependencies ```bash # Install libGL @@ -18,16 +28,6 @@ apt install -y libgl1-mesa-glx pip3 install -r requirements.txt ``` -### Download - -Pretrained model: - -Dataset: to download the validation dataset. - -```bash -wget https://download.openmmlab.com/mmpose/v1/projects/rtmposev1/rtmpose-m_simcc-aic-coco_pt-aic-coco_420e-256x192-63eb25f7_20230126.pth -``` - ### Model Conversion ```bash @@ -38,7 +38,7 @@ python3 export.py --weight rtmpose-m_simcc-aic-coco_pt-aic-coco_420e-256x192-63e onnxsim rtmpose.onnx rtmpose_opt.onnx ``` -## Inference +## Model Inference ```bash export DATASETS_DIR=/Path/to/coco/ @@ -53,13 +53,12 @@ bash scripts/infer_rtmpose_fp16_accuracy.sh bash scripts/infer_rtmpose_fp16_performance.sh ``` -## Results - -Model |BatchSize |Precision |FPS |IOU@0.5 |IOU@0.5:0.95 | -----------|-----------|----------|----------|----------|---------------| -RTMPose | 32 | FP16 | 2313.33 | 0.936 | 0.773 | +## Model Results +| Model | BatchSize | Precision | FPS | IOU@0.5 | IOU@0.5:0.95 | +|---------|-----------|-----------|---------|---------|--------------| +| RTMPose | 32 | FP16 | 2313.33 | 0.936 | 0.773 | -## Reference +## References -mmpose: +- [mmpose](https://github.com/open-mmlab/mmpose.git) diff --git a/models/cv/pose_estimation/rtmpose/ixrt/README.md b/models/cv/pose_estimation/rtmpose/ixrt/README.md index 51a832a181156ee205e6e1fbd70b40c612b04b18..a11d6e088aea61152a3f2d4c6c0372bc4b7539ce 100644 --- a/models/cv/pose_estimation/rtmpose/ixrt/README.md +++ b/models/cv/pose_estimation/rtmpose/ixrt/README.md @@ -1,12 +1,18 @@ -# RTMPose +# RTMPose (IxRT) -## Description +## Model Description RTMPose, a state-of-the-art framework developed by Shanghai AI Laboratory, excels in real-time multi-person pose estimation by integrating an innovative model architecture with the efficiency of the MMPose foundation. The framework's architecture is meticulously designed to enhance performance and reduce latency, making it suitable for a variety of applications where real-time analysis is crucial. -## Setup +## Model Preparation -### Install +### Prepare Resources + +Pretrained model: + +Dataset: to download the validation dataset. + +### Install Dependencies ```bash # Install libGL @@ -18,12 +24,6 @@ apt install -y libgl1-mesa-glx pip3 install -r requirements.txt ``` -### Download - -Pretrained model: - -Dataset: to download the validation dataset. - ## Model Conversion ```bash @@ -39,7 +39,7 @@ python3 export.py --weight data/rtmpose/rtmpose-m_simcc-aic-coco_pt-aic-coco_420 onnxsim data/rtmpose/rtmpose.onnx data/rtmpose/rtmpose_opt.onnx ``` -## Inference +## Model Inference ### FP16 diff --git a/models/multimodal/diffusion_model/stable-diffusion/README.md b/models/multimodal/diffusion_model/stable-diffusion/README.md index 93bc5f3dfc67faf74099d11c2b1e305f96622440..4d6de94fd8082d52a921c20a51262c7414f1312c 100644 --- a/models/multimodal/diffusion_model/stable-diffusion/README.md +++ b/models/multimodal/diffusion_model/stable-diffusion/README.md @@ -4,9 +4,19 @@ Stable Diffusion is a latent text-to-image diffusion model capable of generating photo-realistic images given any text input. -## Setup +## Model Preparation -### Install +### Prepare Resources + +Download the runwayml/stable-diffusion-v1-5 from [huggingface page](https://huggingface.co/runwayml/stable-diffusion-v1-5). + +```bash +cd stable-diffusion +mkdir -p data/ +ln -s /path/to/stable-diffusion-v1-5 ./data/ +``` + +### Install Dependencies ```bash # Install libGL @@ -19,23 +29,13 @@ pip3 install http://files.deepspark.org.cn:880/deepspark/add-ons/diffusers-0.31. pip3 install -r requirements.txt ``` -### Download - -Download the runwayml/stable-diffusion-v1-5 from [huggingface page](https://huggingface.co/runwayml/stable-diffusion-v1-5). - -```bash -cd stable-diffusion -mkdir -p data/ -ln -s /path/to/stable-diffusion-v1-5 ./data/ -``` - -## Inference +## Model Inference ```bash export ENABLE_IXFORMER_INFERENCE=1 python3 demo.py ``` -## Reference +## References - [diffusers](https://github.com/huggingface/diffusers) diff --git a/models/multimodal/vision_language_model/chameleon_7b/vllm/README.md b/models/multimodal/vision_language_model/chameleon_7b/vllm/README.md index 879367dd8f3b0b3a069ddfb684db4d9967eecb3f..b98b6c75c48fc0be763fbe493e3bee5a74f3aa59 100755 --- a/models/multimodal/vision_language_model/chameleon_7b/vllm/README.md +++ b/models/multimodal/vision_language_model/chameleon_7b/vllm/README.md @@ -1,12 +1,21 @@ # Chameleon -## Description +## Model Description Chameleon, an AI system that mitigates these limitations by augmenting LLMs with plug-and-play modules for compositional reasoning. Chameleon synthesizes programs by composing various tools (e.g., LLMs, off-the-shelf vision models, web search engines, Python functions, and heuristic-based modules) for accomplishing complex reasoning tasks. At the heart of Chameleon is an LLM-based planner that assembles a sequence of tools to execute to generate the final response. We showcase the effectiveness of Chameleon on two multi-modal knowledge-intensive reasoning tasks: ScienceQA and TabMWP. Chameleon, powered by GPT-4, achieves an 86.54% overall accuracy on ScienceQA, improving the best published few-shot result by 11.37%. On TabMWP, GPT-4-powered Chameleon improves the accuracy by 17.0%, lifting the state of the art to 98.78%. Our analysis also shows that the GPT-4-powered planner exhibits more consistent and rational tool selection via inferring potential constraints from instructions, compared to a ChatGPT-powered planner. -## Setup +## Model Preparation -### Install +### Prepare Resources + +- Model: + +```bash +# Download model from the website and make sure the model's path is "data/chameleon-7b" +mkdir data +``` + +### Install Dependencies In order to run the model smoothly, you need to get the sdk from [resource center](https://support.iluvatar.com/#/ProductLine?id=2) of Iluvatar CoreX official website. @@ -18,16 +27,7 @@ yum install -y mesa-libGL apt install -y libgl1-mesa-glx ``` -### Download - -- Model: - -```bash -# Download model from the website and make sure the model's path is "data/chameleon-7b" -mkdir data -``` - -## Inference +## Model Inference ```bash export VLLM_ASSETS_CACHE=../vllm/ diff --git a/models/multimodal/vision_language_model/clip/ixformer/README.md b/models/multimodal/vision_language_model/clip/ixformer/README.md index e8b90f57f627546dae5e847660cf1d2b97c74fd2..3a8a9c9d0c76eb464a895b49b0c3ddc6918ea822 100644 --- a/models/multimodal/vision_language_model/clip/ixformer/README.md +++ b/models/multimodal/vision_language_model/clip/ixformer/README.md @@ -1,12 +1,22 @@ # CLIP (IxFormer) -## Description +## Model Description CLIP (Contrastive Language-Image Pre-Training) is a neural network trained on a variety of (image, text) pairs. It can be instructed in natural language to predict the most relevant text snippet, given an image, without directly optimizing for the task, similarly to the zero-shot capabilities of GPT-2 and 3. We found CLIP matches the performance of the original ResNet50 on ImageNet zero-shot without using any of the original 1.28M labeled examples, overcoming several major challenges in computer vision. -## Setup +## Model Preparation -### Install +### Prepare Resources + +Pretrained model: Go to the website to find the pre-trained model you need. Here, we choose clip-vit-base-patch32. + +```bash +# Download model from the website and make sure the model's path is "data/clip-vit-base-patch32" +mkdir -p data +unzip clip-vit-base-patch32.zip -d data/ +``` + +### Install Dependencies In order to run the model smoothly, you need to get the sdk from [resource center](https://support.iluvatar.com/#/ProductLine?id=2) of Iluvatar CoreX official website. @@ -20,17 +30,7 @@ apt install -y libgl1-mesa-glx pip3 install -U transformers==4.27.1 ``` -### Download - -Pretrained model: Go to the website to find the pre-trained model you need. Here, we choose clip-vit-base-patch32. - -```bash -# Download model from the website and make sure the model's path is "data/clip-vit-base-patch32" -mkdir -p data -unzip clip-vit-base-patch32.zip -d data/ -``` - -## Run model +## Model Inference ### Test using the OpenAI interface diff --git a/models/multimodal/vision_language_model/fuyu_8b/vllm/README.md b/models/multimodal/vision_language_model/fuyu_8b/vllm/README.md index 7f2526b92a56f02ac9eccb4723a0891a2bf75780..c8992bd4a9beb39409a7cbb115806d60434ef837 100755 --- a/models/multimodal/vision_language_model/fuyu_8b/vllm/README.md +++ b/models/multimodal/vision_language_model/fuyu_8b/vllm/README.md @@ -1,14 +1,23 @@ # Fuyu-8B -## Description +## Model Description Fuyu-8B is a multi-modal text and image transformer trained by Adept AI. Architecturally, Fuyu is a vanilla decoder-only transformer - there is no image encoder. Image patches are instead linearly projected into the first layer of the transformer, bypassing the embedding lookup. We simply treat the transformer decoder like an image transformer (albeit with no pooling and causal attention). -## Setup +## Model Preparation -### Install +### Prepare Resources + +- Model: + +```bash +# Download model from the website and make sure the model's path is "data/fuyu-8b" +mkdir data/ +``` + +### Install Dependencies In order to run the model smoothly, you need to get the sdk from [resource center](https://support.iluvatar.com/#/ProductLine?id=2) of Iluvatar CoreX official website. @@ -20,16 +29,7 @@ yum install -y mesa-libGL apt install -y libgl1-mesa-glx ``` -### Download - -- Model: - -```bash -# Download model from the website and make sure the model's path is "data/fuyu-8b" -mkdir data -``` - -## Inference +## Model Inference ```bash export VLLM_ASSETS_CACHE=../vllm/ diff --git a/models/multimodal/vision_language_model/intern_vl/vllm/README.md b/models/multimodal/vision_language_model/intern_vl/vllm/README.md index c8a527798abc8aa315b316e53f3879dc87538e11..c7fdc256345f1446a632fe9e546817d4aee629a4 100644 --- a/models/multimodal/vision_language_model/intern_vl/vllm/README.md +++ b/models/multimodal/vision_language_model/intern_vl/vllm/README.md @@ -1,12 +1,22 @@ # InternVL2-4B -## Description +## Model Description InternVL2-4B is a large-scale multimodal model developed by WeTab AI, designed to handle a wide range of tasks involving both text and visual data. With 4 billion parameters, it is capable of understanding and generating complex patterns in data, making it suitable for applications such as image recognition, natural language processing, and multimodal learning. -## Setup +## Model Preparation -### Install +### Prepare Resources + +- Model: + +```bash +cd ${DeepSparkInference}/models/vision-language-understanding/Intern_VL/vllm +mkdir -p data/intern_vl +ln -s /path/to/InternVL2-4B ./data/intern_vl +``` + +### Install Dependencies In order to run the model smoothly, you need to get the sdk from [resource center](https://support.iluvatar.com/#/ProductLine?id=2) of Iluvatar CoreX official website. @@ -23,17 +33,7 @@ pip3 install triton pip3 install ixformer ``` -### Download - -- Model: - -```bash -cd ${DeepSparkInference}/models/vision-language-understanding/Intern_VL/vllm -mkdir -p data/intern_vl -ln -s /path/to/InternVL2-4B ./data/intern_vl -``` - -## Inference +## Model Inference ```bash export CUDA_VISIBLE_DEVICES=0,1 diff --git a/models/multimodal/vision_language_model/llava/vllm/README.md b/models/multimodal/vision_language_model/llava/vllm/README.md index 2ceeaed8182b2b75386647957d89b9eb0c2d6bba..d889579757866ad1f67ae15376cd5b00c6b7b1db 100644 --- a/models/multimodal/vision_language_model/llava/vllm/README.md +++ b/models/multimodal/vision_language_model/llava/vllm/README.md @@ -1,12 +1,21 @@ # LLava -## Description +## Model Description LLaVA is an open-source chatbot trained by fine-tuning LLaMA/Vicuna on GPT-generated multimodal instruction-following data. It is an auto-regressive language model, based on the transformer architecture.The LLaVA-NeXT model was proposed in LLaVA-NeXT: Improved reasoning, OCR, and world knowledge by Haotian Liu, Chunyuan Li, Yuheng Li, Bo Li, Yuanhan Zhang, Sheng Shen, Yong Jae Lee. LLaVa-NeXT (also called LLaVa-1.6) improves upon LLaVa-1.5 by increasing the input image resolution and training on an improved visual instruction tuning dataset to improve OCR and common sense reasoning. -## Setup +## Model Preparation -### Install +### Prepare Resources + +-llava-v1.6-vicuna-7b-hf: + +```bash +# Download model from the website and make sure the model's path is "data/llava" +mkdir data/ +``` + +### Install Dependencies In order to run the model smoothly, you need to get the sdk from [resource center](https://support.iluvatar.com/#/ProductLine?id=2) of Iluvatar CoreX official website. @@ -19,17 +28,7 @@ apt install -y libgl1-mesa-glx pip3 install transformers ``` -### Download - --llava-v1.6-vicuna-7b-hf: - -```bash -# Download model from the website and make sure the model's path is "data/llava" -mkdir data - -``` - -## Inference +## Model Inference ```bash export PT_SDPA_ENABLE_HEAD_DIM_PADDING=1 diff --git a/models/multimodal/vision_language_model/llava_next_video_7b/vllm/README.md b/models/multimodal/vision_language_model/llava_next_video_7b/vllm/README.md index 4fb09f4f6b85e27f1cd84b8c9833fd5aee08111e..584a36038c47021d4416d77881ebbf1ca9388d42 100755 --- a/models/multimodal/vision_language_model/llava_next_video_7b/vllm/README.md +++ b/models/multimodal/vision_language_model/llava_next_video_7b/vllm/README.md @@ -1,12 +1,21 @@ # LLaVA-Next-Video-7B -## Description +## Model Description LLaVA-Next-Video is an open-source chatbot trained by fine-tuning LLM on multimodal instruction-following data. The model is buit on top of LLaVa-NeXT by tuning on a mix of video and image data to achieves better video understanding capabilities. The videos were sampled uniformly to be 32 frames per clip. The model is a current SOTA among open-source models on VideoMME bench. Base LLM: lmsys/vicuna-7b-v1.5 -## Setup +## Model Preparation -### Install +### Prepare Resources + +- Model: + +```bash +# Download model from the website and make sure the model's path is "data/LLaVA-NeXT-Video-7B-hf" +mkdir data/ +``` + +### Install Dependencies In order to run the model smoothly, you need to get the sdk from [resource center](https://support.iluvatar.com/#/ProductLine?id=2) of Iluvatar CoreX official website. @@ -18,16 +27,7 @@ yum install -y mesa-libGL apt install -y libgl1-mesa-glx ``` -### Download - -- Model: - -```bash -# Download model from the website and make sure the model's path is "data/LLaVA-NeXT-Video-7B-hf" -mkdir data -``` - -## Inference +## Model Inference ```bash export VLLM_ASSETS_CACHE=../vllm/ diff --git a/models/multimodal/vision_language_model/minicpm_v_2/vllm/README.md b/models/multimodal/vision_language_model/minicpm_v_2/vllm/README.md index d2b2dd8689d18c2d2686560da07c6d362d3449cb..d5032beed53bfed69d553f65df0b52719e77d267 100644 --- a/models/multimodal/vision_language_model/minicpm_v_2/vllm/README.md +++ b/models/multimodal/vision_language_model/minicpm_v_2/vllm/README.md @@ -1,12 +1,22 @@ # MiniCPM V2 -## Description +## Model Description MiniCPM V2 is a compact and efficient language model designed for various natural language processing (NLP) tasks. Building on its predecessor, MiniCPM-V-1, this model integrates advancements in architecture and optimization techniques, making it suitable for deployment in resource-constrained environments.s -## Setup +## Model Preparation -### Install +### Prepare Resources + +- Model: +Note: Due to the official weights missing some necessary files for vllm execution, you can download the additional files from here: to ensure that the file directory matches the structure shown here: . + +```bash +# Download model from the website and make sure the model's path is "data/MiniCPM-V-2" +mkdir data/ +``` + +### Install Dependencies In order to run the model smoothly, you need to get the sdk from [resource center](https://support.iluvatar.com/#/ProductLine?id=2) of Iluvatar CoreX official website. @@ -21,18 +31,7 @@ pip3 install transformers pip3 install --user --upgrade pillow -i https://pypi.tuna.tsinghua.edu.cn/simple ``` -### Download - -- Model: -Note: Due to the official weights missing some necessary files for vllm execution, you can download the additional files from here: to ensure that the file directory matches the structure shown here: . - -```bash -# Download model from the website and make sure the model's path is "data/MiniCPM-V-2" -mkdir data - -``` - -## Inference +## Model Inference ```bash export PT_SDPA_ENABLE_HEAD_DIM_PADDING=1 diff --git a/models/nlp/llm/baichuan2-7b/vllm/README.md b/models/nlp/llm/baichuan2-7b/vllm/README.md index 9a8cb1ec99d04c561e3a5baddf7a9e2e93add045..e79fb7c74d5e6c802864e3862c82e55dbe25e683 100755 --- a/models/nlp/llm/baichuan2-7b/vllm/README.md +++ b/models/nlp/llm/baichuan2-7b/vllm/README.md @@ -1,15 +1,25 @@ # Baichuan2-7B (vLLM) -## Description +## Model Description Baichuan 2 is a new generation open-source large language model launched by Baichuan Intelligence. It is trained on high-quality data with 26 trillion tokens, which sounds like a substantial dataset. Baichuan 2 achieves state-of-the-art performance on various authoritative Chinese, multilingual, and domain-specific benchmarks of similar size, indicating its excellent capabilities in language understanding and generation.This release includes Base and Chat versions of 7B. -## Setup +## Model Preparation -### Install +### Prepare Resources + +Pretrained model: +[https://huggingface.co/baichuan-inc/Baichuan2-7B-Base/tree/main](https://huggingface.co/baichuan-inc/Baichuan2-7B-Base/tree/main) + +```bash +mkdir /data/baichuan/ +mv Baichuan2-7B-Base.tar/zip /data/baichuan/ +``` + +### Install Dependencies In order to run the model smoothly, you need to get the sdk from [resource center](https://support.iluvatar.com/#/ProductLine?id=2) of Iluvatar CoreX official website. @@ -24,25 +34,15 @@ apt install -y libgl1-mesa-glx pip3 install transformers ``` -### Download - -Pretrained model: -[https://huggingface.co/baichuan-inc/Baichuan2-7B-Base/tree/main](https://huggingface.co/baichuan-inc/Baichuan2-7B-Base/tree/main) - -```bash -mkdir /data/baichuan/ -mv Baichuan2-7B-Base.tar/zip /data/baichuan/ -``` - -## Run model +## Model Inference ```bash python3 offline_inference.py --model /data/baichuan/Baichuan2-7B-Base/ --max-tokens 256 --trust-remote-code --chat_template template_baichuan.jinja --temperature 0.0 ``` -## Run Baichuan w8a16 quantization +### Run Baichuan w8a16 quantization -### Retrieve int8 weights +Retrieve int8 weights. Int8 weights will be saved at /data/baichuan/Baichuan2-7B-Base/int8 @@ -50,13 +50,11 @@ Int8 weights will be saved at /data/baichuan/Baichuan2-7B-Base/int8 python3 convert2int8.py --model-path /data/baichuan/Baichuan2-7B-Base/ ``` -### Run - ```bash python3 offline_inference.py --model /data/baichuan/Baichuan2-7B-Base/int8/ --chat_template template_baichuan.jinja --quantization w8a16 --max-num-seqs 1 --max-model-len 256 --trust-remote-code --temperature 0.0 --max-tokens 256 ``` -## Results +## Model Results | Model | Precision | tokens | QPS | |--------------|-----------|--------|--------| diff --git a/models/nlp/llm/chatglm3-6b-32k/vllm/README.md b/models/nlp/llm/chatglm3-6b-32k/vllm/README.md index c74b04fe49a27e6e13fc1e65727cead426d0dc3b..1fa1f415e4aa42007dc9686abc2dbc76d9993475 100644 --- a/models/nlp/llm/chatglm3-6b-32k/vllm/README.md +++ b/models/nlp/llm/chatglm3-6b-32k/vllm/README.md @@ -1,6 +1,6 @@ # ChatGLM3-6B-32K (vLLM) -## Description +## Model Description ChatGLM3-6B-32K further enhances the understanding of long text capabilities based on ChatGLM3-6B, enabling better handling of contexts up to 32K in length. Specifically, we have updated the positional encoding and designed more @@ -8,9 +8,18 @@ targeted long text training methods, using a 32K context length during the train context length is mostly within 8K, we recommend using ChatGLM3-6B; if you need to handle context lengths exceeding 8K, we recommend using ChatGLM3-6B-32K. -## Setup +## Model Preparation -### Install +### Prepare Resources + +Pretrained model: + +```bash +mkdir -p /data/chatglm/ +mv chatglm3-6b-32k.zip/tar /data/chatglm/ +``` + +### Install Dependencies In order to run the model smoothly, you need to get the sdk from [resource center](https://support.iluvatar.com/#/ProductLine?id=2) of Iluvatar CoreX official website. @@ -25,37 +34,28 @@ apt install -y libgl1-mesa-glx pip3 install transformers ``` -### Download - -Pretrained model: - -```bash -mkdir -p /data/chatglm/ -mv chatglm3-6b-32k.zip/tar /data/chatglm/ -``` - -## Run model +## Model Inference ```bash python3 offline_inference.py --model /data/chatglm/chatglm3-6b-32k --trust-remote-code --temperature 0.0 --max-tokens 256 ``` -## Use the server +### Use the server -### Start the server +Start the server. ```bash python3 -m vllm.entrypoints.openai.api_server --model /data/chatglm/chatglm3-6b-32k --gpu-memory-utilization 0.9 --max-num-batched-tokens 8193 \ --max-num-seqs 32 --disable-log-requests --host 127.0.0.1 --port 12345 --trust-remote-code ``` -### Test using the OpenAI interface +Test using the OpenAI interface. ```bash python3 server_inference.py --host 127.0.0.1 --port 12345 --model_path /data/chatglm/chatglm3-6b-32k ``` -## Results +## Model Results | Model | Precision | tokens | QPS | |-----------------|-----------|--------|--------| diff --git a/models/nlp/llm/chatglm3-6b/vllm/README.md b/models/nlp/llm/chatglm3-6b/vllm/README.md index 5dc174039af016912b9415e2bb61fd3409260ace..7ca28a53705e503e4fd97572f2dc9668b0ae6b8c 100644 --- a/models/nlp/llm/chatglm3-6b/vllm/README.md +++ b/models/nlp/llm/chatglm3-6b/vllm/README.md @@ -1,14 +1,23 @@ # ChatGLM3-6B (vLLM) -## Description +## Model Description ChatGLM3-6B is trained on large-scale natural language text data, enabling it to understand and generate text. It can be applied to various natural language processing tasks such as dialogue generation, text summarization, and language translation. -## Setup +## Model Preparation -### Install +### Prepare Resources + +Pretrained model: + +```bash +mkdir /data/chatglm/ +mv chatglm3-6b.zip/tar /data/chatglm/ +``` + +### Install Dependencies In order to run the model smoothly, you need to get the sdk from [resource center](https://support.iluvatar.com/#/ProductLine?id=2) of Iluvatar CoreX official website. @@ -24,24 +33,15 @@ pip3 install vllm pip3 install transformers ``` -### Download - -Pretrained model: - -```bash -mkdir /data/chatglm/ -mv chatglm3-6b.zip/tar /data/chatglm/ -``` - -## Run model +## Model Inference ```bash python3 offline_inference.py --model /data/chatglm/chatglm3-6b --trust-remote-code --temperature 0.0 --max-tokens 256 ``` -## Use the server +### Use the server -### Start the server +Start the server. ```bash python3 -m vllm.entrypoints.openai.api_server --model /data/chatglm/chatglm3-6b --gpu-memory-utilization 0.9 --max-num-batched-tokens 8193 \ @@ -54,30 +54,24 @@ python3 -m vllm.entrypoints.openai.api_server --model /data/chatglm/chatglm3-6b python3 server_inference.py --host 127.0.0.1 --port 12345 --model_path /data/chatglm/chatglm3-6b ``` -## Benchmarking vLLM - -### Downloading the ShareGPT dataset +### Benchmarking vLLM ```bash +# Downloading the ShareGPT dataset. wget https://huggingface.co/datasets/anon8231489123/ShareGPT_Vicuna_unfiltered/resolve/main/ShareGPT_V3_unfiltered_cleaned_split.json -``` - -### Cloning the vllm project -```bash +# Cloning the vllm project git clone https://github.com/vllm-project/vllm.git -b v0.5.4 --depth=1 ``` -### Benchmarking - -#### Starting server +Starting server. ```bash python3 -m vllm.entrypoints.openai.api_server --model /data/chatglm/chatglm3-6b --gpu-memory-utilization 0.9 --max-num-batched-tokens 8193 \ --max-num-seqs 32 --disable-log-requests --host 127.0.0.1 --trust-remote-code ``` -#### Starting benchmark client +Starting benchmark client. ```bash python3 benchmark_serving.py --host 127.0.0.1 --num-prompts 16 --model /data/chatglm/chatglm3-6b --dataset-name sharegpt \ diff --git a/models/nlp/llm/deepseek-r1-distill-llama-70b/vllm/README.md b/models/nlp/llm/deepseek-r1-distill-llama-70b/vllm/README.md index 5b4751342f2b3646cced146562b68b93a66b8314..b5f28f20f0ee53357b1f67c98b9bdc4972b1ad02 100644 --- a/models/nlp/llm/deepseek-r1-distill-llama-70b/vllm/README.md +++ b/models/nlp/llm/deepseek-r1-distill-llama-70b/vllm/README.md @@ -1,14 +1,24 @@ -# DeepSeek-R1-Distill-Llama-70B +# DeepSeek-R1-Distill-Llama-70B (vLLM) -## Description +## Model Description DeepSeek-R1-Distill models are fine-tuned based on open-source models, using samples generated by DeepSeek-R1. We slightly change their configs and tokenizers. We open-source distilled 1.5B, 7B, 8B, 14B, 32B, and 70B checkpoints based on Qwen2.5 and Llama3 series to the community. -## Setup +## Model Preparation -### Install +### Prepare Resources + +- Model: + +```bash +cd deepSeek-r1-distill-llama-70b/vllm +mkdir -p data/ +ln -s /path/to/DeepSeek-R1-Distill-Llama-70B ./data/ +``` + +### Install Dependencies ```bash # Install libGL @@ -18,28 +28,20 @@ yum install -y mesa-libGL apt install -y libgl1-mesa-glx ``` -### Download - -- Model: - -```bash -cd deepSeek-r1-distill-llama-70b/vllm -mkdir -p data/ -ln -s /path/to/DeepSeek-R1-Distill-Llama-70B ./data/ -``` +## Model Inference -## Inference with offline +### Inference with offline ```bash python3 offline_inference.py --model ./data/DeepSeek-R1-Distill-Llama-70B --max-tokens 256 -tp 8 --temperature 0.0 --max-model-len 3096 ``` -## Inference with serve +### Inference with serve ```bash vllm serve data/DeepSeek-R1-Distill-Llama-70B --tensor-parallel-size 8 --max-model-len 32768 --enforce-eager --trust-remote-code ``` -## Reference +## References -[DeepSeek-R1](https://github.com/deepseek-ai/DeepSeek-R1) +- [DeepSeek-R1](https://github.com/deepseek-ai/DeepSeek-R1) diff --git a/models/nlp/llm/deepseek-r1-distill-llama-8b/vllm/README.md b/models/nlp/llm/deepseek-r1-distill-llama-8b/vllm/README.md index 45f291108eccc3e06e97816f71f12a6815a900da..44516490830ac94cbf66c5f75cbaba61184557bd 100644 --- a/models/nlp/llm/deepseek-r1-distill-llama-8b/vllm/README.md +++ b/models/nlp/llm/deepseek-r1-distill-llama-8b/vllm/README.md @@ -1,14 +1,24 @@ -# DeepSeek-R1-Distill-Llama-8B +# DeepSeek-R1-Distill-Llama-8B (vLLM) -## Description +## Model Description DeepSeek-R1-Distill models are fine-tuned based on open-source models, using samples generated by DeepSeek-R1. We slightly change their configs and tokenizers. We open-source distilled 1.5B, 7B, 8B, 14B, 32B, and 70B checkpoints based on Qwen2.5 and Llama3 series to the community. -## Setup +## Model Preparation -### Install +### Prepare Resources + +- Model: + +```bash +cd deepSeek-r1-distill-llama-8b/vllm +mkdir -p data/ +ln -s /path/to/DeepSeek-R1-Distill-Llama-8B ./data/ +``` + +### Install Dependencies ```bash # Install libGL @@ -18,34 +28,26 @@ yum install -y mesa-libGL apt install -y libgl1-mesa-glx ``` -### Download - -- Model: - -```bash -cd deepSeek-r1-distill-llama-8b/vllm -mkdir -p data/ -ln -s /path/to/DeepSeek-R1-Distill-Llama-8B ./data/ -``` +## Model Inference -## Inference with offline +### Inference with offline ```bash python3 offline_inference.py --model ./data/DeepSeek-R1-Distill-Llama-8B --max-tokens 256 -tp 1 --temperature 0.0 --max-model-len 3096 ``` -## Inference with serve +### Inference with serve ```bash vllm serve data/DeepSeek-R1-Distill-Llama-8B --tensor-parallel-size 2 --max-model-len 32768 --enforce-eager --trust-remote-code ``` -## Results +## Model Results | Model | QPS | |------------------------------|--------| | DeepSeek-R1-Distill-Llama-8B | 105.33 | -## Reference +## References -[DeepSeek-R1](https://github.com/deepseek-ai/DeepSeek-R1) +- [DeepSeek-R1](https://github.com/deepseek-ai/DeepSeek-R1) diff --git a/models/nlp/llm/deepseek-r1-distill-qwen-1.5b/vllm/README.md b/models/nlp/llm/deepseek-r1-distill-qwen-1.5b/vllm/README.md index e10d00457fb4ecd10d344ebba96406d371196879..32c97ebe77f0c3f43fd850f58f381401e34b5d30 100644 --- a/models/nlp/llm/deepseek-r1-distill-qwen-1.5b/vllm/README.md +++ b/models/nlp/llm/deepseek-r1-distill-qwen-1.5b/vllm/README.md @@ -1,14 +1,24 @@ -# DeepSeek-R1-Distill-Qwen-1.5B +# DeepSeek-R1-Distill-Qwen-1.5B (vLLM) -## Description +## Model Description DeepSeek-R1-Distill models are fine-tuned based on open-source models, using samples generated by DeepSeek-R1. We slightly change their configs and tokenizers. We open-source distilled 1.5B, 7B, 8B, 14B, 32B, and 70B checkpoints based on Qwen2.5 and Llama3 series to the community. -## Setup +## Model Preparation -### Install +### Prepare Resources + +- Model: + +```bash +cd deepseek-r1-distill-qwen-1.5b/vllm +mkdir -p data/ +ln -s /path/to/DeepSeek-R1-Distill-Qwen-1.5B ./data/ +``` + +### Install Dependencies ```bash # Install libGL @@ -18,34 +28,26 @@ yum install -y mesa-libGL apt install -y libgl1-mesa-glx ``` -### Download - -- Model: - -```bash -cd deepseek-r1-distill-qwen-1.5b/vllm -mkdir -p data/ -ln -s /path/to/DeepSeek-R1-Distill-Qwen-1.5B ./data/ -``` +## Model Inference -## Inference with offline +### Inference with offline ```bash python3 offline_inference.py --model ./data/DeepSeek-R1-Distill-Qwen-1.5B --max-tokens 256 -tp 1 --temperature 0.0 --max-model-len 3096 ``` -## Inference with serve +### Inference with serve ```bash vllm serve data/DeepSeek-R1-Distill-Qwen-1.5B --tensor-parallel-size 2 --max-model-len 32768 --enforce-eager --trust-remote-code ``` -## Results +## Model Results | Model | QPS | |-------------------------------|--------| | DeepSeek-R1-Distill-Qwen-1.5B | 259.42 | -## Reference +## References -[DeepSeek-R1](https://github.com/deepseek-ai/DeepSeek-R1) +- [DeepSeek-R1](https://github.com/deepseek-ai/DeepSeek-R1) diff --git a/models/nlp/llm/deepseek-r1-distill-qwen-14b/vllm/README.md b/models/nlp/llm/deepseek-r1-distill-qwen-14b/vllm/README.md index 27cade520fcd73e4a91c252b7303dc69cf4155d6..b6c19863f24b1353722bd8470c65395acdf8cdf5 100644 --- a/models/nlp/llm/deepseek-r1-distill-qwen-14b/vllm/README.md +++ b/models/nlp/llm/deepseek-r1-distill-qwen-14b/vllm/README.md @@ -1,14 +1,24 @@ -# DeepSeek-R1-Distill-Qwen-14B +# DeepSeek-R1-Distill-Qwen-14B (vLLM) -## Description +## Model Description DeepSeek-R1-Distill models are fine-tuned based on open-source models, using samples generated by DeepSeek-R1. We slightly change their configs and tokenizers. We open-source distilled 1.5B, 7B, 8B, 14B, 32B, and 70B checkpoints based on Qwen2.5 and Llama3 series to the community. -## Setup +## Model Preparation -### Install +### Prepare Resources + +- Model: + +```bash +cd deepseek-r1-distill-qwen-14b/vllm +mkdir -p data/ +ln -s /path/to/DeepSeek-R1-Distill-Qwen-14B ./data/ +``` + +### Install Dependencies ```bash # Install libGL @@ -18,34 +28,26 @@ yum install -y mesa-libGL apt install -y libgl1-mesa-glx ``` -### Download - -- Model: - -```bash -cd deepseek-r1-distill-qwen-14b/vllm -mkdir -p data/ -ln -s /path/to/DeepSeek-R1-Distill-Qwen-14B ./data/ -``` +## Model Inference -## Inference with offline +### Inference with offline ```bash python3 offline_inference.py --model ./data/DeepSeek-R1-Distill-Qwen-14B --max-tokens 256 -tp 2 --temperature 0.0 --max-model-len 3096 ``` -## Inference with serve +### Inference with serve ```bash vllm serve data/DeepSeek-R1-Distill-Qwen-14B --tensor-parallel-size 2 --max-model-len 32768 --enforce-eager --trust-remote-code ``` -## Results +## Model Results | Model | QPS | |------------------------------|-------| | DeepSeek-R1-Distill-Qwen-14B | 88.01 | -## Reference +## References -[DeepSeek-R1](https://github.com/deepseek-ai/DeepSeek-R1) +- [DeepSeek-R1](https://github.com/deepseek-ai/DeepSeek-R1) diff --git a/models/nlp/llm/deepseek-r1-distill-qwen-32b/vllm/README.md b/models/nlp/llm/deepseek-r1-distill-qwen-32b/vllm/README.md index bc3fab8935d73ecadc0f70da75ad4ac90dc747f6..bda03579bc17a8c8fa2d46bc6d29a1d05b7b9802 100644 --- a/models/nlp/llm/deepseek-r1-distill-qwen-32b/vllm/README.md +++ b/models/nlp/llm/deepseek-r1-distill-qwen-32b/vllm/README.md @@ -1,14 +1,24 @@ -# DeepSeek-R1-Distill-Qwen-32B +# DeepSeek-R1-Distill-Qwen-32B (vLLM) -## Description +## Model Description DeepSeek-R1-Distill models are fine-tuned based on open-source models, using samples generated by DeepSeek-R1. We slightly change their configs and tokenizers. We open-source distilled 1.5B, 7B, 8B, 14B, 32B, and 70B checkpoints based on Qwen2.5 and Llama3 series to the community. -## Setup +## Model Preparation -### Install +### Prepare Resources + +- Model: + +```bash +cd deepseek-r1-distill-qwen-32b/vllm +mkdir -p data/ +ln -s /path/to/DeepSeek-R1-Distill-Qwen-32B ./data/ +``` + +### Install Dependencies ```bash # Install libGL @@ -18,34 +28,26 @@ yum install -y mesa-libGL apt install -y libgl1-mesa-glx ``` -### Download - -- Model: - -```bash -cd deepseek-r1-distill-qwen-32b/vllm -mkdir -p data/ -ln -s /path/to/DeepSeek-R1-Distill-Qwen-32B ./data/ -``` +## Model Inference -## Inference with offline +### Inference with offline ```bash python3 offline_inference.py --model ./data/DeepSeek-R1-Distill-Qwen-32B --max-tokens 256 -tp 4 --temperature 0.0 --max-model-len 3096 ``` -## Inference with serve +### Inference with serve ```bash vllm serve data/DeepSeek-R1-Distill-Qwen-32B --tensor-parallel-size 4 --max-model-len 32768 --enforce-eager --trust-remote-code ``` -## Results +## Model Results | Model | QPS | |------------------------------|-------| | DeepSeek-R1-Distill-Qwen-32B | 68.30 | -## Reference +## References -[DeepSeek-R1](https://github.com/deepseek-ai/DeepSeek-R1) +- [DeepSeek-R1](https://github.com/deepseek-ai/DeepSeek-R1) diff --git a/models/nlp/llm/deepseek-r1-distill-qwen-7b/vllm/README.md b/models/nlp/llm/deepseek-r1-distill-qwen-7b/vllm/README.md index 8cab93a6e79ead253a4f767c57a98a9a7d13fab2..b5d57e19df0f2cea5230bb888cc318342d77073f 100644 --- a/models/nlp/llm/deepseek-r1-distill-qwen-7b/vllm/README.md +++ b/models/nlp/llm/deepseek-r1-distill-qwen-7b/vllm/README.md @@ -1,14 +1,24 @@ -# DeepSeek-R1-Distill-Qwen-7B +# DeepSeek-R1-Distill-Qwen-7B (vLLM) -## Description +## Model Description DeepSeek-R1-Distill models are fine-tuned based on open-source models, using samples generated by DeepSeek-R1. We slightly change their configs and tokenizers. We open-source distilled 1.5B, 7B, 8B, 14B, 32B, and 70B checkpoints based on Qwen2.5 and Llama3 series to the community. -## Setup +## Model Preparation -### Install +### Prepare Resources + +- Model: + +```bash +cd deepseek-r1-distill-qwen-7b/vllm +mkdir -p data/ +ln -s /path/to/DeepSeek-R1-Distill-Qwen-7B ./data/ +``` + +### Install Dependencies ```bash # Install libGL @@ -18,34 +28,26 @@ yum install -y mesa-libGL apt install -y libgl1-mesa-glx ``` -### Download - -- Model: - -```bash -cd deepseek-r1-distill-qwen-7b/vllm -mkdir -p data/ -ln -s /path/to/DeepSeek-R1-Distill-Qwen-7B ./data/ -``` +## Model Inference -## Inference with offline +### Inference with offline ```bash python3 offline_inference.py --model ./data/DeepSeek-R1-Distill-Qwen-7B --max-tokens 256 -tp 1 --temperature 0.0 --max-model-len 3096 ``` -## Inference with serve +### Inference with serve ```bash vllm serve data/DeepSeek-R1-Distill-Qwen-7B --tensor-parallel-size 2 --max-model-len 32768 --enforce-eager --trust-remote-code ``` -## Results +## Model Results | Model | QPS | |-----------------------------|-------| | DeepSeek-R1-Distill-Qwen-7B | 90.48 | -## Reference +## References -[DeepSeek-R1](https://github.com/deepseek-ai/DeepSeek-R1) +- [DeepSeek-R1](https://github.com/deepseek-ai/DeepSeek-R1) diff --git a/models/nlp/llm/llama2-13b/trtllm/README.md b/models/nlp/llm/llama2-13b/trtllm/README.md index d525ad9fb31312898e8b51077fc1abdc3b785411..6bfa1abf40e64bc90b42e123a0533daaa204cded 100755 --- a/models/nlp/llm/llama2-13b/trtllm/README.md +++ b/models/nlp/llm/llama2-13b/trtllm/README.md @@ -1,27 +1,15 @@ # Llama2 13B (TensorRT-LLM) -## Description +## Model Description The Llama2 model is part of the Llama project which aims to unlock the power of large language models. The latest version of the Llama model is now accessible to individuals, creators, researchers, and businesses of all sizes. It includes model weights and starting code for pre-trained and fine-tuned Llama language models with parameters ranging from 7B to 70B. -## Setup +## Model Preparation -### Install - -```bash -# Install libGL -## CentOS -yum install -y mesa-libGL -## Ubuntu -apt install -y libgl1-mesa-glx - -bash scripts/set_environment.sh . -``` - -### Download +### Prepare Resources - Model: - Dataset: @@ -29,18 +17,29 @@ bash scripts/set_environment.sh . ```bash # Download model from the website and make sure the model's path is "data/llama2-13b-chat" # Download dataset from the website and make sure the dataset's path is "data/datasets_cnn_dailymail" -mkdir data +mkdir data/ # Please download rouge.py to this path if your server can't attach huggingface.co. mkdir -p rouge/ wget --no-check-certificate https://raw.githubusercontent.com/huggingface/evaluate/main/metrics/rouge/rouge.py -P rouge ``` -## Inference +### Install Dependencies ```bash -export CUDA_VISIBLE_DEVICES=0,1 +# Install libGL +## CentOS +yum install -y mesa-libGL +## Ubuntu +apt install -y libgl1-mesa-glx +bash scripts/set_environment.sh . +``` + +## Model Inference + +```bash +export CUDA_VISIBLE_DEVICES=0,1 ``` ### FP16 @@ -52,7 +51,7 @@ bash scripts/test_trtllm_llama2_13b_gpu2_build.sh bash scripts/test_trtllm_llama2_13b_gpu2.sh ``` -## Results +## Model Results | Model | tokens | tokens per second | | ---------- | ------ | ----------------- | diff --git a/models/nlp/llm/llama2-70b/trtllm/README.md b/models/nlp/llm/llama2-70b/trtllm/README.md index 628ee896e4b3a269dc91f9c7aa0e8b4a80e8ab60..d01031caaf49abc2c2211873d41e6d764999c300 100644 --- a/models/nlp/llm/llama2-70b/trtllm/README.md +++ b/models/nlp/llm/llama2-70b/trtllm/README.md @@ -1,6 +1,6 @@ # LlaMa2 70B (TensorRT-LLM) -## Description +## Model Description we develop and release Llama 2, a collection of pretrained and fine-tuned large language models (LLMs) ranging in scale from 7 billion to 70 billion parameters. Our fine-tuned LLMs, called Llama 2-Chat, are optimized for dialogue use cases. @@ -9,21 +9,9 @@ helpfulness and safety, may be a suitable substitute for closed-source models. W approach to fine-tuning and safety improvements of Llama 2-Chat in order to enable the community to build on our work and contribute to the responsible development of LLMs. -## Setup +## Model Preparation -### Install - -```bash -# Install libGL -## CentOS -yum install -y mesa-libGL -## Ubuntu -apt install -y libgl1-mesa-glx - -bash scripts/set_environment.sh . -``` - -### Download +### Prepare Resources - Model: @@ -39,7 +27,19 @@ mkdir -p rouge/ wget --no-check-certificate https://raw.githubusercontent.com/huggingface/evaluate/main/metrics/rouge/rouge.py -P rouge ``` -## Inference +### Install Dependencies + +```bash +# Install libGL +## CentOS +yum install -y mesa-libGL +## Ubuntu +apt install -y libgl1-mesa-glx + +bash scripts/set_environment.sh . +``` + +## Model Inference ### FP16 diff --git a/models/nlp/llm/llama2-7b/trtllm/README.md b/models/nlp/llm/llama2-7b/trtllm/README.md index 3be201f191d2a6178353a7c08cb7358d8b518c93..210420ed2361cf1aa61cabc16575b485a54c7376 100644 --- a/models/nlp/llm/llama2-7b/trtllm/README.md +++ b/models/nlp/llm/llama2-7b/trtllm/README.md @@ -1,6 +1,6 @@ # LlaMa2 7B (TensorRT-LLM) -## Description +## Model Description we develop and release Llama 2, a collection of pretrained and fine-tuned large language models (LLMs) ranging in scale from 7 billion to 70 billion parameters. Our fine-tuned LLMs, called Llama 2-Chat, are optimized for dialogue use cases. @@ -9,24 +9,9 @@ helpfulness and safety, may be a suitable substitute for closed-source models. W approach to fine-tuning and safety improvements of Llama 2-Chat in order to enable the community to build on our work and contribute to the responsible development of LLMs. -## Setup +## Model Preparation -### Install - -In order to run the model smoothly, you need to get the sdk from [resource -center](https://support.iluvatar.com/#/ProductLine?id=2) of Iluvatar CoreX official website. - -```bash -# Install libGL -## CentOS -yum install -y mesa-libGL -## Ubuntu -apt install -y libgl1-mesa-glx - -bash scripts/set_environment.sh . -``` - -### Download +### Prepare Resources - Model: @@ -42,7 +27,22 @@ mkdir -p rouge/ wget --no-check-certificate https://raw.githubusercontent.com/huggingface/evaluate/main/metrics/rouge/rouge.py -P rouge ``` -## Inference +### Install Dependencies + +In order to run the model smoothly, you need to get the sdk from [resource +center](https://support.iluvatar.com/#/ProductLine?id=2) of Iluvatar CoreX official website. + +```bash +# Install libGL +## CentOS +yum install -y mesa-libGL +## Ubuntu +apt install -y libgl1-mesa-glx + +bash scripts/set_environment.sh . +``` + +## Model Inference ### FP16 diff --git a/models/nlp/llm/llama2-7b/vllm/README.md b/models/nlp/llm/llama2-7b/vllm/README.md index 44e6db7154fb262e573ffec154906451d53cd500..82eafa3f0bc8a3548899b11bc066a3d4b8e9703a 100755 --- a/models/nlp/llm/llama2-7b/vllm/README.md +++ b/models/nlp/llm/llama2-7b/vllm/README.md @@ -1,6 +1,6 @@ # Llama2 7B (vLLM) -## Description +## Model Description we develop and release Llama 2, a collection of pretrained and fine-tuned large language models (LLMs) ranging in scale from 7 billion to 70 billion parameters. Our fine-tuned LLMs, called Llama 2-Chat, are optimized for dialogue use cases. @@ -9,9 +9,19 @@ helpfulness and safety, may be a suitable substitute for closed-source models. W approach to fine-tuning and safety improvements of Llama 2-Chat in order to enable the community to build on our work and contribute to the responsible development of LLMs. -## Setup +## Model Preparation -### Install +### Prepare Resources + +- Model: + +```bash +cd ${DeepSparkInference}/models/nlp/large_language_model/llama2-7b/vllm +mkdir -p data/llama2 +ln -s /path/to/llama2-7b ./data/llama2 +``` + +### Install Dependencies In order to run the model smoothly, you need to get the sdk from [resource center](https://support.iluvatar.com/#/ProductLine?id=2) of Iluvatar CoreX official website. @@ -29,17 +39,7 @@ pip3 install triton pip3 install ixformer ``` -### Download - -- Model: - -```bash -cd ${DeepSparkInference}/models/nlp/large_language_model/llama2-7b/vllm -mkdir -p data/llama2 -ln -s /path/to/llama2-7b ./data/llama2 -``` - -## Inference +## Model Inference ```bash python3 offline_inference.py --model ./data/llama2/llama2-7b --max-tokens 256 -tp 1 --temperature 0.0 diff --git a/models/nlp/llm/llama3-70b/vllm/README.md b/models/nlp/llm/llama3-70b/vllm/README.md index a6c4f4881a842b09c4e3d9ff8255f1a73a947573..789f10668a024974ed13acb860850e4582a4a1ee 100644 --- a/models/nlp/llm/llama3-70b/vllm/README.md +++ b/models/nlp/llm/llama3-70b/vllm/README.md @@ -1,26 +1,26 @@ # LlaMa3 70B (vLLM) -## Description +## Model Description -This model is the Meta Llama 3 large language model series (LLMs) released by Meta, which is a series of pre-trained and -instruction-tuned generative text models, available in 8B and 70B models. The model is 70B in size and is designed for -large-scale AI applications. +Llama 3 is Meta's latest large language model series, representing a significant advancement in open-source AI +technology. Available in 8B and 70B parameter versions, it's trained on a dataset seven times larger than its +predecessor, Llama 2. The model features an expanded 8K context window and a 128K token vocabulary for more efficient +language encoding. Optimized for conversational AI, Llama 3 demonstrates superior performance across various industry +benchmarks while maintaining strong safety and beneficialness standards. Its 70B version is particularly designed for +large-scale AI applications, offering enhanced reasoning and instruction-following capabilities. -The Llama 3 command-tuned model is optimized for conversational use cases and outperforms many available open source -chat models on common industry benchmarks. In addition, when developing these models, the research team paid great -attention to optimizing beneficialness and safety. +## Model Preparation -Llama 3 is a major improvement over Llama 2 and other publicly available models: +### Prepare Resources ---Trained on a dataset seven times larger than Llama 2 - ---Llama 2 has twice the context length of 8K - ---Encode the language more efficiently using a larger token vocabulary with 128K tokens +- Model: -## Setup +```bash +# Download model from the website and make sure the model's path is "data/Meta-Llama-3-70B-Instruct" +mkdir data/ +``` -### Install +### Install Dependencies In order to run the model smoothly, you need to get the sdk from [resource center](https://support.iluvatar.com/#/ProductLine?id=2) of Iluvatar CoreX official website. @@ -33,17 +33,7 @@ yum install -y mesa-libGL apt install -y libgl1-mesa-glx ``` -### Download - -- Model: - -```bash -# Download model from the website and make sure the model's path is "data/Meta-Llama-3-70B-Instruct" -mkdir data - -``` - -## Inference +## Model Inference ```bash export CUDA_VISIBLE_DEVICES=0,1,2,3,4,5,6,7 diff --git a/models/nlp/llm/qwen-7b/vllm/README.md b/models/nlp/llm/qwen-7b/vllm/README.md index 7b2433b63f9f4da9d96ddc2600679c5f8447adaa..69340ba2c4b4d966bc242ba73ce6d6dfef49bc68 100644 --- a/models/nlp/llm/qwen-7b/vllm/README.md +++ b/models/nlp/llm/qwen-7b/vllm/README.md @@ -1,23 +1,27 @@ # Qwen-7B (vLLM) -## Description +## Model Description -Large language models (LLMs) have revolutionized the field of artificial intelligence, enabling natural language -processing tasks that were previously thought to be exclusive to humans. In this work, we introduce Qwen, the first -installment of our large language model series. Qwen is a comprehensive language model series that encompasses distinct -models with varying parameter counts. It includes Qwen, the base pretrained language models, and Qwen-Chat, the chat -models finetuned with human alignment techniques. The base language models consistently demonstrate superior performance -across a multitude of downstream tasks, and the chat models, particularly those trained using Reinforcement Learning -from Human Feedback (RLHF), are highly competitive. The chat models possess advanced tool-use and planning capabilities -for creating agent applications, showcasing impressive performance even when compared to bigger models on complex tasks -like utilizing a code interpreter. Furthermore, we have developed coding-specialized models, Code-Qwen and -Code-Qwen-Chat, as well as mathematics-focused models, Math-Qwen-Chat, which are built upon base language models. These -models demonstrate significantly improved performance in comparison with open-source models, and slightly fall behind -the proprietary models. +Qwen-7B is a cutting-edge large language model developed as part of the Qwen series, offering advanced natural language +processing capabilities. With 7 billion parameters, it demonstrates exceptional performance across various downstream +tasks. The model comes in two variants: the base pretrained version and the Qwen-Chat version, which is fine-tuned using +human alignment techniques. Notably, Qwen-7B exhibits strong tool-use and planning abilities, making it suitable for +developing intelligent agent applications. It also includes specialized versions for coding (Code-Qwen) and mathematics +(Math-Qwen), showcasing improved performance in these domains compared to other open-source models. -## Setup +## Model Preparation -### Install +### Prepare Resources + +- Model: - Model: + +```bash +cd ${DeepSparkInference}/models/nlp/large_language_model/qwen-7b/vllm +mkdir -p data/qwen +ln -s /path/to/Qwen-7B ./data/qwen +``` + +### Install Dependencies In order to run the model smoothly, you need to get the sdk from [resource center](https://support.iluvatar.com/#/ProductLine?id=2) of Iluvatar CoreX official website. @@ -35,17 +39,7 @@ pip3 install triton pip3 install ixformer ``` -### Download - -- Model: - Model: - -```bash -cd ${DeepSparkInference}/models/nlp/large_language_model/qwen-7b/vllm -mkdir -p data/qwen -ln -s /path/to/Qwen-7B ./data/qwen -``` - -## Inference +## Model Inference ```bash export CUDA_VISIBLE_DEVICES=0,1 diff --git a/models/nlp/llm/qwen1.5-14b/vllm/README.md b/models/nlp/llm/qwen1.5-14b/vllm/README.md index 07741c3f62cc4fb0d0594b9aebf67fd1500893c0..eb429d04ff49be21cdfdc5260e9c299011c141f4 100644 --- a/models/nlp/llm/qwen1.5-14b/vllm/README.md +++ b/models/nlp/llm/qwen1.5-14b/vllm/README.md @@ -1,6 +1,6 @@ # Qwen1.5-14B (vLLM) -## Description +## Model Description Qwen1.5 is a language model series including decoder language models of different model sizes. For each size, we release the base language model and the aligned chat model. It is based on the Transformer architecture with SwiGLU activation, @@ -8,19 +8,9 @@ attention QKV bias, group query attention, mixture of sliding window attention a have an improved tokenizer adaptive to multiple natural languages and codes. For the beta version, temporarily we did not include GQA (except for 32B) and the mixture of SWA and full attention. -## Setup +## Model Preparation -### Install - -```bash -# Install libGL -## CentOS -yum install -y mesa-libGL -## Ubuntu -apt install -y libgl1-mesa-glx -``` - -### Download +### Prepare Resources - Model: @@ -30,13 +20,23 @@ mkdir data/qwen1.5 ln -s /path/to/Qwen1.5-14B ./data/qwen1.5 ``` -## Inference +### Install Dependencies + +```bash +# Install libGL +## CentOS +yum install -y mesa-libGL +## Ubuntu +apt install -y libgl1-mesa-glx +``` + +## Model Inference ```bash python3 offline_inference.py --model ./data/qwen1.5/Qwen1.5-14B --max-tokens 256 -tp 1 --temperature 0.0 --max-model-len 896 ``` -## Results +## Model Results | Model | QPS | |-------------|-------| diff --git a/models/nlp/llm/qwen1.5-32b/vllm/README.md b/models/nlp/llm/qwen1.5-32b/vllm/README.md index 97fdeca8ec3910a447292d6ff8e0b155031b6474..9b111e19f143edf675c9899fd4f202a1e2a53c40 100755 --- a/models/nlp/llm/qwen1.5-32b/vllm/README.md +++ b/models/nlp/llm/qwen1.5-32b/vllm/README.md @@ -1,15 +1,25 @@ # Qwen1.5-32B-Chat (vLLM) -## Description +## Model Description Qwen1.5 is a language model series including decoder language models of different model sizes. For each size, we release the base language model and the aligned chat model. It is based on the Transformer architecture with SwiGLU activation, attention QKV bias, group query attention, mixture of sliding window attention and full attention, etc. Additionally, we have an improved tokenizer adaptive to multiple natural languages and codes. -## Setup +## Model Preparation -### Install +### Prepare Resources + +- Model: + +```bash +cd ${DeepSparkInference}/models/nlp/large_language_model/qwen1.5-32b/vllm +mkdir -p data/qwen1.5 +ln -s /path/to/Qwen1.5-32B ./data/qwen1.5 +``` + +### Install Dependencies ```bash # Install libGL @@ -24,17 +34,7 @@ pip3 install triton pip3 install ixformer ``` -### Download - -- Model: - -```bash -cd ${DeepSparkInference}/models/nlp/large_language_model/qwen1.5-32b/vllm -mkdir -p data/qwen1.5 -ln -s /path/to/Qwen1.5-32B ./data/qwen1.5 -``` - -## Inference +## Model Inference ```bash export CUDA_VISIBLE_DEVICES=0,1,2,3 diff --git a/models/nlp/llm/qwen1.5-72b/vllm/README.md b/models/nlp/llm/qwen1.5-72b/vllm/README.md index 340eb83dbcfe85f6fb81af4c33f3dfdcc6f85c81..6cc7c2adb6b607e155d2932507b70061b9c3d341 100644 --- a/models/nlp/llm/qwen1.5-72b/vllm/README.md +++ b/models/nlp/llm/qwen1.5-72b/vllm/README.md @@ -1,6 +1,6 @@ # Qwen1.5-72B (vLLM) -## Description +## Model Description Qwen1.5 is a language model series including decoder language models of different model sizes. For each size, we release the base language model and the aligned chat model. It is based on the Transformer architecture with SwiGLU activation, @@ -8,19 +8,9 @@ attention QKV bias, group query attention, mixture of sliding window attention a have an improved tokenizer adaptive to multiple natural languages and codes. For the beta version, temporarily we did not include GQA (except for 32B) and the mixture of SWA and full attention. -## Setup +## Model Preparation -### Install - -```bash -# Install libGL -## CentOS -yum install -y mesa-libGL -## Ubuntu -apt install -y libgl1-mesa-glx -``` - -### Download +### Prepare Resources - Model: @@ -30,14 +20,24 @@ mkdir data/qwen1.5 ln -s /path/to/Qwen1.5-72B ./data/qwen1.5 ``` -## Inference +### Install Dependencies + +```bash +# Install libGL +## CentOS +yum install -y mesa-libGL +## Ubuntu +apt install -y libgl1-mesa-glx +``` + +## Model Inference ```bash export CUDA_VISIBLE_DEVICES=0,1 python3 offline_inference.py --model ./data/qwen1.5/Qwen1.5-72B --max-tokens 256 -tp 8 --temperature 0.0 --max-model-len 3096 ``` -## Results +## Model Results | Model | QPS | |-------------|-------| diff --git a/models/nlp/llm/qwen1.5-7b/tgi/README.md b/models/nlp/llm/qwen1.5-7b/tgi/README.md index 3ca81f60b72ca7898e2f9cee6d242bc8d6d9238e..97cca48816a64e553e6964a3512422057a9c7a87 100644 --- a/models/nlp/llm/qwen1.5-7b/tgi/README.md +++ b/models/nlp/llm/qwen1.5-7b/tgi/README.md @@ -1,6 +1,6 @@ # Qwen1.5-7B (Text Generation Inference) -## Description +## Model Description Qwen1.5 is a language model series including decoder language models of different model sizes. For each size, we release the base language model and the aligned chat model. It is based on the Transformer architecture with SwiGLU activation, @@ -8,19 +8,9 @@ attention QKV bias, group query attention, mixture of sliding window attention a have an improved tokenizer adaptive to multiple natural languages and codes. For the beta version, temporarily we did not include GQA (except for 32B) and the mixture of SWA and full attention. -## Setup +## Model Preparation -### Install - -```bash -# Install libGL -## CentOS -yum install -y mesa-libGL -## Ubuntu -apt install -y libgl1-mesa-glx -``` - -### Download +### Prepare Resources - Model: @@ -30,7 +20,17 @@ mkdir -p data/qwen1.5 ln -s /path/to/Qwen1.5-7B ./data/qwen1.5 ``` -## Inference +### Install Dependencies + +```bash +# Install libGL +## CentOS +yum install -y mesa-libGL +## Ubuntu +apt install -y libgl1-mesa-glx +``` + +## Model Inference ### Start webserver @@ -54,7 +54,7 @@ export CUDA_VISIBLE_DEVICES=1 python3 offline_inference.py --model2path ./data/qwen1.5/Qwen1.5-7B ``` -## Results +## Model Results | Model | QPS | |------------|-------| diff --git a/models/nlp/llm/qwen1.5-7b/vllm/README.md b/models/nlp/llm/qwen1.5-7b/vllm/README.md index d991b2d4ae361bbbf6b0f51a062a13186768017b..7b5eac5de3728a67d3b4cf2223ec8ef44dbb1196 100644 --- a/models/nlp/llm/qwen1.5-7b/vllm/README.md +++ b/models/nlp/llm/qwen1.5-7b/vllm/README.md @@ -1,6 +1,6 @@ # Qwen1.5-7B (vLLM) -## Description +## Model Description Qwen1.5 is a language model series including decoder language models of different model sizes. For each size, we release the base language model and the aligned chat model. It is based on the Transformer architecture with SwiGLU activation, @@ -8,19 +8,9 @@ attention QKV bias, group query attention, mixture of sliding window attention a have an improved tokenizer adaptive to multiple natural languages and codes. For the beta version, temporarily we did not include GQA (except for 32B) and the mixture of SWA and full attention. -## Setup +## Model Preparation -### Install - -```bash -# Install libGL -## CentOS -yum install -y mesa-libGL -## Ubuntu -apt install -y libgl1-mesa-glx -``` - -### Download +### Prepare Resources - Model: @@ -30,13 +20,23 @@ mkdir -p data/qwen1.5 ln -s /path/to/Qwen1.5-7B ./data/qwen1.5 ``` -## Inference +### Install Dependencies + +```bash +# Install libGL +## CentOS +yum install -y mesa-libGL +## Ubuntu +apt install -y libgl1-mesa-glx +``` + +## Model Inference ```bash python3 offline_inference.py --model ./data/qwen1.5/Qwen1.5-7B --max-tokens 256 -tp 1 --temperature 0.0 --max-model-len 3096 ``` -## Results +## Model Results | Model | QPS | |------------|--------| diff --git a/models/nlp/llm/qwen2-72b/vllm/README.md b/models/nlp/llm/qwen2-72b/vllm/README.md index 08859625592449605bc8db75ccf133b3b3fb911d..b3a7a7a52bb99bf84534a3b7ee2523e0f943341e 100755 --- a/models/nlp/llm/qwen2-72b/vllm/README.md +++ b/models/nlp/llm/qwen2-72b/vllm/README.md @@ -1,6 +1,6 @@ # Qwen2-72B-Instruct (vLLM) -## Description +## Model Description Qwen2 is the new series of Qwen large language models. For Qwen2, we release a number of base language models and instruction-tuned language models ranging from 0.5 to 72 billion parameters, including a Mixture-of-Experts model. This @@ -14,9 +14,19 @@ reasoning, etc. Qwen2-72B-Instruct supports a context length of up to 131,072 tokens, enabling the processing of extensive inputs. Please refer to this section for detailed instructions on how to deploy Qwen2 for handling long texts. -## Setup +## Model Preparation -### Install +### Prepare Resources + +- Model: + +```bash +cd ${DeepSparkInference}/models/nlp/large_language_model/qwen2-72b/vllm +mkdir -p data/qwen2 +ln -s /path/to/Qwen2-72B ./data/qwen2 +``` + +### Install Dependencies ```bash # Install libGL @@ -31,17 +41,7 @@ pip3 install triton pip3 install ixformer ``` -### Download - -- Model: - -```bash -cd ${DeepSparkInference}/models/nlp/large_language_model/qwen2-72b/vllm -mkdir -p data/qwen2 -ln -s /path/to/Qwen2-72B ./data/qwen2 -``` - -## Inference +## Model Inference ```bash export CUDA_VISIBLE_DEVICES=0,1,2,3,4,5,6,7 diff --git a/models/nlp/llm/qwen2-7b/vllm/README.md b/models/nlp/llm/qwen2-7b/vllm/README.md index fb1556b094789cf9bf6db147fd89207365808aeb..b9433934b3a0bd49ef08c74959955637fba73783 100755 --- a/models/nlp/llm/qwen2-7b/vllm/README.md +++ b/models/nlp/llm/qwen2-7b/vllm/README.md @@ -1,6 +1,6 @@ # Qwen2-7B Instruct (vLLM) -## Description +## Model Description Qwen2 is the new series of Qwen large language models. For Qwen2, we release a number of base language models and instruction-tuned language models ranging from 0.5 to 72 billion parameters, including a Mixture-of-Experts model. This @@ -13,9 +13,19 @@ reasoning, etc. Qwen2-7B-Instruct supports a context length of up to 131,072 tokens, enabling the processing of extensive inputs. -## Setup +## Model Preparation -### Install +### Prepare Resources + +- Model: + +```bash +cd models/nlp/large_language_model/qwen2-7b/vllm +mkdir -p data/qwen2 +ln -s /path/to/Qwen2-7B-Instruct ./data/qwen2 +``` + +### Install Dependencies ```bash # Install libGL @@ -30,17 +40,7 @@ pip3 install triton pip3 install ixformer ``` -### Download - -- Model: https://modelscope.cn/models/Qwen/Qwen2-7B-Instruct - -```bash -cd models/nlp/large_language_model/qwen2-7b/vllm -mkdir -p data/qwen2 -ln -s /path/to/Qwen2-7B-Instruct ./data/qwen2 -``` - -## Inference +## Model Inference ```bash export CUDA_VISIBLE_DEVICES=0 diff --git a/models/nlp/llm/stablelm/vllm/README.md b/models/nlp/llm/stablelm/vllm/README.md index bea23654c5d83ca9fd875ad94738b5db5a0d61bf..7b49863803e0d10f00c699e28a20516063048d72 100644 --- a/models/nlp/llm/stablelm/vllm/README.md +++ b/models/nlp/llm/stablelm/vllm/README.md @@ -1,6 +1,6 @@ # StableLM2-1.6B (vLLM) -## Description +## Model Description Stable LM 2 1.6B is a decoder-only language model with 1.6 billion parameters. It has been pre-trained on a diverse multilingual and code dataset, comprising 2 trillion tokens, for two epochs. This model is designed for various natural @@ -8,9 +8,18 @@ language processing tasks, including text generation and dialogue systems. Due t and diverse dataset, Stable LM 2 1.6B can effectively capture the nuances of language, including grammar, semantics, and contextual relationships, which enhances the quality and accuracy of the generated text. -## Setup +## Model Preparation -### Install +### Prepare Resources + +- Model: + +```bash +# Download model from the website and make sure the model's path is "data/stablelm/stablelm-2-1_6b" +mkdir -p data/stablelm/stablelm-2-1_6b +``` + +### Install Dependencies ```bash # Install libGL @@ -21,23 +30,14 @@ apt install -y libgl1-mesa-glx pip3 install transformers ``` -### Download - -- Model: - -```bash -# Download model from the website and make sure the model's path is "data/stablelm/stablelm-2-1_6b" -mkdir -p data/stablelm/stablelm-2-1_6b -``` - -## Inference +## Model Inference ```bash export CUDA_VISIBLE_DEVICES=0,1 python3 offline_inference.py --model ./data/stablelm/stablelm-2-1_6b --max-tokens 256 -tp 1 --temperature 0.0 ``` -## Results +## Model Results | Model | QPS | |----------|-------| diff --git a/models/nlp/plm/albert/ixrt/README.md b/models/nlp/plm/albert/ixrt/README.md index 2af14b2be4251270f21b3e9646861aed407abcf0..9552148ccda574b012a22e6a03b06fc8ec3747e2 100644 --- a/models/nlp/plm/albert/ixrt/README.md +++ b/models/nlp/plm/albert/ixrt/README.md @@ -1,20 +1,12 @@ -# ALBERT +# ALBERT (IxRT) -## Description +## Model Description Albert (A Lite BERT) is a variant of the BERT (Bidirectional Encoder Representations from Transformers) model that focuses on efficiency and scalability while maintaining strong performance in natural language processing tasks. The AlBERT model introduces parameter reduction techniques and incorporates self-training strategies to enhance its effectiveness. -## Setup +## Model Preparation -### Install - -```bash -apt install -y libnuma-dev - -pip3 install -r requirements.txt -``` - -### Download +### Prepare Resources Pretrained model: @@ -29,6 +21,14 @@ cd ${MODEL_PATH} bash ./scripts/prepare_model_and_dataset.sh ``` +### Install Dependencies + +```bash +apt install -y libnuma-dev + +pip3 install -r requirements.txt +``` + ### Model Conversion Please correct the paths in the following commands or files. @@ -39,7 +39,7 @@ python3 torch2onnx.py --model_path ./general_perf/model_zoo/popular/open_albert/ onnxsim albert-torch-fp32.onnx albert-torch-fp32-sim.onnx ``` -## Inference +## Model Inference ```bash git clone https://gitee.com/deep-spark/iluvatar-corex-ixrt.git --depth=1 @@ -52,7 +52,6 @@ export PROJ_PATH=./ ### Performance ```bash - bash scripts/infer_albert_fp16_performance.sh ``` @@ -89,7 +88,7 @@ sed -i 's/tensorrt_legacy/tensorrt/' ./backends/ILUVATAR/runtime_backend_iluvata python3 core/perf_engine.py --hardware_type ILUVATAR --task albert-torch-fp32 ``` -## Results +## Model Results | Model | BatchSize | Precision | QPS | Exact Match | F1 Score | | ------ | --------- | --------- | ----- | ----------- | -------- | diff --git a/models/nlp/plm/bert_base_ner/igie/README.md b/models/nlp/plm/bert_base_ner/igie/README.md index 558dce76b30095310e14607a8cb44ca6d5a45ecc..8a6f826fd76a9373318c2854d7a1d6ca5c94d36a 100644 --- a/models/nlp/plm/bert_base_ner/igie/README.md +++ b/models/nlp/plm/bert_base_ner/igie/README.md @@ -1,23 +1,23 @@ -# BERT Base NER +# BERT Base NER (IGIE) -## Description +## Model Description BERT is designed to pre-train deep bidirectional representations from unlabeled text by jointly conditioning on both left and right context in all layers. As a result, the pre-trained BERT model can be fine-tuned with just one additional output layer to create state-of-the-art models for a wide range of tasks, such as question answering and language inference, without substantial task-specific architecture modifications. -## Setup +## Model Preparation -### Install - -```bash -pip3 install -r requirements.txt -``` - -### Download +### Prepare Resources Pretrained model: Dataset: +### Install Dependencies + +```bash +pip3 install -r requirements.txt +``` + ### Model Conversion ```bash @@ -27,14 +27,13 @@ export DATASETS_DIR=/Path/to/china-people-daily-ner-corpus/ python3 get_weights.py # Do QAT for INT8 test, will take a long time -cd Int8QAT +cd Int8QAT/ python3 run_qat.py --model_dir ../test/ --datasets_dir ${DATASETS_DIR} python3 export_hdf5.py --model quant_base/pytorch_model.bin -cd .. - +cd ../ ``` -## Inference +## Model Inference ### INT8 @@ -45,8 +44,8 @@ bash scripts/infer_bert_base_ner_int8_accuracy.sh bash scripts/infer_bert_base_ner_int8_performance.sh ``` -## Results +## Model Results -Model |BatchSize |SeqLength |Precision |FPS | F1 Score ------------------|-----------|----------|----------|----------|-------- -Bertbase(NER) | 8 | 256 | INT8 | 2067.252 | 96.2 +| Model | BatchSize | SeqLength | Precision | FPS | F1 Score | +|---------------|-----------|-----------|-----------|----------|----------| +| BERT Base NER | 8 | 256 | INT8 | 2067.252 | 96.2 | diff --git a/models/nlp/plm/bert_base_squad/igie/README.md b/models/nlp/plm/bert_base_squad/igie/README.md index 16d9e4e91e1213ba164b742e7a37e9e07a079434..5c8d9e9d6eda7f053a842adb64401c0b35cd5857 100644 --- a/models/nlp/plm/bert_base_squad/igie/README.md +++ b/models/nlp/plm/bert_base_squad/igie/README.md @@ -1,30 +1,30 @@ -# BERT Base SQuAD +# BERT Base SQuAD (IGIE) -## Description +## Model Description BERT is designed to pre-train deep bidirectional representations from unlabeled text by jointly conditioning on both left and right context in all layers. As a result, the pre-trained BERT model can be fine-tuned with just one additional output layer to create state-of-the-art models for a wide range of tasks, such as question answering and language inference, without substantial task-specific architecture modifications. -## Setup +## Model Preparation -### Install - -```bash -pip3 install -r requirements.txt -``` - -### Download +### Prepare Resources Pretrained model: Dataset: +### Install Dependencies + +```bash +pip3 install -r requirements.txt +``` + ### Model Conversion ```bash python3 export.py --output bert-base-uncased-squad-v1.onnx ``` -## Inference +## Model Inference ```bash export DATASETS_DIR=/Path/to/SQuAD/ @@ -39,7 +39,7 @@ bash scripts/infer_bert_base_squad_fp16_accuracy.sh bash scripts/infer_bert_base_squad_fp16_performance.sh ``` -## Results +## Model Results | Model | BatchSize | SeqLength | Precision | FPS | F1 Score | | --------------- | --------- | --------- | --------- | ------ | -------- | diff --git a/models/nlp/plm/bert_base_squad/ixrt/README.md b/models/nlp/plm/bert_base_squad/ixrt/README.md index 240c62bfc681976dec7a1b8c7ad3d000b7ca500d..e3c9aac3eb1ed6257c37e7e4b20a23953ec1a262 100644 --- a/models/nlp/plm/bert_base_squad/ixrt/README.md +++ b/models/nlp/plm/bert_base_squad/ixrt/README.md @@ -1,52 +1,54 @@ -# BERT Base SQuAD +# BERT Base SQuAD (IxRT) -## Description +## Model Description BERT is designed to pre-train deep bidirectional representations from unlabeled text by jointly conditioning on both left and right context in all layers. As a result, the pre-trained BERT model can be fine-tuned with just one additional output layer to create state-of-the-art models for a wide range of tasks, such as question answering and language inference, without substantial task-specific architecture modifications. -## Setup +## Model Preparation -### T4 requirement(tensorrt_version >= 8.6) +### Prepare Resources ```bash -docker pull nvcr.io/nvidia/tensorrt:23.04-py3 +cd python +bash script/prepare.sh v1_1 ``` -## Install +### Install Dependencies -```bash -pip3 install -r requirements.txt -``` - -### Install on Iluvatar +#### Install on Iluvatar ```bash cmake -S . -B build cmake --build build -j16 ``` -### Install on T4 +#### Install on NV + +Require tensorrt_version >= 8.6 ```bash -cmake -S . -B build -DUSE_TENSORRT=true -cmake --build build -j16 +# Get TensorRT docker image +docker pull nvcr.io/nvidia/tensorrt:23.04-py3 +# Run TensorRT docker ``` -## Download - ```bash -cd python -bash script/prepare.sh v1_1 +# Install requirements.txt in TensorRT docker +pip3 install -r requirements.txt + +# Build +cmake -S . -B build -DUSE_TENSORRT=true +cmake --build build -j16 ``` -## Inference +## Model Inference ### On Iluvatar #### FP16 ```bash -cd script +cd script/ # FP16 bash infer_bert_base_squad_fp16_ixrt.sh @@ -68,7 +70,7 @@ bash script/build_engine.sh --bs 32 --int8 bash script/inference_squad.sh --bs 32 --int8 ``` -## Results +## Model Results | Model | BatchSize | Precision | Latency QPS | exact_match | f1 | | --------------- | --------- | --------- | ----------- | ----------- | ----- | diff --git a/models/nlp/plm/bert_large_squad/igie/README.md b/models/nlp/plm/bert_large_squad/igie/README.md index 7302c3f0a0aaa637eb765f4e4dc7223a0ea5e926..aba63da27faf0e51f5315239833e2b136067f56e 100644 --- a/models/nlp/plm/bert_large_squad/igie/README.md +++ b/models/nlp/plm/bert_large_squad/igie/README.md @@ -1,27 +1,26 @@ -# BERT Large SQuAD +# BERT Large SQuAD (IGIE) -## Description +## Model Description BERT is designed to pre-train deep bidirectional representations from unlabeled text by jointly conditioning on both left and right context in all layers. As a result, the pre-trained BERT model can be fine-tuned with just one additional output layer to create state-of-the-art models for a wide range of tasks, such as question answering and language inference, without substantial task-specific architecture modifications. -## Setup +## Model Preparation -### Install - -```bash -pip3 install -r requirements.txt -``` - -### Download +### Prepare Resources Pretrained model: Dataset: -### Model Conversion +### Install Dependencies ```bash +pip3 install -r requirements.txt +``` + +### Model Conversion +```bash # Get FP16 Onnx Model python3 export.py --output bert-large-uncased-squad-v1.onnx @@ -40,11 +39,10 @@ bash run_qat.sh # model: quant_bert_large/pytorch_model.bin or quant_bert_large/model.safetensors python3 export_hdf5.py --model quant_bert_large/pytorch_model.bin --model_name large -cd .. - +cd ../ ``` -## Inference +## Model Inference ```bash export DATASETS_DIR=/Path/to/SQuAD/ @@ -68,9 +66,9 @@ bash scripts/infer_bert_large_squad_int8_accuracy.sh bash scripts/infer_bert_large_squad_int8_performance.sh ``` -## Results +## Model Results -Model |BatchSize |SeqLength |Precision |FPS | F1 Score ------------------|-----------|----------|----------|----------|-------- -Bertlarge(Squad) | 8 | 256 | FP16 | 302.273 | 91.102 -Bertlarge(Squad) | 8 | 256 | INT8 | 723.169 | 89.899 +| Model | BatchSize | SeqLength | Precision | FPS | F1 Score | +|------------------|-----------|-----------|-----------|---------|----------| +| BERT Large SQuAD | 8 | 256 | FP16 | 302.273 | 91.102 | +| BERT Large SQuAD | 8 | 256 | INT8 | 723.169 | 89.899 | diff --git a/models/nlp/plm/bert_large_squad/ixrt/README.md b/models/nlp/plm/bert_large_squad/ixrt/README.md index a8bcf5b7957f62b71142ea6e0527f5bfacc5ce59..47d9c62b4e79125b2a13c2d858534919776cddac 100644 --- a/models/nlp/plm/bert_large_squad/ixrt/README.md +++ b/models/nlp/plm/bert_large_squad/ixrt/README.md @@ -1,53 +1,55 @@ -# BERT Large SQuAD +# BERT Large SQuAD (IxRT) -## Description +## Model Description BERT is designed to pre-train deep bidirectional representations from unlabeled text by jointly conditioning on both left and right context in all layers. As a result, the pre-trained BERT model can be fine-tuned with just one additional output layer to create state-of-the-art models for a wide range of tasks, such as question answering and language inference, without substantial task-specific architecture modifications. -## Setup +## Model Preparation + +### Prepare Resources Get `bert-large-uncased.zip` from [Google Drive](https://drive.google.com/file/d/1eD8QBkbK6YN-_YXODp3tmpp3cZKlrPTA/view?usp=drive_link) -### NV requirement(tensorrt_version >= 8.6) - ```bash -docker pull nvcr.io/nvidia/tensorrt:23.04-py3 +cd python/ +bash script/prepare.sh v1_1 ``` -## Install +### Install Dependencies -```bash -pip3 install -r requirements.txt -``` - -### On Iluvatar +#### Install on Iluvatar ```bash cmake -S . -B build cmake --build build -j16 ``` -### On NV +#### Install on NV + +Require tensorrt_version >= 8.6 ```bash -cmake -S . -B build -DUSE_TENSORRT=true -cmake --build build -j16 +# Get TensorRT docker image +docker pull nvcr.io/nvidia/tensorrt:23.04-py3 +# Run TensorRT docker ``` -## Download - ```bash -cd python -bash script/prepare.sh v1_1 +# Install requirements.txt in TensorRT docker +pip3 install -r requirements.txt + +# Build +cmake -S . -B build -DUSE_TENSORRT=true +cmake --build build -j16 ``` -## Inference +## Model Inference ### FP16 ```bash -cd python +cd python/ # use --bs to set max_batch_size (dynamic) bash script/build_engine.sh --bs 32 @@ -62,10 +64,11 @@ pip install onnx pycuda bash script/build_engine.sh --bs 32 --int8 bash script/inference_squad.sh --bs 32 --int8 ``` -| Model | BatchSize | Precision | Latency QPS | exact_match | f1 | -|------------------|-----------|-----------|---------------------|-------------|-------| -| BERT-Large-SQuAD | 32 | FP16 | 470.26 sentences/s | 82.36 | 89.68 | -| BERT-Large-SQuAD | 32 | INT8 | 1490.47 sentences/s | 80.92 | 88.20 | -|------------------|-----------|-----------|---------------------|-------------|-------| -| BERT-Large-SQuAD | 32 | FP16 | 470.26 sentences/s | 82.36 | 89.68 | -| BERT-Large-SQuAD | 32 | INT8 | 1490.47 sentences/s | 80.92 | 88.20 | + +| Model | BatchSize | Precision | Latency QPS | exact_match | f1 | +|--------------------|-------------|-------------|-----------------------|---------------|---------| +| BERT-Large-SQuAD | 32 | FP16 | 470.26 sentences/s | 82.36 | 89.68 | +| BERT-Large-SQuAD | 32 | INT8 | 1490.47 sentences/s | 80.92 | 88.20 | +| ------------------ | ----------- | ----------- | --------------------- | ------------- | ------- | +| BERT-Large-SQuAD | 32 | FP16 | 470.26 sentences/s | 82.36 | 89.68 | +| BERT-Large-SQuAD | 32 | INT8 | 1490.47 sentences/s | 80.92 | 88.20 | diff --git a/models/nlp/plm/deberta/ixrt/README.md b/models/nlp/plm/deberta/ixrt/README.md index 3026d51da214bccf5be647cf60a655e2ddf3142e..e2b41d2bf643d48fbcf993bdeb9f728a842c2142 100644 --- a/models/nlp/plm/deberta/ixrt/README.md +++ b/models/nlp/plm/deberta/ixrt/README.md @@ -1,6 +1,6 @@ -# DeBERTa +# DeBERTa (IxRT) -## Description +## Model Description DeBERTa (Decoding-enhanced BERT with disentangled attention) is an enhanced version of the BERT (Bidirectional Encoder Representations from Transformers) model. It improves text representation learning by introducing disentangled attention @@ -9,9 +9,19 @@ self-attention matrix into different parts, focusing on different semantic infor capture relationships between texts.By incorporating decoding enhancement techniques, DeBERTa adjusts the decoder during fine-tuning to better suit specific downstream tasks, thereby improving the model’s performance on those tasks. -## Setup +## Model Preparation -### Install +### Prepare Resources + +Pretrained model: < > + +Dataset: < > to download the squad dataset. + +```bash +bash ./scripts/prepare_model_and_dataset.sh +``` + +### Install Dependencies ```bash export PROJ_ROOT=/PATH/TO/DEEPSPARKINFERENCE @@ -23,16 +33,6 @@ apt install -y libnuma-dev pip3 install -r requirements.txt ``` -### Download - -Pretrained model: < > - -Dataset: < > to download the squad dataset. - -```bash -bash ./scripts/prepare_model_and_dataset.sh -``` - ### Model Conversion ```bash @@ -43,7 +43,7 @@ python3 remove_clip_and_cast.py ``` -## Inference +## Model Inference ```bash git clone https://gitee.com/deep-spark/iluvatar-corex-ixrt.git --depth=1 @@ -96,7 +96,7 @@ sed -i 's/tensorrt_legacy/tensorrt/g' backends/ILUVATAR/common.py python3 core/perf_engine.py --hardware_type ILUVATAR --task deberta-torch-fp32 ``` -## Results +## Model Results | Model | BatchSize | Precision | QPS | Exact Match | F1 Score | |---------|-----------|-----------|-------|-------------|----------| diff --git a/models/nlp/plm/roberta/ixrt/README.md b/models/nlp/plm/roberta/ixrt/README.md index 4e45bfa80e0191dea245aa6382201ce09157ae96..2f9455c916a797c2795f833c37e7b907859e4e02 100644 --- a/models/nlp/plm/roberta/ixrt/README.md +++ b/models/nlp/plm/roberta/ixrt/README.md @@ -1,6 +1,6 @@ -# RoBERTa +# RoBERTa (IxRT) -## Description +## Model Description Language model pretraining has led to significant performance gains but careful comparison between different approaches is challenging. Training is computationally expensive, often done on private datasets of different sizes, and, as we @@ -11,9 +11,15 @@ it. Our best model achieves state-of-the-art results on GLUE, RACE and SQuAD. Th previously overlooked design choices, and raise questions about the source of recently reported improvements. We release our models and code. -## Setup +## Model Preparation -### Install +### Prepare Resources + +Pretrained model: + +Dataset: + +### Install Dependencies ```bash export PROJ_ROOT=/PATH/TO/DEEPSPARKINFERENCE @@ -23,11 +29,7 @@ cd ${MODEL_PATH} pip3 install -r requirements.txt ``` -### Download - -Pretrained model: - -Dataset: +### Model Conversion ```bash # Go to path of this model @@ -48,7 +50,7 @@ python3 export_onnx.py --model_path open_roberta/roberta-base-squad.pt --output_ onnxsim open_roberta/roberta-torch-fp32.onnx open_roberta/roberta-torch-fp32_sim.onnx ``` -## Inference +## Model Inference ```bash git clone https://gitee.com/deep-spark/iluvatar-corex-ixrt.git --depth=1 @@ -103,7 +105,7 @@ wget -O workloads/roberta-torch-fp32.json https://raw.githubusercontent.com/byte python3 core/perf_engine.py --hardware_type ILUVATAR --task roberta-torch-fp32 ``` -## Results +## Model Results | Model | BatchSize | Precision | FPS | F1 | Exact Match | |---------|-----------|-----------|--------|----------|-------------| diff --git a/models/nlp/plm/roformer/ixrt/README.md b/models/nlp/plm/roformer/ixrt/README.md index 6d125955c49a925e3bbf73c09406b5c24a0436c9..627eb6397115d455f8feef09b0865db7f5f9a100 100644 --- a/models/nlp/plm/roformer/ixrt/README.md +++ b/models/nlp/plm/roformer/ixrt/README.md @@ -1,6 +1,6 @@ -# RoFormer +# RoFormer (IxRT) -## Description +## Model Description Position encoding recently has shown effective in the transformer architecture. It enables valuable supervision for dependency modeling between elements at different positions of the sequence. In this paper, we first investigate various @@ -13,18 +13,9 @@ the capability of equipping the linear self-attention with relative position enc transformer with rotary position embedding, also called RoFormer, on various long text classification benchmark datasets. -## Setup +## Model Preparation -### Install - -```bash -apt install -y libnuma-dev - -pip3 install -r requirements.txt - -``` - -### Download +### Prepare Resources Pretrained model: @@ -45,7 +36,16 @@ rm -f open_roformer.tar popd ``` -### Deal with ONNX +### Install Dependencies + +```bash +apt install -y libnuma-dev + +pip3 install -r requirements.txt + +``` + +### Model Conversion ```bash # export onnx @@ -56,7 +56,7 @@ onnxsim ./data/open_roformer/roformer-frozen_org.onnx ./data/open_roformer/rofor python3 deploy.py --model_path ./data/open_roformer/roformer-frozen.onnx --output_path ./data/open_roformer/roformer-frozen.onnx ``` -## Inference +## Model Inference ```bash git clone https://gitee.com/deep-spark/iluvatar-corex-ixrt.git --depth=1 @@ -110,7 +110,7 @@ sed -i 's/segment:0/segment0/g; s/token:0/token0/g' model_zoo/roformer-tf-fp32.j python3 core/perf_engine.py --hardware_type ILUVATAR --task roformer-tf-fp32 ``` -## Results +## Model Results | Model | BatchSize | Precision | FPS | ACC | |----------|-----------|-----------|---------|---------| diff --git a/models/nlp/plm/videobert/ixrt/README.md b/models/nlp/plm/videobert/ixrt/README.md index cb3683845a1f676f2bb1b5882b495151413180a0..dde6c249627aeda93b8e328875f42169ba8c3440 100644 --- a/models/nlp/plm/videobert/ixrt/README.md +++ b/models/nlp/plm/videobert/ixrt/README.md @@ -1,22 +1,14 @@ -# VideoBERT +# VideoBERT (IxRT) -## Description +## Model Description VideoBERT is a model designed for video understanding tasks, extending the capabilities of BERT (Bidirectional Encoder Representations from Transformers) to video data. It enhances video representation learning by integrating both visual and textual information into a unified framework. -## Setup +## Model Preparation -### Install - -```bash -apt install -y libnuma-dev - -pip3 install -r requirements.txt -``` - -### Download +### Prepare Resources Pretrained model: @@ -31,7 +23,15 @@ cd ${MODEL_PATH} bash ./scripts/prepare_model_and_dataset.sh ``` -## Inference +### Install Dependencies + +```bash +apt install -y libnuma-dev + +pip3 install -r requirements.txt +``` + +## Model Inference ```bash git clone https://gitee.com/deep-spark/iluvatar-corex-ixrt.git --depth=1 @@ -72,7 +72,7 @@ wget -O workloads/videobert-onnx-fp32.json https://raw.githubusercontent.com/byt python3 core/perf_engine.py --hardware_type ILUVATAR --task videobert-onnx-fp32 ``` -## Results +## Model Results | Model | BatchSize | Precision | QPS | Top-1 ACC | |-----------|-----------|-----------|-------|-----------| diff --git a/models/others/recommendation/wide_and_deep/ixrt/README.md b/models/others/recommendation/wide_and_deep/ixrt/README.md index c6653cabd3e38107513cb66ac01ca7e7120b05e7..62c39ed0427b2087a354aefc4cead33effdeda3c 100644 --- a/models/others/recommendation/wide_and_deep/ixrt/README.md +++ b/models/others/recommendation/wide_and_deep/ixrt/README.md @@ -1,12 +1,18 @@ -# Wide&Deep +# Wide & Deep (IxRT) -## Description +## Model Description Generalized linear models with nonlinear feature transformations are widely used for large-scale regression and classification problems with sparse inputs. Memorization of feature interactions through a wide set of cross-product feature transformations are effective and interpretable, while generalization requires more feature engineering effort. With less feature engineering, deep neural networks can generalize better to unseen feature combinations through low-dimensional dense embeddings learned for the sparse features. However, deep neural networks with embeddings can over-generalize and recommend less relevant items when the user-item interactions are sparse and high-rank. In this paper, we present Wide & Deep learning---jointly trained wide linear models and deep neural networks---to combine the benefits of memorization and generalization for recommender systems. We productionized and evaluated the system on Google Play, a commercial mobile app store with over one billion active users and over one million apps. Online experiment results show that Wide & Deep significantly increased app acquisitions compared with wide-only and deep-only models. We have also open-sourced our implementation in TensorFlow. -## Setup +## Model Preparation -### Install +### Prepare Resources + +Pretrained model: + +Dataset: + +### Install Dependencies ```bash apt install -y libnuma-dev @@ -14,11 +20,7 @@ apt install -y libnuma-dev pip3 install -r requirements.txt ``` -### Download - -Pretrained model: - -Dataset: +### Model Conversion ```bash # Go to path of this model @@ -35,7 +37,7 @@ python3 deploy.py --model_path open_wide_deep_saved_model/widedeep_sim.onnx --ou python3 change2dynamic.py --model_path open_wide_deep_saved_model/widedeep_sim.onnx --output_path open_wide_deep_saved_model/widedeep_sim.onnx ``` -## Inference +## Model Inference ```bash export ORIGIN_ONNX_NAME=./open_wide_deep_saved_model/widedeep_sim @@ -80,8 +82,8 @@ wget -O workloads/widedeep-tf-fp32.json https://raw.githubusercontent.com/byteda python3 core/perf_engine.py --hardware_type ILUVATAR --task widedeep-tf-fp32 ``` -## Results +## Model Results -| Model | BatchSize | Precision | FPS | ACC | -| --------- | --------- | --------- | -------- | ------- | -| Wide&Deep | 1024 | FP16 | 77073.93 | 0.74597 | +| Model | BatchSize | Precision | FPS | ACC | +|-------------|-----------|-----------|----------|---------| +| Wide & Deep | 1024 | FP16 | 77073.93 | 0.74597 |