# aimet-model-zoo **Repository Path**: superpig2021/aimet-model-zoo ## Basic Information - **Project Name**: aimet-model-zoo - **Description**: No description available - **Primary Language**: Unknown - **License**: Not specified - **Default Branch**: develop - **Homepage**: None - **GVP Project**: No ## Statistics - **Stars**: 0 - **Forks**: 0 - **Created**: 2023-12-22 - **Last Updated**: 2024-11-04 ## Categories & Tags **Categories**: Uncategorized **Tags**: None ## README  # Model Zoo for AI Model Efficiency Toolkit We provide a collection of popular neural network models and compare their floating point and quantized performance. Results demonstrate that quantized models can provide good accuracy, comparable to floating point models. Together with results, we also provide scripts and artifacts for users to quantize floating-point models using the [AI Model Efficiency ToolKit (AIMET)](https://github.com/quic/aimet). ## Table of Contents - [Introduction](#introduction) - [PyTorch Models](#pytorch-models) - [Tensorflow Models](#tensorflow-models) - [Installation and Usage](#installation-and-usage) - [Team](#team) - [License](#license) ## Introduction Quantized inference is significantly faster than floating-point inference, and enables models to run in a power-efficient manner on mobile and edge devices. We use AIMET, a library that includes state-of-the-art techniques for quantization, to quantize various models available in [PyTorch](https://pytorch.org) and [TensorFlow](https://tensorflow.org) frameworks. An original FP32 source model is quantized either using post-training quantization (PTQ) or Quantization-Aware-Training (QAT) technique available in AIMET. Example scripts for evaluation are provided for each model. When PTQ is needed, the evaluation script performs PTQ before evaluation. Wherever QAT is used, the fine-tuned model checkpoint is also provided. ## PyTorch Models
Task | Network [1] | Model Source [2] | Floating Pt (FP32) Model [3] | Quantized Model [4] | TensorFlow Version | Results [5] | |||
---|---|---|---|---|---|---|---|---|---|
Metric | FP32 | W8A8[6] | W4A8[7] | ||||||
Image Classification | ResNet-50 (v1) | GitHub Repo | Pretrained Model | See Documentation | 1.15 | (ImageNet) Top-1 Accuracy | 75.21% | 74.96% | TBD |
ResNet-50-tf2 | GitHub Repo | Pretrained Model | Quantized Model | 2.4 | (ImageNet) Top-1 Accuracy | 74.9% | 74.8% | TBD | |
MobileNet-v2-1.4 | GitHub Repo | Pretrained Model | Quantized Model | 1.15 | (ImageNet) Top-1 Accuracy | 75% | 74.21% | TBD | |
MobileNet-v2-tf2 | GitHub Repo | Pretrained Model | See Example | 2.4 | (ImageNet) Top-1 Accuracy | 71.6% | 71.0% | TBD | |
EfficientNet Lite | GitHub Repo | Pretrained Model | Quantized Model | 2.4 | (ImageNet) Top-1 Accuracy | 74.93% | 74.99% | TBD | |
Object Detection | SSD MobileNet-v2 | GitHub Repo | Pretrained Model | See Example | 1.15 | (COCO) Mean Avg. Precision (mAP) | 0.2469 | 0.2456 | TBD |
RetinaNet | GitHub Repo | Pretrained Model | See Example | 1.15 | (COCO) mAP Detailed Results | 0.35 | 0.349 | TBD | |
MobileDet-EdgeTPU | GitHub Repo | Pretrained Model | See Example | 2.4 | (COCO) Mean Avg. Precision (mAP) | 0.281 | 0.279 | TBD | |
Pose Estimation | Pose Estimation | Based on Ref. | Based on Ref. | Quantized Model | 2.4 | (COCO) mAP | 0.383 | 0.379 | TBD |
(COCO) (mAR) | 0.452 | 0.446 | TBD | ||||||
Super Resolution | SRGAN | GitHub Repo | Pretrained Model | See Example | 2.4 | (BSD100) PSNR / SSIM Detailed Results | 25.45 / 0.668 | 24.78 / 0.628 | 25.41 / 0.666 (INT8W / INT16Act.) |
Semantic Segmentation | DeeplabV3plus_mbnv2 | GitHub Repo | Pretrained Model | See Example | 2.4 | (PascalVOC) mIOU | 72.28 | 71.71 | TBD |
DeeplabV3plus_xception | GitHub Repo | Pretrained Model | See Example | 2.4 | (PascalVOC) mIOU | 87.71 | 87.21 | TBD |