# Vision-Transformer-ViT

**Repository Path**: aloha-qing/Vision-Transformer-ViT

## Basic Information

- **Project Name**: Vision-Transformer-ViT
- **Description**: vision-transformer on cifar10
- **Primary Language**: Python
- **License**: MIT
- **Default Branch**: master
- **Homepage**: None
- **GVP Project**: No

## Statistics

- **Stars**: 1
- **Forks**: 2
- **Created**: 2023-03-16
- **Last Updated**: 2024-05-10

## Categories & Tags

**Categories**: Uncategorized

**Tags**: None

## README

# Vit-ImageClassification

## Introduction
This project uses ViT to perform image classification tasks on DATA set CIFAR10. The implement of Vit and pretrained weight are from https://github.com/asyml/vision-transformer-pytorch. 

![The architecture of ViT](pic/VIT.png)

## Installation
pytorch 1.7.1
python 3.7.3

## Datasets

Download the CIFAR10 from http://www.cs.toronto.edu/~kriz/cifar.html or you can get it from https://pan.baidu.com/s/1ogAFopdVzswge2Aaru_lvw (code: k5v8), creat data floder and unzip the cifar-10-python.tar.gz under './data'. 

## Pre_trained model

You can download the pretrained file from https://pan.baidu.com/s/1CuUj-XIXwecxWMEcLoJzPg (code: ox9n), creat Vit_weights floder and pretrained file under ./Vit_weights 

## Train
```
python main.py 
```
## Result

Base on the pretrained weight, after one epoch, we get 98.1 Accuracy

model  | dataset  | acc
---- | ----- | ------  
ViT-B_16  | CIFAR10 | 98.1