# Vision-Transformer-ViT **Repository Path**: aloha-qing/Vision-Transformer-ViT ## Basic Information - **Project Name**: Vision-Transformer-ViT - **Description**: vision-transformer on cifar10 - **Primary Language**: Python - **License**: MIT - **Default Branch**: master - **Homepage**: None - **GVP Project**: No ## Statistics - **Stars**: 1 - **Forks**: 2 - **Created**: 2023-03-16 - **Last Updated**: 2024-05-10 ## Categories & Tags **Categories**: Uncategorized **Tags**: None ## README # Vit-ImageClassification ## Introduction This project uses ViT to perform image classification tasks on DATA set CIFAR10. The implement of Vit and pretrained weight are from https://github.com/asyml/vision-transformer-pytorch. ![The architecture of ViT](pic/VIT.png) ## Installation pytorch 1.7.1 python 3.7.3 ## Datasets Download the CIFAR10 from http://www.cs.toronto.edu/~kriz/cifar.html or you can get it from https://pan.baidu.com/s/1ogAFopdVzswge2Aaru_lvw (code: k5v8), creat data floder and unzip the cifar-10-python.tar.gz under './data'. ## Pre_trained model You can download the pretrained file from https://pan.baidu.com/s/1CuUj-XIXwecxWMEcLoJzPg (code: ox9n), creat Vit_weights floder and pretrained file under ./Vit_weights ## Train ``` python main.py ``` ## Result Base on the pretrained weight, after one epoch, we get 98.1 Accuracy model | dataset | acc ---- | ----- | ------ ViT-B_16 | CIFAR10 | 98.1