# long_seq_mae **Repository Path**: facebookresearch/long_seq_mae ## Basic Information - **Project Name**: long_seq_mae - **Description**: code release of research paper "Exploring Long-Sequence Masked Autoencoders" - **Primary Language**: Unknown - **License**: Not specified - **Default Branch**: main - **Homepage**: None - **GVP Project**: No ## Statistics - **Stars**: 0 - **Forks**: 0 - **Created**: 2023-07-31 - **Last Updated**: 2023-07-31 ## Categories & Tags **Categories**: Uncategorized **Tags**: None ## README ## Exploring Long-Sequence Masked Autoencoders This is the code release of the paper [Exploring Long-Sequence Masked Autoencoders](https://arxiv.org/abs/2210.07224): ``` @Article{hu2022exploring, author = {Ronghang Hu and Shoubhik Debnath and Saining Xie and Xinlei Chen}, journal = {arXiv:2210.07224}, title = {Exploring Long-Sequence Masked Autoencoders}, year = {2022}, } ``` * This repo is a modification on the [MAE repo](https://github.com/facebookresearch/mae), and supports long-sequence pretraining on both GPUs and TPUs using PyTorch. * This repo is based on [`timm==0.4.12`](https://github.com/rwightman/pytorch-image-models), which can be installed via `pip3 install timm==0.4.12`. ### Fine-tuning with pre-trained checkpoints The following table provides the pre-trained checkpoints used in the paper:
Model (pretrained w/ L=784, image size 448, patch size 16) ViT-Base ViT-Large
COCO (train2017 + unlabeled2017) 4000-epoch download download
ImageNet-1k 800-epoch download download
ImageNet-1k 1600-epoch download download
### Using the codebase * Follow [`PRETRAIN_LONG_SEQ_TPU.md`](PRETRAIN_LONG_SEQ_TPU.md) for long-sequence pretraining on Google Cloud TPUs (which we used for our experiments). * Follow [`PRETRAIN_LONG_SEQ_GPU.md`](PRETRAIN_LONG_SEQ_GPU.md) for long-sequence pretraining on Nvidia GPUs. * Follow [`FINETUNE_DETECTION.md`](FINETUNE_DETECTION.md) to fine-tune on the object detection task using the ViTDet codebase from Detectron2. In addition, this codebase is also compatible with the features in the original MAE repo. Follow [`README_MAE.md`](README_MAE.md) to use the features of the original MAE repo (such as fine-tuning on image classification). ### License This project is under the CC-BY-NC 4.0 license. See [LICENSE](LICENSE) for details.