# IRASim **Repository Path**: ByteDance/IRASim ## Basic Information - **Project Name**: IRASim - **Description**: No description available - **Primary Language**: Unknown - **License**: Apache-2.0 - **Default Branch**: main - **Homepage**: None - **GVP Project**: No ## Statistics - **Stars**: 0 - **Forks**: 0 - **Created**: 2024-06-20 - **Last Updated**: 2026-01-29 ## Categories & Tags **Categories**: Uncategorized **Tags**: None ## README
## Installation
To set up the environment, run the following command:
```bash
bash scripts/install.sh
```
## Dataset
To download the complete dataset, run:
```
bash scripts/download.sh
```
This table lists the download links and file sizes for the RT-1, Bridge, and Language-Table datasets, categorized into train, evaluation, and checkpoints data.
| Category | Train | Size | Evaluation | Size | Checkpoints | Size |
|:----------------|:---------------------------------------------------------------------------------------------------|:------|:----------------------------------------------------------------------|:------|:-----------------------------------------------------------------------|:------|
| **RT-1** | [rt1_train_data.tar.gz](https://lf-robot-opensource.bytetos.com/obj/lab-robot-public/opensource_IRASim_v1/rt1_train_data.tar.gz) | 86G | [rt1_evaluation_data.tar.gz](https://lf-robot-opensource.bytetos.com/obj/lab-robot-public/opensource_IRASim_v1/rt1_evaluation_data.tar.gz) | 100G | [rt1_checkpoints_data.tar.gz](https://lf-robot-opensource.bytetos.com/obj/lab-robot-public/opensource_IRASim_v1/rt1_checkpoints_data.tar.gz) | 29G |
| **Bridge** | [bridge_train_data.tar.gz](https://lf-robot-opensource.bytetos.com/obj/lab-robot-public/opensource_IRASim_v1/bridge_train_data.tar.gz) | 31G | [bridge_evaluation_data.tar.gz](https://lf-robot-opensource.bytetos.com/obj/lab-robot-public/opensource_IRASim_v1/bridge_evaluation_data.tar.gz) | 63G | [bridge_checkpoints_data.tar.gz](https://lf-robot-opensource.bytetos.com/obj/lab-robot-public/opensource_IRASim_v1/bridge_checkpoints_data.tar.gz) | 32G |
| **Language-Table** | [languagetable_train_data.tar.gz](https://lf-robot-opensource.bytetos.com/obj/lab-robot-public/opensource_IRASim_v1/languagetable_train_data.tar.gz) | 200G | [languagetable_evaluation_data.tar.gz](https://lf-robot-opensource.bytetos.com/obj/lab-robot-public/opensource_IRASim_v1/languagetable_evaluation_data.tar.gz) | 194G | [languagetable_checkpoints_data.tar.gz](https://lf-robot-opensource.bytetos.com/obj/lab-robot-public/opensource_IRASim_v1/languagetable_checkpoints_data.tar.gz) | 34G |
The complete dataset structure can be found in [dataset_structure.txt](https://github.com/bytedance/IRASim/blob/main/dataset_structure.txt).
### 📢 Update (May 20, 2025)
We are excited to announce that the **IRASim dataset** is now available on **Hugging Face**:
🔗 [https://huggingface.co/datasets/fangqi/IRASim](https://huggingface.co/datasets/fangqi/IRASim)
To reconstruct the full dataset locally:
1. Download all dataset parts from the Hugging Face page.
2. Use the provided `merge.sh` script to merge the downloaded files into **multiple ZIP archives**.
3. Extract each ZIP file separately to access the complete dataset.
## Language Table Application
We recommend starting with the Language Table application. This application provides a user-friendly keyboard interface to control the robotic arm in an initial image on a 2D plane:
```bash
python3 application/languagetable.py
```
## Training
Below are example scripts for training the IRASim-Frame-Ada model on the RT-1 dataset.
To accelerate training, we recommend encoding videos into latent videos first. Our code also supports direct training by setting `pre_encode` to `false`.
### Single GPU Training
```bash
python3 main.py --config configs/train/rt1/frame_ada.yaml
```
### Multi-GPU Training on a Single Machine
```bash
torchrun --nproc_per_node 8 --nnodes 1 --node_rank 0 --rdzv_endpoint {node_address}:{port} --rdzv_id 107 --rdzv_backend c10d main.py --config configs/train/rt1/frame_ada.yaml
```
## Evaluation
Below are example scripts for evaluating the IRASim-Frame-Ada model on the RT-1 dataset.
### Short Trajectory Setting
To quantitatively evaluate the model in the short trajectory setting, we first need to generate all evaluation videos.
Generate evaluation videos:
```bash
torchrun --nproc_per_node 8 --nnodes 1 --node_rank 0 --rdzv_endpoint {node_address}:{port} --rdzv_id 107 --rdzv_backend c10d main.py --config configs/evaluation/rt1/frame_ada.yaml
```
We provide an automated script to calculate the metrics of the generated short videos:
```bash
python3 evaluate/evaluation_short_script.py
```
### Long Trajectory Setting
Generate all long videos in an autoregressive manner.
Generate the scripts for generating long videos in a multi-process manner:
```bash
python3 scripts/generate_command.py
```
Run:
```bash
bash scripts/generate_long_video_rt1_frame_ada.sh
```
Use the automated script to calculate the metrics of the generated long videos:
```bash
python3 evaluate/evaluation_long_script.py
```
## Citation
If you find this code useful in your work, please consider citing
```shell
@article{FangqiIRASim2024,
title={IRASim: Learning Interactive Real-Robot Action Simulators},
author={Fangqi Zhu and Hongtao Wu and Song Guo and Yuxiao Liu and Chilam Cheang and Tao Kong},
year={2024},
journal={arXiv:2406.12802}
}
```
## Acknowledgement
* Our implementation is largely adapted from [Latte](https://github.com/Vchitect/Latte).
* Our FVD implementation is adapted from [stylegan-v](https://github.com/universome/stylegan-v).
* Our FID implementation is adapted from [pytorch-fid](https://github.com/mseitzer/pytorch-fid).
* Our RT-1, Bridge, Language-Table datasets are adapted from [RT-1](https://robotics-transformer1.github.io/), [Bridge](https://rail-berkeley.github.io/bridgedata/), [open_x_embodiment
](https://github.com/google-deepmind/open_x_embodiment).
# Discussion Group
If you have any questions during the trial, running or deployment, feel free to join our WeChat group discussion! If you have any ideas or suggestions for the project, you are also welcome to join our WeChat group discussion!
