# unifolm-world-model-action **Repository Path**: CoderYuki/unifolm-world-model-action ## Basic Information - **Project Name**: unifolm-world-model-action - **Description**: No description available - **Primary Language**: Unknown - **License**: Not specified - **Default Branch**: main - **Homepage**: None - **GVP Project**: No ## Statistics - **Stars**: 0 - **Forks**: 0 - **Created**: 2025-09-17 - **Last Updated**: 2025-09-17 ## Categories & Tags **Categories**: Uncategorized **Tags**: None ## README # UnifoLM-WMA-0: A World-Model-Action (WMA) Framework under UnifoLM Family
🌎English | 🇨🇳中文
|
|
|:---:|:---:|
|
|
|
**注:右上角小窗口显示世界模型对未来动作视频的预测。**
## 📑 开源计划
- [x] 训练代码
- [x] 推理代码
- [x] 模型Checkpoints
- [ ] 真机部署代码
## ⚙️ 安装
```
conda create -n unifolm-wma python==3.10.18
conda activate unifolm-wma
conda install pinocchio=3.2.0 -c conda-forge -y
conda install ffmpeg=7.1.1 -c conda-forge
git clone --recurse-submodules https://github.com/unitreerobotics/unifolm-world-model-action.git
# If you already downloaded the repo:
cd unifolm-world-model-action
git submodule update --init --recursive
pip install -e .
cd external/dlimp
pip install -e .
```
## 🧰 模型 Checkpoints
| 模型 | 描述 | 链接 |
|---------|-------|------|
|$\text{UnifoLM-WMA-0}_{Base}$| 在 [Open-X](https://robotics-transformer-x.github.io/) 数据集微调后的模型 | [HuggingFace](https://huggingface.co/unitreerobotics/UnifoLM-WMA-0)|
|$\text{UnifoLM-WMA-0}_{Dual}$| 在五个[宇树科技开源数据集](https://huggingface.co/collections/unitreerobotics/g1-dex1-datasets-68bae98bf0a26d617f9983ab)上,决策和仿真双模式,联合微调后的模型 | [HuggingFace](https://huggingface.co/unitreerobotics/UnifoLM-WMA-0)|
## 🛢️ 数据集
实验中,我们训练测试了如下五个开源数据集:
| 数据集 | 机器人 | 链接 |
|---------|-------|------|
|Z1_StackBox| [Unitree Z1](https://www.unitree.com/z1)|[Huggingface](https://huggingface.co/datasets/unitreerobotics/Z1_StackBox_Dataset)|
|Z1_DualArm_StackBox|[Unitree Z1](https://www.unitree.com/z1)|[Huggingface](https://huggingface.co/datasets/unitreerobotics/Z1_DualArmStackBox_Dataset)|
|Z1_DualArm_StackBox_V2|[Unitree Z1](https://www.unitree.com/z1)|[Huggingface](https://huggingface.co/datasets/unitreerobotics/Z1_DualArm_StackBox_Dataset_V2)|
|Z1_DualArm_Cleanup_Pencils|[Unitree Z1](https://www.unitree.com/z1)|[Huggingface](https://huggingface.co/datasets/unitreerobotics/Z1_DualArm_CleanupPencils_Dataset)|
|G1_Pack_Camera|[Unitree G1](https://www.unitree.com/g1)|[Huggingface](https://huggingface.co/datasets/unitreerobotics/G1_MountCameraRedGripper_Dataset)
要在自定义数据集上训练,请首先确保数据符合 [Huggingface LeRobot](https://github.com/huggingface/lerobot) 数据集格式,假设下载后的数据目录结构如下:
```
source_dir/
├── dataset1_name
├── dataset2_name
├── dataset3_name
└── ...
```
随后执行以下命令进行格式转换:
```python
cd prepare_data
python prepare_training_data.py \
--source_dir /path/to/your/source_dir \
--target_dir /path/to/save/the/converted/data/directory \
--dataset_name "dataset1_name" \
--robot_name "a tag of the robot in the dataset" # 例如: Unitree Z1 Robot Arm 或 Unitree G1 Robot with Gripper。
```
转换后的数据结构如下(注:模型训练只支持主视角相机输入, 如数据存在腕部视角,需删除CSV文件中```data_dir```列对应的视频路径):
```
target_dir/
├── videos
│ ├──dataset1_name
│ │ ├──camera_view_dir
│ │ ├── 0.mp4
│ │ ├── 1.mp4
│ │ └── ...
│ └── ...
├── transitions
│ ├── dataset1_name
│ │ ├── meta_data
│ │ ├── 0.h5
│ │ ├── 1.h5
│ │ └── ...
│ └── ...
└── dataset1_name.csv
```
## 🚴♂️ 模型训练
一. 我们的训练策略概括如下:
- **步骤 1**:在 [Open-X](https://robotics-transformer-x.github.io/) 数据集上微调视频生成模型,使其作为世界模型(World Model);
- **步骤 2**:在下游任务数据集上,对 $\text{UnifoLM-WMA}$ 进行决策模式(decision-making mode)后训练;