# KeypointNeRF **Repository Path**: mirrors_facebookresearch/KeypointNeRF ## Basic Information - **Project Name**: KeypointNeRF - **Description**: KeypointNeRF Generalizing Image-based Volumetric Avatars using Relative Spatial Encoding of Keypoints - **Primary Language**: Unknown - **License**: Not specified - **Default Branch**: main - **Homepage**: None - **GVP Project**: No ## Statistics - **Stars**: 0 - **Forks**: 0 - **Created**: 2022-08-13 - **Last Updated**: 2026-01-31 ## Categories & Tags **Categories**: Uncategorized **Tags**: None ## README
Marko Mihajlovic · Aayush Bansal · Michael Zollhoefer . Siyu Tang · Shunsuke Saito
KeypointNeRF leverages human keypoints to instantly generate volumetric radiance representation from 2-3 input images without retraining or fine-tuning. It can represent human faces and full bodies.
## News :new:
- [2022/10/01] Combine [ICON](https://github.com/YuliangXiu/ICON) with our relative spatial keypoint encoding for fast and convenient monocular reconstruction, without requiring the expensive SMPL feature.
More details are [here](#Reconstruction-from-a-Single-Image).
## Installation
Please install python dependencies specified in `environment.yml`:
```bash
conda env create -f environment.yml
conda activate KeypointNeRF
```
## Data preparation
Please see [DATA_PREP.md](DATA_PREP.md) to setup the ZJU-MoCap dataset.
After this step the data directory follows the structure:
```bash
./data/zju_mocap
├── CoreView_313
├── CoreView_315
├── CoreView_377
├── CoreView_386
├── CoreView_387
├── CoreView_390
├── CoreView_392
├── CoreView_393
├── CoreView_394
└── CoreView_396
```
## Train your own model on the ZJU dataset
Execute `train.py` script to train the model on the ZJU dataset.
```shell script
python train.py --config ./configs/zju.json --data_root ./data/zju_mocap
```
After the training, the model checkpoint will be stored under `./EXPERIMENTS/zju/ckpts/last.ckpt`, which is equivalent to the one provided [here](https://drive.google.com/file/d/1rsMb3DFFXaFw0iK7yoUmoDEaCW_XqfaN/view?usp=sharing).
## Evaluation
To extract render and evaluate images, execute:
```shell script
python train.py --config ./configs/zju.json --data_root ./data/zju_mocap --run_val
python eval_zju.py --src_dir ./EXPERIMENTS/zju/images_v3
```
To visualize the dynamic results, execute:
```shell
python render_dynamic.py --config ./configs/zju.json --data_root ./data/zju_mocap --model_ckpt ./EXPERIMENTS/zju/ckpts/last.ckpt
```
(The first three views of an unseen subject are the input to KeypointNeRF; the last image is a rendered novel view)
We compare KeypointNeRF with recent state-of-the-art methods. The evaluation metric is SSIM and PSNR. | Models | PSNR ↑ | SSIM ↑ | |---|---|---| | pixelNeRF (Yu et al., CVPR'21) | 23.17 | 86.93 | | PVA (Raj et al., CVPR'21) | 23.15 | 86.63 | | NHP (Kwon et al., NeurIPS'21) | 24.75 | 90.58 | | KeypointNeRF* (Mihajlovic et al., ECCV'22) | **25.86** | **91.07** |(*Note that results of KeypointNeRF are slightly higher compared to the numbers reported in the original paper due to training views not beeing shuffled during training.)
## Reconstruction from a Single Image Our relative spatial encoding can be used to reconstruct humans from a single image. As a example, we leverage ICON and replace its expensive SDF feature with our relative spatial encoding.