# lite-avatar **Repository Path**: sunpengqi11/lite-avatar ## Basic Information - **Project Name**: lite-avatar - **Description**: No description available - **Primary Language**: Unknown - **License**: MIT - **Default Branch**: main - **Homepage**: None - **GVP Project**: No ## Statistics - **Stars**: 0 - **Forks**: 0 - **Created**: 2025-04-25 - **Last Updated**: 2025-04-25 ## Categories & Tags **Categories**: Uncategorized **Tags**: None ## README # LiteAvatar We introduce a audio2face model for realtime 2D chat avatar, which can run in 30fps on only CPU devices without GPU acceleration. ## Pipeline - An efficient ASR model from [modelsope](https://modelscope.cn/models/iic/speech_paraformer-large_asr_nat-zh-cn-16k-common-vocab8404-pytorch) for audio feature extraction. - A mouth parameter prediction model given audio feature inputs for voice synchronized mouth movement generation. - A lightweight 2D face generator model for mouth movement rendering, which can also be deployed on mobile devices realizing realtime inference. ## Data Preparation Get sample avatar data located in `./data/sample_data.zip` and extract to you path 🔥More avatars can be found at [LiteAvatarGallery](https://modelscope.cn/models/HumanAIGC-Engineering/LiteAvatarGallery/summary) ## Installation We recommend a python version = 3.10 and cuda version = 11.8. Then build environment as follows: ```shell pip install -r requirements.txt ``` ## Inference ``` python lite_avatar.py --data_dir /path/to/sample_data --audio_file /path/to/audio.wav --result_dir /path/to/result ``` The mp4 video result will be saved in the result_dir. ## Interactive demo The realtime interactive video chat demo powered by our LiteAvatar algorithm is available at [OpenAvatarChat](https://github.com/HumanAIGC-Engineering/OpenAvatarChat) ## Acknowledgement We are grateful for the following open-source projects that we used in this project: - [Paraformer](https://modelscope.cn/models/iic/speech_paraformer-large_asr_nat-zh-cn-16k-common-vocab8404-pytorch) and [FunASR](https://github.com/modelscope/FunASR) for audio feature extraction. ## Citation If you find this project useful, please ⭐️ star the repository and cite our related paper: ``` @inproceedings{ZhuangQZZT22, author = {Wenlin Zhuang and Jinwei Qi and Peng Zhang and Bang Zhang and Ping Tan}, title = {Text/Speech-Driven Full-Body Animation}, booktitle = {Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, {IJCAI}}, pages = {5956--5959}, year = {2022} } ```