EmoDiffGes: Emotion-Aware Co-Speech Holistic Gesture Generation with Progressive Synergistic Diffusion.
conda create -n emodiffges python=3.12
conda activate emodiffges
conda install pytorch==2.1.2 torchvision==0.16.2 torchaudio==2.1.2 pytorch-cuda=11.8 -c pytorch -c nvidia
pip install -r requirements.txt
# Download ckpt
https://drive.google.com/drive/folders/1cjuC6DaC7UJVklAwvCW3ba6Ueg8KttPw?dmr=1&ec=wgc-drive-hero-goto
# Download the EmoRoBERTa model
hf download arpanghoshal/EmoRoBERTa
# Download the SMPL model
gdown https://drive.google.com/drive/folders/1MCks7CMNBtAzU2XihYezNmiGT_6pWex8?usp=drive_link -O ./datasets/hub --folder
huggingface-cli download H-Liu1997/BEAT2 --local-dir ./datasets/BEAT_SMPL
# Evaluate the pretrained diffusion model
python test.py -c configs/diffuser_rvqvae_128.yaml
bash train_rvq.sh
# Train the diffusion model
python train.py -c configs/diffuser_rvqvae_128.yaml
python demo.py -c configs/diffuser_rvqvae_128_hf.yaml
Thanks to EMAGE, EmoRoBERTa, our code is partially borrowing from them. Please check these useful repos.
If you find our code or paper helps, please consider citing:
@article{li2025EmoDiffGes,
author = {Li, Xinru and Lin, Jingzhong and Zhang, Bohao and Qi, Yuanyuan and Wang, Changbo and He, Gaoqi},
title = {EmoDiffGes: Emotion-Aware Co-Speech Holistic Gesture Generation with Progressive Synergistic Diffusion},
journal = {Computer Graphics Forum},
volume = {44},
number = {7},
pages = {e70261},
doi = {https://doi.org/10.1111/cgf.70261},
year = {2025}
}

