Skip to content

AniPortrait: Audio-Driven Synthesis of Photorealistic Portrait Animation

License

Notifications You must be signed in to change notification settings

Shveepss/AniPortrait

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 

Repository files navigation

AniPortrait

AniPortrait: Audio-Driven Synthesis of Photorealistic Portrait Animations

Huawei Wei, Zejun Yang, Zhisheng Wang

Tencent Games Zhiji, Tencent

Pipeline

pipeline

TODO

Various Generated Videos

Self driven

Face reenacment

Audio driven

Installation

Build environment

We Recommend a python version >=3.10 and cuda version =11.7. Then build environment as follows:

pip install -r requirements.txt

Download weights

We will upload them to huggingface soon!

All the weights should be placed under the ./pretrained_weights direcotry. You can download weights manually as follows:

  1. Download our trained weights, which include four parts: denoising_unet.pth, reference_unet.pth, pose_guider.pth, motion_module.pth, audio2mesh.pt and audio2pose.pt.

  2. Download pretrained weight of based models and other components:

  3. Download dwpose weights (dw-ll_ucoco_384.onnx, yolox_l.onnx) following this.

Finally, these weights should be orgnized as follows:

./pretrained_weights/
|-- DWPose
|   |-- dw-ll_ucoco_384.onnx
|   `-- yolox_l.onnx
|-- image_encoder
|   |-- config.json
|   `-- pytorch_model.bin
|-- audio2mesh.pt
|-- audio2pose.pt
|-- denoising_unet.pth
|-- motion_module.pth
|-- pose_guider.pth
|-- reference_unet.pth
|-- sd-vae-ft-mse
|   |-- config.json
|   |-- diffusion_pytorch_model.bin
|   `-- diffusion_pytorch_model.safetensors
|-- stable-diffusion-v1-5
|   |-- feature_extractor
|   |   `-- preprocessor_config.json
|   |-- model_index.json
|   |-- unet
|   |   |-- config.json
|   |   `-- diffusion_pytorch_model.bin
|   `-- v1-inference.yaml
`-- wav2vec2-base-960h

Note: If you have installed some of the pretrained models, such as StableDiffusion V1.5, you can specify their paths in the config file (e.g. ./config/prompts/animation.yaml).

Inference

Here is the cli command for running inference scripts:

Self driven

python -m scripts.pose2vid --config ./configs/prompts/animation.yaml -W 512 -H 512 -L 64

You can refer the format of animation.yaml to add your own reference images or pose videos. To convert the raw video into a pose video (keypoint sequence), you can run with the following command:

python -m scripts.vid2pose --video_path pose_video_path.mp4

Face reenacment

python -m scripts.vid2vid --config ./configs/prompts/animation_facereenac.yaml -W 512 -H 512 -L 64

Add source face videos and reference images in the animation_facereenac.yaml.

Audio driven

python -m scripts.audio2vid --config ./configs/prompts/animation_audio.yaml -W 512 -H 512 -L 64

Add audios and reference images in the animation_audio.yaml.

Training

Comming soon!

Citation

@misc{wei2024aniportrait,
      title={AniPortrait: Audio-Driven Synthesis of Photorealistic Portrait Animations}, 
      author={Huawei Wei and Zejun Yang and Zhisheng Wang},
      year={2024},
      eprint={*},
      archivePrefix={arXiv},
      primaryClass={cs.CV}
}

About

AniPortrait: Audio-Driven Synthesis of Photorealistic Portrait Animation

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 100.0%