Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
README.md		README.md

Repository files navigation

AniPortrait

AniPortrait: Audio-Driven Synthesis of Photorealistic Portrait Animations

Huawei Wei, Zejun Yang, Zhisheng Wang

Tencent Games Zhiji, Tencent

Pipeline

TODO

Various Generated Videos

Self driven

Face reenacment

Audio driven

Installation

Build environment

We Recommend a python version >=3.10 and cuda version =11.7. Then build environment as follows:

pip install -r requirements.txt

Download weights

We will upload them to huggingface soon!

All the weights should be placed under the ./pretrained_weights direcotry. You can download weights manually as follows:

Download our trained weights, which include four parts: denoising_unet.pth, reference_unet.pth, pose_guider.pth, motion_module.pth, audio2mesh.pt and audio2pose.pt.
Download pretrained weight of based models and other components:
Download dwpose weights (dw-ll_ucoco_384.onnx, yolox_l.onnx) following this.

Finally, these weights should be orgnized as follows:

./pretrained_weights/
|-- DWPose
|   |-- dw-ll_ucoco_384.onnx
|   `-- yolox_l.onnx
|-- image_encoder
|   |-- config.json
|   `-- pytorch_model.bin
|-- audio2mesh.pt
|-- audio2pose.pt
|-- denoising_unet.pth
|-- motion_module.pth
|-- pose_guider.pth
|-- reference_unet.pth
|-- sd-vae-ft-mse
|   |-- config.json
|   |-- diffusion_pytorch_model.bin
|   `-- diffusion_pytorch_model.safetensors
|-- stable-diffusion-v1-5
|   |-- feature_extractor
|   |   `-- preprocessor_config.json
|   |-- model_index.json
|   |-- unet
|   |   |-- config.json
|   |   `-- diffusion_pytorch_model.bin
|   `-- v1-inference.yaml
`-- wav2vec2-base-960h

Note: If you have installed some of the pretrained models, such as StableDiffusion V1.5, you can specify their paths in the config file (e.g. ./config/prompts/animation.yaml).

Inference

Here is the cli command for running inference scripts:

Self driven

python -m scripts.pose2vid --config ./configs/prompts/animation.yaml -W 512 -H 512 -L 64

You can refer the format of animation.yaml to add your own reference images or pose videos. To convert the raw video into a pose video (keypoint sequence), you can run with the following command:

python -m scripts.vid2pose --video_path pose_video_path.mp4

Face reenacment

python -m scripts.vid2vid --config ./configs/prompts/animation_facereenac.yaml -W 512 -H 512 -L 64

Add source face videos and reference images in the animation_facereenac.yaml.

Audio driven

python -m scripts.audio2vid --config ./configs/prompts/animation_audio.yaml -W 512 -H 512 -L 64

Add audios and reference images in the animation_audio.yaml.

Training

Comming soon!

Citation

@misc{wei2024aniportrait,
      title={AniPortrait: Audio-Driven Synthesis of Photorealistic Portrait Animations}, 
      author={Huawei Wei and Zejun Yang and Zhisheng Wang},
      year={2024},
      eprint={*},
      archivePrefix={arXiv},
      primaryClass={cs.CV}
}

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

AniPortrait

Pipeline

Various Generated Videos

Self driven

Face reenacment

Audio driven

Installation

Build environment

Download weights

Inference

Self driven

Face reenacment

Audio driven

Training

Citation

About

Releases

Packages

Languages

License

Shveepss/AniPortrait

Folders and files

Latest commit

History

Repository files navigation

AniPortrait

Pipeline

Various Generated Videos

Self driven

Face reenacment

Audio driven

Installation

Build environment

Download weights

Inference

Self driven

Face reenacment

Audio driven

Training

Citation

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages