Shurong Yang*, Huadong Li*, Juhao Wu*, Minhao Jing*†, Linze Li, Renhe Ji‡, Jiajun Liang‡, Haoqiang Fan
MEGVII Technology
*Equal contribution †Lead this project ‡Corresponding author
- [TODO] The code of MegActor-Sigma will be cooming soon.
- [🔥🔥🔥 2024.08.28] Arxiv MegActor-Sigma paper are released.
- [✨✨✨ 2024.07.02] For ease of replication, we provide a 10-minute dataset available on Google Drive, which should yield satisfactory performance..
- [🔥🔥🔥 2024.06.25] Training setup released. Please refer to Training for details.
- [🔥🔥🔥 2024.06.25] Integrated into OpenBayes, see the demo. Thank OpenBayes team!
- [🔥🔥🔥 2024.06.17] Demo Gradio Online are released .
- [🔥🔥🔥 2024.06.13] Data curation pipeline are released .
- [🔥🔥🔥 2024.05.31] Arxiv MegActor paper are released.
- [🔥🔥🔥 2024.05.24] Inference settings are released.
Usability: animates a portrait with video while ensuring consistent motion.
Reproducibility: fully open-source and trained on publicly available datasets.
Efficiency: ⚡200 V100 hours of training to achieve pleasant motions on portraits.
MegActor is an intermediate-representation-free portrait animator that uses the original video, rather than intermediate features, as the driving factor to generate realistic and vivid talking head videos. Specifically, we utilize two UNets: one extracts the identity and background features from the source image, while the other accurately generates and integrates motion features directly derived from the original videos. MegActor can be trained on low-quality, publicly available datasets and excels in facial expressiveness, pose diversity, subtle controllability, and visual quality.
demo.mp4
demo4.mp4
demo6.mp4
-
Environments
Detailed environment settings should be found with environment.yaml
- Linux
conda env create -f environment.yaml pip install -U openmim mim install mmengine mim install "mmcv>=2.0.1" mim install "mmdet>=3.1.0" mim install "mmpose>=1.1.0" conda install -c conda-forge cudatoolkit-dev -y
- Linux
-
Dataset.
- For a detailed description of the data processing procedure, please refer to the accompanying below. Data Process Pipeline
- You may refer to a 10-min dataset in this format at Google Drive.
-
Pretrained weights
Please find our pretrained weights at https://huggingface.co/HVSiniX/RawVideoDriven. Or simply use
git clone https://huggingface.co/HVSiniX/RawVideoDriven && ln -s RawVideoDriven/weights weights
We currently support two-stage training on single node machines.
Stage1(Image training):
bash train.sh train.py ./configs/train/train_stage1.yaml {number of gpus on this node}
Stage2(Video training):
bash train.sh train.py ./configs/train/train_stage2.yaml {number of gpus on this node}
Currently only single-GPU inference is supported. We highly recommend that you use --contour-preserve
arg the better preserve the shape of the source face.
CUDA_VISIBLE_DEVICES=0 python eval.py --config configs/inference/inference.yaml --source {source image path} --driver {driving video path} --contour-preserve
For gradio interface, please run
python demo/run_gradio.py
@misc{yang2024megactorsigmaunlockingflexiblemixedmodal,
title={MegActor-$\Sigma$: Unlocking Flexible Mixed-Modal Control in Portrait Animation with Diffusion Transformer},
author={Shurong Yang and Huadong Li and Juhao Wu and Minhao Jing and Linze Li and Renhe Ji and Jiajun Liang and Haoqiang Fan and Jin Wang},
year={2024},
eprint={2408.14975},
archivePrefix={arXiv},
primaryClass={cs.CV},
url={https://arxiv.org/abs/2408.14975},
}
@misc{yang2024megactor,
title={MegActor: Harness the Power of Raw Video for Vivid Portrait Animation},
author={Shurong Yang and Huadong Li and Juhao Wu and Minhao Jing and Linze Li and Renhe Ji and Jiajun Liang and Haoqiang Fan},
year={2024},
eprint={2405.20851},
archivePrefix={arXiv},
primaryClass={cs.CV}
}
Many thanks to the authors of mmengine, MagicAnimate, Controlnet_aux, and Detectron2.
If you have any questions, feel free to open an issue or contact us at [email protected], [email protected] or [email protected].
If you're seeking an internship and are interested in our work, please send your resume to [email protected] or [email protected].