MakeItTalk: Speaker-Aware Talking-Head Animation

This is the code repository implementing the paper:

MakeItTalk: Speaker-Aware Talking-Head Animation

Yang Zhou, Xintong Han, Eli Shechtman, Jose Echevarria , Evangelos Kalogerakis, Dingzeyu Li

SIGGRAPH Asia 2020

Abstract We present a method that generates expressive talking-head videos from a single facial image with audio as the only input. In contrast to previous attempts to learn direct mappings from audio to raw pixels for creating talking faces, our method first disentangles the content and speaker information in the input audio signal. The audio content robustly controls the motion of lips and nearby facial regions, while the speaker information determines the specifics of facial expressions and the rest of the talking-head dynamics. Another key component of our method is the prediction of facial landmarks reflecting the speaker-aware dynamics. Based on this intermediate representation, our method works with many portrait images in a single unified framework, including artistic paintings, sketches, 2D cartoon characters, Japanese mangas, and stylized caricatures. In addition, our method generalizes well for faces and characters that were not observed during training. We present extensive quantitative and qualitative evaluation of our method, in addition to user studies, demonstrating generated talking-heads of significantly higher quality compared to prior state-of-the-art methods.

[Project page] [Paper] [Video]

Figure. Given an audio speech signal and a single portrait image as input (left), our model generates speaker-aware talking-head animations (right). Both the speech signal and the input face image are not observed during the model training process. Our method creates both non-photorealistic cartoon animations (top) and natural human face videos (bottom).

Requirements

Python environment 3.6

conda create -n makeittalk_env python=3.6
conda activate makeittalk

ffmpeg (https://ffmpeg.org/download.html)

sudo apt-get install ffmpeg

python packages

pip install -r requirements.txt

Pre-trained Models (to release soon)

Download the following pre-trained models to examples/ckpt folder.

Model	Link to the model
Voice Conversion	Link
Speech Content Module	Link
Speaker-aware Module	Link
Image2Image Translation Module	Link
Non-photorealistic Warping (.exe)	Link

Animate You Portraits!

Nature human faces / Paintings (warping through Image-to-image translation module)

crop your portrait image into size 256x256 and put it under examples folder with .jpg format. Make sure the head is almost in the middle (check existing examples for a reference).
put test audio files under examples folder as well with .wav format.
animate!

python main_end2end.py --jpg <portrait_file>

use addition args --amp_lip_x <x> --amp_lip_y <y> --amp_pos <pos> to amply lip motion (in x/y-axis direction) and head motion displacements, default values are <x>=2., <y>=2., <pos>=1.

Non-photorealistic cartoon faces (warping through Delaunay triangulation)

animate one of the existing puppets

Puppet Name	wilk	roy	sketch	color	cartoonM	danbooru1
Image

python main_end2end_cartoon.py --jpg <cartoon_puppet_name>

create your own puppets (ToDo...)

Train

ToDo...

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
.idea		.idea
doc		doc
examples		examples
examples_cartoon		examples_cartoon
facewarp		facewarp
src		src
thirdparty		thirdparty
util		util
CODE_OF_CONDUCT.md		CODE_OF_CONDUCT.md
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE.md		LICENSE.md
PULL_REQUEST_TEMPLATE.md		PULL_REQUEST_TEMPLATE.md
README.md		README.md
gypsum_history.sh		gypsum_history.sh
main_end2end.py		main_end2end.py
main_end2end_cartoon.py		main_end2end_cartoon.py
main_train_content.py		main_train_content.py
main_train_image_translation.py		main_train_image_translation.py
main_train_speaker_aware.py		main_train_speaker_aware.py
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

MakeItTalk: Speaker-Aware Talking-Head Animation

Requirements

Pre-trained Models (to release soon)

Animate You Portraits!

Train

License

About

Releases

Packages

Languages

License

PeterZhouSZ/MakeItTalk

Folders and files

Latest commit

History

Repository files navigation

MakeItTalk: Speaker-Aware Talking-Head Animation

Requirements

Pre-trained Models (to release soon)

Animate You Portraits!

Train

License

About

Resources

License

Code of conduct

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages