This repository contains the source code for the CVPR'2021 paper Motion Representations for Articulated Animation by Aliaksandr Siarohin, Oliver Woodford, Jian Ren, Menglei Chai and Sergey Tulyakov.
For more qualitiative examples visit our project page.
Here is an example of several images produced by our method. In the first column the driving video is shown. For the remaining columns the top image is animated by using motions extracted from the driving.
We support python3
. To install the dependencies run:
pip install -r requirements.txt
There are several configuration files one for each dataset
in the config
folder named as config/dataset_name.yaml
. See config/dataset.yaml
to get the description of each parameter.
See description of the parameters in the config/vox256.yaml
. We adjust the the configuration to run on 1 V100 GPU, training on 256x256 dataset takes approximatly 2 days.
Checkpoints can be found in checkpoints
folder. Checkpoints are large, therefore we use git lsf to store them. Either use git lfs pull
or download checkpoints manually from github.
To run a demo, download a checkpoint and run the following command:
python demo.py --config config/dataset_name.yaml --driving_video path/to/driving --source_image path/to/source --checkpoint path/to/checkpoint
The result will be stored in result.mp4
. To use Animation via Disentaglemet add --mode avd
, for standard animation add --mode standard
instead.
Checkpoints from google drive: https://drive.google.com/drive/folders/1jCeFPqfU_wKNYwof0ONICwsj3xHlr_tb
We prepared a demo runnable in google-colab, see: demo.ipynb
.
To train a model run:
CUDA_VISIBLE_DEVICES=0 python run.py --config config/dataset_name.yaml --device_ids 0
The code will create a folder in the log directory (each run will create a time-stamped new folder). Checkpoints will be saved to this folder.
To check the loss values during training see log.txt
.
You can also check training data reconstructions in the train-vis
subfolder.
Then to train Animation via disentaglement (AVD) use:
CUDA_VISIBLE_DEVICES=0 python run.py --checkpoint log/{folder}/cpk.pth --config config/dataset_name.yaml --device_ids 0 --mode train_avd
Where {folder}
is the name of the folder created in the previous step. (Note: use backslash '' before space.)
This will use the same folder where checkpoint was previously stored.
It will create a new checkpoint containing all the previous models and the trained avd_network.
You can monitor performance in log file and visualizations in train-vis folder.
To evaluate the reconstruction performance run:
CUDA_VISIBLE_DEVICES=0 python run.py --config config/dataset_name.yaml --mode reconstruction --checkpoint log/{folder}/cpk.pth
Where {folder}
is the name of the folder created in the previous step. (Note: use backslash '' before space.)
The reconstruction
subfolder will be created in the checkpoint folder.
The generated video will be stored to this folder, also generated videos will be stored in png
subfolder in loss-less '.png' format for evaluation.
Instructions for computing metrics from the paper can be found here.
For obtaining TED dataset run the following commands:
git clone https://github.com/AliaksandrSiarohin/video-preprocessing
cd video-preprocessing
python load_videos.py --metadata ../data/ted384-metadata.csv --format .mp4 --out_folder ../data/TED384-v2 --workers 8 --image_shape 384,384
-
Resize all the videos to the same size, e.g 256x256, the videos can be in '.gif', '.mp4' or folder with images. We recommend the latter, for each video make a separate folder with all the frames in '.png' format. This format is loss-less, and it has better i/o performance.
-
Create a folder
data/dataset_name
with 2 subfolderstrain
andtest
, put training videos in thetrain
and testing in thetest
. -
Create a config file
config/dataset_name.yaml
. See description of the parameters in theconfig/vox256.yaml
. Specify the dataset root in dataset_params specify by settingroot_dir: data/dataset_name
. Adjust other parameters as desired, such as the number of epochs for example. Specifyid_sampling: False
if you do not want to use id_sampling.
Citation:
@inproceedings{siarohin2021motion,
author={Siarohin, Aliaksandr and Woodford, Oliver and Ren, Jian and Chai, Menglei and Tulyakov, Sergey},
title={Motion Representations for Articulated Animation},
booktitle = {CVPR},
year = {2021}
}