Skip to content

Official Pytorch implementation of "Playable Video Generation"

Notifications You must be signed in to change notification settings

ideaplexus/PlayableVideoGeneration

 
 

Repository files navigation

Playable Video Generation



Figure 1. Illustration of the proposed CADDY model for playable video generation.


Playable Video Generation
Willi Menapace, Stéphane Lathuilière, Sergey Tulyakov, Aliaksandr Siarohin, Elisa Ricci
ArXiv

Paper: arXiv: Coming soon
Website
Live Demo

Abstract: This paper introduces the unsupervised learning problem of playable video generation (PVG). In PVG, we aim at allowing a user to control the generated video by selecting a discrete action at every time step as when playing a video game. The difficulty of the task lies both in learning semantically consistent actions and in generating realistic videos conditioned on the user input. We propose a novel framework for PVG that is trained in a self-supervised manner on a large dataset of unlabelled videos. We employ an encoder-decoder architecture where the predicted action labels act as bottleneck. The network is constrained to learn a rich action space using, as main driving loss, a reconstruction loss on the generated video. We demonstrate the effectiveness of the proposed approach on several datasets with wide environment variety.

Overview

Given a set of completely unlabeled videos, we jointly learn a set of discrete actions and a video generation model conditioned on the learned actions. At test time, the user can control the generated video on-the-fly providing action labels as if he or she was playing a videogame. We name our method CADDY. Our architecture for unsupervised playable video generation is composed by several components. An encoder E extracts frame representations from the input sequence. A temporal model estimates the successive states using a recurrent dynamics network R and an action network A which predicts the action label corresponding to the current action performed in the input sequence. Finally, a decoder D reconstructs the input frames. The model is trained using reconstruction as the main driving loss.

Installation

Conda

The complete environment for execution can be installed with:

conda env create -f env.yml

conda activate video-generation

Docker

Build the docker image docker build -t video-generation:1.0 .

Run the docker image. Mount the root directory to /video-generation in the docker container: docker run -it --gpus all --ipc=host -v /path/to/directory/video-generation:/video-generation video-generation:1.0 /bin/bash

Directory structure

Please create the following directories in the root of the project:

  • results
  • checkpoints
  • data

Datasets

Datasets can be downloaded at the following link: Google Drive

  • Breakout: Coming soon
  • BAIR: bair_256_ours.tar.gz
  • Tennis: Coming soon

Please extract them under the data folder

Pretrained Models

Pretrained models can be downloaded at the following link: Google Drive

Please place the directories under the checkpoints folder

Playing

After downloading the checkpoints, the models can be played with the following commands:

  • Bair: python play.py --config configs/01_bair.yaml

  • Breakout: python play.py configs/breakout/02_breakout.yaml

  • Tennis: python play.py --config configs/03_tennis.yaml

Training

The models can be trained with the following commands:

python train.py --config configs/<config_file>

Multi-gpu support is active by default. Runs can be logged through Weights and Biases by running before execution of the training command: wandb init

Evaluation

Evaluation requires two steps. First, an evaluation dataset must be build. Second, evaluation is carried our on the evaluation dataset. To build the evaluation dataset please issue:

python build_evaluation_dataset.py --config configs/<config_file>

To run evaluation issue:

python evaluate_dataset.py --config configs/evaluation/configs/<config_file>

Evaluation results are saved under the evaluation_results directory.

About

Official Pytorch implementation of "Playable Video Generation"

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 99.8%
  • Dockerfile 0.2%