Video future frames prediction based on Transformers
Train the autoencoder firstly, save the ckpt, load it for stage 2
train_FAR.py: Fully autoregressive model
train_FAR_mp.py: multiple gpu training (single machine)
train_NAR.py: Non-autoregressive model
train_NAR_mp.py: multiple gpu training (single machine)
/MovingMNIST
moving-mnist-train.npz
moving-mnist-test.npz
moving-mnist-val.npz \
/KTH
boxing/
person01_boxing_d1/
image_0001.png
image_0002.png
...
person01_boxing_d2/
image_0001.png
image_0002.png
... \
handclapping/
...
handwaving/
...
jogging_no_empty/
...
running_no_empty/
...
walking_no_empty/
...
/BAIR
test/
example_0/
0000.png
0001.png
...
example_1/
0000.png
0001.png
...
example_...
train/
example_0/
0000.png
0001.png
...
example_... \