Official implementation of the paper AttrLostGAN: Attribute Controlled Image Synthesis from Reconfigurable Layout and Style, a method for attribute controlled image synthesis from layout which allows to specify the appearance of individual objects.
Authors: Stanislav Frolov, Avneesh Sharma, Jörn Hees, Tushar Karayil, Federico Raue, Andreas Dengel
Contact: [email protected]
Generated images using a reconfigurable layout and atttributes to control the appearance of individual objects. From left to right: add tree [green], add plane [metal, orange], add sheep [white], add horse [brown], add person, add jacket [pink], grass → grass [dry].
We created a synthetic dataset based on MNIST Dialog to study our approach. Each image contains multiple randomly placed digits on a black background, and each digit has multiple attributes (digit color, background color, and stroke style). From left to right we modify the specification of individual objects without affecting the rest of the image which enables controllable image generation. See paper for more details.
- Python 3.6
- PyTorch 1.5
# clone repository
git clone https://github.com/stanifrolov/attrlostgan.git
# install requirements
pip install -r requirements.txt
# setup for roi_layers
python setup.py build develop
# download Visual Genome dataset to `datasets/vg`
bash scripts/download_vg.sh
python scripts/preprocess_vg.py
Download the following checkpoints to ./pretrained
in order to use the demo notebooks.
- AttrLostGANv1 128x128
- AttrLostGANv2 128x128
- AttrLostGANv2 256x256
- Method by Ke Ma et al. using Attribute-guided Layout2Im
We provide two notebooks to play around with the pre-trained models in an interactive way.
- dmnist_demo.ipynb for the model trained on MNIST Dialog
- demo.ipynb for the models trained on Visual Genome (version and image size can be configured)
python train.py --version=<1 or 2> --img_size=<128 or 256>
python test.py --version=<1 or 2> --img_size=<128 or 256> --split=<train, val or test> --model_path=<path to model>
python dmnist_train.py --dataset_path=<path to dataset>
python dmnist_test.py --dataset_path=<path to dataset> --model_path=<path to model>
To compute the IS on images and SceneIS on object crops we used the official TensorFlow implementation at openai/improved-gan and TensorFlow version 1.15.3.
# IS
python ./eval/is.py --num_images=5096 --splits=5 --image_folder=<path to images>
# SceneIS
python ./eval/is.py --num_images=30000 --splits=10 --image_folder=<path to object crops>
To compute the FID between real and fake images, and SceneFID between real and fake object crops we used the official TensorFlow implementation at bioinf-jku/TTUR and TensorFlow version 1.15.3.
# FID
python ./eval/fid.py --size=<image size 128 or 256> <path to fake images> <path to real images>
# SceneFID
python ./eval/fid.py --size=128 <path to fake images> <path to real images>
To compute the Diversity Score (DS) we used the official implementation at richzhang/PerceptualSimilarity
The results in our paper are obtained by first training an object label classifier on fake data, and then reporting the test accuracy of the model with the lowest validation loss (both val and test on real data).
# step 1: train the classifier
python ./eval/resnet_object_classifier.py \
--train_path=<path to samples generated from train split> \
--val_path=<path to samples generated from val split> \
--test_path=<path to samples generated from test split> \
--model_name=resnet101 --feature_extract --total_epoch=20 --batch_size=256
# step 2: report test accuracy with the lowest validation loss using the log file
The results in our paper are obtained by first training a (multi-label) attribute classifier on fake data, and then reporting the test micro-F1 score of the model with lowest validation loss (both val and test on real data).
# step 1: train the classifier
python ./eval/resnet_attribute_classifier.py \
--train_path=<path to samples generated from train split> \
--val_path=<path to samples generated from val split> \
--test_path=<path to samples generated from test split> \
--model_name=resnet101 --feature_extract --total_epoch=20
# step 2: choose test predictions with the lowest validation loss using the log file
# step 3: compute Attr-F1
python ./eval/attr_f1.py --path=<path_to_csv>
This works builds upon LostGAN-v1 and LostGAN-v2, using the official implementations. We also used the code of Attribute-guided Layout2Im to train a baseline. Furthermore, we used open-source implementations of evaluation metrics such as IS, FID, and LPIPS.
If you found this research useful please consider citing:
@article{frolov2021attrlostgan,
title={AttrLostGAN: Attribute Controlled Image Synthesis from Reconfigurable Layout and Style},
author={Frolov, Stanislav and Sharma, Avneesh and Hees, J{\"o}rn and Karayil, Tushar and Raue, Federico and Dengel, Andreas},
journal={arXiv preprint arXiv:2103.13722},
year={2021}
}