Official PyTorch implementation of the paper:
QuantArt: Quantizing Image Style Transfer Towards High Visual Fidelity
Siyu Huang* (Harvard), Jie An* (Rochester), Donglai Wei (BC), Jiebo Luo (Rochester), Hanspeter Pfister (Harvard)
CVPR 2023
We devise a new style transfer framework called QuantArt for high visual-fidelity stylization. The core idea is to push latent representation of generated artwork toward centroids of real artwork distribution with vector quantization. QuantArt achieves decent performance for various image style transfer tasks.
- python=3.8.5
- pytorch=1.7.0
- pytorch-lightning=1.0.8
- cuda=10.2
We recommend to use conda
to create a new environment with all dependencies installed.
conda env create -f environment.yaml
conda activate quantart
Download pre-trained landscape2art model and put it under logs/
. Run
bash test.sh
The stylized landscape images (from imgs/
) will be saved in logs/
.
Stage-1: The datasets and pre-trained models for codebook pretraining are as follows:
Stage-2: The datasets and pre-trained models for style transfer experiments are as follows:
Task | Pre-trained Model | Content | Style |
---|---|---|---|
photo->artwork | coco2art | MS_COCO | WikiArt |
landscape->artwork | landscape2art | LandscapesHQ | WikiArt |
landscape->artwork (non-VQ) | landscape2art_continuous | LandscapesHQ | WikiArt |
face->artwork | face2art | FFHQ | Metfaces |
artwork->artwork | art2art | WikiArt | WikiArt |
photo->photo | coco2coco | MS_COCO | MS_COCO |
landscape->landscape | landscape2landscape | LandscapesHQ | LandscapesHQ |
Follow Datasets and Pre-trained Models to download more datasets and pretrained models. For instance for landscape-to-artwork style transfer model, the folder structure should be
QuantArt
├── configs
├── datasets
│ ├── lhq_1024_jpg
│ │ ├── lhq_1024_jpg
│ │ │ ├── 0000000.jpg
│ │ │ ├── 0000001.jpg
│ │ │ ├── 0000002.jpg
│ │ │ ├── ...
│ ├── painter-by-numbers
│ │ ├── train
│ │ │ ├── 100001.jpg
│ │ │ ├── 100002.jpg
│ │ │ ├── 100003.jpg
│ │ │ ├── ...
│ │ ├── test
│ │ │ ├── 0.jpg
│ │ │ ├── 100000.jpg
│ │ │ ├── 100004.jpg
│ │ │ ├── ...
├── logs
│ ├── landscape2art
│ │ ├── checkpoints
│ │ ├── configs
├── taming
├── environment.yaml
├── main.py
├── train.sh
└── test.sh
Run the following command to test the pre-trained model on the testing dataset:
python -u main.py --base logs/landscape2art/configs/test.yaml -n landscape2art -t False --gpus 0,
--base
: path for the config file.-n
: result folder underlogs/
.-t
: is training.--gpus
: GPUs used.
Stage-1: Prepare WikiArt dataset as above. Download file lists painter-by-numbers-train.txt and painter-by-numbers-test.txt, put them under datasets/
. Run the following command to train a Stage-1 model (i.e., an autoencoder and a codebook). Four GPUs are recommended but not necessary.
python -u main.py --base configs/vqgan_wikiart.yaml -t True --gpus 0,1,2,3
Two separate Stage-1 models are required for content and style datasets, respectively.
Stage-2: Run bash train.sh
or the following command to train a photo-to-artwork model
python -u main.py --base configs/coco2art.yaml -t True --gpus 0,
--base
: path for the config file.-n
: result folder underlogs/
.-t
: is training.--gpus
: GPUs used.--resume_from_checkpoint
: resume training from a checkpoint.
More training configs of Stage-2 models can be found in configs/
.
Unpaired data:
To test unpaired data, follow comments in configs/custom_unpaired.yaml
to specify model checkpoints and data paths. Then run
python -u main.py --base configs/custom_unpaired.yaml -n custom_unpaired -t False --gpus 0,
Paired data:
To test paired data, the corresponding content and style images (in two folders) should have the same file names. Follow comments in configs/custom_paired.yaml
to specify model checkpoints and data paths, then run
python -u main.py --base configs/custom_paired.yaml -n custom_paired -t False --gpus 0,
@inproceedings{huang2023quantart,
title={QuantArt: Quantizing Image Style Transfer Towards High Visual Fidelity},
author={Siyu Huang and Jie An and Donglai Wei and Jiebo Luo and Hanspeter Pfister},
booktitle={Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition},
month={June},
year={2023}
}
This repository is heavily built upon the amazing VQGAN.