This is an official implementation of paper 'Improving Diffusion Models for Authentic Virtual Try-on in the Wild'
🤗 Try our huggingface Demo
- demo model
- inference code
- training code
git clone https://github.com/yisol/IDM-VTON.git
cd IDM-VTON
conda env create -f environment.yaml
conda activate idm
You can download VITON-HD dataset from VITON-HD.
After download VITON-HD dataset, move vitonhd_test_tagged.json into the test folder.
Structure of the Dataset directory should be as follows.
train
|-- ...
test
|-- image
|-- image-densepose
|-- agnostic-mask
|-- cloth
|-- vitonhd_test_tagged.json
You can download DressCode dataset from DressCode.
We provide pre-computed densepose images and captions for garments here.
We used detectron2 for obtaining densepose images.
After download DressCode dataset, place image-densepose directories and caption text files as follows.
DressCode
|-- dresses
|-- images
|-- image-densepose
|-- dc_caption.txt
|-- ...
|-- lower_body
|-- images
|-- image-densepose
|-- dc_caption.txt
|-- ...
|-- upper_body
|-- images
|-- image-densepose
|-- dc_caption.txt
|-- ...
Inference using python file with arguments.
accelerate launch inference.py \
--width 768 --height 1024 --num_inference_steps 30 \
--output_dir "result" \
--unpaired \
--data_dir "DATA_DIR" \
--seed 42 \
--test_batch_size 2 \
--guidance_scale 2.0
You can simply run with the script file.
sh inference.sh
For DressCode dataset, put the category you want to generate images via category argument.
accelerate launch inference_dc.py \
--width 768 --height 1024 --num_inference_steps 30 \
--output_dir "result" \
--unpaired \
--data_dir "DATA_DIR" \
--seed 42
--test_batch_size 2
--guidance_scale 2.0
--category "upper_body"
You can simply run with the script file.
sh inference.sh
For the demo, GPUs are supported from zerogpu, and auto masking generation codes are based on OOTDiffusion and DCI-VTON.
Parts of the code are based on IP-Adapter.
@article{choi2024improving,
title={Improving Diffusion Models for Virtual Try-on},
author={Choi, Yisol and Kwak, Sangkyung and Lee, Kyungmin and Choi, Hyungwon and Shin, Jinwoo},
journal={arXiv preprint arXiv:2403.05139},
year={2024}
}
The codes and checkpoints in this repository are under the CC BY-NC-SA 4.0 license.