GitHub - kkennethwu/diptych-prompt-unofficail

Intro

In this project we try to inplement DiptychPrompting for food quality enhancement. I use SAM2 to segment the reference image, and enhance the reference attenstion in the Flux-Controlnet-Inpainting.

TO-DO

manaully segmentation should be fixed with more powerful Nature Langauge Segmentation Model.
Check if there is bug in the referenece attention enhancement. (the scale_factor can only be set to 1.0 for now, while the paper set it as 1.3.)
Use lora-dsitilled version of Flux-Controlnet-Inpainting, which has faster inference speed.
Try GPD dataset which has better visual quality in food images.

Dataset

Food101: https://paperswithcode.com/dataset/food-101
GPD: https://github.com/Openning07/GPA

Requirments

SAM2

conda create --name same2 python=3.10
conda activate sam2
pip isntall sam2

Flux-Controlnet-Inpainting

conda create --name diptych python=3.10
conda activate diptych
pip install -r requirements.txt

I use A6000 (48G) to inference the Flux model, becasue my 4090 will run into OOM issue.

Starting

cd DiptychPrompting
# process data in food-101
python data_preprocessor.py

# generate diptych prompt
python diptych_generator.py --cls {food_cls}

# run food quality enhancement
python flux_inpaint.py -c {food_cls} -i {image_dir/image_path} -t "one the table"

Name		Name	Last commit message	Last commit date
Latest commit History 14 Commits
DiptychPrompting		DiptychPrompting
blip-diffusion		blip-diffusion
README.md		README.md
REPORT.url		REPORT.url

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Intro

TO-DO

Dataset

Requirments

Starting

About

Releases

Packages

Languages

kkennethwu/diptych-prompt-unofficail

Folders and files

Latest commit

History

Repository files navigation

Intro

TO-DO

Dataset

Requirments

Starting

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages