Break-A-Scene: Extracting Multiple Concepts from a Single Image

Break-A-Scene: Extracting Multiple Concepts from a Single Image

Omri Avrahami, Kfir Aberman, Ohad Fried, Daniel Cohen-Or, Dani Lischinski

Given a single image with multiple concepts, annotated by loose segmentation masks, our method can learn a distinct token for each concept, and use natural language guidance to re-synthesize the individual concepts or combinations of them in various contexts.

Applications

Image Variations

Entangled Scene Decomposition

Background Extraction

Local Editing by Example

Installation

Install the conda virtual environment:

conda env create -f environment.yml
conda activate break-a-scene
conda install pytorch==1.13.1 torchvision==0.14.1 torchaudio==0.13.1 pytorch-cuda=11.7 -c pytorch -c nvidia

Usage

Step 1 - Extracting concepts from a single image

Create a new folder containing your single image named img.jpg and the corresponding masks, one mask per concept, e.g., mask0.png, mask1.png. You can see the example folder in examples/creature

Then, you train the model by running the following:

python train.py \
  --instance_data_dir INPUT_PATH  \
  --num_of_assets NUMBER_OF_CONCEPTS \
  --initializer_tokens TOKEN0 TOKEN1 TOKEN2 \
  --class_data_dir PATH_TO_PRIOR_DIR \ 
  --phase1_train_steps 400 \
  --phase2_train_steps 400 \
  --output_dir OUTPUT_DIR

where --instance_data_dir is the path to the input folder, --num_of_assets is the number of concepts to extract, --initializer_tokens is an optional list of words describing the concepts (it can be omitted, but the model may produce better results with a proper initialization), --class_data_dir is a path that contains general images for the prior preservation loss (if you do not have such a folder, the script will generate them for you, can be used for future executions), --phase1_train_steps and --phase1_train_steps are the number of training steps per phase, and --output_dir is the path to save the trained model.

For example:

python train.py \
  --instance_data_dir examples/creature  \
  --num_of_assets 3 \
  --initializer_tokens creature bowl stone \
  --class_data_dir inputs/data_dir \
  --phase1_train_steps 400 \
  --phase2_train_steps 400 \
  --output_dir outputs/creature

Step 2 - Generating images

After training, a new model will be saved in OUTPUT_DIR with an extended vocabulary that contains the additional concepts <asset0> ... <assetN> where N = NUMBER_OF_CONCEPTS - 1. For example, in the above case, there will be additional 3 tokens <asset0>, <asset1> and <asset2>.

Now, you can generate images using:

python inference.py \
  --model_path TRAINED_MODEL_PATH \
  --prompt PROMPT \
  --output_path DESTINATION_PATH

For example, in the above case:

python inference.py \
  --model_path outputs/creature \
  --prompt "a photo of <asset0> at the beach" \
  --output_path "outputs/result.jpg"

Or:

python inference.py \
  --model_path outputs/creature \
  --prompt "an oil painting of <asset1> and <asset2>" \
  --output_path "outputs/result.jpg"

Citation

If you find this useful for your research, please cite the following:

@article{avrahami2023break,
  title={Break-A-Scene: Extracting Multiple Concepts from a Single Image},
  author={Avrahami, Omri and Aberman, Kfir and Fried, Ohad and Cohen-Or, Daniel and Lischinski, Dani},
  journal={arXiv preprint arXiv:2305.16311},
  year={2023}
}

Disclaimer

This is not an officially supported Google product.

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
docs		docs
examples		examples
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
environment.yml		environment.yml
inference.py		inference.py
ptp_utils.py		ptp_utils.py
train.py		train.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Break-A-Scene: Extracting Multiple Concepts from a Single Image

Applications

Image Variations

Entangled Scene Decomposition

Background Extraction

Local Editing by Example

Installation

Usage

Step 1 - Extracting concepts from a single image

Step 2 - Generating images

Citation

Disclaimer

About

Releases

Packages

Languages

License

liguiming77/break-a-scene

Folders and files

Latest commit

History

Repository files navigation

Break-A-Scene: Extracting Multiple Concepts from a Single Image

Applications

Image Variations

Entangled Scene Decomposition

Background Extraction

Local Editing by Example

Installation

Usage

Step 1 - Extracting concepts from a single image

Step 2 - Generating images

Citation

Disclaimer

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages