Skip to content

Commit 34b6a03

Browse files
Warlord-Ksayakpaulpcuenca
authored
Add in Blog Post for SegMoE: Segmind Mixture of Diffusion Experts (huggingface#1781)
* Create thumbnail.png * Add thumbnail * Create segmoe.md * Add Segmoe * Update segmoe.md * Update segmoe.md Co-authored-by: Sayak Paul <[email protected]> * Update _blog.yml Co-authored-by: Pedro Cuenca <[email protected]> * Update _blog.yml Co-authored-by: Pedro Cuenca <[email protected]> * Update segmoe.md Co-authored-by: Pedro Cuenca <[email protected]> * Update segmoe.md Co-authored-by: Pedro Cuenca <[email protected]> * Update segmoe.md Co-authored-by: Pedro Cuenca <[email protected]> * Update segmoe.md Co-authored-by: Pedro Cuenca <[email protected]> * Update segmoe.md Co-authored-by: Pedro Cuenca <[email protected]> * Update segmoe.md Co-authored-by: Pedro Cuenca <[email protected]> * Update segmoe.md Co-authored-by: Pedro Cuenca <[email protected]> * Update segmoe.md Co-authored-by: Pedro Cuenca <[email protected]> * Update segmoe.md Co-authored-by: Pedro Cuenca <[email protected]> * Update segmoe.md Co-authored-by: Pedro Cuenca <[email protected]> * Update segmoe.md Co-authored-by: Pedro Cuenca <[email protected]> * Update segmoe.md Co-authored-by: Pedro Cuenca <[email protected]> * Update segmoe.md Co-authored-by: Pedro Cuenca <[email protected]> * Update segmoe.md Co-authored-by: Pedro Cuenca <[email protected]> * Update segmoe.md Co-authored-by: Pedro Cuenca <[email protected]> * Update segmoe.md Co-authored-by: Pedro Cuenca <[email protected]> * Update segmoe.md Co-authored-by: Pedro Cuenca <[email protected]> * Update segmoe.md Co-authored-by: Pedro Cuenca <[email protected]> * Update segmoe.md Co-authored-by: Pedro Cuenca <[email protected]> * Update segmoe.md Co-authored-by: Pedro Cuenca <[email protected]> * Add files via upload * Update segmoe.md * Update _blog.yml * Update _blog.yml * Update _blog.yml * Update segmoe.md * Update segmoe.md * Update segmoe.md Co-authored-by: Pedro Cuenca <[email protected]> * Update segmoe.md Co-authored-by: Pedro Cuenca <[email protected]> * Update segmoe.md Co-authored-by: Pedro Cuenca <[email protected]> * Update _blog.yml * Update _blog.yml * Update _blog.yml * Update _blog.yml * Update segmoe.md Co-authored-by: Sayak Paul <[email protected]> * Update segmoe.md Co-authored-by: Sayak Paul <[email protected]> * Update segmoe.md Co-authored-by: Sayak Paul <[email protected]> * Update _blog.yml Co-authored-by: Sayak Paul <[email protected]> * Update segmoe.md Co-authored-by: Sayak Paul <[email protected]> * Update segmoe.md Co-authored-by: Sayak Paul <[email protected]> * Update segmoe.md Co-authored-by: Sayak Paul <[email protected]> * Update segmoe.md Co-authored-by: Sayak Paul <[email protected]> * Update segmoe.md * Update segmoe.md --------- Co-authored-by: Sayak Paul <[email protected]> Co-authored-by: Pedro Cuenca <[email protected]>
1 parent 4a1de82 commit 34b6a03

File tree

3 files changed

+276
-0
lines changed

3 files changed

+276
-0
lines changed

_blog.yml

+12
Original file line numberDiff line numberDiff line change
@@ -3426,3 +3426,15 @@
34263426
- guide
34273427
- collaboration
34283428
- research
3429+
3430+
- local: segmoe
3431+
title: "SegMoE: Segmind Mixture of Diffusion Experts"
3432+
author: Warlord-K
3433+
guest: true
3434+
thumbnail: /blog/assets/segmoe/thumbnail.png
3435+
date: February 3, 2024
3436+
tags:
3437+
- text-to-image
3438+
- stable-diffusion
3439+
- moe
3440+
- segmoe

assets/segmoe/thumbnail.png

1.21 MB
Loading

segmoe.md

+264
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,264 @@
1+
---
2+
title: "SegMoE: Segmind Mixture of Diffusion Experts"
3+
thumbnail: /blog/assets/segmoe/thumbnail.png
4+
authors:
5+
- user: Warlord-K
6+
guest: true
7+
- user: Icar
8+
guest: true
9+
- user: harishp
10+
guest: true
11+
---
12+
13+
# SegMoE: Segmind Mixture of Diffusion Experts
14+
15+
SegMoE is an exciting framework for creating Mixture-of-Experts Diffusion models from scratch! SegMoE is comprehensively integrated within the Hugging Face ecosystem and comes supported with `diffusers` 🔥!
16+
17+
Among the features and integrations being released today:
18+
19+
- [Models on the Hub](https://huggingface.co/models?search=segmind/SegMoE), with their model cards and licenses (Apache 2.0)
20+
- [Github Repository](https://github.com/segmind/segmoe) to create your own MoE-style models.
21+
22+
## Table of Contents
23+
24+
- [What is SegMoE](#what-is-segmoe)
25+
- [About the name](#about-the-name)
26+
- [Inference](#inference)
27+
- [Samples](#Samples)
28+
- [Using 🤗 Diffusers](#using-🤗-diffusers)
29+
- [Using a Local Model](#using-a-local-model)
30+
- [Comparison](#comparison)
31+
- [Creating your Own SegMoE](#creating-your-own-segmoe)
32+
- [Disclaimers and ongoing work](#disclaimers-and-ongoing-work)
33+
- [Additional Resources](#additional-resources)
34+
- [Conclusion](#conclusion)
35+
36+
## What is SegMoE?
37+
38+
SegMoE models follow the same architecture as Stable Diffusion. Like [Mixtral 8x7b](https://huggingface.co/blog/mixtral), a SegMoE model comes with multiple models in one. The way this works is by replacing some Feed-Forward layers with a sparse MoE layer. A MoE layer contains a router network to select which experts process which tokens most efficiently.
39+
You can use the `segmoe` package to create your own MoE models! The process takes just a few minutes. For further information, please visit [the Github Repository](https://github.com/segmind/segmoe). We take inspiration from the popular library [`mergekit`](https://github.com/arcee-ai/mergekit) to design `segmoe`. We thank the contributors of `mergekit` for such a useful library.
40+
41+
For more details on MoEs, see the Hugging Face 🤗 post: [hf.co/blog/moe](https://huggingface.co/blog/moe).
42+
43+
**SegMoE release TL;DR;**
44+
45+
- Release of SegMoE-4x2, SegMoE-2x1 and SegMoE-SD4x2 versions
46+
- Release of custom MoE-making code
47+
48+
### About the name
49+
50+
The SegMoE MoEs are called **SegMoE-AxB**, where `A` refers to the number of expert models MoE-d together, while the second number refers to the number of experts involved in the generation of each image. Only some layers of the model (the feed-forward blocks, attentions, or all) are replicated depending on the configuration settings; the rest of the parameters are the same as in a Stable Diffusion model. For more details about how MoEs work, please refer to [the "Mixture of Experts Explained" post](https://huggingface.co/blog/moe).
51+
52+
## Inference
53+
54+
We release 3 merges on the Hub:
55+
56+
1. [SegMoE 2x1](https://huggingface.co/segmind/SegMoE-2x1-v0) has two expert models.
57+
2. [SegMoE 4x2](https://huggingface.co/segmind/SegMoE-4x2-v0) has four expert models.
58+
3. [SegMoE SD 4x2](https://huggingface.co/segmind/SegMoE-SD-4x2-v0) has four Stable Diffusion 1.5 expert models.
59+
60+
### Samples
61+
62+
Images generated using [SegMoE 4x2](https://huggingface.co/segmind/SegMoE-4x2-v0)
63+
64+
![image](https://cdn-uploads.huggingface.co/production/uploads/62f8ca074588fe31f4361dae/HgF6DLC-_3igZT6kFIq4J.png)
65+
66+
Images generated using [SegMoE 2x1](https://huggingface.co/segmind/SegMoE-2x1-v0):
67+
68+
![image](https://cdn-uploads.huggingface.co/production/uploads/62f8ca074588fe31f4361dae/ofIz_6VehCHRlpsfrxwFm.png)
69+
70+
Images generated using [SegMoE SD 4x2](https://huggingface.co/segmind/SegMoE-SD-4x2-v0)
71+
72+
![image](https://cdn-uploads.huggingface.co/production/uploads/62f8ca074588fe31f4361dae/z6T2lYPlbXifoh_D5EkLZ.png)
73+
74+
### Using 🤗 Diffusers
75+
76+
Please, run the following command to install the `segmoe` package. Make sure you have the latest version of `diffusers` and `transformers` installed.
77+
```bash
78+
pip install -U segmoe diffusers transformers
79+
```
80+
81+
The following loads up the second model ("SegMoE 4x2") from the list above, and runs generation on it.
82+
83+
```python
84+
from segmoe import SegMoEPipeline
85+
86+
pipeline = SegMoEPipeline("segmind/SegMoE-4x2-v0", device="cuda")
87+
88+
prompt = "cosmic canvas, orange city background, painting of a chubby cat"
89+
negative_prompt = "nsfw, bad quality, worse quality"
90+
img = pipeline(
91+
prompt=prompt,
92+
negative_prompt=negative_prompt,
93+
height=1024,
94+
width=1024,
95+
num_inference_steps=25,
96+
guidance_scale=7.5,
97+
).images[0]
98+
img.save("image.png")
99+
```
100+
101+
![image](https://github.com/Warlord-K/blog/assets/95569637/93e7c4a2-9012-44c3-b778-e5363ad5556c)
102+
103+
### Using a Local Model
104+
105+
Alternatively, a local model can also be loaded up, here `segmoe_v0` is the path to the directory containing the local SegMoE model. Checkout [Creating your Own SegMoE](#creating-your-own-segmoe) to learn how to build your own!
106+
107+
```python
108+
from segmoe import SegMoEPipeline
109+
110+
pipeline = SegMoEPipeline("segmoe_v0", device="cuda")
111+
112+
prompt = "cosmic canvas, orange city background, painting of a chubby cat"
113+
negative_prompt = "nsfw, bad quality, worse quality"
114+
img = pipeline(
115+
prompt=prompt,
116+
negative_prompt=negative_prompt,
117+
height=1024,
118+
width=1024,
119+
num_inference_steps=25,
120+
guidance_scale=7.5,
121+
).images[0]
122+
img.save("image.png")
123+
```
124+
125+
## Comparison
126+
127+
Prompt understanding seems to improve, as shown in the images below. Each image shows the following models left to right: [SegMoE-2x1-v0](https://huggingface.co/segmind/SegMoE-2x1-v0), [SegMoE-4x2-v0](https://huggingface.co/segmind/SegMoE-4x2-v0), Base Model ([RealVisXL_V3.0](https://huggingface.co/SG161222/RealVisXL_V3.0))
128+
129+
![image](https://github.com/segmind/segmoe/assets/95569637/bcdc1b11-bbf5-4947-b6bb-9f745ff0c040)
130+
131+
<div align="center">three green glass bottles</div>
132+
<br>
133+
134+
![image](https://github.com/segmind/segmoe/assets/95569637/d50e2af0-66d2-4112-aa88-bd4df88cbd5e)
135+
136+
<div align="center">panda bear with aviator glasses on its head</div>
137+
<br>
138+
139+
![image](https://github.com/segmind/segmoe/assets/95569637/aba2954a-80c2-428a-bf76-0a70a5e03e9b)
140+
141+
<div align="center">the statue of Liberty next to the Washington Monument</div>
142+
143+
![image](https://github.com/Warlord-K/blog/assets/95569637/f113f804-8217-4b7f-b3a5-213b658697d1)
144+
145+
<div align="center">Taj Mahal with its reflection. detailed charcoal sketch.</div>
146+
147+
## Creating your Own SegMoE
148+
149+
Simply prepare a `config.yaml` file, with the following structure:
150+
151+
```yaml
152+
base_model: Base Model Path, Model Card or CivitAI Download Link
153+
num_experts: Number of experts to use
154+
moe_layers: Type of Layers to Mix (can be "ff", "attn" or "all"). Defaults to "attn"
155+
num_experts_per_tok: Number of Experts to use
156+
experts:
157+
- source_model: Expert 1 Path, Model Card or CivitAI Download Link
158+
positive_prompt: Positive Prompt for computing gate weights
159+
negative_prompt: Negative Prompt for computing gate weights
160+
- source_model: Expert 2 Path, Model Card or CivitAI Download Link
161+
positive_prompt: Positive Prompt for computing gate weights
162+
negative_prompt: Negative Prompt for computing gate weights
163+
- source_model: Expert 3 Path, Model Card or CivitAI Download Link
164+
positive_prompt: Positive Prompt for computing gate weights
165+
negative_prompt: Negative Prompt for computing gate weights
166+
- source_model: Expert 4 Path, Model Card or CivitAI Download Link
167+
positive_prompt: Positive Prompt for computing gate weights
168+
negative_prompt: Negative Prompt for computing gate weights
169+
```
170+
171+
Any number of models can be combined. For detailed information on how to create a config file, please refer to the [github repository](https://github.com/segmind/segmoe)
172+
173+
**Note**
174+
Both Hugging Face and CivitAI models are supported. For CivitAI models, paste the download link of the model, for example: "https://civitai.com/api/download/models/239306"
175+
176+
177+
Then run the following command:
178+
179+
```bash
180+
segmoe config.yaml segmoe_v0
181+
```
182+
183+
This will create a folder called `segmoe_v0` with the following structure:
184+
185+
```bash
186+
├── model_index.json
187+
├── scheduler
188+
│   └── scheduler_config.json
189+
├── text_encoder
190+
│   ├── config.json
191+
│   └── model.safetensors
192+
├── text_encoder_2
193+
│   ├── config.json
194+
│   └── model.safetensors
195+
├── tokenizer
196+
│   ├── merges.txt
197+
│   ├── special_tokens_map.json
198+
│   ├── tokenizer_config.json
199+
│   └── vocab.json
200+
├── tokenizer_2
201+
│   ├── merges.txt
202+
│   ├── special_tokens_map.json
203+
│   ├── tokenizer_config.json
204+
│   └── vocab.json
205+
├── unet
206+
│   ├── config.json
207+
│   └── diffusion_pytorch_model.safetensors
208+
└──vae
209+
   ├── config.json
210+
    └── diffusion_pytorch_model.safetensors
211+
```
212+
213+
Alternatively, you can also use the Python API to create a mixture of experts model:
214+
215+
```python
216+
from segmoe import SegMoEPipeline
217+
218+
pipeline = SegMoEPipeline("config.yaml", device="cuda")
219+
220+
pipeline.save_pretrained("segmoe_v0")
221+
```
222+
223+
### Push to Hub
224+
225+
The Model can be pushed to the hub via the huggingface-cli
226+
227+
```bash
228+
huggingface-cli upload segmind/segmoe_v0 ./segmoe_v0
229+
```
230+
231+
The model can also be pushed to the Hub directly from Python:
232+
233+
```python
234+
from huggingface_hub import create_repo, upload_folder
235+
236+
model_id = "segmind/SegMoE-v0"
237+
238+
repo_id = create_repo(repo_id=model_id, exist_ok=True).repo_id
239+
240+
upload_folder(
241+
repo_id=repo_id,
242+
folder_path="segmoe_v0",
243+
commit_message="Inital Commit",
244+
ignore_patterns=["step_*", "epoch_*"],
245+
)
246+
```
247+
248+
Detailed usage can be found [here](https://huggingface.co/docs/huggingface_hub/guides/upload)
249+
250+
## Disclaimers and ongoing work
251+
252+
- **Slower Speed**: If the number of experts per token is larger than 1, the MoE performs computation across several expert models. This makes it slower than a single SD 1.5 or SDXL model.
253+
254+
- **High VRAM usage**: MoEs run inference very quickly but still need a large amount of VRAM (and hence an expensive GPU). This makes it challenging to use them in local setups, but they are great for deployments with multiple GPUs. As a reference point, SegMoE-4x2 requires 24GB of VRAM in half-precision.
255+
256+
## Conclusion
257+
258+
We built SegMoE to provide the community a new tool that can potentially create SOTA Diffusion Models with ease, just by combining pretrained models while keeping inference times low. We're excited to see what you can build with it!
259+
260+
## Additional Resources
261+
262+
- [Mixture of Experts Explained](https://huggingface.co/blog/moe)
263+
- [Mixture of Experts Models on Hugging Face](https://huggingface.co/models?other=moe)
264+

0 commit comments

Comments
 (0)