English | 中文
Stable-Diffusion implemented by NCNN framework based on C++ (Shit Mountain + Blind Box ver.)
Zhihu: https://zhuanlan.zhihu.com/p/582552276
- To use the model, please refer to the description of the official stable-diffusion model license, which will not be repeated here, please abide by it consciously.
- The code only uses CPU, after adjustment, it only needs 8G RAM!!!
- Thanks to the pr from nihui, the quality of the current output is stable (prompt must be written well, you can refer to The Code of Quintessence), welcome to try.
- Three main steps of Stable-Diffusion:
- CLIP: text-embedding
- iterative sampling with sampler
- decode the sampler results to obtain output images
- Model details:
- Weights:Naifu (u know where to find)
- Sampler:Euler ancestral (k-diffusion version)
- Resolution:512*512
- Denoiser:CFGDenoiser, CompVisDenoiser
- Prompt:positive & negative, both supported :)
- Since the current running speed is not so fast, the exe file wasn't uploaded, please compile it yourself.
- Download the three bin files from 百度网盘 or Google Drive , put them in the corresponding
assets
directory for compilation - A simple test prompt is given in this repo.
- Very sensitive to prompts, if you want to make a high quality picture, the prompt must be written well.
- Slow, one iterative step costs about 5-10second.
I've uploaded the three onnx models used by Stable-Diffusion, so that you can do some interesting work.
You can find them from the link above.
- Please abide by the agreement of the stable diffusion model consciously, and DO NOT use it for illegal purposes!
- If you use these onnx models to make open source projects, please inform me and I'll follow and look forward for your next great work :)
- FrozenCLIPEmbedder
ncnn (input & output): token, multiplier, cond, conds
onnx (input & output): onnx::Reshape_0, 2271
z = onnx(onnx::Reshape_0=token)
origin_mean = z.mean()
z *= multiplier
new_mean = z.mean()
z *= origin_mean / new_mean
conds = torch.concat([cond,z], dim=-2)
- UNetModel
ncnn (input & output): in0, in1, in2, c_in, c_out, outout
onnx (input & output): x, t, cc, out
outout = in0 + onnx(x=in0 * c_in, t=in1, cc=in2) * c_out