Skip to content

DarkMnDragon/DiT_ControlNet

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

10 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

DiT with ControlNet Support

This repository features an optimized Diffusion Transformer (DiT) designed for CIFAR10 and other pixel-space flows. Key improvements include the addition of long skip connections and a final convolutional layer.

Please note that integrating MMDiT-style ControlNet with DiT using long skip connections can lead to unstable training. To address this, the DiT architecture has been modified, particularly in the order of residual additions within the ControlNet.

Checkpoint for DiT-S/2 on CIFAR10, $\text{FID}_{50 \text{k}} = 3.678$

DiT(input_size=32,
    patch_size=2,
    in_channels=3,
    out_channels=3,
    hidden_size=512,
    depth=13,
    num_heads=8,
    mlp_ratio=4,
    num_classes=0,
    use_long_skip=True,
    final_conv=True)

About

DiT with ControlNet Support

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published