Stars
A simple screen parsing tool towards pure vision based GUI agent
🔥 CNN for Watermark Removal using Deep Image Prior with Pytorch 🔥.
Official implementation for "RIFLEx: A Free Lunch for Length Extrapolation in Video Diffusion Transformers"
DeepEP: an efficient expert-parallel communication library
EQ-VAE: Equivariance Regularized Latent Space for Improved Generative Image Modeling.
Official implementation of the paper: "FlowEdit: Inversion-Free Text-Based Editing Using Pre-Trained Flow Models"
Diffusion-Sharpening: Fine-tuning Diffusion Models with Denoising Trajectory Sharpening
A framework for high-quality material transfer that allows users to adjust the degree of material application.
An open-source implementation of Regional Adaptive Sampling (RAS), a novel diffusion model sampling strategy that introduces regional variability in sampling steps
Official code of "MakeAnything: Harnessing Diffusion Transformers for Multi-Domain Procedural Sequence Generation"
FlashVideo: Flowing Fidelity to Detail for Efficient High-Resolution Video Generation
SynCD: Generating Multi-Image Synthetic Data for Text-to-Image Customization
Clean, minimal, accessible reproduction of DeepSeek R1-Zero
[CVPR 2024] code release for "DiffusionLight: Light Probes for Free by Painting a Chrome Ball"
Lora beYond Conventional methods, Other Rank adaptation Implementations for Stable diffusion.
An 8-step inversion and 8-step editing process works effectively with the FLUX-dev model. (3x speedup with results that are comparable or even superior to baseline methods)
High-Resolution Image Synthesis with Latent Diffusion Models
A generative world for general-purpose robotics & embodied AI learning.
PyTorch code and models for the DINOv2 self-supervised learning method.
Official implementation of OneDiffusion paper
The code of our work "Golden Noise for Diffusion Models: A Learning Framework".
https://github.com/xie-lab-ml/Golden-Noise-for-Diffusion-Models for ComfyUI
A ComfyUI custom node that integrates Mistral AI's Pixtral Large vision model, enabling powerful multimodal AI capabilities within ComfyUI. Pixtral Large is a 124B parameter model (123B decoder + 1…