Stars
Implementation of "DiffSensei: Bridging Multi-Modal LLMs and Diffusion Models for Customized Manga Generation"
[NeurIPS'23 Oral] Visual Instruction Tuning (LLaVA) built towards GPT-4V level capabilities and beyond.
Grounded SAM: Marrying Grounding DINO with Segment Anything & Stable Diffusion & Recognize Anything - Automatically Detect , Segment and Generate Anything
Custom nodes for using MV-Adapter in ComfyUI.
[ECCV-2024] This is the official implementation of ZeST.
High-resolution models for human tasks.
[NeurIPS 2024] Depth Anything V2. A More Capable Foundation Model for Monocular Depth Estimation
[CVPR 2024] Depth Anything: Unleashing the Power of Large-Scale Unlabeled Data. Foundation Model for Monocular Depth Estimation
PyTorch code and models for the DINOv2 self-supervised learning method.
Easily create large video dataset from video urls
[NeurIPS D&B Track 2024] Official implementation of HumanVid
[NeurIPS 2023] Uni-ControlNet: All-in-One Control to Text-to-Image Diffusion Models
Clapper.app, a video synthesizer and sequencer designed for the age of AI cinema
[ICCV 2023] Tune-A-Video: One-Shot Tuning of Image Diffusion Models for Text-to-Video Generation
[SIGGRAPH'24] CharacterGen: Efficient 3D Character Generation from Single Images with Multi-View Pose Canonicalization
WebUI extension for ControlNet
[SIGGRAPH 2024] "EASI-Tex: Edge-Aware Mesh Texturing from Single Image", ACM Transactions on Graphics.
The image prompt adapter is designed to enable a pretrained text-to-image diffusion model to generate images with image prompt.
🤗 Diffusers: State-of-the-art diffusion models for image and audio generation in PyTorch and FLAX.
Custom Diffusion: Multi-Concept Customization of Text-to-Image Diffusion (CVPR 2023)
利用HuggingFace的官方下载工具从镜像网站进行高速下载。
Official Pytorch Implementation for "MultiDiffusion: Fusing Diffusion Paths for Controlled Image Generation" presenting "MultiDiffusion" (ICML 2023)
DanceCamera3D: 3D Camera Movement Synthesis with Music and Dance. [CVPR 2024] Official PyTorch implementation