Lists (3)
Sort Name ascending (A-Z)
Stars
InspireMusic: A Unified Framework for Music, Song, Audio Generation.
Realtime Video and Audio Streaming with WebRTC and Gradio
Official code for "ControlAR: Controllable Image Generation with Autoregressive Models"
Official implementation of the paper: "FlowEdit: Inversion-Free Text-Based Editing Using Pre-Trained Flow Models"
You See it, You Got it: Learning 3D Creation on Pose-Free Videos at Scale
[arXiv 2024] Taming Multimodal Joint Training for High-Quality Video-to-Audio Synthesis
A PyTorch library for implementing flow matching algorithms, featuring continuous and discrete flow matching implementations. It includes practical examples for both text and image modalities.
SAM2Long: Enhancing SAM 2 for Long Video Segmentation with a Training-Free Memory Tree
Official Implementation for paper: Negative Token Merging: Image-based Adversarial Feature Guidance
Official implementation of the paper “MagicDriveDiT: High-Resolution Long Video Generation for Autonomous Driving with Adaptive Control”
[NeurIPS 2024] L4GM: Large 4D Gaussian Reconstruction Model
Official repo for paper "Structured 3D Latents for Scalable and Versatile 3D Generation".
[arXiv'24] Align3R: Aligned Monocular Depth Estimation for Dynamic Videos
Official Repo for Paper "OmniEdit: Building Image Editing Generalist Models Through Specialist Supervision"
Boosting Generative Novel View Synthesis with Sparse and Unposed Images
A minimal and universal controller for FLUX.1.
🔥 Official impl. of "TokenFlow: Unified Image Tokenizer for Multimodal Understanding and Generation".
[ECCV 2024 & NeurIPS 2024] Official implementation of the paper TAPTR & TAPTRv2 & TAPTRv3
A course on aligning smol models.
HunyuanVideo: A Systematic Framework For Large Video Generation Model
High-quality and editable surfel Gaussian generation through native 3D diffusion.
Pytorch implementation of "PersonaCraft: Personalized Full-Body Image Synthesis for Multiple Identities from Single References Using 3D-Model-Conditioned Diffusion"
Official implementation of "Make-It-Animatable: An Efficient Framework for Authoring Animation-Ready 3D Characters"
Taming FLUX for Image Inversion & Editing; OpenSora for Video Inversion & Editing! (Official implementation for Taming Rectified Flow for Inversion and Editing.)