Stars
[CVPR 2025] MatAnyone: Stable Video Matting with Consistent Memory Propagation
[CVPR 2025] "A Distractor-Aware Memory for Visual Object Tracking with SAM2"
Official PyTorch implementation of "ViBiDSampler: Enhancing Video Interpolation Using Bidirectional Diffusion Sampler"
HunyuanVideo: A Systematic Framework For Large Video Generation Model
[CVPR 2024] Official implementation of Morphable Diffusion: 3D-Consistent Diffusion for Single-image Avatar Creation
Find First, Track Next: Decoupling Identification and Propagation in Referring Video Object Segmentation
High-Resolution 3D Assets Generation with Large Scale Hunyuan3D Diffusion Models.
ConceptAttention: A method for interpreting multi-modal diffusion transformers.
Original implementation of "Radiant Foam: Real-Time Differentiable Ray Tracing"
MagicTime: Time-lapse Video Generation Models as Metamorphic Simulators
Janus-Series: Unified Multimodal Understanding and Generation Models
Matting Anything Model (MAM), an efficient and versatile framework for estimating the alpha matte of any instance in an image with flexible and interactive visual or linguistic user prompt guidance.
This repository includes the official project of Mask Guided (MG) Matting, presented in our paper: Mask Guided Matting via Progressive Refinement Network
MiniCPM-o 2.6: A GPT-4o Level MLLM for Vision, Speech and Multimodal Live Streaming on Your Phone
[CVPR24] MaGGIe: Mask Guided Gradual Human Instance Matting
[Information Fusion (Vol.103, Mar. '24)] Boosting Image Matting with Pretrained Plain Vision Transformers
OneDrive MassDL-Cli allows you to easily download public OneDrive folders from your Terminal with one simple command!
STAR: Spatial-Temporal Augmentation with Text-to-Video Models for Real-World Video Super-Resolution
[ICML 2024] MagicPose(also known as MagicDance): Realistic Human Poses and Facial Expressions Retargeting with Identity-aware Diffusion
Cosmos is a world model development platform that consists of world foundation models, tokenizers and video processing pipeline to accelerate the development of Physical AI at Robotics & AV labs. C…
High-Quality Human Motion Video Generation with Confidence-aware Pose Guidance
MusePose: a Pose-Driven Image-to-Video Framework for Virtual Human Generation
[ECCV 2024] IDOL: Unified Dual-Modal Latent Diffusion for Human-Centric Joint Video-Depth Generation
[CVPR2024] DisCo: Referring Human Dance Generation in Real World
Implementation code:Advancing Pose-Guided Image Synthesis with Progressive Conditional Diffusion Models
[CVPR 2025] CoCoGaussian: Leveraging Circle of Confusion for Gaussian Splatting from Defocused Images
Official Implementation of "Chrono: A Simple Blueprint for Representing Time in MLLMs"
A list of video inpainting (VI) papers