Stars
Latent Consistency Models: Synthesizing High-Resolution Images with Few-Step Inference
A generative world for general-purpose robotics & embodied AI learning.
FastVideo is an open-source framework for accelerating large video diffusion model.
BMVC'23 | FiveA+Network: You Only Need 9K Parameters for Underwater Image Enhancement
ECCV'22 Oral | Perceiving and Modeling Density for Single Image Dehazing.
[AAAI 2024] Follow-Your-Pose: This repo is the official implementation of "Follow-Your-Pose : Pose-Guided Text-to-Video Generation using Pose-Free Videos"
Official implementation of AAAI-2024 paper "Boosting Multiple Instance Learning Models for Whole Slide Image Classification: A Model-agnostic Framework Based on Counterfactual Inference"
Hallo: Hierarchical Audio-Driven Visual Synthesis for Portrait Image Animation
Implementation of 💍 Ring Attention, from Liu et al. at Berkeley AI, in Pytorch
This project aim to reproduce Sora (Open AI T2V model), we wish the open source community contribute to this project.
Official implementation of ID-unaware Deepfake Detection Model
[Siggraph Asia 2024] Follow-Your-Emoji: This repo is the official implementation of "Follow-Your-Emoji: Fine-Controllable and Expressive Freestyle Portrait Animation"
HunyuanVideo: A Systematic Framework For Large Video Generation Model
EchoMimicV2: Towards Striking, Simplified, and Semi-Body Human Animation
Diffusion Models in Medical Imaging (Published in Medical Image Analysis Journal)
🔥🔥🔥 A curated list of papers on LLMs-based multimodal generation (image, video, 3D and audio).
🚀 A simple way to launch, train, and use PyTorch models on almost any device and distributed configuration, automatic mixed precision (including fp8), and easy-to-configure FSDP and DeepSpeed support
🤗 PEFT: State-of-the-art Parameter-Efficient Fine-Tuning.
🤗 Transformers: State-of-the-art Machine Learning for Pytorch, TensorFlow, and JAX.
A curated list of recent diffusion models for video generation, editing, restoration, understanding, etc.
[ECCV 2024] MOFA-Video: Controllable Image Animation via Generative Motion Field Adaptions in Frozen Image-to-Video Diffusion Model.
🤗 Diffusers: State-of-the-art diffusion models for image, video, and audio generation in PyTorch and FLAX.
🎓 Update Talking-Face Research Papers Daily, Now Integrated with LLM Analysis.
Endora: Video Generation Models as Endoscopy Simulators (MICCAI 2024)
[ECCV 2022] CelebV-HQ: A Large-Scale Video Facial Attributes Dataset
EchoMimic: Lifelike Audio-Driven Portrait Animations through Editable Landmark Conditioning