-
The Chinese University of Hong Kong
-
02:56
(UTC +08:00) - https://zhenzhiwang.github.io/
Highlights
- Pro
Stars
A PyTorch library for implementing flow matching algorithms, featuring continuous and discrete flow matching implementations. It includes practical examples for both text and image modalities.
FastVideo is a lightweight framework for accelerating large video diffusion models.
[CVPR 2025🔥] Identity-Preserving Text-to-Video Generation by Frequency Decomposition
DepthCrafter: Generating Consistent Long Depth Sequences for Open-world Videos
Official repository of "SAMURAI: Adapting Segment Anything Model for Zero-Shot Visual Tracking with Motion-Aware Memory"
🏝️ OASIS: Open Agent Social Interaction Simulations with One Million Agents. https://oasis.camel-ai.org
Code of Pyramidal Flow Matching for Efficient Video Generative Modeling
[NeurIPS D&B Track 2024] Official implementation of HumanVid
🎓 Update Talking-Face Research Papers Daily, Now Integrated with LLM Analysis.
The repository provides code for running inference with the Meta Segment Anything Model 2 (SAM 2), links for downloading the trained model checkpoints, and example notebooks that show how to use th…
Official implementation of Add-SD: Rational Generation without Manual Reference.
Official implementation of "ST-HOI: A Spatial-Temporal Baseline for Human-Object Interaction Detection in Videos" (ACM ICMRW 2021)
[CVPR 2024 Oral] Rethinking Inductive Biases for Surface Normal Estimation
[CVPR 2024 Highlight] FoundationPose: Unified 6D Pose Estimation and Tracking of Novel Objects
Track-Anything is a flexible and interactive tool for video object tracking and segmentation, based on Segment Anything, XMem, and E2FGVI.
A work list of recent human video generation method. This repository focus on half/full body human video generation method, The Nerf, Gaussian splashing, Motion Pose, and talking head/Portrait is n…
StreamingT2V: Consistent, Dynamic, and Extendable Long Video Generation from Text
Official implementation of Ctrl-Adapter: An Efficient and Versatile Framework for Adapting Diverse Controls to Any Diffusion Model (ICLR 2025 Oral)
Accepted as [NeurIPS 2024] Spotlight Presentation Paper
[NeurIPS 2024 Best Paper][GPT beats diffusion🔥] [scaling laws in visual generation📈] Official impl. of "Visual Autoregressive Modeling: Scalable Image Generation via Next-Scale Prediction". An *ult…
CustomDiffusion360: Customizing Text-to-Image Diffusion with Camera Viewpoint Control
MiniSora: A community aims to explore the implementation path and future development direction of Sora.
A collection of awesome video generation studies.