Stars
Official implementation of Continuous 3D Perception Model with Persistent State
[arXiv 2025] Diffusion as Shader: 3D-aware Video Diffusion for Versatile Video Generation Control
Cosmos is a world model development platform that consists of world foundation models, tokenizers and video processing pipeline to accelerate the development of Physical AI at Robotics & AV labs. C…
Dyn-HaMR: Recovering 4D Interacting Hand Motion from a Dynamic Camera
A generative world for general-purpose robotics & embodied AI learning.
[ICLR'25] SynCamMaster: Synchronizing Multi-Camera Video Generation from Diverse Viewpoints
Stag-1: Towards Realistic 4D Driving Simulation with Video Generation Model
Code for the project "MegaSaM: Accurate, Fast and Robust Structure and Motion from Casual Dynamic Videos"
[arXiv'24] Align3R: Aligned Monocular Depth Estimation for Dynamic Videos
HOT3D: An egocentric dataset for 3D hand and object tracking
HunyuanVideo: A Systematic Framework For Large Video Generation Model
Official repo for paper "Structured 3D Latents for Scalable and Versatile 3D Generation".
[CoRL 2024] Im2Flow2Act: Flow as the Cross-domain Manipulation Interface
MoGe: Unlocking Accurate Monocular Geometry Estimation for Open-Domain Images with Optimal Training Supervision
StreamingT2V: Consistent, Dynamic, and Extendable Long Video Generation from Text
Controllable video and image Generation, SVD, Animate Anyone, ControlNet, ControlNeXt, LoRA
Estimating Body and Hand Motion in an Ego-sensed World
RDT-1B: a Diffusion Foundation Model for Bimanual Manipulation
Official Pytorch Implement for "Structure from Duplicates: Neural Inverse Graphics from a Pile of Objects", Neurips 2023
ECCV 2024 SuperGaussian for generic 3D upsampling
DepthCrafter: Generating Consistent Long Depth Sequences for Open-world Videos