Starred repositories
[ICCV 2023 R6D] PyTorch implementation of CNOS: A Strong Baseline for CAD-based Novel Object Segmentation based on Segmenting Anything and DINOv2
Training library for local feature detection and matching
Point-NeRF: Point-based Neural Radiance Fields
[CVPR'23] OpenScene: 3D Scene Understanding with Open Vocabularies
[ICLR 2024 Spotlight] SyncDreamer: Generating Multiview-consistent Images from a Single-view Image
Single Image to 3D using Cross-Domain Diffusion for 3D Generation
Official implementation of `Splatter Image: Ultra-Fast Single-View 3D Reconstruction' CVPR 2024
Gamba: Marry Gaussian Splatting with Mamba for single view 3D reconstruction
[NeurIPS 2024] Unique3D: High-Quality and Efficient 3D Mesh Generation from a Single Image
[CVPR 2024] Depth Anything: Unleashing the Power of Large-Scale Unlabeled Data. Foundation Model for Monocular Depth Estimation
[NeurIPS 2024] ODGS: 3D Scene Reconstruction from Omnidirectional Images with 3D Gaussian Splatting
DN-Splatter + AGS-Mesh: Depth and Normal Priors for Gaussian Splatting
Real-time dense scene reconstruction with SLAM3R
[ICLR 2024] OpenSet 3D Neural Scene Segmentation with Pixel-wise Features and Rendered Novel Views
Pytorch code for ECCV'22 paper. ShAPO: Implicit Representations for Multi-Object Shape, Appearance and Pose Optimization
Pytorch code for ICRA'22 paper: "Single-Shot Multi-Object 3D Shape Reconstruction and Categorical 6D Pose and Size Estimation"
Gorilla: Training and Evaluating LLMs for Function Calls (Tool Calls)
pySLAM is a visual SLAM pipeline in Python for monocular, stereo and RGBD cameras. It supports many modern local and global features, different loop-closing methods, a volumetric reconstruction pip…
Cosmos is a world model development platform that consists of world foundation models, tokenizers and video processing pipeline to accelerate the development of Physical AI at Robotics & AV labs. C…
DINO-X: The World's Top-Performing Vision Model for Open-World Object Detection and Understanding
HybridNets: End-to-End Perception Network
[CVPR2024] Code for "SAM-6D: Segment Anything Model Meets Zero-Shot 6D Object Pose Estimation".
A generative world for general-purpose robotics & embodied AI learning.