Stars
All Algorithms implemented in Python
Composable transformations of Python+NumPy programs: differentiate, vectorize, JIT to GPU/TPU, and more
Mask R-CNN for object detection and instance segmentation on Keras and TensorFlow
Faster R-CNN (Python implementation) -- see https://github.com/ShaoqingRen/faster_rcnn for the official MATLAB version
[ECCV 2024] Official implementation of the paper "Grounding DINO: Marrying DINO with Grounded Pre-Training for Open-Set Object Detection"
Use commands in English to control Blender with OpenAI's GPT-4
Code for robust monocular depth estimation described in "Ranftl et. al., Towards Robust Monocular Depth Estimation: Mixing Datasets for Zero-shot Cross-dataset Transfer, TPAMI 2022"
A procedural Blender pipeline for photorealistic training image generation
[ECCV 2024] Official implementation of the paper "Semantic-SAM: Segment and Recognize Anything at Any Granularity"
Automated dense category annotation engine that serves as the initial semantic labeling for the Segment Anything dataset (SA-1B).
[RSS 2023] Diffusion Policy Visuomotor Policy Learning via Action Diffusion
[CVPR 2024 Highlight] FoundationPose: Unified 6D Pose Estimation and Tracking of Novel Objects
[CVPR 2024 Highlight] Official PyTorch implementation of SpatialTracker: Tracking Any 2D Pixels in 3D Space
[CVPR 2024] RoMa: Robust Dense Feature Matching; RoMa is the robust dense feature matcher capable of estimating pixel-dense warps and reliable certainties for almost any image pair.
VoxPoser: Composable 3D Value Maps for Robotic Manipulation with Language Models
Baseline model for "GraspNet-1Billion: A Large-Scale Benchmark for General Object Grasping" (CVPR 2020)
[CVPR 2024 - Oral] Matching 2D Images in 3D: Metric Relative Pose from Metric Correspondences
[CVPR2024] Code for "SAM-6D: Segment Anything Model Meets Zero-Shot 6D Object Pose Estimation".
Kalman Filter implementation in Python using Numpy only in 30 lines.
Toolbox for our GraspNet-1Billion dataset.
A Benchmark Dataset for Apple Detection and Segmentation
SceneTracker: Long-term Scene Flow Estimation Network
[CoRL 2024] Im2Flow2Act: Flow as the Cross-domain Manipulation Interface