Stars
[ICLR 2025 (Oral 📢) ] Our OpenYOLO3D model achieves state-of-the-art performance in Open Vocabulary 3D Instance Segmentation on ScanNet200 and Replica datasets with up ∼16x speedup compared to the …
🔥[ICLR'25] LLaRA: Supercharging Robot Learning Data for Vision-Language Policy
[ECCV 2024] SGS-SLAM: Semantic Gaussian Splatting For Neural Dense SLAM
SeeGround: See and Ground for Zero-Shot Open-Vocabulary 3D Visual Grounding
An open source implementation of CLIP.
This is the official repository for "EgoLifter Open-world 3D Segmentation for Egocentric Perception, ECCV 2024"
[CVPR'23] Co-SLAM: Joint Coordinate and Sparse Parametric Encodings for Neural Real-Time SLAM
Remote web teleoperation for the Stretch mobile manipulators from Hello Robot Inc.
Dimensionality reduction in very large datasets using Siamese Networks
code for "Diffusion Forcing: Next-token Prediction Meets Full-Sequence Diffusion"
[CVPR'24] Aerial Lifting: Neural Urban Semantic and Building Instance Lifting from Aerial Imagery
[ICLR 2025 Spotlight] MetaUrban: An Embodied AI Simulation Platform for Urban Micromobility
[CoRL 2024] Open-TeleVision: Teleoperation with Immersive Active Visual Feedback
A Modular Framework for 3D Gaussian Splatting and Beyond
openvla / openvla
Forked from TRI-ML/prismatic-vlmsOpenVLA: An open-source vision-language-action model for robotic manipulation.
[RAL 2024] OpenGraphs: Open-Vocabulary Hierarchical 3D Scene Graphs in Large-Scale Outdoor Environments