Stars
[WACV 2024] Training-Free Layout Control with Cross-Attention Guidance
Official implementation of Human-Aware Vision-and-Language Navigation: Bridging Simulation to Reality with Dynamic Human Interactions (NeurIPS DB Track'24 Spotlight).
[CVPR 2024] Official implementation of "Towards Realistic Scene Generation with LiDAR Diffusion Models"
Layout-Guided multi-view driving scene video generation with latent diffusion model
[ROS package] Online Learning for Human Detection in 3D Point Clouds
SoPhie: An Attentive GAN for Predicting Paths Compliant to Social and Physical Constraints
A Grounded Simulation Testing Framework for Evaluating Social Navigation: https://arxiv.org/abs/2103.00047
RVIZ2 plugins for visualization of vision_msgs
Algorithm-agnostic computer vision message types for ROS.
The first behavioral foundation model to control a virtual physics-based humanoid agent for a wide range of whole-body tasks.
DINO-X: The World's Top-Performing Vision Model for Open-World Object Detection and Understanding
Cosmos is a world model development platform that consists of world foundation models, tokenizers and video processing pipeline to accelerate the development of Physical AI at Robotics & AV labs. C…
EA-LSS: Edge-aware Lift-splat-shot Framework for 3D BEV Object Detection
[Arxiv 2022] This is the official implementation of 3D Dual-Fusion: Dual-Domain Dual-Query Camera-LiDAR Fusion for 3D Object Detection
A curated list of robot social navigation.
Simple pedestrian simulator based on libpedsim
Pedestrian simulator powered by the social force model
OpenPCDet Toolbox for LiDAR-based 3D Object Detection.
Official PyTorch implementation of SegFormer
Metric depth estimation from a single image
UNetFormer: A UNet-like transformer for efficient semantic segmentation of remote sensing urban scene imagery, ISPRS. Also, including other vision transformers and CNNs for satellite, aerial image …
[CVPR 2024 Highlight] GenAD: Generalized Predictive Model for Autonomous Driving & Foundation Models in Autonomous System
[NeurIPS 2023] Uni-ControlNet: All-in-One Control to Text-to-Image Diffusion Models