-
Tsinghua University
- Shenzhen, Guangdong, China
Highlights
- Pro
Stars
[TPAMI reviewing] Towards Visual Grounding: A Survey
Janus-Series: Unified Multimodal Understanding and Generation Models
[ECCV 2024] Empowering 3D Visual Grounding with Reasoning Capabilities
[AAAI2025] Language Prompt for Autonomous Driving
Official implementation of "AnyDressing: Customizable Multi-Garment Virtual Dressing via Latent Diffusion Models"
[AAAI 2024] NuScenes-QA: A Multi-modal Visual Question Answering Benchmark for Autonomous Driving Scenario.
😎 up-to-date & curated list of awesome 3D Visual Grounding papers, methods & resources.
This is the official repository for Talk2LiDAR project.
[AAAI 2024] Mono3DVG: 3D Visual Grounding in Monocular Images, AAAI, 2024
Awesome-LLM-3D: a curated list of Multi-modal Large Language Model in 3D world Resources
[ECCV 2024] TOD3Cap: Towards 3D Dense Captioning in Outdoor Scenes
This repository contains the PyTorch implementation of the CVPR'2024 paper (Highlight), IS-Fusion: Instance-Scene Collaborative Fusion for Multimodal 3D Object Detection.
A curated list of robot social navigation.
[IEEE RAL 2024] Dual-Alignment Domain Adaptation for Pedestrian Trajectory Prediction
A curated list of awesome LLM for Autonomous Driving resources (continually updated)
[CVPR 2024] LMDrive: Closed-Loop End-to-End Driving with Large Language Models
A curated list of awesome End-to-End Autonomous Driving resources (continually updated)
[ICRA19] Crowd-aware Robot Navigation with Attention-based Deep Reinforcement Learning
Target journals and conferences in the field of robotics and computer vision.