-
Shandong University
- ShanDong
- https://www.zhihu.com/people/ga-yao-95
Stars
Awesome List of Attention Modules and Plug&Play Modules in Computer Vision
[IROS24] Temporal-and Viewpoint-Invariant Registration for Under-Canopy Footage using Deep-Learning-based Bird's-Eye View Prediction
Efficient and lightweight Vision-Language model for Visual Question Answering in autonomous driving scenarios. The approach replaces images in BLIP's architecture with spatio-temporal BEV feature maps
[ECCV 2024] RecurrentBEV: A Long-term Temporal Fusion Framework for Multi-view 3D Detection
ChatReviewer: 使用ChatGPT分析论文优缺点,提出改进建议
[ICLR 2025] Semi-Supervised Vision-Centric 3D Occupancy World Model for Autonomous Driving
MetaOcc: Surround-View 4D Radar and Camera Fusion Framework for 3D Occupancy Prediction with Dual Training Strategies
[ICRA 2024] Official code for BEVUDA: Multi-geometric Space Alignments for Domain Adaptive BEV 3D Object Detection
A plugin to make view transformer from perspective view to bird-eye-view, it is used in bevdet
[ECCV2024] This is the official implementation of GraphBEV, a BEV multi-modal framework for autonomous driving perception, e.g., 3D object detection and semantic map segmentation.
Gaga: Group Any Gaussians via 3D-aware Memory Bank
[CVPR 2025 Almost Oral ; )] GaussTR: Foundation Model-Aligned Gaussian Transformer for Self-Supervised 3D Spatial Understanding
GaussianAD: Gaussian-Centric End-to-End Autonomous Driving
Official Code Release for "Towards Flexible 3D Perception: Object-Centric Occupancy Completion Augments 3D Object Detection" in NeurIPS 2024
[AAAI 2025] ProtoOcc: Accurate, Efficient 3D Occupancy Prediction Using Dual Branch Encoder-Prototype Query Decoder
UniDrive: Towards Universal Driving Perception Across Camera Configurations
Original reference implementation of "3D Gaussian Splatting for Real-Time Radiance Field Rendering"
[3DV 2025] LSSInst: Improving Geometric Modeling in LSS-Based BEV Perception with Instance Representation
[WACV 2024 Survey Paper] Multimodal Large Language Models for Autonomous Driving
CIGOcc: Complementary Information Guided Occupancy Prediction via Multi-Level Representation Fusion
GussianPretrain for Visual Pre-training in Autonomous Driving, showcasing significant improvements across various 3D perception tasks, including 3D object detection, HD-map construction, and Occupa…