Starred repositories
A generative world for general-purpose robotics & embodied AI learning.
[ECCV2024] FlashSplat: 2D to 3D Gaussian Splatting Segmentation Solved Optimally
[ICML 2024] GALA3D: Towards Text-to-3D Complex Scene Generation via Layout-guided Generative Gaussian Splatting
[CVPR'24] Interactive3D: Create What You Want by Interactive 3D Generation
A curated list of awesome LLM for Autonomous Driving resources (continually updated)
super-ai: unified-vision, math-think/mathink; private
[CVPR2024] Official Repository of Paper "Panacea: Panoramic and Controllable Video Generation for Autonomous Driving"
A PyTorch implementation of MAGE: MAsked Generative Encoder to Unify Representation Learning and Image Synthesis
[Survey] Masked Modeling for Self-supervised Representation Learning on Vision and Beyond (https://arxiv.org/abs/2401.00897)
An ultimately comprehensive paper list of Vision Transformer/Attention, including papers, codes, and related websites
Collection of Remote Sensing Vision-Language Models
Infinite Photorealistic Worlds using Procedural Generation
[NeurIPS 2023] Official implementation of the paper "Segment Everything Everywhere All at Once"
Earth observation tools for Meta AI Segment Anything
Awesome List of Attention Modules and Plug&Play Modules in Computer Vision
Replication of simple CV Projects including attention, classification, detection, keypoint detection, etc.
Yet another repository for developing and benchmarking deep learning-based change detection methods.
A Global Context-aware and Batch-independent Network for road extraction from VHR satellite imagery (ISPRS2021) https://www.sciencedirect.com/science/article/pii/S0924271621000873
The official repository of the paper "Dense Hybrid Recurrent Multi-view Stereo Net with Dynamic Consistency Checking" (ECCV2020 Spotlight)
AdelaiDet is an open source toolbox for multiple instance-level detection and recognition tasks.
Obtain bird's eye view of a scene from a single input image
Camera calibration&Bird's-eye view transformation
OpenCV implementation of Torchvision's image augmentations
The OCR approach is rephrased as Segmentation Transformer: https://arxiv.org/abs/1909.11065. This is an official implementation of semantic segmentation for HRNet. https://arxiv.org/abs/1908.07919