Stars
LimSim & LimSim++: Integrated traffic and autonomous driving simulators with (M)LLM support
fastdup is a powerful, free tool designed to rapidly generate valuable insights from image and video datasets. It helps enhance the quality of both images and labels, while significantly reducing d…
Simple, unified interface to multiple Generative AI providers
Fetch citations and abstracts of a Google Scholar paper and generate prompt for LLM
(IROS 2020, ECCVW 2020) Official Python Implementation for "3D Multi-Object Tracking: A Baseline and New Evaluation Metrics"
[CVPR2023] Official Implementation of "DSVT: Dynamic Sparse Voxel Transformer with Rotated Sets"
VideoLLaMA 2: Advancing Spatial-Temporal Modeling and Audio Understanding in Video-LLMs
[ICML'24 Oral] "MagicLens: Self-Supervised Image Retrieval with Open-Ended Instructions"
Build your neural network easy and fast, 莫烦Python中文教学
[EMNLP 2023 Demo] Video-LLaMA: An Instruction-tuned Audio-Visual Language Model for Video Understanding
Pretrained DeepLabv3 and DeepLabv3+ for Pascal VOC & Cityscapes
Bug-tracking for Jeff's algorithms book, notes, etc.
Examples and tutorials on using SOTA computer vision models and techniques. Learn everything from old-school ResNet, through YOLO and object-detection transformers like DETR, to the latest models l…
[arXiv'24] VistaDream: Sampling multiview consistent images for single-view scene reconstruction
[CVPR 2024 Oral] InternVL Family: A Pioneering Open-Source Alternative to GPT-4o. 接近GPT-4o表现的开源多模态对话模型
Official PyTorch implementation of FocalFormer3D [ICCV 2023]
[ECCV 2024] Better Call SAL: Towards Learning to Segment Anything in Lidar
This is an official implementation for "Swin Transformer: Hierarchical Vision Transformer using Shifted Windows".
Kalman Filter book using Jupyter Notebook. Focuses on building intuition and experience, not formal proofs. Includes Kalman filters,extended Kalman filters, unscented Kalman filters, particle filte…
text and image to video generation: CogVideoX (2024) and CogVideo (ICLR 2023)
Official repository for "AM-RADIO: Reduce All Domains Into One"
A general map auto annotation framework based on MapTR, with high flexibility in terms of spatial scale and element type
Eclipse SUMO is an open source, highly portable, microscopic and continuous traffic simulation package designed to handle large networks. It allows for intermodal simulation including pedestrians a…
《李宏毅深度学习教程》(李宏毅老师推荐👍,苹果书🍎),PDF下载地址:https://github.com/datawhalechina/leedl-tutorial/releases
HSV色彩空间下的交通灯识别
MiniCPM-V 2.6: A GPT-4V Level MLLM for Single Image, Multi Image and Video on Your Phone
[NeurIPS'23 Oral] Visual Instruction Tuning (LLaVA) built towards GPT-4V level capabilities and beyond.
The repository provides code for running inference with the Meta Segment Anything Model 2 (SAM 2), links for downloading the trained model checkpoints, and example notebooks that show how to use th…