Stars
A curated list of awesome LLM for Autonomous Driving resources (continually updated)
[WACV 2024 Survey Paper] Multimodal Large Language Models for Autonomous Driving
A lightweight framework for building LLM-based agents
LlamaIndex is a data framework for your LLM applications
GLM-4 series: Open Multilingual Multimodal Chat LMs | 开源多语言多模态对话模型
A library for efficient similarity search and clustering of dense vectors.
Weaviate is an open-source vector database that stores both objects and vectors, allowing for the combination of vector search with structured filtering with the fault tolerance and scalability of …
This is a demo of multimodal RAG solution
MiniCPM-V 2.6: A GPT-4V Level MLLM for Single Image, Multi Image and Video on Your Phone
LMDeploy is a toolkit for compressing, deploying, and serving LLMs.
纯c++的全平台llm加速库,支持python调用,chatglm-6B级模型单卡可达10000+token / s,支持glm, llama, moss基座,手机端流畅运行
A high-throughput and memory-efficient inference and serving engine for LLMs
ModelScope-Agent: An agent framework connecting models in ModelScope with the world
LLaVA-Plus: Large Language and Vision Assistants that Plug and Learn to Use Skills
AutoGPT is the vision of accessible AI for everyone, to use and to build on. Our mission is to provide the tools, so that you can focus on what matters.
YOLO-World + EfficientViT SAM
Official release of InternLM2.5 base and chat models. 1M context support
Strong and Open Vision Language Assistant for Mobile Devices
[NeurIPS'23 Oral] Visual Instruction Tuning (LLaVA) built towards GPT-4V level capabilities and beyond.
Collection of AWESOME vision-language models for vision tasks
[ECCV 2024] 3DGazeNet: Generalizing Gaze Estimation with Weak-Supervision from Synthetic Views
Collecting papers about new view synthesis
pix2tex: Using a ViT to convert images of equations into LaTeX code.
A collaboration friendly studio for NeRFs