Stars
awesome-autonomous-driving
FastGPT is a knowledge-based platform built on the LLMs, offers a comprehensive suite of out-of-the-box capabilities such as data processing, RAG retrieval, and visual AI workflow orchestration, le…
Dify is an open-source LLM app development platform. Dify's intuitive interface combines AI workflow, RAG pipeline, agent capabilities, model management, observability features and more, letting yo…
Milvus is a high-performance, cloud-native vector database built for scalable vector ANN search
GPT4V-level open-source multi-modal model based on Llama3-8B
Chinese and English multimodal conversational language model | 多模态中英双语对话语言模型
【ICLR 2024🔥】 Extending Video-Language Pretraining to N-modality by Language-based Semantic Alignment
【EMNLP 2024🔥】Video-LLaVA: Learning United Visual Representation by Alignment Before Projection
[ACL 2024 🔥] Video-ChatGPT is a video conversation model capable of generating meaningful conversation about videos. It combines the capabilities of LLMs with a pretrained visual encoder adapted fo…
Official Repository of paper VideoGPT+: Integrating Image and Video Encoders for Enhanced Video Understanding
Grounding DINO 1.5: IDEA Research's Most Capable Open-World Object Detection Model Series
[ECCV 2024] Official implementation of the paper "Grounding DINO: Marrying DINO with Grounded Pre-Training for Open-Set Object Detection"
刷算法全靠套路,认准 labuladong 就够了!English version supported! Crack LeetCode, not only how, but also why.
VILA is a family of state-of-the-art vision language models (VLMs) for diverse multimodal AI tasks across the edge, data center, and cloud.
Several simple examples for popular neural network toolkits calling custom CUDA operators.
🤗 PEFT: State-of-the-art Parameter-Efficient Fine-Tuning.
[MLSys 2024 Best Paper Award] AWQ: Activation-aware Weight Quantization for LLM Compression and Acceleration
TensorRT-LLM provides users with an easy-to-use Python API to define Large Language Models (LLMs) and build TensorRT engines that contain state-of-the-art optimizations to perform inference efficie…
Transformer related optimization, including BERT, GPT
COYO-700M: Large-scale Image-Text Pair Dataset
基于ChatGLM-6B、ChatGLM2-6B、ChatGLM3-6B模型,进行下游具体任务微调,涉及Freeze、Lora、P-tuning、全参微调等
[CVPR 2024 Oral] InternVL Family: A Pioneering Open-Source Alternative to GPT-4o. 接近GPT-4o表现的开源多模态对话模型
A high-throughput and memory-efficient inference and serving engine for LLMs
An open source implementation of CLIP.