Stars
The official repository for the paper: Evaluation of Retrieval-Augmented Generation: A Survey.
OpenCompass is an LLM evaluation platform, supporting a wide range of models (Llama3, Mistral, InternLM2,GPT-4,LLaMa2, Qwen,GLM, Claude, etc) over 100+ datasets.
A list of papers, docs, codes about model quantization. This repo is aimed to provide the info for model quantization research, we are continuously improving the project. Welcome to PR the works (p…
Framework for benchmarking vector search engines
RAGFlow is an open-source RAG (Retrieval-Augmented Generation) engine based on deep document understanding.
User-friendly AI Interface (Supports Ollama, OpenAI API, ...)
🚀WebUI integrated platform for latest LLMs | 各大语言模型的全流程工具 WebUI 整合包。支持主流大模型API接口和开源模型。支持知识库,数据库,角色扮演,mj文生图,LoRA和全参数微调,数据集制作,live2d等全流程应用工具
ChatGPT WebUI using gradio. 给 LLM 对话和检索知识问答RAG提供一个简单好用的Web UI界面
The calflops is designed to calculate FLOPs、MACs and Parameters in all various neural networks, such as Linear、 CNN、 RNN、 GCN、Transformer(Bert、LlaMA etc Large Language Model)
PPL Quantization Tool (PPQ) is a powerful offline neural network quantization tool.
《开源大模型食用指南》针对中国宝宝量身打造的基于Linux环境快速微调(全参数/Lora)、部署国内外开源大模型(LLM)/多模态大模型(MLLM)教程
GLM-4 series: Open Multilingual Multimodal Chat LMs | 开源多语言多模态对话模型
[CVPR 2024 Oral] InternVL Family: A Pioneering Open-Source Alternative to GPT-4o. 接近GPT-4o表现的开源多模态对话模型
OpenBMB / mlc-MiniCPM
Forked from mlc-ai/mlc-llmMiniCPM on Android platform.
A generative speech model for daily dialogue.
Flexible and powerful tensor operations for readable and reliable code (for pytorch, jax, TF and others)
中文羊驼大模型三期项目 (Chinese Llama-3 LLMs) developed from Meta Llama 3
中文LLaMA&Alpaca大语言模型+本地CPU/GPU训练部署 (Chinese LLaMA & Alpaca LLMs)
Build multi-modal Agents with memory, knowledge, tools and reasoning. Chat with them using a beautiful Agent UI.
Unified Efficient Fine-Tuning of 100+ LLMs (ACL 2024)
LMDeploy is a toolkit for compressing, deploying, and serving LLMs.
A general 2-8 bits quantization toolbox with GPTQ/AWQ/HQQ, and export to onnx/onnx-runtime easily.
Since the emergence of chatGPT in 2022, the acceleration of Large Language Model has become increasingly important. Here is a list of papers on accelerating LLMs, currently focusing mainly on infer…
Examples for using ONNX Runtime for machine learning inferencing.
An innovative library for efficient LLM inference via low-bit quantization