Starred repositories
Bring projects, wikis, and teams together with AI. AppFlowy is an AI collaborative workspace where you achieve more without losing control of your data. The best open source alternative to Notion.
Official release of InternLM2.5 base and chat models. 1M context support
StableLM: Stability AI Language Models
📚150+ Tensor/CUDA Cores Kernels, ⚡️flash-attn-mma, ⚡️hgemm with WMMA, MMA and CuTe (98%~100% TFLOPS of cuBLAS/FA2 🎉🎉).
📖A curated list of Awesome LLM/VLM Inference Papers with codes, such as FlashAttention, PagedAttention, Parallelism, etc. 🎉🎉
Mobile-Agent: The Powerful Mobile Device Operation Assistant Family
Official Implementation of EAGLE-1 (ICML'24) and EAGLE-2 (EMNLP'24)
Question and Answer based on Anything.
ChatGLM3 series: Open Bilingual Chat LLMs | 开源双语对话语言模型
Running large language models on a single GPU for throughput-oriented scenarios.
Large Language Model Text Generation Inference
Summary of system papers/frameworks/codes/tools on training or serving large model
zhanzy178 / vllm
Forked from vllm-project/vllmA high-throughput and memory-efficient inference and serving engine for LLMs
纯c++的全平台llm加速库,支持python调用,chatglm-6B级模型单卡可达10000+token / s,支持glm, llama, moss基座,手机端流畅运行
bmfire1 / onnx-tensorrt
Forked from onnx/onnx-tensorrtONNX-TensorRT: TensorRT backend for ONNX
人工精调的中文对话数据集和一段chatglm的微调代码
Langchain-Chatchat(原Langchain-ChatGLM)基于 Langchain 与 ChatGLM, Qwen 与 Llama 等语言模型的 RAG 与 Agent 应用 | Langchain-Chatchat (formerly langchain-ChatGLM), local knowledge based LLM (like ChatGLM, Qwen and…
The Official Python Client for Lamini's API
中文nlp解决方案(大模型、数据、模型、训练、推理)
BELLE: Be Everyone's Large Language model Engine(开源中文对话大模型)
中文LLaMA&Alpaca大语言模型+本地CPU/GPU训练部署 (Chinese LLaMA & Alpaca LLMs)
JAX implementation of OpenAI's Whisper model for up to 70x speed-up on TPU.
Event-driven network library for multi-threaded Linux server in C++11