Starred repositories
中文LLaMA&Alpaca大语言模型+本地CPU/GPU训练部署 (Chinese LLaMA & Alpaca LLMs)
ChatGLM3 series: Open Bilingual Chat LLMs | 开源双语对话语言模型
Question and Answer based on Anything.
Large Language Model Text Generation Inference
Running large language models on a single GPU for throughput-oriented scenarios.
Official release of InternLM2.5 base and chat models. 1M context support
Mobile-Agent: The Powerful Mobile Device Operation Assistant Family
The Official Python Client for Lamini's API
Official Implementation of EAGLE-1 (ICML'24) and EAGLE-2 (EMNLP'24)
zhanzy178 / vllm
Forked from vllm-project/vllmA high-throughput and memory-efficient inference and serving engine for LLMs