Stars
A bidirectional pipeline parallelism algorithm for computation-communication overlap in V3/R1 training.
Codebase for Merging Language Models (ICML 2024)
📖A curated list of Awesome LLM/VLM Inference Papers with codes: WINT8/4, Flash-Attention, Paged-Attention, Parallelism, etc. 🎉🎉
FlashInfer: Kernel Library for LLM Serving
This project aim to reproduce Sora (Open AI T2V model), we wish the open source community contribute to this project.
Unified Efficient Fine-Tuning of 100+ LLMs & VLMs (ACL 2024)
Reference implementation for DPO (Direct Preference Optimization)
AirLLM 70B inference with single 4GB GPU
SGLang is a fast serving framework for large language models and vision language models.
[ICML 2024] Break the Sequential Dependency of LLM Inference Using Lookahead Decoding
Agent framework and applications built upon Qwen>=2.0, featuring Function Calling, Code Interpreter, RAG, and Chrome extension.
TensorRT-LLM provides users with an easy-to-use Python API to define Large Language Models (LLMs) and build TensorRT engines that contain state-of-the-art optimizations to perform inference efficie…
Extend existing LLMs way beyond the original training length with constant memory usage, without retraining
《Hello 算法》:动画图解、一键运行的数据结构与算法教程。支持 Python, Java, C++, C, C#, JS, Go, Swift, Rust, Ruby, Kotlin, TS, Dart 代码。简体版和繁体版同步更新,English version ongoing
Foundational Models for State-of-the-Art Speech and Text Translation
ChatGLM2-6B: An Open Bilingual Chat LLM | 开源双语对话语言模型
Using a Model to generate prompts for Model applications. / 使用模型来生成作图咒语的偷懒工具,支持 MidJourney、Stable Diffusion 等。
Chinese-LLaMA 1&2、Chinese-Falcon 基础模型;ChatFlow中文对话模型;中文OpenLLaMA模型;NLP预训练/指令微调数据集
MNBVC(Massive Never-ending BT Vast Chinese corpus)超大规模中文语料集。对标chatGPT训练的40T数据。MNBVC数据集不但包括主流文化,也包括各个小众文化甚至火星文的数据。MNBVC数据集包括新闻、作文、小说、书籍、杂志、论文、台词、帖子、wiki、古诗、歌词、商品介绍、笑话、糗事、聊天记录等一切形式的纯文本中文数据。
StableLM: Stability AI Language Models