-
SJTU
- Shanghai
- https://syfeng.net
Highlights
- Pro
Lists (3)
Sort Name ascending (A-Z)
Stars
SGLang is a fast serving framework for large language models and vision language models.
Mooncake is the serving platform for Kimi, a leading LLM service provided by Moonshot AI.
veRL: Volcano Engine Reinforcement Learning for LLM
Rigourous evaluation of LLM-synthesized code - NeurIPS 2023 & COLM 2024
A high-throughput and memory-efficient inference and serving engine for LLMs
A PyTorch native library for large model training
Super-Efficient RLHF Training of LLMs with Parameter Reallocation
A fast communication-overlapping library for tensor parallelism on GPUs.
Ongoing research training transformer models at scale
FlagGems is an operator library for large language models implemented in Triton Language.
Development repository for the Triton-Linalg conversion
BitBLAS is a library to support mixed-precision matrix multiplications, especially for quantized LLM deployment.
Awesome LLM compression research papers and tools.
🤗 Transformers: State-of-the-art Machine Learning for Pytorch, TensorFlow, and JAX.
Development repository for the Triton language and compiler
An extention of TVMScript to write simple and high performance GPU kernels with tensorcore.
Simple and efficient pytorch-native transformer text generation in <1000 LOC of python.
FlashInfer: Kernel Library for LLM Serving
A cross-platform ChatGPT/Gemini UI (Web / PWA / Linux / Win / MacOS). 一键拥有你自己的跨平台 ChatGPT/Gemini/Claude LLM 应用。
A tool which profiles OpenCL devices to find their peak capacities
A more memory-efficient rewrite of the HF transformers implementation of Llama for use with quantized weights.