Stars
Large Language Model (LLM) Systems Paper List
Efficient and easy multi-instance LLM serving
SGLang is a fast serving framework for large language models and vision language models.
Heterogeneous AI Computing Virtualization Middleware
[COLM 2024] TriForce: Lossless Acceleration of Long Sequence Generation with Hierarchical Speculative Decoding
《Machine Learning Systems: Design and Implementation》- Chinese Version
Mooncake is the serving platform for Kimi, a leading LLM service provided by Moonshot AI.
xDiT: A Scalable Inference Engine for Diffusion Transformers (DiTs) with Massive Parallelism
how to optimize some algorithm in cuda.
A distributed, fast open-source graph database featuring horizontal scalability and high availability
A task runner / simpler Make alternative written in Go
Ray is an AI compute engine. Ray consists of a core distributed runtime and a set of AI Libraries for accelerating ML workloads.
a unified scheduler for online and offline tasks
Kubernetes operator for managing the CephCSI plugins
jMetal: a framework for multi-objective optimization with metaheuristics
A lightweight library for portable low-level GPU computation using WebGPU.
Open standard for machine learning interoperability
The Prometheus monitoring system and time series database.
🚀 Awesome System for Machine Learning ⚡️ AI System Papers and Industry Practice. ⚡️ System for Machine Learning, LLM (Large Language Model), GenAI (Generative AI). 🍻 OSDI, NSDI, SIGCOMM, SoCC, MLSy…
A high-throughput and memory-efficient inference and serving engine for LLMs
TiDB - the open-source, cloud-native, distributed SQL database designed for modern applications.
CockroachDB — the cloud native, distributed SQL database designed for high availability, effortless scale, and control over data placement.
An observability database aims to ingest, analyze and store Metrics, Tracing and Logging data.
AlpaServe: Statistical Multiplexing with Model Parallelism for Deep Learning Serving (OSDI 23)