Skip to content
View montaguelhz's full-sized avatar

Block or report montaguelhz

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

Large Language Model (LLM) Systems Paper List

742 26 Updated Jan 19, 2025

Efficient and easy multi-instance LLM serving

Python 281 18 Updated Jan 23, 2025

SGLang is a fast serving framework for large language models and vision language models.

Python 7,628 738 Updated Jan 23, 2025

Heterogeneous AI Computing Virtualization Middleware

Go 1,216 249 Updated Jan 21, 2025

[COLM 2024] TriForce: Lossless Acceleration of Long Sequence Generation with Hierarchical Speculative Decoding

Python 236 15 Updated Aug 31, 2024

《Machine Learning Systems: Design and Implementation》- Chinese Version

TeX 4,189 441 Updated Apr 13, 2024

Mooncake is the serving platform for Kimi, a leading LLM service provided by Moonshot AI.

C++ 2,423 143 Updated Jan 23, 2025

xDiT: A Scalable Inference Engine for Diffusion Transformers (DiTs) with Massive Parallelism

Python 1,183 97 Updated Jan 23, 2025

how to optimize some algorithm in cuda.

Cuda 1,840 153 Updated Jan 21, 2025

A distributed, fast open-source graph database featuring horizontal scalability and high availability

C++ 10,997 1,210 Updated Dec 10, 2024

A task runner / simpler Make alternative written in Go

Go 12,015 641 Updated Jan 18, 2025

Ray is an AI compute engine. Ray consists of a core distributed runtime and a set of AI Libraries for accelerating ML workloads.

Python 34,999 5,958 Updated Jan 23, 2025

a unified scheduler for online and offline tasks

Go 521 78 Updated Jan 7, 2025

Kubernetes operator for managing the CephCSI plugins

Go 16 19 Updated Jan 20, 2025

东南大学研究生课程资料整理

720 177 Updated Jan 4, 2024

jMetal: a framework for multi-objective optimization with metaheuristics

Java 522 406 Updated Jan 17, 2025

A lightweight library for portable low-level GPU computation using WebGPU.

C++ 3,798 178 Updated Dec 29, 2024

Open standard for machine learning interoperability

Python 18,298 3,700 Updated Jan 23, 2025

Add citations automatically to your paper.

Python 1 Updated Jul 4, 2024

The Prometheus monitoring system and time series database.

Go 56,939 9,316 Updated Jan 23, 2025

🚀 Awesome System for Machine Learning ⚡️ AI System Papers and Industry Practice. ⚡️ System for Machine Learning, LLM (Large Language Model), GenAI (Generative AI). 🍻 OSDI, NSDI, SIGCOMM, SoCC, MLSy…

2,758 318 Updated Aug 14, 2024

A high-throughput and memory-efficient inference and serving engine for LLMs

Python 34,555 5,285 Updated Jan 23, 2025

TiDB - the open-source, cloud-native, distributed SQL database designed for modern applications.

Go 37,785 5,884 Updated Jan 23, 2025

CockroachDB — the cloud native, distributed SQL database designed for high availability, effortless scale, and control over data placement.

Go 30,415 3,849 Updated Jan 23, 2025

LLM training in simple, raw C/CUDA

Cuda 25,113 2,868 Updated Oct 2, 2024
Jupyter Notebook 138 7 Updated Mar 12, 2024

An observability database aims to ingest, analyze and store Metrics, Tracing and Logging data.

Go 278 87 Updated Jan 23, 2025

AlpaServe: Statistical Multiplexing with Model Parallelism for Deep Learning Serving (OSDI 23)

Python 80 12 Updated Jul 14, 2023
Next