leliyliu

Follow

leliyliu leliyliu

Follow

my world

33 followers · 24 following

leliyliu.github.io

Achievements

Achievements

Highlights

Pro

Lists (16)

Sort

AI system

23 repositories

benchmark

benckmarks

books

CGRA

12 repositories

DL accelerator

latex 写作

LLM-survey

low-precision network

10 repositories

NN-tools

notes

plug-and-play

10 repositories

quantum

simulator

34 repositories

solver

sparse-accelerator

Starred repositories

deepseek-ai / open-infra-index

Production-tested AI infrastructure tools for efficient AGI development and community-driven innovation

6,561 181 Updated Mar 4, 2025

mlcommons / chakra

Repository for MLCommons Chakra schema and tools

Python 88 49 Updated Feb 26, 2025

chenhongyu2048 / LLM-inference-optimization-paper

Summary of some awesome work for optimizing LLM inference

58 1 Updated Feb 4, 2025

interestingLSY / swiftLLM

A tiny yet powerful LLM inference system tailored for researching purpose. vLLM-equivalent performance with only 2k lines of code (2% of vLLM).

Python 142 11 Updated Jul 5, 2024

zhaochenyang20 / Awesome-ML-SYS-Tutorial

My learning notes/codes for ML SYS.

Python 1,245 64 Updated Mar 4, 2025

VIA-Research / vTrain

Python 49 8 Updated Dec 31, 2024

mohuangrui / ucasproposal

LaTeX Proposal Template for the University of Chinese Academy of Sciences

TeX 638 144 Updated Oct 29, 2021

AISys-01 / vllm-CachedAttention

Forked from vllm-project/vllm

The code based on vLLM for the paper “ Cost-Efficient Large Language Model Serving for Multi-turn Conversations with CachedAttention”.

Python 7 1 Updated Sep 19, 2024

facebookresearch / lingua

Meta Lingua: a lean, efficient, and easy-to-hack codebase to research LLMs.

Python 4,472 242 Updated Feb 20, 2025

aliyun / SimAI

C++ 412 57 Updated Feb 28, 2025

openpsi-project / ReaLHF

Super-Efficient RLHF Training of LLMs with Parameter Reallocation

Python 236 14 Updated Jan 13, 2025

byungsoo-oh / ml-systems-papers

Curated collection of papers in machine learning systems

248 14 Updated Feb 28, 2025

kvcache-ai / Mooncake

Mooncake is the serving platform for Kimi, a leading LLM service provided by Moonshot AI.

C++ 2,737 166 Updated Feb 23, 2025

upmem / upmem_llm_framework

UPMEM LLM Framework allows profiling PyTorch layers and functions and simulate those layers/functions with a given hardware profile.

Python 21 3 Updated Feb 11, 2025

UbiquitousLearning / mllm

Fast Multimodal LLM on Mobile Devices

C++ 725 83 Updated Mar 3, 2025

casys-kaist / LLMServingSim

LLMServingSim: A HW/SW Co-Simulation Infrastructure for LLM Inference Serving at Scale

Python 89 12 Updated Feb 24, 2025

DefTruth / CUDA-Learn-Notes

📚200+ Tensor/CUDA Cores Kernels, ⚡️flash-attn-mma, ⚡️hgemm with WMMA, MMA and CuTe (98%~100% TFLOPS of cuBLAS/FA2 🎉🎉).

Cuda 2,676 277 Updated Mar 4, 2025

zhentingqi / rStar

Python 901 105 Updated Jan 23, 2025

agiresearch / AIOS

AIOS: AI Agent Operating System

Python 3,882 474 Updated Mar 4, 2025

Jason-cs18 / HetServe-Foundation

A Overview of Efficiently Serving Foundation Models across Edge Devices

13 Updated Jan 17, 2025

antgroup / glake

GLake: optimizing GPU memory management and IO transmission.

Python 433 38 Updated Nov 27, 2024

sgl-project / sglang

SGLang is a fast serving framework for large language models and vision language models.

Python 11,327 1,136 Updated Mar 4, 2025

onejune2018 / Awesome-LLM-Eval

Awesome-LLM-Eval: a curated list of tools, datasets/benchmark, demos, leaderboard, papers, docs and models, mainly for Evaluation on LLMs. 一个由工具、基准/数据、演示、排行榜和大模型等组成的精选列表，主要面向基础大模型评测，旨在探求生成式AI的技术边界.

487 45 Updated Oct 25, 2024

AIoT-MLSys-Lab / Efficient-LLMs-Survey

[TMLR 2024] Efficient Large Language Models: A Survey

1,107 95 Updated Feb 27, 2025

KnowingNothing / compiler-and-arch

A list of tutorials, paper, talks, and open-source projects for emerging compiler and architecture

436 35 Updated Jan 15, 2025

metame-ai / awesome-llm-plaza

awesome llm plaza: daily tracking all sorts of awesome topics of llm, e.g. llm for coding, robotics, reasoning, multimod etc.

189 14 Updated Feb 27, 2025

DefTruth / Awesome-LLM-Inference

📖A curated list of Awesome LLM/VLM Inference Papers with codes: WINT8/4, Flash-Attention, Paged-Attention, Parallelism, etc. 🎉🎉

3,573 247 Updated Mar 4, 2025

sramshetty / mixture-of-depths

An unofficial implementation of "Mixture-of-Depths: Dynamically allocating compute in transformer-based language models"

Python 35 3 Updated Jun 7, 2024

Efficient-ML / Awesome-Efficient-AIGC

A list of papers, docs, codes about efficient AIGC. This repo is aimed to provide the info for efficient AIGC research, including language and vision, we are continuously improving the project. Wel…

170 11 Updated Feb 10, 2025

casys-kaist / NeuPIMs

NeuPIMs Simulator

Jupyter Notebook 71 21 Updated Jun 19, 2024

Starred topics

Tensorflow