TKONIY

🤯

Working on LLM sys and GPU DB

Yangshen⚡Deng TKONIY

🤯

Working on LLM sys and GPU DB

🚀Master student in DBGroup@SUSTech.

97 followers · 128 following

Achievements

x2 x2

Achievements

x2 x2

Highlights

Developer Program Member
Pro

Organizations

Lists (5)

Sort

Starred repositories

ccfddl / ccf-deadlines

⏰ Collaboratively track deadlines of conferences recommended by CCF (Website, Python Cli, Wechat Applet) / If you find it useful, please star this project, thanks~

Vue 7,023 477 Updated Mar 6, 2025

mit-han-lab / omniserve

[MLSys'25] QServe: W4A8KV4 Quantization and System Co-design for Efficient LLM Serving; [MLSys'25] LServe: Efficient Long-sequence LLM Serving with Unified Sparse Attention

C++ 574 34 Updated Mar 6, 2025

mit-han-lab / Block-Sparse-Attention

A sparse attention kernel supporting mix sparse patterns

C++ 156 5 Updated Feb 13, 2025

vllm-project / aibrix

Cost-efficient and pluggable Infrastructure components for GenAI inference

Jupyter Notebook 3,008 266 Updated Mar 7, 2025

deepseek-ai / open-infra-index

Production-tested AI infrastructure tools for efficient AGI development and community-driven innovation

6,635 186 Updated Mar 4, 2025

openmlsys / openmlsys-zh

《Machine Learning Systems: Design and Implementation》- Chinese Version

TeX 4,291 450 Updated Apr 13, 2024

DeepAuto-AI / sglang

Forked from sgl-project/sglang

This is a fork of SGLang for hip-attention integration. Please refer to hip-attention for detail.

Python 11 2 Updated Mar 6, 2025

DeepAuto-AI / hip-attention

Training-free Post-training Efficient Sub-quadratic Complexity Attention. Implemented with OpenAI Triton.

Python 121 13 Updated Feb 25, 2025

p-ranav / argparse

Argument Parser for Modern C++

C++ 2,929 265 Updated Jan 26, 2025

LayerZero-Labs / qmdb

Quick Merkle Database

Rust 215 20 Updated Mar 3, 2025

Mellanox / sockperf

Network Benchmarking Utility

C++ 626 120 Updated Dec 19, 2024

Tencent / tquic

A high-performance, lightweight, and cross-platform QUIC library

Rust 1,195 99 Updated Feb 25, 2025

srush / LLM-Training-Puzzles

What would you do with 1000 H100s...

Jupyter Notebook 1,011 62 Updated Jan 10, 2024

srush / GPU-Puzzles

Solve puzzles. Learn CUDA.

Jupyter Notebook 10,631 821 Updated Sep 1, 2024

xorbitsai / inference

Replace OpenAI GPT with another LLM in your app by changing a single line of code. Xinference gives you the freedom to use any LLM you need. With Xinference, you're empowered to run inference with …

Python 6,843 562 Updated Mar 6, 2025