sphantix

Sphantix Hang sphantix

Achievements

Stars

6 repositories

A high-throughput and memory-efficient inference and serving engine for LLMs

Python 32,264 4,909 Updated Dec 21, 2024

Minimalist ML framework for Rust

Rust 16,103 980 Updated Dec 21, 2024

Fast inference engine for Transformer models

C++ 3,473 309 Updated Dec 18, 2024

DeepSpeed is a deep learning optimization library that makes distributed training and inference easy, efficient, and effective.

Python 35,946 4,169 Updated Dec 20, 2024

纯c++的全平台llm加速库，支持python调用，chatglm-6B级模型单卡可达10000+token / s，支持glm, llama, moss基座，手机端流畅运行

C++ 3,348 346 Updated Dec 17, 2024

LMDeploy is a toolkit for compressing, deploying, and serving LLMs.

Python 4,887 444 Updated Dec 20, 2024