Skip to content
View sphantix's full-sized avatar

Block or report sphantix

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Stars

AI推理加速

6 repositories

A high-throughput and memory-efficient inference and serving engine for LLMs

Python 32,264 4,909 Updated Dec 21, 2024

Minimalist ML framework for Rust

Rust 16,103 980 Updated Dec 21, 2024

Fast inference engine for Transformer models

C++ 3,473 309 Updated Dec 18, 2024

DeepSpeed is a deep learning optimization library that makes distributed training and inference easy, efficient, and effective.

Python 35,946 4,169 Updated Dec 20, 2024

纯c++的全平台llm加速库,支持python调用,chatglm-6B级模型单卡可达10000+token / s,支持glm, llama, moss基座,手机端流畅运行

C++ 3,348 346 Updated Dec 17, 2024

LMDeploy is a toolkit for compressing, deploying, and serving LLMs.

Python 4,887 444 Updated Dec 20, 2024