Stars
An easy to use and powerful chaos engineering experiment toolkit.(阿里巴巴开源的一款简单易用、功能强大的混沌实验注入工具)
Evaluate and Enhance Your LLM Deployments for Real-World Inference Needs
LLM Inference analyzer for different hardware platforms
A collection of pre-trained, state-of-the-art models in the ONNX format
Enforce the output format (JSON Schema, Regex etc) of a language model
MoverScore: Text Generation Evaluating with Contextualized Embeddings and Earth Mover Distance
Analyze the inference of Large Language Models (LLMs). Analyze aspects like computation, storage, transmission, and hardware roofline model in a user-friendly interface.
QServe: W4A8KV4 Quantization and System Co-design for Efficient LLM Serving
Tutorials for creating and using ONNX models
A validation and profiling tool for AI infrastructure
The papers are organized according to our survey: Evaluating Large Language Models: A Comprehensive Survey.
PyTorch入门教程,在线阅读地址:https://datawhalechina.github.io/thorough-pytorch/
Open standard for machine learning interoperability
[MLSys 2024 Best Paper Award] AWQ: Activation-aware Weight Quantization for LLM Compression and Acceleration
Repository for the QUIK project, enabling the use of 4bit kernels for generative inference - EMNLP 2024
Tensors and Dynamic neural networks in Python with strong GPU acceleration
NumPy aware dynamic Python compiler using LLVM
CUDA integration for Python, plus shiny features
High-speed Large Language Model Serving on PCs with Consumer-grade GPUs
LightLLM is a Python-based LLM (Large Language Model) inference and serving framework, notable for its lightweight design, easy scalability, and high-speed performance.
🔥[IJCAI 2022, Official Code] for paper "Rethinking Image Aesthetics Assessment: Models, Datasets and Benchmarks". Official Weights and Demos provided. 首个面向多主题场景的美学评估数据集、算法和benchmark.
GLake: optimizing GPU memory management and IO transmission.
General-purpose web UI for Kubernetes clusters
The Triton Inference Server provides an optimized cloud and edge inferencing solution.
Deep Learning Visualization Toolkit(『飞桨』深度学习可视化工具 )
Build high-quality LLM apps - from prototyping, testing to production deployment and monitoring.