Skip to content
View iamWHTWD's full-sized avatar

Block or report iamWHTWD

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

An easy to use and powerful chaos engineering experiment toolkit.(阿里巴巴开源的一款简单易用、功能强大的混沌实验注入工具)

Go 6,031 955 Updated Dec 26, 2024

Evaluate and Enhance Your LLM Deployments for Real-World Inference Needs

Python 179 14 Updated Dec 11, 2024

LLM Inference analyzer for different hardware platforms

Jupyter Notebook 46 11 Updated Nov 26, 2024

A collection of pre-trained, state-of-the-art models in the ONNX format

Jupyter Notebook 8,098 1,418 Updated Apr 30, 2024

Enforce the output format (JSON Schema, Regex etc) of a language model

Python 1,649 72 Updated Oct 16, 2024

MoverScore: Text Generation Evaluating with Contextualized Embeddings and Earth Mover Distance

Python 201 35 Updated Nov 20, 2023

BERT score for text generation

Jupyter Notebook 1,643 220 Updated Jul 30, 2024

Analyze the inference of Large Language Models (LLMs). Analyze aspects like computation, storage, transmission, and hardware roofline model in a user-friendly interface.

Python 358 44 Updated Sep 11, 2024

QServe: W4A8KV4 Quantization and System Co-design for Efficient LLM Serving

Python 473 28 Updated Nov 9, 2024

The LLM Evaluation Framework

Python 4,163 341 Updated Jan 2, 2025

Tutorials for creating and using ONNX models

Jupyter Notebook 3,416 634 Updated Jul 15, 2024

CUDA Kernel Benchmarking Library

Cuda 540 69 Updated Nov 20, 2024

A validation and profiling tool for AI infrastructure

Python 284 60 Updated Dec 12, 2024

The papers are organized according to our survey: Evaluating Large Language Models: A Comprehensive Survey.

727 47 Updated May 8, 2024

PyTorch入门教程,在线阅读地址:https://datawhalechina.github.io/thorough-pytorch/

Jupyter Notebook 2,704 428 Updated Oct 30, 2024

Open standard for machine learning interoperability

Python 18,178 3,691 Updated Jan 3, 2025

[MLSys 2024 Best Paper Award] AWQ: Activation-aware Weight Quantization for LLM Compression and Acceleration

Python 2,637 218 Updated Dec 20, 2024

Repository for the QUIK project, enabling the use of 4bit kernels for generative inference - EMNLP 2024

C++ 175 13 Updated Apr 16, 2024

Jenkins automation server

Java 23,426 8,859 Updated Jan 3, 2025

Tensors and Dynamic neural networks in Python with strong GPU acceleration

Python 85,486 23,015 Updated Jan 3, 2025

NumPy aware dynamic Python compiler using LLVM

Python 10,092 1,136 Updated Dec 17, 2024

CUDA integration for Python, plus shiny features

Python 1,890 291 Updated Nov 5, 2024

High-speed Large Language Model Serving on PCs with Consumer-grade GPUs

C++ 8,039 418 Updated Sep 6, 2024

LightLLM is a Python-based LLM (Large Language Model) inference and serving framework, notable for its lightweight design, easy scalability, and high-speed performance.

Python 2,723 217 Updated Jan 3, 2025

🔥[IJCAI 2022, Official Code] for paper "Rethinking Image Aesthetics Assessment: Models, Datasets and Benchmarks". Official Weights and Demos provided. 首个面向多主题场景的美学评估数据集、算法和benchmark.

Python 301 19 Updated Nov 25, 2024

GLake: optimizing GPU memory management and IO transmission.

Python 406 35 Updated Nov 27, 2024

General-purpose web UI for Kubernetes clusters

Go 14,573 4,175 Updated Jan 2, 2025

The Triton Inference Server provides an optimized cloud and edge inferencing solution.

Python 8,537 1,506 Updated Jan 3, 2025

Deep Learning Visualization Toolkit(『飞桨』深度学习可视化工具 )

HTML 4,798 630 Updated Dec 11, 2024

Build high-quality LLM apps - from prototyping, testing to production deployment and monitoring.

Python 9,717 901 Updated Jan 3, 2025
Next