Starred repositories
Yet another PyTorch implementation of Stable Diffusion (probably easy to read)
cuDTW++: Ultra-Fast Dynamic Time Warping on CUDA-enabled GPUs
kaldi-asr/kaldi is the official location of the Kaldi project.
Mooncake is the serving platform for Kimi, a leading LLM service provided by Moonshot AI.
Implement Flash Attention using Cute.
FlagPerf is an open-source software platform for benchmarking AI chips.
FlagGems is an operator library for large language models implemented in Triton Language.
TransferBench is a utility capable of benchmarking simultaneous copies between user-specified devices (CPUs/GPUs)
本书为《C++17 the complete guide》的个人中文翻译,仅供学习和交流使用,侵删
VPTQ, A Flexible and Extreme low-bit quantization algorithm
Chat with your database (SQL, CSV, pandas, polars, mongodb, noSQL, etc). PandasAI makes data analysis conversational using LLMs (GPT 3.5 / 4, Anthropic, VertexAI) and RAG.
Educational framework exploring ergonomic, lightweight multi-agent orchestration. Managed by OpenAI Solution team.
Mirage: Automatically Generating Fast GPU Kernels without Programming in Triton/CUDA
[EMNLP 2024 Industry Track] This is the official PyTorch implementation of "LLMC: Benchmarking Large Language Model Quantization with a Versatile Compression Toolkit".
Evaluate and Enhance Your LLM Deployments for Real-World Inference Needs
🤯 Lobe Chat - an open-source, modern-design AI chat framework. Supports Multi AI Providers( OpenAI / Claude 3 / Gemini / Ollama / Qwen / DeepSeek), Knowledge Base (file upload / knowledge managemen…
⚡️HivisionIDPhotos: a lightweight and efficient AI ID photos tools. 一个轻量级的AI证件照制作算法。
SOTA low-bit LLM quantization (INT8/FP8/INT4/FP4/NF4) & sparsity; leading model compression techniques on TensorFlow, PyTorch, and ONNX Runtime
xDiT: A Scalable Inference Engine for Diffusion Transformers (DiTs) with Massive Parallelism
SGLang is a fast serving framework for large language models and vision language models.
Dynamic Memory Management for Serving LLMs without PagedAttention
A set of examples around pytorch in Vision, Text, Reinforcement Learning, etc.
🤖一个基于 WeChaty 结合 OpenAi ChatGPT / Kimi / 讯飞等Ai服务实现的微信机器人 ,可以用来帮助你自动回复微信消息,或者管理微信群/好友,检测僵尸粉等...
A simple prompt-chatting AI based on wechaty and fintuned NLP model