Lists (2)
Sort Name ascending (A-Z)
Stars
FlashInfer: Kernel Library for LLM Serving
High-performance In-browser LLM Inference Engine
SGLang is a fast serving framework for large language models and vision language models.
A code sample demonstrating how to share and rebuild a PyTorch GPU tensor via its pointer/reference between different processes.
A JIT assembler for x86/x64 architectures supporting MMX, SSE (1-4), AVX (1-2, 512), FPU, APX, and AVX10.2
SOTA low-bit LLM quantization (INT8/FP8/INT4/FP4/NF4) & sparsity; leading model compression techniques on TensorFlow, PyTorch, and ONNX Runtime
⚡ Build your chatbot within minutes on your favorite device; offer SOTA compression techniques for LLMs; run LLMs efficiently on Intel Platforms⚡
An innovative library for efficient LLM inference via low-bit quantization
Running linear algebra as fast as possible on Apple silicon
Performance analysis tools based on Linux perf_events (aka perf) and ftrace
An efficient GPU support for LLM inference with x-bit quantization (e.g. FP6,FP5).
A Flexible Framework for Experiencing Cutting-edge LLM Inference Optimizations
SpotServe: Serving Generative Large Language Models on Preemptible Instances
Reimplementation of RA3.exe (Red Alert 3 game launcher)
The world's simplest facial recognition api for Python and the command line
Mooncake is the serving platform for Kimi, a leading LLM service provided by Moonshot AI.
qstock由“Python金融量化”公众号开发,试图打造成个人量化投研分析包,目前包括数据获取(data)、可视化(plot)、选股(stock)和量化回测(策略backtest)模块。 qstock将为用户提供简洁的数据接口和规整化后的金融市场数据。可视化模块为用户提供基于web的交互图形的简单接口; 选股模块提供了同花顺的选股数据和自定义选股,包括RPS、MM趋势、财务指标、资金流模型…
股票接口 | 韭菜小猪 | A股 | 美股 | 港股 | 股票 | 基金 | JavaScript
Scalable graph analytics database powered by a multithreaded, vectorized temporal engine, written in Rust