Stars
Python SDK, Proxy Server (LLM Gateway) to call 100+ LLM APIs in OpenAI format - [Bedrock, Azure, OpenAI, VertexAI, Cohere, Anthropic, Sagemaker, HuggingFace, Replicate, Groq]
An extremely fast Python package and project manager, written in Rust.
A course on aligning smol models.
STM32Cube MCU Full FW Package for the STM32WL series - (HAL + LL Drivers, CMSIS Core, CMSIS Device, MW libraries plus a set of Projects running on boards provided by ST (Nucleo boards)
OCRmyPDF adds an OCR text layer to scanned PDF files, allowing them to be searched
MII makes low-latency and high-throughput inference possible, powered by DeepSpeed.
LangGraph Studio template for creating an agent that does web research to genearte or enrich structured data.
⚡ Energy consumption metrology agent. Let "scaph" dive and bring back the metrics that will help you make your systems and applications more sustainable !
🚀 A simple way to launch, train, and use PyTorch models on almost any device and distributed configuration, automatic mixed precision (including fp8), and easy-to-configure FSDP and DeepSpeed support
Large-scale LLM inference engine
Get up and running with Llama 3.3, Mistral, Gemma 2, and other large language models.
LLM Comparator is an interactive data visualization tool for evaluating and analyzing LLM responses side-by-side, developed by the PAIR team.
Accessible large language models via k-bit quantization for PyTorch.
Finetune Llama 3.3, Mistral, Phi, Qwen 2.5 & Gemma LLMs 2-5x faster with 70% less memory
A high-throughput and memory-efficient inference and serving engine for LLMs
Source for thebestmotherfuckingwebsite.co
Open source libraries and APIs to build custom preprocessing pipelines for labeling, training, or production machine learning pipelines.
Large Language Model Text Generation Inference
🦜🔗 Build context-aware reasoning applications
A Bulletproof Way to Generate Structured JSON from Language Models
A framework for few-shot evaluation of language models.
Chat with your documents on your local device using GPT models. No data leaves your device and 100% private.
This repository contains an implementation of the LLaMA 2 (Large Language Model Meta AI) model, a Generative Pretrained Transformer (GPT) variant. The implementation focuses on the model architectu…
Lightweight onewire protocol library optimized for UART hardware on embedded systems