Stars
Medusa: Simple Framework for Accelerating LLM Generation with Multiple Decoding Heads
The PyTorch implementation of Generative Pre-trained Transformers (GPTs) using Kolmogorov-Arnold Networks (KANs) for language modeling
TensorRT-LLM provides users with an easy-to-use Python API to define Large Language Models (LLMs) and build TensorRT engines that contain state-of-the-art optimizations to perform inference efficie…
Letta (formerly MemGPT) is a framework for creating LLM services with memory.
Awesome-LLM: a curated list of Large Language Model
A high-throughput and memory-efficient inference and serving engine for LLMs
mallchat的前端项目,是一个既能购物又能聊天的电商系统。以互联网企业级开发规范的要求来实现它,电商该有的购物车,订单,支付,推荐,搜索,拉新,促活,推送,物流,客服,它都必须有。持续更新ing
🚀 A simple way to launch, train, and use PyTorch models on almost any device and distributed configuration, automatic mixed precision (including fp8), and easy-to-configure FSDP and DeepSpeed support
Examples and guides for using the OpenAI API
High-Resolution Image Synthesis with Latent Diffusion Models
Prompt-to-prompt extention of Stable Diffusion web UI
DeepSpeed is a deep learning optimization library that makes distributed training and inference easy, efficient, and effective.
ChatGPT 中文指南🔥,ChatGPT 中文调教指南,指令指南,应用开发指南,精选资源清单,更好的使用 chatGPT 让你的生产力 up up up! 🚀
A list of totally open alternatives to ChatGPT
🧑🏫 60+ Implementations/tutorials of deep learning papers with side-by-side notes 📝; including transformers (original, xl, switch, feedback, vit, ...), optimizers (adam, adabelief, sophia, ...), ga…
An Extensible Toolkit for Finetuning and Inference of Large Foundation Models. Large Models for All.
A simple C++11 Thread Pool implementation
Real-time monitor and web admin for Celery distributed task queue