Stars
🤗 Transformers: State-of-the-art Machine Learning for Pytorch, TensorFlow, and JAX.
🧑🏫 60+ Implementations/tutorials of deep learning papers with side-by-side notes 📝; including transformers (original, xl, switch, feedback, vit, ...), optimizers (adam, adabelief, sophia, ...), ga…
The simplest, fastest repository for training/finetuning medium-sized GPTs.
An open platform for training, serving, and evaluating large language models. Release repo for Vicuna and Chatbot Arena.
Unified Efficient Fine-Tuning of 100+ LLMs (ACL 2024)
A high-throughput and memory-efficient inference and serving engine for LLMs
🤗 PEFT: State-of-the-art Parameter-Efficient Fine-Tuning.
MiniCPM-V 2.6: A GPT-4V Level MLLM for Single Image, Multi Image and Video on Your Phone
A scalable generative AI framework built for researchers and developers working on Large Language Models, Multimodal, and Speech AI (Automatic Speech Recognition and Text-to-Speech)
[ECCV 2024] Official implementation of the paper "Grounding DINO: Marrying DINO with Grounded Pre-Training for Open-Set Object Detection"
[ICLR 2024] Efficient Streaming Language Models with Attention Sinks
Chinese version of CLIP which achieves Chinese cross-modal retrieval and representation generation.
An easy-to-use LLMs quantization package with user-friendly apis, based on GPTQ algorithm.
An efficient, flexible and full-featured toolkit for fine-tuning LLM (InternLM2, Llama3, Phi3, Qwen, Mistral, ...)
Video+code lecture on building nanoGPT from scratch
Transformer: PyTorch Implementation of "Attention Is All You Need"
4 bits quantization of LLaMA using GPTQ
[MLSys 2024 Best Paper Award] AWQ: Activation-aware Weight Quantization for LLM Compression and Acceleration
Code for the ICLR 2023 paper "GPTQ: Accurate Post-training Quantization of Generative Pretrained Transformers".
Project Page for "LISA: Reasoning Segmentation via Large Language Model"
AutoAWQ implements the AWQ algorithm for 4-bit quantization with a 2x speedup during inference. Documentation:
This is a collection of our NAS and Vision Transformer work.
OpenMMLab Model Compression Toolbox and Benchmark.
[ECCV2024] Video Foundation Models & Data for Multimodal Understanding
A curated list for Efficient Large Language Models
OMG-LLaVA and OMG-Seg codebase [CVPR-24 and NeurIPS-24]
TinyGPT-V: Efficient Multimodal Large Language Model via Small Backbones