![python logo](https://raw.githubusercontent.com/github/explore/80688e429a7d4ef2fca1e82350fe8e3517d3494d/topics/python/python.png)
Starred repositories
Official implementation for LaCo (EMNLP 2024 Findings)
Unified Efficient Fine-Tuning of 100+ LLMs & VLMs (ACL 2024)
[ICLR 2022] "Unified Vision Transformer Compression" by Shixing Yu*, Tianlong Chen*, Jiayi Shen, Huan Yuan, Jianchao Tan, Sen Yang, Ji Liu, Zhangyang Wang
Export utility for unconstrained channel pruned models
A method to increase the speed and lower the memory footprint of existing vision transformers.
TinyFusion: Diffusion Transformers Learned Shallow
A family of compressed models obtained via pruning and knowledge distillation
The simplest, fastest repository for training/finetuning medium-sized GPTs.
Official PyTorch implementation of "EdgeSAM: Prompt-In-the-Loop Distillation for On-Device Deployment of SAM"
Official repository for the AAAI2025 paper ( Can We Get Rid of Handcrafted Feature Extractors? SparseViT: Nonsemantics-Centered, Parameter-Efficient Image Manipulation Localization through Spare-Co…
EfficientSAM: Leveraged Masked Image Pretraining for Efficient Segment Anything
[ICML 2024] SqueezeLLM: Dense-and-Sparse Quantization
Code for the ICML 2023 paper "SparseGPT: Massive Language Models Can Be Accurately Pruned in One-Shot".
An official implementation of the paper "How Sparse Can We Prune A Deep Network: A Fundamental Limit Viewpoint".
[NeurIPS 2024] AlphaPruning: Using Heavy-Tailed Self Regularization Theory for Improved Layer-wise Pruning of Large Language Models
Code for the ICLR 2023 paper "GPTQ: Accurate Post-training Quantization of Generative Pretrained Transformers".
The official code of the paper "PyramidDrop: Accelerating Your Large Vision-Language Models via Pyramid Visual Redundancy Reduction".
[NeurIPS'23 Oral] Visual Instruction Tuning (LLaVA) built towards GPT-4V level capabilities and beyond.
LLaVA-PruMerge: Adaptive Token Reduction for Efficient Large Multimodal Models
[ECCV 2024 Oral] Code for paper: An Image is Worth 1/2 Tokens After Layer 2: Plug-and-Play Inference Acceleration for Large Vision-Language Models
[NeurIPS 2024] SlimSAM: 0.1% Data Makes Segment Anything Slim
[NeurIPS 2023] LLM-Pruner: On the Structural Pruning of Large Language Models. Support Llama-3/3.1, Llama-2, LLaMA, BLOOM, Vicuna, Baichuan, TinyLlama, etc.
[ICML 2022] "DepthShrinker: A New Compression Paradigm Towards Boosting Real-Hardware Efficiency of Compact Neural Networks", by Yonggan Fu, Haichuan Yang, Jiayi Yuan, Meng Li, Cheng Wan, Raghurama…
Neural Network Compression Framework for enhanced OpenVINO™ inference
A collection of pre-trained, state-of-the-art models in the ONNX format
Official Pytorch Implementation of "Outlier Weighed Layerwise Sparsity (OWL): A Missing Secret Sauce for Pruning LLMs to High Sparsity"
Prospect Pruning: Finding Trainable Weights at Initialization Using Meta-Gradients