Novemser

🏳️

Glad to GG

Novemser

🏳️

Glad to GG

58 followers · 71 following

China

Achievements

Stars

horseee / LLM-Pruner

[NeurIPS 2023] LLM-Pruner: On the Structural Pruning of Large Language Models. Support Llama-3/3.1, Llama-2, LLaMA, BLOOM, Vicuna, Baichuan, TinyLlama, etc.

Python 908 106 Updated Oct 7, 2024

OpenGVLab / OmniQuant

[ICLR2024 spotlight] OmniQuant is a simple and powerful quantization technique for LLMs.

Python 742 56 Updated Oct 8, 2024

bytedance / ABQ-LLM

An acceleration library that supports arbitrary bit-width combinatorial quantization operations

C++ 232 25 Updated Sep 30, 2024

IST-DASLab / sparsegpt

Code for the ICML 2023 paper "SparseGPT: Massive Language Models Can Be Accurately Pruned in One-Shot".

Python 746 98 Updated Aug 20, 2024

VITA-Group / essential_sparsity

[NeurIPS 2023] "The Emergence of Essential Sparsity in Large Pre-trained Models: The Weights that Matter", Ajay Jaiswal, Shiwei Liu, Tianlong Chen, and Zhangyang Wang

Python 8 Updated Jul 23, 2023

dgSPARSE / dgSPARSE-Lib

PyTorch-Based Fast and Efficient Processing for Various Machine Learning Applications with Diverse Sparsity

Cuda 101 27 Updated Dec 16, 2024

hgyhungry / dgSPARSE-Library

Forked from dgSPARSE/dgSPARSE-Lib

Cuda 6 Updated Dec 3, 2021

LBL-EESA / fastkde

Python 57 12 Updated Oct 23, 2024

IST-DASLab / ZipLM

Code for the NeurIPS 2023 paper: "ZipLM: Inference-Aware Structured Pruning of Language Models".

2 Updated Oct 20, 2023

onnx / tensorflow-onnx

Convert TensorFlow, Keras, Tensorflow.js and Tflite models to ONNX

Jupyter Notebook 2,350 431 Updated Nov 20, 2024

pytorch / torchdynamo

A Python-level JIT compiler designed to make unmodified PyTorch programs faster.

Python 1,016 124 Updated Apr 17, 2024

microsoft / onnxruntime

ONNX Runtime: cross-platform, high performance ML inferencing and training accelerator

C++ 15,077 2,966 Updated Dec 23, 2024

Cornell-RelaxML / qtip

Python 87 8 Updated Dec 10, 2024

casper-hansen / AutoAWQ

AutoAWQ implements the AWQ algorithm for 4-bit quantization with a 2x speedup during inference. Documentation:

Python 1,841 222 Updated Dec 13, 2024

rapidsai / cuml

cuML - RAPIDS Machine Learning Library

C++ 4,303 539 Updated Dec 21, 2024

IST-DASLab / gptq

Code for the ICLR 2023 paper "GPTQ: Accurate Post-training Quantization of Generative Pretrained Transformers".

Python 1,971 156 Updated Mar 27, 2024

GATECH-EIC / ShiftAddLLM

ShiftAddLLM: Accelerating Pretrained LLMs via Post-Training Multiplication-Less Reparameterization

Python 92 12 Updated Oct 15, 2024

archiewood / gosql

A SQL query engine in Go

Go 151 5 Updated Oct 8, 2024

mit-han-lab / qserve

QServe: W4A8KV4 Quantization and System Co-design for Efficient LLM Serving

Python 466 28 Updated Nov 9, 2024

Guangxuan-Xiao / torch-int

This repository contains integer operators on GPUs for PyTorch.

Python 186 50 Updated Sep 29, 2023

mit-han-lab / deepcompressor

Model Compression Toolbox for Large Language Models and Diffusion Models

Python 275 22 Updated Nov 10, 2024

mit-han-lab / smoothquant

[ICML 2023] SmoothQuant: Accurate and Efficient Post-Training Quantization for Large Language Models

Python 1,294 151 Updated Jul 12, 2024

VisionRush / DeepFakeDefenders

Image forgery recognition algorithm

Python 594 75 Updated Sep 9, 2024

everparadise / wanda-falcon

Python 3 Updated Oct 19, 2024

mit-han-lab / llm-awq

[MLSys 2024 Best Paper Award] AWQ: Activation-aware Weight Quantization for LLM Compression and Acceleration

Python 2,608 216 Updated Dec 20, 2024

Aaronhuang-778 / SliM-LLM

SliM-LLM: Salience-Driven Mixed-Precision Quantization for Large Language Models

Python 26 1 Updated Aug 9, 2024

Vahe1994 / AQLM

Official Pytorch repository for Extreme Compression of Large Language Models via Additive Quantization https://arxiv.org/pdf/2401.06118.pdf and PV-Tuning: Beyond Straight-Through Estimation for Ext…

Python 1,193 180 Updated Dec 21, 2024

GoodAI / charlie-mnemonic

Charlie Mnemonic: The First Personal Assistant with Long-Term Memory

Python 175 21 Updated Oct 16, 2024

geekan / MetaGPT

🌟 The Multi-Agent Framework: First AI Software Company, Towards Natural Language Programming

Python 46,082 5,476 Updated Dec 18, 2024

microsoft / graphrag

A modular graph-based Retrieval-Augmented Generation (RAG) system

Python 20,892 2,046 Updated Dec 20, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Novemser

Achievements

Achievements

Block or report Novemser

Stars

horseee / LLM-Pruner

OpenGVLab / OmniQuant

bytedance / ABQ-LLM

IST-DASLab / sparsegpt

VITA-Group / essential_sparsity

dgSPARSE / dgSPARSE-Lib

hgyhungry / dgSPARSE-Library

LBL-EESA / fastkde

IST-DASLab / ZipLM

onnx / tensorflow-onnx

pytorch / torchdynamo

microsoft / onnxruntime

Cornell-RelaxML / qtip

casper-hansen / AutoAWQ

rapidsai / cuml

IST-DASLab / gptq

GATECH-EIC / ShiftAddLLM

archiewood / gosql

mit-han-lab / qserve

Guangxuan-Xiao / torch-int

mit-han-lab / deepcompressor

mit-han-lab / smoothquant

VisionRush / DeepFakeDefenders

everparadise / wanda-falcon

mit-han-lab / llm-awq

Aaronhuang-778 / SliM-LLM

Vahe1994 / AQLM

GoodAI / charlie-mnemonic

geekan / MetaGPT

microsoft / graphrag