[EMNLP'23, ACL'24] To speed up LLMs' inference and enhance LLM's perceive of key information, compress the prompt and KV-Cache, which achieves up to 20x compression with minimal performance loss.

Python 4,733 263 Updated Nov 17, 2024

Zefan-Cai / KVCache-Factory

Unified KV Cache Compression Methods for Auto-Regressive Models

Python 1,079 145 Updated Dec 11, 2024

Zefan-Cai / Awesome-LLM-KV-Cache

Awesome-LLM-KV-Cache: A curated list of 📙Awesome LLM KV Cache Papers with Codes.

140 7 Updated Dec 7, 2024

IAAR-Shanghai / Awesome-Attention-Heads

An awesome repository & A comprehensive survey on interpretability of LLM attention heads.

TeX 283 8 Updated Nov 15, 2024

luckyerr / Voice-Transformer_Speaker-Verification

Incorporating the memory mechanism into the transformer and employing a parallel weighting structure to obtain a better utterance-level representation on the speaker verification task

Python 19 Updated Mar 12, 2024

FFY0 / MetaSketch_TPAMI

SourceCode for MetaSketch

Python 3 1 Updated Jan 25, 2024

Elegycloud / clash-for-linux-backup

基于Clash Core 制作的Clash For Linux备份仓库 A Clash For Linux Backup Warehouse Based on Clash Core

Shell 2,603 1,077 Updated Nov 24, 2024

Tony-Tan / CUDA_Freshman

Cuda 2,241 442 Updated Jan 16, 2024

gakhov / pdsa

Probabilistic Data Structures and Algorithms in Python

Python 123 19 Updated Feb 24, 2020

LaoGong-zp / Transformer

Learning materials of Transformer, including my code, XMind, PDF and so on

Jupyter Notebook 350 59 Updated Sep 28, 2021

jlumbroso / python-random-hash

A simple, time-tested, family of random hash functions in Python, based on CRC32 and xxHash, affine transformations, and the Mersenne Twister. 🎲

Python 9 Updated Jun 6, 2022

dmlc / dgl

Python package built to ease deep learning on graph, on top of existing DL frameworks.

Python 13,600 3,019 Updated Oct 18, 2024

binary-husky / gpt_academic

为GPT/GLM等LLM大语言模型提供实用化交互接口，特别优化论文阅读/润色/写作体验，模块化设计，支持自定义快捷按钮&函数插件，支持Python和C++等项目剖析&自译解功能，PDF/LaTex论文翻译&总结功能，支持并行问询多种LLM模型，支持chatglm3等本地模型。接入通义千问, deepseekcoder, 讯飞星火, 文心一言, llama2, rwkv, claude2, m…

Python 66,320 8,135 Updated Dec 9, 2024

YannDubs / Hash-Embeddings

PyTorch implementation of Hash Embeddings (NIPS 2017). Submission to the NIPS Implementation Challenge.

Python 192 27 Updated Nov 12, 2018

thunlp / GNNPapers

Must-read papers on graph neural networks (GNN)

16,112 3,011 Updated Dec 20, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Yuan Feng FFY0

Achievements

Achievements

Highlights

Block or report FFY0

Stars

NVIDIA / kvpress

thu-wyz / inference_scaling

FasterDecoding / SnapKV

AIoT-MLSys-Lab / Efficient-LLMs-Survey

princeton-nlp / ProLong

FFY0 / AdaKV

antgroup / glake

microsoft / vattention

RayeRen / acad-homepage.github.io

horseee / Awesome-Efficient-LLM

October2001 / Awesome-KV-Cache-Compression

IsaacRe / vllm-kvcompress

66RING / CritiPrefill

vllm-project / vllm

DefTruth / Awesome-LLM-Inference

microsoft / LLMLingua