Awesome-LLM-Eval: a curated list of tools, datasets/benchmark, demos, leaderboard, papers, docs and models, mainly for Evaluation on LLMs. 一个由工具、基准/数据、演示、排行榜和大模型等组成的精选列表，主要面向基础大模型评测，旨在探求生成式AI的技术边界.

440 43 Updated Oct 25, 2024

AIoT-MLSys-Lab / Efficient-LLMs-Survey

[TMLR 2024] Efficient Large Language Models: A Survey

1,045 85 Updated Nov 23, 2024

ggerganov / ggml

Tensor library for machine learning

C++ 11,362 1,057 Updated Dec 13, 2024

KnowingNothing / compiler-and-arch

A list of tutorials, paper, talks, and open-source projects for emerging compiler and architecture

402 35 Updated Nov 28, 2024

metame-ai / awesome-llm-plaza

awesome llm plaza: daily tracking all sorts of awesome topics of llm, e.g. llm for coding, robotics, reasoning, multimod etc.

166 12 Updated Dec 11, 2024

DefTruth / Awesome-LLM-Inference

📖A curated list of Awesome LLM/VLM Inference Papers with codes, such as FlashAttention, PagedAttention, Parallelism, etc. 🎉🎉

2,991 204 Updated Dec 9, 2024

sramshetty / mixture-of-depths

An unofficial implementation of "Mixture-of-Depths: Dynamically allocating compute in transformer-based language models"

Python 34 3 Updated Jun 7, 2024

Efficient-ML / Awesome-Efficient-LLM-Diffusion

A list of papers, docs, codes about efficient AIGC. This repo is aimed to provide the info for efficient AIGC research, including language and vision, we are continuously improving the project. Wel…

159 11 Updated Nov 1, 2024

casys-kaist / NeuPIMs

NeuPIMs Simulator

Jupyter Notebook 60 15 Updated Jun 19, 2024

scale-snu / attacc_simulator

Python 46 5 Updated Jun 24, 2024

PSAL-POSTECH / ONNXim

ONNXim is a fast cycle-level simulator that can model multi-core NPUs for DNN inference

C++ 73 12 Updated Dec 11, 2024

CMU-SAFARI / ramulator2

Ramulator 2.0 is a modern, modular, extensible, and fast cycle-accurate DRAM simulator. It provides support for agile implementation and evaluation of new memory system designs (e.g., new DRAM stan…

C++ 255 62 Updated Dec 11, 2024

hahnyuan / LLM-Viewer

Analyze the inference of Large Language Models (LLMs). Analyze aspects like computation, storage, transmission, and hardware roofline model in a user-friendly interface.

Python 336 39 Updated Sep 11, 2024