Skip to content

Latest commit

 

History

History
10 lines (6 loc) · 365 Bytes

acceleration.md

File metadata and controls

10 lines (6 loc) · 365 Bytes

Acceleration

Hardware and software acceleration for LLM training and inference

Papers

2023

  • (2023-02) High-throughput Generative Inference of Large Language Models with a single GPU Ying Sheng et al. Paper | Github

Useful Resources