Stars
[NeurIPS 2024] Efficient LLM Scheduling by Learning to Rank
📖A curated list of Awesome LLM/VLM Inference Papers with codes: WINT8/4, Flash-Attention, Paged-Attention, Parallelism, etc. 🎉🎉
📚200+ Tensor/CUDA Cores Kernels, ⚡️flash-attn-mma, ⚡️hgemm with WMMA, MMA and CuTe (98%~100% TFLOPS of cuBLAS/FA2 🎉🎉).
📒《统计学习方法-李航: 笔记-从原理到实现,基于R语言》200页PDF,各种手推公式细节讲解,R语言实现. 🎉🎉
Some basic methods of weakly supervised object detection (WSOD), containing methods such as WSDDN, OICR, PCL and so on.
基于组件化设计的思路,用Rust语言的丰富语言特征,设计实现不同功能的独立操作系统内核模块和操作系统框架,可形成不同特征/形态/架构的操作系统内核
Large Language Model (LLM) Systems Paper List
Curated collection of papers in machine learning systems
中文LLaMA&Alpaca大语言模型+本地CPU/GPU训练部署 (Chinese LLaMA & Alpaca LLMs)
This is the homepage of a new book entitled "Mathematical Foundations of Reinforcement Learning."