Lists (3)
Sort Oldest
Stars
GPU programming related news and material links
[WIP] Resources for AI engineers. Also contains supporting materials for the book AI Engineering (Chip Huyen, 2025)
This repo is meant to serve as a guide for Machine Learning/AI technical interviews.
Experiments with Model Training, Deployment & Monitoring
first base model for full-duplex conversational audio
A list of Free Software network services and web applications which can be hosted on your own servers
A bibliography and survey of the papers surrounding o1
Making Long-Context LLM Inference 10x Faster and 10x Cheaper
Build real-time multimodal AI applications 🤖🎙️📹
Accelerate your Hugging Face Transformers 7.6-9x. Native to Hugging Face and PyTorch.
Efficient Triton Kernels for LLM Training
Audio tokenization, in the fastest way possible!
FlashInfer: Kernel Library for LLM Serving
A list of awesome compiler projects and papers for tensor computation and deep learning.
Applied AI experiments and examples for PyTorch
A lightweight library for portable low-level GPU computation using WebGPU.
LMDeploy is a toolkit for compressing, deploying, and serving LLMs.
MobileLLM Optimizing Sub-billion Parameter Language Models for On-Device Use Cases. In ICML 2024.
An Open Source text-to-speech system built by inverting Whisper.
A network filesystem client to connect to SSH servers
Since the emergence of chatGPT in 2022, the acceleration of Large Language Model has become increasingly important. Here is a list of papers on accelerating LLMs, currently focusing mainly on infer…
🐸💬 - a deep learning toolkit for Text-to-Speech, battle-tested in research and production
Instant voice cloning by MIT and MyShell.
Autoregressive Model Beats Diffusion: 🦙 Llama for Scalable Image Generation
A simple tutorial of Variational AutoEncoders with Pytorch