Stars
A curated list of resources for learning and exploring Triton, OpenAI's programming language for writing efficient GPU code.
Hackable and optimized Transformers building blocks, supporting a composable construction.
Implement a ChatGPT-like LLM in PyTorch from scratch, step by step
The LLM's practical guide: From the fundamentals to deploying advanced LLM and RAG apps to AWS using LLMOps best practices
Finetune Llama 3.3, Mistral, Phi, Qwen 2.5 & Gemma LLMs 2-5x faster with 80% less memory
Efficient Triton Kernels for LLM Training
TensorRT-LLM provides users with an easy-to-use Python API to define Large Language Models (LLMs) and build TensorRT engines that contain state-of-the-art optimizations to perform inference efficie…
This repository is a comprehensive collection of research papers, annotations, and concise summaries in the field of Natural Language Processing (NLP). It focuses on machine learning and deep learn…
RandomityGuy / MarbleBlast4D
Forked from HackerPoet/Engine4DMarble Blast, but in a higher dimension
NVIDIA curated collection of educational resources related to general purpose GPU programming.
📕 A clone of @rygorous series of posts on the graphics pipeline.
GPUOcelot: A dynamic compilation framework for PTX
Data Compression, Lossless implementation
A fully C++ deep learning framework.
Documented and Unit Tested educational Deep Learning framework with Autograd from scratch.
A list of awesome beginners-friendly projects.
Python Data Science Handbook: full text in Jupyter Notebooks
Deduplicating archiver with compression and authenticated encryption.
Materials and IPython notebooks for "Python for Data Analysis" by Wes McKinney, published by O'Reilly Media
PyCon 2015 Pandas tutorial materials
The simplest way to run LLaMA on your local machine
Probabilistic language based on pattern matching and constraint propagation, 153 examples