Stars
15
results
for source starred repositories
Clear filter
Applied AI experiments and examples for PyTorch
Ecosystem of libraries and tools for writing and executing fast GPU code fully in Rust.
TensorRT-LLM provides users with an easy-to-use Python API to define Large Language Models (LLMs) and build TensorRT engines that contain state-of-the-art optimizations to perform inference efficie…
llama3 implementation one matrix multiplication at a time
Tile primitives for speedy kernels
You like pytorch? You like micrograd? You love tinygrad! ❤️
Samples for CUDA Developers which demonstrates features in CUDA Toolkit
Step-by-step optimization of CUDA SGEMM