guanrenyang

Renyang Guan guanrenyang

Master student in SJTU

57 followers · 102 following

Shanghai Jiao Tong University
Shanghai

Achievements

Highlights

Lists (3)

Sort

Dataflow

16 repositories

LLM

4 repositories

🚀 My stack

Stars

13 stars written in Cuda

Clear filter

karpathy / llm.c

LLM training in simple, raw C/CUDA

Cuda 24,760 2,804 Updated Oct 2, 2024

rapidsai / cugraph

cuGraph - RAPIDS Graph Analytics Library

Cuda 1,813 309 Updated Dec 19, 2024

HazyResearch / ThunderKittens

Tile primitives for speedy kernels

Cuda 1,752 79 Updated Dec 13, 2024

BBuf / how-to-optim-algorithm-in-cuda

how to optimize some algorithm in cuda.

Cuda 1,746 144 Updated Dec 18, 2024

DefTruth / CUDA-Learn-Notes

📚150+ Tensor/CUDA Cores Kernels, ⚡️flash-attention-mma, ⚡️hgemm with WMMA, MMA and CuTe (98%~100% TFLOPS of cuBLAS 🎉🎉).

Cuda 1,685 176 Updated Dec 19, 2024

flashinfer-ai / flashinfer

FlashInfer: Kernel Library for LLM Serving

Cuda 1,573 160 Updated Dec 19, 2024

Liu-xiandong / How_to_optimize_in_GPU

This is a series of GPU optimization topics. Here we will introduce how to optimize the CUDA kernel in detail. I will introduce several basic kernel optimizations, including: elementwise, reduce, s…

Cuda 860 135 Updated Jul 29, 2023

tspeterkim / flash-attention-minimal

Flash Attention in ~100 lines of CUDA (forward pass only)

Cuda 657 58 Updated Apr 7, 2024

NVIDIA / multi-gpu-programming-models

Examples demonstrating available options to program multiple GPUs in a single node or a cluster

Cuda 578 112 Updated Oct 30, 2024

siboehm / SGEMM_CUDA

Fast CUDA matrix multiplication from scratch

Cuda 540 66 Updated Dec 28, 2023

Cjkkkk / CUDA_gemm

A simple high performance CUDA GEMM implementation.

Cuda 338 37 Updated Jan 4, 2024

RussWong / CUDATutorial

A CUDA tutorial to make people learn CUDA program from 0

Cuda 200 54 Updated Jul 9, 2024

leimao / CUDA-GEMM-Optimization

CUDA Matrix Multiplication Optimization

Cuda 147 13 Updated Jul 19, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Renyang Guan guanrenyang

Achievements

Achievements

Highlights

Block or report guanrenyang

Lists (3)

Dataflow

LLM

🚀 My stack

Stars

karpathy / llm.c

rapidsai / cugraph

HazyResearch / ThunderKittens

BBuf / how-to-optim-algorithm-in-cuda

DefTruth / CUDA-Learn-Notes

flashinfer-ai / flashinfer

Liu-xiandong / How_to_optimize_in_GPU

tspeterkim / flash-attention-minimal

NVIDIA / multi-gpu-programming-models

siboehm / SGEMM_CUDA

Cjkkkk / CUDA_gemm

RussWong / CUDATutorial

leimao / CUDA-GEMM-Optimization