romitjain

r0 romitjain

Likes data

18 followers · 43 following

Achievements

x3 x3

Achievements

x3 x3

Organizations

Lists (3)

Sort

Stars

7 stars written in Cuda

Clear filter

karpathy / llm.c

LLM training in simple, raw C/CUDA

Cuda 24,788 2,806 Updated Oct 2, 2024

flashinfer-ai / flashinfer

FlashInfer: Kernel Library for LLM Serving

Cuda 1,608 160 Updated Dec 23, 2024

siboehm / SGEMM_CUDA

Fast CUDA matrix multiplication from scratch

Cuda 545 66 Updated Dec 28, 2023

likejazz / llama3.cuda

llama3.cuda is a pure C/CUDA implementation for Llama 3 model.

Cuda 318 22 Updated Jun 4, 2024

NVIDIA / cuda-checkpoint

CUDA checkpoint and restore utility

Cuda 247 13 Updated Apr 17, 2024

Huanghongru / SGEMM-Implementation-and-Optimization

📝 Some source code about matrix multiplication implementation on CUDA

Cuda 35 9 Updated Sep 12, 2018

MDK8888 / vllmini

A minimal implementation of vllm.

Cuda 30 Updated Jul 27, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

r0 romitjain

Achievements

Achievements

Organizations

Block or report romitjain

Lists (3)

Reference implementations

Study

Try this

Stars

karpathy / llm.c

flashinfer-ai / flashinfer

siboehm / SGEMM_CUDA

likejazz / llama3.cuda

NVIDIA / cuda-checkpoint

Huanghongru / SGEMM-Implementation-and-Optimization

MDK8888 / vllmini