A GPU-accelerated library containing highly optimized building blocks and an execution engine for data processing to accelerate deep learning training and inference applications.

C++ 5,277 631 Updated Feb 14, 2025

HazyResearch / ThunderKittens

Tile primitives for speedy kernels

Cuda 2,032 113 Updated Feb 16, 2025

mlc-ai / xgrammar

Efficient, Flexible and Portable Structured Generation

C++ 695 42 Updated Feb 15, 2025

UbiquitousLearning / mllm

Fast Multimodal LLM on Mobile Devices

C++ 692 79 Updated Feb 9, 2025

Yangyi-Chen / Multimodal-AND-Large-Language-Models

Paper list about multimodal and large language models, only used to record papers I read in the daily arxiv for personal needs.

587 40 Updated Feb 14, 2025

volcengine / verl

verl: Volcano Engine Reinforcement Learning for LLMs

Python 3,208 275 Updated Feb 16, 2025

xdit-project / xDiT

xDiT: A Scalable Inference Engine for Diffusion Transformers (DiTs) with Massive Parallelism

Python 1,247 102 Updated Feb 10, 2025

pytorch / ao

PyTorch native quantization and sparsity for training and inference

Python 1,836 216 Updated Feb 14, 2025

zhaochenyang20 / Awesome-ML-SYS-Tutorial

My learning notes/codes for ML SYS.

Python 689 31 Updated Feb 16, 2025

DefTruth / CUDA-Learn-Notes

📚200+ Tensor/CUDA Cores Kernels, ⚡️flash-attn-mma, ⚡️hgemm with WMMA, MMA and CuTe (98%~100% TFLOPS of cuBLAS/FA2 🎉🎉).

Cuda 2,320 247 Updated Feb 7, 2025

DefTruth / Awesome-LLM-Inference

📖A curated list of Awesome LLM/VLM Inference Papers with codes: WINT8/4, Flash-Attention, Paged-Attention, Parallelism, etc. 🎉🎉

3,419 235 Updated Feb 13, 2025

showlab / Awesome-Video-Diffusion

A curated list of recent diffusion models for video generation, editing, restoration, understanding, etc.

3,984 225 Updated Feb 14, 2025

UChi-JCL / CacheGen

Python 87 11 Updated Oct 9, 2024

steven2358 / awesome-generative-ai

A curated list of modern Generative Artificial Intelligence projects and services

7,502 806 Updated Feb 11, 2025

ollama / ollama

Get up and running with Llama 3.3, DeepSeek-R1, Phi-4, Gemma 2, and other large language models.

Go 126,596 10,274 Updated Feb 17, 2025

Sumandora / remove-refusals-with-transformers

Implements harmful/harmless refusal removal using pure HF Transformers

Python 531 77 Updated Jun 12, 2024

huggingface / optimum-tpu

Google TPU optimizations for transformers models

Python 98 24 Updated Jan 21, 2025

IAAR-Shanghai / Awesome-Attention-Heads

An awesome repository & A comprehensive survey on interpretability of LLM attention heads.

TeX 306 9 Updated Feb 12, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Infinity Wonderful-Me

Achievements

Achievements

Block or report Wonderful-Me

Stars

ZhuiyiTechnology / roformer

moonbit-community / XMLParser

pyutils / line_profiler

xlab-uiuc / acto

FMInference / DejaVu

ggml-org / llama.cpp

openai / openai-realtime-agents

TheAiSingularity / graphrag-local-ollama

stanford-oval / storm

SJTU-IPADS / PowerInfer

csguoh / Awesome-Mamba-in-Low-Level-Vision

ml-energy / zeus

NVIDIA / DALI