-
UCAS
- BeiJing
-
19:11
(UTC +08:00)
Highlights
- Pro
Stars
HunyuanVideo: A Systematic Framework For Large Video Generation Model
libdrm_amdgpu bindings for Rust, and some methods ported from Mesa3D
ROCm Platform Runtime: ROCr a HPC market enhanced HSA based runtime
Documentation of NVIDIA chip/hardware interfaces
AMDGPU Driver with KFD used by the ROCm project. Also contains the current Linux Kernel that matches this base driver
A GPU-accelerated DNN inference serving system that supports instant kernel preemption and biased concurrent execution in GPU scheduling.
ASCII generator (image to text, image to image, video to video)
xDiT: A Scalable Inference Engine for Diffusion Transformers (DiTs) with Massive Parallelism
FlashInfer: Kernel Library for LLM Serving
Dynamic Memory Management for Serving LLMs without PagedAttention
AutoAWQ implements the AWQ algorithm for 4-bit quantization with a 2x speedup during inference. Documentation:
open-source code for paper: Retrieval Head Mechanistically Explains Long-Context Factuality
Github Pages template for academic personal websites, forked from mmistakes/minimal-mistakes
Automatically Discovering Fast Parallelization Strategies for Distributed Deep Neural Network Training
Open-Sora: Democratizing Efficient Video Production for All
🦜🔗 Build context-aware reasoning applications
Official PyTorch Implementation of "Scalable Diffusion Models with Transformers"
SGLang is a fast serving framework for large language models and vision language models.
Development repository for the Triton language and compiler
《动手学深度学习》:面向中文读者、能运行、可讨论。中英文版被70多个国家的500多所大学用于教学。
StreamDiffusion: A Pipeline-Level Solution for Real-Time Interactive Generation