Stars
The official code of HeadArtist: Text-conditioned 3D Head Generation with Self Score Distillation
Talk to any LLM with hands-free voice interaction, voice interruption, and Live2D taking face running locally across platforms
Code base for Fast3R: Towards 3D Reconstruction of 1000+ Images in One Forward Pass
Towards Economical Inference: Enabling DeepSeek's Multi-Head Latent Attention in Any Transformer-based LLMs
OpenMMLab Multimodal Advanced, Generative, and Intelligent Creation Toolbox. Unlock the magic 🪄: Generative-AI (AIGC), easy-to-use APIs, awsome model zoo, diffusion models, for text-to-image genera…
从无名小卒到大模型(LLM)大英雄~ 欢迎关注后续!!!
Transformers-compatible library for applying various compression algorithms to LLMs for optimized deployment with vLLM
Official repository for VisionZip (CVPR 2025)
A Unified Tokenizer for Visual Generation and Understanding
EasyR1: An Efficient, Scalable, Multi-Modality RL Training Framework based on veRL
A scalable generative AI framework built for researchers and developers working on Large Language Models, Multimodal, and Speech AI (Automatic Speech Recognition and Text-to-Speech)
Codes of MVSFormer++: Revealing the Devil in Transformer’s Details for Multi-View Stereo (ICLR2024)
The repo for "Distill Any Depth: Distillation Creates a Stronger Monocular Depth Estimator"
USP: Unified (a.k.a. Hybrid, 2D) Sequence Parallel Attention for Long Context Transformers Model Training and Inference
xDiT: A Scalable Inference Engine for Diffusion Transformers (DiTs) with Massive Parallelism
[CVPR 2024 - Highlight] FAR: Flexible, Accurate and Robust 6DoF Relative Camera Pose Estimation
NVIDIA® TensorRT™ is an SDK for high-performance deep learning inference on NVIDIA GPUs. This repository contains the open source components of TensorRT.
A high-performance distributed file system designed to address the challenges of AI training and inference workloads.
Virtual whiteboard for sketching hand-drawn like diagrams
TransMLA: Multi-Head Latent Attention Is All You Need
official repo of paper for "CamI2V: Camera-Controlled Image-to-Video Diffusion Model"
Open Source Deep Research Alternative to Reason and Search on Private Data. Written in Python.
[CVPR 2025] Official PyTorch implementation of Rectified Diffusion Guidance for Conditional Generation
Analyze computation-communication overlap in V3/R1.
A library for advanced large language model reasoning
A curated collection of resources, tutorials, and best practices for learning and mastering NVIDIA CUTLASS