Skip to content
View KnowingNothing's full-sized avatar
🥰
🥰
  • ByteDance
  • Beijing

Highlights

  • Pro

Block or report KnowingNothing

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

Github mirror of trition-lang/triton repo.

C++ 14 4 Updated Dec 21, 2024

Virtual whiteboard for sketching hand-drawn like diagrams

TypeScript 87,996 8,380 Updated Dec 23, 2024

Neighborhood Attention Transformer, arxiv 2022 / CVPR 2023. Dilated Neighborhood Attention Transformer, arxiv 2022

Python 1,073 86 Updated May 15, 2024

TiledCUDA is a highly efficient kernel template library designed to elevate CUDA C’s level of abstraction for processing tiles.

C++ 169 10 Updated Nov 18, 2024

NVIDIA Resiliency Extension is a python package for framework developers and users to implement fault-tolerant features. It improves the effective training time by minimizing the downtime due to fa…

Python 63 4 Updated Dec 19, 2024

NVIDIA Linux open GPU with P2P support

C 954 91 Updated Dec 18, 2024

veRL: Volcano Engine Reinforcement Learning for LLM

Python 472 34 Updated Dec 22, 2024

The best OSS video generation models

Python 2,534 260 Updated Dec 18, 2024

xDiT: A Scalable Inference Engine for Diffusion Transformers (DiTs) with Massive Parallelism

Python 1,044 103 Updated Dec 23, 2024

compilerbook

49 28 Updated Apr 25, 2021

OpenR: An Open Source Framework for Advanced Reasoning with Large Language Models

Python 1,297 106 Updated Dec 18, 2024

Collection of AWESOME vision-language models for vision tasks

2,644 227 Updated Dec 3, 2024

This is originally a collection of papers on neural network accelerators. Now it's more like my selection of research on deep learning and computer architecture.

1,882 385 Updated Aug 11, 2024

A native PyTorch Library for large model training

Python 2,779 227 Updated Dec 20, 2024

[OSDI'24] Serving LLM-based Applications Efficiently with Semantic Variable

Python 125 8 Updated Sep 21, 2024

Applied AI experiments and examples for PyTorch

Python 191 17 Updated Dec 17, 2024

SGLang is a fast serving framework for large language models and vision language models.

Python 6,634 594 Updated Dec 23, 2024

A throughput-oriented high-performance serving framework for LLMs

Cuda 669 27 Updated Sep 21, 2024

A fast communication-overlapping library for tensor parallelism on GPUs.

C++ 247 21 Updated Oct 30, 2024

A low-latency & high-throughput serving engine for LLMs

Python 275 34 Updated Sep 12, 2024

MAGIS: Memory Optimization via Coordinated Graph Transformation and Scheduling for DNN (ASPLOS'24)

Python 47 4 Updated May 29, 2024

A list of tutorials, paper, talks, and open-source projects for emerging compiler and architecture

407 35 Updated Nov 28, 2024

Code for the paper "QMoE: Practical Sub-1-Bit Compression of Trillion-Parameter Models".

Python 262 23 Updated Nov 3, 2023

S-LoRA: Serving Thousands of Concurrent LoRA Adapters

Python 1,768 100 Updated Jan 21, 2024

Mirage: Automatically Generating Fast GPU Kernels without Programming in Triton/CUDA

C++ 685 39 Updated Dec 19, 2024

scalable and robust tree-based speculative decoding algorithm

Python 322 37 Updated Aug 13, 2024

[COLM 2024] TriForce: Lossless Acceleration of Long Sequence Generation with Hierarchical Speculative Decoding

Python 237 13 Updated Aug 31, 2024

A "large" language model running on a microcontroller

C++ 499 36 Updated Dec 9, 2023

Grok open release

Python 49,743 8,345 Updated Aug 30, 2024

机场推荐与机场评测

4,275 108 Updated Nov 18, 2024
Next