Stars
The Tensor Algebra SuperOptimizer for Deep Learning
Automatically Discovering Fast Parallelization Strategies for Distributed Deep Neural Network Training
oneAPI Deep Neural Network Library (oneDNN)
Puck is a high-performance ANN search engine
A Python package for extending the official PyTorch that can easily obtain performance on Intel platform
A community-maintained Python framework for creating mathematical animations.
A unified, comprehensive and efficient recommendation library
Set of datasets for the deep learning recommendation model (DLRM).
Large Language Model for Generative Recommendation
A curated list of Generative Recommender Systems (Paper & Code)
Large Language Model-enhanced Recommender System Papers
A high-throughput and memory-efficient inference and serving engine for LLMs
The framework for the paper "Inter-layer Scheduling Space Definition and Exploration for Tiled Accelerators" in ISCA 2023.
LLMs build upon Evol Insturct: WizardLM, WizardCoder, WizardMath
Magicube is a high-performance library for quantized sparse matrix operations (SpMM and SDDMM) of deep learning on Tensor Cores.
PyTorch-Based Fast and Efficient Processing for Various Machine Learning Applications with Diverse Sparsity
[Mlsys'22] Understanding gnn computational graph: A coordinated computation, io, and memory perspective
CogDL: A Comprehensive Library for Graph Deep Learning (WWW 2023)
An implementation of model parallel autoregressive transformers on GPUs, based on the Megatron and DeepSpeed libraries
deepspeedai / Megatron-DeepSpeed
Forked from NVIDIA/Megatron-LMOngoing research training transformer language models at scale, including: BERT & GPT-2
ChatGLM-6B: An Open Bilingual Dialogue Language Model | 开源双语对话语言模型
Flops counter for convolutional networks in pytorch framework
An open platform for training, serving, and evaluating large language models. Release repo for Vicuna and Chatbot Arena.