Stars
FB (Facebook) + GEMM (General Matrix-Matrix Multiplication) - https://code.fb.com/ml-applications/fbgemm/
LMDeploy is a toolkit for compressing, deploying, and serving LLMs.
Repository hosting code for "Actions Speak Louder than Words: Trillion-Parameter Sequential Transducers for Generative Recommendations" (https://arxiv.org/abs/2402.17152).
An implementation of a deep learning recommendation model (DLRM)
Pytorch domain library for recommendation systems
DeepSeek-V2: A Strong, Economical, and Efficient Mixture-of-Experts Language Model
Dynamic Memory Management for Serving LLMs without PagedAttention
Ongoing research training transformer models at scale
🚀 Efficiently (pre)training foundation models with native PyTorch features, including FSDP for training and SDPA implementation of Flash attention v2.
🤗 Transformers: State-of-the-art Machine Learning for Pytorch, TensorFlow, and JAX.
A PyTorch native library for large model training
AI Accelerator Benchmark focuses on evaluating AI Accelerators from a practical production perspective, including the ease of use and versatility of software and hardware.
SeaTunnel is a next-generation super high-performance, distributed, massive data integration tool.
Tensors and Dynamic neural networks in Python with strong GPU acceleration