Stars
A high-throughput and memory-efficient inference and serving engine for LLMs
Automatically Discovering Fast Parallelization Strategies for Distributed Deep Neural Network Training
Semantic cache for LLMs. Fully integrated with LangChain and llama_index.
Deep Learning Energy Measurement and Optimization
An SSH command runner with a focus on simplicity
[NeurIPS 2022] A Fast Post-Training Pruning Framework for Transformers
Python packaging and dependency management made easy
PyTorch implementation of a 1.3B text-to-image generation model trained on 14 million image-text pairs
Learning Software Engineering By Building Web Services
Multi-DNN Inference Engine for Heterogeneous Mobile Processors
Visualizer for neural network, deep learning and machine learning models
Parses, and hovers math formula of c mathematical library functions