-
Google
- Seattle, WA
- patemotter.com
Lists (1)
Sort Oldest
Stars
JetStream is a throughput and memory optimized engine for LLM inference on XLA devices, starting with TPUs (and GPUs in future -- PRs welcome).
Course to get into Large Language Models (LLMs) with roadmaps and Colab notebooks.
TensorRT-LLM provides users with an easy-to-use Python API to define Large Language Models (LLMs) and build TensorRT engines that contain state-of-the-art optimizations to perform inference efficie…
A plugin loader for the Steam Deck.
Official inference library for Mistral models
Simple and efficient pytorch-native transformer text generation in <1000 LOC of python.
Generative Models by Stability AI
Simple, safe way to store and distribute tensors
DeepSpeed is a deep learning optimization library that makes distributed training and inference easy, efficient, and effective.
MII makes low-latency and high-throughput inference possible, powered by DeepSpeed.
Fast and memory-efficient exact attention
Libraries for applying sparsification recipes to neural networks with a few lines of code, enabling faster and smaller models
Linear solvers in JAX and Equinox. https://docs.kidger.site/lineax
Hardware accelerated, batchable and differentiable optimizers in JAX.
Copybara: A tool for transforming and moving code between repositories.
The official Python library for the Google Gemini API
Orbax provides common checkpointing and persistence utilities for JAX users
Type annotations and runtime checking for shape and dtype of JAX/NumPy/PyTorch/etc. arrays. https://docs.kidger.site/jaxtyping/
Performance-portable, length-agnostic SIMD with runtime dispatch
A complete guide to start and improve in machine learning (ML), artificial intelligence (AI) in 2025 without ANY background in the field and stay up-to-date with the latest news and state-of-the-ar…
A simple, performant and scalable Jax LLM!
Flax is a neural network library for JAX that is designed for flexibility.