Stars
Continuous Profiling Platform. Debug performance issues down to a single line of code
A Cloud Native Batch System (Project under CNCF)
Grafana Mimir provides horizontally scalable, highly available, multi-tenant, long-term storage for Prometheus.
🧊 The next generation Package Manager for Kubernetes 📦 Featuring a GUI and a CLI. Glasskube packages are dependency aware, GitOps ready and can get automatic updates via a central public package re…
Determined is an open-source machine learning platform that simplifies distributed training, hyperparameter tuning, experiment tracking, and resource management. Works with PyTorch and TensorFlow.
Fast container image distribution plugin with lazy pulling
NVIDIA GPU metrics exporter for Prometheus leveraging DCGM
An open source DevOps tool for packaging and versioning AI/ML models, datasets, code, and configuration into an OCI artifact.
Run serverless GPU workloads with fast cold starts on bare-metal servers, anywhere in the world
A multi-cluster batch queuing system for high-throughput workloads on Kubernetes.
dazzle is a rather experimental Docker image builder which builds independent layers
transparent proxy server for llama.cpp's server to provide automatic model swapping
A Kubernetes CRD for prefetching container images onto nodes.
Testing designs for a benchmarking operator (in experimental mode!)