Stars
Dynolog is a telemetry daemon for performance monitoring and tracing. It exports metrics from different components in the system like the linux kernel, CPU, disks, Intel PT, GPUs etc. Dynolog also …
A CPU+GPU Profiling library that provides access to timeline traces and hardware performance counters.
Dynolog is a telemetry daemon for performance monitoring and tracing. It exports metrics from different components in the system like the linux kernel, CPU, disks, Intel PT, GPUs etc. Dynolog also …
DLRover: An Automatic Distributed Deep Learning System
Optimized primitives for collective multi-GPU communication
Ongoing research training transformer models at scale
Prometheus exporter for custom eBPF metrics
eBPF-based Networking, Security, and Observability
A Kubernetes Dynamic Admission Controller that patches Pods to add additional information.
Data Import Service for kubernetes, designed with kubevirt in mind.
Operator for provisioning and configuring SR-IOV CNI plugin and device plugin
SRIOV network device plugin for Kubernetes
🐶 Kubernetes CLI To Manage Your Clusters In Style!
Nydus - the Dragonfly image service, providing fast, secure and easy access to container images.
Automated management of large-scale applications on Kubernetes (incubating project under CNCF)
Highly available Prometheus setup with long term storage capabilities. A CNCF Incubating project.
Operate Fluent Bit and Fluentd in the Kubernetes way - Previously known as FluentBit Operator
Repo for the controller-runtime subproject of kubebuilder (sig-apimachinery)
Kubernetes Virtualization API and runtime in order to define and manage virtual machines.
Kubernetes IN Docker - local clusters for testing Kubernetes
Dynamically provisioning persistent local storage with Kubernetes
It is open source ebook about TensorFlow kernel and implementation mechanism.