Skip to content

Latest commit

 

History

History

kubernetes

Folders and files

NameName
Last commit message
Last commit date

parent directory

..
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Kubernetes Integration

Kubernetes Dashboard

Overview

Get metrics from the Kubernetes service in real time to:

  • Visualize and monitor Kubernetes states
  • Be notified about Kubernetes failovers and events.

Note: This check only works with Agent v5. For Agent v6+, refer to the kubelet check.

Setup

Installation

The Kubernetes check is included in the Datadog Agent package, so you don't need to install anything else on your Kubernetes servers.

For more information on installing the Datadog Agent on your Kubernetes clusters, see the Kubernetes documentation page.

To collect Kubernetes State metrics, please refer to the kubernetes_state integration.

Configuration

Edit the kubernetes.yaml file to point to your server and port, set the masters to monitor. See the sample kubernetes.yaml for all available configuration options.

Validation

Run the Agent's status subcommand and look for kubernetes under the Checks section.

Data Collected

Metrics

See metadata.csv for a list of metrics provided by this integration.

Events

As the 5.17.0 release, Datadog Agent now supports built in leader election option for the Kubernetes event collector. Once enabled, you no longer need to deploy an additional event collection container to your cluster. Instead agents will coordinate to ensure only one agent instance is gathering events at a given time, events below will be available:

  • Backoff
  • Conflict
  • Delete
  • DeletingAllPods
  • Didn't have enough resource
  • Error
  • Failed
  • FailedCreate
  • FailedDelete
  • FailedMount
  • FailedSync
  • Failedvalidation
  • FreeDiskSpaceFailed
  • HostPortConflict
  • InsufficientFreeCPU
  • InsufficientFreeMemory
  • InvalidDiskCapacity
  • Killing
  • KubeletsetupFailed
  • NodeNotReady
  • NodeoutofDisk
  • OutofDisk
  • Rebooted
  • TerminatedAllPods
  • Unable
  • Unhealthy

Service Checks

The Kubernetes check does not include any service checks.

Troubleshooting

Can I install the agent on my Kubernetes master node(s) ?

Yes, since Kubernetes 1.6, the concept of Taints and tolerations was introduced. Now rather than the master being off limits, it's simply tainted. Add the required toleration to the pod to run it:

Add the following lines to your Deployment (or Daemonset if you are running a multi-master setup):

spec:
  tolerations:
    - key: node-role.kubernetes.io/master
      effect: NoSchedule

Why is the Kubernetes check failing with a ConnectTimeout error to port 10250?

The agent assumes that the kubelet API is available at the default gateway of the container. If that's not the case because you are using a software defined networks like Calico or Flannel, the agent needs to be specified using an environment variable:

- name: KUBERNETES_KUBELET_HOST
  valueFrom:
    fieldRef:
      fieldPath: spec.nodeName

See this PR

Why is there a container in each Kubernetes pod with 0% CPU and minimal disk/ram?

These are pause containers (docker_image:gcr.io/google_containers/pause.*) that K8s injects into every pod to keep it populated even if the "real" container is restarting/stopped.

The docker_daemon check ignores them through a default exclusion list, but they will show up for K8s metrics like kubernetes.cpu.usage.total and kubernetes.filesystem.usage.

Further Reading

To get a better idea of how (or why) to integrate your Kubernetes service, check out our series of blog posts about it.