Skip to content

Commit

Permalink
Enable gpu mode if gpu hardware detected.
Browse files Browse the repository at this point in the history
layer-nvidia-cuda does the hardware detection and sets a state that the
worker can react to.

When gpu is available, worker updates config and restarts kubelet to
enable gpu mode. Worker then notifies master that it's in gpu mode via
the kube-control relation.

When master sees that a worker is in gpu mode, it updates to privileged
mode and restarts kube-apiserver.

The kube-control interface has subsumed the kube-dns interface
functionality.

An 'allow-privileged' config option has been added to both worker and
master charms. The gpu enablement respects the value of this option;
i.e., we can't enable gpu mode if the operator has set
allow-privileged="false".
  • Loading branch information
tvansteenburgh committed Mar 23, 2017
1 parent edbc9f9 commit c87ac5e
Show file tree
Hide file tree
Showing 15 changed files with 533 additions and 29 deletions.
10 changes: 10 additions & 0 deletions cluster/juju/layers/kubernetes-master/config.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -11,3 +11,13 @@ options:
type: string
default: 10.152.183.0/24
description: CIDR to user for Kubernetes services. Cannot be changed after deployment.
allow-privileged:
type: string
default: "auto"
description: |
Allow kube-apiserver to run in privileged mode. Supported values are
"true", "false", and "auto". If "true", kube-apiserver will run in
privileged mode by default. If "false", kube-apiserver will never run in
privileged mode. If "auto", kube-apiserver will not run in privileged
mode by default, but will switch to privileged mode if gpu hardware is
detected on a worker node.
1 change: 1 addition & 0 deletions cluster/juju/layers/kubernetes-master/layer.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -10,6 +10,7 @@ includes:
- 'interface:http'
- 'interface:kubernetes-cni'
- 'interface:kube-dns'
- 'interface:kube-control'
- 'interface:public-address'
options:
basic:
Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,64 @@
#!/usr/bin/env python

# Copyright 2015 The Kubernetes Authors.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.

import re
import subprocess

from charmhelpers.core import unitdata

BIN_VERSIONS = 'bin_versions'


def get_version(bin_name):
"""Get the version of an installed Kubernetes binary.
:param str bin_name: Name of binary
:return: 3-tuple version (maj, min, patch)
Example::
>>> `get_version('kubelet')
(1, 6, 0)
"""
db = unitdata.kv()
bin_versions = db.get(BIN_VERSIONS, {})

cached_version = bin_versions.get(bin_name)
if cached_version:
return tuple(cached_version)

version = _get_bin_version(bin_name)
bin_versions[bin_name] = list(version)
db.set(BIN_VERSIONS, bin_versions)
return version


def reset_versions():
"""Reset the cache of bin versions.
"""
db = unitdata.kv()
db.unset(BIN_VERSIONS)


def _get_bin_version(bin_name):
"""Get a binary version by calling it with --version and parsing output.
"""
cmd = '{} --version'.format(bin_name).split()
version_string = subprocess.check_output(cmd).decode('utf-8')
return tuple(int(q) for q in re.findall("[0-9]+", version_string)[:3])
Original file line number Diff line number Diff line change
Expand Up @@ -107,10 +107,17 @@ def destroy(self, key, strict=False):
if strict:
self.data.pop('{}-strict'.format(key))
else:
self.data.pop('key')
self.data.pop(key)
self.__save()
except KeyError:
pass

def get(self, key, default=None):
"""Return the value for ``key``, or the default if ``key`` doesn't exist.
"""
return self.data.get(key, default)

def to_s(self):
'''
Render the flags to a single string, prepared for the Docker
Expand Down
5 changes: 5 additions & 0 deletions cluster/juju/layers/kubernetes-master/metadata.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -20,7 +20,12 @@ provides:
kube-api-endpoint:
interface: http
cluster-dns:
# kube-dns is deprecated. Its functionality has been rolled into the
# kube-control interface. The cluster-dns relation will be removed in
# a future release.
interface: kube-dns
kube-control:
interface: kube-control
cni:
interface: kubernetes-cni
scope: container
Expand Down
155 changes: 146 additions & 9 deletions cluster/juju/layers/kubernetes-master/reactive/kubernetes_master.py
Original file line number Diff line number Diff line change
Expand Up @@ -35,6 +35,7 @@
from charms.reactive import set_state
from charms.reactive import when, when_any, when_not
from charms.reactive.helpers import data_changed
from charms.kubernetes.common import get_version, reset_versions
from charms.kubernetes.flagmanager import FlagManager

from charmhelpers.core import hookenv
Expand Down Expand Up @@ -131,6 +132,7 @@ def install():
hookenv.log(install)
check_call(install)

reset_versions()
set_state('kubernetes-master.components.installed')


Expand Down Expand Up @@ -274,13 +276,28 @@ def start_master(etcd, tls):
set_state('kubernetes-master.components.started')


@when('cluster-dns.connected')
def send_cluster_dns_detail(cluster_dns):
@when('kube-control.connected')
def send_cluster_dns_detail(kube_control):
''' Send cluster DNS info '''
# Note that the DNS server doesn't necessarily exist at this point. We know
# where we're going to put it, though, so let's send the info anyway.
dns_ip = get_dns_ip()
cluster_dns.set_dns_info(53, hookenv.config('dns_domain'), dns_ip)
kube_control.set_dns(53, hookenv.config('dns_domain'), dns_ip)


@when_not('kube-control.connected')
def missing_kube_control():
"""Inform the operator they need to add the kube-control relation.
If deploying via bundle this won't happen, but if operator is upgrading a
a charm in a deployment that pre-dates the kube-control relation, it'll be
missing.
"""
hookenv.status_set(
'blocked',
'Relate {}:kube-control kubernetes-worker:kube-control'.format(
hookenv.service_name()))


@when('kube-api-endpoint.available')
Expand Down Expand Up @@ -529,12 +546,110 @@ def remove_nrpe_config(nagios=None):
nrpe_setup.remove_check(shortname=service)


def set_privileged(privileged, render_config=True):
"""Update the KUBE_ALLOW_PRIV flag for kube-apiserver and re-render config.
If the flag already matches the requested value, this is a no-op.
:param str privileged: "true" or "false"
:param bool render_config: whether to render new config file
:return: True if the flag was changed, else false
"""
if privileged == "true":
set_state('kubernetes-master.privileged')
else:
remove_state('kubernetes-master.privileged')

flag = '--allow-privileged'
kube_allow_priv_opts = FlagManager('KUBE_ALLOW_PRIV')
if kube_allow_priv_opts.get(flag) == privileged:
# Flag isn't changing, nothing to do
return False

hookenv.log('Setting {}={}'.format(flag, privileged))

# Update --allow-privileged flag value
kube_allow_priv_opts.add(flag, privileged, strict=True)

# re-render config with new options
if render_config:
context = {
'kube_allow_priv': kube_allow_priv_opts.to_s(),
}

# render the kube-defaults file
render('kube-defaults.defaults', '/etc/default/kube-defaults', context)

# signal that we need a kube-apiserver restart
set_state('kubernetes-master.kube-apiserver.restart')

return True


@when('config.changed.allow-privileged')
@when('kubernetes-master.components.started')
def on_config_allow_privileged_change():
"""React to changed 'allow-privileged' config value.
"""
config = hookenv.config()
privileged = config['allow-privileged']
if privileged == "auto":
return

set_privileged(privileged)
remove_state('config.changed.allow-privileged')


@when('kubernetes-master.kube-apiserver.restart')
def restart_kube_apiserver():
"""Restart kube-apiserver.
"""
host.service_restart('kube-apiserver')
remove_state('kubernetes-master.kube-apiserver.restart')


@when('kube-control.gpu.available')
@when('kubernetes-master.components.started')
@when_not('kubernetes-master.gpu.enabled')
def on_gpu_available(kube_control):
"""The remote side (kubernetes-worker) is gpu-enabled.
We need to run in privileged mode.
"""
config = hookenv.config()
if config['allow-privileged'] == "false":
hookenv.status_set(
'active',
'GPUs available. Set allow-privileged="auto" to enable.'
)
return

set_privileged("true")
set_state('kubernetes-master.gpu.enabled')


@when('kubernetes-master.gpu.enabled')
@when_not('kubernetes-master.privileged')
def disable_gpu_mode():
"""We were in gpu mode, but the operator has set allow-privileged="false",
so we can't run in gpu mode anymore.
"""
remove_state('kubernetes-master.gpu.enabled')


def create_addon(template, context):
'''Create an addon from a template'''
source = 'addons/' + template
target = '/etc/kubernetes/addons/' + template
render(source, target, context)
cmd = ['kubectl', 'apply', '-f', target]
# Need --force when upgrading between k8s versions where the templates have
# changed.
cmd = ['kubectl', 'apply', '--force', '-f', target]
check_call(cmd)


Expand Down Expand Up @@ -683,6 +798,7 @@ def render_files():
api_opts = FlagManager('kube-apiserver')
controller_opts = FlagManager('kube-controller-manager')
scheduler_opts = FlagManager('kube-scheduler')
scheduler_opts.add('--v', '2')

# Get the tls paths from the layer data.
layer_options = layer.options('tls-client')
Expand All @@ -692,6 +808,11 @@ def render_files():
server_cert_path = layer_options.get('server_certificate_path')
server_key_path = layer_options.get('server_key_path')

# set --allow-privileged flag for kube-apiserver
set_privileged(
"true" if config['allow-privileged'] == "true" else "false",
render_config=False)

# Handle static options for now
api_opts.add('--min-request-timeout', '300')
api_opts.add('--v', '4')
Expand All @@ -701,17 +822,33 @@ def render_files():
api_opts.add('--kubelet-certificate-authority', ca_cert_path)
api_opts.add('--kubelet-client-certificate', client_cert_path)
api_opts.add('--kubelet-client-key', client_key_path)

scheduler_opts.add('--v', '2')
# Needed for upgrade from 1.5.x to 1.6.0
# XXX: support etcd3
api_opts.add('--storage-backend', 'etcd2')
admission_control = [
'NamespaceLifecycle',
'LimitRanger',
'ServiceAccount',
'ResourceQuota',
'DefaultTolerationSeconds'
]
if get_version('kube-apiserver') < (1, 6):
hookenv.log('Removing DefaultTolerationSeconds from admission-control')
admission_control.remove('DefaultTolerationSeconds')
api_opts.add(
'--admission-control', ','.join(admission_control), strict=True)

# Default to 3 minute resync. TODO: Make this configureable?
controller_opts.add('--min-resync-period', '3m')
controller_opts.add('--v', '2')
controller_opts.add('--root-ca-file', ca_cert_path)

context.update({'kube_apiserver_flags': api_opts.to_s(),
'kube_scheduler_flags': scheduler_opts.to_s(),
'kube_controller_manager_flags': controller_opts.to_s()})
context.update({
'kube_allow_priv': FlagManager('KUBE_ALLOW_PRIV').to_s(),
'kube_apiserver_flags': api_opts.to_s(),
'kube_scheduler_flags': scheduler_opts.to_s(),
'kube_controller_manager_flags': controller_opts.to_s(),
})

# Render the configuration files that contains parameters for
# the apiserver, scheduler, and controller-manager
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -11,7 +11,7 @@ KUBE_API_ADDRESS="--insecure-bind-address=127.0.0.1"
KUBE_API_PORT="--insecure-port=8080"

# default admission control policies
KUBE_ADMISSION_CONTROL="--admission-control=NamespaceLifecycle,LimitRanger,ServiceAccount,DefaultTolerationSeconds,ResourceQuota"
KUBE_ADMISSION_CONTROL=""

# Add your own!
KUBE_API_ARGS="{{ kube_apiserver_flags }}"
Original file line number Diff line number Diff line change
Expand Up @@ -16,7 +16,7 @@ KUBE_LOGTOSTDERR="--logtostderr=true"
KUBE_LOG_LEVEL="--v=0"

# Should this cluster be allowed to run privileged docker containers
KUBE_ALLOW_PRIV="--allow-privileged=false"
KUBE_ALLOW_PRIV="{{ kube_allow_priv }}"

# How the controller-manager, scheduler, and proxy find the apiserver
KUBE_MASTER="--master=http://127.0.0.1:8080"
9 changes: 9 additions & 0 deletions cluster/juju/layers/kubernetes-worker/config.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -11,3 +11,12 @@ options:
description: |
Labels can be used to organize and to select subsets of nodes in the
cluster. Declare node labels in key=value format, separated by spaces.
allow-privileged:
type: string
default: "auto"
description: |
Allow privileged containers to run on worker nodes. Supported values are
"true", "false", and "auto". If "true", kubelet will run in privileged
mode by default. If "false", kubelet will never run in privileged mode.
If "auto", kubelet will not run in privileged mode by default, but will
switch to privileged mode if gpu hardware is detected.
2 changes: 2 additions & 0 deletions cluster/juju/layers/kubernetes-worker/layer.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -5,9 +5,11 @@ includes:
- 'layer:docker'
- 'layer:nagios'
- 'layer:tls-client'
- 'layer:nvidia-cuda'
- 'interface:http'
- 'interface:kubernetes-cni'
- 'interface:kube-dns'
- 'interface:kube-control'
options:
basic:
packages:
Expand Down
Loading

0 comments on commit c87ac5e

Please sign in to comment.