Skip to content

Commit

Permalink
Add support for user-defined healthchecks
Browse files Browse the repository at this point in the history
This PR adds support for user-defined health-check probes for Docker
containers. It adds a `HEALTHCHECK` instruction to the Dockerfile syntax plus
some corresponding "docker run" options. It can be used with a restart policy
to automatically restart a container if the check fails.

The `HEALTHCHECK` instruction has two forms:

* `HEALTHCHECK [OPTIONS] CMD command` (check container health by running a command inside the container)
* `HEALTHCHECK NONE` (disable any healthcheck inherited from the base image)

The `HEALTHCHECK` instruction tells Docker how to test a container to check that
it is still working. This can detect cases such as a web server that is stuck in
an infinite loop and unable to handle new connections, even though the server
process is still running.

When a container has a healthcheck specified, it has a _health status_ in
addition to its normal status. This status is initially `starting`. Whenever a
health check passes, it becomes `healthy` (whatever state it was previously in).
After a certain number of consecutive failures, it becomes `unhealthy`.

The options that can appear before `CMD` are:

* `--interval=DURATION` (default: `30s`)
* `--timeout=DURATION` (default: `30s`)
* `--retries=N` (default: `1`)

The health check will first run **interval** seconds after the container is
started, and then again **interval** seconds after each previous check completes.

If a single run of the check takes longer than **timeout** seconds then the check
is considered to have failed.

It takes **retries** consecutive failures of the health check for the container
to be considered `unhealthy`.

There can only be one `HEALTHCHECK` instruction in a Dockerfile. If you list
more than one then only the last `HEALTHCHECK` will take effect.

The command after the `CMD` keyword can be either a shell command (e.g. `HEALTHCHECK
CMD /bin/check-running`) or an _exec_ array (as with other Dockerfile commands;
see e.g. `ENTRYPOINT` for details).

The command's exit status indicates the health status of the container.
The possible values are:

- 0: success - the container is healthy and ready for use
- 1: unhealthy - the container is not working correctly
- 2: starting - the container is not ready for use yet, but is working correctly

If the probe returns 2 ("starting") when the container has already moved out of the
"starting" state then it is treated as "unhealthy" instead.

For example, to check every five minutes or so that a web-server is able to
serve the site's main page within three seconds:

    HEALTHCHECK --interval=5m --timeout=3s \
      CMD curl -f http://localhost/ || exit 1

To help debug failing probes, any output text (UTF-8 encoded) that the command writes
on stdout or stderr will be stored in the health status and can be queried with
`docker inspect`. Such output should be kept short (only the first 4096 bytes
are stored currently).

When the health status of a container changes, a `health_status` event is
generated with the new status. The health status is also displayed in the
`docker ps` output.

Signed-off-by: Thomas Leonard <[email protected]>
Signed-off-by: Sebastiaan van Stijn <[email protected]>
  • Loading branch information
Thomas Leonard authored and thaJeztah committed Jun 2, 2016
1 parent d1b1b6a commit b6c7bec
Show file tree
Hide file tree
Showing 27 changed files with 1,170 additions and 80 deletions.
2 changes: 1 addition & 1 deletion api/server/router/container/backend.go
Original file line number Diff line number Diff line change
Expand Up @@ -17,7 +17,7 @@ type execBackend interface {
ContainerExecCreate(name string, config *types.ExecConfig) (string, error)
ContainerExecInspect(id string) (*backend.ExecInspect, error)
ContainerExecResize(name string, height, width int) error
ContainerExecStart(name string, stdin io.ReadCloser, stdout io.Writer, stderr io.Writer) error
ContainerExecStart(ctx context.Context, name string, stdin io.ReadCloser, stdout io.Writer, stderr io.Writer) error
ExecExists(name string) (bool, error)
}

Expand Down
3 changes: 2 additions & 1 deletion api/server/router/container/exec.go
Original file line number Diff line number Diff line change
Expand Up @@ -106,7 +106,8 @@ func (s *containerRouter) postContainerExecStart(ctx context.Context, w http.Res
}

// Now run the user process in container.
if err := s.backend.ContainerExecStart(execName, stdin, stdout, stderr); err != nil {
// Maybe we should we pass ctx here if we're not detaching?
if err := s.backend.ContainerExecStart(context.Background(), execName, stdin, stdout, stderr); err != nil {
if execStartCheck.Detach {
return err
}
Expand Down
19 changes: 10 additions & 9 deletions builder/dockerfile/builder.go
Original file line number Diff line number Diff line change
Expand Up @@ -22,15 +22,16 @@ import (
)

var validCommitCommands = map[string]bool{
"cmd": true,
"entrypoint": true,
"env": true,
"expose": true,
"label": true,
"onbuild": true,
"user": true,
"volume": true,
"workdir": true,
"cmd": true,
"entrypoint": true,
"healthcheck": true,
"env": true,
"expose": true,
"label": true,
"onbuild": true,
"user": true,
"volume": true,
"workdir": true,
}

// BuiltinAllowedBuildArgs is list of built-in allowed build args
Expand Down
66 changes: 34 additions & 32 deletions builder/dockerfile/command/command.go
Original file line number Diff line number Diff line change
Expand Up @@ -3,40 +3,42 @@ package command

// Define constants for the command strings
const (
Env = "env"
Label = "label"
Maintainer = "maintainer"
Add = "add"
Copy = "copy"
From = "from"
Onbuild = "onbuild"
Workdir = "workdir"
Run = "run"
Cmd = "cmd"
Entrypoint = "entrypoint"
Expose = "expose"
Volume = "volume"
User = "user"
StopSignal = "stopsignal"
Arg = "arg"
Env = "env"
Label = "label"
Maintainer = "maintainer"
Add = "add"
Copy = "copy"
From = "from"
Onbuild = "onbuild"
Workdir = "workdir"
Run = "run"
Cmd = "cmd"
Entrypoint = "entrypoint"
Expose = "expose"
Volume = "volume"
User = "user"
StopSignal = "stopsignal"
Arg = "arg"
Healthcheck = "healthcheck"
)

// Commands is list of all Dockerfile commands
var Commands = map[string]struct{}{
Env: {},
Label: {},
Maintainer: {},
Add: {},
Copy: {},
From: {},
Onbuild: {},
Workdir: {},
Run: {},
Cmd: {},
Entrypoint: {},
Expose: {},
Volume: {},
User: {},
StopSignal: {},
Arg: {},
Env: {},
Label: {},
Maintainer: {},
Add: {},
Copy: {},
From: {},
Onbuild: {},
Workdir: {},
Run: {},
Cmd: {},
Entrypoint: {},
Expose: {},
Volume: {},
User: {},
StopSignal: {},
Arg: {},
Healthcheck: {},
}
107 changes: 107 additions & 0 deletions builder/dockerfile/dispatchers.go
Original file line number Diff line number Diff line change
Expand Up @@ -12,7 +12,9 @@ import (
"regexp"
"runtime"
"sort"
"strconv"
"strings"
"time"

"github.com/Sirupsen/logrus"
"github.com/docker/docker/api"
Expand Down Expand Up @@ -426,6 +428,111 @@ func cmd(b *Builder, args []string, attributes map[string]bool, original string)
return nil
}

// parseOptInterval(flag) is the duration of flag.Value, or 0 if
// empty. An error is reported if the value is given and is not positive.
func parseOptInterval(f *Flag) (time.Duration, error) {
s := f.Value
if s == "" {
return 0, nil
}
d, err := time.ParseDuration(s)
if err != nil {
return 0, err
}
if d <= 0 {
return 0, fmt.Errorf("Interval %#v must be positive", f.name)
}
return d, nil
}

// HEALTHCHECK foo
//
// Set the default healthcheck command to run in the container (which may be empty).
// Argument handling is the same as RUN.
//
func healthcheck(b *Builder, args []string, attributes map[string]bool, original string) error {
if len(args) == 0 {
return fmt.Errorf("HEALTHCHECK requires an argument")
}
typ := strings.ToUpper(args[0])
args = args[1:]
if typ == "NONE" {
if len(args) != 0 {
return fmt.Errorf("HEALTHCHECK NONE takes no arguments")
}
test := strslice.StrSlice{typ}
b.runConfig.Healthcheck = &container.HealthConfig{
Test: test,
}
} else {
if b.runConfig.Healthcheck != nil {
oldCmd := b.runConfig.Healthcheck.Test
if len(oldCmd) > 0 && oldCmd[0] != "NONE" {
fmt.Fprintf(b.Stdout, "Note: overriding previous HEALTHCHECK: %v\n", oldCmd)
}
}

healthcheck := container.HealthConfig{}

flInterval := b.flags.AddString("interval", "")
flTimeout := b.flags.AddString("timeout", "")
flRetries := b.flags.AddString("retries", "")

if err := b.flags.Parse(); err != nil {
return err
}

switch typ {
case "CMD":
cmdSlice := handleJSONArgs(args, attributes)
if len(cmdSlice) == 0 {
return fmt.Errorf("Missing command after HEALTHCHECK CMD")
}

if !attributes["json"] {
typ = "CMD-SHELL"
}

healthcheck.Test = strslice.StrSlice(append([]string{typ}, cmdSlice...))
default:
return fmt.Errorf("Unknown type %#v in HEALTHCHECK (try CMD)", typ)
}

interval, err := parseOptInterval(flInterval)
if err != nil {
return err
}
healthcheck.Interval = interval

timeout, err := parseOptInterval(flTimeout)
if err != nil {
return err
}
healthcheck.Timeout = timeout

if flRetries.Value != "" {
retries, err := strconv.ParseInt(flRetries.Value, 10, 32)
if err != nil {
return err
}
if retries < 1 {
return fmt.Errorf("--retries must be at least 1 (not %d)", retries)
}
healthcheck.Retries = int(retries)
} else {
healthcheck.Retries = 0
}

b.runConfig.Healthcheck = &healthcheck
}

if err := b.commit("", b.runConfig.Cmd, fmt.Sprintf("HEALTHCHECK %q", b.runConfig.Healthcheck)); err != nil {
return err
}

return nil
}

// ENTRYPOINT /usr/sbin/nginx
//
// Set the entrypoint (which defaults to sh -c on linux, or cmd /S /C on Windows) to
Expand Down
33 changes: 17 additions & 16 deletions builder/dockerfile/evaluator.go
Original file line number Diff line number Diff line change
Expand Up @@ -58,22 +58,23 @@ var evaluateTable map[string]func(*Builder, []string, map[string]bool, string) e

func init() {
evaluateTable = map[string]func(*Builder, []string, map[string]bool, string) error{
command.Env: env,
command.Label: label,
command.Maintainer: maintainer,
command.Add: add,
command.Copy: dispatchCopy, // copy() is a go builtin
command.From: from,
command.Onbuild: onbuild,
command.Workdir: workdir,
command.Run: run,
command.Cmd: cmd,
command.Entrypoint: entrypoint,
command.Expose: expose,
command.Volume: volume,
command.User: user,
command.StopSignal: stopSignal,
command.Arg: arg,
command.Env: env,
command.Label: label,
command.Maintainer: maintainer,
command.Add: add,
command.Copy: dispatchCopy, // copy() is a go builtin
command.From: from,
command.Onbuild: onbuild,
command.Workdir: workdir,
command.Run: run,
command.Cmd: cmd,
command.Entrypoint: entrypoint,
command.Expose: expose,
command.Volume: volume,
command.User: user,
command.StopSignal: stopSignal,
command.Arg: arg,
command.Healthcheck: healthcheck,
}
}

Expand Down
29 changes: 29 additions & 0 deletions builder/dockerfile/parser/line_parsers.go
Original file line number Diff line number Diff line change
Expand Up @@ -329,3 +329,32 @@ func parseMaybeJSONToList(rest string) (*Node, map[string]bool, error) {

return parseStringsWhitespaceDelimited(rest)
}

// The HEALTHCHECK command is like parseMaybeJSON, but has an extra type argument.
func parseHealthConfig(rest string) (*Node, map[string]bool, error) {
// Find end of first argument
var sep int
for ; sep < len(rest); sep++ {
if unicode.IsSpace(rune(rest[sep])) {
break
}
}
next := sep
for ; next < len(rest); next++ {
if !unicode.IsSpace(rune(rest[next])) {
break
}
}

if sep == 0 {
return nil, nil, nil
}

typ := rest[:sep]
cmd, attrs, err := parseMaybeJSON(rest[next:])
if err != nil {
return nil, nil, err
}

return &Node{Value: typ, Next: cmd, Attributes: attrs}, nil, err
}
33 changes: 17 additions & 16 deletions builder/dockerfile/parser/parser.go
Original file line number Diff line number Diff line change
Expand Up @@ -66,22 +66,23 @@ func init() {
// functions. Errors are propagated up by Parse() and the resulting AST can
// be incorporated directly into the existing AST as a next.
dispatch = map[string]func(string) (*Node, map[string]bool, error){
command.User: parseString,
command.Onbuild: parseSubCommand,
command.Workdir: parseString,
command.Env: parseEnv,
command.Label: parseLabel,
command.Maintainer: parseString,
command.From: parseString,
command.Add: parseMaybeJSONToList,
command.Copy: parseMaybeJSONToList,
command.Run: parseMaybeJSON,
command.Cmd: parseMaybeJSON,
command.Entrypoint: parseMaybeJSON,
command.Expose: parseStringsWhitespaceDelimited,
command.Volume: parseMaybeJSONToList,
command.StopSignal: parseString,
command.Arg: parseNameOrNameVal,
command.User: parseString,
command.Onbuild: parseSubCommand,
command.Workdir: parseString,
command.Env: parseEnv,
command.Label: parseLabel,
command.Maintainer: parseString,
command.From: parseString,
command.Add: parseMaybeJSONToList,
command.Copy: parseMaybeJSONToList,
command.Run: parseMaybeJSON,
command.Cmd: parseMaybeJSON,
command.Entrypoint: parseMaybeJSON,
command.Expose: parseStringsWhitespaceDelimited,
command.Volume: parseMaybeJSONToList,
command.StopSignal: parseString,
command.Arg: parseNameOrNameVal,
command.Healthcheck: parseHealthConfig,
}
}

Expand Down
10 changes: 10 additions & 0 deletions builder/dockerfile/parser/testfiles/health/Dockerfile
Original file line number Diff line number Diff line change
@@ -0,0 +1,10 @@
FROM debian
ADD check.sh main.sh /app/
CMD /app/main.sh
HEALTHCHECK
HEALTHCHECK --interval=5s --timeout=3s --retries=1 \
CMD /app/check.sh --quiet
HEALTHCHECK CMD
HEALTHCHECK CMD a b
HEALTHCHECK --timeout=3s CMD ["foo"]
HEALTHCHECK CONNECT TCP 7000
9 changes: 9 additions & 0 deletions builder/dockerfile/parser/testfiles/health/result
Original file line number Diff line number Diff line change
@@ -0,0 +1,9 @@
(from "debian")
(add "check.sh" "main.sh" "/app/")
(cmd "/app/main.sh")
(healthcheck)
(healthcheck ["--interval=5s" "--timeout=3s" "--retries=1"] "CMD" "/app/check.sh --quiet")
(healthcheck "CMD")
(healthcheck "CMD" "a b")
(healthcheck ["--timeout=3s"] "CMD" "foo")
(healthcheck "CONNECT" "TCP 7000")
Loading

0 comments on commit b6c7bec

Please sign in to comment.