A controller that monitor and fix your Docker-based microservices infrastructure.
kontrol is a very lightweight application you can run as a Docker container or a Docker Swarm service. It runs requests against a set of resources (containers or services) to determine whether those resources are operating normally (a.k.a. health check). Whenever a health check fails, it can launch commands against the Docker API in order to heal the infrastructure.
kontrol includes Slack notifications using Incoming Webhook. Insert the webhook URL into your environment and enable the Slack integration. Anytime a healthcheck fails, the channel that you specified on Slack will receive a message.
Docker provides a built-in healthcheck for containers but it can be too much limited in some use cases. First, you will need to use a third-party tool anyway if you'd like to be notified on Slack about healthcheck failures. Second, the healing operation of Docker only consists in restarting the faulty container itself. In a complex microservices scenario this might not be sufficient as some containers depend on others to be reachable.
For instance, let imagine you have a service A using a service B, service A and B could be seen as healthy from the Docker point of view while the service B is not reachable from the service A for some reason (eg network failure, timeout, etc.). In that case it might be interesting to exhibit a healthcheck endpoint in service A providing the status of service B from the point of view of A. When that healthcheck fails in service A you should be notified and possibly the service B automatically restarted.
If you are looking for a solution to publish error logs to Slack you should better have a look to our Slack adapter for logspout.
Most of the configuration options come from got and dockerode powering kontrol. The exported object in the config.js
should be structured like this:
docker
: the options used to initialize dockerodejobs
: a map of healthcheck and notification/heal tasks to be performed, for each task identified by its keycron
: the CRON pattern to schedule itdelay
: the delay in seconds before scheduling the tasknotify
: optional function with theerror
object as input when the healthcheck has failed or none when it is back to healthy after a failure (returns the Slack message payload to be sent for notification)heal
: optional function with thedocker
object (i.e. dockerode instance) and_
(lodash instance) as input and performing Docker commands to heal the infrastructure- all other options are sent to got instance for the healthcheck request
Here are the environment variables you can use to customize the behaviour:
Variable | Description | Defaults |
---|---|---|
CONFIG_FILEPATH |
your configuration file path | config.js |
PORT |
the server port | 8080 |
SLACK_WEBHOOK_URL |
your Slack webhook URL |
The default config.js
is a great example to start from. It basically checks the kontrol container itself and restart it on failure, like would Docker do. When under test mode, the kontrol container will randomly fail with a 500
status code, so that the container will restart as long as you don't kill it manually. To build and launch the example execute the following commands:
SLACK_WEBHOOK_URL=https://hooks.slack.com/services/xxx
docker-compose build
docker-compose up
You can build the image with the following command:
docker build -t <your-image-name> .
This project is configured to use Travis to build and push the image on the Kalisio's Docker Hub.
The built image is tagged using the version
property in the package.json
file.
To enable Travis to do the job, you must define the following variable in the corresponding Travis project:
Variable | Description |
---|---|
DOCKER_USER |
your username |
DOCKER_PASSWORD |
your password |
This image is designed to be deployed using the Kargo project.
Check out the compose file to have an overview on how the container is deployed.
Please read the Contributing file for details on our code of conduct, and the process for submitting pull requests to us.
This project is sponsored by
This project is licensed under the MIT License - see the license file for details