Skip to content

Commit

Permalink
add loki datasource and logs to dashboard
Browse files Browse the repository at this point in the history
  • Loading branch information
retzkek committed May 15, 2021
1 parent caffc7b commit ba8c32a
Show file tree
Hide file tree
Showing 3 changed files with 199 additions and 40 deletions.
80 changes: 51 additions & 29 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,13 +1,30 @@
# ChiaMon

Example using [mtail](https://github.com/google/mtail) to collect metrics from
[Chia](https://chia.net) logs,
[chia_exporter](https://github.com/retzkek/chia_exporter) to collect metrics
from the Chia node, with a [docker-compose](https://github.com/docker/compose/)
stack to collect data with [Prometheus](https://prometheus.io/) and graph in
[Grafana](https://grafana.com).

![Chia dashboard](https://img.kmr.me/posts/chiamon2.png)
Example Chia monitoring stack, using:

* [mtail](https://github.com/google/mtail) to collect metrics from
[Chia](https://chia.net) logs
* [chia_exporter](https://github.com/retzkek/chia_exporter) to collect metrics
from the Chia node
* [node_exporter](https://github.com/prometheus/node_exporter) or [windows
exporter](https://github.com/prometheus-community/windows_exporter/) to
collect system metrics
* [prometheus](https://prometheus.io/) to store metrics
* [promtail](https://grafana.com/docs/loki/latest/clients/promtail/) and
[loki](https://grafana.com/docs/loki/latest/) to collect and store logs from
the Chia node and plotters (and system too if desired)
* [grafana](https://grafana.com) to display everything

This includes a [docker-compose](https://github.com/docker/compose/)
configuration to run everything, but this is primarily intended for development
and testing.

**WARNING this is NOT a one-click install, expect to need to do some work
setting everything up for your machine. PLEASE read the notes below and
understand what all the services are, what they do, and how they work
together.**

![Chia dashboard](https://img.kmr.me/posts/chiamon3.png)

## mtail program

Expand All @@ -19,7 +36,7 @@ The mtail program is in `mtail/chialog.mtail`. Currently it only collects harves
* `chia_harvester_proofs_total`: cumulative number of proofs won
* `chia_harvester_search_time`: histogram of proof search times

Please set log_level to INFO in your config.yaml
**NOTE** you need to set log_level to INFO in your Chia config.yaml to get harvester metrics.

## chia_exporter

Expand All @@ -29,12 +46,13 @@ API](https://github.com/Chia-Network/chia-blockchain/wiki/RPC-Interfaces).

## Grafana dashboard

The Grafana dashboard is in `grafana/dashboards/Chia.json`. It defines a number
of variables that will be auto-populated from the node metrics; use the
dropdowns to customize to show show the drives, mounts, etc that you're
interested in monitoring.
The example Grafana dashboard is in `grafana/dashboards/Chia.json`. It defines a
number of variables that will be auto-populated from the node metrics. Grafana
dashboards are [easily customized](https://grafana.com/docs/) to show what
you're interested in seeing, in the way you find best; this dashboard is just
meant to demonstrate what can be done.

## Linux/Mac
## Running on Linux/Mac

The docker-compose file will mount the Chia log from
`$HOME/.chia/mainnet/log/debug.log`, verify that this location is correct and
Expand All @@ -44,27 +62,23 @@ set the log level to INFO in the Chia configuration (usually at
Run:

docker-compose up -d

This will do the following:

* Build container image with configuration for mtail from source
* Build container image for chia_exporter from source
* Download node_exporter, prometheus, and grafana images from docker hub
* Run containers in the background, attached to the host network

The grafana service provisions the prometheus datasource and a basic dashboard
that displays harvester and node metrics.
* Download other images from docker hub
* Run containers in the background, attached to the host network (this makes it
easy to communicate with native services, but has some trade-offs. See notes.)

The grafana service provisions the prometheus and loki datasources and a basic
dashboard that displays harvester and node metrics.

Access Grafana at http://localhost:3000 and login with the default admin/admin
username and password (you'll be prompted to change the password).

### Notes

* This is not a production-ready deployment; there's no persistence of Prometheus
data or the Grafana database, so changes will be lost when the services are
recreated. To do that you'd want to bind-mount local paths to the respective
data directories; consult each project's documentation for details.

* It's highly encouraged to run the node exporter natively rather than in
docker - see the discussion in the [node_exporter
docs](https://github.com/prometheus/node_exporter#docker). If you do run it in
Expand All @@ -73,11 +87,19 @@ data directories; consult each project's documentation for details.
'/scratch:/scratch'`). See [issue #3](https://github.com/retzkek/chiamon/issues/3).

* On **Mac** you'll need to run node_exporter natively, not under Docker: `brew
install node_exporter`.
install node_exporter`. You'll probably need to change the networking setup
too, since Docker on Mac runs in a VM. See the windows docker-compose and
prometheus configs.

## Running on Windows

## Windows
The node exporter **does not** work on Windows; instead you need to use the
Windows exporter for system metrics. Modified config and example dashboard are
in the [windows branch](https://github.com/retzkek/chiamon/tree/windows). You
may also want to review the discussion in [issue
#2](https://github.com/retzkek/chiamon/issues/2).

Modified config and dashboard are in the [windows branch](https://github.com/retzkek/chiamon/tree/windows).
These steps will get you to a working setup (but aren't the only way):

* Install [Docker Desktop](https://www.docker.com/products/docker-desktop)
* Install [Visual Studio Code](https://code.visualstudio.com/)
Expand All @@ -87,5 +109,5 @@ Modified config and dashboard are in the [windows branch](https://github.com/ret
* Modify `docker-compose.yml`:
- Change volume paths to point to your home directory.
* Run services. In VSCode with docker extension you can just right-click on `docker-compose.yml` and select "Compose Up"
* Check target status in Prometheus at http://localhost:9090/targets
* Check target status in Prometheus at http://localhost:9090/targets
* Access Grafana at http://localhost:3000 (admin/admin).
143 changes: 136 additions & 7 deletions grafana/dashboards/Chia.json
Original file line number Diff line number Diff line change
Expand Up @@ -9,14 +9,24 @@
"iconColor": "rgba(0, 211, 255, 1)",
"name": "Annotations & Alerts",
"type": "dashboard"
},
{
"datasource": "Loki",
"enable": true,
"expr": "{job=\"plotter\"}|~\"Starting plotting|Time for phase|Total time\"",
"hide": false,
"iconColor": "rgba(255, 96, 96, 1)",
"name": "plotter",
"showIn": 0,
"target": {}
}
]
},
"editable": true,
"gnetId": null,
"graphTooltip": 1,
"id": 1,
"iteration": 1620011566939,
"iteration": 1621098681551,
"links": [],
"panels": [
{
Expand Down Expand Up @@ -61,7 +71,7 @@
"overrides": []
},
"gridPos": {
"h": 7,
"h": 4,
"w": 6,
"x": 0,
"y": 0
Expand Down Expand Up @@ -263,6 +273,66 @@
"alignLevel": null
}
},
{
"datasource": "Prometheus",
"fieldConfig": {
"defaults": {
"color": {
"mode": "thresholds"
},
"mappings": [],
"thresholds": {
"mode": "absolute",
"steps": [
{
"color": "orange",
"value": null
},
{
"color": "green",
"value": 1
}
]
},
"unit": "mojo"
},
"overrides": []
},
"gridPos": {
"h": 3,
"w": 6,
"x": 0,
"y": 4
},
"id": 28,
"options": {
"colorMode": "value",
"graphMode": "none",
"justifyMode": "auto",
"orientation": "auto",
"reduceOptions": {
"calcs": [
"lastNotNull"
],
"fields": "",
"values": false
},
"text": {},
"textMode": "auto"
},
"pluginVersion": "7.5.4",
"targets": [
{
"exemplar": true,
"expr": "chia_wallet_confirmed_balance_mojo",
"interval": "",
"legendFormat": "",
"refId": "A"
}
],
"title": "Balance",
"type": "stat"
},
{
"aliasColors": {},
"bars": false,
Expand Down Expand Up @@ -576,9 +646,11 @@
"pluginVersion": "7.5.4",
"targets": [
{
"expr": "sum(chia_harvester_search_time_bucket{plots_eligible!=\"0\"}) by (le)",
"exemplar": true,
"expr": "sum(increase(chia_harvester_search_time_bucket{plots_eligible!=\"0\"}[$__range])) by (le)",
"format": "heatmap",
"hide": false,
"instant": true,
"interval": "",
"legendFormat": "{{le}}",
"refId": "C"
Expand Down Expand Up @@ -1139,6 +1211,66 @@
"align": false,
"alignLevel": null
}
},
{
"datasource": "Loki",
"fieldConfig": {
"defaults": {},
"overrides": []
},
"gridPos": {
"h": 8,
"w": 12,
"x": 0,
"y": 20
},
"id": 31,
"options": {
"dedupStrategy": "none",
"showLabels": false,
"showTime": true,
"sortOrder": "Descending",
"wrapLogMessage": false
},
"pluginVersion": "7.5.4",
"targets": [
{
"expr": "{job=\"chia\"}|=\"harvester.harvester\"|=\"WARNING\"",
"refId": "A"
}
],
"title": "Harvester Warnings",
"type": "logs"
},
{
"datasource": "Loki",
"fieldConfig": {
"defaults": {},
"overrides": []
},
"gridPos": {
"h": 8,
"w": 12,
"x": 12,
"y": 20
},
"id": 30,
"options": {
"dedupStrategy": "none",
"showLabels": false,
"showTime": true,
"sortOrder": "Descending",
"wrapLogMessage": false
},
"pluginVersion": "7.5.4",
"targets": [
{
"expr": "{job=\"plotter\"}!=\"uniform sort\"",
"refId": "A"
}
],
"title": "Plotter Logs",
"type": "logs"
}
],
"refresh": "1m",
Expand All @@ -1150,7 +1282,7 @@
{
"allValue": null,
"current": {
"selected": true,
"selected": false,
"text": "localhost:9100",
"value": "localhost:9100"
},
Expand Down Expand Up @@ -1182,7 +1314,6 @@
"allValue": null,
"current": {
"selected": true,
"tags": [],
"text": [
"/"
],
Expand Down Expand Up @@ -1218,7 +1349,6 @@
"allValue": null,
"current": {
"selected": true,
"tags": [],
"text": [
"All"
],
Expand Down Expand Up @@ -1254,7 +1384,6 @@
"allValue": null,
"current": {
"selected": true,
"tags": [],
"text": [
"All"
],
Expand Down
16 changes: 12 additions & 4 deletions grafana/datasources.yaml
Original file line number Diff line number Diff line change
@@ -1,10 +1,12 @@
# config file version
apiVersion: 1
apiVersion: 2

# list of datasources that should be deleted from the database
deleteDatasources:
- name: Prometheus
orgId: 1
- name: Loki
orgId: 1

# list of datasources to insert/update depending
# what's available in the database
Expand All @@ -13,11 +15,17 @@ datasources:
type: prometheus
access: proxy
orgId: 1
# <string> custom UID which can be used to reference this datasource in other parts of the configuration, if not specified will be generated automatically
uid: my_unique_uid
url: http://localhost:9090
isDefault: true
version: 1
# <bool> allow users to edit datasources from the UI.
editable: true

- name: Loki
type: loki
access: proxy
orgId: 1
url: http://localhost:3100
isDefault: false
version: 1
# <bool> allow users to edit datasources from the UI.
editable: true

0 comments on commit ba8c32a

Please sign in to comment.