Skip to content

Commit

Permalink
Move wiki information to docs (influxdata#9126)
Browse files Browse the repository at this point in the history
  • Loading branch information
sjwang90 authored Apr 22, 2021
1 parent 03b2dae commit 1bc87cc
Show file tree
Hide file tree
Showing 13 changed files with 794 additions and 0 deletions.
19 changes: 19 additions & 0 deletions CONTRIBUTING.md
Original file line number Diff line number Diff line change
Expand Up @@ -74,6 +74,17 @@ make test-all

Use `make docker-kill` to stop the containers.

### For more developer resources
- [Code Style][codestyle]
- [Deprecation][deprecation]
- [Logging][logging]
- [Metric Format Changes][metricformat]
- [Packaging][packaging]
- [Logging][logging]
- [Packaging][packaging]
- [Profiling][profiling]
- [Reviews][reviews]
- [Sample Config][sample config]

[cla]: https://www.influxdata.com/legal/cla/
[new issue]: https://github.com/influxdata/telegraf/issues/new/choose
Expand All @@ -82,3 +93,11 @@ Use `make docker-kill` to stop the containers.
[processors]: /docs/PROCESSORS.md
[aggregators]: /docs/AGGREGATORS.md
[outputs]: /docs/OUTPUTS.md
[codestyle]: /docs/developers/CODE_STYLE.md
[deprecation]: /docs/developers/DEPRECATION.md
[logging]: /docs/developers/LOGGING.md
[metricformat]: /docs/developers/METRIC_FORMAT_CHANGES.md
[packaging]: /docs/developers/PACKAGING.md
[profiling]: /docs/developers/PROFILING.md
[reviews]: /docs/developers/REVIEWS.md
[sample config]: /docs/developers/SAMPLE_CONFIG.md
7 changes: 7 additions & 0 deletions docs/developers/CODE_STYLE.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
# Code Style
Code is required to be formatted using `gofmt`, this covers most code style
requirements. It is also highly recommended to use `goimports` to
automatically order imports.

Please try to keep lines length under 80 characters, the exact number of
characters is not strict but it generally helps with readability.
88 changes: 88 additions & 0 deletions docs/developers/DEPRECATION.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,88 @@
# Deprecation
Deprecation is the primary tool for making changes in Telegraf. A deprecation
indicates that the community should move away from using a feature, and
documents that the feature will be removed in the next major update (2.0).

Key to deprecation is that the feature remains in Telegraf and the behavior is
not changed.

We do not have a strict definition of a breaking change. All code changes
change behavior, the decision to deprecate or make the change immediately is
decided based on the impact.

## Deprecate plugins

Add a comment to the plugin's sample config, include the deprecation version
and any replacement.

```toml
[[inputs.logparser]]
## DEPRECATED: The 'logparser' plugin is deprecated in 1.10. Please use the
## 'tail' plugin with the grok data_format as a replacement.
```

Add the deprecation warning to the plugin's README:

```markdown
# Logparser Input Plugin

### **Deprecated in 1.10**: Please use the [tail][] plugin along with the
`grok` [data format][].

[tail]: /plugins/inputs/tail/README.md
[data formats]: /docs/DATA_FORMATS_INPUT.md
```

Log a warning message if the plugin is used. If the plugin is a
ServiceInput, place this in the `Start()` function, for regular Input's log it only the first
time the `Gather` function is called.
```go
log.Println("W! [inputs.logparser] The logparser plugin is deprecated in 1.10. " +
"Please use the tail plugin with the grok data_format as a replacement.")
```
## Deprecate options

Mark the option as deprecated in the sample config, include the deprecation
version and any replacement.
```toml
## Broker URL
## deprecated in 1.7; use the brokers option
# url = "amqp://localhost:5672/influxdb"
```

In the plugins configuration struct, mention that the option is deprecated:

```go
type AMQPConsumer struct {
URL string `toml:"url"` // deprecated in 1.7; use brokers
}
```

Finally, use the plugin's `Init() error` method to display a log message at warn level. The message should include the offending configuration option and any suggested replacement:
```go
func (a *AMQPConsumer) Init() error {
if p.URL != "" {
p.Log.Warnf("Use of deprecated configuration: 'url'; please use the 'brokers' option")
}
return nil
}
```

## Deprecate metrics

In the README document the metric as deprecated. If there is a replacement field,
tag, or measurement then mention it.

```markdown
- system
- fields:
- uptime_format (string, deprecated in 1.10: use `uptime` field)
```

Add filtering to the sample config, leave it commented out.

```toml
[[inputs.system]]
## Uncomment to remove deprecated metrics.
# fielddrop = ["uptime_format"]
```
75 changes: 75 additions & 0 deletions docs/developers/LOGGING.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,75 @@
# Logging

## Plugin Logging

You can access the Logger for a plugin by defining a field named `Log`. This
`Logger` is configured internally with the plugin name and alias so they do not
need to be specified for each log call.

```go
type MyPlugin struct {
Log telegraf.Logger `toml:"-"`
}
```

You can then use this Logger in the plugin. Use the method corresponding to
the log level of the message.
```go
p.Log.Errorf("Unable to write to file: %v", err)
```

## Agent Logging

In other sections of the code it is required to add the log level and module
manually:
```go
log.Printf("E! [agent] Error writing to %s: %v", output.LogName(), err)
```

## When to Log

Log a message if an error occurs but the plugin can continue working. For
example if the plugin handles several servers and only one of them has a fatal
error, it can be logged as an error.

Use logging judiciously for debug purposes. Since Telegraf does not currently
support setting the log level on a per module basis, it is especially important
to not over do it with debug logging.

If the plugin is listening on a socket, log a message with the address of the socket:
```go
p.log.InfoF("Listening on %s://%s", protocol, l.Addr())
```

## When not to Log

Don't use logging to emit performance data or other meta data about the plugin,
instead use the `internal` plugin and the `selfstats` package.

Don't log fatal errors in the plugin that require the plugin to return, instead
return them from the function and Telegraf will handle the logging.

Don't log for static configuration errors, check for them in a plugin `Init()`
function and return an error there.

Don't log a warning every time a plugin is called for situations that are
normal on some systems.

## Log Level

The log level is indicated by a single character at the start of the log
message. Adding this prefix is not required when using the Plugin Logger.
- `D!` Debug
- `I!` Info
- `W!` Warning
- `E!` Error

## Style

Log messages should be capitalized and be a single line.

If it includes data received from another system or process, such as the text
of an error message, the text should be quoted with `%q`.

Use the `%v` format for the Go error type instead of `%s` to ensure a nil error
is printed.
42 changes: 42 additions & 0 deletions docs/developers/METRIC_FORMAT_CHANGES.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,42 @@
# Metric Format Changes

When making changes to an existing input plugin, care must be taken not to change the metric format in ways that will cause trouble for existing users. This document helps developers understand how to make metric format changes safely.

## Changes can cause incompatibilities
If the metric format changes, data collected in the new format can be incompatible with data in the old format. Database queries designed around the old format may not work with the new format. This can cause application failures.

Some metric format changes don't cause incompatibilities. Also, some unsafe changes are necessary. How do you know what changes are safe and what to do if your change isn't safe?

## Guidelines
The main guideline is just to keep compatibility in mind when making changes. Often developers are focused on making a change that fixes their particular problem and they forget that many people use the existing code and will upgrade. When you're coding, keep existing users and applications in mind.

### Renaming, removing, reusing
Database queries refer to the metric and its tags and fields by name. Any Telegraf code change that changes those names has the potential to break an existing query. Similarly, removing tags or fields can break queries.

Changing the meaning of an existing tag value or field value or reusing an existing one in a new way isn't safe. Although queries that use these tags/field may not break, they will not work as they did before the change.

Adding a field doesn't break existing queries. Queries that select all fields and/or tags (like "select * from") will return an extra series but this is often useful.

### Performance and storage
Time series databases can store large amounts of data but many of them don't perform well on high cardinality data. If a metric format change includes a new tag that holds high cardinality data, database performance could be reduced enough to cause existing applications not to work as they previously did. Metric format changes that dramatically increase the number of tags or fields of a metric can increase database storage requirements unexpectedly. Both of these types of changes are unsafe.

### Make unsafe changes opt-in
If your change has the potential to seriously affect existing users, the change must be opt-in. To do this, add a plugin configuration setting that lets the user select the metric format. Make the setting's default value select the old metric format. When new users add the plugin they can choose the new format and get its benefits. When existing users upgrade, their config files won't have the new setting so the default will ensure that there is no change.

When adding a setting, avoid using a boolean and consider instead a string or int for future flexibility. A boolean can only handle two formats but a string can handle many. For example, compare use_new_format=true and features=["enable_foo_fields"]; the latter is much easier to extend and still very descriptive.

If you want to encourage existing users to use the new format you can log a warning once on startup when the old format is selected. The warning should tell users in a gentle way that they can upgrade to a better metric format. If it doesn't make sense to maintain multiple metric formats forever, you can change the default on a major release or even remove the old format completely. See [[Deprecation]] for details.

### Utility
Changes should be useful to many or most users. A change that is only useful for a small number of users may not accepted, even if it's off by default.

## Summary table

| | delete | rename | add |
| ------- | ------ | ------ | --- |
| metric | unsafe | unsafe | safe |
| tag | unsafe | unsafe | be careful with cardinality |
| field | unsafe | unsafe | ok as long as it's useful for existing users and is worth the added space |

## References
InfluxDB Documentation: "Schema and data layout"
44 changes: 44 additions & 0 deletions docs/developers/PACKAGING.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,44 @@
# Packaging

## Package using Docker

This packaging method uses the CI images, and is very similar to how the
official packages are created on release. This is the recommended method for
building the rpm/deb as it is less system dependent.

Pull the CI images from quay, the version corresponds to the version of Go
that is used to build the binary:
```
docker pull quay.io/influxdb/telegraf-ci:1.9.7
```

Start a shell in the container:
```
docker run -ti quay.io/influxdb/telegraf-ci:1.9.7 /bin/bash
```

From within the container:
```
go get -d github.com/influxdata/telegraf
cd /go/src/github.com/influxdata/telegraf
# Use tag of Telegraf version you would like to build
git checkout release-1.10
git reset --hard 1.10.2
make deps
# This builds _all_ platforms and architectures; will take a long time
./scripts/build.py --release --package
```

If you would like to only build a subset of the packages run this:

```
# Use the platform and arch arguments to skip unwanted packages:
./scripts/build.py --release --package --platform=linux --arch=amd64
```

From the host system, copy the build artifacts out of the container:
```
docker cp romantic_ptolemy:/go/src/github.com/influxdata/telegraf/build/telegraf-1.10.2-1.x86_64.rpm .
```
55 changes: 55 additions & 0 deletions docs/developers/PROFILING.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,55 @@
# Profiling
This article describes how to collect performance traces and memory profiles
from Telegraf. If you are submitting this for an issue, please include the
version.txt generated below.

Use the `--pprof-addr` option to enable the profiler, the easiest way to do
this may be to add this line to `/etc/default/telegraf`:
```
TELEGRAF_OPTS="--pprof-addr localhost:6060"
```

Restart Telegraf to activate the profile address.

#### Trace Profile
Collect a trace during the time where the performance issue is occurring. This
example collects a 10 second trace and runs for 10 seconds:
```
curl 'http://localhost:6060/debug/pprof/trace?seconds=10' > trace.bin
telegraf --version > version.txt
go env GOOS GOARCH >> version.txt
```

The `trace.bin` and `version.txt` files can be sent in for analysis or, if desired, you can
analyze the trace with:
```
go tool trace trace.bin
```

#### Memory Profile
Collect a heap memory profile:
```
curl 'http://localhost:6060/debug/pprof/heap' > mem.prof
telegraf --version > version.txt
go env GOOS GOARCH >> version.txt
```

Analyze:
```
$ go tool pprof mem.prof
(pprof) top5
```

#### CPU Profile
Collect a 30s CPU profile:
```
curl 'http://localhost:6060/debug/pprof/profile' > cpu.prof
telegraf --version > version.txt
go env GOOS GOARCH >> version.txt
```

Analyze:
```
go tool pprof cpu.prof
(pprof) top5
```
Loading

0 comments on commit 1bc87cc

Please sign in to comment.