Skip to content

A lightweight, cloud-native data transfer agent and aggregator

License

Notifications You must be signed in to change notification settings

LeiZhang-Hunter/loggie

 
 

Repository files navigation

go.dev reference CII Best Practices dockerHub

English | 中文

Loggie is a lightweight, high-performance, cloud-native agent and aggregator based on Golang.

  • Supports multiple pipeline and pluggable components, including data transfer, filtering, parsing, and alerting.
  • Uses native Kubernetes CRD for operation and management.
  • Offers a range of observability, reliability, and automation features suitable for production environments.

Based on Loggie, we can build a cloud-native scalable log data platform.

Features

Next-generation cloud-native log collection and transmission

Building pipelines based on CRD

Loggie includes LogConfig/ClusterLogConfig/Interceptor/Sink CRDs, allowing for the creation of data collection, transfer, processing, and sending pipelines through simple YAML file creation.

eg:

apiVersion: loggie.io/v1beta1
kind: LogConfig
metadata:
  name: tomcat
  namespace: default
spec:
  selector:
    type: pod
    labelSelector:
      app: tomcat
  pipeline:
    sources: |
      - type: file
        name: common
        paths:
          - stdout
          - /usr/local/tomcat/logs/*.log
    sinkRef: default
    interceptorRef: default

crd-usage

Multiple architectures

  • Agent: Deployed via DaemonSet, Loggie can collect log files without the need for containers to mount volumes.

  • Sidecar: Supports non-intrusive auto-injection of Loggie sidecars, without the need to manually add them to the Deployment/StatefulSet templates.

  • Aggregator: Supports deployment as an independent intermediate machine, which can receive aggregated data sent by Loggie Agent and can also be used to consume and process various data sources.

But regardless of the deployment architecture, Loggie still maintains a simple and intuitive internal design.

High Performance

Benchmark

Configure Filebeat and Loggie to collect logs, and send them to a Kafka topic without using client compression, with the Kafka topic partition configured as 3.

With sufficient resources for the Agent specification, modify the number of files collected, the concurrency of the sending client (configure Filebeat worker and Loggie parallelism), and observe their respective CPU, memory, and pod network card transmission rates.

Agent File Size File Count Sink Concurrency CPU MEM (rss) Transmission Rates
Filebeat 3.2G 1 3 7.5~8.5c 63.8MiB 75.9MiB/s
Filebeat 3.2G 1 8 10c 65MiB 70MiB/s
Filebeat 3.2G 10 8 11c 65MiB 80MiB/s
Loggie 3.2G 1 3 2.1c 60MiB 120MiB/s
Loggie 3.2G 1 8 2.4c 68.7MiB 120MiB/s
Loggie 3.2G 10 8 3.5c 70MiB 210MiB/s

Adaptive Sink Concurrency

With sink concurrency configuration enabled, Loggie can:

  • Automatically adjust the downstream data sending parallelism based on the actual downstream data response, making full use of the downstream server's performance without affecting its performance.
  • Adjust the downstream data sending speed appropriately when upstream data collection is blocked to relieve upstream blocking.

Lightweight Streaming Data Analysis and Monitoring

Logs are a universal data type and are not related to platforms or systems. How to better utilize this data is the core capability that Loggie focuses on and develops.

Real-time parsing and transformation

With the configuration of transformer interceptors and the configuration of functional actions, Loggie can achieve:

  • Parsing of various data formats (json, grok, regex, split, etc.)
  • Conversion of various fields (add, copy, move, set, del, fmt, etc.)
  • Support for conditional judgment and processing logic (if, else, return, dropEvent, ignoreError, etc.)

eg:

interceptors:
  - type: transformer
    actions:
      - action: regex(body)
        pattern: (?<ip>\S+) (?<id>\S+) (?<u>\S+) (?<time>\[.*?\]) (?<url>\".*?\") (?<status>\S+) (?<size>\S+)
      - if: equal(status, 404)
        then:
          - action: add(topic, not_found)
          - action: return()
      - if: equal(status, 500)
        then:
          - action: dropEvent()

Detection, recognition, and alerting

Helps you quickly detect potential problems and anomalies in the data and issue timely alerts. Support custom webhooks to connect to various alert channels.

Supports matching methods such as:

  • No data: no log data generated within the configured time period.
  • Fuzzy matching
  • Regular expression matching
  • Conditional judgment
    • Field comparison: equal/less/greater...

Log data aggregation and monitoring

Often, metric data is not only exposed through prometheus exporters, but log data itself can also provide a source of metrics. For example, by counting the access logs of a gateway, you can calculate the number of 5xx or 4xx status codes within a certain time interval, aggregate the qps of a certain interface, and calculate the total amount of body data, etc.

eg:

- type: aggregator
  interval: 1m
  select:
    # operator:COUNT/COUNT-DISTINCT/SUM/AVG/MAX/MIN
    - {key: amount, operator: SUM, as: amount_total}
    - {key: quantity, operator: SUM, as: qty_total}
  groupBy: ["city"]
  calculate:
    - {expression: " ${amount_total} / ${qty_total} ", as: avg_amount}

Observability and fast troubleshooting

  • Loggie provides configurable and rich metrics, and dashboards that can be imported into Grafana with one click.

  • Quickly troubleshoot Loggie itself and any problems in data transmission by Loggie terminal.

FAQs

Loggie vs Filebeat/Fluentd/Logstash/Flume

Loggie Filebeat Fluentd Logstash Flume
Language Golang Golang Ruby JRuby Java
Multiple Pipelines single queue single queue
Multiple output one output copy
Aggregator
Log Alarm
Kubernetes container log collection support container stdout and logs files in container stdout stdout
Configuration delivery through CRD manual manual manual manual
Monitoring support Prometheus metrics,and can be configured to output indicator log files separately, sending metrics, etc. prometheus metrics need exporter need exporter
Resource Usage low low average high high

RoadMap

RoadMap 2023

License

Apache-2.0

Contributions

Pull requests, comments and suggestions are welcome.

Please see CONTRIBUTING.md for more information.

About

A lightweight, cloud-native data transfer agent and aggregator

Resources

License

Code of conduct

Stars

Watchers

Forks

Packages

No packages published

Languages

  • Go 99.5%
  • Other 0.5%