Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: otel prometheus update #198

Open
wants to merge 4 commits into
base: main
Choose a base branch
from
Open

Conversation

3benbox
Copy link
Contributor

@3benbox 3benbox commented Jul 3, 2024

  • update opentelemetry image for prometheusremotewrite exporter
  • update prometheus image for remote write ability
  • deploy prometheus service for opentelemetry remote write
  • move configmap data to yaml file for easier operations
  • remove extra opentelemtry exporter for prometheus/simulations (didn't seem used?)

Prometheus doesn't scrape anything now, it is pushed from opentelemetry.
This should reduce duplicate metrics and provide a single collector to forward to different exporters as required.

@3benbox 3benbox requested review from samika98 and nathanielc July 3, 2024 18:56
@@ -0,0 +1,4 @@
global:
scrape_interval: 10s
scrape_timeout: 5s
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These seem a bit aggressive to me. Do we need near real-time metrics for our use cases? Can we scrape every 30 seconds to a minute?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just pulled these setting along from the existing config.
Want to drop them down to the defaults?

# How frequently to scrape targets by default.
  [ scrape_interval: [<duration>](https://prometheus.io/docs/prometheus/latest/configuration/configuration/#duration) | default = 1m ]

  # How long until a scrape request times out.
  [ scrape_timeout: [<duration>](https://prometheus.io/docs/prometheus/latest/configuration/configuration/#duration) | default = 10s ]

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we don't have a reason to scrape aggressively then we can save resources by setting defaults

action: replace
target_label: kubernetes_container

processors:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we filter out duplicates while processing?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can if necessary, see this issue about the situation.

open-telemetry/opentelemetry-collector-contrib#14900

Copy link
Contributor

@samika98 samika98 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me! Have a questions/concerns about having both push and pull based metrics in the system

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants