Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Performance: pipeline rate always lower than the source rate #571

Open
hariso opened this issue Aug 15, 2022 · 2 comments
Open

Performance: pipeline rate always lower than the source rate #571

hariso opened this issue Aug 15, 2022 · 2 comments
Labels
bug Something isn't working

Comments

@hariso
Copy link
Contributor

hariso commented Aug 15, 2022

Bug description

It appears that the pipeline rate is always lower than the generator source rate. For example, when source generates records at 15k msg/s, the pipeline rate (i.e. the number of records flowing through the pipeline, acknowledged ones) is around 10k, which gives the impression that 10k msg/s is what Conduit can handle.

However, in the 10k msg/s test, the pipeline rate is around 6700 msg/s.

Steps to reproduce

  1. Run https://github.com/ConduitIO/streaming-benchmarks/blob/haris/ec2-helpers/workloads/small-messages-15k-msg-per-sec.sh
  2. Run https://github.com/ConduitIO/streaming-benchmarks/blob/haris/ec2-helpers/workloads/small-messages-10k-msg-per-sec.sh
  3. Compare results

Version

{ "version": "v0.3.0-nightly.20220811", "os": "linux", "arch": "amd64" }

@hariso hariso added bug Something isn't working triage Needs to be triaged labels Aug 15, 2022
@meroxa-machine meroxa-machine moved this to Triage in Conduit Main Aug 15, 2022
@neovintage neovintage removed the status in Conduit Main Aug 15, 2022
@neovintage
Copy link
Contributor

At a lower rates, the throughput matches what happens from the source side to what's happening inside of conduit. Only when we get to the higher rates (15k or 10k) do we start to see the source connector and conduit start to drift. That drift is usually 30%.

@hariso
Copy link
Contributor Author

hariso commented Sep 23, 2022

At a lower rates, the throughput matches what happens from the source side to what's happening inside of conduit. Only when we get to the higher rates (15k or 10k) do we start to see the source connector and conduit start to drift. That drift is usually 30%.

As you suggested, I added two more workloads (1k msg/s, and 5k msg/s). For the 1k msg/s, the drift is around 30%, but for the 5k msg/s, it's much bigger.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
Status: No status
Development

No branches or pull requests

2 participants