Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Race condition in per-operation cancellation in variant_stream #454

Closed
r33s3n6 opened this issue Feb 25, 2025 · 2 comments · Fixed by #457
Closed

Race condition in per-operation cancellation in variant_stream #454

r33s3n6 opened this issue Feb 25, 2025 · 2 comments · Fixed by #457
Labels
bug Something isn't working

Comments

@r33s3n6
Copy link

r33s3n6 commented Feb 25, 2025

Description

I've encountered issues with ineffective cancellation when executing long-running SQL queries. While not consistently reproducible, this issue appears to occur under certain conditions with any SQL query.

Steps to Reproduce

The issue occurs when executing the following code, with all tasks running on a single io_context on a single thread:

auto [ec] = co_await conn.async_execute(
    sql, mysql_result, diag, asio::cancel_after(timeout, asio::as_tuple(asio::deferred)));

In my testing, if the SQL query is slow and the result set is large, the cancellation timer is more likely to be triggered in the following scenario:

  1. An async_read_some() operation completes.
  2. The completion token for async_read_some() is queued but has not yet been executed.
  3. The timer handler executes but fails to cancel anything, as no outstanding asynchronous operation is active at that moment.
  4. The async_read_some() completion token is then executed, calling run_algo_op, which does not properly account for this situation.

Possible Cause and Fix

It seems necessary to check self.cancelled() != asio::cancellation_type::none within each loop iteration to ensure proper cancellation handling. This issue affects not only run_algo_op, but also all state machines initiated by asio::async_compose (e.g., variant_stream::connect_op).

Any insights or potential fixes would be greatly appreciated. Thanks!

@r33s3n6
Copy link
Author

r33s3n6 commented Feb 25, 2025

Sorry for overlooking #199 earlier; however, could it be that variant_stream::connect_op also needs a fix?

@anarthal
Copy link
Collaborator

Your diagnostics are absolutely correct. It looks like I overlooked the variant_stream part, thanks for noticing.

@anarthal anarthal changed the title Ineffective cancellation handling Race condition in per-operation cancellation in variant_stream Feb 25, 2025
@anarthal anarthal added the bug Something isn't working label Feb 25, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants