[BUG]: STF constructs such as parallel_for do not accept lvalue (extended) lambda functions #3473

caugonnet · 2025-01-22T09:19:16Z

Is this a duplicate?

I confirmed there appear to be no duplicate issues for this bug and that I agree to the Code of Conduct

Type of Bug

Compile-time Error

Component

CUDA Experimental (cudax)

Describe the bug

While the usual idiom is to pass lambda function as rvalues like this

ctx.parallel_for(shape, deps...)->*[](size_t i, auto a, auto b) {
    ...
};

We may want to pass an lvalue instead :

auto fn = [](size_t i, auto a, auto b) {
    ...
};
ctx.parallel_for(shape, deps...)->*fn;

But STF currently makes the assumption that we have an rvalue and does invalid operations such as moving instead of forwarding. The mechanisms used to differentiate host/device/host device lambdas are also not working as expected, resulting in compilation errors, or runtime bugs as we fail to call the appropriate parallel_for implementation.

How to Reproduce

auto fn = [](size_t i, auto a, auto b) {
    ...
};
ctx.parallel_for(shape, deps...)->*fn;

Expected behavior

rvalues and lvalues should behave the same.

Reproduction link

No response

Operating System

No response

nvidia-smi output

No response

NVCC version

No response

The text was updated successfully, but these errors were encountered:

caugonnet added the bug Something isn't working right. label Jan 22, 2025

github-project-automation bot added this to CCCL Jan 22, 2025

github-project-automation bot moved this to Todo in CCCL Jan 22, 2025

caugonnet added the stf Sequential Task Flow programming model label Jan 22, 2025

caugonnet self-assigned this Jan 22, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[BUG]: STF constructs such as parallel_for do not accept lvalue (extended) lambda functions #3473

[BUG]: STF constructs such as parallel_for do not accept lvalue (extended) lambda functions #3473

caugonnet commented Jan 22, 2025

[BUG]: STF constructs such as parallel_for do not accept lvalue (extended) lambda functions #3473

[BUG]: STF constructs such as parallel_for do not accept lvalue (extended) lambda functions #3473

Comments

caugonnet commented Jan 22, 2025

Is this a duplicate?

Type of Bug

Component

Describe the bug

How to Reproduce

Expected behavior

Reproduction link

Operating System

nvidia-smi output

NVCC version