You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
While the usual idiom is to pass lambda function as rvalues like this
ctx.parallel_for(shape, deps...)->*[](size_t i, auto a, auto b) {
...
};
We may want to pass an lvalue instead :
auto fn = [](size_t i, auto a, auto b) {
...
};
ctx.parallel_for(shape, deps...)->*fn;
But STF currently makes the assumption that we have an rvalue and does invalid operations such as moving instead of forwarding. The mechanisms used to differentiate host/device/host device lambdas are also not working as expected, resulting in compilation errors, or runtime bugs as we fail to call the appropriate parallel_for implementation.
How to Reproduce
auto fn = [](size_t i, auto a, auto b) {
...
};
ctx.parallel_for(shape, deps...)->*fn;
Expected behavior
rvalues and lvalues should behave the same.
Reproduction link
No response
Operating System
No response
nvidia-smi output
No response
NVCC version
No response
The text was updated successfully, but these errors were encountered:
Is this a duplicate?
Type of Bug
Compile-time Error
Component
CUDA Experimental (cudax)
Describe the bug
While the usual idiom is to pass lambda function as rvalues like this
We may want to pass an lvalue instead :
But STF currently makes the assumption that we have an rvalue and does invalid operations such as moving instead of forwarding. The mechanisms used to differentiate host/device/host device lambdas are also not working as expected, resulting in compilation errors, or runtime bugs as we fail to call the appropriate parallel_for implementation.
How to Reproduce
Expected behavior
rvalues and lvalues should behave the same.
Reproduction link
No response
Operating System
No response
nvidia-smi output
No response
NVCC version
No response
The text was updated successfully, but these errors were encountered: