You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I believe we should be able to do baseline variants of scan kernels for OpenMP and OpenMP target (as of OpenMP 5.0. Is there a reasonable way to do baseline variants of scan kernels for GPU back-ends?
Good call @rhornung67, scan was added in openmp 5.0 so we can conditionally add a baseline for openmp. We can use cub and rocprim for the base variants of cuda and hip scan kernels.
List of kernels to add to the perf suite:
Reduction-only kernels. See Add reduction only kernels #114 and Add Tests to Perf Suite #179
More Atomic kernels. See Add element-wise atomic kernels #115 and Add Tests to Perf Suite #179
Variants of kernels using RAJA teams to compare against existing kernels that use RAJA::kernel interface
Variants of kernels using vectorization API when the dust settles on it.
More kernels from polybench suite that we don't support with RAJA::kernel, but can do with RAJA teams, such as triangular matrix operations.
Kernels that call functions in different translation units
The text was updated successfully, but these errors were encountered: