You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Update the OpenCL kernel for 128/256 threads/block based on the equivalent CUDA kernel - see commit f2b9db2 from the main Gromacs master branch: gromacs@f2b9db2
Evaluate the performance of the new kernel for AMD and NVIDIA GPUs and decide on the final version or versions of the OpenCL kernel that will be used.
The text was updated successfully, but these errors were encountered:
As this will increase register pressure, I suggest trying 128 threads/block too. Additionally, reduction will become tricky without the lane-shuffle ops.
Update the OpenCL kernel for 128/256 threads/block based on the equivalent CUDA kernel - see commit f2b9db2 from the main Gromacs master branch: gromacs@f2b9db2
Evaluate the performance of the new kernel for AMD and NVIDIA GPUs and decide on the final version or versions of the OpenCL kernel that will be used.
The text was updated successfully, but these errors were encountered: