Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

128/256 threads/block #92

Open
ancahamuraru opened this issue Mar 31, 2015 · 2 comments
Open

128/256 threads/block #92

ancahamuraru opened this issue Mar 31, 2015 · 2 comments

Comments

@ancahamuraru
Copy link

Update the OpenCL kernel for 128/256 threads/block based on the equivalent CUDA kernel - see commit f2b9db2 from the main Gromacs master branch: gromacs@f2b9db2

Evaluate the performance of the new kernel for AMD and NVIDIA GPUs and decide on the final version or versions of the OpenCL kernel that will be used.

@pszi1ard
Copy link

pszi1ard commented Apr 1, 2015

As this will increase register pressure, I suggest trying 128 threads/block too. Additionally, reduction will become tricky without the lane-shuffle ops.

@ancahamuraru ancahamuraru changed the title 256 threads/block 128/256 threads/block Apr 2, 2015
@ancahamuraru
Copy link
Author

Thanks for the comment. It's my mistake, I forgot to mention 128 threads/block.
The issue title and description are now updated.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants