You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I am trying to run PRNN on a GeForce RTX 2080(46 SM, 7.7 compute).
I've tried the following tile sizes,
TileConfig<24, 1152, 1152, 192, 288, 6, 36, direction, T>
TileConfig<32, 1024, 1024, 128, 256, 4, 32, direction, T>
TileConfig<32, 1024, 1024, 64, 512, 1, 32, direction, T>
TileConfig<40, 640, 640, 80, 128, 5, 4, direction, T>
Running benchmark using any of these with batchsize=4, timesteps=20, and layer sizes max for each tile configuration, the fastest I can get is 0.00478542 TFLOPS/s in the forward run.
Are the tile sizes inappropriate or is the issue something else.
Thank you.
The text was updated successfully, but these errors were encountered:
Great work putting this together!
I am trying to run PRNN on a GeForce RTX 2080(46 SM, 7.7 compute).
I've tried the following tile sizes,
TileConfig<24, 1152, 1152, 192, 288, 6, 36, direction, T>
TileConfig<32, 1024, 1024, 128, 256, 4, 32, direction, T>
TileConfig<32, 1024, 1024, 64, 512, 1, 32, direction, T>
TileConfig<40, 640, 640, 80, 128, 5, 4, direction, T>
Running benchmark using any of these with batchsize=4, timesteps=20, and layer sizes max for each tile configuration, the fastest I can get is 0.00478542 TFLOPS/s in the forward run.
Are the tile sizes inappropriate or is the issue something else.
Thank you.
The text was updated successfully, but these errors were encountered: