forked from pytorch/FBGEMM
-
Notifications
You must be signed in to change notification settings - Fork 0
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
fp16 gemm using avx512 (pytorch#137)
Summary: Pull Request resolved: pytorch#137 fp16 GEMM was not using avx512 falling behind fp32 performance for large m cases. This diff enables using avx512. Further tuning for register blocking size may be needed. Longer term we would also need to use JIT'ing for fp16. Reviewed By: jianyuh Differential Revision: D17786712 fbshipit-source-id: bebf8723d03db7e128097310745a8103b712ee06
- Loading branch information
1 parent
8786c08
commit 82d259d
Showing
9 changed files
with
3,306 additions
and
479 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.