forked from pytorch/FBGEMM
-
Notifications
You must be signed in to change notification settings - Fork 0
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Adding FP 32 SLS, and unifying it with 8 Bit SLS (pytorch#206)
Summary: Pull Request resolved: pytorch#206 This adds support for 32 bit indices to JITed FP 32 SLS Op. This diff includes the following features of SLS: 1. Normalize by lengths 2. modified prefetch distances for avx2 vs. avx512 3. adds support for 32 bit indices 4. has support for weighted SLS, and supports positional weights 5. Does not specialize for blocksize 1 for avx512 as this reorders reduction. Reviewed By: jspark1105 Differential Revision: D18210640 fbshipit-source-id: f9b4de5707a59cae5d34cb898c0cf52bc5f2a91f
- Loading branch information
1 parent
10eff7b
commit 81cc0a1
Showing
10 changed files
with
1,006 additions
and
199 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.