forked from torvalds/linux
-
Notifications
You must be signed in to change notification settings - Fork 0
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Add support for the hardware version of the Hamming weight function, popcnt, present in CPUs which advertize it under CPUID, Function 0x0000_0001_ECX[23]. On CPUs which don't support it, we fallback to the default lib/hweight.c sw versions. A synthetic benchmark comparing popcnt with __sw_hweight64 showed almost a 3x speedup on a F10h machine. Signed-off-by: Borislav Petkov <[email protected]> LKML-Reference: <20100318112015.GC11152@aftab> Signed-off-by: H. Peter Anvin <[email protected]>
- Loading branch information
Borislav Petkov
authored and
H. Peter Anvin
committed
Apr 6, 2010
1 parent
1527bc8
commit d61931d
Showing
8 changed files
with
108 additions
and
18 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,59 @@ | ||
#ifndef _ASM_X86_HWEIGHT_H | ||
#define _ASM_X86_HWEIGHT_H | ||
|
||
#ifdef CONFIG_64BIT | ||
/* popcnt %rdi, %rax */ | ||
#define POPCNT ".byte 0xf3,0x48,0x0f,0xb8,0xc7" | ||
#define REG_IN "D" | ||
#define REG_OUT "a" | ||
#else | ||
/* popcnt %eax, %eax */ | ||
#define POPCNT ".byte 0xf3,0x0f,0xb8,0xc0" | ||
#define REG_IN "a" | ||
#define REG_OUT "a" | ||
#endif | ||
|
||
/* | ||
* __sw_hweightXX are called from within the alternatives below | ||
* and callee-clobbered registers need to be taken care of. See | ||
* ARCH_HWEIGHT_CFLAGS in <arch/x86/Kconfig> for the respective | ||
* compiler switches. | ||
*/ | ||
static inline unsigned int __arch_hweight32(unsigned int w) | ||
{ | ||
unsigned int res = 0; | ||
|
||
asm (ALTERNATIVE("call __sw_hweight32", POPCNT, X86_FEATURE_POPCNT) | ||
: "="REG_OUT (res) | ||
: REG_IN (w)); | ||
|
||
return res; | ||
} | ||
|
||
static inline unsigned int __arch_hweight16(unsigned int w) | ||
{ | ||
return __arch_hweight32(w & 0xffff); | ||
} | ||
|
||
static inline unsigned int __arch_hweight8(unsigned int w) | ||
{ | ||
return __arch_hweight32(w & 0xff); | ||
} | ||
|
||
static inline unsigned long __arch_hweight64(__u64 w) | ||
{ | ||
unsigned long res = 0; | ||
|
||
#ifdef CONFIG_X86_32 | ||
return __arch_hweight32((u32)w) + | ||
__arch_hweight32((u32)(w >> 32)); | ||
#else | ||
asm (ALTERNATIVE("call __sw_hweight64", POPCNT, X86_FEATURE_POPCNT) | ||
: "="REG_OUT (res) | ||
: REG_IN (w)); | ||
#endif /* CONFIG_X86_32 */ | ||
|
||
return res; | ||
} | ||
|
||
#endif |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters