Skip to content

Commit

Permalink
x86/asm/bitops: Use __builtin_clz{l|ll} to evaluate constant expressions
Browse files Browse the repository at this point in the history
Micro-optimize the bitops code some more, similar to commits:

  fdb6649 ("x86/asm/bitops: Use __builtin_ctzl() to evaluate constant expressions")
  2fcff79 ("powerpc: Use builtin functions for fls()/__fls()/fls64()")

From a recent discussion, I noticed that x86 is lacking an optimization
that appears in arch/powerpc/include/asm/bitops.h related to constant
folding.  If you add a BUILD_BUG_ON(__builtin_constant_p(param)) to
these functions, you'll find that there were cases where the use of
inline asm pessimized the compiler's ability to perform constant folding
resulting in runtime calculation of a value that could have been
computed at compile time.

Signed-off-by: Nick Desaulniers <[email protected]>
Signed-off-by: Ingo Molnar <[email protected]>
Cc: Linus Torvalds <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
  • Loading branch information
nickdesaulniers authored and Ingo Molnar committed Sep 6, 2023
1 parent 4accdb9 commit 3dae5c4
Showing 1 changed file with 9 additions and 0 deletions.
9 changes: 9 additions & 0 deletions arch/x86/include/asm/bitops.h
Original file line number Diff line number Diff line change
Expand Up @@ -293,6 +293,9 @@ static __always_inline unsigned long variable_ffz(unsigned long word)
*/
static __always_inline unsigned long __fls(unsigned long word)
{
if (__builtin_constant_p(word))
return BITS_PER_LONG - 1 - __builtin_clzl(word);

asm("bsr %1,%0"
: "=r" (word)
: "rm" (word));
Expand Down Expand Up @@ -360,6 +363,9 @@ static __always_inline int fls(unsigned int x)
{
int r;

if (__builtin_constant_p(x))
return x ? 32 - __builtin_clz(x) : 0;

#ifdef CONFIG_X86_64
/*
* AMD64 says BSRL won't clobber the dest reg if x==0; Intel64 says the
Expand Down Expand Up @@ -401,6 +407,9 @@ static __always_inline int fls(unsigned int x)
static __always_inline int fls64(__u64 x)
{
int bitpos = -1;

if (__builtin_constant_p(x))
return x ? 64 - __builtin_clzll(x) : 0;
/*
* AMD64 says BSRQ won't clobber the dest reg if x==0; Intel64 says the
* dest reg is undefined if x==0, but their CPU architect says its
Expand Down

0 comments on commit 3dae5c4

Please sign in to comment.