Skip to content

Commit

Permalink
string: memchr_inv() speed improvements
Browse files Browse the repository at this point in the history
- Generate a 64-bit pattern more efficiently

memchr_inv needs to generate a 64-bit pattern filled with a target
character.  The operation can be done by more efficient way.

- Don't call the slow check_bytes() if the memory area is 64-bit aligned

memchr_inv compares contiguous 64-bit words with the 64-bit pattern as
much as possible.  The outside of the region is checked by check_bytes()
that scans for each byte.  Unfortunately, the first 64-bit word is
unexpectedly scanned by check_bytes() even if the memory area is aligned
to a 64-bit boundary.

Both changes were originally suggested by Eric Dumazet.

Signed-off-by: Akinobu Mita <[email protected]>
Suggested-by: Eric Dumazet <[email protected]>
Cc: Brian Norris <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
  • Loading branch information
mita authored and torvalds committed Mar 23, 2012
1 parent a403d93 commit f43804b
Showing 1 changed file with 16 additions and 4 deletions.
20 changes: 16 additions & 4 deletions lib/string.c
Original file line number Diff line number Diff line change
Expand Up @@ -785,12 +785,24 @@ void *memchr_inv(const void *start, int c, size_t bytes)
if (bytes <= 16)
return check_bytes8(start, value, bytes);

value64 = value | value << 8 | value << 16 | value << 24;
value64 = (value64 & 0xffffffff) | value64 << 32;
prefix = 8 - ((unsigned long)start) % 8;
value64 = value;
#if defined(ARCH_HAS_FAST_MULTIPLIER) && BITS_PER_LONG == 64
value64 *= 0x0101010101010101;
#elif defined(ARCH_HAS_FAST_MULTIPLIER)
value64 *= 0x01010101;
value64 |= value64 << 32;
#else
value64 |= value64 << 8;
value64 |= value64 << 16;
value64 |= value64 << 32;
#endif

prefix = (unsigned long)start % 8;
if (prefix) {
u8 *r = check_bytes8(start, value, prefix);
u8 *r;

prefix = 8 - prefix;
r = check_bytes8(start, value, prefix);
if (r)
return r;
start += prefix;
Expand Down

0 comments on commit f43804b

Please sign in to comment.