Skip to content

Commit

Permalink
lib: make memzero_explicit more robust against dead store elimination
Browse files Browse the repository at this point in the history
In commit 0b053c9 ("lib: memzero_explicit: use barrier instead
of OPTIMIZER_HIDE_VAR"), we made memzero_explicit() more robust in
case LTO would decide to inline memzero_explicit() and eventually
find out it could be elimiated as dead store.

While using barrier() works well for the case of gcc, recent efforts
from LLVMLinux people suggest to use llvm as an alternative to gcc,
and there, Stephan found in a simple stand-alone user space example
that llvm could nevertheless optimize and thus elimitate the memset().
A similar issue has been observed in the referenced llvm bug report,
which is regarded as not-a-bug.

Based on some experiments, icc is a bit special on its own, while it
doesn't seem to eliminate the memset(), it could do so with an own
implementation, and then result in similar findings as with llvm.

The fix in this patch now works for all three compilers (also tested
with more aggressive optimization levels). Arguably, in the current
kernel tree it's more of a theoretical issue, but imho, it's better
to be pedantic about it.

It's clearly visible with gcc/llvm though, with the below code: if we
would have used barrier() only here, llvm would have omitted clearing,
not so with barrier_data() variant:

  static inline void memzero_explicit(void *s, size_t count)
  {
    memset(s, 0, count);
    barrier_data(s);
  }

  int main(void)
  {
    char buff[20];
    memzero_explicit(buff, sizeof(buff));
    return 0;
  }

  $ gcc -O2 test.c
  $ gdb a.out
  (gdb) disassemble main
  Dump of assembler code for function main:
   0x0000000000400400  <+0>: lea   -0x28(%rsp),%rax
   0x0000000000400405  <+5>: movq  $0x0,-0x28(%rsp)
   0x000000000040040e <+14>: movq  $0x0,-0x20(%rsp)
   0x0000000000400417 <+23>: movl  $0x0,-0x18(%rsp)
   0x000000000040041f <+31>: xor   %eax,%eax
   0x0000000000400421 <+33>: retq
  End of assembler dump.

  $ clang -O2 test.c
  $ gdb a.out
  (gdb) disassemble main
  Dump of assembler code for function main:
   0x00000000004004f0  <+0>: xorps  %xmm0,%xmm0
   0x00000000004004f3  <+3>: movaps %xmm0,-0x18(%rsp)
   0x00000000004004f8  <+8>: movl   $0x0,-0x8(%rsp)
   0x0000000000400500 <+16>: lea    -0x18(%rsp),%rax
   0x0000000000400505 <+21>: xor    %eax,%eax
   0x0000000000400507 <+23>: retq
  End of assembler dump.

As gcc, clang, but also icc defines __GNUC__, it's sufficient to define
this in compiler-gcc.h only to be picked up. For a fallback or otherwise
unsupported compiler, we define it as a barrier. Similarly, for ecc which
does not support gcc inline asm.

Reference: https://llvm.org/bugs/show_bug.cgi?id=15495
Reported-by: Stephan Mueller <[email protected]>
Tested-by: Stephan Mueller <[email protected]>
Signed-off-by: Daniel Borkmann <[email protected]>
Cc: Theodore Ts'o <[email protected]>
Cc: Stephan Mueller <[email protected]>
Cc: Hannes Frederic Sowa <[email protected]>
Cc: mancha security <[email protected]>
Cc: Mark Charlebois <[email protected]>
Cc: Behan Webster <[email protected]>
Signed-off-by: Herbert Xu <[email protected]>
  • Loading branch information
borkmann authored and herbertx committed May 4, 2015
1 parent 8c98ebd commit 7829fb0
Show file tree
Hide file tree
Showing 4 changed files with 23 additions and 2 deletions.
16 changes: 15 additions & 1 deletion include/linux/compiler-gcc.h
Original file line number Diff line number Diff line change
Expand Up @@ -9,10 +9,24 @@
+ __GNUC_MINOR__ * 100 \
+ __GNUC_PATCHLEVEL__)


/* Optimization barrier */

/* The "volatile" is due to gcc bugs */
#define barrier() __asm__ __volatile__("": : :"memory")
/*
* This version is i.e. to prevent dead stores elimination on @ptr
* where gcc and llvm may behave differently when otherwise using
* normal barrier(): while gcc behavior gets along with a normal
* barrier(), llvm needs an explicit input variable to be assumed
* clobbered. The issue is as follows: while the inline asm might
* access any memory it wants, the compiler could have fit all of
* @ptr into memory registers instead, and since @ptr never escaped
* from that, it proofed that the inline asm wasn't touching any of
* it. This version works well with both compilers, i.e. we're telling
* the compiler that the inline asm absolutely may see the contents
* of @ptr. See also: https://llvm.org/bugs/show_bug.cgi?id=15495
*/
#define barrier_data(ptr) __asm__ __volatile__("": :"r"(ptr) :"memory")

/*
* This macro obfuscates arithmetic on a variable address so that gcc
Expand Down
3 changes: 3 additions & 0 deletions include/linux/compiler-intel.h
Original file line number Diff line number Diff line change
Expand Up @@ -13,9 +13,12 @@
/* Intel ECC compiler doesn't support gcc specific asm stmts.
* It uses intrinsics to do the equivalent things.
*/
#undef barrier_data
#undef RELOC_HIDE
#undef OPTIMIZER_HIDE_VAR

#define barrier_data(ptr) barrier()

#define RELOC_HIDE(ptr, off) \
({ unsigned long __ptr; \
__ptr = (unsigned long) (ptr); \
Expand Down
4 changes: 4 additions & 0 deletions include/linux/compiler.h
Original file line number Diff line number Diff line change
Expand Up @@ -169,6 +169,10 @@ void ftrace_likely_update(struct ftrace_branch_data *f, int val, int expect);
# define barrier() __memory_barrier()
#endif

#ifndef barrier_data
# define barrier_data(ptr) barrier()
#endif

/* Unreachable code */
#ifndef unreachable
# define unreachable() do { } while (1)
Expand Down
2 changes: 1 addition & 1 deletion lib/string.c
Original file line number Diff line number Diff line change
Expand Up @@ -607,7 +607,7 @@ EXPORT_SYMBOL(memset);
void memzero_explicit(void *s, size_t count)
{
memset(s, 0, count);
barrier();
barrier_data(s);
}
EXPORT_SYMBOL(memzero_explicit);

Expand Down

0 comments on commit 7829fb0

Please sign in to comment.