Skip to content

Commit

Permalink
Merge branch 'locking-core-for-linus' of git://git.kernel.org/pub/scm…
Browse files Browse the repository at this point in the history
…/linux/kernel/git/tip/tip

Pull locking and atomic updates from Ingo Molnar:
 "Main changes in this cycle are:

   - Extend atomic primitives with coherent logic op primitives
     (atomic_{or,and,xor}()) and deprecate the old partial APIs
     (atomic_{set,clear}_mask())

     The old ops were incoherent with incompatible signatures across
     architectures and with incomplete support.  Now every architecture
     supports the primitives consistently (by Peter Zijlstra)

   - Generic support for 'relaxed atomics':

       - _acquire/release/relaxed() flavours of xchg(), cmpxchg() and {add,sub}_return()
       - atomic_read_acquire()
       - atomic_set_release()

     This came out of porting qwrlock code to arm64 (by Will Deacon)

   - Clean up the fragile static_key APIs that were causing repeat bugs,
     by introducing a new one:

       DEFINE_STATIC_KEY_TRUE(name);
       DEFINE_STATIC_KEY_FALSE(name);

     which define a key of different types with an initial true/false
     value.

     Then allow:

       static_branch_likely()
       static_branch_unlikely()

     to take a key of either type and emit the right instruction for the
     case.  To be able to know the 'type' of the static key we encode it
     in the jump entry (by Peter Zijlstra)

   - Static key self-tests (by Jason Baron)

   - qrwlock optimizations (by Waiman Long)

   - small futex enhancements (by Davidlohr Bueso)

   - ... and misc other changes"

* 'locking-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (63 commits)
  jump_label/x86: Work around asm build bug on older/backported GCCs
  locking, ARM, atomics: Define our SMP atomics in terms of _relaxed() operations
  locking, include/llist: Use linux/atomic.h instead of asm/cmpxchg.h
  locking/qrwlock: Make use of _{acquire|release|relaxed}() atomics
  locking/qrwlock: Implement queue_write_unlock() using smp_store_release()
  locking/lockref: Remove homebrew cmpxchg64_relaxed() macro definition
  locking, asm-generic: Add _{relaxed|acquire|release}() variants for 'atomic_long_t'
  locking, asm-generic: Rework atomic-long.h to avoid bulk code duplication
  locking/atomics: Add _{acquire|release|relaxed}() variants of some atomic operations
  locking, compiler.h: Cast away attributes in the WRITE_ONCE() magic
  locking/static_keys: Make verify_keys() static
  jump label, locking/static_keys: Update docs
  locking/static_keys: Provide a selftest
  jump_label: Provide a self-test
  s390/uaccess, locking/static_keys: employ static_branch_likely()
  x86, tsc, locking/static_keys: Employ static_branch_likely()
  locking/static_keys: Add selftest
  locking/static_keys: Add a new static_key interface
  locking/static_keys: Rework update logic
  locking/static_keys: Add static_key_{en,dis}able() helpers
  ...
  • Loading branch information
torvalds committed Sep 3, 2015
2 parents 4c12ab7 + d420acd commit ca520ca
Show file tree
Hide file tree
Showing 139 changed files with 2,425 additions and 3,585 deletions.
4 changes: 3 additions & 1 deletion Documentation/atomic_ops.txt
Original file line number Diff line number Diff line change
Expand Up @@ -266,7 +266,9 @@ with the given old and new values. Like all atomic_xxx operations,
atomic_cmpxchg will only satisfy its atomicity semantics as long as all
other accesses of *v are performed through atomic_xxx operations.

atomic_cmpxchg must provide explicit memory barriers around the operation.
atomic_cmpxchg must provide explicit memory barriers around the operation,
although if the comparison fails then no memory ordering guarantees are
required.

The semantics for atomic_cmpxchg are the same as those defined for 'cas'
below.
Expand Down
11 changes: 11 additions & 0 deletions Documentation/fault-injection/fault-injection.txt
Original file line number Diff line number Diff line change
Expand Up @@ -15,6 +15,10 @@ o fail_page_alloc

injects page allocation failures. (alloc_pages(), get_free_pages(), ...)

o fail_futex

injects futex deadlock and uaddr fault errors.

o fail_make_request

injects disk IO errors on devices permitted by setting
Expand Down Expand Up @@ -113,6 +117,12 @@ configuration of fault-injection capabilities.
specifies the minimum page allocation order to be injected
failures.

- /sys/kernel/debug/fail_futex/ignore-private:

Format: { 'Y' | 'N' }
default is 'N', setting it to 'Y' will disable failure injections
when dealing with private (address space) futexes.

o Boot option

In order to inject faults while debugfs is not available (early boot time),
Expand All @@ -121,6 +131,7 @@ use the boot option:
failslab=
fail_page_alloc=
fail_make_request=
fail_futex=
mmc_core.fail_request=<interval>,<probability>,<space>,<times>

How to add new fault injection capability
Expand Down
6 changes: 3 additions & 3 deletions Documentation/memory-barriers.txt
Original file line number Diff line number Diff line change
Expand Up @@ -2327,9 +2327,7 @@ about the state (old or new) implies an SMP-conditional general memory barrier
explicit lock operations, described later). These include:

xchg();
cmpxchg();
atomic_xchg(); atomic_long_xchg();
atomic_cmpxchg(); atomic_long_cmpxchg();
atomic_inc_return(); atomic_long_inc_return();
atomic_dec_return(); atomic_long_dec_return();
atomic_add_return(); atomic_long_add_return();
Expand All @@ -2342,7 +2340,9 @@ explicit lock operations, described later). These include:
test_and_clear_bit();
test_and_change_bit();

/* when succeeds (returns 1) */
/* when succeeds */
cmpxchg();
atomic_cmpxchg(); atomic_long_cmpxchg();
atomic_add_unless(); atomic_long_add_unless();

These are used for such things as implementing ACQUIRE-class and RELEASE-class
Expand Down
99 changes: 52 additions & 47 deletions Documentation/static-keys.txt
Original file line number Diff line number Diff line change
@@ -1,30 +1,45 @@
Static Keys
-----------

By: Jason Baron <[email protected]>
DEPRECATED API:

The use of 'struct static_key' directly, is now DEPRECATED. In addition
static_key_{true,false}() is also DEPRECATED. IE DO NOT use the following:

struct static_key false = STATIC_KEY_INIT_FALSE;
struct static_key true = STATIC_KEY_INIT_TRUE;
static_key_true()
static_key_false()

The updated API replacements are:

DEFINE_STATIC_KEY_TRUE(key);
DEFINE_STATIC_KEY_FALSE(key);
static_key_likely()
statick_key_unlikely()

0) Abstract

Static keys allows the inclusion of seldom used features in
performance-sensitive fast-path kernel code, via a GCC feature and a code
patching technique. A quick example:

struct static_key key = STATIC_KEY_INIT_FALSE;
DEFINE_STATIC_KEY_FALSE(key);

...

if (static_key_false(&key))
if (static_branch_unlikely(&key))
do unlikely code
else
do likely code

...
static_key_slow_inc();
static_branch_enable(&key);
...
static_key_slow_inc();
static_branch_disable(&key);
...

The static_key_false() branch will be generated into the code with as little
The static_branch_unlikely() branch will be generated into the code with as little
impact to the likely code path as possible.


Expand Down Expand Up @@ -56,7 +71,7 @@ the branch site to change the branch direction.

For example, if we have a simple branch that is disabled by default:

if (static_key_false(&key))
if (static_branch_unlikely(&key))
printk("I am the true branch\n");

Thus, by default the 'printk' will not be emitted. And the code generated will
Expand All @@ -75,68 +90,55 @@ the basis for the static keys facility.

In order to make use of this optimization you must first define a key:

struct static_key key;

Which is initialized as:

struct static_key key = STATIC_KEY_INIT_TRUE;
DEFINE_STATIC_KEY_TRUE(key);

or:

struct static_key key = STATIC_KEY_INIT_FALSE;
DEFINE_STATIC_KEY_FALSE(key);


If the key is not initialized, it is default false. The 'struct static_key',
must be a 'global'. That is, it can't be allocated on the stack or dynamically
The key must be global, that is, it can't be allocated on the stack or dynamically
allocated at run-time.

The key is then used in code as:

if (static_key_false(&key))
if (static_branch_unlikely(&key))
do unlikely code
else
do likely code

Or:

if (static_key_true(&key))
if (static_branch_likely(&key))
do likely code
else
do unlikely code

A key that is initialized via 'STATIC_KEY_INIT_FALSE', must be used in a
'static_key_false()' construct. Likewise, a key initialized via
'STATIC_KEY_INIT_TRUE' must be used in a 'static_key_true()' construct. A
single key can be used in many branches, but all the branches must match the
way that the key has been initialized.
Keys defined via DEFINE_STATIC_KEY_TRUE(), or DEFINE_STATIC_KEY_FALSE, may
be used in either static_branch_likely() or static_branch_unlikely()
statemnts.

The branch(es) can then be switched via:
Branch(es) can be set true via:

static_key_slow_inc(&key);
...
static_key_slow_dec(&key);
static_branch_enable(&key);

Thus, 'static_key_slow_inc()' means 'make the branch true', and
'static_key_slow_dec()' means 'make the branch false' with appropriate
reference counting. For example, if the key is initialized true, a
static_key_slow_dec(), will switch the branch to false. And a subsequent
static_key_slow_inc(), will change the branch back to true. Likewise, if the
key is initialized false, a 'static_key_slow_inc()', will change the branch to
true. And then a 'static_key_slow_dec()', will again make the branch false.
or false via:

static_branch_disable(&key);

An example usage in the kernel is the implementation of tracepoints:
The branch(es) can then be switched via reference counts:

static inline void trace_##name(proto) \
{ \
if (static_key_false(&__tracepoint_##name.key)) \
__DO_TRACE(&__tracepoint_##name, \
TP_PROTO(data_proto), \
TP_ARGS(data_args), \
TP_CONDITION(cond)); \
}
static_branch_inc(&key);
...
static_branch_dec(&key);

Tracepoints are disabled by default, and can be placed in performance critical
pieces of the kernel. Thus, by using a static key, the tracepoints can have
absolutely minimal impact when not in use.
Thus, 'static_branch_inc()' means 'make the branch true', and
'static_branch_dec()' means 'make the branch false' with appropriate
reference counting. For example, if the key is initialized true, a
static_branch_dec(), will switch the branch to false. And a subsequent
static_branch_inc(), will change the branch back to true. Likewise, if the
key is initialized false, a 'static_branch_inc()', will change the branch to
true. And then a 'static_branch_dec()', will again make the branch false.


4) Architecture level code patching interface, 'jump labels'
Expand All @@ -150,9 +152,12 @@ simply fall back to a traditional, load, test, and jump sequence.

* #define JUMP_LABEL_NOP_SIZE, see: arch/x86/include/asm/jump_label.h

* __always_inline bool arch_static_branch(struct static_key *key), see:
* __always_inline bool arch_static_branch(struct static_key *key, bool branch), see:
arch/x86/include/asm/jump_label.h

* __always_inline bool arch_static_branch_jump(struct static_key *key, bool branch),
see: arch/x86/include/asm/jump_label.h

* void arch_jump_label_transform(struct jump_entry *entry, enum jump_label_type type),
see: arch/x86/kernel/jump_label.c

Expand All @@ -173,7 +178,7 @@ SYSCALL_DEFINE0(getppid)
{
int pid;

+ if (static_key_false(&key))
+ if (static_branch_unlikely(&key))
+ printk("I am the true branch\n");

rcu_read_lock();
Expand Down
6 changes: 6 additions & 0 deletions arch/Kconfig
Original file line number Diff line number Diff line change
Expand Up @@ -71,6 +71,12 @@ config JUMP_LABEL
( On 32-bit x86, the necessary options added to the compiler
flags may increase the size of the kernel slightly. )

config STATIC_KEYS_SELFTEST
bool "Static key selftest"
depends on JUMP_LABEL
help
Boot time self-test of the branch patching code.

config OPTPROBES
def_bool y
depends on KPROBES && HAVE_OPTPROBES
Expand Down
42 changes: 27 additions & 15 deletions arch/alpha/include/asm/atomic.h
Original file line number Diff line number Diff line change
Expand Up @@ -29,13 +29,13 @@
* branch back to restart the operation.
*/

#define ATOMIC_OP(op) \
#define ATOMIC_OP(op, asm_op) \
static __inline__ void atomic_##op(int i, atomic_t * v) \
{ \
unsigned long temp; \
__asm__ __volatile__( \
"1: ldl_l %0,%1\n" \
" " #op "l %0,%2,%0\n" \
" " #asm_op " %0,%2,%0\n" \
" stl_c %0,%1\n" \
" beq %0,2f\n" \
".subsection 2\n" \
Expand All @@ -45,15 +45,15 @@ static __inline__ void atomic_##op(int i, atomic_t * v) \
:"Ir" (i), "m" (v->counter)); \
} \

#define ATOMIC_OP_RETURN(op) \
#define ATOMIC_OP_RETURN(op, asm_op) \
static inline int atomic_##op##_return(int i, atomic_t *v) \
{ \
long temp, result; \
smp_mb(); \
__asm__ __volatile__( \
"1: ldl_l %0,%1\n" \
" " #op "l %0,%3,%2\n" \
" " #op "l %0,%3,%0\n" \
" " #asm_op " %0,%3,%2\n" \
" " #asm_op " %0,%3,%0\n" \
" stl_c %0,%1\n" \
" beq %0,2f\n" \
".subsection 2\n" \
Expand All @@ -65,13 +65,13 @@ static inline int atomic_##op##_return(int i, atomic_t *v) \
return result; \
}

#define ATOMIC64_OP(op) \
#define ATOMIC64_OP(op, asm_op) \
static __inline__ void atomic64_##op(long i, atomic64_t * v) \
{ \
unsigned long temp; \
__asm__ __volatile__( \
"1: ldq_l %0,%1\n" \
" " #op "q %0,%2,%0\n" \
" " #asm_op " %0,%2,%0\n" \
" stq_c %0,%1\n" \
" beq %0,2f\n" \
".subsection 2\n" \
Expand All @@ -81,15 +81,15 @@ static __inline__ void atomic64_##op(long i, atomic64_t * v) \
:"Ir" (i), "m" (v->counter)); \
} \

#define ATOMIC64_OP_RETURN(op) \
#define ATOMIC64_OP_RETURN(op, asm_op) \
static __inline__ long atomic64_##op##_return(long i, atomic64_t * v) \
{ \
long temp, result; \
smp_mb(); \
__asm__ __volatile__( \
"1: ldq_l %0,%1\n" \
" " #op "q %0,%3,%2\n" \
" " #op "q %0,%3,%0\n" \
" " #asm_op " %0,%3,%2\n" \
" " #asm_op " %0,%3,%0\n" \
" stq_c %0,%1\n" \
" beq %0,2f\n" \
".subsection 2\n" \
Expand All @@ -101,15 +101,27 @@ static __inline__ long atomic64_##op##_return(long i, atomic64_t * v) \
return result; \
}

#define ATOMIC_OPS(opg) \
ATOMIC_OP(opg) \
ATOMIC_OP_RETURN(opg) \
ATOMIC64_OP(opg) \
ATOMIC64_OP_RETURN(opg)
#define ATOMIC_OPS(op) \
ATOMIC_OP(op, op##l) \
ATOMIC_OP_RETURN(op, op##l) \
ATOMIC64_OP(op, op##q) \
ATOMIC64_OP_RETURN(op, op##q)

ATOMIC_OPS(add)
ATOMIC_OPS(sub)

#define atomic_andnot atomic_andnot
#define atomic64_andnot atomic64_andnot

ATOMIC_OP(and, and)
ATOMIC_OP(andnot, bic)
ATOMIC_OP(or, bis)
ATOMIC_OP(xor, xor)
ATOMIC64_OP(and, and)
ATOMIC64_OP(andnot, bic)
ATOMIC64_OP(or, bis)
ATOMIC64_OP(xor, xor)

#undef ATOMIC_OPS
#undef ATOMIC64_OP_RETURN
#undef ATOMIC64_OP
Expand Down
8 changes: 6 additions & 2 deletions arch/arc/include/asm/atomic.h
Original file line number Diff line number Diff line change
Expand Up @@ -172,9 +172,13 @@ static inline int atomic_##op##_return(int i, atomic_t *v) \

ATOMIC_OPS(add, +=, add)
ATOMIC_OPS(sub, -=, sub)
ATOMIC_OP(and, &=, and)

#define atomic_clear_mask(mask, v) atomic_and(~(mask), (v))
#define atomic_andnot atomic_andnot

ATOMIC_OP(and, &=, and)
ATOMIC_OP(andnot, &= ~, bic)
ATOMIC_OP(or, |=, or)
ATOMIC_OP(xor, ^=, xor)

#undef ATOMIC_OPS
#undef ATOMIC_OP_RETURN
Expand Down
Loading

0 comments on commit ca520ca

Please sign in to comment.