Merge branch 'locking-core-for-linus' of git://git.kernel.org/pub/scm…

…/linux/kernel/git/tip/tip Pull locking updates from Ingo Molnar: "The main changes in the locking subsystem in this cycle were: - Add the Linux Kernel Memory Consistency Model (LKMM) subsystem, which is an an array of tools in tools/memory-model/ that formally describe the Linux memory coherency model (a.k.a. Documentation/memory-barriers.txt), and also produce 'litmus tests' in form of kernel code which can be directly executed and tested. Here's a high level background article about an earlier version of this work on LWN.net: https://lwn.net/Articles/718628/ The design principles: "There is reason to believe that Documentation/memory-barriers.txt could use some help, and a major purpose of this patch is to provide that help in the form of a design-time tool that can produce all valid executions of a small fragment of concurrent Linux-kernel code, which is called a "litmus test". This tool's functionality is roughly similar to a full state-space search. Please note that this is a design-time tool, not useful for regression testing. However, we hope that the underlying Linux-kernel memory model will be incorporated into other tools capable of analyzing large bodies of code for regression-testing purposes." [...] "A second tool is klitmus7, which converts litmus tests to loadable kernel modules for direct testing. As with herd7, the klitmus7 code is freely available from http://diy.inria.fr/sources/index.html (and via "git" at https://github.com/herd/herdtools7)" [...] Credits go to: "This patch was the result of a most excellent collaboration founded by Jade Alglave and also including Alan Stern, Andrea Parri, and Luc Maranget." ... and to the gents listed in the MAINTAINERS entry: LINUX KERNEL MEMORY CONSISTENCY MODEL (LKMM) M: Alan Stern <[email protected]> M: Andrea Parri <[email protected]> M: Will Deacon <[email protected]> M: Peter Zijlstra <[email protected]> M: Boqun Feng <[email protected]> M: Nicholas Piggin <[email protected]> M: David Howells <[email protected]> M: Jade Alglave <[email protected]> M: Luc Maranget <[email protected]> M: "Paul E. McKenney" <[email protected]> The LKMM project already found several bugs in Linux locking primitives and improved the understanding and the documentation of the Linux memory model all around. - Add KASAN instrumentation to atomic APIs (Dmitry Vyukov) - Add RWSEM API debugging and reorganize the lock debugging Kconfig (Waiman Long) - ... misc cleanups and other smaller changes" * 'locking-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (31 commits) locking/Kconfig: Restructure the lock debugging menu locking/Kconfig: Add LOCK_DEBUGGING_SUPPORT to make it more readable locking/rwsem: Add DEBUG_RWSEMS to look for lock/unlock mismatches lockdep: Make the lock debug output more useful locking/rtmutex: Handle non enqueued waiters gracefully in remove_waiter() locking/atomic, asm-generic, x86: Add comments for atomic instrumentation locking/atomic, asm-generic: Add KASAN instrumentation to atomic operations locking/atomic/x86: Switch atomic.h to use atomic-instrumented.h locking/atomic, asm-generic: Add asm-generic/atomic-instrumented.h locking/xchg/alpha: Remove superfluous memory barriers from the _local() variants tools/memory-model: Finish the removal of rb-dep, smp_read_barrier_depends(), and lockless_dereference() tools/memory-model: Add documentation of new litmus test tools/memory-model: Remove mention of docker/gentoo image locking/memory-barriers: De-emphasize smp_read_barrier_depends() some more locking/lockdep: Show unadorned pointers mutex: Drop linkage.h from mutex.h tools/memory-model: Remove rb-dep, smp_read_barrier_depends, and lockless_dereference tools/memory-model: Convert underscores to hyphens tools/memory-model: Add a S lock-based external-view litmus test tools/memory-model: Add required herd7 version to README file ...
akeromi · Apr 2, 2018 · 701f3b3 · 701f3b3
2 parents 8747a29 + 19193bc
commit 701f3b3
Show file tree

Hide file tree

Showing 59 changed files with 5,106 additions and 301 deletions.
diff --git a/Documentation/locking/lockdep-design.txt b/Documentation/locking/lockdep-design.txt
@@ -27,7 +27,8 @@ lock-class.
 State
 -----
 
-The validator tracks lock-class usage history into 4n + 1 separate state bits:
+The validator tracks lock-class usage history into 4 * nSTATEs + 1 separate
+state bits:
 
 - 'ever held in STATE context'
 - 'ever held as readlock in STATE context'
@@ -37,7 +38,6 @@ The validator tracks lock-class usage history into 4n + 1 separate state bits:
 Where STATE can be either one of (kernel/locking/lockdep_states.h)
  - hardirq
  - softirq
- - reclaim_fs
 
 - 'ever used'                                       [ == !unused        ]
 
@@ -169,6 +169,53 @@ Note: When changing code to use the _nested() primitives, be careful and
 check really thoroughly that the hierarchy is correctly mapped; otherwise
 you can get false positives or false negatives.
 
+Annotations
+-----------
+
+Two constructs can be used to annotate and check where and if certain locks
+must be held: lockdep_assert_held*(&lock) and lockdep_*pin_lock(&lock).
+
+As the name suggests, lockdep_assert_held* family of macros assert that a
+particular lock is held at a certain time (and generate a WARN() otherwise).
+This annotation is largely used all over the kernel, e.g. kernel/sched/
+core.c
+
+  void update_rq_clock(struct rq *rq)
+  {
+	s64 delta;
+
+	lockdep_assert_held(&rq->lock);
+	[...]
+  }
+
+where holding rq->lock is required to safely update a rq's clock.
+
+The other family of macros is lockdep_*pin_lock(), which is admittedly only
+used for rq->lock ATM. Despite their limited adoption these annotations
+generate a WARN() if the lock of interest is "accidentally" unlocked. This turns
+out to be especially helpful to debug code with callbacks, where an upper
+layer assumes a lock remains taken, but a lower layer thinks it can maybe drop
+and reacquire the lock ("unwittingly" introducing races). lockdep_pin_lock()
+returns a 'struct pin_cookie' that is then used by lockdep_unpin_lock() to check
+that nobody tampered with the lock, e.g. kernel/sched/sched.h
+
+  static inline void rq_pin_lock(struct rq *rq, struct rq_flags *rf)
+  {
+	rf->cookie = lockdep_pin_lock(&rq->lock);
+	[...]
+  }
+
+  static inline void rq_unpin_lock(struct rq *rq, struct rq_flags *rf)
+  {
+	[...]
+	lockdep_unpin_lock(&rq->lock, rf->cookie);
+  }
+
+While comments about locking requirements might provide useful information,
+the runtime checks performed by annotations are invaluable when debugging
+locking problems and they carry the same level of details when inspecting
+code.  Always prefer annotations when in doubt!
+
 Proof of 100% correctness:
 --------------------------
 

diff --git a/Documentation/memory-barriers.txt b/Documentation/memory-barriers.txt
@@ -14,7 +14,11 @@ DISCLAIMER
 This document is not a specification; it is intentionally (for the sake of
 brevity) and unintentionally (due to being human) incomplete. This document is
 meant as a guide to using the various memory barriers provided by Linux, but
-in case of any doubt (and there are many) please ask.
+in case of any doubt (and there are many) please ask.  Some doubts may be
+resolved by referring to the formal memory consistency model and related
+documentation at tools/memory-model/.  Nevertheless, even this memory
+model should be viewed as the collective opinion of its maintainers rather
+than as an infallible oracle.
 
 To repeat, this document is not a specification of what Linux expects from
 hardware.
@@ -48,7 +52,7 @@ CONTENTS
 
      - Varieties of memory barrier.
      - What may not be assumed about memory barriers?
-     - Data dependency barriers.
+     - Data dependency barriers (historical).
      - Control dependencies.
      - SMP barrier pairing.
      - Examples of memory barrier sequences.
@@ -399,7 +403,7 @@ Memory barriers come in four basic varieties:
      where two loads are performed such that the second depends on the result
      of the first (eg: the first load retrieves the address to which the second
      load will be directed), a data dependency barrier would be required to
-     make sure that the target of the second load is updated before the address
+     make sure that the target of the second load is updated after the address
      obtained by the first load is accessed.
 
      A data dependency barrier is a partial ordering on interdependent loads
@@ -550,8 +554,15 @@ There are certain things that the Linux kernel memory barriers do not guarantee:
 	    Documentation/DMA-API.txt
 
 
-DATA DEPENDENCY BARRIERS
-------------------------
+DATA DEPENDENCY BARRIERS (HISTORICAL)
+-------------------------------------
+
+As of v4.15 of the Linux kernel, an smp_read_barrier_depends() was
+added to READ_ONCE(), which means that about the only people who
+need to pay attention to this section are those working on DEC Alpha
+architecture-specific code and those working on READ_ONCE() itself.
+For those who need it, and for those who are interested in the history,
+here is the story of data-dependency barriers.
 
 The usage requirements of data dependency barriers are a little subtle, and
 it's not always obvious that they're needed.  To illustrate, consider the
@@ -2839,8 +2850,9 @@ as that committed on CPU 1.
 
 
 To intervene, we need to interpolate a data dependency barrier or a read
-barrier between the loads.  This will force the cache to commit its coherency
-queue before processing any further requests:
+barrier between the loads (which as of v4.15 is supplied unconditionally
+by the READ_ONCE() macro).  This will force the cache to commit its
+coherency queue before processing any further requests:
 
 	CPU 1		CPU 2		COMMENT
 	===============	===============	=======================================
@@ -2869,8 +2881,8 @@ Other CPUs may also have split caches, but must coordinate between the various
 cachelets for normal memory accesses.  The semantics of the Alpha removes the
 need for hardware coordination in the absence of memory barriers, which
 permitted Alpha to sport higher CPU clock rates back in the day.  However,
-please note that smp_read_barrier_depends() should not be used except in
-Alpha arch-specific code and within the READ_ONCE() macro.
+please note that (again, as of v4.15) smp_read_barrier_depends() should not
+be used except in Alpha arch-specific code and within the READ_ONCE() macro.
 
 
 CACHE COHERENCY VS DMA
@@ -3035,7 +3047,9 @@ the data dependency barrier really becomes necessary as this synchronises both
 caches with the memory coherence system, thus making it seem like pointer
 changes vs new data occur in the right order.
 
-The Alpha defines the Linux kernel's memory barrier model.
+The Alpha defines the Linux kernel's memory model, although as of v4.15
+the Linux kernel's addition of smp_read_barrier_depends() to READ_ONCE()
+greatly reduced Alpha's impact on the memory model.
 
 See the subsection on "Cache Coherency" above.
 

diff --git a/MAINTAINERS b/MAINTAINERS
@@ -8162,6 +8162,24 @@ M:	Kees Cook <[email protected]>
 S:	Maintained
 F:	drivers/misc/lkdtm*
 
+LINUX KERNEL MEMORY CONSISTENCY MODEL (LKMM)
+M:	Alan Stern <[email protected]>
+M:	Andrea Parri <[email protected]>
+M:	Will Deacon <[email protected]>
+M:	Peter Zijlstra <[email protected]>
+M:	Boqun Feng <[email protected]>
+M:	Nicholas Piggin <[email protected]>
+M:	David Howells <[email protected]>
+M:	Jade Alglave <[email protected]>
+M:	Luc Maranget <[email protected]>
+M:	"Paul E. McKenney" <[email protected]>
+R:	Akira Yokosawa <[email protected]>
+L:	[email protected]
+S:	Supported
+T:	git git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu.git
+F:	tools/memory-model/
+F:	Documentation/memory-barriers.txt
+
 LINUX SECURITY MODULE (LSM) FRAMEWORK
 M:	Chris Wright <[email protected]>
 L:	[email protected]

diff --git a/arch/alpha/include/asm/cmpxchg.h b/arch/alpha/include/asm/cmpxchg.h
@@ -38,19 +38,31 @@
 #define ____cmpxchg(type, args...)	__cmpxchg ##type(args)
 #include <asm/xchg.h>
 
+/*
+ * The leading and the trailing memory barriers guarantee that these
+ * operations are fully ordered.
+ */
 #define xchg(ptr, x)							\
 ({									\
+	__typeof__(*(ptr)) __ret;					\
 	__typeof__(*(ptr)) _x_ = (x);					\
-	(__typeof__(*(ptr))) __xchg((ptr), (unsigned long)_x_,		\
-				 sizeof(*(ptr)));			\
+	smp_mb();							\
+	__ret = (__typeof__(*(ptr)))					\
+		__xchg((ptr), (unsigned long)_x_, sizeof(*(ptr)));	\
+	smp_mb();							\
+	__ret;								\
 })
 
 #define cmpxchg(ptr, o, n)						\
 ({									\
+	__typeof__(*(ptr)) __ret;					\
 	__typeof__(*(ptr)) _o_ = (o);					\
 	__typeof__(*(ptr)) _n_ = (n);					\
-	(__typeof__(*(ptr))) __cmpxchg((ptr), (unsigned long)_o_,	\
-				    (unsigned long)_n_,	sizeof(*(ptr)));\
+	smp_mb();							\
+	__ret = (__typeof__(*(ptr))) __cmpxchg((ptr),			\
+		(unsigned long)_o_, (unsigned long)_n_, sizeof(*(ptr)));\
+	smp_mb();							\
+	__ret;								\
 })
 
 #define cmpxchg64(ptr, o, n)						\

diff --git a/arch/alpha/include/asm/xchg.h b/arch/alpha/include/asm/xchg.h
@@ -12,18 +12,13 @@
  * Atomic exchange.
  * Since it can be used to implement critical sections
  * it must clobber "memory" (also for interrupts in UP).
- *
- * The leading and the trailing memory barriers guarantee that these
- * operations are fully ordered.
- *
  */
 
 static inline unsigned long
 ____xchg(_u8, volatile char *m, unsigned long val)
 {
 	unsigned long ret, tmp, addr64;
 
-	smp_mb();
 	__asm__ __volatile__(
 	"	andnot	%4,7,%3\n"
 	"	insbl	%1,%4,%1\n"
@@ -38,7 +33,6 @@ ____xchg(_u8, volatile char *m, unsigned long val)
 	".previous"
 	: "=&r" (ret), "=&r" (val), "=&r" (tmp), "=&r" (addr64)
 	: "r" ((long)m), "1" (val) : "memory");
-	smp_mb();
 
 	return ret;
 }
@@ -48,7 +42,6 @@ ____xchg(_u16, volatile short *m, unsigned long val)
 {
 	unsigned long ret, tmp, addr64;
 
-	smp_mb();
 	__asm__ __volatile__(
 	"	andnot	%4,7,%3\n"
 	"	inswl	%1,%4,%1\n"
@@ -63,7 +56,6 @@ ____xchg(_u16, volatile short *m, unsigned long val)
 	".previous"
 	: "=&r" (ret), "=&r" (val), "=&r" (tmp), "=&r" (addr64)
 	: "r" ((long)m), "1" (val) : "memory");
-	smp_mb();
 
 	return ret;
 }
@@ -73,7 +65,6 @@ ____xchg(_u32, volatile int *m, unsigned long val)
 {
 	unsigned long dummy;
 
-	smp_mb();
 	__asm__ __volatile__(
 	"1:	ldl_l %0,%4\n"
 	"	bis $31,%3,%1\n"
@@ -84,7 +75,6 @@ ____xchg(_u32, volatile int *m, unsigned long val)
 	".previous"
 	: "=&r" (val), "=&r" (dummy), "=m" (*m)
 	: "rI" (val), "m" (*m) : "memory");
-	smp_mb();
 
 	return val;
 }
@@ -94,7 +84,6 @@ ____xchg(_u64, volatile long *m, unsigned long val)
 {
 	unsigned long dummy;
 
-	smp_mb();
 	__asm__ __volatile__(
 	"1:	ldq_l %0,%4\n"
 	"	bis $31,%3,%1\n"
@@ -105,7 +94,6 @@ ____xchg(_u64, volatile long *m, unsigned long val)
 	".previous"
 	: "=&r" (val), "=&r" (dummy), "=m" (*m)
 	: "rI" (val), "m" (*m) : "memory");
-	smp_mb();
 
 	return val;
 }
@@ -135,21 +123,13 @@ ____xchg(, volatile void *ptr, unsigned long x, int size)
  * Atomic compare and exchange.  Compare OLD with MEM, if identical,
  * store NEW in MEM.  Return the initial value in MEM.  Success is
  * indicated by comparing RETURN with OLD.
- *
- * The leading and the trailing memory barriers guarantee that these
- * operations are fully ordered.
- *
- * The trailing memory barrier is placed in SMP unconditionally, in
- * order to guarantee that dependency ordering is preserved when a
- * dependency is headed by an unsuccessful operation.
  */
 
 static inline unsigned long
 ____cmpxchg(_u8, volatile char *m, unsigned char old, unsigned char new)
 {
 	unsigned long prev, tmp, cmp, addr64;
 
-	smp_mb();
 	__asm__ __volatile__(
 	"	andnot	%5,7,%4\n"
 	"	insbl	%1,%5,%1\n"
@@ -167,7 +147,6 @@ ____cmpxchg(_u8, volatile char *m, unsigned char old, unsigned char new)
 	".previous"
 	: "=&r" (prev), "=&r" (new), "=&r" (tmp), "=&r" (cmp), "=&r" (addr64)
 	: "r" ((long)m), "Ir" (old), "1" (new) : "memory");
-	smp_mb();
 
 	return prev;
 }
@@ -177,7 +156,6 @@ ____cmpxchg(_u16, volatile short *m, unsigned short old, unsigned short new)
 {
 	unsigned long prev, tmp, cmp, addr64;
 
-	smp_mb();
 	__asm__ __volatile__(
 	"	andnot	%5,7,%4\n"
 	"	inswl	%1,%5,%1\n"
@@ -195,7 +173,6 @@ ____cmpxchg(_u16, volatile short *m, unsigned short old, unsigned short new)
 	".previous"
 	: "=&r" (prev), "=&r" (new), "=&r" (tmp), "=&r" (cmp), "=&r" (addr64)
 	: "r" ((long)m), "Ir" (old), "1" (new) : "memory");
-	smp_mb();
 
 	return prev;
 }
@@ -205,7 +182,6 @@ ____cmpxchg(_u32, volatile int *m, int old, int new)
 {
 	unsigned long prev, cmp;
 
-	smp_mb();
 	__asm__ __volatile__(
 	"1:	ldl_l %0,%5\n"
 	"	cmpeq %0,%3,%1\n"
@@ -219,7 +195,6 @@ ____cmpxchg(_u32, volatile int *m, int old, int new)
 	".previous"
 	: "=&r"(prev), "=&r"(cmp), "=m"(*m)
 	: "r"((long) old), "r"(new), "m"(*m) : "memory");
-	smp_mb();
 
 	return prev;
 }
@@ -229,7 +204,6 @@ ____cmpxchg(_u64, volatile long *m, unsigned long old, unsigned long new)
 {
 	unsigned long prev, cmp;
 
-	smp_mb();
 	__asm__ __volatile__(
 	"1:	ldq_l %0,%5\n"
 	"	cmpeq %0,%3,%1\n"
@@ -243,7 +217,6 @@ ____cmpxchg(_u64, volatile long *m, unsigned long old, unsigned long new)
 	".previous"
 	: "=&r"(prev), "=&r"(cmp), "=m"(*m)
 	: "r"((long) old), "r"(new), "m"(*m) : "memory");
-	smp_mb();
 
 	return prev;
 }