Skip to content

Commit

Permalink
Merge tag 'kcsan.2022.01.09a' of git://git.kernel.org/pub/scm/linux/k…
Browse files Browse the repository at this point in the history
…ernel/git/paulmck/linux-rcu

Pull KCSAN updates from Paul McKenney:
 "This provides KCSAN fixes and also the ability to take memory barriers
  into account for weakly-ordered systems. This last can increase the
  probability of detecting certain types of data races"

* tag 'kcsan.2022.01.09a' of git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu: (29 commits)
  kcsan: Only test clear_bit_unlock_is_negative_byte if arch defines it
  kcsan: Avoid nested contexts reading inconsistent reorder_access
  kcsan: Turn barrier instrumentation into macros
  kcsan: Make barrier tests compatible with lockdep
  kcsan: Support WEAK_MEMORY with Clang where no objtool support exists
  compiler_attributes.h: Add __disable_sanitizer_instrumentation
  objtool, kcsan: Remove memory barrier instrumentation from noinstr
  objtool, kcsan: Add memory barrier instrumentation to whitelist
  sched, kcsan: Enable memory barrier instrumentation
  mm, kcsan: Enable barrier instrumentation
  x86/qspinlock, kcsan: Instrument barrier of pv_queued_spin_unlock()
  x86/barriers, kcsan: Use generic instrumentation for non-smp barriers
  asm-generic/bitops, kcsan: Add instrumentation for barriers
  locking/atomics, kcsan: Add instrumentation for barriers
  locking/barriers, kcsan: Support generic instrumentation
  locking/barriers, kcsan: Add instrumentation for barriers
  kcsan: selftest: Add test case to check memory barrier instrumentation
  kcsan: Ignore GCC 11+ warnings about TSan runtime support
  kcsan: test: Add test cases for memory barrier instrumentation
  kcsan: test: Match reordered or normal accesses
  ...
  • Loading branch information
torvalds committed Jan 11, 2022
2 parents 1c824bf + b473a38 commit 1be5bdf
Show file tree
Hide file tree
Showing 27 changed files with 1,347 additions and 172 deletions.
76 changes: 63 additions & 13 deletions Documentation/dev-tools/kcsan.rst
Original file line number Diff line number Diff line change
Expand Up @@ -204,17 +204,17 @@ Ultimately this allows to determine the possible executions of concurrent code,
and if that code is free from data races.

KCSAN is aware of *marked atomic operations* (``READ_ONCE``, ``WRITE_ONCE``,
``atomic_*``, etc.), but is oblivious of any ordering guarantees and simply
assumes that memory barriers are placed correctly. In other words, KCSAN
assumes that as long as a plain access is not observed to race with another
conflicting access, memory operations are correctly ordered.

This means that KCSAN will not report *potential* data races due to missing
memory ordering. Developers should therefore carefully consider the required
memory ordering requirements that remain unchecked. If, however, missing
memory ordering (that is observable with a particular compiler and
architecture) leads to an observable data race (e.g. entering a critical
section erroneously), KCSAN would report the resulting data race.
``atomic_*``, etc.), and a subset of ordering guarantees implied by memory
barriers. With ``CONFIG_KCSAN_WEAK_MEMORY=y``, KCSAN models load or store
buffering, and can detect missing ``smp_mb()``, ``smp_wmb()``, ``smp_rmb()``,
``smp_store_release()``, and all ``atomic_*`` operations with equivalent
implied barriers.

Note, KCSAN will not report all data races due to missing memory ordering,
specifically where a memory barrier would be required to prohibit subsequent
memory operation from reordering before the barrier. Developers should
therefore carefully consider the required memory ordering requirements that
remain unchecked.

Race Detection Beyond Data Races
--------------------------------
Expand Down Expand Up @@ -268,6 +268,56 @@ marked operations, if all accesses to a variable that is accessed concurrently
are properly marked, KCSAN will never trigger a watchpoint and therefore never
report the accesses.

Modeling Weak Memory
~~~~~~~~~~~~~~~~~~~~

KCSAN's approach to detecting data races due to missing memory barriers is
based on modeling access reordering (with ``CONFIG_KCSAN_WEAK_MEMORY=y``).
Each plain memory access for which a watchpoint is set up, is also selected for
simulated reordering within the scope of its function (at most 1 in-flight
access).

Once an access has been selected for reordering, it is checked along every
other access until the end of the function scope. If an appropriate memory
barrier is encountered, the access will no longer be considered for simulated
reordering.

When the result of a memory operation should be ordered by a barrier, KCSAN can
then detect data races where the conflict only occurs as a result of a missing
barrier. Consider the example::

int x, flag;
void T1(void)
{
x = 1; // data race!
WRITE_ONCE(flag, 1); // correct: smp_store_release(&flag, 1)
}
void T2(void)
{
while (!READ_ONCE(flag)); // correct: smp_load_acquire(&flag)
... = x; // data race!
}

When weak memory modeling is enabled, KCSAN can consider ``x`` in ``T1`` for
simulated reordering. After the write of ``flag``, ``x`` is again checked for
concurrent accesses: because ``T2`` is able to proceed after the write of
``flag``, a data race is detected. With the correct barriers in place, ``x``
would not be considered for reordering after the proper release of ``flag``,
and no data race would be detected.

Deliberate trade-offs in complexity but also practical limitations mean only a
subset of data races due to missing memory barriers can be detected. With
currently available compiler support, the implementation is limited to modeling
the effects of "buffering" (delaying accesses), since the runtime cannot
"prefetch" accesses. Also recall that watchpoints are only set up for plain
accesses, and the only access type for which KCSAN simulates reordering. This
means reordering of marked accesses is not modeled.

A consequence of the above is that acquire operations do not require barrier
instrumentation (no prefetching). Furthermore, marked accesses introducing
address or control dependencies do not require special handling (the marked
access cannot be reordered, later dependent accesses cannot be prefetched).

Key Properties
~~~~~~~~~~~~~~

Expand All @@ -290,8 +340,8 @@ Key Properties
4. **Detects Racy Writes from Devices:** Due to checking data values upon
setting up watchpoints, racy writes from devices can also be detected.

5. **Memory Ordering:** KCSAN is *not* explicitly aware of the LKMM's ordering
rules; this may result in missed data races (false negatives).
5. **Memory Ordering:** KCSAN is aware of only a subset of LKMM ordering rules;
this may result in missed data races (false negatives).

6. **Analysis Accuracy:** For observed executions, due to using a sampling
strategy, the analysis is *unsound* (false negatives possible), but aims to
Expand Down
10 changes: 5 additions & 5 deletions arch/x86/include/asm/barrier.h
Original file line number Diff line number Diff line change
Expand Up @@ -19,9 +19,9 @@
#define wmb() asm volatile(ALTERNATIVE("lock; addl $0,-4(%%esp)", "sfence", \
X86_FEATURE_XMM2) ::: "memory", "cc")
#else
#define mb() asm volatile("mfence":::"memory")
#define rmb() asm volatile("lfence":::"memory")
#define wmb() asm volatile("sfence" ::: "memory")
#define __mb() asm volatile("mfence":::"memory")
#define __rmb() asm volatile("lfence":::"memory")
#define __wmb() asm volatile("sfence" ::: "memory")
#endif

/**
Expand Down Expand Up @@ -51,8 +51,8 @@ static inline unsigned long array_index_mask_nospec(unsigned long index,
/* Prevent speculative execution past this barrier. */
#define barrier_nospec() alternative("", "lfence", X86_FEATURE_LFENCE_RDTSC)

#define dma_rmb() barrier()
#define dma_wmb() barrier()
#define __dma_rmb() barrier()
#define __dma_wmb() barrier()

#define __smp_mb() asm volatile("lock; addl $0,-4(%%" _ASM_SP ")" ::: "memory", "cc")

Expand Down
1 change: 1 addition & 0 deletions arch/x86/include/asm/qspinlock.h
Original file line number Diff line number Diff line change
Expand Up @@ -53,6 +53,7 @@ static inline void queued_spin_lock_slowpath(struct qspinlock *lock, u32 val)

static inline void queued_spin_unlock(struct qspinlock *lock)
{
kcsan_release();
pv_queued_spin_unlock(lock);
}

Expand Down
54 changes: 40 additions & 14 deletions include/asm-generic/barrier.h
Original file line number Diff line number Diff line change
Expand Up @@ -14,12 +14,38 @@
#ifndef __ASSEMBLY__

#include <linux/compiler.h>
#include <linux/kcsan-checks.h>
#include <asm/rwonce.h>

#ifndef nop
#define nop() asm volatile ("nop")
#endif

/*
* Architectures that want generic instrumentation can define __ prefixed
* variants of all barriers.
*/

#ifdef __mb
#define mb() do { kcsan_mb(); __mb(); } while (0)
#endif

#ifdef __rmb
#define rmb() do { kcsan_rmb(); __rmb(); } while (0)
#endif

#ifdef __wmb
#define wmb() do { kcsan_wmb(); __wmb(); } while (0)
#endif

#ifdef __dma_rmb
#define dma_rmb() do { kcsan_rmb(); __dma_rmb(); } while (0)
#endif

#ifdef __dma_wmb
#define dma_wmb() do { kcsan_wmb(); __dma_wmb(); } while (0)
#endif

/*
* Force strict CPU ordering. And yes, this is required on UP too when we're
* talking to devices.
Expand Down Expand Up @@ -62,15 +88,15 @@
#ifdef CONFIG_SMP

#ifndef smp_mb
#define smp_mb() __smp_mb()
#define smp_mb() do { kcsan_mb(); __smp_mb(); } while (0)
#endif

#ifndef smp_rmb
#define smp_rmb() __smp_rmb()
#define smp_rmb() do { kcsan_rmb(); __smp_rmb(); } while (0)
#endif

#ifndef smp_wmb
#define smp_wmb() __smp_wmb()
#define smp_wmb() do { kcsan_wmb(); __smp_wmb(); } while (0)
#endif

#else /* !CONFIG_SMP */
Expand Down Expand Up @@ -123,19 +149,19 @@ do { \
#ifdef CONFIG_SMP

#ifndef smp_store_mb
#define smp_store_mb(var, value) __smp_store_mb(var, value)
#define smp_store_mb(var, value) do { kcsan_mb(); __smp_store_mb(var, value); } while (0)
#endif

#ifndef smp_mb__before_atomic
#define smp_mb__before_atomic() __smp_mb__before_atomic()
#define smp_mb__before_atomic() do { kcsan_mb(); __smp_mb__before_atomic(); } while (0)
#endif

#ifndef smp_mb__after_atomic
#define smp_mb__after_atomic() __smp_mb__after_atomic()
#define smp_mb__after_atomic() do { kcsan_mb(); __smp_mb__after_atomic(); } while (0)
#endif

#ifndef smp_store_release
#define smp_store_release(p, v) __smp_store_release(p, v)
#define smp_store_release(p, v) do { kcsan_release(); __smp_store_release(p, v); } while (0)
#endif

#ifndef smp_load_acquire
Expand Down Expand Up @@ -178,13 +204,13 @@ do { \
#endif /* CONFIG_SMP */

/* Barriers for virtual machine guests when talking to an SMP host */
#define virt_mb() __smp_mb()
#define virt_rmb() __smp_rmb()
#define virt_wmb() __smp_wmb()
#define virt_store_mb(var, value) __smp_store_mb(var, value)
#define virt_mb__before_atomic() __smp_mb__before_atomic()
#define virt_mb__after_atomic() __smp_mb__after_atomic()
#define virt_store_release(p, v) __smp_store_release(p, v)
#define virt_mb() do { kcsan_mb(); __smp_mb(); } while (0)
#define virt_rmb() do { kcsan_rmb(); __smp_rmb(); } while (0)
#define virt_wmb() do { kcsan_wmb(); __smp_wmb(); } while (0)
#define virt_store_mb(var, value) do { kcsan_mb(); __smp_store_mb(var, value); } while (0)
#define virt_mb__before_atomic() do { kcsan_mb(); __smp_mb__before_atomic(); } while (0)
#define virt_mb__after_atomic() do { kcsan_mb(); __smp_mb__after_atomic(); } while (0)
#define virt_store_release(p, v) do { kcsan_release(); __smp_store_release(p, v); } while (0)
#define virt_load_acquire(p) __smp_load_acquire(p)

/**
Expand Down
3 changes: 3 additions & 0 deletions include/asm-generic/bitops/instrumented-atomic.h
Original file line number Diff line number Diff line change
Expand Up @@ -67,6 +67,7 @@ static inline void change_bit(long nr, volatile unsigned long *addr)
*/
static inline bool test_and_set_bit(long nr, volatile unsigned long *addr)
{
kcsan_mb();
instrument_atomic_read_write(addr + BIT_WORD(nr), sizeof(long));
return arch_test_and_set_bit(nr, addr);
}
Expand All @@ -80,6 +81,7 @@ static inline bool test_and_set_bit(long nr, volatile unsigned long *addr)
*/
static inline bool test_and_clear_bit(long nr, volatile unsigned long *addr)
{
kcsan_mb();
instrument_atomic_read_write(addr + BIT_WORD(nr), sizeof(long));
return arch_test_and_clear_bit(nr, addr);
}
Expand All @@ -93,6 +95,7 @@ static inline bool test_and_clear_bit(long nr, volatile unsigned long *addr)
*/
static inline bool test_and_change_bit(long nr, volatile unsigned long *addr)
{
kcsan_mb();
instrument_atomic_read_write(addr + BIT_WORD(nr), sizeof(long));
return arch_test_and_change_bit(nr, addr);
}
Expand Down
3 changes: 3 additions & 0 deletions include/asm-generic/bitops/instrumented-lock.h
Original file line number Diff line number Diff line change
Expand Up @@ -22,6 +22,7 @@
*/
static inline void clear_bit_unlock(long nr, volatile unsigned long *addr)
{
kcsan_release();
instrument_atomic_write(addr + BIT_WORD(nr), sizeof(long));
arch_clear_bit_unlock(nr, addr);
}
Expand All @@ -37,6 +38,7 @@ static inline void clear_bit_unlock(long nr, volatile unsigned long *addr)
*/
static inline void __clear_bit_unlock(long nr, volatile unsigned long *addr)
{
kcsan_release();
instrument_write(addr + BIT_WORD(nr), sizeof(long));
arch___clear_bit_unlock(nr, addr);
}
Expand Down Expand Up @@ -71,6 +73,7 @@ static inline bool test_and_set_bit_lock(long nr, volatile unsigned long *addr)
static inline bool
clear_bit_unlock_is_negative_byte(long nr, volatile unsigned long *addr)
{
kcsan_release();
instrument_atomic_write(addr + BIT_WORD(nr), sizeof(long));
return arch_clear_bit_unlock_is_negative_byte(nr, addr);
}
Expand Down
Loading

0 comments on commit 1be5bdf

Please sign in to comment.