Skip to content

Commit

Permalink
Merge tag 'locking-core-2021-08-30' of git://git.kernel.org/pub/scm/l…
Browse files Browse the repository at this point in the history
…inux/kernel/git/tip/tip

Pull locking and atomics updates from Thomas Gleixner:
 "The regular pile:

   - A few improvements to the mutex code

   - Documentation updates for atomics to clarify the difference between
     cmpxchg() and try_cmpxchg() and to explain the forward progress
     expectations.

   - Simplification of the atomics fallback generator

   - The addition of arch_atomic_long*() variants and generic arch_*()
     bitops based on them.

   - Add the missing might_sleep() invocations to the down*() operations
     of semaphores.

  The PREEMPT_RT locking core:

   - Scheduler updates to support the state preserving mechanism for
     'sleeping' spin- and rwlocks on RT.

     This mechanism is carefully preserving the state of the task when
     blocking on a 'sleeping' spin- or rwlock and takes regular wake-ups
     targeted at the same task into account. The preserved or updated
     (via a regular wakeup) state is restored when the lock has been
     acquired.

   - Restructuring of the rtmutex code so it can be utilized and
     extended for the RT specific lock variants.

   - Restructuring of the ww_mutex code to allow sharing of the ww_mutex
     specific functionality for rtmutex based ww_mutexes.

   - Header file disentangling to allow substitution of the regular lock
     implementations with the PREEMPT_RT variants without creating an
     unmaintainable #ifdef mess.

   - Shared base code for the PREEMPT_RT specific rw_semaphore and
     rwlock implementations.

     Contrary to the regular rw_semaphores and rwlocks the PREEMPT_RT
     implementation is writer unfair because it is infeasible to do
     priority inheritance on multiple readers. Experience over the years
     has shown that real-time workloads are not the typical workloads
     which are sensitive to writer starvation.

     The alternative solution would be to allow only a single reader
     which has been tried and discarded as it is a major bottleneck
     especially for mmap_sem. Aside of that many of the writer
     starvation critical usage sites have been converted to a writer
     side mutex/spinlock and RCU read side protections in the past
     decade so that the issue is less prominent than it used to be.

   - The actual rtmutex based lock substitutions for PREEMPT_RT enabled
     kernels which affect mutex, ww_mutex, rw_semaphore, spinlock_t and
     rwlock_t. The spin/rw_lock*() functions disable migration across
     the critical section to preserve the existing semantics vs per-CPU
     variables.

   - Rework of the futex REQUEUE_PI mechanism to handle the case of
     early wake-ups which interleave with a re-queue operation to
     prevent the situation that a task would be blocked on both the
     rtmutex associated to the outer futex and the rtmutex based hash
     bucket spinlock.

     While this situation cannot happen on !RT enabled kernels the
     changes make the underlying concurrency problems easier to
     understand in general. As a result the difference between !RT and
     RT kernels is reduced to the handling of waiting for the critical
     section. !RT kernels simply spin-wait as before and RT kernels
     utilize rcu_wait().

   - The substitution of local_lock for PREEMPT_RT with a spinlock which
     protects the critical section while staying preemptible. The CPU
     locality is established by disabling migration.

  The underlying concepts of this code have been in use in PREEMPT_RT for
  way more than a decade. The code has been refactored several times over
  the years and this final incarnation has been optimized once again to be
  as non-intrusive as possible, i.e. the RT specific parts are mostly
  isolated.

  It has been extensively tested in the 5.14-rt patch series and it has
  been verified that !RT kernels are not affected by these changes"

* tag 'locking-core-2021-08-30' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (92 commits)
  locking/rtmutex: Return success on deadlock for ww_mutex waiters
  locking/rtmutex: Prevent spurious EDEADLK return caused by ww_mutexes
  locking/rtmutex: Dequeue waiter on ww_mutex deadlock
  locking/rtmutex: Dont dereference waiter lockless
  locking/semaphore: Add might_sleep() to down_*() family
  locking/ww_mutex: Initialize waiter.ww_ctx properly
  static_call: Update API documentation
  locking/local_lock: Add PREEMPT_RT support
  locking/spinlock/rt: Prepare for RT local_lock
  locking/rtmutex: Add adaptive spinwait mechanism
  locking/rtmutex: Implement equal priority lock stealing
  preempt: Adjust PREEMPT_LOCK_OFFSET for RT
  locking/rtmutex: Prevent lockdep false positive with PI futexes
  futex: Prevent requeue_pi() lock nesting issue on RT
  futex: Simplify handle_early_requeue_pi_wakeup()
  futex: Reorder sanity checks in futex_requeue()
  futex: Clarify comment in futex_requeue()
  futex: Restructure futex_requeue()
  futex: Correct the number of requeued waiters for PI
  futex: Remove bogus condition for requeue PI
  ...
  • Loading branch information
torvalds committed Aug 30, 2021
2 parents 08403e2 + a055fcc commit e5e726f
Show file tree
Hide file tree
Showing 76 changed files with 5,941 additions and 2,791 deletions.
94 changes: 94 additions & 0 deletions Documentation/atomic_t.txt
Original file line number Diff line number Diff line change
Expand Up @@ -271,3 +271,97 @@ WRITE_ONCE. Thus:
SC *y, t;

is allowed.


CMPXCHG vs TRY_CMPXCHG
----------------------

int atomic_cmpxchg(atomic_t *ptr, int old, int new);
bool atomic_try_cmpxchg(atomic_t *ptr, int *oldp, int new);

Both provide the same functionality, but try_cmpxchg() can lead to more
compact code. The functions relate like:

bool atomic_try_cmpxchg(atomic_t *ptr, int *oldp, int new)
{
int ret, old = *oldp;
ret = atomic_cmpxchg(ptr, old, new);
if (ret != old)
*oldp = ret;
return ret == old;
}

and:

int atomic_cmpxchg(atomic_t *ptr, int old, int new)
{
(void)atomic_try_cmpxchg(ptr, &old, new);
return old;
}

Usage:

old = atomic_read(&v); old = atomic_read(&v);
for (;;) { do {
new = func(old); new = func(old);
tmp = atomic_cmpxchg(&v, old, new); } while (!atomic_try_cmpxchg(&v, &old, new));
if (tmp == old)
break;
old = tmp;
}

NB. try_cmpxchg() also generates better code on some platforms (notably x86)
where the function more closely matches the hardware instruction.


FORWARD PROGRESS
----------------

In general strong forward progress is expected of all unconditional atomic
operations -- those in the Arithmetic and Bitwise classes and xchg(). However
a fair amount of code also requires forward progress from the conditional
atomic operations.

Specifically 'simple' cmpxchg() loops are expected to not starve one another
indefinitely. However, this is not evident on LL/SC architectures, because
while an LL/SC architecure 'can/should/must' provide forward progress
guarantees between competing LL/SC sections, such a guarantee does not
transfer to cmpxchg() implemented using LL/SC. Consider:

old = atomic_read(&v);
do {
new = func(old);
} while (!atomic_try_cmpxchg(&v, &old, new));

which on LL/SC becomes something like:

old = atomic_read(&v);
do {
new = func(old);
} while (!({
volatile asm ("1: LL %[oldval], %[v]\n"
" CMP %[oldval], %[old]\n"
" BNE 2f\n"
" SC %[new], %[v]\n"
" BNE 1b\n"
"2:\n"
: [oldval] "=&r" (oldval), [v] "m" (v)
: [old] "r" (old), [new] "r" (new)
: "memory");
success = (oldval == old);
if (!success)
old = oldval;
success; }));

However, even the forward branch from the failed compare can cause the LL/SC
to fail on some architectures, let alone whatever the compiler makes of the C
loop body. As a result there is no guarantee what so ever the cacheline
containing @v will stay on the local CPU and progress is made.

Even native CAS architectures can fail to provide forward progress for their
primitive (See Sparc64 for an example).

Such implementations are strongly encouraged to add exponential backoff loops
to a failed CAS in order to ensure some progress. Affected architectures are
also strongly encouraged to inspect/audit the atomic fallbacks, refcount_t and
their locking primitives.
4 changes: 2 additions & 2 deletions drivers/staging/media/atomisp/pci/atomisp_ioctl.c
Original file line number Diff line number Diff line change
Expand Up @@ -1904,8 +1904,8 @@ int __atomisp_streamoff(struct file *file, void *fh, enum v4l2_buf_type type)
dev_dbg(isp->dev, "Stop stream on pad %d for asd%d\n",
atomisp_subdev_source_pad(vdev), asd->index);

BUG_ON(!rt_mutex_is_locked(&isp->mutex));
BUG_ON(!mutex_is_locked(&isp->streamoff_mutex));
lockdep_assert_held(&isp->mutex);
lockdep_assert_held(&isp->streamoff_mutex);

if (type != V4L2_BUF_TYPE_VIDEO_CAPTURE) {
dev_dbg(isp->dev, "unsupported v4l2 buf type\n");
Expand Down
Loading

0 comments on commit e5e726f

Please sign in to comment.