Skip to content

Commit

Permalink
Merge tag 'rcu.2023.02.10a' of git://git.kernel.org/pub/scm/linux/ker…
Browse files Browse the repository at this point in the history
…nel/git/paulmck/linux-rcu

Pull RCU updates from Paul McKenney:

 - Documentation updates

 - Miscellaneous fixes, perhaps most notably:

      - Throttling callback invocation based on the number of callbacks
        that are now ready to invoke instead of on the total number of
        callbacks

      - Several patches that suppress false-positive boot-time
        diagnostics, for example, due to lockdep not yet being
        initialized

      - Make expedited RCU CPU stall warnings dump stacks of any tasks
        that are blocking the stalled grace period. (Normal RCU CPU
        stall warnings have done this for many years)

      - Lazy-callback fixes to avoid delays during boot, suspend, and
        resume. (Note that lazy callbacks must be explicitly enabled, so
        this should not (yet) affect production use cases)

 - Make kfree_rcu() and friends take advantage of polled grace periods,
   thus reducing memory footprint by almost two orders of magnitude,
   admittedly on a microbenchmark

   This also begins the transition from kfree_rcu(p) to
   kfree_rcu_mightsleep(p). This transition was motivated by bugs where
   kfree_rcu(p), which can block, was typed instead of the intended
   kfree_rcu(p, rh)

 - SRCU updates, perhaps most notably fixing a bug that causes SRCU to
   fail when booted on a system with a non-zero boot CPU. This
   surprising situation actually happens for kdump kernels on the
   powerpc architecture

   This also adds an srcu_down_read() and srcu_up_read(), which act like
   srcu_read_lock() and srcu_read_unlock(), but allow an SRCU read-side
   critical section to be handed off from one task to another

 - Clean up the now-useless SRCU Kconfig option

   There are a few more commits that are not yet acked or pulled into
   maintainer trees, and these will be in a pull request for a later
   merge window

 - RCU-tasks updates, perhaps most notably these fixes:

      - A strange interaction between PID-namespace unshare and the
        RCU-tasks grace period that results in a low-probability but
        very real hang

      - A race between an RCU tasks rude grace period on a single-CPU
        system and CPU-hotplug addition of the second CPU that can
        result in a too-short grace period

      - A race between shrinking RCU tasks down to a single callback
        list and queuing a new callback to some other CPU, but where
        that queuing is delayed for more than an RCU grace period. This
        can result in that callback being stranded on the non-boot CPU

 - Torture-test updates and fixes

 - Torture-test scripting updates and fixes

 - Provide additional RCU CPU stall-warning information in kernels built
   with CONFIG_RCU_CPU_STALL_CPUTIME=y, and restore the full five-minute
   timeout limit for expedited RCU CPU stall warnings

* tag 'rcu.2023.02.10a' of git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu: (80 commits)
  rcu/kvfree: Add kvfree_rcu_mightsleep() and kfree_rcu_mightsleep()
  kernel/notifier: Remove CONFIG_SRCU
  init: Remove "select SRCU"
  fs/quota: Remove "select SRCU"
  fs/notify: Remove "select SRCU"
  fs/btrfs: Remove "select SRCU"
  fs: Remove CONFIG_SRCU
  drivers/pci/controller: Remove "select SRCU"
  drivers/net: Remove "select SRCU"
  drivers/md: Remove "select SRCU"
  drivers/hwtracing/stm: Remove "select SRCU"
  drivers/dax: Remove "select SRCU"
  drivers/base: Remove CONFIG_SRCU
  rcu: Disable laziness if lazy-tracking says so
  rcu: Track laziness during boot and suspend
  rcu: Remove redundant call to rcu_boost_kthread_setaffinity()
  rcu: Allow up to five minutes expedited RCU CPU stall-warning timeouts
  rcu: Align the output of RCU CPU stall warning messages
  rcu: Add RCU stall diagnosis information
  sched: Add helper nr_context_switches_cpu()
  ...
  • Loading branch information
torvalds committed Feb 21, 2023
2 parents 8ca8d89 + bba8d3d commit 8cc01d4
Show file tree
Hide file tree
Showing 54 changed files with 1,734 additions and 839 deletions.
4 changes: 2 additions & 2 deletions Documentation/RCU/NMI-RCU.rst
Original file line number Diff line number Diff line change
Expand Up @@ -8,7 +8,7 @@ Although RCU is usually used to protect read-mostly data structures,
it is possible to use RCU to provide dynamic non-maskable interrupt
handlers, as well as dynamic irq handlers. This document describes
how to do this, drawing loosely from Zwane Mwaikambo's NMI-timer
work in "arch/x86/kernel/traps.c".
work in an old version of "arch/x86/kernel/traps.c".

The relevant pieces of code are listed below, each followed by a
brief explanation::
Expand Down Expand Up @@ -116,7 +116,7 @@ Answer to Quick Quiz:

This same sad story can happen on other CPUs when using
a compiler with aggressive pointer-value speculation
optimizations.
optimizations. (But please don't!)

More important, the rcu_dereference_sched() makes it
clear to someone reading the code that the pointer is
Expand Down
13 changes: 11 additions & 2 deletions Documentation/RCU/UP.rst
Original file line number Diff line number Diff line change
Expand Up @@ -38,7 +38,7 @@ by having call_rcu() directly invoke its arguments only if it was called
from process context. However, this can fail in a similar manner.

Suppose that an RCU-based algorithm again scans a linked list containing
elements A, B, and C in process contexts, but that it invokes a function
elements A, B, and C in process context, but that it invokes a function
on each element as it is scanned. Suppose further that this function
deletes element B from the list, then passes it to call_rcu() for deferred
freeing. This may be a bit unconventional, but it is perfectly legal
Expand All @@ -59,7 +59,8 @@ Example 3: Death by Deadlock
Suppose that call_rcu() is invoked while holding a lock, and that the
callback function must acquire this same lock. In this case, if
call_rcu() were to directly invoke the callback, the result would
be self-deadlock.
be self-deadlock *even if* this invocation occurred from a later
call_rcu() invocation a full grace period later.

In some cases, it would possible to restructure to code so that
the call_rcu() is delayed until after the lock is released. However,
Expand All @@ -85,6 +86,14 @@ Quick Quiz #2:

:ref:`Answers to Quick Quiz <answer_quick_quiz_up>`

It is important to note that userspace RCU implementations *do*
permit call_rcu() to directly invoke callbacks, but only if a full
grace period has elapsed since those callbacks were queued. This is
the case because some userspace environments are extremely constrained.
Nevertheless, people writing userspace RCU implementations are strongly
encouraged to avoid invoking callbacks from call_rcu(), thus obtaining
the deadlock-avoidance benefits called out above.

Summary
-------

Expand Down
13 changes: 6 additions & 7 deletions Documentation/RCU/lockdep.rst
Original file line number Diff line number Diff line change
Expand Up @@ -69,9 +69,8 @@ checking of rcu_dereference() primitives:
value of the pointer itself, for example, against NULL.

The rcu_dereference_check() check expression can be any boolean
expression, but would normally include a lockdep expression. However,
any boolean expression can be used. For a moderately ornate example,
consider the following::
expression, but would normally include a lockdep expression. For a
moderately ornate example, consider the following::

file = rcu_dereference_check(fdt->fd[fd],
lockdep_is_held(&files->file_lock) ||
Expand All @@ -97,10 +96,10 @@ code, it could instead be written as follows::
atomic_read(&files->count) == 1);

This would verify cases #2 and #3 above, and furthermore lockdep would
complain if this was used in an RCU read-side critical section unless one
of these two cases held. Because rcu_dereference_protected() omits all
barriers and compiler constraints, it generates better code than do the
other flavors of rcu_dereference(). On the other hand, it is illegal
complain even if this was used in an RCU read-side critical section unless
one of these two cases held. Because rcu_dereference_protected() omits
all barriers and compiler constraints, it generates better code than do
the other flavors of rcu_dereference(). On the other hand, it is illegal
to use rcu_dereference_protected() if either the RCU-protected pointer
or the RCU-protected data that it points to can change concurrently.

Expand Down
6 changes: 4 additions & 2 deletions Documentation/RCU/rcu.rst
Original file line number Diff line number Diff line change
Expand Up @@ -77,15 +77,17 @@ Frequently Asked Questions
search for the string "Patent" in Documentation/RCU/RTFP.txt to find them.
Of these, one was allowed to lapse by the assignee, and the
others have been contributed to the Linux kernel under GPL.
Many (but not all) have long since expired.
There are now also LGPL implementations of user-level RCU
available (https://liburcu.org/).

- I hear that RCU needs work in order to support realtime kernels?

Realtime-friendly RCU can be enabled via the CONFIG_PREEMPT_RCU
Realtime-friendly RCU are enabled via the CONFIG_PREEMPTION
kernel configuration parameter.

- Where can I find more information on RCU?

See the Documentation/RCU/RTFP.txt file.
Or point your browser at (http://www.rdrop.com/users/paulmck/RCU/).
Or point your browser at (https://docs.google.com/document/d/1X0lThx8OK0ZgLMqVoXiR4ZrGURHrXK6NyLRbeXe3Xac/edit)
or (https://docs.google.com/document/d/1GCdQC8SDbb54W1shjEXqGZ0Rq8a6kIeYutdSIajfpLA/edit?usp=sharing).
21 changes: 16 additions & 5 deletions Documentation/RCU/rcu_dereference.rst
Original file line number Diff line number Diff line change
Expand Up @@ -19,8 +19,9 @@ Follow these rules to keep your RCU code working properly:
can reload the value, and won't your code have fun with two
different values for a single pointer! Without rcu_dereference(),
DEC Alpha can load a pointer, dereference that pointer, and
return data preceding initialization that preceded the store of
the pointer.
return data preceding initialization that preceded the store
of the pointer. (As noted later, in recent kernels READ_ONCE()
also prevents DEC Alpha from playing these tricks.)

In addition, the volatile cast in rcu_dereference() prevents the
compiler from deducing the resulting pointer value. Please see
Expand All @@ -34,7 +35,7 @@ Follow these rules to keep your RCU code working properly:
takes on the role of the lockless_dereference() primitive that
was removed in v4.15.

- You are only permitted to use rcu_dereference on pointer values.
- You are only permitted to use rcu_dereference() on pointer values.
The compiler simply knows too much about integral values to
trust it to carry dependencies through integer operations.
There are a very few exceptions, namely that you can temporarily
Expand Down Expand Up @@ -240,6 +241,7 @@ precautions. To see this, consider the following code fragment::
struct foo *q;
int r1, r2;

rcu_read_lock();
p = rcu_dereference(gp2);
if (p == NULL)
return;
Expand All @@ -248,7 +250,10 @@ precautions. To see this, consider the following code fragment::
if (p == q) {
/* The compiler decides that q->c is same as p->c. */
r2 = p->c; /* Could get 44 on weakly order system. */
} else {
r2 = p->c - r1; /* Unconditional access to p->c. */
}
rcu_read_unlock();
do_something_with(r1, r2);
}

Expand Down Expand Up @@ -297,6 +302,7 @@ Then one approach is to use locking, for example, as follows::
struct foo *q;
int r1, r2;

rcu_read_lock();
p = rcu_dereference(gp2);
if (p == NULL)
return;
Expand All @@ -306,7 +312,12 @@ Then one approach is to use locking, for example, as follows::
if (p == q) {
/* The compiler decides that q->c is same as p->c. */
r2 = p->c; /* Locking guarantees r2 == 144. */
} else {
spin_lock(&q->lock);
r2 = q->c - r1;
spin_unlock(&q->lock);
}
rcu_read_unlock();
spin_unlock(&p->lock);
do_something_with(r1, r2);
}
Expand Down Expand Up @@ -364,7 +375,7 @@ the exact value of "p" even in the not-equals case. This allows the
compiler to make the return values independent of the load from "gp",
in turn destroying the ordering between this load and the loads of the
return values. This can result in "p->b" returning pre-initialization
garbage values.
garbage values on weakly ordered systems.

In short, rcu_dereference() is *not* optional when you are going to
dereference the resulting pointer.
Expand Down Expand Up @@ -430,7 +441,7 @@ member of the rcu_dereference() to use in various situations:
SPARSE CHECKING OF RCU-PROTECTED POINTERS
-----------------------------------------

The sparse static-analysis tool checks for direct access to RCU-protected
The sparse static-analysis tool checks for non-RCU access to RCU-protected
pointers, which can result in "interesting" bugs due to compiler
optimizations involving invented loads and perhaps also load tearing.
For example, suppose someone mistakenly does something like this::
Expand Down
Loading

0 comments on commit 8cc01d4

Please sign in to comment.