Skip to content

Commit

Permalink
Merge tag 'core-rcu-2020-10-12' of git://git.kernel.org/pub/scm/linux…
Browse files Browse the repository at this point in the history
…/kernel/git/tip/tip

Pull RCU changes from Ingo Molnar:

 - Debugging for smp_call_function()

 - RT raw/non-raw lock ordering fixes

 - Strict grace periods for KASAN

 - New smp_call_function() torture test

 - Torture-test updates

 - Documentation updates

 - Miscellaneous fixes

[ This doesn't actually pull the tag - I've dropped the last merge from
  the RCU branch due to questions about the series.   - Linus ]

* tag 'core-rcu-2020-10-12' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (77 commits)
  smp: Make symbol 'csd_bug_count' static
  kernel/smp: Provide CSD lock timeout diagnostics
  smp: Add source and destination CPUs to __call_single_data
  rcu: Shrink each possible cpu krcp
  rcu/segcblist: Prevent useless GP start if no CBs to accelerate
  torture: Add gdb support
  rcutorture: Allow pointer leaks to test diagnostic code
  rcutorture: Hoist OOM registry up one level
  refperf: Avoid null pointer dereference when buf fails to allocate
  rcutorture: Properly synchronize with OOM notifier
  rcutorture: Properly set rcu_fwds for OOM handling
  torture: Add kvm.sh --help and update help message
  rcutorture: Add CONFIG_PROVE_RCU_LIST to TREE05
  torture: Update initrd documentation
  rcutorture: Replace HTTP links with HTTPS ones
  locktorture: Make function torture_percpu_rwsem_init() static
  torture: document --allcpus argument added to the kvm.sh script
  rcutorture: Output number of elapsed grace periods
  rcutorture: Remove KCSAN stubs
  rcu: Remove unused "cpu" parameter from rcu_report_qs_rdp()
  ...
  • Loading branch information
torvalds committed Oct 18, 2020
2 parents 373014b + b36c830 commit 41eea65
Show file tree
Hide file tree
Showing 57 changed files with 1,582 additions and 421 deletions.
Original file line number Diff line number Diff line change
Expand Up @@ -963,7 +963,7 @@ exit and perhaps also vice versa. Therefore, whenever the
``->dynticks_nesting`` field is incremented up from zero, the
``->dynticks_nmi_nesting`` field is set to a large positive number, and
whenever the ``->dynticks_nesting`` field is decremented down to zero,
the the ``->dynticks_nmi_nesting`` field is set to zero. Assuming that
the ``->dynticks_nmi_nesting`` field is set to zero. Assuming that
the number of misnested interrupts is not sufficient to overflow the
counter, this approach corrects the ``->dynticks_nmi_nesting`` field
every time the corresponding CPU enters the idle loop from process
Expand Down
4 changes: 2 additions & 2 deletions Documentation/RCU/Design/Requirements/Requirements.rst
Original file line number Diff line number Diff line change
Expand Up @@ -2162,7 +2162,7 @@ scheduling-clock interrupt be enabled when RCU needs it to be:
this sort of thing.
#. If a CPU is in a portion of the kernel that is absolutely positively
no-joking guaranteed to never execute any RCU read-side critical
sections, and RCU believes this CPU to to be idle, no problem. This
sections, and RCU believes this CPU to be idle, no problem. This
sort of thing is used by some architectures for light-weight
exception handlers, which can then avoid the overhead of
``rcu_irq_enter()`` and ``rcu_irq_exit()`` at exception entry and
Expand Down Expand Up @@ -2431,7 +2431,7 @@ However, there are legitimate preemptible-RCU implementations that do
not have this property, given that any point in the code outside of an
RCU read-side critical section can be a quiescent state. Therefore,
*RCU-sched* was created, which follows “classic” RCU in that an
RCU-sched grace period waits for for pre-existing interrupt and NMI
RCU-sched grace period waits for pre-existing interrupt and NMI
handlers. In kernels built with ``CONFIG_PREEMPT=n``, the RCU and
RCU-sched APIs have identical implementations, while kernels built with
``CONFIG_PREEMPT=y`` provide a separate implementation for each.
Expand Down
2 changes: 1 addition & 1 deletion Documentation/RCU/whatisRCU.rst
Original file line number Diff line number Diff line change
Expand Up @@ -360,7 +360,7 @@ order to amortize their overhead over many uses of the corresponding APIs.

There are at least three flavors of RCU usage in the Linux kernel. The diagram
above shows the most common one. On the updater side, the rcu_assign_pointer(),
sychronize_rcu() and call_rcu() primitives used are the same for all three
synchronize_rcu() and call_rcu() primitives used are the same for all three
flavors. However for protection (on the reader side), the primitives used vary
depending on the flavor:

Expand Down
153 changes: 135 additions & 18 deletions Documentation/admin-guide/kernel-parameters.txt
Original file line number Diff line number Diff line change
Expand Up @@ -3095,6 +3095,10 @@
and gids from such clients. This is intended to ease
migration from NFSv2/v3.

nmi_backtrace.backtrace_idle [KNL]
Dump stacks even of idle CPUs in response to an
NMI stack-backtrace request.

nmi_debug= [KNL,SH] Specify one or more actions to take
when a NMI is triggered.
Format: [state][,regs][,debounce][,die]
Expand Down Expand Up @@ -4174,46 +4178,55 @@
This wake_up() will be accompanied by a
WARN_ONCE() splat and an ftrace_dump().

rcutree.rcu_unlock_delay= [KNL]
In CONFIG_RCU_STRICT_GRACE_PERIOD=y kernels,
this specifies an rcu_read_unlock()-time delay
in microseconds. This defaults to zero.
Larger delays increase the probability of
catching RCU pointer leaks, that is, buggy use
of RCU-protected pointers after the relevant
rcu_read_unlock() has completed.

rcutree.sysrq_rcu= [KNL]
Commandeer a sysrq key to dump out Tree RCU's
rcu_node tree with an eye towards determining
why a new grace period has not yet started.

rcuperf.gp_async= [KNL]
rcuscale.gp_async= [KNL]
Measure performance of asynchronous
grace-period primitives such as call_rcu().

rcuperf.gp_async_max= [KNL]
rcuscale.gp_async_max= [KNL]
Specify the maximum number of outstanding
callbacks per writer thread. When a writer
thread exceeds this limit, it invokes the
corresponding flavor of rcu_barrier() to allow
previously posted callbacks to drain.

rcuperf.gp_exp= [KNL]
rcuscale.gp_exp= [KNL]
Measure performance of expedited synchronous
grace-period primitives.

rcuperf.holdoff= [KNL]
rcuscale.holdoff= [KNL]
Set test-start holdoff period. The purpose of
this parameter is to delay the start of the
test until boot completes in order to avoid
interference.

rcuperf.kfree_rcu_test= [KNL]
rcuscale.kfree_rcu_test= [KNL]
Set to measure performance of kfree_rcu() flooding.

rcuperf.kfree_nthreads= [KNL]
rcuscale.kfree_nthreads= [KNL]
The number of threads running loops of kfree_rcu().

rcuperf.kfree_alloc_num= [KNL]
rcuscale.kfree_alloc_num= [KNL]
Number of allocations and frees done in an iteration.

rcuperf.kfree_loops= [KNL]
Number of loops doing rcuperf.kfree_alloc_num number
rcuscale.kfree_loops= [KNL]
Number of loops doing rcuscale.kfree_alloc_num number
of allocations and frees.

rcuperf.nreaders= [KNL]
rcuscale.nreaders= [KNL]
Set number of RCU readers. The value -1 selects
N, where N is the number of CPUs. A value
"n" less than -1 selects N-n+1, where N is again
Expand All @@ -4222,23 +4235,23 @@
A value of "n" less than or equal to -N selects
a single reader.

rcuperf.nwriters= [KNL]
rcuscale.nwriters= [KNL]
Set number of RCU writers. The values operate
the same as for rcuperf.nreaders.
the same as for rcuscale.nreaders.
N, where N is the number of CPUs

rcuperf.perf_type= [KNL]
rcuscale.perf_type= [KNL]
Specify the RCU implementation to test.

rcuperf.shutdown= [KNL]
rcuscale.shutdown= [KNL]
Shut the system down after performance tests
complete. This is useful for hands-off automated
testing.

rcuperf.verbose= [KNL]
rcuscale.verbose= [KNL]
Enable additional printk() statements.

rcuperf.writer_holdoff= [KNL]
rcuscale.writer_holdoff= [KNL]
Write-side holdoff between grace periods,
in microseconds. The default of zero says
no holdoff.
Expand Down Expand Up @@ -4291,6 +4304,18 @@
are zero, rcutorture acts as if is interpreted
they are all non-zero.

rcutorture.irqreader= [KNL]
Run RCU readers from irq handlers, or, more
accurately, from a timer handler. Not all RCU
flavors take kindly to this sort of thing.

rcutorture.leakpointer= [KNL]
Leak an RCU-protected pointer out of the reader.
This can of course result in splats, and is
intended to test the ability of things like
CONFIG_RCU_STRICT_GRACE_PERIOD=y to detect
such leaks.

rcutorture.n_barrier_cbs= [KNL]
Set callbacks/threads for rcu_barrier() testing.

Expand Down Expand Up @@ -4512,8 +4537,8 @@
refscale.shutdown= [KNL]
Shut down the system at the end of the performance
test. This defaults to 1 (shut it down) when
rcuperf is built into the kernel and to 0 (leave
it running) when rcuperf is built as a module.
refscale is built into the kernel and to 0 (leave
it running) when refscale is built as a module.

refscale.verbose= [KNL]
Enable additional printk() statements.
Expand Down Expand Up @@ -4659,6 +4684,98 @@
Format: integer between 0 and 10
Default is 0.

scftorture.holdoff= [KNL]
Number of seconds to hold off before starting
test. Defaults to zero for module insertion and
to 10 seconds for built-in smp_call_function()
tests.

scftorture.longwait= [KNL]
Request ridiculously long waits randomly selected
up to the chosen limit in seconds. Zero (the
default) disables this feature. Please note
that requesting even small non-zero numbers of
seconds can result in RCU CPU stall warnings,
softlockup complaints, and so on.

scftorture.nthreads= [KNL]
Number of kthreads to spawn to invoke the
smp_call_function() family of functions.
The default of -1 specifies a number of kthreads
equal to the number of CPUs.

scftorture.onoff_holdoff= [KNL]
Number seconds to wait after the start of the
test before initiating CPU-hotplug operations.

scftorture.onoff_interval= [KNL]
Number seconds to wait between successive
CPU-hotplug operations. Specifying zero (which
is the default) disables CPU-hotplug operations.

scftorture.shutdown_secs= [KNL]
The number of seconds following the start of the
test after which to shut down the system. The
default of zero avoids shutting down the system.
Non-zero values are useful for automated tests.

scftorture.stat_interval= [KNL]
The number of seconds between outputting the
current test statistics to the console. A value
of zero disables statistics output.

scftorture.stutter_cpus= [KNL]
The number of jiffies to wait between each change
to the set of CPUs under test.

scftorture.use_cpus_read_lock= [KNL]
Use use_cpus_read_lock() instead of the default
preempt_disable() to disable CPU hotplug
while invoking one of the smp_call_function*()
functions.

scftorture.verbose= [KNL]
Enable additional printk() statements.

scftorture.weight_single= [KNL]
The probability weighting to use for the
smp_call_function_single() function with a zero
"wait" parameter. A value of -1 selects the
default if all other weights are -1. However,
if at least one weight has some other value, a
value of -1 will instead select a weight of zero.

scftorture.weight_single_wait= [KNL]
The probability weighting to use for the
smp_call_function_single() function with a
non-zero "wait" parameter. See weight_single.

scftorture.weight_many= [KNL]
The probability weighting to use for the
smp_call_function_many() function with a zero
"wait" parameter. See weight_single.
Note well that setting a high probability for
this weighting can place serious IPI load
on the system.

scftorture.weight_many_wait= [KNL]
The probability weighting to use for the
smp_call_function_many() function with a
non-zero "wait" parameter. See weight_single
and weight_many.

scftorture.weight_all= [KNL]
The probability weighting to use for the
smp_call_function_all() function with a zero
"wait" parameter. See weight_single and
weight_many.

scftorture.weight_all_wait= [KNL]
The probability weighting to use for the
smp_call_function_all() function with a
non-zero "wait" parameter. See weight_single
and weight_many.

skew_tick= [KNL] Offset the periodic timer tick per cpu to mitigate
xtime_lock contention on larger systems, and/or RCU lock
contention on all systems with CONFIG_MAXSMP set.
Expand Down
3 changes: 2 additions & 1 deletion MAINTAINERS
Original file line number Diff line number Diff line change
Expand Up @@ -17672,8 +17672,9 @@ S: Supported
T: git git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu.git dev
F: Documentation/RCU/torture.rst
F: kernel/locking/locktorture.c
F: kernel/rcu/rcuperf.c
F: kernel/rcu/rcuscale.c
F: kernel/rcu/rcutorture.c
F: kernel/rcu/refscale.c
F: kernel/torture.c

TOSHIBA ACPI EXTRAS DRIVER
Expand Down
6 changes: 4 additions & 2 deletions arch/x86/kvm/mmu/page_track.c
Original file line number Diff line number Diff line change
Expand Up @@ -229,7 +229,8 @@ void kvm_page_track_write(struct kvm_vcpu *vcpu, gpa_t gpa, const u8 *new,
return;

idx = srcu_read_lock(&head->track_srcu);
hlist_for_each_entry_rcu(n, &head->track_notifier_list, node)
hlist_for_each_entry_srcu(n, &head->track_notifier_list, node,
srcu_read_lock_held(&head->track_srcu))
if (n->track_write)
n->track_write(vcpu, gpa, new, bytes, n);
srcu_read_unlock(&head->track_srcu, idx);
Expand All @@ -254,7 +255,8 @@ void kvm_page_track_flush_slot(struct kvm *kvm, struct kvm_memory_slot *slot)
return;

idx = srcu_read_lock(&head->track_srcu);
hlist_for_each_entry_rcu(n, &head->track_notifier_list, node)
hlist_for_each_entry_srcu(n, &head->track_notifier_list, node,
srcu_read_lock_held(&head->track_srcu))
if (n->track_flush_slot)
n->track_flush_slot(kvm, slot, n);
srcu_read_unlock(&head->track_srcu, idx);
Expand Down
48 changes: 48 additions & 0 deletions include/linux/rculist.h
Original file line number Diff line number Diff line change
Expand Up @@ -63,9 +63,17 @@ static inline void INIT_LIST_HEAD_RCU(struct list_head *list)
RCU_LOCKDEP_WARN(!(cond) && !rcu_read_lock_any_held(), \
"RCU-list traversed in non-reader section!"); \
})

#define __list_check_srcu(cond) \
({ \
RCU_LOCKDEP_WARN(!(cond), \
"RCU-list traversed without holding the required lock!");\
})
#else
#define __list_check_rcu(dummy, cond, extra...) \
({ check_arg_count_one(extra); })

#define __list_check_srcu(cond) ({ })
#endif

/*
Expand Down Expand Up @@ -385,6 +393,25 @@ static inline void list_splice_tail_init_rcu(struct list_head *list,
&pos->member != (head); \
pos = list_entry_rcu(pos->member.next, typeof(*pos), member))

/**
* list_for_each_entry_srcu - iterate over rcu list of given type
* @pos: the type * to use as a loop cursor.
* @head: the head for your list.
* @member: the name of the list_head within the struct.
* @cond: lockdep expression for the lock required to traverse the list.
*
* This list-traversal primitive may safely run concurrently with
* the _rcu list-mutation primitives such as list_add_rcu()
* as long as the traversal is guarded by srcu_read_lock().
* The lockdep expression srcu_read_lock_held() can be passed as the
* cond argument from read side.
*/
#define list_for_each_entry_srcu(pos, head, member, cond) \
for (__list_check_srcu(cond), \
pos = list_entry_rcu((head)->next, typeof(*pos), member); \
&pos->member != (head); \
pos = list_entry_rcu(pos->member.next, typeof(*pos), member))

/**
* list_entry_lockless - get the struct for this entry
* @ptr: the &struct list_head pointer.
Expand Down Expand Up @@ -683,6 +710,27 @@ static inline void hlist_add_behind_rcu(struct hlist_node *n,
pos = hlist_entry_safe(rcu_dereference_raw(hlist_next_rcu(\
&(pos)->member)), typeof(*(pos)), member))

/**
* hlist_for_each_entry_srcu - iterate over rcu list of given type
* @pos: the type * to use as a loop cursor.
* @head: the head for your list.
* @member: the name of the hlist_node within the struct.
* @cond: lockdep expression for the lock required to traverse the list.
*
* This list-traversal primitive may safely run concurrently with
* the _rcu list-mutation primitives such as hlist_add_head_rcu()
* as long as the traversal is guarded by srcu_read_lock().
* The lockdep expression srcu_read_lock_held() can be passed as the
* cond argument from read side.
*/
#define hlist_for_each_entry_srcu(pos, head, member, cond) \
for (__list_check_srcu(cond), \
pos = hlist_entry_safe(rcu_dereference_raw(hlist_first_rcu(head)),\
typeof(*(pos)), member); \
pos; \
pos = hlist_entry_safe(rcu_dereference_raw(hlist_next_rcu(\
&(pos)->member)), typeof(*(pos)), member))

/**
* hlist_for_each_entry_rcu_notrace - iterate over rcu list of given type (for tracing)
* @pos: the type * to use as a loop cursor.
Expand Down
Loading

0 comments on commit 41eea65

Please sign in to comment.