Skip to content

Commit

Permalink
Merge branch 'core-rcu-for-linus' of git://git.kernel.org/pub/scm/lin…
Browse files Browse the repository at this point in the history
…ux/kernel/git/tip/tip

Pull RCU updates from Ingo Molnar:
 "The main changes in this cycle were:

   - changes related to No-CBs CPUs and NO_HZ_FULL

   - RCU-tasks implementation

   - torture-test updates

   - miscellaneous fixes

   - locktorture updates

   - RCU documentation updates"

* 'core-rcu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (81 commits)
  workqueue: Use cond_resched_rcu_qs macro
  workqueue: Add quiescent state between work items
  locktorture: Cleanup header usage
  locktorture: Cannot hold read and write lock
  locktorture: Fix __acquire annotation for spinlock irq
  locktorture: Support rwlocks
  rcu: Eliminate deadlock between CPU hotplug and expedited grace periods
  locktorture: Document boot/module parameters
  rcutorture: Rename rcutorture_runnable parameter
  locktorture: Add test scenario for rwsem_lock
  locktorture: Add test scenario for mutex_lock
  locktorture: Make torture scripting account for new _runnable name
  locktorture: Introduce torture context
  locktorture: Support rwsems
  locktorture: Add infrastructure for torturing read locks
  torture: Address race in module cleanup
  locktorture: Make statistics generic
  locktorture: Teach about lock debugging
  locktorture: Support mutexes
  locktorture: Add documentation
  ...
  • Loading branch information
torvalds committed Oct 13, 2014
2 parents 5ff0b9e + fd19bda commit d6dd50e
Show file tree
Hide file tree
Showing 63 changed files with 1,936 additions and 547 deletions.
33 changes: 24 additions & 9 deletions Documentation/RCU/stallwarn.txt
Original file line number Diff line number Diff line change
Expand Up @@ -56,8 +56,20 @@ RCU_STALL_RAT_DELAY
two jiffies. (This is a cpp macro, not a kernel configuration
parameter.)

When a CPU detects that it is stalling, it will print a message similar
to the following:
rcupdate.rcu_task_stall_timeout

This boot/sysfs parameter controls the RCU-tasks stall warning
interval. A value of zero or less suppresses RCU-tasks stall
warnings. A positive value sets the stall-warning interval
in jiffies. An RCU-tasks stall warning starts wtih the line:

INFO: rcu_tasks detected stalls on tasks:

And continues with the output of sched_show_task() for each
task stalling the current RCU-tasks grace period.

For non-RCU-tasks flavors of RCU, when a CPU detects that it is stalling,
it will print a message similar to the following:

INFO: rcu_sched_state detected stall on CPU 5 (t=2500 jiffies)

Expand Down Expand Up @@ -174,8 +186,12 @@ o A CPU looping with preemption disabled. This condition can
o A CPU looping with bottom halves disabled. This condition can
result in RCU-sched and RCU-bh stalls.

o For !CONFIG_PREEMPT kernels, a CPU looping anywhere in the kernel
without invoking schedule().
o For !CONFIG_PREEMPT kernels, a CPU looping anywhere in the
kernel without invoking schedule(). Note that cond_resched()
does not necessarily prevent RCU CPU stall warnings. Therefore,
if the looping in the kernel is really expected and desirable
behavior, you might need to replace some of the cond_resched()
calls with calls to cond_resched_rcu_qs().

o A CPU-bound real-time task in a CONFIG_PREEMPT kernel, which might
happen to preempt a low-priority task in the middle of an RCU
Expand Down Expand Up @@ -208,11 +224,10 @@ o A hardware failure. This is quite unlikely, but has occurred
This resulted in a series of RCU CPU stall warnings, eventually
leading the realization that the CPU had failed.

The RCU, RCU-sched, and RCU-bh implementations have CPU stall warning.
SRCU does not have its own CPU stall warnings, but its calls to
synchronize_sched() will result in RCU-sched detecting RCU-sched-related
CPU stalls. Please note that RCU only detects CPU stalls when there is
a grace period in progress. No grace period, no CPU stall warnings.
The RCU, RCU-sched, RCU-bh, and RCU-tasks implementations have CPU stall
warning. Note that SRCU does -not- have CPU stall warnings. Please note
that RCU only detects CPU stalls when there is a grace period in progress.
No grace period, no CPU stall warnings.

To diagnose the cause of the stall, inspect the stack traces.
The offending function will usually be near the top of the stack.
Expand Down
68 changes: 67 additions & 1 deletion Documentation/kernel-parameters.txt
Original file line number Diff line number Diff line change
Expand Up @@ -1723,6 +1723,49 @@ bytes respectively. Such letter suffixes can also be entirely omitted.
lockd.nlm_udpport=M [NFS] Assign UDP port.
Format: <integer>

locktorture.nreaders_stress= [KNL]
Set the number of locking read-acquisition kthreads.
Defaults to being automatically set based on the
number of online CPUs.

locktorture.nwriters_stress= [KNL]
Set the number of locking write-acquisition kthreads.

locktorture.onoff_holdoff= [KNL]
Set time (s) after boot for CPU-hotplug testing.

locktorture.onoff_interval= [KNL]
Set time (s) between CPU-hotplug operations, or
zero to disable CPU-hotplug testing.

locktorture.shuffle_interval= [KNL]
Set task-shuffle interval (jiffies). Shuffling
tasks allows some CPUs to go into dyntick-idle
mode during the locktorture test.

locktorture.shutdown_secs= [KNL]
Set time (s) after boot system shutdown. This
is useful for hands-off automated testing.

locktorture.stat_interval= [KNL]
Time (s) between statistics printk()s.

locktorture.stutter= [KNL]
Time (s) to stutter testing, for example,
specifying five seconds causes the test to run for
five seconds, wait for five seconds, and so on.
This tests the locking primitive's ability to
transition abruptly to and from idle.

locktorture.torture_runnable= [BOOT]
Start locktorture running at boot time.

locktorture.torture_type= [KNL]
Specify the locking implementation to test.

locktorture.verbose= [KNL]
Enable additional printk() statements.

logibm.irq= [HW,MOUSE] Logitech Bus Mouse Driver
Format: <irq>

Expand Down Expand Up @@ -2900,6 +2943,24 @@ bytes respectively. Such letter suffixes can also be entirely omitted.
Lazy RCU callbacks are those which RCU can
prove do nothing more than free memory.

rcutorture.cbflood_inter_holdoff= [KNL]
Set holdoff time (jiffies) between successive
callback-flood tests.

rcutorture.cbflood_intra_holdoff= [KNL]
Set holdoff time (jiffies) between successive
bursts of callbacks within a given callback-flood
test.

rcutorture.cbflood_n_burst= [KNL]
Set the number of bursts making up a given
callback-flood test. Set this to zero to
disable callback-flood testing.

rcutorture.cbflood_n_per_burst= [KNL]
Set the number of callbacks to be registered
in a given burst of a callback-flood test.

rcutorture.fqs_duration= [KNL]
Set duration of force_quiescent_state bursts.

Expand Down Expand Up @@ -2939,7 +3000,7 @@ bytes respectively. Such letter suffixes can also be entirely omitted.
Set time (s) between CPU-hotplug operations, or
zero to disable CPU-hotplug testing.

rcutorture.rcutorture_runnable= [BOOT]
rcutorture.torture_runnable= [BOOT]
Start rcutorture running at boot time.

rcutorture.shuffle_interval= [KNL]
Expand Down Expand Up @@ -3001,6 +3062,11 @@ bytes respectively. Such letter suffixes can also be entirely omitted.
rcupdate.rcu_cpu_stall_timeout= [KNL]
Set timeout for RCU CPU stall warning messages.

rcupdate.rcu_task_stall_timeout= [KNL]
Set timeout in jiffies for RCU task stall warning
messages. Disable with a value less than or equal
to zero.

rdinit= [KNL]
Format: <full_path>
Run specified binary instead of /init from the ramdisk,
Expand Down
147 changes: 147 additions & 0 deletions Documentation/locking/locktorture.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,147 @@
Kernel Lock Torture Test Operation

CONFIG_LOCK_TORTURE_TEST

The CONFIG LOCK_TORTURE_TEST config option provides a kernel module
that runs torture tests on core kernel locking primitives. The kernel
module, 'locktorture', may be built after the fact on the running
kernel to be tested, if desired. The tests periodically output status
messages via printk(), which can be examined via the dmesg (perhaps
grepping for "torture"). The test is started when the module is loaded,
and stops when the module is unloaded. This program is based on how RCU
is tortured, via rcutorture.

This torture test consists of creating a number of kernel threads which
acquire the lock and hold it for specific amount of time, thus simulating
different critical region behaviors. The amount of contention on the lock
can be simulated by either enlarging this critical region hold time and/or
creating more kthreads.


MODULE PARAMETERS

This module has the following parameters:


** Locktorture-specific **

nwriters_stress Number of kernel threads that will stress exclusive lock
ownership (writers). The default value is twice the number
of online CPUs.

nreaders_stress Number of kernel threads that will stress shared lock
ownership (readers). The default is the same amount of writer
locks. If the user did not specify nwriters_stress, then
both readers and writers be the amount of online CPUs.

torture_type Type of lock to torture. By default, only spinlocks will
be tortured. This module can torture the following locks,
with string values as follows:

o "lock_busted": Simulates a buggy lock implementation.

o "spin_lock": spin_lock() and spin_unlock() pairs.

o "spin_lock_irq": spin_lock_irq() and spin_unlock_irq()
pairs.

o "rw_lock": read/write lock() and unlock() rwlock pairs.

o "rw_lock_irq": read/write lock_irq() and unlock_irq()
rwlock pairs.

o "mutex_lock": mutex_lock() and mutex_unlock() pairs.

o "rwsem_lock": read/write down() and up() semaphore pairs.

torture_runnable Start locktorture at boot time in the case where the
module is built into the kernel, otherwise wait for
torture_runnable to be set via sysfs before starting.
By default it will begin once the module is loaded.


** Torture-framework (RCU + locking) **

shutdown_secs The number of seconds to run the test before terminating
the test and powering off the system. The default is
zero, which disables test termination and system shutdown.
This capability is useful for automated testing.

onoff_interval The number of seconds between each attempt to execute a
randomly selected CPU-hotplug operation. Defaults
to zero, which disables CPU hotplugging. In
CONFIG_HOTPLUG_CPU=n kernels, locktorture will silently
refuse to do any CPU-hotplug operations regardless of
what value is specified for onoff_interval.

onoff_holdoff The number of seconds to wait until starting CPU-hotplug
operations. This would normally only be used when
locktorture was built into the kernel and started
automatically at boot time, in which case it is useful
in order to avoid confusing boot-time code with CPUs
coming and going. This parameter is only useful if
CONFIG_HOTPLUG_CPU is enabled.

stat_interval Number of seconds between statistics-related printk()s.
By default, locktorture will report stats every 60 seconds.
Setting the interval to zero causes the statistics to
be printed -only- when the module is unloaded, and this
is the default.

stutter The length of time to run the test before pausing for this
same period of time. Defaults to "stutter=5", so as
to run and pause for (roughly) five-second intervals.
Specifying "stutter=0" causes the test to run continuously
without pausing, which is the old default behavior.

shuffle_interval The number of seconds to keep the test threads affinitied
to a particular subset of the CPUs, defaults to 3 seconds.
Used in conjunction with test_no_idle_hz.

verbose Enable verbose debugging printing, via printk(). Enabled
by default. This extra information is mostly related to
high-level errors and reports from the main 'torture'
framework.


STATISTICS

Statistics are printed in the following format:

spin_lock-torture: Writes: Total: 93746064 Max/Min: 0/0 Fail: 0
(A) (B) (C) (D) (E)

(A): Lock type that is being tortured -- torture_type parameter.

(B): Number of writer lock acquisitions. If dealing with a read/write primitive
a second "Reads" statistics line is printed.

(C): Number of times the lock was acquired.

(D): Min and max number of times threads failed to acquire the lock.

(E): true/false values if there were errors acquiring the lock. This should
-only- be positive if there is a bug in the locking primitive's
implementation. Otherwise a lock should never fail (i.e., spin_lock()).
Of course, the same applies for (C), above. A dummy example of this is
the "lock_busted" type.

USAGE

The following script may be used to torture locks:

#!/bin/sh

modprobe locktorture
sleep 3600
rmmod locktorture
dmesg | grep torture:

The output can be manually inspected for the error flag of "!!!".
One could of course create a more elaborate script that automatically
checked for such errors. The "rmmod" command forces a "SUCCESS",
"FAILURE", or "RCU_HOTPLUG" indication to be printk()ed. The first
two are self-explanatory, while the last indicates that while there
were no locking failures, CPU-hotplug problems were detected.

Also see: Documentation/RCU/torture.txt
Loading

0 comments on commit d6dd50e

Please sign in to comment.