Skip to content

Commit

Permalink
Merge branch 'rcu/next' of git://git.kernel.org/pub/scm/linux/kernel/…
Browse files Browse the repository at this point in the history
…git/paulmck/linux-rcu into core/rcu

The major features of this series are:

 - making RCU more aggressive about entering dyntick-idle mode in order to
   improve energy efficiency

 - converting a few more call_rcu()s to kfree_rcu()s

 - applying a number of rcutree fixes and cleanups to rcutiny

 - removing CONFIG_SMP #ifdefs from treercu

 - allowing RCU CPU stall times to be set via sysfs

 - adding CPU-stall capability to rcutorture

 - adding more RCU-abuse diagnostics

 - updating documentation

 - fixing yet more issues located by the still-ongoing top-to-bottom
   inspection of RCU, this time with a special focus on the
   CPU-hotplug code path.

Signed-off-by: Ingo Molnar <[email protected]>
  • Loading branch information
Ingo Molnar committed Feb 28, 2012
2 parents 586c6e7 + 1cc8596 commit bdd4431
Show file tree
Hide file tree
Showing 29 changed files with 2,965 additions and 642 deletions.
1,902 changes: 1,745 additions & 157 deletions Documentation/RCU/RTFP.txt

Large diffs are not rendered by default.

14 changes: 14 additions & 0 deletions Documentation/RCU/checklist.txt
Original file line number Diff line number Diff line change
Expand Up @@ -180,6 +180,20 @@ over a rather long period of time, but improvements are always welcome!
operations that would not normally be undertaken while a real-time
workload is running.

In particular, if you find yourself invoking one of the expedited
primitives repeatedly in a loop, please do everyone a favor:
Restructure your code so that it batches the updates, allowing
a single non-expedited primitive to cover the entire batch.
This will very likely be faster than the loop containing the
expedited primitive, and will be much much easier on the rest
of the system, especially to real-time workloads running on
the rest of the system.

In addition, it is illegal to call the expedited forms from
a CPU-hotplug notifier, or while holding a lock that is acquired
by a CPU-hotplug notifier. Failing to observe this restriction
will result in deadlock.

7. If the updater uses call_rcu() or synchronize_rcu(), then the
corresponding readers must use rcu_read_lock() and
rcu_read_unlock(). If the updater uses call_rcu_bh() or
Expand Down
87 changes: 80 additions & 7 deletions Documentation/RCU/stallwarn.txt
Original file line number Diff line number Diff line change
Expand Up @@ -12,14 +12,38 @@ CONFIG_RCU_CPU_STALL_TIMEOUT
This kernel configuration parameter defines the period of time
that RCU will wait from the beginning of a grace period until it
issues an RCU CPU stall warning. This time period is normally
ten seconds.
sixty seconds.

RCU_SECONDS_TILL_STALL_RECHECK
This configuration parameter may be changed at runtime via the
/sys/module/rcutree/parameters/rcu_cpu_stall_timeout, however
this parameter is checked only at the beginning of a cycle.
So if you are 30 seconds into a 70-second stall, setting this
sysfs parameter to (say) five will shorten the timeout for the
-next- stall, or the following warning for the current stall
(assuming the stall lasts long enough). It will not affect the
timing of the next warning for the current stall.

This macro defines the period of time that RCU will wait after
issuing a stall warning until it issues another stall warning
for the same stall. This time period is normally set to three
times the check interval plus thirty seconds.
Stall-warning messages may be enabled and disabled completely via
/sys/module/rcutree/parameters/rcu_cpu_stall_suppress.

CONFIG_RCU_CPU_STALL_VERBOSE

This kernel configuration parameter causes the stall warning to
also dump the stacks of any tasks that are blocking the current
RCU-preempt grace period.

RCU_CPU_STALL_INFO

This kernel configuration parameter causes the stall warning to
print out additional per-CPU diagnostic information, including
information on scheduling-clock ticks and RCU's idle-CPU tracking.

RCU_STALL_DELAY_DELTA

Although the lockdep facility is extremely useful, it does add
some overhead. Therefore, under CONFIG_PROVE_RCU, the
RCU_STALL_DELAY_DELTA macro allows five extra seconds before
giving an RCU CPU stall warning message.

RCU_STALL_RAT_DELAY

Expand Down Expand Up @@ -64,6 +88,54 @@ INFO: rcu_bh_state detected stalls on CPUs/tasks: { } (detected by 4, 2502 jiffi

This is rare, but does happen from time to time in real life.

If the CONFIG_RCU_CPU_STALL_INFO kernel configuration parameter is set,
more information is printed with the stall-warning message, for example:

INFO: rcu_preempt detected stall on CPU
0: (63959 ticks this GP) idle=241/3fffffffffffffff/0
(t=65000 jiffies)

In kernels with CONFIG_RCU_FAST_NO_HZ, even more information is
printed:

INFO: rcu_preempt detected stall on CPU
0: (64628 ticks this GP) idle=dd5/3fffffffffffffff/0 drain=0 . timer=-1
(t=65000 jiffies)

The "(64628 ticks this GP)" indicates that this CPU has taken more
than 64,000 scheduling-clock interrupts during the current stalled
grace period. If the CPU was not yet aware of the current grace
period (for example, if it was offline), then this part of the message
indicates how many grace periods behind the CPU is.

The "idle=" portion of the message prints the dyntick-idle state.
The hex number before the first "/" is the low-order 12 bits of the
dynticks counter, which will have an even-numbered value if the CPU is
in dyntick-idle mode and an odd-numbered value otherwise. The hex
number between the two "/"s is the value of the nesting, which will
be a small positive number if in the idle loop and a very large positive
number (as shown above) otherwise.

For CONFIG_RCU_FAST_NO_HZ kernels, the "drain=0" indicates that the
CPU is not in the process of trying to force itself into dyntick-idle
state, the "." indicates that the CPU has not given up forcing RCU
into dyntick-idle mode (it would be "H" otherwise), and the "timer=-1"
indicates that the CPU has not recented forced RCU into dyntick-idle
mode (it would otherwise indicate the number of microseconds remaining
in this forced state).


Multiple Warnings From One Stall

If a stall lasts long enough, multiple stall-warning messages will be
printed for it. The second and subsequent messages are printed at
longer intervals, so that the time between (say) the first and second
message will be about three times the interval between the beginning
of the stall and the first message.


What Causes RCU CPU Stall Warnings?

So your kernel printed an RCU CPU stall warning. The next question is
"What caused it?" The following problems can result in RCU CPU stall
warnings:
Expand Down Expand Up @@ -128,4 +200,5 @@ is occurring, which will usually be in the function nearest the top of
that portion of the stack which remains the same from trace to trace.
If you can reliably trigger the stall, ftrace can be quite helpful.

RCU bugs can often be debugged with the help of CONFIG_RCU_TRACE.
RCU bugs can often be debugged with the help of CONFIG_RCU_TRACE
and with RCU's event tracing.
33 changes: 30 additions & 3 deletions Documentation/RCU/torture.txt
Original file line number Diff line number Diff line change
Expand Up @@ -69,6 +69,13 @@ onoff_interval
CPU-hotplug operations regardless of what value is
specified for onoff_interval.

onoff_holdoff The number of seconds to wait until starting CPU-hotplug
operations. This would normally only be used when
rcutorture was built into the kernel and started
automatically at boot time, in which case it is useful
in order to avoid confusing boot-time code with CPUs
coming and going.

shuffle_interval
The number of seconds to keep the test threads affinitied
to a particular subset of the CPUs, defaults to 3 seconds.
Expand All @@ -79,6 +86,24 @@ shutdown_secs The number of seconds to run the test before terminating
zero, which disables test termination and system shutdown.
This capability is useful for automated testing.

stall_cpu The number of seconds that a CPU should be stalled while
within both an rcu_read_lock() and a preempt_disable().
This stall happens only once per rcutorture run.
If you need multiple stalls, use modprobe and rmmod to
repeatedly run rcutorture. The default for stall_cpu
is zero, which prevents rcutorture from stalling a CPU.

Note that attempts to rmmod rcutorture while the stall
is ongoing will hang, so be careful what value you
choose for this module parameter! In addition, too-large
values for stall_cpu might well induce failures and
warnings in other parts of the kernel. You have been
warned!

stall_cpu_holdoff
The number of seconds to wait after rcutorture starts
before stalling a CPU. Defaults to 10 seconds.

stat_interval The number of seconds between output of torture
statistics (via printk()). Regardless of the interval,
statistics are printed when the module is unloaded.
Expand Down Expand Up @@ -271,11 +296,13 @@ The following script may be used to torture RCU:
#!/bin/sh

modprobe rcutorture
sleep 100
sleep 3600
rmmod rcutorture
dmesg | grep torture:

The output can be manually inspected for the error flag of "!!!".
One could of course create a more elaborate script that automatically
checked for such errors. The "rmmod" command forces a "SUCCESS" or
"FAILURE" indication to be printk()ed.
checked for such errors. The "rmmod" command forces a "SUCCESS",
"FAILURE", or "RCU_HOTPLUG" indication to be printk()ed. The first
two are self-explanatory, while the last indicates that while there
were no RCU failures, CPU-hotplug problems were detected.
36 changes: 16 additions & 20 deletions Documentation/RCU/trace.txt
Original file line number Diff line number Diff line change
Expand Up @@ -33,23 +33,23 @@ rcu/rcuboost:
The output of "cat rcu/rcudata" looks as follows:

rcu_sched:
0 c=20972 g=20973 pq=1 pgp=20973 qp=0 dt=545/1/0 df=50 of=0 ri=0 ql=163 qs=NRW. kt=0/W/0 ktl=ebc3 b=10 ci=153737 co=0 ca=0
1 c=20972 g=20973 pq=1 pgp=20973 qp=0 dt=967/1/0 df=58 of=0 ri=0 ql=634 qs=NRW. kt=0/W/1 ktl=58c b=10 ci=191037 co=0 ca=0
2 c=20972 g=20973 pq=1 pgp=20973 qp=0 dt=1081/1/0 df=175 of=0 ri=0 ql=74 qs=N.W. kt=0/W/2 ktl=da94 b=10 ci=75991 co=0 ca=0
3 c=20942 g=20943 pq=1 pgp=20942 qp=1 dt=1846/0/0 df=404 of=0 ri=0 ql=0 qs=.... kt=0/W/3 ktl=d1cd b=10 ci=72261 co=0 ca=0
4 c=20972 g=20973 pq=1 pgp=20973 qp=0 dt=369/1/0 df=83 of=0 ri=0 ql=48 qs=N.W. kt=0/W/4 ktl=e0e7 b=10 ci=128365 co=0 ca=0
5 c=20972 g=20973 pq=1 pgp=20973 qp=0 dt=381/1/0 df=64 of=0 ri=0 ql=169 qs=NRW. kt=0/W/5 ktl=fb2f b=10 ci=164360 co=0 ca=0
6 c=20972 g=20973 pq=1 pgp=20973 qp=0 dt=1037/1/0 df=183 of=0 ri=0 ql=62 qs=N.W. kt=0/W/6 ktl=d2ad b=10 ci=65663 co=0 ca=0
7 c=20897 g=20897 pq=1 pgp=20896 qp=0 dt=1572/0/0 df=382 of=0 ri=0 ql=0 qs=.... kt=0/W/7 ktl=cf15 b=10 ci=75006 co=0 ca=0
0 c=20972 g=20973 pq=1 pgp=20973 qp=0 dt=545/1/0 df=50 of=0 ql=163 qs=NRW. kt=0/W/0 ktl=ebc3 b=10 ci=153737 co=0 ca=0
1 c=20972 g=20973 pq=1 pgp=20973 qp=0 dt=967/1/0 df=58 of=0 ql=634 qs=NRW. kt=0/W/1 ktl=58c b=10 ci=191037 co=0 ca=0
2 c=20972 g=20973 pq=1 pgp=20973 qp=0 dt=1081/1/0 df=175 of=0 ql=74 qs=N.W. kt=0/W/2 ktl=da94 b=10 ci=75991 co=0 ca=0
3 c=20942 g=20943 pq=1 pgp=20942 qp=1 dt=1846/0/0 df=404 of=0 ql=0 qs=.... kt=0/W/3 ktl=d1cd b=10 ci=72261 co=0 ca=0
4 c=20972 g=20973 pq=1 pgp=20973 qp=0 dt=369/1/0 df=83 of=0 ql=48 qs=N.W. kt=0/W/4 ktl=e0e7 b=10 ci=128365 co=0 ca=0
5 c=20972 g=20973 pq=1 pgp=20973 qp=0 dt=381/1/0 df=64 of=0 ql=169 qs=NRW. kt=0/W/5 ktl=fb2f b=10 ci=164360 co=0 ca=0
6 c=20972 g=20973 pq=1 pgp=20973 qp=0 dt=1037/1/0 df=183 of=0 ql=62 qs=N.W. kt=0/W/6 ktl=d2ad b=10 ci=65663 co=0 ca=0
7 c=20897 g=20897 pq=1 pgp=20896 qp=0 dt=1572/0/0 df=382 of=0 ql=0 qs=.... kt=0/W/7 ktl=cf15 b=10 ci=75006 co=0 ca=0
rcu_bh:
0 c=1480 g=1480 pq=1 pgp=1480 qp=0 dt=545/1/0 df=6 of=0 ri=1 ql=0 qs=.... kt=0/W/0 ktl=ebc3 b=10 ci=0 co=0 ca=0
1 c=1480 g=1480 pq=1 pgp=1480 qp=0 dt=967/1/0 df=3 of=0 ri=1 ql=0 qs=.... kt=0/W/1 ktl=58c b=10 ci=151 co=0 ca=0
2 c=1480 g=1480 pq=1 pgp=1480 qp=0 dt=1081/1/0 df=6 of=0 ri=1 ql=0 qs=.... kt=0/W/2 ktl=da94 b=10 ci=0 co=0 ca=0
3 c=1480 g=1480 pq=1 pgp=1480 qp=0 dt=1846/0/0 df=8 of=0 ri=1 ql=0 qs=.... kt=0/W/3 ktl=d1cd b=10 ci=0 co=0 ca=0
4 c=1480 g=1480 pq=1 pgp=1480 qp=0 dt=369/1/0 df=6 of=0 ri=1 ql=0 qs=.... kt=0/W/4 ktl=e0e7 b=10 ci=0 co=0 ca=0
5 c=1480 g=1480 pq=1 pgp=1480 qp=0 dt=381/1/0 df=4 of=0 ri=1 ql=0 qs=.... kt=0/W/5 ktl=fb2f b=10 ci=0 co=0 ca=0
6 c=1480 g=1480 pq=1 pgp=1480 qp=0 dt=1037/1/0 df=6 of=0 ri=1 ql=0 qs=.... kt=0/W/6 ktl=d2ad b=10 ci=0 co=0 ca=0
7 c=1474 g=1474 pq=1 pgp=1473 qp=0 dt=1572/0/0 df=8 of=0 ri=1 ql=0 qs=.... kt=0/W/7 ktl=cf15 b=10 ci=0 co=0 ca=0
0 c=1480 g=1480 pq=1 pgp=1480 qp=0 dt=545/1/0 df=6 of=0 ql=0 qs=.... kt=0/W/0 ktl=ebc3 b=10 ci=0 co=0 ca=0
1 c=1480 g=1480 pq=1 pgp=1480 qp=0 dt=967/1/0 df=3 of=0 ql=0 qs=.... kt=0/W/1 ktl=58c b=10 ci=151 co=0 ca=0
2 c=1480 g=1480 pq=1 pgp=1480 qp=0 dt=1081/1/0 df=6 of=0 ql=0 qs=.... kt=0/W/2 ktl=da94 b=10 ci=0 co=0 ca=0
3 c=1480 g=1480 pq=1 pgp=1480 qp=0 dt=1846/0/0 df=8 of=0 ql=0 qs=.... kt=0/W/3 ktl=d1cd b=10 ci=0 co=0 ca=0
4 c=1480 g=1480 pq=1 pgp=1480 qp=0 dt=369/1/0 df=6 of=0 ql=0 qs=.... kt=0/W/4 ktl=e0e7 b=10 ci=0 co=0 ca=0
5 c=1480 g=1480 pq=1 pgp=1480 qp=0 dt=381/1/0 df=4 of=0 ql=0 qs=.... kt=0/W/5 ktl=fb2f b=10 ci=0 co=0 ca=0
6 c=1480 g=1480 pq=1 pgp=1480 qp=0 dt=1037/1/0 df=6 of=0 ql=0 qs=.... kt=0/W/6 ktl=d2ad b=10 ci=0 co=0 ca=0
7 c=1474 g=1474 pq=1 pgp=1473 qp=0 dt=1572/0/0 df=8 of=0 ql=0 qs=.... kt=0/W/7 ktl=cf15 b=10 ci=0 co=0 ca=0

The first section lists the rcu_data structures for rcu_sched, the second
for rcu_bh. Note that CONFIG_TREE_PREEMPT_RCU kernels will have an
Expand Down Expand Up @@ -119,10 +119,6 @@ o "of" is the number of times that some other CPU has forced a
CPU is offline when it is really alive and kicking) is a fatal
error, so it makes sense to err conservatively.

o "ri" is the number of times that RCU has seen fit to send a
reschedule IPI to this CPU in order to get it to report a
quiescent state.

o "ql" is the number of RCU callbacks currently residing on
this CPU. This is the total number of callbacks, regardless
of what state they are in (new, waiting for grace period to
Expand Down
9 changes: 1 addition & 8 deletions arch/s390/kernel/irq.c
Original file line number Diff line number Diff line change
Expand Up @@ -165,13 +165,6 @@ static inline int ext_hash(u16 code)
return (code + (code >> 9)) & 0xff;
}

static void ext_int_hash_update(struct rcu_head *head)
{
struct ext_int_info *p = container_of(head, struct ext_int_info, rcu);

kfree(p);
}

int register_external_interrupt(u16 code, ext_int_handler_t handler)
{
struct ext_int_info *p;
Expand Down Expand Up @@ -202,7 +195,7 @@ int unregister_external_interrupt(u16 code, ext_int_handler_t handler)
list_for_each_entry_rcu(p, &ext_int_hash[index], entry)
if (p->code == code && p->handler == handler) {
list_del_rcu(&p->entry);
call_rcu(&p->rcu, ext_int_hash_update);
kfree_rcu(p, rcu);
}
spin_unlock_irqrestore(&ext_int_hash_lock, flags);
return 0;
Expand Down
12 changes: 1 addition & 11 deletions drivers/target/tcm_fc/tfc_sess.c
Original file line number Diff line number Diff line change
Expand Up @@ -85,16 +85,6 @@ static struct ft_tport *ft_tport_create(struct fc_lport *lport)
return tport;
}

/*
* Free tport via RCU.
*/
static void ft_tport_rcu_free(struct rcu_head *rcu)
{
struct ft_tport *tport = container_of(rcu, struct ft_tport, rcu);

kfree(tport);
}

/*
* Delete a target local port.
* Caller holds ft_lport_lock.
Expand All @@ -114,7 +104,7 @@ static void ft_tport_delete(struct ft_tport *tport)
tpg->tport = NULL;
tport->tpg = NULL;
}
call_rcu(&tport->rcu, ft_tport_rcu_free);
kfree_rcu(tport, rcu);
}

/*
Expand Down
Loading

0 comments on commit bdd4431

Please sign in to comment.