Skip to content

Commit

Permalink
Merge branch 'core-rcu-for-linus' of git://git.kernel.org/pub/scm/lin…
Browse files Browse the repository at this point in the history
…ux/kernel/git/tip/linux-2.6-tip

* 'core-rcu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (24 commits)
  rcu: remove all rcu head initializations, except on_stack initializations
  rcu head introduce rcu head init on stack
  Debugobjects transition check
  rcu: fix build bug in RCU_FAST_NO_HZ builds
  rcu: RCU_FAST_NO_HZ must check RCU dyntick state
  rcu: make SRCU usable in modules
  rcu: improve the RCU CPU-stall warning documentation
  rcu: reduce the number of spurious RCU_SOFTIRQ invocations
  rcu: permit discontiguous cpu_possible_mask CPU numbering
  rcu: improve RCU CPU stall-warning messages
  rcu: print boot-time console messages if RCU configs out of ordinary
  rcu: disable CPU stall warnings upon panic
  rcu: enable CPU_STALL_VERBOSE by default
  rcu: slim down rcutiny by removing rcu_scheduler_active and friends
  rcu: refactor RCU's context-switch handling
  rcu: rename rcutiny rcu_ctrlblk to rcu_sched_ctrlblk
  rcu: shrink rcutiny by making synchronize_rcu_bh() be inline
  rcu: fix now-bogus rcu_scheduler_active comments.
  rcu: Fix bogus CONFIG_PROVE_LOCKING in comments to reflect reality.
  rcu: ignore offline CPUs in last non-dyntick-idle CPU check
  ...
  • Loading branch information
torvalds committed May 18, 2010
2 parents 1014cfe + 72d5a9f commit f262af3
Show file tree
Hide file tree
Showing 21 changed files with 470 additions and 143 deletions.
94 changes: 71 additions & 23 deletions Documentation/RCU/stallwarn.txt
Original file line number Diff line number Diff line change
Expand Up @@ -3,56 +3,104 @@ Using RCU's CPU Stall Detector
The CONFIG_RCU_CPU_STALL_DETECTOR kernel config parameter enables
RCU's CPU stall detector, which detects conditions that unduly delay
RCU grace periods. The stall detector's idea of what constitutes
"unduly delayed" is controlled by a pair of C preprocessor macros:
"unduly delayed" is controlled by a set of C preprocessor macros:

RCU_SECONDS_TILL_STALL_CHECK

This macro defines the period of time that RCU will wait from
the beginning of a grace period until it issues an RCU CPU
stall warning. It is normally ten seconds.
stall warning. This time period is normally ten seconds.

RCU_SECONDS_TILL_STALL_RECHECK

This macro defines the period of time that RCU will wait after
issuing a stall warning until it issues another stall warning.
It is normally set to thirty seconds.
issuing a stall warning until it issues another stall warning
for the same stall. This time period is normally set to thirty
seconds.

RCU_STALL_RAT_DELAY

The CPU stall detector tries to make the offending CPU rat on itself,
as this often gives better-quality stack traces. However, if
the offending CPU does not detect its own stall in the number
of jiffies specified by RCU_STALL_RAT_DELAY, then other CPUs will
complain. This is normally set to two jiffies.
The CPU stall detector tries to make the offending CPU print its
own warnings, as this often gives better-quality stack traces.
However, if the offending CPU does not detect its own stall in
the number of jiffies specified by RCU_STALL_RAT_DELAY, then
some other CPU will complain. This delay is normally set to
two jiffies.

The following problems can result in an RCU CPU stall warning:
When a CPU detects that it is stalling, it will print a message similar
to the following:

INFO: rcu_sched_state detected stall on CPU 5 (t=2500 jiffies)

This message indicates that CPU 5 detected that it was causing a stall,
and that the stall was affecting RCU-sched. This message will normally be
followed by a stack dump of the offending CPU. On TREE_RCU kernel builds,
RCU and RCU-sched are implemented by the same underlying mechanism,
while on TREE_PREEMPT_RCU kernel builds, RCU is instead implemented
by rcu_preempt_state.

On the other hand, if the offending CPU fails to print out a stall-warning
message quickly enough, some other CPU will print a message similar to
the following:

INFO: rcu_bh_state detected stalls on CPUs/tasks: { 3 5 } (detected by 2, 2502 jiffies)

This message indicates that CPU 2 detected that CPUs 3 and 5 were both
causing stalls, and that the stall was affecting RCU-bh. This message
will normally be followed by stack dumps for each CPU. Please note that
TREE_PREEMPT_RCU builds can be stalled by tasks as well as by CPUs,
and that the tasks will be indicated by PID, for example, "P3421".
It is even possible for a rcu_preempt_state stall to be caused by both
CPUs -and- tasks, in which case the offending CPUs and tasks will all
be called out in the list.

Finally, if the grace period ends just as the stall warning starts
printing, there will be a spurious stall-warning message:

INFO: rcu_bh_state detected stalls on CPUs/tasks: { } (detected by 4, 2502 jiffies)

This is rare, but does happen from time to time in real life.

So your kernel printed an RCU CPU stall warning. The next question is
"What caused it?" The following problems can result in RCU CPU stall
warnings:

o A CPU looping in an RCU read-side critical section.

o A CPU looping with interrupts disabled.
o A CPU looping with interrupts disabled. This condition can
result in RCU-sched and RCU-bh stalls.

o A CPU looping with preemption disabled.
o A CPU looping with preemption disabled. This condition can
result in RCU-sched stalls and, if ksoftirqd is in use, RCU-bh
stalls.

o A CPU looping with bottom halves disabled. This condition can
result in RCU-sched and RCU-bh stalls.

o For !CONFIG_PREEMPT kernels, a CPU looping anywhere in the kernel
without invoking schedule().

o A bug in the RCU implementation.

o A hardware failure. This is quite unlikely, but has occurred
at least once in a former life. A CPU failed in a running system,
at least once in real life. A CPU failed in a running system,
becoming unresponsive, but not causing an immediate crash.
This resulted in a series of RCU CPU stall warnings, eventually
leading the realization that the CPU had failed.

The RCU, RCU-sched, and RCU-bh implementations have CPU stall warning.
SRCU does not do so directly, but its calls to synchronize_sched() will
result in RCU-sched detecting any CPU stalls that might be occurring.

To diagnose the cause of the stall, inspect the stack traces. The offending
function will usually be near the top of the stack. If you have a series
of stall warnings from a single extended stall, comparing the stack traces
can often help determine where the stall is occurring, which will usually
be in the function nearest the top of the stack that stays the same from
trace to trace.
The RCU, RCU-sched, and RCU-bh implementations have CPU stall
warning. SRCU does not have its own CPU stall warnings, but its
calls to synchronize_sched() will result in RCU-sched detecting
RCU-sched-related CPU stalls. Please note that RCU only detects
CPU stalls when there is a grace period in progress. No grace period,
no CPU stall warnings.

To diagnose the cause of the stall, inspect the stack traces.
The offending function will usually be near the top of the stack.
If you have a series of stall warnings from a single extended stall,
comparing the stack traces can often help determine where the stall
is occurring, which will usually be in the function nearest the top of
that portion of the stack which remains the same from trace to trace.
If you can reliably trigger the stall, ftrace can be quite helpful.

RCU bugs can often be debugged with the help of CONFIG_RCU_TRACE.
35 changes: 19 additions & 16 deletions Documentation/RCU/trace.txt
Original file line number Diff line number Diff line change
Expand Up @@ -256,23 +256,23 @@ o Each element of the form "1/1 0:127 ^0" represents one struct
The output of "cat rcu/rcu_pending" looks as follows:

rcu_sched:
0 np=255892 qsp=53936 cbr=0 cng=14417 gpc=10033 gps=24320 nf=6445 nn=146741
1 np=261224 qsp=54638 cbr=0 cng=25723 gpc=16310 gps=2849 nf=5912 nn=155792
2 np=237496 qsp=49664 cbr=0 cng=2762 gpc=45478 gps=1762 nf=1201 nn=136629
3 np=236249 qsp=48766 cbr=0 cng=286 gpc=48049 gps=1218 nf=207 nn=137723
4 np=221310 qsp=46850 cbr=0 cng=26 gpc=43161 gps=4634 nf=3529 nn=123110
5 np=237332 qsp=48449 cbr=0 cng=54 gpc=47920 gps=3252 nf=201 nn=137456
6 np=219995 qsp=46718 cbr=0 cng=50 gpc=42098 gps=6093 nf=4202 nn=120834
7 np=249893 qsp=49390 cbr=0 cng=72 gpc=38400 gps=17102 nf=41 nn=144888
0 np=255892 qsp=53936 rpq=85 cbr=0 cng=14417 gpc=10033 gps=24320 nf=6445 nn=146741
1 np=261224 qsp=54638 rpq=33 cbr=0 cng=25723 gpc=16310 gps=2849 nf=5912 nn=155792
2 np=237496 qsp=49664 rpq=23 cbr=0 cng=2762 gpc=45478 gps=1762 nf=1201 nn=136629
3 np=236249 qsp=48766 rpq=98 cbr=0 cng=286 gpc=48049 gps=1218 nf=207 nn=137723
4 np=221310 qsp=46850 rpq=7 cbr=0 cng=26 gpc=43161 gps=4634 nf=3529 nn=123110
5 np=237332 qsp=48449 rpq=9 cbr=0 cng=54 gpc=47920 gps=3252 nf=201 nn=137456
6 np=219995 qsp=46718 rpq=12 cbr=0 cng=50 gpc=42098 gps=6093 nf=4202 nn=120834
7 np=249893 qsp=49390 rpq=42 cbr=0 cng=72 gpc=38400 gps=17102 nf=41 nn=144888
rcu_bh:
0 np=146741 qsp=1419 cbr=0 cng=6 gpc=0 gps=0 nf=2 nn=145314
1 np=155792 qsp=12597 cbr=0 cng=0 gpc=4 gps=8 nf=3 nn=143180
2 np=136629 qsp=18680 cbr=0 cng=0 gpc=7 gps=6 nf=0 nn=117936
3 np=137723 qsp=2843 cbr=0 cng=0 gpc=10 gps=7 nf=0 nn=134863
4 np=123110 qsp=12433 cbr=0 cng=0 gpc=4 gps=2 nf=0 nn=110671
5 np=137456 qsp=4210 cbr=0 cng=0 gpc=6 gps=5 nf=0 nn=133235
6 np=120834 qsp=9902 cbr=0 cng=0 gpc=6 gps=3 nf=2 nn=110921
7 np=144888 qsp=26336 cbr=0 cng=0 gpc=8 gps=2 nf=0 nn=118542
0 np=146741 qsp=1419 rpq=6 cbr=0 cng=6 gpc=0 gps=0 nf=2 nn=145314
1 np=155792 qsp=12597 rpq=3 cbr=0 cng=0 gpc=4 gps=8 nf=3 nn=143180
2 np=136629 qsp=18680 rpq=1 cbr=0 cng=0 gpc=7 gps=6 nf=0 nn=117936
3 np=137723 qsp=2843 rpq=0 cbr=0 cng=0 gpc=10 gps=7 nf=0 nn=134863
4 np=123110 qsp=12433 rpq=0 cbr=0 cng=0 gpc=4 gps=2 nf=0 nn=110671
5 np=137456 qsp=4210 rpq=1 cbr=0 cng=0 gpc=6 gps=5 nf=0 nn=133235
6 np=120834 qsp=9902 rpq=2 cbr=0 cng=0 gpc=6 gps=3 nf=2 nn=110921
7 np=144888 qsp=26336 rpq=0 cbr=0 cng=0 gpc=8 gps=2 nf=0 nn=118542

As always, this is once again split into "rcu_sched" and "rcu_bh"
portions, with CONFIG_TREE_PREEMPT_RCU kernels having an additional
Expand All @@ -284,6 +284,9 @@ o "np" is the number of times that __rcu_pending() has been invoked
o "qsp" is the number of times that the RCU was waiting for a
quiescent state from this CPU.

o "rpq" is the number of times that the CPU had passed through
a quiescent state, but not yet reported it to RCU.

o "cbr" is the number of times that this CPU had RCU callbacks
that had passed through a grace period, and were thus ready
to be invoked.
Expand Down
11 changes: 11 additions & 0 deletions include/linux/debugobjects.h
Original file line number Diff line number Diff line change
Expand Up @@ -20,12 +20,14 @@ struct debug_obj_descr;
* struct debug_obj - representaion of an tracked object
* @node: hlist node to link the object into the tracker list
* @state: tracked object state
* @astate: current active state
* @object: pointer to the real object
* @descr: pointer to an object type specific debug description structure
*/
struct debug_obj {
struct hlist_node node;
enum debug_obj_state state;
unsigned int astate;
void *object;
struct debug_obj_descr *descr;
};
Expand Down Expand Up @@ -60,6 +62,15 @@ extern void debug_object_deactivate(void *addr, struct debug_obj_descr *descr);
extern void debug_object_destroy (void *addr, struct debug_obj_descr *descr);
extern void debug_object_free (void *addr, struct debug_obj_descr *descr);

/*
* Active state:
* - Set at 0 upon initialization.
* - Must return to 0 before deactivation.
*/
extern void
debug_object_active_state(void *addr, struct debug_obj_descr *descr,
unsigned int expect, unsigned int next);

extern void debug_objects_early_init(void);
extern void debug_objects_mem_init(void);
#else
Expand Down
1 change: 0 additions & 1 deletion include/linux/init_task.h
Original file line number Diff line number Diff line change
Expand Up @@ -49,7 +49,6 @@ extern struct group_info init_groups;
{ .first = &init_task.pids[PIDTYPE_PGID].node }, \
{ .first = &init_task.pids[PIDTYPE_SID].node }, \
}, \
.rcu = RCU_HEAD_INIT, \
.level = 0, \
.numbers = { { \
.nr = 0, \
Expand Down
50 changes: 32 additions & 18 deletions include/linux/rcupdate.h
Original file line number Diff line number Diff line change
Expand Up @@ -56,8 +56,6 @@ struct rcu_head {
};

/* Exported common interfaces */
extern void synchronize_rcu_bh(void);
extern void synchronize_sched(void);
extern void rcu_barrier(void);
extern void rcu_barrier_bh(void);
extern void rcu_barrier_sched(void);
Expand All @@ -66,8 +64,6 @@ extern int sched_expedited_torture_stats(char *page);

/* Internal to kernel */
extern void rcu_init(void);
extern int rcu_scheduler_active;
extern void rcu_scheduler_starting(void);

#if defined(CONFIG_TREE_RCU) || defined(CONFIG_TREE_PREEMPT_RCU)
#include <linux/rcutree.h>
Expand All @@ -83,6 +79,14 @@ extern void rcu_scheduler_starting(void);
(ptr)->next = NULL; (ptr)->func = NULL; \
} while (0)

static inline void init_rcu_head_on_stack(struct rcu_head *head)
{
}

static inline void destroy_rcu_head_on_stack(struct rcu_head *head)
{
}

#ifdef CONFIG_DEBUG_LOCK_ALLOC

extern struct lockdep_map rcu_lock_map;
Expand All @@ -106,12 +110,13 @@ extern int debug_lockdep_rcu_enabled(void);
/**
* rcu_read_lock_held - might we be in RCU read-side critical section?
*
* If CONFIG_PROVE_LOCKING is selected and enabled, returns nonzero iff in
* an RCU read-side critical section. In absence of CONFIG_PROVE_LOCKING,
* If CONFIG_DEBUG_LOCK_ALLOC is selected, returns nonzero iff in an RCU
* read-side critical section. In absence of CONFIG_DEBUG_LOCK_ALLOC,
* this assumes we are in an RCU read-side critical section unless it can
* prove otherwise.
*
* Check rcu_scheduler_active to prevent false positives during boot.
* Check debug_lockdep_rcu_enabled() to prevent false positives during boot
* and while lockdep is disabled.
*/
static inline int rcu_read_lock_held(void)
{
Expand All @@ -129,13 +134,15 @@ extern int rcu_read_lock_bh_held(void);
/**
* rcu_read_lock_sched_held - might we be in RCU-sched read-side critical section?
*
* If CONFIG_PROVE_LOCKING is selected and enabled, returns nonzero iff in an
* RCU-sched read-side critical section. In absence of CONFIG_PROVE_LOCKING,
* this assumes we are in an RCU-sched read-side critical section unless it
* can prove otherwise. Note that disabling of preemption (including
* disabling irqs) counts as an RCU-sched read-side critical section.
* If CONFIG_DEBUG_LOCK_ALLOC is selected, returns nonzero iff in an
* RCU-sched read-side critical section. In absence of
* CONFIG_DEBUG_LOCK_ALLOC, this assumes we are in an RCU-sched read-side
* critical section unless it can prove otherwise. Note that disabling
* of preemption (including disabling irqs) counts as an RCU-sched
* read-side critical section.
*
* Check rcu_scheduler_active to prevent false positives during boot.
* Check debug_lockdep_rcu_enabled() to prevent false positives during boot
* and while lockdep is disabled.
*/
#ifdef CONFIG_PREEMPT
static inline int rcu_read_lock_sched_held(void)
Expand Down Expand Up @@ -177,7 +184,7 @@ static inline int rcu_read_lock_bh_held(void)
#ifdef CONFIG_PREEMPT
static inline int rcu_read_lock_sched_held(void)
{
return !rcu_scheduler_active || preempt_count() != 0 || irqs_disabled();
return preempt_count() != 0 || irqs_disabled();
}
#else /* #ifdef CONFIG_PREEMPT */
static inline int rcu_read_lock_sched_held(void)
Expand All @@ -192,6 +199,15 @@ static inline int rcu_read_lock_sched_held(void)

extern int rcu_my_thread_group_empty(void);

#define __do_rcu_dereference_check(c) \
do { \
static bool __warned; \
if (debug_lockdep_rcu_enabled() && !__warned && !(c)) { \
__warned = true; \
lockdep_rcu_dereference(__FILE__, __LINE__); \
} \
} while (0)

/**
* rcu_dereference_check - rcu_dereference with debug checking
* @p: The pointer to read, prior to dereferencing
Expand Down Expand Up @@ -221,8 +237,7 @@ extern int rcu_my_thread_group_empty(void);
*/
#define rcu_dereference_check(p, c) \
({ \
if (debug_lockdep_rcu_enabled() && !(c)) \
lockdep_rcu_dereference(__FILE__, __LINE__); \
__do_rcu_dereference_check(c); \
rcu_dereference_raw(p); \
})

Expand All @@ -239,8 +254,7 @@ extern int rcu_my_thread_group_empty(void);
*/
#define rcu_dereference_protected(p, c) \
({ \
if (debug_lockdep_rcu_enabled() && !(c)) \
lockdep_rcu_dereference(__FILE__, __LINE__); \
__do_rcu_dereference_check(c); \
(p); \
})

Expand Down
29 changes: 28 additions & 1 deletion include/linux/rcutiny.h
Original file line number Diff line number Diff line change
Expand Up @@ -29,6 +29,10 @@

void rcu_sched_qs(int cpu);
void rcu_bh_qs(int cpu);
static inline void rcu_note_context_switch(int cpu)
{
rcu_sched_qs(cpu);
}

#define __rcu_read_lock() preempt_disable()
#define __rcu_read_unlock() preempt_enable()
Expand Down Expand Up @@ -74,7 +78,17 @@ static inline void rcu_sched_force_quiescent_state(void)
{
}

#define synchronize_rcu synchronize_sched
extern void synchronize_sched(void);

static inline void synchronize_rcu(void)
{
synchronize_sched();
}

static inline void synchronize_rcu_bh(void)
{
synchronize_sched();
}

static inline void synchronize_rcu_expedited(void)
{
Expand Down Expand Up @@ -114,4 +128,17 @@ static inline int rcu_preempt_depth(void)
return 0;
}

#ifdef CONFIG_DEBUG_LOCK_ALLOC

extern int rcu_scheduler_active __read_mostly;
extern void rcu_scheduler_starting(void);

#else /* #ifdef CONFIG_DEBUG_LOCK_ALLOC */

static inline void rcu_scheduler_starting(void)
{
}

#endif /* #else #ifdef CONFIG_DEBUG_LOCK_ALLOC */

#endif /* __LINUX_RCUTINY_H */
Loading

0 comments on commit f262af3

Please sign in to comment.