Skip to content

Commit

Permalink
Merge tag 'rcu.2022.07.26a' of git://git.kernel.org/pub/scm/linux/ker…
Browse files Browse the repository at this point in the history
…nel/git/paulmck/linux-rcu

Pull RCU updates from Paul McKenney:

 - Documentation updates

 - Miscellaneous fixes

 - Callback-offload updates, perhaps most notably a new
   RCU_NOCB_CPU_DEFAULT_ALL Kconfig option that causes all CPUs to be
   offloaded at boot time, regardless of kernel boot parameters.

   This is useful to battery-powered systems such as ChromeOS and
   Android. In addition, a new RCU_NOCB_CPU_CB_BOOST kernel boot
   parameter prevents offloaded callbacks from interfering with
   real-time workloads and with energy-efficiency mechanisms

 - Polled grace-period updates, perhaps most notably making these APIs
   account for both normal and expedited grace periods

 - Tasks RCU updates, perhaps most notably reducing the CPU overhead of
   RCU tasks trace grace periods by more than a factor of two on a
   system with 15,000 tasks.

   The reduction is expected to increase with the number of tasks, so it
   seems reasonable to hypothesize that a system with 150,000 tasks
   might see a 20-fold reduction in CPU overhead

 - Torture-test updates

 - Updates that merge RCU's dyntick-idle tracking into context tracking,
   thus reducing the overhead of transitioning to kernel mode from
   either idle or nohz_full userspace execution for kernels that track
   context independently of RCU.

   This is expected to be helpful primarily for kernels built with
   CONFIG_NO_HZ_FULL=y

* tag 'rcu.2022.07.26a' of git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu: (98 commits)
  rcu: Add irqs-disabled indicator to expedited RCU CPU stall warnings
  rcu: Diagnose extended sync_rcu_do_polled_gp() loops
  rcu: Put panic_on_rcu_stall() after expedited RCU CPU stall warnings
  rcutorture: Test polled expedited grace-period primitives
  rcu: Add polled expedited grace-period primitives
  rcutorture: Verify that polled GP API sees synchronous grace periods
  rcu: Make Tiny RCU grace periods visible to polled APIs
  rcu: Make polled grace-period API account for expedited grace periods
  rcu: Switch polled grace-period APIs to ->gp_seq_polled
  rcu/nocb: Avoid polling when my_rdp->nocb_head_rdp list is empty
  rcu/nocb: Add option to opt rcuo kthreads out of RT priority
  rcu: Add nocb_cb_kthread check to rcu_is_callbacks_kthread()
  rcu/nocb: Add an option to offload all CPUs on boot
  rcu/nocb: Fix NOCB kthreads spawn failure with rcu_nocb_rdp_deoffload() direct call
  rcu/nocb: Invert rcu_state.barrier_mutex VS hotplug lock locking order
  rcu/nocb: Add/del rdp to iterate from rcuog itself
  rcu/tree: Add comment to describe GP-done condition in fqs loop
  rcu: Initialize first_gp_fqs at declaration in rcu_gp_fqs()
  rcu/kvfree: Remove useless monitor_todo flag
  rcu: Cleanup RCU urgency state for offline CPU
  ...
  • Loading branch information
torvalds committed Aug 3, 2022
2 parents c2a24a7 + 34bc7b4 commit 7d9d077
Show file tree
Hide file tree
Showing 76 changed files with 2,124 additions and 1,279 deletions.
10 changes: 5 additions & 5 deletions Documentation/RCU/Design/Requirements/Requirements.rst
Original file line number Diff line number Diff line change
Expand Up @@ -1844,10 +1844,10 @@ that meets this requirement.

Furthermore, NMI handlers can be interrupted by what appear to RCU to be
normal interrupts. One way that this can happen is for code that
directly invokes rcu_irq_enter() and rcu_irq_exit() to be called
directly invokes ct_irq_enter() and ct_irq_exit() to be called
from an NMI handler. This astonishing fact of life prompted the current
code structure, which has rcu_irq_enter() invoking
rcu_nmi_enter() and rcu_irq_exit() invoking rcu_nmi_exit().
code structure, which has ct_irq_enter() invoking
ct_nmi_enter() and ct_irq_exit() invoking ct_nmi_exit().
And yes, I also learned of this requirement the hard way.

Loadable Modules
Expand Down Expand Up @@ -2195,7 +2195,7 @@ scheduling-clock interrupt be enabled when RCU needs it to be:
sections, and RCU believes this CPU to be idle, no problem. This
sort of thing is used by some architectures for light-weight
exception handlers, which can then avoid the overhead of
rcu_irq_enter() and rcu_irq_exit() at exception entry and
ct_irq_enter() and ct_irq_exit() at exception entry and
exit, respectively. Some go further and avoid the entireties of
irq_enter() and irq_exit().
Just make very sure you are running some of your tests with
Expand Down Expand Up @@ -2226,7 +2226,7 @@ scheduling-clock interrupt be enabled when RCU needs it to be:
+-----------------------------------------------------------------------+
| **Answer**: |
+-----------------------------------------------------------------------+
| One approach is to do ``rcu_irq_exit();rcu_irq_enter();`` every so |
| One approach is to do ``ct_irq_exit();ct_irq_enter();`` every so |
| often. But given that long-running interrupt handlers can cause other |
| problems, not least for response time, shouldn't you work to keep |
| your interrupt handler's runtime within reasonable bounds? |
Expand Down
6 changes: 3 additions & 3 deletions Documentation/RCU/stallwarn.rst
Original file line number Diff line number Diff line change
Expand Up @@ -97,12 +97,12 @@ warnings:
which will include additional debugging information.

- A low-level kernel issue that either fails to invoke one of the
variants of rcu_user_enter(), rcu_user_exit(), rcu_idle_enter(),
rcu_idle_exit(), rcu_irq_enter(), or rcu_irq_exit() on the one
variants of rcu_eqs_enter(true), rcu_eqs_exit(true), ct_idle_enter(),
ct_idle_exit(), ct_irq_enter(), or ct_irq_exit() on the one
hand, or that invokes one of them too many times on the other.
Historically, the most frequent issue has been an omission
of either irq_enter() or irq_exit(), which in turn invoke
rcu_irq_enter() or rcu_irq_exit(), respectively. Building your
ct_irq_enter() or ct_irq_exit(), respectively. Building your
kernel with CONFIG_RCU_EQS_DEBUG=y can help track down these types
of issues, which sometimes arise in architecture-specific code.

Expand Down
34 changes: 34 additions & 0 deletions Documentation/admin-guide/kernel-parameters.txt
Original file line number Diff line number Diff line change
Expand Up @@ -3667,6 +3667,9 @@
just as if they had also been called out in the
rcu_nocbs= boot parameter.

Note that this argument takes precedence over
the CONFIG_RCU_NOCB_CPU_DEFAULT_ALL option.

noiotrap [SH] Disables trapped I/O port accesses.

noirqdebug [X86-32] Disables the code which attempts to detect and
Expand Down Expand Up @@ -4560,6 +4563,9 @@
no-callback mode from boot but the mode may be
toggled at runtime via cpusets.

Note that this argument takes precedence over
the CONFIG_RCU_NOCB_CPU_DEFAULT_ALL option.

rcu_nocb_poll [KNL]
Rather than requiring that offloaded CPUs
(specified by rcu_nocbs= above) explicitly
Expand Down Expand Up @@ -4669,6 +4675,34 @@
When RCU_NOCB_CPU is set, also adjust the
priority of NOCB callback kthreads.

rcutree.rcu_divisor= [KNL]
Set the shift-right count to use to compute
the callback-invocation batch limit bl from
the number of callbacks queued on this CPU.
The result will be bounded below by the value of
the rcutree.blimit kernel parameter. Every bl
callbacks, the softirq handler will exit in
order to allow the CPU to do other work.

Please note that this callback-invocation batch
limit applies only to non-offloaded callback
invocation. Offloaded callbacks are instead
invoked in the context of an rcuoc kthread, which
scheduler will preempt as it does any other task.

rcutree.nocb_nobypass_lim_per_jiffy= [KNL]
On callback-offloaded (rcu_nocbs) CPUs,
RCU reduces the lock contention that would
otherwise be caused by callback floods through
use of the ->nocb_bypass list. However, in the
common non-flooded case, RCU queues directly to
the main ->cblist in order to avoid the extra
overhead of the ->nocb_bypass list and its lock.
But if there are too many callbacks queued during
a single jiffy, RCU pre-queues the callbacks into
the ->nocb_bypass queue. The definition of "too
many" is supplied by this kernel boot parameter.

rcutree.rcu_nocb_gp_stride= [KNL]
Set the number of NOCB callback kthreads in
each group, which defaults to the square root
Expand Down
6 changes: 3 additions & 3 deletions Documentation/features/time/context-tracking/arch-support.txt
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
#
# Feature name: context-tracking
# Kconfig: HAVE_CONTEXT_TRACKING
# description: arch supports context tracking for NO_HZ_FULL
# Feature name: user-context-tracking
# Kconfig: HAVE_CONTEXT_TRACKING_USER
# description: arch supports user context tracking for NO_HZ_FULL
#
-----------------------
| arch |status|
Expand Down
1 change: 1 addition & 0 deletions MAINTAINERS
Original file line number Diff line number Diff line change
Expand Up @@ -5165,6 +5165,7 @@ F: include/linux/console*

CONTEXT TRACKING
M: Frederic Weisbecker <[email protected]>
M: "Paul E. McKenney" <[email protected]>
S: Maintained
F: kernel/context_tracking.c
F: include/linux/context_tracking*
Expand Down
8 changes: 4 additions & 4 deletions arch/Kconfig
Original file line number Diff line number Diff line change
Expand Up @@ -784,18 +784,18 @@ config HAVE_ARCH_WITHIN_STACK_FRAMES
and similar) by implementing an inline arch_within_stack_frames(),
which is used by CONFIG_HARDENED_USERCOPY.

config HAVE_CONTEXT_TRACKING
config HAVE_CONTEXT_TRACKING_USER
bool
help
Provide kernel/user boundaries probes necessary for subsystems
that need it, such as userspace RCU extended quiescent state.
Syscalls need to be wrapped inside user_exit()-user_enter(), either
optimized behind static key or through the slow path using TIF_NOHZ
flag. Exceptions handlers must be wrapped as well. Irqs are already
protected inside rcu_irq_enter/rcu_irq_exit() but preemption or signal
protected inside ct_irq_enter/ct_irq_exit() but preemption or signal
handling on irq exit still need to be protected.

config HAVE_CONTEXT_TRACKING_OFFSTACK
config HAVE_CONTEXT_TRACKING_USER_OFFSTACK
bool
help
Architecture neither relies on exception_enter()/exception_exit()
Expand All @@ -807,7 +807,7 @@ config HAVE_CONTEXT_TRACKING_OFFSTACK

- Critical entry code isn't preemptible (or better yet:
not interruptible).
- No use of RCU read side critical sections, unless rcu_nmi_enter()
- No use of RCU read side critical sections, unless ct_nmi_enter()
got called.
- No use of instrumentation, unless instrumentation_begin() got
called.
Expand Down
2 changes: 1 addition & 1 deletion arch/arm/Kconfig
Original file line number Diff line number Diff line change
Expand Up @@ -84,7 +84,7 @@ config ARM
select HAVE_ARCH_TRANSPARENT_HUGEPAGE if ARM_LPAE
select HAVE_ARM_SMCCC if CPU_V7
select HAVE_EBPF_JIT if !CPU_ENDIAN_BE32
select HAVE_CONTEXT_TRACKING
select HAVE_CONTEXT_TRACKING_USER
select HAVE_C_RECORDMCOUNT
select HAVE_BUILDTIME_MCOUNT_SORT
select HAVE_DEBUG_KMEMLEAK if !XIP_KERNEL
Expand Down
4 changes: 2 additions & 2 deletions arch/arm/kernel/entry-common.S
Original file line number Diff line number Diff line change
Expand Up @@ -28,7 +28,7 @@
#include "entry-header.S"

saved_psr .req r8
#if defined(CONFIG_TRACE_IRQFLAGS) || defined(CONFIG_CONTEXT_TRACKING)
#if defined(CONFIG_TRACE_IRQFLAGS) || defined(CONFIG_CONTEXT_TRACKING_USER)
saved_pc .req r9
#define TRACE(x...) x
#else
Expand All @@ -38,7 +38,7 @@ saved_pc .req lr

.section .entry.text,"ax",%progbits
.align 5
#if !(IS_ENABLED(CONFIG_TRACE_IRQFLAGS) || IS_ENABLED(CONFIG_CONTEXT_TRACKING) || \
#if !(IS_ENABLED(CONFIG_TRACE_IRQFLAGS) || IS_ENABLED(CONFIG_CONTEXT_TRACKING_USER) || \
IS_ENABLED(CONFIG_DEBUG_RSEQ))
/*
* This is the fast syscall return path. We do as little as possible here,
Expand Down
12 changes: 6 additions & 6 deletions arch/arm/kernel/entry-header.S
Original file line number Diff line number Diff line change
Expand Up @@ -366,25 +366,25 @@ ALT_UP_B(.L1_\@)
* between user and kernel mode.
*/
.macro ct_user_exit, save = 1
#ifdef CONFIG_CONTEXT_TRACKING
#ifdef CONFIG_CONTEXT_TRACKING_USER
.if \save
stmdb sp!, {r0-r3, ip, lr}
bl context_tracking_user_exit
bl user_exit_callable
ldmia sp!, {r0-r3, ip, lr}
.else
bl context_tracking_user_exit
bl user_exit_callable
.endif
#endif
.endm

.macro ct_user_enter, save = 1
#ifdef CONFIG_CONTEXT_TRACKING
#ifdef CONFIG_CONTEXT_TRACKING_USER
.if \save
stmdb sp!, {r0-r3, ip, lr}
bl context_tracking_user_enter
bl user_enter_callable
ldmia sp!, {r0-r3, ip, lr}
.else
bl context_tracking_user_enter
bl user_enter_callable
.endif
#endif
.endm
Expand Down
5 changes: 3 additions & 2 deletions arch/arm/mach-imx/cpuidle-imx6q.c
Original file line number Diff line number Diff line change
Expand Up @@ -3,6 +3,7 @@
* Copyright (C) 2012 Freescale Semiconductor, Inc.
*/

#include <linux/context_tracking.h>
#include <linux/cpuidle.h>
#include <linux/module.h>
#include <asm/cpuidle.h>
Expand All @@ -24,9 +25,9 @@ static int imx6q_enter_wait(struct cpuidle_device *dev,
imx6_set_lpm(WAIT_UNCLOCKED);
raw_spin_unlock(&cpuidle_lock);

rcu_idle_enter();
ct_idle_enter();
cpu_do_idle();
rcu_idle_exit();
ct_idle_exit();

raw_spin_lock(&cpuidle_lock);
if (num_idle_cpus-- == num_online_cpus())
Expand Down
2 changes: 1 addition & 1 deletion arch/arm64/Kconfig
Original file line number Diff line number Diff line change
Expand Up @@ -176,7 +176,7 @@ config ARM64
select HAVE_C_RECORDMCOUNT
select HAVE_CMPXCHG_DOUBLE
select HAVE_CMPXCHG_LOCAL
select HAVE_CONTEXT_TRACKING
select HAVE_CONTEXT_TRACKING_USER
select HAVE_DEBUG_KMEMLEAK
select HAVE_DMA_CONTIGUOUS
select HAVE_DYNAMIC_FTRACE
Expand Down
14 changes: 7 additions & 7 deletions arch/arm64/kernel/entry-common.c
Original file line number Diff line number Diff line change
Expand Up @@ -41,7 +41,7 @@ static __always_inline void __enter_from_kernel_mode(struct pt_regs *regs)

if (!IS_ENABLED(CONFIG_TINY_RCU) && is_idle_task(current)) {
lockdep_hardirqs_off(CALLER_ADDR0);
rcu_irq_enter();
ct_irq_enter();
trace_hardirqs_off_finish();

regs->exit_rcu = true;
Expand Down Expand Up @@ -76,15 +76,15 @@ static __always_inline void __exit_to_kernel_mode(struct pt_regs *regs)
if (regs->exit_rcu) {
trace_hardirqs_on_prepare();
lockdep_hardirqs_on_prepare();
rcu_irq_exit();
ct_irq_exit();
lockdep_hardirqs_on(CALLER_ADDR0);
return;
}

trace_hardirqs_on();
} else {
if (regs->exit_rcu)
rcu_irq_exit();
ct_irq_exit();
}
}

Expand Down Expand Up @@ -161,7 +161,7 @@ static void noinstr arm64_enter_nmi(struct pt_regs *regs)
__nmi_enter();
lockdep_hardirqs_off(CALLER_ADDR0);
lockdep_hardirq_enter();
rcu_nmi_enter();
ct_nmi_enter();

trace_hardirqs_off_finish();
ftrace_nmi_enter();
Expand All @@ -182,7 +182,7 @@ static void noinstr arm64_exit_nmi(struct pt_regs *regs)
lockdep_hardirqs_on_prepare();
}

rcu_nmi_exit();
ct_nmi_exit();
lockdep_hardirq_exit();
if (restore)
lockdep_hardirqs_on(CALLER_ADDR0);
Expand All @@ -199,7 +199,7 @@ static void noinstr arm64_enter_el1_dbg(struct pt_regs *regs)
regs->lockdep_hardirqs = lockdep_hardirqs_enabled();

lockdep_hardirqs_off(CALLER_ADDR0);
rcu_nmi_enter();
ct_nmi_enter();

trace_hardirqs_off_finish();
}
Expand All @@ -218,7 +218,7 @@ static void noinstr arm64_exit_el1_dbg(struct pt_regs *regs)
lockdep_hardirqs_on_prepare();
}

rcu_nmi_exit();
ct_nmi_exit();
if (restore)
lockdep_hardirqs_on(CALLER_ADDR0);
}
Expand Down
2 changes: 1 addition & 1 deletion arch/csky/Kconfig
Original file line number Diff line number Diff line change
Expand Up @@ -42,7 +42,7 @@ config CSKY
select HAVE_ARCH_AUDITSYSCALL
select HAVE_ARCH_MMAP_RND_BITS
select HAVE_ARCH_SECCOMP_FILTER
select HAVE_CONTEXT_TRACKING
select HAVE_CONTEXT_TRACKING_USER
select HAVE_VIRT_CPU_ACCOUNTING_GEN
select HAVE_DEBUG_BUGVERBOSE
select HAVE_DEBUG_KMEMLEAK
Expand Down
8 changes: 4 additions & 4 deletions arch/csky/kernel/entry.S
Original file line number Diff line number Diff line change
Expand Up @@ -19,11 +19,11 @@
.endm

.macro context_tracking
#ifdef CONFIG_CONTEXT_TRACKING
#ifdef CONFIG_CONTEXT_TRACKING_USER
mfcr a0, epsr
btsti a0, 31
bt 1f
jbsr context_tracking_user_exit
jbsr user_exit_callable
ldw a0, (sp, LSAVE_A0)
ldw a1, (sp, LSAVE_A1)
ldw a2, (sp, LSAVE_A2)
Expand Down Expand Up @@ -159,8 +159,8 @@ ret_from_exception:
and r10, r9
cmpnei r10, 0
bt exit_work
#ifdef CONFIG_CONTEXT_TRACKING
jbsr context_tracking_user_enter
#ifdef CONFIG_CONTEXT_TRACKING_USER
jbsr user_enter_callable
#endif
1:
#ifdef CONFIG_PREEMPTION
Expand Down
2 changes: 1 addition & 1 deletion arch/loongarch/Kconfig
Original file line number Diff line number Diff line change
Expand Up @@ -75,7 +75,7 @@ config LOONGARCH
select HAVE_ARCH_TRACEHOOK
select HAVE_ARCH_TRANSPARENT_HUGEPAGE
select HAVE_ASM_MODVERSIONS
select HAVE_CONTEXT_TRACKING
select HAVE_CONTEXT_TRACKING_USER
select HAVE_DEBUG_STACKOVERFLOW
select HAVE_DMA_CONTIGUOUS
select HAVE_EXIT_THREAD
Expand Down
2 changes: 1 addition & 1 deletion arch/mips/Kconfig
Original file line number Diff line number Diff line change
Expand Up @@ -56,7 +56,7 @@ config MIPS
select HAVE_ARCH_TRACEHOOK
select HAVE_ARCH_TRANSPARENT_HUGEPAGE if CPU_SUPPORTS_HUGEPAGES
select HAVE_ASM_MODVERSIONS
select HAVE_CONTEXT_TRACKING
select HAVE_CONTEXT_TRACKING_USER
select HAVE_TIF_NOHZ
select HAVE_C_RECORDMCOUNT
select HAVE_DEBUG_KMEMLEAK
Expand Down
2 changes: 1 addition & 1 deletion arch/powerpc/Kconfig
Original file line number Diff line number Diff line change
Expand Up @@ -202,7 +202,7 @@ config PPC
select HAVE_ARCH_SECCOMP_FILTER
select HAVE_ARCH_TRACEHOOK
select HAVE_ASM_MODVERSIONS
select HAVE_CONTEXT_TRACKING if PPC64
select HAVE_CONTEXT_TRACKING_USER if PPC64
select HAVE_C_RECORDMCOUNT
select HAVE_DEBUG_KMEMLEAK
select HAVE_DEBUG_STACKOVERFLOW
Expand Down
2 changes: 1 addition & 1 deletion arch/powerpc/include/asm/context_tracking.h
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@
#ifndef _ASM_POWERPC_CONTEXT_TRACKING_H
#define _ASM_POWERPC_CONTEXT_TRACKING_H

#ifdef CONFIG_CONTEXT_TRACKING
#ifdef CONFIG_CONTEXT_TRACKING_USER
#define SCHEDULE_USER bl schedule_user
#else
#define SCHEDULE_USER bl schedule
Expand Down
2 changes: 1 addition & 1 deletion arch/riscv/Kconfig
Original file line number Diff line number Diff line change
Expand Up @@ -86,7 +86,7 @@ config RISCV
select HAVE_ARCH_THREAD_STRUCT_WHITELIST
select HAVE_ARCH_VMAP_STACK if MMU && 64BIT
select HAVE_ASM_MODVERSIONS
select HAVE_CONTEXT_TRACKING
select HAVE_CONTEXT_TRACKING_USER
select HAVE_DEBUG_KMEMLEAK
select HAVE_DMA_CONTIGUOUS if MMU
select HAVE_EBPF_JIT if MMU
Expand Down
Loading

0 comments on commit 7d9d077

Please sign in to comment.