Skip to content

Commit

Permalink
Merge branches 'debug-choice', 'devel-stable' and 'misc' into for-linus
Browse files Browse the repository at this point in the history
  • Loading branch information
Russell King committed Sep 5, 2013
4 parents d8dfad3 + 8d258be + 5cc91e0 + 9fc2105 commit 141b974
Show file tree
Hide file tree
Showing 103 changed files with 1,590 additions and 1,343 deletions.
42 changes: 32 additions & 10 deletions Documentation/arm/Booting
Original file line number Diff line number Diff line change
Expand Up @@ -18,7 +18,8 @@ following:
2. Initialise one serial port.
3. Detect the machine type.
4. Setup the kernel tagged list.
5. Call the kernel image.
5. Load initramfs.
6. Call the kernel image.


1. Setup and initialise RAM
Expand Down Expand Up @@ -120,12 +121,27 @@ tagged list.
The boot loader must pass at a minimum the size and location of the
system memory, and the root filesystem location. The dtb must be
placed in a region of memory where the kernel decompressor will not
overwrite it. The recommended placement is in the first 16KiB of RAM
with the caveat that it may not be located at physical address 0 since
the kernel interprets a value of 0 in r2 to mean neither a tagged list
nor a dtb were passed.
overwrite it, whilst remaining within the region which will be covered
by the kernel's low-memory mapping.

5. Calling the kernel image
A safe location is just above the 128MiB boundary from start of RAM.

5. Load initramfs.
------------------

Existing boot loaders: OPTIONAL
New boot loaders: OPTIONAL

If an initramfs is in use then, as with the dtb, it must be placed in
a region of memory where the kernel decompressor will not overwrite it
while also with the region which will be covered by the kernel's
low-memory mapping.

A safe location is just above the device tree blob which itself will
be loaded just above the 128MiB boundary from the start of RAM as
recommended above.

6. Calling the kernel image
---------------------------

Existing boot loaders: MANDATORY
Expand All @@ -136,11 +152,17 @@ is stored in flash, and is linked correctly to be run from flash,
then it is legal for the boot loader to call the zImage in flash
directly.

The zImage may also be placed in system RAM (at any location) and
called there. Note that the kernel uses 16K of RAM below the image
to store page tables. The recommended placement is 32KiB into RAM.
The zImage may also be placed in system RAM and called there. The
kernel should be placed in the first 128MiB of RAM. It is recommended
that it is loaded above 32MiB in order to avoid the need to relocate
prior to decompression, which will make the boot process slightly
faster.

When booting a raw (non-zImage) kernel the constraints are tighter.
In this case the kernel must be loaded at an offset into system equal
to TEXT_OFFSET - PAGE_OFFSET.

In either case, the following conditions must be met:
In any case, the following conditions must be met:

- Quiesce all DMA capable devices so that memory does not get
corrupted by bogus network packets or disk data. This will save
Expand Down
121 changes: 121 additions & 0 deletions Documentation/arm/kernel_mode_neon.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,121 @@
Kernel mode NEON
================

TL;DR summary
-------------
* Use only NEON instructions, or VFP instructions that don't rely on support
code
* Isolate your NEON code in a separate compilation unit, and compile it with
'-mfpu=neon -mfloat-abi=softfp'
* Put kernel_neon_begin() and kernel_neon_end() calls around the calls into your
NEON code
* Don't sleep in your NEON code, and be aware that it will be executed with
preemption disabled


Introduction
------------
It is possible to use NEON instructions (and in some cases, VFP instructions) in
code that runs in kernel mode. However, for performance reasons, the NEON/VFP
register file is not preserved and restored at every context switch or taken
exception like the normal register file is, so some manual intervention is
required. Furthermore, special care is required for code that may sleep [i.e.,
may call schedule()], as NEON or VFP instructions will be executed in a
non-preemptible section for reasons outlined below.


Lazy preserve and restore
-------------------------
The NEON/VFP register file is managed using lazy preserve (on UP systems) and
lazy restore (on both SMP and UP systems). This means that the register file is
kept 'live', and is only preserved and restored when multiple tasks are
contending for the NEON/VFP unit (or, in the SMP case, when a task migrates to
another core). Lazy restore is implemented by disabling the NEON/VFP unit after
every context switch, resulting in a trap when subsequently a NEON/VFP
instruction is issued, allowing the kernel to step in and perform the restore if
necessary.

Any use of the NEON/VFP unit in kernel mode should not interfere with this, so
it is required to do an 'eager' preserve of the NEON/VFP register file, and
enable the NEON/VFP unit explicitly so no exceptions are generated on first
subsequent use. This is handled by the function kernel_neon_begin(), which
should be called before any kernel mode NEON or VFP instructions are issued.
Likewise, the NEON/VFP unit should be disabled again after use to make sure user
mode will hit the lazy restore trap upon next use. This is handled by the
function kernel_neon_end().


Interruptions in kernel mode
----------------------------
For reasons of performance and simplicity, it was decided that there shall be no
preserve/restore mechanism for the kernel mode NEON/VFP register contents. This
implies that interruptions of a kernel mode NEON section can only be allowed if
they are guaranteed not to touch the NEON/VFP registers. For this reason, the
following rules and restrictions apply in the kernel:
* NEON/VFP code is not allowed in interrupt context;
* NEON/VFP code is not allowed to sleep;
* NEON/VFP code is executed with preemption disabled.

If latency is a concern, it is possible to put back to back calls to
kernel_neon_end() and kernel_neon_begin() in places in your code where none of
the NEON registers are live. (Additional calls to kernel_neon_begin() should be
reasonably cheap if no context switch occurred in the meantime)


VFP and support code
--------------------
Earlier versions of VFP (prior to version 3) rely on software support for things
like IEEE-754 compliant underflow handling etc. When the VFP unit needs such
software assistance, it signals the kernel by raising an undefined instruction
exception. The kernel responds by inspecting the VFP control registers and the
current instruction and arguments, and emulates the instruction in software.

Such software assistance is currently not implemented for VFP instructions
executed in kernel mode. If such a condition is encountered, the kernel will
fail and generate an OOPS.


Separating NEON code from ordinary code
---------------------------------------
The compiler is not aware of the special significance of kernel_neon_begin() and
kernel_neon_end(), i.e., that it is only allowed to issue NEON/VFP instructions
between calls to these respective functions. Furthermore, GCC may generate NEON
instructions of its own at -O3 level if -mfpu=neon is selected, and even if the
kernel is currently compiled at -O2, future changes may result in NEON/VFP
instructions appearing in unexpected places if no special care is taken.

Therefore, the recommended and only supported way of using NEON/VFP in the
kernel is by adhering to the following rules:
* isolate the NEON code in a separate compilation unit and compile it with
'-mfpu=neon -mfloat-abi=softfp';
* issue the calls to kernel_neon_begin(), kernel_neon_end() as well as the calls
into the unit containing the NEON code from a compilation unit which is *not*
built with the GCC flag '-mfpu=neon' set.

As the kernel is compiled with '-msoft-float', the above will guarantee that
both NEON and VFP instructions will only ever appear in designated compilation
units at any optimization level.


NEON assembler
--------------
NEON assembler is supported with no additional caveats as long as the rules
above are followed.


NEON code generated by GCC
--------------------------
The GCC option -ftree-vectorize (implied by -O3) tries to exploit implicit
parallelism, and generates NEON code from ordinary C source code. This is fully
supported as long as the rules above are followed.


NEON intrinsics
---------------
NEON intrinsics are also supported. However, as code using NEON intrinsics
relies on the GCC header <arm_neon.h>, (which #includes <stdint.h>), you should
observe the following in addition to the rules above:
* Compile the unit containing the NEON intrinsics with '-ffreestanding' so GCC
uses its builtin version of <stdint.h> (this is a C99 header which the kernel
does not supply);
* Include <arm_neon.h> last, or at least after <linux/types.h>
4 changes: 3 additions & 1 deletion Documentation/devicetree/bindings/arm/l2cc.txt
Original file line number Diff line number Diff line change
Expand Up @@ -16,9 +16,11 @@ Required properties:
performs the same operation).
"marvell,"aurora-outer-cache: Marvell Controller designed to be
compatible with the ARM one with outer cache mode.
"bcm,bcm11351-a2-pl310-cache": For Broadcom bcm11351 chipset where an
"brcm,bcm11351-a2-pl310-cache": For Broadcom bcm11351 chipset where an
offset needs to be added to the address before passing down to the L2
cache controller
"bcm,bcm11351-a2-pl310-cache": DEPRECATED by
"brcm,bcm11351-a2-pl310-cache"
- cache-unified : Specifies the cache is a unified cache.
- cache-level : Should be set to 2 for a level 2 cache.
- reg : Physical base address and size of cache controller's memory mapped
Expand Down
62 changes: 59 additions & 3 deletions arch/arm/Kconfig
Original file line number Diff line number Diff line change
Expand Up @@ -52,6 +52,7 @@ config ARM
select HAVE_REGS_AND_STACK_ACCESS_API
select HAVE_SYSCALL_TRACEPOINTS
select HAVE_UID16
select IRQ_FORCED_THREADING
select KTIME_SCALAR
select PERF_USE_VMALLOC
select RTC_LIB
Expand Down Expand Up @@ -1372,6 +1373,15 @@ config ARM_ERRATA_798181
which sends an IPI to the CPUs that are running the same ASID
as the one being invalidated.

config ARM_ERRATA_773022
bool "ARM errata: incorrect instructions may be executed from loop buffer"
depends on CPU_V7
help
This option enables the workaround for the 773022 Cortex-A15
(up to r0p4) erratum. In certain rare sequences of code, the
loop buffer may deliver incorrect instructions. This
workaround disables the loop buffer to avoid the erratum.

endmenu

source "arch/arm/common/Kconfig"
Expand Down Expand Up @@ -1613,13 +1623,49 @@ config ARCH_NR_GPIO

source kernel/Kconfig.preempt

config HZ
config HZ_FIXED
int
default 200 if ARCH_EBSA110 || ARCH_S3C24XX || ARCH_S5P64X0 || \
ARCH_S5PV210 || ARCH_EXYNOS4
default AT91_TIMER_HZ if ARCH_AT91
default SHMOBILE_TIMER_HZ if ARCH_SHMOBILE
default 100

choice
depends on !HZ_FIXED
prompt "Timer frequency"

config HZ_100
bool "100 Hz"

config HZ_200
bool "200 Hz"

config HZ_250
bool "250 Hz"

config HZ_300
bool "300 Hz"

config HZ_500
bool "500 Hz"

config HZ_1000
bool "1000 Hz"

endchoice

config HZ
int
default HZ_FIXED if HZ_FIXED
default 100 if HZ_100
default 200 if HZ_200
default 250 if HZ_250
default 300 if HZ_300
default 500 if HZ_500
default 1000

config SCHED_HRTICK
def_bool HIGH_RES_TIMERS

config SCHED_HRTICK
def_bool HIGH_RES_TIMERS
Expand Down Expand Up @@ -1756,6 +1802,9 @@ config HAVE_ARCH_TRANSPARENT_HUGEPAGE
def_bool y
depends on ARM_LPAE

config ARCH_WANT_GENERAL_HUGETLB
def_bool y

source "mm/Kconfig"

config FORCE_MAX_ZONEORDER
Expand Down Expand Up @@ -2175,6 +2224,13 @@ config NEON
Say Y to include support code for NEON, the ARMv7 Advanced SIMD
Extension.

config KERNEL_MODE_NEON
bool "Support for NEON in kernel mode"
default n
depends on NEON
help
Say Y to include support for NEON in kernel mode.

endmenu

menu "Userspace binary formats"
Expand All @@ -2199,7 +2255,7 @@ source "kernel/power/Kconfig"

config ARCH_SUSPEND_POSSIBLE
depends on !ARCH_S5PC100
depends on CPU_ARM920T || CPU_ARM926T || CPU_SA1100 || \
depends on CPU_ARM920T || CPU_ARM926T || CPU_FEROCEON || CPU_SA1100 || \
CPU_V6 || CPU_V6K || CPU_V7 || CPU_XSC3 || CPU_XSCALE || CPU_MOHAWK
def_bool y

Expand Down
Loading

0 comments on commit 141b974

Please sign in to comment.