Skip to content

Commit

Permalink
Merge tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kern…
Browse files Browse the repository at this point in the history
…el/git/arm64/linux

Pull arm64 updates from Catalin Marinas:

 - MTE asynchronous support for KASan. Previously only synchronous
   (slower) mode was supported. Asynchronous is faster but does not
   allow precise identification of the illegal access.

 - Run kernel mode SIMD with softirqs disabled. This allows using NEON
   in softirq context for crypto performance improvements. The
   conditional yield support is modified to take softirqs into account
   and reduce the latency.

 - Preparatory patches for Apple M1: handle CPUs that only have the VHE
   mode available (host kernel running at EL2), add FIQ support.

 - arm64 perf updates: support for HiSilicon PA and SLLC PMU drivers,
   new functions for the HiSilicon HHA and L3C PMU, cleanups.

 - Re-introduce support for execute-only user permissions but only when
   the EPAN (Enhanced Privileged Access Never) architecture feature is
   available.

 - Disable fine-grained traps at boot and improve the documented boot
   requirements.

 - Support CONFIG_KASAN_VMALLOC on arm64 (only with KASAN_GENERIC).

 - Add hierarchical eXecute Never permissions for all page tables.

 - Add arm64 prctl(PR_PAC_{SET,GET}_ENABLED_KEYS) allowing user programs
   to control which PAC keys are enabled in a particular task.

 - arm64 kselftests for BTI and some improvements to the MTE tests.

 - Minor improvements to the compat vdso and sigpage.

 - Miscellaneous cleanups.

* tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux: (86 commits)
  arm64/sve: Add compile time checks for SVE hooks in generic functions
  arm64/kernel/probes: Use BUG_ON instead of if condition followed by BUG.
  arm64: pac: Optimize kernel entry/exit key installation code paths
  arm64: Introduce prctl(PR_PAC_{SET,GET}_ENABLED_KEYS)
  arm64: mte: make the per-task SCTLR_EL1 field usable elsewhere
  arm64/sve: Remove redundant system_supports_sve() tests
  arm64: fpsimd: run kernel mode NEON with softirqs disabled
  arm64: assembler: introduce wxN aliases for wN registers
  arm64: assembler: remove conditional NEON yield macros
  kasan, arm64: tests supports for HW_TAGS async mode
  arm64: mte: Report async tag faults before suspend
  arm64: mte: Enable async tag check fault
  arm64: mte: Conditionally compile mte_enable_kernel_*()
  arm64: mte: Enable TCO in functions that can read beyond buffer limits
  kasan: Add report for async mode
  arm64: mte: Drop arch_enable_tagging()
  kasan: Add KASAN mode kernel parameter
  arm64: mte: Add asynchronous mode support
  arm64: Get rid of CONFIG_ARM64_VHE
  arm64: Cope with CPUs stuck in VHE mode
  ...
  • Loading branch information
torvalds committed Apr 26, 2021
2 parents 6a71382 + a27a881 commit 31a24ae
Show file tree
Hide file tree
Showing 122 changed files with 3,862 additions and 914 deletions.
3 changes: 1 addition & 2 deletions Documentation/admin-guide/kernel-parameters.txt
Original file line number Diff line number Diff line change
Expand Up @@ -2279,8 +2279,7 @@
state is kept private from the host.
Not valid if the kernel is running in EL2.

Defaults to VHE/nVHE based on hardware support and
the value of CONFIG_ARM64_VHE.
Defaults to VHE/nVHE based on hardware support.

kvm-arm.vgic_v3_group0_trap=
[KVM,ARM] Trap guest accesses to GICv3 group-0
Expand Down
54 changes: 54 additions & 0 deletions Documentation/admin-guide/perf/hisi-pmu.rst
Original file line number Diff line number Diff line change
Expand Up @@ -53,6 +53,60 @@ Example usage of perf::
$# perf stat -a -e hisi_sccl3_l3c0/rd_hit_cpipe/ sleep 5
$# perf stat -a -e hisi_sccl3_l3c0/config=0x02/ sleep 5

For HiSilicon uncore PMU v2 whose identifier is 0x30, the topology is the same
as PMU v1, but some new functions are added to the hardware.

(a) L3C PMU supports filtering by core/thread within the cluster which can be
specified as a bitmap::

$# perf stat -a -e hisi_sccl3_l3c0/config=0x02,tt_core=0x3/ sleep 5

This will only count the operations from core/thread 0 and 1 in this cluster.

(b) Tracetag allow the user to chose to count only read, write or atomic
operations via the tt_req parameeter in perf. The default value counts all
operations. tt_req is 3bits, 3'b100 represents read operations, 3'b101
represents write operations, 3'b110 represents atomic store operations and
3'b111 represents atomic non-store operations, other values are reserved::

$# perf stat -a -e hisi_sccl3_l3c0/config=0x02,tt_req=0x4/ sleep 5

This will only count the read operations in this cluster.

(c) Datasrc allows the user to check where the data comes from. It is 5 bits.
Some important codes are as follows:
5'b00001: comes from L3C in this die;
5'b01000: comes from L3C in the cross-die;
5'b01001: comes from L3C which is in another socket;
5'b01110: comes from the local DDR;
5'b01111: comes from the cross-die DDR;
5'b10000: comes from cross-socket DDR;
etc, it is mainly helpful to find that the data source is nearest from the CPU
cores. If datasrc_cfg is used in the multi-chips, the datasrc_skt shall be
configured in perf command::

$# perf stat -a -e hisi_sccl3_l3c0/config=0xb9,datasrc_cfg=0xE/,
hisi_sccl3_l3c0/config=0xb9,datasrc_cfg=0xF/ sleep 5

(d)Some HiSilicon SoCs encapsulate multiple CPU and IO dies. Each CPU die
contains several Compute Clusters (CCLs). The I/O dies are called Super I/O
clusters (SICL) containing multiple I/O clusters (ICLs). Each CCL/ICL in the
SoC has a unique ID. Each ID is 11bits, include a 6-bit SCCL-ID and 5-bit
CCL/ICL-ID. For I/O die, the ICL-ID is followed by:
5'b00000: I/O_MGMT_ICL;
5'b00001: Network_ICL;
5'b00011: HAC_ICL;
5'b10000: PCIe_ICL;

Users could configure IDs to count data come from specific CCL/ICL, by setting
srcid_cmd & srcid_msk, and data desitined for specific CCL/ICL by setting
tgtid_cmd & tgtid_msk. A set bit in srcid_msk/tgtid_msk means the PMU will not
check the bit when matching against the srcid_cmd/tgtid_cmd.

If all of these options are disabled, it can works by the default value that
doesn't distinguish the filter condition and ID information and will return
the total counter values in the PMU counters.

The current driver does not support sampling. So "perf record" is unsupported.
Also attach to a task is unsupported as the events are all uncore.

Expand Down
13 changes: 10 additions & 3 deletions Documentation/arm64/booting.rst
Original file line number Diff line number Diff line change
Expand Up @@ -202,9 +202,10 @@ Before jumping into the kernel, the following conditions must be met:

- System registers

All writable architected system registers at the exception level where
the kernel image will be entered must be initialised by software at a
higher exception level to prevent execution in an UNKNOWN state.
All writable architected system registers at or below the exception
level where the kernel image will be entered must be initialised by
software at a higher exception level to prevent execution in an UNKNOWN
state.

- SCR_EL3.FIQ must have the same value across all CPUs the kernel is
executing on.
Expand Down Expand Up @@ -270,6 +271,12 @@ Before jumping into the kernel, the following conditions must be met:
having 0b1 set for the corresponding bit for each of the auxiliary
counters present.

For CPUs with the Fine Grained Traps (FEAT_FGT) extension present:

- If EL3 is present and the kernel is entered at EL2:

- SCR_EL3.FGTEn (bit 27) must be initialised to 0b1.

The requirements described above for CPU mode, caches, MMUs, architected
timers, coherency and system registers apply to all CPUs. All CPUs must
enter the kernel in the same exception level.
Expand Down
34 changes: 34 additions & 0 deletions Documentation/arm64/pointer-authentication.rst
Original file line number Diff line number Diff line change
Expand Up @@ -107,3 +107,37 @@ filter out the Pointer Authentication system key registers from
KVM_GET/SET_REG_* ioctls and mask those features from cpufeature ID
register. Any attempt to use the Pointer Authentication instructions will
result in an UNDEFINED exception being injected into the guest.


Enabling and disabling keys
---------------------------

The prctl PR_PAC_SET_ENABLED_KEYS allows the user program to control which
PAC keys are enabled in a particular task. It takes two arguments, the
first being a bitmask of PR_PAC_APIAKEY, PR_PAC_APIBKEY, PR_PAC_APDAKEY
and PR_PAC_APDBKEY specifying which keys shall be affected by this prctl,
and the second being a bitmask of the same bits specifying whether the key
should be enabled or disabled. For example::

prctl(PR_PAC_SET_ENABLED_KEYS,
PR_PAC_APIAKEY | PR_PAC_APIBKEY | PR_PAC_APDAKEY | PR_PAC_APDBKEY,
PR_PAC_APIBKEY, 0, 0);

disables all keys except the IB key.

The main reason why this is useful is to enable a userspace ABI that uses PAC
instructions to sign and authenticate function pointers and other pointers
exposed outside of the function, while still allowing binaries conforming to
the ABI to interoperate with legacy binaries that do not sign or authenticate
pointers.

The idea is that a dynamic loader or early startup code would issue this
prctl very early after establishing that a process may load legacy binaries,
but before executing any PAC instructions.

For compatibility with previous kernel versions, processes start up with IA,
IB, DA and DB enabled, and are reset to this state on exec(). Processes created
via fork() and clone() inherit the key enabled state from the calling process.

It is recommended to avoid disabling the IA key, as this has higher performance
overhead than disabling any of the other keys.
2 changes: 1 addition & 1 deletion Documentation/arm64/tagged-address-abi.rst
Original file line number Diff line number Diff line change
Expand Up @@ -40,7 +40,7 @@ space obtained in one of the following ways:
during creation and with the same restrictions as for ``mmap()`` above
(e.g. data, bss, stack).

The AArch64 Tagged Address ABI has two stages of relaxation depending
The AArch64 Tagged Address ABI has two stages of relaxation depending on
how the user addresses are used by the kernel:

1. User addresses not accessed by the kernel but used for address space
Expand Down
9 changes: 9 additions & 0 deletions Documentation/dev-tools/kasan.rst
Original file line number Diff line number Diff line change
Expand Up @@ -161,6 +161,15 @@ particular KASAN features.

- ``kasan=off`` or ``=on`` controls whether KASAN is enabled (default: ``on``).

- ``kasan.mode=sync`` or ``=async`` controls whether KASAN is configured in
synchronous or asynchronous mode of execution (default: ``sync``).
Synchronous mode: a bad access is detected immediately when a tag
check fault occurs.
Asynchronous mode: a bad access detection is delayed. When a tag check
fault occurs, the information is stored in hardware (in the TFSR_EL1
register for arm64). The kernel periodically checks the hardware and
only reports tag faults during these checks.

- ``kasan.stacktrace=off`` or ``=on`` disables or enables alloc and free stack
traces collection (default: ``on``).

Expand Down
41 changes: 20 additions & 21 deletions arch/arm64/Kconfig
Original file line number Diff line number Diff line change
Expand Up @@ -108,9 +108,9 @@ config ARM64
select GENERIC_CPU_AUTOPROBE
select GENERIC_CPU_VULNERABILITIES
select GENERIC_EARLY_IOREMAP
select GENERIC_FIND_FIRST_BIT
select GENERIC_IDLE_POLL_SETUP
select GENERIC_IRQ_IPI
select GENERIC_IRQ_MULTI_HANDLER
select GENERIC_IRQ_PROBE
select GENERIC_IRQ_SHOW
select GENERIC_IRQ_SHOW_LEVEL
Expand Down Expand Up @@ -138,6 +138,7 @@ config ARM64
select HAVE_ARCH_JUMP_LABEL
select HAVE_ARCH_JUMP_LABEL_RELATIVE
select HAVE_ARCH_KASAN if !(ARM64_16K_PAGES && ARM64_VA_BITS_48)
select HAVE_ARCH_KASAN_VMALLOC if HAVE_ARCH_KASAN
select HAVE_ARCH_KASAN_SW_TAGS if HAVE_ARCH_KASAN
select HAVE_ARCH_KASAN_HW_TAGS if (HAVE_ARCH_KASAN && ARM64_MTE)
select HAVE_ARCH_KFENCE
Expand Down Expand Up @@ -195,6 +196,7 @@ config ARM64
select IOMMU_DMA if IOMMU_SUPPORT
select IRQ_DOMAIN
select IRQ_FORCED_THREADING
select KASAN_VMALLOC if KASAN_GENERIC
select MODULES_USE_ELF_RELA
select NEED_DMA_MAP_STATE
select NEED_SG_DMA_LENGTH
Expand Down Expand Up @@ -1069,6 +1071,9 @@ config SYS_SUPPORTS_HUGETLBFS
config ARCH_HAS_CACHE_LINE_SIZE
def_bool y

config ARCH_HAS_FILTER_PGPROT
def_bool y

config ARCH_ENABLE_SPLIT_PMD_PTLOCK
def_bool y if PGTABLE_LEVELS > 2

Expand Down Expand Up @@ -1430,19 +1435,6 @@ config ARM64_USE_LSE_ATOMICS
built with binutils >= 2.25 in order for the new instructions
to be used.

config ARM64_VHE
bool "Enable support for Virtualization Host Extensions (VHE)"
default y
help
Virtualization Host Extensions (VHE) allow the kernel to run
directly at EL2 (instead of EL1) on processors that support
it. This leads to better performance for KVM, as they reduce
the cost of the world switch.

Selecting this option allows the VHE feature to be detected
at runtime, and does not affect processors that do not
implement this feature.

endmenu

menu "ARMv8.2 architectural features"
Expand Down Expand Up @@ -1696,10 +1688,23 @@ config ARM64_MTE

endmenu

menu "ARMv8.7 architectural features"

config ARM64_EPAN
bool "Enable support for Enhanced Privileged Access Never (EPAN)"
default y
depends on ARM64_PAN
help
Enhanced Privileged Access Never (EPAN) allows Privileged
Access Never to be used with Execute-only mappings.

The feature is detected at runtime, and will remain disabled
if the cpu does not implement the feature.
endmenu

config ARM64_SVE
bool "ARM Scalable Vector Extension support"
default y
depends on !KVM || ARM64_VHE
help
The Scalable Vector Extension (SVE) is an extension to the AArch64
execution state which complements and extends the SIMD functionality
Expand Down Expand Up @@ -1728,12 +1733,6 @@ config ARM64_SVE
booting the kernel. If unsure and you are not observing these
symptoms, you should assume that it is safe to say Y.

CPUs that support SVE are architecturally required to support the
Virtualization Host Extensions (VHE), so the kernel makes no
provision for supporting SVE alongside KVM without VHE enabled.
Thus, you will need to enable CONFIG_ARM64_VHE if you want to support
KVM in the same kernel image.

config ARM64_MODULE_PLTS
bool "Use PLTs to allow module memory to spill over into vmalloc area"
depends on MODULES
Expand Down
1 change: 1 addition & 0 deletions arch/arm64/configs/defconfig
Original file line number Diff line number Diff line change
Expand Up @@ -1156,6 +1156,7 @@ CONFIG_CRYPTO_DEV_HISI_TRNG=m
CONFIG_CMA_SIZE_MBYTES=32
CONFIG_PRINTK_TIME=y
CONFIG_DEBUG_INFO=y
CONFIG_DEBUG_INFO_REDUCED=y
CONFIG_MAGIC_SYSRQ=y
CONFIG_DEBUG_FS=y
CONFIG_DEBUG_KERNEL=y
Expand Down
2 changes: 1 addition & 1 deletion arch/arm64/crypto/aes-modes.S
Original file line number Diff line number Diff line change
Expand Up @@ -701,7 +701,7 @@ AES_FUNC_START(aes_mac_update)
cbz w5, .Lmacout
encrypt_block v0, w2, x1, x7, w8
st1 {v0.16b}, [x4] /* return dg */
cond_yield .Lmacout, x7
cond_yield .Lmacout, x7, x8
b .Lmacloop4x
.Lmac1x:
add w3, w3, #4
Expand Down
2 changes: 1 addition & 1 deletion arch/arm64/crypto/sha1-ce-core.S
Original file line number Diff line number Diff line change
Expand Up @@ -121,7 +121,7 @@ CPU_LE( rev32 v11.16b, v11.16b )
add dgav.4s, dgav.4s, dg0v.4s

cbz w2, 2f
cond_yield 3f, x5
cond_yield 3f, x5, x6
b 0b

/*
Expand Down
2 changes: 1 addition & 1 deletion arch/arm64/crypto/sha2-ce-core.S
Original file line number Diff line number Diff line change
Expand Up @@ -129,7 +129,7 @@ CPU_LE( rev32 v19.16b, v19.16b )

/* handled all input blocks? */
cbz w2, 2f
cond_yield 3f, x5
cond_yield 3f, x5, x6
b 0b

/*
Expand Down
4 changes: 2 additions & 2 deletions arch/arm64/crypto/sha3-ce-core.S
Original file line number Diff line number Diff line change
Expand Up @@ -184,11 +184,11 @@ SYM_FUNC_START(sha3_ce_transform)
eor v0.16b, v0.16b, v31.16b

cbnz w8, 3b
cond_yield 3f, x8
cond_yield 4f, x8, x9
cbnz w2, 0b

/* save state */
3: st1 { v0.1d- v3.1d}, [x0], #32
4: st1 { v0.1d- v3.1d}, [x0], #32
st1 { v4.1d- v7.1d}, [x0], #32
st1 { v8.1d-v11.1d}, [x0], #32
st1 {v12.1d-v15.1d}, [x0], #32
Expand Down
2 changes: 1 addition & 1 deletion arch/arm64/crypto/sha512-ce-core.S
Original file line number Diff line number Diff line change
Expand Up @@ -195,7 +195,7 @@ CPU_LE( rev64 v19.16b, v19.16b )
add v10.2d, v10.2d, v2.2d
add v11.2d, v11.2d, v3.2d

cond_yield 3f, x4
cond_yield 3f, x4, x5
/* handled all input blocks? */
cbnz w2, 0b

Expand Down
2 changes: 1 addition & 1 deletion arch/arm64/include/asm/arch_gicv3.h
Original file line number Diff line number Diff line change
Expand Up @@ -173,7 +173,7 @@ static inline void gic_pmr_mask_irqs(void)

static inline void gic_arch_enable_irqs(void)
{
asm volatile ("msr daifclr, #2" : : : "memory");
asm volatile ("msr daifclr, #3" : : : "memory");
}

#endif /* __ASSEMBLY__ */
Expand Down
21 changes: 0 additions & 21 deletions arch/arm64/include/asm/arch_timer.h
Original file line number Diff line number Diff line change
Expand Up @@ -165,25 +165,6 @@ static inline void arch_timer_set_cntkctl(u32 cntkctl)
isb();
}

/*
* Ensure that reads of the counter are treated the same as memory reads
* for the purposes of ordering by subsequent memory barriers.
*
* This insanity brought to you by speculative system register reads,
* out-of-order memory accesses, sequence locks and Thomas Gleixner.
*
* http://lists.infradead.org/pipermail/linux-arm-kernel/2019-February/631195.html
*/
#define arch_counter_enforce_ordering(val) do { \
u64 tmp, _val = (val); \
\
asm volatile( \
" eor %0, %1, %1\n" \
" add %0, sp, %0\n" \
" ldr xzr, [%0]" \
: "=r" (tmp) : "r" (_val)); \
} while (0)

static __always_inline u64 __arch_counter_get_cntpct_stable(void)
{
u64 cnt;
Expand Down Expand Up @@ -224,8 +205,6 @@ static __always_inline u64 __arch_counter_get_cntvct(void)
return cnt;
}

#undef arch_counter_enforce_ordering

static inline int arch_timer_arch_init(void)
{
return 0;
Expand Down
20 changes: 1 addition & 19 deletions arch/arm64/include/asm/asm_pointer_auth.h
Original file line number Diff line number Diff line change
Expand Up @@ -13,30 +13,12 @@
* so use the base value of ldp as thread.keys_user and offset as
* thread.keys_user.ap*.
*/
.macro ptrauth_keys_install_user tsk, tmp1, tmp2, tmp3
.macro __ptrauth_keys_install_user tsk, tmp1, tmp2, tmp3
mov \tmp1, #THREAD_KEYS_USER
add \tmp1, \tsk, \tmp1
alternative_if_not ARM64_HAS_ADDRESS_AUTH
b .Laddr_auth_skip_\@
alternative_else_nop_endif
ldp \tmp2, \tmp3, [\tmp1, #PTRAUTH_USER_KEY_APIA]
msr_s SYS_APIAKEYLO_EL1, \tmp2
msr_s SYS_APIAKEYHI_EL1, \tmp3
ldp \tmp2, \tmp3, [\tmp1, #PTRAUTH_USER_KEY_APIB]
msr_s SYS_APIBKEYLO_EL1, \tmp2
msr_s SYS_APIBKEYHI_EL1, \tmp3
ldp \tmp2, \tmp3, [\tmp1, #PTRAUTH_USER_KEY_APDA]
msr_s SYS_APDAKEYLO_EL1, \tmp2
msr_s SYS_APDAKEYHI_EL1, \tmp3
ldp \tmp2, \tmp3, [\tmp1, #PTRAUTH_USER_KEY_APDB]
msr_s SYS_APDBKEYLO_EL1, \tmp2
msr_s SYS_APDBKEYHI_EL1, \tmp3
.Laddr_auth_skip_\@:
alternative_if ARM64_HAS_GENERIC_AUTH
ldp \tmp2, \tmp3, [\tmp1, #PTRAUTH_USER_KEY_APGA]
msr_s SYS_APGAKEYLO_EL1, \tmp2
msr_s SYS_APGAKEYHI_EL1, \tmp3
alternative_else_nop_endif
.endm

.macro __ptrauth_keys_install_kernel_nosync tsk, tmp1, tmp2, tmp3
Expand Down
Loading

0 comments on commit 31a24ae

Please sign in to comment.