Skip to content

Commit

Permalink
Merge tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kern…
Browse files Browse the repository at this point in the history
…el/git/arm64/linux

Pull arm64 updates from Will Deacon:
 "Highlights include a major rework of our kPTI page-table rewriting
  code (which makes it both more maintainable and considerably faster in
  the cases where it is required) as well as significant changes to our
  early boot code to reduce the need for data cache maintenance and
  greatly simplify the KASLR relocation dance.

  Summary:

   - Remove unused generic cpuidle support (replaced by PSCI version)

   - Fix documentation describing the kernel virtual address space

   - Handling of some new CPU errata in Arm implementations

   - Rework of our exception table code in preparation for handling
     machine checks (i.e. RAS errors) more gracefully

   - Switch over to the generic implementation of ioremap()

   - Fix lockdep tracking in NMI context

   - Instrument our memory barrier macros for KCSAN

   - Rework of the kPTI G->nG page-table repainting so that the MMU
     remains enabled and the boot time is no longer slowed to a crawl
     for systems which require the late remapping

   - Enable support for direct swapping of 2MiB transparent huge-pages
     on systems without MTE

   - Fix handling of MTE tags with allocating new pages with HW KASAN

   - Expose the SMIDR register to userspace via sysfs

   - Continued rework of the stack unwinder, particularly improving the
     behaviour under KASAN

   - More repainting of our system register definitions to match the
     architectural terminology

   - Improvements to the layout of the vDSO objects

   - Support for allocating additional bits of HWCAP2 and exposing
     FEAT_EBF16 to userspace on CPUs that support it

   - Considerable rework and optimisation of our early boot code to
     reduce the need for cache maintenance and avoid jumping in and out
     of the kernel when handling relocation under KASLR

   - Support for disabling SVE and SME support on the kernel
     command-line

   - Support for the Hisilicon HNS3 PMU

   - Miscellanous cleanups, trivial updates and minor fixes"

* tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux: (136 commits)
  arm64: Delay initialisation of cpuinfo_arm64::reg_{zcr,smcr}
  arm64: fix KASAN_INLINE
  arm64/hwcap: Support FEAT_EBF16
  arm64/cpufeature: Store elf_hwcaps as a bitmap rather than unsigned long
  arm64/hwcap: Document allocation of upper bits of AT_HWCAP
  arm64: enable THP_SWAP for arm64
  arm64/mm: use GENMASK_ULL for TTBR_BADDR_MASK_52
  arm64: errata: Remove AES hwcap for COMPAT tasks
  arm64: numa: Don't check node against MAX_NUMNODES
  drivers/perf: arm_spe: Fix consistency of SYS_PMSCR_EL1.CX
  perf: RISC-V: Add of_node_put() when breaking out of for_each_of_cpu_node()
  docs: perf: Include hns3-pmu.rst in toctree to fix 'htmldocs' WARNING
  arm64: kasan: Revert "arm64: mte: reset the page tag in page->flags"
  mm: kasan: Skip page unpoisoning only if __GFP_SKIP_KASAN_UNPOISON
  mm: kasan: Skip unpoisoning of user pages
  mm: kasan: Ensure the tags are visible before the tag in page->flags
  drivers/perf: hisi: add driver for HNS3 PMU
  drivers/perf: hisi: Add description for HNS3 PMU driver
  drivers/perf: riscv_pmu_sbi: perf format
  perf/arm-cci: Use the bitmap API to allocate bitmaps
  ...
  • Loading branch information
torvalds committed Aug 1, 2022
2 parents a82c58c + 892f723 commit 0cec3f2
Show file tree
Hide file tree
Showing 123 changed files with 3,971 additions and 1,627 deletions.
3 changes: 2 additions & 1 deletion Documentation/ABI/testing/sysfs-devices-system-cpu
Original file line number Diff line number Diff line change
Expand Up @@ -493,12 +493,13 @@ What: /sys/devices/system/cpu/cpuX/regs/
/sys/devices/system/cpu/cpuX/regs/identification/
/sys/devices/system/cpu/cpuX/regs/identification/midr_el1
/sys/devices/system/cpu/cpuX/regs/identification/revidr_el1
/sys/devices/system/cpu/cpuX/regs/identification/smidr_el1
Date: June 2016
Contact: Linux ARM Kernel Mailing list <[email protected]>
Description: AArch64 CPU registers

'identification' directory exposes the CPU ID registers for
identifying model and revision of the CPU.
identifying model and revision of the CPU and SMCU.

What: /sys/devices/system/cpu/aarch32_el0
Date: May 2021
Expand Down
8 changes: 7 additions & 1 deletion Documentation/admin-guide/kernel-parameters.txt
Original file line number Diff line number Diff line change
Expand Up @@ -400,6 +400,12 @@
arm64.nomte [ARM64] Unconditionally disable Memory Tagging Extension
support

arm64.nosve [ARM64] Unconditionally disable Scalable Vector
Extension support

arm64.nosme [ARM64] Unconditionally disable Scalable Matrix
Extension support

ataflop= [HW,M68k]

atarimouse= [HW,MOUSE] Atari Mouse
Expand Down Expand Up @@ -3161,7 +3167,7 @@
improves system performance, but it may also
expose users to several CPU vulnerabilities.
Equivalent to: nopti [X86,PPC]
kpti=0 [ARM64]
if nokaslr then kpti=0 [ARM64]
nospectre_v1 [X86,PPC]
nobp=0 [S390]
nospectre_v2 [X86,PPC,S390,ARM64]
Expand Down
136 changes: 136 additions & 0 deletions Documentation/admin-guide/perf/hns3-pmu.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,136 @@
======================================
HNS3 Performance Monitoring Unit (PMU)
======================================

HNS3(HiSilicon network system 3) Performance Monitoring Unit (PMU) is an
End Point device to collect performance statistics of HiSilicon SoC NIC.
On Hip09, each SICL(Super I/O cluster) has one PMU device.

HNS3 PMU supports collection of performance statistics such as bandwidth,
latency, packet rate and interrupt rate.

Each HNS3 PMU supports 8 hardware events.

HNS3 PMU driver
===============

The HNS3 PMU driver registers a perf PMU with the name of its sicl id.::

/sys/devices/hns3_pmu_sicl_<sicl_id>

PMU driver provides description of available events, filter modes, format,
identifier and cpumask in sysfs.

The "events" directory describes the event code of all supported events
shown in perf list.

The "filtermode" directory describes the supported filter modes of each
event.

The "format" directory describes all formats of the config (events) and
config1 (filter options) fields of the perf_event_attr structure.

The "identifier" file shows version of PMU hardware device.

The "bdf_min" and "bdf_max" files show the supported bdf range of each
pmu device.

The "hw_clk_freq" file shows the hardware clock frequency of each pmu
device.

Example usage of checking event code and subevent code::

$# cat /sys/devices/hns3_pmu_sicl_0/events/dly_tx_normal_to_mac_time
config=0x00204
$# cat /sys/devices/hns3_pmu_sicl_0/events/dly_tx_normal_to_mac_packet_num
config=0x10204

Each performance statistic has a pair of events to get two values to
calculate real performance data in userspace.

The bits 0~15 of config (here 0x0204) are the true hardware event code. If
two events have same value of bits 0~15 of config, that means they are
event pair. And the bit 16 of config indicates getting counter 0 or
counter 1 of hardware event.

After getting two values of event pair in usersapce, the formula of
computation to calculate real performance data is:::

counter 0 / counter 1

Example usage of checking supported filter mode::

$# cat /sys/devices/hns3_pmu_sicl_0/filtermode/bw_ssu_rpu_byte_num
filter mode supported: global/port/port-tc/func/func-queue/

Example usage of perf::

$# perf list
hns3_pmu_sicl_0/bw_ssu_rpu_byte_num/ [kernel PMU event]
hns3_pmu_sicl_0/bw_ssu_rpu_time/ [kernel PMU event]
------------------------------------------

$# perf stat -g -e hns3_pmu_sicl_0/bw_ssu_rpu_byte_num,global=1/ -e hns3_pmu_sicl_0/bw_ssu_rpu_time,global=1/ -I 1000
or
$# perf stat -g -e hns3_pmu_sicl_0/config=0x00002,global=1/ -e hns3_pmu_sicl_0/config=0x10002,global=1/ -I 1000


Filter modes
--------------

1. global mode
PMU collect performance statistics for all HNS3 PCIe functions of IO DIE.
Set the "global" filter option to 1 will enable this mode.
Example usage of perf::

$# perf stat -a -e hns3_pmu_sicl_0/config=0x1020F,global=1/ -I 1000

2. port mode
PMU collect performance statistic of one whole physical port. The port id
is same as mac id. The "tc" filter option must be set to 0xF in this mode,
here tc stands for traffic class.

Example usage of perf::

$# perf stat -a -e hns3_pmu_sicl_0/config=0x1020F,port=0,tc=0xF/ -I 1000

3. port-tc mode
PMU collect performance statistic of one tc of physical port. The port id
is same as mac id. The "tc" filter option must be set to 0 ~ 7 in this
mode.
Example usage of perf::

$# perf stat -a -e hns3_pmu_sicl_0/config=0x1020F,port=0,tc=0/ -I 1000

4. func mode
PMU collect performance statistic of one PF/VF. The function id is BDF of
PF/VF, its conversion formula::

func = (bus << 8) + (device << 3) + (function)

for example:
BDF func
35:00.0 0x3500
35:00.1 0x3501
35:01.0 0x3508

In this mode, the "queue" filter option must be set to 0xFFFF.
Example usage of perf::

$# perf stat -a -e hns3_pmu_sicl_0/config=0x1020F,bdf=0x3500,queue=0xFFFF/ -I 1000

5. func-queue mode
PMU collect performance statistic of one queue of PF/VF. The function id
is BDF of PF/VF, the "queue" filter option must be set to the exact queue
id of function.
Example usage of perf::

$# perf stat -a -e hns3_pmu_sicl_0/config=0x1020F,bdf=0x3500,queue=0/ -I 1000

6. func-intr mode
PMU collect performance statistic of one interrupt of PF/VF. The function
id is BDF of PF/VF, the "intr" filter option must be set to the exact
interrupt id of function.
Example usage of perf::

$# perf stat -a -e hns3_pmu_sicl_0/config=0x00301,bdf=0x3500,intr=0/ -I 1000
1 change: 1 addition & 0 deletions Documentation/admin-guide/perf/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -9,6 +9,7 @@ Performance monitor support

hisi-pmu
hisi-pcie-pmu
hns3-pmu
imx-ddr
qcom_l2_pmu
qcom_l3_pmu
Expand Down
4 changes: 4 additions & 0 deletions Documentation/arm64/elf_hwcaps.rst
Original file line number Diff line number Diff line change
Expand Up @@ -301,6 +301,10 @@ HWCAP2_WFXT

Functionality implied by ID_AA64ISAR2_EL1.WFXT == 0b0010.

HWCAP2_EBF16

Functionality implied by ID_AA64ISAR1_EL1.BF16 == 0b0010.

4. Unused AT_HWCAP bits
-----------------------

Expand Down
10 changes: 4 additions & 6 deletions Documentation/arm64/memory.rst
Original file line number Diff line number Diff line change
Expand Up @@ -33,9 +33,8 @@ AArch64 Linux memory layout with 4KB pages + 4 levels (48-bit)::
0000000000000000 0000ffffffffffff 256TB user
ffff000000000000 ffff7fffffffffff 128TB kernel logical memory map
[ffff600000000000 ffff7fffffffffff] 32TB [kasan shadow region]
ffff800000000000 ffff800007ffffff 128MB bpf jit region
ffff800008000000 ffff80000fffffff 128MB modules
ffff800010000000 fffffbffefffffff 124TB vmalloc
ffff800000000000 ffff800007ffffff 128MB modules
ffff800008000000 fffffbffefffffff 124TB vmalloc
fffffbfff0000000 fffffbfffdffffff 224MB fixed mappings (top down)
fffffbfffe000000 fffffbfffe7fffff 8MB [guard region]
fffffbfffe800000 fffffbffff7fffff 16MB PCI I/O space
Expand All @@ -51,9 +50,8 @@ AArch64 Linux memory layout with 64KB pages + 3 levels (52-bit with HW support):
0000000000000000 000fffffffffffff 4PB user
fff0000000000000 ffff7fffffffffff ~4PB kernel logical memory map
[fffd800000000000 ffff7fffffffffff] 512TB [kasan shadow region]
ffff800000000000 ffff800007ffffff 128MB bpf jit region
ffff800008000000 ffff80000fffffff 128MB modules
ffff800010000000 fffffbffefffffff 124TB vmalloc
ffff800000000000 ffff800007ffffff 128MB modules
ffff800008000000 fffffbffefffffff 124TB vmalloc
fffffbfff0000000 fffffbfffdffffff 224MB fixed mappings (top down)
fffffbfffe000000 fffffbfffe7fffff 8MB [guard region]
fffffbfffe800000 fffffbffff7fffff 16MB PCI I/O space
Expand Down
6 changes: 6 additions & 0 deletions Documentation/arm64/silicon-errata.rst
Original file line number Diff line number Diff line change
Expand Up @@ -82,10 +82,14 @@ stable kernels.
+----------------+-----------------+-----------------+-----------------------------+
| ARM | Cortex-A57 | #1319537 | ARM64_ERRATUM_1319367 |
+----------------+-----------------+-----------------+-----------------------------+
| ARM | Cortex-A57 | #1742098 | ARM64_ERRATUM_1742098 |
+----------------+-----------------+-----------------+-----------------------------+
| ARM | Cortex-A72 | #853709 | N/A |
+----------------+-----------------+-----------------+-----------------------------+
| ARM | Cortex-A72 | #1319367 | ARM64_ERRATUM_1319367 |
+----------------+-----------------+-----------------+-----------------------------+
| ARM | Cortex-A72 | #1655431 | ARM64_ERRATUM_1742098 |
+----------------+-----------------+-----------------+-----------------------------+
| ARM | Cortex-A73 | #858921 | ARM64_ERRATUM_858921 |
+----------------+-----------------+-----------------+-----------------------------+
| ARM | Cortex-A76 | #1188873,1418040| ARM64_ERRATUM_1418040 |
Expand All @@ -102,6 +106,8 @@ stable kernels.
+----------------+-----------------+-----------------+-----------------------------+
| ARM | Cortex-A510 | #2077057 | ARM64_ERRATUM_2077057 |
+----------------+-----------------+-----------------+-----------------------------+
| ARM | Cortex-A510 | #2441009 | ARM64_ERRATUM_2441009 |
+----------------+-----------------+-----------------+-----------------------------+
| ARM | Cortex-A710 | #2119858 | ARM64_ERRATUM_2119858 |
+----------------+-----------------+-----------------+-----------------------------+
| ARM | Cortex-A710 | #2054223 | ARM64_ERRATUM_2054223 |
Expand Down
2 changes: 1 addition & 1 deletion Documentation/features/vm/ioremap_prot/arch-support.txt
Original file line number Diff line number Diff line change
Expand Up @@ -9,7 +9,7 @@
| alpha: | TODO |
| arc: | ok |
| arm: | TODO |
| arm64: | TODO |
| arm64: | ok |
| csky: | TODO |
| hexagon: | TODO |
| ia64: | TODO |
Expand Down
11 changes: 6 additions & 5 deletions Documentation/memory-barriers.txt
Original file line number Diff line number Diff line change
Expand Up @@ -1894,6 +1894,7 @@ There are some more advanced barrier functions:

(*) dma_wmb();
(*) dma_rmb();
(*) dma_mb();

These are for use with consistent memory to guarantee the ordering
of writes or reads of shared memory accessible to both the CPU and a
Expand Down Expand Up @@ -1925,11 +1926,11 @@ There are some more advanced barrier functions:
The dma_rmb() allows us guarantee the device has released ownership
before we read the data from the descriptor, and the dma_wmb() allows
us to guarantee the data is written to the descriptor before the device
can see it now has ownership. Note that, when using writel(), a prior
wmb() is not needed to guarantee that the cache coherent memory writes
have completed before writing to the MMIO region. The cheaper
writel_relaxed() does not provide this guarantee and must not be used
here.
can see it now has ownership. The dma_mb() implies both a dma_rmb() and
a dma_wmb(). Note that, when using writel(), a prior wmb() is not needed
to guarantee that the cache coherent memory writes have completed before
writing to the MMIO region. The cheaper writel_relaxed() does not provide
this guarantee and must not be used here.

See the subsection "Kernel I/O barrier effects" for more information on
relaxed I/O accessors and the Documentation/core-api/dma-api.rst file for
Expand Down
11 changes: 6 additions & 5 deletions Documentation/virt/kvm/arm/hyp-abi.rst
Original file line number Diff line number Diff line change
Expand Up @@ -60,12 +60,13 @@ these functions (see arch/arm{,64}/include/asm/virt.h):

* ::

x0 = HVC_VHE_RESTART (arm64 only)
x0 = HVC_FINALISE_EL2 (arm64 only)

Attempt to upgrade the kernel's exception level from EL1 to EL2 by enabling
the VHE mode. This is conditioned by the CPU supporting VHE, the EL2 MMU
being off, and VHE not being disabled by any other means (command line
option, for example).
Finish configuring EL2 depending on the command-line options,
including an attempt to upgrade the kernel's exception level from
EL1 to EL2 by enabling the VHE mode. This is conditioned by the CPU
supporting VHE, the EL2 MMU being off, and VHE not being disabled by
any other means (command line option, for example).

Any other value of r0/x0 triggers a hypervisor-specific handling,
which is not documented here.
Expand Down
6 changes: 6 additions & 0 deletions MAINTAINERS
Original file line number Diff line number Diff line change
Expand Up @@ -9038,6 +9038,12 @@ F: Documentation/admin-guide/perf/hisi-pcie-pmu.rst
F: Documentation/admin-guide/perf/hisi-pmu.rst
F: drivers/perf/hisilicon

HISILICON HNS3 PMU DRIVER
M: Guangbin Huang <[email protected]>
S: Supported
F: Documentation/admin-guide/perf/hns3-pmu.rst
F: drivers/perf/hisilicon/hns3_pmu.c

HISILICON QM AND ZIP Controller DRIVER
M: Zhou Wang <[email protected]>
L: [email protected]
Expand Down
3 changes: 3 additions & 0 deletions arch/Kconfig
Original file line number Diff line number Diff line change
Expand Up @@ -223,6 +223,9 @@ config HAVE_FUNCTION_DESCRIPTORS
config TRACE_IRQFLAGS_SUPPORT
bool

config TRACE_IRQFLAGS_NMI_SUPPORT
bool

#
# An arch should select this if it provides all these things:
#
Expand Down
4 changes: 1 addition & 3 deletions arch/arm/include/asm/io.h
Original file line number Diff line number Diff line change
Expand Up @@ -139,11 +139,9 @@ extern void __iomem *__arm_ioremap_caller(phys_addr_t, size_t, unsigned int,
extern void __iomem *__arm_ioremap_pfn(unsigned long, unsigned long, size_t, unsigned int);
extern void __iomem *__arm_ioremap_exec(phys_addr_t, size_t, bool cached);
void __arm_iomem_set_ro(void __iomem *ptr, size_t size);
extern void __iounmap(volatile void __iomem *addr);

extern void __iomem * (*arch_ioremap_caller)(phys_addr_t, size_t,
unsigned int, void *);
extern void (*arch_iounmap)(volatile void __iomem *);

/*
* Bad read/write accesses...
Expand Down Expand Up @@ -380,7 +378,7 @@ void __iomem *ioremap_wc(resource_size_t res_cookie, size_t size);
#define ioremap_wc ioremap_wc
#define ioremap_wt ioremap_wc

void iounmap(volatile void __iomem *iomem_cookie);
void iounmap(volatile void __iomem *io_addr);
#define iounmap iounmap

void *arch_memremap_wb(phys_addr_t phys_addr, size_t size);
Expand Down
9 changes: 1 addition & 8 deletions arch/arm/mm/ioremap.c
Original file line number Diff line number Diff line change
Expand Up @@ -418,7 +418,7 @@ void *arch_memremap_wb(phys_addr_t phys_addr, size_t size)
__builtin_return_address(0));
}

void __iounmap(volatile void __iomem *io_addr)
void iounmap(volatile void __iomem *io_addr)
{
void *addr = (void *)(PAGE_MASK & (unsigned long)io_addr);
struct static_vm *svm;
Expand Down Expand Up @@ -446,13 +446,6 @@ void __iounmap(volatile void __iomem *io_addr)

vunmap(addr);
}

void (*arch_iounmap)(volatile void __iomem *) = __iounmap;

void iounmap(volatile void __iomem *cookie)
{
arch_iounmap(cookie);
}
EXPORT_SYMBOL(iounmap);

#if defined(CONFIG_PCI) || IS_ENABLED(CONFIG_PCMCIA)
Expand Down
9 changes: 1 addition & 8 deletions arch/arm/mm/nommu.c
Original file line number Diff line number Diff line change
Expand Up @@ -230,14 +230,7 @@ void *arch_memremap_wb(phys_addr_t phys_addr, size_t size)
return (void *)phys_addr;
}

void __iounmap(volatile void __iomem *addr)
{
}
EXPORT_SYMBOL(__iounmap);

void (*arch_iounmap)(volatile void __iomem *);

void iounmap(volatile void __iomem *addr)
void iounmap(volatile void __iomem *io_addr)
{
}
EXPORT_SYMBOL(iounmap);
Loading

0 comments on commit 0cec3f2

Please sign in to comment.