Skip to content

Commit

Permalink
Merge branch 'akpm' (patches from Andrew)
Browse files Browse the repository at this point in the history
Merge yet more updates from Andrew Morton:
 "This is the material which was staged after willystuff in linux-next.

  Subsystems affected by this patch series: mm (debug, selftests,
  pagecache, thp, rmap, migration, kasan, hugetlb, pagemap, madvise),
  and selftests"

* emailed patches from Andrew Morton <[email protected]>: (113 commits)
  selftests: kselftest framework: provide "finished" helper
  mm: madvise: MADV_DONTNEED_LOCKED
  mm: fix race between MADV_FREE reclaim and blkdev direct IO read
  mm: generalize ARCH_HAS_FILTER_PGPROT
  mm: unmap_mapping_range_tree() with i_mmap_rwsem shared
  mm: warn on deleting redirtied only if accounted
  mm/huge_memory: remove stale locking logic from __split_huge_pmd()
  mm/huge_memory: remove stale page_trans_huge_mapcount()
  mm/swapfile: remove stale reuse_swap_page()
  mm/khugepaged: remove reuse_swap_page() usage
  mm/huge_memory: streamline COW logic in do_huge_pmd_wp_page()
  mm: streamline COW logic in do_swap_page()
  mm: slightly clarify KSM logic in do_swap_page()
  mm: optimize do_wp_page() for fresh pages in local LRU pagevecs
  mm: optimize do_wp_page() for exclusive pages in the swapcache
  mm/huge_memory: make is_transparent_hugepage() static
  userfaultfd/selftests: enable hugetlb remap and remove event testing
  selftests/vm: add hugetlb madvise MADV_DONTNEED MADV_REMOVE test
  mm: enable MADV_DONTNEED for hugetlb mappings
  kasan: disable LOCKDEP when printing reports
  ...
  • Loading branch information
torvalds committed Mar 25, 2022
2 parents aa5b537 + 25fd2d4 commit 29c8c18
Show file tree
Hide file tree
Showing 73 changed files with 2,458 additions and 956 deletions.
17 changes: 11 additions & 6 deletions Documentation/dev-tools/kasan.rst
Original file line number Diff line number Diff line change
Expand Up @@ -30,7 +30,7 @@ Software tag-based KASAN mode is only supported in Clang.

The hardware KASAN mode (#3) relies on hardware to perform the checks but
still requires a compiler version that supports memory tagging instructions.
This mode is supported in GCC 10+ and Clang 11+.
This mode is supported in GCC 10+ and Clang 12+.

Both software KASAN modes work with SLUB and SLAB memory allocators,
while the hardware tag-based KASAN currently only supports SLUB.
Expand Down Expand Up @@ -206,6 +206,9 @@ additional boot parameters that allow disabling KASAN or controlling features:
Asymmetric mode: a bad access is detected synchronously on reads and
asynchronously on writes.

- ``kasan.vmalloc=off`` or ``=on`` disables or enables tagging of vmalloc
allocations (default: ``on``).

- ``kasan.stacktrace=off`` or ``=on`` disables or enables alloc and free stack
traces collection (default: ``on``).

Expand Down Expand Up @@ -279,8 +282,8 @@ Software tag-based KASAN uses 0xFF as a match-all pointer tag (accesses through
pointers with the 0xFF pointer tag are not checked). The value 0xFE is currently
reserved to tag freed memory regions.

Software tag-based KASAN currently only supports tagging of slab and page_alloc
memory.
Software tag-based KASAN currently only supports tagging of slab, page_alloc,
and vmalloc memory.

Hardware tag-based KASAN
~~~~~~~~~~~~~~~~~~~~~~~~
Expand All @@ -303,8 +306,8 @@ Hardware tag-based KASAN uses 0xFF as a match-all pointer tag (accesses through
pointers with the 0xFF pointer tag are not checked). The value 0xFE is currently
reserved to tag freed memory regions.

Hardware tag-based KASAN currently only supports tagging of slab and page_alloc
memory.
Hardware tag-based KASAN currently only supports tagging of slab, page_alloc,
and VM_ALLOC-based vmalloc memory.

If the hardware does not support MTE (pre ARMv8.5), hardware tag-based KASAN
will not be enabled. In this case, all KASAN boot parameters are ignored.
Expand All @@ -319,6 +322,8 @@ checking gets disabled.
Shadow memory
-------------

The contents of this section are only applicable to software KASAN modes.

The kernel maps memory in several different parts of the address space.
The range of kernel virtual addresses is large: there is not enough real
memory to support a real shadow region for every address that could be
Expand Down Expand Up @@ -349,7 +354,7 @@ CONFIG_KASAN_VMALLOC

With ``CONFIG_KASAN_VMALLOC``, KASAN can cover vmalloc space at the
cost of greater memory usage. Currently, this is supported on x86,
riscv, s390, and powerpc.
arm64, riscv, s390, and powerpc.

This works by hooking into vmalloc and vmap and dynamically
allocating real shadow memory to back the mappings.
Expand Down
65 changes: 59 additions & 6 deletions Documentation/vm/page_owner.rst
Original file line number Diff line number Diff line change
Expand Up @@ -78,7 +78,7 @@ Usage

2) Enable page owner: add "page_owner=on" to boot cmdline.

3) Do the job what you want to debug
3) Do the job that you want to debug.

4) Analyze information from page owner::

Expand All @@ -89,22 +89,75 @@ Usage

Page allocated via order XXX, ...
PFN XXX ...
// Detailed stack
// Detailed stack

Page allocated via order XXX, ...
PFN XXX ...
// Detailed stack
// Detailed stack

The ``page_owner_sort`` tool ignores ``PFN`` rows, puts the remaining rows
in buf, uses regexp to extract the page order value, counts the times
and pages of buf, and finally sorts them according to the times.
and pages of buf, and finally sorts them according to the parameter(s).

See the result about who allocated each page
in the ``sorted_page_owner.txt``. General output::

XXX times, XXX pages:
Page allocated via order XXX, ...
// Detailed stack
// Detailed stack

By default, ``page_owner_sort`` is sorted according to the times of buf.
If you want to sort by the pages nums of buf, use the ``-m`` parameter.
If you want to sort by the page nums of buf, use the ``-m`` parameter.
The detailed parameters are:

fundamental function:

Sort:
-a Sort by memory allocation time.
-m Sort by total memory.
-p Sort by pid.
-P Sort by tgid.
-n Sort by task command name.
-r Sort by memory release time.
-s Sort by stack trace.
-t Sort by times (default).

additional function:

Cull:
-c Cull by comparing stacktrace instead of total block.
--cull <rules>
Specify culling rules.Culling syntax is key[,key[,...]].Choose a
multi-letter key from the **STANDARD FORMAT SPECIFIERS** section.


<rules> is a single argument in the form of a comma-separated list,
which offers a way to specify individual culling rules. The recognized
keywords are described in the **STANDARD FORMAT SPECIFIERS** section below.
<rules> can be specified by the sequence of keys k1,k2, ..., as described in
the STANDARD SORT KEYS section below. Mixed use of abbreviated and
complete-form of keys is allowed.


Examples:
./page_owner_sort <input> <output> --cull=stacktrace
./page_owner_sort <input> <output> --cull=st,pid,name
./page_owner_sort <input> <output> --cull=n,f

Filter:
-f Filter out the information of blocks whose memory has been released.

Select:
--pid <PID> Select by pid.
--tgid <TGID> Select by tgid.
--name <command> Select by task command name.

STANDARD FORMAT SPECIFIERS
==========================

KEY LONG DESCRIPTION
p pid process ID
tg tgid thread group ID
n name task command name
f free whether the page has been released or not
st stacktrace stace trace of the page allocation
2 changes: 2 additions & 0 deletions arch/alpha/include/uapi/asm/mman.h
Original file line number Diff line number Diff line change
Expand Up @@ -74,6 +74,8 @@
#define MADV_POPULATE_READ 22 /* populate (prefault) page tables readable */
#define MADV_POPULATE_WRITE 23 /* populate (prefault) page tables writable */

#define MADV_DONTNEED_LOCKED 24 /* like DONTNEED, but drop locked pages too */

/* compatibility flags */
#define MAP_FILE 0

Expand Down
2 changes: 1 addition & 1 deletion arch/arm64/Kconfig
Original file line number Diff line number Diff line change
Expand Up @@ -208,7 +208,7 @@ config ARM64
select IOMMU_DMA if IOMMU_SUPPORT
select IRQ_DOMAIN
select IRQ_FORCED_THREADING
select KASAN_VMALLOC if KASAN_GENERIC
select KASAN_VMALLOC if KASAN
select MODULES_USE_ELF_RELA
select NEED_DMA_MAP_STATE
select NEED_SG_DMA_LENGTH
Expand Down
6 changes: 6 additions & 0 deletions arch/arm64/include/asm/vmalloc.h
Original file line number Diff line number Diff line change
Expand Up @@ -25,4 +25,10 @@ static inline bool arch_vmap_pmd_supported(pgprot_t prot)

#endif

#define arch_vmap_pgprot_tagged arch_vmap_pgprot_tagged
static inline pgprot_t arch_vmap_pgprot_tagged(pgprot_t prot)
{
return pgprot_tagged(prot);
}

#endif /* _ASM_ARM64_VMALLOC_H */
5 changes: 4 additions & 1 deletion arch/arm64/include/asm/vmap_stack.h
Original file line number Diff line number Diff line change
Expand Up @@ -17,10 +17,13 @@
*/
static inline unsigned long *arch_alloc_vmap_stack(size_t stack_size, int node)
{
void *p;

BUILD_BUG_ON(!IS_ENABLED(CONFIG_VMAP_STACK));

return __vmalloc_node(stack_size, THREAD_ALIGN, THREADINFO_GFP, node,
p = __vmalloc_node(stack_size, THREAD_ALIGN, THREADINFO_GFP, node,
__builtin_return_address(0));
return kasan_reset_tag(p);
}

#endif /* __ASM_VMAP_STACK_H */
5 changes: 3 additions & 2 deletions arch/arm64/kernel/module.c
Original file line number Diff line number Diff line change
Expand Up @@ -58,12 +58,13 @@ void *module_alloc(unsigned long size)
PAGE_KERNEL, 0, NUMA_NO_NODE,
__builtin_return_address(0));

if (p && (kasan_module_alloc(p, size, gfp_mask) < 0)) {
if (p && (kasan_alloc_module_shadow(p, size, gfp_mask) < 0)) {
vfree(p);
return NULL;
}

return p;
/* Memory is intended to be executable, reset the pointer tag. */
return kasan_reset_tag(p);
}

enum aarch64_reloc_op {
Expand Down
2 changes: 1 addition & 1 deletion arch/arm64/mm/pageattr.c
Original file line number Diff line number Diff line change
Expand Up @@ -85,7 +85,7 @@ static int change_memory_common(unsigned long addr, int numpages,
*/
area = find_vm_area((void *)addr);
if (!area ||
end > (unsigned long)area->addr + area->size ||
end > (unsigned long)kasan_reset_tag(area->addr) + area->size ||
!(area->flags & VM_ALLOC))
return -EINVAL;

Expand Down
3 changes: 2 additions & 1 deletion arch/arm64/net/bpf_jit_comp.c
Original file line number Diff line number Diff line change
Expand Up @@ -1304,7 +1304,8 @@ u64 bpf_jit_alloc_exec_limit(void)

void *bpf_jit_alloc_exec(unsigned long size)
{
return vmalloc(size);
/* Memory is intended to be executable, reset the pointer tag. */
return kasan_reset_tag(vmalloc(size));
}

void bpf_jit_free_exec(void *addr)
Expand Down
2 changes: 2 additions & 0 deletions arch/mips/include/uapi/asm/mman.h
Original file line number Diff line number Diff line change
Expand Up @@ -101,6 +101,8 @@
#define MADV_POPULATE_READ 22 /* populate (prefault) page tables readable */
#define MADV_POPULATE_WRITE 23 /* populate (prefault) page tables writable */

#define MADV_DONTNEED_LOCKED 24 /* like DONTNEED, but drop locked pages too */

/* compatibility flags */
#define MAP_FILE 0

Expand Down
2 changes: 2 additions & 0 deletions arch/parisc/include/uapi/asm/mman.h
Original file line number Diff line number Diff line change
Expand Up @@ -55,6 +55,8 @@
#define MADV_POPULATE_READ 22 /* populate (prefault) page tables readable */
#define MADV_POPULATE_WRITE 23 /* populate (prefault) page tables writable */

#define MADV_DONTNEED_LOCKED 24 /* like DONTNEED, but drop locked pages too */

#define MADV_MERGEABLE 65 /* KSM may merge identical pages */
#define MADV_UNMERGEABLE 66 /* KSM may not merge identical pages */

Expand Down
1 change: 0 additions & 1 deletion arch/powerpc/mm/book3s64/trace.c
Original file line number Diff line number Diff line change
Expand Up @@ -3,6 +3,5 @@
* This file is for defining trace points and trace related helpers.
*/
#ifdef CONFIG_TRANSPARENT_HUGEPAGE
#define CREATE_TRACE_POINTS
#include <trace/events/thp.h>
#endif
2 changes: 1 addition & 1 deletion arch/s390/kernel/module.c
Original file line number Diff line number Diff line change
Expand Up @@ -45,7 +45,7 @@ void *module_alloc(unsigned long size)
p = __vmalloc_node_range(size, MODULE_ALIGN, MODULES_VADDR, MODULES_END,
gfp_mask, PAGE_KERNEL_EXEC, VM_DEFER_KMEMLEAK, NUMA_NO_NODE,
__builtin_return_address(0));
if (p && (kasan_module_alloc(p, size, gfp_mask) < 0)) {
if (p && (kasan_alloc_module_shadow(p, size, gfp_mask) < 0)) {
vfree(p);
return NULL;
}
Expand Down
3 changes: 0 additions & 3 deletions arch/x86/Kconfig
Original file line number Diff line number Diff line change
Expand Up @@ -337,9 +337,6 @@ config GENERIC_CALIBRATE_DELAY
config ARCH_HAS_CPU_RELAX
def_bool y

config ARCH_HAS_FILTER_PGPROT
def_bool y

config ARCH_HIBERNATION_POSSIBLE
def_bool y

Expand Down
2 changes: 1 addition & 1 deletion arch/x86/kernel/module.c
Original file line number Diff line number Diff line change
Expand Up @@ -78,7 +78,7 @@ void *module_alloc(unsigned long size)
MODULES_END, gfp_mask,
PAGE_KERNEL, VM_DEFER_KMEMLEAK, NUMA_NO_NODE,
__builtin_return_address(0));
if (p && (kasan_module_alloc(p, size, gfp_mask) < 0)) {
if (p && (kasan_alloc_module_shadow(p, size, gfp_mask) < 0)) {
vfree(p);
return NULL;
}
Expand Down
1 change: 0 additions & 1 deletion arch/x86/mm/init.c
Original file line number Diff line number Diff line change
Expand Up @@ -31,7 +31,6 @@
* We need to define the tracepoints somewhere, and tlb.c
* is only compiled when SMP=y.
*/
#define CREATE_TRACE_POINTS
#include <trace/events/tlb.h>

#include "mm_internal.h"
Expand Down
2 changes: 2 additions & 0 deletions arch/xtensa/include/uapi/asm/mman.h
Original file line number Diff line number Diff line change
Expand Up @@ -109,6 +109,8 @@
#define MADV_POPULATE_READ 22 /* populate (prefault) page tables readable */
#define MADV_POPULATE_WRITE 23 /* populate (prefault) page tables writable */

#define MADV_DONTNEED_LOCKED 24 /* like DONTNEED, but drop locked pages too */

/* compatibility flags */
#define MAP_FILE 0

Expand Down
35 changes: 26 additions & 9 deletions include/linux/gfp.h
Original file line number Diff line number Diff line change
Expand Up @@ -54,9 +54,17 @@ struct vm_area_struct;
#define ___GFP_THISNODE 0x200000u
#define ___GFP_ACCOUNT 0x400000u
#define ___GFP_ZEROTAGS 0x800000u
#define ___GFP_SKIP_KASAN_POISON 0x1000000u
#ifdef CONFIG_KASAN_HW_TAGS
#define ___GFP_SKIP_ZERO 0x1000000u
#define ___GFP_SKIP_KASAN_UNPOISON 0x2000000u
#define ___GFP_SKIP_KASAN_POISON 0x4000000u
#else
#define ___GFP_SKIP_ZERO 0
#define ___GFP_SKIP_KASAN_UNPOISON 0
#define ___GFP_SKIP_KASAN_POISON 0
#endif
#ifdef CONFIG_LOCKDEP
#define ___GFP_NOLOCKDEP 0x2000000u
#define ___GFP_NOLOCKDEP 0x8000000u
#else
#define ___GFP_NOLOCKDEP 0
#endif
Expand Down Expand Up @@ -232,24 +240,33 @@ struct vm_area_struct;
*
* %__GFP_ZERO returns a zeroed page on success.
*
* %__GFP_ZEROTAGS returns a page with zeroed memory tags on success, if
* __GFP_ZERO is set.
* %__GFP_ZEROTAGS zeroes memory tags at allocation time if the memory itself
* is being zeroed (either via __GFP_ZERO or via init_on_alloc, provided that
* __GFP_SKIP_ZERO is not set). This flag is intended for optimization: setting
* memory tags at the same time as zeroing memory has minimal additional
* performace impact.
*
* %__GFP_SKIP_KASAN_UNPOISON makes KASAN skip unpoisoning on page allocation.
* Only effective in HW_TAGS mode.
*
* %__GFP_SKIP_KASAN_POISON returns a page which does not need to be poisoned
* on deallocation. Typically used for userspace pages. Currently only has an
* effect in HW tags mode.
* %__GFP_SKIP_KASAN_POISON makes KASAN skip poisoning on page deallocation.
* Typically, used for userspace pages. Only effective in HW_TAGS mode.
*/
#define __GFP_NOWARN ((__force gfp_t)___GFP_NOWARN)
#define __GFP_COMP ((__force gfp_t)___GFP_COMP)
#define __GFP_ZERO ((__force gfp_t)___GFP_ZERO)
#define __GFP_ZEROTAGS ((__force gfp_t)___GFP_ZEROTAGS)
#define __GFP_SKIP_KASAN_POISON ((__force gfp_t)___GFP_SKIP_KASAN_POISON)
#define __GFP_SKIP_ZERO ((__force gfp_t)___GFP_SKIP_ZERO)
#define __GFP_SKIP_KASAN_UNPOISON ((__force gfp_t)___GFP_SKIP_KASAN_UNPOISON)
#define __GFP_SKIP_KASAN_POISON ((__force gfp_t)___GFP_SKIP_KASAN_POISON)

/* Disable lockdep for GFP context tracking */
#define __GFP_NOLOCKDEP ((__force gfp_t)___GFP_NOLOCKDEP)

/* Room for N __GFP_FOO bits */
#define __GFP_BITS_SHIFT (25 + IS_ENABLED(CONFIG_LOCKDEP))
#define __GFP_BITS_SHIFT (24 + \
3 * IS_ENABLED(CONFIG_KASAN_HW_TAGS) + \
IS_ENABLED(CONFIG_LOCKDEP))
#define __GFP_BITS_MASK ((__force gfp_t)((1 << __GFP_BITS_SHIFT) - 1))

/**
Expand Down
6 changes: 0 additions & 6 deletions include/linux/huge_mm.h
Original file line number Diff line number Diff line change
Expand Up @@ -183,7 +183,6 @@ unsigned long thp_get_unmapped_area(struct file *filp, unsigned long addr,

void prep_transhuge_page(struct page *page);
void free_transhuge_page(struct page *page);
bool is_transparent_hugepage(struct page *page);

bool can_split_folio(struct folio *folio, int *pextra_pins);
int split_huge_page_to_list(struct page *page, struct list_head *list);
Expand Down Expand Up @@ -341,11 +340,6 @@ static inline bool transhuge_vma_enabled(struct vm_area_struct *vma,

static inline void prep_transhuge_page(struct page *page) {}

static inline bool is_transparent_hugepage(struct page *page)
{
return false;
}

#define transparent_hugepage_flags 0UL

#define thp_get_unmapped_area NULL
Expand Down
Loading

0 comments on commit 29c8c18

Please sign in to comment.