Skip to content

Commit

Permalink
mm: introduce Reported pages
Browse files Browse the repository at this point in the history
In order to pave the way for free page reporting in virtualized
environments we will need a way to get pages out of the free lists and
identify those pages after they have been returned.  To accomplish this,
this patch adds the concept of a Reported Buddy, which is essentially
meant to just be the Uptodate flag used in conjunction with the Buddy page
type.

To prevent the reported pages from leaking outside of the buddy lists I
added a check to clear the PageReported bit in the del_page_from_free_list
function.  As a result any reported page that is split, merged, or
allocated will have the flag cleared prior to the PageBuddy value being
cleared.

The process for reporting pages is fairly simple.  Once we free a page
that meets the minimum order for page reporting we will schedule a worker
thread to start 2s or more in the future.  That worker thread will begin
working from the lowest supported page reporting order up to MAX_ORDER - 1
pulling unreported pages from the free list and storing them in the
scatterlist.

When processing each individual free list it is necessary for the worker
thread to release the zone lock when it needs to stop and report the full
scatterlist of pages.  To reduce the work of the next iteration the worker
thread will rotate the free list so that the first unreported page in the
free list becomes the first entry in the list.

It will then call a reporting function providing information on how many
entries are in the scatterlist.  Once the function completes it will
return the pages to the free area from which they were allocated and start
over pulling more pages from the free areas until there are no longer
enough pages to report on to keep the worker busy, or we have processed as
many pages as were contained in the free area when we started processing
the list.

The worker thread will work in a round-robin fashion making its way though
each zone requesting reporting, and through each reportable free list
within that zone.  Once all free areas within the zone have been processed
it will check to see if there have been any requests for reporting while
it was processing.  If so it will reschedule the worker thread to start up
again in roughly 2s and exit.

Signed-off-by: Alexander Duyck <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Acked-by: Mel Gorman <[email protected]>
Cc: Andrea Arcangeli <[email protected]>
Cc: Dan Williams <[email protected]>
Cc: Dave Hansen <[email protected]>
Cc: David Hildenbrand <[email protected]>
Cc: Konrad Rzeszutek Wilk <[email protected]>
Cc: Luiz Capitulino <[email protected]>
Cc: Matthew Wilcox <[email protected]>
Cc: Michael S. Tsirkin <[email protected]>
Cc: Michal Hocko <[email protected]>
Cc: Nitesh Narayan Lal <[email protected]>
Cc: Oscar Salvador <[email protected]>
Cc: Pankaj Gupta <[email protected]>
Cc: Paolo Bonzini <[email protected]>
Cc: Rik van Riel <[email protected]>
Cc: Vlastimil Babka <[email protected]>
Cc: Wei Wang <[email protected]>
Cc: Yang Zhang <[email protected]>
Cc: wei qi <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Linus Torvalds <[email protected]>
  • Loading branch information
Alexander Duyck authored and torvalds committed Apr 7, 2020
1 parent 624f58d commit 36e66c5
Show file tree
Hide file tree
Showing 7 changed files with 434 additions and 4 deletions.
11 changes: 11 additions & 0 deletions include/linux/page-flags.h
Original file line number Diff line number Diff line change
Expand Up @@ -168,6 +168,9 @@ enum pageflags {

/* non-lru isolated movable page */
PG_isolated = PG_reclaim,

/* Only valid for buddy pages. Used to track pages that are reported */
PG_reported = PG_uptodate,
};

#ifndef __GENERATING_BOUNDS_H
Expand Down Expand Up @@ -436,6 +439,14 @@ TESTCLEARFLAG(Young, young, PF_ANY)
PAGEFLAG(Idle, idle, PF_ANY)
#endif

/*
* PageReported() is used to track reported free pages within the Buddy
* allocator. We can use the non-atomic version of the test and set
* operations as both should be shielded with the zone lock to prevent
* any possible races on the setting or clearing of the bit.
*/
__PAGEFLAG(Reported, reported, PF_NO_COMPOUND)

/*
* On an anonymous page mapped into a user virtual memory area,
* page->mapping points to its anon_vma, not to a struct address_space;
Expand Down
25 changes: 25 additions & 0 deletions include/linux/page_reporting.h
Original file line number Diff line number Diff line change
@@ -0,0 +1,25 @@
/* SPDX-License-Identifier: GPL-2.0 */
#ifndef _LINUX_PAGE_REPORTING_H
#define _LINUX_PAGE_REPORTING_H

#include <linux/mmzone.h>
#include <linux/scatterlist.h>

#define PAGE_REPORTING_CAPACITY 32

struct page_reporting_dev_info {
/* function that alters pages to make them "reported" */
int (*report)(struct page_reporting_dev_info *prdev,
struct scatterlist *sg, unsigned int nents);

/* work struct for processing reports */
struct delayed_work work;

/* Current state of page reporting */
atomic_t state;
};

/* Tear-down and bring-up for page reporting devices */
void page_reporting_unregister(struct page_reporting_dev_info *prdev);
int page_reporting_register(struct page_reporting_dev_info *prdev);
#endif /*_LINUX_PAGE_REPORTING_H */
11 changes: 11 additions & 0 deletions mm/Kconfig
Original file line number Diff line number Diff line change
Expand Up @@ -236,6 +236,17 @@ config COMPACTION
it and then we would be really interested to hear about that at
[email protected].

#
# support for free page reporting
config PAGE_REPORTING
bool "Free page reporting"
def_bool n
help
Free page reporting allows for the incremental acquisition of
free pages from the buddy allocator for the purpose of reporting
those pages to another entity, such as a hypervisor, so that the
memory can be freed within the host for other uses.

#
# support for page migration
#
Expand Down
1 change: 1 addition & 0 deletions mm/Makefile
Original file line number Diff line number Diff line change
Expand Up @@ -111,3 +111,4 @@ obj-$(CONFIG_HMM_MIRROR) += hmm.o
obj-$(CONFIG_MEMFD_CREATE) += memfd.o
obj-$(CONFIG_MAPPING_DIRTY_HELPERS) += mapping_dirty_helpers.o
obj-$(CONFIG_PTDUMP_CORE) += ptdump.o
obj-$(CONFIG_PAGE_REPORTING) += page_reporting.o
17 changes: 13 additions & 4 deletions mm/page_alloc.c
Original file line number Diff line number Diff line change
Expand Up @@ -74,6 +74,7 @@
#include <asm/div64.h>
#include "internal.h"
#include "shuffle.h"
#include "page_reporting.h"

/* prevent >1 _updater_ of zone percpu pageset ->high and ->batch fields */
static DEFINE_MUTEX(pcp_batch_high_lock);
Expand Down Expand Up @@ -896,6 +897,10 @@ static inline void move_to_free_list(struct page *page, struct zone *zone,
static inline void del_page_from_free_list(struct page *page, struct zone *zone,
unsigned int order)
{
/* clear reported state and update reported page count */
if (page_reported(page))
__ClearPageReported(page);

list_del(&page->lru);
__ClearPageBuddy(page);
set_page_private(page, 0);
Expand Down Expand Up @@ -959,7 +964,7 @@ buddy_merge_likely(unsigned long pfn, unsigned long buddy_pfn,
static inline void __free_one_page(struct page *page,
unsigned long pfn,
struct zone *zone, unsigned int order,
int migratetype)
int migratetype, bool report)
{
struct capture_control *capc = task_capc(zone);
unsigned long uninitialized_var(buddy_pfn);
Expand Down Expand Up @@ -1044,6 +1049,10 @@ static inline void __free_one_page(struct page *page,
add_to_free_list_tail(page, zone, order, migratetype);
else
add_to_free_list(page, zone, order, migratetype);

/* Notify page reporting subsystem of freed page */
if (report)
page_reporting_notify_free(order);
}

/*
Expand Down Expand Up @@ -1360,7 +1369,7 @@ static void free_pcppages_bulk(struct zone *zone, int count,
if (unlikely(isolated_pageblocks))
mt = get_pageblock_migratetype(page);

__free_one_page(page, page_to_pfn(page), zone, 0, mt);
__free_one_page(page, page_to_pfn(page), zone, 0, mt, true);
trace_mm_page_pcpu_drain(page, 0, mt);
}
spin_unlock(&zone->lock);
Expand All @@ -1376,7 +1385,7 @@ static void free_one_page(struct zone *zone,
is_migrate_isolate(migratetype))) {
migratetype = get_pfnblock_migratetype(page, pfn);
}
__free_one_page(page, pfn, zone, order, migratetype);
__free_one_page(page, pfn, zone, order, migratetype, true);
spin_unlock(&zone->lock);
}

Expand Down Expand Up @@ -3227,7 +3236,7 @@ void __putback_isolated_page(struct page *page, unsigned int order, int mt)
lockdep_assert_held(&zone->lock);

/* Return isolated page to tail of freelist. */
__free_one_page(page, page_to_pfn(page), zone, order, mt);
__free_one_page(page, page_to_pfn(page), zone, order, mt, false);
}

/*
Expand Down
Loading

0 comments on commit 36e66c5

Please sign in to comment.