Skip to content

Commit

Permalink
mm: don't use compound_head() in virt_to_head_page()
Browse files Browse the repository at this point in the history
compound_head() is implemented with assumption that there would be race
condition when checking tail flag.  This assumption is only true when we
try to access arbitrary positioned struct page.

The situation that virt_to_head_page() is called is different case.  We
call virt_to_head_page() only in the range of allocated pages, so there
is no race condition on tail flag.  In this case, we don't need to
handle race condition and we can reduce overhead slightly.  This patch
implements compound_head_fast() which is similar with compound_head()
except tail flag race handling.  And then, virt_to_head_page() uses this
optimized function to improve performance.

I saw 1.8% win in a fast-path loop over kmem_cache_alloc/free, (14.063
ns -> 13.810 ns) if target object is on tail page.

Signed-off-by: Joonsoo Kim <[email protected]>
Acked-by: Christoph Lameter <[email protected]>
Cc: Pekka Enberg <[email protected]>
Cc: David Rientjes <[email protected]>
Cc: Joonsoo Kim <[email protected]>
Cc: Jesper Dangaard Brouer <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
  • Loading branch information
JoonsooKim authored and torvalds committed Feb 10, 2015
1 parent 9aabf81 commit ccaafd7
Showing 1 changed file with 26 additions and 1 deletion.
27 changes: 26 additions & 1 deletion include/linux/mm.h
Original file line number Diff line number Diff line change
Expand Up @@ -446,13 +446,31 @@ static inline struct page *compound_head_by_tail(struct page *tail)
return tail;
}

/*
* Since either compound page could be dismantled asynchronously in THP
* or we access asynchronously arbitrary positioned struct page, there
* would be tail flag race. To handle this race, we should call
* smp_rmb() before checking tail flag. compound_head_by_tail() did it.
*/
static inline struct page *compound_head(struct page *page)
{
if (unlikely(PageTail(page)))
return compound_head_by_tail(page);
return page;
}

/*
* If we access compound page synchronously such as access to
* allocated page, there is no need to handle tail flag race, so we can
* check tail flag directly without any synchronization primitive.
*/
static inline struct page *compound_head_fast(struct page *page)
{
if (unlikely(PageTail(page)))
return page->first_page;
return page;
}

/*
* The atomic page->_mapcount, starts from -1: so that transitions
* both from it and to it can be tracked, using atomic_inc_and_test
Expand Down Expand Up @@ -531,7 +549,14 @@ static inline void get_page(struct page *page)
static inline struct page *virt_to_head_page(const void *x)
{
struct page *page = virt_to_page(x);
return compound_head(page);

/*
* We don't need to worry about synchronization of tail flag
* when we call virt_to_head_page() since it is only called for
* already allocated page and this page won't be freed until
* this virt_to_head_page() is finished. So use _fast variant.
*/
return compound_head_fast(page);
}

/*
Expand Down

0 comments on commit ccaafd7

Please sign in to comment.