Skip to content

Commit

Permalink
mm/readahead: Fix readahead with large folios
Browse files Browse the repository at this point in the history
Reading 100KB chunks from a big file (eg dd bs=100K) leads to poor
readahead behaviour.  Studying the traces in detail, I noticed two
problems.

The first is that we were setting the readahead flag on the folio which
contains the last byte read from the block.  This is wrong because we
will trigger readahead at the end of the read without waiting to see
if a subsequent read is going to use the pages we just read.  Instead,
we need to set the readahead flag on the first folio _after_ the one
which contains the last byte that we're reading.

The second is that we were looking for the index of the folio with the
readahead flag set to exactly match the start + size - async_size.
If we've rounded this, either down (as previously) or up (as now),
we'll think we hit a folio marked as readahead by a different read,
and try to read the wrong pages.  So round the expected index to the
order of the folio we hit.

Reported-by: Guo Xuenan <[email protected]>
Signed-off-by: Matthew Wilcox (Oracle) <[email protected]>
  • Loading branch information
Matthew Wilcox (Oracle) committed May 5, 2022
1 parent 170f37d commit b9ff43d
Showing 1 changed file with 9 additions and 6 deletions.
15 changes: 9 additions & 6 deletions mm/readahead.c
Original file line number Diff line number Diff line change
Expand Up @@ -474,7 +474,8 @@ static inline int ra_alloc_folio(struct readahead_control *ractl, pgoff_t index,

if (!folio)
return -ENOMEM;
if (mark - index < (1UL << order))
mark = round_up(mark, 1UL << order);
if (index == mark)
folio_set_readahead(folio);
err = filemap_add_folio(ractl->mapping, folio, index, gfp);
if (err)
Expand Down Expand Up @@ -555,8 +556,9 @@ static void ondemand_readahead(struct readahead_control *ractl,
struct file_ra_state *ra = ractl->ra;
unsigned long max_pages = ra->ra_pages;
unsigned long add_pages;
unsigned long index = readahead_index(ractl);
pgoff_t prev_index;
pgoff_t index = readahead_index(ractl);
pgoff_t expected, prev_index;
unsigned int order = folio ? folio_order(folio) : 0;

/*
* If the request exceeds the readahead window, allow the read to
Expand All @@ -575,8 +577,9 @@ static void ondemand_readahead(struct readahead_control *ractl,
* It's the expected callback index, assume sequential access.
* Ramp up sizes, and push forward the readahead window.
*/
if ((index == (ra->start + ra->size - ra->async_size) ||
index == (ra->start + ra->size))) {
expected = round_up(ra->start + ra->size - ra->async_size,
1UL << order);
if (index == expected || index == (ra->start + ra->size)) {
ra->start += ra->size;
ra->size = get_next_ra_size(ra, max_pages);
ra->async_size = ra->size;
Expand Down Expand Up @@ -662,7 +665,7 @@ static void ondemand_readahead(struct readahead_control *ractl,
}

ractl->_index = ra->start;
page_cache_ra_order(ractl, ra, folio ? folio_order(folio) : 0);
page_cache_ra_order(ractl, ra, order);
}

void page_cache_sync_ra(struct readahead_control *ractl,
Expand Down

0 comments on commit b9ff43d

Please sign in to comment.