Skip to content

Commit

Permalink
Merge tag 'folio-5.19' of git://git.infradead.org/users/willy/pagecache
Browse files Browse the repository at this point in the history
Pull page cache updates from Matthew Wilcox:

 - Appoint myself page cache maintainer

 - Fix how scsicam uses the page cache

 - Use the memalloc_nofs_save() API to replace AOP_FLAG_NOFS

 - Remove the AOP flags entirely

 - Remove pagecache_write_begin() and pagecache_write_end()

 - Documentation updates

 - Convert several address_space operations to use folios:
     - is_dirty_writeback
     - readpage becomes read_folio
     - releasepage becomes release_folio
     - freepage becomes free_folio

 - Change filler_t to require a struct file pointer be the first
   argument like ->read_folio

* tag 'folio-5.19' of git://git.infradead.org/users/willy/pagecache: (107 commits)
  nilfs2: Fix some kernel-doc comments
  Appoint myself page cache maintainer
  fs: Remove aops->freepage
  secretmem: Convert to free_folio
  nfs: Convert to free_folio
  orangefs: Convert to free_folio
  fs: Add free_folio address space operation
  fs: Convert drop_buffers() to use a folio
  fs: Change try_to_free_buffers() to take a folio
  jbd2: Convert release_buffer_page() to use a folio
  jbd2: Convert jbd2_journal_try_to_free_buffers to take a folio
  reiserfs: Convert release_buffer_page() to use a folio
  fs: Remove last vestiges of releasepage
  ubifs: Convert to release_folio
  reiserfs: Convert to release_folio
  orangefs: Convert to release_folio
  ocfs2: Convert to release_folio
  nilfs2: Remove comment about releasepage
  nfs: Convert to release_folio
  jfs: Convert to release_folio
  ...
  • Loading branch information
torvalds committed May 25, 2022
2 parents 8642174 + 516edb4 commit fdaf9a5
Show file tree
Hide file tree
Showing 161 changed files with 1,233 additions and 1,221 deletions.
4 changes: 2 additions & 2 deletions Documentation/filesystems/caching/netfs-api.rst
Original file line number Diff line number Diff line change
Expand Up @@ -433,11 +433,11 @@ has done a write and then the page it wrote from has been released by the VM,
after which it *has* to look in the cache.

To inform fscache that a page might now be in the cache, the following function
should be called from the ``releasepage`` address space op::
should be called from the ``release_folio`` address space op::

void fscache_note_page_release(struct fscache_cookie *cookie);

if the page has been released (ie. releasepage returned true).
if the page has been released (ie. release_folio returned true).

Page release and page invalidation should also wait for any mark left on the
page to say that a DIO write is underway from that page::
Expand Down
2 changes: 1 addition & 1 deletion Documentation/filesystems/fscrypt.rst
Original file line number Diff line number Diff line change
Expand Up @@ -1256,7 +1256,7 @@ inline encryption hardware will encrypt/decrypt the file contents.
When inline encryption isn't used, filesystems must encrypt/decrypt
the file contents themselves, as described below:

For the read path (->readpage()) of regular files, filesystems can
For the read path (->read_folio()) of regular files, filesystems can
read the ciphertext into the page cache and decrypt it in-place. The
page lock must be held until decryption has finished, to prevent the
page from becoming visible to userspace prematurely.
Expand Down
2 changes: 1 addition & 1 deletion Documentation/filesystems/fsverity.rst
Original file line number Diff line number Diff line change
Expand Up @@ -559,7 +559,7 @@ already verified). Below, we describe how filesystems implement this.
Pagecache
~~~~~~~~~

For filesystems using Linux's pagecache, the ``->readpage()`` and
For filesystems using Linux's pagecache, the ``->read_folio()`` and
``->readahead()`` methods must be modified to verify pages before they
are marked Uptodate. Merely hooking ``->read_iter()`` would be
insufficient, since ``->read_iter()`` is not used for memory maps.
Expand Down
36 changes: 18 additions & 18 deletions Documentation/filesystems/locking.rst
Original file line number Diff line number Diff line change
Expand Up @@ -237,20 +237,20 @@ address_space_operations
prototypes::

int (*writepage)(struct page *page, struct writeback_control *wbc);
int (*readpage)(struct file *, struct page *);
int (*read_folio)(struct file *, struct folio *);
int (*writepages)(struct address_space *, struct writeback_control *);
bool (*dirty_folio)(struct address_space *, struct folio *folio);
void (*readahead)(struct readahead_control *);
int (*write_begin)(struct file *, struct address_space *mapping,
loff_t pos, unsigned len, unsigned flags,
loff_t pos, unsigned len,
struct page **pagep, void **fsdata);
int (*write_end)(struct file *, struct address_space *mapping,
loff_t pos, unsigned len, unsigned copied,
struct page *page, void *fsdata);
sector_t (*bmap)(struct address_space *, sector_t);
void (*invalidate_folio) (struct folio *, size_t start, size_t len);
int (*releasepage) (struct page *, int);
void (*freepage)(struct page *);
bool (*release_folio)(struct folio *, gfp_t);
void (*free_folio)(struct folio *);
int (*direct_IO)(struct kiocb *, struct iov_iter *iter);
bool (*isolate_page) (struct page *, isolate_mode_t);
int (*migratepage)(struct address_space *, struct page *, struct page *);
Expand All @@ -262,22 +262,22 @@ prototypes::
int (*swap_deactivate)(struct file *);

locking rules:
All except dirty_folio and freepage may block
All except dirty_folio and free_folio may block

====================== ======================== ========= ===============
ops PageLocked(page) i_rwsem invalidate_lock
ops folio locked i_rwsem invalidate_lock
====================== ======================== ========= ===============
writepage: yes, unlocks (see below)
readpage: yes, unlocks shared
read_folio: yes, unlocks shared
writepages:
dirty_folio maybe
dirty_folio: maybe
readahead: yes, unlocks shared
write_begin: locks the page exclusive
write_end: yes, unlocks exclusive
bmap:
invalidate_folio: yes exclusive
releasepage: yes
freepage: yes
release_folio: yes
free_folio: yes
direct_IO:
isolate_page: yes
migratepage: yes (both)
Expand All @@ -289,13 +289,13 @@ swap_activate: no
swap_deactivate: no
====================== ======================== ========= ===============

->write_begin(), ->write_end() and ->readpage() may be called from
->write_begin(), ->write_end() and ->read_folio() may be called from
the request handler (/dev/loop).

->readpage() unlocks the page, either synchronously or via I/O
->read_folio() unlocks the folio, either synchronously or via I/O
completion.

->readahead() unlocks the pages that I/O is attempted on like ->readpage().
->readahead() unlocks the folios that I/O is attempted on like ->read_folio().

->writepage() is used for two purposes: for "memory cleansing" and for
"sync". These are quite different operations and the behaviour may differ
Expand Down Expand Up @@ -372,12 +372,12 @@ invalidate_lock before invalidating page cache in truncate / hole punch
path (and thus calling into ->invalidate_folio) to block races between page
cache invalidation and page cache filling functions (fault, read, ...).

->releasepage() is called when the kernel is about to try to drop the
buffers from the page in preparation for freeing it. It returns zero to
indicate that the buffers are (or may be) freeable. If ->releasepage is zero,
the kernel assumes that the fs has no private interest in the buffers.
->release_folio() is called when the kernel is about to try to drop the
buffers from the folio in preparation for freeing it. It returns false to
indicate that the buffers are (or may be) freeable. If ->release_folio is
NULL, the kernel assumes that the fs has no private interest in the buffers.

->freepage() is called when the kernel is done dropping the page
->free_folio() is called when the kernel has dropped the folio
from the page cache.

->launder_folio() may be called prior to releasing a folio if
Expand Down
9 changes: 4 additions & 5 deletions Documentation/filesystems/netfs_library.rst
Original file line number Diff line number Diff line change
Expand Up @@ -96,7 +96,7 @@ attached to an inode (or NULL if fscache is disabled)::
Buffered Read Helpers
=====================

The library provides a set of read helpers that handle the ->readpage(),
The library provides a set of read helpers that handle the ->read_folio(),
->readahead() and much of the ->write_begin() VM operations and translate them
into a common call framework.

Expand Down Expand Up @@ -136,20 +136,19 @@ Read Helper Functions
Three read helpers are provided::

void netfs_readahead(struct readahead_control *ractl);
int netfs_readpage(struct file *file,
struct page *page);
int netfs_read_folio(struct file *file,
struct folio *folio);
int netfs_write_begin(struct file *file,
struct address_space *mapping,
loff_t pos,
unsigned int len,
unsigned int flags,
struct folio **_folio,
void **_fsdata);

Each corresponds to a VM address space operation. These operations use the
state in the per-inode context.

For ->readahead() and ->readpage(), the network filesystem just point directly
For ->readahead() and ->read_folio(), the network filesystem just point directly
at the corresponding read helper; whereas for ->write_begin(), it may be a
little more complicated as the network filesystem might want to flush
conflicting writes or track dirty data and needs to put the acquired folio if
Expand Down
2 changes: 1 addition & 1 deletion Documentation/filesystems/porting.rst
Original file line number Diff line number Diff line change
Expand Up @@ -624,7 +624,7 @@ any symlink that might use page_follow_link_light/page_put_link() must
have inode_nohighmem(inode) called before anything might start playing with
its pagecache. No highmem pages should end up in the pagecache of such
symlinks. That includes any preseeding that might be done during symlink
creation. __page_symlink() will honour the mapping gfp flags, so once
creation. page_symlink() will honour the mapping gfp flags, so once
you've done inode_nohighmem() it's safe to use, but if you allocate and
insert the page manually, make sure to use the right gfp flags.

Expand Down
86 changes: 41 additions & 45 deletions Documentation/filesystems/vfs.rst
Original file line number Diff line number Diff line change
Expand Up @@ -620,9 +620,9 @@ Writeback.
The first can be used independently to the others. The VM can try to
either write dirty pages in order to clean them, or release clean pages
in order to reuse them. To do this it can call the ->writepage method
on dirty pages, and ->releasepage on clean pages with PagePrivate set.
Clean pages without PagePrivate and with no external references will be
released without notice being given to the address_space.
on dirty pages, and ->release_folio on clean folios with the private
flag set. Clean pages without PagePrivate and with no external references
will be released without notice being given to the address_space.

To achieve this functionality, pages need to be placed on an LRU with
lru_cache_add and mark_page_active needs to be called whenever the page
Expand Down Expand Up @@ -656,7 +656,7 @@ by memory-mapping the page. Data is written into the address space by
the application, and then written-back to storage typically in whole
pages, however the address_space has finer control of write sizes.

The read process essentially only requires 'readpage'. The write
The read process essentially only requires 'read_folio'. The write
process is more complicated and uses write_begin/write_end or
dirty_folio to write data into the address_space, and writepage and
writepages to writeback data to storage.
Expand Down Expand Up @@ -722,20 +722,20 @@ cache in your filesystem. The following members are defined:
struct address_space_operations {
int (*writepage)(struct page *page, struct writeback_control *wbc);
int (*readpage)(struct file *, struct page *);
int (*read_folio)(struct file *, struct folio *);
int (*writepages)(struct address_space *, struct writeback_control *);
bool (*dirty_folio)(struct address_space *, struct folio *);
void (*readahead)(struct readahead_control *);
int (*write_begin)(struct file *, struct address_space *mapping,
loff_t pos, unsigned len, unsigned flags,
loff_t pos, unsigned len,
struct page **pagep, void **fsdata);
int (*write_end)(struct file *, struct address_space *mapping,
loff_t pos, unsigned len, unsigned copied,
struct page *page, void *fsdata);
sector_t (*bmap)(struct address_space *, sector_t);
void (*invalidate_folio) (struct folio *, size_t start, size_t len);
int (*releasepage) (struct page *, int);
void (*freepage)(struct page *);
bool (*release_folio)(struct folio *, gfp_t);
void (*free_folio)(struct folio *);
ssize_t (*direct_IO)(struct kiocb *, struct iov_iter *iter);
/* isolate a page for migration */
bool (*isolate_page) (struct page *, isolate_mode_t);
Expand All @@ -747,7 +747,7 @@ cache in your filesystem. The following members are defined:
bool (*is_partially_uptodate) (struct folio *, size_t from,
size_t count);
void (*is_dirty_writeback) (struct page *, bool *, bool *);
void (*is_dirty_writeback)(struct folio *, bool *, bool *);
int (*error_remove_page) (struct mapping *mapping, struct page *page);
int (*swap_activate)(struct file *);
int (*swap_deactivate)(struct file *);
Expand All @@ -772,14 +772,14 @@ cache in your filesystem. The following members are defined:

See the file "Locking" for more details.

``readpage``
called by the VM to read a page from backing store. The page
will be Locked when readpage is called, and should be unlocked
and marked uptodate once the read completes. If ->readpage
discovers that it needs to unlock the page for some reason, it
can do so, and then return AOP_TRUNCATED_PAGE. In this case,
the page will be relocated, relocked and if that all succeeds,
->readpage will be called again.
``read_folio``
called by the VM to read a folio from backing store. The folio
will be locked when read_folio is called, and should be unlocked
and marked uptodate once the read completes. If ->read_folio
discovers that it cannot perform the I/O at this time, it can
unlock the folio and return AOP_TRUNCATED_PAGE. In this case,
the folio will be looked up again, relocked and if that all succeeds,
->read_folio will be called again.

``writepages``
called by the VM to write out pages associated with the
Expand Down Expand Up @@ -832,9 +832,6 @@ cache in your filesystem. The following members are defined:
passed to write_begin is greater than the number of bytes copied
into the page).

flags is a field for AOP_FLAG_xxx flags, described in
include/linux/fs.h.

A void * may be returned in fsdata, which then gets passed into
write_end.

Expand Down Expand Up @@ -867,36 +864,35 @@ cache in your filesystem. The following members are defined:
address space. This generally corresponds to either a
truncation, punch hole or a complete invalidation of the address
space (in the latter case 'offset' will always be 0 and 'length'
will be folio_size()). Any private data associated with the page
will be folio_size()). Any private data associated with the folio
should be updated to reflect this truncation. If offset is 0
and length is folio_size(), then the private data should be
released, because the page must be able to be completely
discarded. This may be done by calling the ->releasepage
released, because the folio must be able to be completely
discarded. This may be done by calling the ->release_folio
function, but in this case the release MUST succeed.

``releasepage``
releasepage is called on PagePrivate pages to indicate that the
page should be freed if possible. ->releasepage should remove
any private data from the page and clear the PagePrivate flag.
If releasepage() fails for some reason, it must indicate failure
with a 0 return value. releasepage() is used in two distinct
though related cases. The first is when the VM finds a clean
page with no active users and wants to make it a free page. If
->releasepage succeeds, the page will be removed from the
address_space and become free.
``release_folio``
release_folio is called on folios with private data to tell the
filesystem that the folio is about to be freed. ->release_folio
should remove any private data from the folio and clear the
private flag. If release_folio() fails, it should return false.
release_folio() is used in two distinct though related cases.
The first is when the VM wants to free a clean folio with no
active users. If ->release_folio succeeds, the folio will be
removed from the address_space and be freed.

The second case is when a request has been made to invalidate
some or all pages in an address_space. This can happen through
the fadvise(POSIX_FADV_DONTNEED) system call or by the
filesystem explicitly requesting it as nfs and 9fs do (when they
some or all folios in an address_space. This can happen
through the fadvise(POSIX_FADV_DONTNEED) system call or by the
filesystem explicitly requesting it as nfs and 9p do (when they
believe the cache may be out of date with storage) by calling
invalidate_inode_pages2(). If the filesystem makes such a call,
and needs to be certain that all pages are invalidated, then its
releasepage will need to ensure this. Possibly it can clear the
PageUptodate bit if it cannot free private data yet.
and needs to be certain that all folios are invalidated, then
its release_folio will need to ensure this. Possibly it can
clear the uptodate flag if it cannot free private data yet.

``freepage``
freepage is called once the page is no longer visible in the
``free_folio``
free_folio is called once the folio is no longer visible in the
page cache in order to allow the cleanup of any private data.
Since it may be called by the memory reclaimer, it should not
assume that the original address_space mapping still exists, and
Expand Down Expand Up @@ -935,14 +931,14 @@ cache in your filesystem. The following members are defined:
without needing I/O to bring the whole page up to date.

``is_dirty_writeback``
Called by the VM when attempting to reclaim a page. The VM uses
Called by the VM when attempting to reclaim a folio. The VM uses
dirty and writeback information to determine if it needs to
stall to allow flushers a chance to complete some IO.
Ordinarily it can use PageDirty and PageWriteback but some
filesystems have more complex state (unstable pages in NFS
Ordinarily it can use folio_test_dirty and folio_test_writeback but
some filesystems have more complex state (unstable folios in NFS
prevent reclaim) or do not set those flags due to locking
problems. This callback allows a filesystem to indicate to the
VM if a page should be treated as dirty or writeback for the
VM if a folio should be treated as dirty or writeback for the
purposes of stalling.

``error_remove_page``
Expand Down
13 changes: 13 additions & 0 deletions MAINTAINERS
Original file line number Diff line number Diff line change
Expand Up @@ -14878,6 +14878,19 @@ F: Documentation/core-api/padata.rst
F: include/linux/padata.h
F: kernel/padata.c

PAGE CACHE
M: Matthew Wilcox (Oracle) <[email protected]>
L: [email protected]
S: Supported
T: git git://git.infradead.org/users/willy/pagecache.git
F: Documentation/filesystems/locking.rst
F: Documentation/filesystems/vfs.rst
F: include/linux/pagemap.h
F: mm/filemap.c
F: mm/page-writeback.c
F: mm/readahead.c
F: mm/truncate.c

PAGE POOL
M: Jesper Dangaard Brouer <[email protected]>
M: Ilias Apalodimas <[email protected]>
Expand Down
12 changes: 5 additions & 7 deletions block/fops.c
Original file line number Diff line number Diff line change
Expand Up @@ -372,9 +372,9 @@ static int blkdev_writepage(struct page *page, struct writeback_control *wbc)
return block_write_full_page(page, blkdev_get_block, wbc);
}

static int blkdev_readpage(struct file * file, struct page * page)
static int blkdev_read_folio(struct file *file, struct folio *folio)
{
return block_read_full_page(page, blkdev_get_block);
return block_read_full_folio(folio, blkdev_get_block);
}

static void blkdev_readahead(struct readahead_control *rac)
Expand All @@ -383,11 +383,9 @@ static void blkdev_readahead(struct readahead_control *rac)
}

static int blkdev_write_begin(struct file *file, struct address_space *mapping,
loff_t pos, unsigned len, unsigned flags, struct page **pagep,
void **fsdata)
loff_t pos, unsigned len, struct page **pagep, void **fsdata)
{
return block_write_begin(mapping, pos, len, flags, pagep,
blkdev_get_block);
return block_write_begin(mapping, pos, len, pagep, blkdev_get_block);
}

static int blkdev_write_end(struct file *file, struct address_space *mapping,
Expand All @@ -412,7 +410,7 @@ static int blkdev_writepages(struct address_space *mapping,
const struct address_space_operations def_blk_aops = {
.dirty_folio = block_dirty_folio,
.invalidate_folio = block_invalidate_folio,
.readpage = blkdev_readpage,
.read_folio = blkdev_read_folio,
.readahead = blkdev_readahead,
.writepage = blkdev_writepage,
.write_begin = blkdev_write_begin,
Expand Down
Loading

0 comments on commit fdaf9a5

Please sign in to comment.