Skip to content

Commit

Permalink
Merge tag 'f2fs-for-5.10-rc1' of git://git.kernel.org/pub/scm/linux/k…
Browse files Browse the repository at this point in the history
…ernel/git/jaegeuk/f2fs

Pull f2fs updates from Jaegeuk Kim:
 "In this round, we've added new features such as zone capacity for ZNS
  and a new GC policy, ATGC, along with in-memory segment management. In
  addition, we could improve the decompression speed significantly by
  changing virtual mapping method. Even though we've fixed lots of small
  bugs in compression support, I feel that it becomes more stable so
  that I could give it a try in production.

  Enhancements:
   - suport zone capacity in NVMe Zoned Namespace devices
   - introduce in-memory current segment management
   - add standart casefolding support
   - support age threshold based garbage collection
   - improve decompression speed by changing virtual mapping method

  Bug fixes:
   - fix condition checks in some ioctl() such as compression, move_range, etc
   - fix 32/64bits support in data structures
   - fix memory allocation in zstd decompress
   - add some boundary checks to avoid kernel panic on corrupted image
   - fix disallowing compression for non-empty file
   - fix slab leakage of compressed block writes

  In addition, it includes code refactoring for better readability and
  minor bug fixes for compression and zoned device support"

* tag 'f2fs-for-5.10-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/jaegeuk/f2fs: (51 commits)
  f2fs: code cleanup by removing unnecessary check
  f2fs: wait for sysfs kobject removal before freeing f2fs_sb_info
  f2fs: fix writecount false positive in releasing compress blocks
  f2fs: introduce check_swap_activate_fast()
  f2fs: don't issue flush in f2fs_flush_device_cache() for nobarrier case
  f2fs: handle errors of f2fs_get_meta_page_nofail
  f2fs: fix to set SBI_NEED_FSCK flag for inconsistent inode
  f2fs: reject CASEFOLD inode flag without casefold feature
  f2fs: fix memory alignment to support 32bit
  f2fs: fix slab leak of rpages pointer
  f2fs: compress: fix to disallow enabling compress on non-empty file
  f2fs: compress: introduce cic/dic slab cache
  f2fs: compress: introduce page array slab cache
  f2fs: fix to do sanity check on segment/section count
  f2fs: fix to check segment boundary during SIT page readahead
  f2fs: fix uninit-value in f2fs_lookup
  f2fs: remove unneeded parameter in find_in_block()
  f2fs: fix wrong total_sections check and fsmeta check
  f2fs: remove duplicated code in sanity_check_area_boundary
  f2fs: remove unused check on version_bitmap
  ...
  • Loading branch information
torvalds committed Oct 16, 2020
2 parents 54a4c78 + 788e96d commit 7a3dade
Show file tree
Hide file tree
Showing 28 changed files with 1,795 additions and 493 deletions.
3 changes: 2 additions & 1 deletion Documentation/ABI/testing/sysfs-fs-f2fs
Original file line number Diff line number Diff line change
Expand Up @@ -22,7 +22,8 @@ Contact: "Namjae Jeon" <[email protected]>
Description: Controls the victim selection policy for garbage collection.
Setting gc_idle = 0(default) will disable this option. Setting
gc_idle = 1 will select the Cost Benefit approach & setting
gc_idle = 2 will select the greedy approach.
gc_idle = 2 will select the greedy approach & setting
gc_idle = 3 will select the age-threshold based approach.

What: /sys/fs/f2fs/<disk>/reclaim_segments
Date: October 2013
Expand Down
82 changes: 67 additions & 15 deletions Documentation/filesystems/f2fs.rst
Original file line number Diff line number Diff line change
Expand Up @@ -127,14 +127,14 @@ active_logs=%u Support configuring the number of active logs. In the
current design, f2fs supports only 2, 4, and 6 logs.
Default number is 6.
disable_ext_identify Disable the extension list configured by mkfs, so f2fs
does not aware of cold files such as media files.
is not aware of cold files such as media files.
inline_xattr Enable the inline xattrs feature.
noinline_xattr Disable the inline xattrs feature.
inline_xattr_size=%u Support configuring inline xattr size, it depends on
flexible inline xattr feature.
inline_data Enable the inline data feature: New created small(<~3.4k)
inline_data Enable the inline data feature: Newly created small (<~3.4k)
files can be written into inode block.
inline_dentry Enable the inline dir feature: data in new created
inline_dentry Enable the inline dir feature: data in newly created
directory entries can be written into inode block. The
space of inode block which is used to store inline
dentries is limited to ~3.4k.
Expand Down Expand Up @@ -203,9 +203,9 @@ usrjquota=<file> Appoint specified file and type during mount, so that quota
grpjquota=<file> information can be properly updated during recovery flow,
prjjquota=<file> <quota file>: must be in root directory;
jqfmt=<quota type> <quota type>: [vfsold,vfsv0,vfsv1].
offusrjquota Turn off user journelled quota.
offgrpjquota Turn off group journelled quota.
offprjjquota Turn off project journelled quota.
offusrjquota Turn off user journalled quota.
offgrpjquota Turn off group journalled quota.
offprjjquota Turn off project journalled quota.
quota Enable plain user disk quota accounting.
noquota Disable all plain disk quota option.
whint_mode=%s Control which write hints are passed down to block
Expand Down Expand Up @@ -266,6 +266,8 @@ inlinecrypt When possible, encrypt/decrypt the contents of encrypted
inline encryption hardware. The on-disk format is
unaffected. For more details, see
Documentation/block/inline-encryption.rst.
atgc Enable age-threshold garbage collection, it provides high
effectiveness and efficiency on background GC.
======================== ============================================================

Debugfs Entries
Expand Down Expand Up @@ -301,7 +303,7 @@ Usage

# insmod f2fs.ko

3. Create a directory trying to mount::
3. Create a directory to use when mounting::

# mkdir /mnt/f2fs

Expand All @@ -315,7 +317,7 @@ mkfs.f2fs
The mkfs.f2fs is for the use of formatting a partition as the f2fs filesystem,
which builds a basic on-disk layout.

The options consist of:
The quick options consist of:

=============== ===========================================================
``-l [label]`` Give a volume label, up to 512 unicode name.
Expand All @@ -337,17 +339,21 @@ The options consist of:
1 is set by default, which conducts discard.
=============== ===========================================================

Note: please refer to the manpage of mkfs.f2fs(8) to get full option list.

fsck.f2fs
---------
The fsck.f2fs is a tool to check the consistency of an f2fs-formatted
partition, which examines whether the filesystem metadata and user-made data
are cross-referenced correctly or not.
Note that, initial version of the tool does not fix any inconsistency.

The options consist of::
The quick options consist of::

-d debug level [default:0]

Note: please refer to the manpage of fsck.f2fs(8) to get full option list.

dump.f2fs
---------
The dump.f2fs shows the information of specific inode and dumps SSA and SIT to
Expand All @@ -371,6 +377,37 @@ Examples::
# dump.f2fs -s 0~-1 /dev/sdx (SIT dump)
# dump.f2fs -a 0~-1 /dev/sdx (SSA dump)

Note: please refer to the manpage of dump.f2fs(8) to get full option list.

sload.f2fs
----------
The sload.f2fs gives a way to insert files and directories in the exisiting disk
image. This tool is useful when building f2fs images given compiled files.

Note: please refer to the manpage of sload.f2fs(8) to get full option list.

resize.f2fs
-----------
The resize.f2fs lets a user resize the f2fs-formatted disk image, while preserving
all the files and directories stored in the image.

Note: please refer to the manpage of resize.f2fs(8) to get full option list.

defrag.f2fs
-----------
The defrag.f2fs can be used to defragment scattered written data as well as
filesystem metadata across the disk. This can improve the write speed by giving
more free consecutive space.

Note: please refer to the manpage of defrag.f2fs(8) to get full option list.

f2fs_io
-------
The f2fs_io is a simple tool to issue various filesystem APIs as well as
f2fs-specific ones, which is very useful for QA tests.

Note: please refer to the manpage of f2fs_io(8) to get full option list.

Design
======

Expand All @@ -383,7 +420,7 @@ consists of a set of sections. By default, section and zone sizes are set to one
segment size identically, but users can easily modify the sizes by mkfs.

F2FS splits the entire volume into six areas, and all the areas except superblock
consists of multiple segments as described below::
consist of multiple segments as described below::

align with the zone size <-|
|-> align with the segment size
Expand Down Expand Up @@ -486,7 +523,7 @@ one inode block (i.e., a file) covers::
`- direct node (1018)
`- data (1018)

Note that, all the node blocks are mapped by NAT which means the location of
Note that all the node blocks are mapped by NAT which means the location of
each node is translated by the NAT table. In the consideration of the wandering
tree problem, F2FS is able to cut off the propagation of node updates caused by
leaf data writes.
Expand Down Expand Up @@ -566,7 +603,7 @@ When F2FS finds a file name in a directory, at first a hash value of the file
name is calculated. Then, F2FS scans the hash table in level #0 to find the
dentry consisting of the file name and its inode number. If not found, F2FS
scans the next hash table in level #1. In this way, F2FS scans hash tables in
each levels incrementally from 1 to N. In each levels F2FS needs to scan only
each levels incrementally from 1 to N. In each level F2FS needs to scan only
one bucket determined by the following equation, which shows O(log(# of files))
complexity::

Expand Down Expand Up @@ -707,7 +744,7 @@ WRITE_LIFE_LONG " WRITE_LIFE_LONG
Fallocate(2) Policy
-------------------

The default policy follows the below posix rule.
The default policy follows the below POSIX rule.

Allocating disk space
The default operation (i.e., mode is zero) of fallocate() allocates
Expand All @@ -720,7 +757,7 @@ Allocating disk space
as a method of optimally implementing that function.

However, once F2FS receives ioctl(fd, F2FS_IOC_SET_PIN_FILE) in prior to
fallocate(fd, DEFAULT_MODE), it allocates on-disk blocks addressess having
fallocate(fd, DEFAULT_MODE), it allocates on-disk block addressess having
zero or random data, which is useful to the below scenario where:

1. create(fd)
Expand All @@ -739,7 +776,7 @@ Compression implementation
cluster can be compressed or not.

- In cluster metadata layout, one special block address is used to indicate
cluster is compressed one or normal one, for compressed cluster, following
a cluster is a compressed one or normal one; for compressed cluster, following
metadata maps cluster to [1, 4 << n - 1] physical blocks, in where f2fs
stores data including compress header and compressed data.

Expand Down Expand Up @@ -772,3 +809,18 @@ Compress metadata layout::
+-------------+-------------+----------+----------------------------+
| data length | data chksum | reserved | compressed data |
+-------------+-------------+----------+----------------------------+

NVMe Zoned Namespace devices
----------------------------

- ZNS defines a per-zone capacity which can be equal or less than the
zone-size. Zone-capacity is the number of usable blocks in the zone.
F2FS checks if zone-capacity is less than zone-size, if it is, then any
segment which starts after the zone-capacity is marked as not-free in
the free segment bitmap at initial mount time. These segments are marked
as permanently used so they are not allocated for writes and
consequently are not needed to be garbage collected. In case the
zone-capacity is not aligned to default segment size(2MB), then a segment
can start before the zone-capacity and span across zone-capacity boundary.
Such spanning segments are also considered as usable segments. All blocks
past the zone-capacity are considered unusable in these segments.
6 changes: 3 additions & 3 deletions fs/f2fs/acl.c
Original file line number Diff line number Diff line change
Expand Up @@ -160,7 +160,7 @@ static void *f2fs_acl_to_disk(struct f2fs_sb_info *sbi,
return (void *)f2fs_acl;

fail:
kvfree(f2fs_acl);
kfree(f2fs_acl);
return ERR_PTR(-EINVAL);
}

Expand Down Expand Up @@ -190,7 +190,7 @@ static struct posix_acl *__f2fs_get_acl(struct inode *inode, int type,
acl = NULL;
else
acl = ERR_PTR(retval);
kvfree(value);
kfree(value);

return acl;
}
Expand Down Expand Up @@ -240,7 +240,7 @@ static int __f2fs_set_acl(struct inode *inode, int type,

error = f2fs_setxattr(inode, name_index, "", value, size, ipage, 0);

kvfree(value);
kfree(value);
if (!error)
set_cached_acl(inode, type, acl);

Expand Down
17 changes: 14 additions & 3 deletions fs/f2fs/checkpoint.c
Original file line number Diff line number Diff line change
Expand Up @@ -107,7 +107,7 @@ struct page *f2fs_get_meta_page(struct f2fs_sb_info *sbi, pgoff_t index)
return __get_meta_page(sbi, index, true);
}

struct page *f2fs_get_meta_page_nofail(struct f2fs_sb_info *sbi, pgoff_t index)
struct page *f2fs_get_meta_page_retry(struct f2fs_sb_info *sbi, pgoff_t index)
{
struct page *page;
int count = 0;
Expand Down Expand Up @@ -243,6 +243,8 @@ int f2fs_ra_meta_pages(struct f2fs_sb_info *sbi, block_t start, int nrpages,
blkno * NAT_ENTRY_PER_BLOCK);
break;
case META_SIT:
if (unlikely(blkno >= TOTAL_SEGS(sbi)))
goto out;
/* get sit block addr */
fio.new_blkaddr = current_sit_addr(sbi,
blkno * SIT_ENTRY_PER_BLOCK);
Expand Down Expand Up @@ -1047,8 +1049,12 @@ int f2fs_sync_dirty_inodes(struct f2fs_sb_info *sbi, enum inode_type type)
get_pages(sbi, is_dir ?
F2FS_DIRTY_DENTS : F2FS_DIRTY_DATA));
retry:
if (unlikely(f2fs_cp_error(sbi)))
if (unlikely(f2fs_cp_error(sbi))) {
trace_f2fs_sync_dirty_inodes_exit(sbi->sb, is_dir,
get_pages(sbi, is_dir ?
F2FS_DIRTY_DENTS : F2FS_DIRTY_DATA));
return -EIO;
}

spin_lock(&sbi->inode_lock[type]);

Expand Down Expand Up @@ -1619,11 +1625,16 @@ int f2fs_write_checkpoint(struct f2fs_sb_info *sbi, struct cp_control *cpc)

f2fs_flush_sit_entries(sbi, cpc);

/* save inmem log status */
f2fs_save_inmem_curseg(sbi);

err = do_checkpoint(sbi, cpc);
if (err)
f2fs_release_discard_addrs(sbi);
else
f2fs_clear_prefree_segments(sbi, cpc);

f2fs_restore_inmem_curseg(sbi);
stop:
unblock_operations(sbi);
stat_inc_cp_count(sbi->stat_info);
Expand Down Expand Up @@ -1654,7 +1665,7 @@ void f2fs_init_ino_entry_info(struct f2fs_sb_info *sbi)
}

sbi->max_orphans = (sbi->blocks_per_seg - F2FS_CP_PACKS -
NR_CURSEG_TYPE - __cp_payload(sbi)) *
NR_CURSEG_PERSIST_TYPE - __cp_payload(sbi)) *
F2FS_ORPHANS_PER_BLOCK;
}

Expand Down
Loading

0 comments on commit 7a3dade

Please sign in to comment.