Skip to content

Commit

Permalink
nilfs2: fix segctor bug that causes file system corruption
Browse files Browse the repository at this point in the history
There is a bug in the function nilfs_segctor_collect, which results in
active data being written to a segment, that is marked as clean.  It is
possible, that this segment is selected for a later segment
construction, whereby the old data is overwritten.

The problem shows itself with the following kernel log message:

  nilfs_sufile_do_cancel_free: segment 6533 must be clean

Usually a few hours later the file system gets corrupted:

  NILFS: bad btree node (blocknr=8748107): level = 0, flags = 0x0, nchildren = 0
  NILFS error (device sdc1): nilfs_bmap_last_key: broken bmap (inode number=114660)

The issue can be reproduced with a file system that is nearly full and
with the cleaner running, while some IO intensive task is running.
Although it is quite hard to reproduce.

This is what happens:

 1. The cleaner starts the segment construction
 2. nilfs_segctor_collect is called
 3. sc_stage is on NILFS_ST_SUFILE and segments are freed
 4. sc_stage is on NILFS_ST_DAT current segment is full
 5. nilfs_segctor_extend_segments is called, which
    allocates a new segment
 6. The new segment is one of the segments freed in step 3
 7. nilfs_sufile_cancel_freev is called and produces an error message
 8. Loop around and the collection starts again
 9. sc_stage is on NILFS_ST_SUFILE and segments are freed
    including the newly allocated segment, which will contain active
    data and can be allocated at a later time
10. A few hours later another segment construction allocates the
    segment and causes file system corruption

This can be prevented by simply reordering the statements.  If
nilfs_sufile_cancel_freev is called before nilfs_segctor_extend_segments
the freed segments are marked as dirty and cannot be allocated any more.

Signed-off-by: Andreas Rohner <[email protected]>
Reviewed-by: Ryusuke Konishi <[email protected]>
Tested-by: Andreas Rohner <[email protected]>
Signed-off-by: Ryusuke Konishi <[email protected]>
Cc: <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
  • Loading branch information
zeitgeist87 authored and torvalds committed Jan 15, 2014
1 parent a6da83f commit 70f2fe3
Showing 1 changed file with 6 additions and 4 deletions.
10 changes: 6 additions & 4 deletions fs/nilfs2/segment.c
Original file line number Diff line number Diff line change
Expand Up @@ -1440,17 +1440,19 @@ static int nilfs_segctor_collect(struct nilfs_sc_info *sci,

nilfs_clear_logs(&sci->sc_segbufs);

err = nilfs_segctor_extend_segments(sci, nilfs, nadd);
if (unlikely(err))
return err;

if (sci->sc_stage.flags & NILFS_CF_SUFREED) {
err = nilfs_sufile_cancel_freev(nilfs->ns_sufile,
sci->sc_freesegs,
sci->sc_nfreesegs,
NULL);
WARN_ON(err); /* do not happen */
sci->sc_stage.flags &= ~NILFS_CF_SUFREED;
}

err = nilfs_segctor_extend_segments(sci, nilfs, nadd);
if (unlikely(err))
return err;

nadd = min_t(int, nadd << 1, SC_MAX_SEGDELTA);
sci->sc_stage = prev_stage;
}
Expand Down

0 comments on commit 70f2fe3

Please sign in to comment.