Skip to content

Commit

Permalink
md: avoid endless recovery loop when waiting for fail device to compl…
Browse files Browse the repository at this point in the history
…ete.

If a device fails in a way that causes pending request to take a while
to complete, md will not be able to immediately remove it from the
array in remove_and_add_spares.
It will then incorrectly look like a spare device and md will try to
recover it even though it is failed.
This leads to a recovery process starting and instantly aborting over
and over again.

We should check if the device is faulty before considering it to be a
spare.  This will avoid trying to start a recovery that cannot
proceed.

This bug was introduced in 2.6.26 so that patch is suitable for any
kernel since then.

Cc: [email protected]
Reported-by: Jim Paradis <[email protected]>
Signed-off-by: NeilBrown <[email protected]>
  • Loading branch information
neilbrown committed Jun 28, 2011
1 parent 2992c4b commit 4274215
Showing 1 changed file with 1 addition and 0 deletions.
1 change: 1 addition & 0 deletions drivers/md/md.c
Original file line number Diff line number Diff line change
Expand Up @@ -7088,6 +7088,7 @@ static int remove_and_add_spares(mddev_t *mddev)
list_for_each_entry(rdev, &mddev->disks, same_set) {
if (rdev->raid_disk >= 0 &&
!test_bit(In_sync, &rdev->flags) &&
!test_bit(Faulty, &rdev->flags) &&
!test_bit(Blocked, &rdev->flags))
spares++;
if (rdev->raid_disk < 0
Expand Down

0 comments on commit 4274215

Please sign in to comment.