Recovering data from a failed RAID array with ddrescue: a practical walkthrough

Published: (June 11, 2026 at 08:57 AM EDT)
4 min read
Source: Dev.to

Source: Dev.to

When a RAID array fails, the worst thing you can do is panic and start poking at it immediately. I’ve seen too many cases where an impatient rebuild attempt overwrote the only good copy of data. This walkthrough covers how to safely approach a degraded or failed RAID — with ddrescue as your best friend. Before running mdadm —assemble, before doing anything, clone your physical disks. A RAID 5 with one failed drive can lose everything the moment a second drive throws a read error during rebuild. This isn’t hypothetical — it’s how most total RAID losses happen. The golden rule: image first, recover second.

Check current RAID state

cat /proc/mdstat

More detail

mdadm —detail /dev/md0

Look for: [UUU_] — one drive failed (underscore = missing) [UU__] — two drives failed (catastrophic for RAID 5) State: degraded, recovering, or failed

Do NOT run mdadm —manage /dev/md0 —add /dev/sdX yet. Stop the array instead: mdadm —stop /dev/md0

ddrescue is the right tool because it handles read errors gracefully: it maps bad sectors, retries them, and lets you resume interrupted sessions. Never use dd for a failing disk. Install it:

Debian/Ubuntu

sudo apt install gddrescue

RHEL/CentOS

sudo dnf install ddrescue

Clone each RAID member to a separate image file (you need enough storage — same total size as all disks combined):

First pass: copy everything readable, skip bad sectors fast

sudo ddrescue -d -r0 /dev/sda /mnt/backup/sda.img /mnt/backup/sda.log

Second pass: retry bad sectors up to 3 times

sudo ddrescue -d -r3 /dev/sda /mnt/backup/sda.img /mnt/backup/sda.log

Key flags: -d — direct disk access (bypass kernel cache) -r0 / -r3 — retry bad sectors 0 or 3 times The .log mapfile is critical: it lets you resume if the clone is interrupted Repeat for every disk in the array (sdb, sdc, etc.). Once you have image files, assemble a software RAID from the images using loop devices — never from the raw physical disks again:

Set up loop devices

sudo losetup /dev/loop0 /mnt/backup/sda.img sudo losetup /dev/loop1 /mnt/backup/sdb.img sudo losetup /dev/loop2 /mnt/backup/sdc.img

Try to assemble (read-only is ideal)

sudo mdadm —assemble —readonly /dev/md0 /dev/loop0 /dev/loop1 /dev/loop2

If mdadm complains about mismatched superblocks or won’t assemble, try with —force: sudo mdadm —assemble —force /dev/md0 /dev/loop0 /dev/loop1 /dev/loop2

Mount read-only first — never mount degraded arrays read-write

sudo mount -o ro /dev/md0 /mnt/raid_recovery

Check what’s there

ls -la /mnt/raid_recovery/ df -h /mnt/raid_recovery/

If the filesystem is ext4 and won’t mount, try fsck on the loop-assembled md device before mounting: sudo fsck.ext4 -n /dev/md0 # -n = dry run, no writes

XFS arrays need xfs_repair -n /dev/md0 for the dry-run equivalent. Pitfall 1: Dirty bit / write-intent bitmap mismatch mdadm will want to do a full resync — on loop images, this is safe, but watch for it. Pitfall 2: Mixed sector sizes sudo blockdev —getpbsz /dev/sda

Pitfall 3: RAID 6 with two failed disks Pitfall 4: Chunk size mismatch —force, you may need to specify —chunk=512 (or whatever the original was). Check old mdadm.conf or strings on a disk image for metadata.

Hash check critical files

find /mnt/raid_recovery -name “*.db” -exec md5sum {} + > /tmp/recovered_hashes.txt

Check filesystem integrity

sudo dmesg | grep -i “raid|md0|error” | tail -30

Don’t unmount until you’ve copied everything critical to a separate, healthy disk. If your array has two or more failed members with severe bad sectors, software reassembly may not be enough. The logical structure (stripe layout, chunk boundaries) can be reconstructed manually — but it’s extremely time-consuming and error-prone without specialized tools. At that point it’s worth reading a detailed overview of RAID failure modes and professional recovery options before deciding whether to escalate. The most important takeaway: image everything before you touch anything. ddrescue + loop devices gives you a safe sandbox to experiment in without risking your only copy of the data. Good luck — and may your parity drives never fail.

0 views
Back to Blog

Related posts

Read more »

The spec is in the wrong place

My day job is at a large tech company. Hundreds of engineering teams, and every one of them is somewhere different on AI adoption. Some are still treating codin...

The Heuristics Say Don't

A culture that only records its disasters ends up with a biased archive. Wars documented, plagues chronicled, collapses catalogued. The quiet decades go unwritt...