Recovering data from a failed RAID array with ddrescue: a practical walkthrough
Source: Dev.to
When a RAID array fails, the worst thing you can do is panic and start poking at it immediately. I’ve seen too many cases where an impatient rebuild attempt overwrote the only good copy of data. This walkthrough covers how to safely approach a degraded or failed RAID — with ddrescue as your best friend. Before running mdadm —assemble, before doing anything, clone your physical disks. A RAID 5 with one failed drive can lose everything the moment a second drive throws a read error during rebuild. This isn’t hypothetical — it’s how most total RAID losses happen. The golden rule: image first, recover second.
Check current RAID state
cat /proc/mdstat
More detail
mdadm —detail /dev/md0
Look for: [UUU_] — one drive failed (underscore = missing) [UU__] — two drives failed (catastrophic for RAID 5) State: degraded, recovering, or failed
Do NOT run mdadm —manage /dev/md0 —add /dev/sdX yet. Stop the array instead: mdadm —stop /dev/md0
ddrescue is the right tool because it handles read errors gracefully: it maps bad sectors, retries them, and lets you resume interrupted sessions. Never use dd for a failing disk. Install it:
Debian/Ubuntu
sudo apt install gddrescue
RHEL/CentOS
sudo dnf install ddrescue
Clone each RAID member to a separate image file (you need enough storage — same total size as all disks combined):
First pass: copy everything readable, skip bad sectors fast
sudo ddrescue -d -r0 /dev/sda /mnt/backup/sda.img /mnt/backup/sda.log
Second pass: retry bad sectors up to 3 times
sudo ddrescue -d -r3 /dev/sda /mnt/backup/sda.img /mnt/backup/sda.log
Key flags: -d — direct disk access (bypass kernel cache) -r0 / -r3 — retry bad sectors 0 or 3 times The .log mapfile is critical: it lets you resume if the clone is interrupted Repeat for every disk in the array (sdb, sdc, etc.). Once you have image files, assemble a software RAID from the images using loop devices — never from the raw physical disks again:
Set up loop devices
sudo losetup /dev/loop0 /mnt/backup/sda.img sudo losetup /dev/loop1 /mnt/backup/sdb.img sudo losetup /dev/loop2 /mnt/backup/sdc.img
Try to assemble (read-only is ideal)
sudo mdadm —assemble —readonly /dev/md0 /dev/loop0 /dev/loop1 /dev/loop2
If mdadm complains about mismatched superblocks or won’t assemble, try with —force: sudo mdadm —assemble —force /dev/md0 /dev/loop0 /dev/loop1 /dev/loop2
Mount read-only first — never mount degraded arrays read-write
sudo mount -o ro /dev/md0 /mnt/raid_recovery
Check what’s there
ls -la /mnt/raid_recovery/ df -h /mnt/raid_recovery/
If the filesystem is ext4 and won’t mount, try fsck on the loop-assembled md device before mounting: sudo fsck.ext4 -n /dev/md0 # -n = dry run, no writes
XFS arrays need xfs_repair -n /dev/md0 for the dry-run equivalent. Pitfall 1: Dirty bit / write-intent bitmap mismatch mdadm will want to do a full resync — on loop images, this is safe, but watch for it. Pitfall 2: Mixed sector sizes sudo blockdev —getpbsz /dev/sda
Pitfall 3: RAID 6 with two failed disks Pitfall 4: Chunk size mismatch —force, you may need to specify —chunk=512 (or whatever the original was). Check old mdadm.conf or strings on a disk image for metadata.
Hash check critical files
find /mnt/raid_recovery -name “*.db” -exec md5sum {} + > /tmp/recovered_hashes.txt
Check filesystem integrity
sudo dmesg | grep -i “raid|md0|error” | tail -30
Don’t unmount until you’ve copied everything critical to a separate, healthy disk. If your array has two or more failed members with severe bad sectors, software reassembly may not be enough. The logical structure (stripe layout, chunk boundaries) can be reconstructed manually — but it’s extremely time-consuming and error-prone without specialized tools. At that point it’s worth reading a detailed overview of RAID failure modes and professional recovery options before deciding whether to escalate. The most important takeaway: image everything before you touch anything. ddrescue + loop devices gives you a safe sandbox to experiment in without risking your only copy of the data. Good luck — and may your parity drives never fail.