Data recovery blog

I came across a couple of pretty good sites detailing some of the tools and processes I use for data recovery.

Basically this wiki steps you through recovering data via a bootable Linux CD/USB. This wiki recommends RIPLinux, though I personally use SystemRescueCd which is just another bootable Linux distribution designed for data and system recovery.

The utilities which are important are ddrescue, badblocks, and smartctl.  I also use foremost, autopsy, ntfs-3g, ssh/scp, rsync, samba, 7z when doing data recovery work. 

Essentially the first task is to see just what kind of shape the drive is in.  For this we use smartctl to query the drive's S.M.A.R.T. statistics.  These numbers will tell us just how many errors the drive is producing, and can give us an estimate of how much time we have to work with. 

a "smartctl --all /dev/<device>" will print the stats from that particular drive. 

For example, on one of my systems I run the following:

$ sudo smartctl --all /dev/sde
smartctl version 5.38 [x86_64-unknown-linux-gnu] Copyright (C) 2002-8 Bruce Allen
Home page is

Model Family:     Seagate Barracuda 7200.11
Device Model:     ST31000340AS
Serial Number:    5QJ0TCMT
Firmware Version: SD15
User Capacity:    1,000,204,886,016 bytes
Device is:        In smartctl database [for details use: -P show]
ATA Version is:   8
ATA Standard is:  ATA-8-ACS revision 4
Local Time is:    Wed Jun 16 22:45:09 2010 MDT
SMART support is: Available - device has SMART capability.
SMART support is: Enabled

SMART overall-health self-assessment test result: PASSED

There is more data displayed than this, but we are really only interested in the first 20 lines or so at this point.  What we see is the drive make, model, serial number, and firmware which we can use to check the warranty status of the drive.  We also see a generalized health assessment for the drive.

Note that I used sudo in this example, some recovery distros (systemrescuecd for example) give you root or administrative privileges by default, which means that you do not need to use sudo.

In this case, I would feel reasonably safe in using ddrescue to create an image of the drive that I would then use to actually recover data from.  One word of warning, ddrescue creates an image that is the same size as the drive you are trying to pull data from. 

The drive above is a 1 Terabyte drive so you will need the same amount of free space on your recovery media.  Currently I have 4 x 1 terabyte drives in a RAID 5 array which I use for data recovery.  This gives me approximately 2.8T of usable space to work with.  Of course I will upgrade this over time as the size of drives increases.

Once you have your image of the failing disk, we can mount partitions, and copy data off (this is where 7z, rsync, ssh come in handy).  Foremost is useful if the filesystem is completely corrupt and you cannot mount it.  You simply tell foremost what kind of data you are looking for, and it will go looking through the drive image for that file type. 

For example: foremost -t jpeg -i /path/to/diskimage.img -o /path/to/output

This will look through your disk image looking for files whose headers indicate a jpeg image.  One limitation to be aware of is that foremost does not recover the filename, so finding your data can be tricky as foremost does not differentiate between jpegs in your browser cache and your family vacation or wedding photos.  In this case, sorting your directory view by filesize will likely be beneficial.

autopsy/sleuthkit can automate some of these tasks for you, though having a firm understanding of what is going on under the hood will help you when the automated tools fail.

If you carve out the partitions from your disk image, you can also fix file systems errors and write that partition to a new drive.  See my earlier post for more details.