Hard drive recovery 301


Let’s say you have a hard drive whose media is failing but whose controller card is still functional. Let’s further say you have a desire to pull a partition off that drive and see what’s still salvageable. And let’s further say you have a computer you’re okay with leaving on for a month or so to do it. All of these things were true about a hard drive that Glendon Mellow, The Flying Trilobite, sent along to me to try to recover — there were some family photos and tax returns that he hadn’t had backed up anyplace when the drive started failing. Being the samaritan that I am, I took the project on as a way to hone my own skills. I also had a feeling I could write a blog post afterward so others might benefit.

This isn’t a 101 level course. Hell, it’s not even a 201, as it assumes you know enough to use Linux’s terminal (no GUIs in this post!), and how to connect your hard drive through a USB adapter or directly. It also assumes the hard drive is in a specific state that it might still be readable even if Windows itself can’t get at the data. This last one is a fairly big assumption, and I trust you’re going to be able to identify when that’s the case.

The first thing you’re going to need to know in this procedure is that everything you do on a hard drive with failing media could provoke the media to fail further, so you definitely want to be careful not to do any writing to the drive you’re trying to recover. The second thing you need to know is that everything you do could take an absurd amount of time. Patience and caution are the watchwords here.

Connect the hard drive to your Linux box by whatever means necessary. We’ll be doing most of our work from the console; feel free to use whatever console terminal you want, and whatever shell, but my personal preference is to use a tabbed terminal like Guake, and bash as my shell of choice.

If you’re using a USB adapter (like this kind for instance) and can plug it in live while your computer is turned on, dmesg will show you what drive you’ve just plugged in, usually in square brackets like [sdb].

So, now that we know that Linux can see the controller, we need to find out the parameters of the drive. I assume you know something about the partition structure of the drive, and therefore what partition is the target. In your terminal, use:

sudo parted /dev/sdb

We’ll want to use sector units for our next part, so first type “unit s”, then “p” to print the stats on the device. Again, this might take a very long time, and might even fail. If it doesn’t, you should see something like:

(parted) unit s
(parted) p
Model: ATA TOSHIBA MK1665GS (scsi)
Disk /dev/sdb: 312581808s
Sector size (logical/physical): 512B/512B
Partition Table: msdos

Number Start End Size Type File system Flags
1 2048s 27265023s 27262976s primary ntfs diag
2 27265024s 27469823s 204800s primary ntfs boot
3 27469824s 97733495s 70263672s primary ntfs
4 97734654s 312580095s 214845442s extended
5 97734656s 136800255s 39065600s logical ext4
6 136802304s 287389695s 150587392s logical ext4
7 287391744s 312580095s 25188352s logical linux-swap(v1)

Let’s imagine, in our hypothetical, that partition 3 — the largest NTFS drive — is the one we’re looking to recover. It’s the one I store my Windows-side Home drive, so it’s a good example. It’s up to you to decide what partition you want to recover, or you could simply do every one of these partitions one at a time if you really aren’t sure.

So, the next thing we need to do is ensure we have a place we can put a drive dump of that partition that has sufficient space. Assuming your block size is 512 like mine, cd to that location, then do the following:

sudo dd bs=512 if=/dev/sdb of=./drivedump-sdb3.dmp conv=noerror,sync iflag=direct skip=27469824 count=70263672

bs is the Block Size listed above.
if is the Input File (in this case the whole sdb drive).
conv set to noerror,sync sets the dd command to plow through any drive errors it finds and replace them with zeroes, rather than aborting the copy.
iflag sets the input method to “direct”, or a raw drive copy.
skip tells dd where to start the copy from — specifically, at the start of the third partition as listed by parted above.
count tells dd how many blocks to copy, as derived by the length of the third partition listed by parted above.

And then you wait. However long it takes. In my case, with Glendon’s drive, my netbook chunked away at the 26 gig partition for over a month. But, when it was done, I had a full drive dump that I could mount as a drive and explore locally.

If you want to watch to see how big the drive dump is getting, you can do something fairly simple in a new shell tab:

watch -n1 ls -la ./drivedump-sdb3.dmp

This runs the ls command every second and refreshes the output. That way you aren’t going and checking manually every time you decide to go look to see how far along it’s gotten.

To mount the drive once it’s done, do the following:

mkdir temp

sudo mount -o loop -t ntfs ./drivedump-sdb3.dmp temp

Of course, if the partition was a type other than ntfs, feel free to change the -t parameter to the filesystem you were recovering. Now the temp folder should have all the contents of the drive you’ve copied. Grab them and restore them however you please.

If there were a lot of errors during the copy, many files may not be fully intact, but they’re at least browsable and you can see what you can salvage this way. Imagine each media error that dd encountered as a bullet hole somewhere on your hard drive. If the bullet hit a file, that file might be badly damaged and unreadable, or simply missing a few letters, depending on what kind of file it is. A text document, for instance, is more resilient than a jpeg image, and an image is more resilient than a zip file or Word document.

If you have any other tips and tricks, feel free to throw them together here. And if you have any questions, don’t hesitate to ask. I’d be glad to impart my vast (snicker) wisdom.

Comments

  1. says

    Rule #1 is to have at least 3 copies of all critical data. Rule #2 is to get one of your copies from those. Assuming you’ve violate rules #1 and #2.

    Use dd on the ‘c’ (whole drive) taw partition. I.e.: dd if=/dev/rsd7c of=/tmp/whatever

  2. freetheworldof says

    I personally use ‘EaseUS Data Recovery’ from easeus.com and/or ‘Spinrite’ from grc.com.
    Much easier.

  3. says

    I used to do it this way, I have a whole Linux workstation for such jobs. Then I bought a $50 direct cloner and my life has never been easier. Cloned a TB drive overnight and ran fsck on it and had a boot able drive. Even used it on a bunch of Windows drives then chkdsk will at least made the surviving data readable.

  4. says

    This is probably less sketchy than what I did in the past, though from what I understand, it probably wouldn’t have worked. A friend at school kind of assumed he would have to just start over, when in the midst of writing a paper, his daughter knocked his laptop off the couch and onto their laminate floor (he didn’t blame her, *he* was the one who left it on the couch). The HDD clicked when I connected it my laptop, which I had booted into Linux, and failed utterly to even register – except as an unknown device.

    I ended up hermetically sealing it with a food saver sealer and putting it into the freezer for a couple hours. While it was in the freezer, we cranked up the AC in his apartment (so condensation wouldn’t start to form immediately upon removal from the bag). I connected the drive to the pertinent bits of an external drive case I had disassembled (trying to ensure it would be completely in the open for heat dissipation) and plugged it into the laptop with Linux running – after smacking it against my palm. It opened up immediately and, speedily like the ninja, I got into “users” his name, “my documents” and salvaged his goddamned paper! I managed to get his entire documents folder, and his most of his music saved as well. Unfortunately it started making an even uglier clicking sound than before and within moments made a *really* ugly sound, but I got what he really needed and some shit he wanted.

    Not terribly elegant, but it worked and made me feel rather badass.

  5. CompulsoryAccount7746, Sky Captain says

    Important: Mount that drive image readonly! Even make a backup of the image while you work in case the first gets mangled.
     
    For dealing with parition table issues in general, I’m fond of disktype (prints exhaustive info) and gpart.
     
    In the worst situations when the filesystem itself has been damaged, and repair tools fail to reconstruct it (reverting to the backup after each attempt), there’s still the most desperate method of all: “data carving”.
     
    That’s when you scour the drive for bytes that resemble headers of common filetypes and essentially dd them out of the unindexed wreckage that used to be a directory tree.

  6. left0ver1under says

    Since suggestions are welcome, here are some that might be worth trying:

    Roadkil’s Unstoppable Copier
    http://www.roadkil.net/program.php?ProgramID=29

    TestDisk
    http://www.cgsecurity.org/wiki/TestDisk

    According to their websites, this is what they do:

    * Unstoppable Copier (various windows versions are available) is designed to keep reading files off a damaged drive, regardless of read errors or the types of errors.

    * TestDisk (a DOS program) is for recovering lost partitions and files.

    I haven’t used these programs myself because (not being a smartass) I keep data losses from happening. Twentyeight grams of prevention is worth 454 grams of cure. Here’s what I’d suggest to anyone, not just the person with the damaged HDD:

    (1) Only keep your OS and installed programs on your PC, never your data. Always keep data on external USB hard drives – it’s easier to get at the disks, you don’t have to disassemble the computer. And in a directory on a backup drive, keep copies of all your favourite programs and utilities, the installers or zip files. It will make setting up again much faster.

    (2) Backup the data automatically about once a week or once a fortnight. There’s software that will do it for you (i.e. only the files that changes), but even a manual drag-and-drop from drive to drive is good enough unless it’s a large HDD.

    There’s probably better software, but microsloth’s SyncToy will do it:

    http://www.microsoft.com/en-us/download/details.aspx?id=15155

    (3) Multiboot (aka Dual boot) your computer. When you have more than one OS on the computer (FreeDOS and windows 7 on mine), it’s usually possible to recover data on the failed OS’s partition using the primitive OS. It may slow down cold bootups, but that’s what hibernate is for, to avoid cold boots.

    Multibooting is a good reason to keep (copies of) installation CDs from older OSes. However, you cannot use two NT OSes on the same computer (NT, XP, Vista, 7, 8), they will conflict with each other. Pre-NT OSes (95, 98, ME – why?) will not cause conflict.

    (4) Make a hard drive image. This is best used after a system recovery and installation of software. It will copy your entire hard drive and its contents, and can copy an entire working installation onto a a computer, software, OS and all.

    (5) Have disk recovery software and tools ready ahead of time, and know how to use them. I still keep an external USB floppy disk drive and write protected boot disks (plus a freeware NTFS reader for DOS) just in case everything goes wrong, plus a bootable USB flash device with FreeDOS. When all else fails, DOS doesn’t.

    This is one that I have used and I guarantee that it does work – how to create a bootable USB flash drive, and the files needed to do it:

    http://www.sevenforums.com/tutorials/46707-ms-dos-bootable-flash-drive-create.html

    And DuWayne, I like and believe your report.

  7. kemist, Dark Lord of the Sith says

    Rule #1 is to have at least 3 copies of all critical data.

    Or, keep everything of importance on the cloud.

    Also, is there a specific reason to mount that hard drive on a USB drive rather than a SATA ? USB must serialize all that data and reparallelize it at the other end – this takes time.

  8. CompulsoryAccount7746, Sky Captain says

    @left0ver1under:

    I still keep an external USB floppy disk drive and write protected boot disks (plus a freeware NTFS reader for DOS)

    That would’ve been my advice eight years ago but Knoppix is quite comfortable with NTFS these days.

    If a Windows environment is absolutely necessary, there’s the Ultimate Boot CD, too.

  9. fwtbc says

    This is my general approach to things, too.

    For NTFS specifically, I’d look at the ntfsprogs set of utilities and see if ntfsclone has options to not bail out on errors for recovery purposes. Using a utiility like this is probably a better option because if it can read the file allocation map and then just copy the blocks that are actually used, it could greatly reduce the time taken and the load on the drive.

    But dd is a good FS-agnostic approach.

    If you’ve got the pv utility installed, you can use it for a nice progress bar. It’s just an input/output pipe like cat, so you can do things like:

    # pv hugefile.tar.bz2 |tar -jxf –

    and get a nice ascii progress bar + speed statistics/ETA. If it’s taking input via a pipe, then it won’t print a progress bar unless you tell it how much data it should expect with the –size option.

    # dd if=foo bs=1024k count=100 |pv –size=104857600 >bar

  10. docsarvis says

    Here is another vote for Data Rescue. I have owned a registered copy for at least 10 years, and it has recovered several friends’ drives. My drives? I have backups of my backups, and backups of those.

  11. says

    When it comes to backing up larger files (ie. media files and ISO images), I backup to a drive inside the box that I keep disabled except when backing shit up. Document files and pictures go into multiple clouds (as well as the offline drive). Really important things get written to DVD and saved to flash drive, stored in a fire resistant lock box. Honestly though, clouds are really the way to go. I use multiple computers on a regular basis, including public terminals at school (when I need to print something). Clouds make everything so much easier. And as much as I dislike MS, I really like being able to run a power point directly from the cloud.

  12. kemist, Dark Lord of the Sith says

    Also, linux trick if you actually want to use that computer while it’s crunching your data :

    You put “&” at the end of your command. This launches the job in the background. Then the NICE level (priority) of the job can be adjusted according to whatever else goes on on the computer. It also continues if you log off.

    All people who’ve done things like molecular dynamics (where jobs commonly last for days or weeks) learn that trick.

    Using a utiility like this is probably a better option because if it can read the file allocation map and then just copy the blocks that are actually used, it could greatly reduce the time taken and the load on the drive.

    If the file allocation map is still readeable. And depending on how it reads them, it might be slightly more risky with a soon-to-fail HD. Data blocks for a single file are not necessarily contiguous. This means new search and lots more HD spinning for each block. It slows down the copying (searching for a block is an added delay) and puts more stress on the HD controller.

  13. kemist, Dark Lord of the Sith says

    And as much as I dislike MS, I really like being able to run a power point directly from the cloud.

    There are other options.

    Dropbox is one. Or GoogleDrive. You can run and modify your files in these. And share them in groups, and on all your machines, tablets and phones included.

  14. kemist, Dark Lord of the Sith says

    (2) Backup the data automatically about once a week or once a fortnight. There’s software that will do it for you (i.e. only the files that changes), but even a manual drag-and-drop from drive to drive is good enough unless it’s a large HDD.

    Might as well buy a RAID controller. They’re cheaper now.

    But, the cloud these days is safer, simpler and cheaper than those two solutions. Multiple copies in different geographical locations. That’s the strategy data centers have adopted, now available to everybody.

  15. Arion says

    I take it nobody here has ever heard of Hiren’s Boot CD? I’m extremely disappointed Jason. You of all people should know about this :P. Anyway, it’s the fastest, easiest OSS to work with the low level software on your HDD. No worries about whether or not it’s dead, the utilities, to scan it, clone it, and, if possible, repair it are all on one CD.
    It even has a nice little lite version of Windows for those who are afraid of GUIs, with most of the same software.

  16. says

    Most of the good advice has been given, I see, but I still have some tidbits to share.

    kemist, Dark Lord of the Sith @ 15:

    You put “&” at the end of your command. This launches the job in the background.

    On the terminals I’ve worked with, logging out would usually kill the job if you only did that. I usually follow up with “disown %X”, where X is the job ID of the process (usually 1, check with “jobs”). That detaches it from the current terminal, and guarantees it’ll keep running once you log off.

    This does make it a pain to get status info, though. I will either redirect the output to a file (“> FILE 2>&1″ before the “&”) then use “tail -f FILE” to follow along, or throw the entire command into its own screen session (“screen -dmS LABEL COMMAND ARGUMENTS”) and detach/reattach as needed.

    Might as well buy a RAID controller. They’re cheaper now.

    I prefer software RAID. The performance loss is minimal, but it’s a helluva lot easier to get back you data if something goes sideways. I may be a bit out of date on this, but hardware/hybrid controllers tend to use their own proprietary formats when managing RAID data, sometimes even on a model-by-model basis. Getting your data back may mean tracking down the exact same hardware you used, years after it’s been discontinued.

    The cloud is an excellent option, though. If you feel uncomfortable letting other computers store your data, run it through “gpg” first. Problem solved!

  17. says

    Kemist –

    I use MS Office exclusively. I regularly need to turn in papers, power points and spreadsheets in MS Office formats. I just don’t have the time to screw around with Libre or OO. Not that this is their fault (bloody damned MS offers the option to save in OO, assholes!), just that the end result is that I use MS Office.

  18. says

    Might as well buy a RAID controller.

    Why? I pretty much exclusively use Gigabyte boards, but assumed that onboard RAID control is pretty much standard. I prefer to drag and drop and disable my back-up drive, but that is because large file back-ups are an infrequent occurrence for me, pretty much only for when I rip blurays or buy new music or audiobooks. It is apparently quite easy though, to install drives in RAID with Gigabyte or Asus boards. I kind of assumed that was the case for most boards produced in the last 6 years or so.

  19. kemist, Dark Lord of the Sith says

    I use MS Office exclusively. I regularly need to turn in papers, power points and spreadsheets in MS Office formats. I just don’t have the time to screw around with Libre or OO. Not that this is their fault (bloody damned MS offers the option to save in OO, assholes!), just that the end result is that I use MS Office.

    I never bothered to buy MS Office. I did use OO for some time, then was “forced” to learn Latex for some coursework. Haven’t gone back to OO or MS Office since.

    If you get some free time, one thing to consider would be to try and learn Latex. You can produce very high quality stuff, pdf format, with much less assle. That is, after you have spent some time playing with it.

    I kind of assumed that was the case for most boards produced in the last 6 years or so.

    Last board I bought was from Asus, about 5-6 years ago, and I chose the RAID option (haven’t actually implemented it for data protection though, I prefered to have two separate disks under different OS at the time). Haven’t checked if it’s more or less common since.

  20. F says

    External drives: Make sure the drive is meant to stay on if you plan to leave it on. I’ve seen people lose data because these drives crap out when left running constantly. Particularly the ones sold as “backup drives”.

  21. says

    Most of what everyone is saying is beyond me – I can add that so far, almost every file Jason’s recovered is in great shape. Without going into personal detail, there are some family photos that are more than just nostalgia to have recovered and it’s amazing to find them again. Much of what’s on there is pre-Facebook, so they weren’t even uploaded anywhere.

    Some look like they’ve been artfully collaged together, but even those have areas we can crop.

    After that 3rd-hand laptop died, I bought a brand-new pc, the first new, up-to-date computer I have ever had in my life. And I bought a back-up hard drive and back-up regularly. That computer’s motherboard has since died, and everything was restored from the backup. We’re a lot more careful now.

    Jason, can’t thank you enough for all your hard work and for getting so much off the drive. This is like Christmas and winning the lottery rolled into one.

  22. says

    kemist –

    The problem is that I am usually required to turn in coursework in the MS format. Instructors want to be able to open stuff with MS office for grading/adding notes. And at this point I just don’t care. There are some perks to dealing with MS, MS mathematics (both the base software and the office plugin) being a huge one. It makes it really easy to type my math homework and show my work. I know you can do that with OO, it’s just a lot easier with Mathematics. The other nice perk is that you can run Power Point as a web app – making last minute adjustments and then run the PP itself, regardless of whether the machine running it has PP installed.

    And as much as I would prefer to be done with Windows, Win8 is actually a significant improvement, if for no other reason than it doesn’t come weighed down with so much crap. But the framework itself is a lot sexier – a lot lighter. By itself it uses about 20% less system resources and allocates resources a lot more efficiently. I ran an upgrade install over a 20 day old Win7 clean install. With 7 the apps I use frequently enough that I leave them open and Window itself were using roughly 39% of my memory (running Chrome with several tabs open, IE with a couple of tabs open, Minitab, MS mathematics and Word). After installing Win8, memory usage with the same applications running was down to 20%. Also, with Win7 cold booting took between 15-20 seconds. Win8 boots essentially instantly after posting, boot time is less than five seconds.

    F –

    I don’t see how there would be a problem with external drives, unless the controller fails – in which case the drive itself and the data is still intact (though you should probably use Linux to try to access it). But I have an external drive that has been running more or less steadily for three years now and had no problems. I have a couple others that are regularly switched from machine to machine, even switching drives with some frequency. The drives themselves only spin up when you access them.

  23. says

    Nice tutorial, bookmarking this for my WSHTF collection.

    Rather than just ls’ing the file with your watch command you can get more detail by telling dd to print its stats instead. dd will return the status report you normally only see on completion if you send it the USR1 signal. Either get dd’s PID and use kill or change your watch to something like:

    watch -n X pkill -USR1 dd

    … and you’ll get progress and detail on total volume copied and current speed in kB/s every X seconds.

    You could also run both from within screen rather than just backgrounding; handy if you’re remote. Certainly wouldn’t hurt to nohup any long-running dd too if you’re not in a screen session and want to be extra paranoid about accidentally logging out.

  24. says

    Nice catch, Jon. Had no idea dd responded to -USR1. Guess it pays to read the man files.

    Arion: No need to be disappointed, I have Hiren’s Boot CD, Dave’s Ultimate Boot, and Bart PE all burned to disks in my last-resort pile. But I always go with booting from my Ubuntu live USB as my first resort, a Knoppix CD as my second.

  25. says

    So what about getting data off ZIP disks? A friend just bought a new reader and still can’t open several disks. Is it likely to be possible to retrieve the data using Linux?

  26. F says

    DuWayne:

    Yeah, I mean that a lot of people buy drives which are intended for backup or quick transfers, then leave them on all the time. Various parts in these aren’t rated for continuous/many hours of operation. Plenty of external drives, however, are. They are the same as internal drives, in an enclosure. Sure, plenty of times all you have to do is pry the standard drive out and put it in a new enclosure/computer. But when the problem is damaged data, not due to a drop or something, I find a lot of hardware people saying that you shouldn’t leave such-and-such a drive running continuously. (Kind of like the “Don’t leave an optical disk in the drive all the time.”)

    I’ve just seen this come up in support forums too many times not to mention it.

    And sure, I would certainly try dd (which, actually, you can get for Windows if that’s what you want. Should already be available on OS X I’d think. But if you want to read the drives to display contents, and your current OS is just giving you trouble, sure, try a Linux live CD like knoppix (sometimes older versions are better if the latest doesn’t work).

    If you are using Windows here, it cannot read non-Windows filesystems, so if the disk was formatted for Mac or Linux, yes, definitely try a Linux distro.

Trackbacks

Leave a Reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>