“Hey, Marcus, those hard drive recovery services: how good are they?”
The answer is: pretty good.
The longer-form answer is: it depends on if the drives you were using are still available on Ebay. How old is your storage?
The snappy answer is: it’s an IQ test. If you need a drive recovery service you don’t understand computers well enough to own one, and you failed the IT IQ test.
Working in computer security, I get that sort of question all the time. The worst is when it’s something like a cryptolocker virus. Cryptolocker takes over your hard drive, encrypts it with a random key that it sends to its controller somewhere out in the cloud, and then you can buy the key back for a bitcoin (about $600 right now)
Hollywood Presbyterian Medical Center paid a $17,000 ransom in bitcoin to a hacker who seized control of the hospital’s computer systems and would give back access only when the money was paid, the hospital’s chief executive said Wednesday.
Apparently Hollywood Presbyterian Medical Center has no idea how to use computers. That should be comforting to their patients. $17,000 in bitcoin gets their data back but doesn’t solve the basic problem that apparently they have no backups, automation of system administration, or disaster recovery plan. When I say “don’t know enough IT to operate a computer” that’s what I’m talking about.
The cryptolocker attack doesn’t apply to hard drive recovery – since the data is overwritten encrypted, there’s nothing a drive recovery company can do for you. But, if you have good backups, you’ll just laugh at cryptolocker.
Let me teach you Marcus’ iron laws of data:
- If you don’t have three copies of it, you barely have one.
- If your data backup approach doesn’t have some scheduled way of making sure your files are preserved, you will neglect them and you will later regret neglecting them.
- Hard drives get old.
- Segregate disposable from important.
- Recovery is much more expensive than backups.
I’m going to walk through those points and lay out my home system backup approach. It’s not expensive, onerous or complicated. And I have only lost data twice in my life: (so far!)
- The first time, when a Seagate 20meg hdd blew up on me and two month’s worth of coding on my first ever consulting project blew up with it. That was when I decided to get serious about not losing data ever again.
- The second time, when I left a company where I worked and because I was afraid they might sue me, I turned over all the media I had that held data. (Hillary Clinton, that’s how you’re supposed to do it) and asked them to be careful with it. In a fit of pique they dumpstered the drives – including email archives going back to when I first got on the internet in 1981. They were trying to irritate me and it succeeded beyond their wildest imaginings.
Point #1: Have 3 Copies
The first copy is your working copy. In my case, that’s my work desktop (I have a game desktop that is 100% disposable) which has all my emails, code, reports, invoices, 15 years worth of digital photography and video, CDrom images of rare applications from 1985, etc. It is divided into 3 drives, each of which holds about 4tb. In this case, size doesn’t matter; I’m just bragging – it really is what you do with it that counts.
I have 2 external USB hard drives for each of the drives. That’s a total of 9 drives, in other words. Whenever I need to increase the size of one of the 3 drives in my system I buy 2 more USB hard drives and copy the internal drive to the newer, bigger drives, then replace the internal drive with the larger one, copy a copy back, and move on.
External drive #1 sits, powered off, in a pelican case on a shelf in my office. It is powered off and in a pelican case in honor of my friend Olaf, a fine photographer, who had a system (I built for him back in my case-modding phase!) with pull-out drive racks and all sorts of nice stuff, and a goodly array of USB backup drives that were powered up when the goon down the hall in the building he shared managed to trigger the sprinkler system. Powered up hard drive meets water, data goes “bye!” Olaf called me and asked “Hey, Marcus, those hard drive recovery services: how good are they?” and I told him:
“Think of this as an opportunity to re-invent your photography.”
Another reason to keep it powered off: if some hacker gets into your system and decides to overwrite your drive (or you get cryptolocker) it can’t touch the powered-off drive sitting on the shelf in the pelican case. This is a Sun Tzu-approved technique I call “security through pure laziness.”
External drive #2 sits in a safe deposit box at my bank, 15 miles away. The nice thing about bank vaults, other than that they’re probably pretty hard to get into, is that they all have fire suppression systems. Any disaster that wipes out both my house and the bank … well, I live 2000 feet above sea-level in a stable zone, it’d probably be a disaster so bad I’ll be busy for a while and may not even survive. I’ll be ready to re-invent myself after that, I bet.
Point #2: Trigger Your Backup Schedule
About once a month I go to town and get groceries, deposit checks, whatever. Most of the time, I try to synchronize my external drives (remember: I only have one at home at any given time) with the internal drives. Not every drive needs to get synchronized, I have a pretty good memory for that, but it doesn’t take very long anyway, unless I’ve been ripping a lot of movies or music, doing photo shoots, or collecting game-play videos (easy to eat 100gb that way)
Make sure you label your backup drives. I use great big labels and a post-it note. The label says what it is: “b-photo” or whatever, and the post-it is the last time I synchronized that particular drive.
It takes more time to describe it than to do it.
Synchronizing drives: do it however you want to. Some operating systems have built-in tools for it. If you’re trapped in Windows-land you can use Microsoft Sync-Toy. I use Funduc’s Directory Toolkit because it has worked fine for me since 1994 or so.
I like to manually synchronize, because of my friend Norm. Norm had a super cool backup system – it was a network attached storage server running FreeBSD and ZFS, which lived in his studio – a separate building – a couple hundred feet from his house. He set up his systems using some UNIX tools to automatically synchronize his desktop with the storage server. It was a very spiffy arrangement, until the time he was installing a new drive and zeroized the file system on the wrong drive and the automatic synchronizer process he’d set up mirrored the empty filesystem onto the storage server. (Hint: rsync is not always your friend)
Another trigger for my backup schedule is my travel schedule. I have a laptop. I’ve segregated my email and working files into a separate drive (a small one, about 200gb) and I have a matching 200gb encrypted partition on my laptop. Before I go somewhere, I synchronize the working file area with the laptop. When I get home, I synchronize it back. A nice side effect of that is that I always have a pretty fresh emergency copy of my working files that’s generally no more out of sync than a couple days.
Point #3: Hard Drives Get Old
Solid state drives get old, hard drives get old, even DVDs get old. My old ASR-33 punch tapes got old. The only way to keep your data from ageing away on you is to keep refreshing it to new media every so often. Then throw the old media away. Keeping a forward-rolling archive means you will never find yourself staring at Ebay listings for TRS-80 external floppy drives, and wondering “how can I read this thing?” That means: spend the money on the space and don’t worry too much about organizing it. I have a filesystem called “archives” and I’m pretty sure that if I created it after 1992 it’s in there. It just may take me days to find it.
There is nothing worse than coming to depend on a hard drive and discovering it’s erroring. Most drives give some warning before they outright die. If you get a drive error, now is a good time to re-assess your storage needs!
DVDs get old. Remember Olaf, the guy I told you about earlier? He had stacks of DVDs on spindles and tried to recover some of his lost stuff from those. Unfortunately his old DVD writer appeared to be kind of unreliable, and the disks age just sitting there. He had a lot of data errors and corrupted files.
The way serious storage people do things nowadays, they use a file database as an index to a bunch of heirarchical storage. That’s how Amazon can sell space so cheaply – it’s a tiered array of caches starting with system RAM then solid state drives then hard drives and finally petabyte tape robots. The big storage systems checksum the files when they are accessed and make sure the files haven’t lost a bit somewhere. That’s actually a thing! When you are storing petabytes and hundreds of millions of files, the probability that a system RAM error or a drive mis-read will flip a bit goes up – we’re talking astronomical numbers in a mega-storage array and even things with a very low probability happen all the time.
Hard drives try to tell when they have bad sectors but sometimes they can’t and you’ve now got a corrupted file. ZFS internally keeps several copies, checksummed, and reconstructs the correct one if something goes wrong – and it tells you to get a new drive. My friends who play storage harder than I do, use ZFS.
Point #4: Segregate Data
If you’re installing an operating system carve a partition out of the drive and put all the O/S and your apps there. Then carve another partition for your working files. My home hard drive is carved into:
Work files (back up weekly)
Temporary files (back up weekly)
Music (back up once a month)
When I synchronize the drives I synch work files and music and that’s it. If my O/S install takes a bullet (it’s Windows, so that happens about once a year) I don’t have to worry about it – I just blow the contents of that partition away and reinstall Windows then re-populate the other partitions.
If you get your personal stuff mixed in with Windows or whatever you are going to have to start backing up that crap, which means you’re backing up the temporary install files, yadda yadda all the glarp the O/S wants to pull down. Don’t do that. Set “my documents” to point to your own partition and don’t ever deliberately store anything in C:\Windows.
Consider taking segregation to the point of physical separation. My systems are now built with two 100gb Kingston SSDs in a mirrored drive configuration. I boot Windows on SSD and my machine runs like your favorite metaphor for “really fast.” That way, the SSDs can wear out and blow up or whatever and/or if I want to do a Windows upgrade I just pull the old SSDs, do the upgrade onto new SSDs, and when that’s all done I put the old SSDs in an external case and use them for giving movies to friends or upgrading laptops.
The other reason segregating is good is because you can decide what to take with you and what not and when.
Point #5: Recovery Is More Expensive Than Backups
Hard drive recovery is a really cool thing. Unless you’re the person who needs it. Hard core hard drive recovery shops do things like buy duplicates of a damaged drive, then swap the logic boards to see if that’s the problem. If that doesn’t work they go into a clean-box and dismantle the platters and try to move them to a donor drive. Sometimes it even works. It’s really cool when it does.
Let’s look at some costs:
- Pelican Case, $32 – keeps the USB external drives dry and safe. Looks cool. You can use a Zero briefcase with cut foam if you’re really styling. Think waterproof.
- Triplicate Storage, $350 – That’s the cost of a triplicated 4tb, approximately. Seagate external 4tb expansion drives (right now) are $109. So you need 2 of those and an internal 4tb drive
- Safe Deposit Box, $40/year – I have one of the large ones and it turned out to be handy for storing car titles, deeds, stock certificates, and blackmail materials. Just kidding about the stock certificates.
So for less than $500 you can have 4tb of guaranteed storage. You can skip the safe deposit box if you have a friend you see once a month – just swap the pelican case and you’ll look like a spy or something cool like that. You can be someone else’s offsite backup.
If you have better internet bandwidth than I do, you can get cloud backup solutions that just work, and work great. Those services are about $20/ month, or $250/year.
I do not know what the automatic cloud sync things do with cryptolocker (which overwrites and encrypts files in place) – my guess is that you get a copy of the encrypted data synchronized to your cloud backup, but then you have the option of going back to retrieve versions from before the attack. You hope.
One Last Thing: More Segregation
If you sell a device or give it to someone else, be careful what happens to the existing storage in the device. Back in the early 00’s I did some consulting for a company that produced cell phone storage encryption. We bought a bunch of used phones on Ebay and nearly every one of them had something interesting on it. Only one out of the dozen we got was wiped (and it was just formatted using the O/S’ format command, which is reversable with forensic software) I have purchased used laptops on Ebay and found people’s files in them. It’s disappointing how banal most people’s data is.*
The only way to keep that sort of thing from happening is to destroy the disk or encrypt your stuff and then de-key the encrypted volume when you hand the device to someone else.
On my laptop, I use Truecrypt (Free, but discontinued and getting hard to find) to create a virtual volume that’s the size of my working files partition. That way, I have C:\windows and inside C:\windows\tc I have a 200gb Truecrypt volume. If I ever give someone my laptop all I need to do is delete that volume to free up the space – as long as the recipient hasn’t got the key and the volume is unmounted, I don’t need to worry about wiping the drive.
I do this for data segregation more than security but I do cross a lot of international borders and I don’t like DHS’ habit of occasionally looking on people’s hard drives. So if you dismount the Truecrypt volume, it’s just a big file and if DHS images your disk they’ll think you’re probably into child porn or drugs and they’ll freak out, but they won’t notice the big file right away and you’ve probably passed on through. Virtual volumes are a great way of segregating your data. Since I work in information security, I periodically get annoyingly helpful people lecturing me about Truecrypt’s flaws**, and I have to patiently tell them that – for me – it’s a system administration tool and my way of securing customer data is to keep it off my laptop.
At my first real job, I was a user support consultant at a university, and had taken to hanging out with another guy, Kevin, who worked in the CS department, who was pretty cool. One day a woman came into my office, tears pouring down her face, incoherent. It turns out she had been trying to format a floppy disk Format B: and had typed Format C: instead. The hard drive had the only copy of her mostly-finished dissertation on it. “Distraught” was an understatement. Kevin had a copy of Norton Utilities and we went over and looked at her hard drive – it turned out that she had stopped the format before it got very far – it had just wiped the file allocation table and a bunch of the data blocks. Kevin used Norton to reconstruct a file allocation table consisting of the blocks of the document, and eventually re-arranged the document into the correct order. He was the great hero of the hard drive that week, for sure! If she’d had backups, Kevin never would have had such a chance to be a star. By the way, when I’m editing a document I really don’t want to possibly lose, I periodically just email myself copies. That way my in-box has a nice sequence of the various revisions of the document.
(* Anyone want a framework for a novel? Someone buys a laptop on Ebay, gets it, and discovers there’s data on it. The data appears to include video of someone being murdered by the police. The hero falls into a web of awfulness as they try to figure out the owner of the laptop, and what happened. Stay tuned…)
(** Infosec practitioners tend to be very big on secret squirrel “NSA backdoor” legends. In the case of Truecrypt, the author appears to have come under pressure from the US Government to put a backdoor in, refused, been threatened with dire consequences, and dropped developing the software entirely.)