After the Yahoo! disclosure, there was some general falling all over oneself from some of the other large providers, “we didn’t!” “no, not us!” etc.
Methinks they doth protest too much.
The FBI has issued over 300,000 national security letters since 2006. They’re a basic form-letter saying, pretty much: “give us all the transactional data you have regarding this email address” they specifically are not asking for data contents. They’re asking for information that’ll allow the establishment of entity relationships: webs of communication. Who talks to whom and when.
This is the “metadata” that you have heard so much about. It’s all tied to a technique called “Traffic Analysis” – a term of art that was classified until the mid 1980s.* The US National Security Agency’s basic course in cryptanalysis included a course on the topic starting in the 1950s; it was a 200-level advanced course in which the students were expected to completely compromise the communications of the mythical nation of Zendia, using a series of encrypted intercepts.**
One of the crypto-heads I used to work with at Trusted Information Systems had actually gone through all the exercises and “solved” the game. The cover of the book tells you much of what you need to know: you assemble a map of entities that communicate, how often they communicate and the message/response patterns (do they originate, only, or do they exchange messages) and message sizes (short messages may be an acknowledgement, longer messages may be orders) The resulting map is an organization’s communication-patterns and – hopefully – an org chart. If you look closely at the cover image you can possibly infer that DKT, the node at the upper right, is the commander and LRJ, TV1, and MPA are field commanders, etc.
You can imagine how easy it is to build such webs using internet technologies, if you’re able to get a feed behind your target’s encryption envelope. Keiran Healy does a delightful demonstration of Trafficke Analysis of the Sons of Liberty which identifies Paul Revere as a major node.
A National Security Letter looks like this:
Where it gets tricky is that the big part that’s blocked out is the communications target. So the interesting metric we would want to know is not how many NSLs have been sent but rather how many records have been returned in response to NSLs. We know that, in the case of phone service providers, express shipping services (UPS and FEDEX), bulk mail (USPS), and Yahoo! and many web sites, the answer appears to be “all of them.” PATRIOT act included “money laundering” provisions for “anti terrorism” that require banks to report transactions over $10,000, and the credit card companies have been suspiciously silent. Very, very, very silent. Take all that data and you can build some interesting applications. For example, if you’ve spent $400 with www.cheaperthandirt.com and gotten a 20lb package via UPS with an ORM-D endorsement, they know you bought ammunition, probably about 1000rd. American gun nuts haven’t put 2+2 together and realized why there’s no gun registration database because: what else do you buy from Bushmaster that costs $1500 and ships with an insurance declaration, signature required, to an address of someone with a Federal Firearms License (FFL)? Just keep thinking about this: the internet providers are the tip of a 300,000 NSL iceberg. When you start integrating all that data, you don’t need all the contents: you just subpoena it from one place and infer the rest.
I used to hang out with a photographer, a bunch of years ago, who had a bit of a fondness for cocaine. Once when I was at the studio, he was txting with someone about something, then announced to me that he had just scored his weekend’s entertainment.
Me: “But…. but…. txt messages aren’t secure!”
Him: “I didn’t call it cocaine. I said I was ordering a pizza.”
Me: “Yeah, but when they figure out he’s dealing, they’ll pull all his call records, and you’ll be in there ordering a pizza every friday afternoon.”
Analysis of that sort used to be my “thing” (specifically: looking at system log data to identify the penetration-point and activities of system breaches) and it’s a straightforward process of putting together the information and a timeline. The example I gave above, of buying a gun: what’s the time-line going to look like for that? And, if you take that sequence of events and trawl through a large amount of data, how many other gun-buys are going to fall out? The NSA has market data that most companies would sacrifice babies to get their hands on.
Precomputing and visualizing is the name of the game, and there’s a great deal of research and development devoted to producing analysis software. Palantir (above) is very popular with the surveillance state.****
Pretty much any company that does anything with lots of end-user transactions, or which builds security tools, is going to get an NSL, eventually. In fact, if you don’t, it probably means either a) they found a hole in your crypto and just don’t need to ask b) you haven’t got enough customers. There has been some push-back: after 11 years of litigation, Nicholas Merrill – founder of an ISP (Caelyx) – that received an NSL, finally got court clearance to talk about it. The Electronic Frontier Foundation and other organizations are working through the courts controlled by the “most transparent government, ever.”
Yahoo!, Google, Facebook, Microsoft, Linkdn – they’ve complained openly. Who hasn’t? 299,995 more, we should suppose.
The whole thing is a trap for stupid bad guys, if that. That’s what’s so frustrating about the gigantic amount of effort and money that’s being spent to develop these systems: they only work retroactively. Like in the example I gave earlier of the cocaine-buying photographer, if you catch a terrorist you can then investigate their friends (because you know who their circle of friends is) – but you’re only going to catch the terrorist once they actually, you know, do something. Sure, it would be a bad idea to try to buy ammonium nitrate and powdered aluminum on ebay (Ebay hasn’t said anything about NSLs one way or another) and it’d probably be a bad idea to buy a pressure-cooker on amazon.com (Amazon hasn’t said much)***** Ultimately, though, these systems can only detect repeat instances of past failures. If they are trawling through purchase records for ammonium nitrate + aluminum + pressure cooker they won’t catch ammonium nitrate + aluminum + scuba tank or CO2 drink machine tank or oxyacetylene torch oxygen bottle.
We should be suspicious of such systems because they don’t do what they pretend to do, namely, protect us. They are, however, very effective at digging up dirt in people’s past. For example, when Eliot Spitzer was busted for prostitution, how do you think the FBI was able to get his credit card records going back for years? When David Petraeus was taken off the political table when there were rumblings about him being a possible candidate for office, how was the FBI able to get all of his emails and txts between him and his lover? There’s dirt that’s there for digging and it’s going to be dug according to the prevailing politics of a time. That’s been a primary operational mode of the FBI since its inception. Nixon started the Watergate “plumbers” because the FBI wasn’t giving him the covert intelligence he wanted (we later learned that “Deep Throat” was FBI) – in other words, the FBI was in, up to its neck, in internal politics and was using selective disclosure to break political opponents including the president. So: they got Spitzer but not Jeffrey Epstein. They didn’t get the Boston Marathon Bomber until it was thoroughly too late.
Privacy has only ever been the prerogative of the rich and powerful (so they can do their shenanigans without others knowing what they’re up to) the idea that the middle class or even the poor should have privacy is a relatively recent invention of the late enlightenment. Silicon valley is just continuing the trend of being embraced into the service of the oligarchy. The newly-minted rich dotcommers have a stark choice: your money, or some other guy’s rights.
Warrant Canaries are a ridiculous idea that responds to the ridiculous situation. Since the NSL prohibits you from talking about it, the idea is to post a statement on a provider’s site that says “We still have not gotten an NSL.” Then, as soon as the organization gets an NSL, they have to take it down because the first rule of NSL club is: you don’t talk about NSLs. That’s a clever way of weaponizing the non-disclosure agreement against itself. But what happens when everyone gets NSLs?
PBS: How the US Government Turned Silicon Valley Into a Surveillance Partner
Electronic Frontier Foundation on National Security Letters
Merrill and Caelyx – NSL Details Unmasked
(* I can’t find an exact date. Callimahos’ Agean Park Press edition of The Zendian Problem is dated 1989)
(** By the way, google image search for “traffic analysis zendia” results in a lot of pictures of some attractive young lady; apparently there is a model/celebrity with that name. Her parents must have either been lucky, or cryptographers.)
(*** Apparently my copy was a good investment: Amazon lists it for over $1,700. It can be found on the internet via google books.)
(**** This presentation has some neat bits about some of the things you can do with it)
(***** I am actually concerned by what they appear to have said. “Amazon does not give data from AWS…” which is their cloud service platform. Why so specific? What about store transaction data?)
There is a theory paper written by Microsoft employees on Traffic Analysis which was floating around the Net. I have a copy some where. Are you aware of it. I will post the title when I find my copy.
Found it “Introducing Traffic Analysis “ by George Danezis and Richard Clayton
Marcus Ranum says
Amazingly it doesn’t mention Callimahos’ book or classes. That makes me wonder if they actually have studied the topic. (On further reflection and examination of the references, I suppose they have, but I’m surprised they missed such an important text, especially since it’s practical)