This is a really fascinating, if highly specialized, book about an important topic.
Donald MacKenzie’s Mechanizing Proof: Computing, Risk, and Trust [wc] is a good companion-piece to Olav Lysne’s The Huawei and Snowden Questions [wc]; taken together they provide a grim view of computer security, and all of the implications of the software-driven world we are finding ourselves in. When I first read them, last year, I wandered around like Private Hudson in ALIENS, “Game over man! This is some really pretty shit we’re in.” There are three questions regarding software:
- Why it is so bad?
- Why it is so good?
- Given that writing security or reliability-critical software is manifestly harder than writing plain old “just make it work” applications, and plain old software is pretty bad, how can we make security/reliability-critical software good enough to rely on it?
The cynic in me notes that industry’s response to question #3 is to throw patches at software and hope that the problem will go away. Meanwhile, everyone who has thought about the problem for a second realizes that if there’s one bug, there are probably two or more; they are just unknown. Since I got involved in computer security in the late 1980s, we’ve basically seen an entire civilization built on top of beta-test quality code. If that. There are critical pieces of software (e.g.: Microsoft Office) that have been a continuous mine of bugs for over 20 years, with no sign that the vein is running out. For one thing, it’s because software vendors are busy adding new bugs as fast as the old ones can be fixed.
What is needed, everyone eventually realizes, is a way of making software that is good by design. And that’s the topic of Mechanizing Proof. I’m tempted to say it’s an exciting book that reads like a novel, but that would be a lie. I mean, it’s as exciting as some popular pot-boilers and it’s dealing with a reality that is historical record, but it’s a highly specialized topic. That it’s an extremely important topic doesn’t change the fact that it’s highly specialized. It’s a history of the early days of smart people realizing “this stuff is going to be a problem.” And that’s not a minor thing: when John Von Neumann says “this stuff is going to be a problem” the wise man will shut up and listen. Which, people did – but they don’t, anymore.
The meeting was held in the Hotel Sonnenbichl on the outskirts of Garmisch. The vista from its windows and terrace unrolled across the little town to the towering mountains. The keynote speech opening the conference was given by Alan Perlis of Carnegie Mellon University in Pittsburgh, a leading figure in U.S. computer science. “Our goal this week,” Perlis told the small, select group of participants, is the conversion of mushyware to firmware, to transmute our products from jello to crystals.” Buxton recalled the tone being set by frank admissions in an after-dinner talk by Robert S. Barton, who had played a leading role in the development of Burroughs’ innovative B-5000 computer. Barton told participants, “that we were all guilty, essentially, of concealing the fact that big pieces of software were increasingly disaster areas,”
You see? Old fogeys complaining that software sucks is nothing new; it began shortly after the invention of software and has continued more or less apace. The main change is that nowadays, coders are drinking Cristal and driving Lamborghinis, lighting their shirts on fire with $100 bills in VIP lounges in strip clubs. That is my dim memory of the 90s, anyway.
There was general agreement at Garmish that “software engineering” was needed. But what was it? As an analogy, an inspiration, a slogan, few would have disagreed, but it remained an empty box awaiting content. Consensus around the slogan largely evaporated when it came to deciding more concretely what software engineering should consist of. The report of the Garmisch meeting noted that “The phrase ‘software engineering’ was chosen deliberately,” by the study group that set up the meeting because it was “provocative in implying the need for software manufacture to be based on the types of theoretical foundations and practical disciplines that are traditional in the established branches of engineering.” That formulation, however, immediately indicated the potential for radically different emphases on the relative importance of “practical disciplines” and “theoretical foundations.”
The whole book chronicles the white-knuckled warfare that has obtained between those practical disciplines and the theoretical foundations. One school of thought was that we could devise tools that would prove that a piece of software did what it was supposed to, then apply those tools to the foundations of the code-mass, and build trust upwards through the system. The other school of thought was that those foundations could be well-constructed by experts and code that depended on them would gain a degree of reliability through code re-use. I see this as a dialogue between builders and architects – the architects are responsible for knowing the properties of the components and making designs that factor in engineering overhead, and the builders are responsible for reading the architects’ blueprints and then doing something else entirely. It is a fact that architects often design things where the I-beams don’t line up correctly, or there is no space to run the brown water from the bathroom, and the builders just fudge things back together. It is also a fact that occasionally buildings collapse and kill a lot of people, so why hasn’t that happened with software, yet?
My opinion is that it has happened with software; the collapse just is not dramatic and lethal. It’s merely incredibly expensive. I’m speaking there from the perspective of someone who made a good, lucrative, career out of sheltering people from the consequences of their unwise choice to use Microsoft Windows. It wasn’t strictly their fault, I would say, because there were no reasonably viable alternatives anyway – but let’s just say that I don’t ride amusement park rides that are computer-controlled unless I can see that the controllers are Siemens PLCs and not a Windows box.
MacKenzie is writing history of technology, not popular science, so his book lacks the literary flourish of Simon Singh or Siddartha Mukerjee but it’s highly readable and it’s interesting. I’d say it’s just short of fascinating, unless you work in the field or you work in a field affected by software. Of course every field is affected by software, whether you acknowledge it or not: we have moved into an era where your bank’s IT department may screw up and discover that all the money has gone – all of your money has gone – because someone depended on an unpatched version of Apache Struts. That’s not implausible, it’s merely unpleasant. What really sucks is that U.S. Missile Command replaced its old 1960s command/control systems with stuff running on Windows. Probably Windows 95. Do you feel better, now?
I have to confess that I am firmly in the camp that says formal verification is a pipe-dream. That’s because, for formal processes to work, we’d have to also be able to describe the business purpose of the software formally, as well. It doesn’t matter if the software is perfect, if it allows the user to transfer negative one million dollars to someone else’s account – the overarching system in which the software works has to have a verifiable explanation of how the world should work, not merely the software implementing the software simulation of the world. All of this stuff seems like details that matter only to a practitioner, but if you know someone who thinks that their ‘soul’ is going to be uploaded into a computer, I’d suggest you start them with The Huawei and Snowden Questions and then break Mechanizing Proof to them gently.
The Huawei and Snowden Questions is a topic for another day, but let me explain its relevance, briefly. Basically, it’s a shot below the water-line fired by a theoretician who points out that:
- We cannot reasonably trust the provenance of our software (Huawei)
- We cannot reasonably trust the operators of our software (Snowden)
- It is implausible to compose systems of higher trust out of untrustworthy components
- Game over, man
People have been saying that for decades (I have slides from my old “writing secure software” course that I used to teach in the 90s) but Lysne says it well, and in a particularly timely way.
The debate between security-by-design and discover-flaws-and-fix was a subtext of my career for about 20 years. Alan Paller, the founder of the SANS conference, and I, used to stake out that debate every so often – with SANS arguing professionally that you could run insecure software on public internets if you just tried hard enough, or something. In fact, SANS has built a massive reputation out of teaching people “patch, patch, patch.” I used to bump into Paller every so often and would snark at him, “so, have we fixed all the bugs, yet?” Of course the problem is that nobody wants to pay the time and resource costs to actually design software to be good. There’s another problem: well-designed software is harder to repurpose by just throwing a few new chunks of code into it – that breaks the design. Bad software is more nimble, which means it’s more commercially successful.