From the Archives: Evaluating FiveThirtyEight

This is a repost of a simple analysis I did in 2012, evaluating the presidential predictions of FiveThirtyEight.  What a different time it was.  If readers are interested, I could try to repeat the analysis for 2020.

The news is saying that Nate Silver (who does election predictions at FiveThirtyEight) got fifty states out of fifty. It’s being reported as a victory of math nerds over pundits.

In my humble opinion, getting 50 out of 50 is somewhat meaningless. A lot of those states weren’t exactly swing states! And if he gets some of them wrong, that doesn’t mean his probabilistic predictions were wrong. Likewise, if he gets them right, that doesn’t mean he was right.

[Read more…]

Ethics of accuracy

Andreas Avester summarized Weapons of Math Destruction: How Big Data Increases Inequality and Threatens Democracy by Cathy O’Neil. Now, I’m not sure how many readers remember this, but I’m a professional data scientist. Which doesn’t really qualify me as an authority to talk about data science, much less the ethics thereof, but, hey, it’s a thing. I have thoughts.

In my view there are two distinct1 ethical issues with data science: 1) our models might make mistakes, or 2) our models might be too accurate. As I said in Andreas’ comments:

The first problem is obvious, so let me explain the second one. Suppose you found an algorithm that perfectly predicted people’s healthcare expenses, and started using this to price health insurance. Well then, it’s like you might as well not have health insurance, because everyone’s paying the same amount either way. This is “fair” in the sense that everyone’s paying exactly the amount of burden they’re placing on society. But it’s “unfair” in that, the amount of healthcare expenses people have is mostly beyond their control. I think it would be better if our algorithms were actually less accurate, and we just charged everyone the same price–modulo, I don’t know, smoking.

[Read more…]