Bayes’ Theorem: Deceptively Simple

Bayes' Theorem, in classic form

Good ol’ Bayes’ Theorem. Have you even wondered where it comes from, though? If you don’t know probability, there doesn’t seem to be any obvious logic to it. Once you’ve had it explained to you, though, it seems blindingly obvious and almost tautological. I know of a couple of good explainers, such as E.T. Jaynes’ Probability Theory, but just for fun I challenged myself to re-derive the Theorem from scratch. Took me about twenty minutes; if you’d like to try, stop reading here and try working it out yourself.

[Read more…]

Women’s Work

[I know, I’m a good three months late on this. It’s too good for the trash bin, though, and knowing CompSci it’ll be relevant again within the next year.]

This swells my heart.

LAURIE SEGALL: Computer science, it hasn’t always been dominated by men. It wasn’t until 1984 that the number of women studying computer science started falling. So how does that fit into your argument as to why there aren’t more women in tech?

JAMES DAMORE: So there are several reasons for why it was like that. Partly, women weren’t allowed to work other jobs so there was less freedom for people; and, also… it was simply different kinds of work. It was more like accounting rather than modern-day computer programming. And it wasn’t as lucrative, so part of the reason so many men give go into tech is because it’s high paying. I know of many people at Google that- they weren’t necessarily passionate about it, but it was what would provide for their family, and so they still worked there.

SEGALL: You say those jobs are more like accounting. I mean, look at Grace Hopper who pioneered computer programming; Margaret Hamilton, who created the first ever software which was responsible for landing humans on the moon; Katherine Johnson, Mary Jackson, Dorothy Vaughan, they were responsible for John Glenn accurately making his trajectory to the moon. Those aren’t accounting-type jobs?

DAMORE: Yeah, so, there were select positions that weren’t, and women are definitely capable of being confident programmers.

SEGALL: Do you believe those women were outliers?

DAMORE: … No, I’m just saying that there are confident women programmers. There are many at Google, and the women at Google aren’t any worse than the men at Google.

Segall deserves kudos for getting Damore to reverse himself. Even he admits there’s no evidence women are worse coders than men, in line with the current scientific evidence. I’m also fond of the people who make solid logical arguments against Damore’s views. We even have good research on why computing went from being dominated by women to dominated by men, and how occupations flip between male- and female-dominated as their social standing changes.

But there’s something lacking in Segall’s presentation of the history of women in computing. She isn’t alone, I’ve been reading a tonne of stories about the history of women in computing, and all of them suffer from the same omission: why did women dominate computing, at first? We think of math and logic as being “a guy thing,” so this is terribly strange. [Read more…]

Stochastic Supertasks

I really loved this illustration of the paradoxes of infinity from Infinite Series, so much so that I’ll sum it up here.

What if you could do one task an infinite number of times over a finite time span? The obvious thing to do, granted this superpower, is start depositing money into your bank account. Let’s say you decide on plunking in one dollar at a time, an infinite number of times. Not happy at having to process an infinite number of transactions, your bank decides to withdraw a 1 cent fee after every one. How much money do you have in your bank account, once the dust settles?

Zero bucks. Think about it: at some point during the task you’ll have deposited N dollars in the account. The total amount the bank takes, however, keeps growing over time and at some point it’s guaranteed to reach N dollars too. This works for any value of N, and so any amount of cash you deposit will get removed.

In contrast, if the bank had decided to knock off 1 cent of your deposit before it reached your bank account, you’d both have an infinite amount of cash! This time around, there is no explicit subtraction to balance against the deposits, so your funds grow without bounds.

Weird, right? Welcome to the Ross–Littlewood paradox. My formulation is a bit more fun than the original balls-and-urns approach, but does almost the same job (picture the balls as pennies). It does fail when it comes to removing a specific item from the container, though; in the Infinite Series video, the host challenges everyone to figure out what happens if you remove the median ball from the urn after adding nine, assuming each ball has a unique number assigned to it. Can’t do that with cash.

My attempt is below the fold.

[Read more…]

“Science Is Endangered by Statistical Misunderstanding”

He’s baaaaaack. I’ve linked to David Colquhoun’s previous paper on p-values,[1] but I’ve since learned he’s about to publish a sequel.

Despite decades of warnings, many areas of science still insist on labelling a result of P < 0.05 as “significant”.   This practice must account for a substantial part of the lack of reproducibility in some areas of science. And this is before you get to the many other well-known problems, like multiple comparisons, lack of randomisation and P-hacking. Science is endangered by statistical misunderstanding, and by university presidents and research funders who impose perverse incentives on scientists. [2]

[Read more…]

Math Can Be Weird

Take the Cantor function, a “Devil’s Staircase.”

The Cantor function, in the range [0:1]. It looks like a jagged staircase.

It looks like a squiggly mess, yet it is continuous and at almost every point there’s a well-defined slope: perfectly horizontal. The only exceptions are at points along the X-axis which are part of the Cantor set, an uncountable number of points with zero length. Even at one of these points, however, the net vertical increase is zero! We can see this by calculating the limits toward a point with a non-zero slope.

Wikipedia has a good write-up on how to evaluate the Cantor function (I used it in the above approximation).

  1. Express x in base 3.
  2. If x contains a 1, replace every digit after the first 1 by 0.
  3. Replace all 2s with 1s.
  4. Interpret the result as a binary number. The result is c(x).

The point x = 1/3 is part of the Cantor set, and thus satisfies our needs. Following the above rules, the output of the function there is 0.1 in binary, or 0.5 in decimal. Let’s calculate both limits, to get a feel for how much vertical is climbed at that point.

Approaching the limits of C(1/3). Spoiler alert, they both wind up equalling 1/2.

If we approach x = 1/3 from the right, we flatline at y = 1/2 . If  we approach it from the left, we wind up evaluating the geometric series y = 1/4 + 1/8 + 1/16 + … to calculate the height, which gets arbitrarily close to y = 1/2 . The height of the “jump” at x = 1/3 vanishes into insignificance! That’s a good thing, as otherwise the Cantor function’s slope would have approached a vertical line and it wouldn’t be a function.

Calculating the slope of the Cantor function at x = 1/3. Spoiler alert, it approaches a perfectly vertical slope.

Yet even though every single vertical hop is arbitrarily small, it’s obvious the Cantor function has some sort of vertical increase. How else could it contain both (0,0) and (1,1)?  In fact, if you measured the arc length of the Cantor function, it would be two units. Every point where the slope isn’t horizontal it is arbitrarily vertical, so no matter where you put the vertical or horizontal bits you wind up travelling the Manhattan distance between (0,0) and (1,1), which is 2. We know the distance of the horizontal components adds to one unit, since the Cantor set has length 0 and the horizontal distance is 1, so the uncountable number of arbitrarily small vertical “hops” must also have a net length of one unit.

The Cantor function manages to climb vertically without actually climbing vertically. Pretty wild, eh?

Oh, and credit where credit is due, I was introduced to the Cantor function by PBS’s Infinite Series. Check it out for a weekly dose of math.