John Oliver weighed in on the replication crisis, and I think he did a great job. I’d have liked a bit more on university press departments, who can write misleading press releases that journalists jump on, but he did have to simplify things for a lay audience.
It got me thinking about what “false” means, though. “True” is usually defined as “in line with reality,” so “false” should mean “not in line with reality,” the precise compliment.
But don’t think about it in terms of a single thing, but in multiple data points applied to a specific theory. Suppose we analyze that data, and find that all but a few datapoints are predicted by the hypothesis we’re testing. Does this mean the hypothesis is false, since it isn’t in line with reality in all cases, or true, because it’s more in line with reality than not? Falsification argues that it is false, and exploits that to come up with this epistemology:
- Gather data.
- Is that data predicted by the hypothesis? If so, repeat step 1.
- If not, replace this hypothesis with another that predicts all the data we’ve seen so far, and repeat step 1.
That’s what I had in mind when I said that frequentism works on streams of hypotheses, hopping from one “best” hypothesis to the next. The addition of time changes the original definitions slightly, so that “true” really means “in line with reality in all instances” while “false” means “in at least one instance, it is not in line with reality.”
Notice the asymmetry, though. A hypothesis has to reach a pretty high bar to be considered “true,” and “false” hypotheses range from “in line with reality, with one exception” to “never in line with reality.” Some of those “false” hypotheses are actually quite valuable to us, as John Oliver’s segment demonstrates. He never explains what “statistical significance” means, for instance, but later on uses “significance” in the “effect size” sense. This will mislead most of the audience away from the reality of the situation, and in the absolute it makes his segment “false.” Nonetheless, that segment was a net positive at getting people to understand and care for the replication crisis, so labeling it “false” is a disservice.
We need something fuzzier than the strict binary of falsification. What if we didn’t compliment “true” in the set-theory sense, but in the definitional sense? Let “true” remain “in line with reality in all instances,” but change “false” from “in at least one instance, it is not in reality” to “never in line with reality.” This creates a gap, though: that hypothesis from earlier is neither “true” nor “false,” as it isn’t true in all cases nor false in all. It must be in a third category, as part of some sort of paraconsistent logic.
This is where the Bayesian interpretation of statistics comes from, it deliberately disclaims an absolute “true” or “false” label for descriptions of the world, instead holding them up as two ends of a continuum. Every hypothesis in the third category inbetween, hoping that future data will reveal that its closer to one end of the continuum or the other.
I think it’s a neat way to view the Bayesian/Frequentism debate, as a mere disagreement over what “false” means.