OHJ: The Covington Review (Part 1)

Cover of Richard Carrier's book On the Historicity of Jesus. Medieval icon image of Jesus holding a codex, on a plain brown background, title above in white text, author below in white text.This week I am doing a series on early reviews of my book On the Historicity of Jesus. If you know of reviews I don’t cover by the end of the first week of July, post them in comments (though please also remark on your own estimation of their merits).


One of those early reviews posted is by Nicholas Covington (at Hume’s Apprentice of SkepticInk), author and blogger, with a strong interest in counter-apologetics, naturalist philosophy, and historical argument. He is blogging his review as a series, and so far only parts 1 and 2 are available. I will post more as he does. But here is my commentary on part 1, on a question of method. For the remaining parts, see closing paragraph.

First, of course, I concur with Covington’s opening warnings. As I have written on the same point before (Fincke Is Right: Arguing Jesus Didn’t Exist Should Not Be a Strategy). I have likewise made the same point about the Possibility Fallacy (Proving History, pp. 26-29). And I am glad he plans to get to his own estimates of numbers by the end of his series (that’s important).

Second, his part 1 only addresses the question of prior probability. He correctly points out that that precedes our examination of specific evidence for or against historicity, so the historicity of Jesus is not decided by its prior (as I also explain in OHJ, ch. 6).

Third, so far he has only one point to make about this. Essentially, he repeats what I call the Alternative Class Objection. Which I already fully address in the book (OHJ, pp. 245-46).

I could leave it at that. But Covington proposes a reference class I didn’t give as an example, one that gives occasion to discuss an important methodological point that is easily gotten wrong. [Note that after I composed the following, but before I published it, Covington updated his article with a paragraph noticing on his own the last point I make here.]

The Objection

Covington’s conclusion on this one point is:

All in all, Carrier’s prior probability of 33% for the historicity of Jesus is reasonable but not entirely beyond challenge, and it may be equally reasonable for us to hold to a prior probability much higher if we use a different reference class such as the one I mentioned.

By which he means what is essentially the converse of Stephen Law’s Contamination Principle (which I actually refer to in OHJ, see the author index). Stephen Law’s principle is that the more unbelievable things there are in a story, the less believable the mundane things in that story are. Stated as such, he is correct (although the question remains how much less). Covington proposes the converse: the more true things there are in a story, the more believable the rest of the story is. Stated as such, he is correct…provided we commute the principle to the correct reference class of information. Covington skips that step. Consequently, his objection (which admittedly he does not really have that much confidence in) is not valid. Although it could in principle be fixed up to work, that would require the Gospels to look substantially different than they actually do (as I explain in detail in OHJ, ch. 10).

An easy example of what I mean is to take Law’s example (which Covington discusses), in which someone claims a certain person they call Bert “flew around the room by flapping his arms before dying, coming back to life and turning their sofa into a donkey,” and add the detail that Bert voted for President Obama in 2008 and lives in Seattle. Does the fact that there really was an election for a man really named Obama in 2008, and there really is a city named Seattle, increase the probability that Bert exists at all? Or by any appreciable amount? No. Because fiction routinely includes factually true details (in fact, studies of urban legends show they actually accumulate such details over time, so reliably that experts in the subject consider the proliferation of factual details a sign of a story not being true: OHJ, pp. 480-81, n. 195). And this is where we have to pay attention to reference classes: is it improbable for the story of a non-existent person to contain true facts of the world? No. To the contrary, it’s almost universally the case (pick any myth placed in an actual historical context and you’ll find things in it that are true, like the names and locations of cities and other geographical and political and cultural facts). So it is actually expected (see OHJ, pp. 214-34).

Therefore, the presence of true facts of the world in a story does not increase the probability of the rest of the story being true, at least not by any significant amount. Except contra-factually, of course: it increases it relative to the same story but where all those true facts are replaced by false ones. But the fact that false facts lower a story’s probability does not entail true facts raise it; they only raise it relative to that hypothetical but non-existent version of the story containing false facts of the world. And that’s not the question we are asking here. We are asking how likely the stories we actually have are. Not the likelihood of stories we don’t have. See my discussion of a similar problem regarding Nazareth archaeology in OHJ, p. 258, n. 8. Contemplating the stories we don’t have can be a useful exercise (as P(e|h) must equal 1 – P(~e|h)), but only in a certain way (I explain all of this in PH, pp. 52, 230, 255-56, 302 n. 13).

Examples to the Contrary

An example I have discussed (as have other scholars making the exact same point: see NIF, pp. 174-87) is the book of Acts: peppered with true facts of the world (some cribbed from Josephus precisely for the purpose), yet nevertheless not at all believable on almost any other detail (OHJ, ch. 9). This is how historical fiction gets written. It’s not like Luke was really good at checking incidental details of regional geography and politics (he sometimes wasn’t, but even if he was), therefore the stories he inserts those details into are credible. To the contrary, Luke weaves false tales and then inserts true background facts to make them seem believable.

Therefore we cannot use the insertion of true background facts to support the truth of the stories. Those are separate reference classes, and they do not inform each other. That an author is good at the one only tells you the probability that he continues to be good at that, and therefore Luke’s often getting background facts right lends credence to other background facts in Luke that we can’t independently verify. Nothing more. If you want to up the credence of Luke’s stories (the individual pericopes, as narrative units), you need evidence that he regularly gets stories right, not just the background facts. And that is precisely what we can’t verify, whereas we can show he often (and deliberately) gets stories wrong, or uses such suspicious methods of composing them that they can’t be credited as being the result of honest inquiry (again, see OHJ, ch. 9). Which sets the prior probability of any of his other stories being true to a low value, not a high one.

I do the same thing with the Gospels: demonstrate that they are composed in such a suspicious and consistently unbelievable manner that there is no way to get a high prior that any story in them is true (OHJ, ch. 10). That’s why you can’t use them to support the historicity of anything in them that we can’t independently verify. And we can’t independently verify Jesus (OHJ, chs. 8, 9, and 11).

In fact, in precisely that context I discuss in both PH and OHJ what I think Covington wants to do. See “iteration, method of” in the index to PH, which I mention in OHJ, p. 509. I’ll quote the latter here now because it’s relevant:

… from the survey in this chapter it’s clear that if we went from pericope to pericope assessing the likelihood of it being true (rather than invented to communicate a desired point or to fit a pre-planned narrative structure), each time updating our prior probability that anything in the Gospels can be considered reliable evidence for a historical Jesus, then that probability would consistently go down (or level off somewhere low), but never rise. In fact I have not found a single pericope in these Gospels that is more likely true than false. These Gospels are therefore no different than the dozens of other Gospels that weren’t selected for the canon (as discussed in Element 44). They are all just made-up stories.

To change this conclusion, historicists need to find a way to prove that something about the historical Jesus in the Gospels is probably true (not possibly true, but probably true). They have often attempted this, but so far only with completely invalid methods (as I have already thoroughly documented in Chapter 5 of Proving History). I see no prospect of any valid method ever succeeding at that task. But only time will tell. For now, my conclusion is that we can ascertain nothing in the Gospels that can usefully verify the historicity of Jesus.

Note my use of the correct reference class: stories, not background facts. Background facts (like that Pontius Pilate was governing Judea in the 30s A.D.) are wholly unconnected from the truth of anything in the stories. That Pilate existed is not connected to whether Jesus Christ existed, any more than it is connected to whether Joseph of Arimathea existed (see OHJ, index). It therefore does not lend credence to either.

Certainly, if the Gospels got that detail wrong, then the probability of the story being true would plummet. Hence my conclusion is not that the Gospels plummet the probability of historicity (as they would if they got everything wrong), but that they have no effect on it that we can discern (except as I extract in chapter 6 to construct the only reference class for which we have enough data to build a probability out of: see OHJ, p. 395).

So notice, for example, that the “gospel” that placed Jesus a hundred years earlier under king Jannaeus (OHJ, pp. 281-89) also gets the same kind of historical fact correct (there really was a king Jannaeus and he really was the last in an uninterrupted line of kings of Judea). Yet both stories can’t be true. Thus, getting right who was in office when you set your story tells us nothing about whether the hero of your story even existed.

So Covington is right that the only way my prior probability can be challenged is by coming up with a better reference class. But that reference class cannot exclude the Rank-Raglan data and accomplish anything–because that data would go back into e and thus drop the probability all over again, as I demonstrate mathematically in OHJ, pp. 239-44 and 245-46. This is why using the Josephan Messiahs class doesn’t work (OHJ, p. 246). And [as his revision now acknowledges] the approach Covington suggests wouldn’t work for the same reason–unless the Gospels were substantially different than they are. And lo. They aren’t.


Covington’s entries in this series are indexed here. He has also responded to this commentary on part 1 and I have replied in turn (see comment). My commentary on his remaining sections is as follows:


For a complete list of my responses to critiques of OHJ, see the last section of my List of Responses to Defenders of the Historicity of Jesus.


  1. millssg99 says

    I have read many a book of fiction set in real times, places, people, etc. At the same time other characters or events are complete fiction (the whole point). I don’t see how pointing to accurate background details makes people think that makes anything else in the story true. Setting a fiction in a historical context doesn’t help.

    • says

      As my cited pages in NIF show, other historians discussing Acts have said exactly the same thing. So experts are with you on that one. Until they don’t want to be.

  2. says

    The question that occurs to me is, why is 12 items on the Rank-Raglan list used as the cutoff? I don’t see anything about the list that suggests having a majority of items is particularly significant. Why not 10, or 8, or even 16? This would be a bigger concern if Jesus was right near the chosen cutoff point but, since the chosen cutoff point directly effects the prior we come up with, it is not irrelevant either.

    Similarly, why the exact Rank-Raglan list at all? Would a subset of items be better? It seems hard to argue against using a subset while at the same time avoiding opening things up to a much expanded list of legend/myth like items. What would happen if more military and/or political struggle related items (of a typically legendary sort) were included in the list?

    I also noticed some of the Rank-Raglan seem to be correlated with each other. For example, surely the most common reason someone would be trying to kill a baby would be political (i.e. they are heir to a king). Stories not including a death will fail to meet both the mysterious death and died on a hill items. Those not crowned a king won’t rule at all (uneventfully or otherwise).

    All of this suggests to me that, at the very least, the better way to incorporate the Rank-Raglan list is by incorporating it item by item, possibly along with items not on the list, as evidence rather than as a prior probability using an arbitrary cutoff point. This might not matter as much if the other evidence moved the prior significantly but, at least when you are arguing a fortiori, the other evidences cancel each other out leaving the posterior probability almost exactly where you started.

    • says

      The question that occurs to me is, why is 12 items on the Rank-Raglan list used as the cutoff?

      Because the probability of non-existence rises the higher you score, and Jesus scores near top of the list–and even just from Mark he scores 14. So placing the cut-off at 11 was generous. Once you get below 12, the probability of chance matches is so high the ranking becomes meaningless (you are then talking about too large a reference class, violating the Rule of Greater Knowledge; this is all explained in chapter 6).

      The significance is that this effect (the probability of non-existence rises the higher you score) is an observation, not a prediction; hence there is something significant about the score, regardless of the causal interdependence or oddness of some of the criteria (an issue I address in ch. 6). That’s what makes it useful. If you added more criteria, you would have to raise the benchmark, so as to ensure you are still tracking this phenomenon, and not just adding random people to the class.

      For the same reason…

      …why the exact Rank-Raglan list at all? Would a subset of items be better?

      Because it’s the narrowest applicable reference class for which we have a lot of data. That’s “a lot” by ancient history standards, not scientific standards. But still far more than we have for any other class (I give examples in chapter 6). No other class comes close, leaving margins of error too wide to use (which is why I find the other mythic elements in the Gospels unusable as evidence against historicity; the Rank-Raglan class is the only one for which we have a sizable number of class members to draw inferences from, and note that even then I allow a wide margin of error).

      A subset would increase the probability of chance matches and thus reduce the relevance of the class (per above) and also result in missing otherwise identifiable members of the class (because of the spotty survival of evidence, we don’t have the complete dossiers on every prospective member).

      To the contrary, we should want to increase the criteria as much as possible, without resulting in leaving out too many applicable cases, or including too many inapplicable ones. So you want a criteria-set that creates (a) a lot of examples fitting the profile (i.e., not just two exemplars; which requires less criteria) and (b) reduces the risk of examples fitting the profile purely by accident (which requires more criteria). So you want something that’s not too little and not too much.

      Since Jesus scores 14 to 20, we don’t want too many fewer than that to count (to ensure we are placing Jesus in a relevant class, and not just ignoring the fact that he scores so high), but we also don’t want so many required that all you have left in the class are Jesus and Oedipus. Conversely you want to avoid the multiple comparisons fallacy. So you need parallels that are widely paralleled, i.e. by many exemplars, and not things that appear only twice (see “Lincoln-Kennedy” in the index).

      Mathematically you could do this the long way and run up your iterated priors by progressively narrowing the class (from the set of all who score 1 hit and above, to the set of all who score 2 hits and above, and so on). Thus it actually doesn’t matter where you cut it off, unless you cut it so far to where you have so small a class that you eliminate the utility of even having the class (because this creates margins of error too wide to be useful, as per every other case).

      I cut to the chase in ch. 6 by showing that even if we started with something like the “score 1+” class, by the time we got to the “score 12+” class we would end up in the same place as if we just started at the 12+ class. But if you really wanted to, you could burn several pages of math doing every step singly all the way there. The trick is finding class members at each stage (a lot of unnecessary work), in order to graph how historicity declines. Knowing that that will be a waste of time in the end saves you the trouble of doing it. You could even show that by calculating for every possible ratio at every possible stage of this iterated process: you still end up in the same place. And that’s what I show in ch. 6 (with some examples).

      And then, once you got beyond 12, you would be narrowing the class so much that your margins of error start to consume your ratios, negating any utility of continuing. The margin of 1-to-4 historical out of 15 members is the best you are ever going to get.

      The key observation is how there are so few historical persons beyond a certain point, and within a class of non-trivial size, and Jesus is well beyond that point. If this was just all random accident, there should be as many historical persons (by proportion) scoring above 11 as score below. That that isn’t the case entails this is a thing, and not an accident: it really genuinely is unusual for any historical person to score so high. So that Jesus scores well above that is a significant observation. All I do is calculate how significant, even allowing for wide margins of error.

  3. dthunt says

    Outside of the historicity bit, do you find a lot of other opportunities to put things in Bayesian terms when talking with people?

    If so, have you observed a preference in people for attacking your priors (if you choose to disclose them) instead just considering the evidence you put forward?

    • says

      Do I use BT in other things? Yes. But usually not explicitly (so that I don’t have to explain the framework). It is easily done without the person you are talking to even being aware of it (unless they are also Bayesians).

      Do I detect a preference for attacking my priors? Curiously, no. Almost always they attack e or P(e|h) or P(e|~h). Except when I am explicit about using Bayes–then will they flip to attacking the prior instead–often because that is the apologetic they’ve been trained in (the mantra “arbitrary priors” is like “a banana is obviously intelligently designed”), or the thing they least understand (making it easier to confuse ignorance with incredulity). But I also know how to preempt those attacks, and designed chapter 6 in OHJ to do so, making it a lot harder. Indeed, technically impossible, since there I actually start with every possible prior, and show that we end up in the same place (so no attack on the prior is possible; they have to attack the reference class’s likelihood ratio, which is attacking the evidence). Notably, that trick can be used in every other Bayesian argument about anything, so it’s worth getting the book just to have that model for doing that, in defense of any sound prior you ever employ.

    • dthunt says

      Oh, I have the book! Unsurprisingly, I had it open to chapter six when I wrote the question.

      I can see a stark difference that exists when you are trying to write a book like OHJ, where you’re trying to give a fairly comprehensive evaluation of the evidence and your audience most likely is unjustifiably certain on various hypotheses, versus when compared to a social situation in which two people are trying to update each other by sharing evidence.

      In the former situation, you’re deliberately calling a good deal of attention to the bucket of things that you’ve formed as “a prior”. It’s expected, really, that people respond to it, and I feel a little embarrassed to have asked the question.

      The core reason I asked the question at all is that some people who purportedly understand Bayes’ theorem still ignore discussion of evidence in favor of relentlessly moving to attack someone’s nebulous priors, if a difference is detected with respect to their own, and I am confused by this. It’s made me reluctant to share more than the faintest hint about priors, even ones that are meticulously examined.

      I don’t know if this sort of lashing out just a reflexive attack on a person who apparently disagrees (the rough equivalent of shouting), or if the attacker is just genuinely confused about the idea that there may be big evidence that has not been shared (potentially for very good reasons, like that it would not be convincing to the other party, or requires quite a lot of background to make any sense of, and will take lots of time, or could hurt someone).

      I hadn’t considered that this might be some sort of bizarre attack on the mere idea of inference or straight up apologetics. I believe you have substantially more experience in these sorts of discussions than I do, and if that is in your hypothesis space, then perhaps it ought to be in mine as well. I have definitely seen people declare a variety of subjects as being completely immune to analysis.

      I will examine the construction of chapter six more closely, on your advice.

      Thank you for your insights. I am very much looking forward to seeing how the format is received. I am enjoying the book.

  4. Bruce says

    I love this post, because it gives one a feel for using Bayes Theorem without overwhelming people with detail. I felt a few bits of “Proving History” were hard slogging for me to read, because I am not used to technical discussions of probability. (That book is very enjoyable, even if one just skims the few paragraphs that are tough.)

    But this post makes it clear (or should) that apologists who cite references to Pontius Pilate as if that were evidence are doing the equivalent as if they found a reference to British Prime Ministers John Major and Tony Blair in the Harry Potter series and decided that those works of fiction were a documentary. Thanks.

  5. says

    Covington posted a reply as Part 1a.


    (1) Indeed, I have polished all the Rank-Raglan criteria to improve their clarity and accuracy for the original purpose. This is only partly based on Dundes. It is partly my own refinement. The effect of which is to match the examples actually proffered by Rank and Raglan (thus eliminating their verbal ambiguity in naming the criteria) while making the criteria more strict (and thus harder to score, as any method of criteria should be).

    (2) I don’t count scorers from outside Greco-Roman antiquity, as I do not believe those can be relevant. They are too historically and culturally out-of-context, and being post-Christian in a Christian world, too easily a product of influence from Christianity (a phenomenon that cannot be explained for the origins of Christianity itself).

    (3) Covington’s math is inapplicable because his set is incomplete (e.g. he left out several: see last paragraph in my commentary on the Hallquist review; and even that list is unlikely to be complete). You would have to make sure you had all high scorers before you could calculate the ratio. I would not even expect Covington to engage the massive literature review necessary to locate all modern and medieval scorers, even in Western culture, much less beyond. And again, such a task would be moot, since those exemplars should not be counted (for the reasons stated in item (2)).

    (4) The concern that members even from Greco-Roman antiquity have been left out can only be answered by finding members from Greco-Roman antiquity that were left out. And then recalculating.

    (5) Item (1) is why Mithradates does not rate: the criteria were applied so loosely almost anyone would score (OHJ, pp. 231-32, n. 193). That nullifies the utility of the criteria. It’s the wrong reference class (same as the Josephan Christs class: ch. 6 § 5). You should think of it this way: Mithradates belongs to two sets, the set of “loosely scored RR heroes” and the set of “tightly scored RR heroes.” Only the latter set is relevant for my argument. It is a subset of the former set, and by the law of greater knowledge, when you know a member belongs to a narrower set, you have to use the narrower set (if you have good enough data for that set for the switch to produce any benefit in knowledge). Otherwise you are violating a basic principle of logic (omitting known facts).

    For example, if you say John is unlikely to have shot several people in the head because he was a politician and politicians rarely do that, you are not deploying valid logic if John was a politician in Nazi Germany. Leaving out the narrowing factor does not get you to a valid counter-class. If John belongs to the narrower class “politicians in Nazi Germany,” then you have to include the consequences of that fact in your analysis.

  6. Erick Rolon says

    Could you comment about Sepphoris?
    Why an important city near actual Nazareth is not mentioned in the Gospels? Thanks

    • says

      Not a useful observation. I discuss why in On the Historicity of Jesus, p. 257 n. 8. There are lots of places in Galilee Jesus doesn’t go. We can’t draw any conclusions from that, unfortunately.