Miracles and Probability II

In the previous post, we focused on how you can use the principle of analogy to conclude that if something contradicts what you already know (what we will call background knowledge), then, unsurprisingly, in all likelihood it is false.  We saw the limitations of this principle, however, especially when generalized to mean rare and new events, not just miracles.  So we need some other tool or insight to allow us to conclude that if something rare and new occurs that we don’t necessarily rule it out, assuming we have good evidence.  To illustrate, we can see all the background knowledge that we have as all the accepted truths in the world, and identify a miracle occurring in the Gospels as our hypotheses.  We can then assign a degree of certainty (probability) to our belief that the miracles are true, given our background knowledge; this is known as the prior probability.  Now, we were essentially arguing in the last post that the prior probability of miracles is low because our background knowledge consists of truths that contradict it.  But what if, say, as Mike Licona and others claim, that we have good evidence for a miracle, for example, the resurrection.  We need to find a way to incorporate that “good evidence” into our argument.  After all, we can’t a priori rule out miracles because we can never be one hundred percent certain that the supernatural doesn’t exist.

Bayes’ Theorem

The solution to this problem is Bayes’ theorem.  The theorem’s conclusion takes into account our background knowledge, and the evidence we have for the hypothesis we make.  Where as before we were jumping to the conclusion that the probability of a miracle is low based on our prior knowledge of how the world works, we will now take into account the actual evidence of the miracle.  Quantitatively speaking, the theorem makes us express our premises in terms of degrees as opposed to absolutes by forcing us to numerically label them as probabilities.  This is important because most claims in life, especially about historical events, can only be discussed in terms of probabilities, with us often saying things like “most likely” or “more likely”, for example.  So, those that object to doing math in history, think again, because everyday we already speak in terms of probabilities.  Moreover, the theorem forces you to think of alternative hypothesis, reducing confirmation bias.  This formalized and systematic approach to viewing your hypothesis allows for a clarity unrivaled by other methods.  I offer two quotes below that explain its history and importance.

In simple terms, Bayes’s Theorem is a logical formula that deals with cases of empirical ambiguity, calculating how confident we can be in any particular conclusion, given what we know at the time.  The theorem was discovered in the late eighteenth century and has since been formally proved, mathematically and logically, so we now know its conclusions are always necessarily true if its premises are true (probabilities).  [Richard Carrier]
Bayes’s theorem is at the heart of everything from genetics to Google, from health insurance to hedge funds. It is a central relationship for thinking concretely about uncertainty, and–given quantitative data, which is sadly not always a given–for using mathematics as a tool for thinking clearly about the world. [Chris Wiggins, Scientific American]


It’s probably best to jump into the formula because the mathematical relationship between the premises reveal a lot about the theorem’s mechanics.  We are trying to find how likely or unlikely (probability) our hypothesis is.  A hypothesis we’ve been working with is whether or not miracles occurred as purported in the Gospels.  So our question is how probable is it that miracles occur in the Gospels relative to the evidence and background knowledge we have, P(h|e,b), which is the posterior or epistemic probability.  The prior probability itself is how typical our explanation (hypothesis) is, P(h|b), or how plausible it is, relative to our background knowledge.  And, finally, the evidence that we have for the miracles occurring, e, can be best thought of as how expected the evidence is if our explanation (hypothesis) is true, P(e|h,b), often called consequent or explanatory probability.  In summary, we have a prior probability, P(h|b), that when it takes into account new evidence, P(e|h,b), the posterior probability, P(h|e,b), gets updated. Two forms of the equation are given, one that takes into account one hypothesis and antithesis hypothesis, while the other takes into account multiple hypotheses.  The equation, 2, can be thought of as the probability or the ratio of your hypothesis (h1) to the competitor’s hypotheses (h2, h3, … ), again, reducing our confirmation bias.  The prior probabilities must sum to one while the expected probabilities don’t.  Please see below for more detail.


1)  P(h|e,b) =  P(h|b) * P(e|h,b) / [ ( P(h|b) * P(e|h,b) ) + P(~h|b) * P(e |~h,b)]
2)  P(h1|e,b) =  P(h1|b) * P(e|h1,b) / [ [P(h1|b) * P(e|h1,b)] + [P(h2|b) * P(e|h2,b)] + [P(h3|b) * P(e|h3,b)]  + …]


P(h|e,b) = probability that the hypothesis is true;  Epistemic probability or Posterior probability
P(e|h,b) = how expected the evidence is if our explanation is true; Expected probability or Explanatory probability
P(e|~h,b) = how expected the evidence is if our explanation is false 
P(h|b) = how typical our explanation (hypothesis) is; Prior probability or Intrinsic probability  
P(~h|b) = how atypical our explanation is; (1 – prior probability)

Prior Probability

Calculating our prior probability is one of great importance because it can mean the difference between a probable event and an improbable event.  A prior is derived from all known information about your hypothesis.  This leads us to the concept of reference classes.  A reference class can be thought of as a category of claims that all address a similar scenario.  This information can be used (referenced) to assist us in finding how typical our explanation is; in other words, it estimates our priors. As an example from “Proving History” by Richard Carrier, a hypothesis you may promote to explain the evidence in the Gospels, empty tomb etc., is that Jesus Christ was raised from the dead by a supernatural agent.  How do you derive a prior probability based on your background knowledge?  Well, what you can do is look for similar scenarios that were believed to have occurred in the past.  For instance, Romulus, Asclepius, Zalmoxis, Inanna, Lazarus, many Saints in Matthew or the Moabite of 2 Kings have all been purported to have been raised from the dead by a supernatural agent.  So our reference class is all persons purported to be raised from the dead by a supernatural agent.  We have at least ten of them from antiquity and probably more.  Since prior probability is only based on background knowledge and not conditioned on the evidence, we can assume that each one of the persons claimed to have been raised by a supernatural agent is equivalent.  That is, there’s no more reason to believe one story over the other since all equally contradict our background knowledge.  If that is the case, then classical probability theory says we can divide the sample space into equivalent pieces such that they sum to one.  So the prior probability would be 1/10 or 0.1 that Jesus Christ rose from the dead by a supernatural agent.


It’s important to realize that we could have used a broader reference class and achieved a much lower probability.  For this fact and others, probabilities formed via Bayesian are classified as subjective, but they are not arbitrary since we have good reasons for our methodology.  To be conservative, we picked the narrowest reference class possible.  But we could have found other attributes in common and defined a more generic class.  For example, all of these claims of risen figures also have the attribute in common that they are supernatural, miraculous or mythological.  If that’s the new class, then we know that there are hundreds of thousands of cases where people wrote, spoke or claimed that a miracle was true when in fact it wasn’t.  That would give us a prior of 1/100,000 at least.  But as a heuristic we will pick the narrowest reference class in order to produce the most conservative estimates.  This rule of thumb reduces the chance that our presuppositions or biases will influence the estimate. This is known as a fortiori estimate.  Getting back to reference classes, it’s worth noting that our background knowledge of all accepted truths is quite large – that is, there are a lot of truths that can contradict our belief in a miracle occuring.  For example, the fact that we are creative and inventive would make us believe that a lot of miracles are fabricated, that we are meaning making machines and see agency in inanimate objects, that we make things up in order to influence others, that we could be honestly mistaken, that we can be credulous and so forth – all are apart of of b and all are viable hypotheses that can explain the evidence.  If incorporated, these can have the effect of lowering the posterior probability, but creating reference classes for multiple hypotheses can be challenging albeit the principle is the same as in the single case.

Epistemic Probabilities

It’s significant enough to make the distinction between epistemic probabilities and physical probabilities.  Epistemic probabilities are beliefs that an event happened is true versus someone making it up (or being mistaken), and physical probabilities are probabilities (relative frequencies) of events occurring.  An example might be what’s the probability of someone at random having a myocardial bridge in their heart, which is pretty small, incidence of occurrence being 3%.  But the probability that you believe someone has a myocardial bridge can be quite high since it’s based or conditioned on the evidence at hand, say a recent angiogram.  Moreover, epistemic probabilities often measure events that occur just once, like historical claims, versus physical probabilities which are often statistical averages of repeated phenomena.  So you can’t empirically derive epistemic probabilities by repeating an experiment – say by taking the long term average of flipping a coin resulting in a relative frequency or probability of 0.5 – instead you must rely on thought experiments by deriving a reference class.  The former method is known as the frequentist approach, while the latter method is known as the Bayesian approach.  It’s best to think of these methods as different approaches designed for different kinds of problems rather than as rivals.  Please see quote below, emphasizing the fidelity of the Bayesian method.  The next post will discuss the consequent probability and eventually compute an epistemic probability of our hypothesis; we’ll stick with the miraculous hypothesis that Jesus was raised from the dead by a supernatural agent in order to explain a wide range of claims found in the Gospels and Epistles.
The specification of the prior is often the most subjective aspect of Bayesian probability theory, and it is one of the reasons statisticians held Bayesian inference in contempt. But closer examination of traditional statistical methods reveals that they all have their hidden assumptions and tricks built into them. Indeed, one of the advantages of Bayesian probability theory is that one’s assumptions are made up front, and any element of subjectivity in the reasoning process is directly exposed. [ Olshausen]