I have often been asked how we should evaluate arguments from consensus. That’s where someone says “the consensus of experts is that P, therefore we should agree P is true.” On the one hand, this looks like an Argument from Authority, a recognized fallacy. On the other hand, we commonly think it should add weight to a conclusion that the relevant experts endorse it. Science itself is based on this assumption. As is religion, lest a religionist think they can defeat science by rejecting all appeals to authority–because such a tack would defeat all religion as well, even your own judgment, since if all appeals to authority are invalid, so is every appeal to yourself as an authority (on your religion, or even on your own life and experience).
And yet, it is often enough the case that a consensus of experts is wrong (as proved even by the fact that the scientific consensus has frequently changed, as has the consensus in any other domain of expertise, from history to motorboat repair). And our brains are cognitively biased to over-trust those we accept as authorities (the Asch effect), putting us at significant risk of false belief if we are not sufficiently critical of our relying on an expert. It’s only more complicated when we have warring experts and have to choose between them, even though we are not experts ourselves.
So what do we do?
The best treatments of the Argument from Authority as a fallacy are at Princeton, FallacyFiles, Wikipedia, and Nizkor (all of which have valuable insight worth reading up on) although that last (and many other treatments online) incorrectly state that the Argument from Authority is only a fallacy when the authority appealed to is not legitimate (e.g. “this sort of reasoning is fallacious only when the person is not a legitimate authority in a particular context”). That’s incorrect because the Argument from Authority is a non sequitur in deductive logic: even the most capable and relevant expert authority on the subject of P can be mistaken. Therefore it cannot deductively follow that P is true merely because they say P is true. Wikipedia gets this right.
But that just means authority can only have logical merit in an inductive argument, which in reality means a probabilistic argument: endorsement of P by certain authorities will normally increase the probability that P is true. The question is by how much, and when…and why. I already cover this question, and the underlying mathematical logic of it (including the role of Condorcet’s Jury Theorem) in Proving History (index, “consensus,” p. 334, and “Why History Requires Expertise,” pp. 17-20). The linked webpages above cover it further, adding for example criteria for discerning the weight of an expert opinion (and I discuss the philosophy and epistemology of relying on expert opinion in Sense and Goodness without God, “The Method of Expert Testimony,” II.3.6, pp. 58-59, which should be read in the context of “Finding the Good Method,” II.3.1, pp. 51-53).
Here I will add more observations. But for a fuller understanding, you should read my previous writings on the subject (cited above).
Whose Opinion Should We Count?
Laypeople need to be able to evaluate the argumentative and evidential value of “expert opinion,” and that includes a “consensus” of expert opinion They need to be able to tell when that has value and when it does not, and how much value it has, without themselves being experts. Indeed, even if they are experts, they often need to make these evaluations without themselves having to re-do all the research and study that that consensus is based on, as otherwise we would be demanding an absurd scale of inefficiency in the expert group, by nixing the ability to divide their labor, and instead requiring every expert to reproduce all the work of every other expert, a patent impossibility.
But a consensus has zero argumentative value when the individual scholars comprising that consensus have neither (a) examined the strongest case against that consensus nor (b) examined enough of it to be able to identify and articulate significant errors of fact or logic in it. So it is fallacious (indeed, a conspicuously unreliable practice) to just cite the consensus on anything, without first ascertaining whose opinions within that consensus actually count. The most reliable population to heed is that which consists of all qualified experts (those who have requisite expertise in the subject being appealed to, e.g. climate science, evolutionary biology, economics, the historicity of Jesus) who have met either condition (a) or (b), and therefore exclude from consideration all such experts who meet neither condition.
Notably, when questioning the historicity of Jesus, this means excluding from consideration nearly all historians of Jesus. Because almost none have met either condition (a) or (b). And this is even apart from other reasons we should discount them, which I enumerate in chapters 1 and 5 of Proving History, where I show that historians of Jesus have all been generating their conclusions from demonstrably invalid methods, and worse, have accordingly generated countless contradictory conclusions from the same body of evidence. As I state there, unless differences are admitted to be a matter of opinion rather than fact (index, “disagreement,” p. 335), “When everyone picks up the same method, applies it to the same facts, and gets a different result, we can be certain that that method is invalid and should be abandoned” (p. 14). And yet this is exactly what we observe has happened in Jesus studies. Therefore the “expert consensus” on the historicity of Jesus cannot be appealed to, because it is useless. Unlike the consensus of historians on almost any other subject. (Although please heed my past remarks on this; as well as my discussion of what this means regarding the burden of evidence in Proving History, “Axiom 6,” pp. 29-30.)
The second cull comes from eliminating from the pool of experts to count, those who articulate their reasons for their conclusion and those reasons are self-evidently illogical (you can directly observe their conclusion is arrived at by a fallacious step of reasoning) or false (you can reliably confirm that a statement of fact they made is false). Cranks, of course, will “believe” they see fallacies and falsehoods in an expert argument, when really there are neither, but I can only give advice to the sane. If you are a crank, you are beyond rational argument. Hopefully most of my readers are not cranks, but have taken the trouble to avoid excess delusionality and become competent evaluators of facts and logic. Or if you haven’t done that yet, please do.
This is where laypeople in the historicity debate can start to get a handle on why they should no longer trust the consensus of experts in Jesus studies. You can thus see why, so far, Bart Ehrman’s opinion is to be discounted, likewise Maurice Casey’s, Akin & Horn’s, Crossan & MacDonald’s, even, astonishingly, that of Goodacre and Bermejo-Rubio. There is something else driving their opinions, something other than a careful and objective examination of the facts. In some cases, I think it’s just institutional error (they are repeating things other experts told them, that they did not know were false) or institutional inertia (it’s just easier to not think about challenging the past consensus), in others, something more (Ehrman I suspect is too arrogant to admit his mistakes and thus has fallen victim to the escalation of commitment bias; Casey I suspect is simply insane). Even Bermejo-Rubio, whose mistakes are all subtle errors of logic (because an expertise in logic is unfortunately lacking from the training of most historians), I think is ultimately really a victim of both institutional inertia and commitment bias.
Counting Experts Who Actually Checked
We needn’t expect experts to always check. Many challenges to authority are vacuous and would be a monumental waste of time to examine. Thus, for the very purpose of efficiency, experts apply reasonable criteria for judging which challenges are worth examining and which ignoring. I discuss important aspects of this in my critique of the delusional mythicist Joseph Atwill. In general, a challenge needs to be presented in an efficient and competent manner (with strong and thorough citations of evidence, and clear and sound logic). Any challenge that fails even that rudimentary standard can safely be ignored. Because if the challenge is valid, there should be no reason why it can’t be presented correctly that way. That it is not being presented that way is a huge red flag. It most often means it can’t be. Because it’s false.
Of course, a challenge presented in an efficient and competent manner can be dismissed very quickly: if an expert looks at it and immediately spots fatal errors (of logic or fact). As I did in the case of Atwill. That meets the (b) requirement noted above. And no further inefficient waste of time is necessary. Unless the challenge can be re-presented without those errors (and the errors are acknowledged, apologized for, and explained…because one needs to explain why they were trusting a conclusion on false facts and fallacious reasoning, indeed how they can be competent to make a challenge at all if they can’t even get basic facts right, because failing to address this, or even an outright refusal to, is an indicator of delusionality, and experts have no obligation to engage with the insane).
But that noted, the strongest consensus argument exists when condition (a) is fully satisfied by every member of the expert community whose opinion you are counting, and you are counting more than a handful, and every one of them subsequently agree, and no other expert who disagrees satisfies even condition (b). Here an “expert” must be someone with a Ph.D. or equivalent in a field significantly encompassing or overlapping the subject in question. And we only count experts (in the counted group or the other group) who do not make their case on a self-evident fallacy or falsehood. But meeting all those conditions is rare. Usually you get something a little short of that. For example, 95% agree, and 5% remain recalcitrant. Or some experts who haven’t met condition (a) but do meet condition (b) gainsay the opinion of the experts meeting condition (a). Or some of the experts in either group rely on a self-evident fallacy or falsehood but not enough to obviously invalidate their conclusion.
This can make evaluating a consensus difficult.
The “strongest consensus” condition should add a strong weight to a conclusion. Everything below that, less weight, by some degree. In reality, few experts actually examine consensus-challenging arguments; most simply ignore them, rendering their opinion on them of less value, unless you can confirm such a challenge already has little prima facie merit (e.g. it does not pass the minimum requirement stated above). And a strong enough consensus argument can still exist when more than a handful of experts satisfy condition (b), and they outnumber experts gainsaying them by a substantial amount. If less than a handful of experts do this, then we can still have a valid argument from authority (i.e. their expertise can still weigh in favor of their being right), it just isn’t an argument from consensus.
To achieve the minimum satisfaction of (b), a representative sample of the consensus body (I would have to say at least three experts whose opinions can be established as independent of each other and whose subjective biases are not strongly aligned) must examine the strongest case at least enough to have found significant (rather than trivial, minor, or non-essential) errors in it (of either fact or logic) if any there be, and prove those errors exist (by presenting the requisite evidence or analysis, and showing that they actually correctly understand the argument they are rebutting and representing it honestly and accurately), or state that they found none (in which case this consensus would be affirming the challenge is correct, and thus the consensus should change).
Of course, a challenger should have the opportunity to expose any errors or dishonesty in the consensus case against them, and other experts should have the opportunity to join the counted consensus group (by meeting condition (b) in the manner above) so as to replicate or challenge their findings. But this is what constructive dialogue in an expert community is for. Eventually, it will become clear to all non-delusional participants that the challenger can’t meet the objections, or the consensus must change. And this does not entail a black-or-white result. The change to the consensus may simply be to admit that the truth about P cannot be presently known, contrary to the previous consensus which assumed P was true. And this can be accomplished by a challenger arguing P is false.
When the number of experts in a field entering the counted group in this manner becomes very large (e.g. at least twenty and ideally a hundred or even a thousand or more) and their collective opinion is substantially consistent (e.g. at least 95% agree) we will again have a strong argument from consensus. But we still have a valid argument from consensus when there is, say, 67% agreement among 10 experts (who have met condition (b), and no one else has). It will just be a weak one. But enough to warrant some degree of agnosticism among non-experts regarding the disputed claim P.
But Still Only Inductive
And yet for all that, it is still possible for the consensus to be incorrect. Indeed, this is still possible even when condition (a) is fully satisfied by hundreds of experts, the most ideal consensus normally achievable. But what we have at that point is a consensus that is increasingly less likely to be wrong. The strongest consensus argument has the lowest probability of being incorrect, while the weakest consensus argument has the highest probability of being incorrect–without that probability being so high as to nullify its value. Weaker consensus arguments are simply invalid.
In Bayesian terms, if we have confirmed that the consensus has achieved at least the minimum requirements to be of any value (per the examples above), then what we would say the probability was of that consensus being wrong in its conclusion that P is true would be the prior probability that P is false (the converse of which would then be the prior probability that P is true). Thus to overcome such a consensus, evidence must be presented that is so much more improbable on any other theory than such a consensus being wrong, that the prior probability against its being wrong is overcome (and overcome by enough to say it is more probable that the consensus is wrong).
This is one reason why anyone challenging a consensus bears the burden of evidence. But note that this again only applies to a soundly established consensus, per the procedures outlined above. Sometimes a consensus of experts is not soundly established, as in Proving History I have shown is the case in Jesus studies. And merely polling experts does not generate a valid consensus in any case (except against specious challenges, i.e. those, as explained above, that don’t even merit an expert’s attention), because all experts won’t have met even condition (b) for any challenge to the consensus, and so their opinion on the matter cannot be reliable even though they are an expert.
But this applies quite broadly. For example, an expert can meet condition (b) even when a challenge is not examined by that expert, but depends on a fact that the expert independently confirms is false, sometimes even without knowing they are addressing a challenge. For example, a valid argument from consensus holds when a layperson checks a standard expert reference book in a field and finds that something a crank is saying is false–even though the expert opinion in that reference wasn’t addressing their challenge or even aware of it. Because an expert confirming a false premise in a challenge meets condition (b), and a standard reference will have been vetted by more than a handful of experts. As another example, often someone will approach me and ask about some crank theory or other, which I haven’t heard of before, and I’ll ask them what facts these cranks hang their case on. If those facts are demonstrably false, and I can show they are false, I don’t need to examine the case further. I have already met condition (b). (Unless the person who asked me about it isn’t correctly reporting what the challenger argued, but then they’ve failed to meet the minimum requirement of warranting an expert’s time and attention.)
Overall, arguments from consensus, if valid at all, only increase the probability of an examined claim P. They don’t guarantee P is true. But a strong argument from consensus can greatly increase the probability of P.
Rebutting a Consensus Response to a Challenge
Often a challenger doesn’t accept the consensus rejection of their challenge. Often the challenger is delusional. But let’s suppose we know the challenger’s work well enough to doubt that. How can a layperson evaluate the matter when we have some appreciable expert consensus meeting condition (b) in rejecting a challenge, but the challenger calls foul?
Here we return to checking the experts rejecting P even after meeting condition (b), to see if their rejection is based on logically sound argument: no relevant fallacies, no relevant falsehoods. A non-delusional challenger who cries foul will have identified relevant fallacies or falsehoods in the expert consensus rejection of P. If they don’t, even given the opportunity, then you should probably start changing your opinion about that challenger’s delusionality.
Warranting reasons to conclude a consensus rejection of P is invalid include identifying an obvious fallacy in the expert’s reasoning or the repeated assertion of false facts. The latter is especially damning: if the experts comprising a cited consensus keep citing fact X as a reason for their opinion, and fact X is demonstrably false, then that consensus is worthless. I wrote about a classic example of this in Christian apologetics, where a consensus that there was an empty tomb was based on the belief that women were not trusted as witnesses in antiquity (a factually false claim, as well as a fallacy, since the women aren’t cited as witnesses in any early Christian source: see Habermas and the Devious Trick). Another example I point out in Hitler Homer Bible Christ (n. 9 p. 342): many scholars cited as the consensus in favor of the Testimonium Flavianum in Josephus base their opinion on the claim that an Arabic fragment derives from an earlier text than was employed by Eusebius, not knowing that that has been proved to be false (it derives from Eusebius).
The opinions of such scholars are to be discarded. No matter how expert they may otherwise be, they cannot be counted toward a valid expert consensus on that matter.
Generally, when it is proved (with honest and accurate representations of their arguments) that their opinion (their conclusion) is invalid or unsound (i.e. based on fallacious logic or factually false claims, or both), then a consensus of experts has failed to satisfy the conditions required for a valid argument from consensus. Such a consensus is to be rejected.
Of course, showing an argument from consensus has no value does not establish the consensus is wrong. It only establishes that the existence of that consensus itself has no value for determining whether that consensus is true. The Argument from Fallacy is still a fallacy: showing that an argument from consensus is fallacious (because that consensus has no argumentative value) does not entail the challenge to that consensus is correct. It only eliminates one argument against that challenge. The challenger still bears the burden of showing that the challenge is also true.
A question that does arise, however, is what to conclude when a consensus of experts does not change even though it persists in arriving at its conclusion invalidly or unsoundly even after being shown that it is doing so. An expert community that behaves this way is discredited. It’s opinions then cease to hold any evidential or argumentative value. This is why fundamentalist experts cannot be counted in any argument from consensus. That requires showing that fundamentalists do indeed persist in sticking by false or fallacious reasoning. But once you’ve done that, such experts should simply be bracketed out of consideration as pseudo-scholars, and only the remaining body of experts considered relevant when citing consensus.
Of course, a discredited body of experts will continue to deny that they have been discredited. Thus the burden will always remain on the outside observer to decide which is the case. Have those experts been discredited, or is the claim that they have been discredited baseless? This can be difficult, but is not beyond the ability of a non-expert, since the only expertise required is that of being able to evaluate the logical validity of either side’s arguments.
It is less common that both sides will continue to claim the facts are different from what the other side claims, but even when that happens, it reduces again to a problem in simple logic: examine on what basis one side claims the facts are X and on what basis the other side claims the facts are ~X. At some point you will be able to identify one side or the other is arriving at that claim through invalid logic–or else you will be able to personally verify one side or the other is incorrect (e.g. if a weatherman says it is raining outside and you can directly observe yourself that it is not). Thus actual expertise is not needed to vet the relative reliability of experts. Except expertise in reasoning, which everyone should endeavor to have.
Finally, it is important to note the logical significance of a divided consensus. It is almost never the case that an expert population agrees 100% on every issue (every single member of that expert community agreeing). Yet if there is disagreement, this calls into question the validity of that expert community’s claims to expertise (since if their methods and standards, by which they qualify as experts in the first place, are so unreliable that they cannot generate consistent results, then it can be questioned whether their expertise has any value in the matter at all). How can two experts, using the same methods on the same facts, get different results? There are several causal hypotheses with enough frequency in practice to be plausible enough to test:
- The disagreement is admitted by all sides to be unresolvable on present evidence (e.g. all experts agree that their disagreement is a matter of opinion that cannot be conclusively resolved on present evidence, and they are more stating what they feel to be most probable given their limited data). In this case, the disagreement is insignificant to the function of an argument from consensus, as long as such an argument is being used to establish the general point and not any disputed particulars.
- The disagreement is on minor nuances and not substantive matters (e.g. all experts might agree with a more general statement of the matter and only disagree on small details that are not conclusively provable on present evidence). In this case, the disagreement is insignificant to the function of an argument from consensus, as long as such an argument is being used to establish the general point and not any disputed particulars. (This is not the case if the disputed particulars are not minor, but in fact shouldn’t be in dispute if these experts are using valid methods. Then our situation is one of either of the following.)
- The disagreement is caused by failures to meet condition (a) or (b), as discussed above. In which case experts are diminishing the value of their authority by affirming opinions as proved that in fact they have not responsibly vetted (i.e. they have not satisfied conditions (a) or (b), which they ought to know invalidates the strength of their opinion in the matter). Such experts should be advised not to do that, and to responsibly vet their own opinions first. Their opinion cannot be cited as part of the consensus until they do that. Counting such opinions is the most common error in making an argument from consensus. It is too often simply assumed that every qualified expert will know all about P, because P falls under subject S and they are experts in subject S. That is almost never true. Unless you can demonstrate meeting at least condition (b), in at least the broadest sense (as illustrated above), then being an expert on S does not in fact make you an expert on P. All you can affirm as an expert in that case is that you’ve never heard of any reason to believe P is true (or false), and probably would have if it were. But that is an extremely weak argument, and should always be acknowledged as such.
- The disagreement is caused by subjective biases on one side or the other. In which case, it will be possible to identify which side is violating expert objectivity (by seeing which side most often errs on key facts or logic), and then bracket their opinions out of the pool of experts, no longer to be counted as relevant. Often that means the only valid consensus that remains is that of the other side. This is where we are now in the dispute between scientists and creationists.
That last rule is perhaps the most useful.
Whenever you see two bodies of experts disagreeing with each other, and you are not an expert in the same subject, first identify which of the four categories that disagreement falls under:
-  noncommittal disagreement
-  trivial disagreement
-  uninformed disagreement
-  biased disagreement
The first two are unimportant and you needn’t trouble yourself over it (experts who admit their disagreement is a matter of opinion, and experts who disagree over things you concede are trivial, are not disagreeing in any manner that poses a problem for the layperson). The second two are important, but of those, the first allows you to determine which side to trust by simply looking at which side has bothered to check the claim they are talking about (i.e. have met at least condition (b) with respect to any challenge to their opinion). If one of them hasn’t, but is just arguing from the armchair, while the other side has examined the best case against them, and appears to have answered it without self-evident fallacy or falsehood, then you know which side is most likely right (depending on how many experts are in their camp: a conclusion must be regarded as tentative until the number of experts conceding it is large).
And you can do all that without having to be an expert yourself.
Which leaves the last scenario: Where both sides appear to have at least tried to meet condition (b), are disagreeing about something that isn’t trivial, and are insisting it’s not an arguable matter of opinion. What you do then is try to test the credibility of both sides. Locate the genuine experts on either side (ignore amateurs taking up their banner; you only want to vet the qualified experts here) and check their references and diagram their logic, until you start finding mistakes. You must necessarily find some, because only one thing can be true, so if two people disagree whether some claim P is true and are sure they are right, one of them must have made a mistake in their reasoning somewhere: either relying on a premise that is false, or on an argument that is fallacious.
Generally, eventually, you will find one side to be disproportionately more dishonest about the facts (citing bogus sources, or misrepresenting what those sources say or demonstrate, or not even citing sources for their claims at all, or any evidence you can independently verify) or illogical in its reasoning (and basic competency in detecting fallacies is all you need here, a competency everyone should have, or certainly labor to develop if they don’t).
Then you will know which side’s opinion you can safely discount.