The Context

This is now the third post in this series (see part 1part 2, and next part 4). Here the focus shifts to the issue of finding Pr(H|E) in Biblical Studies. When in Biblical Studies is Pr(H|E) easier or more reliably found not by using Bayes Theorem but by a different method?

The Issue


Christoph writes,  (in a FB comment on Laura Hunt’s excellent lecture–sorry outsiders, it’s a closed group ; )

I took a look again at my Hidden Criticism (open access if you are interested) and on pp. 43-44 I actually touch on this issue but present it in a misleading way. I was trying to explain there why it is difficult to come up with statistical values for priors in biblical studies. What I should have ACTUALLY done there is perhaps to explain why it is so difficult to have direct access to posteriors. In any case, what I wrote back then seems quite imprecise to me now, to say the least.”

So what’s Christoph talking about here? On pages 43-44 he’s describing how we might use Bayes to confirm theories.  Suppose the theory is that a certain text contains a non-obvious critique of the Roman Empire. He uses the medical analogy, just like we did in the previous post, and says the first thing to do is find the prior probability. Just like a random person in a population could have a 1 in 1,000 chance of having a disease, so a random letter in the ancient world could have a 1 in an X chance of containing a Roman critique.

He then illustrates the reference class problem. Should the X here include all writings in the ancient world, or all writings among the Jewish population, or perhaps all letters from early Christians? Now this choice of class is actually quite important. If we include all writings in the ancient world, we have to include all the receipts, logs, bills, legal renderings, etc. If all these are included then any random text we pick up might only have (to pick a random number) a 1 in 10,000 chance of being critical of Rome. If we instead use Jewish texts from under the time of Roman rule there might be a 1 in 4 chance. This vast difference could easily determine the outcome of a Bayesian analysis. After the particular evidence (E) is considered, Pr(H|E) in the first case could still be extremely low, while in the second Pr(H|E) could be quite likely.

Here I should also emphasize that such a problem is not unique to Biblical Studies. In the medical context the same problem arises. Suppose we are trying to find Pr(H|E) for a positive test of a person in a small village. Say 1/10,000 people in the world have a disease. But ¼ people in a small village have that disease. Suppose our likelihoods  are Pr(E+|H+)=.75 and Pr(E+|H-)=.1. In this case changing the prior from .0001 to .25 means that the odds that a person with a positive test result actually has the disease change from .075% to 71.43%—an enormous range. Keep in mind that in this case both numbers are true and well-known. The choice isn’t between known or unknown priors, or between one prior that is true versus another that is false. Both options are true and known and Bayes itself doesn’t tell you how to choose between them. Some other principle must be at play here.

Now in the book Christoph suggests that we use a series of nested conditions to help clarify issues. This is an interesting and helpful idea in its own right, but it’s not clear how, practically, one should go about using this to determine which prior to use. Deciding on priors is a perennial problem so this is not to knock on Christoph for having not provided a definitive answer. However, it does mean that when he says “What I should have ACTUALLY done there is perhaps to explain why it is so difficult to have direct access to posteriors” we should note that the issue is (I think) not that the problem of priors has been solved, but that the difficulty in jumping straight to the posterior, Pr(H|E) has been ignored. Basically, it can both be true that the prior remains problematic, but that it is less (and perhaps much less) problematic than trying to find Pr(H|E) a different way.

So how are we to evaluate this? How do we know if we have better epistemological access Pr(H) or Pr(H|E)? (this is a bit too simple, but for now it will do) Now there is no clear E in text here (fitting since Christoph was talking about priors) but we can add one. And since it is Christoph it seems fitting to use θριαμβεύειν as our datum (full disclosure, as I haven’t read much of either his 2017, or 2022, this is a hypothetical example, not Christoph’s position). In our hypothetical case, H here is the hypothesis that the text critiques the Roman Empire via a subtext and E is the datum that θριαμβεύω is used in one of its grammatical forms.

The next thing to settle is how to determine which probability has better epistemological access. Here I’ll suggest a useful heuristic for this is the general consensus of experts. If a consensus of experts can agree to an approximate value (or range of values, or a distribution of values) in case A while not in case B, then case A seems to be epistemologically clearer. Now there’s lots of details we could get lost in here but I’m just trying to get at the basic point: if the experts can agree in one case and say, “Yeah, that seems basically right” but not in another, that’s a sign for better epistemological access.

Pr(H)

In this case, we’d be asking experts if they’d be comfortable saying, “Yeah a text critiques the Roman Empire via a subtext about 1%, 5%, 50%, etc. of the time. That seems about right.” Now I am not an expert here, but I’ve asked this type of question before and in my experience, I’d say there’d be almost no agreement at all. Actually, I think that one would get enormous pushback to the question itself. The most common answer might well be “There’s no way to even answer that question!”

When we look at what Pr(H) means this makes sense. Pr(H) = critique texts (C) divided by the total number of texts (T), Pr(H) = C/T. The problem with coming up with a value for this is that we don’t have much of a grasp on either of these. The total number of texts is enormous and a bit hard to define. The number of critique texts is quite contested and also really hard to grasp. Hence our knowledge about both the numerator and denominator is so vague that even asking about the value Pr(H) feels like a trick question.

Pr(H|E)

On the other hand Pr(H|E) is much better defined. Here Pr(H|E) = the number of texts that both have θριαμβεύειν and are Roman critiques (θ∩C) divided by the total number of texts with θριαμβεύειν (θ). Now texts with θριαμβεύειν are going to be much more limited than the total number of ancient texts. Christoph, using the TLG, suggests that there around 100 uses (2017, 82). So the denominator is going to known fairly reliably. But what about θ∩C?

Well again, there are always going to be debates, but since the set under consideration is much more circumscribed then the opportunities for consensus are larger. Some uses of θριαμβεύειν will obviously not fit H, some will obviously fit H and then there will be the debated middle. As I said in the last post, where actual quantities are used, all you need to calculate Pr(H|E) is the total number of E (here number of θριαμβεύειν texts) and the number of H∩E texts (here the number θριαμβεύειν texts that are used subversive critiques). Both of these are available here, at least to a greater degree than total texts and Roman critique texts. This would lead naturally to a greater level of agreement.

More agreement makes perfect sense because Pr(C|θ) is actually a subset of Pr(C). Remember, Pr(C|θ) = θ∩C / C while Pr(C) = C/T; consequently, Pr(C|θ) pertains only to those critique texts that have θριαμβεύειν in them, the subset of C that has θ. Anytime only a subset of the evidence is examined there is a greater likelihood of agreement—there is simply less stuff to disagree about. As a side note, its also much easier to do research on a small subset of data like θριαμβεύειν than the entire corpus of ancient literature around Rome. This makes it much easier to actually come to an informed opinion about the subject.

Pr(E|h)

There is of course more to a Bayesian analysis than simply looking at Pr(H). There is the issue of likelihoods, Pr(E|H) [and often Pr(E|~H)]. This must also be known to do a Bayesian analysis. So how does Pr(E|H) compare to Pr(H|E) with regards epistemological access? In the above example Pr(E|H) is Pr(θ|C), the probability of finding θριαμβεύειν assuming we are looking at Critiques. To find this, we find things that have both a critique and θριαμβεύειν and divide it by the total number of critiques, θ∩C / C. This is very similar to Pr(H|E). As we discussed above, this is Pr(C|θ) = θ∩C / θ. The only difference between the two is the denominator, the numerators are the same.

Since the numerators are the same, we have the same epistemological access to both. The question then becomes, to which denominator, C or θ, do we have better access? The answer to this seems to clearly be θ.  To find it, all we have to do is find the instances where θ is used. However, for C, we have to make the judgement about which texts have an embedded Roman critique. To use our heuristic about expert agreement, surely its true that it’s easier to get expert consensus on how often θριαμβεύειν is found than consensus on how often Roman Critiques are found. This would mean that Pr(H|E) is easier than Pr(E|H).

 

Conclusion


If all this is correct, this means that at least in this case study from Biblical studies, we have better epistemological access to Pr(H|E) than either Pr(H) or Pr(E|H). Consequentlly the more reliable way to find Pr(H|E) would be to use conditional probability rather than use Bayes Theorem. In the next post, I’ll discuss the problem of finding Pr(H|E) in the context of historical backgrounds in the New Testament. 

 

 

Leave a Reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes:

<a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <s> <strike> <strong>