When I worked at MD Anderson Cancer Center, we spent a lot of compute cycles evaluating the function g(a, b, c, d), defined as the probability that a sample from a beta(a, b) random variable is larger than a sample from a beta(c, d) random variable. This function was often in the inner loop of simulations that ran for hours or even days.
I developed ways to evaluate this function more efficiently because it was a bottleneck. Along the way I found a new symmetry. W. R. Thompson had studied what I call the function g back in 1933 and reported two symmetries:
g(a, b, c, d) = 1 − g(c, d, a, b)
and
g(a, b, c, d) = g(d, c, b, a).
I found that
g(a, b, c, d) = g(d, b, c, a)
as well. See a proof here.
You can conclude from these rules that
- g(a, b, c, d) = g(d, c, b, a) = g(d, b, c, a) = g(a, c, b, d)
- g(a, b, c, d) = 1 − g(c, d, a, b)
I was just looking at a book that mentioned the symmetries of the cross ratio which I will denote
r(a, b, c, d) = (a − c)(b − d) / (b − c)(a −d).
Here is Theorem 4.2 from [1] written in my notation.
Let a, b, c, d be four points on a projective line with cross ratio r(a, b, c, d) = λ. Then we have
-
- r(a, b, c, d) = r(b, a, d, c) = r(c, d, a, b) = r(d, c, b, a).
- r(a, b, d, c) = 1/λ
- r(a, c, b, d) = 1 − λ
- the values for the remaining permutations are consequences of these three basic rules.
This looks awfully familiar. Rules 1 and 3 for cross ratios correspond to rules 1 and 2 for beta inequalities, though not in the same order. Both g and r are invariant under reversing their arguments, but are otherwise invariant under different permutations of the arguments.
Both g and r take on 6 distinct values, taking on each 4 times. I feel like there is some deeper connection here but I can’t see it. Maybe I’ll come back to this later when I have the time to explore it. If you see something, please leave a comment.
There is no rule for beta inequalities analogous to rule 2 for cross ratios, at least not that I know of. I don’t know of any connection between g(a, b, c, d) and g(a, b, d, c).
Update: There cannot be a function h such that g(a, b, d, c) is a function of g(a, b, c, d) alone because I found parameters that lead to the same value of the latter but different values of the former. If there is a relation between g(a, b, c, d) and g(a, b, d, c) and it must involve the parameters and not just the value of g.
[1] Jürgen Richter-Gebert. Perspectives on Projective Geometry: A Guided Tour Through Real and Complex Geometry. Springer 2011.
I would be interested to learn more about efficient ways to compute g(a,b,c,d) that you’ve developed? The following link shows the approach I came up with, cast slightly differently as the number of successes/trials of observed binomial proportions, that is linear in the (minimum of the two) number of trials, i.e., effectively min(a+b,c+d). I have struggled to find any ways to make improvements on this?
https://gist.github.com/possibly-wrong/4a823f9acc65b49c4f6037573a115df2
I’ve written several papers on inequality probabilities for beta distributions and other distributions. You can find these by scrolling through my publications: https://www.johndcook.com/blog/articles/