Statistical computing often involves working with ratios of factorials. These factorials are often too big to fit in a floating point number, and so we work with logarithms. So if we need to compute log(a! / b!), we call software that calculates log(a!) and log(b!) directly without computing a! or b! first. More on that here. But sometimes this doesn’t work.
Suppose a = 10^40 and b = a – 10^10. Our first problem is that we cannot even compute b directly. Since a – b is 30 orders of magnitude smaller than a, we’d need about 100 bits of precision to begin to tell a and b apart, about twice as much as a standard floating point number. (Why 100? That’s the log base 2 of 10^30. And how much precision does a floating point number have? See here.)
Our next problem is that even if we could accurately compute b, the log gamma function is going to be approximately the same for a and b, and so the difference will be highly inaccurate.
So what do we do? We could use some sort of extended precision package, but that is not necessary. There’s an elegant solution using ordinary precision.
Let f(x) = log(Γ(x)). The mean value theorem says that
f(a+1) – f(b+1) = (a–b) f‘(c+1)
for some c between a+1 and b+1. We don’t need to compute b, only a – b, which we know is 10^10. The derivative of the log of the gamma function is called the digamma function, written ψ(x). So
log(a! / b!) = 10^10 ψ(c+1).
But what is c? The mean value theorem only says that the equation above is true for some c. What c do we use? It doesn’t matter. Since a and b are so relatively close, the digamma function will take on essentially the same value for any value between a+1 and b+1. Therefore
log(a! / b!) ≈ 10^10 ψ(a).
We could compute this in two lines of Python:
from scipy.special import digamma
4 thoughts on “Computing log gamma differences”
“the difference between a and b is 30 orders of magnitude”
Shouldn’t that be something more like: “the difference between a and b is 30 orders of magnitude smaller than a (b is the same order of magnitude–40–as a)”?
Thanks. I reworded that part to say “Since a – b is 30 orders of magnitude smaller than a, …”
It’s also worth noting that this is the same as using a first-order taylor expansion. Your comment that c does not matter is equivalent to the second-derivative of log-gamma at a or b being much smaller than (a-b)^2 (bounding the error term).
Using the digamma function in this case can be “expensive”. In your example, since b = a – x, with x = 10^10, we can approximate the factorial ratio by
a!/(a-x)! ~ (a-x)^x = a^x * (1-x/a)^x ~ a^x * (1 – x^2 / a)
since x/a is small. In fact, 1 – x^2 /a ~ 1 since x^2 / a is still very small, so
log(a!/(a-x)!) ~ x * log(a) = 10^10 * log(a)
which is somehow obvious in hindsight since the digamma function can be roughly approximated by the (natural) log function… :)