Suppose you add two long numbers together. As you work your way toward the result, adding digits from right to left, sometimes you carry a 1 and sometimes you don’t. How often do you carry 1?
Now suppose you add three numbers at a time. Now your carry might be 0, 1, or 2. How often does each appear? How do things change if you’re not working in base 10?
John Holte goes into this problem in detail in [1]. We’ll only look at his first result here.
Suppose we’re adding m numbers together in base b, with each digit being uniformly distributed. Then at each step of the addition process, the probability distribution of the amount we carry out only depends on the amount we carried in from the previous step. In other words, the carries form a Markov chain!
This means that we can do more than describe the distribution of the amounts carried. We can specify the transition probabilities for carries, i.e. the distribution on the amount carried out, conditional on the amount carried in.
Examples
When we add two base ten numbers, m = 2 and b = 10, we have the transition matrix
This means that on average, we’re as likely to carry 0 as to carry 1. But more specifically, there’s a 55% chance that we’ll carry a 1 if we carried a 1 on the previous step, and also a 55% chance that we won’t have a carry this time if we didn’t have one last time.
If we add three numbers, things get a little more interesting. Now the transition matrix is
This gives the probabilities of each out carry for each in carry. We can compute the long run distribution from the matrix above to find that when adding three long numbers, we’ll carry 0 or 1 with probability 1/6 each, and carry 1 with probability 2/3.
When adding three binary numbers, the transition matrix is fairly different than for base 10
but the long run distributions are the same: carry a 1 two thirds of the time, and split the remaining probability equally between carrying 0 and 2.
The transition matrices don’t change much for larger bases; base 100 doesn’t look much different than base 10, for example.
Finally, let’s do an example adding five numbers in base 10. The transition matrix is
and the long run distribution is 1/60, 13/60, 33/60, 13/60, and 1/60 for carries of 0 through 4.
General case
In general, when adding m numbers in base b, Holte gives the transition probabilities as
Here is some Python code to compute the transition matrices.
from scipy import floor, zeros, array from scipy.special import comb from numpy.linalg import matrix_power # transition probability def pi(i, j, b, m): s = 0 for r in range(0, j - int(floor(i/b)) + 1): s += (-1)**r * comb(m+1, r) * comb(m-i-1 +b*(j-r+1), m) return s/b**m def transition_matrix(b, m): P = [ [pi(i, j, b, m) for j in range(m)] for i in range(m) ] return array(P)
You can find the long run probabilities by raising the transition matrix to a large power.
Related posts
[1] John M. Holte. The American Mathematical Monthly, Vol. 104, No. 2 (Feb., 1997), pp. 138-149
I wonder how does this changes when the numbers come from numerical data following the Newcomb-Benford’s law of anomalous numbers, instead of having digits uniformly distributed.
Jakub, here’s my guess. Benford’s law says that digits are roughly uniform in log-space. So here’s a simulation in IDL (you probably never heard of it, but it’s what I use) that does the uniform distribution to recreate the 45/55 matrix and then a log-space distribution. Since small digits are more highly represented in log-space, the probability of carries is smaller. My transition matrix for two numbers is
.84 .76
.15 .23
Here’s the full output of my code and the code itself.
Uniform
0.0998725 0.0995840 0.100013 0.100205 0.100090 0.100240 0.0999095 0.0999435 0.0999900 0.100153
transition matrix
0.549245 0.449910
0.450755 0.550090
Log
0.278547 0.168131 0.121233 0.0945430 0.0775195 0.0659935 0.0570550 0.0507760 0.0453455 0.0408560
transition matrix
0.841960 0.769524
0.158040 0.230476
pro markov_carry
n=1000000
print, ‘Uniform’
;get two sets of uniform distribution of n digits
digits=fix(randomu(seed,n,2)*10)
;validate that the digits are uniform
print, histogram(digits)/(2.0*n)
;create an empty transition matrix
transition=lonarr(2,2)
;loop through each digit and see if the result is greater than 10
;and add to the appropriate part of the transition matrix
prev_carry=0
for i=0,n-1 do begin
carry=(digits[i,0]+digits[i,1]+prev_carry) ge 10
transition[prev_carry,carry]++
prev_carry=carry
end
;print the normalized transition matrix
print, ‘transition matrix’
print, transition/rebin(total(transition,2),2,2)
;now we do the same thing except we pick our digits to
;be uniform in log10 space
print, ‘Log’
digits=fix((10^randomu(seed,n,2)-1)*10/9)
print, histogram(digits)/(2.0*n)
transition=lonarr(2,2)
prev_carry=0
for i=0,n-1 do begin
carry=(digits[i,0]+digits[i,1]+prev_carry) ge 10
transition[prev_carry,carry]++
prev_carry=carry
end
print, ‘transition matrix’
print, transition/rebin(total(transition,1),2,2)
end
John,
Interesting stuff; but I suspect a typo in the 100 number in the following text;
“The transition matrices don’t change much for larger bases; base 100 doesn’t look much different than base 100, for example.”