Distribution of carries

Suppose you add two long numbers together. As you work your way toward the result, adding digits from right to left, sometimes you carry a 1 and sometimes you don’t. How often do you carry 1?

Now suppose you add three numbers at a time. Now your carry might be 0, 1, or 2. How often does each appear? How do things change if you’re not working in base 10?

John Holte goes into this problem in detail in [1]. We’ll only look at his first result here.

Suppose we’re adding m numbers together in base b, with each digit being uniformly distributed. Then at each step of the addition process, the probability distribution of the amount we carry out only depends on the amount we carried in from the previous step. In other words, the carries form a Markov chain!

This means that we can do more than describe the distribution of the amounts carried. We can specify the transition probabilities for carries, i.e. the distribution on the amount carried out, conditional on the amount carried in.

Examples

When we add two base ten numbers, m = 2 and b = 10, we have the transition matrix

\begin{pmatrix} 0.55 & 0.45 \\ 0.45 & 0.55 \end{pmatrix}

This means that on average, we’re as likely to carry 0 as to carry 1. But more specifically, there’s a 55% chance that we’ll carry a 1 if we carried a 1 on the previous step, and also a 55% chance that we won’t have a carry this time if we didn’t have one last time.

If we add three numbers, things get a little more interesting. Now the transition matrix is

\begin{pmatrix} 0.220 & 0.660 & 0.120 \\ 0.165 & 0.670 & 0.165 \\ 0.120 & 0.660 & 0.220 \end{pmatrix}

This gives the probabilities of each out carry for each in carry. We can compute the long run distribution from the matrix above to find that when adding three long numbers, we’ll carry 0 or 1 with probability 1/6 each, and carry 1 with probability 2/3.

When adding three binary numbers, the transition matrix is fairly different than for base 10

\begin{pmatrix} 0.500 & 0.500 & 0.000 \\ 0.125 & 0.750 & 0.125 \\ 0.000 & 0.500 & 0.500 \\ \end{pmatrix}

but the long run distributions are the same: carry a 1 two thirds of the time, and split the remaining probability equally between carrying 0 and 2.

The transition matrices don’t change much for larger bases; base 100 doesn’t look much different than base 10, for example.

Finally, let’s do an example adding five numbers in base 10. The transition matrix is

\begin{pmatrix} 0.020 & 0.305 & 0.533 & 0.140 & 0.003 \\ 0.013 & 0.259 & 0.548 & 0.176 & 0.005 \\ 0.008 & 0.216 & 0.553 & 0.216 & 0.008 \\ 0.005 & 0.176 & 0.548 & 0.260 & 0.013 \\ 0.003 & 0.140 & 0.533 & 0.305 & 0.020 \\ \end{pmatrix}

and the long run distribution is 1/60, 13/60, 33/60, 13/60, and 1/60 for carries of 0 through 4.

General case

In general, when adding m numbers in base b, Holte gives the transition probabilities as

\pi_{ij} = b^{-m} \sum_{r=0}^{j-\lfloor i/b\rfloor}(-1)^r {m+1\choose r}{m - i -1 + b(j-r+1) \choose m}

Here is some Python code to compute the transition matrices.

from scipy import floor, zeros, array
from scipy.special import comb
from numpy.linalg import matrix_power

# transition probability
def pi(i, j, b, m):
    s = 0
    for r in range(0, j - int(floor(i/b)) + 1):
        s += (-1)**r * comb(m+1, r) * comb(m-i-1 +b*(j-r+1), m)
    return s/b**m

def transition_matrix(b, m):
    P =  [ [pi(i, j, b, m) for j in range(m)]
            for i in range(m) ]
    return array(P)

You can find the long run probabilities by raising the transition matrix to a large power.

Related posts

[1] John M. Holte. The American Mathematical Monthly, Vol. 104, No. 2 (Feb., 1997), pp. 138-149

3 thoughts on “Distribution of carries

  1. I wonder how does this changes when the numbers come from numerical data following the Newcomb-Benford’s law of anomalous numbers, instead of having digits uniformly distributed.

  2. Jakub, here’s my guess. Benford’s law says that digits are roughly uniform in log-space. So here’s a simulation in IDL (you probably never heard of it, but it’s what I use) that does the uniform distribution to recreate the 45/55 matrix and then a log-space distribution. Since small digits are more highly represented in log-space, the probability of carries is smaller. My transition matrix for two numbers is

    .84 .76
    .15 .23

    Here’s the full output of my code and the code itself.

    Uniform
    0.0998725 0.0995840 0.100013 0.100205 0.100090 0.100240 0.0999095 0.0999435 0.0999900 0.100153
    transition matrix
    0.549245 0.449910
    0.450755 0.550090
    Log
    0.278547 0.168131 0.121233 0.0945430 0.0775195 0.0659935 0.0570550 0.0507760 0.0453455 0.0408560
    transition matrix
    0.841960 0.769524
    0.158040 0.230476

    pro markov_carry
    n=1000000

    print, ‘Uniform’

    ;get two sets of uniform distribution of n digits
    digits=fix(randomu(seed,n,2)*10)

    ;validate that the digits are uniform
    print, histogram(digits)/(2.0*n)

    ;create an empty transition matrix
    transition=lonarr(2,2)

    ;loop through each digit and see if the result is greater than 10
    ;and add to the appropriate part of the transition matrix
    prev_carry=0
    for i=0,n-1 do begin
    carry=(digits[i,0]+digits[i,1]+prev_carry) ge 10
    transition[prev_carry,carry]++
    prev_carry=carry
    end

    ;print the normalized transition matrix
    print, ‘transition matrix’
    print, transition/rebin(total(transition,2),2,2)

    ;now we do the same thing except we pick our digits to
    ;be uniform in log10 space

    print, ‘Log’

    digits=fix((10^randomu(seed,n,2)-1)*10/9)

    print, histogram(digits)/(2.0*n)

    transition=lonarr(2,2)

    prev_carry=0
    for i=0,n-1 do begin
    carry=(digits[i,0]+digits[i,1]+prev_carry) ge 10
    transition[prev_carry,carry]++
    prev_carry=carry
    end

    print, ‘transition matrix’
    print, transition/rebin(total(transition,1),2,2)

    end

  3. John,

    Interesting stuff; but I suspect a typo in the 100 number in the following text;

    “The transition matrices don’t change much for larger bases; base 100 doesn’t look much different than base 100, for example.”

Comments are closed.