Average digit sum

Suppose you write down a number and take the sum of its digits. In what base will this sum be the smallest on average?

Let’s do a couple examples comparing base 10 and base 2. The number 2022 in base 10 has digit sum 6, but its binary equivalent 11111100110 has digit sum 8, so the base 10 representation has the smaller digit sum. But if we take 1024 in base 10, the digit sum is 7, but 1024 = 2¹⁰ and so the sum of its binary digits is just 1.

In [1] the author proves that for sufficiently large N, the average digit sum for all positive integers less than N is the least in base 2. Moreover, the author shows that as N goes to infinity, the average digit sum of numbers less than N written in radix r is asymptotically equal to

(r – 1) log N / 2 log r.

This is an increasing function of r and so not only does base 2 have the minimum average digit sum, the average digit sum increases with the size of the base.

Let’s see how this works out with N = 1,000,000 and r ranging from 2 to 10 using the following Mathematica code.

    Table[
        Sum[ Total[IntegerDigits[n, r]], 
            {n, 0, 1000000}] /1000000., 
        {r, 2, 10}
    ]

This returns

    {9.885, 12.336, 14.8271, 16.5625, 18.5806, 
     20.5883, 22.2383, 24.1978, 27.}

Now let’s see what values the asymptotic formula predicted.

    Table[(r - 1) Log[1000000.]/(2 Log[r]), {r, 2, 10}]

This returns

    {9.96578, 12.5754, 14.9487, 17.1681, 19.2765, 
     21.2993, 23.2535, 25.1508, 27.}

So for each r the asymptotic formula produces a good approximation to the actual result.

[1] L. E. Bush. Asymptotic Formula for the Average Sum of the Digits of Integers. The American Mathematical Monthly, Mar., 1940, Vol. 47, No. 3, pp. 154-156