Aliquot ratio distribution

The previous post looked at repeatedly applying the function s(n) which is the sum of the divisors of n less than n. It is an open question whether the sequence

s( s( s( … s(n) … ) ) )

always converges or enters a loop. In fact, it’s an open question of whether the sequence starting with n = 276 converges.

A heuristic argument for why the sequence ought to converge, at least much of the time, is that s(n) is usually less than n. This can be made precise as explained here: in the limit as N goes to infinity, the proportion of numbers less than N for which s(n) < n is at least 0.752.

Even though applying s usually leads to a smaller result, conceivably it could lead to a much larger result when it increases, as with the hailstone problem.

I made some histograms of the ratio s(n) / n to get a feel for how much s increases or decreases its argument. (I imagine this has all been done before, but I haven’t seen it.)

Here’s my first attempt, for n up to 1000.

Histogram of aliquot ratios

So apparently the ratio is often less than 1.

I thought it would be clearer to work on a log scale and center the graph at 0 so I next made the following graph.

Histogram of log aliquot ratios

This shows that the aliquot function s never increases its argument by much, and it often reduces its argument quite a bit, at least for numbers less than 1000.

Next I made a histogram with numbers less than 1,000,000 to see whether the pattern continued. Here’s what I got.

Histogram of log aliquot ratios for numbers less than a million

The most obvious feature is the huge spike. If you can ignore that, mentally cropping the image to just the bottom part, it looks basically like the previous graph.

What’s up with that spike? It’s in the bin containing log(0.5). In other words, the ratio s(n)/n is often near 0.5.

So I thought maybe there’s some number theoretic reason the ratio is often exactly 1/2. That’s not the case. It’s only exactly 1/2 once, when n = 2, but it’s often near 1/2.

In any case, applying s very often returns a smaller number than you started with, so it’s plausible that the iterated aliquot sequence would often converge. But empirically it looks like some sequences, such as the one starting at n = 276, diverge. They somehow weave through all the numbers for which the sequence would come tumbling down.

The iterated aliquot problem

Let s(n) be the sum of the proper divisors of n, the divisors less than n itself. A number n is called excessive, deficient, or perfect depending on whether s(n) – n is positive, negative, or 0 respectively, as I wrote about a few days ago.

The number s(n) is called the aliquot sum of n. The iterated aliquot problem asks whether repeated iterations of s always converge to a perfect number or to 0. This problem has something of the flavor of the Collatz conjecture, a.k.a. the hailstone problem.

As mentioned here, about 1/4 of all integers are excessive and about 3/4 are deficient. That is, s(n) is smaller than n for about 3 out of 4 integers. Based on just this, it seems plausible that iterated aliquots would often converge. (See the next post for a visualization of how much the function s increases or decreases its argument.)

We can investigate the iterated aliquot problem with the following Python code.

    from sympy import divisors
    def s(n):
        if n == 0:
            return 0
        return sum(divisors(n)) - n
    def perfect(n):
        return n == s(n)
    def iterate(n):
        i = 0
        while True:
            if perfect(n) or n == 0:
                return i
            n = s(n)
            i += 1

The function iterate counts how many iterations it takes for repeated applications of s to converge.

For n = 1 through 24, the sequence of iterates converges to 0, the only exception being the perfect number 6.

When n = 25 the sequence converges, but it converges to 6 rather than 0. That is, 25 is the smallest number which is not perfect but for which the sequence terminates in a perfect number.

When n = 138, the sequence converges, but it takes 178 iterations to do so. For smaller n the sequence converges much sooner.

When n = 220, we see something new. s(220) = 284, and s(284) = 220. That is, our code is stuck in an infinite loop.

So we need to go back and refine our question: does the sequence of iterations always converge to 0, converge to a perfect number, or enter a loop?

We have to modify the Python code above to look for loops;

    def iterate(n):
        i = 0
        seen = set()
        while True:
            if perfect(n) or n == 0 or n in seen:
                return i
            n = s(n)
            i += 1

At least for n < 276 the sequence converges.

When I tried 276, the iterations got much larger and slower. I let it run for 400 iterations and the last iteration was


This took 150 seconds. I switched over to Mathematica and it calculated the same result in 0.6 seconds.

Then I asked Mathematica to run 600 iterations and that took 64 seconds. I went up to 700 iterations and that took 6133 seconds. So it seems that each set of 100 iterations takes about 10 times longer than the previous one.

The last result I got was


Maybe the iterations turn around eventually, but so far it looks like they’ve reached escape velocity. The growth isn’t monotonic. It goes up and down, but more often goes up.

The problem of whether the sequence always converges is unsolved. I don’t know whether it’s known whether the particular sequence starting at 276 converges.

Update: See the first comment. It’s an open problem whether the sequence starting at 276 converges.

Excessive, deficient, and perfect numbers

I learned recently that weekly excess deaths in the US have dipped into negative territory [1], and wondered whether we should start speaking of deficient deaths by analogy with excessive and deficient numbers.

The ancient Greeks defined a number n to be excessive, deficient, or perfect according to whether the sum of the number’s proper divisors was greater than, less than, or equal to n. “Proper” divisor means that n itself isn’t included. Excessive numbers are also called “abundant” numbers.

For example, 12 is excessive because

1 + 2 + 3 + 4 + 6 = 16 > 12,

8 is deficient because

1 + 2 + 4 < = 7 < 8,

and 6 is perfect because

1 + 2 + 3 = 6.

Perfect numbers

Nearly all numbers are excessive or deficient; perfect numbers constitute a very thin middle ground [2]. There are 51 known perfect numbers, all even so far. Euclid proved that if M is a Mersenne prime, then n = M(M+1)/2 is even and perfect. Euler proved the converse: if n is an even perfect number, n = M(M + 1)/2 for some Mersenne prime M.

No one has proved that odd perfect numbers don’t exist, but mathematicians keep proving more and more properties that an odd perfect number must have, if such a number exists. Maybe some day this list will include an impossibility, proving that there are no odd perfect numbers. Currently we know that if an odd perfect number exists, it must have more 1500 digits, its largest prime factor must have more than 8 digits, etc.

Density of excessive numbers

A little less than 1 in 4 numbers are excessive. More precisely, the proportion of excessive numbers less than N approaches a limiting value as N goes to infinity, and that limit is known to be between 0.2474 and 0.2480. See [3]. This means that a little more than 1 in 4 numbers is deficient.

Perfect number density

I said that perfect numbers form a thin middle ground between excessive and deficient numbers. We only know of 51 perfect numbers, but there may be infinitely many of them. Even so, they are thinly distributed, with zero density in the limit. So the density of deficient numbers is 1 minus the density of excessive numbers.

There is a conjecture that there are infinitely many perfect numbers, but that they are extremely rare. As stated above, n is a perfect prime if and only if n = M(M+1)/2 for a Mersenne prime M. A Mersenne prime is a prime number of the form 2p – 1 where p is prime. The number of such exponents p less than a given bound x is conjectured to be asymptotically proportional to

eγ log x / log 2

and the following graph gives empirical support for the conjecture. More on this conjecture here.

Predicted vs actual Mersenne primes

If the p‘s are thinly spread out, then the Mersenne primes, 2 raised to the power of these p‘s, are much more thinly spread out, and so even perfect numbers are very thinly spread out.

Related posts

[1] Based on CDC data

[2] Some sources define a number n to be excessive if the sum of its proper divisors is greater than or equal to n, meaning that by this definition perfect numbers are included in excessive numbers.

[3] Marc Deléglise Bounds for the density of abundant numbers

Close but no cigar

The following equation is almost true.

\sum_{n=1}^\infty \frac{\lfloor n \tanh \pi\rfloor}{10^n} = \frac{1}{81}

And by almost true, I mean correct to well over 200 decimal places. This sum comes from [1]. Here I will show why the two sides are very nearly equal and why they’re not exactly equal.

Let’s explore the numerator of the sum with a little code.

    >>> from math import tanh, pi
    >>> for n in range(1, 11): print(n*tanh(pi))

When we take the floor (the integer part [2]) of the numbers above, the pattern seems to be

n tanh π⌋ = n – 1

If the pattern continues, our sum would be 1/81. To see this, multiply the series by 100, evaluate the equation below at x = 1/10, and divide by 100.

\begin{align*} \sum_{n=1}^\infty n x^{n-1} &= \sum_{n=0}^\infty \frac{d}{dx} x^n \\ &= \frac{d}{dx} \sum_{n=0}^\infty x^n \\ &= \frac{d}{dx} \frac{1}{1-x} \\ &= \frac{1}{(1-x)^2} \end{align*}

Our sum is close to 1/81, but not exactly equal to it, because

n tanh π⌋ = n – 1

holds for a lot of n‘s but not for all n.

Note that

tanh π = 0.996… = 1 – 0.00372…

and so

n tanh π⌋ = n – 1

will hold as long as n < 1/0.00372… = 268.2…


⌊268 tanh π⌋ = 268-1


⌊269 tanh π⌋ = 269-2.

So the 269th term on the left side

\sum_{n=1}^\infty \frac{\lfloor n \tanh \pi\rfloor}{10^n} = \frac{1}{81}

is less than the 269th term of the sum

10-2 + 2×10-3 + 3×10-4 + … = 1/81

for the right side.

We can compare the decimal expansions of both sides by using the Mathematica command

    N[Sum[Floor[n Tanh[Pi]]/10^n, {n, 1, 300}], 300]

This shows the following:

\begin{align*} \frac{1}{81} &= 0.\underbrace{012345679}_{\text{repeat forever}} \ldots\\ \sum_{n=1}^\infty \frac{\lfloor n \tanh \pi\rfloor}{10^n} &= 0.\underbrace{012345679}_{\text{repeat 29 times}} \ldots 0123456\framebox{6}\ldots \end{align*}

Related posts

[1] J. M. Borwein and P. B. Borwein. Strange Series and High Precision Fraud. The American Mathematical Monthly, Vol. 99, No. 7, pp. 622-640

[2] The floor of a real number x is the greatest integer ≤ x. For positive x, this is the integer part of x, but not for negative x.

Divisibility by any prime

Here is a general approach to determining whether a number is divisible by a prime. I’ll start with a couple examples before I state the general rule. This method is documented in [1].

First example: Is 2759 divisible by 31?

Yes, because

\begin{align*} \begin{vmatrix} 275 & 9 \\ 3 & 1 \end{vmatrix} &= 248 \\ \begin{vmatrix} 24 & 8 \\ 3 & 1 \end{vmatrix} &= 0 \end{align*}

and 0 is divisible by 31.

Is 75273 divisible by 61? No, because

\begin{align*} \begin{vmatrix} 7527 & 3 \\ 6 & 1 \end{vmatrix} &= 7509 \\ \begin{vmatrix} 750 & 9\\ 6 & 1 \end{vmatrix} &= 696\\ \begin{vmatrix} 69 & 6\\ 6 & 1 \end{vmatrix} &= 33 \end{align*}

and 33 is not divisible by 61.

What in the world is going on?

Let p be an odd prime and n a number we want to test for divisibility by p. Write n as 10a + b where b is a single digit. Then there is a number k, depending on p, such that n is divisible by p if and only if

\begin{align*} \begin{vmatrix} a & b \\ k & 1 \end{vmatrix} \end{align*}

is divisible by p.

So how do we find k?

  • If p ends in 1, we can take k = ⌊p / 10⌋.
  • If p ends in 3, we can take k = ⌊7p / 10⌋.
  • If p ends in 7, we can take k = ⌊3p / 10⌋.
  • If p ends in 9, we can take k = ⌊9p / 10⌋.

Here ⌊x⌋ means the floor of x, the largest integer no greater than x. Divisibility by even primes and primes ending in 5 is left as an exercise for the reader. The rule takes more effort to carry out when k is larger, but this rule generally takes less time than long division by p.

One final example. Suppose we want to test divisibility by 37. Since 37*3 = 111, k = 11.

Let’s test whether 3293 is divisible by 37.

329 – 11×3 = 296

29 – 11×6 = -37

and so yes, 3293 is divisible by 37.

[1] R. A. Watson. Tests for Divisibility. The Mathematical Gazette, Vol. 87, No. 510 (Nov., 2003), pp. 493-494

Sums of consecutive reciprocals

The sum of the reciprocals of consecutive integers is never an integer. That is, for all positive integers m and n with n > m, the sum

\sum_{k=m}^n \frac{1}{k}

is never an integer. This was proved by József Kürschák in 1908.

This means that the harmonic numbers defined by

H_n = \sum{k=1}^n \frac{1}{k}

are never integers for n > 1. The harmonic series diverges, so the sequence of harmonic numbers goes off to infinity, but it does so carefully avoiding all integers along the way.

Kürschák’s theorem says that not only are the harmonic numbers never integers, the difference of two distinct harmonic numbers is never an integer. That is, HnHm is never an integer unless m = n.

Related posts

Average sum of digits

The smaller the base you write numbers in, the smaller their digits sums will be on average.

This need not be true of any given number, only for numbers on average. For example, let n = 2021. In base 10, the sum of the digits is 5, but in base 2 the sum of the digits is 8 because n is written as 11111100101 in binary. So for our particular example, a smaller base lead to a larger digit sum. In fact, all smaller bases than 10 lead to a larger digit sum for 2021. But on average, numbers have smaller digit sums when written in binary than when written in base 10.

Let’s explore this with a little Python code, then look at a theoretical result.

First, we need some code to express numbers in different bases, and here’s a simple function for that.

    # From
    def numberToBase(n, b):
        if n == 0:
            return [0]
        digits = []
        while n:
            digits.append(int(n % b))
            n //= b
        return digits[::-1]

Using the code above, our digit sum function is trivial:

    def digitSum(n, b):
        return sum(numberToBase(n, b))

Now for our simulation.

    import numpy as np

    upper_bound = 1000000000
    sample_size = 1000
    bases = np.arange(2, 17)

    sample = np.random.randint(0, upper_bound, size = sample_size)
    out = [sum(digitSum(n, b) for n in sample)/sample_size for b in bases]

A theorem in [1] says that

\frac{S(b, N)}{N} \sim \frac{(r-1)\log N}{2\log b}

where S(b, N) is the sum of the number of digits of all numbers from 1 to N when written in base b. This value is asymptotically equal to the expression on the right.

We can compute the average number of digits we’d expect in our simulation using the asymptotic result as an approximation.

    asymp = [0.5*(b-1)*np.log(upper_bound) / np.log(b) for b in bases]

The table below compares the simulation results to theory.

    | base | simulation | asymptotic |
    |    2 |     14.899 |     14.949 |
    |    3 |     18.660 |     18.863 |
    |    4 |     22.317 |     22.423 |
    |    5 |     25.398 |     25.752 |
    |    6 |     28.345 |     28.915 |
    |    7 |     30.996 |     31.949 |
    |    8 |     34.839 |     34.880 |
    |    9 |     36.422 |     37.726 |
    |   10 |     40.275 |     40.500 |
    |   11 |     41.480 |     43.211 |
    |   12 |     44.353 |     45.868 |
    |   13 |     47.454 |     48.476 |
    |   14 |     49.949 |     51.041 |
    |   15 |     51.772 |     53.567 |
    |   16 |     53.610 |     56.058 |

[1] L. E. Bush. An Asymptotic Formula for the Average Sum of the Digits of Integers. The American Mathematical Monthly , Mar., 1940, Vol. 47, pp. 154-156.

A connected topology for the integers

You can define a topology on the positive integers by choosing as an open basis sets the series of the form an + b where a and b are relatively prime positive integers. Solomon Golumb defines this topology in [1] and proves that it is connected.

But that paper doesn’t prove that proposed basis really is a basis. That is implicitly an exercise for the reader, and we carry out that exercise here. The proof will say that certain things can be computed, and we’ll go a step further an actually compute them with Python.


To prove that these sets form the basis of a topology, we need to establish two things.

  1. Every point is contained in some basis element.
  2. The intersection of basis elements is another basis element or empty.

The first is easy. For any positive number x, the sequence (x+1)n + x contains x, and x+1 is relatively prime to x. Any other number relatively prime to x would work just as well.

The second is more interesting. Consider two sequences: an + b and cm + d. A number x in the intersection satisfies the pair of equations

x = b mod a
x = d mod c.

If a and c are relatively prime, the Chinese Remainder Theorem says that the equation has a solution y, and the solutions are unique mod ac. So the intersection of the two series is acn + y.

If a and c are not relatively prime, let r be their greatest common divisor. Then the intersection of an + b and cm + d is another sequence of the same form if b and d are congruent mod r and empty otherwise.

Separating points

A topological space is connected if it is not the union of two disjoint open sets. An easy way to make a space connected is to not have any disjoint open sets. But Golumb’s topology has plenty of disjoint open sets.

In fact, his topology is Hausdorff, meaning that any two points can be separated by disjoint open sets. The proof is simple. Take two points s and t. Let p be any prime bigger than both s and t, and consider the two sets pn + s and pn + t.

Python calculation

Our proof split into two cases: when the strides of the two series are relatively prime and when they are not.

Relatively prime strides

To get concrete, suppose we have the sequences

2, 9, 16, 23, …


3, 14, 25, 35, …

or in other words 7n + 2 and 11n + 3. According to the theory above, these sequences intersect because 7 and 11 are relatively prime, and the intersection should be of the form 77n + y, and we should be able to compute y from the Chinese Remainder Theorem. We will use the function crt from SymPy. (See another example of using this function here.)

    >>> from sympy.ntheory.modular import crt
    >>> crt([7,11], [2, 3], symmetric=False)
    >>> (58, 77)

This reports that y = 58.

Now let’s verify that the intersection of our two series looks like 77n + 58.

    >>> A = set(2+7*n for n in range(100))
    >>> B = set(3+11*n for n in range(100))
    >>> sorted(A.intersection(B))
    [58, 135, 212, 289, 366, 443, 520, 597, 674]

Strides with a common factor: empty intersection

Now for the case where a and c have greatest common divisor r > 1. Consider, for example,

1, 16, 31, 46, …


2, 14, 26, 38, …

The sequences 15n + 1 and 12m + 2 do not intersect. The greatest common divisor of 15 and 12 is 3. Numbers in the first series are congruent to 1 mod 3 and numbers in the second series are congruent to 2 mod 3, so no number can be in both. If we didn’t realize this, we could call

    crt([15,12], [1, 2], symmetric=False)

and see that it returns None.

Strides with a common factor: finding the intersection

But now let’s look at

1, 16, 31, 46, …


13, 25, 37, 49, …

Now both sequences are congruent to 1 mod 3, so it’s at least feasible that the two series may intersect. And in fact they do.

    >>> crt([15,12], [1, 13], symmetric=False)
    (1, 60)

This says that the set of solutions are all congruent to 1 mod 60. Now 1 itself is not a solution, but 61 is, and so is 60n + 1 for all positive integers n. We can illustrate this as before by computing the intersection of the two series.

    >>> A = set(1+15*n for n in range(30))
    >>> B = set(13+12*n for n in range(30))
    >>> sorted(A.intersection(B))
    [61, 121, 181, 241, 301, 361]

Related posts

[1] Solomon W. Golomb. A Connected Topology for the Integers. The American Mathematical Monthly , Oct., 1959, pp. 663-665

Factors of consecutive products

Pick a positive integer k and take the product of k consecutive integers greater than k. Then the result is divisible by a prime number greater than k. This theorem was first proved 128 years ago [1].

For example, suppose we pick k = 5. If we take the product


then it’s obviously divisible by a prime larger than 5 because 23 is one of the factors. But we could also take, for example,


Although the sequence {32, 33, 34, 35, 36} doesn’t contain any primes, it contains a couple numbers with prime factors larger than 5, namely 33 = 3*11 and 34 = 2*17. Sylvester’s theorem guarantees that there will always be one number out of 5 consecutive integers that is divisible by a prime greater than 5, provided the first number is itself greater than 5.

For one more example, we can pick k = 8 and look at


One of the numbers between 140 and 147 (inclusive) must be divisible by a prime greater than 8, and in fact five of them are, starting with 141 = 3*47.

In both of our examples there were multiple numbers that had a prime factor large enough to satisfy Sylvester’s theorem. Are there examples where only one number has a factor large enough? Yes, but not many.

For example, {8, 9} is a pair of consecutive integers, only one of which has a prime factor bigger than 2. And {8, 9, 10} is an example of 3 consecutive integers, only one of which has a prime factor larger than 3. I wrote a program to search for examples with k = 4 and didn’t find any.

In any case, a theorem dating back to 1897 [2] proves that there could only be finitely many examples for k > 2. Lehmer extends the results [2] in his paper [3].


[1] J. J. Sylvester, On arithmetical series, Messenger Math., 21 (1892) 1-19, and 87-120; Collected Mathematical Papers, 4 (1912) 687-731.

[2] G. St0rmer, Quelques theoremes sur l’equation de Pell x² – Dy² = ± 1 et leurs applications,
Skr. Norske Vid. Akad. Oslo, I no. 2 (1897).

[3] D. H. Lehmer, The Prime Factors of Consecutive Integers, The American Mathematical Monthly , Feb., 1965, Vol. 72, No. 2, Part 2, pp. 19-20.

Ruling out gaps in the list of Mersenne primes

The largest known primes are Mersenne primes, in part because the Lucas-Lehmer algorithm makes it more efficient to test Mersenne numbers for primality than to test arbitrary numbers. These numbers have the form 2n – 1.

Trend in Mersenne primes

The Great Internet Mersenne Prime Search (GIMPS) announced today that they have tested whether 2n – 1 is prime for all values of n less than 100,000,000 [1]. More specifically, they’ve tested each exponent at least once, though they’d like to test each exponent twice before giving up on it.

If we assume all of their initial calculations are correct, we can say, for example, that 282,589,933 – 1 is the 51st Mersenne prime. Before we had to say that it was the 51st known Mersenne prime because there could be a smaller, undiscovered Mersenne prime, in which case 282,589,933 – 1 might be the 52nd Mersenne prime. Mersenne primes have not always been discovered in order. For example, the 29th Mersenne prime was discovered after the 30th and 31st such numbers.

So for now, again assuming all the GIMPS calculations have been correct, we know all Mersenne primes less than 2100,000,000, and there are 51 of them.

Related posts

[1] There’s no need to test every value of n. We know that if 2n – 1 is prime, then n is prime. So we only need to check prime values of n.