Estimating the proportion of smooth numbers

river rocks

A number is said to be “smooth” if all its prime factors are small. To make this precise, a number is said to be y-smooth if it only has prime factors less than or equal to y. So, for example, 1000 is 5-smooth.

The de Bruijn function φ(x, y) counts the number of y-smooth positive integers up to x, and so φ(x, y)/x is the probability that a number between 1 and x is y-smooth.

It turns out that

φ(x, x1/u)/x ≈ 1/uu.

This means, for example, that the proportion of numbers with less than 100 digits whose factors all have less than 20 digits is roughly 1/55 = 1/3125.

Source: The Joy of Factoring

Here’s a little Python code to experiment with this approximation. We’ll look at the proportion of 96-bit numbers whose largest prime factor has at most 32 bits.

    from secrets import randbits
    from sympy.ntheory.factor_ import smoothness

    smooth = 0
    for _ in range(100):
        x = randbits(96)
        s = smoothness(x)[0]
        if s < 2**32:
            smooth += 1
    print("Num small:", smooth)

The SymPy function smoothness returns a pair, and the first element of the pair is the largest prime dividing its argument.

In our case u = 3, and so we’d expect about 1/27 of our numbers to have no factors larger than 32 bits. I ran this five times, and got 8, 2, 5, 3, and 7. From the approximation above we’d expect results around 4, so our results are not surprising.