Poisson distribution and prime numbers

Let ω(n) be the number of distinct prime factors of x. A theorem of Landau says that for N large, then for randomly selected positive integers less than N, ω-1 has a Poisson(log log N) distribution. This statement holds in the limit as N goes to infinity.

Apparently N has to be extremely large before the result approximately holds. I ran an experiment for N = 10,000,000 and the fit is not great.

Here’s the data:

|----+----------+-----------|
|    | Observed | Predicted |
|----+----------+-----------|
|  1 |   665134 |  620420.6 |
|  2 |  2536837 | 1724733.8 |
|  3 |  3642766 | 2397330.6 |
|  4 |  2389433 | 2221480.4 |
|  5 |   691209 | 1543897.1 |
|  6 |    72902 |  858389.0 |
|  7 |     1716 |  397712.0 |
|  8 |        1 |  157945.2 |
|  9 |        0 |   54884.8 |
| 10 |        0 |   16952.9 |
|----+----------+-----------|

And here’s a plot:

bar chart of actual vs predicted

I found the above theorem in these notes. It’s possible I’ve misunderstood something or there’s an error in the notes. I haven’t found a more formal reference on the theorem yet.

Update: According to the results in the comments below, it seems that the notes I cited may be wrong. The notes say “distinct prime factors”, i.e. ω(n), while numerical results suggest they meant to say Ω(n), the number of prime factors counted with multiplicity. I verified the numbers given below. Here’s a plot.

 

Here’s the Python code I used to generate the table. (To match the code for the revised graph, replace omega which calculated ω(n) with the SymPy function primeomega which calculates Ω(n).)

from sympy import factorint
from numpy import empty, log, log2
from scipy.stats import poisson

N = 10000000

def omega(n):
    return len(factorint(n))

table = empty(N, int)
table[0] = table[1] = 0
for n in range(2, N):
    table[n] = omega(n)

# upper bound on the largest value of omega(n) for n < N.
u = int(log2(N))

for n in range(1, u+1):
    print(n, len(table[table==n]), poisson(log(log(N))).pmf(n-1)*N)

Related posts

2 thoughts on “Poisson distribution and prime numbers

  1. It was interesting so I started playing around with it in R. I decided that the distribution didn’t look much like the log(log(N)) Poisson distribution and the two means weren’t even close.

    So I decided to drop the unique part of the prime factors and it feels* much better. (My math aint good enough to know if I am right).
    Here is the inefficient R code
    (Two simulations – one using the average of the original data and one using the log(log(N)) mean)
    ~~~~~~~~~~~~
    library(gmp)
    N<- 10000001
    pfc <-matrix(0,N-1,1)

    for (i in 2:N) {
    # pfc[i-1]<-length(unique(factorize(i)))
    pfc[i-1]<-length(factorize(i))
    }

    meanpfc<-sum(pfc-1)/(N-1)

    ppfc<-rpois(N-1,meanpfc)+1
    ppfc2<-rpois(N-1,log(log(N-1)))+1

    u<-max(ppfc,pfc,ppfc2)
    fppfc2<-factor(ppfc2,levels=1:u)
    fppfc<-factor(ppfc,levels=1:u)
    fpfc<-factor(pfc,levels=1:u)

    obs<-table(fpfc)
    pred<-table(fppfc)
    predlln<-table(fppfc2)
    cbind(obs,pred, predlln)
    ~~~~~~~~~~~~~~~~~~
    Table of data
    obs pred predlln
    1 664579 617455 620119
    2 1904325 1716905 1722713
    3 2444359 2394090 2395522
    4 2050696 2223963 2223467
    5 1349779 1546252 1545203
    6 774078 863598 858951
    7 409849 399329 397826
    8 207207 159158 157759
    9 101787 55695 55203
    10 49163 17139 16926
    11 23448 4815 4786
    12 11068 1257 1179
    13 5210 266 276
    14 2406 64 58
    15 1124 9 11
    16 510 4 1
    17 233 1 0
    18 102 0 0
    19 45 0 0
    20 21 0 0
    21 7 0 0
    22 3 0 0
    23 1 0 0

    ~~~~~~~~~~
    The average of the original data (minus 1) is 2.7861251. The average of the simulated Poisson distribution with the previous mean is 2.7854423 (results will vary!) and log(log(10000000)) =2.779943

    It's still odd in the right hand tail but much better in the left hand tail.

Comments are closed.