Connecting probability and number theory

E. Kowalski has a post this morning called Finding life beyond the Central Limit Theorem. The post mentions a theorem I’ve never heard of that connects basic concepts in probability and number theory.

Surely the Central Limit Theorem and the normal distribution are important in probability. You might even call them “central.” And counting prime divisors of integers is a natural thing to do in number theory. It turns out these two concepts are closely related.

Pick a big number N. Now select a random integer n between 1 and N. Count the number of distinct prime divisors of n and call that ω(n). Roughly speaking, ω(n) acts like a normal random variable with mean and variance log log N.

There are plenty of other connections between probability and number theory. I took probability from Jeffrey Vaaler, a number theorist. I remember three informal remarks he made to suggest connections between number theory and probability.

  1. Being relatively prime is analogous to being independent. Suppose p and q are two positive integers with no factors in common. Now pick numbers at random between 1 and pq. Being divisible by p is independent of being divisible by q.
  2. There’s a heuristic in number theory that prime numbers behave randomly except when there’s a good reason that they don’t.
  3. Some things are so hard to find that the best way to look for them is to search at random. Often an effective way to prove that something exists is to show that the probability of selecting the thing you’re looking for from some larger set is positive.

7 thoughts on “Connecting probability and number theory

  1. I wrote a MATLAB script to test that log log N fact. It seems to fit if my computer allows me to take prime factors of N values larger than 2^32 :)


    clc,clear

    N = 2^32;
    nTests = 5000;

    counts = zeros(nTests,1);
    for iTest = 1:nTests
    n = randi(N);
    counts(iTest) = length(unique(factor(n)));
    end

    x = 0:0.01:max(counts);
    mu1 = log(log(N));
    lambda1 = log(log(N));

    y = normpdf(x,mu1,sqrt(lambda1));

    xCounts = 0:max(counts);
    nElements = histc(counts,xCounts);
    nElements = nElements ./ sum(nElements);
    bar(xCounts,nElements,'BarWidth',1)

    hold on
    plot(x,y,'r', 'LineWidth',2)
    hold off

    mu2 = mean(counts);
    lambda2 = var(counts);

  2. Being relatively prime is analogous to being independent. Aren’t these two events mutually exclusive and hence dependent ?

  3. The two primes p and q above are different so being one or the other are mutually exclusive states. If a number is 3, it cannot be 5, for example. But being divisible by one or the other is not mutually exclusive. Numbers can be divisible by 3 and by 5. And that’s where the analogy with independence comes in. If you pick numbers at random from the set 1 … N where N is divisible by 15, then being divisible by 3 is independent of being divisible by 5.

  4. But between 1 and p*q no integer is divisible by both p and q ? Since p and q are relative primes. Is there something that escapes me here?

  5. Sorry for the late reply, but yes, got it, ‘between’ includes the boundaries and everything makes sense, indeed. Thanks

Comments are closed.