An empirical look at the Goldbach conjecture

The Goldbach conjecture says that every even number bigger than 2 is the sum of two primes. I imagine he tried out his idea on numbers up to a certain point and guessed that he could keep going. He lived in the 18th century, so he would have done all his calculation by hand. What might he have done if he could have written a Python program?

Let’s start with a list of primes, say the first 100 primes. The 100th prime is p = 541. If an even number less than p is the sum of two primes, it’s the sum of two primes less than p. So by looking at the sums of pairs of primes less than p, we’ll know whether the Goldbach conjecture is true for numbers less than p. And while we’re at it, we could keep track not just of whether a number is the sum of two primes, but also how many ways it is a sum of two primes.

    from sympy import prime
    from numpy import zeros
    
    N = 100
    p = prime(N)
    
    primes = [prime(i) for i in range(1, N+1)]
    sums = zeros(p, int)
    
    for i in range(N):
        # j >= i so we don't double count
        for j in range(i, N):
            s = primes[i] + primes[j]
            if s >= p:
                break
            sums[s] += 1
    
    # Take the even slots starting with 4
    evens = sums[4::2]
    
    print( min(evens), max(evens) )

This prints 1 and 32. The former means that every even number greater than 4 and less than p was hit at least once, that every number under consideration was the sum of two primes. The latter means that at least one number less than p can be written as a sum of two primes 32 different ways.

According to the Wikipedia article on the Goldbach conjecture, Nils Pipping manually verified the Goldbach conjecture for even numbers up to 100,000 in 1938, an amazing feat.

There are 9,952 primes less than 100,000 and so we would need to take N = 9592 in our program to reproduce Pipping’s result. This took about seven minutes.

Update: As suggested in the comments, nearly all of the time is being spent generating the list of primes. When I changed the line

    primes = [prime(i) for i in range(1, N+1)]

to

    primes = [x for x in primerange(1, p)]

the runtime dropped from 7 minutes to 18 seconds.

8 thoughts on “An empirical look at the Goldbach conjecture

  1. `Every integer larger than 1 is an average of 2 primes.’ is preferable because odd integers are not avoided.

  2. [code]
    my @primes = (1..*).grep(*.is-prime);
    my %sums = (@primes[^100] X+ @primes[^100]).Bag;
    say %sums{4, 6 … @primes[99]}.minmax;
    [/code]

    That does double-count, though. I don’t really see the harm in that?

  3. What is Python doing for that to take 7 minutes?! My version, with LuaJIT, takes 2 seconds. 2 minutes for numbers up to 1,000,000.

    I suspect the answer comes from your use of prime[n]. Looking at the documentation, that’s going to be very slow: it does a binary search to find m such that li(m) > n, then calculates pi(m-1), then finds the next prime n – pi(m-1) times.

    Try using the primerange generator in those loops instead. It should go a lot faster.

  4. My original code took 7 minutes, most of which I discovered later was generating the list of primes. My revised code took 18 seconds.

  5. I’ve just realised that John D. Cook signs himself “John” in comments. The one on the 20th was from me, and I am not him. Sorry if that caused any confusion.

    The list of counts provides good reason to believe that the Goldbach conjecture might be true. Have a look at the minimum and maximum from each block of 100,000 in the first million even numbers: (I know already that this isn’t going to format nicely)

    4 – 99998: min 1 max 4336
    100000 – 199998: min 1139 max 7862
    200000 – 299998: min 2058 max 10796
    300000 – 399998: min 2860 max 14188
    400000 – 499998: min 3633 max 16998
    500000 – 599998: min 4401 max 20736
    600000 – 699998: min 5126 max 24152
    700000 – 799998: min 5821 max 25368
    800000 – 899998: min 6548 max 29174
    900000 – 999998: min 7247 max 31188
    1000000 – 1099998: min 7925 max 34150
    1100000 – 1199998: min 8582 max 37216
    1200000 – 1299998: min 9276 max 40170
    1300000 – 1399998: min 9889 max 43186
    1400000 – 1499998: min 10551 max 45682
    1500000 – 1599998: min 11163 max 48088
    1600000 – 1699998: min 11834 max 49130
    1700000 – 1799998: min 12452 max 52492
    1800000 – 1899998: min 13075 max 55122
    1900000 – 1999998: min 13650 max 55976

    If the count is ever 0, the conjecture is false. But while the count varies quite a bit, the local minima are steadily increasing. For these small numbers, it looks promising.

Leave a Reply

Your email address will not be published. Required fields are marked *