Exploring the sum-product conjecture

Quanta Magazine posted an article yesterday about the sum-product problem of Paul Erdős and Endre Szemerédi. This problem starts with a finite set of real numbers A then considers the size of the sets A+A and A*A. That is, if we add every element of A to every other element of A, how many distinct sums are there? If we take products instead, how many distinct products are there?

Proven results

Erdős and Szemerédi proved that there are constants c and ε > 0 such that

max{|A+A|, |A*A|} ≥ c|A|1+ε

In other words, either A+A or A*A is substantially bigger than A. Erdős and Szemerédi only proved that some positive ε exists, but they suspected ε could be chosen close to 1, i.e. that either |A+A| or |A*A| is bounded below by a fixed multiple of |A|² or nearly so. George Shakan later showed that one can take ε to be any value less than

1/3 + 5/5277 = 0.3342899…

but the conjecture remains that the upper limit on ε is 1.

Python code

The following Python code will let you explore the sum-product conjecture empirically. It randomly selects sets of size N from the non-negative integers less than R, then computes the sum and product sets using set comprehensions.

    from numpy.random import choice

    def trial(R, N):
        # R = integer range, N = sample size
        x = choice(R, N, replace=False)
        s = {a+b for a in x for b in x}
        p = {a*b for a in x for b in x}
        return (len(s), len(p))

When I first tried this code I thought it had a bug. I called trial 10 times and got the same values for |A+A| and |A*A| every time. That was because I chose R large relative to N. In that case, it is likely that every sum and every product will be unique, aside from the redundancy from commutativity. That is, if R >> N, it is likely that |A+A| and |A*A| will both equal N(N+1)/2. Things get more interesting when N is closer to R.

Probability vs combinatorics

The Erdős-Szemerédi problem is a problem in combinatorics, looking for deterministic lower bounds. But it seems natural to consider a probabilistic extension. Instead of asking about lower bounds on |A+A| and |A*A| you could ask for the distribution on |A+A| and |A*A| when the sets A are drawn from some probability distribution.

If the set A is drawn from a continuous distribution, then |A+A| and |A*A| both equal N(N+1)/2 with probability 1. Only careful choices, ones that would happen randomly with probability zero, could prevent the sums and products from being unique, modulo commutativity, as in the case R >> N above.

If the set A is an arithmetic sequence then |A+A| is small and |A*A| is large, and the opposite holds if A is a geometric sequence. So it might be interesting to look at the correlation of |A+A| and |A*A| when A comes from a discrete distribution, such as choosing N integers uniformly from [1, R] when N/R is not too small.

Projecting Unicode to ASCII

Sometimes you need to downgrade Unicode text to more restricted ASCII text. For example, while working on my previous post, I was surprised that there didn’t appear to be an asteroid named after Poincaré. There is one, but it was listed as Poincare in my list of asteroid names.

Python module

I used the Python module unidecode to convert names to ASCII before searching, and that fixed the problem. Here’s a small example showing how the code works.

    import unidecode

    for x in ["Poincaré", "Gödel"]:
        print(x, unidecode.unidecode(x))

This produces

    Poincaré Poincare
    Gödel Godel

Installing the unidecode module also installs a command line utility by the same name. So you could, for example, pipe text to that utility.

As someone pointed out on Hacker News, this isn’t so impressive for Western languages,

But if you need to project Arabic, Russian or Chinese, unidecode is close to black magic:

>>> from unidecode import unidecode
>>> unidecode("北亰")
'Bei Jing '

(Someone has said in the comments that 北亰 is a typo and should be 北京. I can’t say whether this is right, but I can say that unidecode transliterates both to “Bei Jing.”)

Projections

I titled this post “Projecting Unicode to ASCII” because this code is a projection in the mathematical sense. A projection is a function P such that for all inputs x,

PP(x) ) = P(x).

That is, applying the function twice does the same thing as applying the function once. The name comes from projection in the colloquial sense, such as projecting a three dimensional object onto a two dimensional plane. An equivalent term is to say P is idempotent. [1]

The unidecode function maps the full range of Unicode characters into the range 0x00 to 0x7F, and if you apply it to a character already in that range, the function leaves it unchanged. So the function is a projection, or you could say the function is idempotent.

Projection is such a simple condition that it hardly seems worth giving it a name. And yet it is extremely useful. A general principle in user interface to design is to make something a projection if the user expects it to be a projection. Users probably don’t have the vocabulary to say “I expected this to be a projection” but they’ll be frustrated if something is almost a projection but not quite.

For example, if software has a button to convert an image from color to grayscale, it would be surprising if (accidentally) clicking button a second time had any effect. It would be unexpected if it returned the original color image, and it would be even more unexpected if it did something else, such as keeping the image in grayscale but lowering the resolution.

Related posts

[1] The term “idempotent” may be used more generally than “projection,” the latter being more common in linear algebra. Some people may think of a projection as linear idempotent function. We’re not exactly doing linear algebra here, but people do think of portions of Unicode geometrically, speaking of “planes.”

Groups of order 2019

How many groups have 2019 elements? What are these groups?

2019 is a semiprime, i.e. the product of two primes, 3 and 673. For every semiprime s, there are either one or two distinct groups of order s.

As explained here, if spq with pq, all groups of order s are isomorphic if q is not a factor of p-1. If q does divide p-1 then there are exactly two non-isomorphic groups of order s, one abelian and one non-abelian. In our case, 3 does divide 672, so there are two groups of order 2019. The first is easy: the cyclic group of order 2019. The second is more complex.

You could take the direct product of the cyclic groups of order 3 and 673, but that turns out to be isomorphic to the cyclic group of order 2019. But if you take the semidirect product you get the other group of order 2019, the non-abelian one.

Semidirect products

Starting with two groups G and H, the direct product G × H is the Cartesian product of G and H with multiplication defined componentwise. The semidirect product of G and H, written [1]

G \rtimes H

also starts with the Cartesian product of G and H but defines multiplication differently.

In our application, G will be the integers mod 673 with addition and H will be a three-element subgroup of the integers mod 673 with multiplication [2]. Let H be the set {1, 255, 417} with respect to multiplication [3]. Note that 1 is its own inverse and 255 and 417 are inverses of each other.

Product

Now the group product of (g1, h1) and (g2, h2) is defined to be

(g1 + h1-1g2, h1 h2)

So, for example, the product of (5, 255) and (334, 417) is (5 + 417*334, 255*417) which reduces to (645, 1) working mod 673.

(We haven’t defined the semidirect product in general, but the procedure above suffices to define the semidirect product for any two groups of prime order, and so it is sufficient to find all groups of semiprime order.)

Note that our group is non-abelian. For example, if we reverse the order of multiplication above we get (263, 1).

Identity

The identity element is just the pair consisting of the identity elements from G and H. In our case, this is (0, 1) because 0 is the additive identity and 1 is the multiplicative identity.

Inverse

The inverse of an element (gh) is given by

(-ghh-1).

So, for example, the inverse of (600, 255) is (444, 417).

Python code

The goal of this post is to be concrete rather than general.

So to make everything perfectly explicit, we’ll write a little Python code implementing the product and inverse.

    
    def hinv(h):
        if h == 255:
            return 417
        if h == 417:
            return 255
        if h == 1:
            return 1
    
    def prod(x, y):
        g1, h1 = x
        g2, h2 = y
        g3 = (g1 + hinv(h1)*g2) % 673
        h3 = (h1*h2) % 673
        return (g3, h3)
    
    def group_inv(x):
        g, h = x
        return ((-g*h)%673, hinv(h))
    
    x = (5, 255)
    y = (334, 417)
    print(prod(x, y))
    print(prod(y, x))
    print(group_inv((600, 255)))

The following code verifies that our group satisfies the axioms of a group.

    from itertools import product
    
    identity = (0, 1)
    h_list = [1, 255, 417]
    
    def elem(x):
        g, h = x
        g_ok = (0 <= g <= 672)
        h_ok = (h in h_list)
        return (g_ok and h_ok)
    
    group = product(range(673), h_list)
    assert(len([g for g in group]) == 2019)
    
    # closed under multiplicaton    
    for x in group:
        for y in group:
            assert( elem(prod(x, y)) )
    
    # multiplication is associative
    for x in group:
        for y in group:
            for z in group:
                xy_z = prod(prod(x, y),z)
                x_yz = prod(x, prod(y,z))
                assert(xy_z == x_yz)
    
    # identity acts like it's supposed to
    for x in group:
        assert( prod(x, identity) == x )
        assert( prod(identity, x) == x )
    
    # every element has an inverse
    for x in group:
        ginv = group_inv(x)
        assert( elem(ginv) )
        assert( prod(x, ginv) == identity )
        assert( prod(ginv, x) == identity )

Related posts

[1] The symbol for semidirect product is ⋊. It’s U+22CA in Unicode and \rtimes in LaTeX.

[2] In general, the semidirect product depends on a choice of an action of the group H on the group G. Here the action is multiplication by an element of H. Different actions can result in different groups. Sometimes the particular choice of action is made explicit as a subscript on the ⋊ symbol.

[3] How did I find these numbers? There are 672 non-zero numbers mod 673, so I picked a number, it happened to be 5, and raised it to the powers 672/3 and 2*672/3.

 

Check sums and error detection

The previous post looked at Crockford’s base 32 encoding, a minor variation on the way math conventionally represents base 32 numbers, with concessions for human use. By not using the letter O, for example, it avoids confusion with the digit 0.

Crockford recommends the following check sum procedure, a simple error detection code:

The check symbol encodes the number modulo 37, 37 being the least prime number greater than 32.

That is, we take the remainder when the base 32 number is divided by 37 and append the result to the original encoded number. The remainder could be larger than 31, so we need to expand our alphabet of symbols. Crockford recommends using the symbols *, ~, $, =, and U to represent remainders of 32, 33, 34, 35, and 36.

Crockford says his check sum will “detect wrong-symbol and transposed-symbol errors.” We will show that this is the case in the proof below.

Python example

Here’s a little Python code to demonstrate how the checksum works

    from base32_crockford import encode, decode

    s = "H88CMK9BVJ1V"
    w = "H88CMK9BVJ1W" # wrong last char
    t = "H88CMK9BVJV1" # transposed last chars

    def append_checksum(s):
        return encode(decode(s), checksum=True)

    print(append_checksum(s))
    print(append_checksum(w))
    print(append_checksum(t))

This produces the following output.

    H88CMK9BVJ1VP
    H88CMK9BVJ1WQ
    H88CMK9BVJV1E

The checksum character of the original string is P. When the last character is changed, the checksum changes from P to Q. Similarly, transposing the last two characters changes the checksum from P to E.

The following code illustrates that the check sum can be a non-alphabetic character.

    s = "H88CMK9BVJ10"
    n = decode(s)
    r = n % 37
    print(r)
    print(encode(n, checksum=True))

This produces

    32
    H88CMK9BVJ10*

As we said above, a remainder of 32 is represented by *.

Proof

If you change one character in a base 32 number, its remainder by 37 will change as well, and so the check sum changes.

Specifically, if you change the nth digit from the right, counting from 0, by an amount k, then you change the number by a factor of k 32n. Since 0 < k < 32, k is not divisible by 37, nor is 32n. Because 37 is prime, k 32n is not divisible by 37 [1]. The same argument holds if we replace 37 by any larger prime.

Now what about transpositions? If you swap consecutive digits a and b in a number, you also change the remainder by 37 (or any larger prime) and hence the check sum.

Again, let’s be specific. Suppose we transpose the nth and n+1st digits from the right, again counting from 0. Denote these digits by a and b respectively. Then swapping these two digits changes the number by an amount

(b 2n+1 + a 2n) – (a 2n+1 + b 2n) = (b – a) 2n

If ab, then b – a is a number between -31 and 31, but not 0, and so b – a is not divisible by 37. Neither is any power of 2 divisible by 37, so we’ve changed the remainder by 37, i.e. changed the check sum. And as before, the same argument works for any prime larger than 47.

Related posts

[1] A prime p divides a product ab only if it divides a or it divides b. This isn’t true for composite numbers. For example, 6 divides 4*9 = 36, but 6 doesn’t divide 4 or 9.

Base 32 and base 64 encoding

Math has a conventional way to represent numbers in bases larger than 10, and software development has a couple variations on this theme that are only incidentally mathematical.

Math convention

By convention, math books typically represent numbers in bases larger than 10 by using letters as new digit symbols following 9. For example, base 16 would use 0, 1, 2, …, 9, A, B, C, D, E, and F as its “digits.” This works for bases up to 36; base 36 would use all the letters of the alphabet. There’s no firm convention for whether to use upper or lower case letters.

Base 64 encoding

The common use for base 64 encoding isn’t to represent bits as numbers per se, but to have an efficient way to transmit bits in a context that requires text characters.

There are around 100 possible characters on a keyboard, and 64 is the largest power of 2 less than 100 [1], and so base 64 is the most dense encoding using common characters in a base that is a power of 2.

Base 64 encoding does not follow the math convention of using the digits first and then adding more symbols; it’s free not to because there’s no intention of treating the output as numbers. Instead, the capital letters A through Z represent the numbers 0 though 25, the lower case letters a through z represent the numbers 26 through 51, and the digits 0 through 9 represent the numbers 52 through 61. The symbol + is used for 62 and / is used for 63.

Crockford’s base 32 encoding

Douglas Crockford proposed an interesting form of base 32 encoding. His encoding mostly follows the math convention: 0, 1, 2, …, 9, A, B, …, except he does not use the letters I, L, O, and U. This eliminates the possibility of confusing i, I, or l with 1, or confusing O with 0. Crockford had one more letter he could eliminate, and he chose U in order to avoid an “accidental obscenity.” [2]

Crockford’s base 32 encoding is a compromise between efficiency and human legibility. It is more efficient than hexadecimal, representing 25% more bits per character. It’s less efficient than base 64, representing 17% fewer bits per character, but is more legible than base 64 encoding because it eliminates commonly confused characters.

His encoding is also case insensitive. He recommends using only capital letters for output, but permitting upper or lower case letters in input. This is in the spirit of Postel’s law, also known as the robustness principle:

Be conservative in what you send, and liberal in what you accept.

See the next post for an explanation of Crockford’s check sum proposal.

A password generator

Here’s a Python script to generate passwords using Crockford’s base 32 encoding.

    from secrets import randbits
    from base32_crockford import encode

    def gen_pwd(numbits):
        print(encode(randbits(numbits)))

For example, gen_pwd(60) would create a 12-character password with 60-bits of entropy, and this password would be free of commonly confused characters.

Related posts

[1] We want to use powers of 2 because it’s easy to convert between base 2 and base 2n: start at the right end and convert bits in groups of n. For example, to convert a binary string to hexadecimal (base 24 = 16), convert groups of four bits each to hexadecimal. So to convert the binary number 101111001 to hex, we break it into 1 0111 1001 and convert each piece to hex, with 1 -> 1, 0111 -> 7, and 1001 -> 9, to find 101111001 -> 179. If we a base that is not a power of 2, the conversion would be more complicated and not so localized.

[2] All the words on George Carlin’s infamous list include either an I or a U, and so none can result from Crockford’s base 32 encoding. If one were willing to risk accidental obscenities, it would be good to put U back in and remove S since the latter resembles 5, particularly in some fonts.

RSA with one shared prime

The RSA encryption setup begins by finding two large prime numbers. These numbers are kept secret, but their product is made public. We discuss below just how difficult it is to recover two large primes from knowing their product.

Suppose two people share one prime. That is, one person chooses primes p and q and the other chooses p and r, and qr. (This has happened [1].) Then you can easily find p. All three primes can be obtained in less than a millisecond as illustrated by the Python script below.

In a way it would be better to share both your primes with someone else than to share just one. If both your primes are the same as someone else’s, you could read each other’s messages, but a third party attacker could not. If you’re lucky, the person you share primes with doesn’t know that you share primes or isn’t interested in reading your mail. But if that person’s private key is ever disclosed, so is yours.

Python code

Nearly all the time required to run the script is devoted to finding the primes, so we time just the factorization.

    from secrets import randbits
    from sympy import nextprime, gcd
    from timeit import default_timer as timer
    
    numbits = 2048
    
    p = nextprime(randbits(numbits))
    q = nextprime(randbits(numbits))
    r = nextprime(randbits(numbits))                                      
    
    m = p*q
    n = p*r
    
    t0 = timer()
    g = gcd(m, n)
    assert(p == g)
    assert(q == m//p)
    assert(r == n//p)
    t1 = timer()
    
    # Print time in milliseconds
    print(1000*(t1 - t0))

Python notes

There are a few noteworthy things about the Python code.

First, it uses a cryptographic random number generator, not the default generator, to create 2048-bit random numbers.

Second, it uses the portable default_timer which chooses the appropriate timer on different operating systems. Note that it returns wall clock time, not CPU time.

Finally, it uses integer division // to recover q and r. If you use the customary division operator Python will carry out integer division and attempt to cast the result to a float, resulting in the error message “OverflowError: integer division result too large for a float.”

GCD vs factoring

If you wanted to find the greatest common divisor of two small numbers by hand, you’d probably start by factoring the two numbers. But as Euclid knew, you don’t have to factor two numbers before you can find the greatest common factor of the numbers. For large numbers, the Euclidean algorithm is orders of magnitude faster than factoring.

Nobody has announced being able to factor a 1024 bit number yet. The number m and n above have four times as many bits. The largest of the RSA numbers factored so far has 768 bits. This was done in 2009 and took approximately two CPU millennia, i.e. around 2000 CPU years.

Fast Euclidean algorithm

The classic Euclidean algorithm takes O(n²) operations to find the greatest common divisor of two integers of length n. But there are modern sub-quadratic variations on the Euclidean algorithm that take

O(n log²(n) log(log(n)))

operations, which is much faster for very large numbers. I believe SymPy is using the classic Euclidean algorithm, but it could transparently switch to using the fast Euclidean algorithm for large inputs.

Related posts

[1] As noted in the comments, this has happened due to faulty software implementation, but it shouldn’t happen by chance. By the prime number theorem, the number of n-bit primes is approximately 2n / (n log 2). If M people choose 2M primes each n bits long, the probability of two of these primes being the same is roughly

22-n M n log 2.

If M = 7.5 billion (the current world population) and n = 1024, the probability of two primes being the same is about 10-295. This roughly the probability of winning the Powerball jackpot 40 times in a row.

Simulating identification by zip code, sex, and birthdate

As mentioned in the previous post, Latanya Sweeney estimated that 87% of Americans can be identified by the combination of zip code, sex, and birth date. We’ll do a quick-and-dirty estimate and a simulation to show that this result is plausible. There’s no point being too realistic with a simulation because the actual data that Sweeney used is even more realistic. We’re just showing that her result is reasonable.

Quick estimate

Suppose average life expectancy is around 78 years. If everyone is between 0 and 78, then there are 78*365 possible birth dates and twice that many combinations of birth date and sex.

What’s the average population of a US zip code? We’ll use 9,330 for reasons explained in [1].

We have 56,940 possible birth date and sex combinations for 9,330 people. There have to be many unused birth date and sex combinations in a typical zip code, and it’s plausible that many combinations will only be used once. We’ll run a Python simulation to see just how many we’d expect to be used one time.

Python simulation

The array demo below will keep track of the possible demographic values, i.e. combinations of birth date and sex. We’ll loop over the population of the zip code, randomly assigning everyone to a demographic value, then see what proportion of demographic values is only used once.

    from random import randrange
    from numpy import zeros
    
    zctasize = 9330
    demosize = 365*78*2
    demo = zeros(demosize)
    
    for _ in range(zctasize):
        d = randrange(demosize)
        demo[d] += 1
    
    unique = len(demo[demo == 1])
    print(unique/zctasize)

I ran this simulation 10 times and got values ranging from 84.3% to 85.7%.

Analytic solution

As Road White points out in the comments, you can estimate the number of unique demographics by a probability calculation.

Suppose there are z inhabitants in our zip code and d demographic categories. We’re assuming here (and above) that all demographic categories are equally likely, even though that’s not true of birth dates.

We start by looking at a particular demographic category. The probability that exactly one person falls in that category is

\frac{z}{d}\left(1 - \frac{1}{d}\right)^{z-1}

To find the expected number of demographic slots with exactly one entry we multiply by d, and to get the proportion of p of people this represents we divide by z.

\log p = (z-1)\log\left(1 - \frac{1}{d}\right) \approx - \frac{z}{d}

and so

p \approx \exp(-z/d)

which in our case is 84.8%, consistent with our simulation above.

Here we used the Taylor series approximation

\log(1 + x) = x + {\cal O}(x^2)

If z is of the same magnitude as d or smaller, then the error in the approximation above is O(1/d).

You could also work this as a Poisson distribution, then condition on the probability of a slot being occupied.

By the way, a similar calculation shows that the expected number of demographic slots containing two people is r exp(-r)/2 where rz/d. So while 85% can be uniquely identified, another 7% can be narrowed down to two possibilities.

Related posts

[1] I don’t know, but I do know the average population of a “zip code tabulation area” or ZCTA, and that’s 9,330 according to the data linked to here. As I discuss in that post, the Census Bureau reports population by ZTCA, not by zip code per se, for several reasons. Zip codes can cross state boundaries, they are continually adjusted, and some zip codes contain only post office boxes.

Sine of a googol

How do you evaluate the sine of a large number in floating point arithmetic? What does the result even mean?

Sine of a trillion

Let’s start by finding the sine of a trillion (1012) using floating point arithmetic. There are a couple ways to think about this. The floating point number t = 1.0e12 can only be represented to 15 or 16 significant decimal figures (to be precise, 53 bits [1]), and so you could think of it as a representative of the interval of numbers that all have the same floating point representation. Any result that is the sine of a number in that interval should be considered correct.

Another way to look at this is to say that t can be represented exactly—its binary representation requires 42 bits and we have 53 bits of significand precision available—and so we expect sin(t) to return the sine of exactly one trillion, correct to full precision.

It turns out that IEEE arithmetic does the latter, computing sin(1e12) correctly to 16 digits.

Here’s the result in Python

    >>> sin(1.0e12)
    -0.6112387023768895

and verified by Mathematica by computing the value to 20 decimal places

    In:= N[Sin[10^12], 20]
    Out= -0.61123870237688949819

Range reduction

Note that the result above is not what you’d get if you were first to take the remainder when dividing by 2π and then taking the sine.

    >>> sin(1.0e12 % (2*pi))
    -0.6112078499756778

This makes sense. The result of dividing a trillion by the floating point representation of 2π, 159154943091.89536, is correct to full floating point precision. But most of that precision is in the integer part. The fractional part is only has five digits of precision, and so we should expect the result above to be correct to at most five digits. In fact it’s accurate to four digits.

When your processor computes sin(1e12) it does not naively take the remainder by 2π and then compute the sine. If it did, you’d get four significant figures rather than 16.

We started out by saying that there were two ways of looking at the problem, and according to the first one, returning only four significant figures would be quite reasonable. If we think of the value t as a measured quantity, measured to the precision of floating point arithmetic (though hardly anything can be measured that accurately), then four significant figures would be all we should expect. But the people who designed the sine function implementation on your processor did more than they might be expected to do, finding the sine of a trillion to full precision. They do this using a range reduction algorithm that retains far more precision than naively doing a division. (I worked on this problem a long time ago.)

Sine of a googol?

What if we try to take the sine of a ridiculously large number like a googol (10100)? This won’t work because a googol cannot be represented exactly as a floating point number (i.e. as a IEEE 754 double). It’s not too big; floating point numbers can be as large as around 10308. The problem is that a number that big cannot be represented to full precision. But we can represent 2333 exactly, and a googol is between 2332 and 2333. And amazingly, Python’s sine function (or rather the sine function built into the AMD processor on my computer) returns the correct result to full precision.

    >>> sin(2**333)
    0.9731320373846353

How do we know this is right? I verified it in Mathematica:

    In:= Sin[2.0^333]
    Out= 0.9731320373846353

How do we know Mathematica is right? We can do naive range reduction using extended precision, say 200 decimal places.

    In:= p = N[Pi, 200]
    In:= theta = x - IntegerPart[x/ (2 p)] 2 p
    Out= 1.8031286178334699559384196689…

and verify that the sine of the range reduced value is correct.

    In:= Sin[theta]
    Out= 0.97313203738463526…

Interval arithmetic interpretation

Because floating point numbers have 53 bits of precision, all real numbers between 256 – 22 and 256 + 22 are represented as the floating point number 256. This is a range of width 8, wider than 2π, and so the sines of the numbers in this range cover the possible values of sine, i.e. [-1, 1]. So if you think of a floating point number as a sample from the set of all real numbers with the same binary representation, every possible value of sine is a valid return value for numbers bigger than 256. But if you think of a floating point number as representing itself exactly, then it makes sense to ask for the sine of numbers like 2333 and larger, up to the limits of floating point representation [2].

Related posts

[1] An IEEE 754 double has 52 significand bits, but these bits can represent 53 bits of precision because the first bit of the fractional part is always 1, and so it does not need to be represented. See more here.

[2] The largest exponent of an IEEE double is 1023, and the largest significand is 2 – 2-53 (i.e. all 1’s), so the largest possible value of a double is

(253 – 1)21024-53

and in fact the Python expression sin((2**53 - 1)*2**(1024-53)) returns the correct value to 16 significant figures.

Ellipsoid distance on Earth

globe

To first approximation, Earth is a sphere. But it bulges at the equator, and to second approximation, Earth is an oblate spheroid. Earth is not exactly an oblate spheroid either, but the error in the oblate spheroid model is about 100x smaller than the error in the spherical model.

Finding the distance between two points on a sphere is fairly simple. Here’s a calculator to compute the distance, and here’s a derivation of the formula used in the calculator.

Finding the distance between two points on an ellipsoid is much more complicated. (A spheroid is a kind of ellipsoid.) Wikipedia gives a description of Vincenty’s algorithm for finding the distance between two points on Earth using an oblate spheroid model (specifically WGS-84). I’ll include a Python implementation below.

Comparison with spherical distance

How much difference does it make when you calculate difference on an oblate spheroid rather than a sphere? To address that question I looked at the coordinates of several cities around the world using the CityData function in Mathematica. Latitude is in degrees north of the equator and longitude is in degrees east of the prime meridian.

    |--------------+--------+---------|
    | City         |    Lat |    Long |
    |--------------+--------+---------|
    | Houston      |  29.78 |  -95.39 |
    | Caracas      |  10.54 |  -66.93 |
    | London       |  51.50 |   -0.12 |
    | Tokyo        |  35.67 |  139.77 |
    | Delhi        |  28.67 |   77.21 |
    | Honolulu     |  21.31 | -157.83 |
    | Sao Paulo    | -23.53 |  -46.63 |
    | New York     |  40.66 |  -73.94 |
    | Los Angeles  |  34.02 | -118.41 |
    | Cape Town    | -33.93 |   18.46 |
    | Sydney       | -33.87 |  151.21 |
    | Tromsø       |  69.66 |   18.94 |
    | Singapore    |   1.30 |  103.85 |
    |--------------+--------+---------|

Here are the error extremes.

The spherical model underestimates the distance from London to Tokyo by 12.88 km, and it overestimates the distance from London to Cape Town by 45.40 km.

The relative error is most negative for London to New York (-0.157%) and most positive for Tokyo to Sidney (0.545%).

Update: The comparison above used the same radius for both the spherical and ellipsoidal models. As suggested in the comments, a better comparison would use the mean radius (average of equatorial and polar radii) in the spherical model rather than the equatorial radius.

With that change the most negative absolute error is between Tokyo and New York at -30,038 m. The most negative relative error is between London and New York at -0.324%.

The largest positive error, absolute and relative, is between Tokyo and Sydney. The absolute error is 29,289 m and the relative error is 0.376%.

Python implementation

The code below is a direct implementation of the equations in the Wikipedia article.

Note that longitude and latitude below are assumed to be in radians. You can convert from degrees to radians with SciPy’s deg2rad function.

from scipy import sin, cos, tan, arctan, arctan2, arccos, pi

def spherical_distance(lat1, long1, lat2, long2):
    phi1 = 0.5*pi - lat1
    phi2 = 0.5*pi - lat2
    r = 0.5*(6378137 + 6356752) # mean radius in meters
    t = sin(phi1)*sin(phi2)*cos(long1-long2) + cos(phi1)*cos(phi2)
    return r * arccos(t)

def ellipsoidal_distance(lat1, long1, lat2, long2):

    a = 6378137.0 # equatorial radius in meters 
    f = 1/298.257223563 # ellipsoid flattening 
    b = (1 - f)*a 
    tolerance = 1e-11 # to stop iteration

    phi1, phi2 = lat1, lat2
    U1 = arctan((1-f)*tan(phi1))
    U2 = arctan((1-f)*tan(phi2))
    L1, L2 = long1, long2
    L = L2 - L1

    lambda_old = L + 0

    while True:
    
        t = (cos(U2)*sin(lambda_old))**2
        t += (cos(U1)*sin(U2) - sin(U1)*cos(U2)*cos(lambda_old))**2
        sin_sigma = t**0.5
        cos_sigma = sin(U1)*sin(U2) + cos(U1)*cos(U2)*cos(lambda_old)
        sigma = arctan2(sin_sigma, cos_sigma) 
    
        sin_alpha = cos(U1)*cos(U2)*sin(lambda_old) / sin_sigma
        cos_sq_alpha = 1 - sin_alpha**2
        cos_2sigma_m = cos_sigma - 2*sin(U1)*sin(U2)/cos_sq_alpha
        C = f*cos_sq_alpha*(4 + f*(4-3*cos_sq_alpha))/16
    
        t = sigma + C*sin_sigma*(cos_2sigma_m + C*cos_sigma*(-1 + 2*cos_2sigma_m**2))
        lambda_new = L + (1 - C)*f*sin_alpha*t
        if abs(lambda_new - lambda_old) <= tolerance:
            break
        else:
            lambda_old = lambda_new

    u2 = cos_sq_alpha*((a**2 - b**2)/b**2)
    A = 1 + (u2/16384)*(4096 + u2*(-768+u2*(320 - 175*u2)))
    B = (u2/1024)*(256 + u2*(-128 + u2*(74 - 47*u2)))
    t = cos_2sigma_m + 0.25*B*(cos_sigma*(-1 + 2*cos_2sigma_m**2))
    t -= (B/6)*cos_2sigma_m*(-3 + 4*sin_sigma**2)*(-3 + 4*cos_2sigma_m**2)
    delta_sigma = B * sin_sigma * t
    s = b*A*(sigma - delta_sigma)

    return s

Clearing up the confusion around Jacobi functions

The Jacobi elliptic functions sn and cn are analogous to the trigonometric functions sine and cosine. The come up in applications such as nonlinear oscillations and conformal mapping. Unfortunately there are multiple conventions for defining these functions. The purpose of this post is to clear up the confusion around these different conventions.

Plot of Jacobi sn

The image above is a plot of the function sn [1].

Modulus, parameter, and modular angle

Jacobi functions take two inputs. We typically think of a Jacobi function as being a function of its first input, the second input being fixed. This second input is a “dial” you can turn that changes their behavior.

There are several ways to specify this dial. I started with the word “dial” rather than “parameter” because in this context parameter takes on a technical meaning, one way of describing the dial. In addition to the “parameter,” you could describe a Jacobi function in terms of its modulus or modular angle. This post will be a Rosetta stone of sorts, showing how each of these ways of describing a Jacobi elliptic function are related.

The parameter m is a real number in [0, 1]. The complementary parameter is m‘ = 1 – m. Abramowitz and Stegun, for example, write the Jacobi functions sn and cn as sn(um) and cn(um). They also use m1 = rather than m‘ to denote the complementary parameter.

The modulus k is the square root of m. It would be easier to remember if m stood for modulus, but that’s not conventional. Instead, m is for parameter and k is for modulus. The complementary modulus k‘ is the square root of the complementary parameter.

The modular angle α is defined by m = sin² α.

Note that as the parameter m goes to zero, so does the modulus k and the modular angle α. As any one of these three goes to zero, the Jacobi functions sn and cn converge to their counterparts sine and cosine. So whether your dial is the parameter, modulus, or modular angle, sn converges to sine and cn converges to cosine as you turn the dial toward zero.

As the parameter m goes to 1, so does the modulus k, but the modular angle α goes to π/2. So if your dial is the parameter or the modulus, it goes to 1. But if you think of your dial as modular angle, it goes to π/2. In either case, as you turn the dial to the right as far as it will go, sn converges to hyperbolic secant, and cn converges to the constant function 1.

Quarter periods

In addition to parameter, modulus, and modular angle, you’ll also see Jacobi function described in terms of K and K‘. These are called the quarter periods for good reason. The functions sn and cn have period 4K as you move along the real axis, or move horizontally anywhere in the complex plane. They also have period 4iK‘. That is, the functions repeat when you move a distance 4K‘ vertically [2].

The quarter periods are a function of the modulus. The quarter period K along the real axis is

K(m) = \int_0^{\pi/2} \frac{d\theta}{\sqrt{1-m\sin^2\theta}}

and the quarter period K‘ along the imaginary axis is given by K‘(m) = K(m‘) = K(1 – m).

The function K(m) is known as “the complete elliptic integral of the first kind.”

Amplitude

So far we’ve focused on the second input to the Jacobi functions, and three conventions for specifying it.

There are two conventions for specifying the first argument, written either as φ or as u. These are related by

u = \int_0^{\varphi} \frac{d\theta}{\sqrt{1-m\sin^2\theta}}

The angle φ is called the amplitude. (Yes, it’s an angle, but it’s called an amplitude.)

When we said above that the Jacobi functions had period 4K, this was in terms of the variable u. Note that when φ = π/2, uK.

Jacobi elliptic functions in Mathematica

Mathematica uses the u convention for the first argument and the parameter convention for the second argument.

The Mathematica function JacobiSN[u, m] computes the function sn with argument u and parameter m. In the notation of A&S, sn(um).

Similarly, JacobiCN[u, m] computes the function cn with argument u and parameter m. In the notation of A&S, cn(um).

We haven’t talked about the Jacobi function dn up to this point, but it is implemented in Mathematica as JacobiDN[u, m].

The function that solves for the amplitude φ as a function of u is JacobiAmplitude[um m].

The function that computes the quarter period K from the parameter m is EllipticK[m].

Jacobi elliptic functions in Python

The SciPy library has one Python function that computes four mathematical functions at once. The function scipy.special.ellipj takes two arguments, u and m, just like Mathematica, and returns sn(um), cn(um), dn(um), and the amplitude φ(um).

The function K(m) is implemented in Python as scipy.special.ellipk.

Related posts

[1] The plot was made using JacobiSN[0.5, z] and the function ComplexPlot described here.

[2] Strictly speaking, 4iK‘ is a period. It’s the smallest vertical period for cn, but 2iK‘ is the smallest vertical period for sn.