Random spherical coordinates

The topic of generating random points on a unit sphere has come up several times here. The standard method using normal random samples generates points (x, y, z) in Cartesian coordinates.

If you wanted points in spherical coordinates, you could first generate points in Cartesian coordinates, then convert the points to spherical coordinates. But it would seem more natural, and possibly more efficient, to directly generate spherical coordinates (ρ, θ, ϕ). That’s what we’ll do in this post.

Unfortunately there are several conventions for spherical coordinates, as described here, so we should start by saying we’re using the math convention that (ρ, θ, ϕ) means (radius, longitude, polar angle). The polar angle ϕ is 0 at the north pole and π at the south pole.

Generating the first two spherical coordinate components is trivial. On a unit sphere, ρ = 1, and θ we generate uniformly on [0, 2φ]. The polar angle ϕ requires more thought. If we simply generated φ uniformly from [0, π] we’d put too many points near the poles and too few points near the equator.

As I wrote about a few days ago, the x, y, and z coordinates of points on a sphere are uniformly distributed. In particular, the z coordinate is uniformly distributed, and so we need cos ϕ to be uniformly distributed, not ϕ itself. We can accomplish this by generating uniform points from [−1, 1] and taking the inverse cosine.

Here’s Python code summarizing the algorithm for generating random points on a sphere in spherical coordinates.

from numpy import random, pi, arccos

def random_pt() # in spherical coordinates
    rho = 1
    theta = 2*pi*random.random()
    phi = arccos(1 - 2*random.random())
    return (rho, theta, phi)

See the next post for a demonstration.

Random samples from a tetrahedron

Exactly one month ago I wrote about sampling points from a triangle. This post will look at the three dimensional analog, sampling from a tetrahedron. The generalization to higher dimensions works as well.

Sampling from a triangle

In the triangle post, I showed that a naive approach doesn’t work but a variation does. If the vertices of the triangle are AB, and C, then the naive approach is to generate three uniform random numbers, normalize them, and use them to form a linear combination of the vertices. That is, generate

r1r2r3

uniformly from [0, 1], then define

r1′ = r1/(r1r2r3)
r2′ = r2/(r1r2r3)
r3′ = r3/(r1r2r3)

and return as a random sample the point

r1′ A + r2′ B + r3C.

This oversamples points near the middle of the triangle and undersamples points near the vertices. But everything above works if you replace uniform samples with exponential samples.

Sampling from a tetrahedron

The analogous method works for a tetrahedron with vertices ABC, and D. Generate exponential random variables ri for i = 1, 2, 3, 4. Then normalize, defining each ri′ to be ri divided by the sum of all the ri s. Then the random sample is

r1′ A + r2′ B + r3C + r4D.

In the case of the triangle, it was easy to visualize that uniformly generated rs did not lead to uniform samples of the triangle, but that exponentially generated rs did. This is harder to do in three dimensions. Even if the points were uniformly sampled from a tetrahedron, they might not look uniformly distributed from a given perspective.

So how might we demonstrate that our method works? One approach would be to take a cube inside a tetrahedron and show that the proportion of samples that land inside the cube is what we’d expect, namely the ratio of the volume of the cube to the volume of the tetrahedron.

This brings up a couple interesting subproblems.

  1. How do we compute the volume of a tetrahedron?
  2. How can we test whether our cube lies entirely within the tetrahedron?

Volume of a tetrahedron

To find the volume of a tetrahedron, form a matrix M whose columns are the vectors from three of the vertices to the remaining vertex. So we could take AD, BD, and CD. The volume of the tetrahedron is the determinant of M divided by 6.

Testing whether a cube is inside a tetrahedron

A tetrahedron is convex, and so the line segment joining any two points inside the tetrahedron lies entirely inside the tetrahedron. It follows that if all the vertices of a cube are inside the a tetrahedron, the entire cube is inside.

So this brings us to the problem of testing whether a point is inside a tetrahedron. We can do this by converting the coordinates of the point to barycentric coordinates, then testing whether all the coordinates are non-negative and add to 1.

Random sampling illustration

I made up four points as vertices of a tetrahedron.

import numpy as np

A = [0, 0, 0]
B = [5, 0, 0]
C = [0, 7, 0]
D = [0, 1, 6]
A, B, C, D = map(np.array, [A, B, C, D])

For my test cube I chose the cube whose vertex coordinates are all combinations of 1 and 2. So the vertex nearest the origin is (1, 1, 1) and the vertex furthest from the origin is (2, 2, 2). You can verify that all the vertices are inside the tetrahedron, and so the cube is inside the tetrahedron.

The tetrahedron has volume 35, and the cube has volume 1, so we expect 1/35th of the random points to land inside the cube. Here’s the code to sample the tetrahedron and calculate the proportion of points that land in the cube.

from scipy.stats import expon

def incube(v):
    return 1 <= v[0] <= 2 and 1 <= v[1] <= 2 and 1 <= v[2] <= 2

def good_sample(A, B, C, D):
    r = expon.rvs(size=4)
    r /= sum(r)
    return r[0]*A + r[1]*B + r[2]*C + r[3]*D

def bad_sample(A, B, C, D):
    r = np.random.random(size=4)
    r /= sum(r)
    return r[0]*A + r[1]*B + r[2]*C + r[3]*D

N = 1_000_000

badcount  = 0
goodcount = 0

for _ in range(N):
    v = bad_sample(A, B, C, D)
    if incube(v):
        badcount += 1        
    v = good_sample(A, B, C, D)
    if incube(v):
        goodcount += 1

print(badcount/N, goodcount/N, 1/35)

This produced

0.094388 0.028372 0.028571…

The good (exponential) method match the expected proportion to three decimal places but the bad (uniform) method put about 3.3 times as many points in the cube as expected.

Note that three decimal agreement is about what we’d expect via the central limit theorem since 1/√N = 0.001.

If you’re porting the sampler above to an environment where you don’t a function for generating exponential random samples, you can roll your own by returning −log(u) where u is a uniform sample from [0. 1].

More general 3D regions

After writing the post about sampling from triangles, I wrote a followup post about sampling from polygons. In a nutshell, divide your polygon into triangles, then randomly select a triangle in proportion to its area, then sample with that triangle. You can do the analogous process in three dimensions by dividing a general volume into tetrahedra.

Cycles in Marsaglia’s mental RNG

Last week I wrote about a mental random number generator designed by George Marsaglia. It’s terrible compared to any software RNG, but it produces better output than most people would if asked to say a list of random digits.

Marsaglia’s RNG starts with a two-digit number as a seed state, then at each step replaces n with the 10s digit of n plus 6 times the 1s digit. Here’s the generator in Python form, taken from the earlier post.

state = 42 # Set the seed to any two digit number

def random_digit():
    tens = lambda n: n // 10
    ones = lambda n: n % 10
    global state
    state = tens(state) + 6*ones(state)
    return ones(state)

Only the last digit of the state is returned: the state has two digits but the output has one digit.

Alexander Janßen left an interesting comment on Mastodon regarding the generator:

I’m not sure if it’s already documented anywhere, but a random seed of 59 sticks and always returns a 9. This is similiar to the initial state of 0 (which is more trivial). In return that means, that the state 59 is never reached from any other sequence.

To explore this further, let’s look at just the states of the mental RNG, not the output digits.

If you start with the state 1, you will produce each of the numbers from 1 to 58 in a cycle. It follows that if you start at any number between 1 and 58 you’ll produce the same sequence of states in a cycle, though starting at a different point in the cycle.

… → 1 → 6 → 36 → 39 → … → 1 → …

If you start at 59, you’ll stay at 59 because 5 + 6×9 = 59. In this case, the RNG literally does what is described in the Dilbert cartoon below.

Over here we have our random number generator. nine nine nine nine nine nine ...

Collatz interlude

If you start with a seed larger than 59, you’ll eventually land on a state less than 59, and then produce the same cycle as described above. This is reminiscent of the Collatz conjecture, which I describe here as follows.

The Collatz conjecture asks whether the following procedure always terminates at 1. Take any positive integer n. If it’s odd, multiply it by 3 and add 1. Otherwise, divide it by 2. For obvious reasons the Collatz conjecture is also known as the 3n + 1 conjecture.

The Collatz conjecture remains an open question, though there has been some progress on the problem. I routinely get email from people who say they have no training in math but believe they have solved the problem. Recently they have begun to explain how an LLM led them to their proof.

Does the Collatz sequence get stuck in a cycle for some starting point? Nobody knows. But we can show every starting point for Marsaglia’s mental RNG ends up in a cycle.

Back to Marsaglia

If we start the mental RNG with 99, we next get 63, and then 24. So starting at 99, we end up in a cycle after two steps. The same is true starting at 69 or 89. Otherwise, any starting point between 60 and 98 leads us to a cycle after one step.

Marsaglia’s generator is supposed to be seeded with a two-digit number. What if we seed it with a larger number? For example, what if we start at 1066 and consider the 1os “digit” to be 106?

Each step of the algorithm reduces the state by an order of magnitude, and so if we start with a state greater than 1000, we eventually get to a state less than 1000. Then every state between 100 and 1000 leads to a cycle, usually the permutation of the digits 1 through 58, but the following three-digit states lead to the fixed point of 59.

118, 177, 236, 295, 354, 413, 472, 531, 590

Related posts

Random samples from a polygon

Ted Dunning left a comment on my post on random sampling from a triangle saying you could extend this to sampling from a polygon by dividing the polygon into triangles, and selecting a triangle each time with probability proportional to the triangle’s area.

To illustrate this, let’s start with a irregular pentagon.

To pick a point inside, I used the centroid, the average of the vertices. Connecting the centroid to each of the vertices splits the pentagon into triangles. (Here I implicitly used the fact that this pentagon is convex. The centroid of a non-convex polygon could be outside the polygon.)

We can find the area of the triangles using Heron’s rule.

Here’s what we get for random samples.

Randomly selecting points inside a triangle

If you have a triangle with vertices AB, and C, how would you generate random points inside the triangle ABC?

Barycentric coordinates

One idea would be to use barycentric coordinates.

  1. Generate random numbers α, β, and γ from the interval [0, 1].
  2. Normalize the points to have sum 1 by dividing each by their sum.
  3. Return αA + βB + γC.

This generates points inside the triangle, but not uniformly.

Accept-reject

Another idea is to use an accept-reject method. Draw a rectangle around the triangle, generate points in the square, and throw them away if they land outside the triangle.

An advantage to this method is that it obviously works because it doesn’t rely on anything subtle. Three cheers for brute force!

The method is fairly efficient; on average only half the points will be rejected.

A disadvantage to all accept-reject methods is that they have variable runtime, though this only matters in some applications.

Accept-flip

There is a clever variation on the accept-reject method. Create a parallelogram by attaching a flipped copy of the triangle. Now randomly sample from the parallelogram. Every sample point will either land inside the original triangle or in its flipped twin. If it landed in the original triangle, keep it. If it landed in the twin, flip it back over and use that point.

This is like an accept-reject method, except there’s no waste. Every point is kept, possibly after flipping.

You can find code for implementing this method on Stack Overflow.

Barycentric coordinates fixed

Felix left a note in the comments that the barycentric approach would work if you draw the random weights from an exponential distribution rather than a uniform distribution. That goes against my intuition. I thought about using a different distribution on the weights, but I didn’t work out what it would need to be, but I did not expect it to be exponential. Apparently it works. Here’s a proof by plot:

Related posts

A mental random number generator

Man thinking of random numbers

George Marsaglia was a big name in random number generation. I’ve referred to his work multiple times here, most recently in this article from March on randomly generating points on a sphere. He is best remembered for his DIEHARD battery of tests for RNG quality. See, for example, this post.

I recently learned about a mental RNG that Marsaglia developed. It’s not great, but it’s pretty good for something that’s easy to mentally compute. The output is probably better than if you simply tried to make up random numbers; people are notoriously bad at doing that. Hillel Wayne explores Marsaglia’s method in detail here.

Here’s a summary the generator in Python.

state = 42 # Set the seed to any two digit number

def random_digit():
    tens = lambda n: n // 10
    ones = lambda n: n % 10
    global state
    state = tens(state) + 6*ones(state)
    return ones(state)

Related posts

Legendre and Ethereum

What does an eighteenth century French mathematician have to do with the Ethereum cryptocurrency?

A pseudorandom number generator based on Legendre symbols, known as Legendre PRF, has been proposed as part of a zero knowledge proof mechanism to demonstrate proof of custody in Ethereum 2.0. I’ll explain what each of these terms mean and include some Python code.

The equation x² = a mod p is solvable for some values of a and not for others. The Legendre symbol

\left(\frac{a}{p}\right)

is defined to be 1 if a has a square root mod p, and −1 if it does not. Here a is a positive integer and p is a (large) prime [1]. Note that this is not a fraction, though it looks like one.

As a varies, the Legendre symbols vary randomly. Not literally randomly, of course, but the act random enough to be useful as a pseudorandom number generator.

Legendre PRF

Generating bits by computing Legendre symbols is a lot more work than generating bits using a typical PRNG, so what makes the Legendre PRF of interest to Ethereum?

Legendre symbols can be computed fairly efficiently. You wouldn’t want to use the Legendre PRF to generate billions of random numbers for some numerical integration computation, but for a zero knowledge proof you only need to generate a few dozen bits.

To prove that you know a key k, you can generate a string of pseudorandom bits that depend on the key. If you can do this for a few dozen bits, it’s far more likely that you know the key than that you have guessed the bits correctly. Given k, the Legendre PRF computes

L_k(x) = \frac{1}{2}\left( 1 - \left( \frac{k+x}{p}\right) \right)

for n consecutive values of x [2].

One reason this function is of interest is that it naturally lends itself to multiparty computation (MPC). The various parties could divide up the range of x values that each will compute.

The Legendre PRF has not been accepted to be part of Ethereum 2.o. It’s only been proposed, and there is active research into whether it is secure enough.

Python code

Here’s a little Python scrypt to demonstrate using Lk(x).

    from sympy import legendre_symbol

    def L(x, k, p):
        return (1 - legendre_symbol(x + k, p)) // 2

    k = 20250626
    p = 2**521 - 1
    print([L(x, k, p) for x in range(10)])

This produces the following.

    [1, 1, 1, 1, 0, 0, 1, 0, 0, 1]

Related posts

[1] There is a third possibility: the Legendre symbol equals 0 if a is a multiple of p. We can safely ignore this possibility for this post.

[2] The Legendre symbol takes on the values ±1 and so we subtract this from 1 to get values {0, 2}, then divide by 2 to get bits {0, 1}.

A simple way to generate random points on a sphere

I recently ran across a simple way to generate random points uniformly distributed on a sphere: Generate random points in a cube until you get one inside the sphere, then push it to the surface of the sphere by normalizing it [1].

In more detail, to generate random points on the unit sphere, generate triples

(u1, u2, u3)

of independent random values uniformly generated on [−1, 1] and compute the sum of there squares

S² = u1² + u2² + u3²

If S² > 1 then reject the sample and try again. Otherwise return

(u1/S, u2/S, u3/S).

There are a couple things I like about this approach. First, it’s intuitively plausible that it works. Second, it has minimal dependencies. You could code it up quickly in a new environment, say in an embedded system, though you do need a square root function.

Comparison with other methods

It’s not the most common approach. The most common approach to generate points on a sphere in n dimensions is to generate n independent values from a standard normal distribution, then normalize. The advantage of this approach is that it generalizes efficiently to any number of dimensions.

It’s not obvious that the common approach works, unless you’ve studied a fair amount of probability, and it raises the question of how to generate samples from normal random variable. The usual method, the Box-Muller algorithm, requires a logarithm and a sine in addition to a square root.

The method starting with points in a cube is more efficient (when n = 3, though not for large n) than the standard approach. About half the samples that you generate from the cube will be thrown out, so on average you’ll need to generate six values from [−1, 1] to generate one point on the sphere. It’s not the most efficient approach—Marsaglia gives a faster algorithm in the paper cited aove—but it’s good enough.

Accept-reject methods

One disadvantage to accept-reject methods in general is that although they often have good average efficiency, their worst-case efficiency is theoretically infinitely bad: it’s conceivable that you’ll reject every sample, never getting one inside the sphere. This is not a problem in practice. However, it may be a problem in practice that the runtime is variable.

When computing in parallel, a task may have to wait on the last thread to finish. The runtime for the task is the runtime of the slowest thread. In that case you may want to use an algorithm that is less efficient on average but that has constant runtime.

It also matters in cryptography whether processes have a constant runtime. An attacker may be able to infer something about the path taken through code by observing how long it took for the code to execute.

Related posts

[1] George Marsaglia. Choosing a point from the surface of a sphere. The Annals of Mathematical Statistics. 1972, Vol. 43, No. 2, 645–646.

Factored random numbers

A couple days ago Michael Nielsen posted an image of a one-page paper that gives an algorithm for generating factored random numbers, uniformly distributed from 1 to some designated N.

The algorithm does not generate random numbers then factor them. It’s more efficient than that, generating the factorization along with the final result. It does require testing for whether a number is prime, but this is more efficient than factorization.

I thought about trying to code up the algorithm in Python, but then I see that @iconjack beat me to it.

from sympy import isprime
from random import random, randint

def randfacts(N):
    while True:
        n, r, s = N, 1, []
        while n > 1:
            if r > N: break
            if isprime(n := randint(1,n)):
                r *= n
                s.append(n)
        else:
            if random() < r/N:
                return r, s

What is a Pentanomial GFSR random number generator?

The ISO random number generation standard, ISO 28640, speaks of a “Pentanomial GFSR method” for generating random variates. What is this? We’ll break it down, starting with GFSR.

GFSR

In short, a GFSR random number generator is what is now more commonly called a linear feedback shift register, or LFSR. The terminology “GFSR” was already dated when the ISO standard was published in 2010.

Tausworthe [1] came up with LFSR random variate generators in 1965. His paper looked at linear feedback shift registers but doesn’t use the term LFSR. LFSRs are bit sequences in which the current bit is a linear combination of certain previous bits, something like a Fibonacci sequence. The algebra is being carried out mod 2, so addition amounts to XOR.

As a toy example, consider

xn+5 = xn+2 + xn.

The first five bits would be the seed for the generator, then the recurrence above determines the rest. So, for example, we might start with 01101. Then the sequence of bits would be

011011101010000100101100111110001101110101…

It seems the term GFSR dates to 1973 with Lewis and Payne [2]. As far as I can tell, between 1965 and 1973 Tausworthe’s idea was implemented in practice using two previous terms, and this specialization of LFSRs became known as FSRs. Then [2] generalized this to include possibly more previous terms, restoring the generality of [1].

That explains “GFSR method” but what is this “pentanomial”?

Pentanomial

Recurrence relations can be described in terms of their characteristic polynomials. For example, in the familiar Fibonacci sequence

Fn+2 = Fn+1 + Fn

the characteristic polynomial is

x² − x − 1.

The characteristic polynomial of the toy generator above would be

x5x2 − 1.

And since −1 is the same as 1 when working mod 2, we’d usually write this polynomial as

x5 + x2 + 1.

Between 1965 and 1973 FSRs were implemented by terms that had a trinomial as the characteristic polynomial. Lewis and Payne looked at recurrences that depended on more previous terms, and in particular the choice of 4 previous terms, corresponding to a polynomial with 5 non-zero coefficients, had nice properties, hence the term “pentanomial.”

Primitive polynomials

The polynomials come before the recurrences. That is, we don’t write down the polynomial after designing the recurrence; the recurrence is designed by picking characteristic polynomials with special algebraic properties. This is because a primitive polynomial corresponds to the longest possible sequence of bits before things repeat.

The period of a LFSR corresponding to an nth degree primitive polynomial is 2n-1. This is the maximum possible because we cannot seed an LFSR with all zeros. We could, but the sequence would only produce zeros. As long as at least one bit in the seed is set, the LFSR will produce a sequence of 2n-1 bits before repeating.

Practicalities

In practice LFSRs are based on higher-degree polynomials than our toy example. For example, one might use

xn+607 = xn + xn+167 + xn+307 + xn+461

after seeding it with 607 random bits. This corresponds to the primitive pentanomial

x607 + x461 + x307 + x167 + 1.

This LFSR would have period 2607 − 1, or on the order of 10183 bits.

LFSRs are very fast, but they’re unsuitable for cryptography because linear feedback is linear and cryptanalysis can exploit linear structure. So in order to use LFSRs as secure random number generators they have to be combined with some sort of nonlinear processing, such as described here.

Related links

[1] Robert C. Tausworthe. Random Numbers Generated by Linear Recurrence Modulo Two. Mathematics of Computation, 19. 1965 pp. 201–209.

[2] T. G. Lewis and W. H. Payne. Generalized Feedback Shift Register Pseudorandom Number Algorithm. Journal of the ACM. Vol 20, No. 3. July 1973. pp. 456–468