Almost Sure posted an interesting fact on X:
If a point (x, y, z) is chosen at random uniformly on the unit sphere, then x, y, and z each have the uniform distribution on [−1, 1] and zero correlations (but not independent!) This follows from Archimedes’ “On the Sphere and Cylinder” published in 225BC.
Archimedes effectively showed that projecting a sphere to a cylinder wrapping round its equator preserves area, so projection of (x,y,z) is uniform on the cylinder. Used in Lambert cylindrical equal area projection.
In this post I want to empirically demonstrate the distribution of the coordinates, their lack of linear correlation, and their dependence.
The usual way to generate random points on a sphere is to generate three samples from a standard normal distribution and normalize by dividing by the Euclidean norm of the three points. So to generate a list of N points on a sphere, I generate an N × 3 matrix of normal samples and divide each row by its norm.
import numpy as np N = 10000 # Generate N points on a sphere M = norm.rvs(size=(N, 3)) M /= np.linalg.norm(M, axis=1)[:, np.newaxis]
Next, I find the correlation of each column with the others.
x = M[:, 0] y = M[:, 1] z = M[:, 2] cxy, _ = pearsonr(x, y) cyz, _ = pearsonr(y, z) cxz, _ = pearsonr(x, z) print(cxy, cyz, cxz)
When I ran this I got correlations of -0.0023, 0.0047, and -0.0177. For reasons explained in the previous post, I’d expect values in the interval [−0.02, 0.02] if the columns were independent, so the correlations are consistent with that hypothesis.
To demonstrate that each column is uniformly distributed on [−1, 1] you could look at histograms.
import matplotlib.pyplot as plt plt.hist(x, bins=int(N**0.5)) plt.show() plt.hist(y, bins=int(N**0.5)) plt.show() plt.hist(z, bins=int(N**0.5)) plt.show()
I’ll spare you the plots, but they look like what you’d expect. If you wanted to do something more analytical than visual, you could compute something like the Kolmogorov-Smirnov statistic.
The coordinates are not independent, though they have zero linear correlation. Maybe a test for nonlinear dependence will work. I tried Spearman’s rank correlation and Kendall’s tau, and neither detected the dependence. But distance correlation did.
from dcor import distance_correlation cxy = distance_correlation(x, y) cyz = distance_correlation(y, z) cxz = distance_correlation(x, z) print(cxy, cyz, cxz)
Each of the distance correlations was approximately 0.11, substantially greater than zero, implying dependence between the coordinates.