Solving hard problems

We help companies solve hard problems in mathematics, statistics, and computing. Let’s explore how we might work together.

Three-party Diffie-Hellman in one shot

Posted on 17 November 2025 by John

Elliptic curve Diffie-Hellman

Given a point P on an elliptic curve E, and a random number a, aP means to add P to itself a times, using the addition on E. The point aP can be computed efficiently, even if a is a very large number [1]. However, if E has a large number of points, and if a is chosen at random from a large range, then it is not practical to compute a given P and aP.

This is the elliptic curve version of the discrete logarithm problem, and its presumed difficulty is the basis of the security of Diffie-Hellman key exchange.

Two-party Diffie-Hellman

With two-party Diffie-Hellman key exchange, two parties, Alice and Bob, generate random private keys a and b respectively. They agree on a point P on an elliptic curve E. Alice computes aP and sends it to Bob. Simultaneously Bob computes bP and sends it to Alice. Then Alice can compute

a(bP) = (ab)P

and Bob can compute

b(aP) = (ba)P = (ab)P.

Then both Alice and Bob know a shared secret, the point (ab)P on E, but neither party has revealed a private key.

Three-party Diffie-Hellman

You could extend the approach above to three parties, say adding Carol, but this would require extra communication: Alice could send (ab)P to Carol, which she could multiply by her private key c to obtain abcP. Similarly, everyone else could arrive at abcP. Each person has to do a computation, send and receive a message, do another computation, and send an receive another message.

Joux [2] came up with a way to do Diffie-Hellman key exchange with three people and only one round of sending and receiving messages. The set up uses a pairing e( , ) of two elliptic curve subgroups, G₁ and G₂, as in the previous post. Fix generators P ∈ G₁ and Q ∈ G₂. Each party multiplies P and Q by their private key and sends the results to the other two parties.

Alice receives bP from Bob and cQ from Carol. This is enough for her to compute

e(bP, cQ)^a = e(P, Q)^abc.

Similarly, Bob receives aP from Alice and cQ from Carol, enabling him to compute

e(aP, cQ)^b = e(P, Q)^abc.

And finally, Carol receives aP from Alice and bQ from Bob, enabling her to compute

e(aP, bQ)^c = e(P, Q)^abc.

So all three parties can compute the shared secret e(P, Q)^abc. but no party knows the other parties’ private keys.

Footnotes

[1] If you want to multiply a point by 2¹⁰⁰, for example, you don’t carry out 2¹⁰⁰ additions; you carry out 100 doublings. Of course not every positive integer is a power of 2, but every positive integer is the sum of powers of 2, i.e. it can be written in binary. So as you’re doing your doublings, sum the terms that correspond to 1s in the binary representation of the number you’re multiplying by.

[2] Antoine Joux. A One Round Protocol for Tripartite Diffie–Hellman. Journal of Cryptology (2004) 17: 263–276.

Elliptic curve pairings in cryptography

Posted on 16 November 2025 by John

Pairings can mean a variety of related things in group theory, but for our purposes a pairing is a bilinear mapping from two groups to a third group.

e: G₁ × G₂ → G_T

Typically the group operation on G₁ and G₂ is written additively and the group operation on G_T is written multiplicatively. In fact, G_T will always be the multiplicative group of a finite field, i.e. G_T consists of the non-zero elements of a finite field under multiplication. (The “T” stands for “target.”)

Here bilinear [1] means that if P is an element of G₁ and Q is an element of G₂, and a and b are nonnegative integers,

e(aP, bQ) = e(P, Q)^ab.

There are a few provisos …

First, the pairing must be non-degenerate, i.e. e(P, Q) ≠ 1 for some P and Q.

Second, the pairing must be efficiently computable.

Third, the embedding degree must not be “too high.” This means that if G_T is the multiplicative group of a field with p^k elements, k is not too big. We will look at two examples in which k = 12.

The second and third provisos are important even though they’re not stated rigorously.

Cryptography often speaks of pairing elliptic curves, but in fact it uses pairings of prime-order subgroups of the additive groups of elliptic curves. Because the subgroups have prime order, they are cyclic, and so the pairing is determined by its value on a generator from each subgroup.

Example: BN254

The previous post briefly mentioned a pairing between two elliptic curves, BN254 and alt_bn128, that is used in Ethereum and was used in Zcash in the original Sprout shielded protocol.

The elliptic curve BN254 is defined over the field F_p, the integers mod p, where

p = 21888242871839275222246405745257275088696311157297823662689037894645226208583.

and the elliptic curve alt_bn128 is defined over the field F_p[i], i.e. the field F_p, with an imaginary element i adjoined.

Both elliptic curves have a subgroup of order

r = 21888242871839275222246405745257275088548364400416034343698204186575808495617,

which is prime. So in the pairing the groups G₁ and G₂ are isomorphic to the integers mod r. The target group G_T has order p12 − 1 and so the embedding degree k equals 12, and so the embedding degree is “not too high.”

Example: BLS12-381

Another example also comes from Ethereum and Zcash. Ethereum uses BN254 in smart contracts, but it uses BLS12-381 in its consensus layer. Zcash switched from BN254 to BLS12-381 in the Sapling release.

BLS12-381 is defined over a prime order field with on the order of 2³⁸¹ elements and has embedding order 12, hence 12-381. The BLS stands for Paulo Barreto, Ben Lynn, and Michael Scott. Elliptic curve names often look mysterious, but they’re actually pretty descriptive. I discuss BLS12-381 in more detail here. As in the example above, BLS12-381 is defined over a field F_p and is paired with a curve over F_p[i], i.e. the same field with an imaginary element adjoined. The equation for BLS12-381 is

y² = x³ + 4

and the equation for the curve it is paired with is

y² = x³ + 4(1 + i)

As before the target group is the multiplicative group of a finite field of order p¹².

[1] You’ll also see bilinearity defined by

e(P + Q, R) = e(P, R) e(Q, R)

and

e(P, R + S) = e(P, R) e(P, S).

These definitions are equivalent. To see that the definition here implies the definition at the top, write out aP as P + P + … + P etc.

Since we’re working in subgroups of prime order, there is a generator for each subgroup. Write out each element as a multiple of a generator, then the definition at the top implies the definition here.

Adding an imaginary unit to a finite field

Posted on 16 November 2025 by John

Let p be a prime number. Then the integers mod p form a finite field.

The number of elements in a finite field must be a power of a prime, i.e. the order q = pⁿ for some n. When n > 1, we can take the elements of our field to be polynomials of degree n − 1 with coefficients in the integers mod p.

Addition works just as you’d expect addition to work, adding coefficients mod p, but multiplication is a little more complicated. You multiply field elements by multiplying their polynomial representatives, but then you divide by an irreducible polynomial and take the remainder.

When n = 2, for some p you can define the field by adding an imaginary unit.

When you can and cannot adjoin an i

For some finite fields of order p, you can construct a field of order p² by joining an element i to the field, very much the way you form the complex numbers from the real numbers. For example, you can create a field with 49 elements by taking pairs of (a, b) of integers mod 7 and multiplying them as if they were a + bi. So

(a, b) * (c, d) = (ac − bd, ad + bc).

This is equivalent to choosing the polynomial x² + 1 as your irreducible polynomial and following every polynomial multiplication by taking the remainder modulo x² + 1.

This works for a field with 49 elements, but not for a field of 25 elements. That’s because over the integers mod 5 the polynomial x² + 1 already has a root. Two of them in fact: x = 2 or x = 3. So you could say that mod 5, i = 2. Or i = 3 if you prefer. You can still form a field of 25 elements by taking pairs of elements from a field of 5 elements, but you have to choose a different polynomial as your irreducible polynomial because x² + 1 is not irreducible because

x² + 1 = (x − 2)(x + 2)

when working over the integers mod 5. You could use

x² + x + 1

as your irreducible polynomial. To prove that this polynomial is irreducible mod 5, plug in the numbers 0, 1, 2, 3, and 4 and confirm that none of them make the polynomial equal 0.

In general, you can create a field of order p² by adjoining an element i if and only if p = 3 mod 4.

Next we’ll look at an example of making a very large finite field even larger by adding an imaginary element.

Example from Ethereum

The Ethereum virtual machine has support for a pairing—more on that in a future post—of two elliptic curves, BN254 and alt_bn128. The BN254 curve is defined by

y² = x³ + 3

over the field F_p, the integers mod p, where

p = 21888242871839275222246405745257275088696311157297823662689037894645226208583.

The curve alt_bn128 is defined by

y² = x³ + 3/(9 + i)

over the field F_p[i], i.e. the field F_p, with an element i adjoined. Note the that last two digits of p are 83, and so p is congruent to 3 mod 4.

Special point on curve

The Ethereum documentation (EIP-197) singles out a particular point (x, y) on alt_bn128:

x = a + bi
y = c + di

where

a = 10857046999023057135944570762232829481370756359578518086990519993285655852781
b = 11559732032986387107991004021392285783925812861821192530917403151452391805634
c = 8495653923123431417604973247489272438418190587263600148770280649306958101930
d = 4082367875863433681332203403145435568316851327593401208105741076214120093531.

We will show that this point is on the curve as an exercise in working in the field F_p[i]. We’ll write Python code from scratch, not using any libraries, so all the details will be explicit.

def add(pair0, pair1, p):
    a, b = pair0
    c, d = pair1
    return ((a + c) % p, (b + d) % p)

def mult(pair0, pair1, p):
    a, b = pair0
    c, d = pair1
    return ((a*c - b*d) % p, (b*c + a*d) % p)

p = 21888242871839275222246405745257275088696311157297823662689037894645226208583
a = 10857046999023057135944570762232829481370756359578518086990519993285655852781
b = 11559732032986387107991004021392285783925812861821192530917403151452391805634
c = 8495653923123431417604973247489272438418190587263600148770280649306958101930
d = 4082367875863433681332203403145435568316851327593401208105741076214120093531

# Find (e, f) such that (e, f)*(9, 1) = (1, 0).
# 9e - f = 1
# e + 9f = 0
# Multiply first equation by 9 and add.
e = (9*pow(82, -1, p)) % p
f = (-e*pow(9, -1, p)) % p
prod = mult((e, f), (9, 1), p)
assert(prod[0] == 1 and prod[1] == 0)

y2 = mult((c, d), (c, d), p)
x3 = mult((a, b), mult((a, b), (a, b), p), p)
rhs = add(x3, mult((3, 0), (e, f), p), p)

assert(y2[0] == rhs[0])
assert(y2[1] == rhs[1])

Four generalizations of the Pythagorean theorem

Posted on 13 November 2025 by John

Here are four theorems that generalize the Pythagorean theorem. Follow the links for more details regarding each equation.

1. Theorem by Apollonius for general triangles.

2. Edsgar Dijkstra’s extension of the Pythagorean theorem for general triangles.

$\text{sgn}(\alpha + \beta - \gamma) = \text{sgn}(a^2 + b^2 - c^2)$

3. A generalization of the Pythagorean theorem to tetrahedra.

$V_0^2 = \sum_{i=1}^n V_i^2$

4. A unified Pythagorean theorem that covers spherical, plane, and hyperbolic geometry.

$A(c) = A(a) + A(b) - \kappa \frac{A(a) \, A(b)}{2\pi}$

Elementary symmetric polynomials and optimization

Posted on 12 November 2025 by John

The mth elementary symmetric polynomial of degree n

$e_m(x_1, x_2, \ldots, x_n)$

is the sum of all terms containing a product of m variables. So, for example,

$\begin{align*} e_1(w, x, y, z) &= w + x + y + z \\ e_2(w, x, y, z) &= wx + wy + wz + xy + xz + yz \\ e_3(w, x, y, z) &= xyz + wyz + wxz + wxy \\ e_4(w, x, y, z) &= wxyz \end{align*}$
These polynomials came up in the previous post. The problem was choosing weights to minimize the variance of a weighted sum of random variables can be solved using elementary symmetric polynomials.

To state the optimization problem more generally, suppose you want to minimize

$t_1^2 x_1 + t_2^2x_2 + \cdots + t_n^2 x_n$

where the t_i and x_i are positive and the t_i sum to 1. You can use Lagrange multipliers to show that the solution is

$t_i = \frac{e_n(x_1, x_2, \cdots, x_n)}{x_i \,e_{n-1}(x_1, x_2, \cdots, x_n)}$

Weighting an average to minimize variance

Posted on 12 November 2025 by John

Suppose you have $100 to invest in two independent assets, A and B, and you want to minimize volatility. Suppose A is more volatile than B. Then putting all your money on A would be the worst thing to do, but putting all your money on B would not be the best thing to do.

The optimal allocation would be some mix of A and B, with more (but not all) going to B. We will formalize this problem and determine the optimal allocation, then generalize the problem to more assets.

Two variables

Let X and Y be two independent random variables with finite variance and assume at least one of X and Y is not constant. We want to find t that minimizes

$\text{Var}[tX + (1-t)Y]$

subject to the constraint 0 ≤ t ≤ 1. Because X and Y are independent,

$\text{Var}[tX + (1-t)Y] = t^2 \text{Var}[X] + (1-t)^2 \text{Var}[Y]$

Taking the derivative with respect to t and setting it to zero shows that

$t = \frac{\text{Var}[Y]}{\text{Var}[X] + \text{Var}[Y]}$

So the smaller the variance on Y, the less we allocate to X. If Y is constant, we allocate nothing to X and go all in on Y. If X and Y have equal variance, we allocate an equal amount to each. If X has twice the variance of Y, we allocate 1/3 to X and 2/3 to Y.

Multiple variables

Now suppose we have n independent random variables X_i for i running from 1 to n, and at least one of the variables is not constant. Then we want to minimize

$\text{Var}\left[ \sum_{i=1}^n t_i X_i \right] = \sum_{i=1}^n t_i^2 \text{Var}[X_i]$

subject to the constraint

$\sum_{i=1}^n t_i = 1$

and all t_i non-negative. We can solve this optimization problem with Lagrange multipliers and find that

$t_i \text{Var}[X_i] = t_j \text{Var}[X_j]$

for all 1 ≤ i, j ≤ n. These (n − 1) equations along with the constraint that all the t_i sum to 1 give us a system of equations whose solution is

$t_i = \frac{\prod_{j \ne i} \text{Var}[X_j]}{\sum_{i = 1}^n \prod_{j \ne i} \text{Var}[X_j]}$

Incidentally, the denominator has a name: the (n − 1)st elementary symmetric polynomial in n variables. More on this in the next post.

Brownian motion and Riemann zeta

Posted on 10 November 2025 by John

Excellent video by Almost Sure: What does Riemann Zeta have to do with Brownian Motion?

Connects several things that I’ve written about here including Brownian motion, the Riemann zeta function, and the Kolmogorov-Smirnov test.

Rolling correlation

Posted on 9 November 2025 by John

Suppose you have data on the closing prices of two stocks over 1,000 days and you want to look at the correlation between the two asset prices over time in rolling 30 day windows.

It seems that the rolling correlation is periodic. peaking about every 50 days.

But this is an artifact of the rolling window, not a feature of the data. I created the two simulated stock time series by creating random walks. The price of the stock each day is the price the previous day plus a sample from a normal random variable with mean zero and variance 1.

import numpy as np
from scipy.stats import norm

n = 1000
x = np.cumsum(norm.rvs(size=n))
y = np.cumsum(norm.rvs(size=n))

If you use a wider window, say 60 days, you’ll still see a periodic pattern in the rolling correlation, though with lower frequency.

Analog of Heron’s formula on a sphere

Posted on 8 November 2025 by John

The area of a triangle can be computed directly from the lengths of its sides via Heron’s formula.

$A = \sqrt{s(s-a)(s-b)(s-c)}$

Here s is the semiperimeter, s = (a + b + c)/2.

Is there an analogous formula for spherical triangles? It’s not obvious there should be, but there is a formula by Simon Antoine Jean L’Huilier (1750–1840).

$\tan^2 \frac{S}{4} = \tan \frac{s}{2} \tan \frac{s-a}{2} \tan \frac{s-b}{2} \tan \frac{s-c}{2}$

Here we denote area by S for surface area, rather than A because in the context of spherical trigonometry A usually denotes the angle opposite side a. The same convention applies in plane trigonometry, but the potential for confusion is greater in L’Huilier’s formula since the area appears inside a tangent function.

Now tan θ ≈ θ for small θ, and so L’Huilier’s formula reduces to Heron’s formula for small triangles.

Imagine the Earth as a sphere of radius 1 and take a spherical triangle with one vertex at the north pole and two vertices on the equator 90° longitude apart. Then a = b = c = π/2 and s = 3π/4. Such a triangle takes of 1/8 of the Earth’s surface area of 4π, so the area S is π/2. You can verify that L’Huilier’s formula gives the correct area.

It’s not a proof, but it’s a good sanity check that L’Huilier’s formula is correct for small triangles and for at least one big triangle.

How much is a gigawatt?

Posted on 7 November 2025 by John

There’s increasing talk of gigawatt data centers. Currently the largest data center, Switch’s Citadel Campus in Nevada, uses 850 megawatts of power. OpenAI’s Stargate data center, under construction, is supposed to use 1.2 gigawatts.

Gigawatt

An average French nuclear reactor produces about a gigawatt of power. If the US were allowed build nuclear reactors, we could simply build one reactor for every gigantic data center. Unfortunately, the Nuclear Regulatory Commission essentially prohibits the construction of profitable nuclear reactors.

An American home uses about 1200 watts of power, so a gigawatt of electricity could power 800,000 homes. So roughly, a gigawatt is a megahome.

Gigawatt-year

A gigawatt is a unit of power, not energy. Energy is power over some time period.

A gigawatt-year is about 3 × 10¹⁶ joules, or 30 petajoules.

A SpaceX Starship launch releases 50 terajoules of energy, so a gigawatt-year is 60 Starship launches.

A couple months ago I wrote about illustrating cryptographic strength in terms of the amount of energy needed to break it, and how much water that much energy would boil. Let’s do something similar for a gigawatt-year.

It takes about 300 kilojoules of energy to boil a liter of water [1], so 30 petajoules would boil 100 billion liters of water. So a gigawatt-year of energy would be enough to boil Coniston Water, the third largest lake in England.

If you could convert a kilogram of matter to energy according to E = mc², this would release 90 petajoules. So a gigawatt-year is the energy in about 300 grams of matter.

[1] In detail, boiling a liter of water is defined as increases the temperature from 20° C to 100° C at sea level.

Solving hard problems

Three-party Diffie-Hellman in one shot

Elliptic curve Diffie-Hellman

Two-party Diffie-Hellman

Three-party Diffie-Hellman

Footnotes

Elliptic curve pairings in cryptography

Example: BN254

Example: BLS12-381

Related posts

Adding an imaginary unit to a finite field

When you can and cannot adjoin an i

Example from Ethereum

Special point on curve

Related posts

Four generalizations of the Pythagorean theorem

Elementary symmetric polynomials and optimization

Related posts

Weighting an average to minimize variance

Two variables

Multiple variables

Related posts

Brownian motion and Riemann zeta

Rolling correlation

Related posts

Analog of Heron’s formula on a sphere

How much is a gigawatt?

Gigawatt

Gigawatt-year