Proving you know a product

There is a way to prove that you know two numbers a and b, and their product cab, without revealing ab, or c. This isn’t very exciting without more context — maybe you know that 7 × 3 = 21 — but it’s a building block of more interesting zero knowledge proofs, such as proving that a cryptocurrency transaction is valid without revealing the amount of the transaction.

The proof mechanism requires an elliptic curve G and a pairing of G with itself. (More on pairings shortly.) It also requires a generator g of the group structure on G.

The prover takes the three secret numbers and multiplies the generator g by each, encrypting the numbers as agbg, and cg. When G is a large elliptic curve, say one with on the order of 2256 points, then computing products like ag can be done quickly, but recovering a from g and ag is impractical. In a nutshell, multiplication is easy but division [1] is practically impossible [2].

The verifier receives agbg, and cg. How can he verify that ab = c without knowing ab, or c? Here’s where pairing come in.

I go more into pairings here, but essentially a pairing is a mapping from two groups to a third group

eG1 × G2 → GT

such that

e(aPbQ) = e(PQ)ab.

In our case G1 and G2 are both equal to the group G above, and the target group GT doesn’t matter for our discussion here. Also, P and Q will both be our generator g.

By the defining property of a pairing,

e(agbg) = e(gg)ab

and

e(cgg) = e(gg)c.

So if abc, then e(gg)ab and e(gg)c will be equal.

Related posts

[1] The literature will usually speak of discrete logarithms rather than division. The group structure on an elliptic curve is Abelian, and so it is usually written as addition. If you write the group operation as multiplication, then you’re taking logs rather than dividing. The multiplicative notation highlights the similarity to working in the multiplicative group modulo a large prime.

[2] The computation is theoretically possible but not possible in practice without spending enormous resources, or inventing a large scale quantum computer. This is the discrete logarithm assumption.

How to prove you know a discrete logarithm

In a high school math class, the solution to the equation

bxy

is the logarithm of y in base b. The implicit context of the equation is the real numbers, and the solution is easy to calculate.

The same problem in the context of finite groups is called the discrete logarithm problem, and it is difficult to solve for large groups. In particular, it is impractical to solve when working modulo a sufficiently large prime number or when working over a sufficiently large elliptic curve [1]. In either context, the exponential bx can be computed efficiently but its inverse cannot.

Now suppose you want to prove that you know x without revealing x itself. That is, you’d like to construct a zero knowledge proof that you know x. How could you do this?

Here’s one way.

  1. You, the prover, create a random number r, compute t = br, and send the verifier t.
  2. The other party, the verifier, creates a random number c, the challenge, and sends it to you.
  3. You calculate sr + cx and send s to the verifier.
  4. The verifier checks whether bst yc. and believes you if and only if equality holds.

Let’s see why this works.

First of all, what have you revealed to the prover? Two values: t and s. The value t is the exponential of a random number, and so another random number. The value s is based on x, and so conceivably you’ve revealed your secret. But the verifier does not know r, only a value computed from r (i.e. t) and the verifier cannot recover r from t because this would require computing a discrete logarithm.

Next, why should bst yc? Because

bsbr + cx = br bcx = t (bx)c = t yc.

Finally, why should the verifier believe you know x? If you don’t know x, but were able to come up with an s that satisfies the verifier, then you were able to compute the discrete logarithm of t yc.

Related posts

[1] At least without a large-scale quantum computer. Schor’s algorithm could efficiently compute discrete logarithms if only there were a large quantum computer to run it on.

Efficiently computing multiple modular inverses at once

Suppose you have a large prime number M and you need to find the inverse of several numbers mod M.  Montgomery’s trick is a way to combine the computation of the inverses to take less time than computing the inverses individually. Peter Montgomery (1947–2020) came up with this trick in 1985.

We will illustrate Montgomery’s trick by inverting three numbers—a, b, and c—though the trick extends to any number of numbers. It is commonly used in cryptography.

Modular inverses are much slower to calculate than modular products, so doing fewer of the former and more of the latter is a good tradeoff. Montgomery’s method only calculates one modular inverse, regardless of how many numbers need to be inverted.

The idea is to directly invert the product of all the numbers and use multiplication to find the inverses of the individual numbers. In our case, we compute

xab
y = cy = abc
x−1 = cy−1
b−1 = ax−1
a−1 = bx−1

To show that this actually saves time, we’ll run some Python code to invert three random numbers modulo a very large prime, much larger than occurs in practice. The reason is to make the computation time longer and easier to demonstrate. In practice, Montgomery’s trick saves a little time off of a lot of calculations. Here we’ll save a lot of time off a handful of calculations.

import sys
import time
from secrets import randbelow

# extend the default maximum integer size
sys.set_int_max_str_digits(100000)

# the 32nd Mersenne prime
M = 2**756839 - 1

def simple(a, b, c, M):
    return [pow(x, -1, M) for x in [a, b, c]]

def montgomery(a, b, c, M):
    x = a*b % M
    y = x*c % M
    yinv = pow(y, -1, M)
    cinv = x*yinv % M
    xinv = c*yinv % M
    binv = a*xinv % M
    ainv = b*xinv % M
    return [ainv, binv, cinv]
    
a = randbelow(M)
b = randbelow(M)
c = randbelow(M)

start = time.perf_counter()
result = simple(a, b, c, M)
elapsed = time.perf_counter() - start
print(elapsed)

start = time.perf_counter()
result = montgomery(a, b, c, M)
elapsed = time.perf_counter() - start
print(elapsed)

When we ran this, the direct approach took 121.8 seconds, and Montgomery’s trick took 47.6 seconds.

Related posts

RSA as a pairing

The last couple posts have been about group pairings, specifically Tate pairings as they’re used in cryptography. This post will show that RSA encryption can be seen as a special case of pairing-based cryptography.

The idea comes from Ben Lynn’s 2007 dissertation. Lynn is the “L” in BLS signatures—one of the topics in his dissertations—and in BLS elliptic curves.

A pairing is a bilinear mapping from two groups to a third group

eG1 × G2 → GT.

Here bilinear means that if P is an element of G1 and Q is an element of G2, and a and b are nonnegative integers, then

e(aPbQ) = e(PQ)ab.

There are more criteria for a pairing to be useful in cryptography, but we won’t need those for this post.

Ben Lynn’s dissertation mentions that exponentiation is a special case of pairing if you let G1 and GT be the multiplicative group of the integers mod r and let G2 be the additive group of integers mod (r − 1). Then you can define a pairing by

e(ga) = ga.

Typically you can’t just write down a simple expression for a pairing, but in this case you can.

RSA encryption corresponds to rpq where p and q are large primes. The product pq is made public but the factorization into p and q is held secret. A message [1] is encrypted by exponentiation mod n where the exponent is the public key. In Lynn’s notation, the message is g and the public key is a.

The security of RSA encryption depends on the fact that you can’t recover g from ga mod n unless you know a trapdoor, the factorization of n [2]. This is true of pairings more generally: it is not practical to recover the inputs to a pairing from the output unless you know a trapdoor.

Related posts

[1] In practice, RSA isn’t used to encrypt entire messages. Instead, it is used to encrypt a key for a symmetric encryption algorithm such as AES, and that key is used to encrypt the message. This is done for efficiency.

[2] Or, more specifically, a private key that can easily be computed if you know the factorization of n. It’s conceivable that breaking RSA encryption is easier than factoring, but so far that does not appear to be the case.

Three-party Diffie-Hellman in one shot

Elliptic curve Diffie-Hellman

Given a point P on an elliptic curve E, and a random number a, aP means to add P to itself a times, using the addition on E. The point aP can be computed efficiently, even if a is a very large number [1]. However, if E has a large number of points, and if a is chosen at random from a large range, then it is not practical to compute a given P and aP.

This is the elliptic curve version of the discrete logarithm problem, and its presumed difficulty is the basis of the security of Diffie-Hellman key exchange.

Two-party Diffie-Hellman

With two-party Diffie-Hellman key exchange, two parties, Alice and Bob, generate random private keys a and b respectively. They agree on a point P on an elliptic curve E. Alice computes aP and sends it to Bob. Simultaneously Bob computes bP and sends it to Alice. Then Alice can compute

a(bP) = (ab)P

and Bob can compute

b(aP) = (ba)P = (ab)P.

Then both Alice and Bob know a shared secret, the point (ab)P on E, but neither party has revealed a private key.

Three-party Diffie-Hellman

You could extend the approach above to three parties, say adding Carol, but this would require extra communication: Alice could send (ab)P to Carol, which she could multiply by her private key c to obtain abcP. Similarly, everyone else could arrive at abcP. Each person has to do a computation, send and receive a message, do another computation, and send an receive another message.

Joux [2] came up with a way to do Diffie-Hellman key exchange with three people and only one round of sending and receiving messages. The set up uses a pairing e( , ) of two elliptic curve subgroups, G1 and G2, as in the previous post. Fix generators PG1 and QG2. Each party multiplies P and Q by their private key and sends the results to the other two parties.

Alice receives bP from Bob and cQ from Carol. This is enough for her to compute

e(bPcQ)a = e(P, Q)abc.

Similarly, Bob receives aP from Alice and cQ from Carol, enabling him to compute

e(aPcQ)b = e(P, Q)abc.

And finally, Carol receives aP from Alice and bQ from Bob, enabling her to compute

e(aPbQ)c = e(P, Q)abc.

So all three parties can compute the shared secret e(P, Q)abc. but no party knows the other parties’ private keys.

Footnotes

[1] If you want to multiply a point by 2100, for example, you don’t carry out 2100 additions; you carry out 100 doublings. Of course not every positive integer is a power of 2, but every positive integer is the sum of powers of 2, i.e. it can be written in binary. So as you’re doing your doublings, sum the terms that correspond to 1s in the binary representation of the number you’re multiplying by.

[2] Antoine Joux. A One Round Protocol for Tripartite Diffie–Hellman. Journal of Cryptology (2004) 17: 263–276.

Elliptic curve pairings in cryptography

Pairings can mean a variety of related things in group theory, but for our purposes a pairing is a bilinear mapping from two groups to a third group.

e: G1 × G2GT

Typically the group operation on G1 and G2 is written additively and the group operation on GT is written multiplicatively. In fact, GT will always be the multiplicative group of a finite field, i.e. GT consists of the non-zero elements of a finite field under multiplication. (The “T” stands for “target.”)

Here bilinear [1] means that if P is an element of G1 and Q is an element of G2, and a and b are nonnegative integers,

e(aPbQ) = e(P, Q)ab.

There are a few provisos …

Robin Williams imitating William F. Buckley

First, the pairing must be non-degenerate, i.e. e(PQ) ≠ 1 for some P and Q.

Second, the pairing must be efficiently computable.

Third, the embedding degree must not be “too high.” This means that if GT is the multiplicative group of a field with pk elements, k is not too big. We will look at two examples in which k = 12.

The second and third provisos are important even though they’re not stated rigorously.

Cryptography often speaks of pairing elliptic curves, but in fact it uses pairings of prime-order subgroups of the additive groups of elliptic curves. Because the subgroups have prime order, they are cyclic, and so the pairing is determined by its value on a generator from each subgroup.

Example: BN254

The previous post briefly mentioned a pairing between two elliptic curves, BN254 and alt_bn128, that is used in Ethereum and was used in Zcash in the original Sprout shielded protocol.

The elliptic curve BN254 is defined over the field Fp, the integers mod p, where

p = 21888242871839275222246405745257275088696311157297823662689037894645226208583.

and the elliptic curve alt_bn128 is defined over the field Fp[i], i.e. the field Fp, with an imaginary element i adjoined.

Both elliptic curves have a subgroup of order

r = 21888242871839275222246405745257275088548364400416034343698204186575808495617,

which is prime. So in the pairing the groups G1 and G2 are isomorphic to the integers mod r. The target group GT has order p12 − 1 and so the embedding degree k equals 12, and so the embedding degree is “not too high.”

Example: BLS12-381

Another example also comes from Ethereum and Zcash. Ethereum uses BN254 in smart contracts, but it uses BLS12-381 in its consensus layer. Zcash switched from BN254 to BLS12-381 in the Sapling release.

BLS12-381 is defined over a prime order field with on the order of 2381 elements and has embedding order 12, hence 12-381. The BLS stands for Paulo Barreto, Ben Lynn, and Michael Scott. Elliptic curve names often look mysterious, but they’re actually pretty descriptive. I discuss BLS12-381 in more detail here. As in the example above, BLS12-381 is defined over a field Fp and is paired with a curve over Fp[i], i.e. the same field with an imaginary element adjoined. The equation for BLS12-381 is

y² = x³ + 4

and the equation for the curve it is paired with is

y² = x³ + 4(1 + i)

As before the target group is the multiplicative group of a finite field of order p12.

Related posts

[1] You’ll also see bilinearity defined by

e(PQR) = e(PRe(QR)

and

e(PRS) = e(PRe(PS).

These definitions are equivalent. To see that the definition here implies the definition at the top, write out aP as PP + … + P etc.

Since we’re working in subgroups of prime order, there is a generator for each subgroup. Write out each element as a multiple of a generator, then the definition at the top implies the definition here.

Physical Keys and Encryption Keys

A physical key, such as a house key, is a piece of metal with cuts of differing depths. Typically there may be around 6 cuts, with five different possible depths for each cut. This allows 56 = 15,625 possible keys.

Encryption keys, such as AES keys, are a string of bits, often 128 bits, for a total of 2128 possible keys.

How long would a physical key have to be to have the same level of security as an encryption key? We’d need to solve

5n = 2128

which means

n = 128 / log25 = 55.12.

So we’d need a key with around 55 notches.

metal key with 55 notches

This only takes into account combinatorial possibilities, not the difficulty of attacking a physical key or a binary key. There are incomparably more possibilities for binary keys, but encryption attacks can be automated and carried out remotely (unless a computer is air gapped). A physical lock can only be attacked in person. It takes a lock picker orders of magnitude more time to try a key than a password cracking program. On the other hand, locks aren’t picked by trying thousands of keys.

Related post: Measuring cryptographic strength in liters of boiling water

RSA with multiple primes

Typically RSA public keys are the product of two large primes, npq. But there’s no reason they couldn’t be the product of say three primes, npqr, or more primes, as long as φ(n), or λ(n), is calculated correctly.

Encryption is done the same way. Decryption could be done the same way, except there is the opportunity for it to be more efficient. The trick is to use the CRT (Chinese Remainder Theorem) in a way similar to Garner’s algorithm. This is why RSA with multiple primes is sometimes used for digital signatures.

The difficulty of factoring n using the GNFS algorithm doesn’t change depending on the number of factors n has, as long as all the factors are sufficiently large, far too large to find using trial division.

Daniel Bernstein’s post-quantum RSA paper was based on keys that are the product of a large number of 4096-bit primes. This way all the arithmetic is carried out modulo 4096-bit primes, not modulo terabyte primes.

A quiet change to RSA

An RSA public key is a pair of numbers (en) where e is an exponent and npq where p and q are large prime numbers. The original RSA paper said choose a private key d and compute e. In practice now we choose e and compute d. Furthermore, e is now almost always 65537 for reasons given here. So the public key is essentially just n.

Euler’s totient function

The relationship between the exponent and the private decryption key in the original RSA paper was

ed = 1 mod φ(n).

It is easy to compute e given d, or d given e, when you know Euler’s totient function of n,

φ(n) = (p − 1)(q − 1).

The security of RSA encryption rests on the assumption that it is impractical to compute φ(n) unless you know p and q.

Carmichael’s totient function

Gradually over the course of several years, the private key d changed from being the solution to

ed = 1 mod φ(n)

to being the solution to

ed = 1 mod λ(n)

where Euler‘s totient function φ(n) was replaced with Carmichael‘s totient function λ(n).

The heart of the original RSA paper was Euler’s generalization of Fermat’s little theorem which says if a is relatively prime to m, then

aφ(n) = 1 (mod n)

Carmichael’s λ(n) is defined to be the smallest number that can replace φ(n) in the equation above. It follows that λ(n) divides φ(n).

Why the change?

Using Carmichael’s totient rather than Euler’s totient results in smaller private keys and thus faster decryption. When n = pq for odd primes p and q,

λ(n) = lcm( (p − 1), (q − 1) ) = (p − 1)(q − 1) / gcd( (p − 1), (q − 1) )

so λ(n) is smaller than φ(n) by a factor of gcd( (p − 1), (q − 1) ). At a minimum, this factor is at least 2 since p − 1 and q − 1 are even numbers.

However, an experiment suggests this was a trivial savings. When I generated ten RSA public keys the gcd was never more than 8.

from sympy import randprime, gcd

for _ in range(10):
    p = randprime(2**1023, 2**1024)
    q = randprime(2**1023, 2**1024)
    print(gcd(p-1, q-1))

I repeated the experiment with 100 samples. The median of the gcd’s was 2, the mean was 35.44, and the maximum was 2370. So the while gcd might be moderately large, but it is usually just 2 or 4.

Better efficiency

The efficiency gained from using Carmichael’s totient is minimal. More efficiency can be gained by using Garner’s algorithm.

Time needed to factor large integers

The optimal choice of factoring algorithm depends on the size of the number you want to factor. For numbers larger than a googol (10100) the GNFS (general number field sieve) algorithm is the fastest known factoring algorithm, making GNFS the algorithm of choice for trying to factor public keys for RSA encryption

The expected time required to factor a number n using the GNFS is proportional to

exp( f(n) g(n) )

where f(n) is relatively constant and g(n) varies more strongly with n. More specifically

f(n) = (64/9)1/3 + o(1)

and

g(n) = (log n)1/3 (log log n)2/3.

The notation o(1) means a term that goes to zero as n increases. Very often in practice you can ignore o(1) terms when your value of n is large. In cryptographic applications n is certainly large, at least 21024, and yet the o(1) term is still significant for values of n seen in practice. I’ve seen one source say that for keys used in practice the o(1) term is around 0.27.

Security levels

It is widely stated that factoring a 1024-bit private key is comparable in difficulty to breaking an 80-bit symmetric encryption key, i.e. that 1024-bit keys provide 80-bit security.

To find the security level of a 1024-bit RSA key, the size of the o(1) term matters. But given this, if we want to find the security level of more RSA key sizes, it doesn’t matter how big the o(1) term is. It only that the term is roughly constant over the range of key sizes we are interested in.

If f(n) is relatively constant, then the log of the time to factor n is proportional to g(n), and the log of the time to break an encryption with security level k is proportional to k. So the security level of a key n should be proportional to g(n). Let’s see if that’s the case, using the reported security levels of various key sizes.

import numpy as np

# RSA key sizes and security levels
data = {
    1024 : 80,
    2048 : 112,
    3072 : 128,
    4096 : 140,
    7680 : 192,
}
for d in data:
    x = d*np.log(2) # natural log of 2^d
    y = x**(1/3)*(np.log(x)**(2/3)) 
    print(data[d]/y)

This prints the following:

2.5580
2.6584
2.5596
2.4819
2.6237

So the ratio is fairly constant, as expected. All the reported security levels are multiples of 4, which suggests there was some rounding done before reporting them. This would account for some of the variation above.

The output above suggests that the security level of an RSA key with b bits is roughly

2.55 x1/3 log(x)2/3

where x = log(2) b.

Aside on RSA security

RSA encryption is not broken by factoring keys but by exploiting implementation flaws.

Factoring a 2048-bit RSA key would require more energy than the world produces in a year, as explained here.