Proving you know a product

There is a way to prove that you know two numbers a and b, and their product cab, without revealing ab, or c. This isn’t very exciting without more context — maybe you know that 7 × 3 = 21 — but it’s a building block of more interesting zero knowledge proofs, such as proving that a cryptocurrency transaction is valid without revealing the amount of the transaction.

The proof mechanism requires an elliptic curve G and a pairing of G with itself. (More on pairings shortly.) It also requires a generator g of the group structure on G.

The prover takes the three secret numbers and multiplies the generator g by each, encrypting the numbers as agbg, and cg. When G is a large elliptic curve, say one with on the order of 2256 points, then computing products like ag can be done quickly, but recovering a from g and ag is impractical. In a nutshell, multiplication is easy but division [1] is practically impossible [2].

The verifier receives agbg, and cg. How can he verify that ab = c without knowing ab, or c? Here’s where pairing come in.

I go more into pairings here, but essentially a pairing is a mapping from two groups to a third group

eG1 × G2 → GT

such that

e(aPbQ) = e(PQ)ab.

In our case G1 and G2 are both equal to the group G above, and the target group GT doesn’t matter for our discussion here. Also, P and Q will both be our generator g.

By the defining property of a pairing,

e(agbg) = e(gg)ab

and

e(cgg) = e(gg)c.

So if abc, then e(gg)ab and e(gg)c will be equal.

Related posts

[1] The literature will usually speak of discrete logarithms rather than division. The group structure on an elliptic curve is Abelian, and so it is usually written as addition. If you write the group operation as multiplication, then you’re taking logs rather than dividing. The multiplicative notation highlights the similarity to working in the multiplicative group modulo a large prime.

[2] The computation is theoretically possible but not possible in practice without spending enormous resources, or inventing a large scale quantum computer. This is the discrete logarithm assumption.

How to prove you know a discrete logarithm

In a high school math class, the solution to the equation

bxy

is the logarithm of y in base b. The implicit context of the equation is the real numbers, and the solution is easy to calculate.

The same problem in the context of finite groups is called the discrete logarithm problem, and it is difficult to solve for large groups. In particular, it is impractical to solve when working modulo a sufficiently large prime number or when working over a sufficiently large elliptic curve [1]. In either context, the exponential bx can be computed efficiently but its inverse cannot.

Now suppose you want to prove that you know x without revealing x itself. That is, you’d like to construct a zero knowledge proof that you know x. How could you do this?

Here’s one way.

  1. You, the prover, create a random number r, compute t = br, and send the verifier t.
  2. The other party, the verifier, creates a random number c, the challenge, and sends it to you.
  3. You calculate sr + cx and send s to the verifier.
  4. The verifier checks whether bst yc. and believes you if and only if equality holds.

Let’s see why this works.

First of all, what have you revealed to the prover? Two values: t and s. The value t is the exponential of a random number, and so another random number. The value s is based on x, and so conceivably you’ve revealed your secret. But the verifier does not know r, only a value computed from r (i.e. t) and the verifier cannot recover r from t because this would require computing a discrete logarithm.

Next, why should bst yc? Because

bsbr + cx = br bcx = t (bx)c = t yc.

Finally, why should the verifier believe you know x? If you don’t know x, but were able to come up with an s that satisfies the verifier, then you were able to compute the discrete logarithm of t yc.

Related posts

[1] At least without a large-scale quantum computer. Shor’s algorithm could efficiently compute discrete logarithms if only there were a large quantum computer to run it on.

What is a Pedersen commitment?

I’m taking a break from my series on celestial navigation. The previous posts give the basics, but I haven’t thought of a way to go further that I’m satisfied with. So now for something completely different: Pedersen commitments.

Pedersen commitments are a building block of zero knowledge proofs (ZKP), and they give an opportunity to look at a couple other interesting topics: nothing-up-my-sleeve constructions and homomorphic encryption.

A Pedersen commitment to a value v takes a random number x and two generators of an elliptic curve, G and H, and returns

C = vGxH.

The significance of C is that it appears to be a random number to the recipient, but the sender who calculated it can later show that it was computed from v and xC is called a commitment to the value v because the sender cannot later say that C was computed from a different v and a different x.

Mathematical details

The addition in

C = vGxH

is carried out on an elliptic curve, such as Ed25519 in the case of Monero. Multiplication is defined by repeated addition, though it’s not computed that way [1]. G and H are not just points on the elliptic curve but points in a large, prime-order subgroup of the elliptic curve.

Because the value x is random, the possible values of C are uniformly distributed on the curve, and so someone observing C learns nothing about v. For that reason x is called a blinding factor.

The difficulty of the discrete logarithm problem insures that it is impractical come up with different values v‘ and x‘ such that

v Gx H = v‘ Gx‘ H.

This depends on two assumptions.

The first assumption is that the discrete logarithm problem is hard to solve given current algorithms and hardware. The prevailing opinion is that it is unlikely anyone will come up with an efficient algorithm for solving the discrete logarithm problem on current hardware. However, Shor’s algorithm could solve the discrete logarithm problem efficiently if and when a practical, large-scale quantum computer is created.

The second assumption is that the generator H was chosen at random and not calculated to be a backdoor.

How to make and use a backdoor

Because G and H are members of the same prime-order (i.e. cyclic) group, there exists some integer h such

HhG

If the generator H was randomly selected, nobody knows h and nobody can calculate it. But if H was calculated by first selecting h and multiplying hG then there is a backdoor.

Now

C = vGxH = vGx(hG) = (vxh)G.

If you know h, you can pick a new v‘ and solve for x‘ such that

vxhv‘ + xh.

That would mean that in the context of a cryptocurrency that uses Pedersen commitments, such as Monero or the Liquid Network on top of Bitcoin, you could initially commit to spending v and later claimed that you committed to spending v‘.

Note that solving for x‘ requires modular arithmetic, not solving the discrete logarithm problem.

How to prove no backdoor

The way to prove that the generator H was chosen in good faith is to be transparent about how it was created. In practice this means using some sort of cryptographic hash function. For example, Bulletproofs hashed “bulletproof_g” and “bulletproof_h” to create its values of G and H. Bulletproofs require multiple values of G and H and so consecutive integers were concatenated to the strings before hashing.

Reversing a cryptographic hash like SHA256 is impractical, even assuming you have a quantum computer, and so it is extremely unlikely that there is a backdoor when the generators were created by hashing a natural string.

It’s said that Pedersen commitments do not require a trusted setup. That’s true in spirit, but more precisely they require a transparent setup that is easy to trust.

Homomorphic encryption

The function

C: (vx) ↦ vG + xH

is a group homomorphism from pairs of integers to the subgroup generated by G and H. This means that

C(vx) + C(v‘, x‘) = C(vv‘, xx‘)

or in other words, you can combine multiple commitments into a single commitment. The sum of a commitment to (vx) and a commitment to (v‘, x‘) is a commitment to (vv‘, xx‘).

Related posts

[1] In practice the number x is enormous, say on the order of the number of points on the elliptic curve, and so software does not add H to itself x times. Instead it uses a process analogous to fast exponentiation. In fact, if you write the group operation multiplicatively rather than additively, it is exactly fast exponentiation.

Zero knowledge proof of compositeness

A zero knowledge proof (ZKP) answers a question without revealing anything more than answer. For example, a digital signature proves your possession of a private key without revealing that key.

Here’s another example, one that’s more concrete than a digital signature. Suppose you have a deck of 52 cards, 13 of each of spades, hearts, diamonds, and clubs. If I draw a spade from the deck, I can prove that I drew a spade without showing which card I drew. If I show you that all the hearts, diamonds, and clubs are still in the deck, then you know that the missing card must be a spade.

Composite numbers

You can think of Fermat’s primality test as a zero knowledge proof. For example, I can convince you that the following number is composite without telling you what its factors are.

n = 244948974278317817239218684105179099697841253232749877148554952030873515325678914498692765804485233435199358326742674280590888061039570247306980857239550402418179621896817000856571932268313970451989041

Fermat’s little theorem says that if n is a prime and b is not a multiple of n, then

bn−1 = 1 (mod n).

A number b such that bn−1 ≠ 1 (mod n) is a proof that n is not prime, i.e. n is composite. So, for example, b = 2 is a proof that n above is composite. This can be verified very quickly using Python:

    >>> pow(2, n-1, n)
    10282 ... 4299

I tried the smallest possible base [1] and it worked. In general you may have to try a few bases. And for a few rare numbers (Carmichael numbers) you won’t be able to find a base. But if you do find a base b such that bn−1 is not congruent to 1 mod n, you know with certainty that n is composite.

Prime numbers

The converse of Fermat’s little theorem is false. It can be used to prove a number is not prime, but it cannot prove that a number is prime. But it can be used to show that a number is probably prime. (There’s some subtlety as to what it means for a number to probably be prime. See here.)

Fermat’s little theorem can give you a zero knowledge proof that a number is composite. Can it give you a zero knowledge proof that a number is prime? There are a couple oddities in this question.

First, what would it mean to have a zero knowledge proof that a number is prime? What knowledge are you keeping secret? When you prove that a number is composite, the prime factors are secret (or even unknown), but what’s the secret when you say a number is prime? Strictly speaking a ZKP doesn’t have to keep anything secret, but in practice it always does.

Second, what about the probability of error? Zero knowledge proofs do not have to be infallible. A ZKP can have some negligible probability of error, and usually do.

It’s not part of the definition, but practical ZKPs must be easier to verify than the direct approach to what they prove. So you could have something like a primality certificate that takes far less computation to verify than the computation needed to determine from scratch that a number is prime.

Proving other things

You could think of non-constructive proofs as ZKPs. For example, you could think of the intermediate value theorem as a ZKP: it proves that a function has a zero in an interval without giving you any information about where that zero may be located.

What makes ZKPs interesting in application is that they can prove things of more general interest than mathematical statements [2]. For example, cryptocurrencies can provide ZKPs that accounting constraints hold without revealing the inputs or outputs of the transaction. You could prove that nobody tried to spend a negative amount and that the sum of the inputs equals the sum of the outputs.

Related posts

[1] You could try b = 1, but then bn−1 is always 1. This example shows that the existence of a base for which bn−1 = 1 (mod n) doesn’t prove anything.

[2] You might object that accounting rules are mathematical statements, and of course they are. But they’re of little interest to mathematicians and of great interest to the parties in a transaction.

Lewis Carroll and Zero Knowledge Proofs

Illustration from Through the Looking Glass

Elliptic curves are often used in cryptography, and in particular they are used in zero-knowledge proofs (ZKP). Cryptocurrencies such as Zcash use ZKP to protect the privacy of users.

Several of the elliptic curves used in ZKP have whimsical names taken from characters by Lewis Carroll. This post will look at these five elliptic curves:

  • Jubjub
  • Baby Jubjub
  • Bandersnatch
  • Tweedledee
  • Tweedledum

Charles Dodgson was a mathematician, and perhaps there’s some connection from his mathematical work to elliptic curves and ZKP, the connection explored here is with his literary works written under the name Lewis Carroll.

Jabberwocky names

The first three curves—Jubjub, Baby Jubjub, and Bandersnatch—all get their name from Lewis Carroll’s poem Jabberwocky.

“Beware the Jabberwock, my son!
The jaws that bite, the claws that catch!
Beware the Jubjub bird, and shun
The frumious Bandersnatch!”

These curves all have a twisted Edwards curve form and a Montgomery curve form, just like the relationship between Ed25519 and Curve25519 that I wrote about a few days ago.

As its name suggests, the Baby Jubjub elliptic curve is related to the Jubjub curve but smaller.

Bandersnatch is similar to Jubjub, but arithmetic over this curve can be implemented more efficiently.

Looking Glass names

The last two curves—Tweedledum and Tweedledee—take their names from Through the Looking Glass.

And as their names suggest, Tweedledum and Tweedledee and very closely related. Both have the equation

y² = x³ + 5

but over different fields. Tweedledum is defined over the integers mod p and has q elements. Tweedledee is defined over the integers mod q and has p elements. (Mind your ps and qs!)

Here

p = 2254 + 4707489545178046908921067385359695873
q = 2254 + 4707489544292117082687961190295928833

Note that Tweedledum and Tweedledee are a “pair” of curves used together, but they do not form a pairing in the technical sense of that word.

More Lewis Carroll posts

More elliptic curve posts