Why use hash puzzles for proof-of-work?

A couple days ago I wrote about the the problem that Bitcoin requires to be solved as proof-of-work. In a nutshell, you need to tweak a block of transactions until the SHA256 double hash of its header is below a target value [1]. Not all cryptocurrencies use proof of work, but those that do mostly use hash-based puzzles.

Other cryptocurrencies use a different hashing problem, but they still use hashes. Litecoin and Dogecoin use the same proof-of-work problem, similar to the one Bitcoin uses, but with the scrypt (pronounced S-crypt) hash function. Several cryptocurrencies use a hashing problem based on Equihash. Monero uses its RandomX algorithm for proof-of-work, and although this algorithm has multiple components, it ultimately solves a hashing problem. [2]

Why hash puzzles?

Why do cryptocurrencies use hashing problems for proof of work? In principle they could use any computational problem that is hard to solve but easy to verify, such as numerous problems in computational number theory.

One reason is that computer scientists are confident that quantum computing would not reduce the difficulty of solving hash puzzles, even though it would reduce the difficulty of factoring-based puzzles. Also, there is general agreement that it’s unlikely a mathematical breakthrough will find a weakness in hashing functions.

Ian Cassels said “Cryptography is a mixture of mathematics and muddle, and without the muddle the mathematics can be used against you.” Hashing is much more muddle than mathematics.

Why not do something useful?

Hash puzzles work well for demonstrating work done, but they’re otherwise useless. They keep the wheels of cryptocurrencies turning, but the solutions themselves are intrinsically worthless.

Wouldn’t it be nice if crypto miners were solving useful problems like protein folding? You could do that. In fact there is a cryptocurrency FoldingCoin that does just that. But FoldingCoin has a market cap seven orders of magnitude smaller than Bitcoin, on the order of $200,000 compared to Bitcoin’s market cap of $2T.

Cryptocurrencies that use proof of useful work have not taken off. This might necessarily be the case. Requiring practical work creates divergent incentives. If you base a currency on the difficulty of protein folding computations, for example, it would cause major disruption if a pharmaceutical company decided to get into mining protein folding cryptocurrency at a loss because it values the results.

Going back to Cassels’ remark about mathematics and muddle, practical real-world problems often have a mathematical structure. Which is a huge blessing, except when you’re designing problems to be hard. Hash-based problems have gradually become easier to solve over time, and cryptocurrencies have adjusted. But a mathematical breakthrough for solving a practical problem would have the potential to disrupt a currency faster than the market could adapt.

Related posts

[1] You don’t change the transaction amounts, but you may change the order in which the transactions are arranged into a Merkle tree so that you get different hash values. You can also change a 32-bit nonce, and a timestamp, but most of the degrees of freedom you need in order to find an acceptable hash comes from rearranging the tree.

[2] Both scrypt and Equihash were designed to be memory intensive and to thwart the advantage custom ASIC mining hardware. However, people have found a way to use ASIC hardware to solve scrypt and Equihash problems. RandomX requires running a randomly generated problem before hashing the output in an attempt to frustrate efforts to develop specialized mining hardware.

What is the Bitcoin proof-of-work problem?

In order to prevent fraud, anyone wanting to add a block to the Bitcoin blockchain must prove that they’ve put in a certain amount of computational work. This post will focus on what problem must be solved in order produce proof of work.

You’ll see the proof of work function described as finding strings whose SHA256 hash value begins with a specified number of 0s. That’s sort of a zeroth-level approximation of the problem.

The string s you’re trying to find has the form data + nonce where the data comes from the block you’re wanting to add and the nonce is a value you concatenate on the end. You try different values until you get an acceptable hash value.

You’re not computing the SHA256(s) but rather the double hash:

SHA256²(s) = SHA256( (SHA256(s) )

The only way to find such a string s is by brute force [1], and applying the hash function twice doubles the amount of brute force work needed.

And you’re not exactly trying to produce leading zeros; you’re trying to produce a value less than a target T. This is roughly the same thing, but not quite.

To illustrate this, suppose you have a 2FA fob that generates six-digit random numbers. You’ve been asked to wait until your fob generates a number less than 2025. Waiting until you have three leading zeros would be sufficient, but that would be making the task harder than it needs to be. You’d be waiting for a number less than 1000 when you’re only asked to wait for a number less than 2025.

A SHA256 hash value is a 256-bit number. If your target T is a power of 2

T = 2256−n

then finding a value of s such that

SHA256²(s) < T

really is finding an s whose (double) hash begins with n zeros, though T is not required to be a power of 2.

Finding a value of s with

SHA256²(s) < 2256−n

will require, on average, testing 2n values of s.

The value of T is adjusted over time in order to keep the amount of necessary work roughly constant. As miners have become more efficient, the value of T has gone down and the amount of work has gone up. But the value of T can go up as well. It is currently fluctuating around 2176, i.e. hashes must have around 80 leading zero bits.

Now here’s where things get a little more complicated. I said at the beginning that the string s has the form

s = data + nonce

where the data comes from the block you’re trying to add and the nonce is a number you twiddle in order to get the desired hash value. But the nonce is a 32-bit integer. If you need to hash on the order of 280 strings in order to find one with 80 leading zeros, you can’t do that just by adjusting a 32-bit nonce.

In practice you’re going to have to twiddle the contents of what I’ve called data. The data contains a Merkle tree of transactions, and you can change the hash values by adjusting the order in which transactions are arranged in the tree, in addition to adjusting the nonce.

Related posts

[1] Unless someone finds a flaw in SHA256, which cryptographers have been trying to do for years and have not been able to do. And if a significant weakness is found in SHA256, it may not translate into a significant flaw in SHA256².

Golden hospital gowns

Here’s something I posted on X a couple days ago:

There’s no direct connection between AI and cryptocurrency, but they have a similar vibe.

They both leave you wondering whether the emperor is sumptuously clothed, naked, or a mix of both.

Maybe he’s wearing a hospital gown with gold threads.

In case you’re unfamiliar with the story, this is an allusion to The Emperor’s New Clothes, one of the most important stories in literature.

I propose golden hospital gown as a metaphor for things that are a fascinating mixture of good and bad, things that have large numbers of haters and fanboys, both with valid points. There’s continual improvement and a lot work to be done sorting out what works well and what does not.

I tried to get Grok to create an image of what I had in mind by a golden hospital gown. The results were not what I wanted, but passable, which is kinda the point of the comment above. It’s amazing that AI can produce anything remotely resembling a desired image starting from a text description. But there is a very strong temptation to settle for mediocre and vaguely creepy images that aren’t what we really want.

Man in a yellow hospital gown with hairy back and legs exposed

Related posts

Base 58 encoding and Bitcoin addresses

A few weeks ago I wrote about base32 and base64 encoding. I’ll review these quickly then discuss base58 and its use in Bitcoin.

Base32 and Base64

All three methods have the goal of compactly representing large numbers while maintaining readability. Douglas Crockford’s base32 encoding is the most conservative: it’s case-insensitive and it does not use the letters I, L, O, or U. The first three letters are omitted because of visual similarity to digits, and the last to avoid “accidental obscenities.”

Base 64 is not concerned with avoiding visual similarities, and uses the full upper and lower case alphabet, plus two more symbols, + and /.

Base58

Base58 is nearly as efficient as base64, but more concerned about confusing letters and numbers. The number 1, the lower case letter l, and the upper case letter I all look similar, so base58 retains the digit 1 and does not use the lower case letter l or the capital letter I.

The number 0 looks like the lower case letter o and the upper case letter O. Here base58 makes an unusual choice: it keeps the lower case letter o, but does not use the digit 0 or the capital letter O. This is odd because every other encoding that I can think of keep the 10 digits and differs over what letters to use.

Bases like 32 and 64 have the advantage of being trivial to convert back and forth with binary. To convert a binary number to base 2n, you start at the least significant end and convert groups of n bits. Since 58 is not a power of 2, converting to base 58 is more involved.

Bitcoin addresses

Bitcoin addresses are written in base58, and in fact base58 was developed for Bitcoin.

A Bitcoin address is a 25 byte (200 bit) number. Now

log582200 = 34.14

and so it may take up to 35 characters to represent a Bitcoin address in base58. Using base64 would have taken up to 34 characters, so base58 pays a very small price for preventing a class of errors relative to base64. Base32 would require 40 characters.

As noted above, converting between binary and base58 is more complicated than converting between binary and either base32 or base64. However, converting to base58 is trivial compared to everything else that goes into forming a Bitcoin address. The steps, documented here, involve taking an ECDSA public key, applying a secure hash function three times, and appending a checksum.

Related posts

Bitcoin key mechanism and elliptic curves over finite fields

Bitcoin uses the Elliptic Curve Digital Signature Algorithm (ECDSA) based on elliptic curve cryptography. The particular elliptic curve is known as secp256k1, which is the curve

y² = x³ + 7

over a finite field (a.k.a. Galois field) to be described shortly.

graph of elliptic curve y^2 = x^3 + 7

Addition on elliptic curves in the plane is defined geometrically in terms of where lines intercept the curve. We won’t go into the geometry here, except to say that it boils down to a set of equations involving real numbers. But we’re not working over real numbers; we’re working over a finite field.

Finite field modulus

The idea is to take the equations motivated by the geometry in the plane then use those equations to define addition when you’re not working over real numbers but over a different field. In the case of secp256k1, the field is the finite field of integers mod p where

p = 2256 – 232 – 977

Here p was chosen to be relatively close to 2256. It’s not the largest prime less than 2256; there are a lot of primes between p and 2256. Other factors also went into the choice p. Note that we’re not working in the integers mod p per se; we’re working in an Abelian group whose addition law is defined by an elliptic curve over the integers mod p.

(Update: Here’s another post about secp256k1’s sister curve, secp256r1, another curve modulo a 256-bit prime, but with different structure.)

Base point

Next, we pick a base point g on the elliptic curve. The standard defining secp256k1 says that g is

0279BE667EF9DCBBAC55A06295CE870B07029BFCDB2DCE28D959F2815B16F81798

in “compressed form” or

040x79BE667EF9DCBBAC55A06295CE870B07029BFCDB2DCE28D959F2815B16F81798483ADA7726A3C4655DA4FBFC0E1108A8FD17B448A68554199C47D08FFB10D4B8

in “uncompressed form”.

The base point is a specially chosen point on the elliptic curve, and so it is a pair of numbers mod p, not a single number. How do you extract x and y from these compressed or uncompressed forms?

Compressed form

The compressed form only gives x and you’re supposed to solve for y. The uncompressed form gives you x and y. However, the numbers are slightly encoded. In compressed form, the string either starts with “o2” or “o3” and the rest of the string is the hexadecimal representation of x. There will be two values of y satisfying

y² = x³ + 7 mod p

and the “o2” or “03” tells you which one to pick. If the compressed form starts with 02, pick the root whose least significant bit is even. And if the compressed form starts with 03, pick the root whose least significant bit is odd. (The two roots will add to p, and p is odd, so one of the roots will be even and one will be odd.)

Uncompressed form

The uncompressed form will always start with 04. After this follow the hexadecimal representations of x and y concatenated together.

In either case we have

x = 79BE667EF9DCBBAC55A06295CE870B07029BFCDB2DCE28D959F2815B16F81798

and

y = 483ADA7726A3C4655DA4FBFC0E1108A8FD17B448A68554199C47D08FFB10D4B8

We can verify this with a little Python code:

    x = 0x79BE667EF9DCBBAC55A06295CE870B07029BFCDB2DCE28D959F2815B16F81798
    y = 0x483ADA7726A3C4655DA4FBFC0E1108A8FD17B448A68554199C47D08FFB10D4B8
    p = 0xFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFEFFFFFC2F
    assert((y*y - x*x*x - 7) % p == 0)

Exponentiation over elliptic curve

Starting with our base point g, define kg to be g added to itself k times. Note again that the sense of “addition” here is addition in the elliptic curve, not addition in the field of integers mod p. The key to elliptic curve cryptography is that kg can be computed efficiently, but solving for k starting from the product kg cannot. You can compute kg using the fast exponentiation algorithm, but solving for k requires computing discrete logarithms. (This is the ECDLP: Elliptic Curve Discrete Logarithm Problem.)

Why is this called “exponentiation” and not “multiplication”? Arithmetic on the elliptic curve is commutative, and in commutative (i.e. Abelian) groups the group operation is usually denoted as addition. And repeated addition is called multiplication.

But in general group theory, the group operation is denoted as multiplication, and repeated application of the group operation is called  exponentiation. It’s conventional to use the general term “exponentiation” even though over an Abelian group it makes more sense to call it multiplication.

You undo exponentiation by taking logarithms, so the process of solving for k is called the discrete logarithm problem. The security of elliptic curve cryptography depends on the difficulty of computing discrete logarithms.

Counting bits of security

The best algorithms for solving discrete logarithm problems in a group of size n currently require O(√n) operations. How big is n in our case?

The base point g was chosen to have a large order, and in fact its order is approximately 2256.  Specifically, the order of g written in hexadecimal is

n = FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFEBAAEDCE6AF48A03BBFD25E8CD0364141.

This means that we get approximately 256/2 = 128 bits of security because √(2256) = 2128.

More cryptography posts