RSA numbers and factoring

It’s much easier to multiply numbers together than to factor them apart. That’s the basis of RSA encryption.

In particular, the RSA encryption scheme rests on the assumption that given two large primes p and q, one can quickly find the product pq but it is much harder to recover the factors p and q. For the size numbers you’ll see in math homework, say two or three digits, factoring is a little harder than multiplication. But when you get into numbers that are hundreds of digits long, factoring is orders of magnitude more difficult. Or so it seems. There’s no proof that factoring has to be as difficult as it appears to be. And sometimes products have special structure that makes factoring much easier, hence the need for so-called safe primes.

The ability to factor large products quickly is sufficient to break RSA, but we don’t know whether it’s necessary. Conceivably there is an efficient algorithm for breaking RSA encryption that does not require factoring, or that could be turned into an efficient algorithm for factoring. But no such algorithm is publicly known, and it people who use RSA are betting that such an algorithm doesn’t exist.

Factoring challenges

How hard is it to factor large products? We can get some idea from contest results. In 1991, RSA Laboratories published a list of factoring challenges, the so-called RSA numbers. The smallest of these, RSA-100, was a 100-digit number that was factored shortly after the challenge was announced.

Last week, Noblis, Inc. announced that their company had factored RSA-230, factoring a 230-digit number into two 115-digit primes. Because multiplication is so much easier than factoring, it’s simple to verify the result even though it was hard to produce it.

RSA232 = 17969491597941066732916128449573246156367561808012600070888918835531726460341490933493372247868650755230855864199929221814436684722874052065257937495694348389263171152522525654410980819170611742509702440718010364831638288518852689
p = 4528450358010492026612439739120166758911246047493700040073956759261590397250033699357694507193523000343088601688589
q = 3968132623150957588532394439049887341769533966621957829426966084093049516953598120833228447171744337427374763106901

if RSA232 == p*q:
    print("Check!")

A slightly larger RSA challenge problem, RSA-232 had previously been factored. The RSA numbers larger than RSA-232 have not been factored at the time of writing this post.

(RSA-232 has 232 decimal digits. But some RSA numbers are named after their length in bits and some are named after their length in digits. For example, RSA-704 is smaller than RSA-230 because the former has 704 bits and the latter has 230 decimal digits.)

Number field sieve

The expected time required to factor a semiprime (product of two primes) using the general number field sieve is O( exp(c log(n)1/3 log log(n)2/3) ) where the value of c is a little elusive. Some sources say c = (64/9)1/3 = 1.92, but as I understand it that value is based on a heuristic and the smallest rigorously proven value is c = 2.77.

Using the more optimistic value of c, we estimate about 1023 operations would be necessary to factor RSA-230.

Shor’s quantum algorithm

In theory, Shor’s algorithm could factor semiprimes very quickly using a quantum computer. At this time, the largest number that has been factored with Shor’s algorithm is 21. So it’s not much of a threat to crypto security yet, but it has enormous potential. If a large-scale quantum computer were available. Shor’s algorithm could factor primes in O( log(n)² log(log(n)) log(log(log(n))) ) time [1].

This means that Shor’s algorithm could factor RSA-230 in about 3 million operations.

More on number theory and cryptography

[1]  Obligatory old joke: What sound does a number theorist make when drowning? log log log …