Factoring RSA100

Earlier today I wrote about factoring four 255-bit numbers that I needed for a post. Just out of curiosity, I wanted to see how long it would take to factor RSA 100, the smallest of the factoring challenges posed by RSA Laboratories in 1991. This is a 100-digit (330-bit) number that is the product of two primes.

I used the CADO-NFS software. The software was developed in France, and CADO is a French acronym for Crible Algébrique: Distribution, Optimisation. NFS stands for number field sieve, the fastest algorithm for factoring numbers with over 100 digits.

RSA 100 was first factored in 1991 using a few days of compute time on an MP1 MasPar computer, a machine that cost $500,000 at the time, equivalent to around $1,250,000 today.

My effort took about 23 minutes (1376 seconds) on a System 76 Meerkat mini that I paid $600 for in 2022.

The MP1 was about the size of a refrigerator. The Meerkat is about 3″ × 3″ × 1.5″.

Pairing-unfriendly curves

A couple days ago I wrote about two pair of closely related elliptic curves: Tweedledum and Tweedledee, and Pallas and Vesta.

In each pair, the order of one curve is the order of the base field of the other curve. The curves in each pair are used together in cryptography, but they don’t form a “pairing” in the technical sense of a bilinear pairing, and in fact none of the curves are “pairing-friendly” as described below.

An elliptic curve E/Fq is said to be pairing-friendly if r divides qk − 1 for some small k. Here r is the size of the largest prime-order subgroup of the curve, but since our curves have prime order p, r = p.

As for what constitutes a small value of k, something on the order of 10 would be considered small. The larger k is, the less pairing-friendly the curve is. We will show that our curves are extremely pairing-unfriendly.

Since q is not a multiple of p in our examples, there must be some power of q such at

qk = 1 mod p.

The question is whether k is large, i.e. whether the order of q mod p is large. We could try successive values of k, but that won’t get us very far. To be more clever, we use Lagrange’s theorem that says the order of an element divides the order of the group. So k must be one of the factors of p − 1. (We subtract 1 because we’re looking at the multiplicative group mod p, which removes 0.)

Finding the divisors of n − 1 requires factoring n − 1, which isn’t easy, but isn’t insurmountable either. The previous post reports the time required to do this in Python and in Mathematica for each of the following values of n.

p = 2254 + 4707489544292117082687961190295928833
q = 2254 + 4707489545178046908921067385359695873
r = 2254 + 45560315531419706090280762371685220353
s = 2254 + 45560315531506369815346746415080538113

Tweedledum has order p and its base field has order q.

k = 28948022309329048855892746252171976963322203655954433126947083963168578338816

Tweedledee has order q and its base field has order p.

k = 28948022309329048855892746252171976963322203655955319056773317069363642105856

Vesta has order r and its base field has order s.

k = 14474011154664524427946373126085988481681528240970780357977338382174983815168

Pallas has order s and its base field has order r.

k = 14474011154664524427946373126085988481681528240970823689839871374196681474048

It’s safe to say in each case k is not a small number.

 

Time to factor big integers Python and Mathematica

This post will look at the time required to factor n − 1 each of the following prime numbers in Python (SymPy) and Mathematica. The next post will explain why I wanted to factor these numbers.

p = 2254 + 4707489544292117082687961190295928833
q = 2254 + 4707489545178046908921067385359695873
r = 2254 + 45560315531419706090280762371685220353
s = 2254 + 45560315531506369815346746415080538113

Here are the timing results.

    |   |   Python | Mathematica |
    |---+----------+-------------|
    | p |    0.913 |       0.616 |
    | q |    0.003 |       0.002 |
    | r |  582.107 |      14.915 |
    | s | 1065.925 |      20.763 |

This is hardly a carefully designed benchmark, but it’s enough to suggest Mathematica can be a couple orders of magnitude faster than Python.

Here are the factorizations.

p − 1 = 234 × 3 × 4322432633228119 × 129942003317277863333406104563609448670518081918257
q − 1 = 233 × 3 × 5179 × 216901160674121772178243990852639108850176422522235334586122689
r − 1 = 232 × 32 × 463 × 539204044132271846773 × 8999194758858563409123804352480028797519453
s − 1 = 232 × 32 × 1709 × 24859 × 1690502597179744445941507 × 10427374428728808478656897599072717

Primitive Pythagorean triangles with the same area

A Pythagorean triangle is a right triangle with integer sides. A primitive Pythagorean triangle is one in which the sides have no factor in common. For example a triangle with sides (30, 40, 50) is a Pythagorean triangle but not a primitive Pythagorean triangle.

It is possible for two primitive Pythagorean triangles to have the same area. The smallest example is (20, 21, 29) and (12, 35, 37). Both have area 210.

It’s also possible for three primitive Pythagorean triangles to have the same area, but the smallest example is much larger. The triangles (4485, 5852, 7373), (1380, 19019, 19069), and (3059, 8580, 9109) all have area 13123110, discovered by C. L. Shedd in 1945.

Nobody has found an example of four primitive Pythagorean triangles having the same area. I don’t know whether it’s been settled whether such triangles exist. But it has been proven that if they exist, they have area greater than 9.3 × 1024. See OEIS A093536.

Incidentally, the triangle (20, 21, 29) came up in the post Do perimeter and area determine a triangle? from February of this year.

Counting sums of squares

In the process of writing the previous post, I ran across the Landau-Ramanujan theorem. It turned out to not be what I needed, but it’s an interesting result.

What portion of the numbers less than N can be written as the sum of the squares of two non-negative integers? Edmund Landau discovered in 1908, and Srinivasa Ramanujan independently rediscovered in 1913, that the proportion is asymptotically

c / (log N)1/2

where c, the Landau-Ramanujan constant, equals 0.76422….

Let’s see how the number of squares less than 1000 compares to the estimate given by the theorem.

from math import sqrt, log

N = 1000
c = 0.76422
print("Predicted: ", c / sqrt(log(N)))

sumsq = N*[0]
for i in range(N):
    for j in range(N):
        n = i**2 + j**2
        if n < N:
            sumsq[n] = 1
            
print("Exact: ", sum(sumsq)/N)

The predicted proportion is 0.291 and the exact proportion is 0.330.

The script takes O(N²) time to run, so we'd need to do something more clever if we wanted to investigate very large N.

Moessner’s Magic

The sum of the first n odd numbers is n². For example,

1 + 3 + 5 + 7 + 9 = 25 = 5²

Here’s another way to say the same thing. Start with the list of positive integers, remove every second integer from the list, and take the cumulative sum. The resulting list is the list of squares.

We begin with

1, 2, 3, 4, 5, …

then have

1, 3, 5, 7, 9, …

and end up with

1, 4, 9, 16, 25, …

Cubes

Now start over with the list of positive integers. We’re going to remove every third integer, take the cumulative sum, then remove every second integer, and take the cumulative sum. The result is a list of cubes.

Here are the steps described above for the positive integers up to 20:

1, 2, 3, 4, 5, 6, 7, 8, 10, 11, …, 20
1, 2, 4, 5, 7, 8, 10, 11, 13, 14, 16, 17, 19, 20
1, 3, 7, 12, 19, 27, 37, 48, 61, 75, 91, 108, 127, 147
1, 7, 19, 37, 61, 91, 127
1, 8, 27, 64, 125, 216, 343

General case

In general, given an integer k we remove every kth element, take the cumulative sum. Then we reduce k by 1 and repeat. We keep doing this until k = 1, in which case we stop. The end result will be the list of kth powers.

The result at the top of the post, corresponding to k = 2, has been known since ancient times. The generalization to larger k wasn’t known until Alfred Moessner observed it in 1951. Oskar Perron proved Moessner’s conjecture later the same year. The result has been called Moesnner’s theorem or Moesnner’s magic.

Python implementation

A little Python code will make it easier to play with Moessner’s process for larger numbers.

from numpy import cumsum

def moessner(N, k):
    x = range(1, N+1)
    while k > 1:
        x = [x[n] for n in range(len(x)) if n % k != k - 1]
        # print(x)
        x = cumsum(x)
        # print(x)
        k = k - 1
    return x

To see the intermediate steps, uncomment the print statements.

If we run the following code

for k in [2, 3, 4, 5]:
    print( moessner(25, k) )

we get the following.

[1   4   9  16  25  36  49  64  81 100 121 144 169]
[1   8  27  64 125 216 343 512 729]
[1   16   81  256  625 1296 2401]
[1   32  243 1024 3125]

Related posts

Powers of 3 + √2

Yesterday’s post looked at the distribution of powers of x mod 1. For almost all x > 1 the distribution is uniform in the limit. But there are exceptions, and the post raised the question of whether 3 + √2 is an exception.

A plot made it look like 3 + √2 is an exception, but that turned out to be a numerical problem.

A higher precision calculation showed that the zeros on the right end of the plot were erroneous.

So this raises the question of how to calculate (3 + √2)n accurately for large n. The way I created the second plot was to use bc to numerically calculate the powers of 3 + √2. In this post, I’ll look at using Mathematica to calculate the powers symbolically.

For all positive integers n,

(3 + √2)n = an + bn√2

where an and bn are positive integers. We want to compute the a and b values.

If you ask Mathematica to compute (3 + √2)n it will simply echo the expression. But if you use the Expand function it will give you want. For example

    Expand[(3 + Sqrt[2])^10]

returns

    1404491 + 993054 √2

We can use the Coefficient function to split a + b √2 into a and b.

    parts[n_] := 
        Module[{x = (3 + Sqrt[2])^n}, 
            {Coefficient[x, Sqrt[2], 0], Coefficient[x, Sqrt[2], 1]}]

Now parts[10] returns the pair {1404491, 993054}.

Here’s something interesting. If we set

(3 + √2)n = an + bn√2

as above, then the two halves of the expression on the right are asymptotically equal. That is, as n goes to infinity, the ratio

anbn√2

converges to 1.

We can see this by defining

    ratio[n_] := 
        Module[ {a = Part[ parts[n], 1], b = Part[parts[n], 2]}, 
        N[a / (b Sqrt[2])]]

and evaluating ratio at increasing values of n. ratio[12] returns 1.00001 and ratio[13] returns 1, not that the ratio is exactly 1, but it is as close to 1 as a floating point number can represent.

This seems to be true more generally, as we can investigate with the following function.

    ratio2[p_, q_, r_, n_] := 
        Module[{x = (p + q Sqrt[r])^n}, 
            N[Coefficient[x, Sqrt[r], 0]/(Coefficient[x, Sqrt[r], 1] Sqrt[r])]]

When r is a prime and

(p + q√r)n = an + bnr

then it seems that the ratio an / bnr converges to 1 as n goes to infinity. For example, ratio2[3, 5, 11, 40] returns 1, meaning that the two halves of the expression for (3 + 5√11)n are asymptotically equal.

I don’t know whether the suggested result is true, or how to prove it if it is true. Feels like a result from algebraic number theory, which is not something I know much about.

Update: An anonymous person on X suggested a clever and simple proof. Observe that

\begin{align*} a_n &= \frac{(3+\sqrt{2})^n + (3-\sqrt{2})^n}{2} \\ b_n\sqrt{2} &= \frac{(3 + \sqrt{2})^n - (3-\sqrt{2})^n}{2} \end{align*}

In this form it’s clear that the ratio an / bn √2 converges to 1, and the proof can be generalized to cover more.

Multiples and powers mod 1

For a positive real number x, the expression x mod 1 means the fractional part of x:

x mod 1 = x − ⌊ x ⌋.

This post will look at the distributions of nx mod 1 and xn mod 1 for n = 1, 2, 3, ….

The distribution of nx mod 1 is easy to classify. If x is rational then nx will cycle through a finite number of values in the interval [0, 1]. If x is irrational then nx will be uniformly distributed in the interval.

The distribution of xn mod 1 is more subtle. We have three basic facts.

  1. If 0 ≤ x < 1, then xn → 0 as n → ∞.
  2. If x = 1 then xn = 1 for all n.
  3. For almost all x > 1 the sequence xn mod 1 is uniformly distributed. But for some values of x it is not, and there’s no known classification for these exceptional values of x.

The rest of the post will focus on the third fact.

Suppose x = √2. Then for all even n, xn mod 1 = 0. The sequence isn’t uniformly distributed in [0, 1] because half the sequence is piled up at 0.

For another example, let’s look at powers of the golden ratio φ = (1 + √5)/2. For even values of n the sequence φn converges to 1, and for odd values of n it converges to 0. You can find a proof here.

At one time it was not known whether xn mod 1 is uniformly distributed for x = 3 + √2.  (See HAKMEM, item 32.) I don’t know whether this is still unknown. Let’s see what we can tell from a plot.

Apparently sometime around n = 25 the sequence suddenly converges to 0. That looks awfully suspicious. Let’s calculate x25 using bc.

    $ echo '(3 + sqrt(2))^25' | bc -l
    13223107213151867.43024035874391918714

This says x25 mod 1 should be around 0.43, not near 0. The reason we’re seeing all zeros is that all the bits in the significand are being used to store the integer part and none are left over for the fractional part. A standard floating point number has 53 bits of significand and

(3 + √2)25 > 253.

For more details, see Anatomy of a floating point number.

Now let’s see what happens when we calculate xn mod 1 correctly using bc. Here’s the revised plot.

Is this uniformly distributed? Well, it’s not obviously not uniformly distributed. But we can’t tell by looking at a plot because uniform distribution is an asymptotic property.

We didn’t give the definition of a uniformly distributed sequence until now because an intuitive understanding of the term was sufficient. Now we should be more precise.

Given a set E, a sequence ω, and an integer N, define A(E, ω, N) to be the number of elements in the first N elements of the sequence ω that fall inside E. We say ω is uniformly distributed mod 1 if for every 0 ≤ ab ≤ 1,

limN → ∞ A([a, b), ω, N) / N = ba.

That is, the relative port of points that fall in an interval is equal to the length of the interval.

Now let’s go back to the example x = √2. We know that the powers of this value of x are not uniformly distributed mod 1. But we also said that for almost all x > 1 the powers of x are uniformly distributed. That means every tiny little interval around √2 contains a number y such that the powers of y are uniformly distributed mod 1. Now if y is very close to √2 and we plot the first few values of yn mod 1 then half of the values will be near 0. The sequence will not look uniformly distributed, though it is uniformly distributed in the limit.

Legendre and Ethereum

What does an eighteenth century French mathematician have to do with the Ethereum cryptocurrency?

A pseudorandom number generator based on Legendre symbols, known as Legendre PRF, has been proposed as part of a zero knowledge proof mechanism to demonstrate proof of custody in Ethereum 2.0. I’ll explain what each of these terms mean and include some Python code.

The equation x² = a mod p is solvable for some values of a and not for others. The Legendre symbol

\left(\frac{a}{p}\right)

is defined to be 1 if a has a square root mod p, and −1 if it does not. Here a is a positive integer and p is a (large) prime [1]. Note that this is not a fraction, though it looks like one.

As a varies, the Legendre symbols vary randomly. Not literally randomly, of course, but the act random enough to be useful as a pseudorandom number generator.

Legendre PRF

Generating bits by computing Legendre symbols is a lot more work than generating bits using a typical PRNG, so what makes the Legendre PRF of interest to Ethereum?

Legendre symbols can be computed fairly efficiently. You wouldn’t want to use the Legendre PRF to generate billions of random numbers for some numerical integration computation, but for a zero knowledge proof you only need to generate a few dozen bits.

To prove that you know a key k, you can generate a string of pseudorandom bits that depend on the key. If you can do this for a few dozen bits, it’s far more likely that you know the key than that you have guessed the bits correctly. Given k, the Legendre PRF computes

L_k(x) = \frac{1}{2}\left( 1 - \left( \frac{k+x}{p}\right) \right)

for n consecutive values of x [2].

One reason this function is of interest is that it naturally lends itself to multiparty computation (MPC). The various parties could divide up the range of x values that each will compute.

The Legendre PRF has not been accepted to be part of Ethereum 2.o. It’s only been proposed, and there is active research into whether it is secure enough.

Python code

Here’s a little Python scrypt to demonstrate using Lk(x).

    from sympy import legendre_symbol

    def L(x, k, p):
        return (1 - legendre_symbol(x + k, p)) // 2

    k = 20250626
    p = 2**521 - 1
    print([L(x, k, p) for x in range(10)])

This produces the following.

    [1, 1, 1, 1, 0, 0, 1, 0, 0, 1]

Related posts

[1] There is a third possibility: the Legendre symbol equals 0 if a is a multiple of p. We can safely ignore this possibility for this post.

[2] The Legendre symbol takes on the values ±1 and so we subtract this from 1 to get values {0, 2}, then divide by 2 to get bits {0, 1}.

Golden powers revisited

Years ago I wrote a post Golden powers are nearly integers. The post was picked up by Hacker News and got a lot of traffic. The post was commenting on a line from Terry Tao:

The powers φ, φ2, φ3, … of the golden ratio lie unexpectedly close to integers: for instance, φ11 = 199.005… is unusually close to 199.

In the process of writing my recent post on base-φ numbers I came across some equations that explain exactly why golden powers are nearly integers.

Let φ be the golden ratio and ψ = −1/φ. That is, φ and ψ are the larger and smaller roots of

x² − x − 1 = 0.

Then powers of φ reduce to an integer and an integer multiple of φ. This is true for negative powers of φ as well, and so powers of ψ also reduce to an integer and an integer multiple of ψ. And in fact, the integers alluded to are Fibonacci numbers.

φn = Fn φ + Fn − 1
ψn = Fn ψ + Fn − 1

These equations can be found in TAOCP 1.2.8 exercise 11.

Adding the two equations leads to [1]

φn = Fn + 1 + Fn − 1 − ψn

So yes, φn is nearly an integer. In fact, it’s nearly the sum of the (n + 1)st and (n − 1)st Fibonacci numbers. The error in this approximation is −ψn, and so the error decreases exponentially with alternating signs.

Related posts

[1] φn + ψn = Fn (φ + ψ) + 2 Fn − 1 = Fn + 2 Fn − 1 = Fn + Fn − 1 + Fn − 1 = Fn + 1 + Fn − 1