Many encryption algorithms rely on the difficulty of factoring a large number n. If you want to make n hard to factor, you want it to have only two factors. Otherwise, the more factors n has, the smaller the smallest factor must be.
So if you want n to be the product of two large primes, p and q, you want to pick these primes to be roughly the same size so that the smaller factor is as large as possible. If you’re limited on the size of n, then you want p and q to be roughly of size √n. But not too close to √n. You may see in a description of a cryptographic algorithm, such as RSA, “Pick two large primes p and q, but not too close together, …” Why is that?
The answer goes back to Fermat (1607–1665). His factoring trick is to start with an odd composite n and look for numbers a and b such that
n = a² − b²
because if you can do that, then
n = (a + b)(a − b).
This trick always works [1], but it’s only practical when the factors are close together. If they are close together, you can do a brute force search for a and b. But otherwise you’re better off doing something else.
Small example
To give an example, suppose we want to factor n = 12319. Then √n = 110.99, so we can start looking for a and b by trying a = 111. We increment a until a² − n is a square, b².
Now 111² − 12319 = 2, so that didn’t work. But 112² − 12319 = 225, which is a square, and so we take a = 112 and b = 15. This tells us p = 112+15 = 127 and q = 112 − 15 = 97.
Larger example with Python code
Now let’s do a larger example, too big to do by hand and more similar to a real application.
from sympy import sqrt, log, ceiling, Integer def is_square(n): return type(sqrt(n)) == Integer def fermat_factor(n): num_digits = int(log(n, 10).evalf() + 1) a = ceiling( sqrt(n).evalf(num_digits) ) counter = 0 while not is_square(a*a - n): a += 1 counter += 1 b = sqrt(a*a - n) return(a+b, a-b, counter) p = 314159200000000028138418196395985880850000485810513 q = 314159200000000028138415196395985880850000485810479 print( fermat_factor(p*q) )
This recovers p and q in 3,580 iterations. Note that the difference in p and q is large in absolute terms, approximately 3 × 1027, but small relative to p and q.
Related posts
[1] If n = pq, then you can set a = (p + q)/2 and b = (p − q)/2.
Should it be 3580 iterations?
Yes, thanks.