Many encryption algorithms rely on the difficulty of factoring a large number *n*. If you want to make *n* hard to factor, you want it to have only two factors. Otherwise, the more factors *n* has, the smaller the smallest factor must be.

So if you want *n* to be the product of two large primes, *p* and *q*, you want to pick these primes to be roughly the same size so that the smaller factor is as large as possible. If you’re limited on the size of *n*, then you want *p* and *q* to be roughly of size √*n*. But not too close to √*n*. You may see in a description of a cryptographic algorithm, such as RSA, “Pick two large primes *p* and *q*, but not too close together, …” Why is that?

The answer goes back to Fermat (1607–1665). His factoring trick is to start with an odd composite *n* and look for numbers *a* and *b* such that

*n* = *a*² – *b*²

because if you can do that, then

*n* = (*a* + *b*)(*a* – *b*).

This trick always works [1], but it’s only practical when the factors are close together. If they are close together, you can do a brute force search for *a* and *b*. But otherwise you’re better off doing something else.

## Small example

To give an example, suppose we want to factor *n* = 12319. Then √*n* = 110.99, so we can start looking for *a* and *b* by trying *a* = 111. We increment *a* until *a*² – *n* is a square, *b*².

Now 111² – 12319 = 2, so that didn’t work. But 112² – 12319 = 225, which is a square, and so we take *a* = 112 and *b* = 15. This tells us *p* = 112+15 = 127 and *q* = 112 – 15 = 97.

## Larger example with Python code

Now let’s do a larger example, too big to do by hand and more similar to a real application.

from sympy import sqrt, log, ceiling, Integer def is_square(n): return type(sqrt(n)) == Integer def fermat_factor(n): num_digits = int(log(n, 10).evalf() + 1) a = ceiling( sqrt(n).evalf(num_digits) ) counter = 0 while not is_square(a*a - n): a += 1 counter += 1 b = sqrt(a*a - n) return(a+b, a-b, counter) p = 314159200000000028138418196395985880850000485810513 q = 314159200000000028138415196395985880850000485810479 print( fermat_factor(p*q) )

This recovers *p* and *q* in 3,580 iterations. Note that the difference in *p* and *q* is large in absolute terms, approximately 3 × 10^{27}, but small relative to *p* and *q*.

## Related posts

[1] If *n* = *pq*, then you can set *a* = (*p + **q*)/2 and *b* = (*p* – *q*)/2.

Should it be 3580 iterations?

Yes, thanks.