# Strong primes

There are a couple different definitions of a strong prime. In number theory, a strong prime is one that is closer to the next prime than to the previous prime. For example, 11 is a strong prime because it is closer to 13 than to 7.

In cryptography, a strong primes are roughly speaking primes whose products are likely to be hard to factor. More specifically, though still not too specific, p is a strong prime if

1. p is large
2. p – 1 has a large prime factor q
3. q – 1 has a large prime factor r
4. p + 1 has a large prime factor s

The meaning of “large” is not precise, and varies over time. In (1), large means large enough that it is suitable for use in cryptography, such as in RSA encryption. This standard increases over time due to increases in computational power and improvements in factoring technology. The meaning of “large” in (2), (3), and (4) is not precise, but makes sense in relative terms. For example in (2), the smaller the ratio (p – 1)/q the better.

## Relation between the definitions

The Wikipedia article on strong primes makes the following claim without any details:

A computationally large safe prime is likely to be a cryptographically strong prime.

I don’t know whether this has been proven, or even if it’s true, but I’d like to explore it empirically. (Update: see the section on safe primes below. I misread “safe” above as “strong.” Just as well: it lead to an interesting investigation.)

We’ll need some way to quantify whether a prime is strong in the cryptographic sense. This has probably been done before, but for my purposes I’ll use the sum of the logarithms of q, r, and s. We should look at these relative to the size of p, but all the p‘s I generate will be roughly the same size.

## Python code

I’ll generate 100-bit primes just so my script will run quickly. These primes are too small for use in practice, but hopefully the results here will be representative of larger primes.

```    from sympy import nextprime, prevprime, factorint, randprime
import numpy as np

# largest prime factor
def lpf(n):
return max(factorint(n).keys())

def log2(n):
np.log2(float(n))

num_samples = 100
data = np.zeros((num_samples, 5))

bitsize = 100

for i in range(num_samples):
p = randprime(2**bitsize, 2**(bitsize+1))
data[i,0] = 2*p > nextprime(p) + prevprime(p)
q = lpf(p-1)
r = lpf(q-1)
s = lpf(p+1)
data[i,1] = log2(q)
data[i,2] = log2(r)
data[i,3] = log2(s)
data[i,4] = log2(q*r*s)
```

The columns of our matrix correspond to whether the prime is strong in the number theory sense, the number of bits in qr, and s, and the total bits in the three numbers. (Technically the log base 2 rather than the number of bits.)

## Results

There were 75 strong primes and 25 non-strong primes. Here were the averages:

```    |-----+--------+------------|
|     | strong | not strong |
|-----+--------+------------|
| q   |   63.6 |       58.8 |
| r   |   41.2 |       37.0 |
| s   |   66.3 |       64.3 |
| sum |  171.0 |      160.1 |
|-----+--------+------------|
```

The numbers are consistently higher for strong primes. However, the differences are small relative to the standard deviations of the values. Here are the standard deviations:

```    |-----+--------+------------|
|     | strong | not strong |
|-----+--------+------------|
| q   |   20.7 |       15.6 |
| r   |   19.8 |       12.3 |
| s   |   18.7 |       19.9 |
| sum |   30.8 |       41.9 |
|-----+--------+------------|
```

## Safe primes

I realized after publishing this post that the Wikipedia quote didn’t say what I thought it did. It said that safe primes are likely to be cryptographically strong primes. I misread that as strong primes. But the investigation above remains valid. It shows weak evidence that strong primes in the number theoretical sense are also strong primes in the cryptographic sense.

Note that safe does not imply strong; it only implies the second criterion in the definition of strong. Also, strong does not imply safe.

To test empirically whether safe primes are likely to be cryptographically strong, I modified my code to generate safe primes and compute the strength as before, the sum of the logs base 2 of qr, and s. We should expect the strength to be larger since the largest factor of p will always be as large as possible, (p – 1)/2. But there’s no obvious reason why r or s should be large.

For 100-bit safe primes, I got an average strength of 225.4 with standard deviation 22.8, much larger than in my first experiment, and with less variance.