# Any number can start a factorial

Any positive number can be found at the beginning of a factorial. That is, for every positive positive integer n, there is an integer m such that the leading digits of m! are the digits of n.

There’s a tradition in math to use the current year when you need an arbitrary numbers; you’ll see this in competition problems and recreational math articles. So let’s demonstrate the opening statement with n = 2019. Then the smallest solution is m = 3177, i.e.

3177! = 2019…000

The factorial of 3177 is a number with 9749 digits, the first of which are 2019, and the last 793 of which are zeros.

The solution m = 3177 was only the first. The next solution is 6878, and there are infinitely more.

Not only does every number appear at the beginning of a factorial, it appears at the beginning of infinitely many factorials.

We can say even more. Persi Diaconis proved that factorials obey Benford’s law, and so we can say how often a number n appears at the beginning of a factorial.

If a sequence of numbers, like factorials, obeys Benford’s law, then the leading digit d in base b appears with probability

logb(1 + 1/d).

If we’re interested in 4-digit numbers like 2019, we can think of these as base 10,000 digits [1]. This means that the proportion factorials that begin with 2019 equals

log10000(1 + 1/2019)

or about 1 in every 18,600 factorials.

By the way, powers of 2 also obey Benford’s law, and so you can find any number at the beginning of a power of 2. For example,

22044 = 2019…

Related: Benford’s law blog posts

[1] As pointed out in the comments, this approach underestimates the frequency of factorials that start with 2019. It would count numbers of the form 2019 xxxx xxxx, but not numbers of the form 201 9xxx xxxx etc. The actual frequency would be 4x higher.

# A simple unsolved problem

Are there infinitely many positive integers n such that tan(n) > n?

David P. Bellamy and Felix Lazebnik [1] asked in 1998 whether there were infinitely many solutions to |tan(n)| > n and tan(n) > n/4. In both cases the answer is yes. But at least as recently as 2014 the problem at the top of the page was still open [2].

It seems that tan(n) is not often greater than n. There are only five instances for n less than one billion:

1
260515
37362253
122925461
534483448

In fact, there are no more solutions if you search up to over two billion as the following C code shows.

#include <math.h>
#include <stdio.h>
#include <limits.h>

int main() {
for (int n = 0; n < INT_MAX; n++)
if (tan(n) > n)
printf("%d\n", n);
}

If there are infinitely many solutions, it would be interesting to know how dense they are.

Update: The known numbers for which tan(n) > n are listed in OEIS sequence A249836. According to the OEIS site it is still unknown whether the complete list is finite or infinite.

***

[1] David P. Bellamy, Felix Lazebnik, and Jeffrey C. Lagarias, Problem E10656: On the number of positive integer solutions of tan(n) > n, Amer. Math. Monthly 105 (1998) 366.

[2] Felix Lazebnik. Surprises. Mathematics Magazine , Vol. 87, No. 3 (June 2014), pp. 212-22

# Primes that don’t look like primes

Primes usually look odd. They’re literally odd [1], but they typically don’t look like they have a pattern, because a pattern would often imply a way to factor the number.

However, 12345678910987654321 is prime!

I posted this on Twitter, and someone with the handle lagomoof replied that the analogous numbers are true for bases 2, 3, 4, 6, 9, 10, 16, and 40. So, for example, the hexadecimal number 123456789abcdef10fedcba987654321 would be prime. The following Python code proves this is true.

from sympy import isprime

for b in range(2, 41):
n = 0
for i in range(0, b):
n += (i+1)*b**i
for i in range(b+1, 2*b):
n += (2*b-i)*b**i
print(b, isprime(n))

By the way, some have suggested that these numbers are palindromes. They’re not quite because of the 10 in the middle. You can show that the only prime palindrome with an even number of digits is 11. This is an example of how a pattern in the digits leads to a way to factor the number.

## Related posts

[1] “All primes are odd except 2, which is the oddest of all.” — Concrete Mathematics

# Predicted distribution of Mersenne primes

Mersenne primes are prime numbers of the form 2p – 1. It turns out that if 2p – 1 is a prime, so is p; the requirement that p is prime is a theorem, not part of the definition.

So far 51 Mersenne primes have discovered [1]. Maybe that’s all there are, but it is conjectured that there are an infinite number Mersenne primes. In fact, it has been conjectured that as x increases, the number of primes px is asymptotically

eγ log x / log 2

where γ is the Euler-Mascheroni constant. For a heuristic derivation of this conjecture, see Conjecture 3.20 in Not Always Buried Deep.

How does the actual number of Mersenne primes compared to the number predicted by the conjecture? We’ll construct a plot below using Python. Note that the conjecture is asymptotic, and so it could make poor predictions for now and still be true for much larger numbers. But it appears to make fairly good predictions over the range where we have discovered Mersenne primes.

import numpy as np
import matplotlib.pyplot as plt

# p's for which 2^p - 1 is prime.
# See https://oeis.org/A000043
ps = [2, 3, 5, ... , 82589933]

# x has 200 points from 10^1 to 10^8
# spaced evenly on a logarithmic scale
x = np.logspace(1, 8, 200)

# number of p's less than x such that 2^p - 1 is prime
actual = [np.searchsorted(ps, t) for t in x]

exp_gamma = np.exp(0.5772156649)
predicted = [exp_gamma*np.log2(t) for t in x]

plt.plot(x, actual)
plt.plot(x, predicted, "--")

plt.xscale("log")
plt.xlabel("p")
plt.ylabel(r"Mersenne primes $\leq 2^p-1$")
plt.legend(["actual", "predicted"])

## Related posts

[1] Fifty one Mersenne primes have been verified. But these may not be the smallest Mersenne primes. It has not yet been verified that there are no Mersenne primes yet to be discovered between the 47th and 51st known ones. The plot in this post assumes the known Mersenne primes are consecutive, and so it is speculative toward the right end.

# Beating the odds on the Diffie-Hellman decision problem

There are a couple variations on the Diffie-Hellman problem in cryptography: the computation problem (CDH) and the decision problem (DDH). This post will explain both and give an example of where the former is hard and the latter easy.

## The Diffie-Hellman problems

The Diffie-Hellman problems are formulated for an Abelian group. The main group we have in mind is the multiplicative group of non-zero integers modulo a large prime p. But we start out more generally because the Diffie-Hellman problems are harder over some groups than others as we will see below.

Let g be the generator of a group. When we think of the group operation written as multiplication, this means that every element of the group is a power of g.

## Computational Diffie-Hellman problem

Let a and b be two integers. Given ga and gb, the CDH problem is to compute gab. Note that the exponent is ab and not a+b. The latter would be easy: just multiply ga and gb.

To compute gab we apparently need to know either a or b, which we are not given. Solving for either exponent is the discrete logarithm problem, which is impractical for some groups.

It’s conceivable that there is a way to solve CDH without solving the discrete logarithm problem, but at this time the most efficient attacks on CDH compute discrete logs.

### Diffie-Hellman key exchange

Diffie-Hellman key exchange depends on the assumption that CDH is a hard problem.

Suppose Alice and Bob both agree on a generator g, and select private keys a and b respectively. Alice sends Bob ga and Bob sends Alice gb. This doesn’t reveal their private keys because the discrete logarithm problem is (believed to be) hard. Now both compute gab and use it as their shared key. Alice computes gab by raising gb to the power a, and Bob computes gab by raising ga to the power b.

If someone intercepted the exchange between Alice and Bob, they would possess ga and gb and would want to compute gab. This the the CDH.

When working over the integers modulo a large prime p, CDH is hard if p-1 has a large factor, such as when p – 1 = 2q for a prime q. If p-1 has only small factors, i.e. if p-1 is “smooth”, then the discrete logarithm problem is tractable via the Silver-Pohlig-Hellman algorithm [1]. But if p is large and p-1 has a large prime factor, CDH is hard as far as we currently know.

## Decision Diffie-Hellman problem

The DDH problem also starts with knowledge of ga and gb. But instead of asking to compute gab it asks whether one can distinguish gab from gc for a randomly chosen c with probability better than 0.5.

Clearly DDH is weaker than CDH because if we can solve CDH we know the answer to the DDH question with certainty. Still, the DDH problem is believed to be hard over some groups, such as certain elliptic curves, but not over other groups, as illustrated below.

### DDH for multiplicative group mod p

Suppose we’re working in the multiplicative group of non-zero integers modulo a prime p. Using Legendre symbols, which are practical to compute even for very large numbers, you can determine which whether ga is a square mod p, which happens if and only if a is even. So by computing the Legendre symbol for ga mod p, we know the parity of a. Similarly we can find the parity of b, and so we know the parity of ab. If ab is odd but gc is a square mod p, we know that the answer to the DDH problem is no. Similarly if ab is even but gc is not a square mod p, we also know that the answer to the DDH problem is no.

Since half the positive integers mod p are squares, we can answer the DDH problem with certainty half the time by the parity argument above. If our parity argument is inconclusive we have to guess. So overall we can answer he DDH problem correctly with probability 0.75.

## Related posts

[1] As is common when you have a lot of names attached to a theorem, there were multiple discoveries. Silver discovered the algorithm first, but Pohlig and Hellman discovered it independently and published first.

# Fame, difficulty, and usefulness

Pierre Fermat is best known for two theorems, dubbed his “last” theorem and his “little” theorem. His last theorem is famous, difficult to prove, and useless. His little theorem is relatively arcane, easy to prove, and extremely useful.

There’s little relation between technical difficulty and usefulness.

## Fermat’s last theorem

Fermat’s last theorem says there are no positive integer solutions to

an + bn = cn

for integers n > 2. This theorem was proven three and a half centuries after Fermat proposed it. The theorem is well known, even in pop culture. For example, Captain Picard tries to prove it in Star Trek: Next Generation. Fermat’s last theorem was famous for being an open problem that was easy to state. Now that it has been proven, perhaps it will fade from popular consciousness.

The mathematics developed over the years in attempts to prove Fermat’s last theorem has been very useful, but the theorem itself is of no practical importance that I’m aware of.

## Fermat’s little theorem

Fermat’s little theorem says that if p is a prime and a is any integer not divisible by p, then ap − 1 − 1 is divisible by p. This theorem is unknown in pop culture but familiar in math circles. It’s proved near the beginning of any introductory number theory class.

The theorem, and its generalization by Euler, comes up constantly in applications of number theory. See three examples here.

# Twisted elliptic curves

This morning I was sitting at a little bakery thinking about what to do before I check out of my hotel. I saw that the name of the bakery was Twist Bakery & Cafe, and that made me think of writing about twisted elliptic curves when I got back to my room.

## Twist of an elliptic curve

Suppose p is an odd prime and let E be the elliptic curve with equation

y² = x³ + ax² + bx + c

where all the variables come from the integers mod p. Now pick some d that is not a square mod p, i.e. there is no x such that x² – d is divisible by p. Then the curve E‘ with equation

dy² = x³ + ax² + bx + c

is a twist of E. More explicit it’s a quadratic twist of E. This is usually what people have in mind when they just say twist with no modifier, but there are other kinds of twists.

Let’s get concrete. We’ll work with integers mod 7. Then the squares are 1, 4, and 2. (If that last one looks strange, note that 3*3 = 2 mod 7.) And so the non-squares are 3, 5, and 6. Suppose our curve E is

y² = x³ + 3x + 2.

Then we can form a twist of E by multiplying the left side by 3, 5, or 6. Let’s use 3. So E‘ given by

3y² = x³ + 3x + 2

is a twist of E.

## Twisted curve

There’s something potentially misleading in the term “twisted curve”. The curve E‘ is a twist of E, so E‘ is twisted relative to E, and vice versa. You can’t say E‘ is twisted and E is not.

But you might object that E‘ has the form

dy² = x³ + ax² + bx + c

where d is a non-square, and E does not, so E‘ is the one that’s twisted. But a change of variables can leave the curve the same while changing the form of the equation. Every curve has an equation in which the coefficient of y² is 1 and a form in which the coefficient is a non-square. Specifically,

dy² = x³ + ax² + bx + c

and

y² = x³ + dax² + d²bx + d³c

specify the same curve.

The two curves E and E‘ are not isomorphic over the field of integers mod p, but they are isomorphic over the quadratic extension of the integers mod p. That is, if you extend the field by adding the square root of d, the two curves are isomorphic over that field.

## Partitioning x coordinates

For a given x, f(x) = x³ + ax² + bx + c is either a square or a non-square. If it is a square, then

y² = f(x)

has a solution and

dy² = f(x)

does not. Conversely, if f(x) is not a square, then

dy² = f(x)

has a solution and

y² = f(x)

does not. So a given x coordinate either corresponds to a point on E or a point on E‘, but not both.

## Application to cryptography

In elliptic curve cryptography, it’s good if not only is the curve you’re using secure, so are its twists. Suppose you intend to work over a curve E, and someone sends you an x coordinate that’s not on E. If you don’t check whether x corresponds to a point on E, you are effectively working on a twist E‘ rather than E. That can be a security hole if E‘ is not as secure a curve as E, i.e. if the discrete logarithm problem on E‘ can be solved more efficiently than the same problem on E.

## Related posts

Let p be an odd prime number. If the equation

x² = n mod p

has a solution then n is a square mod p, or in classical terminology, n is a quadratic residue mod p. Half of the numbers between 0 and p are quadratic residues and half are not. The residues are distributed somewhat randomly. For large p, quadratic residues are distributed enough like random numbers to be useful in cryptographic applications. For example, see this post. But the distribution isn’t entirely random as we’ll demonstrate here.

First let’s look at a couple illustrations, using p = 101 and p = 103. We’ll use the following Python code to plot the residues.

import matplotlib.pyplot as plt

for p in [101, 103]:
s = {x**2 % p for x in range(1, p)}
for q in s:
plt.vlines(q, 0, 1)
plt.savefig("residues_mod_{}.svg".format(p))
plt.close()

Here are the plots.

## Symmetry and anti-symmetry

If you look closely at the first image, you’ll notice it appears to be symmetric about the middle. And if you look closely at the second image, you’ll notice it appears to be anti-symmetric, i.e. the left side has bars where the right side has gaps and vice versa.

The following code shows that these observations are correct.

p = 101
s = {x**2 % p for x in range(1, p)}
for q in s:
assert(p - q in s)

p = 103
s = {x**2 % p for x in range(1, p)}
for q in s:
assert(p - q not in s)

Both of these observations hold more generally. If p = 1 mod 4, the residues are symmetric: n is a quadratic residue if and only if pn is a quadratic residue. And if p = 3 mod 4, the residues are anti-symmetric: n is a quadratic residue if and only if pn is not a quadratic residue. Both of these statements are consequences of the quadratic reciprocity theorem.

If you look again at the plot of residues for p = 103, you’ll notice that not only is it anti-symmetric, it is also darker on the left side. We can verify with the following code

print(len( [q for q in s if q < 51] ))
print(len( [q for q in s if q > 51] ))

that there are 28 residues on the left and and 23 on the right. This is a general pattern called quadratic excess. The quadratic excess of an odd prime p, denoted E(p) is the number of quadratic residues to the left of the middle minus the number to the right of the middle. If p = 1 mod 4, E(p) is zero by symmetry, but if p = 3 mod 4, E(p) is always positive.

Here is a plot of quadratic excess for primes less than 3000 and congruent to 3 mod 4.

## Statistical properties

In this post I’ve said a couple things that are in tension. At the top I said that quadratic residues act sufficiently like random bits to be useful in cryptography. Then later on I showed that they have major departures from random behavior. That would be a problem if you looked at the entire sequence of residues, but if p has thousands of digits, and you’re looking at a small sample of the reciprocity values, that’s a different situation.

Suppose you pick two large enough primes, p and q, and keep them secret but tell me their product N = pq. Then you pick a random value n and ask me whether it’s a quadratic residue mod N, the best I can do is say “Maybe, with probability 1/2.” There are crypto systems that depend on this being a computationally hard problem to solve, unless the factors p and q are known.

However, you need to be a little careful. If n is smaller than √N, I could test whether n is a square, working in the integers, not the integers mod N. If it is a square integer, then it’s a square mod N. If it’s not a square integer, it might be a square mod N, but I wouldn’t know.

There’s a related issue I may explore in the future. Suppose instead of just asking whether a particular number is a quadratic residue, you take a range of numbers and return 0 or 1 depending on whether each number is a residue. Can I distinguish the sequence from a random sequence using statistical tests? If I took a sequence of length greater than N/2 then I could tell by looking at symmetry or antisymmetry. But what if the range is small relative to N? Maybe N is a thousand-digit numnber but you only give me a sequence of a million bits. Would the distribution be distinguishable from uniform random bits? Is there any sort of autocorrelation that might betray the sequence as not random?

# Software to factor integers

In my previous post, I showed how changing one bit of a semiprime (i.e. the product of two primes) creates an integer that can be factored much faster. I started writing that post using Python with SymPy, but moved to Mathematica because factoring took too long.

## SymPy vs Mathematica

When I’m working in Python, SymPy lets me stay in Python. I’ll often use SymPy for a task that Mathematica could do better just so I can stay in one environment. But sometimes efficiency is a problem.

SymPy is written in pure Python, for better and for worse. When it comes to factoring large integers, it’s for worse. I tried factoring a 140-bit integer with SymPy, and killed the process after over an hour. Mathematica factored the same integer in 1/3 of a second.

## Mathematica vs PARI/GP

The previous post factors 200-bit semiprimes. The first example, N = pq where

p = 1078376712338123201911958185123
q = 1126171711601272883728179081277

took 99.94 seconds to factor using Mathematica. A random sample of 13 products of 100-bit primes and they took an average of 99.1 seconds to factor.

Using PARI/GP, factoring the value of N above took 11.4 seconds to factor. I then generated a sample of 10 products of 100-bit primes and on average they took 10.4 seconds to factor using PARI/GP.

So in these examples, Mathematica is several orders of magnitude faster than SymPy, and PARI/GP is one order of magnitude faster than Mathematica.

It could be that the PARI/GP algorithms are relatively better at factoring semiprimes. To compare the efficiency of PARI/GP and Mathematica on non-semiprimes, I repeated the exercise in the previous post, flipping each bit of N one at a time and factoring.

This took 240.3 seconds with PARI/GP. The same code in Mathematica took 994.5 seconds. So in this example, PARI/GP is about 4 times faster where as for semiprimes it was 10 times faster.

## Python and PARI

There is a Python interface to PARI called cypari2. It should offer the convenience of working in Python with the efficiency of PARI. Unfortunately, the installation failed on my computer. I think SageMath interfaces Python to PARI but I haven’t tried it.

# Making public keys factorable with Rowhammer

The security of RSA encryption depends on the fact that the product of two large primes is difficult to factor. So if p and q are large primes, say 2048 bits each, then you can publish n = pq with little fear that someone can factor n to recover p and q.

But if you can change n by a tiny amount, you may make it much easier to factor. The Rowhammer attack does this by causing DRAM memory to flip bits. Note that we’re not talking about breaking someone’s encryption in the usual sense. We’re talking about secretly changing their encryption to a system we can break.

To illustrate on a small scale what a difference changing one bit can make, let p = 251 and q = 643.  Then n = pq = 161393. If we flip the last bit of n we get m = 161392. Although n is hard to factor by hand because it has no small factors, m is easy to factor, and in fact

161392 = 24 × 7 × 11 × 131.

For a larger example, I generated two 100-bit random primes in Mathematica

p = 1078376712338123201911958185123
q = 1126171711601272883728179081277

and was able to have it factor n = pq in about 100 seconds. But Mathematica was able to factor n-1 in a third of a second.

So far we have looked at flipping the least significant bit. But Rowhammer isn’t that precise. It might flip some other bit.

If you flip any bit of a product of two large primes, you’re likely to get an easier factoring problem, though the difficulty depends on the number you start with and which bit you flip. To illustrate this, I flipped each of the bits one at a time and measured how long it took to factor the result.

The median time to factor n with one bit flipped was 0.4 seconds. Here’s a plot of the factoring times as a function of which bit was flipped.

The plot shows about 80% of the data. Twenty percent of the time the value was above 11 seconds, and the maximum value was 74 seconds. So in every case flipping one bit made the factorization easier, usually quite a lot easier, but only a little easier in the worst case.

To verify that the results above were typical, I did a second experiment. This time I generated a sequence of pairs of random 100-bit primes. I factored their product, then factored the product with a randomly chosen bit flipped. Here are the factoring times in seconds.

|----------+---------|
| Original | Flipped |
|----------+---------|
|  117.563 |   3.828 |
|  108.672 |   4.875 |
|   99.641 |   0.422 |
|  103.031 |   0.000 |
|   99.188 |   0.000 |
|  102.453 |   0.234 |
|   79.594 |   0.094 |
|   91.031 |   0.875 |
|   64.313 |   0.000 |
|   95.719 |   6.500 |
|  131.125 |   0.375 |
|   97.219 |   0.000 |
|   98.828 |   0.203 |
|----------+---------|

By the way, we started this post by talking about maliciously flipping a bit. The same thing can happen without a planned attack if memory is hit by a cosmic ray.