Density of square-free numbers, cube-free numbers, etc.

An integer n is said to be square-free if no perfect square (other than 1) divides n. Similarly, n is said to be cube-free if no perfect cube (other than 1) divides n.

In general, n is said to be k-free if it has no divisor d > 1 where d is a kth power [1].

As x gets larger, the proportion of numbers less than x that are k-free approaches 1/ζ(k). That is

\lim_{x\to\infty} \frac{Q_k(x)}{x} = \frac{1}{\zeta(k)}

where Qk(x) is the number of k-free positive integers less than or equal to x and ζ is the Riemann zeta function.

The values of ζ at even integers are known in closed form. So, for example, the proportion of square-free numbers less than x approaches 6/π² since ζ(2) = π²/6. For odd integers the values of ζ must be calculated numerically.

Once you have a theorem that says one thing converges to another, it’s natural to ask next is how fast it converges. How well does the limit work as an approximation of the finite terms? Are there any bounds on the errors?

In [2] the authors give new estimates the remainder term

R_k(x) = Q_k(x) - \frac{x}{\zeta(k)}

Python code

Let’s play with this using Python. The code below is meant to be clear rather than efficient.

    from sympy import factorint

    def kfree(n, k):
        exponents = factorint(n).values()
        return max(exponents) < k

    def Q(k, x):
        q = 1 # because 1 is not k-free
        for n in range(2, x+1):
            if kfree(n, k):
                q += 1
        return q

    print(Q(4, 1000000))

This says Q4(1,000,000) = 923,939.

So Q4(106)/106 = 0.923939. Now ζ(4) = π4/90, so 1/ζ(4) = 0.9239384….

We have R4(106) = 0.597 and so in this case R is quite small.

(If we didn’t know ζ in closed form, we could calculate it in python with scipy.special.zeta.)

More posts involving Riemann zeta

[1] It would be clearer if we said kth power-free, but people shorten this to k-free.

In my opinion the term “k-free” is misleading. The term ought to mean a number not divisible by k, or maybe a number that has no digit k when written in some base. But the terminology is what it is.

[2] Michael Mossinghoff, Tomás Oliviera e Sliva, and Timothy Trudgian. The distribution of k-free numbers. Mathematics of Computation. Posted November 24, 2020.

Digitally delicate primes

A digitally delicate prime is a prime number such that if any of its digits are changed by any amount, the result is not prime. The smallest such number is 294001. This means that variations on this number such as 394001 or 291001 or 294081 are all composite.

A paper [1] posted last week gives several new results concerning digitally delicate primes. Here I’ll just mention a few things that are known.

First, there are infinitely many digitally delicate primes. They are uncommon, but they have positive density. That is, as x goes to infinity, the ratio of the number of digitally delicate primes less than x to the total number of primes less than x converges to a small positive number.

This result has been extended to bases other than base 10. In [1] the authors extend the result to widely digitally delicate primes, meaning that the result holds even if you consider the “digits” of a number to include leading zeros.

The following Python code tests whether a number is a digitally delicate prime. It then takes the first six digitally delicate primes and verifies that they are indeed digitally delicate.

from sympy import isprime
from math import floor, log10

def num_digits(n):
    return floor(log10(n))+1

def digit_in_pos(n, pos):
    return (n // 10**pos) % 10

def set_digit(n, pos, new):
    "change digit in given position to new value"
    old = digit_in_pos(n, pos)
    return n + (new - old)*10**pos

def is_digitally_delicate_prime(n):
    if not isprime(n):
        return False
    nd = num_digits(n)
    for pos in range(nd):
        for newdigit in range(1, 10):
            if digit_in_pos(n, pos) == newdigit:
            m = set_digit(n, pos, newdigit)
            if isprime(m):
                print("m = ", m)
                return False
    return True

for p in [294001, 505447, 584141, 604171, 971767, 1062599]:

[1] Michael Filaseta and Jeremiah Southwick. Primes that become composite after changing an arbitrary digit. Mathematics of Computation. Article electronically published on November 24, 2020.

Cubic calendars

Suppose you’re tasked with designing a Christmas countdown figure that uses numbered cube faces.

Christmas countdown Santa

How would you put the numbers on the cubes?

Suppose you want the calendar to say “00” on Christmas Day. If we were accustomed to counting in base six, you could label the sides of each cube 0 through 5, and you could count up to 55six = 35ten days until Christmas. But let’s work in base 10 like nearly all humans do.

We would at least like to count to 24 so we could start using our calendar on December 1. But to count up to 24, we have to get past 22. That means we have to represent 00, 11, and 22, and that means we have to put a 0, 1, and 2 on each cube. But then we have six remaining faces to receive seven remaining digits, so we can’t get past 22.

Here’s where we break our implicit rules. We’ve tacitly assumed each face of each cube has only one label. But since a ‘6’ is an upside-down ‘9’, we can effectively put seven labels on six faces. One cube, call it cube A, can be labeled 0, 1, 2, 3, 4, and 5, and the other, call it cube B, can be labeled 0, 1, 2, 6 (9), 7, and 8.

We can put cube B first on 0, and use cube A to count up to 5. Then we put cube A first on 0 and use cube B to count from 6 up to 9 (upside-down 6). We can do the analogous procedure for leading digits 1 and 2 to count all the way up to 29. To count any further we have to put cube A first with 3 showing because that’s the only way to get a number in the 30s. We can count up to 32, then we’re stuck.

We could break another implicit rule. We’ve assumed there must be two digits. What if we’re content to have a single 0 on Christmas Day and represent single digits by not using a cube in the first slot? Now only one cube has to have a 0 on it.

I don’t think this helps. To get past 32, we need a 3 on both cubes. We already have a 3 on cube A, but if we replace the 0 on cube B with a 3, now we can’t represent 10 or 20.

So it seems 32 is as high as we can count without thinking of another rule to bend.

Exercise for the reader: How many consecutive numbers could you produce using octahedra rather than cubes? Since the sides of octahedra are triangles, not cubes, you might argue that the faces should be oriented with the base of the triangle on bottom and so the trick of using an upside-down 6 to represent 9 should be disallowed.

Drawing commutative diagrams with Quiver

I recently discovered quiver, a tool for drawing commutative diagrams. It looks like a nice tool for drawing diagrams more generally, but it’s designed particularly to include the features you need when drawing the kinds of diagrams that are ubiquitous in category theory.

You can draw diagrams using the online app and export the result to LaTeX (tikz).

Here are some examples.

quiver diagrams

The app is mostly easy to figure out. The object and arrow labels are specified using LaTeX.

Quiver lets you position labels on either side of an arrow, or even inside the arrow as in the e in the first diagram above.

You can change the arrow and arrowhead styles, as in the second example.

It took me a little while to figure out how to have two arrows with the same destination and source, as in the third example. You use the “offset” setting to move arrows up or down do that they’re not on top of each other. And by default, the software helpfully assumes that if you want to move one arrow up a bit from center, you probably want to move the other one down from center by the same amount.

You can use the “curve” setting to draw arrows as in the final example. You can also draw arrows between arrows, as is common when working with 2-categories.

More category theory posts

Refinements to the prime number theorem

Let π(x) be the number of primes less than x. The simplest form of the prime number theorem says that π(x) is asymptotically equal to x/log(x), where log means natural logarithm. That is,

\lim_{x\to\infty} \frac{\pi(x)}{x/\log(x)} = 1

This means that in the limit as x goes to infinity, the relative error in approximating π(x) with x/log(x) goes to 0. However, there is room for improvement. The relative approximation error goes to 0 faster if we replace x/log(x) with li(x) where

\text{li}(x) = \int_0^x \frac{dt}{\log t}

The prime number theorem says that for large x, the error in approximating π(x) by li(x) is small relative to π(x) itself. It would appear that li(x) is not only an approximation for π(x), but it is also an upper bound. That is, it seems that li(x) > π(x). However, that’s not true for all x.

Littlewood proved in 1914 that there is some x for which π(x) > li(x). We still don’t know a specific number x for which this holds, though we know such numbers exist. The smallest such x is the definition of Skewes’ number. The number of digits in Skewes’ number is known to be between 20 and 317, and is believed to be close to the latter.

Littlewood not only proved that li(x) – π(x) is sometimes negative, he proved that it changes sign infinitely often. So naturally there is interest in estimating li(x) – π(x) for very large values of x.

A new result was published a few days ago [1] refining previous bounds to prove that

|\pi(x) - \text{li}(x)| \leq 235 x (\log x)^{0.52} \exp(-0.8\sqrt{\log x})

for all x > exp(2000).

When x = exp(2000), the right side is roughly 10857 and π(x) is roughly 10865, and so the relative error is roughly 10-8. That is, the li(x) approximation to π(x) is accurate to 8 significant figures, and the accuracy increases as x gets larger.


[1] Platt and Trudgian. The error term in the prime number theorem. Mathematics of Computation. November 16, 2020.

Minimizing random Boolean expressions

The previous post looked at all Boolean expressions on three or four variables and how much they can be simplified. The number of Boolean expressions on n variables is


and so the possibilities explode as n increases. We could do n = 3 and 4, but 5 would be a lot of work, and 6 is out of the question.

So we do what we always do when a space is too big to explore exhaustively: we explore at random.

The Python module we’ve been using, qm, specifies a function of n Boolean variables in terms of the set of product terms on which the function evaluates to 1. These product terms can be encoded as integers, and so a Boolean function of n variables corresponds to a subset of the integers 0 through 2n – 1.

We can generate a subset of these numbers by generating a random mask consisting of 0s and 1s, and keeping the numbers where the mask value is 1. We could do this with code like the following.

     N= 2**n
     x = np.arange(N)
     mask = np.random.randint(2, size=N)
     ones = set(mask*x)

There’s a small problem with this approach: the set ones always contains 0. We want it to contain 0 if and only if the 0th mask value is a 1.

The following code generates a Boolean expression on n variables, simplifies it, and returns the length of the simplified expression [1].

    def random_sample(n):
        N = 2**n
        x = np.arange(N)
        mask = np.random.randint(2, size=N)
        ones = set(mask*x)
        if mask[0] == 0:
        return len(qm(ones=ones, dc={}))

We can create several random samples and make a histogram with the following code.

    def histogram(n, reps):
        counts = np.zeros(2**n+1, dtype=int)
        for _ in range(reps):
            counts[random_sample(n)] += 1
        return counts

The data in the following graph comes from calling histogram(5, 1000).

data = [0, 0, 0, 0, 4, 44, 145, 339, 296, 128, 38, 5, 0, 1]

Note that the length of the random expressions is distributed symmetrically around 16 (half of 25). So minimization turns a distribution centered around 16 into a distribution centered around 8.

The code is slow because the Quine-McCluskey algorithm is slow, and our Python implementation of the algorithm isn’t as fast as it could be. But Boolean minimization is an NP problem, so no exact algorithm is going to scale well. To get faster results, we could switch to something like the Expresso Heuristic Logic Minimizer, which often gets close to a minimum expression.


[1] The code above will fail if the set of terms where the function is 1 is empty. However this is extremely unlikely: we’d expect it to happen once in every 2^(2^n) times and so when n = 5 this is less than one time in four billion. The fully correct approach would be to call qm with zeros=x when ones is empty.

How much can Boolean expressions be simplified?

In the previous post we looked at how to minimize Boolean expressions using a Python module qm. In this post we’d like to look at how much the minimization process shortens expressions.

Witn n Boolean variables, you can create 2^n terms that are a product of distinct variables. You can specify a Boolean function by specifying the subset of such terms on which it takes the value 1, and so there are 2^(2^n) Boolean functions on n variables. For very small values of n we can minimize every possible Boolean function.

To do this, we need a way to iterate through the power set (set of all subsets) of the integers up to 2^n. Here’s a function to do that, borrowed from itertools recipes.

    from itertools import chain, combinations
    def powerset(iterable):
        xs = list(iterable)
        return chain.from_iterable(
            combinations(xs, n) for n in range(len(xs) + 1))

Next, we use this code to run all Boolean functions on 3 variables through the minimizer. We use a matrix to keep track of how long the input expressions are and how long the minimized expressions are.

    from numpy import zeros
    from qm import q 

    n = 3
    N = 2**n
    tally = zeros((N,N), dtype=int)
    for p in powerset(range(N)):
        if not p:
            continue # qm can't take an empty set
        i = len(p)
        j = len(qm(ones=p, dc={}))
        tally[i-1, j-1] += 1 

Here’s a table summarizing the results [1].

The first column gives the number of product terms in the input expression and the subsequent columns give the number of product terms in the output expressions.

For example, of the expressions of length 2, there were 12 that could be reduced to expressions of length 1 but the remaining 16 could not be reduced. (There are 28 possible input expressions of length 2 because there are 28 ways to choose 2 items from a set of 8 things.)

There are no nonzero values above the main diagonal, i.e. no expression got longer in the process of minimization. Of course that’s to be expected, but it’s reassuring that nothing went obviously wrong.

We can repeat this exercise for expressions in 4 variables by setting n = 4 in the code above. This gives the following results.

We quickly run into a wall as n increases. Not only does the Quine-McCluskey algorithm take about twice as long every time we add a new variable, the number of possible Boolean functions grows even faster. There were 2^(2^3) = 256 possibilities to explore when n = 3, and 2^(2^4) = 65,536 when n = 4.

If we want to explore all Boolean functions on five variables, we need to look at 2^(2^5) = 4,294,967,296 possibilities. I estimate this would take over a year on my laptop. The qm module could be made more efficient, and in fact someone has done that. But even if you made the code a billion times faster, six variables would still be out of the question.

To explore functions of more variables, we need to switch from exhaustive enumeration to random sampling. I may do that in a future post. (Update: I did.)


[1] The raw data for the tables presented as images is available here.

Minimizing boolean expressions

This post will look at how to take an expression for a Boolean function and look for a simpler expression that corresponds to the same function. We’ll show how to use a Python implementation of the Quine-McCluskey algorithm.


We will write AND like multiplication, OR like addition, and use primes for negation. For example,

wx + z


(w AND x) OR (NOT z).

Minimizing expressions

You may notice that the expression

wxz + wxz

can be simplified to wz, for example, but it’s not feasible to simplify complicated expressions without a systematic approach.

One such approach is the Quine-McCluskey algorithm. Its run time increases exponentially with the problem size, but for a small number of terms it’s quick enough [1]. We’ll show how to use the Python module qm which implements the algorithm.

Specifying functions

How are you going to pass a Boolean expression to a Python function? You could pass it an expression as a string and expect the function to parse the string, but then you’d have to specify the grammar of the little language you’ve created. Or you could pass in an actual Python function, which is more work than necessary, especially if you’re going to be passing in a lot of expressions.

A simpler way is pass in the set of places where the function evaluates to 1, encoded as numbers.

For example, suppose your function is

wxyz + wxyz

This function evaluates to 1 when either the first term evaluates to 1 or the second term evaluates to 1. That is, when either

(w, x, y, z) = (1, 1, 0, 1)


(w, x, y, z) = (0, 1, 1, 0).

Interpreting the left sides as binary numbers, you could specify the expression with the set {13, 6} which describes where the function is 1.

If you prefer, you could express your numbers in binary to make the correspondence to terms more explicit, i.e. {0b1101,0b110}.

Using qm

One more thing before we use qm: your Boolean expression might not be fully specified. Maybe you want it to be 1 on some values, 0 on others, and you don’t care what it equals on the rest.

The qm module lets you specify these with arguments ones, zeroes, and dc. If you specify two out of these three sets, qm will infer the third one.

For example, in the code below

    from qm import qm
    print(qm(ones={0b111, 0b110, 0b1101}, dc={}))

we’re asking qm to minimize the expression

xyz + xyz‘ + wxyz.

Since the don’t-care set is empty, we’re saying our function equals 0 everywhere we haven’t said that it equals 1. The function prints

    ['1101', '011X']

which corresponds to

wxyz + wxy,

the X meaning that the fourth variable, z, is not part of the second term.

Note that the minimized expression is not unique: we could tell by inspection that

xyz + xyz‘ + wxyz.

could be reduced to

xy + wxyz.

Also, our code defines a minimum expression to be one with the fewest sums. Both simplifications in this example have two sums. But xy + wxyz is simpler than wxyz + wxy in the sense of having one less term, so there’s room for improvement, or at least discussion, as to how to quantify the complexity of an expression.

In the next post I use qm to explore how much minimization reduces the size of Boolean expressions.


[1] The Boolean expression minimization problem is in NP, and so no known algorithm that always produces an exact answer will scale well. But there are heuristic algorithms like Espresso and its variations that usually provide optimal or near-optimal results.

Rotating symbols in LaTeX

Linear logic uses an unusual symbol, an ampersand rotated 180 degrees, for multiplicative disjunction.


The symbol is U+214B in Unicode.

I was looking into how to produce this character in LaTeX when I found that the package cmll has two commands that produce this character, one semantic and one descriptive: \parr and \invamp [1].

This got me to wondering how you might create a symbol like the one above if there wasn’t one built into a package. You can do that by using the graphicx package and the \rotatebox command. Here’s how you could roll your own par operator:


There’s a backslash in front of the & because it’s a special character in LaTeX. If you wanted to rotate a K, for example, there would be no need for a backslash.

The \rotatebox command can rotate any number of degrees, and so you could rotate an ampersand 30° with


to produce a tilted ampersand.


Related posts

[1] The name \parr comes from the fact that the operator is sometimes pronounced “par” in linear logic. (It’s not simply \par because LaTeX already has a command \par for inserting a paragraph break.)

The name \invamp is short for “inverse ampersand.” Note however that the symbol is not an inverted ampersand in the sense of being a reflection; it is an ampersand rotated 180°.

The smallest number with a given number of divisors

Suppose you want to find the smallest number with 5 divisors. After thinking about it a little you might come up with 16, because

16 = 24

and the divisors of 16 are 2k where k = 0, 1, 2, 3, or 4.

This approach generalizes: For any prime q, the smallest number with q divisors is 2q-1.

Now suppose you want to find the smallest number with 6 divisors. One candidate would be 32 = 25, but you could do better. Instead of just looking at numbers divisible by the smallest prime, you could consider numbers that are divisible by the two smallest primes. And in fact

12 = 22 3

is the smallest number with 6 divisors.

This approach also generalizes. If h is the product of 2 primes, say h = pq where pq, then the smallest number with h divisors is

2p-1 3q-1.

The divisors come from letting the exponent on 2 range from 0 to p-1 and letting the exponent on 3 range from 0 to q-1.

For example, the smallest number with 35 divisors is

5184 = 27-1 35-1.

Note that we did not require p and q to be different. We said pq, and not p > q. And so, for example, the smallest number with 25 divisors is

1296 = 25-1 35-1.

Now, suppose we want to find the smallest number with 1001 divisors. The number 1001 factors as 7*11*13, which has some interesting consequences. It turns out that the smallest number with 1001 divisors is

213-1 311-1 57-1.

Does this solution generalize? Usually, but not always.

Let h = pqr where p, q, and r are primes with pqr. Then the smallest number with h divisors is

2p-1 3q-1 5r-1

with one exception. The smallest number with 8 divisors would be 30 = 2*3*5 if the theorem always held, but in fact the smallest number with 8 divisors is 24.

In [1] M. E. Gorst examines the exceptions to the general pattern. We’ve looked at the smallest number with h divisors when h is the product of 1, or 2, or 3 (not necessarily distinct) primes. Gorst considers values of h equal to the product of up to 6 primes.

We’ve said that the pattern above holds for all h the product of 1 or 2 primes, and for all but one value of h the product of 3 primes. There are two exceptions for h the product of 4 primes. That is, if h = pqrs where pqrs are primes, then the smallest number with h divisors is

2p-1 3q-1 5r-1 7s-1

with two exceptions. The smallest number with 24 divisors is 23 × 3 × 5, and the smallest number with 3 × 23 divisors is 23 × 32 × 5.

When h is the product of 5 or 6 primes, there are infinitely many exceptions, but they have a particular form given in [1].

The result discussed here came up recently in something I was working on, but I don’t remember now what. If memory serves, which it may not, I wanted to assume something like what is presented here but wasn’t sure it was true.


[1] M. E. Grost. The Smallest Number with a Given Number of Divisors. The American Mathematical Monthly, September 1968, pp. 725-729.