Connecting the FFT and quadratic reciprocity

Some readers will look at the title of this post and think “Ah yes, the FFT. I use it all the time. But what is this quadratic reciprocity?”

Others will look at the same title and think “Gauss called the quadratic reciprocity theorem the jewel in the crown of mathematics. But what is this FFT thing? I think I remember an engineer saying something about that.”

Gauss proved a theorem that relates quadratic reciprocity and the FFT. For distinct odd primes p and q, the following equation holds.

\left(\frac{p}{q}\right) \left(\frac{q}{p}\right) = \frac{\text{Tr} {\cal F}_{pq}}{ \text{Tr} {\cal F}_p\, \text{Tr} {\cal F}_q}

I’ll spend the rest of this post unpacking this equation.

Legendre symbols

The expressions on the left are not fractions but rather Legendre symbols. The parentheses are not for grouping but are part of the symbol.

The Legendre symbol

\left(\frac{a}{r}\right)

is defined to be 0 if a is a multiple of r, 1 if a has a square root mod r, and −1 otherwise.

Fourier transforms

The Discrete Fourier Transform (DFT) of a vector of length n multiplies the vector by the n by n Fourier matrix Fp whose j, k entry is equal to exp(2πi jk / n). The Fast Fourier Transform (FFT) is a way to compute the DFT more quickly than directly multiplying by the Fourier matrix. Since the DFT is nearly always computed using the FFT algorithm, the DFT is commonly referred to as the FFT.

Matrix trace

The trace of a matrix is the sum of the elements along the main diagonal. So the trace of the Fourier matrix of size n is

\text{Tr} {\cal F}_n = \sum_{j=1}^n \exp(2\pi ij^2/n)

Numerical illustration

The quadratic reciprocity theorem, also due to Gauss, is usually stated as

\left(\frac{p}{q}\right) \left(\frac{q}{p}\right) = (-1)^{\frac{p-1}{2}\frac{q-1}{2}}

We can illustrate the theorem at the top of the page numerically with the following Python code, using the quadratic reciprocity theorem to evaluate the product of the Legendre symbols.

from numpy import exp, pi

tr = lambda p: sum(exp(2j*pi*k**2/p) for k in range(1, p+1))
p, q = 59, 17
print( tr(p*q)/(tr(p)*tr(q)) )
print( (-1)**((p-1)*(q-1)/4) ) 

The first print statement produces (0.9999999999998136-1.4048176871018313e-13j) due to some loss of precision due to floating point calculations, but this is essentially 1, which is what the second print statement produces.

If we change q to 19, both statements print −1 (after rounding the first result).

Quadratic Gauss sum

We can quickly prove

\left(\frac{p}{q}\right) \left(\frac{q}{p}\right) = \frac{\text{Tr} {\cal F}_{pq}}{ \text{Tr} {\cal F}_p\, \text{Tr} {\cal F}_q}

if we assume the quadratic reciprocity theorem and the following equation for the trace of the Fourier matrix.

\text{Tr} {\cal F}_n = \sum_{j=1}^n \exp(2\pi ij^2/n) =
\left\{
	\begin{array}{ll}
		\sqrt{n}  & \mbox{if } n \equiv 1 \bmod{4} \\
		0 & \mbox{if } n \equiv 2 \bmod{4} \\
                i\sqrt{n} & \mbox{if } n \equiv 3 \bmod{4} \\
                (1+i)\sqrt{n} & \mbox{if } n \equiv 0 \bmod{4} \\
	\end{array}
\right.

This proof is historically backward. It assumes quadratic reciprocity, but Gauss proved quadratic reciprocity by first proving the equation we’re trying to prove. He then obtained the expression on the right hand side of the quadratic reciprocity theorem using the equation above for the trace of the Fourier matrix.

The trace of the Fourier matrix is now called a quadratic Gauss sum. It’s a special case of more general sums that Gauss studied, motivated by his exploration of quadratic reciprocity.

Incidentally, Gauss gave many proofs of the quadratic reciprocity theorem. I don’t know where the proof outlined hear fits into the sequence of proofs he developed.

Related posts

Two-digit zip codes

It’s common to truncate US zip codes to the first three digits for privacy reasons. Truncating to the first two digits is less common, but occurs in some data sets.

HIPAA Safe Harbor requires sparse 3-digit zip codes to be suppressed; even when rolled up to three digits some regions are still sparsely populated.

How sparse can a two-digit zip code region be? Empty. The population of all zip codes starting with 09 is zero. That’s because all zip codes of the form 09XXX are overseas military bases and not anyone’s permanent residence.

Aside from that, the least populated 2-digit zip code is 69 with a population of about 175,000.

The most populated 2-digit zip code is 33 with a population of about 10,500,000.

So the ratio of the largest population to the smallest non-zero population is about 60.

What if we roll up all the way to 1-digit zip codes? Now things are more evenly distributed. The smallest region is 5 with a population of about 17.5M and the largest is 9 with a population of about 54M. A ratio of about 3.

Related posts

Bessel zero spacing

Bessel functions are to polar coordinates what sines and cosines are to rectangular coordinates. This is why Bessel function often arise in applications with radial symmetry.

The locations of the zeros of Bessel functions are important in application, and so you can find software for computing these zeros in mathematical libraries. In days gone by you could find them in printed tables, such as Table 9.5 in A&S.

Bessel functions are solutions to Bessel’s differential equation,

x^2 y'' + x y' + (x^2 - \nu^2) y = 0

For each ν the functions Jν and Yν, known as the Bessel functions of the first and second kind respectively, form a basis for the solutions to Bessel’s equation. These functions are analogous to cosine and sine

As x → ∞, Bessel functions asymptotically behave like damped sinusoidal waves. Specifically,

\begin{align*} J_\nu(x) &\sim \sqrt{\frac{2}{\pi x}} \cos(x - \pi\nu/2 - \pi/4) \\ Y_\nu(x) &\sim \sqrt{\frac{2}{\pi x}} \sin(x - \pi\nu/2 - \pi/4) \end{align*}

So if for large x Bessel functions of order ν behave something like sin(x), you’d expect the spacing between the zeros of the Bessel functions to approach π, and this is indeed the case.

We can say more. If ν² > ¼ then the spacing between zeros decreases toward π, and if ν² < ¼ the spacing between zeros increases toward π. This is not just true of Jν and Yν but also of their linear combinations, i.e. to any solution of Bessel’s equation with parameter ν.

If you look carefully, you can see this in the plots of J0 and J1 below. The solid blue curve, the plot of J0, crosses the x-axis at points closer together than π, and dashed orange curve, the plot of J1, crosses the x-axis at points further apart than π.

For more on the spacing of Bessel zeros see [1].

Related posts

[1] F. T. Metcalf and Milos Zlamal. On the Zeros of Solutions to Bessel’s Equation. The American Mathematical Monthly, Vol. 73, No. 7, pp. 746–749

Coloring the queen’s graph

Suppose we have an n × n chessboard. The case n = 8 is of course most common, but we consider all positive integer values of n.

The graph of a chess piece has an edge between two squares if and only if the piece can legally move between the two squares.

Now suppose we have a rule that if a piece can move between two squares, the squares must be colored differently. The minimum number of colors necessary to color the chessboard this was is the chromatic number of that piece’s graph.

The chromatic number of a rook is clearly at least n: since a rook can move between any two squares in a row, every square in the row must be a different color. And in fact the chromatic number for a rook’s graph is exactly n.

Bishops are a little more complicated. If n is even, then the chromatic number for a bishop is n. If n is odd, the number is n for a white bishop but n − 1 for a black bishop. Thinking of a 3 × 3 board shows why the two bishops are different.

Now a queen is sorta like a rook and a bishop combined, but determining the chromatic number for a queen’s graph is more difficult. The chromatic number of a queen can be no less than that of a rook since the former can make all the moves of the latter. So the queen’s chromatic number is at least n, but is it equal to n?

The results stated above can be found in [1].

Let k(n) be the chromatic number of a queen’s graph on an the n × n chessboard. The results in [1] regarding k(n) are summarized as follows.

  • k(1) = 1
  • k(2) = 4
  • k(3) = k(4) = 5
  • k(6) = 7
  • k(n) = n if n is not divisible by 2 or 3.

The authors stated that they had found an example proving that k(8) ≤ 9 and conjectured that k(8) = 9. You can find a C++ program that confirms this conjecture here.

The authors conjectured that k(n) ≤ n + 1 for all n > 3. I don’t know whether this has been proven. My impression following a brief search is that the problem of finding k(n) for all n is still not fully resolved.

Update: According to a link in the comments, n = 27 is the smallest n such that k(n) is unknown.

Related posts

[1] M. R. Iyer and V. V. Menon. On Coloring the n × n Chessboard. The American Mathematical Monthly, 1966, Vol. 73, pp. 721–725.

Regex to match SWIFT-BIC codes

A SWIFT-BIC number identifies a bank, not a particular bank account. The BIC part stands for Bank Identifier Code.

I had to look up the structure of SWIFT-BIC codes recently, and here it is:

  • Four letters to identify the bank
  • Two letters to identify the country
  • Two letters or digits to identify the location
  • Optionally, three letters or digits to identify a branch

Further details are given in the ISO 9362 standard.

We can use this as an example to illustrate several regular expression features, and how regular expressions are used in practice.

Regular expressions

If your regular expression flavor supports listing a number of repetitions in braces, you could write the above format as

    [A-Z]{6}[A-Z0-9]{2,5}

This would work, for example, with egrep but not with grep. YMMV.

That’s concise, but a little too permissive. It allows anywhere from 2 to 5 alphanumeric characters on the end. But the standard says 2 or 5 alphanumeric characters after the country code, not between 2 and 5. For example, 3 characters after the country code would no be valid. So we could reduce our false positive rate a little by changing the regex to

    [A-Z]{6}[A-Z0-9]{2}([A-Z0-9]{3})?$

Without the dollar sign on the end, ABCDEF12X would still match because the part of the regex up to the optional ([A-Z0-9]{3})? at the end would match at the beginning of the string. The dollar sign marks the end of the string, so it says the code has to end either after 8 or 11 characters and stop.

If your regex flavor does not support counts in braces, you could spell everything out:

    [A-Z][A-Z][A-Z][A-Z][A-Z][A-Z][A-Z0-9][A-Z0-9]([A-Z0-9]{3})?$

Convenience versus accuracy

If you want to match only valid SWIFT-BIC codes, you can get perfect accuracy by checking against an exhaustive list of SWIFT-BIC codes. You could even write a regular expression that matches codes on this list and only codes on the list, but what would the point be? Regular expressions usually tradeoff convenience for accuracy.

I don’t have a list of all valid SWIFT-BIC codes. If I did, it might be out of date by the time I download it. But if I’m trying to pull bank codes out of a text file, the regex

    [A-Z]{6}[A-Z0-9]{2}([A-Z0-9]{3})?$

is likely to do a pretty good job. Regular expressions are usually used in a context where there’s some tolerance for error. Maybe you use a regular expression to do a first pass, then weed out the mismatches with a manual review.

Capturing parts

Maybe you want to do more than just find SWIFT codes. Maybe you want to look at their pieces.

For example, the fifth and sixth characters of a SWIFT code are the ISO 3166 two-letter abbreviation for the country the bank is in. (With one exception: XR represents Kosovo, which does not have an ISO 3166 code.)

You could replace

    [A-Z]{6}

at the front of the regular expression with

    [A-Z]{4}([A-Z]{2})

which will not change which strings match, but it will store the fifth and sixth characters as the first captured group. How you access captured group varies between various regular expression implementations.

Legibility

The first proposed regular expression

    [A-Z]{6}[A-Z0-9]{2,5}

is easy to read, at least in my opinion. It has grown over the course of this post to

    [A-Z]{4}([A-Z]{2})[A-Z0-9]{2}([A-Z0-9]{3})?$

which is not as easy to read. This is typical: you start with a quick-and-dirty regular expression, the refine it until it meets your needs. Regular expressions tend to get uglier as they become more precise.

There are ways to make regular expressions more readable by using something like the /x modifier in Perl, which lets you insert white space and comments inside a regular expression.

That’s nice, but it’s also a little odd. If you’re going to use a complicated regular expression in production code, then you should format it nicely and add comments. But then you have to ask why you’re using a complicated regular expression in production code. I’m not saying this is never appropriate, but it’s not the most common use case.

I could imagine using a simple regular expression when you want quick and dirty, and using an exhaustive list of SWIFT codes in production. A complex, well-commented regular expression seems to fall into a sort of no man’s land in between.

Bad takes on chaos theory

I just finished reading The Three Body Problem. At the end of the book is a preview of Cixin Liu’s book Supernova Era. A bit of dialog in that preview stood out to me because it is touches on themes I’ve written about before.

“I’ve heard about that. When a butterfly flaps its wings, there’s a hurricane on the other side of the world.”

“That’s right,” Specs said, nodding. A chaotic system.”

Huahua said, “I want to be that butterfly.”

Specs should his head again. “You don’t understand at all. We’re all butterflies, just like every butterfly. Every grain of sand and every drop of rain is a butterfly. That’s why the world is unpredictable.”

Most popular interpretations of chaos theory are misguided. Two such misguided interpretations are illustrated in the passage above.

When hearing of chaos theory, many jump to the same conclusion as the Huahua in the excerpt above who wants to be the butterfly that starts a hurricane. They think chaos theory implies that butterfly effects can be engineered. This is the optimistic fallacy.

Sometimes a small deliberate effort can lead to a large intended conclusion. But chaos theory would not predict this. In fact, reasoning by analogy from chaos theory would suggest this is impossible. More on that here and here.

Another misguided interpretation of chaos theory is the pessimistic fallacy that “the world is unpredictable,” as Specs says above. But we know that’s not true. Some aspects of the world are very predictable. As Orphan Annie says, the sun will come out tomorrow.

Even people who say the world is unpredictable don’t live as if the world were unpredictable. Deep down they know full well that in important ways the world is predictable.

Bell curve meme.

It’s true that not everything is as predictable as we may have imagined, weather being a famous example. Chaos theory was born out of the surprising observation that weather simulations are very sensitive to changes in initial conditions.

We do not live in a world in which we can tickle a particular butterfly in order to deliberately direct the course of the future. But neither do we live in a world without discernible causes and effects.

New Ways To Make Code Run Faster

Racecar

The news from Meta last week is a vivid reminder of the importance of making code run faster and more power-efficiently. Meta intends to purchase 350,000 Nvidia H100 GPUs this year [1]. Assuming 350W TDP [2] and $0.1621 per kW-h [3] average US energy cost, one expects a figure of $174 million per year in electricity expenses just to power the GPUs (note this is only a rough estimate of the actual). For this and many other datacenter-scale and real-time critical applications, every bit of increased performance can be impactful.

Many approaches can be taken to making code run faster. The excellent book Hacker’s Delight contains many tricks for speeding up low-level code kernels [4]. Also, superoptimization techniques find the very fastest performing implementations of short, loop-free code kernels by exhaustive search [5].

Since exhaustive search scales exponentially with code size, it’s no surprise that other methods have been tried. Recent work with reinforcement learning reduces the number of scalar multiples needed for matrix products, discovering new Strassen-like algorithms [6]. Other work focuses more on hardware design; for example, PrefixRL optimizes chip design for targeted problems using reinforcement learning [7].

Finding the absolute fastest code for a given task is in general NP-complete [8]. This problem is out of reach for tractable solution by such methods. However, phenomenal progress in SAT/SMT solver efficiency for NP-complete problems has been made over the last 20 years (someone has even quipped, “NP is the new P” [9]). And indeed, these methods have been applied to superoptimization problems [10, 11]. Perhaps methodologies based on efficient SAT and SMT solvers will afford further opportunities for advancement.

The need for speed and power efficiency has become so acute that radically different non-von Neumann processors are being built [12], in some cases orders of magnitude more power efficient but restrictive in the problems they can solve. In the coming years we can expect to see a huge amount activity and developments on this important problem.

[1] “Meta will have 350,000 of Nvidia’s fastest AI GPUs by end of year, buying AMD’s MI300, too,” Tom’s Hardware, Jan. 19, 2024, https://www.tomshardware.com/tech-industry/meta-will-have-350000-of-nvidias-fastest-ai-gpus-by-end-of-year-buying-amds-mi300-too

[2] “NVIDIA H100 Tensor Core GPU Datasheet,” https://resources.nvidia.com/en-us-tensor-core/nvidia-tensor-core-gpu-datasheet

[3] “Electricity Rates for Every State in The U.S.,” https://www.energybot.com/electricity-rates/

[4] Henry Warren, “Hacker’s Delight,” 2nd Edition, 2012, https://web.archive.org/web/20190915025154/http://www.hackersdelight.org/

[5] “Superoptimization”, https://en.wikipedia.org/wiki/Superoptimization

[6] Fawzi, A., Balog, M., Huang, A. et al. “Discovering faster matrix multiplication algorithms with reinforcement learning,” Nature 610, 47–53 (2022). https://doi.org/10.1038/s41586-022-05172-4

[7] Rajarshi Roy et al., “PrefixRL: Optimization of Parallel Prefix Circuits using Deep Reinforcement Learning,” https://arxiv.org/abs/2205.07000

[8] Hennessy, John L., and Thomas Gross. “Postpass code optimization of pipeline constraints.” ACM Transactions on Programming Languages and Systems (TOPLAS) 5, no. 3 (1983): 422-448.

[9] Arie Gurfinkel, “Algorithms for SAT,” https://ece.uwaterloo.ca/~agurfink/ece750t29/assets/pdf/02_SAT.pdf

[10] Abhinav Jangda and Greta Yorsh. 2017. “Unbounded superoptimization,” in Proceedings of the 2017 ACM SIGPLAN International Symposium on New Ideas, New Paradigms, and Reflections on Programming and Software (Onward! 2017). Association for Computing Machinery, New York, NY, USA, 78–88. https://doi.org/10.1145/3133850.3133856

[11] Phitchaya Mangpo Phothilimthana, Aditya Thakur, Rastislav Bodik, and Dinakar Dhurjati. 2016. “GreenThumb: superoptimizer construction framework,” in Proceedings of the 25th International Conference on Compiler Construction (CC 2016). Association for Computing Machinery, New York, NY, USA, 261–262. https://doi.org/10.1145/2892208.2892233

[12] “OpenAI Agreed to Buy $51 Million of AI Chips From a Startup Backed by CEO Sam Altman,” Wired, Dec. 3, 2023, https://www.wired.com/story/openai-buy-ai-chips-startup-sam-altman/

 

Brute force cryptanalysis

A naive view of simple substitution ciphers is that they are secure because there are 26! ways to permute the English alphabet, and so an attacker would have to try 26! ≈ 4 × 1026 permutations. However, such brute force is not required. In practice, simple substitution ciphers are breakable by hand in a few minutes, and you can find software that automates the process.

However, for modern encryption, apparently brute force is required. If you encrypt a message using AES with a 128-bit key, for example, you can’t do much better than try 2128 keys. You might be able to do a little better, but as far as is openly known, you can’t do orders of magnitude better.

Even for obsolete encryption methods such as DES it still takes a lot more effort to break encryption than to apply encryption. The basic problem with DES is that it used 56-bit keys, and trying 256 keys is feasible [1]. You won’t be able to do it on your laptop, but it can be done using many processors in parallel [2]. Still, you’d need more than a passing curiosity about a DES encrypted message before you’d go to the time and expense of breaking it.

If breaking a simple substitution cipher really did require brute force, it would offer 88-bit security. That is, 26! roughly equals 288. So any cipher offering b-bit security for b > 88 is more secure in practice than breaking simple substitution ciphers would be in naive theory. This would include AES, as well as many of its competitors that weren’t chosen for the standard, such as Twofish.

For all the block ciphers mentioned here, the number of bits of security they offer is equal to the size of the key in bits. This isn’t always the case. For example, the security level of an RSA key is much less than the size of the key, and the relation between key size and security level is nonlinear.

A 1024-bit RSA modulus is believed to offer on the order of 87 bits security, which incidentally is comparable to 26! as mentioned above. NIST FIPS 184-5 recommends 2048 bits as the minimum RSA modulus size. This gives about 117 bits of security.

The security of RSA depends on the difficulty of factoring the product of large primes [3], and so you can compute the security level of a key based on the efficiency of the best known factoring algorithm, which is currently the General Number Field Sieve. More on this here.

Related posts

[1] There are ways to do better than brute force against DES, if you have an enormous number of messages all encrypted with the same key.

[2] In 1998, the EFF built a machine called Deep Crack with 1,856 custom processors that could crack DES encoded messages in nine days, four and a half days on average.

[3] Nobody has proved that breaking RSA requires factoring. There is a variation on RSA that is provably as hard as factoring but as far as I know it has never been widely used.

Straddling checkerboard encryption

Introduction

Computers fundamentally changed cryptography, opening up new possibilities for making and breaking codes. At first it may not have been clear which side benefited most, but now it’s clear that computers gave more power to code makers than code breakers.

We now have cryptographic primitives that cannot be attacked more efficiently than by brute force, as far as we know. The weak link is how these primitives are implemented and combined, not the primitives themselves.

Before computers there was more of a cat and mouse game between encryption and cryptanalysis. Encryption schemes that were convenient to carry out by hand could usually be broken by hand eventually. But if you only needed secrecy briefly, a simple scheme might provide that secrecy for long enough. This post will look at one such scheme, the straddling checkerboard.

Checkerboards

Perhaps the most obvious way to conveniently turn letters into numbers is to arrange the letters into a 5 × 5 grid. This has to leave out one letter, and in practice this meant combining I and J. Or if you needed digits, you could use a 6 × 6 grid and put J back in. You’d scramble the alphabet in the grid according to some key, then encrypt each letter by its coordinates.

       12345
      +-----
     1|EBISP
     2|XWLVN
     3|AOYZQ
     4|MDCKH
     5|RTUFG

This is no better than a simple substitution cipher because someone intercepting a message encrypted this way would easily guess that pairs of digits represent letters. However, if you then permuted the digits with a transposition cipher, you’d have something more formidable. This is essentially what the ADFGV cipher did, which stumped cryptanalysts for a while.

The straddling checkerboard is a variation on the method above. Letters would be arranged in a 3 × 10 grid rather than 5 × 5. Some letters would be encrypted as a single digit and some as a pair of digits.

       1234567890
      +----------
      |  EBISPXWL
     1|VNAOYZQMDC
     2|KHRTUFGJ./

In the example above, E would be encrypted as 3, N would be encrypted as 12, and so on. This is an instance of a prefix code. In order to be able to decode the digits unambiguously, no letter could be encoded as 1 or 2; these digits always signaled the beginning of a pair.

Prefix codes are often used in non-secret codes, such as country codes for telephone numbers. More examples of prefix codes in this post.

Because 1 and 2 could not be used to encode single letters, there were 28 slots to fill. These could be filled with other symbols, and in practice period and slash were added [1].

Efficiency

The straddling checkerboard gives a more efficient encoding than does the checkerboard since typically fewer digits will be required. If efficiency were the only concern, we’d put the eight most frequent letters on the top row, something like the following [2].

       1234567890
      +----------
      |  ETAOINSR
     1|BCDFGHJKLM
     2|PQUVWXYZ./

This would be more efficient but less secure since the arrangement of the letters would be more predictable.

Security

The straddling checkerboard presents a bit of a challenge to the cryptanalyst since it’s not know a priori whether a digit is part of a pair (if the vertical coordinates are not always 1 and 2).

The straddling checkerboard didn’t offer much security even in its day. It would have been better if there had been some further processing done on the digits, such as how the ADFGV cipher permuted its coordinates.

The message, interpreted as a number N, could have been further encrypted as aN + b where a and b were randomly chosen numbers that were part of the key. As far as I know, nothing like this was ever done. This would have provided more security but would also require more effort and increase the chance of introducing errors.

Related posts

[1] David Kahn. The Codebreakers. Chapter 18.

[2] You may have expected the last letter on the first row to be H, going by the printer’s order ETAOIN SHRDLU. Peter Norvig discovered a slightly different order of letter frequencies based on the Google corpus.

Email subscription changes

I will soon be discontinuing the email subscription option for this blog. I recommend that email subscribers switch over to subscribing to the RSS feed for the blog. If you’re unfamiliar with RSS, here is an article on how to get started.

(I recommend RSS in general, and not just for subscribing to this blog. RSS lets you decide what sources you want to receive rather than leave the choice up to someone else’s algorithm.)

Update: I have replaced my email subscription service with a Substack account.

You can also find me on Twitter and on Mastodon at johndcook@mathstodon.xyz.