Cryptography, hydrodynamics, and celestial mechanics

V. I. Arnold

Last night I was reading a paper by the late Russian mathematician V. I. Arnold “Polymathematics: is mathematics a single science or a set of arts?” and posted a lightly edited extract of it on Twitter. It begins

All mathematics is divided into three parts: cryptography, hydrodynamics, and celestial mechanics.

Arnold is alluding to the opening line to Julius Caesar’s Gallic Wars [1] which suggests he’s being a little playful. The unedited version leaves no doubt that he’s being playful or cynical.

All mathematics is divided into three parts: cryptography (paid for by CIA, KGB and the like), hydrodynamics (supported by manufacturers of atomic submarines), and celestial mechanics (financed by military and other institutions dealing with missiles, such as NASA).

I edited out the parenthetical remarks, in part edit the sentence down to a tweet, but also because when you take out the humor/cynicism he makes a valid if hyperbolic point. He goes on to back this up.

Cryptography has generated number theory, algebraic geometry over finite fields, algebra, combinatorics and computers.

Hydrodynamics has procreated complex analysis, partial differential equations, Lie groups and algebra theory, cohomology theory and scientific computing.

Celestial mechanics is the origin of dynamical systems, linear algebra, topology, variational calculus and symplectic geometry.

Arnold adds a footnote to the comment about cryptography:

The creator of modern algebra, Viète, was the cryptographer of King Henry IV of France.

Of course not all mathematics was motivated by cryptography, hydrodynamics, and celestial mechanics, but an awful lot of it was. And his implicit argument that applied math gives birth to pure math is historically correct. Sometimes pure math gives rise to applied math, but much more often it’s the other way around.

His statements roughly match my own experience. Much of the algebra and number theory that I’ve learned has been motivated by cryptography. I dove into Knuth’s magnum opus, volume 2, because I wanted to implement the RSA algorithm in C++.

I got started in scientific computing in a computational fluid dynamics (CDF) lab. I didn’t work in the lab, but my roommate did, and I went there to use the computers. That’s where I would try my hand at numerical analysis and where I wrote simulation code for my dissertation. My dissertation in partial differential equations was related (very abstractly) to fluid flow in porous media.

I didn’t know anything about celestial mechanics until I sat in on Richard Arenstorf‘s class as a postdoc. So celestial mechanics was not my personal introduction to dynamical systems etc., but Arnold is correct that these fields came out of celestial mechanics.

Related posts

[1] “Gallia est omnis divisa in partes tres.” which translates “Gaul is a whole divided into three parts.”

The quality of an RNG depends on the application

A random number generator can be good for some purposes and not for others. This isn’t surprising given the fundamentally impossible task such generators are supposed to perform. Technically a random number generator is a pseudo random number generator because it cannot produce random numbers. But random is as random does, and for many purposes the output of a pseudorandom number generator is random for practical purposes. But this brings us back to purposes.

Let β be an irrational number and consider the sequence

xnnβ mod 1

for n = 0, 1, 2, … That is, we start at 0 and repeatedly add β, taking the fractional part each time. This gives us a sequence of points in the unit interval. If you think of bending the interval into a circle by joining the 1 end to 0, then our sequence goes around the circle, each time moving β/360°.

Is this a good random number generator? For some purposes yes. For others no. We’ll give an example of each.


If your purpose is Monte Carlo integration, then yes it is. Our sequence has low discrepancy. You can approximate the integral of a function f over [0, 1] by taking the average of f(xn) over the first N elements of the sequence. Doing Monte Carlo integration with this particular RNG amounts to quasi Monte Carlo (QMC) integration, which is often more efficient than Monte Carlo integration.

Here’s an example using β = e.

    import numpy as np
    # Integrate f with N steps of (quasi) Monte Carlo 
    def f(x): return 1 + np.sin(2*np.pi*x)
    N = 1000
    # Quasi Monte Carlo
    sum = 0
    x = 0
    e = np.exp(1) 
    for n in range(N):
        sum += f(x)
        x = (x + e) % 1
    # Monte Carlo
    sum = 0
    for _ in range(N):
        sum += f(np.random.random())

This code prints


The exact value of the integral is 1, and so the error using QMC between 4 and 5 times smaller than the error using MC. To put it another way, integration using our simple RNG is much more accurate than using the generally better RNG that NumPy uses.


Now suppose we’re doing some sort of simulation that requires computing the gaps between consecutive random numbers. Let’s look at the set of gaps we get using our simple RNG.

    gapset = set()
    x = 0
    for _ in range(N):
        newx = (x + e) % 1
        gap = np.abs(newx - x)
        x = newx
        gapset.add(np.round(gap, 10))
    print( gapset )

Here we rounded the gaps to 10 decimal places so we don’t have minuscule gaps caused by floating point error.

And here’s our output:

    {0.7182818285, 0.2817181715}

There are only two gap sizes! This is a consequence of the three-gap theorem that Greg Egan mentioned on Twitter this morning. Our situation is a slightly different in that we’re looking at gaps between consecutive terms, not the gaps that the interval is divided into. That’s why we have two gaps rather than three.

If we use NumPy’s random number generator, we get 1000 different gap sizes.

Histogram of gap sizes


Random number generators with excellent statistical properties may be completely unsuitable for use in cryptography. A lot of people don’t know this or don’t believe it. I’ve seen examples of insecure encryption systems that use random number generators with good statistical properties but bad cryptographic properties.

These systems violate Kirchoff’s principle that the security of an encryption system should reside in the key, not in the algorithm. Kirchoff assumed that encryption algorithms will eventually be known, and difficult to change, and so the strength of the system should rely on the keys it uses. At best these algorithms provide security by obscurity: they’re easily breakable knowing the algorithm, but the algorithm is obscure. But these systems may not even provide security by obscurity because it may be possible to infer the algorithm from the output.

Fit for purpose

The random number generator in this post would be terrible for encryption because the sequence is trivially predictable. It would also fail some statistical tests, though it would pass others. It passes at least one statistical test, namely using the sequence for accurate Monte Carlo integration.

Even so, the sequence would pass a one-sided test but not a two-sided test. If you tested whether the sequence, when used in Monte Carlo integration, produced results with error below some threshold, it would pass. But if you looked at the distribution of the integration errors, you’d see that they’re smaller than would be expected from a random sequence. The sequence provides suspiciously good integration results, failing a test of randomness but suggesting the sequence might be useful in numerical integration.

Related posts

Mental hash function

A few years ago I wrote about Manual Blum’s proposed method for mentally computing a secure hash function. He proposed using this method as a password manager, using the hash of a web site’s name as the password for the site.

I first wrote about Blum’s method on the Heidelberg Laureate Forum blog, then wrote some footnotes to the post here.

NB: This method is insecure for reasons Rob Shearer describes in the comments.


Blum’s algorithm requires creating and memorizing two things: a random map from letters to digits, and a random permutation of digits.

Let f be a function that maps the set {A, B, C, …, Z} to the set {0, 1, 2, …, 9} created by a random number generator, and let g be a random permutation of {0, 1, 2, …, 9}.

Start with the text you want to hash. Then use f to convert the text to a sequence of digits, replacing each character c with f(c). Label this sequence of digits

a0 a1 a2an.

To create the first digit of your hash value, add the first and last digits of your sequence of digits, take the last digit of the sum, and apply the permutation g to it. In symbols, the first digit of the output, b0, is given by

b0 = g( (a0 + an) % 10 )

where a % 10 is the remainder when a is divided by 10.

For the rest of the digits, add the previous output to the current input, take the last digit, and apply the permutation. That is, for k > 0,

bk = g( (bk-1 + ak) % 10 ).

A few years ago Blum recommended using at least 12 characters.

Python code

Here’s some Python code to implement the method above and make all the details explicit.

    from numpy.random import default_rng
    rng = default_rng(20220516)
    fdata = rng.integers(10, size=26)
    def f(ch):
        ch = ch.lower()
        return fdata[ord(ch) - ord('a')]
    gdata = rng.permutation(10)
    def g(n):
        i = list(gdata).index(n)
        return gdata[(i+1) % 10]
    def hash(text):
        digits = [f(c) for c in text]
        N = len(text)
        out = [0] * N
        out[0] = g((digits[0] + digits[-1]) % 10)
        for i in range(1, N):
            out[i] = g((out[i-1] + digits[i]) % 10)
        return "".join(str(d) for d in out)

This prints 26752.

Is mental cryptography hopeless?

It’s been eight years since I talked to Manuel Blum. Perhaps he realized years ago that this system is flawed. But I still find his initial question interesting: are there encryption algorithms that humans can execute mentally that cannot be easily broken by computers?

It would be interesting if such an algorithm could at least put up a good fight. Cryptographic algorithms are considered broken if they can be compromised given a few weeks of computing. Could a mental hashing algorithm at least withstand an hour of computer attack?

Encryption in groups of unknown order

One way of looking at RSA encryption, a way that generalizes to new methods, is that the method is based on group operations inside a group of unknown order, i.e. unknown to most people. Another way of putting it is that RSA encryption takes place in a group where everybody knows how to multiply but not everyone knows how to divide. This will be explained below.

RSA encryption starts by finding two large primes p and q. You compute the product

n = pq

and make it public, but keep the factors p and q secret. The RSA method lets anyone send you encrypted messages by doing certain operations in the multiplicative group of the integers mod n. (Details here.) Let’s call this group G.

The elements of G are the integers from 1 to n-1 that are relative prime to n. The group operation is multiplication mod n, i.e. to multiply two elements of G, multiply them first as ordinary integers, then divide the result by n and keep the remainder.

The order of G, the number of elements in G, is

φ(n) = (p – 1)(q – 1).

You know p and q, and so you know φ(n), but the public does not. The security of RSA depends on the assumption that the public cannot compute φ(n). If someone could factor n, they could compute φ(n), but it is assumed that for large enough p and q it is not computationally feasible to factor n.

The public knows how to multiply in G but not how to divide. That is, anyone can carry out multiplication, but they cannot compute multiplicative inverses. Only you know how to divide in G, because you know φ(n) and Euler’s theorem.

In some sense the public knows everything there is to know about G. It’s the multiplicative group of integers mod n, and you tell them what n is. And that does tell them all they need to know to send you messages, but in practice it doesn’t tell them enough to decrypt messages anyone else sends you.

When you’re using RSA for public key cryptography, you’re telling the world “Here’s an n. To communicate securely with me, carry out certain algorithms with the integers relatively prime to n.”

Someone might object “But how do we know whether an integer is relatively prime to n? You haven’t told us its factors.”

You could reply “Don’t worry about it. It’s extremely unlikely that you’d run into a number that isn’t relatively prime to n. In fact, if you did, you’d break my system wide open. But if you’re worried about it, you can efficiently confirm that your numbers are relatively prime to n.”

Let’s unpack that last statement. We’ve already said that the number of positive integers less and n and relatively prime to n is (p – 1)(q – 1). So the number that are not relatively prime is

pq – (p – 1)(q – 1) = p + q – 1

and the probability of accidentally running into a one of these numbers is

(p + q – 1)/pq

Now if p and q are 300 digit numbers, for example, then this probability is on the order of one chance in 10300.

The Euclidean algorithm lets you find the greatest common factor of enormous numbers quickly. If you have a number k and you want to test whether it’s relatively prime to n, you can compute gcd(k, n). If k was chosen at random, the gcd is extremely likely to be 1, i.e. relatively prime to n. But if the answer is not 1, then it’s either p or q, in which case you’ve broken the encryption.


If you can factor large numbers, you can break RSA encryption. But it’s conceivable that you may be able to break RSA without being able to factor large numbers. That is in fact the case when RSA is implemented poorly. But aside from implementation flaws, nobody knows whether breaking RSA is as hard as factoring. Michael Rabin came up with a variation on RSA that is provably as hard as factoring, though I don’t know whether it has ever been used in practice.

More cryptography posts here.

Also a crypto library

The home page for the OpenSSL project says

OpenSSL is a robust, commercial-grade, and full-featured toolkit for the Transport Layer Security (TLS) and Secure Sockets Layer (SSL) protocols. It is also a general-purpose cryptography library. …

If you’ve never heard of the project before, you would rightly suppose that OpenSSL implements SSL (and its successor TLS). But you might not realize that OpenSSL “is also a general-purpose cryptography library.”

After thinking about it a bit, you might realize that software implementing SSL must have some encryption capability, but it doesn’t follow that this capability would necessarily be readily accessible. In fact, OpenSSL has implements a lot of cryptography algorithms and makes them easy to use from the command line. For example, this post shows how to compute hash functions using the openssl command.

Earlier today I wrote a thread on @CompSciFact about the famous example of encrypting an image of the Linux mascot Tux using ECB (Electronic Code Book) mode. As the saying goes, you should never use ECB “because you can see the penguin.”

Original encrypted Tux image

I wanted to try reproducing the example, and my first thought was to use Python. But setting up encryption libraries is a fairly lengthy process, while AES encryption using openssl is a one-liner.

My encrypted Tux image

You can still see the outline of Tux, but my penguin looks quite different from the famous example for a variety of reasons. For starters, I don’t know what key was used in the original image. Also, there are a variety of ways to extract the data from an image, encrypt it, and put it back. I basically followed Filippo Valsorda’s post The ECB Penguin but I had to make a few changes to get it to work due to changes in GIMP since that post was written.

Using cryptography broken 50 years ago

Old cryptography never dies. After a method is broken, its use declines, but never goes to zero.

And when I say “broken,” I do not mean no longer recommended, but broken to the point of being trivial to decrypt. I recently ran across an anecdote from World War I showing this is nothing new. The Vigenère cipher had been broken decades before the war broke out but was widely used anyway.

In this post I’ll explain what the Vigenère cipher is, give a little history of its rise and fall, and explain how it can be broken.

Vigenère cipher

A simple substitution cipher replaces one letter with another. For example, maybe you replace A with X, B with J, C with B, etc. Simple substitution ciphers are so easy to break that they’re included in pulp puzzle books.

The Vigenère cipher is a step up from simple substitution. It combines a key of length n with the clear text in such a way that you effectively have n different simple substitution ciphers. It’s not hard to break—I’ll explain how to break it below—but it’s less vulnerable that simple substitution. It makes it harder to spot high-frequency letters like E because not all E’s are encrypted the same way.

In its simplest form Vigenère essentially adds a key and a message mod 26. For example, if your message is “Attack the bridge at dawn” and your key is “ossifrage” the encryption would work like this.


Starting from the left, O is the 14th letter of the alphabet (counting from 0), and so A is moved ahead 14 places to 0. Since the clear text and the key are involved symmetrically, you could say that the letter O is moved head zero places since A is the 0th letter of the alphabet.

The next two letters of the clear text and the key happen to be repeated [1]. T and S are the 19th and 18th letters of the alphabet, and 19 + 18 = 37, which is congruent to 11 mod 16. The 11th letter of the alphabet is L.

Here’s a little Python code to carry out the encryption.

    key   = "OSSIFRAGE"

    A = ord('A')
    for i in range(len(clear)):
        c = ord(clear[i]) - A
        k = ord(key[i % len(key)]) - A
        e = (c + k) % 26
        print(chr(e + A), end="")

A more secure version of Vigenère moves the clear text through scrambled alphabets. The simplified version above is a particularly insecure special case. However, using scrambled alphabets (“polyalphabetic substitution”) doesn’t make the method that much stronger.

World War I

According to [2],

[The Vigenère cipher] was commonly regarded as unbreakable and was widely used up through World War I, even though the Prussian cryptographer Friedrich Wilhelm Kasiski had published a method for breaking it in 1863.

David Kahn [3] gives more detail about Kasiski’s book:

Die Geheimschriften und die Dechiffrir-kunst concentrates on answering the problem that had vexed cryptanalysts for more than 300 years: how to achieve a general solution for polyalphabetic ciphers with repeating keywords. … But the 95-page volume seems to have stirred almost no comment at the time.

So armies were depending on the security of Vigenère over 50 years after it had been broken. This was worse than using DES today, around 50 years after it came out. Weaknesses have been found in DES, and it’s 56-bit keys are too short to resist a brute force attack from contemporary computers. But DES has not been broken as thoroughly as Vigenère had been broken by the time of WWI.

Breaking Vigenère

So how would you go about attacking Vigenère? At first it seems hard to break. You can’t naively look at letter frequencies. In the example above, the clear text has four A’s, but they are encrypted three different ways.

However, Vigenère with a key of length n is just a set of n different simple substitution ciphers. You can chop the cipher text into n pieces and break each one separately.

How would you know the key length n? You could just try brute force. Try 1, then 2, then 3, etc. For example, to see whether the example above might have a key of length 2, you could analyze the letters in even positions and the letters in odd positions separately. If n isn’t too large (and in practice it often wasn’t large) brute force is efficient.

A more sophisticated approach would be to try different alignments and see which one results in statistical properties most consistent with English text. (Or French text if you suspect your clear text was in French etc.) This was the motivation for William Friedman developing his index of coincidence.

In hindsight this seems fairly obvious, but it was not obvious to anyone for three centuries before Kasiski, nor to many for decades after.

Why study obsolete cryptography?

Classical encryption methods are completely obsolete. The methods described in David Kahn’s book The Codebreakers can now be broken almost instantly. So why study old cryptography?

My interest in the subject was renewed recently because I had use for it in new projects. Not directly, though I have seen obsolete encryption in use, but indirectly. Some of the ideas from classical cryptography are relevant to searching for patterns in data, even though nothing was encrypted.

There are still lessons to learn from classical cryptography. For example, even a subtle lack of randomness in keys can be exploited. This was done during WWII, and flaws in random number generators are still causing security failures today.

Related posts

[1] This is a little foreshadowing of what can go wrong even with non-repeating keys: If the clear text and the key are English prose, coincidences like this will happen, and they are an exploitable weakness.

[2] James S. Craft and Lawrence C. Washington. An Introduction to Number Theory with Cryptography, 2nd edition. CRC Press.

[3] David Kahn. The Codebreakers: The Story of Secret Writing. Scribner, 1967.

Index of coincidence

Index of coincidence is a statistic developed by William Friedman for use in cryptanalysis. It measures how unevenly symbols are distributed in a message.

It’s a kind of signature that could be used, for example, to infer the language of a text, even if the text has been encrypted with a simple substitution cipher. It also has more sophisticated uses, such as inferring key length in text that has been encrypted with a Vigenère cipher.

If a discrete random variable has n possible values, each occurring with probability pi, then its index of coincidence is

I = \sum_{i=1}^n p_i^2

In cryptanalysis, the probabilities are the letter frequencies in the language of the message. The sum above is often multiplied by some constant, but this makes no difference in application because you would compare index of coincidence values, not interpret the index in the abstract.

Often the sum above is multiplied by n. This amounts to dividing the sum above by the corresponding sum for a uniform distribution.

Index of coincidence is occasionally use in applications, though not in cryptography anymore.


The letter frequencies in Google’s English language corpus are given here. In this corpus the probabilities range from 0.1249 for e down to 0.0009 for z. The index of coincidence would be

0.1249² + 0.0928² + … + 0.0009² = 0.06612

You’re more likely to see this value multiplied by 26, which gives 1.719.

Relation to entropy

Index of coincidence seems a lot like entropy, and in fact it is a simple transformation of an entropy, though not Shannon entropy. Index of coincidence is closely related to Rényi entropy.

Rényi entropy of order α is defined for a random variable X to be

H_\alpha(X) = \frac{1}{1 - \alpha} \log_2 \left(\sum_{i=1}^n p_i^\alpha \right)

and so Rényi entropy of order 2 is the negative log of the index of coincidence. That is, if H is the Rényi entropy of order 2 and I is the index of coincidence, then

\begin{align*} H &= -\log I \\ I &= \exp(-H) \end{align*}

Index of coicidence and Rényi entropy are sometimes defined with multiplicative constants that would slightly change the equations above.

Since exp(-x) is a decreasing function, increasing index of coincidence corresponds to decreasing Rényi entropy. Sorting in increasing order by index of coincidence is the same as sorting in decreasing order by Rényi entropy.


What does RIPEMD stand for?

The RIPEMD-160 secure hash function may be best known these days for its role as part of the implementation of Bitcoin. I’ve wondered what “RIPEMD” stands for, and today I stumbled on an explanation [1]:

“RIPEMD” stands for “RIPE Message Digest,” where “RIPE” stands for “RACE Integrity Primitives Evaluation” and where “RACE” stands for “Research and Development in Advanced Communications Technologies in Europe”—a nice example of a recursive abbreviation.

This deserves a diagram:

I created the diagram above with DITAA.

Related posts

[1] Introduction to Cryptography with Open-Source Software by Alasdair McAndrew. CRC Press. 2011

Hashing phone numbers

A cryptographic hash is also known as a one-way function because given an input x, one can quickly compute the hash h(x), but it is extremely time-consuming to try to recover x if you only know h(x).

Even if the hashing algorithm is considered “broken,” it may take an enormous effort to break it. Google demonstrated that they could break a SHA-1 hash, but they used a GPU-century of compute power to do so. Attacks have become more efficient since then, but it still takes many orders of magnitude less work to compute a hash than to attempt to invert it. [1]

However, if you know that the hash value comes from a small set of possible inputs, brute force can discover which one. I wrote about this a couple years ago in the post Hashing names does not protect privacy. You could, for example, easily create a table of the hashes of all nine-digit social security numbers.

I often explain this to clients who have been told that hashed data is “encrypted.” This is subtle, because the data is encrypted, in a way, but not in the way they think.

A paper came out a few weeks ago that hashed 118 billion phone numbers.

The limited amount of possible mobile phone numbers combined with the rapid increase in affordable storage capacity makes it feasible to create key-value databases of phone numbers indexed by their hashes and then to perform constant-time lookups for each given hash value. We demonstrate this by using a high-performance cluster to create an in-memory database of all 118 billion possible mobile phone numbers from [reference] (i.e., mobile phone numbers allowed by Google’s libphonenumber and the WhatsApp registration API) paired with their SHA-1 hashes.

The authors were able to use this data base to query

10% of US mobile phone numbers for WhatsApp and 100% for Signal. For Telegram we find that its API exposes a wide range of sensitive information, even about numbers not registered with the service.

Related posts

[1] Hash functions are not invertible, even in theory, in the sense of a unique x leading to a hash value h(x). Suppose you’re computing a 256-bit hash on files that are one kilobyte (8192 bits). If you’re mapping a space of 28192 possible files into a space of 2256 possible hash values, the mapping cannot be one-to-one. However, if you know the inputs are not random bits but German prose, and you find a file of German prose that has a matching hash value, you’ve almost certainly recovered the file that led to the hash value.

Perfectly nonlinear functions

The other day I heard someone suggest that a good cocktail party definition of cryptography is puzzle making. Cryptographers create puzzles that are easy to solve given the key, but ideally impossible without the key.

Linearity is very useful in solving puzzles, and so a puzzle maker would like to create functions that are as far from linear as possible. That is why cryptographers are interested in perfectly nonlinear functions (PNs) and almost perfect nonlinear functions (APNs), which we will explore here.

The functions we are most interested in map a set of bits to a set of bits. To formalize things, we will look at maps between two finite Abelian groups A and B. If we look at lists of bits with addition defined as component-wise XOR [1], we have an Abelian group, but the ideas below apply to other kinds of Abelian groups as well.

We will measure the degree of nonlinearity of a function F from A to B by the size of the sets

\delta(a, b) = \{x \mid F(a + x) - F(x) = b\}

where a and x are in A and b is in B [2].

If F is a linear function, F(a + x) = F(a) + F(x), and so δ(a, b) simplifies to

\delta(a, b) = \{x \mid F(a) = b\}

which equals all of the domain A, if F(a) = b. So for linear functions, the set δ(a, b) is potentially as large as possible, i.e. all of A. Perfectly nonlinear functions will be functions where the sets δ(a, b) are as small as possible, and almost perfectly nonlinear functions will be functions were δ(a, b) is as small as possible in a given context.

The uniformity of a function F is defined by

\underset{a \ne 0}{\max_{a \in A, \, b\in B}} \,\, |\delta(a,b)|

The restriction a ≠ 0 is necessary since otherwise the uniformity of every function would be the same.

For linear functions, the uniformity is as large as possible, i.e. |A|. As we will show below, a nonlinear function can have maximum uniformity as well, but some nonlinear functions have smaller uniformity. So some nonlinear functions are more nonlinear than others.

We will compute the uniformity of a couple functions, mapping 4 bits to 1 bit, with the following Python code.

    A = range(16)
    B = range(2)
    nonZeroA = range(1, 16)

    def delta(f, a, b):
        return {x for x in A if f(x^a) ^ f(x) == b}

    def uniformity(f):
        return max(len(delta(f, a, b)) for a in nonZeroA for b in B)

We could look at binary [3] functions with domains and ranges of different sizes by changing the definitions of A, B, and nonZeroA.

Note that in our groups, addition is component-wise XOR, denoted by ^ in the code above, and that subtraction is the same as addition since XOR is its own inverse. Of course in general addition would be defined differently and subtraction would not be the same as addition. When working over a binary field, things are simpler, sometimes confusingly simpler.

We want to compute the uniformity of two functions. Given bits x0, x1, x2, and x3, we define

F(x0, x1, x2, x3) = x0 x1


G(x0, x1, x2, x3) = x0 x1 + x2 x3.

We can implement these in Python as follows.

    def F(x):
        x = [c == '1' for c in format(x, "04b")]
        return x[0] and x[1]

    def G(x):
        x = [c == '1' for c in format(x, "04b")]
        return (x[0] and x[1]) ^ (x[2] and x[3])

Next we compute the uniformity of the two functions.

    print( uniformity(F) )
    print( uniformity(G) )

This shows that the uniformity of F is 16 and the uniformity of G is 8. Remember that functions with smaller uniformity are considered more nonlinear. Both functions F and G are nonlinear, but G is more nonlinear in the sense defined here.

To see that F is nonlinear, let x = (1, 0, 0, 0) and y = (0, 1, 0, 0).

1 = F(x + y) ≠ F(x) + F(y) = 0.

It turns out that 8 is the smallest possible uniformity for a function from 4 bits to 1 bit, and the function G is said to be perfectly nonlinear.

I haven’t defined perfect nonlinearity or almost perfect nonlinearaity precisely, only saying that they have something to do with maximum nonlinearity (minimum uniformity) in some context. I may say more about that context in a future post.

Related posts

[1] XOR stands for exclusive or. For Boolean variables p and q, p XOR q is true if either p is true or q is true, but not both. The component-wise XOR of two binary numbers sets each bit of the result to the XOR of the corresponding input bits. For example, 1001 XOR 1010 = 0011.

[2] This definition comes from Perfect nonlinear functions and cryptography by Céline Blondeau and Kaisa Nyberg.

[3] The next post replaces binary numbers with base 5 and other bases.