Finite rings

It occurred to me recently that I rarely hear about finite rings. I did a Google Ngram search to make sure this isn’t just my experience.

Finite group, finite ring, finite field ngram

Source

Why are finite groups and finite fields common while finite rings are not?

Finite groups have relatively weak algebraic structure, and demonstrate a lot of variety. Finite fields have very strong algebraic structure. Their complete classification has been known for a long time and is easy to state.

I imagine that most of the references to finite groups above have to do with classifying finite groups, and that most of the references to finite fields have to do with applications of finite fields, which are many.

You can see that references to finite groups hit their peak around the time of the Feit-Thompson theorem in 1962, and drop sharply after the classification of finite simple groups was essentially done in 1994. There’s a timeline of the progress toward the classification theorem on Wikipedia.

Rings have more structure than groups, but less structure than fields. Finite rings in particular are in a kind of delicate position: they easily become fields. Wedderburn’s little theorem says every finite domain is a field.

The classification of finite rings is much simpler than that of finite groups. And in applications you often want a finite field. Even if a finite ring (not necessarily a field) would do, you’d often use a finite field anyway.

In summary, my speculation as to why you don’t hear much about finite rings is that they’re not as interesting to classify as finite groups, and not as useful in application as finite fields.

Posts on finite simple groups

Posts on finite fields

Mixing error-correcting codes and cryptography

Secret codes and error-correcting codes have nothing to do with each other. Except when they do!

Error-correcting codes

Error correcting code make digital communication possible. Without some way to detect and correct errors, the corruption of a single bit could wreak havoc. A simple example of an error-detection code is check sums. A more sophisticated example would be erasure codes, a method used by data centers to protect customer data against hard drive failures or even entire data centers going offline.

People who work in coding theory are quick to point out that they do not work in cryptography. “No, not that kind of code. Error-correcting codes, not secret codes.” The goal isn’t secrecy. The goal is maximize the probability of correctly transmitting data while minimizing the amount of extra information added.

Codes and ciphers

You don’t hear the word “code” used in connection with cryptography much anymore. People used to refer to “codes and ciphers” in one breath. Historically, the technical distinction was that a code operated on words, while a cipher operated on characters. Codes in this sense have long been obsolete, but people still speak of codes colloquially.

David Kahn’s classic book on pre-modern cryptography is entitled The Codebreakers, not the Cipherbreakers, because the public at the time was more familiar with the term code than the term cipher. Maybe that’s still the case because, for example, Jason Fagone entitled his biography of Elizabeth Friedman The Woman Who Smashed Codes. Perhaps the author suggested The Woman Who Smashed Ciphers and an editor objected.

Code-based cryptography

If you’re accustomed to the older use of “codes,” the term “code-based cryptography” is redundant. But it means something specific in modern usage: cryptographic systems that incorporate error-correction codes. So error-correcting codes and secret “codes” do have something to do with each other after all!

Robert McEliece had this idea back in 1978. His encryption method starts with a particular error-correcting code, a binary Goppa code, and scrambles it with an invertible linear transformation. At a very high level, McEliece’s method boils down to a secret factorization, sorta like RSA but even more like oil and vinegar. The public key is the product of the Goppa code and the linear transformation, but only the owner knows the factorization of this key.

To encrypt a message with McEliece’s method, the sender adds a specific amount of random noise, noise that the Goppa code can remove. An attacker faces a challenging computational problem to recover the message without knowing how to factor the public key.

Post-quantum cryptography

McEliece’s method did not attract much interest at the time because it requires much larger public keys than other methods, such as RSA. However, there is renewed interest in McEliece’s approach because his scheme is apparently quantum-resistant whereas RSA and other popular public key systems are not.

If and when large quantum computers become practical, they could factor primes efficiently, and thus break RSA. They could also solve the discrete logarithm and elliptic discrete logarithm problems, breaking Diffie-Hellman and elliptic curve cryptosystems. All public key cryptosystems now in common use would be broken.

Why worry about this now while quantum computers don’t exist? (They exist, but only as prototypes. So far the largest number a quantum computer has been able to factor is 21.) The reason is that it takes a long time to develop, analyze, standardize, and deploy encryption methods. There’s also the matter of forward security: someone could store encrypted messages with the hope of decrypting them in the future. This doesn’t matter for cat photos transmitted over TLS, but it could matter for state secrets; governments may be encrypting documents that they wish to keep secret for decades.

NIST is sponsoring a competition to develop and standardize quantum-resistant encryption methods. Two months ago NIST announced the candidates that advanced to the second round. Seven of these methods use code-based cryptography, including the classic McEliece method and six variations: BIKE, HQC, LEDAcrypt, NTS-KEM, ROLLO, and RQC.

Related posts

US Army applying new areas of math

Many times on this blog I’ve argued that the difference between pure and applied math is motivation. As my graduate advisor used to say, “Applied mathematics is not a subject classification. It’s an attitude.”

Uncle Sam wants homotopy type theory

Traditionally there was general agreement regarding what is pure math and what is applied. Number theory and topology, for example, are pure, while differential equations and numerical analysis are applied.

But then public key cryptography and topological data analysis brought number theory and topology over into the applied column, at least for some people. And there are people working in differential equations and numerical analysis that aren’t actually interested in applications. It would be more accurate to say that some areas of math are more directly and more commonly applied than others. Also, some areas of math are predominantly filled with people interested in applications and some are not.

The US Army is interested in applying some areas of math that you would normally think of as very pure, including homotopy type theory (HoTT).

From an Army announcement:

Modeling frameworks are desired that are able to eschew the usual computational simplification assumptions and realistically capture … complexities of real world environments and phenomena, while still maintaining some degree of computational tractability. Of specific interest are causal and predictive modeling frameworks, hybrid model frameworks that capture both causal and predictive features, statistical modeling frameworks, and abstract categorical models (cf. Homotopy Type Theory).

And later in the same announcement

Homotopy Type Theory and its applications are such an area that is of significant interest in military applications.

HoTT isn’t the only area of math the Army announcement mentions. There are the usual suspects, such as (stochastic) PDEs, but also more ostensibly pure areas of math such as topology; the word “topological” appears 23 times in the document.

This would be fascinating. It can be interesting when a statistical model works well in application, but it’s no surprise: that’s what statistics was developed for. It’s more interesting when something finds an unexpected application, such as when number theory entered cryptography. The applications the Army has in mind are even more interesting because the math involved is more abstract and, one would have thought, less likely to be applied.

Related posts

Riffing on mistakes

I mentioned on Twitter yesterday that one way to relieve the boredom of grading math papers is to explore mistakes. If a statement is wrong, what would it take to make it right? Is it approximately correct? Is there some different context where it is correct? Several people said they’d like to see examples, so this blog post is a sort of response.

***

One famous example of this is the so-called Freshman’s Dream theorem:

(a + b)p = ap + bp

This is not true over the real numbers, but it is true, for example, when working with integers mod p.

(More generally, the Freshman’s Dream is true in any ring of characteristic p. This is more than an amusing result; it’s useful in applications of finite fields.)

***

A common misunderstanding in calculus is that a series converges if its terms converge to zero. The canonical counterexample is the harmonic series. It’s terms converge to zero, but the sum diverges.

But this can’t happen in the p-adic numbers. There if the terms of a series converge to zero, the series converges (though maybe not absolutely).

***

Here’s something sorta along these lines. It looks wrong, and someone might arrive at it via a wrong understanding, but it’s actually correct.

sin(xy) sin(x + y) = (sin(x) – sin(y)) (sin(x) + sin(y))

***

Odd integers end in odd digits, but that might not be true if you’re not working in base 10. See Odd numbers in odd bases.

***

You can misunderstand how percentages work, but still get a useful results. See Sales tax included.

***

When probabilities are small, you can often get by with adding them together even when strictly speaking they don’t add. See Probability mistake can make a good approximation.

A genius can admit finding things difficult

Karen Uhlenbeck

Karen Uhlenbeck has just received the Abel Prize. Many say that the Fields Medal is the analog of the Nobel Prize for mathematics, but others say that the Abel Prize is a better analog. The Abel prize is a recognition of achievement over a career whereas the Fields Medal is only awarded for work done before age 40.

I had a course from Karen Uhlenbeck in graduate school. She was obviously brilliant, but what I remember most from the class was her candor about things she didn’t understand. She was already famous at the time, having won a MacArthur genius award and other honors, so she didn’t have to prove herself.

When she presented the definition of a manifold, she made an offhand comment that it took her a month to really understand that definition when she was a student. She obviously understands manifolds now, having spent her career working with them.

I found her comment about extremely encouraging. It shows it’s possible to become an expert in something you don’t immediately grasp, even if it takes you weeks to grok its most fundamental concept.

Uhlenbeck wasn’t just candid about things she found difficult in the past. She was also candid about things she found difficult at the time. She would grumble in the middle of a lecture things like “I can never remember this.” She was not a polished lecturer—far from it—but she was inspiring.

Related posts

(The connection between Karen Uhlenbeck, Ted Odell, and John Tate is that they were all University of Texas math faculty.)

Photo of Karen Uhlenbeck in 1982 by George Bergman [GFDL], via Wikimedia Commons

Thermocouple polynomials and other sundries

I was looking up something on the NIST (National Institute of Standards and Technology) web site the other day and ran across thermocouple polynomials. I wondered what that could be, assuming “thermocouple” was a metaphor for some algebraic property. No, it refers to physical thermocouples. The polynomials are functions for computing voltage as a function of temperature, and temperature as a function of voltage, for a variety of types of thermocouples. See the NIST ITS-90 Thermocouple Database.

I keep running into NIST’s eclectic collection of useful information. Three examples:

I wonder what’s going to take me back to NIST next.

Digital signatures with oil and vinegar

“Unbalanced oil and vinegar” is a colorful name for a cryptographic signature method. This post will give a high-level description of the method and explain where the name comes from.

The RSA encryption algorithm depends on the fact that computers can easily multiply enormous numbers, but they cannot efficiently factor the product of two enormous primes. Whenever you have something that’s easy to do but hard to undo, you might be able to make an encryption algorithm out of it.

The unbalanced oil and vinegar (UOV) digital signature algorithm is analogous to RSA in that it also depends on the difficulty of factoring. But UOV is based on the difficulty of factoring the composition of a linear and nonlinear operator, not multiplying prime numbers. One advantage of UOV over RSA is that UOV is quantum-resistant. That is, if large quantum computers become practical, UOV signatures will remain hard to forge (or so it is currently believed) whereas RSA signatures would be easy to forge.

Solving large systems of multivariate polynomial equations over finite fields is hard, provably NP-hard, unless there’s some special structure that makes things easier. Several proposed post-quantum digital signature algorithms are based on this, such as the LUOV variant on UOV.

The idea behind UOV is to create systems of equations that have a special structure, with some “oil” variables and some “vinegar” variables, so named because they do not mix, or rather mix in a very simple, convenient way. This special structure is kept secret, and is obscured by composition with an invertible linear operator. This operator acts like a blender, thoroughly mixing the oil and vinegar. The term “unbalanced” refers to the fact that the scheme is more secure if you do not have equal numbers of “oil” and “vinegar” variables.

Polynomials over finite fields. Polynomials over finite fields everywhere!

Someone wanting to sign a file with the UOV algorithm knows the oil-and-vinegar structure and produces a vector that is mapped to a specified value, inverting the composition of the linear operator and the polynomials. They can do this because they know the factorization into this special structure. Someone wanting to verify a UOV signature only knows the (apparently unstructured) composition. They just see a large system of multivariate polynomial equations. They can stick a signature in and verify that the output is what it’s supposed to be, but they couldn’t produce a signature because they can’t invert the system. [1]

How large do these systems of polynomials need to be? On the order of a hundred equations and variables, though with more variables than polynomials. Not that large compared to linear systems, where one can efficiently solve systems with millions of equations and variables. And the polynomial are only quadratic. So in one sense the systems are small. But it takes several kilobytes [2] to describe such systems, which makes the public keys for UOV large relative to currently popular digital signature algorithms such as ECDSA. The signatures produced by UOV are small, but the public keys are large.

Related posts

[1] The system is not invertible in the sense of being one-to-one because it’s underdetermined. By inverting the system we mean producing any input that maps to the desired output. This solution is not generally unique.

[2] Representing m quadratic polynomials in n variables over a field of size b bits requires bmn²/2 bits. So 80 quadratic polynomials in 120 variables over GF(28) would require 8 × 80 × 120²/2 = 4,608,000 bits = 576 kilobytes. The LUOV variation on UOV mentioned above reduces the key sizes quite a bit, but it still requires larger public keys than ECDSA.

Counting irreducible polynomials over finite fields

You can construct a finite field of order pn for any prime p and positive integer n. The elements are polynomials modulo an irreducible polynomial of degree n, with coefficients in the integers mod p. The choice of irreducible polynomial matters, though the fields you get from any two choices will be isomorphic.

For example, the AES encryption algorithm uses the finite field GF(28), i.e. the finite field with 28 = 256 elements. Except we need to be a little careful about saying “the” field. Since we’re doing concrete calculations, the choice of irreducible polynomial matters, and AES dictates the polynomial

x8 + x4 + x3 + x + 1.

Another example from cryptography is Galois Counter Mode (GCM) which uses the finite field GF(2128), specifying the irreducible polynomial

x128 + x7 + x2 + x + 1.

How many other irreducible polynomials are there over GF(28) or any other field for that matter? We’ll assume the leading coefficient is 1, i.e. we’ll count monic polynomials, because otherwise we can just divide by the leading coefficient.

The number of monic irreducible polynomials of degree n over a field with q elements is given by

I_q(n) = \frac{1}{n} \sum_{d | n} \mu(d) q^{n/d}

where μ is the Möbius function and the sum is over all positive integers that divide n. We can implement this function succinctly in Python.

    from sympy import mobius, divisors

    def I_q(n, q):
        list = [mobius(d)*q**(n/d) for d in divisors(n)]
        return sum(list)//n

We can compute I_q(8, 2) to find out there are 30 monic irreducible polynomials of degree 8 with coefficients in GF(2), i.e. with one-bit coefficients. There are 256 monic polynomials—the coefficient of xk can be either 0 or 1 for k = 0 … 7—but only 30 of these are irreducible. Similarly, there are 2128 monic polynomials of degree 128 with binary coefficients, and 2121 of them are irreducible. Clearly it’s convenient in applications like GCM to use a polynomial of low weight, i.e. one with few non-zero coefficients.

Note that in the paragraph above we count the number of monic irreducible polynomials with coefficients in GF(2) that we could use in constructing GF(28). We haven’t considered how many monic irreducible polynomials there are in GF(28), i.e. with coefficients not just in GF(2) but in GF(28). That would be a much larger number. If we call I_q(8, 256) we get 2,305,843,008,676,823,040.

Scaling up differential privacy: lessons from the US Census

The paper Issues Encountered Deploying Differential Privacy describes some of the difficulties the US Census Bureau has run into while deploying differential privacy for the 2020 census. It’s not surprising that they would have difficulties. It’s surprising that they would even consider applying differential privacy on such an enormous scale.

If your data project is smaller than the US Census, you can probably make differential privacy work.

Related posts