Finite rings

It occurred to me recently that I rarely hear about finite rings. I did a Google Ngram search to make sure this isn’t just my experience.

Finite group, finite ring, finite field ngram

Source

Why are finite groups and finite fields common while finite rings are not?

Finite groups have relatively weak algebraic structure, and demonstrate a lot of variety. Finite fields have very strong algebraic structure. Their complete classification has been known for a long time and is easy to state.

I imagine that most of the references to finite groups above have to do with classifying finite groups, and that most of the references to finite fields have to do with applications of finite fields, which are many.

You can see that references to finite groups hit their peak around the time of the Feit-Thompson theorem in 1962, and drop sharply after the classification of finite simple groups was essentially done in 1994. There’s a timeline of the progress toward the classification theorem on Wikipedia.

Rings have more structure than groups, but less structure than fields. Finite rings in particular are in a kind of delicate position: they easily become fields. Wedderburn’s little theorem says every finite domain is a field.

The classification of finite rings is much simpler than that of finite groups. And in applications you often want a finite field. Even if a finite ring (not necessarily a field) would do, you’d often use a finite field anyway.

In summary, my speculation as to why you don’t hear much about finite rings is that they’re not as interesting to classify as finite groups, and not as useful in application as finite fields.

Posts on finite simple groups

Posts on finite fields

US Army applying new areas of math

Many times on this blog I’ve argued that the difference between pure and applied math is motivation. As my graduate advisor used to say, “Applied mathematics is not a subject classification. It’s an attitude.”

Uncle Sam wants homotopy type theory

Traditionally there was general agreement regarding what is pure math and what is applied. Number theory and topology, for example, are pure, while differential equations and numerical analysis are applied.

But then public key cryptography and topological data analysis brought number theory and topology over into the applied column, at least for some people. And there are people working in differential equations and numerical analysis that aren’t actually interested in applications. It would be more accurate to say that some areas of math are more directly and more commonly applied than others. Also, some areas of math are predominantly filled with people interested in applications and some are not.

The US Army is interested in applying some areas of math that you would normally think of as very pure, including homotopy type theory (HoTT).

From an Army announcement:

Modeling frameworks are desired that are able to eschew the usual computational simplification assumptions and realistically capture … complexities of real world environments and phenomena, while still maintaining some degree of computational tractability. Of specific interest are causal and predictive modeling frameworks, hybrid model frameworks that capture both causal and predictive features, statistical modeling frameworks, and abstract categorical models (cf. Homotopy Type Theory).

And later in the same announcement

Homotopy Type Theory and its applications are such an area that is of significant interest in military applications.

HoTT isn’t the only area of math the Army announcement mentions. There are the usual suspects, such as (stochastic) PDEs, but also more ostensibly pure areas of math such as topology; the word “topological” appears 23 times in the document.

This would be fascinating. It can be interesting when a statistical model works well in application, but it’s no surprise: that’s what statistics was developed for. It’s more interesting when something finds an unexpected application, such as when number theory entered cryptography. The applications the Army has in mind are even more interesting because the math involved is more abstract and, one would have thought, less likely to be applied.

Related posts

Riffing on mistakes

I mentioned on Twitter yesterday that one way to relieve the boredom of grading math papers is to explore mistakes. If a statement is wrong, what would it take to make it right? Is it approximately correct? Is there some different context where it is correct? Several people said they’d like to see examples, so this blog post is a sort of response.

***

One famous example of this is the so-called Freshman’s Dream theorem:

(a + b)p = ap + bp

This is not true over the real numbers, but it is true, for example, when working with integers mod p.

(More generally, the Freshman’s Dream is true in any ring of characteristic p. This is more than an amusing result; it’s useful in applications of finite fields.)

***

A common misunderstanding in calculus is that a series converges if its terms converge to zero. The canonical counterexample is the harmonic series. It’s terms converge to zero, but the sum diverges.

But this can’t happen in the p-adic numbers. There if the terms of a series converge to zero, the series converges (though maybe not absolutely).

***

Here’s something sorta along these lines. It looks wrong, and someone might arrive at it via a wrong understanding, but it’s actually correct.

sin(xy) sin(x + y) = (sin(x) – sin(y)) (sin(x) + sin(y))

***

Odd integers end in odd digits, but that might not be true if you’re not working in base 10. See Odd numbers in odd bases.

***

You can misunderstand how percentages work, but still get a useful results. See Sales tax included.

***

When probabilities are small, you can often get by with adding them together even when strictly speaking they don’t add. See Probability mistake can make a good approximation.

A genius can admit finding things difficult

Karen Uhlenbeck

Karen Uhlenbeck has just received the Abel Prize. Many say that the Fields Medal is the analog of the Nobel Prize for mathematics, but others say that the Abel Prize is a better analog. The Abel prize is a recognition of achievement over a career whereas the Fields Medal is only awarded for work done before age 40.

I had a course from Karen Uhlenbeck in graduate school. She was obviously brilliant, but what I remember most from the class was her candor about things she didn’t understand. She was already famous at the time, having won a MacArthur genius award and other honors, so she didn’t have to prove herself.

When she presented the definition of a manifold, she made an offhand comment that it took her a month to really understand that definition when she was a student. She obviously understands manifolds now, having spent her career working with them.

I found her comment about extremely encouraging. It shows it’s possible to become an expert in something you don’t immediately grasp, even if it takes you weeks to grok its most fundamental concept.

Uhlenbeck wasn’t just candid about things she found difficult in the past. She was also candid about things she found difficult at the time. She would grumble in the middle of a lecture things like “I can never remember this.” She was not a polished lecturer—far from it—but she was inspiring.

Related posts

(The connection between Karen Uhlenbeck, Ted Odell, and John Tate is that they were all University of Texas math faculty.)

Photo of Karen Uhlenbeck in 1982 by George Bergman [GFDL], via Wikimedia Commons

Base85 encoding

I wrote a while back about Base32 and Base64 encoding, and yesterday I wrote about Bitcoin’s Base58 encoding. For completeness I wanted to mention Base85 encoding, also known as Ascii85. Adobe uses it in PostScript and PDF files, and git uses it for encoding patches.

Like Base64, the goal of Base85 encoding is to encode binary data printable ASCII characters. But it uses a larger set of characters, and so it can be a little more efficient. Specifically, it can encode 4 bytes (32 bits) in 5 characters.

Why 85?

There are 95 printable ASCII characters, and

log95(232) = 4.87

and so it would take 5 characters encode 4 bytes if you use all possible printable ASCII characters. Given that you have to use 5 characters, what’s the smallest base that will still work? It’s 85 because

log85(232) = 4.993

and

log84(232) = 5.006.

(If you’re not comfortable with logarithms, see an alternate explanation in the footnote [1].)

Now Base85 is different from the other bases I’ve written about because it only works on 4 bytes at a time. That is, if you have a number larger than 4 bytes, you break it into words of 4 bytes and convert each word to Base 85.

Character set

The 95 printable ASCII characters are 32 through 126. Base 85 uses characters 33 (“!”) through 117 (‘u’). ASCII character 32 is a space, so it makes sense you’d want to avoid that one. Since Base85 uses a consecutive range of characters, you can first convert a number to a pure mathematical radix 85 form, then add 33 to each number to find its Base85 character.

Example

Suppose we start with the word 0x89255d9, equal to 143807961 in decimal.

143807961 = 2×854 + 64×853 + 14×852 + 18×85 + 31

and so the radix 85 representation is (2, 64, 14, 18, 31). Adding 33 to each we find that the ASCII values of the characters in the Base85 representation are (35, 97, 47, 51, 64), or (‘#’, ‘a’, ‘/’, ‘3’, ‘@’) and so #a/3@ is the Base85 encoding of 0x89255d9.

Z85

The Z85 encoding method is also based on a radix 85 representation, but it chose to use a different subset of the 95 printable characters. Compared to Base85, Z85 adds seven characters

    v w x y z { }

and removes seven characters

    ` \ " ' _ , ;

to make the encoding work more easily with programming languages. For example, you can quote Z85 strings with single or double quotes because neither kind of quote is a valid Z85 character. And you don’t have to worry about escape sequences since the backslash character is not part of a Z85 representation.

Gotchas

There are a couple things that could trip someone up with Base85. First of all, Base 85 only works on 32-bit words, as noted above. For larger numbers it’s not a base conversion in the usual mathematical sense.

Second, the letter z can be used to denote a word consisting of all zeros. Since such words come up disproportionately often, this is a handy shortcut, though it means you can’t just divide characters into groups of 5 when converting back to binary.

Related posts

[1] 954 = 81450625 < 232 = 4294967296, so four characters from an alphabet of 95 elements is not enough to represent 232 possibilities. So we need at least five characters.

855 = 4437053125 > 232, so five characters is enough, and in fact it’s enough for them to come from an alphabet of size 85. But 845 = 4182119424 < 232, so an alphabet of 84 characters isn’t enough to represent 32 bits with five characters.

The Soviet license plate game and Kolmogorov complexity

Soviet license plate

Physicist Lev Landau used to play a mental game with Soviet license plates [1]. The plates had the form of two digits, a dash, two more digits, and some letters.

Rules of the game

His game was to apply high school math operators to the numbers on both side of the dash so that the dash could be replaced by an equal sign. For example, given the license plate 44-74, one solution would be

4! + 4 = 7*4.

Note that we can insert operators, such as !, +, and *, but not more digits.

Is there a solution for every possible license plate? That depends on what class of operators you allow.

You could trivialize the game by applying the fractional part operation {x} to both sides since the fractional part of an integer is zero. You could disallow the fractional part operator on the grounds that it’s not clearly a high school math operation, or just disallow it because it makes the game uninteresting.

Universal solution

It turns out there’s a universal solution, starting with the observation that

√(n + 1) = sec arctan √n.

If one side is greater than the other by one, the formula above gives an immediate solution. For example, a solution for the license plate 89-88 would be

√89 = sec arctan √88.

If the difference is greater, the formula can be applied repeatedly. For example, we could apply the formula twice to obtain

√(n + 2) = sec arctan √(n + 1) = sec arctan sec arctan √n

and so a possible solution for 35-37 is

sec arctan sec arctan √35 = √37.

Kolmogorov complexity

Given that a solution is always possible, we can make the game more interesting by looking for the simplest solution. We have some intuitive idea what this means. With our example of 44-74, the first solution

4! + 4 = 7*4

is simpler than the universal solution

sec arctan sec arctan … √44 = √74

which would require applying secant and arctangent each 30 times.

The Kolmogorov complexity of an object is the length of the shortest computer program to produce the object. We could compute the Kolmogorov complexity of the functions applied to the digits on each side in order to measure how complex a solution is.

To make this precise, we need to specify what our programming language is, and that’s not as easy as it sounds. If we think of mathematics notation as a programming language, do we want to count ! as one character and arctan as 6 characters? That doesn’t seem right. If we wrote “arctan” as “atn” we would use fewer characters without producing a different solution.

Complexity of Python code

To make things more objective, we could look at the length of actual computer programs rather than imagining math notation to be a programming language. Say we choose Python. Then here are a couple functions that compute our two solutions for license plate 44-74.

    from math import sqrt, cos, atan

    def f():
        sec = lambda x: 1/cos(x)
        y = sqrt(44)
        for _ in range(30):
            y = sec(atan(y))
        return y

    def g():
        return sqrt(77)

We could measure the complexity of our functions f and g by counting the number of characters in each one. But there still are difficulties.

What about the import statement? It should count toward the length of f because it uses everything imported but g could have used a shorter statement that only imported sqrt. More fundamentally, are we cheating by even importing a library?

Furthermore, the two functions above don’t produce exactly the same output due to limited precision. We could imagine that our imported functions are infinitely precise, but then we’re not really using Python but rather an idealized version of Python.

And what about the loop? That introduced new digits, 3 and 0, and so violates the rules of Landau’s game. So should we unroll the loop before calculating complexity?

Thought experiment

Kolmogorov complexity is a very useful concept, but it’s more of a thought experiment than something you can practically compute. We can imagine the shortest program to compute something, but we can seldom know that we’ve actually found such a program. All we can know in practice is upper bounds.

In theory you can enumerate all Turing machines of a given length, or all Python programs of a given length, and find the shortest one that does a given task [2], but the list grows exponentially with length.

However, it is possible to compute the length of particular programs if we deal with some of the complications mentioned above. We could make Landau’s game a two-player game by seeing who could come up with the simpler solution in a fixed amount of time.

Back to Landau

If we allow sine and degree in our set of operators, there’s a universal solution due to B. S. Gorobets. For n ≥ 6, n! is a multiple of 360, and so

sin (n!)° = 0.

And if n is less than 6, it’s two-digit representation begins with zero, so we can multiply the digits to get zero.

If we disallow transcendental functions, we block Gorobets’ trick and we have functions whose length we can objectively measure in a programming language.

Related posts

[1] Harun Šiljak. Soviet Street Mathematics: Landau’s License Plate Game. Mathematical Intelligencer, https://doi.org/10.1007/s00283-017-9743-9

[2] In addition to having to test an exponentially growing set of programs, there’s the problem of knowing if even one program will complete.

If you run a program and see it complete, then you know it completes! If you don’t see it complete, it might just need more time, or it might never finish. The halting problem theorem says that there’s no program that can tell whether another arbitrary program will terminate. You still might be able to determine whether a particular program terminates, but you might not.

Photo credit CC BY-SA 3.0 via Wikimedia Commons
Русский: Задний автомобильный номер СССР образца 1936 года

Asteroids named after mathematicians

asteroid image

This evening I stumbled on the fact that John von Neumann and Fibonacci both have asteroids named after them. Then I wondered how many more famous mathematicians have asteroids named after them. As it turns out, most of them: Euclid, Gauss, Cauchy, Noether, Gödel, Ramanujan, … It’s easier to look at the names that are not assigned to asteroids.

I wrote a little script to search for the 100 greatest mathematicians (according to James Allen’s list) and look for names missing from a list of named asteroids. Here are the ones that were missing, in order of Allen’s list.

So of the top 100 list of mathematicians, 80 have asteroids named after them and the 20 above do not.

Alexandre Grothendieck, is the greatest mathematician—11th on James Allen’s list—not to have an asteroid named in his honor.

Sine of a googol

How do you evaluate the sine of a large number in floating point arithmetic? What does the result even mean?

Sine of a trillion

Let’s start by finding the sine of a trillion (1012) using floating point arithmetic. There are a couple ways to think about this. The floating point number t = 1.0e12 can only be represented to 15 or 16 significant decimal figures (to be precise, 53 bits [1]), and so you could think of it as a representative of the interval of numbers that all have the same floating point representation. Any result that is the sine of a number in that interval should be considered correct.

Another way to look at this is to say that t can be represented exactly—its binary representation requires 42 bits and we have 53 bits of significand precision available—and so we expect sin(t) to return the sine of exactly one trillion, correct to full precision.

It turns out that IEEE arithmetic does the latter, computing sin(1e12) correctly to 16 digits.

Here’s the result in Python

    >>> sin(1.0e12)
    -0.6112387023768895

and verified by Mathematica by computing the value to 20 decimal places

    In:= N[Sin[10^12], 20]
    Out= -0.61123870237688949819

Range reduction

Note that the result above is not what you’d get if you were first to take the remainder when dividing by 2π and then taking the sine.

    >>> sin(1.0e12 % (2*pi))
    -0.6112078499756778

This makes sense. The result of dividing a trillion by the floating point representation of 2π, 159154943091.89536, is correct to full floating point precision. But most of that precision is in the integer part. The fractional part is only has five digits of precision, and so we should expect the result above to be correct to at most five digits. In fact it’s accurate to four digits.

When your processor computes sin(1e12) it does not naively take the remainder by 2π and then compute the sine. If it did, you’d get four significant figures rather than 16.

We started out by saying that there were two ways of looking at the problem, and according to the first one, returning only four significant figures would be quite reasonable. If we think of the value t as a measured quantity, measured to the precision of floating point arithmetic (though hardly anything can be measured that accurately), then four significant figures would be all we should expect. But the people who designed the sine function implementation on your processor did more than they might be expected to do, finding the sine of a trillion to full precision. They do this using a range reduction algorithm that retains far more precision than naively doing a division. (I worked on this problem a long time ago.)

Sine of a googol?

What if we try to take the sine of a ridiculously large number like a googol (10100)? This won’t work because a googol cannot be represented exactly as a floating point number (i.e. as a IEEE 754 double). It’s not too big; floating point numbers can be as large as around 10308. The problem is that a number that big cannot be represented to full precision. But we can represent 2333 exactly, and a googol is between 2332 and 2333. And amazingly, Python’s sine function (or rather the sine function built into the AMD processor on my computer) returns the correct result to full precision.

    >>> sin(2**333)
    0.9731320373846353

How do we know this is right? I verified it in Mathematica:

    In:= Sin[2.0^333]
    Out= 0.9731320373846353

How do we know Mathematica is right? We can do naive range reduction using extended precision, say 200 decimal places.

    In:= p = N[Pi, 200]
    In:= theta = x - IntegerPart[x/ (2 p)] 2 p
    Out= 1.8031286178334699559384196689…

and verify that the sine of the range reduced value is correct.

    In:= Sin[theta]
    Out= 0.97313203738463526…

Interval arithmetic interpretation

Because floating point numbers have 53 bits of precision, all real numbers between 256 – 22 and 256 + 22 are represented as the floating point number 256. This is a range of width 8, wider than 2π, and so the sines of the numbers in this range cover the possible values of sine, i.e. [-1, 1]. So if you think of a floating point number as a sample from the set of all real numbers with the same binary representation, every possible value of sine is a valid return value for numbers bigger than 256. But if you think of a floating point number as representing itself exactly, then it makes sense to ask for the sine of numbers like 2333 and larger, up to the limits of floating point representation [2].

Related posts

[1] An IEEE 754 double has 52 significand bits, but these bits can represent 53 bits of precision because the first bit of the fractional part is always 1, and so it does not need to be represented. See more here.

[2] The largest exponent of an IEEE double is 1023, and the largest significand is 2 – 2-53 (i.e. all 1’s), so the largest possible value of a double is

(253 – 1)21024-53

and in fact the Python expression sin((2**53 - 1)*2**(1024-53)) returns the correct value to 16 significant figures.

Sine sum

Sam Walters posted something interesting on Twitter yesterday I hadn’t seem before:

If for some reason your browser doesn’t render the embedded tweet, he points out that

-\frac{1}{7} < \sum_{n=1}^N \sin(n) < 2

for all positive integers N.

Here’s a little Python script to illustrate the sum in his tweet.

    from scipy import sin, arange
    import matplotlib.pyplot as plt
    
    x = arange(1,100)
    y = sin(x).cumsum()
    plt.plot(x, y)
    plt.plot((1, 100), (-1/7, -1/7), "g--")
    plt.plot((1, 100), (2, 2), "g--")
    plt.savefig("sinsum.svg")

Exponential sums

Exponential sums are interesting because crude application of the triangle inequality won’t get you anywhere. All it will let you conclude is that the sum is between –N and N.

(Why is this an exponential sum? Because it’s the imaginary part of the sum over ein.)

For more on exponential sums, you might like the book Uniform Distribution of Sequences.

Also, I have a page that displays the plot of a different exponential sum each day, the coefficients in the sum

\sum_{n=0}^N \exp\left( 2\pi i \left( \frac{n}{m} + \frac{n^2}{d} + \frac{n^3}{y} \right ) \right )

being taking from the day’s date. Because consecutive numbers have very different number theoretic properties, the images vary quite a bit from day to day.

Here’s a sneak peak at what the exponential sum for Christmas will be this year.

Geometry of an oblate spheroid

We all live on an oblate spheroid [1], so it could be handy to know a little about oblate spheroids.

Eccentricity

Conventional notation uses a for the equatorial radius and c for the polar radius. Oblate means ac. The eccentricity e is defined by

e = \sqrt{1 - \frac{c^2}{a^2}}

For a perfect sphere, ac and so e = 0. The eccentricity for earth is small, e = 0.08. The eccentricity increases as the polar radius becomes smaller relative to the equatorial radius. For Saturn, the polar radius is 10% less than the equatorial radius, and e = 0.43.

Volume

The volume of an oblate spheroid is simple:

V = \frac{4}{3}\pi a^2 c

Clearly we recover the volume of a sphere when ac.

Surface area

The surface area is more interesting. The surface of a spheroid is a surface of rotation, and so can easily be derived. It works out to be

A = 2\pi a^2 + \pi \frac{c^2}{e} \log\left(\frac{1 +e}{1 - e} \right)

It’s not immediately clear that we get the area of a sphere as c approaches a, but it becomes clear when we expand the log term in a Taylor series.

A = 2\pia^2 + 2\pi c^2\left( 1 + \frac{e^2}{3} + \frac{e^4}{5} + \frac{e^6}{7} + \cdots \right)

This suggests that an oblate spheroid has approximately the same area as a sphere with radius √((a² + c²)/2), with error on the order of e².

If we set a = 1 and let c vary from 0 to 1, we can plot how the surface area varies.

plot c vs area

Here’s the corresponding plot where we use the eccentricity e as our independent variable rather than the polar radius c.

plot of area as a function of eccentricity

Related posts

[1] Not in a yellow submarine as some have suggested.