How big is the monster?

Voyager 1 photo of Jupiter. Jan. 6, 1979

Symmetries are captured by mathematical groups. And just as you can combine kinds symmetry to form new symmetries, you can combine groups to form new groups.

So-called simple groups are the building blocks of groups as prime numbers are the building blocks of integers [1].

Finite simple groups have been fully classified, and they fall into several families, with 26 exceptions that fall outside any of these families. The largest of these exceptional groups is called the monster.

The monster is very large, containing approximately 8 × 1053 elements. I saw a video by Richard Boucherds where he mentioned in passing that the number of elements in the group is a few orders of magnitude larger than the number of atoms in earth.

I tried to find a comparison that is closer to 8 × 1053 and settled on the number of atoms in Jupiter.

The mass of Jupiter is about 2 × 1027 kg. The planet is roughly 3/4 hydrogen and 1/4 helium by mass, and from that you can calculate that Jupiter contains about 1054 atoms.

Before doing my calculation with Jupiter, I first looked at lengths such as the diameter of the Milky Way in angstroms. But even the diameter of the observable universe in angstroms is far too small, only about 9 × 1036.

More on finite simple groups

[1] The way you build up groups from simple groups is more complicated than the way you build integers out of primes, but there’s still an analogy there.

The photo at the top of the post was taken by Voyager 1 on January 6, 1979.

Group symmetry of copula operations

You don’t often see references to group theory in a statistics book. Not that there aren’t symmetries in statistics that could be described in terms of groups, but this isn’t often pointed out.

Here’s an example from An Introduction to Copulas by Roger Nelsen.

Show that under composition the set of operations of forming the survival copula, the dual of a copula, and the co-copula of a given copula, along with the identity (i.e., ^, ~, *, and i) yields the dihedral group.

Nelsen gives the following multiplication table for copula operations.

    o | i ^ ~ *
    i | i ^ ~ *
    ^ | ^ i * ~
    ~ | ~ * i ^
    * | * ~ ^ i

The rest of this post explains what a copula is and what the operations above are.

What is a copula?

At a high level, a copula is a mathematical device for modeling the dependence between random variables. Sklar’s theorem says you can express the joint distribution of a set of random variables in terms of their marginal distributions and a copula. If the distribution functions are continuous, the copula is unique.

The precise definition of a copula is technical. We’ll limit ourselves to copulas in two dimensions to make things a little simpler.

Let I be the unit interval [0, 1]. Then a (two-dimensional) copula is a function from I × I to I  that satisfies

\begin{align*} C(0, v) &= 0\\ C(u, 0) &= 0\\ C(u, 1) &= u\\ C(1, v) &= v \end{align*}

and is 2-increasing.

The idea of a 2-increasing function is that “gradients point northeast.” Specifically, for all points (x1, y1) and (x2, y2) with x1x2 and y1y2, we have

C(x_2, y_2) - C(x_2, y_1) - C(x_1, y_2) + C(x_1, y_1) \,\geq\, 0

The definition of copula makes no mention of probability, but the 2-increasing condition says that C acts like the joint CDF of two random variables.

Survival copula, dual copula, co-copula

For a given copula C, the corresponding survival copula, dual copula, and co-copula are defined by

\begin{align*} \hat{C}(u, v) &= u + v - 1 + C(1-u, 1-v) \\ \tilde{C}(u, v) &= u + v - C(u,v) \\ C^*(u,v) &= 1 - C(1-u, 1-v) \end{align*}


The reason for the name “survival” has to do with a survival function, i.e. complementary CDF of a random variable. The survival copula is another copula, but the dual copula and co-copulas aren’t actually copulas.

This post hasn’t said much too about motivation or application—that would take a lot more than a short blog post—but it has included enough that you could verify that the operations do compose as advertised.

Update: See this post for more algebraic structure for copulas, a sort of convolution product.

Sharing secrets with polynomials

This post will present a couple ways to share secrets using polynomials. We have a group of n people who want to share a secret between them so that k of them will have to cooperate in order to unlock the secret. For example, maybe a committee of n = 5 wants to require the cooperation of at least k = 3 members.

Shamir’s method

Adi Shamir came up with the idea of using polynomials to share secrets as follows. First, encode the secret you want to share as an integer a0. Next, generate m = k-1 other random integers a1 through am and use these as coefficients of a polynomial f of degree m:

f(x) = a_0 + a_1x + a_2x^2 + \cdots + a_mx^m

A trusted party generates n random integers values of x and gives each person an x and the corresponding value of f(x). Since m+1 points completely determine a mth degree polynomial, if k = m+1 people share their data, they can recover f, and thus recover the secret number a0. This can be efficiently, for example, by using Lagrange interpolation. But with fewer than k data points, the polynomial remains undetermined.

In practice we’d work over the integer modulo a large prime p. While fewer than k data points will not let someone completely determine the polynomial f, it will narrow down the possible coefficients if we’re working over the integers. Working modulo a large prime instead reveals less information.

Verifiable secret sharing

There’s a possible problem with Shamir’s method. Maybe the trusted party made a mistake. Or maybe the trusted party was dishonest and shouldn’t have been trusted. How can the parties verify that they have been given valid data without unlocking the secret? Seems we’re at a logical impasse since you’d have to recover the polynomial to know if your points are on the polynomial.

Paul Feldman came up with a way to assure the participants that the secret can be unlocked without giving them the information to unlock it. The trick is to give everyone data that in principle would let them determine the polynomial, but in practice would not.

We choose a large prime p such that p-1 has a large prime factor q [1]. Then the multiplicative group of non-zero integers mod p has a subgroup of order q. Let g be a generator of that group. The idea is to let everyone verify that

y_i = f(x_i)

for their given (xi, yi) by letting them verify that

g^{y_i} = g^{f(x_i)}

where all calculations are carried out mod p. Our trusted party does this by computing

A_i \equiv g^{a_i}\pmod{p}

for each coefficient ai and letting everyone know g and each of the Ai‘s.

In principle, anyone could solve for a0 if they know A0. But in practice, provided q is large enough, this would not be possible because doing so would require solving the discrete logarithm problem, which is computationally difficult. It’s possible to compute discrete logarithms for small q, but the difficulty goes up quickly as q gets larger.

How do the Ai‘s let everyone verify that their (xi, yi) data is correct?

Each person can verify that

g^{y_i} = \prod_{j=0}^m A_j^{x_i^j} 

using the public data and their personal data, and so they can verify that

g^{y_i} = \prod_{j=0}^m A_j^{x_i^j} = \prod_{j=0}^m g^{a_j x_i^j} = g^{f(x_i)} 

Related posts

[1] Conceptually you pick p‘s until you find one so that p-1 has a large prime factor q. In practice, you’d do it the other way around: search for large primes q until you find one such that, say, 2q + 1 is also prime.

Addition on Curve1174

I’ve written about elliptic curve and alluded to the fact that there’s a special kind of addition for points on the curve. But I haven’t gone into details because it’s more complicated than I wanted to get into.

However, there’s a special case where the details are not complicated, the so called Edwards curves. I’ll look briefly at Edwards curves in general, then focus on Curve1174, a particular Edwards curve used in cryptography.

The example here could be used in an introductory group theory course with no reference to elliptic curves. Just think of it as a funny way to add pairs of integers.

Addition on Edwards curves

For a particular class of elliptic curve, Edwards curves, the addition formula is simpler than usual. As mentioned a few days ago, an Edwards curve has the form

x² + y² = 1 + d x² y²

where d is not 0 or 1 in the underlying finite field. Then addition on the curve is given by

(x_1,y_1) + (x_2,y_2) = \left( \frac{x_1 y_2 + x_2 y_1}{1 + dx_1 x_2 y_1 y_2}, \frac{y_1 y_2 - x_1 x_2}{1 - dx_1 x_2 y_1 y_2} \right)

When d is a square, there are some exceptions. When d is not a square, as will be the case in our application, the denominators are never zero, and so the formula above is all there is to the addition rule.

Note that the division in the formula above is division in the underlying finite field, i.e. multiplication by the multiplicative inverse.


We’re interested in Curve1174, a particular elliptic curve used in cryptography. The underlying field is GF(p), the integers modulo the prime p = 2251 – 9. Also, d = -1174, from whence the curve takes its name.

Plotting Curve1174 over the reals

The plot above shows what Curve1174 looks like over the real numbers, though we’re interested in the curve over the integers mod p. (By the way, if d > 0 you get a curve that looks like a squircle.)

We consider the pairs of integers (x, y) that lie on the curve, i.e. those that satisfy

x² + y² = 1 + d x² y² mod p.

You can show that the sum of two points on the curve is another point on the curve, if you define addition with the formula above. The identity element for addition is the pair (0, 1). The additive inverse of a point (xy) is the point (-xy). So we have a group. Addition is commutative, and so in fact we have an Abelian group.

Python code

We can implement addition on Curve1174 in a few lines of Python.

from sympy import mod_inverse

def divide(a, b, p):
    "Compute a/b in GF(p)"
    return (a*mod_inverse(b, p))%p

def group_add(x1, y1, x2, y2, p, d):
    x3 = divide(x1*y2 + x2*y1, 1 + d*x1*x2*y1*y2, p)
    y3 = divide(y1*y2 - x1*x2, 1 - d*x1*x2*y1*y2, p)
    return (x3, y3)

The only thing we needed SymPy for was the mod_inverse function. It wouldn’t take much work to write your own mod_inverse function from scratch using the method outlined here using a variation on the Euclidean algorithm.

It’s clear that (1, 0) is a point on the curve, and so we can add it to itself with the code

p = 2**251 - 9
d = -1174
print(group_add(1, 0, 1, 0, p, d))

and find that it equals

(0, 3618502788666131106986593281521497120414687020801267626233049500247285301238),

which may come as a bit of a surprise. Arithmetic here is not intuitive; it scrambles up points well, which hints at why the curve is useful in cryptography.

Let’s find another point on the curve. Let’s set x = 2019 and see what y is. When we come up with the equation y must satisfy, the Jacobi symbol shows there is no solution.

When x = 2025 there is a solution, and we can compute it using sqrt_mod from sympy.ntheory.

x = 2025
k = divide(1 - x*x, 1 - d*x*x, p)
y = sqrt_mod(k, p)

This says the point

(2025, 588747530266665079407582947937120321357732884331117971504880828350684014295)

is on Curve1174. And since x and y only appear as squares in the equation defining the curve, once we find an (x, y) pair on the curve, the points (±x, ±y) are also on the curve.

Just for grins, let’s double the point (xy) above, i.e. add it to itself. This works out to

(2795920935947049934301363619759082573282734750667150130900931245990107374027,  2577351770662637935098262284237063829290291047539093190165388036658162531660).

Number of points on Curve1174

In general it can be hard to compute how many point lie on an elliptic curve, but in the case of Curve 1174 the number of points is known. Bernstein et al computed that the number of points on Curve1174 is p + 1 – t where t is a big number, but much smaller than p, on the order of the square root of p. Specifically,

t = 45330879683285730139092453152713398836.

Why not just absorb the 1 into t? This was done to match the notation in Hasse’s theorem. See the footnote here.

Elliptic curve cryptography (ECC)

What does all this have to do with cryptography? Cryptographers like to find problems that can be computed easily but that are hard to reverse. Most public key cryptography methods depend on the difficulty of undoing one of three things:

  • multiplication,
  • modular exponentiation, or
  • multiplication over an elliptic curve.

RSA encryption, for example, depends on the difficulty of factoring the product of two large primes.

The elliptic curve discrete logarithm problem (ECDLP) is the problem of undoing multiplication over an elliptic curve. If n is an integer and P is a point on the curve, we can compute QnP easily. If n is large, we don’t just add P to itself n times. Instead we double it log2n times and add the necessary intermediate results, analogous to fast exponentiation.

It’s easy to compute Q given n and P, but it’s hard to compute n given P and Q. This is the elliptic curve discrete logarithm problem that EEC protocols rely on for their security.

More elliptic curve cryptography posts

Groups of order 2019

How many groups have 2019 elements? What are these groups?

2019 is a semiprime, i.e. the product of two primes, 3 and 673. For every semiprime s, there are either one or two distinct groups of order s.

As explained here, if spq with pq, all groups of order s are isomorphic if q is not a factor of p-1. If q does divide p-1 then there are exactly two non-isomorphic groups of order s, one abelian and one non-abelian. In our case, 3 does divide 672, so there are two groups of order 2019. The first is easy: the cyclic group of order 2019. The second is more complex.

You could take the direct product of the cyclic groups of order 3 and 673, but that turns out to be isomorphic to the cyclic group of order 2019. But if you take the semidirect product you get the other group of order 2019, the non-abelian one.

Semidirect products

Starting with two groups G and H, the direct product G × H is the Cartesian product of G and H with multiplication defined componentwise. The semidirect product of G and H, written [1]

G \rtimes H

also starts with the Cartesian product of G and H but defines multiplication differently.

In our application, G will be the integers mod 673 with addition and H will be a three-element subgroup of the integers mod 673 with multiplication [2]. Let H be the set {1, 255, 417} with respect to multiplication [3]. Note that 1 is its own inverse and 255 and 417 are inverses of each other.


Now the group product of (g1, h1) and (g2, h2) is defined to be

(g1 + h1-1g2, h1 h2)

So, for example, the product of (5, 255) and (334, 417) is (5 + 417*334, 255*417) which reduces to (645, 1) working mod 673.

(We haven’t defined the semidirect product in general, but the procedure above suffices to define the semidirect product for any two groups of prime order, and so it is sufficient to find all groups of semiprime order.)

Note that our group is non-abelian. For example, if we reverse the order of multiplication above we get (263, 1).


The identity element is just the pair consisting of the identity elements from G and H. In our case, this is (0, 1) because 0 is the additive identity and 1 is the multiplicative identity.


The inverse of an element (gh) is given by


So, for example, the inverse of (600, 255) is (444, 417).

Python code

The goal of this post is to be concrete rather than general.

So to make everything perfectly explicit, we’ll write a little Python code implementing the product and inverse.

    def hinv(h):
        if h == 255:
            return 417
        if h == 417:
            return 255
        if h == 1:
            return 1
    def prod(x, y):
        g1, h1 = x
        g2, h2 = y
        g3 = (g1 + hinv(h1)*g2) % 673
        h3 = (h1*h2) % 673
        return (g3, h3)
    def group_inv(x):
        g, h = x
        return ((-g*h)%673, hinv(h))
    x = (5, 255)
    y = (334, 417)
    print(prod(x, y))
    print(prod(y, x))
    print(group_inv((600, 255)))

The following code verifies that our group satisfies the axioms of a group.

    from itertools import product
    identity = (0, 1)
    h_list = [1, 255, 417]
    def elem(x):
        g, h = x
        g_ok = (0 <= g <= 672)
        h_ok = (h in h_list)
        return (g_ok and h_ok)
    group = product(range(673), h_list)
    assert(len([g for g in group]) == 2019)
    # closed under multiplication    
    for x in group:
        for y in group:
            assert( elem(prod(x, y)) )
    # multiplication is associative
    for x in group:
        for y in group:
            for z in group:
                xy_z = prod(prod(x, y),z)
                x_yz = prod(x, prod(y,z))
                assert(xy_z == x_yz)
    # identity acts like it's supposed to
    for x in group:
        assert( prod(x, identity) == x )
        assert( prod(identity, x) == x )
    # every element has an inverse
    for x in group:
        ginv = group_inv(x)
        assert( elem(ginv) )
        assert( prod(x, ginv) == identity )
        assert( prod(ginv, x) == identity )

Related posts

[1] The symbol for semidirect product is ⋊. It’s U+22CA in Unicode and \rtimes in LaTeX.

[2] In general, the semidirect product depends on a choice of an action of the group H on the group G. Here the action is multiplication by an element of H. Different actions can result in different groups. Sometimes the particular choice of action is made explicit as a subscript on the ⋊ symbol.

[3] How did I find these numbers? There are 672 non-zero numbers mod 673, so I picked a number, it happened to be 5, and raised it to the powers 672/3 and 2*672/3.


Groups of semiprime order

For each prime p, there is only one group with p elements, the cyclic group with that many elements. It would be plausible to think there is only one group of order n if and only if n is prime, but this isn’t the case.

If p and q are primes, then there are ostensibly at least two groups of order pq: the cyclic group Zpq, and Zp + Zq, the direct sum of the cyclic groups of orders p and q. However, there may just be one group of order pq after all because the two groups above could be isomorphic.

If pq = 2, then Z4 and Z2 + Z2 are not isomorphic. But the groups Z15 and Z3 + Z5 are isomorphic. That is, there is only one group of order 15, even though 15 is composite. This is the smallest such example.

Let p and q be primes with pq. If q does not divide p-1, then there is only one group of order pq. That is, all groups of order pq are isomorphic to the cyclic group Zpq. So when p = 5 and q = 3, there is only one group of order 15 because 3 does not evenly divide 5-1 = 4. The same reasoning shows, for example, that there must only be one group with 77 elements because 7 does not divide 10.

Now if q does divide p-1, then there are two distinct groups of order pq. One is the cyclic group with pq elements. But the other is non-Abelian, and so it cannot be Zp + Zq. So once again Zpq is isomorphic to Zp + Zq, but there’s a new possibility, a non-Abelian group.

Note that this does not contradict our earlier statement that Z4 and Z2 + Z2 are different groups, because we assumed p > q. If pq, then Zpq is not isomorphic to Zp + Zq.

Related posts


Number of groups of prime power order

John Baez left a comment on my post on group statistics saying

It’s known that the number of groups of order p^n for prime p is p^{2n^3/27+O(n^(8/3))}. It might be fun to compare this to what Mathematica says.

Here goes. First let’s let p = 2. Mathematica’s FiniteGroupCount function tops out at n = 10. And for the range of values Mathematica does support, 2^{2n^3/27} grossly overestimates the number of groups.

    Table[{FiniteGroupCount[2^n], 2^((2./27) n^3)}, {n, 1, 11}]

    {{1, 1.05269}, {2, 1.50795}, {5, 4.}, {14, 26.7365}, 
    {51, 612.794}, {267, 65536.}, {2328, 4.45033*10^7}, 
    {56092, 2.61121*10^11}, {10494213, 1.80144*10^16}, 
    {49487365422, 1.98847*10^22}, {FiniteGroupCount[2048], 4.7789*10^29}}

OEIS has entries for the sizes of various groups, and the entry for powers of 2 confirms the asymptotic formula above. Maybe it’s not a good approximation until n is very large. (Update: Here’s a reference for the asymptotic expression for all p.)

The OEIS entry for number of groups of order powers of 5 has an interesting result:

For a prime p >= 5, the number of groups of order p^n begins 1, 1, 2, 5, 15,
61 + 2*p + 2*gcd (p – 1, 3) + gcd (p – 1, 4),
3*p^2 + 39*p + 344 + 24*gcd(p – 1, 3) + 11*gcd(p – 1, 4) + 2*gcd(p – 1, 5), …

We can duplicate this with Mathematica.

    Table[FiniteGroupCount[5^n], {n, 0, 6}]

    {1, 1, 2, 5, 15, 77, 684}

and the last two numbers match the calculations given in OEIS.

There’s something interesting going on with Mathematica. It doesn’t seem to know, or agree with, the formula above for groups of order p4. For example,

    Table[FiniteGroupCount[7^n], {n, 0, 6}]

    {1, 1, 2, 5, FiniteGroupCount[2401], 83, 860}

I get similar results when I use larger primes: it can’t handle the fourth power.

    Table[FiniteGroupCount[389^n], {n, 0, 6}]

    {1, 1, 2, 5, FiniteGroupCount[22898045041], 845, 469548}

The results for n = 5 and 6 agree with OEIS.

Is OEIS wrong about the number of groups of order  p4 or should Mathematica simply return 15 but there’s a gap in the software?

Also, does anybody know why the agreement with the asymptotic formula above is so bad? It’s just as bad or worse for other primes that I tried.

More algebra posts

Group statistics

I just ran across FiniteGroupData and related functions in Mathematica. That would have made some of my earlier posts easier to write had I used this instead of writing my own code.

Here’s something I find interesting. For each n, look at the groups of order at most n and count how many are Abelian versus non-Abelian. At first there are more Abelian groups, but the non-Abelian groups soon become more numerous. Also, the number of Abelian groups grows smoothly, while the number of non-Abelian groups has big jumps, particularly at powers of 2.

Counting Abelian and non-Abelian groups

Here’s the Mathematica code:

    fgc = FoldList[Plus, 0, Table[FiniteGroupCount[n], {n, 1, 300}]]
    fga = FoldList[Plus, 0, Table[FiniteAbelianGroupCount[n], {n, 1, 300}]]
    ListLogPlot[ {fgc - fga, fga }, 
        PlotLegends -> {"Non-Abelian", "Abelian"}, 
        Joined -> True, 
        AxesLabel -> {"order", "count"}]

I see the plot legend on my screen, but when saving the plot to a file the legend wasn’t included. Don’t know why. (Update: See footnote [1]). The jagged blue curve is the number of non-Abelian groups of size up to n. The smooth gold curve is the corresponding curve for Abelian groups.

Here’s the same plot carried out further to show the jumps at 512 and 1024.

Counting Abelian and non-Abelian groups

More group theory posts

[1] Someone from Wolfram Research saw this post and sent me a fix:

pl = ListLogPlot[...]
Export["~/Desktop/img.png", pl]

Overlap in the classification of finite simple groups

The previous post defined the groups PSL(nq) where n is a positive integer and q is a prime power. These are finite simple groups for n ≥ 2 except for PSL(2, 2) and PSL(2, 3).

Overlap among PSL(nq)

There are a couple instances where different values of n and q lead to isomorphic groups: PSL(2, 4) and PSL(2, 5) are isomorphic, and PSL(2, 7) and PSL(3, 2) are isomorphic. These are the only instances [1].

With the exceptions stated above, distinct values of n and q lead to distinct groups. Is it possible for different choices of n and q to lead to groups of the same size, even though the groups are not isomorphic to each other? Yes, PSL(3, 4) and PSL(4, 2) both have order 20160, but the groups are not isomorphic. This is the only example [2].

Overlap between PSL and alternating groups

The first post in this series mentioned that for n ≥ 5, the alternating group An, the group of even permutations on a set of n elements, is a simple group. Three of the alternating groups are isomorphic to PSL groups:

  • PSL(2, 4) = PSL(2, 5) = A5
  • PSL(2, 9) = A6
  • PSL(4, 2) = A8

Here “=” really means isomorphic. We mentioned PSL(4, 2) above. It has the same order as PSL(3, 4). This means that A8 and PSL(3, 4) have the same order but are not isomorphic.

I suspect that with a small number of exceptions, the order of a finite simple group determines the group. I haven’t proven that, but numerical exploration suggests its true. This page lists non-Abelian finite simple groups of order less than 10 billion, and there are only seven orders that correspond to more than 1 group, the largest example being order 25,920.

One last overlap

There is only one other duplication in the lists of groups in the CFSG theorem, and that is PSU(4, 2) = PSp(4, 3). I haven’t written about these groups yet.


[1] See The Finite Simple Groups by Robert A. Wilson

[2] In fact, aside from the groups mentioned in this post, the orders of all the finite simple groups are unique except for two non-isomorphic families that have orders: PΩ2n+1(q) and PSp2n(q) for n ≥ 3 and odd prime powers q. See discussion on Math Overflow.

Orders of finite simple groups

Simple groups are to groups as prime numbers are to numbers.A simple group has no non-trivial normal subgroups, just as a prime number has no non-trivial factors.


Finite simple groups have been classified into five broad categories:

  1. Cyclic groups of prime order
  2. Alternating groups
  3. Classical groups
  4. Exceptional groups of Lie type
  5. Sporadic groups.

Three of these categories are easy to describe.

The cyclic groups of prime order are simply the integers mod p where p is prime. These are the only Abelian finite simple groups.

The alternating groups are even-order permutations of a set. These groups are simple if the set it permutes has at least 5 elements.

The sporadic groups are a list of 26 groups that don’t fit anywhere else. The other categories are infinite families of groups, but the sporadic groups are just individual groups.

The classical groups and exceptional Lie groups are harder to describe. I’d like to write about them in some detail down the road. For now, I’ll be deliberately vague.

This post is a broad overview, and may be the first of more posts in a series. This post just looks at the sizes (orders) of the groups.

Update: Here’s a follow-on post that looks at the groups denoted An(q).

Smallest group in each family

You can find a list of the families of finite simple groups on Wikipedia, along with their orders (the number of elements in each group). We can use this to determine the smallest group in each family, just to get an idea of how these families spread out.

The classical groups and exceptional Lie groups that I glossed over depend on a parameter n and/or a parameter q where q is a prime power. Even for very small values of n and q, the smallest ones for which the groups are simple, some of these groups are BIG.

| Family           |               Order of smallest simple group |
| Cyclic(p)        |                                            2 |
| Alternating(n)   |                                           60 |
| A_n(q)           |                                           60 |
| Sporadic         |                                         7920 |
| B_n(q)           |                                        25920 |
| Suzuki(2^(2n+1)) |                                        29120 |
| 2A_n(q^2)        |                                       126000 |
| C_n(q)           |                                      1251520 |
| G_2(q)           |                                      4245696 |
| 2F_4(q)'         |                                     17971200 |
| D_n(q)           |                                    174182400 |
| 3D_n(q^2)        |                                    197406720 |
| 3D_4(q^3)        |                                    211341312 |
| 2G_2(q)          |                                  10073444472 |
| F_4(q)           |                             3311126603366400 |
| 2E_6(q^2)        |                      76532479683774853939200 |
| E_6(q)           |                     214841575522005575270400 |
| 2F_4(q)          |                     264905352699586176614400 |
| E_7(q)           |     7997476042075799759100487262680802918400 |
| E_8(q)           | 33780475314363480626138819061408559507999169 |

Never mind the cryptic family names for now; I may get into these in future posts. My point here is that for some of these families, even the smallest member is quite large.

Interestingly, the smallest sporadic group has a modest size of 7920. But the largest sporadic group, “The Monster,” has order nearly 1054. You could think of each sporadic group of being a lonely family of one, so I’ll list their orders here. (There are groupings within the sporadic groups, but I’m not clear how much these are grouped together for historical reasons (i.e. who discovered them) or mathematical reasons. I expect there’s a blurry line between the historical and mathematical since the groups discovered by an individual were amenable to that person’s techniques.)

| Group |                         Order of smallest simple group |
| M_11  |                                                   7920 |
| M_12  |                                                  95040 |
| J_1   |                                                 175560 |
| M_22  |                                                 443520 |
| J_2   |                                                 604800 |
| M_23  |                                               10200960 |
| HS    |                                               44352000 |
| J_3   |                                               50232960 |
| M_24  |                                              244823040 |
| McL   |                                              898128000 |
| He    |                                             4030387200 |
| Ru    |                                           145926144000 |
| Suz   |                                           448345497600 |
| O'N   |                                           460815505920 |
| Co_3  |                                           495766656000 |
| Co_2  |                                         42305421312000 |
| Fi_22 |                                         64561751654400 |
| HN    |                                        273030912000000 |
| Ly    |                                      51765179004000000 |
| Th    |                                      90745943887872000 |
| Fi_23 |                                    4089470473293004800 |
| Co_1  |                                    4157776806543360000 |
| J_4   |                                   86775571046077562880 |
| Fi_24 |                              1255205709190661721292800 |
| B     |                     4154781481226426191177580544000000 |
| M     | 808017424794512875886459904961710757005754368000000000 |

Groups of order less than a million

In 1972, Marshall Hall published a list of the (non-Abelian) finite simple groups of order less than one million. Hall said that there were 56 such groups known, and now that the classification theorem has been completed we know he wasn’t missing any groups.

There are 78,498 primes less than one million, so there are that many cyclic (Abelian) finite groups of order less than one million. In the range of orders one to a million, Abelian simple groups outnumber non-Abelian simple groups by over 1400 to 1. Of the 56 non-Abelian orders, 46 belong to groups of the form An(q). Most of the families in the table above don’t make an appearance since their smallest representatives have order much larger than one million.

There can be multiple non-isomorphic groups with the same order, especially for small orders. This is another detail I may get into in future posts.

Growth of group orders

Now let’s look at plots to see how the size of the groups grow. Because these numbers quickly get very large, all plots are on a log scale.

In case you have difficulty seeing the color differences, the legends are in the same vertical order as the plots.

Some of the captions list two or three groups. That is because the curves corresponding to the separate groups are the same at the resolution of the image.

Here are groups that only depend on a parameter n.

Here are groups that only depend on a parameter q. The q‘s vary over prime powers: 2, 3, 4, 5, 7, 8, 9, 11, etc. Note that the horizontal axis is not q itself but the index of q, i.e. i on the horizontal scale corresponds to the ith prime power.

For the groups that depend on n and q, we vary these separately. First we hold n fixed at 5 and let q vary.

Finally, we now fix q at 5 and let n vary.

Looking up group by size

Imagine writing a program that takes the size n of a finite simple group and tells you what groups it could be. What would such a program look like?

Since the vast majority of finite simple groups below a given size are cyclic, we’d check whether n is prime. If it is, then the group is the integers mod n. What about the non-Abelian groups? These are so thinly spread out that the simplest thing to do would be to make a table of the possible values less than the maximum program input size N. If we set N = 10,000,000,000, then David Madore has already been done here. There are only 491 possible orders for non-Abelian simple groups of order less than 10 billion.

We said above that Abelian groups outnumbered non-Abelian groups by over 1400 to 1 for simple groups of order less than one million. What about for simple groups of order less than 10 billion? The number of primes less than 10 billion is 455,052,511 and so Abelian simple groups outnumber non-Abelian simple groups by a little less than a million to one.


This post began by saying simple groups are analogous to prime numbers. The Abelian finite simple groups are exactly integers modulo prime numbers. The orders of the non-Abelian finite simple groups are spread out much more sparsely than prime numbers.

Also, the way you combine simple groups to make general groups is more complicated than the way you multiply prime numbers to get all integers. There was speculation that group theory would die once finite simple groups were classified, but that has not been the case. Unfortunately (or fortunately if you’re a professional group theorist) the classification theorem doesn’t come close to telling us everything we’d want to know about groups.