The original Room square

A few days ago I wrote about Room squares, squares named after Thomas Room. This post will be about Room’s original square.

You could think of a Room square as a tournament design in which the rows represent locations and the columns represent rounds (or vice versa). Every team plays every other team exactly once, and only one pair of teams can play each other at the same location in the same round.

Here is an image creating from Room’s original square using the visualization approach from the earlier post.

This is not a trivial variation on the square in the earlier post. Notice that this square has a block of four contiguous squares in the middle row and middle column. The previous image does not.

Here is Room’s original square, presented as in his half-page announcement in [1] with the rows and columns labeled with binary numbers.

A plain text version of the image above is available below [2].

Room points out an interesting pattern: a cell is filled in if and only if the “dot product” of its coordinate is odd. Think of each coordinate number as three dimensional vector and take the dot product of the vector representing the row and the vector representing the column.

Room didn’t refer to dot products. In his words

(αβγ, λμν) are the coordinates of one of the points marked rs in and only if αλ + βμ + γν is odd.

Related posts

[1] T. G. Room. A new type of magic square. Mathematical Gazette. Volume 39 (1955), p. 307.

[2] Here is Room’s square as a plain text (org-mode) table.

|-----+-----+-----+-----+-----+-----+-----+-----|
| 001 |  12 |     |  34 |     |  56 |     |  78 |
| 010 |     |  37 |  25 |     |     |  48 |  16 |
| 011 |  47 |  15 |     |     |  38 |  26 |     |
| 100 |     |     |     |  68 |  14 |  57 |  23 |
| 101 |  58 |     |  67 |  24 |     |  13 |     |
| 110 |     |  46 |  18 |  35 |  27 |     |     |
| 111 |  36 |  28 |     |  17 |     |     |  45 |
|-----+-----+-----+-----+-----+-----+-----+-----|
|     | 001 | 010 | 011 | 100 | 101 | 110 | 111 |
|-----+-----+-----+-----+-----+-----+-----+-----|

Costas arrays

The famous n queens problem is to find a way to position n queens on a n×n chessboard so that no queen attacks any other. That is, no two queens can be in the same row, the same column, or on the same diagonal. Here’s an example solution:

Costas arrays

In this post we’re going to look at a similar problem, weakening one requirement and adding another. First we’re going to remove the requirement that no two pieces can be on the same diagonal. This turns the n queens problem into the n rooks problem because rooks cannot move diagonally.

Next, imagine running stiff wires from every rook to every other rook. We will require that no two wires have the same length and run in the same direction. in more mathematical terms, we require that the displacement vectors between all the rooks are unique.

A solution to this problem is called a Costas array.  In 1965, J. P. Costas invented what are now called Costas arrays to solve a problem in radar signal processing.

Why do we call a solution a Costas array rather than a Costas matrix? Because a matrix solution can be described by recording for each column the row number of the occupied cell. For example, we could describe the eight queens solution above as

(2, 4, 6, 8, 3, 1, 7, 5).

Here I’m numbering rows from the bottom up, like points in the first quadrant, rather than top to bottom as one does with matrices.

Note that the solution to the eight queens problem above is not a Costas array because some of the displacement vectors between queens are the same. For example, if we number queens from left to right, the displacement vectors between the first and second queens is the same as the displacement vector between the second and third queens.

Visualizing a Costas array

Here’s a visualization of a Costas array of order 5.

It’s clear from the plot that no two dots are in the same row or column, but it’s not obvious that all the connecting lines are different. To see that the lines are different, we move all the tails of the connecting lines to the origin, keeping their length and direction the same.

There are 10 colored lines in the first plot, but at first you may only see 9 in the second plot. That’s because two of the lines have the same direction but different length. If you look closely, you’ll see that there’s a short purple line on top of the long red line. That is, one line runs from the origin to (1, -1) and another from the origin to (4, -4).

Here is a visualization of a Costas array of order 9.

And here are its displacement vectors translated to the origin.

Here is a visualization of a Costas array of order 15.

And here are its displacement vectors translated to the origin.

Generating Costas arrays

There are algorithms for generating some Costas arrays, but not all. Every known algorithm [1] leaves out some solutions, and it is not know whether Costas arrays exist for some values of n.

The Costas arrays above were generated using the Lempel construction algorithm. Given a prime p [2] and a primitive root mod p [3], the following Python code will generate a Costas array of order p – 2.


    p = 11 # prime
    a = 2 # primitive element

    # verify a is a primitive element mod p
    s = {a**i % p for i in range(p)}
    assert( len(s) == p-1 )

    n = p-2
    dots = []

    for i in range(1, n+1):
        for j in range(1, n+1):
            if (pow(a, i, p) + pow(a, j, p)) % p == 1:
                dots.append((i,j))
                break

Related posts

[1] Strictly speaking, no scalable algorithm will enumerate all Costas arrays. You could enumerate all permutation matrices of order n and test whether each is a Costas array, but this requires generating and testing n! matrices and so is completely impractical for moderately large n.

[2] More generally, the Lempel algorithm can generate a solution of order q-2 where q is a prime power. The code above only works for primes, not prime powers. For prime powers, you have to work with finite fields of order q and the code would be more complicated.

[3] A primitive root for a finite field of order q is a generator of the multiplicative group of the field, i.e. an element x such that every non-zero element of the field is some power of x.

Balanced tournament designs

Suppose you have an even number of teams that you’d like to schedule in a Round Robin tournament. This means each team plays every other team exactly once.

Denote the number of teams as 2n. You’d like each team to play in each round, so you need n locations for the games to be played.

Each game will choose 2 teams from the set of 2n teams, so the number of games will be n(2n – 1). And this means you’ll need 2n – 1 rounds. A tournament design will be a grid with n rows, one for each location, and 2n-1 columns, one for each round.

Let’s pause there for just a second and make sure this is possible. We can certainly assign n(2n – 1) games to a grid of size n by 2n-1. But can we do it in such a way that no team needs to be in two locations at the same time? It turns out we can.

Now we’d like to introduce one more requirement: we’d like to balance the games over the locations. By that we mean we’d like a team to play no more than twice in the same location. A team has to play at least once in each location, but not every team can play twice in each location. If every team played once in each location, we’d have n² games, and if every team played twice in each location we’d have 2n² games, but

n² < n(2n – 1) < 2n²

for n greater than 1. If n = 1, this is all trivial, because in that case we would have two teams. They simply play each other and that’s the tournament!

We can’t have each team play exactly the same number of times in each location, but can we have each team play either once or twice in each location? If so, we call that a balanced tournament design.

Now the question becomes for which values of n can we have a balanced tournament design involving 2n teams. This is called a BTD(n) design.

We said above that this is possible if n = 1: two teams just play each other somewhere. For n = 2, it is not possible to have a balanced tournament design: some team will have to play all three games in the same location.

It is possible to have a balanced tournament design for n = 3. Here’s a solution:

{{af, ce, bc, de, bd}, {be, df, ad, ac, cf}, {cd, ab, ef, bf, ae}}

In fact this is the solution aside from relabeling the teams. That is, given any balanced tournament design involving six teams, there is a way to label the teams so that you have the design above. Said another way, there is one design in BTD(3) up to isomorphism.

There are balanced tournament designs for all positive n except for n = 2. And in general there are a lot of them. There are 47 designs in BTD(4) up to isomorphism [1].

Related posts

[1] The CRC Handbook of Combinatorial Designs. Colbourn and Dinitz editors. CRC Press, 1996.

 

Generating functions for polynomial sequences

The previous post looked at a generating function for a specific polynomial sequence. This post will look at generating functions for polynomial sequences in general. (There’s an alternating term in the previous post that isn’t polynomial, but we’ll address that too.)

The starting point for this post is a simple observation:

x\left(\frac{d}{dx} x^n \right) = n x^n

If we let xD be the operator that differentiates a function then multiplies the result by x, we have

(xD) x^n = n x^n

We can apply xD m times, each time multiplying xn by a factor of n.

(xD)^m x^n = n^m x^n

And more generally, for any polynomial p(x) we have

p(xD) x^n = p(n) x^n

Now let S be a set of integers and form a generating function F(x) by summing xn over n in S.

p(xD) x^n = p(n) x^n

Then we have

\begin{align*} p(xD)F(x) &= p(xD)\sum_S x^n \\ &= \sum_S p(xD)x^n \\ &= \sum_S p(n)x^n \end{align*}

In words, multiplying the nth term of a generating function by p(n) is the same as operating on the generating function by p(xD).

Example

The previous post computed the generating function of

Z_n = \frac{(-1)^n(3n+6) + 2n^3 + 12n^2 + 25n - 6}{12}

using Mathematica. Here we will compute the generating function again using the result derived below.

Before we computed

g(x) = \sum_{n=1}^\infty Z_nx^n

by summing over the positive integers. But Zn is not quite a polynomial function of n. Aside from the alternating term it is a cubic polynomial in n. The alternating term is a polynomial in n if we restrict ourselves to even values of n, and it is another polynomial if we restrict ourselves to odd values of n.

Define

\begin{align*} F_1(x) &= \frac{2n^3 + 12n^2 + 25n - 6}{12} \\ F_2(x) &= \frac{(3n+6)}{12} \\ F_3(x) &= -\frac{(3n+6)}{12} \\ \end{align*}

Then we have

\sum_{n} Z_nx^n = \sum_{n} F_1(n)x^n + \sum_{n \text{ even}} F_2(n)x^n + \sum_{n \text{ odd}} F_3(n)x^n

for positive integer n, splitting our original generating function into three generating functions, each summed over a different set of integers.

Define

\begin{align*} g_1(x) &= \sum_{n > 1} x^n = \frac{x}{1-x}\\ g_2(x) &= \sum_{n > 1,\, n \text{ even}}^\infty x^n = \frac{x^2}{1 - x^2}\\ g_3(x) &= \sum_{n > 1,\, n \text{ odd}}^\infty x^n = \frac{x}{1 - x^2}\\ \end{align*}

Then

g(x) = F_1(xD)g_1(x) + F_3(xD)g_3(x) + F_3(xD)g_3(x)

If we expand the line above, we should get the same expression for g(x) as in the previous post.

Sampling with replacement until you’ve seen everything

Suppose you have a standard deck of 52 cards. You pull out a card, put it back in the deck, shuffle, and pull out another card. How long would you expect to do this until you’ve seen every card?

Here’s a variation on the same problem. Suppose you’re a park ranger keeping data on tagged wolves in a national park. Each day you catch a wolf at random, collect some data, and release it. If there are 100 wolves in the park, how long until you’ve seen every wolf at least once?

We’ll solve this problem via simulation, then exactly.

Simulation

The Python code to solve the problem via simulation is straight forward. Here we’ll simulate our ranger catching and releasing one of 100 wolves. We’ll repeat our simulation five times to get an idea how much the results vary.

    import numpy as np
    from numpy.random import default_rng

    rng = default_rng()
    num_runs = 5
    N = 100 # number of unique items

    for run in range(num_runs):
        tally = np.zeros(N,dtype=int)
        trials = 0
        while min(tally) == 0:
            tally[rng.integers(N)] += 1
            trials += 1
        print(trials)

When I ran this, I got 846, 842, 676, 398, and 420.

Exact solution

Suppose we have a desk of N cards, and we sample the deck with replacement M times. We can select the first card N ways. Ditto for the second, third, and all the rest of the cards. so there are NM possible sequences of card draws.

How many of these sequences include each card at least once? To answer the question, we look at Richard Stanley’s twelvefold way. We’re looking for the number of ways to allocate M balls into N urns such that each urn contains at least one ball. The answer is

N! S(M, N)

where S(M, N) is the Stirling number of the second kind with parameters M and N [1].

So the probability of having seen each card at least once after selecting M cards randomly with replacement is

N! S(M, N) / NM.

Computing Stirling numbers

We’ve reduced our problem to the problem of computing Stirling numbers (of the second kind), so how do we do that?

We can compute Stirling numbers in terms of binomial coefficients as follows.

S(n,k) = \frac{1}{k!}\sum_{i = 0}^k (-1)^i {k \choose i} (k-i)^n

Now we have a practical problem if we want to use this formula for larger values of n and k. If we work with floating point numbers, we’re likely to run into overflow. This is the perennial with probability calculations. You may need to compute a moderate-sized number as the ratio of two enormous numbers. Even though the final result is between 0 and 1, the intermediate results may be too large to store in a floating point number. And even if we don’t overflow, we may lose precision because we’re working with an alternating sum.

The usual way to deal with overflow is to work on a log scale. But that won’t work for Stirling numbers because we’re adding terms together, not multiplying them.

One way to solve our problem—not the most efficient way but a way that will work—is to work with integers. This is easy in Python because integers by default can be arbitrarily large.

There are two ways to compute binomial coefficients in Python: scipy.special.binom and math.comb. The former uses floating point operations and the latter uses integers, so we need to use the latter.

We can compute the numerator of our probability with

    from math import comb
    def k_factorial_stirling(n, k):
        return sum((-1)**i * comb(k, i)*(k-i)**n for i in range(k+1))    

Then our probability is

    k_factorial_stirling(M, N) / N**M

If we compute the probability that our ranger has seen all 100 wolves after catching and releasing 500 times, we get 0.5116. So the median number of catch and release cycles is somewhere around 500, which is consistent with our simulation above.

Note that the numerator and denominator in our calculation are both on the order of 101000 while the largest floating point number is on the order of 10308, so we would have overflowed if we’d used binom instead of comb.

Related posts

[1] It’s unfortunate that these were called the “second kind” because they’re the kind that come up most often in application.

Image: “Fastening a Collar” by USFWS Mountain Prairie is licensed under CC BY 2.0 .

Self-Orthogonal Latin Squares

The other day I wrote about orthogonal Latin squares. Two Latin squares are orthogonal if the list of pairs of corresponding elements in the two squares contains no repeats.

A self-orthogonal Latin square (SOLS) is a Latin square that is orthogonal to its transpose.

Here’s an example of a self-orthogonal Latin square:

    1 7 6 5 4 3 2
    3 2 1 7 6 5 4
    5 4 3 2 1 7 6
    7 6 5 4 3 2 1
    2 1 7 6 5 4 3
    4 3 2 1 7 6 5
    6 5 4 3 2 1 7

To see that this Latin square is orthogonal to itself, we’ll concatenate the square and its transpose.

    11 73 65 57 42 34 26 
    37 22 14 76 61 53 45 
    56 41 33 25 17 72 64 
    75 67 52 44 36 21 13 
    24 16 71 63 55 47 32 
    43 35 27 12 74 66 51 
    62 54 46 31 23 15 77

Each pair of digits is unique.

Magic squares

As we discussed in the earlier post, you can make a magic square out of a pair of orthogonal Latin squares. If we interpret the pairs of digits above as base 10 numbers, we have a magic square because each row, column, and diagonal has the digits 1 through 7 in the 1s place and the same set of digits in the 10s place.

Typically an n × n magic square is filled with the numbers 1 through n². We can make such a magic square with a few adjustments.

First, we subtract 1 from every element of our original square before we combine it with its transpose. This gives us the following.

    00 62 54 46 31 23 15 
    26 11 03 65 50 42 34 
    45 30 22 14 06 61 53 
    64 56 41 33 25 10 02 
    13 05 60 52 44 36 21 
    32 24 16 01 63 55 40 
    51 43 35 20 12 04 66 

If we interpret the elements above as base 7 numbers, then the entries are the numbers 0 through 66seven which equals 48ten. If we add 1 to every entry we get all the numbers from 1 through 100seven = 49ten. Written in base 10, this is the following.

     1 45 40 35 23 18 13 
    21  9  4 48 36 31 26 
    34 22 17 12  7 44 39 
    47 42 30 25 20  8  3 
    11  6 43 38 33 28 16 
    24 19 14  2 46 41 29 
    37 32 27 15 10  5 49 

Possible sizes

It’s surprising that orthogonal Latin squares exist as often as they do. Self-orthogonal Latin squares are more rare, and yet they exist for all sizes except n = 2, 3, or 6. (Source: The CRC Handbook of Combinatorial Designs.)

It’s hard to prove that self-orthogonal Latin squares exist, but it’s easy to verify the sizes where they don’t exist. There is no self-orthogonal Latin square of size 6 because there is no pair of orthogonal Latin squares of size 6. The latter is a hard theorem to prove, but given it, the former is trivial.

There are orthogonal Latin squares of size 3, but no self-orthogonal Latin squares of size 3. And it’s not hard to see why. Suppose there were such a square. Without loss of generality we can label the top row with 1, 2, and 3. Let x be the entry directly below 1.

    1 2 3
    x * *
    * * *

What could x be? Not 1, because elements in a column of a Latin square are unique. It can’t be 2 either, because otherwise when you join the square and its transpose would have two 22s. So x must be 3, and the element below x must be 2. But then when you join the square and its transpose you have two 32s. There’s not enough wiggle room for a 3 × 3 Latin square to be self-orthogonal.

Greco-Latin squares and magic squares

Suppose you create an n × n Latin square filled with the first n letters of the Roman alphabet. This means that each letter appears exactly once in each row and in each column.

You could repeat the same exercise only using the Greek alphabet.

Is it possible to find two n × n Latin squares, one filled with Roman letters and the other filled with Greek letters, so that when you combine corresponding entries, each combination of Roman and Greek letters appears exactly once? If so, the combination is called a Greco-Latin square.

Whether Greco-Latin squares of size n exist depends on n. But before we explore that, let’s give an example.

Here are two Latin squares, one filled with the first four Roman letters and the other filled with the first four Greek letters.

    A B C D   α β γ δ
    D C B A   γ δ α β
    B A D C   δ γ β α
    C D A B   β α δ γ

If we concatenate the two matrices, we get

    Aα Bβ Cγ Dδ   
    Dγ Cδ Bα Aβ   
    Bδ Aγ Dβ Cα   
    Cβ Dα Aδ Bγ   

and each of the two-letter entries is unique. So we’ve shown that a Greco-Latin square exists for n = 4.

Mutually Orthogonal Latin Squares (MOLS)

The more modern name for Greco-Latin squares is “mutually orthogonal Latin squares,” abbreviated MOLS. We say two Latin squares M and N are mutually orthogonal if the list of pairs (Mij, Nij,) contains no duplicates.

Euler’s work

Euler showed how to construct Greco-Latin squares when n is odd and when n is a multiple of 4. He conjectured that no Greco-Latin squares exist when n = 4k + 2.

His conjecture was incorrect. For example, there is a Greco-Latin square of size 10. And in fact, Greco-Latin squares exist for all n except n = 2 and n = 6.

Magic squares

Suppose you have two orthogonal Latin squares M and N of size n, and suppose that the diagonal elements of both squares contain no duplicates.

Use the numbers 0 through n-1 rather than Greek or Latin letters to fill the squares and interpret the Latin squares as matrices. Then the matrix

nM + N

is a magic square. This is equivalent to taking the corresponding Greco-Latin square and interpreting the entries as base n numbers.

For example, using the Latin squares above, replace A and α with 0, B and β with 1, C and γ with 2, and D and γ with 3. Then the Greco-Roman square

    Aα Bβ Cγ Dδ   
    Dγ Cδ Bα Aβ   
    Bδ Aγ Dβ Cα   
    Cβ Dα Aδ Bγ   

becomes

    00 11 22 33   
    32 23 10 01   
    13 02 31 20   
    21 30 03 12   

with the entries being base 4 numbers. The rows and columns clearly have the same sum because they all have the same set of numbers in the 1s place and in the 4s place. Written in base 10, the magic square above is

     0  5 10 15   
    14 11  4  1   
     7  2 13  8   
     9 12  3  6   

It’s conventional for magic squares to be filled with consecutive numbers starting with 1, so you could add 1 to every entry above if you’d like.

Related posts

Latin squares and 3D chess

In a n×n Latin square, each of the numbers 1 through n appears exactly once in each row and column. For example, the 5 × 5 square below is a Latin square.

12345, 21534, 34251, 45123, 53412

If we placed a rook on each square numbered 1, the rooks would not attack each other since no two rooks would be in the same row or the same column. The same would be true if we moved all the rooks to squares numbered 2, or 3, etc.

Now imagine a n×n×n chess cube, a stack of n chessboards, each board being n×n. For example, we could have a stack of eight standard 8×8 boards.

In our 3D chess cube, rooks can move any number of squares in either the x or y direction within a board, and they can also move vertically in the z direction.

Put a rook on every square of the bottom board. Then take the rooks on squares numbered 2 and move them to the second level. Take the rooks on squares numbered 3 and move them to the third level, and so forth.

Now we have n×n rooks, and none is in a position to attack the other. To see this, pick a particular level k. None of the rooks on level k are attacking each other because level k is a Latin square. And none of the rooks can attack vertically because we started with all the rooks on the bottom level and lifted them up by various amounts; there’s only one rook in each vertical column.

Next let’s suppose we have n×n rooks arranged in 3D so that none is attacking any of the others. Label the rooks on level k with a k. Now push all the rooks straight down vertically to the first level. There can only be one rook on each square because no rooks were attacking each other vertically.

Number each square by the number of its rook. I claim the result is a Latin square. There can only be one k in each row and column because all the ks started out on level k, and none were attacking each other in the x or y direction.

Related posts

How many Latin squares are there?

12345, 21534, 34251, 45123, 53412

A Latin square is an n × n grid with a number from 1 to n in each cell, such that no number appears twice in a row or twice in a column.

It’s not obvious that Latin squares exist for all n, but they do, and in fact there are a lot of them. The exact number is known only for n ≤ 11. See [1] for the known values. There are upper and lower bounds for all n, and this post will look at how good the bounds are.

A particularly simple result is that number of Latin squares of size n is bounded below by the superfactorial of n [2]. That is, if Ln is the number of Latin squares of size n and the superfactorial of n is defined by

S(n) = 1! × 2! × 3! × … × n!

then

LnS(n).

A reduced Latin square is a Latin square in which the elements of the first row and first column are in numerical order. The image at the top of the post is a reduced Latin square. If Rn is the number of reduced Latin squares of size n then

Ln = n! (n-1)! Rn

and so we can divide the lower bound on Sn by n! (n-1)! to get a lower bound on Rn:

RnS(n-2).

Ronald Alter [2] gives the following upper bound on Rn:

R_n \leq \prod_{k=1}^{n-1} (n-k)^{k-1}(n-k)!

Here’s now the lower bound, exact value, and upper bound compare for n up to 6.

    |---+-------+-------+---------|
    | n | lower | exact |   upper |
    |---+-------+-------+---------|
    | 1 |     1 |     1 |       1 |
    | 2 |     1 |     1 |       1 |
    | 3 |     1 |     1 |       2 |
    | 4 |     2 |     4 |      24 |
    | 5 |    12 |    56 |    3456 |
    | 6 |   288 |  9408 | 9553280 |
    |---+-------+-------+---------|

The numbers get very big for larger n. so let’s switch over to a log scale.

Here’s a plot corresponding to the logarithms of the values above, including all known values of Rn.

The lower and upper bounds are remarkably symmetric about the exact value, which suggests that their average would make a good estimate. Let’s look at a plot.

Indeed, the average of the logs of the bounds is very close to the log of the actual value. This says the number of reduced Latin squares of size n is approximately the geometric mean of its upper and lower bounds, at least for n up to 11.

We can factor S(n-2) out of the upper bound on Rn when computing the geometric mean, and this gives us the approximation

R_n \approx S(n-2)\sqrt{(n-1)!} \,\,\prod_{k=1}^{n-1} (n-k)^{(k-1)/2}

References

[1] OEIS A000315

[2] Ronald Alter. The American Mathematical Monthly. Vol. 82, No. 6 (Jun. – Jul., 1975), pp. 632-634

Change of basis and Stirling numbers

Polynomials form a vector space—the sum of two polynomials is a polynomial etc.—and the most natural basis for this vector space is powers of x:

1, x, x², x³, …

But the power basis is not the only possible basis, and often not the most useful basis in application.

Falling powers

In some applications the falling powers of x are a more useful basis. For positive integers n, the nth falling power of x is defined to be

x^{\underbar{\small{\emph{n}}}) = x(x-1)(x-2)\cdots(x-n+1)

Falling powers come up in combinatorics, in the calculus of finite differences, and in hypergeometric functions.

Change of basis

Since we have two bases for the vector space of polynomials, we can ask about the matrices that represent the change of basis from one to the other, and here’s where we see an interesting connection.

The entries of these matrices are numbers that come up in other applications, namely the Stirling numbers. You can think of Stirling numbers as variations on binomial coefficients. More on Stirling numbers here.

In summation notation, we have

\begin{align*} x^{\underbar{\small{\emph{n}}}} &= \sum_{k=0}^n S_1(n,k)x^{\text{\small{\emph{k}}}} \\ x^{\text{\small{\emph{n}}}} &= \sum_{k=0}^n S_2(n,k)x^{\underbar{\small\emph{n}}} \\ \end{align*}

where the S1 are the (signed) Stirling numbers of the 1st kind, and the S2 are the Stirling numbers of the 2nd kind.

(There are two conventions for defining Stirling numbers of the 1st kind, differing by a factor of (-1)n-k.)

Matrix form

This means the (ij)th element of matrix representing the change of basis from the power basis to the falling power basis is S1(i, j) and the (i, j)th entry of the matrix for the opposite change of basis is S2(i, j). These are lower triangular matrices because S1(i, j) and S2(i, j) are zero for j > i.

These are infinite matrices since there’s no limit to the degree of a polynomial. But if we limit our attention to polynomials of degree less than m, we take the upper left m by m submatrix of the infinite matrix. For example, if we look at polynomials of degree 4 or less, we have

\begin{bmatrix} x^\underbar{\tiny{0}} \\ x^\underbar{\tiny{1}} \\ x^\underbar{\tiny{2}} \\ x^\underbar{\tiny{3}} \\ x^\underbar{\tiny{4}} \\ \end{bmatrix} = \begin{bmatrix} 1 & 0 & 0 & 0 & 0 \\ 0 & 1 & 0 & 0 & 0 \\ 0 & -1 & 1 & 0 & 0 \\ 0 & 2 & -3 & 1 & 0 \\ 0 & -6 & 11 & -6 & 1 \\ \end{bmatrix} \begin{bmatrix} x^\text{\tiny{0}} \\ x^\text{\tiny{1}} \\ x^\text{\tiny{2}} \\ x^\text{\tiny{3}} \\ x^\text{\tiny{4}} \\ \end{bmatrix}

to convert from powers to falling powers, and

\begin{bmatrix} x^\text{\tiny{0}} \\ x^\text{\tiny{1}} \\ x^\text{\tiny{2}} \\ x^\text{\tiny{3}} \\ x^\text{\tiny{4}} \\ \end{bmatrix} = \begin{bmatrix} 1 & 0 & 0 & 0 & 0 \\ 0 & 1 & 0 & 0 & 0 \\ 0 & 1 & 1 & 0 & 0 \\ 0 & 1 & 3 & 1 & 0 \\ 0 & 1 & 7 & 6 & 1 \\ \end{bmatrix} \begin{bmatrix} x^\underbar{\tiny{0}} \\ x^\underbar{\tiny{1}} \\ x^\underbar{\tiny{2}} \\ x^\underbar{\tiny{3}} \\ x^\underbar{\tiny{4}} \\ \end{bmatrix}

going from falling powers to powers.

Incidentally, if we filled a matrix with unsigned Stirling numbers of the 1st kind, we would have the change of basis matrix going from the power basis to rising powers defined by

x^{\overline{n}} = x(x+1)(x+2)\cdots(x+n+1)

It may be hard to see, but there’s a bar on top of the exponent n for rising powers whereas before we had a bar under the n for falling powers.

Related posts