Coloring the queen’s graph

Posted on 30 January 2024 by John

Suppose we have an n × n chessboard. The case n = 8 is of course most common, but we consider all positive integer values of n.

The graph of a chess piece has an edge between two squares if and only if the piece can legally move between the two squares.

Now suppose we have a rule that if a piece can move between two squares, the squares must be colored differently. The minimum number of colors necessary to color the chessboard this was is the chromatic number of that piece’s graph.

The chromatic number of a rook is clearly at least n: since a rook can move between any two squares in a row, every square in the row must be a different color. And in fact the chromatic number for a rook’s graph is exactly n.

Bishops are a little more complicated. If n is even, then the chromatic number for a bishop is n. If n is odd, the number is n for a white bishop but n − 1 for a black bishop. Thinking of a 3 × 3 board shows why the two bishops are different.

Now a queen is sorta like a rook and a bishop combined, but determining the chromatic number for a queen’s graph is more difficult. The chromatic number of a queen can be no less than that of a rook since the former can make all the moves of the latter. So the queen’s chromatic number is at least n, but is it equal to n?

The results stated above can be found in [1].

Let k(n) be the chromatic number of a queen’s graph on an the n × n chessboard. The results in [1] regarding k(n) are summarized as follows.

k(1) = 1
k(2) = 4
k(3) = k(4) = 5
k(6) = 7
k(n) = n if n is not divisible by 2 or 3.

The authors stated that they had found an example proving that k(8) ≤ 9 and conjectured that k(8) = 9. You can find a C++ program that confirms this conjecture here.

The authors conjectured that k(n) ≤ n + 1 for all n > 3. I don’t know whether this has been proven. My impression following a brief search is that the problem of finding k(n) for all n is still not fully resolved.

Update: According to a link in the comments, n = 27 is the smallest n such that k(n) is unknown.

[1] M. R. Iyer and V. V. Menon. On Coloring the n × n Chessboard. The American Mathematical Monthly, 1966, Vol. 73, pp. 721–725.

From graph theory to category theory

Posted on 3 October 2023 by John

Let G be a directed graph whose nodes are the positive integers and whose edges represent relations between two integers. In our first example we’ll draw an edge from x to y if x is a multiple of y. In our second example we’ll draw an edge from x to y if x ≥ y.

In both examples we define a function p(x, y) to be the unique node such that whenever a node z has directed edges going to x and to y, there is also a directed node from z to p(x, y).

Multiplication example

In this example there will be an edge from x to y if (and only if) x is a multiple of y. So, for instance, there is an edge from every even number to 2. There are edges from 15 to 1, 3, 5, and 15.

Now suppose there is some node with edges to 6 and 7. Call this node p(6, 7) or just p. Then p must be some multiple of 6 and 7. Also, by our definition of p we know that if there is an edge from any node z to 6 and 7, there must also be an edge from z to p. This says every multiple of 6 and 7 must also be a multiple of p. So p = 42. The node labeled z could be 4200, for example, but p can only be 42.

To generalize from this example, the node p(x, y) is the least common multiple of x and y.

Order example

In this example there will be an edge from x to y if and only if x ≥ y. Every positive integer points to itself and to every smaller integer.

Now what would p(x, y) be? It’s something no less than x and no less than y. And by definition of p, every number greater than p(x, y) is at least as big as p(x, y). That is, p(x, y) is the smallest integer no less than x or y, i.e. the maximum of x and y.

The reveal

The integer p(x, y) is the product of x and y in the sense of category theory. It may also be the product in the basic sense of multiplying two numbers together, but it might not be. The definition in terms of nodes and edges generalizes the notion of product, so that the familiar product is an example, the canonical example, but not the only example.

The category theory notion of a product abstracts something that multiplication and maximum have in common. More on this here.

We could go back and define a function c(x, y) by saying replacing “to” with “from” in the definition of p. That is, c(x, y) is the unique node such that whenever a node z has directed edges coming from x and from y, there is also a directed node to z coming from c(x, y). The function c is the coproduct.

Emphasizing the edges

In category theory, definitions, such as the definition of product and coproduct, depend not just on the objects but on the morphisms between them. In graph theory language, definitions depend not just on the nodes but also on the edges. Keep the same objects but define different morphisms and you get different products, as we did above.

Often there is a standard set of morphisms (edges), so standard that they are often left implicit. That’s usually OK, but sometimes the morphisms need to be emphasized, either because they are not the usual morphisms or because we need to stress some property of the morphisms. Morphisms are typically structure-preserving functions, and we may need to emphasize the structure-preserving part.

Creating a Traveling Salesman Tour of Texas with Mathematica

Posted on 10 August 2023 by John

A Traveling Salesman tour visits a list of destinations using the shortest path. There’s an obvious way to find the shortest path connecting N points: try all N! paths and see which one is shortest. Unfortunately, that might take a while.

Texas has 254 counties, and so calculating a tour of Texas counties by brute force would examine 254! paths, over 10⁵⁰⁰ paths. In theory, large Traveling Salesman problems are unsolvable. In practice they can often be solved quickly. As is often the case, the key is to give yourself just a little slack and look for solutions that are close to optimal.

I’ve used the example of a Traveling Salesman tour of Texas before because it makes a nice visual. People asked me for the code that made the image, but I didn’t save the code and didn’t remember offhand how to re-create it. So here’s the code for future reference.

Incidentally, computing the tour itself took only a second or two. Creating the visualization took several seconds.

    texas = Entity["AdministrativeDivision", "Texas"]; 
    counties = texas["Subdivisions"];
    tour = FindShortestTour[texas["Subdivisions"]];
    GeoGraphics[{Thick, Red, GeoPath[counties[[tour[[2]]]]]}]

Here counties is a list of objects representing Texas counties, sorted by alphabetical order, from Anderson County to Zavala County.

The tour object is a pair of a distance and a list of integers. The distance, 6780.74 nautical miles, is the length of the tour. The integers are the indexes of the counties in the tour.

{6780.74 nmi, {1, 107, 234, …, 201, 37, 1}}

The tour starts with the first county, Anderson County. It’s got to start somewhere, and I expect it always starts with the first item in the list. Next it goes to the 107th county, Henderson County, and so on. Because FindShortestTour returns a closed tour, the tour ends where it started, in Anderson County.

Related posts: Traveling Salesman tours of Africa, Americas, Eurasia and Oceania.

Topological sort

Posted on 27 March 2023 by John

When I left academia [1] my first job was working as a programmer. I was very impressed by a new programmer we hired who hit the ground running. His first week he looked at some problem we were working on and said “Oh, you need a topological sort.” I’d never heard of a topological sort and it sounded exotic. What could this have to do with topology?!

A topological sort of a directed graph lists source nodes before target nodes. For example, if there is a directed edge from A to B and from C to A, then the nodes would be list C, A, B. It’s just a way of listing items in a directed graph so that no item in the list points to an item earlier in the list. All arrows point forward.

This is not exotic at all. It’s something you’ve likely done, maybe by hand. As pointed out in the comments, the make utility does this, compiling source files in the order that they’re needed [2].

Where does topology come in? Imagine your directed graph made of beads and strings. You want to pick up the graph by some bead so that all beads are higher than the beads they point to. It’s topological in the sense that you don’t need to preserve the geometry of the graph, only its connectivity.

tsort

The Unix utility tsort will do a topological sort. The input to the utility is a text file with two items per line, separated by white space, indicating a directed edge from the first item to the second.

Example

Here is a thumbnail image of a graph of relationships between special functions. See this page for a full-sized image and an explanation of what the arrows represent.

special function relationships

I took the GraphViz file used to create the graph and formatted it for tsort. Then I randomly shuffled the file with shuf.

    Gegenbauer_polynomials Legendre_polynomials
    Gegenbauer_polynomials Chebyshev_polynomials_Second_kind
    Hypergeometric_2F1 Jacobi_polynomials
    Error_function Fresnel_S
    ...
    Hypergeometric_1F1 Error_function

The lines are not sorted topologically because, for example, the Gegenbauer polynomials are special cases of the Hypergeometric 2F1 functions, so Hypergeometric 2F1 should be listed before Gegenbauer polynomials.

When I ran the shuffled file through tsort I got

    Elliptic_F
    Hypergeometric_2F1
    Elliptic_E
    Hypergeometric_1F1
    ....
    Beta

and now in this list more general functions always come before special cases.

[1] After a postdoc at Vanderbilt, I took a job as a programmer. I got the job because they needed a programmer who knew some DSP. A few years later I got a job at MD Anderson Cancer Center managing a group of programmers. It’s fuzzy whether my time at MDACC should be considered time in Academia. My responsibilities there were sometimes academic—writing journal articles, teaching classes—and sometimes not—developing software and managing software developers.

[2] The make software can be used to run any directed acyclic graph of tasks, but is most often used to compile software.

The NBA and MLB trees are isomorphic

Posted on 7 March 2023 by John

NBA MLB games

An isomorphism is a structure-preserving function from one object to another. In the context of graphs, an isomorphism is a function that maps the vertices of one graph onto the vertices of another, preserving all the edges.

So if G and H are graphs, and f is an isomorphism between G and H, nodes x and y are connected in G if and only if nodes f(x) and f(y) are connected in H.

There are 30 basketball teams in the National Basketball Association (NBA) and 30 baseball teams in Major League Baseball (MLB). That means the NBA and MLB are isomorphic as sets, but it doesn’t necessarily mean that the hierarchical structure of the two organizations are the same. But in fact the hierarchies are the same.

Both the NBA and MLB have two top-level divisions, each divided into three subdivisions, each containing five teams.

Basketball has an Eastern Conference and a Western Conference, whereas baseball has an American League and a National League. Each basketball conference is divided into three divisions, just like baseball leagues, and each division has five teams, just as in baseball. So the tree structures of the two organizations are the same.

In the earlier post about the MLB tree structure, I showed how you could number baseball teams so that the team number n could tell you the league, division, and order within a division by taking the remainders when n is divided by 2, 3, and 5. Because the NBA tree structure is isomorphic, the same applies to the NBA.

Here’s a portion of the graph with numbering. The full version is available here as a PDF.

Here’s the ordering.

Los Angeles Clippers
Miami Heat
Portland Trail Blazers
Milwaukee Bucks
Dallas Mavericks
Brooklyn Nets
Los Angeles Lakers
Orlando Magic
Utah Jazz
Chicago Bulls
Houston Rockets
New York Knicks
Phoenix Suns
Washington Wizards
Denver Nuggets
Cleveland Cavaliers
Memphis Grizzlies
Philadelphia 76ers
Sacramento Kings
Atlanta Hawks
Minnesota Timberwolves
Detroit Pistons
New Orleans Pelicans
Toronto Raptors
Golden State Warriors
Charlotte Hornets
Oklahoma City Thunder
Indiana Pacers
San Antonio Spurs
Boston Celtics

Incidentally, the images at the top of the post were created with DALL-E. They look nice overall, but you’ll see bizarre details if you look too closely.

A mathematical look at the NFL

Posted on 2 March 2023 by John

This post will look at the National Football League through the lens of graph theory, topology, and binary numbers.

The NFL has a very nice tree structure, which isn’t too surprising in light of the need to make tournament brackets. The NFL is divided into two conferences, the American Football Conference and the National Football Conference.

Tree structure

Each conference is divided into four divisions named after geographical regions. Since this is a mathematical post, I’ve listed the regions counterclockwise starting in the east because that’s how mathematicians do things.

NFL -> AFC NFL -> NFC AFC -> AFCE AFC -> AFCN AFC -> AFCW AFC -> AFCS NFC -> NFCE NFC -> NFCN NFC -> NFCW NFC -> NFCS

Each division has four teams. Adding each team under its division would make an awkwardly wide graph. I made a graph of the entire tree, rotated so that image is long rather than wide. Here’s a little piece of it.

The full image is available here.

Geography

Now you may wonder how well the geographic division names correspond to geography. For example, the Dallas Cowboys are in the NFC East, and it’s a little jarring to hear Texas called “east.”

But within each conference, all the “East” teams are indeed east of all the West teams. And with one exception, all the North teams are indeed north of the South teams. The Indianapolis Colts are the exception. The Colts are in the AFC South, but are located to the north of the Cincinnati Bengals and the Baltimore Ravens in the AFC North.

This geographical sorting only applies within a conference. The Dallas Cowboys, for example are east of all the West teams within their conference, but they are west of the Kansas City Chiefs in the AFC West.

Here’s where topology comes in: you can make the division names match their geography if you morph the map of the United States pulling Indianapolis south of its geometric location.

Binary numbers

The graph structure of the NFL is essentially a full binary tree; you could make it into a binary tree by introducing a sub-conference layer and grouping the teams into pairs.

You could number the NFL teams with five bits: one for the conference, two for the division, and two more for the team. We could make the leading bit 0 for the AFC and 1 for the NFC. Then within each division, we could use 00 for East, 01 for North, 10 for West, and 11 for South. As mentioned above, this follows the mathematical convention of angles increasing counterclockwise starting at the positive x-axis.

The table above is an SVG image; here is the same data in plain text.

Visualizing correlations with graphs

Posted on 27 September 2022 by John

Yesterday I found a statistics textbook for geologists [1] for $1 at a library book sale. When I thumbed through the book an image similar to the one below caught my eye.

This image approximates Figure 15.2 in [1],

The nodes represent six factors of the thickness of rock formations and the edges are labeled with the correlations between factors. Only large correlations are shown. For example, in theory everything is correlated with “total” but carbonates are not significantly correlated with the total. Nonclastics divide into evaporates and carbonates; apparently nearly all the nonclastics in this data set were evaporites.

Notice that this example illustrates that correlation is not transitive. That is, if A is correlated with B and B is correlated with C, it does not follow that A is necessarily correlated with C.

Making the graph

I made the graph above with GraphViz using the following code.

    graph G {
    
    layout=neato
    
    T [label="Total"      , pos="2.50, 5.00!"]
    S [label="Sand"       , pos="4.66, 3.75!"]
    C [label="Carbonates" , pos="4.66, 1.25!"]
    E [label="Evaporites" , pos="2.50, 0.00!"]
    N [label="Nonclastics", pos="0.39, 1.25!"]
    H [label="Shale"      , pos="0.39, 3.75!"]
    
    T -- S [label=" 0.24 "]
    T -- H [label=" 0.89 "]
    T -- N [label=" 0.84 "]
    T -- E [label=" 0.82 "]
    H -- N [label=" 0.69 "]
    H -- E [label=" 0.70 "]
    S -- C [label=" 0.45 "]
    N -- E [label=" 0.99 "]

    }

I’ve mostly used GraphViz to make graphs when I didn’t care much about the layout. I’ve experimented with a few layout engines, but I hadn’t tried specifying the node positions before.

The nodes in the original graph were arranged in a circle, so I tried the circo layout engine. This did not position the nodes in a circle. I also tried specifying the positions without the bang on the end, giving the positions as layout hints. GraphViz did not appreciate my suggestions and was certain that it knew better how to layout the graph. But when I added the exclamation marks GraphViz acquiesced to my wishes.

GraphViz will create output in a variety of formats. I tried PNG and SVG. The SVG image above was 11 times smaller than the PNG output. One reason I starting using SVG images more often is that they often result in smaller files. They also look very nice at multiple resolutions, i.e. on a desktop and on a mobile device.

[1] Krumbein and Graybill. An Introduction to Statistical Models in Geology. McGraw-Hill, 1965.

Graphing Japanese Prefectures

Posted on 12 September 2022 by John

The two previous posts looked at adjacency networks. The first used examples of US states and Texas counties. The second post made suggestions for using these networks in a classroom. This post is a continuation of the previous post using examples from Japan.

Japan is divided into 8 regions and 47 prefectures. Here is a network diagram of the prefectures in the Kanto region showing which regions border each other. (In this post, “border” will be regions share a large enough border that I was able to see the border region on the map I was using. Some regions may share a very small border that I left out.)

This is a good example of why it is convenient in GraphViz to use variable names that are different from labels. I created my graphs using English versions of prefecture names, and checked my work using the English names. Then after debugging my work I changed the label names (but not the connectivity data) to use Japanese names.

To show what this looks like, my GraphViz started out like this

    graph G {
    layout=sfdp
    AI [label="Aichi"]
    AK [label="Akita"]
    AO [label="Aomori"]
    ...
    AO -- AK
    AO -- IW
    AK -- IW
    ...

and ended up like this

    graph G {
    layout=sfdp
    AI [label="愛知県"]
    AK [label="秋田県"]
    AO [label="青森県"]
    ...
    AO -- AK
    AO -- IW
    AK -- IW
    ...

Here’s a graph only showing which prefectures border each other within a region.

This image is an SVG, so you can rescale it without losing any resolution. Here’s the same image as a PDF.

Because this network is effectively several small networks, it would be easy to look at a map and figure out which nodes correspond to which prefectures. (It would be even easier if you could read the labels!)

Note that there are two islands—literal islands, as well as figurative islands in the image above—Hokkaido, which is its own region, and Okinawa, which a prefecture in the Kyushu region.

Here’s the graph with all bordering relations, including across regions.

The image above is also an SVG. And here’s the same image as a PDF.

Classroom exercise with networks

Posted on 12 September 2022 by John

In the previous post I looked at graphs created from representing geographic regions with nodes and connecting nodes with edges if the corresponding regions share a border.

It’s an interesting exercise to recover the geographic regions from the network. For example, take a look at the graph for the continental United States.

It’s easy to identify Alaska in the graph. The node on the left represents Maine because Maine is the only state to border exactly one other state. From there you can bootstrap your way to identifying the rest of the states.

Math class

This could make a fun classroom exercise in a math class. Students will naturally come up with the idea of the degree of a node, the number of edges that meet that node, because that’s a handy way to solve the puzzle: the only possibilities for a node of degree n are states that border n other states.

This also illustrates that networks preserve topology, not geometry. That is, the connectivity information is retained, but the shape is dramatically different.

Geography class

Someone asked me on Twitter to make a corresponding graph for Brazil. Mathematica, or at least my version of Mathematica, doesn’t have data on Brazilian states, so I made an adjacency graph using GraphViz.

adjacency graph of Brazilian states

Labeling the blank nodes is much easier for Brazil than for the US because Brazil has about half as many states, and the topology of the graph gives you more to work with. Three nodes connect to only one other node, for example.

Here the exercise doesn’t involve as much logic, but the geography is less familiar, unless of course you’re more familiar with Brazil than the US. Labeling the graph will require staring at a map of Brazil and you might accidentally learn a little about Brazil.

GraphViz

The labeled version of the graph above is available here. And here are the GraphViz source files that make the labeled and unlabeled versions.

The layout of a GraphViz file is very simple. The file looks like this:

    graph G {

        layout=sfdp

        AC [label="Acre"]
        AL [label="Alagoas"]
        ...
        AC -- AM
        AC -- RO
        ...
    }

There are three parts: a layout, node labels, and connections.

GraphViz has several layout engines, and the sfdp one matched what I was looking for in this case. Other layout options lead to overlapping edges that were confusing.

The node names AC, AL, etc. do not appear in the output. They’re just variable names for your convenience. The text inside the label is what appears in the final output. I’ll give an example in the next post in which it’s very convenient for the variables to be different from the labels. The order of the labels doesn’t matter, only which variables are associated with which labels.

Finally, the lines with variables separated by dashes are the connection data. Here we’re telling GraphViz to connect node AC to nodes AM and RO. The order of these lines doesn’t matter.

Adjacency networks

Posted on 11 September 2022 by John

Suppose you want to color a map with no two bordering regions having the same color. If this is a map on a plane, you can do this using only four colors, but maybe you’d like to use more.

You can reduce the problem to coloring the nodes in a graph. Each node corresponds to a region, and there is an edge between two nodes if and only if their corresponding regions share a border.

Here is a sort of topologists’s or graph theorist’s view of the continental United States.

This was created using the following sample code from the Mathematica documentation.

    RelationGraph[MemberQ[#2["BorderingStates"], #1] &, 
        EntityList[
            EntityClass["AdministrativeDivision", "ContinentalUSStates"]]]

You can recognize Maine in the graph because it’s the only state that only borders one other state. Alaska is also easy to locate. Exercise for the reader: mentally add Hawaii to the graph.

The analogous graph for Texas counties took much longer to draw: there are 49 continental US states but 254 Texas counties.

This was created with the following code.

    RelationGraph[MemberQ[#2["BorderingCounties"], #1] &, 
        EntityList[EntityClass["AdministrativeDivision", "USCountiesTexas"]]]

You can find El Paso county in the top left; it only borders one county just as Maine only borders one state.