Effective graph resistance

I’ve mentioned the Moore-Penrose pseudoinverse of a matrix a few times, most recently last week. This post will give an application of the pseudoinverse: computing effective graph resistance.

Given a graph G, imagine replacing each edge with a one Ohm resistor. The effective resistance between two nodes in G is the electrical resistance between those the two nodes.

Calculating graph resistance requires inverting the graph Laplacian, but the graph Laplacian isn’t invertible. We’ll resolve this paradox shortly, but in the mean time this sounds like a job for the pseudoinverse, and indeed it is.

The graph Laplacian of a graph G is the matrix L = D − A where D is the diagonal matrix whose entries are the degrees of each node and A is the adjacency matrix. The vector of all 1’s

1 = (1, 1, 1, …, 1)

is in the null space of L but if G is connected then the null space is one dimensional. More generally, the dimension of the null space equals the number of components in G.

Calculating graph resistance requires solving the linear system

Lw = v

where v and w are orthogonal to 1. If the graph G is connected then the system has a unique solution. The graph Laplacian L is not invertable over the entire space, but it is invertable in the orthogonal complement to the null space.

The Moore-Penrose pseudoinverse L+ gives the graph resistance matrix. The graph resistance between two nodes a and b is given by

Rab = (eaeb)T L+ (eaeb)

where ei is is the vector that has all zero entries except for a 1 in the ith slot.

For more on graph resistance see Effective graph resistance by Wendy Ellens et al.

More spectral graph theory posts

Multiplying a matrix by its transpose

An earlier post claimed that there practical advantages to partitioning a matrix, thinking of the matrix as a matrix of matrices. This post will give an example.

Let M be a square matrix and suppose we need to multiply M by its transpose MT. We can compute this product faster than multiplying two arbitrary matrices of the same size by exploiting the fact that MMT will be a symmetric matrix.

We start by partitioning M into four blocks

M = \begin{bmatrix}A & B \\ C & D \end{bmatrix}

Then

M^\intercal = \begin{bmatrix}A^\intercal & C^\intercal \\ B^\intercal & D^\intercal \end{bmatrix}

and

MM^\intercal = \begin{bmatrix} AA^\intercal  + BB^\intercal & AC^\intercal  + BD^\intercal \\CA^\intercal + DB^\intercal & CC^\intercal + DD^\intercal \end{bmatrix}

Now for the first clever part: We don’t have to compute both of the off-diagonal blocks because each is the transpose of the other. So we can reduce our calculation by 25% by not calculating one of the blocks, say the lower left block.

And now for the second clever part: apply the same procedure recursively. The diagonal blocks in MMT involve a matrix times its transpose. That is, we can partition A and use the same idea to compute AAT and do the same for BBT, CCT, and DDT. The off diagonal blocks require general matrix multiplications.

The net result is that we can compute MMT in about 2/3 the time it would take to multiply two arbitrary matrices of the same size.

Recently a group of researchers found a way to take this idea even further, partitioning a matrix into a 4 by 4 matrix of 16 blocks and doing some clever tricks. The RXTX algorithm can compute MMT in about 36/41 the time required to multiply arbitrary matrices, a savings of about 5%. A 5% improvement may be significant if it appears in the inner loop of a heavy computation. According to the authors, “The algorithm was discovered by combining Machine Learning-based search methods with Combinatorial Optimization.”

Related posts

A bit-twiddling marvel

Pop quiz: what does the following code do?

bool is_leap_year_fast(uint32_t y) {
    return ((y * 1073750999) & 3221352463) <= 126976;
}

It determines whether the year y is a leap year in the Gregorian calendar, of course. :)

It’s very efficient, though I don’t image that would ever matter. But it’s very clever.

Gregorian vs Julian calendar

A year is a leap year in the Julian calendar if and only if it is a multiple of 4. But the Julian year is a bit too long on average to match the earth’s orbit around the sun. You get a much better fit if you add the complication that years divisible by 100 but not by 400 are not leap years.

Presumably no one reading this recalls 1900 not being a leap year, but some of you may need to know some day that 2100 is not a leap year either.

C code

The cryptic function above comes from the recent post A leap year check in three instructions by Falk Hüffner. The function is correct for the next 100 thousand years (more precisely, through 103499) and correct if we anachronistically extent the Gregorian calendar back to the year 0.

The following C code shows empirically that Hüffner’s code is correct, but you’ll have to read his post to understand why it is correct.

#include <stdio.h>
#include <stdbool.h>
#include <stdint.h>

bool is_leap_year_fast(uint32_t y) {
    return ((y * 1073750999) & 3221352463) <= 126976;
}

bool is_leap_year(uint32_t y) {
    if ((y % 4) != 0) return false;
    if ((y % 100) != 0) return true;
    if ((y % 400) == 0) return true;
    return false;
}

int main() {
    for (uint32_t y = 0; y <= 102499; y++)
        if (is_leap_year_fast(y) != is_leap_year(y))
            printf("Exception: %d\n", y);
    return 0;
}

Related posts

Matrices of Matrices

It’s often useful to partition a matrix, thinking of it as a matrix whose entries are matrices. For example, you could look at the matrix 6 × 6

M = \begin{bmatrix} 3 & 1 & 4 & 0 & 0 & 0 \\ 2 & 7 & 1 & 0 & 0 & 0 \\ 5 & 0 & 6 & 0 & 0 & 0 \\ 0 & 0 & 0 & 2 & 4 & 8 \\ 0 & 0 & 0 & 1 & 7 & 6\\ 0 & 0 & 0 & 3 & 5 & 1 \\ \end{bmatrix}

as a 2 × 2 matrix whose entries are 3 × 3 matrices.

\begin{bmatrix} \begin{bmatrix} 3 & 1 & 4 \\ 2 & 7 & 1 \\ 5 & 0 & 6 \\ \end{bmatrix} & \begin{bmatrix} 0 & 0 & 0 \\ 0 & 0 & 0 \\ 0 & 0 & 0 \\ \end{bmatrix} \\ & \\ \begin{bmatrix} 0 & 0 & 0 \\ 0 & 0 & 0 \\ 0 & 0 & 0 \\ \end{bmatrix} & \begin{bmatrix} 2 & 4 & 8 \\ 1 & 7 & 6\\ 3 & 5 & 1 \\ \end{bmatrix} \end{bmatrix}

M is not a diagonal matrix of real numbers, but its block structure is a diagonal matrix of blocks. It might be possible in some application to take advantage of the fact that the off-diagonal blocks are zero matrices.

You can multiply two partitioned matrices by multiplying the blocks as if they were numbers. The only requirement is that the blocks be of compatible sizes for multiplication: the block widths on the left have to match the corresponding block heights on the right.

For example, if you have two partitioned matrices

A = \begin{bmatrix} A_{11} & A_{12} \\ A_{21} & A_{22} \end{bmatrix}
and
B = \begin{bmatrix} B_{11} & B_{12} \\ B_{21} & B_{22} \end{bmatrix}

then their product is the partitioned matrix

C = \begin{bmatrix} A_{11}B_{11} + A_{12}B_{21} & A_{11}B_{12} + A_{12}B_{22} \\ A_{21}B_{11} + A_{22}B_{21} & A_{21}B_{12} + A_{22}B_{22} \end{bmatrix}

The only requirement is that the matrix products above make sense. The number of columns in A11 must equal the number of rows in B11, and the number of columns in A12 must equal the number of rows in B21. (The rest of the compatibility conditions follow because the blocks in a column of a partitioned matrix have the same number of columns, and the blocks in a row of a partitioned matrix have the same number of rows.)

You can partition matrices into any number of rows and columns of blocks; the examples above have been 2 × 2 arrays of blocks only for convenience.

It’s natural to oscillate a bit between thinking “Of course you can do that” and “Really? That’s incredible.” If you haven’t gone through at least one cycle perhaps you haven’t thought about it long enough.

What could go wrong?

Manipulation of partitioned matrices works so smoothly that you have to wonder where things could go wrong.

For the most part, if a calculation makes logical sense then it’s likely to be correct. If you add two blocks, they have to be compatible for addition. If you “divide” by a block, i.e, multiply by its inverse, then the block must have an inverse. Etc.

One subtle pitfall is that matrix multiplication does not commute in general, and so you have to watch out for implicitly assuming that ABBA where A and B are blocks. For example, theorems about the determinant of a partitioned matrix are not as simple as you might hope. The determinant of the partitioned matrix

\begin{vmatrix} A & B \\ C & D \end{vmatrix}

is not simply det(ADBC) in general, though it is true if all the blocks are square matrices of the same size and the blocks in one row or one column commute, i.e. if AB = BA, CD = DC, AC = CA, or BD = DB.

More linear algebra posts

Trinomial Coefficients and Kings

The term trinomial coefficient is used a couple different ways. The most natural use of the term is a generalization of bionomial coefficients to three variables, i.e. numbers of the form

\frac{n!}{i! \, j! \, k!}

where ijkn. These numbers are the coefficients of yj zk in the expansion of (x + y + z)n.

But there’s another use of the term trinomial coefficient, the one that we are interest in for this post, and that is the coefficients of xk in the expansion of (1 + x + 1/x)n. These numbers can be visualized in a triangle analogous to Pascal’s triangle.

In Pascal’s triangle, each entry is the sum of the two entries above it. In the trinomial triangle, each entry is the sum of the three entries above it.

If you’ve been reading this blog regularly you know I’ve been interested in chess puzzles lately. The thing that make me interested in trinomial coefficients is that they relate to moves of a king on a chessboard.

If you note the number paths a king can take to a square using the minimal number of moves, you get the trinomial triangle. The reason is simple: the number of ways to get to a square equals the number of ways to get there via the square up and to the left, plus the number of ways to get their via the square above, plus the number of ways to get their via the square up and to the right.

Related posts

Riff on an integration bee integral

I saw an integral posted online that came from this year’s MIT Integration Bee.

\int_0^1 x^{2024} (1 - x^{2025})^{2025}\, dx

My thoughts on seeing this were, in order:

  1. It looks like a beta function.
  2. The answer is a small number.
  3. You can evaluate the integral using the substitution u = 1 − x2025.

I imagine most students’ reactions would be roughly the opposite. They’d first see the substitution. Some might think of the value being small, and none would think of beta functions.

Size estimate

The value of the integral is small because you have numbers between 0 and 1 being raised to large powers. In real applications, being able to estimate the size of integral may be more valuable than being able to compute the integral.

But on a quiz, like the Integration Bee, knowing that the result is small would be worthless, except possibly to check a result. You know from the context of the problem being on a timed quiz that there must be a trick [1].

Incidentally, if you changed the problem slightly, say by replacing one of the instances of 2025 with 2025.03, the integration problem becomes much harder, but you could still estimate that the value is small.

Beta functions

The first time I saw someone evaluate an integral by recognizing a beta function it seemed like he was pulling a rabbit out of a hat. I was completely surprised that he thought something was common knowledge that I’d never seen.

Years later I went to work in biostatistics at MDACC and worked with the beta function and beta random variables all the time. If you look through my publications, you’ll see the word “beta” appears several times.

The beta function is defined as follows. You could take the first equality as the definition and the second as a theorem.

B(x, y) = \int_0^1 t^{x-1} (1 - t)^{y-1} \,dx = \frac{\Gamma(x) \, \Gamma(y)}{\Gamma(x+y)}

The integrand in the definition of the beta function is the probability density function for a beta(xy) random variable, and so B(x, y) is the normalizing constant for a beta(xy) probability distribution.

The integral in the Integration Bee question is not a beta function. I thought maybe it could be turned into a beta function, but the u substitution is simpler.

If you change the integrand above to

\int_0^1 x^{2024} (1 - x)^{2025}\, dx

then you do have a beta function. The value of this integral is B(2025, 2026) which equals 2024! 2025! / 4050!, which is an extremely small number.

Numerical estimation

The integral in the previous section is roughly the same as that of x2024 (1 − x)2024. This function has its maximum at ½, and so the integrand is less than 2−4048 ≈ 10−1219.

If we want to evaluate 2024! 2025! / 4050! numerically we’ll need to be indirect about it because 2024! would overflow and the final result would underflow. We could calculate the log base 10 of the value using Python as follows.

>>> from scipy.special import gammaln
>>> from math import log
>>> (gammaln(2025) + gammaln(2026) - gammaln(4051))/log(10)
-1220.576093208249

So our upper bound got us to within an order of magnitude or two of the result.

 

[1] When you see the current year used in a math problem, the quiz writer is winking at you. The year is often meant to suggest that a number is large but somewhat arbitrary. Here you suspect that the key to the question would be the same if 2024 and 2025 were replaced with 1997 and 1998. Also, the year is often a hint that you don’t want to use brute force. In this case, it’s a warning not to expand the integrand using the binomial theorem.

Multiplying by quaternions on the left and right

The map that takes a quaternion x to the quaternion qx is linear, so it can be represented as multiplication by a matrix. The same is true of the map that takes x to xq, but the two matrices are not the same because quaternion multiplication does not commute.

Let qa + bi + cjdk and let qM be the matrix that represents multiplication on the left by q. Then

_qM = \begin{bmatrix} a & -b & -c & -d \\ b & a & -d & c \\ c & d & a & -b \\ d & -c & b & a \\ \end{bmatrix}

Now let Mq be the matrix that represents multiplication on the right by q. Then

M_q = \begin{bmatrix} a & -b & -c & -d \\ b & a & d & -c \\ c & -d & a & b \\ d & c & -b & a \\ \end{bmatrix}

Can prove both matrix representations are correct by showing that they do the right thing when q = 1, ij, and k. The rest follows by linearity.

You might speculate that the matrix representation for multiplying on the right by q might be the transpose of the matrix representation for multiplying on the left by q. You can look at the matrices above and see that’s not the case.

In this post I talk about how to represent rotations with quaterions, and in this post I give an equation for the equivalent rotation matrix for a rotation described by a quaternion. You can prove that the matrix representation is correct by multiplying out qM and Mq* . Keep in mind that q in that case is a unit quaterion, so the sum of the squares of its components add to 1.

Related posts

Embeddings, Projections, and Inverses

I just revised a post from a week ago about rotations. The revision makes explicit the process of embedding a 3D vector into the quaternions, then pulling it back out.

The 3D vector is embedded in the quaternions by making it the vector part of a quaternion with zero real part:

(p1p2p3)  → (0, p1p2p3)

and the quaternion is returned to 3D by cutting off the real part:

(p0, p1p2p3)  → (p1p2p3).

To give names to the the process of moving to and from quaterions, we have an embedding E : ℝ3 to ℝ4 and a projection P from ℝ4 to ℝ3.

We can represent E as a 4 × 3 matrix

E = \begin{bmatrix} 0 & 0 & 0 \\ 1 & 0 & 0 \\ 0 & 1 & 0 \\ 0 & 0 & 1 \\ \end{bmatrix}

and P by a 3 × 4 matrix

P = \begin{bmatrix} 0 & 1 & 0 & 0 \\ 0 & 0 & 1 & 0 \\ 0 & 0 & 0 & 1 \\ \end{bmatrix}

We’d like to say E and P are inverses. Surely P undoes what E does, so they’re inverses in that sense. But E cannot undo P because you lose information projecting from ℝ4 to ℝ3 and E cannot recover information that was lost.

The rest of this post will look at three generalizations of inverses and how E and P relate to each.

Left and right inverse

Neither matrix is invertible, but PE equals the identity matrix on ℝ3 , and so P is a left inverse of E and E is a right inverse of P.

PE = \begin{bmatrix} 1 & 0 & 0 \\ 0 & 1 & 0 \\ 0 & 0 & 1 \\ \end{bmatrix}

On the other hand, EP is not an identity matrix, and so E is not a left inverse of E, and neither is P a right inverse of E.

EP = \begin{bmatrix} 0 & 0 & 0 & 0 \\ 0 & 1 & 0 & 0 \\ 0 & 0 & 1 & 0 \\ 0 & 0 & 0 & 1 \\ \end{bmatrix}

Adjoint

P is the transpose of E, which means it is the adjoint of E. Adjoints are another way to generalize the idea of an inverse. More on that here.

Pseudo-inverse

The Moore-Penrose pseudo-inverse acts a lot like an inverse, which is somewhat uncanny because all matrices have a pseudo-inverse, even rectangular matrices.

Pseudo-inverses are symmetric, i.e. if A+ is the pseudo-inverse of A, then A is the pseudo-inverse of A+.

Given an m by n matrix A, the Moore-Penrose pseudoinverse A+ is the unique n by m matrix satisfying four conditions:

  1. A A+ A = A
  2. A+ A A+ = A+
  3. (A A+)* = A A+
  4. (A+ A)* = A+ A

To show that A+ = P we have to establish

  1. EPE = E
  2. PEP = A+
  3. (EP)* = EP
  4. (PE)* = PE

We calculated EP and PE above, and both are real and symmetric, so properties 3 and 4 hold.

We can also compute

 EPE = \begin{bmatrix} 0 & 0 & 0 & 0 \\ 0 & 1 & 0 & 0 \\ 0 & 0 & 1 & 0 \\ 0 & 0 & 0 & 1 \\ \end{bmatrix} \begin{bmatrix} 0 & 0 & 0 \\ 1 & 0 & 0 \\ 0 & 1 & 0 \\ 0 & 0 & 1 \\ \end{bmatrix} = \begin{bmatrix} 0 & 0 & 0 \\ 1 & 0 & 0 \\ 0 & 1 & 0 \\ 0 & 0 & 1 \\ \end{bmatrix} = E

and

PEP = \begin{bmatrix} 1 & 0 & 0 \\ 0 & 1 & 0 \\ 0 & 0 & 1 \\ \end{bmatrix} \begin{bmatrix} 0 & 1 & 0 & 0 \\ 0 & 0 & 1 & 0 \\ 0 & 0 & 0 & 1 \\ \end{bmatrix} = \begin{bmatrix} 0 & 1 & 0 & 0 \\ 0 & 0 & 1 & 0 \\ 0 & 0 & 0 & 1 \\ \end{bmatrix} =P

showing that properties 1 and 2 hold as well.

Related posts