Maximum principles and bounds for initial value problems

The previous post looked at using maximum principles to bound the solution to a boundary value problem. This is a similar post, focusing on an initial value problem. As before we start with the differential operator

L[u] = u” + g(x)u’ + h(x)

where g and h are bounded and h is non-positive. We are interested in solutions to the equation L[u] = f, but this time we want to specify the initial conditions u(0) = u0 and u(0) = u0. We will use the same maximum principle as before, but a slightly different approach to constructing upper and lower bounds on solutions. Our conclusion will also be a little different: this time we’ll get upper and lower bounds not only on the solution u but also on its derivative u‘.

We look for a function z(x) satisfying the differential inequality

L[u] ≥ f

and the two initial inequalities z(0) ≥ u0 and z(0) ≥ u0. We also look for a function w(x) satisfying the same inequalities in the opposite direction, i.e. replacing ≥ with ≤. (Note that replacing equality with inequality in the differential equation and initial conditions makes it much easier to find solutions.)

The maximum principle tells us that z and w provide upper and lower bounds respectively on u, and their derivatives provide bounds on the derivative of u. That is,

w(x) ≤ u(x) ≤ z(x)

and

w‘(x) ≤ u‘(x) ≤ z‘(x).

We illustrate this technique on a variation on Bessel’s equation:

u”(x) + u‘(x)/xu(x) = 0

with initial conditions u(0) = 1 and u‘(0) = 0. We wish to bound the solution u on the inverval [0, 1].

As candidates, we try zc1x2 + 1 and wc2x2 + 1. (Why not a general quadratic polynomial? The initial conditions tell us that the linear coefficient will be 0 and the constant term will be 1.)

Since

L[cx2 + 1] = c(4 – x2) – 1

it suffices to pick c1 = 1/3 and c2 = 1/4. This tells us that

x/4 + 1 ≤ u(x) ≤ x/3 + 1.

To see how well these bounds work, we compare with the exact solution I0, the so-called modified Bessel function of the first kind.

The lower bound is very good, though a comparably good upper bound would take a more careful choice of z.

Source: Maximum Principles in Differential Equations by Protter and Weinberger

Related posts:

Musical pitch notation

How can you convert the frequency of a sound to musical notation? I wrote in an earlier post how to calculate how many half steps a frequency is above or below middle C, but it would be useful go further have code to output musical pitch notation.

In scientific pitch notation, the C near the threshold of hearing, around 16 Hz, is called C0. The C an octave higher is C1, the next C2, etc. Octaves begin with C; other notes use the octave number of the closest C below.

C4, middle C

The lowest note on a piano is A0, a major sixth up from C0. Middle C is C4 because it’s 4 octaves above C0. The highest note on a piano is C8.

Math

A4, the A above middle C, has a frequency of 440 Hz. This is nine half steps above C4, so the pitch of C4 is 440*2-9/12. C0 is four octaves lower, so it’s 2-4 = 1/16 of the pitch of C4. (Details for this calculation and the one below are given in here.)

For a pitch P, the number of half steps from C0 to P is

h = 12 log2(P / C0).

Software

The Python code below calculates the number of half steps h from C0 up to a pitch, then computes the corresponding pitch notation.

from math import log2, pow

A4 = 440
C0 = A4*pow(2, -4.75)
name = ["C", "C#", "D", "D#", "E", "F", "F#", "G", "G#", "A", "A#", "B"]
    
def pitch(freq):
    h = round(12*log2(freq/C0))
    octave = h // 12
    n = h % 12
    return name[n] + str(octave)

The pitch for A4 is its own variable in case you’d like to modify the code for a different tuning. While 440 is common, it used to be lower in the past, and you’ll sometimes see higher values like 444 today.

If you’d like to port this code to a language that doesn’t have a log2 function, you can use log(x)/log(2) for log2(x).

Powers of 2

When scientific pitch notation was first introduced, C0 was defined to be exactly 16 Hz, whereas now it works out to around 16.35. The advantage of the original system is that all C’s have frequency a power of 2, i.e. Cn has frequency 2n+4 Hz. The formula above for the number of half steps a pitch is above C0 simplifies to

h = 12 log2P – 48.

If C0 has frequency 16 Hz, the A above middle C has frequency 28.75 = 430.54, a little flat compared to A 440. But using the A 440 standard, C0 = 16 Hz is a convenient and fairly accurate approximation.

Related posts

Fourier transform of a function on a graph

What is a Fourier transform at its core? An expansion of function in terms of eigenfunctions of the Laplacian. For a function on the real line, the Laplacian is simply the second derivative. The functions mapped to multiples of themselves by taking second derivatives are sines and cosines of various frequencies. A Fourier series is a change of basis, using as basis vectors those functions who behave the simplest under the second derivative.

The Fourier transform of a function on a graph is also a change of basis, expanding a discrete function in terms of eigenvalues of the Laplacian, in this case the graph Laplacian.

The Fourier transform of a function f, evaluated at a frequency ω, is the inner product of f with the eigenfunction exp(2πiωt).

\hat{f}(\omega) = \langle f, \exp(2\pi i \omega t) \rangle = \int_{-\infty}^\infty f(t) \exp(-2\pi i \omega t) \, dx

The inner product of two complex functions f and g is the integral of the product of f and the conjugate of g. Conjugation is why exp(2πiωt) became exp(-2πiωt).

The Fourier transform of a discrete function f on a graph, evaluated at an eigenvalue λi, is the inner product of f (i.e. the vector of values of f at each node) with the eigenvector associated with λi.

\hat{f}(\lambda_i) = \langle f, v^*_i \rangle = \sum_{j=1}^N f(j) v_i^*(j)

Here the inner product is a discrete sum rather than an integral. As before, we take the complex conjugate of the second item in the product.

The eigenvectors associated with the smallest eigenvalues of the graph Laplacian are analogous to low frequency sines and cosines. The eigenvalue components corresponding to nearly vertices in a graph should be close together. This analogy explains why spectral coordinates work so well.

Related:

Connection between hypergeometric distribution and series

What’s the connection between the hypergeometric distributions, hypergeometric functions, and hypergeometric series?

The hypergeometric distribution is a probability distribution with parameters NM, and n. Suppose you have an urn containing N balls, M red and the rest, N – M blue and you select n balls at a time. The hypergeometric distribution gives the probability of selecting k red balls.

The probability generating function for a discrete distribution is the series formed by summing over the probability of an outcome k and xk. So the probability generating function for a hypergeometric distribution is given by

f(x) = \sum \frac{{M\choose k}{N-M \choose n-k}}{{N \choose n}} x^k

The summation is over all integers, but the terms are only non-zero for k between 0 and M inclusive. (This may be more general than the definition of binomial coefficients you’ve seen before. If so, see these notes on the general definition of binomial coefficients.)

It turns out that f is a hypergeometric function of x because it is can be written as a hypergeometric series. (Strictly speaking,  f is a constant multiple of a hypergeometric function. More on that in a moment.)

A hypergeometric function is defined by a pattern in its power series coefficients. The hypergeometric function F(a, bcx) has a the power series

F(a, b; c) = \frac{(a)_k (b)_k}{(c)_k} \frac{x^k}{k!}

where (n)k is the kth rising power of n. It’s a sort of opposite of factorial. Start with n and multiply consecutive increasing integers for k terms. (n)0 is an empty product, so it is 1. (n)1 = n, (n)2 = n(n+1), etc.

If the ratio of the k+1st term to the kth term in a power series is a polynomial in k, then the series is a (multiple of) a hypergeometric series, and you can read the parameters of the hypergeometric series off the polynomial. This ratio for our probability generating function works out to be

\frac{P(X=k+1)}{P(X=k)} = \frac{(k-M)(k-n)}{(k+N-M-n+1)(k+1)}

and so the corresponding hypergeometric function is F(-M, –nN – M – n + 1; x). The constant term of a hypergeometric function is always 1, so evaluating our probability generating function at 0 tells us what the constant is multiplying F(-M, –nN – M – n + 1; x). Now

f(0) = P(X = 0) = \frac{{N-M \choose n}}{{N \choose n}}

and so

f(x) = \frac{{N-M \choose n}}{{N \choose n}} F(-M, -n; N - M - n + 1; x)

The hypergeometric series above gives the original hypergeometric function as defined by Gauss, and may be the most common form in application. But the definition has been extended to have any number of rising powers in the numerator and denominator of the coefficients. The classical hypergeometric function of Gauss is denoted 2F1 because it has two falling powers on top and one on bottom. In general, the hypergeometric function pFq has p rising powers in the denominator and q rising powers in the denominator.

The CDF of a hypergeometric distribution turns out to be a more general hypergeometric function:

P(X \leq k) = 1 - \frac{{n\choose k+1}{N-n\choose M-k-1}}{{N\choose M}} \phantom{ }_3F_2(1, k+1-M, k+1-n; k+2, N+k+2-M-n; 1)

where a = 1, bk+1-M, ck+1-n, d = k+2, and eN+k+2-Mn.

Thanks to Jan Galkowski for suggesting this topic via a comment on an earlier post, Hypergeometric bootstrapping.

Structure in jazz and math

Last night I went to a concert by the Branford Marsalis Quartet. One of the things that impressed me about the quartet was how creative they are while also being squarely within a tradition. People who are not familiar with jazz may not realize how structured it is and how much it respects tradition. The spontaneous and creative aspects of jazz are more obvious than the structure. In some ways jazz is more tightly structured than classical music. To use Francis Schaeffer’s phrase, there is form and freedom, freedom within form.

Every field has its own structure, its tropes, its traditions. Someone unfamiliar with the field can be overwhelmed, not having the framework that an insider has to understand things. They may think something is completely original when in fact the original portion is small.

In college I used to browse the journals in the math library and be completely overwhelmed. I didn’t learn until later that usually very little in a journal article is original, and even the original part isn’t that original. There’s a typical structure for a paper in PDEs, for example, just as there are typical structures for romantic comedies, symphonies, or space operas. A paper in partial differential equations might look like this:

  1. Motivation / previous work
  2. Weak formulation of PDE
  3. Craft function spaces and PDE as operator
  4. A priori estimates imply operator properties
  5. Well posedness results
  6. Regularity

An expert knows these structures. They know what’s boilerplate, what’s new, and just how new the new part is. When I wrote something up for my PhD advisor I remember him saying “You know what I find most interesting?” and pointing to one inequality. The part he found interesting, the only part he found interesting, was not that special from my perspective. It was all hard work for me, but only one part of it stood out as slightly original to him. An expert in partial differential equations sees a PDE paper the way a professional musician listens to another or the way a chess master sees a chess board.

While a math journal article may look totally incomprehensible, an expert in that specialization might see 10% of it as somewhat new. An interesting contrast to this is the “abc conjecture.” Three and a half years ago Shinichi Mochizuki proposed a proof of this conjecture. But his approach is so entirely idiosyncratic that nobody has been able to understand it. Even after a recent conference held for the sole purpose of penetrating this proof, nobody but Mochizuki really understands it. So even though most original research is not that original, once in a while something really new comes out.

Related:

Consulting for consultants

They say that doctors make terrible patients, but in my experience consultants make great consulting clients. The best are confident in their own specialization and respect you in yours. They get going quickly and pay quickly. (I’ve only worked for consultants who have small companies. I imagine large consulting companies are as slow as other companies the same size.)

Sometimes consultants working in software development will ask me to help out with some mathematical part of their projects. And sometimes math/stat folks will ask me to help out with some computational part of their projects.

I started my consulting business three years ago. Since then I’ve gotten to know a few other consultants well. This lets me offer a broader range of services to a client by bringing in other people, and sometimes it helps me find projects.

If you’re a consultant and interested in working together, please send me an email introducing yourself. I’m most interested in meeting consultants who have some overlap with what I do but who also have complementary strengths.

Dilogarithm, polylogarithm, and related functions

The functions dilogarithm, trilogarithm, and more generally polylogarithm are meant to be generalizations of the logarithm. I first came across the dilogarithm in college when I was evaluating some integral with Mathematica, and they’ve paid a visit occasionally ever since.

Unfortunately polylogarithms are defined in several slightly different and incompatible ways. I’ll start by following An Atlas of Functions and then mention differences in A&S, SciPy, and Mathematica.

Polylogarithms

According to Atlas,

Polylogarithms are themselves special cases of Lerch’s function. Also known as Jonquière’s functions (Ernest Jean Philippe Fauque de Jonquières, 1820–1901, French naval officer and mathematician), they appear in the Feynman diagrams of particle physics.

The idea is to introduce an extra parameter ν in the power series for natural log:

\mbox{polyln}_\nu(x) = -\sum_{j=1}^\infty \frac{(1-x)^j}{j^\nu}

When ν = 1 we get the ordinary logarithm, i.e. polyln1 = log. Then polyln2 is the dilogarithm diln, and polyln3 is the trilogarithm triln.

One advantage of the definition in Atlas is that the logarithm is a special case of the polylogarithm. Other conventions don’t have this property.

Other conventions

The venerable A&S defines dilogarithms in a way that’s equivalent to the negative of the definition above and does not define polylogarithms of any other order. SciPy’s special function library follows A&S. SciPy uses the name spence for the dilogarithm for reasons we’ll get to shortly.

Mathematica has the function PolyLog[ν, x] that evaluates to

\mbox{Li}_\nu(x) = \sum_{j=1}^\infty \frac{x^j}{j^\nu}

So polylnν above corresponds to -PolyLog[ν, -x] in Mathematica. Matlab’s polylog is the same as Mathematica’s PolyLog.

Relation to other functions

Spence’s integral is the function of x given by the integral

\int_1^x \frac{\log t}{t-1}\, dt

and equals diln(x). Note that the SciPy function spence returns the negative of the integral above.

The Lerch function mentioned above is named for Mathias Lerch (1860–1922) and is defined by the integral

\Phi(x, \nu, u) = \frac{1}{\Gamma(\nu)} \int_0^\infty \frac{t^{\nu-1} \exp(-ut)}{1 - x\exp(-t)} \, dt

The connection with polylogarithms is easier to see from the series expansion:

\Phi(x, \nu, u) = \sum_{j=0}^\infty \frac{x^j}{(j+u)^\nu}

The connection with polylogarithms is then

\Phi(x, \nu,1) = -\frac{1}{x} \mbox{polyln}_\nu(1-x)

Note that the Lerch function also generalizes the Hurwitz zeta function, which in turn generalizes the Riemann zeta function. When x = 1, the Lerch function reduces to ζ(ν, u).

Regular expression to match any chemical element

Here’s a frivolous exercise in regular expressions: Write a regex to match any chemical element symbol.

Here’s one solution.

A[cglmrstu]|B[aehikr]?|C[adeflmnorsu]?|D[bsy]|E[rsu]|F[elmr]?|G[ade]|H[efgos]?|I[nr]?|Kr?|L[airuv]|M[dgnot]|N[abdeiop]?|Os?|P[abdmortu]?|R[abefghnu]|S[bcegimnr]?|T[abcehilm]|U(u[opst])?|V|W|Xe|Yb?|Z[nr]

Making it more readable

Here’s the same expression in more readable form:

/
A[cglmrstu]     | 
B[aehikr]?      | 
C[adeflmnorsu]? | 
D[bsy]          | 
E[rsu]          | 
F[elmr]?        | 
G[ade]          | 
H[efgos]?       | 
I[nr]?          | 
Kr?             | 
L[airuv]        | 
M[dgnot]        | 
N[abdeiop]?     | 
Os?             | 
P[abdmortu]?    | 
R[abefghnu]     | 
S[bcegimnr]?    | 
T[abcehilm]     | 
U(u[opst])?     | 
V               | 
W               | 
Xe              | 
Yb?             | 
Z[nr]
/x

The /x option in Perl says to ignore white space. Other regular expression implementations have something similar. Python has two such options, X for similarity with Perl, and VERBOSE for readability. Both have the same behavior.

Regex syntax

The regular expression says that a chemical element symbol may start with A, followed by c, g, l, m, r, s, t, or u; or a B, optionally followed by a, e, h, i, k, or r; or …

The most complicated part of the regex is the part for symbols starting with U. There’s Uranium whose symbols is simply U, and there are the elements who have temporary names based on their atomic numbers: Ununtrium, Ununpentium, Ununseptium, and Ununoctium. These are just Latin for one-one-three, one-one-five, one-one-seven, and one-one-eight. The symbols are U, Uut, Uup, Uus, and Uuo. The regex U(u[opst])? can be read “U, optionally followed by u and one of o, p, s, or t.”

Note that the regex will match any string that contains a chemical element symbol, but it could match more. For example, it would match “I’ve never been to Boston in the fall” because that string contains B, the symbol for boron. Exercise: Modify the regex to only match chemical element symbols.

Regex golf

There may be clever ways to use fewer characters at the cost of being more obfuscated. But this is for fun anyway, so we’ll indulge in a little regex golf.

There are five elements whose symbols start with I or Z: I, In, Ir, Zn, and Zr. You could write [IZ][nr] to match four of these. The regex I|[IZ][nr] would represent all five with 10 characters, while I[nr]?|Z[nr] uses 12. Two characters saved! Can you cut out any more?

Regex resources

Discrete harmonic functions

A (continuous) harmonic function f on an open set a function that is twice differentiable and satisfied Laplace’s equation: ∇2 f = 0. Such functions have a mean value property: at a point x interior to the set, f(x) equals the average value of f over any ball around x that fits inside the region. It turns out that the converse is true: a locally-integrable function satisfying the mean value property is harmonic.

We can take the mean value property from the continuous case as the definition in the discrete case. A function on a graph is harmonic at a node x if f(x) equals the average of f‘s values on the nodes connected to x. A function is said to have a singularity at points of a graph where it is not harmonic.

A non-constant function on a finite connected graph cannot be harmonic at every node. It has to have at least two singularities, namely the points where it takes on its maximum and its minimum. This is analogous to Liouville’s theorem for (continuous) harmonic functions: a bounded harmonic function on all of Euclidean space must be constant. Some infinite graphs have non-constant functions that are harmonic everywhere. For example, linear functions on the integers are harmonic.

In the continuous case, we want to find solutions to Laplace’s equation satisfying a boundary condition. We want to find a function that has the specified values on the boundary and is harmonic in the interior. With the right regularity conditions, this solution exists and is unique.

We can do the same for discrete harmonic functions. If we specify the values of a function on some non-empty subset of nodes, there is a unique discrete harmonic function that extends the function to the rest of a graph.

Discrete harmonic functions come up in applications such as random walks and electrical networks.

Related: Can you hear the shape of a network?

 

 

Mathematics of medical plastics

In this interview, I talk with Ray Rilling about applying mathematics to manufacturing medical plastics.

Photo of Ray Rilling

JC: Ray, could you start by saying a little about yourself?

RR: Sure. My name is Ray Rilling, and I am the Director of Technology at Putnam Plastics.

My initial training was cellular biology with an emphasis on epidemiology, but I decided to move to a more applied (and lucrative) field. I’ve been in medical plastics for the last 23 years.

When I saw that epidemiology was a crowded field, I pursued substantial amounts of additional coursework that allowed me to get into the medical engineering plastics space. In lots of ways I think this pick-and-choose approach was much better than a traditional bio-medical engineering degree would have been. It allowed me to focus on exactly what I needed to learn.

JC: Thanks. And tell me a bit about where you work.

RR: Putnam Plastics is a 30 year old medical plastics manufacturer, specializing in polymer-based medical components (i.e. plastic tubing for medical usage).

JC: I did some consulting work for a company that manufactures stents, so I imagine your companies have a few things in common. What is your day-to-day mathematical toolkit?

RR: We cross a lot of boundaries and there are many layers of computational needs. The majority of the computations are basic algebraic formulas to calculate simple things like annular tooling draw down ratios or the ultimate tensile of a tube. From there, you may find that calculating the burst pressure of a composite or may require the solving of a linear system.

The most challenging computations that we perform are for our complex polymer extrusion tools. These are based on Computational Fluid Dynamics (CFD) and Finite or Boundary Element Methods (FEM / BEM). Many complexities emerge because polymer flow is compressible, non-isothermal, and viscoelastic in nature. It does not allow us to use conventional Navier-Stokes or other common equations. In all reality, we rely on proprietary equations to point us in the right direction. No model has ever provided a perfect solution. We combine our experience and expertise with the applied calculations.

The most complex of the computations are performed with third party software packages and often require the use of complex meshing, materials testing for inputs, and can even require us to create modified code to suit the specific application. We work with some very creative minds and there is never a shortage of new challenges.

Aside from the design engineering mathematics, we think statistically about our results. This can be to analyze the process capability, equivalency, and even product risk analysis. This statistical analysis can be intertwined with the design engineering aspects and deal with a range of biological and medical constants (i.e. devices might need to handle liquid nitrogen or remain in bodies for years or decades).

One of my mentors used to say, “The best equation is teamwork.” Teamwork is a sort of expectation, a crowd-sourced form of expert opinion. It allows you bring a focus on what the real variables are.

Some calculations can take weeks, so having a good appreciation of how to approach the challenge is important. You can get away with not knowing what’s under the hood. But that’s dangerous. It is much better to get things done more quickly, and with better understanding. Especially when a client is waiting on results.

JC: What is some math you wish you learned more of?

Vector and tensor-based physics. As the bar in the medical device industry increases, so do the expectations of the products that we make. Being able take large number of variables or inputs and transform them into a workable model is extremely valuable. My dream is to be able to reliably calculate extrusion tooling, product failure modes, and performance properties before I start cutting steel. But in our time and age, we still rely on some development iterations and calculate what we can.

I also wish I had more applied materials science. When I am developing something in the lab, sometimes the product does not perform the way you want it to. Everytime I start learning something new, like applied surface chemistry or the effects of radiation on polymers, I think of 100 things that I could have done in previous projects.

JC: Did you learn any math you haven’t had much use for yet? I’ve used most of the math I learned, though sometimes there’s been a gap of twenty years between learning something and applying it.

RR: I actually use pretty much all of the math I’ve learned. But that’s likely because I was able to pick and choose because I went back and did supplemental studies. Even the epidemiology is useful.

JC: I wanted to pick up on what you said earlier about physical models. Could you say more about the interplay of physical and mathematical models at Putnan?

RR: The models are never completely right. We use them as a tool. They often fail in three key ways:

  1. We find additional variables we need to account for or constrain. In many cases our first attempt failed because the product did not perform properly in an unaccounted for way. At the end of the day it might take many iterations before we have all of the performance properties that are needed.
  2. We are translating soft or subjective numbers from physicians or engineers. A model will provide concrete numbers. It is often important to create a baseline product that can take the customers subjective description and correlate it to a model.
  3. We need to additionally constrain for cost effectiveness (i.e. minimize the amount of expensive wire in a catheter).

Those three trade offs mean that we are never able to just take a model print out and run with it. There always has to be some back and forth.

JC: It’s interesting that you consider shortcomings besides inaccuracy as failures. Someone new to applying math in the real world might think of your first point but the other points. The latter two aren’t failures from an academic perspective, but they certainly are from a business perspective.

* * *

Other interviews:

Hypergeometric bootstrapping: implement one, get seven free

Suppose you have software to compute one hypergeometric function. How many more hypergeometric functions can you compute from it?

Hypergeometric functions satisfy a lot of identities, so you can bootstrap one such function into many more. That’s one reason they’re so useful in applications. For this post, I want to focus on just three formulas, the so-called linear transformation formulas that relate one hypergeometric function to another. (There are more linear formulas that relate one hypergeometric function to a combination of two others. I may consider those in a future post, but not today.)

Linear transformations

The classical hypergeometric function has three parameters. The linear transformations discussed above relate the function with parameters (abc) to those with parameters

  • (c – ac – bc)
  • (ac – bc)
  • (bc – ac)

Here are the linear transformations in detail:

\begin{eqnarray*} F(a, b; c; z) &=& (1-z)^{c -a-b} F(c-a, c-b; c; z) \\ &=& (1-z)^{-a} F\left(a, c-b; c; \frac{z}{z-1}\right) \\ &=& (1-z)^{-b} F\left(b, c-a; c; \frac{z}{z-1}\right) \end{eqnarray*}

How many more?

The three transformations above are the ones listed in Handbook of Mathematical Functions by Abramowitz and Stegun. How many more transformations can we create by combining them? To answer this, we represent each of the transformations with a matrix. The transformations correspond to multiplying the matrix by the column vector  (abc).

A &=& \left( \begin{array}{rrr} -1 & 0 & \phantom{-}1 \\ 0 & -1 & 1 \\ 0 & 0 & 1 \end{array} \right) \\ B &=& \left( \begin{array}{rrr} \phantom{-}1 & 0 & 0 \\ 0 & -1 & \phantom{-}1 \\ 0 & 0 & 1 \end{array} \right) \\ C &=& \left( \begin{array}{rrr} 0 & \phantom{-}1 & 0 \\ -1 & 0 & \phantom{-}1 \\ 0 & 0 & 1 \end{array} \right)

The question of how many transformations we can come up with can be recast as asking what is the order of the group generated by AB, and C. (I’m jumping ahead a little by presuming these matrices generate a group. Conceivably the don’t have inverses, but in fact they do. The inverses of AB, and C are AB, and AC respectively.)

A little experimentation reveals that there are eight transformations: I, ABCDABEACFBC, and GCB. So in addition to the identity I, the do-nothing transformation, we have found four more: DEF, and G. These last four correspond to taking  (abc) to

  • (c – abc)
  • (ac – bc)
  • (bac)
  • (c – bc – ac)

This means that if we have software to compute the hypergeometric function with one fixed set of parameters, we can bootstrap that into computing the hypergeometric function with up to seven more sets of parameters. (Some of the seven new combinations could be repetitive if, for example, ab.)

Identifying the group

We have uncovered a group of order 8, and there are only 5 groups that size. We should be able to find a familiar group isomorphic to our group of transformations.

The five groups of order 8 are Z8 (cyclic group), Z4×Z2 (direct product), E8 (elementary Abelian group), D8 (dihedral group), and the quaternion group. The first three are Abelian, and our group is not, so our group must be isomorphic to either the quaternion group or the dihedral group. The quaternion group has only 1 element of order 2, and the dihedral group has 5 elements of order 2. Our group has five elements of order 2 (ABDFG) and so it must be isomorphic to the dihedral group of order 8, the rotations and reflections of a square. (Some authors call this D8, because the group has eight elements. Others call it D4, because it is the group based on a 4-sided figure.)

Dihedral group of order 8

 

Can you hear the shape of a network?

Mark Kac asked in 1966 whether you can hear the shape of a drum. In other words, can two drums, made of the same material, produce the exact same sound but have different shapes? More formally, Kac asked whether the eigenvalues of the Laplace’s equation with zero boundary conditions uniquely determine the shape of a region in the plane. The question remained open until 1992. No, you can’t always hear the shape of a drum. But often you can. With some restrictions on the regions, the shape is uniquely determined by the sound, i.e., the Laplace spectrum.

Ten years before Kac asked about hearing the shape of a drum, Günthard and Primas asked the analogous question about graphs. Can you hear the shape of a graph? That is, can two different graphs have the same eigenvalues? At the time, the answer was believed to be yes, but a year later it was found to be no, not always [1]. So the next natural question is when can you hear the shape of a graph, i.e. under what conditions is a graph determined by its eigenvalues? It depends on which matrix you’re taking the eigenvalues of, but under some conditions some matrix spectra uniquely determine graphs. We don’t know in general how common it is for spectra to uniquely determine graphs.

Here are two graphs that have the same adjacency matrix spectra, first published in [2]:

Both have adjacency spectra [-2, 0, 0, 0, 2]. But the graphs are not cospectral as far as the Laplacian is concerned. Their Laplace spectra are [0, 0, 2, 2, 4] and [0, 1, 1, 1, 5] respectively. We could tell that the Laplace spectra would be different before computing them because the second smallest Laplace eigenvalue is positive if and only if a graph is connected.

The graphs below are cospectral for the adjacency, Laplacian, and unsigned Laplacian matrices.

Both graphs have the same number of nodes and edges, and every node has degree 4 in both graphs. But the graph on the left contains more triangles than the one on the right, so they cannot be isomorphic.

One way to test whether two graphs are isomorphic is to compute their spectra. If the spectra are different, the graphs are not isomorphic. So spectral analysis gives a way to show that two graphs are not isomorphic in polynomial time, though the test may be inconclusive.

If two graphs do have the same spectra, what is the probability that they are isomorphic? In [1] the authors answer this question empirically for graphs of order up to 11. Very roughly, there’s about an 80% chance graphs with the same adjacency matrix spectrum are isomorphic. The chances go up to 90% for the Laplacian and 95% for the signless Laplacian.

* * *

[1] Edwin R. van Dam, Willem H. Haemers. Which graphs are determined by their spectrum? Linear Algebra and its Applications 373 (2003) 241–272

[2] D.M. Cvetkovi´c, Graphs and their spectra, Univ. Beograd. Publ. Elektrotehn. Fak. Ser. Mat. Fiz. 354–356 (1971) 1–50.

Discrete Laplace transform

The relationship between the discrete Laplace transform and discrete Fourier transform is not quite the same as that between their continuous counterparts.

Continuous Fourier and Laplace transforms

The continuous versions of the Fourier and Laplace transforms are given as follows.

Fourier transform:

{\cal F}(f)(\omega) = \int_{-\infty}^\infty \exp(-i\omega x) f(x)\, dx

Laplace transform:

{\cal L}(f)(s) = \int_0^\infty \exp(s x) f(x)\, dx

The Fourier transform is defined several ways, and I actually prefer the convention that puts a factor of 2π in the exponential, but the convention above makes the analogy with Laplace transform simpler. There are two differences between the Fourier and Laplace transforms. The Laplace transform integrates over only half the real line, compared to the entire real line for Fourier. But a variation on the Laplace transform, the Bilateral Laplace transform integrates over the entire real line. The Bilateral Laplace transform at s is simply the Fourier transform at xis. And of course the same is true for the (one-sided) Laplace transform if the function f is only non-zero for positive values.

I’ve encountered the Fourier transform more in application, and the Laplace transform more in teaching. This is not to say the Laplace transform isn’t used in practice; it certainly is used in applications. But the two transforms serve similar purposes, and the Laplace transform is easier to teach. Because the factor exp(-sx) decays rapidly, the integral defining the Laplace transform converges for functions where the integral defining the Fourier transform would not. Such functions may still have Fourier transforms, but the transforms require distribution theory whereas the Laplace transforms can be computed using basic calculus.

Discrete Fourier and Laplace Transforms

There’s more difference between the discrete versions of the Fourier and Laplace transforms than between the continuous versions.

The discrete Fourier transform (DFT) approximates the integral defining the (continuous) Fourier transform with a finite sum. It discretizes the integral and truncates its domain. The discrete Laplace transform is an infinite sum. It discretizes the integral defining the Laplace transform, but it does not truncate the domain. Given a step size η > 0, the discrete Laplace transform of f is

{\cal L}_\eta(f)(s) = \eta \sum_{n=0}^\infty \exp(-sn\eta) f(n\eta)

The discrete Laplace transform isn’t “as discrete” as the discrete Fourier transform. The latter takes a finite sequence and returns a finite sequence. The former evaluates a function at an infinite number of points and produces a continuous function.

The discrete Laplace transform is used in applications such as signal processing, as well as in the theory of analytic functions.

Connection with the z-transform and generating functions

If η = 1 and z = exp(-s), the discrete Laplace transform becomes the z-transform of the values of f at non-negative integers. And if we replace z with 1/z, or equivalently set z = exp(s) instead of z = exp(-s), we get the generating function of the values of f at non-negative integers.

z-transforms are common in digital signal processing, while generating functions are common in combinatorics. They are essentially the same thing.