Queueing theory and regular expressions

people waiting in line

Queueing theory is the study of waiting in line. That may not sound very interesting, but the subject is full of surprises. For example, when a server is near capacity, adding a second server can cut backlog not just in half but by an order of magnitude or more. More on that here.

In this post, I’m not going to look at the content of queueing theory but just the name. When seeing the word queueing, many people assume it is misspelled and want to replace it with queuing. As I write this, my spell checker is recommending that change. But the large majority of people in the field call their subject queueing theory.

Queueing has five consecutive vowels. Are there any other English words with so many vowels in a row? To find out, I did an experiment like the classic examples from Kernighan and Pike demonstrating regular expressions by searching a dictionary for words with various patterns.

The first obstacle I ran into is that apparently Ubuntu doesn’t come with a dictionary file. However this post explains how to install one.

The second obstacle was that queueing wasn’t in the dictionary. I first searched the American English dictionary, and it didn’t include this particular word. But then I searched the British English dictionary and found it.

Now, to find all words with five consecutive vowels, I ran this:

    egrep '[aeiou]{5}' british-english

and got back only one result: queueing. So at least in this dictionary of 101,825 words, the only word with five (or more) consecutive vowels is queueing.

Out of curiosity I checked how many words have at least four consecutive vowels. The command

    grep -E "[aeiou]{4}" british-english | wc -l

returns 40, and the corresponding command with the file american-english returns 39. So both dictionaries contain 39 words with exactly four vowels in a row and no more. But are they they same 39 words? In fact they are. And one of the words is queuing, and so apparently that is an acceptable spelling in both American and British English. But as a technical term in mathematics, the accepted spelling is queueing.

Some of the 39 words are plurals or possessives of other words. For example, the list begins with Hawaiian, Hawaiian’s, and Hawaiians. (Unix sorts capital letters before lower case letters.) After removing plurals and possessives, we have 22 words with four consecutive vowels:

  • Hawaiian
  • Iroquoian
  • Kauai
  • Kilauea
  • Louie
  • Montesquieu
  • Quaoar
  • Rouault
  • aqueous
  • gooier
  • gooiest
  • obsequious
  • obsequiously
  • obsequiousness
  • onomatopoeia
  • pharmacopoeia
  • plateaued
  • plateauing
  • queue
  • queued
  • queuing
  • sequoia

It’s interesting that three of the words are related to Hawaii: Kuaui is an Hawaiian island and Kulauea is a Hawaiian volcano.

I’d never heard of Quaoar before. It’s a Kuiper belt object about half the size of Pluto.

Related posts

Hyperexponential and hypoexponential distributions

There are a couple different ways to combine random variables into a new random variable: means and mixtures. To take the mean of X and Y you average their values. To take the mixture of X and Y you average their densities. The former makes the tails thinner. The latter makes the tails thicker. When X and Y are exponential random variables, the mean has a hypoexponential distribution and the mixture has a hyperexponential distribution.

Hypoexponential distributions

Suppose X and Y are exponentially distributed with mean μ. Then their sum X + Y has a gamma distribution with shape 2 and scale μ. The sum has mean 2μ and variance 2μ². The coefficient of variation, the ratio of the standard deviation to the mean, is 1/√2. The hypoexponential distribution is so-called because its coefficient of variation is less than 1, whereas an exponential distribution has coefficient of variation 1 because the mean and standard deviation are always the same.

The means of X and Y don’t have to be the same. When they’re different, the sum does not have a gamma distribution, and so hypoexponential distributions are more general than gamma distributions.

A hypoexponential random variable can also be the sum of more than two exponentials. If it is the sum of n exponentials, each with the same mean, then the coefficient of variation is 1/√n. In general, the coefficient of variation for a hypoexponential distribution is the coefficient of variation of the means [1].

In the introduction we talked about means rather than sums, but it makes no difference to the coefficient of variation because going from sum to mean divides the mean and the standard deviation by the same amount.

Hyperexponential distributions

Hyperexponential random variables are constructed as a mixture of exponentials rather than an average. Instead selecting a value from X and a value from Y, we select a value from X or a value from Y. Given a mixture probability p, we choose a sample from X with probability p and a value from Y with probability 1 – p.

The density function for a mixture is a an average of the densities of the two components. So if X and Y have density functions fX and fY, then the mixture has density

p fX + (1 – p) fY

If you have more than two random variables, the distribution of their mixture is a convex combination of their individual densities. The coefficients in the convex combination are the probabilities of selecting each random variable.

If X and Y are exponential with means μX and μY, and we have a mixture that selects X with probability p, then then mean of the mixture is the mixture of the means

μ = p μX + (1 – p) μY

which you might expect, but the variance

σ² = p μX ² + (1 – p) μY ² + p(1 – p)(μX – μY

is not quite analogous because of the extra p(1 – p)(μX – μY)² term at the end. If μX = μY this last term drops out and the coefficient of variation is 1: mixing two identically distributed random variables doesn’t do anything to the distribution. But when the means are different, the coefficient of variation is greater than 1 because of the extra term in the variance of the mixture.

Example

Suppose μX = 2 and μY = 8. Then the average of X and Y has mean 5, and so does an equal mixture of X and Y.

The average of X and Y has standard deviation √17, and so the coefficient of variation is √17/5 = 0.8246.

An exponential distribution with mean 5 would have standard deviation 5, and so the coefficient of variation 1.

An equal mixture of X and Y has standard deviation √43, and so the coefficient of variation is √43/5 = 1.3114.

Related posts

[1] If exponential random variables Xi have means μi, then the coefficient of variation of their sum (or average) is

√(μ1² + μ2² + … + μn²) / (μ1 + μ2 + … + μn)

Primes that don’t look like primes

Primes usually look odd. They’re literally odd [1], but they typically don’t look like they have a pattern, because a pattern would often imply a way to factor the number.

However, 12345678910987654321 is prime!

I posted this on Twitter, and someone with the handle lagomoof replied that the analogous numbers are true for bases 2, 3, 4, 6, 9, 10, 16, and 40. So, for example, the hexadecimal number 123456789abcdef10fedcba987654321 would be prime. The following Python code proves this is true.

    from sympy import isprime

    for b in range(2, 41):
        n = 0
        for i in range(0, b):
            n += (i+1)*b**i
        for i in range(b+1, 2*b):
            n += (2*b-i)*b**i
        print(b, isprime(n))

By the way, some have suggested that these numbers are palindromes. They’re not quite because of the 10 in the middle. You can show that the only prime palindrome with an even number of digits is 11. This is an example of how a pattern in the digits leads to a way to factor the number.

Related posts

[1] “All primes are odd except 2, which is the oddest of all.” — Concrete Mathematics

Computational survivalist

Some programmers and systems engineers try to do everything they can with basic command line tools on the grounds that someday they may be in an environment where that’s all they have. I think of this as a sort of computational survivalism.

I’m not much of a computational survivalist, but I’ve come to appreciate such a perspective. It’s an efficiency/robustness trade-off, and in general I’ve come to appreciate the robustness side of such trade-offs more over time. It especially makes sense for consultants who find themselves working on someone else’s computer with no ability to install software. I’m rarely in that position, but that’s kinda where I am on one project.

Example

I’m working on a project where all my work has to be done on the client’s laptop, and the laptop is locked down for security. I can’t install anything. I can request to have software installed, but it takes a long time to get approval. It’s a Windows box, and I requested a set of ports of basic Unix utilities at the beginning of the project, not knowing what I might need them for. That has turned out to be a fortunate choice on several occasions.

For example, today I needed to count how many times certain characters appear in a large text file. My first instinct was to write a Python script, but I don’t have Python. My next idea was to use grep -c, but that would count the number of lines containing a given character, not the number of occurrences of the character per se.

I did a quick search and found a Stack Overflow question “How can I use the UNIX shell to count the number of times a letter appears in a text file?” On the nose! The top answer said to use grep -o and pipe it to wc -l.

The -o option tells grep to output the regex matches, one per line. So counting the number of lines with wc -l gives the number of matches.

Computational minimalism

Computational minimalism is a variation on computational survivalism. Computational minimalists limit themselves to a small set of tools, maybe the same set of tools as computational survivalist, but for different reasons.

I’m more sympathetic to minimalism than survivalism. You can be more productive by learning to use a small set of tools well than by hacking away with a large set of tools you hardly know how to use. I use a lot of different applications, but not as many as I once used.

Related posts

What exactly is a day?

How many days are in a year? 365.

How many times does the earth rotate on its axis in a year? 366.

If you think a day is the time it takes for earth to rotate once around its axis, you’re approximately right, but off by about four minutes.

What we typically mean by “day” is the time it takes for earth to return to the same position relative to the sun, i.e. the time from one noon to the next. But because the earth is orbiting the sun as well as rotating on its axis, it has to complete a little more than one rotation around its axis to bring the sun back to the same position.

Since there are approximately 360 days in a year, the earth has to rotate about 361 degrees to bring the sun back into the same position. This extra rotation takes 1/360th of a day, or about 4 minutes. (We’ll be more precise below.)

Here’s another way to see that the number of rotations has to be more than the number days in a year. To make the numbers simple, assume there are 360 days in a year. Suppose you’re looking up at the noon sun one day. Next suppose it is exactly 180 days later and the earth had rotated 180 times. Because the earth is on the opposite side of the sun, you would be on the opposite side of the sun too. For it to be noon again 180 days later, the earth would have to have made 180.5 rotations, making 361 rotations in a 360-day year.

Sidereal days

Imagine an observer looking straight down at the Earth’s north pole from several light years away. Imagine also an arrow from the center of the Earth to a fixed point on the equator. The time it takes for the observer to see that arrow return to same angle is a sidereal day. The time it takes for that arrow to go from pointing at the sun to pointing at the sun again is a solar day, which is about four minutes longer [1].

If you assume a year has 365.25 (solar) days, but also assume Earth is in a circular orbit around the sun, you’ll calculate a sidereal day to be about 3 minutes and 57 seconds shorter than a solar day. Taking the elliptical shape of Earth’s orbit into account takes about a second off that amount.

Related posts

[1] Technically this is an apparent solar day, a solar day from the perspective of an observer. The length of an apparent solar day varies through the year, but we won’t go into that here.

Harmonographs

In the previous post, I said that Lissajous curves are the result of plotting a curve whose x and y coordinates come from (undamped) harmonic oscillators. If we add a little bit of dampening, multiplying our cosine terms by negative exponentials, the resulting curve is called a harmonograph.

Here’s a bit of Mathematica code to draw harmonographs.

harmonograph[f1_, f2_, ph1_, ph2_, d1_, d2_, T_] := 
    ParametricPlot[{
        Cos[f1 t + ph1] Exp[-d1 t], 
        Cos[f2 t + ph2] Exp[-d2 t]
    }, {t, 0, T}]

For example,

harmonograph[3, 4, 0, Sqrt[2], 0.01, 0.02, 100]

is a slight variation on one of the curves from the previous post, adding dampening terms exp(-0.01t) and exp(-0.01t). The result is the following.

Harmonograph

You can get a wide variety of graphs by playing with the parameters. A couple years ago I wrote about an attempt to reproduce the background of the image in the poster for Alfred Hitchcock’s movie Vertigo. The curve is another example of a harmonograph.

Vertigo harmonograph

Related posts

Lissajous curves and knots

Suppose that over time the x and y coordinates of a point are both given by a harmonic oscillator, i.e.

x(t) = cos(nx t + φx)
y(t) = cos(ny t + φy)

Then the resulting path is called a Lissajous curve.

If you add a z coordinate also given by harmonic oscillator

z(t) = cos(nz t + φz)

then the result is a Lissajous knot. This means that if we project a Lissajous knot onto a coordinate plane, we get a Lissajous curve.

For example, here is a Lissajous knot. See the code at the bottom for parameter details.

Lissajous knot

Here are its projections onto the xy plane,

xy-plane projection

the yz plane,

yz plane projection

and the xz plane.

xz plane projection

If you’ve ever seen an oscilloscope, you may have seen Lissajous curves on its screen.

In order to get a knot, we require that the frequencies, the n‘s, are pairwise relatively prime. We also require that the curve not intersect itself, which imposes a constraint on the phases, the φ’s, as well.

Last week I used torus knots as an example of a family of knots. The Lissajous knots are a distinct family of knots: no torus knot is also a Lissajous knot.

Here’s the Mathematica code to produce the images above.

x[t_] := Cos[3 t]
y[t_] := Cos[4 t + Sqrt[2]]
z[t_] := Cos[5 t + Sqrt[3]]

ParametricPlot3D[{x[t], y[t], z[t]}, {t, 0, 2 Pi}]

ParametricPlot[{x[t], y[t]}, {t, 0, 2 Pi}]
ParametricPlot[{y[t], z[t]}, {t, 0, 2 Pi}]
ParametricPlot[{x[t], z[t]}, {t, 0, 2 Pi}]

Curvature of an ellipsoid

For an ellipsoid with equation

\left(\frac{x}{a}\right^2 + \left(\frac{y}{b}\right^2 + \left(\frac{z}{c}\right^2 = =1

the Gaussian curvature at each point is given by

K(x,y,z) = \frac{1}{a^2 b^2 c^2 \left(\frac{x^2}{a^4} + \frac{y^2}{b^4} + \frac{z^2}{c^4} \right )^2}

Now suppose abc > 0. Otherwise relabel the coordinate axes so that this is the case. Then the largest curvature occurs at (±a, 0, 0), and the smallest curvature occurs at (0, 0, ±c).

You could prove this using algebra by manipulating inequalities, or using calculus with Lagrange multipliers.

To see intuitively why this might be true, it helps to exaggerate the shape. First imagine that a is much larger than b or c. Then you have a cigar shape, the greatest curvature as at the two ends. And If you imagine c being much smaller than a and b, you have sort of a pancake shape which is flat on top and bottom.

The maximum curvature is (a/bc)² and the minimum curvature is (c/ab)².

Related posts

 

Fixed points

Take a calculator and enter any number. Then press the cosine key over and over. Eventually the numbers will stop changing. You will either see 0.99984774 or 0.73908513, depending on whether your calculator was in degree mode or radian mode. This is an example of a fixed point, a point that doesn’t change when you apply a function.

The example above is actually two examples, one for cosine of x degrees and one for cosine of x radians. These are two different functions, and they have different fixed points. Note that the two fixed points are not simply related to each other by converting between degrees and radians.

Contraction mapping theorem

The functions

f(x) = cos(x)

and

g(x) = cos(πx/180)

are both contraction mappings, the former corresponding to radians and the latter to degrees. A function h(x) is a contraction mapping if for any two points x and y,

|h(x) – h(y)| ≤ c|xy|

for some constant c < 1. You can use the mean value theorem to show that c = sin(1) for the function f, and c = π sin(π/180) for the function g.

The contraction mapping theorem says that if a function h is a contraction mapping on a closed interval, then h has a unique fixed point. You can generalize this from working on closed interval to working in any complete metric space.

Weak contraction mappings

The constant c above is important. Without it, you don’t necessarily have a fixed point. For example, let

f(x) = x + 1/x

on the interval [1, ∞). Then you can show that

|f(x) – f(y)| < |xy|

for xy but the function f has no fixed point. To see this, suppose f(x) = x, then 1/x = 0, but the reciprocal of a positive number is positive.

If you start with a point x ≥ 1 and repeatedly apply f, you’ll get a sequence of points that move closer to each other, but not closer to any particular number. The sequence of iterates diverges.

A function satisfying the inequality above is called weak contraction mapping. In general, a weak contraction mapping need not have a fixed point, as in the example above. But a weak contraction on a compact space will have a fixed point.

Commentary

As discussed in the link about Kepler’s equation below, something like the contraction mapping theorem was used in practice long before it was formalized in theory.

When I was in grad school, I studied nonlinear PDEs. After seeing a long list of existence theorems and getting lost in the proofs, I wondered what was at the heart of all these theorems. Under all the layers of operator and function space definitions, somewhere there had to be core theorems that said something exists, and I went back over my lecture notes to dig for what these theorems were. In every case it boiled down to a fixed point theorem, not the contraction mapping theorem per se but analogous theorems.

Related posts

Number of real roots in an interval

Suppose you have a polynomial p(x) and in interval [a, b] and you want to know how many distinct real roots the polynomial has in the interval. You can answer this question using Sturm’s algorithm.

Let p0(x) = p(x) and letp1(x) be its derivative p‘(x).

Then define a series of polynomials for i ≥ 1

pi+1(x) = – pi-1(x) mod pi(x)

until you reach a constant. Here f mod g means the remainder when f is divided by g.

This sequence of polynomials is called the Sturm series. Count the number of sign changes in the Sturm series at a and at b. Then the difference between these two counts is the number of distinct roots of p in the interval [a, b].

Example with Mathematica

As an example, let’s look at the number of real zeros for

p(x) = x5xc

for some constant c. We’ll let Mathematica calculate our series.

    p0[x_] := x^5 - x - c
    p1[x_] := D[p0[x], x]
    p2[x_] := -PolynomialRemainder[p0[x], p1[x], x]
    p3[x_] := -PolynomialRemainder[p1[x], p2[x], x]

This works out to

p0(x) = x5xc
p1(x) = 5x4 – 1
p2(x) = (4/5)x + c
p3(x) = 1 – 3125c4/256

Now suppose c = 3 and we want to find out how many zeros p has in the interval [0, 2].

Evaluating our series at 0 we get -3, -1, 3, -3109/16. So the pattern is – – + -, i.e. two sign changes.

Evaluating our series at 2 we get 27, 79, 23/5, -3109/16. So the pattern is + + + -, i.e. one sign change.

This says x5x – 3 has one real root between 0 and 2.

By the way, you can multiply the polynomials in the sequence by any positive constant you like if that makes calculations easier. This multiplies subsequent polynomials by the same amount and doesn’t change the signs.

Fine print

Note that the algorithm counts distinct roots; multiple roots are counted once.

You can let the end points of the interval be infinite, and so count all the real roots.

I first tried using Mathematica’s PolynomialMod function, but

   PolynomialMod[5 x^4 - 1, 4 x/5 + c]

gave the unexpected result 5x4 – 1. That’s because PolynomialMod does not let you specify what the variable is. It assumed that 4x/5 + c was a polynomial in c. PolynomialRemainder is explicit about the variable, and that’s why the calls to this function above have x as the last argument.

Related post