Multiserver queues with general probability distributions

Textbook presentations of queueing theory start by assuming that the time between customer arrivals and the time to serve a customer are both exponentially distributed.

Maybe these assumptions are realistic enough, but maybe not. For example, maybe there’s a cutoff on service time. If someone takes too long, you tell them there’s no more you can do and send them on their merry way. Or maybe the tails on times between arrivals are longer than those of an exponential distribution.

The default model makes other simplifying assumptions than just the arrival and service distributions. It also makes other simplifying assumptions. For example, it assumes no balking: customers are infinitely patient, willing to wait as long as necessary without giving up. Obviously this isn’t realistic in general, and yet it might be adequate when wait times are short enough, depending on the application. We will remove the exponential probability distribution assumption but keep the other assumptions.

Assuming exponential gaps between arrivals and exponential service times has a big advantage: it leads to closed-form expressions for wait times and other statistics. This model is known as a M/M/s model where the two Ms stand for “Markov” and the s stands for the number of servers. If you assume a general probability distribution on arrival gaps and service times, this is known as a G/G/s model, G for “general.”

The M/M/s model isn’t just a nice tractable model. It’s also the basis of an approximation for more general models.

Let A be a random variable modeling the time between arrivals and let S be a random variable modeling service times.

The coefficient of variation of a random variable X is its standard deviation divided by its mean. Let cA and cS be the coefficients of variation for A and S respectively. Then the expected wait time for the G/G/s model is approximately proportional to the expected wait time for the corresponding M/M/s model, and the proportionality constant is given in terms of the coefficients of variation [1].

W_{G/G/s} \approx \frac{c_A^2 + c_S^2}{2}\, W_{M/M/s}

Note that the coefficient of variation for an exponential random variable is 1 and so the approximation is exact for exponential distributions.

How good is the approximation? You could do a simulation to check.

But if you have to do a simulation, what good is the approximation? Well, for one thing, it could help you test and debug your simulation. You might first calculate outputs for the M/M/s model, then the G/G/s approximation, and see whether your simulation results are close to these numbers. If they are, that’s a good sign.

You might use the approximation above for pencil-and-paper calculations to get a ballpark idea of what your value of s needs to be before you refine it based on simulation results. (If you refine it at all. Maybe the departure of your queueing system from M/M/s is less important than other modeling considerations.)


[1] See Probability, Statistics, and Queueing Theory by Arnold O. Allen.


Queueing and Economies of Scale

If a single server is being pushed to the limit, adding a second server can drastically reduce wait times. This is true whether the server is a human serving food or a computer serving web pages. I first wrote about this here and I give more details here.

What if you add an extra server but you also add more customers? The second server still reduces the wait time, though clearly not by as much as if the work load didn’t increase.

If you increase the number of servers in proportion to the traffic, wait times will go down. In theory, wait times will approach zero as the traffic and number of servers go off to infinity together. So under ideal assumptions, you could lower wait times by scaling your number of customers and your number of servers. Alternatively, you could scale your number of servers but not as fast as your number of customers, and keep wait times constant while reducing payroll.

In theory, the ideal scale is as large as possible. Economies of scale are real, but so are dis-economies of scale. Eventually the latter overtakes the former. But that’s a matter for another post. For this post we’re only looking at an idealized model.


As before we assume the time between customer arrivals and the time required to serve each customer is random. (Technically, a M/M/s model.)

Assume first that the ratio of the rate at which customers arrive to the rate at which they can be served is 0.8. Here’s what happens when we increase the traffic and the number of servers proportionally.

Next assume that the ratio of the arrival and service rates is 0.9. Scaling the traffic and servers proportionally reduces the number of customers in line, but the wait time declines more slowly than it did when each server wasn’t so close to capacity.

Each new server helps, but each helps less than the one before. You hit diminishing return immediately.


Let ρ be the ratio of the total arrival rate to the rate at which a single server can take care of a customer and let s be the number of servers. In the examples above, ρ was 0.8s and 0.9s. The equations below don’t assume any particular relation between ρ and s except that ρ/s must be less than 1. (If ρ/s were greater than 1, the system would not approach an equilibrium; lines would grow without bound over time.)

Define p0 as follows. (NB: that’s a Roman p, not a Greek ρ. They’re visually similar, but it’s conventional notation in queueing theory.)

\frac{1}{p_0} = \frac{\rho^s}{s!(1 - \rho/s)} + \sum_{n=0}^{s-1} \frac{\rho^n}{n!}
Then the expected wait time is

\frac{p_0 \,\rho^{s+1}}{s\,s!\,(1 - \rho/s)^2}

By the way, we’ve assumed the time between arrivals and the time to serve a customer are both exponentially distributed. What if they’re not? That is the subject of my next post.

More queueing theory posts

Linear logic arithmetic

Linear logic has connectives not used in classical logic. The connectives ⊗and & are conjunctions, ⊕ and ⅋ are disjunctions, and ! and ? are analogous to the modal operators ◻ and ◇ (necessity and possibility).

\begin{table}[] \begin{tabular}{lccll} & \phantom{i}add\phantom{i} & mult & & \\ \cline{2-3} \multicolumn{1}{l|}{conjunction} & \multicolumn{1}{c|}{\&} & \multicolumn{1}{c|}{\otimes} & & \\ \cline{2-3} \multicolumn{1}{r|}{disjunction} & \multicolumn{1}{c|}{\oplus} & \multicolumn{1}{c|}{\parr} & & \\ \cline{2-3} & & & & \end{tabular} \end{table}

Another way to classify the connectives is to say ⊕ and & are called additive, ⊗ and ⅋ are multiplicative, and ! and ? are called exponential.And still another classification says that ⊕, ⊗, and ! have positive polarity, while &, ⅋, and ? have negative polarity.

\begin{table}[] \begin{tabular}{lcccl} & add & mult & exp & \\ \cline{2-4} \multicolumn{1}{r|}{positive} & \multicolumn{1}{c|}{\oplus} & \multicolumn{1}{c|}{\otimes} & \multicolumn{1}{c|}{!} & \\ \cline{2-4} \multicolumn{1}{r|}{negative} & \multicolumn{1}{c|}{\&} & \multicolumn{1}{c|}{\parr} & \multicolumn{1}{c|}{?} & \\ \cline{2-4} \end{tabular} \end{table}

This post will show that these arithmetical names are justified by analogy.

For one thing, multiplication-like connectives distribute over addition-like connectives.
\begin{align*} A \otimes (B \oplus C) &\equiv (A \otimes B) \oplus (A \otimes C) \\ A \parr (B \,\&\, C) &\equiv (A \parr B) \,\&\, (A \parr C) \\ (A \oplus B) \otimes C &\equiv (A \otimes C) \oplus (B \otimes C) \\ (A \,\&\, B) \parr C &\equiv (A \parr C) \,\&\, (B \parr C) \end{align*}

Also, the exponential-like operators behave analogously to the equation

e^(a+b) = e^a e^b.

with respect to the addition-like and multiplication-like connectives.

\begin{align*} !(A \& B) &\equiv (!A) \otimes (!B) \\ ?(A \oplus B) &\equiv (?A) \parr (?B) \end{align*}

In both equations, if you apply an exponential-like operator to the result of applying an addition-like operator, you get a multiplication-like operator applied to the exponential-like operator applied to two addition-like arguments separately.

The term “polarity” is justified by the fact that linear negation flips the polarity of connectives.

In the following analogs of De Morgan’s laws, negation turns conjunctions into disjunctions and vice versa, and it reverses the polarity of connectives.

\begin{align*} (A \otimes B)^\bot &\equiv A^\bot \parr B^\bot \\ (A \oplus B)^\bot &\equiv A^\bot \,\&\,\, B^\bot \\ (A \parr B)^\bot &\equiv A^\bot \otimes B^\bot \\ (A \,\&\, B)^\bot &\equiv A^\bot \oplus B^\bot \\ \end{align*}

Negating exponential connectives also flips polarity.

\begin{align*} (!A)^\bot &\equiv ?(A^\bot) \\ (?A)^\bot &\equiv !(A^\bot) \end{align*}

These rules are analogous to the rules for negating quantifiers in classical logic.

\begin{align*} \neg(\forall x, p) &\iff \exists x, \neg p \\ \neg(\exists x, p) &\iff \forall x, \neg p \end{align*}

More logic posts

Self-Orthogonal Latin Squares

The other day I wrote about orthogonal Latin squares. Two Latin squares are orthogonal if the list of pairs of corresponding elements in the two squares contains no repeats.

A self-orthogonal Latin square (SOLS) is a Latin square that is orthogonal to its transpose.

Here’s an example of a self-orthogonal Latin square:

    1 7 6 5 4 3 2
    3 2 1 7 6 5 4
    5 4 3 2 1 7 6
    7 6 5 4 3 2 1
    2 1 7 6 5 4 3
    4 3 2 1 7 6 5
    6 5 4 3 2 1 7

To see that this Latin square is orthogonal to itself, we’ll concatenate the square and its transpose.

    11 73 65 57 42 34 26 
    37 22 14 76 61 53 45 
    56 41 33 25 17 72 64 
    75 67 52 44 36 21 13 
    24 16 71 63 55 47 32 
    43 35 27 12 74 66 51 
    62 54 46 31 23 15 77

Each pair of digits is unique.

Magic squares

As we discussed in the earlier post, you can make a magic square out of a pair of orthogonal Latin squares. If we interpret the pairs of digits above as base 10 numbers, we have a magic square because each row, column, and diagonal has the digits 1 through 7 in the 1s place and the same set of digits in the 10s place.

Typically an n × n magic square is filled with the numbers 1 through n². We can make such a magic square with a few adjustments.

First, we subtract 1 from every element of our original square before we combine it with its transpose. This gives us the following.

    00 62 54 46 31 23 15 
    26 11 03 65 50 42 34 
    45 30 22 14 06 61 53 
    64 56 41 33 25 10 02 
    13 05 60 52 44 36 21 
    32 24 16 01 63 55 40 
    51 43 35 20 12 04 66 

If we interpret the elements above as base 7 numbers, then the entries are the numbers 0 through 66seven which equals 48ten. If we add 1 to every entry we get all the numbers from 1 through 100seven = 49ten. Written in base 10, this is the following.

     1 45 40 35 23 18 13 
    21  9  4 48 36 31 26 
    34 22 17 12  7 44 39 
    47 42 30 25 20  8  3 
    11  6 43 38 33 28 16 
    24 19 14  2 46 41 29 
    37 32 27 15 10  5 49 

Possible sizes

It’s surprising that orthogonal Latin squares exist as often as they do. Self-orthogonal Latin squares are more rare, and yet they exist for all sizes except n = 2, 3, or 6. (Source: The CRC Handbook of Combinatorial Designs.)

It’s hard to prove that self-orthogonal Latin squares exist, but it’s easy to verify the sizes where they don’t exist. There is no self-orthogonal Latin square of size 6 because there is no pair of orthogonal Latin squares of size 6. The latter is a hard theorem to prove, but given it, the former is trivial.

There are orthogonal Latin squares of size 3, but no self-orthogonal Latin squares of size 3. And it’s not hard to see why. Suppose there were such a square. Without loss of generality we can label the top row with 1, 2, and 3. Let x be the entry directly below 1.

    1 2 3
    x * *
    * * *

What could x be? Not 1, because elements in a column of a Latin square are unique. It can’t be 2 either, because otherwise when you join the square and its transpose would have two 22s. So x must be 3, and the element below x must be 2. But then when you join the square and its transpose you have two 32s. There’s not enough wiggle room for a 3 × 3 Latin square to be self-orthogonal.

Better parts, worse system

There’s a rule of systems thinking that improving part of a system can often make the system as a whole worse.

One example of this is Braess’ Paradox that adding roads can make traffic worse, and closing roads can improve traffic.

Another example is the paradox of enrichment: Increasing the food available to an ecosystem may lead to instability, and even to extinction.

Richard Hamming put it this way in what he called the first rule of systems engineering:

If you optimize the components, you’ll probably ruin the system performance.

For more variations on this principle, see this Twitter thread.

I’ve been thinking about this lately in regard to software tools. If you’re just starting out and ask around for the best way to do this and that, you might get individual bits of good advice that add up to bad advice.

“You’re still using X? You should be using Y. Y’s better.”

When someone says Y is “better” they probably mean that it has more features. It may also be objectively better by several other criteria. But that doesn’t mean it’s better for you, at this point in time, given your experience, your temperament, and your project.

The Dreyfus model of skill acquisition says that beginners want context-free rules: Y is better than X. That’s an understandable desire, and people are all too willing to give context-free advice. But context matters. A lot.

Yesterday I needed to do a little search-and-replace text munging, join two data sets, and compute some basic statistics. I thought about how I could go off on tangents for each task, using the best tool for each task individually, and take forever to get my job done. Instead, I cobbled something together quickly at a command line, with each step being far from the most general solution.

If you choose the best (i.e. biggest) tool for each task, you’ll get affirmation, or at least avoid criticism, for each choice. And you’ll likely end up with an unwieldy hodgepodge of tools, none of which you’re confident in using. If instead you think carefully about your abilities and your needs, you’ll gradually assemble a collection of tools that work together for you,  making you more confident and productive.

Queueing theory equations

A blog post about queueing theory that I wrote back in 2008 continues to be popular. The post shows that when a system is close to capacity, adding another server dramatically reduces the wait time. In that example, going from one teller to two tellers doesn’t make service twice as fast but 93 times as fast.

That post starts out at a very high level and then goes into some details. This post goes into much more detail.

For a quick introduction to queueing theory, see this page.

Queueing theory notation

In Kendall’s notation, we’re talking about M/M/1 and M/M/2 queues. The first letter denotes the assumed distribution on times between customer arrivals, the second denotes the distribution on service times, and the number at the end gives the number of servers.

The simplest and most common models in queueing theory are M/M models. The M‘s stand for “Markov” but they really mean exponential. (Sorry. I didn’t invent the notation.) The time between arrivals is exponentially distributed, which means the arrival is a Poisson process. The time it takes to serve each customer is also exponentially distributed.

Let λ be the number of who arrive per unit time, and let μ be the number of customers served per unit time, on average. For the system to approach a steady state we must have λ/μ < 1. Otherwise the number of customers in the queue grows over time without bound. In the example I mentioned above, λ = 5.8 customers per hour, and μ = 6 customers per hour. Customers are coming in nearly as fast as they can be served, and so a long line develops.

M/M/1 equations

For a M/M/1 queue, the expected wait time, once the system reaches steady state, is

W = \frac{\lambda}{\mu(\mu - \lambda)}

and the expected number of customers waiting at any time is

L = \frac{\lambda^2}{\mu(\mu - \lambda)}

M/M/2 equations

The equations for M/M/k systems in general are significantly more complicated, but they’re not bad when k = 2. For an M/M/2 system at steady state, the expected wait time is

W = \frac{\lambda^2}{\mu(4\mu^2 - \lambda^2)}

and the expected number of customers waiting is

L = \frac{\lambda^3}{\mu(4\mu^2 - \lambda^2)}

You may have noticed that for both M/M/1 and M/M/2 the expected number of customers waiting in line is λ times the expected wait time, i.e. L = λW. This is true in general and is known as Little’s law.

More queueing theory

Swanson’s rule of thumb

Swanson’s rule of thumb [1] says that the mean of a moderately skewed probability distribution can be approximated by the weighted average of the 10th, 50th, and 90th percentile, with weights 0.3, 0.4, and 0.3 respectively. Because it is based on percentiles, the rule is robust to outliers. Swanson’s rule is used in the oil and gas industry, but I don’t believe it’s widely known outside of that sector.

This post will apply Swanson’s rule to three distributions to see how well it does. I’ll plot the bias in Swanson’s rule and skewness for three distributions families as the shape parameters vary.

First, here’s a plot for the gamma distribution with shape parameter k varying from 0.5 to 10.

The skewness of a gamma distribution is 1/√k. So unless k is close to 0, the skewness of a gamma distribution is small and the error in Swanson’s rule is very small.

For the log normal I let the shape parameter σ vary from 0.01 to 1; I didn’t use larger parameters because skewness for the log normal increases very rapidly as a function of σ. Over the range I plotted the percentage relative error in Swanson’s rule is roughly equal to the skewness.

The skewness of a Pareto distribution is undefined unless the shape parameter is greater than 3. The plot below is based on shape parameters between 3.1 and 4. The skewness decreases as the shape parameter increases, asymptotically approaching 2.

Based on these examples, Swanson’s rule does indeed work pretty well when skewness is small. It works well for the gamma distribution in general because skewness is always small (unless the shape parameter k is tiny). But the log normal distribution shows that the assumption of small skewness is important.

For the Pareto distribution, Swanson’s rule works well when the shape parameter is large and the skewness is small. But this was kind of a stress test for Swanson’s rule because often the Pareto distribution is used when the skewness is large if not infinite.


[1] Swanson’s 30-40-30 rule. A. Hurst, G. C. Brown, R. I. Swanson. AAPG Bulletin, December 2000

Random sampling to save money

I was stunned when my client said that a database query that I asked them to run would cost the company $100,000 per year. I had framed my question in the most natural way, not thinking that at the company’s scale it would be worth spending some time thinking about the query.

Things have somewhat come full circle. When computers were new, computer time was expensive and programmer time was dirt cheap by comparison. Someone might be scolded for using computer time to do what a programmer could do. Now of course programmer time is expensive and computer time is cheap. Usually.

But when dealing with gargantuan data sets, the cost of computer time might matter a great deal, especially when renting services from a cloud provider. Maybe you can find a more clever way to run an enormous query. Or even better, maybe you can find a way to avoid running an enormous query.

One way to save time and money is to base decisions on random samples rather than exhaustive queries. This requires a little preparation. How big a sample do you need? How exactly are you going to take a sample? What kind of uncertainty do you have in your result when you’re done? You can afford to think about these questions a long time if it saves tens of thousands of dollars per year.

Greco-Latin squares and magic squares

Suppose you create an n × n Latin square filled with the first n letters of the Roman alphabet. This means that each letter appears exactly once in each row and in each column.

You could repeat the same exercise only using the Greek alphabet.

Is it possible to find two n × n Latin squares, one filled with Roman letters and the other filled with Greek letters, so that when you combine corresponding entries, each combination of Roman and Greek letters appears exactly once? If so, the combination is called a Greco-Latin square.

Whether Greco-Latin squares of size n exist depends on n. But before we explore that, let’s give an example.

Here are two Latin squares, one filled with the first four Roman letters and the other filled with the first four Greek letters.

    A B C D   α β γ δ
    D C B A   γ δ α β
    B A D C   δ γ β α
    C D A B   β α δ γ

If we concatenate the two matrices, we get

    Aα Bβ Cγ Dδ   
    Dγ Cδ Bα Aβ   
    Bδ Aγ Dβ Cα   
    Cβ Dα Aδ Bγ   

and each of the two-letter entries is unique. So we’ve shown that a Greco-Latin square exists for n = 4.

Mutually Orthogonal Latin Squares (MOLS)

The more modern name for Greco-Latin squares is “mutually orthogonal Latin squares,” abbreviated MOLS. We say two Latin squares M and N are mutually orthogonal if the list of pairs (Mij, Nij,) contains no duplicates.

Euler’s work

Euler showed how to construct Greco-Latin squares when n is odd and when n is a multiple of 4. He conjectured that no Greco-Latin squares exist when n = 4k + 2.

His conjecture was incorrect. For example, there is a Greco-Latin square of size 10. And in fact, Greco-Latin squares exist for all n except n = 2 and n = 6.

Magic squares

Suppose you have two orthogonal Latin squares M and N of size n, and suppose that the diagonal elements of both squares contain no duplicates.

Use the numbers 0 through n-1 rather than Greek or Latin letters to fill the squares and interpret the Latin squares as matrices. Then the matrix

nM + N

is a magic square. This is equivalent to taking the corresponding Greco-Latin square and interpreting the entries as base n numbers.

For example, using the Latin squares above, replace A and α with 0, B and β with 1, C and γ with 2, and D and γ with 3. Then the Greco-Roman square

    Aα Bβ Cγ Dδ   
    Dγ Cδ Bα Aβ   
    Bδ Aγ Dβ Cα   
    Cβ Dα Aδ Bγ   


    00 11 22 33   
    32 23 10 01   
    13 02 31 20   
    21 30 03 12   

with the entries being base 4 numbers. The rows and columns clearly have the same sum because they all have the same set of numbers in the 1s place and in the 4s place. Written in base 10, the magic square above is

     0  5 10 15   
    14 11  4  1   
     7  2 13  8   
     9 12  3  6   

It’s conventional for magic squares to be filled with consecutive numbers starting with 1, so you could add 1 to every entry above if you’d like.

Related posts

Latin squares and 3D chess

In a n×n Latin square, each of the numbers 1 through n appears exactly once in each row and column. For example, the 5 × 5 square below is a Latin square.

12345, 21534, 34251, 45123, 53412

If we placed a rook on each square numbered 1, the rooks would not attack each other since no two rooks would be in the same row or the same column. The same would be true if we moved all the rooks to squares numbered 2, or 3, etc.

Now imagine a n×n×n chess cube, a stack of n chessboards, each board being n×n. For example, we could have a stack of eight standard 8×8 boards.

In our 3D chess cube, rooks can move any number of squares in either the x or y direction within a board, and they can also move vertically in the z direction.

Put a rook on every square of the bottom board. Then take the rooks on squares numbered 2 and move them to the second level. Take the rooks on squares numbered 3 and move them to the third level, and so forth.

Now we have n×n rooks, and none is in a position to attack the other. To see this, pick a particular level k. None of the rooks on level k are attacking each other because level k is a Latin square. And none of the rooks can attack vertically because we started with all the rooks on the bottom level and lifted them up by various amounts; there’s only one rook in each vertical column.

Next let’s suppose we have n×n rooks arranged in 3D so that none is attacking any of the others. Label the rooks on level k with a k. Now push all the rooks straight down vertically to the first level. There can only be one rook on each square because no rooks were attacking each other vertically.

Number each square by the number of its rook. I claim the result is a Latin square. There can only be one k in each row and column because all the ks started out on level k, and none were attacking each other in the x or y direction.

Related posts