Server utilization: Joel on queueing

Joel Spolsky of Joel on Software fame mentions queueing theory in the latest StackOverflow podcast. He mentions a rule of thumb that wait times go up quickly as server utilization exceeds 80%. The same principle applies whether you’re talking about computer servers or human servers. I hadn’t heard of that rule of thumb, though I knew that you don’t want anywhere near 100% utilization. Here’s the graph that justifies Joel’s remark.

graph of wait time as a function of utilization, service time fixed

The vertical axis is wait time. The horizontal access is utilization.

Here are the details. Basic queuing theory assumes customer arrivals are Poisson distributed with rate λ. Service times are exponentially distributed with rate μ. The ratio λ/μ is called utilization ρ. If this ratio is greater than 1, that says customers are arriving faster than they can be served, and so the line will grow without bound. If the ratio is less than 1, the line will reach some steady state on average.

The average waiting time is W = 1/(μ-λ). Now assume the service time μ is fixed and the arrival rate λ = ρ μ. Then W = 1/μ(1-ρ) and so the wait time is proportional to 1/(1-ρ). As the utilization ρ approaches 1, the wait time goes to infinity. The graph above plots 1/(1-ρ). As Joel said, the curve does go up quickly when ρ exceeds 0.8.

Related post: What happens when you add a new teller?

Whatever happened to system administration?

You hardly hear anyone speak of “system administration” anymore. Now everything is “IT” (information technology). System administrators are now “IT professionals.” Why the change? Has it become politically incorrect to call the people who administer computer systems “system administrators”?

I’ve never liked the term “information technology.” For one thing, it has a confusing abbreviation. For another, it’s terribly vague. It would seem to include programming, but I think  programming is about the only thing it doesn’t include.

Exascale computing

In this podcast, Peter Kogge explains what it would take to increase our computing capacity by a factor of 1000. We can’t just do more of what we’re doing now and scale linearly. For example, one limitation is power consumption. Data centers now consume 1.5% of the electricity on the U.S. power grid. Nevertheless, Kogge outlines ways we could achieve a 1000-fold increase in computing capacity by 2015.

Six degrees of Paul Erdős

The goal of the game Six Degrees of Kevin Bacon is to connect actors to Kevin Bacon in as few steps as possible. For example, Elvis Presley can be linked to Kevin Bacon in two steps: Elvis Presley made a movie with Ed Asner, and Ed Asner made a movie with Kevin Bacon. The game could be played with other actors, but the advantage of using Kevin Bacon is that he’s made movies with a lot of other actors.

Mathematicians have a similar and older game, connecting each other to Paul Erdős via a chain of papers. Like Kevin Bacon, Erdős was a social hub. He wrote more math papers than anyone else and usually collaborated with other authors.

The length of the smallest chain connecting someone to Paul Erdős is called that person’s  Erdős number. Erdős has Erdős number 0. Anyone who wrote a paper with Erdős has Erdős number 1, etc. My Erdős number is 4 because I wrote a paper with Ralph Showalter, who wrote a paper with Charles Radin, who wrote a paper with John Horton Conway, who wrote a paper with Erdős. I’ve written papers with other folks with Erdős number 3, but not with anyone with a lower number, so my number is 4. I calculated my Erdős number after learning about this  website from Dave Richeson’s six degrees of separation blog post.

Paul Erdős was a pure mathematician. I wonder who might be the Paul Erdős of applied math, someone who has written a large number of applied math articles, especially with many collaborators. It would be hard to say since applied math is a fuzzy classification, blending into other disciplines. Still, any nominations?

Related post: Publish or perish

Cost-benefit analysis versus benefit-only analysis

Hardly anyone cares about statistics directly. People more often care about decisions they need to make with the help of statistics. This suggests that the statistics and decision-making process should be explicitly integrated. The name for this integrated approach is “decision theory.” Problems in decision theory are set up with the goal of maximizing “utility,” the benefit you expect to get from a decision. Equivalently, problems are set up to minimize expected cost. Cost may be a literal monetary cost, but it could be some other measure of something you want to avoid.

I was at a conference this morning where David Draper gave an excellent talk entitled Bayesian Decision Theory in Biostatistics: the Utility of Utility.  Draper presented an example of selecting variables for a statistical model. But instead of just selecting the most important variables in a purely statistical sense, he factored in the cost of collecting each variable. So if two variables make nearly equal contributions to a model, for example, the procedure would give preference to the variable that is cheaper to collect. In short, Draper recommended a cost-benefit analysis rather than the typical (statistical) benefit-only analysis. Very reasonable.

Why don’t people always take this approach? One reason is that it’s hard to assign utilities to outcomes. Dollar costs are often easy to account for, but it can be much harder to assign values to benefits. For example, you have to ask “Benefit for whom?” In a medical context, do you want to maximize the benefit to patients? Doctors? Insurance companies? Tax payers? Regulators? Statisticians? If you want to maximize some combination of these factors, how do you weight the interests of the various parties?

Assigning utilities is hard work, and you can never make everyone happy. No matter how good of a job you do, someone will criticize you. Nearly everyone agrees in the abstract that considering utilities is the way to go, but in practice it is hardly ever done. Anyone who proposes a way to quantify utility is immediately shot down by people who have a better idea. The net result is that rather than using a reasonable but  imperfect idea of utility, no utility is used at all. Or rather no explicit definition of utility is used. There is usually some implicit idea of utility, chosen for mathematical convenience, and that one wins by default. In general, people much prefer to leave utilities implicit.

In the Q&A after his talk, Draper said something to the effect that the status quo persists for a very good reason: thinking is hard work, and it opens you up to criticism.

Talent alone won’t pay the bills

Here’s a Washington Post article about an experiment. The newspaper asked world-famous violinist Joshua Bell to dress down and perform as a street musician at a DC metro stop. Hardly anyone paid attention. He earned $32.17 in tips. People routinely pay over $100 for a chance to listen to Joshua Bell in a concert hall, but he earned an average of about $0.03 per passer-by that day.

Suppose Bell were trying to support himself as a street musician. What might he say to himself after a disappointing day at the metro? That if only he had practiced a little harder he would have made more tips?

Two myths I learned in college: bathtub drains and airplane wings

Here are two things I was taught in college that were absolutely wrong.

  • The Coriolis effect explains why bathtubs drain clockwise in the northern hemisphere.
  • The Bernoulli effect explains how planes fly.

These are not things that scientists believed at the time but were later disproved. They’re simply myths. (No doubt I’ve passed on a few myths to students over the years. If I can think of any, I’ll retract them here.)

The Coriolis effect does explain why cyclones rotate one way in the northern hemisphere and the opposite way in the southern hemisphere. The rotation of the earth influences the rotation of large bodies of fluid, like weather systems. However, a bathtub would need to be maybe a thousand miles in diameter before the Coriolis effect would determine how it drains. Bathtubs drain clockwise and counterclockwise in both hemispheres. Random forces such as sound in the air have more influence than the Coriolis effect on such small bodies of water.

The explanation  I learned in college for how airplanes fly involves the Bernoulli principle. The shape of a wing is such that air has to travel further over the top of the wing than under the bottom of the wing.

airfoilSince air particles going across the top and bottom of the wing arrive at the trailing edge at the same time, the particles going over the top travel further and so are spread further apart. This results in lower pressure over the top of the wing and lifts the airplane.

There are a couple problems with this explanation. When the air particles split at the leading edge, why should the ones that go over the top and the ones that go under the bottom arrive at the trailing edge at the same time? In fact, they don’t. Also, while the Bernoulli effect does explain part of the lift on an airplane wing, the effect is an order of magnitude too small to account for why airplanes fly.

If the Bernoulli principle explains lift, how can an airplane fly upside-down? Wouldn’t the Bernoulli effect suck the plane down rather than lifting it up when you turn the wing over?

Here’s a  set of recordings from a lecture that debunks the Bernoulli effect myth and explains the real reason airplanes fly. Why airplanes fly: a modern myth. Part I, Part II.  There will be a Part III that hasn’t been released yet.

Update: Apparently these links have gone away.

Apple Mac turns 25 today

The Mac came out 25 years ago today. Here’s the article. Hat tip @divbyzero.

I sold Macs for a little while. I worked at the UT computer store one summer. At the time, that store was the biggest distributor of Macs in the country.

I’ve never owned a Mac. I think it would be great fun, but it’s hard for me to justify the expense. If someone wanted to give me a Mac, I’d start blogging about Mac stuff. :-)

Splitting a convex set through its center

Three days ago I raised the question How unevenly can you split a convex set through its center? Scott had asked in the comments what kind of sets split most unevenly through their centers. I didn’t have an answer at the time, but now I have something.

Start with a triangle. Draw a line from the middle of the base up to the top. The center of mass is located 1/3 of the way up that line.

Now draw a horizontal line through the center of mass. The triangle above this line is similar to the entire triangle, but will have 2/3 of the original height. By similarity, it also has 2/3 of the base, and so it has 4/9 of the area of the entire triangle. The portion below the center of mass has 5/9 of the area.

Now let’s go up a dimension and imagine a tetrahedron sitting on a table. Draw a line from the center of the base up to the top. The center of mass for the tetrahedron will be located along that line, 1/4 of the way up. Pass a plane through the center of mass parallel to the table. This forms a new tetrahedron above the cutting plane, similar to the original tetrahedron. The new tetrahedron has 3/4 of the height, and by similarity, it has (3/4)3 = 27/64 of the original volume. Of course the solid below the cutting plane then has 37/64 of the original volume.

Now consider the general case. The center of mass of an n-simplex is located 1/(n+1) of the way up along a line running from the center of the base to the top. Cut the n-simplex through the center of mass with a hyperplane parallel to the base. The n-simplex above the hyperplane has n/(n+1) of the original height and (n/(n+1))n of the original volume. As n goes to infinity, this expression converges to 1/e.

In summary, we’ve shown that a convex set in Rn can have at least 1 – (n/(n+1))n of its volume on one side of a hyperplane through the center of mass by constructing such a set, namely any n-simplex. And as n goes to infinity, this volume approaches 1 – 1/e. We have not shown that an n-simplex is the worst case in each dimension n, though I suspect this is true.  If you take a ball in Rn, any plane through the center divides the ball exactly in half. In some sense, an n-simplex is as far as a convex set can get from a ball. It’s the least-round convex shape.

Update: Here’s a paper on the topic. B. Grunbaum, “Partitions of mass-distributions and convex bodies by hyperplanes,” Pacific Journal of Mathematics. 10, 1257–1261, 1960.