# Saved by symmetry

When I solve a problem by appealing to symmetry, students’ jaws drop. They look at me as if I’d pulled a rabbit out of a hat.

I used think of these tricks as common knowledge, but now I think they’re common knowledge in some circles (e.g. physics) and not as common in others. These tricks are simple, but not as many people as I’d thought have been trained to spot opportunities to apply them.

Here’s an example.

Pop quiz 1: Evaluate the following intimidating integral.

Solution: Zero, by symmetry, because the integrand is odd.

The integrand is an odd function (i.e. f(-x) = –f(x)), and the integrand of an odd function over a symmetric interval is zero. This is because the region below the x-axis is symmetric to the region above the x-axis as the following graph shows.

The example above is not artificial: similar calculations come up constantly in applications. For example, the Fourier series of an odd function contains only sine terms, no cosine terms. Why? The integrals to compute the Fourier coefficients for the cosine terms all involve odd functions integrated over an interval symmetric about zero.

Another common application of symmetry is evaluating derivatives. The derivative of an odd function is an even function and vice versa.

Pop quiz 2: What is the coefficient of x5 in the Taylor series of cos(1 + x2)?

Solution: Zero, by symmetry, because cos(1 + x2) is an even function.

Odd functions of x have only odd powers of x in their Taylor series and even functions have only even powers of x in their Taylor series. Why? Because the coefficients come from derivatives evaluated at zero.

If f(x) is an odd function, all of its even derivatives are odd functions. These derivatives are zero at x = 0, and so all the coefficients of even powers of x are zero. A similar argument shows that even functions have only even powers of x in their Taylor series.

Symmetry tricks are obvious in hindsight. The hard part is learning to recognize when they apply. Symmetries are harder to recognize, but also more valuable, in complex situations. The key is to think about the problem you’re trying to solve before you dive into heads-down calculation.

Related posts:

# Arrogant ignorance

The following line continues the theme of appropriate scale from a few days ago.

We identify arrogant ignorance by its willingness to work on too large a scale, and thus put too much at risk.

This comes from the title essay of Wendell Berry’s collection of essays The Way of Ignorance. Berry mostly has in mind ecological risk, though of course the principle applies more generally.

Related posts:

# Software development and the myth of progress

The word myth brings up images of classical mythology. From there it can be generalized a couple ways. One is any story that is not true. Another is a story, whether true or not, that embodies a system of belief.

Sociologists use myth in the latter sense when they speak of “the myth of progress.” They are not suggesting that progress does not happen. Instead they are giving a name to the faith that things are always improving.

The myth of progress runs deep in software development. For example, when a software package says it requires XYZ version 4.6 or better, there’s the implicit assumption that later is always better, and better by all criteria. It’s hardly ever the case that a new version has more features, is easier to use, has fewer bugs, is more secure, requires less memory, runs faster, and costs less money. At best, the new software is better by the criteria that most users find most important.

The myth of progress relieves some of the burden of thinking. It’s easier to assume newer is better than to ask whether the new thing is better for me, now, for my work, etc.

Companies profit from the myth of progress by selling new versions. For this reason, the myth of progress may not be as strong in the open source ecosystem. For example, it seems Linux developers have more respect for old software than Windows developers do.

Consider the text-processing utility sed, written by Lee McMahon around 1973. Linux users today think nothing of using a utility over 30 years old. Windows developers, on the other hand, look down in disdain at anything three years old.

You could argue there’s no reason to use sed. AWK is more powerful than sed, and Perl is more powerful than AWK, so sed (and AWK) should die away. So why does sed still exist? Because AWK and Perl are not uniformly better. They have more features and are useful for a wider variety of problems, but sed does what it does very well. sed can do some common tasks in just a few keystrokes that would require more code in AWK and even more code in Perl.

In Windows development, there’s a sense that the old must die for the new to live. VB.NET has to replace VB 6, for example. But there’s no corresponding sense that sed has to go away for AWK or Perl to prosper. Instead, AWK and Perl carefully—I’d almost say lovingly—preserve sed syntax.

Maybe Windows developers have a stronger sense of progress because Windows really has seen more progress. Unix began on the high-end systems of its day and has worked its way down. Windows began low-end hardware and has worked its way up. Unix was written for scientists and engineers and later became more of a consumer operating system. Windows was written for consumers first and more recently has tried to appeal to scientists and engineers. Given this history, it’s not surprising that the Unix community might want to preserve more of its early development. But it doesn’t follow that because Windows NT was a huge improvement over Windows 3.1 that Visual Studio 2010 is necessarily an improvement over Visual Studio 2008.

Related post: Negative space in operating systems

# Suspicious definitions

I’ve long been suspicious of speeches that revolve around idiosyncratic definitions. I was pleased to find this evening that C. S. Lewis shared this suspicion.

But when we leave the dictionaries we must view all definitions with grave distrust. … The fact that they define it at all is itself a ground for skepticism. Unless we are writing a dictionary, or a text-book of some technical subject, we define our words only because we are in some measure departing from their real current sense. Otherwise there would be no purpose in doing so. This is especially true of negative definitions. Statements that honour, or freedom, or humour, or wealth, ‘does not mean’ this or that are proof that it was beginning to mean, or even had long meant, precisely this or that. … We do not warn pupils that coalbox does not mean a hippopotamus.

…  A certain type of writer begins The essence of poetry is’ or All vulgarity may be defined as’, and then produces a definition which no one ever though of since the world began, which conforms to no one’s actual usage, and which he himself will probably have forgotten by the end of the month.

From Studies in Words (ISBN 0521398312)

# Priorities

From Merlin Mann:

If you’ve got more than two priorities, you might as well think you have more than two arms.

Related posts:

# Three Firefox 4 tips

The latest version of Firefox, version 4.0, makes a few changes that take a little getting used to.

1. The navigation bar is below the tabs by default.
2. The “home” icon has moved from the left end of the navigation bar to the right.
3. The little icon on the right end of the address bar that let’s you subscribe to RSS feeds is gone.

If you want the navigation bar back above the tabs, you can go to View -> Toolbars and uncheck “Tabs on Top.”

If you find the new home icon inconvenient, you could just learn the keyboard shortcut: Alt-Home. (By the way, you can navigate back and forward with Alt-Left Arrow and Alt-Right Arrow.)

# Appropriate scale

“Scale” became a popular buzz word a couple decades ago. Suddenly everyone was talking about how things scale. At first the term was used to describe how software behaved as problems became larger or smaller. Then the term became more widely used to describe how businesses and other things handle growth.

Now when people say something “won’t scale” they mean that it won’t perform well as things get larger. “Scale” most often means “scale up.” But years ago the usage was more symmetric. For example, someone might have said that a software package didn’t scale well because it took too long to solve small problems, too long relative to the problem size. We seldom use “scale” to discuss scaling down, except possibly in the context of moving something to smaller electronic devices.

This asymmetric view of scaling can be harmful. For example, little companies model themselves after big companies because they hope to scale (up). But running a small software business, for example, as a Microsoft in miniature is absurd. A small company’s procedures might not scale up well, but neither do a large company’s procedures scale down well.

I’ve been interested in the idea of appropriate scale lately, both professionally and personally.

I’ve realized that some of the software I’ve been using scales in a way that I don’t need it to scale. These applications scale up to handle problems I don’t have, but they’re overly complex for addressing the problems I do have. They scale up, but they don’t scale down. Or maybe they don’t scale up in the way I need them to.

I’m learning to make better use of fewer tools. This quote from Hugh MacLeod suggests that other people may come to the same point as they gain experience.

Actually, as the artist gets more into her thing, and gets more successful, the number of tools tends to go down.

On a more personal level, I think that much frustration in life comes from living at an inappropriate scale. Minimalism is gaining attention because minimalists are saying “Scale down!” while the rest of our culture is saying “Scale up!” Minimalists provide a valuable counterweight, but they can be a bit extreme. As Milton Glaser pointed out, less isn’t more, just enough is more. Instead of simply scaling up or down, we should find an appropriate scale.

How do you determine an appropriate scale? The following suggestion from Andrew Kern is a good starting point:

There is an appropriate scale to every human activity and it is the scale of personal responsibility.

Update: See the follow-up post Arrogant ignorance.

Related posts:

# Python for high performance computing

William Scullin’s talk from PyCon 2011: Python for high performance computing.

At least in our shop [Argonne National Laboratory] we have three accepted languages for scientific computing. In this order they are C/C++, Fortran in all its dialects, and Python. You’ll notice the absolute and total lack of Ruby, Perl, Java.

If you’re interested in Python and HPC, check out SciPyTip.

Radiation units are confusing for three or four reasons.

1. There are different units depending on whether you’re measuring how much radiation is being emitted or measuring how much is being received.
2. There are different ways of quantifying the amount of radiation received depending on whether you’re doing physics or biology.
3. For each of these measurements there are traditional units and SI units.

If you’re not familiar with scientific units, a fourth source of confusion is the prefixes for various powers of 10: milli-, micro-, etc.

The amount of radioactivity emitted by a source is measured in Becquerels or Curies. The SI unit the becquerel (Bq), one decay per second. The traditional unit Curie (Ci) is 3.7 × 1010 Bq and is about the radioactivity of a gram of radium.

The amount of radiation received by a source is measured in grays or rads. The SI unit Gray (Gy) corresponds to one joule of energy absorbed by one kilogram of matter. The traditional unit rad is 0.01 Gy.

The biological effect of radiation is measured in Sieverts or rems. Biologically effective dose is the amount of radiation received multiplied by the relative biological effectiveness (RBE) of the type of radiation source. For x-rays, the RBE is 1. For alpha rays, the RBE is 20. The SI unit of effective dose is the Sievert (Sv), which corresponds to one Gy of x-rays. A rem is 0.01 Sv.

Another unit of effect is the banana equivalent dose. A banana is 0.0001 mSv, or roughly the effective dose of radiation from eating a banana.

# Daily tip accounts broader than names imply

Some of my daily tip Twitter accounts are a little broader than their names imply. Account names need to be fairly short, so they can’t be too descriptive. Here are fuller descriptions of some of the accounts.

# A support one-liner

This morning I had a fun support request related to our software. The exchange took place over email but it could have fit into a couple Twitter messages. Would that all requests could be answered so succinctly.

Question:

Do you have R code to compute P(X > Y) where X ~ gamma(ax, bx) and Y ~ gamma(ay, by)?

Response:

ineq <- function(ax, bx, ay, by) pbeta(bx/(bx+by), ay, ax)

For more on the problem and the solution, see Exact calculation of inequality probabilities.

# Work-life balance

From Nigel Marsh:

I stepped back from the workforce and I spent a year at home with my wife and four young children. But all I learned about work-life balance from that year was that I found it quite easy to balance work and life when I didn’t have any work.

# Algorithm used for world record pi calculations

The following algorithm is based on work of Ramanujan and has been used in several world-record calculations of pi.

Initialize a0 = 6 – 4 √2 and y0 = √2 – 1. Then compute

and

The terms an form a sequence of approximations to 1/π. The error in each approximation is given by

This says that the number of correct digits roughly quadruples after each step. To see this, note that the number of correct decimal places after the nth step is the negative of the logarithm base 10 of the error:

[The error term goes to zero so quickly that you cannot (in ordinary precision) compute the error bound and then take logarithms; the exp(-2 π 4n) term will underflow to zero for n larger than 3. (See why here.) You have to take the log of the error term directly before evaluating numerically.]

The number of correct digits quadruples at each step because the leading term in the numerator above is 4n.

To give a few specifics, a1 is accurate to 9 digits, a2 is accurate to 41 digits, and a3 is accurate to 171 digits. After 10 steps, a10 is accurate to over 2.8 million digits.

The algorithm given here was state of the art as of 1989. It was used to set world records in 1986. I don’t know whether it has been used more recently. See more here.

According to this page, π has been calculated to 6.4 billion digits. You can beat this record if you can carry out 16 steps of the method described here. a16 would be accurate to over 11 billion digits.

Update (18 April 2012): The algorithm used most recently for world record calculations for pi has been the Chudnovsky algorithm. As of October 2011, the record was over 10 trillion digits.

Related posts:

For a daily dose of computer science and related topics, follow @CompSciFact on Twitter.

# A Ramanujan series for calculating pi

Ramanujan discovered the following remarkable formula for computing π:

This is not the most efficient series for computing π. My next post will give a more efficient method, also based on work of Ramanujan. But the series above is interesting for reasons explained below.

Notice that the denominator of each term in the sum above is a power of 2. This means the partial sums of the series can be represented exactly in binary. We’ll see that each term in the series adds six bits precision to the sum once we’ve added enough terms.

To understand how to use this series, we need to estimate the binomial coefficient term. Stirling’s approximation shows that

This tells us that the nth term in the series is a rational number with numerator something like 2^6n and denominator 2^(12n+4). Therefore the nth term is on the order of 2^(-6n-4) and so the series converges quickly. The first three terms illustrates this:

The error in summing up a finite number of terms is approximately the first term in the remainder, so just a few terms leads to an accurate approximation for 1/π and hence an accurate approximation for π.

To be a little more precise, when we sum the series from n = 0 to n = N, the error in approximating 1/π is on the order of the next term, 2^(-6N-10). Said another way, summing up to the Nth term computes 1/π to 6N + 10 bits. For example, suppose we wanted to compute 1/π to 200 decimal places. That’s about 664 bits, and so 108 terms of the series should be enough.

We glossed over a detail above. We said that the nth term is on the order of 2^(-6n-4). That’s true for sufficiently large n. In fact, we can say that the nth term is less than 2^(-6n-4), but only for large enough n. How large? We need the denominator term (π n)^3/2 to be larger than the numerator term 42n + 5. This happens if n is at least as large as 58.

One advantage to using this series to compute π is that you could compute later blocks of bits without having to compute the earlier blocks, provided you’re interested in the bits beyond those contributed by 58th term in the series.

Reference

Related post: Two useful asymptotic series