Pi seconds equals one nanocentury

Duff’s rule says that a nanocentury is about π seconds. Assuming a year is 365.25 days, there are 3,155,760,000 seconds in a century. So a nanocentury, one billionth of a century, is 3.15576 seconds, roughly π seconds.

This odd fact is surprisingly useful in back-of-the-envelope calculations. In order to determine whether something is computationally feasible, you have to go from how much work can be done in a second to how much work could be done on calendar time scales. For example, suppose some operation is going to take 1015 steps and you can carry out 106 operations per second. How long would that take? Obviously 109 seconds, but how long is that in familiar units of time? According to Duff’s rule it’s about a third of a century so about 30 years.

Since the square root of 10 is approximately π, you could say there are about square root of 10 seconds in a nanocentury. In fact, the square root of 10 is 3.162 and so it’s closer to the number of seconds in a nanocentury than π is. The advantage to square root of 10 is that it is exactly half an order of magnitude, 101/2. So you could say a century is 9.5 orders of magnitude longer than a second, or a year is 7.5 orders of magnitude longer than a second. Or perhaps more memorably,

A year is about 30 megaseconds.

If you interpret “mega” as 220 rather than 106 this approximation gets even better. (Technically, this would be 30 mebiseconds. SI distinguishes “mega” = 106 = 1,000,000 from “mebi” = 220 = 1,048,576, though the latter isn’t widely used. Most people have either never heard of the new prefixes like “mebi” or think they sound silly and prefer the ambiguity of using “mega” to mean two slightly different things. See Kibi, mebi, gibi.)

I wrote briefly about Duff’s rule a while back in the post Three rules of thumb. That post also includes a great video of Grace Hopper explaining to David Letterman her rule of thumb that light travels about one foot in a nanosecond.

There isn’t a googol of anything

Before Google, there was googol, the number 10^100, written as a 1 followed by 100 zeroes.

There are about 4 × 10^79 atoms in the universe. (Here’s a derivation of that number.) You could bump that number up a little by counting particles rather than atoms, but not by much. There’s not a googol of anything physical in the universe.

On the other hand, numbers larger than a googol routinely arise in application. When you’re counting potential things rather than physical things you can run into numbers much larger than a googol. This happens all the time in probability calculations. Inconceivably large numbers pop up in intermediate steps on the way to moderate-sized results.

Related posts:

Emacs calculator

Emacs has quite a sophisticated scientific calculator. Like many other things in Emacs, it is both powerful and idiosyncratic.

The calc module ships with Emacs as of version 22. The calc manual has full documentation, but it takes some work to understand. This post may make it easier to read the manual.

The manual says you can start calc with the command M-#. That did not work for me, but the command M-x calc did.

Calc has two modes: RPN (reverse Polish notation) and algebraic. RPN is the default mode and uses very terse commands, often one or two letters. Most people will find the algebraic mode syntax more familiar. Commands in algebraic mode begin with a quote and use longer, more descriptive names.

For example, suppose you want to find the cosine of 5. You can do this in RPN mode by typing 5 and pressing return, then typing shift-C. To do the same calculation in algebraic mode, type 'cos(5) and return.

I’ll step through to gamma function documentation as an example of how to interpret the manual. It says

The f g (calc-gamma) [gamma] command computes the Euler gamma function. …

This means that in RPN mode, you enter the argument of the gamma function, say 5, and then type fg. (Not f g as you might expect.) You could also enter 5 and then type M-x calc-gamma. In algebraic mode, the syntax would be 'gamma(5). So the pattern in the manual is

RPN keystrokes (lisp function) [algebraic syntax].

The calc module has an amazing range of functionality — symbolic calculation, matrix operations, graphics, etc. — though I don’t imagine I’d use much of it. In my mind, the benefit of calc is being able to do a quick calculation without leaving Emacs. Although one could do sophisticated calculations in calc, I doubt I would for two reasons. First, if I have to look up how to do something in calc, I lose the benefit of not interrupting my workflow. Second, I don’t want to start a computation in calc and then discover I need something that isn’t there and then have to switch to another tool such as SciPy, Mathematica, or R.

Related posts:

Probability that a number is prime

The fastest ways to test whether a number is prime have some small probability of being wrong. Said another way, it’s easier to tell whether a number is “probably” prime than to tell with certainty that it’s prime. This post looks briefly at algorithms for primality testing then focuses on what exactly it means to say a number is “probably prime.”

The simplest probabilisitic method for testing primes is based on Fermat’s Little Theorem. This theorem says that if p is prime then ap-1 has a remainder of 1 when divided by p. If the remainder when dividing an-1 by n is anything other than 1, we know we do not have a prime. If the remainder is 1, we don’t necessarily have a prime, but the chances that the number is prime have gone up. (At this point, some statisticians are cringing at my wording. I’ll get to these objections in a minute.)

If I repeat the test for several different values of a, the probability that my number is prime keeps going up. In principle I could keep testing with new values of a until I’m sufficiently satisfied that my number is prime. That’s not quite true, however, since there are some numbers known as Carmichael numbers that will slip through the test for every value of a. The smallest Carmichael number is 561. But these numbers are sufficiently rare that you are extraordinarily unlikely to run into one by accident when testing large numbers for primality.

Primality testing involves some very interesting mathematics. I may blog about that sometime, but in this post I want to look at what it means to say a number is “probably prime.” This issue illustrates the differences between two schools of statistical interpretation.

A strict frequentist statistician would argue as follows. Suppose you have a large number N, and you run it through a probabilistic primality test. If it fails, the probability of it being prime is 0 because the number is certainly composite. If it passes, the probability that it is prime is either 0 or 1, but we just don’t know which. If it is prime, the probability that it is prime is 1. If it’s not prime, the probability is 0. It’s not very helpful to ask for the probability of a specific number being prime because there’s nothing random going on.

However, just before you write off the statistician as being worthless, he gives a further explanation. He explains that you can indeed talk about the probability of a number being prime if you rephrase your question. Suppose you pick a number at random from some range. Then you can speak of the probability that you chose a prime number. The number’s properties are not random but your choice is random. And further you can ask about the conditional probability that your selection will be prime given that it passes a primality test.

Perhaps you don’t find the argument above satisfying. You still want to say “I’ve got a number here and it passed the primality tests. Surely I can say it’s probably prime.” A Bayesian statistician may agree with you. There are varieties of Bayesians, but one school of thought says that probabilities represent uncertainty. Randomness is one cause of uncertainty, but not the only one or even the most important one. E. T. Jaynes was a major proponent of this view. He argued that much of what people say about randomness is nonsense and that we should focus on information instead. As he once said “probabilities do not describe reality, only our information about reality.”

Of course a particular number is either prime or not, but our knowledge of whether it is prime is uncertain. So the probability refers to our mental state, not to the number. (In a sense this isn’t too different from the frequentist saying the probability applies to our process of choosing numbers, not the number we drew.)

One criticism of Bayesian statistics is that Bayesian analysis begins with a prior distribution. So our Bayesian statistician would ask “What is your probability that the number will be prime before you select it?” A sensible answer would be to use the Prime Number Theorem to come up with an initial guess. For example, if you’re picking a number with 200 digits, the theorem says that the proportion of 200-digit numbers that are prime is roughly 1/(200 log 10) or about 1/460.

But what if someone chooses a prior that isn’t as clever? They’ll get a different probability that a particular number is prime! But that’s to be expected. If a probability represents uncertainty, some people will have more uncertainty than others because people know different things. What if someone secretly created your 200-digit number by multiplying two 100-digit numbers? His prior probability that the product is prime will be zero because he has no uncertainty.

Leaving aside primality for a moment, suppose I flip a fair coin. Before I flip the coin, a frequentist and a Bayesian  agree that the probability that it will come up heads is 1/2. Now suppose the coin has fallen to the floor and only I can see the outcome.  My probability of heads is now either 0 or 1. The Bayesian would say that his probability is still 1/2 because his information hasn’t changed. A strict frequentist would say it is now improper to assign a probability to the outcome. The coin’s status has changed from random to fixed but unknown, and one may not use probabilities to quantify uncertain knowledge.

Going back to primality testing, recall we first said 1/460 was a good prior distribution on the probability of a 200-digit number being prime. But we could improve this by saying the prior probability is 0 for even numbers and 1/230 for odd numbers since we can easily rule out half of the 200 digit numbers. Our prior probability has become more refined because it now incorporates more knowledge, namely the fact that big even numbers are not prime. But why stop there?

We could rule out multiples of 3 as well and put all the positive probability on numbers not divisible by 2 or 3. We could continue, and at each step our prior distribution would change. Some people are disturbed by the thought of the prior changing like this, but it’s perfectly consonant with the idea of a prior distribution expressing prior knowledge. The more work you do up front, the less uncertainty you have.

We could even continue this process, in theory, until we know the primality status of every 200-digit number and there’s no uncertainty left. We’d be done before we got around to conducting our probabilistic primality test. In practice we’d never finish if we tried to test every 200 digit number for primality. There would be nowhere to store the results even if we could compute them. There are fewer than 10^80 atoms in the universe, so if we stored one result per atom, we could only store a table of primes up to about 80 digits.

Setting aside statistical fine points, suppose we are now quite certain that our number is prime. Say we report that the probability of our number not being prime is less than 1 chance in 1050. Is that really believable? For such a tiny probability, isn’t it more likely that there was a bug in our computer program that produced the probability? Or if we assume the software is perfect, we have to admit that computer hardware has errors. These errors are rare, but are they as rare as one chance in 1050? Or for that matter, what is the probability that radiation may have caused a bit to flip somewhere in the hardware as we were computing?

Suppose we want to know the primality of a number because we’re going to use the number in an encryption algorithm. Then our ultimate concern is whether the encryption keeps our data secure. If the probability of the primality algorithm failing is orders of magnitude less than the probability of other security threats, we should focus our attention elsewhere.

Related posts:

Casablanca and Einstein

Humphrey Bogart and Ingrid Bergman in Casablanca

If you’ve ever seen Casablanca, you’ve heard the song As Time Goes By, but only the chorus.

You must remember this
A kiss is just a kiss, a sigh is just a sigh.
The fundamental things apply
As time goes by.

Did you know the song includes references to relativity and four-dimensional geometry?

Here’s the first verse.

This day and age we’re living in
Gives cause for apprehension
With speed and new invention
And things like fourth dimension.

Yet we get a trifle weary
With Mr. Einstein’s theory.
So we must get down to earth at times
Relax relieve the tension.

And no matter what the progress
Or what may yet be proved
The simple facts of life are such
They cannot be removed.

Here are the full lyrics.

Via Math Mutation podcast #134.

Probability and Statistics cheat sheet

Matthias Vallentin posted a comment on my post about a math/CS cheat sheet to say that he’s been working on a probability and statistics cheat sheet. Looks great, though at 24 pages it stretches the definition of “cheat sheet” even more than the computer science cheat sheet did.

Anybody know of other cool cheat sheets?

Related links:

Serious lessons from Knuth’s joke

On June 30, 2010 Donald Knuth announced iTeX, the successor to TeX. His announcement was an extended parody of much of what people recommend as the “right” way to develop software.

TeX has been extremely successful. The vast majority of math and computer science is published using TeX. And yet Knuth implies that TeX would have been an obscure failure if he had developed it using trendy software development techniques.

Here’s the video of Knuth’s presentation.

Math/CS cheat sheet

Here’s something called a theoretical computer science cheat sheet. I don’t know whether I agree with the name, but it’s a nice cheat sheet.

The first two pages of the cheat sheet have to do with sums, combinatorics, and recurrence relations, the kinds of things you’d find in Concrete Mathematics and certainly useful in theoretical computer science. The chart mentions the “master method” that I blogged about here. But a large part of the cheat sheet is devoted to calculus and other general mathematics not as directly related to computer science.

There’s a nice summary of finite calculus on page 8, a useful little area of math that’s not very widely known.

Thanks to Bill the Lizard for pointing out the cheat sheet.