by John on January 13, 2008
From Saint Augustine:
Heaven forbid that God should hate in us that by which he made us superior to the animals! Heaven forbid that we should believe in such a way as not to accept or seek reasons, since we could not even believe if we did not possess rational souls.
by John on January 12, 2008
Jim Berger gives the following example illustrating the difference between frequentist and Bayesian approaches to inference in his book The Likelihood Principle.
Experiment 1:
A fine musician, specializing in classical works, tells us that he is able to distinguish if Hayden or Mozart composed some classical song. Small excerpts of the compositions of both authors are selected at random and the experiment consists of playing them for identification by the musician. The musician makes 10 correct guesses in exactly 10 trials.
Experiment 2:
A drunken man says he can correctly guess in a coin toss what face of the coin will fall down. Again, after 10 trials the man correctly guesses the outcomes of the 10 throws.
A frequentist statistician would have as much confidence in the musician’s ability to identify composers as in the drunk’s ability to predict coin tosses. In both cases the data are 10 successes out of 10 trials. But a Bayesian statistician would combine the data with a prior distribution. Presumably most people would be inclined a priori to have more confidence in the musician’s claim than the drunk’s claim. After applying Bayes theorem to analyze the data, the credibility of both claims will have increased, though the musician will continue to have more credibility than the drunk. On the other hand, if you start out believing that it is completely impossible for drunks to predict coin flips, then your posterior probability for the drunk’s claim will continue to be zero, no matter how much evidence you collect.
Dennis Lindley coined the term “Cromwell’s rule” for the advice that nothing should have zero prior probability unless it is logically impossible. The name comes from a statement by Oliver Cromwell addressed to the Church of Scotland:
I beseech you, in the bowels of Christ, think it possible that you may be mistaken.
In probabilistic terms, “think it possible that you may be mistaken” corresponds to “don’t give anything zero prior probability.” If an event has zero prior probability, it will have zero posterior probability, no matter how much evidence is collected. If an event has tiny but non-zero prior probability, enough evidence can eventually increase the posterior probability to a large value.
The difference between a small positive prior probability and a zero prior probability is the difference between a skeptical mind and a closed mind.
by John on January 12, 2008
An estimator in statistics is a way of guessing a parameter based on data. An estimator is unbiased if over the long run, your guesses converge to the thing you’re estimating. Sounds eminently reasonable. But it might not be.
Suppose you’re estimating something like the number of car accidents per week in Texas and you counted 308 the first week. What would you estimate is the probability of seeing no accidents the next week?
If you use a Poisson model for the number of car accidents, a very common assumption for such data, there is a unique unbiased estimator. And this estimator would estimate the probability of no accidents during a week as 1. Worse, had you counted 307 accidents, the estimated probability would be -1! The estimator alternates between two ridiculous values, but in the long run these values average out to the true value. Exact in the limit, useless on the way there. A slightly biased estimator would be much more practical.
See Michael Hardy’s article for more details: An_Illuminating_Counterexample.pdf
by John on January 12, 2008
Here are three quotes on originality I’ve read recently. I’ll lay them out first then discuss how I think they relate to each other.
C. S. Lewis from The Weight of Glory, as quoted in a blog post by David Rogstad.
No man who values originality will ever be original. But try to tell the truth as you see it, try to do any bit of work as well as it can be done for the work’s sake, and what men call originality will come unsought.
Larry Wall, creator of Perl, in his talk Perl, the first postmodern programming language.
Modernism is also a Cult of Originality. It didn’t matter if the sculpture was hideous, as long as it was original. It didn’t matter if there was no music in the music. Plagiarism was the greatest sin. … The Cult of Originality shows up in computer science as well. For some reason, many languages that came out of academia suffer from this. Everything is reinvented from first principles (or in some cases, zeroeth principles), and nothing in the language resembles anything in any other language you’ve ever seen. And then the language designer wonders why the language never catches on. … In case you hadn’t noticed, Perl is not big on originality.
Paul Graham in the introduction to Founders at Work.
People like the idea of innovation in the abstract, but when you present them with any specific innovation, they tend to reject it because it doesn’t fit with what they already know. … As Howard Aiken said, “Don’t worry about people stealing your ideas. If your ideas are any good, you’ll have to ram them down people’s throats.”
If you strive to be original, you might achieve it in some technical sense, but end up with something nobody cares about. Strive for authenticity and excellence and you’re more likely to do something valuable. But originality isn’t appreciated as much in practice as it is in theory.
by John on January 11, 2008
Its always hard for me to decide the opening line for a paper. Here’s an opening line I ran across recently in a statistics paper.
Imagine we own a factory that produces nuts and bolts.
You had me at hello!
Here’s another great line, taken from the preface to the third edition of Theory of Probability by Harold Jeffreys.
Some points in later chapters have been transferred to the first, in the hope that fewer critics will be mislead into inferring what is not in the book from not finding it in the first chapter.
by John on January 11, 2008
I listen to a podcast recently interviewing Rob Page from Zope. At one point he talked about having SQL statements in your code vs. having accessor classes, and how as your code gets bigger there’s more need for OO design. No surprise. But then they said something interesting: if your project is smaller than $1M then straight SQL is OK, and over $1M you need accessors.
I don’t know whether I agree with the $1M cutoff, but I agree that there is a cutoff somewhere. I appreciate that Page was willing to take a stab at where the cutoff is. Also, I found it interesting that he measured size by dollars rather than, for example, lines of code. I’d like to see more pundits qualify their recommendations as a function of project budget.
Almost all advice on software engineering is about scaling up: bigger code bases, more users, etc. No one talks about the problem of scaling down. The implicit assumption is that you should concentrate on scaling up because scaling down is easy. I disagree. Over the last few years I’ve managed hundreds of small software projects, and I know that scaling down presents unique challenges. Maybe “scaling down” is the wrong term. My projects have scaled up in a different way: more projects rather than bigger projects.
One challenge of small projects is that they may become large projects; the opposite never happens. Sometimes the transition is so gradual that the project becomes too big for its infrastructure before anyone notices. Having some rule like the $1M cutoff could serve as a promt for reflection along the way: Hey, now that we’re a $1M project, maybe we should start to refactor now to avoid a complete rewrite down the road.
by John on January 10, 2008
Edsgar Dijkstra quipped that software testing can only prove the existence of bugs, not the absense of bugs. His research focused on formal techniques for proving the correctness of software, with the implicit assumption that proofs are infallible. But proofs are written by humans, just as software is, and are also subject to error. Donald Knuth had this in mind when he said “Beware of bugs in the above code; I have only proved it correct, not tried it.” The way to make progress is to shift from thinking about the possibility of error to thinking about the probability of error.
Testing software cannot prove the impossibility of bugs, but it can increase your confidence that there are no bugs, or at least lower your estimate of the probability of running into a bug. And while proofs can contain errors, they’re generally less error-prone than source code. (See a recent discussion by Mark Dominus about how reliable proofs have been.) At any rate, people tend to make different kinds of errors when proving theorems than when writing software. If software passes tests and has a formal proof of correctness, it’s more likely to be correct. And if theoretical results are accompanied by numerical demonstrations, they’re more believable.
Leslie Lamport wrote an article entitled How to Write a Proof where he addresses the problem of errors in proofs and recommends a pattern of writing proofs which increases the probability of the proof being valid. Interestingly, his proofs resemble programs. And while Lamport is urging people to make proofs more like programs, the literate programming folks are urging us to write programs that are more like prose. Both are advocating complementary modes of validation, adding machine-like validation to prosaic proofs and adding prosaic explanations to machine instructions.
by John on January 10, 2008
I heard yesterday that relative to their size, galaxies are much closer together than stars. I’d never heard that, so I looked into it. Just using orders of magnitude, the sun is 10^9 meters wide and the nearest star is 10^16 meters away. The Milky Way is 10^21 meters wide, and the Andromeda galaxy is 10^22 meters away. So stars are millions of diameters apart, but galaxies are tens of diameters apart.
by John on January 9, 2008
When I first saw integration techniques in calculus, I thought they were a waste of time because software packages could do any integral I could do by hand. Besides, you can always just use Simpson’s rule to compute integrals numerically.
In short, I thought symbolic integration was useless and numerical integration was trivial. Of course I was wrong on both accounts. I’ve solved numerous problems at work by being able compute an integral in closed form, and I’ve had a lot of fun cracking challenging numerical integration problems.
Many of the things I thought were impractical when I was in college have turned out to be very practical. And many things I thought would be supremely useful I have yet to apply. Paying too much attention to what is “practical” can be self-defeating. Pragmatism is impractical.
by John on January 9, 2008
I ran across an article recently comparing the performance of a 1986 Mac and a 2007 PC. Of course the new machine would totally blow away the old one on a number crunching benchmark, but when it comes to the most mundane benchmarks — time to boot, launch Microsoft Word, open a file, do a search and replace, etc. — the old Mac pulls ahead slightly. Software bloat has increased at roughly the same rate as Moore’s law, making a new machine with new software no better than an old machine with old software in some respects.
The comparisons in the article resonate with my experience. I expect administrative tasks to be quick and number crunching to be slow, and so I’m continually surprised how long routine tasks take and how quickly numerical software runs.