From the monthly archives:

November 2010

Object oriented vs. functional programming

by John on November 3, 2010

From Michael Feathers:

OO makes code understandable by encapsulating moving parts.
FP makes code understandable by minimizing moving parts.

This explains some of the tension between object oriented programming and functional programming. The former tries to control state behind object interfaces. The latter tries to minimize state by using pure functions as much as possible.

It’s understandable that programmers accustomed to object oriented programming would like to add functional programming on top of OO, but I believe you have to make more of an exclusive commitment to functional programming to get the most benefit. For example, pure functions are easier to debug and to execute in parallel due to their lack of side effects. But if your code is only semi-functional, you can’t have the same confidence in testing your code or in spreading it across processors.

James Hague argues that 100% functional purity is impractical and that one should aim for 85% purity. But the 15% impurity needs to be partitioned, not randomly scattered across your code base. A simple strategy for doing this is to use functional in the small and OO in the large. Clojure also has some very interesting ideas for isolating the stateful parts of a program.

Related post:

Pure functions have side effects

{ 10 comments }

The snowball strategy says to pay off your smallest debt first, then the next smallest, and so on until you’re out of debt.

When I first heard of this I thought it was silly. Clearly the optimal strategy is to pay off the debt with the highest interest rate first. That assessment is mathematically correct, but psychologically wrong. The snowball strategy provides a sense of accomplishment and encouragement by reducing the number of debts as soon as possible. Ideally someone would be able to pay off at least one debt before their determination to get out of debt wanes.

My point here isn’t to give financial advice. I bring up the snowball strategy because it is an example of a problem with an obvious but naive solution. If someone is overwhelmed by debt, they need encouragement more than a mathematically optimal strategy. However, the snowball strategy may not be psychologically optimal for everyone. This further illustrates the idea that optimal real-life strategies are more complicated than mathematical models.

Many things that don’t look optimal are in fact optimal once you take the necessary constraints into account. For example, software that seems poorly designed may in fact have been brilliantly designed when you consider its economic and historical constraints. (This may even be the norm. Nobody complains about how badly obscure software was designed. We complain about software that has been successful enough to criticize.)

Related posts:

A little simplicity goes a long way
Acknowledging problems versus solving problems

{ 18 comments }

Sledgehammer technique for trig integrals

by John on November 2, 2010

There’s a powerful integration trick that I don’t believe is too widely known. Some calculus books mention it in a footnote, but few emphasize it. This is unfortunate since this trick applies to more problems than many of the more ad hoc techniques that are commonly taught.

Karl Weierstrass (1815-1897) came up with the idea of using t = tan(x/2) to convert trig functions of x to rational functions of t. If t = tan(x/2), then

  • sin(x) = 2t/(1 + t2)
  • cos(x) = (1 – t2) / (1 + t2)
  • dx = 2 dt/(1 + t2).

This means that any integral of a rational function of sines and cosines can be converted to an integral of rational function of t. And any rational function of t can be integrated in closed form by using partial fraction decomposition, though the partial fraction decomposition may need to be performed numerically.

I call this the sledgehammer technique because it’s overkill for the simplest trig integrals; other less general techniques are easier to apply in such problems. On the other hand, Weierstrass’ technique is very general and can evaluate integrals that look impossible at first glance.

Related posts:

Integration and pragmatism
What to make “u” in integration by parts
Numerical integration article posted

{ 11 comments }

Bias and consistency

by John on November 1, 2010

Suppose you have two ways to estimate something you’re interested in. One is biased and one is unbiased. Surely the unbiased method is better, right? Not necessarily. Statistical bias is not as bad as it sounds.

Under ideal conditions, an unbiased estimator gives the correct answer on average, but each particular estimate may be ridiculous. Suppose you ask me to estimate how many dwarfs were in Snow White and the Seven Dwarfs. If I alternately guess 100 and -272, each guess will be wildly wrong. But if 75% of the time I guess 100 and 25% of the time guess -272, my average guess will be 7 and so my estimates will be unbiased. But if half the time I guess 8 and half the time I guess 7, my average guess will be 7.5 and my process will be biased. However, each estimate will be more accurate.

Consistency is a weaker condition than unbiasedness. Consistency says that if you feed your method enough data generated from your assumed model, your estimates will converge to the correct value.

But if your model is not exactly correct (and it never is) will you get a reasonably good result? It’s possible for an inconsistent method to provide good results in practice and it’s possible that a consistent method may not.

In his blog post on cross validation, Rob Hyndman mentions a paper that shows one validation method is consistent and another is not. Rob concludes

Frankly, I don’t consider this is a very important result as there is never a true model. In reality, every model is wrong, so consistency is not really an interesting property.

In the context of his post, Rob argues that the most important test of a statistical method is how well it predicts future data. Some people have commented that this comes down too hard on consistency. But we’re talking about a blog post, and blogs don’t use the same kind of carefully qualified language that formal papers do. Perhaps in a more formal setting Rob might argue that a gross failure of consistency gives one reason to suspect a method won’t predict well, but a lack of complete consistency shouldn’t remove a method from consideration. Such language may be inoffensive, but it lacks the verve of his original statement.

Too often bias and consistency are seen as all-or-nothing properties. In theoretical statistics, one typically asks whether a method is biased, not how biased it is. The same is true of consistency. Bias and consistency are only two criteria by which methods can be evaluated. A small amount of bias or inconsistency may be an acceptable trade-off in exchange for better performance by other criteria such as efficiency or robustness.

Related posts:

The Titanic Effect
What distribution does my data have?

{ 4 comments }