From the monthly archives:

January 2012

Walking away from factory work

by John on January 17, 2012

From Shop Class as Soulcraft,

Given their likely acquaintance such a cognitively rich world of work, it is hardly surprising that when Henry Ford introduced the assembly line in 1913, workers simply walked out. One of Ford’s biographers wrote, “So great was labor’s distaste for the new machine system that toward the close of 1913 every time the company wanted to add 100 men to its factory personnel, it was necessary to hire 963.”

A dozen years ago people would talk of building “software factories” to crank out software projects. Back then someone tried to get me excited about joining an effort to create such a factory. I told him I did not want to work in a factory. He tried to back-peddle, saying that it’s not what it sounds like. But I’m sure it was exactly what it sounded like.

{ 6 comments }

Preparing for change, expressing intent

by John on January 17, 2012

Many good programming practices boil down to preparing for change or expressing intent. It seems to me that novices emphasize the former, experts the latter.

One of the first things you learn in programming is to use symbolic constants rather than magic numbers. For example, if you have a maximum of 12 items in a shopping cart, define a constant like MAX_ITEMS to be 12 and use that symbol rather than the number “12″ throughout the code. That way if you have to increase the maximum to 25 some day, you can just make the change in one place. Symbolic constants prepare for change.

Sounds good, but then why define a constant for pi? It’s not going to change. But having a constant PI in source code conveys the intention of the number.

There are 3,628,800 seconds in six weeks. Coincidentally, this number also equals 10!. But constants like SECONDS_PER_SIX_WEEKS and TEN_FACTORIAL clearly convey where the numbers come from. That’s why it’s sometimes worthwhile to give one thing two names. The symbol SECONDS_PER_SIX_WEEKS looks like a conversion factor, while TEN_FACTORIAL makes you think somewhere there are 10 things being arranged. Using the symbols in the opposite context would be clever, but not in a good way.

Expressing intent is easier to justify than preparing for change. If you argue that some chunk of code should be pulled out into its own function in case it needs to change, someone may argue “But that’ll never change.” If you argue that the same chuck of code should be pulled out and given a name to express what it’s trying to do, you’re likely to get less resistance.

If you focus on making your intentions clear, your code will be easier to maintain. If you focus on maintainability alone, it might backfire. You might get lots of unneeded code, inserted with the intent of making future maintenance easier, that makes maintenance harder.

Related posts:

Why does software have to be maintained?
Holographic code
Bugs, features, and risk

{ 4 comments }

Six analysis and probability diagrams

by John on January 17, 2012

Here are a few diagrams I’ve created that summarize relationships in analysis and probability. Click on a thumbnail image to go to a page with the full image and explanatory text.

Special functions

Gamma and related functions

Probability distributions

Conjugate priors

Convergence theorems

Bessel functions

{ 3 comments }

The most dreadful conclusion

by John on January 16, 2012

In his book Heretics, G. K. Chesterton praises H. G. Wells for being able to change his mind.

He has abandoned the sensational theory with the same honourable gravity and simplicity with which he adopted it. Then he thought it was true; now he thinks it is not true. He has come to the most dreadful conclusion a literary man can come to, the conclusion that the ordinary view is the right one. It is only the last and wildest kind of courage that can stand on a tower before ten thousand people and tell them that twice two is four.

Emphasis added.

Related post: Three reasons expert predictions are often wrong

{ 3 comments }

Roman numeral puzzle

by John on January 14, 2012

I noticed an ad for Super Bowl XLVI on a pizza box this morning. The Roman numeral XLVI does not repeat any character. This brought up a couple questions.

  • How many Roman numerals are possible if you’re not allowed to repeat a character?
  • Could you write a (reasonably short) regular expression to find all such numbers?

You can post your solutions to either question in the comments.

There has never been universal agreement on the rules for constructing Roman numerals, so your solution would depend on your choice of rules. For our purposes here, assume the valid characters are I, V, X, L, C, D, and M. Also, assume any character can be subtracted from a larger character. For example, you can assume IL is a valid representation of 49.

For a more challenging problem, you can use the more restrictive subtraction rules.

  1. I can be subtracted from V and X only.
  2. X can be subtracted from L and C only.
  3. C can be subtracted from D and M only.
  4. V, L, and D can never be subtracted.

Other puzzle posts:

A Renaissance math puzzle
Technology history quiz
A log puzzle

{ 25 comments }

Interpreting statistics

by John on January 13, 2012

From Matt Briggs:

I challenge you to find me in any published statistical analysis, outside of an introductory textbook, a confidence interval given the correct interpretation. If you can find even one instance where the [frequentist] confidence interval is not interpreted as a [Bayesian] credible interval, then I will eat your hat.

Most statistical analysis is carried out by people who do not interpret their results correctly. They carry out frequentist procedures and then give the results a Bayesian interpretation. This is not simply a violation of an academic taboo. It means that people generally underestimate the uncertainty in their conclusions.

Related posts:

Most published research results are false
Classical statistics in a nutshell

{ 16 comments }

Bugs, features, and risk

by John on January 12, 2012

All software has bugs. Someone has estimated that production code has about one bug per 100 lines. Of course there’s some variation in this number. Some software is a lot worse, and some is a little better.

But bugs-per-line-of-code is not very useful for assessing risk. The risk of a bug is the probability of running into it multiplied by its impact. Some lines of code are far more likely to execute than others, and some bugs are far more consequential than others.

Devoting equal effort to testing all lines of code would be wasteful. You’re not going to find all the bugs anyway, so you should concentrate on the parts of the code that are most likely to run and that would produce the greatest harm if they were wrong.

However, here’s a complication. The probability of running into a bug can change over time as people use the software in new ways. For whatever reason people to want to use features that had not been exercised before. When they do so, they’re likely to uncover new bugs.

(This helps explain why everyone thinks his preferred software is more reliable than others. When you’re a typical user, you tread the well-tested paths. You also learn, often subconsciously, to avoid buggy paths. When you bring your expectations from an old piece of software to a new one, you’re more likely to uncover bugs.)

Even though usage patterns change, they don’t change arbitrarily. It’s still the case that some code is far more likely than other code to execute.

Good software developers think ahead. They solve more than they’re asked to solve. They think “I’m going to go ahead and include this other case while I’m at it in case they need it later.” They’re heroes when it turns out their guesses about future needs were correct.

But there’s a downside to this initiative. You pay for what you don’t use. Every speculative feature either has to be tested, incurring more expense up front, or delivered untested, incurring more risk. This suggests its better to disable unused features.

You cannot avoid speculation entirely. Writing maintainable software requires speculating well, anticipating and preparing for change. Good software developers place good bets, and these tend to be small bets, going to a little extra effort to make software much more flexible. As with bugs, you have to consider probabilities and consequences: how likely is this part of the software to change, and how much effort will it take to prepare for that change?

Developers learn from experience what aspects of software are likely to change and they prepare for that change. But then they get angry at a rookie who wastes a lot of time developing some unnecessary feature. They may not realize that the rookie is doing the same thing they are, but with a less informed idea of what’s likely to be needed in the future.

Disputes between developers often involve hidden assumptions about probabilities. Whether some aspect of the software is responsible preparation for maintenance or wasteful gold plating depends on your idea of what’s likely to happen in the future.

Related post: Why programmers write unneeded code

{ 11 comments }

Imploding my old office building

by John on January 11, 2012

I used to have an office in this building that was imploded on Sunday.

You can hear someone on the video say “Are we looking at the right building?” just before the building starts to collapse.

More on the implosion from the Houston Chronicle.

[If the video doesn't show up in your blog reader, go directly to my blog page or to the Houston Chronicle link.]

{ 2 comments }

Customizing conventional wisdom

by John on January 11, 2012

From Solitude and Leadership by William Deresiewicz:

I find for myself that my first thought is never my best thought. My first thought is always someone else’s; it’s always what I’ve already heard about the subject, always the conventional wisdom. It’s only by concentrating, sticking to the question, being patient, letting all the parts of my mind come into play, that I arrive at an original idea. By giving my brain a chance to make associations, draw connections, take me by surprise. And often even that idea doesn’t turn out to be very good. I need time to think about it, too, to make mistakes and recognize them, to make false starts and correct them, to outlast my impulses, to defeat my desire to declare the job done and move on to the next thing.

Conventional wisdom summarizes the experience of many people. As a result, it’s often a good starting point. But like a blurred photo, it has gone through a sort of averaging process, loosing resolution along the way. It takes hard work to decide how, or even whether, conventional wisdom applies to your particular circumstances.

Bureaucracies are infuriating because they cannot deliberate on particulars the way Deresiewicz recommends. In order to scale up, they develop procedures that work well under common scenarios.

The context of Deresiewicz’s advice is a speech he gave at West Point. His audience will spend their careers in one of the largest and most bureaucratic organizations in the world. Deresiewicz is aware of this irony and gives advice for how to be a deep thinker while working within a bureaucracy.

Related posts:

John Cleese on creativity
Advanced or just obscure?
In defense of reinventing wheels
Small, local, old, and particular

{ 0 comments }

Holographic code

by John on January 9, 2012

In a hologram, information about each small area of image is scattered throughout the holograph. You can’t say this little area of the hologram corresponds to this little area of the image. At least that’s what I’ve heard; I don’t really know how holograms work.

I thought about holograms the other day when someone was describing some source code with deeply nested templates. He told me “You can’t just read it. You can only step through the code with a debugger.” I’ve ran into similar code. The execution sequence of the code at run time is almost unrelated to the sequence of lines in the source code. The run time behavior is scattered through the source code like image information in a holograph.

Holographic code is an advanced anti-pattern. It’s more likely to result from good practice taken to an extreme than from bad practice.

Somewhere along the way, programmers learn the “DRY” principle: Don’t Repeat Yourself. This is good advice, within reason. But if you wring every bit of redundancy out of your code, you end up with something like Huffman encoded source. In fact, DRY is very much a compression algorithm. In moderation, it makes code easier to maintain. But carried too far, it makes reading your code like reading a zip file. Sometimes a little redundancy makes code much easier to read and maintain.

Code is like wine: a little dryness is good, but too much is bitter or sour.

Note that functional-style code can be holographic just like conventional code. A pure function is self-contained in the sense that everything the function needs to know comes in as arguments, i.e. there is no dependence on external state. But that doesn’t mean that everything the programmer needs to know is in one contiguous chuck of code. If you have to jump all over your code base to understand what’s going on anywhere, you have holographic code, regardless of what style it was written in. However, I imagine functional programs would usually be less holographic.

Related post: Baklava code

{ 30 comments }

Pax Romana

by John on January 7, 2012

From A History of the English Speaking Peoples by Winston Churchill:

In our own fevered, changing, and precarious age, where all is in flux and nothing is accepted, we must survey with respect a period when, with only three hundred thousand soldiers, widespread peace in the entire known world was maintained from generation to generation, and when the first pristine impulse of Christianity lifted men’s souls to the contemplation of new and larger harmonies beyond the ordered world around them.

{ 11 comments }

Variable-length patents

by John on January 5, 2012

Alex Tabarrok brings up an interesting question: Why should all patents have the same length?

Pharmaceuticals are really the classic case of where the [ratio of] innovation-to-imitation costs are extraordinarily high. It costs about a billion dollars to create a new pharmaceutical. The first pill costs a billion dollars; the second pill costs 50 cents. So, that’s a classic case where imitation costs really are low. That’s the best case for patents, in a field like that.

But my question is: Why does every innovation deserve or require the same 20-year patent? Why do we have a system which gives a one billion dollar pharmaceutical–where there’s $1 billion in research and development costs–we give that a 20-year patent and one-click shopping gets the same 20-year patent? That makes no sense whatsoever.

So, what I suggest is a more flexible system. I’d like to have a 20-year patent, maybe a 15-year patent, maybe a 3-year patent. Something like that. And then we could say: You want to apply for a 3-year patent? We are going to get this through the system quickly; we won’t look at it so much. … You want a 20-year patent, though, you’d better show us that you really are deserving and put some costs in there.

Source: EconTalk

I don’t like software patents, though I don’t see them going away. But it might be possible to pass legislation to reduce the length of software patents.

See also this post about the tragedy of the anti-commons. The tragedy of the commons is misuse of a resource nobody owns. The tragedy of the anti-commons is the under-use of a resource that too many people own.

Building a DVD player requires using hundreds of patented inventions. No company could ever build a DVD player if it had to negotiate with all patent holders and obtain their unanimous consent. … Fortunately, the owners of the patents used in building DVD players have formed a single entity authorized to negotiate on their behalf. But if you’re creating something new that does not have an organized group of patent holders, there are real problems.

{ 6 comments }

Stigler’s law and Avogadro’s number

by John on January 5, 2012

Stigler’s law says that no scientific discovery is named after its original discoverer. Stigler attributed his law to Robert Merton, acknowledging that Stigler’s law obeys Stigler’s law.

Avogadro’s number may be an example of Stigler’s law, depending on your perspective. An episode of Engines of our Ingenuity on Josef Loschmidt explains.

The Italian, Romano Amadeo Carlo Avogadro, had suggested [in 1811] that all gases have the same number of molecules in a given volume. Loschmidt figured out [in 1865] how many molecules that would be.

You could argue that Avogadro’s constant should be named after Loschmidt, and some use the symbol L for the constant in honor of Loschmidt. Jean Perrin came up with more accurate estimates and proposed in 1909 that the constant should be named after Avogadro. Loschmidt made several important contributions to science that are now known by other’s names.

As I’d mentioned in an earlier post, there are some fun coincidences with Avogadro’s number.

  1. NA is approximately 24! (i.e., 24 factorial.)
  2. The mass of the earth is approximately 10 NA kilograms.
  3. The number of stars in the observable universe is 0.5 NA.

{ 5 comments }

double.Epsilon != DBL_EPSILON

by John on January 5, 2012

Here’s a pitfall in C# that keeps coming up. C# has a constant double.Epsilon that programmers coming from C naturally assume is the same as C’s DBL_EPSILON. It’s not. In fact, the former is hundreds of orders of magnitude smaller.

C#’s double.Epsilon is the closest floating point number to 0. C’s DBL_EPSILON is the distance between 1 and the closest floating point number greater than 1. Said another way, DBL_EPSILON is the smallest positive floating point number x such that 1 + x != 1, often called “machine epsilon.”

Typically double.Epsilon is on the order of 10^-324 and DBL_EPSILON is on the order of 10^-16. (These values could potentially change depending on the platform, but they hardly ever do.)

C# has no constant corresponding to DBL_EPSILON. This is unfortunate, since this constant appears frequently in numerical software. Why? Because it tells you, for example, when to stop adding series.

If DBL_EPSILON is on the order of 10^-16, that means that if you add two numbers that differ by more than 16 orders of magnitude, the sum doesn’t change. If you’re summing a decreasing series of numbers, say in order to evaluate a Taylor approximation, you might as well stop once the next term is 16 orders of magnitude smaller than the sum. If you keep going past that point, you’ll burn CPU cycles but you won’t change your answer.

DBL_EPSILON is almost always about 10^-16. But by giving it a name, you avoid having 10^-16 as a mysterious constant throughout code. And if your code should ever move to an environment with different floating point resolution, your code will correctly adjust to the new platform.

Related links:

An introduction to numerical programming in C#
Anatomy of a floating point number

{ 8 comments }

Just what do you mean by ’scale’?

by John on January 4, 2012

“Fancy algorithms are slow when n is small, and n is usually small.” — Rob Pike

Someone might object that Rob Pike’s observation is irrelevant. Everything is fast when the problem size n is small, so design your code to be efficient for large n and don’t worry about small n. But it’s not that simple.

Suppose you have two sorting algorithms, Simple Sort and Fancy Sort. Simple Sort is more efficient for lists with less than 50 element and Fancy Sort is more efficient for lists with more than 50 elements.

You could say that Fancy Sort scales better. What if n is a billion? Fancy Sort could be a lot faster.

But there’s another way a problem could scale. Instead of sorting longer lists, you could sort more lists. What if you have a billion lists of size 40 to sort?

People toss around the term “scaling,” assuming everyone has the same notion of scaling. But projects could scale along different dimensions. Whether Simple Sort or Fancy Sort scales better depends on how the problem scales.

The sorting example just has two dimensions: the length of each list and the number of lists. Software trade-offs are often much more complex. The more dimensions a problem has, the more opportunities there are for competing solutions to each claim that it scales better.

Related posts:

{ 12 comments }