Well, F = ma.

Three or four very short stories on the difficulty of learning to use simple things. Depends whether you count the last section as a story.

* * *

When I was taking freshman physics and we were stuck on a problem, the professor would say “Well, F = ma.”

True, but absolutely useless. Yes, we know that F = ma. (Force equals mass times acceleration.) Nobody thought “Oh, that’s it. I was thinking F = ma2. That explains everything.” Newton’s laws are simple (in a sense) but subtle to apply. The difficult part isn’t the abstract principles but their application to concrete problems.

* * *

The heart of Bayesian statistics is not much more complicated than F = ma. Its the statement that

posterior ∝ likelihood × prior.

It takes a few years to learn how to apply that equation well. And when people try to help, their advice sounds about as useless as “Well, F = ma.”

* * *

Learning to use Unix was hard. When I asked for help getting started, a lab assistant said “Go read the man pages.” That’s about as hostile as saying “Want to learn English? Read a dictionary.” Fortunately I knew other people who were helpful. One of them told me about the book.

But still, it took a while to get the gestalt of Unix. I knew how to use a handful of utilities, and kept thinking everything would be fine once I knew maybe 10x as many utilities. Then one day I was talking with a friend who seemed fluent working with Unix. I asked him how he did a few things and realized he used the same tools I did, but used them better. It was almost as if he’d said “I just use F = ma” except when he said it things clicked.

* * *

The motivation for this post, the thing that brought these stories to mind, was listening to a podcast. The show had some good advice, things that I know I need to do, but nothing I hadn’t heard many times before. The hard part is working out what the particulars mean for me personally.

It often takes someone else to help us see what’s right in front of us. I’m grateful for the people who have helped me work out the particulars of things I was convinced of but couldn’t see how to apply. Sometimes I have the pleasure of being able to do that for someone else.

If it were easy …

“If it were easy, someone would have done it.” Maybe not.

Maybe the thing is indeed easy, and has been done before. Then someone was the first to do it. The warning that it had been done before didn’t apply to this person, even though it would apply to the subsequent people with the same idea.

This reminds me of the story of two economists walking down the street. They notice a $20 bill on the sidewalk and the first asks “Aren’t you going to pick it up?” The second replies “No, it’s not really there. If it were, someone would have picked it up by now.”

Sometimes a solution is easy, but nobody has had the audacity to try it. Or maybe circumstances have changed so that something is easy now that hasn’t been before.

Sometimes a solution is easy for you, if not for many others. See how much less credible the opening sentence sounds with for you inserted: “If it were easy for you, someone would have done it.”

Structure in jazz and math

Last night I went to a concert by the Branford Marsalis Quartet. One of the things that impressed me about the quartet was how creative they are while also being squarely within a tradition. People who are not familiar with jazz may not realize how structured it is and how much it respects tradition. The spontaneous and creative aspects of jazz are more obvious than the structure. In some ways jazz is more tightly structured than classical music. To use Francis Schaeffer’s phrase, there is form and freedom, freedom within form.

Every field has its own structure, its tropes, its traditions. Someone unfamiliar with the field can be overwhelmed, not having the framework that an insider has to understand things. They may think something is completely original when in fact the original portion is small.

In college I used to browse the journals in the math library and be completely overwhelmed. I didn’t learn until later that usually very little in a journal article is original, and even the original part isn’t that original. There’s a typical structure for a paper in PDEs, for example, just as there are typical structures for romantic comedies, symphonies, or space operas. A paper in partial differential equations might look like this:

  1. Motivation / previous work
  2. Weak formulation of PDE
  3. Craft function spaces and PDE as operator
  4. A priori estimates imply operator properties
  5. Well posedness results
  6. Regularity

An expert knows these structures. They know what’s boilerplate, what’s new, and just how new the new part is. When I wrote something up for my PhD advisor I remember him saying “You know what I find most interesting?” and pointing to one inequality. The part he found interesting, the only part he found interesting, was not that special from my perspective. It was all hard work for me, but only one part of it stood out as slightly original to him. An expert in partial differential equations sees a PDE paper the way a professional musician listens to another or the way a chess master sees a chess board.

While a math journal article may look totally incomprehensible, an expert in that specialization might see 10% of it as somewhat new. An interesting contrast to this is the “abc conjecture.” Three and a half years ago Shinichi Mochizuki proposed a proof of this conjecture. But his approach is so entirely idiosyncratic that nobody has been able to understand it. Even after a recent conference held for the sole purpose of penetrating this proof, nobody but Mochizuki really understands it. So even though most original research is not that original, once in a while something really new comes out.



I was listening to a classic music station yesterday, and I heard the story of a professional pianist whose hand was injured in an accident. He then started learning trumpet and two years later he was a professional trumpeter. I didn’t catch the musician’s name.

I was not surprised that a professional in one instrument could become a professional in another, but I was surprised that he did it in only two years. It probably helped that he was no longer able to play piano; I imagine if he wanted to learn trumpet in addition to piano he would not have become so proficient so quickly.

If you go by the rule of thumb that it takes about 10 years to master anything, this professional pianist was 80% of the way to becoming a professional trumpeter before he touched a trumpet. Or to put it another way, 80% of being a professional musician who plays trumpet is becoming a professional musician.

Transferable skills are more difficult to acquire and more valuable than the context in which they’re exercised.

Related posts:

The Mozart Myth

I don’t know how many times I’ve heard about how Mozart would compose entire musical scores in his head and only write them down once they were finished. Even authors who stress that creativity requires false starts and hard work have said that Mozart may have been an exception. But maybe he wasn’t.

In his new book How to Fly a Horse, Kevin Ashton says that the Mozart story above is a myth based on a forged letter. According to Ashton,

Mozart’s real letters—to his father, to his sister, and to others—reveal his true creative process. He was exceptionally talented, but he did not write by magic. He sketched his compositions, revised them, and sometimes got stuck. He could not work without a piano or harpsichord. He would set work aside and return to it later. … Masterpieces did not come to him complete in uninterrupted streams of imagination, nor without an instrument, nor did he write them whole and unchanged. The letter is not only forged, it is false.

Related posts:


Looking ten years ahead

From Freeman Dyson:

Economic forecasting is useful for predicting the future up to about ten years ahead. Beyond ten years the quantitative changes which the forecast accesses are usually sidetracked or made irrelevant by qualitative changes in the rules of the game. Qualitative changes are produced by human cleverness … or by human stupidity … Neither cleverness nor stupidity are predictable.

Source: Infinite in All Directions, Chapter 10, Engineers’ Dreams.


Zig Ziglar said that if you increase your confidence, you increase your competence. I think that’s generally true. Of course you could be an idiot and become a more confident idiot. In that case confidence just makes things worse [1]. But otherwise when you have more confidence, you explore more options, and in effect become more competent.

There are some things you may need to learn not for the content itself but for the confidence boost. Maybe you need to learn them so you can confidently say you didn’t need to. Also, some things you need to learn before you can see uses for them. (More on that theme here.)

I’ve learned several things backward in the sense of learning the advanced material before the elementary. For example, I studied PDEs in graduate school before having mastered the typical undergraduate differential equation curriculum. That nagged at me. I kept thinking I might find some use for the undergrad tricks. When I had a chance to teach the undergrad course a couple times, I increased my confidence. I also convinced myself that I didn’t need that material after all.

My experience with statistics was similar. I was writing research articles in statistics before I learned some of the introductory material. Once again the opportunity to teach the introductory material increased my confidence. The material wasn’t particularly useful, but the experience of having taught it was.

Related post: Psychological encapsulation

[1] See Yeats’ poem The Second Coming:

The best lack all conviction, while the worst
Are full of passionate intensity.


Prevent errors or fix errors

The other day I was driving by our veterinarian’s office and saw that the marquee said something like “Prevention is less expensive than treatment.” That’s sometimes true, but certainly not always.

This evening I ran across a couple lines from Ed Catmull that are more accurate than the vet’s quote.

Do not fall for the illusion that by preventing errors, you won’t have errors to fix. The truth is, the cost of preventing errors is often far greater than the cost of fixing them.

From Creativity, Inc.

What would Donald Knuth do?

I’ve seen exhortations to think like Leonardo da Vinci or Albert Einstein, but these leave me cold. I can’t imagine thinking like either of these men. But here are a few famous people I could imagine emulating when trying to solve a problem

What would Donald Knuth do? Do a depth-first search on all technologies that might be relevant, and write a series of large, beautiful, well-written books about it all.

What would Alexander Grothendieck do? Develop a new field of mathematics that solves the problem as a trivial special case.

What would Richard Stallman do? Create a text editor so powerful that, although it doesn’t solve your problem, it does allow you to solve your problem by writing a macro and a few lines of Lisp.

What would Larry Wall do? Bang randomly on the keyboard and save the results to a file. Then write a language in which the file is a program that solves your problem.

What would you add to the list?


“I got the easy ones wrong”

This morning my daughter told me that she did well on a spelling test, but she got the easiest words wrong. Of course that’s not exactly true. The words that are hardest for her to spell are the ones she in fact did not spell correctly. She probably meant that she missed the words she felt should have been easy. Maybe they were short words. Children can be intimidated by long words, even though long words tend to be more regular and thus easier to spell.

Our perceptions of what is easy are often upside-down. We feel that some things should be easy even though our experience tells us otherwise.

Sometimes the trickiest parts of a subject come first, but we think that because they come first they should be easy. For example, force-body diagrams come at the beginning of an introductory physics class, but they can be hard to get right. Newton didn’t always get them right. More advanced physics, say celestial mechanics, is in some ways easier, or at least less error-prone.

“Elementary” and “easy” are not the same. Sometimes they’re opposites. Getting off the ground, so to speak, may be a lot harder than flying.

Remove noise, remove signal

Whenever you remove noise, you also remove at least some signal. Ideally you can remove a large portion of the noise and a small portion of the signal, but there’s always a trade-off between the two. Averaging things makes them more average.

Statistics has the related idea of bias-variance trade-off. An unfiltered signal has low bias but high variance. Filtering reduces the variance but introduces bias.

If you have a crackly recording, you want to remove the crackling and leave the music. If you do it well, you can remove most of the crackling effect and reveal the music, but the music signal will be slightly diminished. If you filter too aggressively, you’ll get rid of more noise, but create a dull version of the music. In the extreme, you get a single hum that’s the average of the entire recording.

This is a metaphor for life. If you only value your own opinion, you’re an idiot in the oldest sense of the word, someone in his or her own world. Your work may have a strong signal, but it also has a lot of noise. Getting even one outside opinion greatly cuts down on the noise. But it also cuts down on the signal to some extent. If you get too many opinions, the noise may be gone and the signal with it. Trying to please too many people leads to work that is offensively bland.

Related post: The cult of average

The difference between machines and tools

From “The Inheritance of Tools” by Scott Russell Sanders:

I had botched a great many pieces of wood before I mastered the right angle with a saw, botched even more before I learned to miter a joint. The knowledge of these things resides in my hands and eyes and the webwork of muscles, not in the tools. There are machines for sale—powered miter boxes and radial arm saws, for instance—that will enable any casual soul to cut proper angles in boards. The skill is invested in the gadget instead of the person who uses it, and this is what distinguishes a machine from a tool.

Related post: Software exoskeletons

The Jericho-Masada approach to mathematics

Pierre Cartier describing Alexander Grothendieck’s approach to mathematics:

Grothendieck’s favorite method is not unlike Joshua’s method for conquering Jericho. The thing was to patiently encircle the solid walls without actually doing anything: at a certain point, the walls fall flat without a fight. This was also the method used by the Romans when they conquered the natural desert fortress Masada, the last stronghold of the Jewish revolt, after spending months patiently building a ramp. Grothendieck was convinced that if one has a sufficiently unifying vision of mathematics, if one can sufficiently penetrate the essence of mathematics and the strategies of its concepts, then particular problems are nothing but a test; they do not need to be solved for their own sake.


Slabs of time

From Some Remarks: Essays and Other Writing by Neal Stephenson:

Writing novels is hard, and requires vast, unbroken slabs of time. Four quiet hours is a resource I can put to good use. Two slabs of time, each two hours long, might add up to the same four hours, but are not nearly as productive as an unbroken four. … Likewise, several consecutive days with four-hour time-slabs in them give me a stretch of time in which I can write a decent book chapter, but the same number of hours spread out across a few weeks, with interruptions in between them, are nearly useless.

I haven’t written a novel, and probably never will, but Stephenson’s remarks describe my experience doing math and especially developing software. I can do simple, routine work in short blocks of time, but I need larger blocks of time to work on complex projects or to be more creative.

Related post: Four hours of concentration