Structure in jazz and math

Last night I went to a concert by the Branford Marsalis Quartet. One of the things that impressed me about the quartet was how creative they are while also being squarely within a tradition. People who are not familiar with jazz may not realize how structured it is and how much it respects tradition. The spontaneous and creative aspects of jazz are more obvious than the structure. In some ways jazz is more tightly structured than classical music. To use Francis Schaeffer’s phrase, there is form and freedom, freedom within form.

Every field has its own structure, its tropes, its traditions. Someone unfamiliar with the field can be overwhelmed, not having the framework that an insider has to understand things. They may think something is completely original when in fact the original portion is small.

In college I used to browse the journals in the math library and be completely overwhelmed. I didn’t learn until later that usually very little in a journal article is original, and even the original part isn’t that original. There’s a typical structure for a paper in PDEs, for example, just as there are typical structures for romantic comedies, symphonies, or space operas. A paper in partial differential equations might look like this:

  1. Motivation / previous work
  2. Weak formulation of PDE
  3. Craft function spaces and PDE as operator
  4. A priori estimates imply operator properties
  5. Well posedness results
  6. Regularity

An expert knows these structures. They know what’s boilerplate, what’s new, and just how new the new part is. When I wrote something up for my PhD advisor I remember him saying “You know what I find most interesting?” and pointing to one inequality. The part he found interesting, the only part he found interesting, was not that special from my perspective. It was all hard work for me, but only one part of it stood out as slightly original to him. An expert in partial differential equations sees a PDE paper the way a professional musician listens to another or the way a chess master sees a chess board.

While a math journal article may look totally incomprehensible, an expert in that specialization might see 10% of it as somewhat new. An interesting contrast to this is the “abc conjecture.” Three and a half years ago Shinichi Mochizuki proposed a proof of this conjecture. But his approach is so entirely idiosyncratic that nobody has been able to understand it. Even after a recent conference held for the sole purpose of penetrating this proof, nobody but Mochizuki really understands it. So even though most original research is not that original, once in a while something really new comes out.

Related:

Retooling

I was listening to a classic music station yesterday, and I heard the story of a professional pianist whose hand was injured in an accident. He then started learning trumpet and two years later he was a professional trumpeter. I didn’t catch the musician’s name.

I was not surprised that a professional in one instrument could become a professional in another, but I was surprised that he did it in only two years. It probably helped that he was no longer able to play piano; I imagine if he wanted to learn trumpet in addition to piano he would not have become so proficient so quickly.

If you go by the rule of thumb that it takes about 10 years to master anything, this professional pianist was 80% of the way to becoming a professional trumpeter before he touched a trumpet. Or to put it another way, 80% of being a professional musician who plays trumpet is becoming a professional musician.

Transferable skills are more difficult to acquire and more valuable than the context in which they’re exercised.

Related posts:

The Mozart Myth

I don’t know how many times I’ve heard about how Mozart would compose entire musical scores in his head and only write them down once they were finished. Even authors who stress that creativity requires false starts and hard work have said that Mozart may have been an exception. But maybe he wasn’t.

In his new book How to Fly a Horse, Kevin Ashton says that the Mozart story above is a myth based on a forged letter. According to Ashton,

Mozart’s real letters—to his father, to his sister, and to others—reveal his true creative process. He was exceptionally talented, but he did not write by magic. He sketched his compositions, revised them, and sometimes got stuck. He could not work without a piano or harpsichord. He would set work aside and return to it later. … Masterpieces did not come to him complete in uninterrupted streams of imagination, nor without an instrument, nor did he write them whole and unchanged. The letter is not only forged, it is false.

Related posts:

 

Looking ten years ahead

From Freeman Dyson:

Economic forecasting is useful for predicting the future up to about ten years ahead. Beyond ten years the quantitative changes which the forecast accesses are usually sidetracked or made irrelevant by qualitative changes in the rules of the game. Qualitative changes are produced by human cleverness … or by human stupidity … Neither cleverness nor stupidity are predictable.

Source: Infinite in All Directions, Chapter 10, Engineers’ Dreams.

Confidence

Zig Ziglar said that if you increase your confidence, you increase your competence. I think that’s generally true. Of course you could be an idiot and become a more confident idiot. In that case confidence just makes things worse [1]. But otherwise when you have more confidence, you explore more options, and in effect become more competent.

There are some things you may need to learn not for the content itself but for the confidence boost. Maybe you need to learn them so you can confidently say you didn’t need to. Also, some things you need to learn before you can see uses for them. (More on that theme here.)

I’ve learned several things backward in the sense of learning the advanced material before the elementary. For example, I studied PDEs in graduate school before having mastered the typical undergraduate differential equation curriculum. That nagged at me. I kept thinking I might find some use for the undergrad tricks. When I had a chance to teach the undergrad course a couple times, I increased my confidence. I also convinced myself that I didn’t need that material after all.

My experience with statistics was similar. I was writing research articles in statistics before I learned some of the introductory material. Once again the opportunity to teach the introductory material increased my confidence. The material wasn’t particularly useful, but the experience of having taught it was.

Related post: Psychological encapsulation


[1] See Yeats’ poem The Second Coming:

The best lack all conviction, while the worst
Are full of passionate intensity.

 

Prevent errors or fix errors

The other day I was driving by our veterinarian’s office and saw that the marquee said something like “Prevention is less expensive than treatment.” That’s sometimes true, but certainly not always.

This evening I ran across a couple lines from Ed Catmull that are more accurate than the vet’s quote.

Do not fall for the illusion that by preventing errors, you won’t have errors to fix. The truth is, the cost of preventing errors is often far greater than the cost of fixing them.

From Creativity, Inc.

What would Donald Knuth do?

I’ve seen exhortations to think like Leonardo da Vinci or Albert Einstein, but these leave me cold. I can’t imagine thinking like either of these men. But here are a few famous people I could imagine emulating when trying to solve a problem

What would Donald Knuth do? Do a depth-first search on all technologies that might be relevant, and write a series of large, beautiful, well-written books about it all.

What would Alexander Grothendieck do? Develop a new field of mathematics that solves the problem as a trivial special case.

What would Richard Stallman do? Create a text editor so powerful that, although it doesn’t solve your problem, it does allow you to solve your problem by writing a macro and a few lines of Lisp.

What would Larry Wall do? Bang randomly on the keyboard and save the results to a file. Then write a language in which the file is a program that solves your problem.

What would you add to the list?

 

“I got the easy ones wrong”

This morning my daughter told me that she did well on a spelling test, but she got the easiest words wrong. Of course that’s not exactly true. The words that are hardest for her to spell are the ones she in fact did not spell correctly. She probably meant that she missed the words she felt should have been easy. Maybe they were short words. Children can be intimidated by long words, even though long words tend to be more regular and thus easier to spell.

Our perceptions of what is easy are often upside-down. We feel that some things should be easy even though our experience tells us otherwise.

Sometimes the trickiest parts of a subject come first, but we think that because they come first they should be easy. For example, force-body diagrams come at the beginning of an introductory physics class, but they can be hard to get right. Newton didn’t always get them right. More advanced physics, say celestial mechanics, is in some ways easier, or at least less error-prone.

“Elementary” and “easy” are not the same. Sometimes they’re opposites. Getting off the ground, so to speak, may be a lot harder than flying.

Remove noise, remove signal

Whenever you remove noise, you also remove at least some signal. Ideally you can remove a large portion of the noise and a small portion of the signal, but there’s always a trade-off between the two. Averaging things makes them more average.

Statistics has the related idea of bias-variance trade-off. An unfiltered signal has low bias but high variance. Filtering reduces the variance but introduces bias.

If you have a crackly recording, you want to remove the crackling and leave the music. If you do it well, you can remove most of the crackling effect and reveal the music, but the music signal will be slightly diminished. If you filter too aggressively, you’ll get rid of more noise, but create a dull version of the music. In the extreme, you get a single hum that’s the average of the entire recording.

This is a metaphor for life. If you only value your own opinion, you’re an idiot in the oldest sense of the word, someone in his or her own world. Your work may have a strong signal, but it also has a lot of noise. Getting even one outside opinion greatly cuts down on the noise. But it also cuts down on the signal to some extent. If you get too many opinions, the noise may be gone and the signal with it. Trying to please too many people leads to work that is offensively bland.

Related post: The cult of average

The difference between machines and tools

From “The Inheritance of Tools” by Scott Russell Sanders:

I had botched a great many pieces of wood before I mastered the right angle with a saw, botched even more before I learned to miter a joint. The knowledge of these things resides in my hands and eyes and the webwork of muscles, not in the tools. There are machines for sale—powered miter boxes and radial arm saws, for instance—that will enable any casual soul to cut proper angles in boards. The skill is invested in the gadget instead of the person who uses it, and this is what distinguishes a machine from a tool.

Related post: Software exoskeletons

The Jericho-Masada approach to mathematics

Pierre Cartier describing Alexander Grothendieck’s approach to mathematics:

Grothendieck’s favorite method is not unlike Joshua’s method for conquering Jericho. The thing was to patiently encircle the solid walls without actually doing anything: at a certain point, the walls fall flat without a fight. This was also the method used by the Romans when they conquered the natural desert fortress Masada, the last stronghold of the Jewish revolt, after spending months patiently building a ramp. Grothendieck was convinced that if one has a sufficiently unifying vision of mathematics, if one can sufficiently penetrate the essence of mathematics and the strategies of its concepts, then particular problems are nothing but a test; they do not need to be solved for their own sake.

Source

Slabs of time

From Some Remarks: Essays and Other Writing by Neal Stephenson:

Writing novels is hard, and requires vast, unbroken slabs of time. Four quiet hours is a resource I can put to good use. Two slabs of time, each two hours long, might add up to the same four hours, but are not nearly as productive as an unbroken four. … Likewise, several consecutive days with four-hour time-slabs in them give me a stretch of time in which I can write a decent book chapter, but the same number of hours spread out across a few weeks, with interruptions in between them, are nearly useless.

I haven’t written a novel, and probably never will, but Stephenson’s remarks describe my experience doing math and especially developing software. I can do simple, routine work in short blocks of time, but I need larger blocks of time to work on complex projects or to be more creative.

Related post: Four hours of concentration

Efficiency could land you in jail

A German postman recently faced criminal charges for coming up with using more efficient routes to deliver the mail. His supervisor had informally tolerated his initiative, but could not officially sanction it since his violated procedure. He got into trouble when his suspicious peers reported him. Fortunately he was not fired, only reprimanded for not following rules.

The source I saw (thanks Tim) doesn’t give much more detail. Maybe the charges against him were not as ridiculous as they seem. Maybe he violated reasonable safety regulations, for example. But I find it quite plausible that he simply got into trouble for using his brain. Even if the incident were completely made up, it would make a good story. It’s symbolic of bureaucratic punishment of efficiency. It’s easy to find analogous examples.

If this mailman were working for a small courier company, the company might reward him and ask him for recommendations for improving other routes. Of course a small company might also fire him. But large organizations, public and private, are more likely to punish initiative. And I understand why: large organizations have to maintain consistency. The clever postman must be reprimanded for the good of the system, but it’s maddening when you’re the postman.

Related posts:

Beethoven, Beatles, and Beyoncé: more on the Lindy effect

This post is a set of footnotes to my previous post on the Lindy effect. This effect says that creative artifacts have lifetimes that follow a power law distribution, and hence the things that have been around the longest have the longest expected future.

Works of art

The previous post looked at technologies, but the Lindy effect would apply, for example, to books, music, or movies. This suggests the future will be something like a mirror of the present. People have listened to Beethoven for two centuries, the Beatles for about four decades, and Beyoncé for about a decade. So we might expect Beyoncé to fade into obscurity a decade from now, the Beatles four decades from now, and Beethoven a couple centuries from now.

Disclaimer

Lindy effect estimates are crude, only considering current survival time and no other information. And they’re probability statements. They shouldn’t be taken too seriously, but they’re still interesting.

Programming languages

Yesterday was the 25th birthday of the Perl programming language. The Go language was announced three years ago. The Lindy effect suggests there’s a good chance Perl will be around in 2037 and that Go will not. This goes against your intuition if you compare languages to mechanical or living things. If you look at a 25 year-old car and a 3 year-old car, you expect the latter to be around longer. The same is true for a 25 year-old accountant and a 3 year-old toddler.

Life expectancy

Someone commented on the original post that for a British female, life expectancy is 81 years at birth, 82 years at age 20, and 85 years at age 65. Your life expectancy goes up as you age. But your expected additional years of life does not. By contrast, imagine a pop song that has a life expectancy of 1 year when it comes out. If it’s still popular a year later, we could expect it to be popular for another couple years. And if people are still listening to it 30 years after it came out, we might expect it to have another 30 years of popularity.

Mathematical details

In my original post I looked at a simplified version of the Pareto density:

f(t) = c/tc+1

starting at t = 1. The more general Pareto density is

f(t) = cac/tc+1

and starts at t = a. This says that if a random variable X has a Pareto distribution with exponent c and starting time a, then the conditional distribution on X given that X is at least b is another Pareto distribution, now with the same exponent but starting time b. The expected value of X a priori is ac/(c-1), but conditional on having survived to time b, the expected value is now bc/(c-1). That is, the expected value has gone up in proportion to the ratio of starting times, b/a.