Random walk on a clock

Stand on a large clock, say on the 1. Now flip a coin and move ahead one hour if the coin turns up heads, and back one hour otherwise. Keep repeating the process until you’ve stood on all 12 numbers. How long, on average, will this random walk take? If you generalize to clocks with p positions, how does the expected time vary with p?

Here’s a little Python code to experiment with the problem.

from random import random

p = 12

def time_to_cover_circle():
    circle = [0 for i in range(p)]

    count, pos = 0, 0
    while True:
        count += 1
        pos += 1 if random() > 0.5 else -1
        pos %= p
        circle[pos] = 1
        if min(circle) == 1:
            return count

Examples bring a subject to life

Steve Awodey said of category theory

Material at this level of abstraction is simply incomprehensible without the applications and examples that bring it to life.

Michael Atiyah would say this is not unique to categories.

When someone tells me a general theorem I say that I want an example that is both simple and significant. It’s very easy to give simple examples that are not very interesting or interesting examples that are very difficult. If there isn’t a simple, interesting case, forget it. … Without knowing the examples, the whole thing is pointless. It is a mistake to focus on the techniques without constantly, not only at the beginning, knowing what the good examples are, where they came from, why they’re done. The abstractions and the examples have to go hand in hand.

More from Michael Aityah here.

Equation in stained glass

In the southwest corner of the Church of the Holy Spirit in Heidelberg there is a stained glass window containing Einstein’s equation E = mc2 near the bottom.

Most of the windows in the church are much lighter, both in color and in content. This window is not prominent; I only noticed it the second time I was in the church.

I think the text at the top of the window quotes 2 Peter 3:10 and Isaiah 54:10. The date is the day the atomic bomb was dropped on Hiroshima.

Interview with Sir Michael Atiyah

Yesterday I had the privilege of interviewing Sir Michael Atiyah. We talked about mathematical writing, the interplay of abstraction and application, and unified theories of physics. A shorter excerpt, focusing on mathematical exposition, appears on the HLF conference blog.

Update: I’ve posted the audio of the interview here.

John Cook interviewing Michael Atiyah at HLF 2013

Photo by Bernhard Kreutzer

JC: I ran across your expository papers when I was a student and very much enjoyed reading them. Two things stood out that I still remember. One was your comment about one-handed pottery, that the person creating it might be very proud of what he was able to do with only one hand, but the user of the pottery doesn’t care.

MA: That was my response to the kind of highly technical mathematics that showed you could do something under extremely minimal hypotheses, but who cares? Sometimes its useful, but often its for the benefit of the technicians and not the general user.

JC: The other thing I remember from your papers was your rule that exposition should proceed from the particular to the abstract.

MA: I have strong views on that. Too many people write papers that are very abstract and at the end they may give some examples. It should be the other way around. You should start with understanding the interesting examples and build up to explain what the general phenomena are. This was you progress from initial understanding to more understanding. This is both for expository purposes and for mathematical purposes. It should always guide you.

When someone tells me a general theorem I say that I want an example that is both simple and significant. It’s very easy to give simple examples that are not very interesting or interesting examples that are very difficult. If there isn’t a simple, interesting case, forget it.

One feature of mathematics is its abstraction. You can encompass many things under one heading. We talk about waves. For us a wave is not a water wave, it’s an oscillating function. It’s a very useful concept to extract things out.

Mathematics is built on abstractions. On the other hand, we have to have concrete realizations of them. Your brain has to operate on two levels. It has to have abstract hierarchies, but it also has to have concrete steps you can put your feet on.

Everything useful in mathematics has been devised for a purpose. Even if you don’t know it, the guy who did it first, he knew what he was doing. Banach didn’t just develop Banach spaces for the sake of it. He wanted to put many spaces under one heading. Without knowing the examples, the whole thing is pointless. It is a mistake to focus on the techniques without constantly, not only at the beginning, knowing what the good examples are, where they came from, why they’re done. The abstractions and the examples have to go hand in hand.

You can’t just do particularity. You can’t just pile one theorem on top of another and call it mathematics. You have to have some architecture, some structure. Poincaré said science is not just a collection of facts any more than a house is a collection of bricks. You have to put it together the right way. You can’t just list 101 theorems and say this is mathematics. That is not mathematics. That is categorization. Mathematics is not a dictionary.

JC: I think someone called that stamp collecting or bug collecting.

MA: Yes. Rutherford was very scathing about chemistry. He said there is Schrodinger’s equation, and everything else was cookery.

JC: I think people are more willing to give these examples verbally than in print.

MA: You’re quite right. It is kind of a reaction in recent times. People began to go beyond what they could prove and got a bit woolly and they got a bad reputation, like some of the Italian algebraic geometers. So you are taught to be rigorous, and you learn to be very, very rigorous. And then you think that your papers have to be ultra-rigorous, otherwise the guys behind you will find fault with what you’re saying. The need for rigor, it’s a bit like the people who are afraid of being sued by the lawyers. If you don’t put it down, 100%, someone will sue you! People cover themselves in advance, but the result is an unreadable paper.

But when you talk like this, you’re allowed to lower the level of rigor in order to increase the power of explanation. You can explain things using hand waving, use analogies, leaving out technical details, because you want to get the idea across. But when people write, particularly mathematics, and particularly in the last decades of the last century, people became very formalistic. Papers were rejected if they were not rigorous enough. People were reacting to the loose talk of the past. So we went to the other extreme. And most papers aren’t read. That’s a theorem! Someone said the average number of readers of a paper is one, and that’s the author.

JC: And with multiple authors, maybe it’s a fraction.

MA: That’s right. I used to tell people, make your introduction understandable to a general mathematician. Don’t get into the technicalities there. Say you’ll deal with them later. Don’t just say “Let X be a space” and jump into the details but that’s what people do. So you’re right. But people know when they give a lecture to behave differently, they can use all the tools of hand-waving, literally! [waves hands], to get idea across.

Some people write so well it’s as good as a lecture. Hermann Weyl is my great model. He used to write beautiful literature. Reading it was a joy because he put a lot of thought into it. Hermann Weyl wrote a book called The Classical Groups, very famous book, and at the same time a book appeared by Chevalley called Lie Groups. Chevalley’s book was compact, all the theorems are there, very precise, Bourbaki-style definitions. If you want to learn about it, it’s all there. Hermann Weyl’s book was the other way around. It is discursive, expansive, it’s a bigger book, it tells you much more than need to know. It’s the kind of book you read ten times. Chevalley’s book you read once. You go back to these other books. They’re full of throw-away lines, insights that are not fully relevant for going from A to B. He’s trying to give you a vision for C. Not many people write like that. He was an exception.

In the introduction he explains that he’s German, he’s writing in a foreign language, he apologizes saying he is writing in a language that was “not sung by the gods at my cradle.” You can’t get any more poetic than that. I’m a great admirer of his style, of his exposition and his mathematics. They go together.

JC: I ran across an old paper by Halmos recently where he refers to your index theorem as a pinnacle of abstraction, and I thought about the comparison with Shinichi Mochizuki’s claimed proof of the abc conjecture. The index theorem rests on a large number of abstractions, yet these are widely shared abstractions. Mochizuki’s proof also sits on top of many abstractions, but nobody understands it.

MA: These problems have a long history. The past is part of your common heritage. They generalize classical theorems. They go step by step through abstractions. Everybody knows what they’re for. So there’s a big pyramid. The base is very firm. The purpose of abstraction is that you can go back down to earth and apply it here or apply it there. You don’t just disappear in the clouds. You walk along the ridge and go down to this valley or that.

JC: Maybe the proof of the abc conjecture is more of a flag pole than a pyramid.

MA: And when you get to the top of a flag pole, what do you do? You could be like the hermit saints who would sit on a pillar. …

JC: I saw a video recently of a talk you gave a couple years ago in which you described yourself as a “mystic,” someone hoping for a unification of several diverse theories of physics.

MA: That’s right. I was saying that there are these different schools of thought, each with a prophet leading them. I wanted to say that I didn’t follow any one of them, that you can search for the truth through many different ways. The mystic thinks there’s a little bit of truth in everybody’s line. I think every interesting line has something to be said for it.

There are the string theorists, that’s the biggest school, and people like Roger Penrose, Alain Connes, … They all have strong ideas, and they attract a certain group of followers. These are serious people, there’s serious reason to believe what they’re doing, and there’s something in it. Bridges have been built between these theories. They’ve found unexpected links between them. So that’s already a bit of convergence. So I’m hoping there will be more convergence, using the best of each theory.

It would be sad if we’ve seen the last of the visions of the prophets. It would be sad if new fundamental insights were no more. There’s no reason it should stop. At the end of the 19th century people thought physics was finished. We know absolutely everything, but at the beginning of the 20th century it exploded. There’s no reason not to think there will be further revolutionary ideas emerging. Any day, any time, anywhere.

I’m an optimist. I believe in new ideas, in progress. It’s faith. I’ve recently been thinking about faith. If you’re a religious person, which I’m not, you believe God created the universe. That’s why it works, and you’re trying to understand God’s works. There are many scientists who work in that framework. Scientists, outside of religion, have their own faith. They believe the universe is rational. They’re trying to find the laws of nature. But why are there laws? That’s the article of faith for scientists. It’s not rational. It’s useful. It’s practical. There’s evidence in its favor: the sun does rise every day. But nevertheless, at the end of the day, it’s an article of faith.

I gave a talk about Pythagoras. Bertrand Russell said that Pythagoras was probably the most influential philosopher who ever existed, because he was the first person to emphasize two things, the role of numbers in general, and the role of numbers in music in particular. He was so struck by this that he believed numbers were the base of the universe. He was the first person to articulate the view that the universe is rational, built on rational principles.

JC: What do you think of category theory? Do you think it plays a role in this grand unification?

MA: Well, I’m a conservative in many ways. As you get older you get more conservative. And I don’t believe in things unless I’m convinced of their utility. Ordinary categories are a useful language. But it’s become now more formal. There are things called 2-categories, 3-categories, building categories on categories. I’m now partially converted. I think it may be necessary. Algebra was an abstraction when it first came along. Then you get into categories and higher levels of abstraction. So I’m broad minded about it, but I’d like to be persuaded.

In Heidelberg

I’m in Heidelberg this week at the Heidelberg Laureate Forum and having a great time. I’m writing for the conference blog and may not blog much here. We’ll see. Maybe later in the week I’ll have more time.

The network connection in my hotel is down most of the time, so I’m writing this quickly while I have a window of connectivity.

I’ve posted a few photos on Google+.

Posts for the conference so far:

Who was Nevanlinna and what is his prize?
Saxophone quartets and probability

I interviewed Sir Michael Atiyah this morning and so that will be posted sometime.

Least understood bit of basic math

Logarithms may be the least understood topic in basic math. In my experience, if an otherwise math-savvy person is missing something elementary, it’s usually logarithms.

For example, I have had conversations with people with advanced technical degrees where I’ve had to explain that logs in all bases are proportional to each other. For example, if one thing is proportional to the natural log of another, the former is also proportional to the log base 10 or log base anything else of the latter [1].

I’ve also noticed that quite often there’s a question on the front page of math.stackexchange of the form “How do I solve …” and the solution is invariably “take logarithms of both sides.” This seems to be a secret technique.

I suspect that more people understood logarithms when they had to use slide rules. A slide rule is essentially two sticks with log-scale markings. By moving one relative to the other, you’re adding lengths, which means adding logs, which does multiplication. If you do that for a while, it seems you’d have to get a feel for logs.

Log tables also make logs more tangible. At first it seems there’s no skill required to use a  table, but you often have to exercise a little bit of understanding. Because of the limitations of space, tables can’t be big enough to let you directly look up everything. You have to learn how to handle orders of magnitude and how to interpolate.

If the first time you see logs is when it’s time to learn to differentiate them, you have to learn two things at once. And that’s too much for many students. They make mistakes, such as assuming logs are linear functions, that they would not make if they had an intuitive feel for what they’re working with.

Maybe schools could have a retro math week each year when students can’t use calculators and have to use log tables and slide rules. I don’t think it would do as much good to just make tables or slide rules a topic in the curriculum. It’s when you have to use these things to accomplish something else, when they are not simply an isolated forgettable topic of their own, that the ideas sink in.

More on logarithms

[1] That is, loga(x) = loga(b) logb(x). This says loga(b) is the proportionality constant for converting between logs in base a and b. To prove the equation, raise a to the power of both sides.

To memorize this equation, notice the up-and-down pattern of the bases and arguments: a up to x = a up to b down to b up to x. The right side squeezes an up and down b in between a and x.

Is information theory misnamed

Interesting perspective on information theory:

To me, the subject of “information theory” is badly named. That discipline is devoted to finding ideal compression schemes for messages to be sent quickly and accurately across a noisy channel. It deliberately does not pay any attention to what the messages mean. To my mind this should be called compression theory or redundancy theory. Information is inherently meaningful—that is its purpose—any theory that is unconcerned with the meaning is not really studying information per se. The people who decide on speed limits for roads and highways may care about human health, but a study limited to deciding ideal speed limits should not be called “human health theory”.

Despite what was said above, Information theory has been extremely important in a diverse array of fields, including computer science but also in neuroscience and physics. I’m not trying to denigrate the field; I am only frustrated with its name.

From David Spivak, footnotes 13 and 14 here.

To err is human, to catch an error shows expertise

Experts are OK at avoiding mistakes, but they’re even better at recognizing and fixing mistakes.

If you ask an elementary student something like “How can we know that the answer to this problem cannot be 8769?” they might only be able to say “Because the correct answer is 8760.” That is, the only tool they have for checking an result is to compare it to a known quantity. A more sophisticated student might be able to say something like “We know the result must be even” or “The result cannot be more than 8764” for  reasons that come from context.

Experienced programmers write fewer bugs than rookies. But more importantly, the bugs they do write don’t live as long. The experienced programmer may realize almost immediately that a piece of code cannot be right, while the rookie may not realize there’s a problem until the output is obviously wrong. The more experienced programmer might notice that the vertical alignment of the code looks wrong, or that incompatible pieces are being used together, or even that the code “smells wrong” for reasons he or she can’t articulate.

An engineer might know that an answer is wrong because it has the wrong magnitude or the wrong dimensions. Or maybe a result violates a conservation law. Or maybe the engineer thinks “That’s not the way things are done, and I bet there’s a good reason why.”

“Be more careful” only goes so far. Experts are not that much more careful than novices. We can only lower our rate of mistakes so much. There’s much more potential for being able to recognize mistakes than to prevent them.

A major step in maturing as a programmer is accepting the fact that you’re going to make mistakes fairly often. Maybe you’ll introduce a bug for every 10 lines of code, at least one for every 100 lines. (Rookies write more bugs than this but think they write fewer.) Once you accept this, you begin to ask how you can write code to make bugs stand out. Given two approaches, which is more likely to fail visibly if is it’s wrong? How can I write code so that logic errors are more likely to show up as compile errors? How am I doing to debug this when it breaks?

Theory pays off in the long run. Abstractions that students dismiss as impractical probably are impractical at first. But in the long run, these abstractions may prevent or catch errors. I’ve come to see the practicality in many things that I used to dismiss as pedantic: dimensional analysis, tensor properties, strongly typed programming languages, category theory, etc. A few weeks into a chemistry class I learned the value of dimensional analysis. It has taken me much longer to appreciate category theory.

Related posts

The difference between machines and tools

From “The Inheritance of Tools” by Scott Russell Sanders:

I had botched a great many pieces of wood before I mastered the right angle with a saw, botched even more before I learned to miter a joint. The knowledge of these things resides in my hands and eyes and the webwork of muscles, not in the tools. There are machines for sale—powered miter boxes and radial arm saws, for instance—that will enable any casual soul to cut proper angles in boards. The skill is invested in the gadget instead of the person who uses it, and this is what distinguishes a machine from a tool.

Related post: Software exoskeletons