Closet Bayesian

When I was a grad student, a statistics postdoc confided to me that he was a “closet Bayesian.” This sounded absolutely bizarre. Why would someone be secretive about his preferred approach to statistics? I could not imagine someone whispering that although she’s doing her thesis in algebra, she’s secretively interested in differential equations.

I knew nothing about statistics at the time and was surprised to find that there was a bitter rivalry between two schools of statistics. The rivalry is still there, though it’s not as bitter as it once was.

I find it grating when someone asks “Are you a Bayesian?” It implies an inappropriate degree of commitment and exclusivity. Bayesian statistics is just a tool. Statistics itself is just tool, one way of understanding the world.

My car has a manual transmission. I prefer manual transmissions. But if someone asked whether I was a manual transmissionist, I’d look at them like they’re crazy. I don’t have any moral objections to automatic transmissions.

I evaluate a car by how well it works. And for most purposes, I prefer the way a manual transmission works. But when I’m teaching one of my kids to drive, we go out in my wife’s car with an automatic transmission. Similarly, I evaluate a mathematical model (statistical or otherwise) by how it works for a given purpose. Sometimes a Bayesian and a frequentist approach lead to the same conclusions, but the latter is easier to understand or implement. Sometimes a Bayesian method leads to a better result because it can use more information or is easier to interpret. Sometimes it’s a toss up and I use a Bayesian approach because its more familiar, just like my old car.

Fractured work

Vivek Haldar’s recent post Quantum of Work points out something obvious in retrospect: programming is intrinsically fractured. It does little good to tell a programmer to unplug and concentrate. He or she cannot work for more than a few minutes before needing to look something up online or interact with someone.

A quantum of work is the theoretical longest amount of time you can work purely on your own without needing to break out into looking up something on the web or your mail or needing input from another person. For most modern workers this quantum of work is measured in minutes.

At least that’s the default, the path of least resistance. But it’s not the only way to work. Software developer Joey Hess describes how he works.

[My home] is nicely remote, and off the grid, relying on solar power. I only get 50 amp-hours of juice on a sunny day, and often less than 15 amp-hours on a bad day. … I seem to live half the time out of range of broadband, and still use dialup since bouncing the Internet off a satellite has too much latency, and no better total aggregate bandwidth. So I’m fully adapted to asynchronous communication.

Joey Hess cannot possibly work the way Vivek Haldar describes. It sounds like his quantum of work is measured in hours if not days. That would not be optimal or even feasible for some kinds of work, but it does suggest that we may not need to be as connected as we are. Maybe your optimal quantum of work is somewhere between the extremes discussed above.

If your quantum of work is 10 minutes, maybe you could increase that. This would require making some changes. Keeping your same way of working but trying to ration your time online would be frustrating and counterproductive. I think it’s significant that Hess says he adapted to working asynchronously. For example, I assume he keeps reference material on his local hard drive that others would access online.

Even if working offline is less efficient, it’s a good idea to be prepared to work that way when necessary. I was reminded of that this weekend. I was using some desktop software that depends on a server component. There was a failure on the vendor’s server and nobody at work to fix it, so I was stuck.

What are some ways to increase your quantum of work and to work less synchronously?

Related posts

Finding 2013 in pi

My youngest daughter asked me this morning whether you can find the number 2013 in the digits of pi. I said it must be possible, then wrote the following Python code to find where 2013 first appears.

    from mpmath import mp
    mp.dps = 100000
    digits = str(mp.pi)[2:]
    digits.find('2013')

I use the multi-precision math package mpmpath to get pi to 100,000 significant figures. I save this as a string and cut off the “3.” at the beginning to have just the string of digits after the decimal point.

The code returns 6275, so “2013” appears in the 6275th position in the string of digits. However, we usually count decimal places starting from 1 but count positions in a string starting from 0, so 2013 starts in the 6276th decimal place of pi.

So π = 3.14159…2013… where the first “…” represents 6,270 digits not shown.

* * *

Now we jump off into deeper mathematical water.

For some purposes, the digits of pi are random. The digits are obviously not random — there are algorithms for calculating them — and yet they behave randomly, and random is as random does.

If the digits of pi were random, then we could almost certainly find any sequence we want if we look long enough. Can we find any finite sequence of digits in the decimal expansion of pi? I would assume so, but I don’t know whether that has been proven.

You might expect that not only can you find 2013 in pi, but that if you split the digits of pi into blocks of 4, then 2013 and every other particular block would occur with the same frequency in the limit. That is, one would expect that the expansion of pi is uniform in base 10,000.

More generally, you might conjecture that pi is a normal number, i.e. that the digits of pi are uniformly distributed in every base. This has not been proven. In fact, no one has proved that any particular number is normal [reference]. However, we do know that almost all numbers are normal. That is, the set of non-normal numbers has Lebesgue measure zero.