Russian novel programming

One of the things that makes Russian novels hard to read, at least for Americans, is that characters have multiple names. For example, in The Brothers Karamazov, Alexei Fyodorovich Karamazov is also called Alyosha, Alyoshka, Alyoshenka, Alyoshechka, Alexeichik, Lyosha, and Lyoshenka.

Russian novel programming is the anti-pattern of one thing having many names. For a given program, you may have a location in version control, a location on your hard drive, a project name, a name for the program executable, etc. Each of these may contain slight differences in the same name. Or major differences. For historical reasons, the code for foo.exe is in a project named ‘bar’, under a path named …

I thought about this today when looking into a question about a program. A single number had different names in several different contexts. There’s the text label on the desktop user interface, the name of the C# variable that captures the user input, the name of the corresponding C++ variable when the user input is passed to the back-end numeric code, and the name of used in the XML file that serializes the variable when it goes between a database and a web server. Of course these should all be coordinated, but there were understandable historical reasons for how things got into this state.

Related posts:

Baklava code
Software to slice bread
Why Shakespeare is hard to read

The black T-shirt crowd

My previous post quotes Greg Jorgensen’s imaginary interview with Linus Torvalds. In the interview, Torvalds says “the black T-shirt crowd” has gotten bored with Linux because it has become too easy. Now Git gives them a new arcane product to explore and master. Jorgensen has Torvals say

I didn’t really expect anyone to use [Git] because it’s so hard to use, but that turns out to be its big appeal. No technology can ever be too arcane or complicated for the black t-shirt crowd.

John Durden pointed out in the comments that I’m wearing a black T-shirt in the photo on my blog. Touché!

The T-shirt I was wearing at the time of the photo isn’t entirely black, though the portion showing in my mugshot is. Here’s the full photo:

I’ll admit to wearing a metaphorical black T-shirt, i.e. enjoying solving arcane problems, though not the particular problems mentioned in Jorgensen’s satire. I take no pleasure in troubleshooting operating system or version control problems, for example. I want such things to just work. But I do enjoy solving some kinds of problems that most people would rather not think about. It would be the pot calling the kettle black for me to poke too much fun at the programmers in Jorgensen’s article. Anyone with a PhD in partial differential equations should be cautious about calling someone else geeky.

(There should be a joke in there about black kettles and black T-shirts, but it’s late as I write this and I can’t pull it off.)

No technology can ever be too arcane

In this fake interview, Linux creator Linus Torvalds says Linux has gotten too easy to use and that’s why people use Git:

Git has taken over where Linux left off separating the geeks into know-nothings and know-it-alls. I didn’t really expect anyone to use it because it’s so hard to use, but that turns out to be its big appeal. No technology can ever be too arcane or complicated for the black t-shirt crowd.

Emphasis added.

Note: If you want to leave a comment saying Linux or Git really aren’t hard to use, lighten up. This is satire.

Related post: Advanced or just obscure?

Ramanujan's factorial approximation

Ramanujan came up with an approximation for factorial that resembles Stirling’s famous approximation but is much more accurate.

n! sim sqrt{pi} left(frac{n}{e}right)^n sqrt[6]{8n^3 + 4n^2 + n + frac{1}{30}}

As with Stirling’s approximation, the relative error in Ramanujan’s approximation decreases as n gets larger. Typically these approximations are not useful for small values of n. For n = 5, Stirling’s approximation gives 118.02 while the exact value is 120. But Ramanujan’s approximation gives 120.00015.

Here’s an implementation of the approximation in Python.

def ramanujan(x):
    fact = sqrt(pi)*(x/e)**x
    fact *= (((8*x + 4)*x + 1)*x + 1/30.)**(1./6.)
    return fact

For non-integer values of x, the function returns an approximation for Γ(x+1), an extension of factorial to real values. Here’s a plot of the accuracy of Ramanujan’s approximation.

plot of precision of Ramanujan's approximation

For x = 50, Ramanujan’s approximation is good to nearly 10 significant figures, whereas Stirling’s approximation is good to about 7.

Here’s a little trickier implementation.

def ramanujan2(x):
    fact = sqrt(pi)*(x/e)**x
    fact *= (((8*x + 4)*x + 1)*x + 1/30.)**(1./6.)
    if isinstance(x, int):
        fact = int(fact)
    return fact

This code gives the same value as before if x is not an integer. But it gives exact values of factorial for x = 0, 1, 2, … 10. If x is 11 or greater, the result is not exact, but the relative error is less than 10^-7 and decreases as x increases. Ramanujan’s approximation always errs on the high side, so rounding the result down improves the accuracy and makes it exact for small integer inputs.

The downside of this trick is that now, for example, ramanujan2(5) and ramanujan2(5.0) give different results. In some contexts, the improved accuracy may not be worth the inconsistency.

Reference: On Gosper’s formula for the Gamma function

Related post: A Ramanujan series for calculating pi

Bad UI of the day

A friend of mine sent me the photo below of a Sears Craftsman reciprocating saw.

saw

What do you suppose the yellow switch does? Since the positions are labeled ‘0’ and ‘1’, my first thought was that they were off and on respectively. But no!

The yellow switch controls blade operation: normal versus orbital. But which label corresponds to which operation? You can’t tell before you turn on the saw, and even while it’s running it’s not obvious. But you can tell if you watch the blade closely as it slows down after you turn the saw off.

The printed instructions that came with the saw say:

The first position is for normal blade operation, and the second position is for orbital blade operation.

But which is “first”, the position labeled ‘0’ or ‘1’? A normal human being might think that 1 and 1st go together. Being a programmer, I assumed — correctly — that 0 is the 1st position.

How many bullets does it take to cut down a tree?

According to Guesstimation 2.0, it would take about 10,000 bullets to cut down a tree with a 20 cm radius. The book doesn’t just announce the result but shows how you might come to this conclusion by computing how much energy it would take to fell the tree and how much energy bullets deliver. After going through a rough calculation, the book reports that Mythbusters actually cut down a tree, a smaller one, with about 2,000 bullets.

Guesstimation 2.0 explores about 100 questions with back-of-the-envelope solutions. (I started to add up the number of problems by looking at the table of contents, but it would be more in line with the spirit of the book and say that since it has on the order to 10 questions and on the order of 10 problems per chapter, it has about 100 questions.)

Another question I found interesting was estimating the amount of fuel needed to transport food across the United States. If you lived in New York and all of your food came from California, it would take about 10 gallons of fuel to bring you your groceries. It may take more fuel to bring your groceries home from the grocery store than it takes to bring the food from around the country to the grocery store.

It's not the text editor, it's text

Vivek Haldar had a nice rant about editors a couple days ago. In response to complaints that some editors are ugly, he writes:

The primary factor in looking good should be the choice of a good font at a comfortable size, and a syntax coloring theme that you like. And that is not something specific to an editor. Editors like Emacs and vi have almost no UI!

To illustrate his point, here’s what my Emacs looks like without text:

There’s just not much there, not enough to say it’s pretty or ugly.

When people say that Emacs is not pretty, I think they mean that plain text is not pretty.

For better and for worse, everything in Emacs is text. The advantage of this approach is consistency. Everything uses the same commands for navigation and editing: source code, error messages, directory listings, … Everything is just text. The disadvantage is that you don’t have nicely designed special windows for each of these things, and it does get a little monotonous.

When people say they love their text editor, I think they love text. They prefer an environment that allows them to solve problems by editing text files rather than clicking buttons. And as Vivek says, that is not something specific to a particular editor.

Related posts:

Personal organization software
Complexity and unity

Math tools for the next 20 years

Igor Carron commented on his blog that

… the mathematical tools that we will use in the next 20 years are for the most part probably in our hands already.

He compares this to progress in treating leukemia: survival rates increased dramatically over the last 40 years, not primarily by developing new drugs, but by better applying drugs that were available in 1970.

I find that plausible. I’ve gotten a lot of mileage out of math that was well known 100 years ago.

Related post: Doing good work with bad tools

Cancer moon shots

M. D. Anderson Cancer Center announced a $3 billion research program today aimed at six specific forms of cancer.

  • Acute myeloid leukemia and myelodysplastic syndrome (AML and MDS)
  • Chronic lymphocytic leukemia (CLL)
  • Lung cancer
  • Melanoma
  • Prostate cancer
  • Triple negative breast and ovarian cancer

These special areas of research are being called “moon shots” by analogy with John F. Kennedy’s challenge to put a man on the moon. This isn’t a new idea. In fact, a few months after the first moon landing, there was a full-page ad in the Washington Post that began “Mr. Nixon: You can cure cancer.” The thinking was the familiar refrain “If we can put a man on the moon, we can …” President Nixon and other politicians were excited about the idea and announced a “war on cancer.” Scientists, however, were more skeptical. Sol Spiegelman said at the time

An all-out effort at this time would be like trying to land a man on the moon without knowing Newton’s laws of gravity.

The new moon shots are not a national attempt to “cure cancer” in the abstract. They are six initiatives at one institution to focus research on specific kinds of cancer. And while we do not yet know the analog of Newton’s laws for cancer, we do know far more about the basic biology of cancer than we did in the 1970’s.

There are results that suggest that there is some unity beyond the diversity of cancer, that ultimately there are a few common biological pathways involved in all cancers. Maybe some day we will be able to treat cancer in general, but for now it looks like the road forward is specialization. Perhaps specialized research programs will uncover some of these common patters in all cancer.

Related links:

cancermoonshots.org
Ph.D. Comics on cancer
Bayesian clinical trials in one zip code

How long will there be computer science departments?

The first computer scientists resided in math departments. When universities began to form computer science departments, there was some discussion over how long computer science departments would exist. Some thought that after a few years, computer science departments would have served their purpose and computer science would be absorbed into other departments that applied it.

It looks like computer science departments are here to stay, but that doesn’t mean that there are not territorial disputes. If other departments are not satisfied with the education their students are getting from the computer science department, they will start teaching their own computer science classes. This is happening now, to different extents in different places.

Some institutions have departments of bioinformatics. Will they always? Or will “bioinformatics” simply be “biology” in a few years?

Statisticians sometimes have their own departments, sometimes reside in mathematics departments, and sometimes are scattered to the four winds with de facto statisticians working in departments of education, political science, etc. It would be interesting to see which of these three options grows in the wake of “big data.” A fourth possibility is the formation of “data science” departments, essentially statistics departments with more respect for machine learning and with better marketing.

No doubt computer science, bioinformatics, and statistics will be hot areas for years to come, but the scope of academic departments by these names will change. At different institutions they may grow, shrink, or even disappear.

Academic departments argue that because their subject is important, their department is important. And any cut to their departmental budget is framed as a cut to the budget for their subject. But neither of these is necessarily true. Matt Briggs wrote about this yesterday in regard to philosophy. He argues that philosophy is important but that philosophy departments are not. He quotes Peter Kreeft:

Philosophy was not a “department” to its founders. They would have regarded the expression “philosophy department” as absurd as “love department.”

Love is important, but it doesn’t need to be a department. In fact, it’s so important that the idea of quarantining it to a department is absurd.

Computer science and statistics departments may shrink as their subjects diffuse throughout the academy. Their departments may not go away, but they may become more theoretical and more specialized. Already most statistics education takes place outside of statistics departments, and the same may be true of computer science soon if it isn’t already.

How do you justify that distribution?

Someone asked me yesterday how people justify probability distribution assumptions. Sometimes the most mystifying assumption is the first one: “Assume X is normally distributed …” Here are a few answers.

  1. Sometimes distribution assumptions are not justified.
  2. Sometimes distributions can be derived from fundamental principles. For example, there are axioms that uniquely specify a Poisson distribution.
  3. Sometimes distributions are justified on theoretical grounds. For example, large samples and the central limit theorem together may justify assuming that something is normally distributed.
  4. Often the choice of distribution is somewhat arbitrary, chosen by intuition or for convenience, and then empirically shown to work well enough.
  5. Sometimes a distribution can be a bad fit and still work well, depending on what you’re asking of it.

The last point is particularly interesting. It’s not hard to imagine that a poor fit would produce poor results. It’s surprising when a poor fit produces good results. Here’s an example of the latter.

Suppose you are testing a new drug and hoping that it improves how long patients live. You want to stop the clinical trial early if it looks like patients are living no longer than they would have on standard treatment. There is a Bayesian method for monitoring such experiments that assumes survival times have an exponential distribution. But survival times are not exponentially distributed, not even close.

The method works well because of the question being asked. The method is not being asked to accurately model the distribution of survival times for patients in the trial. It is only being asked to determine whether a trial should continue or stop, and it does a good job of doing so. As the simulations in this paper show, the method makes the right decision with high probability, even when the actual survival times are not exponentially distributed.

Related posts:

What distribution does my data have?
Four views of the negative binomial

Accuracy versus perceived accuracy

Commercial weather forecasters need to be accurate, but they also need to be perceived as being accurate, and sometimes the latter trumps the former.

For instance, the for-profit weather forecasters rarely predict exactly a 50% chance of rain, which might seem wishy-washy and indecisive to customers. Instead, they’ll flip a coin and round up to 60, or down to 40, even though this makes the forecasts both less accurate and less honest.

Forecasters also exaggerate small chances of rain, such as reporting 20% when they predict 5%.

People notice one type of mistake — the failure to predict rain — more than another kind, false alarms. If it rains when it isn’t supposed to, they curse the weatherman for ruining their picnic, whereas an unexpectedly sunny day is taken as a serendipitous bonus.

From The Signal and the Noise. The book gets some of its data from Eric Floehr of ForecastWatch. Read my interview with Eric here.

Robustness of simple rules

In his speech The dog and the frisbee, Andrew Haldane argues that simple models often outperform complex models in complex situations. He cites as examples sports prediction, diagnosing heart attacks, locating serial criminals, picking stocks, and  understanding spending patterns. The gist of his argument is this:

Complex environments often instead call for simple decision rules. That is because these rules are more robust to ignorance.

And yet behind every complex set of rules is a paper showing that it outperforms simple rules, under conditions of its author’s choosing. That is, the person proposing the complex model picks the scenarios for comparison. Unfortunately, the world throws at us scenarios not of our choosing. Simpler methods may perform better when model assumptions are violated. And model assumptions are always violated, at least to some extent.

Related posts:

More theoretical power, less real power
Advantages of crude models
Canonical examples from robust statistics

Working to change the world

I recently read that Google co-founder Sergey Brin asked an audience whether they are working to change the world. He said that for 99.9999% of humanity, the answer is no.

I really dislike that question. It invites arrogance. Say yes and you’re one in a million. You’re a better person than the vast majority of humanity.

Focusing on doing enormous good can make us feel justified in neglecting small acts of goodness. Many have professed a love for Humanity and shown contempt for individual humans. “I’m trying to end poverty, cure cancer, and make the world safe for democracy; I shouldn’t be held to same petty standards as those who are wasting their lives.”

To paraphrase Thomas Sowell, we should judge people by their means, not their ends, because most people don’t achieve their ends and all we’re left with is their means [1].

In context Brin implies that only grand technological innovation is worthwhile, obviously a rather narrow perspective. Did Anne Frank make the world a better place by keeping a diary? I think so.

The opposite of a technologist might be a medieval literature professor. If you wanted to “change the world” the last thing you’d do might be to choose a career medieval scholarship. And yet two of the most influential people of the 20th century — C. S. Lewis and J. R. R. Tolkien — were medieval literature professors.

It’s very hard to know what kind of impact you’re going to have in the world. The surest way to do great good is to focus first on doing good.

Related post: Here’s to the sane ones

***

[1] I think Thomas Sowell said something like this in the context of organizations rather than individuals, but I can’t find the quote.

The paper is too big

In response to the question “Why are default LaTeX margins so big?” Paul Stanley answers

It’s not that the margins are too wide. It’s that the paper is too big!

This sounds flippant, but he gives a compelling argument that paper really is too big for how it is now used.

As is surely by now well-known, the real question is the size of the text block. That is a really important factor in legibility. As others have noted, the optimum line length is broadly somewhere between 60 characters and 75 characters.

Given reasonable sizes of font which are comfortable for reading at the distance we want to read at (roughly 9 to 12 point), there are only so many line lengths that make sense. If you take a book off your shelf, especially a book that you would actually read for a prolonged period of time, and compare it to a LaTeX document in one of the standard classes, you’ll probably notice that the line length is pretty similar.

The real problem is with paper size. As it happens, we have ended up with paper sizes that were never designed or adapted for printing with 10-12 point proportionally spaced type. They were designed for handwriting (which is usually much bigger) or for typewriters. Typewriters produced 10 or 12 characters per inch: so on (say) 8.5 inch wide paper, with 1 inch margins, you had 6.5 inches of type, giving … around 65 to 78 characters: in other words something pretty close to ideal. But if you type in a standard proportionally spaced font (worse, in Times — which is rather condensed because it was designed to be used in narrow columns) at 12 point, you will get about 90 to 100 characters in the line.

He then gives six suggestions for what to do about this. You can see his answer for a full explanation. Here I’ll just summarize his points.

  1. Use smaller paper.
  2. Use long lines of text but extra space between lines.
  3. Use wide margins.
  4. Use margins for notes and illustrations.
  5. Use a two column format.
  6. Use large type.

Given these options, wide margins (as in #3 and #4) sound reasonable.