Simple legacy

Benoit Mandelbrot makes the following observation in The Fractal Geometry of Nature.

Many creative minds overrate their most baroque works, and underrate the simple ones. When history reverses such judgments, prolific writers come to be best remembered as authors of “lemmas,” of propositions they had felt “too simple” in themselves and had to be published solely as preludes to forgotten theorems.

If you’re not familiar with lemmas and theorems, think of a musician who is famous for a short prelude written as an introduction to a longer piece nobody remembers. For example, Rossini’s four-minute William Tell Overture is far more famous than the four-hour William Tell opera it introduces.

Returning to famous mathematicians, I remember as an undergraduate hearing of Schwarz’s lemma and waiting for the corresponding theorem that never came. The same applies to Poincaré’s lemma, Zorn’s lemma, and Fatou’s lemma.

We’re all naturally proud of things we work hard for. We expect other people to value our work in proportion to the amount of effort we put into it, but the world doesn’t work that way. It can be discouraging focus on the big, complex projects we’ve worked on that haven’t been appreciated. On the other hand, it can be very encouraging to think of the potential impact of small projects and simple ideas.

Comparing Google and Yahoo automatic translation

I played around with Google’s translator a little after adding some notranslate directives as discussed in my previous post. Google did honor my requests to mark some sections as literal text to not be translated. Google’s translator was also able to recognize my name as a name without special markup. Yahoo, on the other hand, translated my name, turning “Cook” into “Cuisinier” in French.

Google treated text inside <code> tags as literals that should not be translated. That is, Google would leave my source code snippets alone and only translate the English prose surrounding the code. Yahoo, on the other hand, would translate everything, including source code. For example, I had some PowerShell code on my page with the keyword matches that Google left alone but Yahoo translated into “allumettes,” presumably good French prose but not a legal PowerShell keyword.

One puzzling thing about the Google translation engine was that it would change which text was hyperlinked. For example, the text “My résumé” was changed to “Mon CV,” linking on the translation for “my.” Yahoo produced what I expected, “Mon résumé.” There were several other instances in which Google produced odd links, such as hyperlinking the | marker between words that were linked before. For example, the footer of my website has these links:

Home | Sitemap | My blog | Search

Yahoo turned this into

Maison | Sitemap | Mon blog | Recherche

while Google produced

Accueil | Plan du site | Mon blog | Recherche

So Google incorporated the separator bars as part of words, and moved the last link from “Recherche” to the bar separating “blog” and “Rescherche.”

One advantage of Google’s translation is that it lets you hover your mouse over a line of translated text and see the original text.

Giving hints to automatic translators

One problem with machine translation is that machines don’t know when to stop translating. For example Yahoo’s Babel Fish translator translates my last name “Cook” literally to “Cocinero” in Spanish and “Cuisinier” in French.

Today Google announced a way to tell its translator that text should not be translated. Place such text inside a <span> tag with the attribute class="notranslate". I tried this on a web page that explained that a certain piece of code printed out “Hello world.” Since “Hello world” is literal output, it should be left untranslated, not turned into, for example, “Bonjour le monde.” The solution was to modify the HTML to say

The code above prints &ldquo;<span class="notranslate">Hello world</span>.&rdquo;

To prevent an entire page from being translated, add the following tag in the <head> section of the page.

<meta name="google" value="notranslate">

I suppose other machine translation efforts, such as those from Microsoft and Yahoo, will follow Google’s lead and support the class=notranslate directive.

API symmetry

Symmetric APIs are easier to use. I was reminded of this when doing some regular expression programming in Python and comparing it to Perl. Perl’s regular expression operators for search and replace are symmetric in a way that their Python counterparts are not.

Perl uses m/pattern/ for matching and s/pattern/replacement/ for substitution. Both apply to the first instance of a pattern in a string by default. The g option following a match or substitute operator causes the command to apply to all instances of the pattern. The i option after either a match or substitute command causes the pattern to apply in a case-insensitive manner. Matching and substitution are symmetric.

Python uses re.search() for matching and re.sub() for substitution. The search function can only apply to the first instance of a pattern; to match all instances of a pattern, use re.findall(). The function re.sub() applies to all instances by default, but it has a max parameter that can be set to limit the number of instances it applies to. To make a search pattern case-insensitive, pass in re.IGNORECASE flag. To make a substitution case-insensitive, modify the regular expression itself by adding (?i).

In general, I find Python syntax much cleaner than Perl, but regular expressions are implemented more elegantly in Perl.

Diagram of probability distribution relationships

In 1986, Lawrence Leemis published a diagram illustrating the relationships between a couple dozen probability distributions. In 2008, he published a much larger diagram, available online.

I’ve created a diagram similar to the original Leemis diagram with 21 of the most common distributions. You can click on a distribution name to find out its parameterization, and you can click on an arrow to get the details of the relationship it represents. Here’s a small version of the diagram.

See Clickable chart of distribution relationships for the full diagram.

Watch what you name graphics files in LaTeX

A while back I was trying to paste a figures into a LaTeX document this evening with names like foo_27.png and foo_32.2.png, putting a parameter value into the name of the plot. The former worked but the latter didn’t.

It turns out the \includegraphics command parses the file extension in a naive way to determine the file type. When it sees foo_27.png, it says “OK, a .png file”. But when it sees foo_32.2.png, it says “.2.png? I’ve never heard of that file type.”

Related post: Including graphics in LaTeX documents

Unusual security behavior in TightVNC

A friend of mine told me recently about his adventures using TightVNC. When you install TightVNC it asks for two passwords.

Hmm. Wonder what that’s about.

Now its time to log in.

Surprise! The behavior of the software depends on what password you used to log in. Permissions are not tied to the user but to the password. If I’m John who came in with password Snoopy, I have one set of permissions, but if I’m John who came in with password Linus, I get another set of permissions.

I suppose this makes sense in isolation, but it’s completely contrary to convention. Yes, it could make sense for one person to have two sets of permissions. But this is nearly always done by having two accounts, not two passwords for the same account. Convention is to associate privilege with a user, not with how the user logged in. I see how it could be convenient to have two sets of privilege associated with one account, but there’s no indication in the log in dialog that it matters what password you enter. A better solution would be to have someone log in with one password, but if they have multiple privilege options, show radio buttons and ask which set of privileges they want to exercise.

What’s good for you in red wine

I’ve heard two podcasts that contradict each other somewhat as far as what it is in red wine that’s good for you.

According to this Scientific American interview with Charles Bamforth it’s really the alcohol in wine that’s good for your arteries and so you might as well drink beer, and that in fact beer has some health advantages over red wine. Bamforth is the Anheuser-Busch Endowed Professor of Brewing Science at U.C. Davis. (Hmm …)

But according to the November 30, 2006 podcast from Nature, the procyanidins in red wine are particularly good for your arteries. While there may be health benefits from alcohol in general, red wine has unique benefits, especially red wines from certain regions.

Red wine contains the anti-oxidant resveratrol which has received a lot of press. Another Nature podcast (November 2, 2006) reports research that indicates resveratrol has increased life spans in experiments with yeast, flies, worms, and mice, and there is reason to believe it might do the same in humans. However, the amount of resveratrol in red wine is insignificant: you’d have to drink hundreds of gallons of wine a day to get a beneficial dose of resveratrol.

Unlike resveratrol, there are enough procyanidins in a glass of red wine to make a difference. This has been tested by extracting the procyanidins and testing it on cultured endothelial cells. Also, people live longer in the regions that produce wines especially high in procyanidins, namely Sardinia and southwestern France. Here’s the Nature article reporting the research.

It seems that it may not be the soil in certain areas but rather the wine making traditions that increase the procyanidin levels. Vintners in these areas leave the crushed grape seeds in with the juice longer.

Related posts