Comparing Google and Yahoo automatic translation

I played around with Google’s translator a little after adding some notranslate directives as discussed in my previous post. Google did honor my requests to mark some sections as literal text to not be translated. Google’s translator was also able to recognize my name as a name without special markup. Yahoo, on the other hand, translated my name, turning “Cook” into “Cuisinier” in French.

Google treated text inside <code> tags as literals that should not be translated. That is, Google would leave my source code snippets alone and only translate the English prose surrounding the code. Yahoo, on the other hand, would translate everything, including source code. For example, I had some PowerShell code on my page with the keyword matches that Google left alone but Yahoo translated into “allumettes,” presumably good French prose but not a legal PowerShell keyword.

One puzzling thing about the Google translation engine was that it would change which text was hyperlinked. For example, the text “My résumé” was changed to “Mon CV,” linking on the translation for “my.” Yahoo produced what I expected, “Mon résumé.” There were several other instances in which Google produced odd links, such as hyperlinking the | marker between words that were linked before. For example, the footer of my website has these links:

Home | Sitemap | My blog | Search

Yahoo turned this into

Maison | Sitemap | Mon blog | Recherche

while Google produced

Accueil | Plan du site | Mon blog | Recherche

So Google incorporated the separator bars as part of words, and moved the last link from “Recherche” to the bar separating “blog” and “Rescherche.”

One advantage of Google’s translation is that it lets you hover your mouse over a line of translated text and see the original text.

2 thoughts on “Comparing Google and Yahoo automatic translation

  1. Interesting — and odd — about Google’s moving the links, especially the one where it moved the link to the punctuation. Hm,

    Google definitely does better than Y! with translating in context. For instance, if you enter the German phrase “rot, weiss, und blau,” Y! will turn that into “red, knows, and blue,” considering “weiss” to be a conjugation of the verb “wissen” (to know), rather than a colour. Google will get it right (“red, white, and blue”), and will also turn “er weiss die blau” into the somewhat nonsensical “he knows the blue”, understanding that “er weiss” now makes it the verb.

    A German friend once pointed me to an item on German eBay that had obviously been machine-translated from a badly written English original. In the payment instructions, it said “Kein Kabeljau.” I’ll leave that one as an exercise for the reader.

  2. French is a funny language, in as much as they jealously guard the “Frenchness” of their vocabulary and try hard to discourage loanwords, especially from English. When I was studying it some popular irritants to the Academy were “le weekend” and “le babysitting” for example.

    This led to a whole new French vocabulary for computer terms. While almost every language uses “byte” or something quite similar, in French it is “octet”. This is a more precise term if you really mean eight bits and no other system specific number, but it is unusual. Actually, I wonder what the French would call a byte on a system where a byte is not eight bits. Fortunately these are rare.

    Fun with automatic translators:

    When I was a child I heard a story, almost certainly apocryphal, in which the military made a computer which could do automatic translation. When demonstrated to the brass, one of them asked it to translate “out of sight, out of mind” into Chinese. The machine output some Chinese characters, but none of those present could read Chinese. So they had the machine translate the result back into English. It output “invisible idiot”.

    It is interesting and illuminating to feed the result of automatic translation back into the translator, and the results can be quite amusing. This may help highlight places where the machine has trouble with context, idiom, or usage. Of course it is no guarantee that it got the first translation correct, even if it comes back correct when re-translated.

    Also note that it is possible and not too damaging to translate a web page into its original language. I’ll leave the application of this as another exercise for the reader.

    Finally, my all-time favorite web translator, which sadly disappeared long ago, was the “T’inator” which translated (pitied) a web page into the language of Mr. T. My favorite translation was “research” into “jibba-jabba”.

Comments are closed.