# Unicode to LaTeX

I’ve run across a couple web sites that let you enter a LaTeX symbol and get back its Unicode value. But I didn’t find a site that does the reverse, going from Unicode to LaTeX, so I wrote my own.

Unicode / LaTeX Conversion

If you enter Unicode, it will return LaTeX. If you enter LaTeX, it will return Unicode. It interprets a string starting with “U+” as a Unicode code point, and a string starting with a backslash as a LaTeX command.

For example, the screenshot above shows what happens if you enter U+221E and click “convert.” You could also enter infty and get back U+221E.

However, if you go from Unicode to LaTeX to Unicode, you won’t always end up where you started. There may be multiple Unicode values that map to a single LaTeX symbol. This is because Unicode is semantic and LaTeX is not. For example, Unicode distinguishes between the Greek letter Ω and the symbol Ω for ohms, the unit of electrical resistance, but LaTeX does not.

## 14 thoughts on “Unicode to LaTeX”

1. Can’t you just use XeTeX or LuaTeX? If you use a math enabled font, you should be able to get the same result, and your source will be more readable if you use a Unicode text editor.

2. Kyle: XeTeX and LuaTeX don’t run everywhere. Sometimes you need to use plain LaTeX.

Also, LaTeX commands are more memorable than Unicode code points. So when I’m writing LaTeX, I’d rather enter LaTeX commands than Unicode values.

3. It could also usefully allow you to type (and, particularly, paste) the actual Unicode character as well.

4. Hey, do you mind if I steal that data.js file?
I’d like to start adding stuff like this to a unicode lookup tool of mine…

6. Actually you are supposed to differentiate between ohm and omega, and there are a number of packages to do that, such as siunitsx. For example, if you use Omega for ohms it will wrong, as you will get an italic omega, instead of an upright omega.

7. I have to say that I’m not convinced about this omega/ohm thing. Do we use a different “m” as the symbol for metres? Nope, so I don’t see why we need to be prejudiced against Greek letters, either.

8. LaTeX was designed to put ink on paper. So if two concepts produce the same patterns pixels on paper, no need to distinguish them.

Unicode is meant to be more semantic, and so it makes a distinction between the capital ‘A’ of the English alphabet and the capital ‘A’ of the Greek alphabet. That distinction is useful in software. Maybe the software treats Greek text differently than English text. Maybe you want to search for a capital alpha in an English document containing countless English A’s.

Now LaTeX is being used to create online documents, usually PDFs. It would be nice if it made more semantic distinctions, but it wasn’t designed for that.

9. John: Semantic LaTeX is a big thing these days. It had a lot of features supporting that back in the day, before anything else did (emph instead of textitalic when appropriate). My code is often criticized as being too low level and not using enough semantic macros.

10. For example, Unicode distinguishes between the Greek letter Ω and the symbol Ω for ohms, the unit of electrical resistance, but LaTeX does not.

Yes, it does. That’s actually the point in LaTeX in comparison to plain TeX (and one of the reason for the lots and lots of packages). Everyone doing science and using LaTeX probably uses siunitx nowadays and it naturally has `\ohm`.

11. There are a lot of LaTeX users who don’t use any packages at all, and more who only use a package if they must.

12. There are a lot of LaTeX users who don’t use any packages at all

That must be all those users who know what they’re doing already. Because they’re invisible on the popular help sites like http://tex.stackexchange.com and http://latex-community.org/forum/. (The same is true for the three most popular German help sites). There the vast majority uses more packages than they should and (ironically) at the same time not the ones they should.

and more who only use a package if they must.

But why are they then using LaTeX in the first place and don’t use Plain TeX. Packages are the natural extension in LaTeX. Even if you have lot’s of own code in your preamble the LaTeX philosophy suggests you make an own package for it. (I’ll see if I can find David Carlisle’s quote on this…)