Unicode provides a way to distinguish symbols that look alike but have different meanings.
We can illustrate this with the following Python code.
import unicodedata as u for pair in [('K', 'K'), ('Ω', 'Ω'), ('ℵ', 'א')]: for c in pair: print(format(ord(c), '4X'), u.bidirectional(c), u.name(c))
4B L LATIN CAPITAL LETTER K 212A L KELVIN SIGN 3A9 L GREEK CAPITAL LETTER OMEGA 2126 L OHM SIGN 2135 L ALEF SYMBOL 5D0 R HEBREW LETTER ALEF
Even though K and K look similar, the former is a Roman letter and the latter is a symbol for temperature in Kelvin. Similarly, Ω and Ω are semantically different even though they look alike.
Or rather, they probably look similar. A font may or may not use different glyphs for different code points. The editor I’m using to write this post uses a font that makes no difference between ohm and omega. The letter K and the Kelvin symbol are slightly different if I look very closely. The two alefs appear substantially different.
Note that the mathematical symbol alef is a left-to-right character and the Hebrew latter alef is a right-to-left character. The former could be useful to tell a word processor “This isn’t really a Hebrew letter; it’s a symbol that looks like a Hebrew letter. Don’t change the direction of my text or switch any language-sensitive features like spell checking.”
These letter-like symbols can be used to provide semantic information, but they can also be used to deceive. For example, a malicious website could change a K in a URL to a Kelvin sign.
One thought on “Letter-like Unicode symbols”
I recommend reading the old debate about splitting off a new Unicode character for yogh (ȝ) from ezh (ʒ) for nice look at the process of untangling the semantics of glyphs from their appearances.