Shorter URLs by using Unicode

Tinyarro.ws is a service like tinyurl.com and others that shorten URLs. However, unlike similar services, Tinyarro.ws uses Unicode characters, allowing it to encode more possibilities into each character. These sub-compact URLs may contain Chinese characters, for example, or other symbols unfamiliar to many users. They’re no good for reading aloud, say over the phone or on a podcast. But they’re ideal for Twitter because you only have to click on the link, not type it into a browser.

Here’s a URL I got when I tried the Tinyarro.ws site:

screen shot from tinyarro.ws

The resulting URL may not display correctly in your browser depending on what fonts you have installed: http://➡.ws/㣸.

I pasted the URL into Microsoft Word and used Alt-x to see what the Unicode characters were. (See Three ways to enter Unicode characters in Windows.) The arrow is code point U+27A1 and the final character is code point U+38F8. I have no idea what that character means. I would appreciate someone letting me know in the comments.

Unicode character U=38F8

Related post: How to insert graphics in Twitter messages

Tagged with: ,
Posted in Uncategorized
4 comments on “Shorter URLs by using Unicode
  1. John says:

    Apparently U+38F8 is an obscure character. Two Chinese friends have told me they don’t recognize it. One suggested it may have something to do with work. I assume the character is at least inoffensive.

  2. John Venier says:

    OK, that does it — time to link to Hanzi Smatter!

  3. tpc says:

    I studied Chinese for 10 years and neither do I recognise that character.

  4. Ah, the Unicode “ghost character” strikes again…

    Characters likes these are known as “ghost characters”, where they only exist basing on unicode algrithm but not used in linguistic sense.

    Matter of fact, Unihan Grid Index has many of these:

    http://www.tian.cc/2009/04/ghost-character-u38f8.html