Non-English Languages & Whacky Domain Names

Asian language glyphs from the web sites of James Holderness While doing a little research for an upcoming article tonight, I revisited the web site of James Holderness. If the name looks familiar, it’s because I linked to him in January regarding detecting duplicate items within RSS feeds.

When I stumbled onto his site, I couldn’t believe the domain that he was using:

  • http://www.xn--8ws00zhy3a.com

as it seemed completely unmanageable for a normal person. At the time, it seemed so unmanageable for a normal person that I thought James must have been participating in some obscure SEO challenge; today I realise that isn’t the case at all.

The image shown above is displayed on James site beside his name. It turns out that those three glyphs some how translate into the obscure domain listed earlier as can be seen by the following screenshot from Google Search:

Non-English written characters or glyphs displayed within a Google Search result as the domain name

For those that are interested, Yahoo!, MSN and Live search all showed the English translation of the foreign language in the domain name and not the glyph based version – though were more than happy to display the glyphs within the title of the web site.

Does anyone know how a glyph is translated into the standard English alphabet and more so, what within the domain name delineates one glyph from the next?

About Alistair

My name is Alistair Lattimore, I'm in my very early 30's and live on the sunny Gold Coast in Australia. I married my high school sweet heart & we've been together for longer than I can remember. Claire and I started our family in September 2008 when Hugo was born and added a gorgeous little girl named Evie in May 2010. You can find me online in the typical hangouts, Twitter, facebook & Google.
This entry was posted in Search and tagged , , . Bookmark the permalink.

One Response to Non-English Languages & Whacky Domain Names

  1. The algorithm for encoding an international domain name is fairly complicated and is documented in a series of RFCs – 3490, 3491 and 3492. The one that will probably be of most interest to you is RFC 3492, which describes the Punycode encoding format.

Leave a Reply

Your email address will not be published. Required fields are marked *

*

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>