Non-English Languages & Whacky Domain Names

Asian language glyphs from the web sites of James Holderness While doing a little research for an upcoming article tonight, I revisited the web site of James Holderness. If the name looks familiar, it’s because I linked to him in January regarding detecting duplicate items within RSS feeds.

When I stumbled onto his site, I couldn’t believe the domain that he was using:

  • http://www.xn--8ws00zhy3a.com

as it seemed completely unmanageable for a normal person. At the time, it seemed so unmanageable for a normal person that I thought James must have been participating in some obscure SEO challenge; today I realise that isn’t the case at all.

The image shown above is displayed on James site beside his name. It turns out that those three glyphs some how translate into the obscure domain listed earlier as can be seen by the following screenshot from Google Search:

Non-English written characters or glyphs displayed within a Google Search result as the domain name

For those that are interested, Yahoo!, MSN and Live search all showed the English translation of the foreign language in the domain name and not the glyph based version – though were more than happy to display the glyphs within the title of the web site.

Does anyone know how a glyph is translated into the standard English alphabet and more so, what within the domain name delineates one glyph from the next?

One thought on “Non-English Languages & Whacky Domain Names

  1. The algorithm for encoding an international domain name is fairly complicated and is documented in a series of RFCs – 3490, 3491 and 3492. The one that will probably be of most interest to you is RFC 3492, which describes the Punycode encoding format.

Comments are closed.