Search engines still struggling with Internationalized Domain Names (IDNs)
Internationalized Domain Names (IDNs) are now a reality and in use by websites right now. Unfortunately, it seems that the search engines are still playing catch-up.
Introduction
Note: If you are unable to view the Arabic and Cyrillic letters in this page you may need to install the required fonts.
Now that the first Internationalized Domain Names (IDNs) have gone live and have had some time to get established, it seems like a good time to revisit the finding of my previous article on IDNs “Can search engines handle Internationalized Domain Names (IDNs)?”
IDNs went live initially for three countries, all using the Arabic alphabet: Egypt (مصر); Saudi Arabia (السعودية); and the United Arab Emirates (امارات). Russia’s new IDN (рф) went live a little later, adding the Cyrillic alphabet to the mix, and additional IDNs have been created for other countries and alphabets. For this article I’ll take a look at how search engines handle these four IDNs.
IDN site search
To get an idea of how extensively the search engines have indexed sites on these new IDNs I’m going to use the “site:” operator. Although this operator is primarily used for finding pages on a particular website, e.g. [site:lbi.co.uk] it can also work all the way up to the TLD level, e.g. [site:uk].
Searching Google for [site:مصر], [site: السعودية], [site:امارات] and [site:рф] returns results from the IDNs for Egypt, Saudi Arabia, the UAE and Russia as expected. Whilst the new IDN for Saudi Arabia only had 14 pages indexed when checked, the other IDNs all feature thousands of results.
Screenshot of a search for [site:рф] in Google:
Trying the same searches in Bing, however, does not return any results:
It appears that the site: operator does not work with these new IDNs in Bing (searching for other domains, e.g. [site:com], works as expected).
IDNs in search results?
The next area tested is whether the search engines will return these domains in their search results. To test this I picked out some random web pages on the new Egyptian IDN and tried searching for their title tags in both Google and Bing.
Searching both Google and Bing for the title of one web page, [مراكز التميز في البحث والتطوير - وزارة الإتصالات], brought up a number of web pages. The results from Google and Bing both contained a result from an IDN:
Google snippet featuring an IDN:
Bing snippet featuring an IDN:
More IDN bugs
Earlier I described how Bing’s site: operator does not yet work with IDNs. However, Google also has a number of IDN woes. Searching for [site:مصر] (the new IDN for Egypt) brings up the site سجل.مصر – however, clicking on the “Show more results from سجل.مصر” link in Google appears to be listing sites on domains other than سجل.مصر. Additionally, the “Show all results” link is percent encoded rather than listing the site name in the Arabic font.
In my previous look at how search engines handled IDNs I had found that Google’s links to “Translate this page” and “Cached” were broken for IDNs. Today it appears that Google has fixed the translation links – however, the cache links still do not appear to function.
Conclusion
The situation is much the same as it was back in February. The search engines can index websites which use IDNs – however, all of the major search engines still have bugs with their IDN support.
Given that the number of IDNs is set to grow and the number of websites using IDNs is likely to vastly increase in the near future, it’s vital that the search engines iron out the bugs in their IDN support. After all, if a search engine can’t handle websites from a particular properly, people might decide to switch to a search engine that can.
Tags: domains, Google, international, keywords, language, Microsoft, research, search, SEO












