Google Blogoscoped

Forum

Google International Domain Names Bug  (View post)

Kirby Witmer [PersonRank 10]

Friday, April 7, 2006
18 years ago6,794 views

Hmmmm.... works for me.

David [PersonRank 0]

18 years ago #

and for me

Philipp Lenssen [PersonRank 10]

18 years ago #

OK, it seems only in Firefox will you get a 404. However even in IE you'll be directed to the wrong page... compare the page you see in IE after clicking on the Google result with the page you get when copy & pasting the URL displayed into the address bar.

Mike [PersonRank 0]

18 years ago #

Works fine with FF for me.

TOMHTML [PersonRank 10]

18 years ago #

Yes, when I click on the link, I got a 404 error page to http://g%c3%a1bor.20y.hu/

I have Firefox.
On I.E., when I click, I am redirected on http://xn--gbor-5na.20y.hu/
The "xn--", where does it come from ???

Tony Ruscoe [PersonRank 10]

18 years ago #

TOMHTML: the "xn--" characters are a result of encoding "foreign" characters in international domain names using Punycode.

You can read more here:
http://en.wikipedia.org/wiki/Internationalized_domain_names

Nono [PersonRank 0]

18 years ago #

its the same for me here on IE.

Tony Ruscoe [PersonRank 10]

18 years ago #

BTW, on a (slightly) related note is appears that Google doesn't yet recognise the .eu as being a TLD. For example, both these (and more) will show a summary for the site/domain (i.e. cache, similar pages, links, etc.):

http://www.google.com/search?q=whois.com
http://www.google.com/search?q=whois.net

By contrast, this just returns a normal search results page:

http://www.google.com/search?q=whois.eu

KenWong [PersonRank 3]

18 years ago #

Clicked the link directly in IE6 and worked well.

David [PersonRank 0]

18 years ago #

Works fine in Firefox 1.5.

. [PersonRank 1]

18 years ago #

Firefox and works fine

TOMHTML [PersonRank 10]

18 years ago #

Thank you very much Tony, by the way I discovered what "Punycode" means.

Matt Cutts recently admited that Google have problem with IDNs, almost whith PageRank :

Q: “Any results on why IDN Domains don’t show pagerank?”
A: I’ve seen a couple that do, but I’ll check into why most don’t. My guess is that there’s a normalization issue somewhere in the toolbar PageRank pathway.
> http://www.mattcutts.com/blog/q-a-thread-march-27-2006/

Fergus Macdonald [PersonRank 0]

18 years ago #

Works fine for me too...Firefox.

Datrio [PersonRank 1]

18 years ago #

It's worse with some other characters in URLs.

I don't know about you, but click here – http://www.google.com/search?client=opera&rls=en&q=Przestrze%C5%84+Wikipedia&sourceid=opera&ie=utf-8&oe=utf-8

The first result should be an article on Przestrzeń from the Polish Wikipedia. If you have Search History enabled, upon clicking on the link you'll get a "Bad Request" error (at least in Opera), because the Polish character ń will be encoded as %u0144 in the URL.

I so much hope Google will fix that soon...

Juha-Matti Laurio [PersonRank 10]

18 years ago #

Works fine in localized Firefox 1.5.0.1 too.

In Finland we have sites like http://www.säkylä.fi/ and http://www.viestintävirasto.fi/ (redirect in use) too.

Haochi [PersonRank 10]

18 years ago #

I don't even know you can use the special character for the domain name...I thought you can only use a-z...
Interesting...

Matt Cutts [PersonRank 10]

18 years ago #

All these examples worked fine for me, except for the [whois.eu] search. I'll point that out in Google, and ask someone to look at this thread.

Philipp Lenssen [PersonRank 10]

18 years ago #

Very strange. The top Google result when clicked on brings me to
g%c3%a1bor.20y.hu/ which then shows a 404 (Win FireFox 1.5.0.1). Only by copying the URL as it is displayed below the result snipped can I go to the page...

/pd [PersonRank 10]

18 years ago #

the .eu TLA was sunraised just yesterday- Proprogation across the of root servers will take at least 48 hrs for for full meshing... So its ok that the whois.eu will take some time for results dispaly..

or is my thought process wrong on this ??

/pd [PersonRank 10]

18 years ago #

solly I 4 got to add collateral

http://europa.eu.int/information_society/policy/doteu/index_en.htm

Andrew Hitchcock [PersonRank 10]

18 years ago #

Just to add some samples: all the links work in Safari.

Philipp Lenssen [PersonRank 10]

18 years ago #

OK, I figured out one reason why it seems to work and then not work: I signed out of my Google Account and then clicking on the result worked.

Being signed in, the click URL is something like this for me (Firefox):
http://www.google.com/url?sa=t&ct=res&cd=1&url=http%3A//g%E1bor.20y.hu/
(I removed an "ei" and a "sig2" parameter, but the URL works neither with/ without)

Signed out, the URL is the working:
http://gábor.20y.hu/

Tony Ruscoe [PersonRank 10]

18 years ago #

Matt Cutts: Thanks for passing on the info.

/pd: There are already 47,000 .eu domains in Google's index:

http://www.google.com/search?q=site%3a%2eeu

Companies have been able to register these throughout the Sunrise period, so many are already in operation. The report you linked to explains that from yesterday, anyone can register a .eu domain without the need for documentation to prove it's your trademark, company name, etc.:

"Registration for the new Top Level web Domain .eu began on 7 December 2005 with a 4-month “sunrise” period. During this time only the holders of existing trademarks or other prior rights could register. Registrations for .eu is fully open to the public as from 7 April 2006."

Philipp Lenssen [PersonRank 10]

18 years ago #

German Spiegel says there were 900,000 registrations on Friday during the landrush phase... mostly from the UK (222.000), with Germany coming in second, and the Netherlands third.
http://www.spiegel.de/netzwelt/politik/0,1518,410376,00.html

Jamie [PersonRank 0]

18 years ago #

I've spent the last 5 days messing with i18n and encodings at work. I think you could almost get a degree in encodings and locales, there is so much to know. This really only applies to a global page like Google. We've already seen security issues caused by encoding issues. In fact I think it was only a few months ago that someone showed a punycode encoded domain name could be composed of homographs to a real domain such as paypal, yet another way to instigate phishing attacks. It's absolutely mind numbing. I have a strange interest in arcane information but character sets and encodings... do I sound frustrated to anyone? :)

Fun fact: Unicode has 18 different space characters (15 of which render itentically on my screen). Have fun parsing out individual words with a list of 18 and counting separaters, coders!):
http://www.cs.tut.fi/~jkorpela/chars/spaces.html

And yet, there's no way to differentiate Japanese unicode characters from Chinese or other Han unified languages. That adds to the nightmare because you still have to consider the regions that haven't embraced Unicode for this and other reasons and still stick to their favorite two-or-three encodings and character sets.

Philipp Lenssen [PersonRank 10]

18 years ago #

> I think you could almost get a
> degree in encodings and locales,
> there is so much to know.

True, true. And if you add typographical conventions and local keyboards to the mix (I don't have the right key to type a correct quotation mark) it becomes mindboggling.

Caleb E [PersonRank 10]

18 years ago #

The problem has to do with their redirects which allow the url to be added to your search history. On an only slightly related note, Google changed the summary text for pages from "Results 1 – 1 of 1 for gábor.20y.hu" to "Showing web page information for gábor.20y.hu"

Forum home

Advertisement

 
Blog  |  Forum     more >> Archive | Feed | Google's blogs | About
Advertisement

 

This site unofficially covers Google™ and more with some rights reserved. Join our forum!