Google Blogoscoped

Forum

Malformed Google Result URLs  (View post)

Joseph Friedman [PersonRank 0]

Monday, January 12, 2009
10 months ago1,074 views

It should return NO results.

site:www means any site ending in a domain ".www" – which of course does not exist.

JohnMu [PersonRank 10]

10 months ago #

It sure looks interesting :). Thanks for reporting it, I'll pass that on to some of the people here.

George R [PersonRank 10]

10 months ago #

[site:http] google.com/search?q=site:http

I might understand how these got into the index.
Google may have indexed a page that referred to one of these urls.

I do not understand why there is a cache link in the SERP entry.
Following that link produces a google page that says "Your search – ... – did not match any documents.

noname [PersonRank 4]

10 months ago #

i think there are several problems
1. someone got to have backlinks to such links – asi link: show, it was russian yandex google.cz/search?hl=cs&q=l ... and that's the reason why only russian pages are in the indexs

2. such links had to go through the internal system of link validation – if something like www/ismm/ru went through, it mean that they got some problem in this, maybe they index also some internal links where http: //www is a root of the "intranet" and the subpages from yandex seemed like some pages in this system (it probably not returned 404, but some 300 error, so Google has indexed them using the description from the source of the link – yandex).

Zim [PersonRank 10]

10 months ago #

Also searching for
site:www.*
Brings weird results.

Juha-Matti Laurio [PersonRank 10]

10 months ago #

Are the URLs www.de and www.ee ([put at-character here]Zim) real at all?

Reinier Meijer [PersonRank 1]

10 months ago #

Today I got malformed results through Google Alerts too.

Check the results for:
google.com/search?hl=en&q= ...

You get an URL consisting of 2 URLs, separated by an encoded space, a hyphen and again an encoded space:
www.jjnet.dk%20-%20www.feltet.dk

JohnMu [PersonRank 10]

10 months ago #

Hi Reiner
It looks like feltet.dk is using wildcard subdomains – anything technically resolves to the webserver like that. For example, blogoscoped.feltet.dk will work. This is generally something they need to fix on their end (feel free to send them a tip :-)).

I'll pass this on to the guys here too though. Thanks for bringing it up!
John

[Link formatting corrected – Tony]

Reinier Meijer [PersonRank 1]

10 months ago #

Hi JohnMu,

Thanks! You're right. Never seen spaces in subdomains before.

Reinier

Selva Kumar [PersonRank 0]

10 months ago #

I think this problem has been taken care off. It no longer produces this error

Antionestrife [PersonRank 0]

10 months ago #

Now this was error was unaware of. I didn't across this problem or it was already fixed.

[URL removed – Tony]

Michael Flaster [PersonRank 0]

10 months ago #

Thanks for pointing this problem out!

Yes, this issue has now been resolved.

Matt Cutts [PersonRank 10]

10 months ago #

Michael, thanks for stopping by talk about this (I can vouch that Michael is a fellow engineer at Google). :)

Marcel Sasik [PersonRank 0]

10 months ago #

i have the sameproblem like Reinier Meijer

google.sk/search?hl=sk&q=p ...

www.predajdielo.sk,%20www.umeleckediela.sk/ – 36k

This thread is locked as it's old... but you can create a new thread in the forum. 

Forum home

Advertisement

 
Blog  |  Forum     more >> Archive | Feed | Google's blogs | About
Advertisement (advertise here?)
Books about Google on Amazon

 

This site unofficially covers Google™ and more with some rights reserved. You can subscribe to the feed, email your tips and join our forum!