Google Blogoscoped


GStatic, an addition to Google's robots.txt  (View post)

Philipp Lenssen [PersonRank 10]

Thursday, October 16, 2008
16 years ago14,279 views

I believe this line has been added to Google's robots.txt (

Sitemap: http: //

(added blank to prevent auto-linking, but here's the link:

Ionut Alex. Chitu [PersonRank 10]

16 years ago #

A list of sitemaps for Google profiles. I wonder if Google will create a special search engine for those profiles.

Colin Colehour [PersonRank 10]

16 years ago #

Wow, kinda scary to see a giant list of profile links like that. They make it so convenient for anyone to grab a list of Google User's with profiles.

[put at-character here]Ionut Alex. Chitu – They created the sitemap probably so the main googlebot will find and the profile URLs. Otherwise they might not have an easier way to include those pages in their index.

Colin Colehour [PersonRank 10]

16 years ago #

Only 176 results so far for Google Profiles:

I bet this grows in the next month because they are going to be crawling profile URLs now with that sitemap.

Ionut Alex. Chitu [PersonRank 10]

16 years ago #

I know why they created the sitemaps, but Google Base would be more appropriate to search those profiles because they're structured.

Above 5 comments were made in the forum before this was blogged,

Richard Liriano [PersonRank 1]

16 years ago #

Good morning Philipp,

Great post.

Just wanted to share with you also that on your Google Profile site there is an option that allows you to put a badge on your profile site to tell people that you are voting on November 4th.

This is the image that is added when you enable it:

When you click on the image it directs you to:


P.S. I am sure Google would be adding more badges like this in the future to Google Profile, no?

Amragi Temper [PersonRank 1]

16 years ago #

someone should crawl all these and make them searchable:

will be aprox 1 GB of data, that should be possible, not?

unofficial google profile search up before googles own profile search that would be something *G

Philipp Lenssen [PersonRank 10]

16 years ago #

Amragi, coincidentally, my crawler has been running for an hour or so by now. Currently at 13,044 of 132,268...

And here are the ones indexed in Google so far (175 at this time, probably with old ones indexed through backlinks from elsewhere):

J [PersonRank 2]

16 years ago #


I found John McCain :

Wondering if it's really him.

Philipp Lenssen [PersonRank 10]

16 years ago #

J, that's "officially" him, but who knows if he set anything up himself, it's also probable his advisors helped. See:

NoddleGei [PersonRank 0]

16 years ago #

And Obama, too:

beussery [PersonRank 10]

16 years ago #

These voting features have been out for a week or so...

mrbene [PersonRank 10]

16 years ago #

Hey, this is nice: The profile isn't public by default.

Yay! Privacy-compliant!

Floris Fiedeldij Dop [PersonRank 2]

16 years ago #

Great, now when abusive users need to hack into people's accounts who use super hard to guess passwords. All they have to do is go to their google profile and see if they answered some of those 'where did you live, what do you do, etc' questions. :rolleyes:

Omer [PersonRank 0]

16 years ago #

Omfg, Google is indeed a bloody social network now.

Adam Boulton [PersonRank 0]

16 years ago #

Matt Cutts has commented that the profile links will have no SEO benefits

franz [PersonRank 1]

16 years ago #

>The Sitemap is hosted at (I wonder why?)

simple: easier to roll out these files to the server. google is a big big big data center network, so if you would like to distribute these files to the google domain you would have to wait hours, propably days that they are all the same on all data centers. just put it to a seperate smaller serverfarm you can roll out these files much much faster.

Philipp Lenssen [PersonRank 10]

16 years ago #

The Google Public Profile Search engine is finished, if you want it you'd need to set up the PHP + SQL on your server, you can email me to grab the package...

Roger Browne [PersonRank 10]

16 years ago #

I presume you noticed the incorrectly-escaped "&" in the search result for Brian Ussery.

Philipp Lenssen [PersonRank 10]

16 years ago #

Yes I did, thanks. I could fix this, the reason is that while I "should" treat Google's profile pages as HTML and thus not escape them again when displaying snippets, I figured it might be safer in any case since I can't be too sure their HTML is correct. Guess what I'd need is a "escape only if needed" function :)

Roger Browne [PersonRank 10]

16 years ago #

The safest way is to "un-escape" the crawled HTML, then re-escape it before displaying it on your own website. That will be safe whether you get correct or bad HTML from Google.

franz [PersonRank 1]

16 years ago #

hi phillip

an easier way to access the google profile search is []

Philipp Lenssen [PersonRank 10]

16 years ago #

Franz why do you think that is easier than bookmarking ? Both URLs aren't that easy to remember (e.g. "s2"... or "saerch" instead of "search").

Forum home


Blog  |  Forum     more >> Archive | Feed | Google's blogs | About


This site unofficially covers Google™ and more with some rights reserved. Join our forum!