Google Blogoscoped

Sunday, December 10, 2006

Does the Google Toolbar Index URLs?

I had a lil’ bet going with Googler Matt Cutts. The premise? I figured that once you installed the Google Toolbar, every URL you’d visit would slowly be added to the Google web search index... even otherwise unlinked URLs, as long as they’re public (and not excluded via robots.txt).

My reasoning is simple: the Toolbar includes an advanced “show PageRank” option which, when turned on, must send a page’s URL to a Google server, so that this server will then be able to return the PageRank value. I think grabbing this value for other purposes as well would be mostly fair and square, too, because a) Google warns about the advanced feature, b) a page which it would index would be public anyway, and c) it would help get some “deep web” crawling going, in tune with Google’s mission & motivation.

Well, it turns out my suspicion was dead-wrong. Here’s the setup of the bet with Matt that was meant to prove this, as far as possible:

  1. I hid an HTML page on my server on August 15th. With “hiding” I mean that I didn’t tell anyone of the page, or linked to it, and I didn’t make its URL easy to guess. It just happened to be technically public in the sense that there was no password protection.
  2. On that HTML page, I included the unique string blogoscoped55521384239 (such a word had never been indexed by Google before, as a Google web search returned 0 results at the time).
  3. I visited this page with the Firefox Google Toolbar (advanced options activated). Several times.
  4. Now to see the results of the experiment, all you had to do was wait some months, and then search Google for blogoscoped55521384239. If this would show the page in question, then the Toolbar must have been responsible for indexing it.

It was an illuminating experiment, because after some months, a search for blogoscoped55521384239 only resulted in the page where Matt and I discussed the bet (this part was obvious), but not the semi-hidden page on my server... which is located here. In other words, nope, the Google Toolbar won’t index pages (as far as this experiment was able to find proof for that, and I can’t think of any setup that would yield final proof – though Matt could walk up to the Toolbar team and ask, of course). This is not to say that toolbars by other vendors act the same way... if anyone wants to continue the experiment on say, the Yahoo toolbar, I’m curious about the results.

So congrats to Matt, and hope you or your friends enjoy the copy of my book! :)


Blog  |  Forum     more >> Archive | Feed | Google's blogs | About


This site unofficially covers Google™ and more with some rights reserved. Join our forum!