Some additions to Google's robots.txt file:
------------------------ Disallow: /hosted/images/ Disallow: /hosted/life/ User-agent: Googlebot-Image Disallow: / User-agent: Googlebot-Image Allow: /hosted/images/ Allow: /hosted/life/ Allow: /searchhistory/ Allow: /news?output=xhtml Allow: /news?btcid= Allow: /news?btaid= Allow: /s2/profiles ------------------------ |
Why would they disallow the image search bot for most services? For example, this search will return 0 results:
http://images.google.com/images?q=site%3Agroups.google.com |
... unless they'd "crawl" images internally and then add them to the index in a different way (considering they already have them on their servers, they don't strictly need to go the outside http route...). Not that that's the answer here, just saying... |
If I see this right this part ...
--------- User-agent: Googlebot-Image Disallow: / User-agent: Googlebot-Image ---------
has now been changed to ...
--------- User-agent: Googlebot Disallow: / User-agent: Googlebot --------- |
... and after yet another change those user-agent lines are now gone from the file completely. |