Google Blogoscoped


How does the Google bot actually work in scanning directories?

David T [PersonRank 7]

Thursday, October 11, 2007
15 years ago3,778 views

Hi there Blogoscoped readers.

I have a problem, and thought that Philipp may know, or someone else who understands how the Google spiders work...

I have a hosting package whereby I can have unlimited domains hosted on 1 hosting account. This means, I have the following setup on my hosting space: ==> autmatically loads up and hides the fact it is on ==> autmatically loads up and hides the fact it is on

The problem I have is that I am worried that Google may index my content for site2 and site3 twice, i.e:

Google dislikes duplicate content and I don't want it either!

My question is how I can deal with this. Do you think inserting a line in robots.txt on disallow: / would work to disallow Google referencing the pages on or would this line also disallow Google referencing Ofcourse on I could tell Google in my robots.txt to go ahead indexing the whole site.

It seems this is more a question about how the Google bot works! I asked my host which is one of the biggest in the US (hostmonster) and they said "we do not support googles functions or bots, we have this query all the time, the truth is we can not support it as google may or may not allow it, you need to go off of their rules and regulations". So basically that's a "err, I dunno" from them!

Any ideas? Thanks so much,


Philipp Lenssen [PersonRank 10]

15 years ago #

Do you have a permanent redirect HTTP header set up to point from domain 1 & 3 to domain 2 or...?

David T [PersonRank 7]

15 years ago #

Hi Philipp, thanks for getting back about this mini-nightmare, basically the hosting package creates an .htaccess to redirect the extra sites which are folders in the main site. Does that make sense?

Zim [PersonRank 10]

15 years ago #

David, if you put in your robots.txt the order to disallow the subfolder, then anydomainyou.have/ won't be read.
I think it'll work if page1.html is visible from
I suggest you to investigate about parked domains vs addon domains, what you can do with them, robots.txt instructions, and finally, the almighty .htaccess.
Good luck!

David T [PersonRank 7]

15 years ago #

Thanks for the advice Zim!

Forum home


Blog  |  Forum     more >> Archive | Feed | Google's blogs | About


This site unofficially covers Google™ and more with some rights reserved. Join our forum!