Google Blogoscoped

Forum

Del.icio.us Doesn't Want to Get Indexed  (View post)

/pd [PersonRank 10]

Saturday, October 7, 2006
17 years ago6,219 views

"they want to shine by being the only one to search through them."

That means they themselves (YHOO) will need to voliate the robots meta noindex data ??

Sohil [PersonRank 10]

17 years ago #

/pd is right. I think this is either a bug or Yahoo is doing this to disallow Google from searching it on purpose for obvious reasons.

Philipp Lenssen [PersonRank 10]

17 years ago #

> That means they themselves (YHOO) will need to
> voliate the robots meta noindex data ??

No, I mean they own the Del.icio.us databases themselves, so they don't need to crawl the web site like outsiders do... they can create a special search engine that digs straight into the data. That way, the robots.txt were irrelevant to them (they respect it and yet get the stuff indexed), and they could still display the results e.g. in a Yahoo onebox.

/pd [PersonRank 10]

17 years ago #

But of what benift would that be Phlilpp ??

Organic web indexes will then need to be merged with the " intranet" generated indexes.. for one box results .. I just think its too much trouble for Yahoo it differentaite search techniques.. and by doing that , they also fase the issue that ALL pages within thir FW / intranet could get index and displayed... thats also a risk..!!

Ramibotros [PersonRank 10]

17 years ago #

I think even if Yahoo violates the robot.txt it wouldn't get it in trouble as long as they own the website..

Sohil [PersonRank 10]

17 years ago #

/pd, I think Yahoo is willing to risk it to screw with Google.

Sergi [PersonRank 0]

17 years ago #

It's cloaking.

If you use Googlebot as user-agent, you will not see :

<meta name="robots” content="noarchive,nofollow,noindex"/>

Philipp Lenssen [PersonRank 10]

17 years ago #

> But of what benift would that be Phlilpp ??

First of all, I don't believe this is what happened, it was just an example of contexts in which disallowing indexing of sub-pages makes some limited sense. But the benefit in hosting data, publishing it on the web, and then disallowing others to crawl it is so that you create the search monopoly on this data.

Jojo [PersonRank 1]

17 years ago #

As Sergi said: It´s cloaking. That was a topic in some SEO blogs a few days ago. For example at SEOmoz: http://www.seomoz.org/blogdetail.php?ID=1431

Pip [PersonRank 8]

17 years ago #

completely right. Try FF User Agent Switcher and pretending beeing googlebot and you will see they totaly agree with being indexed.

BUT: the never-the-less nofollow the outbound links, of course ;)

kellan [PersonRank 0]

17 years ago #

del.icio.us has done this since day one (3 years ago), long before they were acquired by yahoo. and they've talked in public about why.

the point is to de-motivate spammers from pumping useless links into the database in order to inflate their page rank.

news this ain't.

Matt Inman [PersonRank 0]

17 years ago #

Hi, an even easier to way to check what googlebot sees is to check the cached snapshot of the del.icio.us site:
http://www.google.com/search?q=cache:EDe3-uxq45oJ:del.icio.us/+delicious&hl=en&gl=us&ct=clnk&cd=1&client=firefox-a

Philipp Lenssen [PersonRank 10]

17 years ago #

Matt good idea, albeit it doesn't take into account very recent changes as the snapshot is hours to days old.

Tadeusz Szewczyk [PersonRank 10]

17 years ago #

Anyways, whatever the reasons and the motivation, such steps kill the idea of the internet and of freedom of speech itself.
It is a symptom we can see in the society in general. Freedoms that are abused by a few are taken away for all of us. I wonder why sites or pages that are bookmarked by hundreds or thousands of people shouldn't get some "link juice" accordingly. I can't believe that such a service like del.icio.us is so easy to manipulte that you have to cripple the whole platform.

Philipp Lenssen [PersonRank 10]

17 years ago #

I don't mind much about Del.icio.us doing this, but the same basic issue convinced Wikipedia German to nofollow all outgoing links, and I think that's kinda nasty – they get millions of backlinks yet they don't send normal links to other sites? The spam issue is real and I'm sure a nofollow helps, but they could do it like it's done in this forum, for example: if the link isn't trashed after a couple of days, the nofollow will be removed.

Forum home

Advertisement

 
Blog  |  Forum     more >> Archive | Feed | Google's blogs | About
Advertisement

 

This site unofficially covers Google™ and more with some rights reserved. Join our forum!