To save me from having to egogoogle myself all the time, I've had Google Alerts setup for ["tony ruscoe"] setup for ages. However, recently I've noticed something strange going on with Google's search results, Digg.com and, in particular, my name.
Searching for [site:digg.com "tony ruscoe"] on Google and Google Blog Search returns quite a few pages:
http://www.google.com/search?q=site:digg.com+%22tony+ruscoe%22 http://www.google.com/blogsearch?q=site:digg.com+%22tony+ruscoe%22
Some of the Web results returned are correct, since I've had my name mentioned in some Digg articles (although not for a while). But more recently, pages like this one have been popping into my Google Alerts for Web Search and my name doesn't appear in the content at all:
http://digg.com/political_opinion/House_Speaker_Pelosi_wants_to_hear_your_opinion_on_Bush_Cheney_impeachment
(If you viewed the cached copy, it just says "These terms only appear in links pointing to this page" although I don't believe that's the case.)
What's more weird is that most of the Blog Search results don't mention my name anywhere. And unlike Google Web Search, I didn't think they were indexed based on inbound link text. So why does Google think my name appears in those items?
It also seems to be the case for other names like ["philipp lenssen"]:
http://www.google.com/search?q=site%3Adigg.com+%22philipp+lenssen%22 http://www.google.com/blogsearch?q=site%3Adigg.com+%22philipp+lenssen%22
Although, this isn't the case for ["test name"]:
http://www.google.com/blogsearch?q=site%3Adigg.com+%22test+name%22
Can anyone explain why this might be? Does it happen when you search for your name?
And can anyone find any other websites it does this for? If not, could it be that Digg is doing something dodgy...? |
> And unlike Google Web Search, I didn't think they > were indexed based on inbound link text.
Wouldn't think so either, but that almost seems like a possibility now. I'll send an email to alerts-feedbackgoogle.com to point them here... |
Tony – Did you digg any of the articles that are being returned in your blog search results? |
No. I didn't Digg any of those articles. And even if I did, I thought Google Blog Search indexed only what was in the feed, which wouldn't include the names of anyone who Dugg the articles.
<< Blog Search indexes blogs by their site feeds, which will be checked frequently for new content. >>
http://www.google.com/help/about_blogsearch.html#howworks |
Also, the Google Cache of the first result at the time I checked it outputs the usual "these keywords are only found in pages linking to this page..." etc. So Tony's name wasn't on the Digg page itself, according to Google... |
I think I may have found a clue...
www.mansbags.com
Lots and lots and lots of Digg articles are linked to from there, presumably automated, each saying "Original post by Philipp Lenssen" or occassionally "Original post by Tony Ruscoe" – which I guess is why it only seemed to work for our names. This isn't true though, so I can only assume there's a bug in the software being used to generate that "splog".
And here's what I believe is proof that Google Blog Search doesn't *just* index feeds based on the content of the feeds, but also uses the link text from other blogs linking to the feed items.
[site:mansbags.com "tony ruscoe"] http://www.google.com/blogsearch?q=site%3Amansbags.com+%22tony+ruscoe%22
Which returns links to pretty much the same Digg articles as:
[site:digg.com "tony ruscoe"] http://www.google.com/blogsearch?q=site%3Adigg.com+%22tony+ruscoe%22 |
Tony – I see what you mean. I did a search for myself and received several posts on mansbags.com which have my name but its on a page I have nothing to do with.
http://www.google.com/blogsearch?hl=en&ie=UTF-8&q=colin+colehour+blogurl:http://www.mansbags.com/&filter=0&sa=N
Could this be considered spam in Google Blog Search? |
Colin, I suspect that's because you've made a post to Google Blogoscoped too.
<< Could this be considered spam in Google Blog Search? >>
Perhaps. It's duplicated content but that's not technically spam.
What's more interesting to me is that Google Blog Search appears to be using link juice from other blogs to index feeds. Perhaps this is an entirely new area for SEO enthusiasts to investigate and abuse... |
The site scraps RSS content from Digg and he has profiled Philipp as the target entity.That's why you can at time see the relationship between the two!! |