New York Times and Cloaking - Google Blogoscoped Forum

Forum

New York Times and Cloaking (View post)
Seth Finkelstein	Monday, June 19, 2006 18 years ago • 8,995 views
How do we know that the New York Times didn't simply give Google a Times Select subscription?
/pd	18 years ago #
==>> "give Google a Times Select subscription? " and how will this benifit Google or the times ?? If this subricption is used for the news section it would make sense, but then NYT has the paid content section which normally most users dont get too.. I for one, think that NYT would be smarter then to use Cloaking as a SEO strategy. However, on the other hand, this could also mean a massive dis-information ploy.. I.e. send this to googlebot to index , but actually display this to the user. After all all MSM have to be pro this or pro that.. :)-
/pd	18 years ago #
theres chatter happening in this forum too http://www.highrankings.com/forum/index.php?showtopic=23234
Philipp Lenssen	18 years ago #
> How do we know that the New York Times > didn't simply give Google a Times Select subscription? Wouldn't that be a double standard if Google accepts such a subscription? I mean Google punishes smaller sites for cloaking (rightfully) but then larger sites could simply get away with it. Also, if Google manually accepts subscriptions this would seem like an indirect manual editing of search results, too, which Google claims to avoid. The point remains: if the NYT shows something else to Google (via a subscription or other means) it's against Google's cloaking guidelines: "Don't ... present different content to search engines than you display to users" http://www.google.com/support/webmasters/bin/answer.py?answer=35769
/pd	18 years ago #
Philipp, have you tried the same with the google news ?? THE WSJ , clearly indicates clip to be s"subcription required"
Utills	18 years ago #
I'm not sure about the specifics of the NYT but I do know that quite a few news sites employ a policy whereby they offer the article free for a few days and then make it subscription only. In such cases perhaps Google visited the link when the article was free and indexed the content of the article. However, in the meantime the article has become subscription only and thus cannot be read by a normal user.
mrbene	18 years ago #
If this is being done well, it's probably not only user agent verification, but also IP verication (either netblock based on ARIN registration to Google, or IP list based on RDNS). However, I personally don't think this is some intentionally nefarious scheme – the NYT articles for today are publicly accessible, it's only the archives that require login/registration. And, since NYT is a news site, the crawling of new content is done on the ASAP model, which means the crawler gets the same unfettered access as the rest of us on the day of the crawl, and then just doesn't check back. 'Cause as you know, if you don't see it, it never happened :)
Philipp Lenssen	18 years ago #
Good point Utills. Pd, Google News, IMO, is a different game. It's clearly a moderate (manual sources) site to start with, and I don't think they have any cloaking guidelines for Google News. Yeah, we often see "subscription" there, a disclaimer attached right to the Google snippet...
Elias KAI	18 years ago #
This problems occurs for most Online Journals. The question is : Was it done for the purpose of getting more subscribers ?
Brian M.	18 years ago #
http://blogoscoped.com/forum/12652.html Cloaking is the way Google Scholar gets most of its content, and Google has to coordinate with those content providers in order to get an account.
Kirby Witmer	18 years ago #
http://digg.com/technology/New_York_Times_and_Cloaking
Pete Warden	18 years ago #
This was one of the things that was driving me crazy with Google's results recently. I kept having to go through cycles of loading the pages, and then seeing if they actually contained the right terms, especially with news sites. For example, a search for petewarden throws up fifth and sixth results on SourceForge that I've never been able to even access. I actually ended up writing an applet to check Google's results for me, to try and save myself some time. If anyone else wants to look, I've put it up at http://www.petewarden.com/search/ It seems like the only real solution to this is going to be some kind of automatic client-side checking, though that'll cost bandwidth on the servers that are being checked...
Ryan	18 years ago #
It IS cloaking, but it's NOT something google is going to ban anybody for. the NYT is too crucial a site to be removed from Google's index. Removing it would actually be less useful to the customer. Google is giving preferential treatment to the site quite simply because a large number of people expect to see NYT results in their Google searches. IF they don't, they'll go to other search engines. There's a line between penalzing the bad guys and penalizing your users too.
Ilya Kniazeu	18 years ago #
It's a longstanding practice and not only for NY Times as I might guess. Google news bots have access to articles that require subscription and something similar goes here. Google would not refuse additional valuable content even if ordinary user cannot see it. And I believe, it's not the case that article was some time available in free access and then gone into subscription archive. I am sure Google bots track NY Times articles often enough to immediately update index after article gone.
mika	18 years ago #
Come on! You want nice search results, so Google gets them for you! Why are you complaining about the fact Google adds USEFUL data to their index?! Cloaking is forbidden for spam pages and other harmful uses. But as long as it helps Google to get/gain Users and they cooperate with the NYTimes they're lucky to have the content while Yahoo doesn't. Maybe they don't SEO for Yahoo, but it could be a special agreement with Google. And I think it IS useful to get a link to a nice NYTimes article, even if I have to pay for it.
Philipp Lenssen	18 years ago #
> You want nice search results, so Google gets > them for you! Why are you complaining about the fact > Google adds USEFUL data to their index?! The exact same arguments are used by every cloaker in defense: "We're just trying to help the user by letting good stuff be indexed." IMO, the user is annoyed when she enters a search string and the result page says "please subscribe to actually see this". I certainly hate it, even on Google News – I think they should exclude sub-only sources there as well, or make them reachable via a certain interface option only. At least Google News is explicitly moderated (hand picked sources), whereas Google claims web search is not. As for Google Scholar, well if that's what people expect there it's fine, I can't tell for myself as I don't use it. On web search, everyone should be equal – be it the New York Times, a small website, or BMW.de (which also got banned for cloaking). If Google tells webmasters they're against cloaking, then that should include all websites. But if we go any further with this, I'd first like to see if indeed Google is giving the NYT any special rights. Apart from what the NYT SEO suggested, I don't see any evidence for this. Cloaking on BMW.de was also unbanned for years before they got penalized. Let's wait and see.
/pd	18 years ago #
"Cloaking is forbidden for spam pages and other harmful uses. But as long as it helps Google to get/gain Users and they cooperate with " yeha, so lets all cloak some stuff for the .cn SERP then correct ?? :)- will the site get banned or not ?? you can't have a coin tossed up and expect it to sit on the fence. Thats extacly what google is doing.. playing both sides when it suits their convenince!! Now, we wait and see eh ?? "When Google was burning Larry is fiddling?" ?? Read Like : when rome was buirn, Nero was fiddling – :)_
Elias KAI	18 years ago #
Philipp: the user is annoyed when she enters a search string and the result page says "please subscribe to actually see this". I certainly hate it, even on Google News – I think they should exclude sub-only sources there as well, ' I totally agree, adding to Google news, Google Scholars. You do some search and you end up wasting 10 minutes to subscribe or to pay in order to get your articles WHICH can be found for FREE in some other search engines with old data stored. Google should take cautions about these actions and seriously. You never know when Google can begin losing their actual added value which is the Free Accessible and most relevant information ressource.
/pd	18 years ago #
ok does nto google have cached pages for the query that Philipp demo'ed ? I want to visually verify cach'ed pages ..but I dont see a link for that.. why is this ?
jkb1	18 years ago #
A lot of people seem misinformed about what Time Select is. Times Select is a portion of the website that is subscription-only. It is NOT the same thing as their archives. There are certain articles that might appear in the print edition of the NYT today and that article is on the NYT website today as well. However, these certain articles may be in the Times Select section, meaning that to read it online on the very day it came out, you must subscribe to Times Select. It is somewhat like a "premium" section of the website, but there is really nothing "premium" about it... simply a way to make people pay up rather than cancelling their print subscription and reading the entire paper online for free. Here is the question: don't you think Google SHOULD display these results in their search engine? Their goal is to make information readily available and indexed? Imagine you read an article by Friedman in yesterday's NYT at home over breakfast. Today at work you wanted to show it to your colleague... so you google for Friedman and some key words from the article. Would you not want those results to show the correct article? The problem is when you click on it you discover it is under the Times Select portion of the site. The only mistake is there should be a note that it is a subscription-only part of the NYT website. I just don't think it's fair to call it cloaking when Google is just trying to index the day's newspaper articles and the NYT is feeding the CORRECT information – you're just getting redirected to what is essentially a sign-in page for the premium section of the website. If you were to sign in and see the Times Select article then I would expect the exact language found in the Google search to be within the article. What you really need is a debate about how Google should handle subscription-only content. Should they just not index it? Should there be a different search method? Should it be clearly marked in the results?
Philipp Lenssen	18 years ago #
At least the NYT page that is being referred to has the following: <meta http-equiv="Pragma" content="no-cache"> <meta name="robots" content="noarchive">
/pd	18 years ago #
ok then theres no cache.. but then theirs NYT strategy too for posting online editions of news when it service their own purpose.. like that of TW :)- http://www.nypost.com/news/regionalnews/im_doing_time__sex_babe_regionalnews_angela_montefinise_and_bill_sanderson.htm
/pd	18 years ago #
grrrrrrrrrr..mea cupla.. Disregard the above URL pls.
Tadeusz Szewczyk	18 years ago #
Keep calm! Google certainly uses the BugMeNot extension for firefox: http://roachfiend.com/archives/2005/02/07/bugmenot/ to see what regular subscribers see!
mrbene	18 years ago #
Tadeusz – BugMeNot doesn't get you into Times Select, just the rest of it. However, Google seems to be getting into Times Selet!
maksee	18 years ago #
It was interesting to read all these comments. I have an idea (for Google labs maybe). As I assume there's no big problem for any autmated program to introduce itself as a real world agent, not a search bot. So google might add something like double loading. The first and main source for indexing comes with general bot headers, but the last step might be checking the same page with a "conventional" agent. If the results differ significally, the bot marks the page record with a special sign that ends up with some special icon on the results page (some warning about possible cloaking). I know that this second agent will be widely known after some time and there would be some cloaking with it also so I myself am not sure about whether it would be really useful.
Philipp Lenssen	18 years ago #
Don't they do these background checks already? :)

Forum home

>> More posts

Blog | Forum more >> Archive | Feed | Google's blogs | About

This site unofficially covers Google™ and more with some rights reserved. Join our forum!