Google Now Allows Sites to Serve Content to Them While Showing a Registration Box to Non-Google Users

Monday, October 20, 2008

Google Now Allows Sites to Serve Content to Them While Showing a Registration Box to Non-Google Users

There once was a time when Google search tried to be a neutral bystander, watching the web without getting too actively involved. There once was a time when Google instructed webmasters to serve their Googlebot the same thing served to a site’s human users. Now, Google is officially telling webmasters they can serve one thing to people coming from Google web search, and another thing to people coming from elsewhere. Think of it as Google now offering publishers to hand Google a special key to the publisher’s content. Google calls this “first click free” and they say they do this in order “to help users find and access content that may require registration or a subscription”, to “include highly relevant content in Google’s search index” and to “to provide a promotion and discovery opportunity for publishers with restricted content.”

What does this mean, exactly? It means that a user coming from a Google search result may see the content, but when the user arrives at the same page from elsewhere – like from a link in a news article – they might see a registration or payment box and no actual article or other content. Google explains how this can be implemented:

To include your restricted content in Google’s search index, our crawler needs to be able to access that content on your site. Keep in mind that Googlebot cannot access pages behind registration or login forms. You need to configure your website to serve the full text of each document when the request is identified as coming from Googlebot via the user-agent and IP-address. (...)

When users click a Google search result to access your content, your web server will need to check the “Referer” HTTP request-header field. When the referring URL is on a Google domain, like www.google.com or www.google.de, your site will need to display the full text version of the page instead of the protected version of the page that is otherwise shown.

Isn’t that “cloaking,” which Google previously penalized websites for? Google would argue it’s not, because you’re still serving the same page to Googlebot as you do to Google users. For reference, Google’s existing guidelines say that “Cloaking refers to the practice of presenting different content or URLs to users and search engines. Serving up different results based on user agent may cause your site to be perceived as deceptive and removed from the Google index.” Note how the guidelines do not mention “Google users” specifically, but just “users”. Google continues to say they frown on practices “that provide content solely for the benefit of search engines”, warning that such sites may be removed from their index.

When a can of worms is opened, you never know the exact shape, quantity or destinations of the creatures crawling out – you just know it’s going to be somewhat slimy. For this particular can, the lid was beginning to be opened when Google News allowed their “first click free” policy in Google News, and now it’s officially off for web results as well. Here are some of the potential things this policy could cause, good and bad:

more sites opting for a registration option instead of showing the content to everyone, because they now found a feasible way to get people interested without offering the full content.
on the opposite end, more sites showing their full content to at least a portion of users (Google users), instead of opting for a registration-only view for everyone.
the barrier for competing search engines, existing and future ones, being raised... because Google may now be offered a key by some sites, something which the same site may not bother implementing for the new engine on the block (if that other engine would also suggest a first click policy). If this policy would ever become wide-spread, the next Larry and Sergeys of today writing a web crawler would face a lot of new dead ends: “Google exclusive” crawl territory, a place where newcomers need to ask permission first. Google seems to understand this issue very well in contexts different from this one; defending net neutrality, Google CEO Eric Schmidt once warned of “a two-tiered system”, arguing that “Today the Internet is an information highway where anybody – no matter how large or small, how traditional or unconventional – has equal access.”
I’m no lawyer so I don’t know if this could help back up claims of Google being a monopoly or abusing their power, as per above point. Google pointed out in the past how competition is just “one click away” so users could leave Google at any time if they’re not satisfied, but that’s much less realistic if many websites were to serve special free pages to just Google.
some webmasters pushing more great content online and leading happier lives because they finally found a working way to create a pay wall.
more people installing tools to surf with fake (Google) referrers set in their browser.
new headaches for current implementations of sites using the Google SOAP or REST API to show search results, because they won’t be the “right” referrer when people click-through to a “first click” page.
more opportunity for Search Engine Optimization consultants helping websites to set up FCF right.
more confusion all around because people may find a URL on the web, paste it somewhere else for reference, only to find that people clicking on that link now won’t be able to see what they saw. Bloggers and others who post links might need to double-check every URL they found via Google and potentially look for an alternative URL (or switch to an alternative search engine to begin with) if they want to link to it and make sure their readers see the same.

If not enough people care to implement FCF, it could also become one of those occasional sightings in Google results without any big implications for the web or Google. And there’s also the possibility of a neutralizing, self-healing effect kicking in to suppress negative side effects of “first click free” (what does a global brain do with non-transmitting neurons?). When a site aims to promote its content by showing different things to different user agents, people may distribute that link less, causing a lower Google PageRank in turn, which causes a worse ranking, which finally brings a traffic decrease to said site. Blogs and other sites might even try to manage a blacklist of such sites known to hide content to non-Google users. Say, if Example.com was found to be one such site, then the following link...

http://example.com/foo

... could be automatically transformed to a (perhaps nofollowed) “feeling lucky” redirect like this on publication of an article:

http://google.com/search?q=http%3A%2F
%2Fexample.com%2Ffoo&btnI=1

There once was a time when Google tried to offer a neutral view on the web without too much distortion. But for better or worse times changed a bit in this regard, and Google now offers webmasters a chance to treat the Googlebot along with Google users as VIPs. Google’s organic results thus become not any view onto the web, but a special one. You may prefer this view – when using Google you’re being treated as a VIP, after all! – or dislike it. And it might force you to rely on Google even more than before if some publishers start creating one free website for Google users, and another free one for second-class web citizens.

Efforts splicing up the web into vendor specific zones aren’t new, though the technologies and specific approaches involved vary greatly. In the 1990s, “Best Viewed with Netscape” or “Optimized for Internet Explorer" style buttons sprung up, and browser makers were working hard to deliver their users a “special” web with proprietary tags and more. Many of us had strong dislikes for such initiatives because it felt too much like a lock-in: the web seems to fare better when it works on cross-vendor standards, not being optimized for this or that tool or – partly self-interested – corporation.

[Thanks WebSonic.nl! Illustration not by Google.]

Google Now Allows Sites t ... by Philipp Lenssen | Comments (1+46)

>> More posts

Blog | Forum more >> Archive | Feed | Google's blogs | About

This site unofficially covers Google™ and more with some rights reserved. Join our forum!