Google Blogoscoped

Forum

Reactions on Discontinuation of Google API  (View post)

Seth Finkelstein [PersonRank 10]

Wednesday, December 20, 2006
17 years ago7,977 views

Here's another effort:

Cracking Google AJAX Search API

   Written by Matthew Wilkinson
   Monday 18 December 2006 20:20:09

   http://mattwilko.com/content/Cracking_Google_AJAX_Search_API

Ionut Alex. Chitu [PersonRank 10]

17 years ago #

That's against the TOS.

Caydel [PersonRank 1]

17 years ago #

Interesting URL, Seth. I will be keeping an eye on that. I hope he posts some more today...

Philipp Lenssen [PersonRank 10]

17 years ago #

What is the legal situation on web ToS's anyway?

Do you have to agree to a site's ToS before you automatically pull content from its server? Wouldn't that mean the Googlebot automatically "agrees" to a site's ToS because it pulls content from their servers? Even when that ToS is (hypothetically speaking) idotic?

Or do automated approaches only need to respect the servers robots.txt? For example, Google via their robots.txt disallows pulling of its SERPs. Is disrespecting a robots.txt illegal by US-law?

(I won't even ask about the law situation of e.g. a *German* server pulling *US* content.)

Ionut Alex. Chitu [PersonRank 10]

17 years ago #

You can only pull search results from AJAX Search API through the provided interface. You're not allowed to reorder search results, add something in-between and mostly you're not allowed to do anything.

You must accept the TOS to get a key, Philipp.

Andrew [PersonRank 0]

17 years ago #

It would be pretty trivial to write a proxy script in PHP or whatever language to access their AJAX API. These URLs return results in JSON format and are used as the src for <script> tags appended to the document <head> tag when using the AJAX API interface:

- Web: http://www.google.com/uds/GwebSearch?callback=GwebSearch.RawCompletion&context=0&lstkp=0&rsz=small&hl=en&sig=8656f49c146c5220e273d16b4b6978b2&q=Google%20AJAX%20Search%20API&key=internal-documentation&v=1.0
- Video: http://www.google.com/uds/GvideoSearch?callback=GvideoSearch.RawCompletion&context=0&lstkp=0&rsz=small&hl=en&sig=8656f49c146c5220e273d16b4b6978b2&q=Google%20AJAX%20Search%20API&key=internal-documentation&v=1.0
- Blog: http://www.google.com/uds/GblogSearch?callback=GblogSearch.RawCompletion&context=0&lstkp=0&rsz=small&hl=en&sig=8656f49c146c5220e273d16b4b6978b2&q=Google%20AJAX%20Search%20API&key=internal-documentation&v=1.0

Philipp Lenssen [PersonRank 10]

17 years ago #

> You must accept the TOS to get a key, Philipp.

OK, but what about the normal web search results (the HTML)?
I am really curious about the legal situation for those.

Tyler [PersonRank 0]

17 years ago #

Andrew,

The problem with writing a proxy for the AJAX API is that it doesn't return nearly as much information as the SOAP API did. Currently, you can't even as for a specific number of results.

Ionut Alex. Chitu [PersonRank 10]

17 years ago #

The API is limited to allowing You to host and display Google Search Results on your site, and does not provide You with the ability to access other underlying Google Services or data.

You agree that you will not, and you will not permit your users or other third parties to: (a) modify or replace the text, images, or other content of the Google Search Results, including by (i) changing the order in which the Google Search Results appear, (ii) intermixing Search Results from sources other than Google, or (iii) intermixing other content such that it appears to be part of the Google Search Results; or (b) modify, replace or otherwise disable the functioning of links to Google or third party websites provided in the Google Search Results.

Can I scrape the search results from the Google AJAX Search API if the API doesn't meet my needs?
Sorry, but no; the AJAX Search API is the only permissible way to publish Google AJAX Search API results on your site. We'll block your application if it accesses search results outside of the API.

http://code.google.com/apis/ajaxsearch/terms.html
http://code.google.com/apis/ajaxsearch/faq.html

Ionut Alex. Chitu [PersonRank 10]

17 years ago #

<< No Automated Querying

You may not send automated queries of any sort to Google's system without express permission in advance from Google. Note that "sending automated queries" includes, among other things:

   * using any software which sends queries to Google to determine how a website or webpage "ranks" on Google for various queries;
   * "meta-searching" Google; and
   * performing "offline" searches on Google. >>

http://www.google.com/terms_of_service.html

Philipp Lenssen [PersonRank 10]

17 years ago #

Andrew, it's also trivial to write a PHP script for their HTML serps... especially with PHP5. If you just want the top URL you only need a couple of lines. Or take a look at this:
http://blogoscoped.com/archive/2004_06_24_index.html#108809520320587461

Philipp Lenssen [PersonRank 10]

17 years ago #

> No Automated Querying

Yes Ionut, that's their ToS.... but my question is not what Google's ToS reads but whether or not ignoring a ToS is illegal by definition (in particular when you use automated tools).

Andrew [PersonRank 0]

17 years ago #

Yeah, but saying JSON makes me sound so much cooler, doesn't it? :)

Ionut Alex. Chitu [PersonRank 10]

17 years ago #

I don't think it's illegal, but don't be surprised if Google decides to block access to their services.

Philipp Lenssen [PersonRank 10]

17 years ago #

Yes, absolutely, that's one of the main risks of screenscraping. I have some "information" though that you can actually do heckuva lot screenscaping of Google search results before they block your IP :)
But yes, that definitely happens if you poll too many pages in too short time...

Ionut Alex. Chitu [PersonRank 10]

17 years ago #

You know, Google encourages screenscraping. Look at Google Gadgets API.

http://www.google.com/apis/gadgets/remote-content.html#Fetch_text

Philipp Lenssen [PersonRank 10]

17 years ago #

Excellent point.

Matthew Wilkinson [PersonRank 1]

17 years ago #

Hi, I'm Matthew, I'm, the guy who wrote the script that's at the top of the page. I read the Google AJAX Search API terms and conditions extensively, and my script doesn't technically break them, at least it doesn't break them anymore than Google's own Javascript code breaks them. By the way Caydel, I'm still writing that article and I'll hopefully be putting more up in the next few days.

Matthew Wilkinson [PersonRank 1]

17 years ago #

Also, @Ionut, the clause you posted from the ToS doesn't apply here, as the script does not technically "modify" the "Google Search Results", nor does it intermix them or change the order. Also, the Google AJAX Search API itself is an "automated query", and so therefore the normal ToS cannot apply here.

Ionut Alex. Chitu [PersonRank 10]

17 years ago #

Keyphrase:
<<the AJAX Search API is the only permissible way to publish Google AJAX Search API results on your site>>

API = a bunch of functions and classes

Matthew Wilkinson [PersonRank 1]

17 years ago #

Yes, but that's in the FAQ, not the ToS, the FAQ is not a legal document + the script could be turned into functions+classes.

Reto Meier [PersonRank 10]

17 years ago #

Not to weigh in on the argument, but an API is an Application Programming Interface. It's a definition of how to interact their service.

While that's often 'a bunch of functions and classes', it can be a lot broader than that.

Ionut Alex. Chitu [PersonRank 10]

17 years ago #

No, I didn't give a definition of an API. I should've said AJAX Search API = a bunch of functions and classes provided by Google.

Of course you can do whatever you want. I was just trying to say that the AJAX thingie is much more limited than the SOAP one, at least if we play by Google's rules.

Notice they reworded the text from the SOAP page:

<<Depending on your application, the AJAX Search API may be a better choice for you instead. It tends to be better suited for search-based web applications and supports additional features like Video, News, Maps, and Blog search results.>>

http://code.google.com/apis/soapsearch/

Reto Meier [PersonRank 10]

17 years ago #

Ionut, you're absolutely right, there's no comparison between the AJAX API and the SOAP API.

The AJAX one is great / better if you want a custom google search box / results on your web page, but useless if you want to *do something* with the search results. I'm personally surprised they would even suggest using the AJAX tool as a replacement when it so clearly isn't.

I'd personally been hoping they'd expand the SOAP API to better handle onebox results and 'special' querries like the music / book / movie times results – killing it seems short sighted, I just hope they replace it (soon!) with a gData feed like they offer for Code search.

Tony Ruscoe [PersonRank 10]

17 years ago #

<< Depending on your application, the AJAX Search API may be a better choice for you instead. >>

How can they say something like that without offering a real alternative? What if I'm a developer without a SOAP API key and AJAX Search API isn't a better choice for my application?

Ionut Alex. Chitu [PersonRank 10]

17 years ago #

I can give you a key and the SDK.

(It's interesting to note that the SDK has 666 KB.)

Tony Ruscoe [PersonRank 10]

17 years ago #

Thanks, I have one already – it was just a hypothetical question. ;-)

Ionut Alex. Chitu [PersonRank 10]

17 years ago #

You can still get keys. Just log out and visit this page:
http://api.google.com/createkey

Here's a link to the SDK:
http://debian.fmi.uni-sofia.bg/~hiena/googleapi.zip

Philipp Lenssen [PersonRank 10]

17 years ago #

The official response is in, but there's nothing new:

http://google-code-updates.blogspot.com/2006/12/beyond-soap-search-api.html

(Google says "you can continue to execute queries" but I can't, it's too flaky.)

Tadeusz Szewczyk [PersonRank 10]

17 years ago #

What does this move tell us about Google? You shouldn't set up a business model based on Google services. This and the end of Google Answers means: Your income should not depend on Google. Google can revoke your right to use and earn money on any service at any time

Did I mention that there are lots of new search engines out there? My new favourite is Wink: http://wink.com

JohnMu [PersonRank 10]

17 years ago #

Maybe it's just telling us that Google is concentrating it's efforts on the things that *really* make a difference.

Of all the uses for the SOAP API, how many of them were actually useful to people outside of the SEO-mindset? I have probably created two dozen tools using the SOAP API, most of them for myself. Out of those tools, there is only one which I think might interest the average webmaster who's not just senselessly tracking (irrelevant) rankings for (irrelevant) keywords. And even that tool is so obscure that most average webmasters can't understand what it does :-).

Scraping results from Google and breaking the ToS is not unknown: every tool that displays pagerank does it. Theoretically, every non-Google pagerank tool goes against the ToS. How many of you have a non-Google PR display in their browser? On their website? They're against the ToS.

I expect more people to go against the ToS on purpose, Google to fight back a bit more, people moving tools over to other platforms or getting more creative with getting results back to a server (eg client side javascript with server post-backs). But will it change anything?

No. SEOs will still want their (irrelvant) numbers for their reports (or egos). If they want numbers strong enough, they'll shell out more money for a more complicated tool that will be able to gather them automatically.

What will it change for Google? They won't have to support their API; they well know that number-freaks will still get their fix somehow, only Google doesn't have to provide for them any more.

Philipp Lenssen [PersonRank 10]

17 years ago #

> Of all the uses for the SOAP API, how many of them
> were actually useful to people outside of the
> SEO-mindset?

I use or used the Google API (rarely, screenscraping) in a lot of non-SEO contexts:

- When you click on your name in the forum, I want to present the first Google search result for your nick name to save members having to enter their signature URL

- At FindForward.com, I'm displaying search results with thumbnails, among a lot of other search-based hacks (like Search Grid or Centuryshare, which have been featured in the book Google Hacks)

- The Google API was integrated at Upwarded.com, a non-SEO site (which is now open source, so you can have a look if you want)

- I've used the Google API as search engine for this blog, albeit I had to replace it a while ago due to the API's constant downtimes (of all of the samples so far, plain site search is the only one where Google offers good replacements, so switching was easy)

- I've used the Google API (rarely, screenscraping) in dozens of different tools like the Word Popularity Colorizer (http://blogoscoped.com/text-color/), Google Years (http://blogoscoped.com/google-years/), the Categorizer (http://blogoscoped.com/categorizer/), Moviebot (http://blogoscoped.com/moviebot/), Neighborsearch (http://blogoscoped.com/neighborsearch/), the Auto-linker (http://blogoscoped.com/yahoo/autolinker.php5)

- The CoverBrowser.com search engine's keyword base is partly powered by the Google API

That's a lot of different needs, so by-and-by I presented those projects to the Google API team and they gave me 100,000 requests per day. Most of these tools will now work flaky due to the flaky Google API. I don't blame this on Google, they never made any promises. Clearly, I bet on the wrong horse. My bad. A screenscraping framework may be the better alternative for future uses.

JohnMu [PersonRank 10]

17 years ago #

Nice use of the APIs, Philipp. Neat stuff :-)

Forum home

Advertisement

 
Blog  |  Forum     more >> Archive | Feed | Google's blogs | About
Advertisement

 

This site unofficially covers Google™ and more with some rights reserved. Join our forum!