Amazon’s Alexa, which is using the web archive we know from Archive.org’s Wayback Machine, is releasing the Alexa Web Search Platform (AWSP), as John Battelle points out in his scoop article.
As opposed to the Google API and the Yahoo API, this seems to be a much more in-depth exposure of the data. You have access to HTML pages, images, music files, and even video content. You can search by content type, specific types of URL, size, file header information, language or HTTP response codes. Alexa calls this the “Define” step. You can then apply an operation onto the search set, which Alexa terms as “Process,” and it contains of a compiled program accessing their C API. The final step is coined “Publish” and allows you to download your results, or publish them as part of an either private or public Alexa search engine. For more information, you can check the AWSP help.
This whole development process seems a little more “hardcore” than any of the web service based SOAP or even REST APIs, which are truly easy to play around with. Then again, if you ever wished you just had a larger set of data and would not be restricted to the small amounts of results per query, this might be for you. John Battelle says, “Alexa and Amazon are turning the index inside out, and offering it as a web service that anyone can mashup to their hearts content. Entrepreneurs can use Alexa’s crawl, Alexa’s processors, Alexa’s server farm....the whole nine yards.”
Also, AWSP seems to have a REST API of their own, as the Java code sample on this page shows. (Just as with Amazon’s Mechanical Turk, the AWSP pages are painfully scattered all over the place in different formats and layouts, instead of bundled on just one homepage.)
Now I’m sure many developers would be happy to pay Google or Yahoo for a larger amount of hits and be able to base their efforts on a commercial basis. And indeed the Alexa search API power comes at a price; the first 10,000 requests per month are free, but any subsequent requests will be valued at $0.00025 per request (e.g. 25 bucks for 100,000 hits). Note the AWSP FAQ lists different criteria for calculating payment; CPU hours, bandwidth, and storage. When it comes to the API of the Amazon Mechanical Turk, this also meant you needed to have a US bank account, effectively closing the service to international developers; I’m not yet sure how Alexa handles payment, but already their sign-up form was assuming US applicants only (there was no “country” field, but a “state” field instead).
>> More posts
Advertisement
This site unofficially covers Google™ and more with some rights reserved. Join our forum!