Google's Categories of Random Query-Evaluation

Sunday, June 5, 2005

Google’s Categories of Random Query-Evaluation

SearchBistro publishes Google’s General Guidelines on Random Query-Evaluation from 2003 (written by Natasha Doubson, SearchBistro’s Henk van Ess says). In the PDF, Google categorizes queries which are to be evaluated by raters into three groups:

A navigational query is when the user expects to be taken to a specific homepage, like when searching for “United Airlines”. In other words, the query has only one valid result.
An informational query can have more than a single correct results, with different “valid” sites only varying in terms of authority and relevancy. Google lists the example queries “renaissance paintings” and “What is a quark?”.
A transactional query can have many right results as well. As opposed to the informational query, here the searcher wants to order or download some kind of product. (Google admits the line between transactional and informational queries are often hard to draw.)

Ratings

Once a rater is randomly presented the search queries, which are taken from real-life usage examples (only little is filtered out, like pornographic results), there are several different rating categories available. They are:

Vital (like the search “OfficeMax” resulting in “www.officemax.com”)
Useful (for results which are “as good as it gets;” like when the user enters “West Nile Virus” and finds www.cdc.gov/ncidod/dvbid/westnile/)
Relevant (results which aren’t too good but still OK, like an amateurish homepage of a band the user searched for)
Not Relevant (results somewhat connected to the topic but not helpful; the relationship between search and find is subtle at best)
Off Topic (when there’s no relation at all between the search query and the specific result page offered)
Offensive (like “uninvited” pornographic results, or web spam pages which display cheating techniques)
Erronous
Didn’t Load
Foreign Language
Unrated

Comparisons

Also available to raters are more straightforward result comparisons. Here, the rater needs to adjust a slider to say which of two complete results shown is the more relevant. (One can assume that two different ranking algorithm settings are being compared here, e.g. when Google experiments with applying a new spam filter.) The different ratings available are:

Much better
Better
Slightly Better
About the same

Google’s Categories of Ra ... by Philipp Lenssen | Comments (0)

>> More posts

Blog | Forum more >> Archive | Feed | Google's blogs | About

This site unofficially covers Google™ and more with some rights reserved. Join our forum!