Google Blogoscoped

Forum

"More data usually beats better algorithms"

Ianf [PersonRank 10]

Saturday, April 5, 2008
16 years ago3,223 views

[...] Most people think Google's success is due to their brilliant algorithms, especially PageRank. In reality, the two big innovations that Larry and Sergey introduced, that really took search to the next level in 1998, were:

• The recognition that hyperlinks were an important measure of popularity – a link to a webpage counts as a vote for it.

• The use of anchortext (the text of hyperlinks) in the web index, giving it a weight close to the page title.

http://3quarksdaily.blogs.com/photos/uncategorized/2008/04/05/blue_data.gif

First generation search engines had used only the text of the web pages themselves. The addition of these two additional data sets – hyperlinks and anchortext – took Google's search to the next level. The PageRank algorithm itself is a minor detail – any halfway decent algorithm that exploited this additional data would have produced roughly comparable results.

The same principle also holds true for another area of great success for Google: the AdWords keyword auction model. [...]

http://anand.typepad.com/datawocky/2008/03/more-data-usual.html

[via http://3quarksdaily.blogs.com/]

Ionut Alex. Chitu [PersonRank 10]

16 years ago #

The conclusions is good, but the two examples aren't supporting it. Links and anchor texts aren't "more data", they are a part of the same data used by search engines. PageRank is an algorithm that used the same data in a clever way. The CTR from AdWords isn't "more data", that's implicit data that could have been used by GoTo/Overture, as well. A good example for "more data beats better algorithms" is Google's translation system: http://www.youtube.com/watch?v=nU8DcBF-qo4

TOMHTML [PersonRank 10]

16 years ago #

Algorithms without data are useless. Moreover, you can't make accurate algorithms without watching data before. So I don't understand the title of this thread :-/

Forum home

Advertisement

 
Blog  |  Forum     more >> Archive | Feed | Google's blogs | About
Advertisement

 

This site unofficially covers Google™ and more with some rights reserved. Join our forum!