Google Blogoscoped

Monday, July 31, 2006

If Only We Knew Google’s Secret Ranking Algo

Wouldn't it be nice to see Google's top secret ranking formula? The piece of source code that makes a web page stand or fall in rankings, rumored to contain over 100 different factors? A piece of source code like...

function getPagerank(url)
{
    // start off with a random low PR
    pagerank = randomNumber(0, 3);

    if ( pageHostedOn(url, 'google.com') ) {
        pagerank++;
    }
    else if ( pageHostedOn(url, 'microsoft.com') ) {
        pagerank--;
    }
    
    if ( pageValidates(url) ) {
        pagerank *= .5;
    }
    
    tag_value['b'] = 1;
    tag_value['h2'] = 2;
    tag_value['h1'] = 3;
    tag_value['strong'] = -1; // W3C sux!
    pagerank = calculateTagsPr(tag_value, pagerank);

    // Sergey said good news sites have
    // lots of nested tables
    tablesOnPage = getTagCount('table');
    if (tablesOnPage >= 50) {
        pagerank += 2;
    }

    if (pagerank >= 5) {
        pagerank = 4; // helps selling AdWords
    }

    if ( linksFrom('mattcutts.com', url) >= 4 ) {
        // I link to "clean" sites only
        // – Matt, Feb 2006
        pagerank += 2;
    }

    pagerank += countBacklinks(url) / 10000;

    blacklist1 = getList('government.cn/censored.txt');
    blacklist2 = getList('c:larry-page-hatelist.txt');
    if ( inArray(blacklist1, url) ||
            inArray(blacklist2, url) ) {
        pagerank = 0;
    }

    d = dashesInUrl(url);
    pagerank = (d >= 3) ? pagerank -1 : pagerank + 1;

    if ( inString(url, "how to build a bomb") ) {
        // added on request. 2004-12-01.
        recipient = "peter@homelandsecurity.gov";
        subject = "You might wanna check this...";
        sendMailTo(recipient, subject, url);

        // page might still be relevant
        pagerank++;
    }

    if ( month() == "June" || month() == "October" ) {
        // makes people talk about
        // PR updates, good publicity
        pagerank -= randomNumber(1,3);
    }    

    if ( linkCol(url) == WHITE &&
            pageCol(url) == WHITE ) {
        // spammer!! Googleaxe it!!
        pagerank = 0;
    }

    if (url == "http://www.nytimes.com") {
        // just testing, pls remove tomorrow
        // – Frank, June 2003
        pagerank = 10;
    }

    return pagerank;
}

[Thanks Alek for the idea!]

Advertisement

 
Blog  |  Forum     more >> Archive | Feed | Google's blogs | About
Advertisement

 

This site unofficially covers Google™ and more with some rights reserved. Join our forum!