Google Blogoscoped

Thursday, September 16, 2004

Spam Map

This world map shows where spam originates, and if it’s accurate, I’m sitting right in the European epicenter. [Via Waxy.]

New PageRank Checksum Algorithm

The recently changed PageRank checksum calculation has been hacked. The following PHP source is in the public domain. [Thanks Google Blog Dirson.]

<?php
/*
        Written and contributed by
        Alex Stapleton,
        Andy Doctorow,
        Tarakan,
        Bill Zeller,
        Vijay "Cyberax" Bhatter
        traB
    This code is released into the public domain
*/
//header("Content-Type: text/plain; charset=utf-8");
define('GOOGLE_MAGIC', 0xE6359A60);

function obtainPR($data)
{
     $ret = array();

     $parser = xml_parser_create();
     xml_parser_set_option($parser,XML_OPTION_CASE_FOLDING,0);
     xml_parser_set_option($parser,XML_OPTION_SKIP_WHITE,1);
     xml_parse_into_struct($parser,$data,$values,$tags);
     xml_parser_free($parser);

     $hash_stack = array();

     foreach ($values as $key => $val)
     {
         switch ($val['type'])
         {
           case 'complete':
               array_push($hash_stack, $val['tag']);
               $type = implode($hash_stack, "][");
               if ($type == "RK")
               {
               		$PageRank = $val[value];
               }
               array_pop($hash_stack);
           break;
         }//swhitch
     }//foreach
     
     return $PageRank;
}//obtainPR


//unsigned shift right
function zeroFill($a, $b)
{
    $z = hexdec(80000000);
        if ($z & $a)
        {
            $a = ($a>>1);
            $a &= (~$z);
            $a |= 0x40000000;
            $a = ($a>>($b-1));
        }
        else
        {
            $a = ($a>>$b);
        }
        return $a;
}


function mix($a,$b,$c) {
  $a -= $b; $a -= $c; $a ^= (zeroFill($c,13));
  $b -= $c; $b -= $a; $b ^= ($a<<8);
  $c -= $a; $c -= $b; $c ^= (zeroFill($b,13));
  $a -= $b; $a -= $c; $a ^= (zeroFill($c,12));
  $b -= $c; $b -= $a; $b ^= ($a<<16);
  $c -= $a; $c -= $b; $c ^= (zeroFill($b,5));
  $a -= $b; $a -= $c; $a ^= (zeroFill($c,3));  
  $b -= $c; $b -= $a; $b ^= ($a<<10);
  $c -= $a; $c -= $b; $c ^= (zeroFill($b,15));
  
  return array($a,$b,$c);
}

function GoogleCH($url, $length=null, $init=GOOGLE_MAGIC) {
    if(is_null($length)) {
        $length = sizeof($url);
    }
    $a = $b = 0x9E3779B9;
    $c = $init;
    $k = 0;
    $len = $length;
    while($len >= 12) {
        $a += ($url[$k+0] +($url[$k+1]<<8) +($url[$k+2]<<16) +($url[$k+3]<<24));
        $b += ($url[$k+4] +($url[$k+5]<<8) +($url[$k+6]<<16) +($url[$k+7]<<24));
        $c += ($url[$k+8] +($url[$k+9]<<8) +($url[$k+10]<<16)+($url[$k+11]<<24));
        $mix = mix($a,$b,$c);
        $a = $mix[0]; $b = $mix[1]; $c = $mix[2];
        $k += 12;
        $len -= 12;
    }

    $c += $length;
    switch($len)              /* all the case statements fall through */
    {
        case 11: $c+=($url[$k+10]<<24);
        case 10: $c+=($url[$k+9]<<16);
        case 9 : $c+=($url[$k+8]<<8);
          /* the first byte of c is reserved for the length */
        case 8 : $b+=($url[$k+7]<<24);
        case 7 : $b+=($url[$k+6]<<16);
        case 6 : $b+=($url[$k+5]<<8);
        case 5 : $b+=($url[$k+4]);
        case 4 : $a+=($url[$k+3]<<24);
        case 3 : $a+=($url[$k+2]<<16);
        case 2 : $a+=($url[$k+1]<<8);
        case 1 : $a+=($url[$k+0]);
         /* case 0: nothing left to add */
    }
    $mix = mix($a,$b,$c);
    /*-------------------------------------------- report the result */
    return $mix[2];
}

//converts a string into an array of integers containing the numeric value of the char
function strord($string) {
    for($i=0;$i<strlen($string);$i++) {
        $result[$i] = ord($string{$i});
    }
    return $result;
}


// converts an array of 32 bit integers into an array with 8 bit values. Equivalent to (BYTE *)arr32

function c32to8bit($arr32) {
    for($i=0;$i<count($arr32);$i++) {
        for ($bitOrder=$i*4;$bitOrder<=$i*4+3;$bitOrder++) {
            $arr8[$bitOrder]=$arr32[$i]&255;
            $arr32[$i]=zeroFill($arr32[$i], 8);
        }    
    }
    return $arr8;
}


// http://www.example.com/ - Checksum: 6540747202

print("<b>URL .... $url</b>n");
$url = 'info:' . $url;
$ch = GoogleCH(strord($url));
$url_to_parse = sprintf ("http://toolbarqueries.google.com/search?client=navclient-auto&ch=6%u&ie=UTF-8&oe=UTF-8&q=%s", $ch, $url);
$value = obtainPR(file_get_contents($url_to_parse));
printf("<li> <u>Checksum <2.0.114:</u> ..... 6%u ...... <A href=$url_to_parse>link</A> .... PR = $valuen",$ch);

$ch=sprintf("%u", $ch);
// new since Toolbar 2.0.114

$ch = ((($ch/7) << 2) | (((int)fmod($ch,13))&7));

$prbuf = array();
$prbuf[0] = $ch;
for($i = 1; $i < 20; $i++) {
      $prbuf[$i] = $prbuf[$i-1]-9;
}
$ch = GoogleCH(c32to8bit($prbuf), 80);
$url_to_parse = sprintf ("http://toolbarqueries.google.com/search?client=navclient-auto&ch=6%u&ie=UTF-8&oe=UTF-8&q=%s", $ch, $url);
$value = obtainPR(file_get_contents($url_to_parse));
//
printf("<li> <u>Checksum >=2.0.114:</u> ..... 6%u ...... <A href=$url_to_parse>link</A> .... PR = $valuen",$ch);

?>

Update: the first version of the code as posted above wasn't working – you are now seeing the second, updated version which is tested (and definitely works). Thanks D. for the help.

Butler Jeeves Missing

InsideGoogle reports Butler Jeeves of Ask.com is currently missing. The official statement is he’s on a secret “worldwide quest” to “strengthen and improve Ask.com”. According to same page, Jeeves went from the Himalayas to the salt flats of Utah. Other people however report this is a cover-up and that Jeeves has been kidnapped. At the Ask.com site, he’s even been erased from the Quantum Leap cartoon – and every other place (except for Ask Jeeves Kids – possibly more than just a coincidence). But the image of him on this Ask.com page might be a subtle visual clue: we can see a missing butler, the words “Help Jeeves”, and a rope tied two his hands. Then again this might be a double charade, and the butler’s just on vacation in the Bahamas.

Advertisement

 
Blog  |  Forum     more >> Archive | Feed | Google's blogs | About
Advertisement

 

This site unofficially covers Google™ and more with some rights reserved. Join our forum!