Google Blogoscoped

Wednesday, January 21, 2004

Chatfind

Chatfind is a new tool here which searches chat logs from the IRC. For example, you can enter “Google” to find out what people are saying about Google.

How many seconds in a day?

Did you know that Google knows the question to all these answers via the Google Calculator?

And finally ...

1 millenium = 3.1556926 × 1016 microseconds

Re: Bloggers Ignoring Link Semantics

In response to Bloggers Ignoring Link Semantics:

“I think you make a valid point, but you carry it too far. The Web is not meant for search engines, it is primarily meant for humans who view it directly, and we should keep this in mind at all times: insofar as possible, Web authors should make the information they provide pleasant for humans to read _and_ easy for automated systems (search engines being the most important, but not the only ones) to digest, but when it is not possible to satisfy both, humans should be the priority.

Unfortunately what it all mostly boils down to is that HTML is semantically very limited, and while efforts have been made to enhance its presentational aspect, none has been made to make it semantically richer. For example, there is no way at all I can include a word in a Web page while telling all search engines “please do not index this word, it is definitely *not* a keyword for this page, it is simply cited as an example” or something of the sort. Similarly, there is no way to associate fine semantics with a link (well, actually, there is: the “rel” and “rev” attributes, and RDF profiles; but in practice it just doesn’t work).

In writing a blog, if I wish to refer to something I already said, I’m likely to write something like “as I previously mentioned”, and then I’ll make the two words “previously mentioned” a link to the entry in question; there simply isn’t much else that I can do: the English (or whatever language) text is the primary matter, and links should be added to it without changing the text, and I’m not going to write “as I previously mentioned, in my first entry dated 2004-01-21” just so I can make “first entry dated 2004-01-21” a link.

Instead, if the link text is insufficient to describe exactly what the link is poiting to, HTML offers a very good mechanism: the “title" attribute. (...)
Since so few people use it, I believe search engines will also ignore it. This is also a shame.

But given the present state of affairs, I think the small cost of having Web search engines (or only Google, in fact) record some useless link texts such as “previously mentioned”, is insufficient to justify heavy rewriting of valid English text. On the other hand, I entirely agree that “click here” is almost always a stupid choice for a link text, and “#” is even worse.”
David A. Madore, via Email, January 20, 2004

Htaccess For Friendly URLs

Here's a nice trick to turn the URL
http://authorama.com/book.php?title=foo&part=2
into the more search engine and reader-friendly
http://authorama.com/foo-2.html

On Apache servers, write the following into an ".htaccess" file and put it in the root (it uses regular expressions, which you need to adapt according to your needs):

RewriteEngine on
Options +FollowSymlinks
RewriteBase /
RewriteRule ^(.*)-([0-9]*).html  show.php?title=$1&part=$2

Finally make sure you send the 301 permanently moved header to Google.
However, I got some problems with the following PHP ("book.php") on my machine:

<?

// ... Check title and part (removed) ...

header("HTTP/1.1 301 Moved permanently");
header("Location: http://www.authorama.com/"
        . $title . "-" . $part . ".html");
?>

Only if I remove the first line will it correctly work on my Authorama site. It will then send a 302 message for a temporary movement, which however is only sub-optimal (especially concerning the Googlebot-reaction). The error message is not visible due to my server setting.
Your help greatly appreciated, especially at doing above right within the .htaccess file (because I believe the PHP is right, it's just my server that's misconfigured)!

Heavy Traffic and Other News

While Booble is “taking a poke at Google” with their adult search engine, and someone else accuses Google to ignore PageRank, keywords, and pretty much everything else, I’ve had more traffic to Search.CSS and my other tools than I can handle – as result, all my Google Web API tools won’t work for today (and might stop working sometime during tomorrow, as I only have a limited amount of requests per day for my API key). Also, my new Chatfind tool will have to wait until the issue is resolved.

In the meantime, you might want to visit the new Yahoo! Lab (similar to the Google Labs, only there’s nothing there except PDFs at the moment). Yahoo is expected to drop Google for their search results very soon and switch to Inktomi in the upcoming search engine wars. Or if you have a Google AdSense account, check out the AdSense charts generator. And then there’s RSS Alerts by PubSub. Pity that Phlog.net, which I tried out, doesn’t seem to handle my Nokia 6600 photos and displays garbage. However their photobloggers by location directory is nice if you want to travel (without actually travelling) or just see what people near you are doing.

Advertisement

 
Blog  |  Forum     more >> Archive | Feed | Google's blogs | About
Advertisement

 

This site unofficially covers Google™ and more with some rights reserved. Join our forum!