Google Blogoscoped

Monday, June 23, 2003

Google as ASCII-Art


With a little help from ASCII-O-Mat, which turns your images into text. Also see the HTML source for above file.

Counting Google-Indexed Pages

I wanted to know how many pages of mine Google indexed all-in-all. The following query showed the result, and you can adapt it to fit your own sites: OR OR OR OR

The Google result count for above query is 3,970.

I will try out how the above query works together with Google Alert (which notifies me of changes). If it works well, it’d mean I get an email whenever Google indexed a new page of mine.

The Google page-count includes dynamically generated pages. For example on*, every book I re-format and publish consists of a single HTML page, making for about 40 books. However, using PHP, I link to a “chapterized” version of those HTML pages (I simply split the files along the H2 headers of the original HTML). So the page-count for Authorama becomes 755. Of course the chapterization was done to help visitors reading by splitting it up into easy chunks, but it also helps with the 101K indexing limit (which is actually 150K last time I tested).

*Since the books published on Authorama are often also featured on other sites (because they’re public domain books), I sometimes fall into the “duplicate content removal” of Google’s SERPS. However, sometimes I also rank OK, even though display a PageRank of zero. Authorama is one of my sites where ranking is important to me because of the nature of it; I re-publish books in a format that I feel reflects best on its contents (easy to read, easy to navigate, no clutter or weird colors/ fonts), but I believe Google does not really analyze the layout to make decisions about rankings. is a typical page which is mostly helpful to people coming from Google and e.g. looking for the context of specific phrase from a book. (It is not a typical page one would bookmark and return to it every several or so days.)


Blog  |  Forum     more >> Archive | Feed | Google's blogs | About


This site unofficially covers Google™ and more with some rights reserved. Join our forum!