<<We've known it for a long time: the web is big. The first Google index in 1998 already had 26 million pages, and by 2000 the Google index reached the one billion mark. Over the last eight years, we've seen a lot of big numbers about how much content is really out there. Recently, even our search engineers stopped in awe about just how big the web is these days – when our systems that process links on the web to find new content hit a milestone: 1 trillion (as in 1,000,000,000,000) unique URLs on the web at once!>>
http://googleblog.blogspot.com/2008/07/we-knew-web-was-big.html |
Thanks Blogger for making this possible:) |
The title is not accurate. |
URLs!= Sites
>> the number of individual web pages out there is growing by several billion pages per day.
:O, that's interesting.
2/01: 1.3 billion 2/02: 2 billion 2/03: 3 billion 2/04: 4.28 billion 11/04: 8 billion ... 7/08: 1 trillion!!!!
It has came a (not really) long way. I am going to attribute the exponential growth to Google's recent discovery of the dark web and my useless content generator. :) |
Actually Google doesn't index 10^12 pages.
<< We don't index every one of those trillion pages – many of them are similar to each other, or represent auto-generated content similar to the calendar example that isn't very useful to searchers. >> |
These are 10^12 URL's. It is not 10^12 sites. It may not be 10^12 pages. They are not all indexed.
Below is a scheme for a US quadrillion (i.e. 10^15) URL's at the blogoscoped.com site.
http://blogoscoped.com/search/?q=1 (q=1) http://blogoscoped.com/search/?q=2 (q=2) http://blogoscoped.com/search/?q=3 (q=3) ... http://blogoscoped.com/search/?q=1000000000000000 (q=1000000000000000)
If you are not worried about URL length, we could extend this scheme to a googol URL's. ... http://blogoscoped.com/search/?q=10000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000 (q= 1 0000000000 0000000000 0000000000 0000000000 0000000000 0000000000 0000000000 0000000000 0000000000 0000000000)
|