How Many Google Books Pages?
(View post)jon ![[PersonRank 0] [PersonRank 0]](image/postrank/0.gif) | Tuesday, December 6, 2005 18 years ago |
well actually the idea was that you could see all the pages (as opposed to just two or three) in a book by searching that qury. kind of like a hack! |
Philipp Lenssen ![[PersonRank 10] [PersonRank 10]](image/postrank/10.gif) | 18 years ago # |
But that part didn't work for me... I could only see the first three pages in a particular book, at least when I tried scrolling through it with the Next button Google offers... then it would give me a little note on "Why this is copyrighted..." |
jon ![[PersonRank 0] [PersonRank 0]](image/postrank/0.gif) | 18 years ago # |
well i just searched for a book and then clicked on a book in that page i wouldtype that query and hit search this book. so it's not searching for books with that query but searching inside books with that query which will give you all the pages. hope they (google people) don't find out about this :) |
viggen ![[PersonRank 1] [PersonRank 1]](image/postrank/1.gif) | 18 years ago # |
adding the italian si gives me 140 million a | the | and | but | of | from | der | die | das | le | si | la |
cheers viggen |
jake ![[PersonRank 0] [PersonRank 0]](image/postrank/0.gif) | 18 years ago # |
(0 OR 1 OR 3) 156 mil |
viggen ![[PersonRank 1] [PersonRank 1]](image/postrank/1.gif) | 18 years ago # |
1|0| a | the | and | but | of | from | der | die | das | le | la 191000000 pages
hehe that is fun... :)
cheers viggen |
jake ![[PersonRank 0] [PersonRank 0]](image/postrank/0.gif) | 18 years ago # |
1|0|3| a | the | and | but | of | from | der | die | das | le | si | la
226 mil |
none ![[PersonRank 1] [PersonRank 1]](image/postrank/1.gif) | 18 years ago # |
in google.com you can put something like [-amkmwienawwwawewwnjwi] but this don't work with print.google.com |
lramirez ![[PersonRank 1] [PersonRank 1]](image/postrank/1.gif) | 18 years ago # |
They want my google account. They'll probably charge me the book if I exceed the 3 pages lol
|
mrnibz ![[PersonRank 1] [PersonRank 1]](image/postrank/1.gif) | 18 years ago # |
a | the | and | but | of | from | der | die | das | le | la | an | do | sa | de | si | 0 | 1 | 2 | go
a | the | an | but | of | from | der | die | das | le | la | . | , | sa | de | si | 0 | 1 | 2 | go |!
produces anywhere from 219 million to 235 million... it isn't consistent though. |
jake ![[PersonRank 0] [PersonRank 0]](image/postrank/0.gif) | 18 years ago # |
by|provided|http|1|0|3| a | the | and | but | of | from | der | die | das | le | si | la
236 mil |
James Bradbury ![[PersonRank 5] [PersonRank 5]](image/postrank/5.gif) | 18 years ago # |
Huh. Adding the Spanish word el actually makes the total decrease from 236 million to 230 million. This shows an error in the OR algorithm. |
Brian M. ![[PersonRank 10] [PersonRank 10]](image/postrank/10.gif) | 18 years ago # |
247,000,000 with date:0000-9999|a|http see: http://flickr.com/photos/breflection/71018274/ I noticed GBS (can we call it that from now on? i hate typing it out) returns pretty different results almost every time, and also varies the number of results it reports by quite a lot. Also, the `oldest' book in GBS: date:0000-1012 |
James Bradbury ![[PersonRank 5] [PersonRank 5]](image/postrank/5.gif) | 18 years ago # |
Huh. Adding the Spanish word el actually makes the total decrease from 236 million to 230 million. This shows an error in the OR algorithm. |
Brian M. ![[PersonRank 10] [PersonRank 10]](image/postrank/10.gif) | 18 years ago # |
I'm pretty sure that date:1012-2020 will capture all the pages in GBS (from the `oldest' to `newest'). I've gotten ~237 million with that query. Note that at this point all we are doing is fiddling with the parameters of Google's result estimation algorithms.
Another neat one is: -date:1012-2020 a Google won't let you search for -date... because it is too general, and it won't include a in your search because it is a stop word. But put them together and no doubt you reveal all the results (i've gotten ~242 million with that query). |
jake ![[PersonRank 0] [PersonRank 0]](image/postrank/0.gif) | 18 years ago # |
date:0000-9999|by|provided|http|1|0|3| a | the | and | but | of | from | der | die | das | le | si | la
~ 262-272 mil |
Brian M. ![[PersonRank 10] [PersonRank 10]](image/postrank/10.gif) | 18 years ago # |
I haven't gotten over 256 mil with that query (just different data centers likely). I got 257 mil by adding a french and chinese stop word to the end. Could have been random, didn't hurt:
date:0000-9999|by| provided| http| 1| 0| 3| a| the |and |bu t| of| from| der| die| das| le| si| la| de|了
Adding stop words from the OCLS's distribution of languages in GBS might squeeze a lil' more out: http://www.dlib.org/dlib/september05/lavoie/09lavoie.html |
jake ![[PersonRank 0] [PersonRank 0]](image/postrank/0.gif) | 18 years ago # |
I get 265 mil with your date:0000-9999|a|http , so that is most elegant at this point |
Brian M. ![[PersonRank 10] [PersonRank 10]](image/postrank/10.gif) | 18 years ago # |
I get the most with date:1012-2020|date:0000-9999 |
olivier ertzscheid ![[PersonRank 1] [PersonRank 1]](image/postrank/1.gif) | 18 years ago # |
Have a look here : http://affordance.typepad.com/mon_weblog/2005/11/la_bugbliothque.html A french post demonstrating that Google's counts are totally faked. |
Brian M. ![[PersonRank 10] [PersonRank 10]](image/postrank/10.gif) | 18 years ago # |
Looks like they fixed it so you can't search for two date ranges :) |