Crawler - Google Blogoscoped Forum

Forum

Crawler
pokemo	Tuesday, January 29, 2008 18 years ago • 4,423 views
Can a crawler crawls/index the contents of PDF, DOC....
DPic	18 years ago #
yeah, you can even search for those just by clicking on advanced search haha :)
Zim	18 years ago #
And even read them as plain html from Google :)
Tony Ruscoe	18 years ago #
[filetype:pdf] http://www.google.com/search?q=filetype%3Apdf [filetype:doc] http://www.google.com/search?q=filetype%3Adoc
Ionut Alex. Chitu	18 years ago #
There are free/open source tools that convert a PDF, DOC to HTML or plain text. http://www.google.com/search?hl=en&q=pdf2html&btnG=Search http://www.google.com/search?hl=en&q=antiword&btnG=Search
Colin Colehour	18 years ago #
The crawler will only index searchable PDFs. So if the PDF is a scanned image that was not OCRed, it will not be indexable by the crawler.

Blog | Forum more >> Archive | Feed | Google's blogs | About

This site unofficially covers Google™ and more with some rights reserved. Join our forum!