Google Blogoscoped



pokemo [PersonRank 10]

Tuesday, January 29, 2008
16 years ago3,856 views

Can a crawler crawls/index the contents of PDF, DOC....

DPic [PersonRank 10]

16 years ago #

yeah, you can even search for those just by clicking on advanced search haha :)

Zim [PersonRank 10]

16 years ago #

And even read them as plain html from Google :)

Tony Ruscoe [PersonRank 10]

16 years ago #



Ionut Alex. Chitu [PersonRank 10]

16 years ago #

There are free/open source tools that convert a PDF, DOC to HTML or plain text.

Colin Colehour [PersonRank 10]

16 years ago #

The crawler will only index searchable PDFs. So if the PDF is a scanned image that was not OCRed, it will not be indexable by the crawler.

Forum home


Blog  |  Forum     more >> Archive | Feed | Google's blogs | About


This site unofficially covers Google™ and more with some rights reserved. Join our forum!