No more cached copy of PDF files? - Google Blogoscoped Forum

Forum

No more cached copy of PDF files?
milivella	Wednesday, June 4, 2008 16 years ago • 3,966 views
It seems that Google doesn't offer any more the chance to view the cached copy (html) of PDF files. See e.g. http://www.google.com/search?q=filetype%3Apdf Why? Don't know whether is old news.
Tony Ruscoe	16 years ago #
Weird. Some files still have the "View as HTML" link when you don't search using the filetype operator. There are some in these search results: e.g. http://www.google.com/search?q=site%3Aadobe.com+inurl%3Apdf
Ionut Alex. Chitu	16 years ago #
I'd say that PDF results that can't be viewed as HTML are quite rare: http://www.google.com/search?hl=en&q=filetype%3Apdf+hamlet http://www.google.com/search?q=filetype%3Apdf+confidential
Tony Ruscoe	16 years ago #
There's no good reason why they can't be viewed as HTML. From what I've seen of PDF to HTML converters, if Google can extract the text, it should be able to display them as HTML.
milivella	16 years ago #
So some PDF can be viewed as HTML, some cannot, and the reason is not copyright (if I'm not wrong)...
Tony Ruscoe	16 years ago #
Correct. It doesn't seem to be related to copyright: http://www.google.com/search?q=filetype%3Apdf+%22Copyright+2000..2008%22+%22All+rights+reserved%22
Ianf	16 years ago #
I have a vague memory of this universal pdf-as-html policy beeing changed a year+ ago, making Google observe the no-copy-text flag in the docs in line with Adobe Acrobat's requirements (different from print flag).
Roger Browne	16 years ago #
I also remember Google announcing the policy change that Ianf mentions, although I'd say it was more like two years ago.
milivella	16 years ago #
Thanks for the feedback about the policy change (from a quick search it seems not easy to find informations about it, though).

Advertisement

Blog | Forum more >> Archive | Feed | Google's blogs | About

Advertisement

This site unofficially covers Google™ and more with some rights reserved. Join our forum!