Google Blogoscoped

Tuesday, October 19, 2004

Google Desktop Security

I find it funny some people argue the Google Desktop search is risky because people at work often share a single account on a computer.

The argument goes like this: you might have forgotten to secure your files the way it’s meant to be (with an account, which has a password) and now someone else digs them up in a search. But shouldn’t people cure the source of the problem, not the symptom?
Even the Windows search, as slow as it is, digs up files on one and the same computer if you don’t protect it. (Admittedly, it does so very badly, so finding something might take longer than the user’s patience can hold up.) In any case these security risks brought up against the Google Desktop are especially ironic because the boss in particular can already happily read along – he doesn’t need Google to do that, just a system administrator who will forward the right data to him.

Now there’s something left to discuss, which indeed is critical, and it applies to both the Google web search, and the Google Desktop. It is a search application’s cache (which, in the case of Google Desktop, can possibly be cracked if it’s not already exposed in the user interface). It is critical for two reasons: it doesn’t respect copyright, and it doesn’t respect if you delete the file.

First, the copyright issue*. Have you ever wondered how Google web search can get away with copying millions of texts and images? Well, we’re glad they can, and apparently it’s legal (or at least, nobody successfully sued them on it).
We’re glad because it’s so useful the web would be worse off without it. But Google is effectively storing a redundant copy of files, and unless you complain (there’s the “noarchive” meta tag, the “robots.txt” file, and the Digital Millennium Copyright Act, DMCA), that’s where people will see them.

*Of course, the copyright issue is a fuzzy one with a lot of implications. We can start with the browser itself which is used to access sites: it will store a copy of the file in its own cache. It has to store it somewhere, because that’s what downloading means (and you need to download a site to view it): to make a redundant copy on your hard-disk. If we treat Google as a neutral machine – an extension to the browser, which lets us access everything with robotic indifference – then that apparently makes the copyright issue fade away. The biggest reason most people won’t complain, however, is the fact they upload something on a public server for the sole reason of having it be accessed by many people – and Google might bring them the most visitors.

Second, deleting a file. How does the Google Desktop search handle this? I tested this by creating a text file with a unique string in it. I then threw the file in the Windows trash can, which I emptied afterwards. And guess what, the file can now still be found, and Google Desktop offers me its full-text cache of the file even though it had been removed.

While this isn’t a real problem, and shouldn’t be (in fact, it can be a life-saver if you accidentally deleted an important file), we also should be very aware of it. There’s no “noarchive” or “robots.txt” either, but it was of course our choice to install the Google Desktop application, and we can also tweak its indexing settings in a variety of ways.

Currently, Google Desktop allows us to enable or disable searching of:

Additional to this, we can provide one or more paths for Google Desktop to ignore. (Naturally this also got rid of the cache of the text file I used to test.)

And of course, we can choose to not install the Google Desktop. (Well, we installed the Windows Desktop, and trust it, or most of us do – and that’s Microsoft, for crying out loud.) If we do, we should know: computers are really bad at forgetting.

20th Century Word Popularity

Oxford University Press undertook a study of when words of the last century were first used. They might not always be scientifically correct because their corpus is just a selection, but this is nevertheless interesting. Some examples:

[Via Hotlinks.]


TV-B-Gone [is] a new universal remote that turns off almost any television. The device, which looks like an automobile remote, has just one button. When activated, it spends over a minute flashing out 209 different codes to turn off televisions, the most popular brands first.”
– Steven Bodzin, Inventor Rejoices as TVs Go Dark (Wired), Oct. 19 2004

Google Desktop Privacy

The privacy meme took off again, and once more its targeting Google:

“People who use public or workplace computers for e-mail, instant messaging and Web searching have a new privacy risk to worry about: Google’s free new tool that indexes a PC’s contents for quickly locating data.

If it’s installed on computers at libraries and Internet cafes, users could unwittingly allow people who follow them on the PCs, for example, to see sensitive information in e-mails they’ve exchanged. That could mean revealed passwords, conversations with doctors, or viewed Web pages detailing online purchases. (...)

Acknowledging the concerns, [Marissa Mayer, director of consumer Web products at Google Inc.] said managers of shared computers should think twice about installing the software until Google develops advanced features like password protection and multi-user support.”
– Anick Jesdanun, New Google tool for searching computers a privacy risk on shared PCs, October 19, 2004

Yahoo Job Offer

Zawodny of Yahoo is looking for people who know a bit about web services.

Do You Yahoo?

Possibly the most useless page on Yahoo. [Via SER.]

Working on the Google Browser?

Battelle finds this piece by Markoff of The New York Times:

“If you drive by the Google buildings in the evening,” said a person who has detailed knowledge of the company’s business, “the lights that are still on are the ones on the floor where they are working on the browser.”

Gmail Cartoon

Via Markus comes a pointer to this Gmail strip at UserFriendly.

Charlene Li’s Blog

Steve Rubel alerted me of Carlene Li’s new blog on technology, marketing, and search. It looks relevant and very interesting, and I added it to my search channels overview.


Blog  |  Forum     more >> Archive | Feed | Google's blogs | About


This site unofficially covers Google™ and more with some rights reserved. Join our forum!