Google Blogoscoped

Monday, May 15, 2006

10 Things You Might Not Know About Censorware

This article is written by Seth Finkelstein as part of the Blog Swap with Philipp Lenssen – Philipp’s article on 10 Things You Might Not Know About Google can be found at Seth’s blog.

The Net interprets censorship as damage, and routes around it.
– John Gilmore (famous quote)

What if censorship is in the router?
Seth Finkelstein

1. Censorware isn’t just for kids

Censorware (often misleadingly called “filtering”) is software designed and optimized for use by an authority to prevent another person from sending or receiving information. In the United States, it was first discussed in connection with parents who wanted to prevent children from seeing sex-related information. But the issues are by no means restricted to that situation. The very same software which is used by parents in their home, is legally imposed on libraries (which accept certain government funds), and employ by dictatorial governments, e.g. “The Great Firewall Of China”.

2. Programmers have been sued for publishing reverse-engineering of censorware

In March of 2000, programmers Eddy L O Jansson and Matthew Skala published a report “The Breaking of Cyber Patrol 4” (“Several attacks are presented on the “sophisticated anti-hacker security” features of Cyber Patrol 4, a “censorware” product intended to prevent users from accessing Internet content considered harmful ... Excerpts from the list of blocked sites are presented and commented upon. A package of source code and binaries implementing the attacks is included.”)

They were quickly sued by CyberPatrol, for copyright infringement, breach of licensing agreement, theft of trade secrets, and other charges. The case was settled out of court, as the programmers could not afford the costs of a legal defense. As Skala wrote:

What I found out was that those organizations, through no fault of their own, were able to give me a lot of sympathy and not enough of anything else, particularly money, to bring my personal risk of tragic consequences down to an acceptable level, despite, incredibly, the fact that what I had done was legal. Ultimately, I couldn’t rely on anybody to deal with my problems but myself.

Some people learn that lesson a bit less impressively than I had to.

Disclosure note: The author of this article, Seth Finkelstein, had done similar reverse-engineering of CyberPatrol much earlier, but not published a report about it, and remained anonymous for years when providing information about censorware, out of fear of a lawsuit.

3. Censorware often blacklists language translation sites, as a LOOPHOLE

Too much of the discussion about censorware takes place in terms of the misnomer “filtering”. That conjures up an image of removing evil, yucky, even toxic material, while leaving a purified result. The constant chant of “porn, pornography, harmful to minors, obscenity, child porn, pr0n, porno, PORN ...” often keeps issues framed in these terms. People sometimes gets the idea that censorware is intended to remove evil sites. No. It is designed to control what people are permitted to read. That is a very different problem. It implies that even if there was a perfect blacklist for sex or other prohibited material, censorware would still need to ban anonymity, privacy, language translation sites and more. Because all such sites, no matter how functional and useful they may be, have the capability to allow a reader to view any other site. They are a LOOPHOLE.

4. Censorware often blacklists the Google cache

The issue of third-party sites allowing users to escape from censorware apply with particular force to Google’s cache. Since virtually all web pages index by Google can be retrieved from a cache, this makes those caches a way of circumventing censorship. As detailed in Google testimony:

The cause of the slowness and unreliability appears to be, in large measure, the extensive filtering performed by China’s licensed Internet Service Providers (ISPs). ... Each ISP is legally obligated to implement its own filtering mechanisms, leading to diverse and sometimes inconsistent outcomes across the network at any given moment. For example, some of Google’s services appear to be unavailable to Chinese users nearly always, including Google News, the Google cache (i.e., our service that maintains stored copies of web pages), and Blogspot (the site that hosts weblogs of Blogger customers). Other services, such as Google Image Search, can be reached about half the time. Still others, such as, Froogle, and Google Maps, are unavailable only around 10% of the time.

5. Censorware research has been one of the few successful DMCA exemptions

The U.S. Copyright Office conducts special hearings every three years for “Rulemaking on Exemptions from Prohibition on Circumvention of Technological Measures”, as part of the Digital Millennium Copyright Act (DMCA). This is a process where researchers or other interested parties may ask for an exemption from one aspect of that law prohibiting the bypassing of controls which restrict access to copyrighted works (note the exemption is not perpetual, it applies only for three year period). One of the few exemptions granted, (in the 2003 rulemaking language) was:

1. Compilations consisting of lists of Internet locations blocked by commercially marketed filtering software applications that are intended to prevent access to domains, websites or portions of websites, but not including lists of Internet locations blocked by software applications that operate exclusively to protect against damage to a computer or computer network or lists of Internet locations blocked by software applications that operate exclusively to prevent receipt of e-mail. For purposes of this exemption, ``Internet locations’’ are defined to include ``domains, uniform resource locators (URLs), numeric IP addresses or any combination thereof.’’

6. Legal arguments over the effectiveness of censorware were the reason for the subpoena for data from Google and other search engines

Contrary to popular myth, the subpoena for Google search data had nothing to do with investigating child porn. It was driven by a longstanding legal argument regarding the effectiveness of censorware. As the declaration of expert witness Philip B Stark proposes:

3. Reviewing URLs available through search engines will help us understand what sites users can find using search engines, to estimate the prevalence of harmful-to-minors (HTM) materials among such sites, to characterize those sites, and to measure the effectiveness of content filters in screening HTM materials from those sites.

4. Reviewing user queries to search engines will help us understand the search behavior of current web users, to estimate how often web users encounter HTM materials through searches, and to measure the effectiveness of filters in screening those materials.

7. If censorware works for parents to control children in the US, it’ll work for governments to control citizens in e.g. China. Contrariwise, if censorware can’t work for governments to control citizens in e.g. China, it can’t work for parents to control children in the US.

Many discussions of censorware tend to revolve around statements of values, usually concerning which authorities have legitimate rights of control, in what contexts. Typically the values are that parents have a right to prohibit their children from reading certain materials, employers can control what employees view, but governments should not censor citizen’s ability to obtain information. However, the technical implications here are essentially identical, no matter what the social relationships.

So there’s a deep problem in efforts to bypass Internet censorship. If citizens can escape from government control, then children can escape from parent’s control. But if restricting information works on minors in the US, it’ll work on citizens under dictatorial governments. Either way, the results are problematic.

8. Nobody wants the “.XXX domain”, except people trying to make money from it.

One frequently seen proposal is to have a domain extension which would be specific to sexual material, a “.XXX” domain. While this is a very appealing proposal, which produces much pontification, almost every interest group involved in the debate thinks it’s a bad idea. Civil-libertarians oppose it because they fear it will be used as a tool to marginalize a vague and broad range of speech. Censors oppose it because they fear it will be used as a tool to legitimize material which they want to criminalize. Many technical people oppose it because it’s a very poor implementation of a simple-minded ratings system. Many webmasters oppose it because they don’t want to change their existing domain names.

Essentially the only major group in favor it is the group which wants to sell the .XXX domain names.

9. Nobody wants a kids-only domain, except politicians

Another frequently seen proposal is to have a domain extension which would be specific to “kids” material. This has been implemented as ””, which has been a spectacular failure. Years after creation, it’s had around “thirteen live sites available for use”. It’s another simple ratings system shoehorned into domain naming. Website owners don’t want to invest in new domain names. One of the creators has said:

“I never want to make enemies of people who may see the light, and I don’t think (the restrictions) are onerous. But what I do think it does is that if they have a similar dot-com site where they can market goods, they’d rather be there"

10. Censorware sex blacklists are overall very boring

There’s an unjustified mystique associated with censorware blacklists of sex sites, a salacious idea that they’re great collections of pornography. In reality, they’re mostly collections of junk. There’s little incentive to remove an entry from a blacklist, so they fill up with expired or unreachable sites. The expansive definition of sexual material means there’s often plenty of items which only the most prudish would consider arousing (again, there’s an impetus to err on the side of more listings rather than fewer). Plus there’s usually many duplications and redirects of the same site. If you want porn sites, go to a pornography blog.


Blog  |  Forum     more >> Archive | Feed | Google's blogs | About


This site unofficially covers Google™ and more with some rights reserved. Join our forum!