Google Blogoscoped

Thursday, July 5, 2007

Check Which Google Result Number is Censored

It’s hard to know exactly what is missing in a self-censored Google China result, because Google at the end of the page only discloses that something is missing (but not precisely which page). One way to find out more is to compare Google.cn results with those from the international Chinese Google search (and then to do a “site” operator search on Google.cn for each of the domains you see only in the international results). However, this methodology isn’t always correct because different Google sites sometimes show different results, even when it comes to two Chinese Google variants.

Another method allows you to find out exactly which position Google removed (removed following up on requests from the Chinese government, as we know). First, you need to set your browser language settings to Chinese. Then you can visit Google.cn without being redirected away, if you’re outside China. To know when you’ve hit on a censored result look for this italic message at the footer of the page:

据当地法律法规和政策,部分搜索结果未予显示。

When you see this message (which roughly says that due to local laws and policies, some results are missing), append “&num=1” to the Google URL. This will restrict the result set from 10 to 1 pages. The censorship message might now disappear, but you can then increase this number step by step (num=2, num=3 and so on). The first time you see the self-censorship disclaimer reappear, you know that this specific rank is missing, e.g. when you’ve entered num=5 last it means the fifth most relevant page for your query is removed.

As a sample query, you can search Google.cn for 纯如.

Before going on with Google, I want to briefly explain what this search query means, and some of this background is disturbing. According to Wikipedia, that is the Chinese name of Iris Chang, who wrote The Rape of Nanking detailing the brutal events during the war between China and Japan. Amazon UK in their book review writes:

After fierce fighting in Shanghai, the Japanese occupied the old Chinese imperial city of Nanking on 13 December 1937. Over the next six weeks, the Japanese massacred more than 300,000 Chinese and raped more than 80,000 women. But these bare figures don’t begin to describe the atrocities. The Japanese indulged in execution contests to see who could behead the most civilians in the shortest time, they burned their victims, they buried them alive, they set dogs on them. No form of mutilation and torture was too extreme or bizarre and no one escaped. Men, women, children and babies were all butchered.

Iris Chang later, in 2004, committed suicide after a period of depression resulting from a nervous breakdown, as Wikipedia writes.

Using the methodology above, we can determine that the third result for Iris Chang’s Chinese name is missing (as well as the fifth result for her English name, and the fifth result for “The Rape of Nanking"; we do not know however if the censorship is in direct relation to Iris Chang, as Google mostly works on a domain blacklist). At least according to Google, because the method relies on what Google discloses. Google once said, “Chinese regulations will require us to remove some sensitive information from our search results. When we do so, we’ll disclose this to users”. However, it’s important to note that Google doesn’t always directly disclose their self-censorship on Google.cn. Compare the results for “The Rape of Nanking” on Google Book Search. The left side shows Google.cn, which, as Google discloses elsewhere on the site, only includes works from China mainland publishers. The right side shows results from Google.de, with a Chinese interface.

I think it’s worthwhile to separate two issues in this discussion. One is the discussion of whether or not Google’s censorship compromise was worth making (“It’s a decision that reasonable people ... might disagree on ... given the trade-offs,” Google’s CEO Eric Schmidt told reporters on June 19th this year). The other is the discussion of what exactly this censorship entails, which we can only find out by probing Google’s servers as above; Google does not communicate these things openly. In this specific case, by restricting themselves to mainland China publishers only – which is not a usability choice, as e.g. German Google Books results aren’t restricted to German publishers – Google self-censored about 672 books, from The Rape of Nanking itself to the numerous mentions this book received in other works.

Advertisement

 
Blog  |  Forum     more >> Archive | Feed | Google's blogs | About
Advertisement

 

This site unofficially covers Google™ and more with some rights reserved. Join our forum!