Google Blogoscoped

Forum

Dear Gmail, I don't speak Chinese. Or Korean. Or Farsi. Or any language but English.

Brian Mingus [PersonRank 10]

Wednesday, June 11, 2008
11 years ago4,154 views

Look guys, I know your machine learning filters are state of the art and that you put a lot of effort into fighting spam and have one of the best spam filters on the market. I also know that some people do speak multiple languages, and so monolingual users like myself might be a bit confusing for your filters. But come on. I have never sent an e-mail in a language other than english. A legitimate e-mail has never been sent to me in a language other than english. This isn't state of the art machine learning, just a rule based filter. If the user obviously don't speak the language, <i>don't send e-mails to them that are written in other languages</i>.

I have been getting e-mails from "UMLChina", in my inbox <i>for over a year</i>. I have reported every single one of them as spam. The number one type of spam that lands in my inbox is foreign language spam, often in character sets I don't even recognize. Come on Gmail, 99% of all of the characters in my inbox are in the tiny ASCII subset of unicode. If someone sends me an e-mail loaded with high level unicode characters, there is a 99% chance that it is spam. It's so not hard!

Ramibotros [PersonRank 10]

11 years ago #

Try filtering emails from UMLChina to make them land in the trash..

Brian Mingus [PersonRank 10]

11 years ago #

The reductio ad absurdum on that approach is truly absurd.

Tony Ruscoe [PersonRank 10]

11 years ago #

Unfortunately, you can't automatically mark mails as spam, so I've created my own spam label and do this myself with this filter:

Has the words: lang:zh
Doesn't have: in:spam
Skip the Inbox (Archive it)
Apply the label: X-Spam

Of course, I could just delete it but I'd like to check that my filter is working correctly. It's caught quite a few and all have been spam.

(BTW, don't try using [-lang:en] as that doesn't work very well...)

kowach [PersonRank 0]

11 years ago #

This works:

-{lang:en}

This thread is locked as it's old... but you can create a new thread in the forum. 

Forum home

Advertisement

 
Blog  |  Forum     more >> Archive | Feed | Google's blogs | About
Advertisement

 

This site unofficially covers Google™ and more with some rights reserved. Join our forum!