Google Blogoscoped

Forum

No comments possible for the latest blog by philipp

Rohit Srivastwa [PersonRank 10]

Thursday, July 19, 2007
17 years ago3,495 views

Philipp's latest blog entry on the where he shows some special results on some special searches was nice

I tried posting a comment & it was disabled by philipp's script saying that there is *some* bad words in there, you know what I'm talking about ;)

Ionut Alex. Chitu [PersonRank 10]

17 years ago #

The title contains *viagra*, so he should either change the title or the filter.

Ludwik Trammer [PersonRank 10]

17 years ago #

The filters are very annoying. Just yesterday I couldn't post YouTube video, because video's id included "h[remove-this]1". There really are better anti-spam solutions...

Roger Browne [PersonRank 10]

17 years ago #

Or, we could just post comments about the V* search results here.

I can get Uclue to be the first result by using the following search (after substituting the V-word where indicated):
V* amusement and shenanigans specifically at Microsoft
http://www.google.com/search?q=%76iagra+amusement+and+shenanigans+specifically+at+Microsoft

Of course, the Uclue page isn't _really_ about V* shenanigans at Microsoft!

Rohit Srivastwa [PersonRank 10]

17 years ago #

Hey chitu
why your post didn't got filtered?

Rohit Srivastwa [PersonRank 10]

17 years ago #

BTW i wanted to post the comment that the Google result for me is putting blogoscoped on second & first is some mansbags.com

Philipp Lenssen [PersonRank 10]

17 years ago #

Ooops. I removed "viagra" from the blacklist for now, sorry! Please post again what you wanted to comment...

> There really are better anti-spam solutions...

Hmm. I might try to show a captcha when you hit on a blacklisted word. I'm not a big fan of other anti-spam solutions (like math puzzles or captchas by default...).

Ludwik Trammer [PersonRank 10]

17 years ago #

> I'm not a big fan of other anti-spam solution

There is not good anti-spam solution for big services, but smaller sites, like Google Blogoscoped, can very easily fool bots. You don't have to display complicated captchas, you can just ask user in a plain text to type o word "cat" or a result of 5+3. My problem with SPAM in comments completely stopped when I changed HTML names of inputs form English to Polish and started to validate provided e-mail address server side. Spam bots just don't know which of the inputs is for e-mail and fail.

You could even make a text-input with label "Don't type anything here", than hide both via CSS, so users don't even see anything new, and then SPAM bots will be the only ones that don't leave this blank.

Philipp Lenssen [PersonRank 10]

17 years ago #

> You don't have to display complicated captchas,
> you can just ask user in a plain text to type o word "cat" or a
> result of 5+3.

Yeah, but I also don't like those...
Of course, a word blacklist is worse if it always appears, but I hope the blacklist doesn't appear too often, and if it does there's often ways to amend it...

> You could even make a text-input with label "Don't
> type anything here", than hide both via CSS, so users
> don't even see anything new, and then SPAM bots will
> be the only ones that don't leave this blank.

I might want to try this, you're right...

Ludwik Trammer [PersonRank 10]

17 years ago #

> I might want to try this, you're right...

I think that's a great and (unlike captcha) very elegant solution, working even in text browsers. There are two variants of this – more and less aggressive.
In the first one you pick some neutral name for the hidden input. This will probably work, because bots like to fill all the inputs, even those which names they don't recognize (to minimize chance of "you have to fill all the fields" error).
If this won't work (but I bet it will) you'll have to adopt the more aggressive approach and rename the input to something that bots will recognize, like "mail". This could however generate some false positives, due to browser addons that automatically fill out for the user some basic fields. So you would have to prepare some second step in which user could send the post anyway (by proving being a human in some other way).
The less aggressive approach doesn't share this problem.

Niraj Sanghvi [PersonRank 10]

17 years ago #

Philipp, the hidden field has worked extremely well for me...I went from using filters and still getting 20-30 spam comments coming through per day, to getting literally zero spam comments per day. The only way spam comments ever show up now is if someone actually manually leaves one. Bots always get caught.

But there are more ways that are still invisible to the user. I highly recommend this article which not only mentions hidden fields, but also a spinner hash, timestamping, etc.
http://nedbatchelder.com/text/stopbots.html

Hong Xiaowan [PersonRank 10]

17 years ago #

Although blacklist sometime makes trouble, I think it is a good way to prevent spams.

I love the style of this blog. No register, No login, but can keep the value real names, and make a pure content in the forum.

I am research this forum for times. And I am doing a similar forum.

Rohit Srivastwa [PersonRank 10]

17 years ago #

IMO a math captcha would be nice
I have seen the nice math captcha of Drupal where it asks you to calculate a simple arithmetic & then write the answer

Don't use complex mathematics, I'm not good in that ;)

James Xuan [PersonRank 10]

17 years ago #

If a=4 and b=27 in this equasion what is the vanue of y.
b*b*b*b*b*b*b*b*b*b*b*b*b*b*b*b*b*b*b*b*b*b*b*b(a)

Google Calc says: A lot

Rohit Srivastwa [PersonRank 10]

17 years ago #

One important thing
Captcha (or other methods we are discussing) will surely stop the automated reply bots from adding garbage, but what about users who do this intentionally

Some casual visitors put many stuff of random visits to the site & that might spoil the spirit here. We all notice many signature stripped by philipp/tony. And mostly these are users with 0 PR or 2-3 post users.

I also remember one of my comment was blocked because it contained words *good*job*. I was really impressed by the idea of blocking that combination (whose idea was that?).

I think Philipp & team who help him will have to maintain double filter.
Captcha for the bots & still manual filter for these kind of users.

Thoughts??

Colin Colehour [PersonRank 10]

17 years ago #

Philipp, have you ever thought about moderating a user's first post like Matt Cutts does on his blog? I've seen this done on several high profile blogs. It might cut out the random garbage that gets posted.

David Hetfield [PersonRank 10]

17 years ago #

Yeah Colin's offer could be real nice Philipp. :)

Philipp Lenssen [PersonRank 10]

17 years ago #

> I also remember one of my comment was
> blocked because it contained words *good*job*.
> I was really impressed by the idea of blocking
> that combination (whose idea was that?).

This was put in to fight a certain type of spam which always goes like this (I'm paraphrasing, it's available in 100s of variants):

"Hey, i like your site, good job!"
"Nice article, well done!"
"Great job, good post!"
"Nice site design, I like it!"

Now, you might flatter yourself into thinking this is honest praise, but it always comes en-masse, and always weeks after you posted something – and the praise is always so generic that it could apply to *every* post (it's very scaleable). This is also one of the hardest to blacklist, because "good job" is indeed something someone might really honestly say who's not a spammer.

I haven't completely figured out the purpose of these spam posts, by the way, because they don't include a signature. It's possible it's testing-the-water spam (seeing if something remains up in a thread – I've often seen these kind of comments after some time followed by a bunch of plain spam comments), or it's just that this blog doesn't include a URL field which the bot would actually aim for.

> Philipp, have you ever thought about moderating a
> user's first post like Matt Cutts does on his blog?
> I've seen this done on several high profile blogs.
> It might cut out the random garbage that gets posted.

I'll keep that in mind once the forum becomes unmanageable due to spam...

Ludwik Trammer [PersonRank 10]

17 years ago #

> Captcha for the bots & still manual filter for these kind of users

There is no way to do automatic filters for real users!

> you might flatter yourself into
> thinking this is honest praise

I think of the great things about this forum (not the most important one, though) is how easy it is to post – without registration, captchas, moderation... But by applying more and more filters you changes this dramatically. Even right now it would be a lot easier to have one captcha every couple of days (using a cookie) than to have all that annoying filters. Filters are especially bad because you have to alter your post – I just hate this.

This forum deserves full anti-bot solution. I'm a big fun of a hidden input one. It's just so elegant and transparent to the user.

Tony Ruscoe [PersonRank 10]

17 years ago #

<< Even right now it would be a lot easier to have one captcha every couple of days (using a cookie) than to have all that annoying filters. Filters are especially bad because you have to alter your post – I just hate this. >>

Either way, you'd have to think more and / or type something extra.

How many times do the filters actually affect you? I've probably only ever been stopped from submitting exactly what I wanted to submit a couple of times in 2-3 years.

I think a hidden / dummy field is definitely the way to go.

Philipp Lenssen [PersonRank 10]

17 years ago #

OK, there's now that hidden field, and added to that if you happen to enter something from the blacklist you can then still get it through now by answering a simple math question. Both approaches are not meant to be 100% bot-proof, but more like 80% solutions that may defend many bots not specialized on this forum software. So thanks to all, and let's see how this goes!

Ludwik Trammer [PersonRank 10]

17 years ago #

Great :) For the best anti-bot result it would be probably wiser to put display:none in the .css file, not right in the page and to use some other name for the field than "bottrap". Bot most of the bots (maybe even all for now) are no that sophisticated.

For people that for some reason have to use captcha I'd like to recommend reCAPCTCHA. If you don't have any other alternative that's the most useful captcha solution out there. It helps to digitize old books. It works like that:
At first book are scanned and cutting edge OCR program tries to make them into text. Some words are unreadable for OCR. Those words are put into captchas. By solving captcha people helps to OCR the book. There are two words on every captcha – one word that computer doesn't know yet and wants you to help him read, and other that OCR couldn't read before, but now it knows, thanks to help of the other people solving captchas.

It's also easy solution for webmasters, who don't have make their own captcha system, but got pretty good one ready, with sound component (for blind people). Recaptcha is supported by people who originally came up with the whole captcha idea (the same people that designed Google Image Labeler's idea)

It's http://recaptcha.net

PS: No, I'm not a bot advertising recaptcha.net ;)

Ionut Alex. Chitu [PersonRank 10]

17 years ago #

YouTube's blog needs some spam protection as well:
http://www.youtube.com/blog?entry=ojjc9w-lzyI

Ludwik Trammer [PersonRank 10]

17 years ago #

(I'm just checking what will now happen when I'll try to add a post with "h1" in it).

semi-UPDATE: "Small bot/ human check, what's 6 + 3" and the text input below. "what's 6+3?" obviously isn't harder for a bot that "type '9' below", computers are even better than humans in math. But it's possible that mathematical question is easier for human to understand – I mean if someone asks as "what's 3+3" we answer without thinking, but we are not used to people/sites asking us to repeat what they just said.
So I don't really know which one is easier.

Forum home

Advertisement

 
Blog  |  Forum     more >> Archive | Feed | Google's blogs | About
Advertisement

 

This site unofficially covers Google™ and more with some rights reserved. Join our forum!