Wednesday January 04, 2006

Battle With the Comment Spammers

Recently, this site has been hit with "Comment Spam". This is a phenomenon where commercial messages are pasted into comments on blogs. These comments are not related to the topic in the blog post, are typically for pharmaceutical products, are are done by programs, not people.

One approach to this spam is to turn off comments altogether. This is the nuculear option that is pretty unattractive. This site doesn't get to many comments, but if somebody has a question on a technical topic I post, then email becomes the method, and only the parties in question benefit from the results of the discussion.

Another approach is to moderate the comments. Hide the comments until a human (me) can review the comment and either approve or discard the comment. This is a hassle, especially since the spammers are typically code based and can throttle up the volume pretty easily. I was getting 30-40 per day.

Another approach is to put in a "CAPTCHA" ("completely automated public Turing test to tell computers and humans apart") test that puts an image on the screen that can't be read by a computer and ask the commenter to key in the value in the image. Accessibility is the issue here as those using screen readers can't participate in the commenting.

To prevent the message from providing value (most search engines like outbound links and boost site rankings when people link to the site) you could remove all the HTML or add the "ref=nofollow" attribute to anchor tags (this tells participating search engines to not index the link). But once again, this can limit functionality to real users.

Another approach is to change the comment form in a subtle way that prevents the spammers program from working properly. But much like security techniques, the smart spammer will eventually figure this out.

I've implemented a combination of the above methods, and so far I've been able to thwart their attempts. I think their programs are still hitting the site as my page views are abnormally high, but the spam actually be posted to the server has been eliminated.

What I'd really like to know though is 1) how did they find this site? 2) what makes them think that this site gets enough page views that spamming would have any success at reaching an audience (search engines or human eyeballs? 3) does it really work?

I wonder if there is some sort of blog "confessions of a blog spammer" that could provide those insights.

Posted by markj at 12:40 pm  |  Last Edited by markj @ 01:02 pm
Comment Posting Closed (0 comments posted)  |  Trackbacks (0)  |  Permalink
(Posted to: Technology)
No longer accepting trackbacks on this entry.

About the Author

is a Web Application Designer working in the suburbs of Portland Oregon.

He specializes in bringing user-centered, standards based, easy to use applications developed using Oracle web technologies.

This blog will focus on the crossover of standards based design and web application development with Oracle technology, and an occasional sprinkling of articles about his newly discovered "Entrepreneurial Spirit."

The Archives

Categories

Quick Hits

[View All by Month]

RSS Syndication

Syndicated
Subscribe with Bloglines

Validation

Valid XHTML 1.0!
Valid CSS!


Copyright ©2004-2010 On The Mark Technology, LLC. All rights reserved.
Unauthorized access is prohibited. usage will be monitored.
CCBot/1.0 (+http://www.commoncrawl.org/bot.html) on CCBot/1.0 (+http://www.commoncrawl.org/bot.html) at