Home > Uncategorised > Hmm, damn spammers

Hmm, damn spammers

December 12th, 2005 Leave a comment Go to comments

OK, so I now get quite a lot of comment spam. It’s not exactly a torrent, but none of it ever gets through my 100% effective filtering system (i.e. Me).

However, I did think about automating at least some of the process, and so at the weekend I started recording IP addresses and User-Agent strings, in a futile attempt to at least get some kind of handle on this.

What I’ve seen disturbs me. Effectively, the spammers are “spoofing” their User-Agents to the extent that it is possible to exclude them based on their User-Agent string, but I could also be excluding a large proportion of my audience.

For example, one of the strings is “Mozilla/4.0 (compatible; MSIE 4.01; Windows NT Windows CE)”, however, looking it to this string here (fifth entry down), you can’t really do that. I’ve seen some examples of .htaccess files that seem to limit some of these things, but even these get it wrong – in both my quoted examples, you will see that the completely exclude “Maxthon” which is a legitimate tabbed shell around the Internet Explorer Active X control, and you can also see an exclusion for “AtHome021” – an extension added to the end of Internet Explorer’s User-Agent string by the “At Home” ISP.

I’m thinking about integrating some kind of pattern recogniser to work out the “routes” through my website that legitimate users take, and the routes that spammers take, in order to work out some kind of trail.

I’ll keep you informed.

Categories: Uncategorised Tags: ,
  1. No comments yet.
  1. No trackbacks yet.