I have a couple of urls that have become quite attractive to spammers as of late, for some stupid reason. Stupid in that most situations involving spam are stupid, as the inefficiencies would make anyone of any intelligence balk at the very concept. But still, many desperate and immoral thugs persist.

My urls that appear to make spambots salivate with misguided hope are those that allow anonymous users to add content that will be later displayed to others. Specifically, there are two:

  • anonymous trac ticket creation
  • wordpress comments

Both trac and WordPress have fantastic tools that fight spam (Akismet, for one, is priceless). These tools prevent tons of spam on my sites every day. But thanks to mindless bots, the spam, while pretty much always unsuccessful in creating tickets due to Akismet and captcha, can morph into the DOS category. I was getting 5 apache requests every second, 24×7.

I started using mod_evasive to stop the flood, which certainly helped. But it did not break the spambots to the point where they gave up. I was dealing with some seriously inept and overzealous spambotting – I don’t even have heavily trafficked sites. What recourse is left if you just. keep. getting. mindlessly. hammmered!?

I got out the big gun and decided that, in the case of my trac ticket site, it was better to just move the whole damned url. The ticket site is a part of a larger site devoted to my music player project, and valid users should really navigate through the top site anyway. It took me a while to decide this was best. It’s certainly not optimal for supporting a site that might be heavily bookmarked by end users. It’s kind of out of the box thinking. But in my case it was worth the cost.

For trac, it was just a matter of a couple trac.ini and apache config changes, and then changing the referring websites.

trac.ini:

[trac]
base_url = https://mysite.com/mynewurl

apache conf:

WSGIScriptAlias /mynewurl /var/lib/trac/apache/trac.wsgi

I wrote about setting up trac for anonymous ticket creation here.

I upgraded to trac 1.0.1 and lost TracSpamFilter functionality. This post is about getting it working again. It also includes some warnings if you are using https.

Trac seems to be in motion and sometimes figuring out the latest “right” way to do something is confusing. To start, look at the default plugins in the base installation. Log in as admin and select the plugins page. Then click the arrow next to the [Trac ###] item to see available plugins.

TracSpamFilter is not there, it appears we’ll have to specifically (re-)install as a plugin. But that’s easy:

easy_install TracSpamFilter
/etc/init.d/apache2 restart  

And it looks like all my configuration was left intact, whew. Weird that the plugin went away.

Recheck all your configuration settings under the Admin page’s Spam Filtering link. You should also only enable the parts you need, by hitting the sideways dropdown arrow next to TracSpamFilter under the Admin plugins page. Read more here.

Time to test! Testing requires that you set your spam filter scoring so that you know it will trigger a capture, log out, create a new ticket, and hit submit. Only then will trac check your request to see if it is spam. When I tested things out, the need for a capture was properly identified during ticket creation, and the ticket was reloaded under the /capture url.
But the capture failed to be displayed.

So I turned on debugging by adding this to trac.ini (WARNING: trac will resave the trac.ini file stripping out any comments, boo):

[logging]
# MDM This places the log in /var/lib/trac/sites/mysite/log
log_file = trac.log
log_level = INFO
log_type = file

And here’s what I saw…

2013-04-25 14:40:39,070 Trac[env] INFO: Reloading environment due to configuration change
2013-04-25 14:40:42,478 Trac[env] INFO: -------------------------------- environment startup [Trac 1.0.1] --------------------------------
2013-04-25 14:40:42,514 Trac[env] WARNING: base_url option not set in configuration, generated links may be incorrect
2013-04-25 14:40:42,765 Trac[api] WARNING: Unable to find repository '(default)' for synchronization

Does any of that matter? I headed to irc, posted questions… and got nothing over two days of waiting. Let’s go back and analyze the what we’re seeing. Viewing the source revealed that I was getting the captcha code in the html! Good deal. Hitting F12 in google chrome gives you a nice set of tools ripped from the heart of Firefox’s Firebug. After poking around some, I realized that both the recaptcha and keycaptcha code snippets were using http, not https. Chrome is a bitch about this type of stuff. And after firing up Firefox and finding that everything worked fine, I confirmed that it was to blame. You suck, chrome! You can’t introduce new rules, even if they’re good, that break 50% of the web. Well, I can’t really argue, https should be https. But in this case, I can’t even figure out where the plugin code is to fix it. No time this time. So for now, I’m falling back to the “Expression Captcha”, which sucks but is embedded in trac and therefore works everywhere.

I’ll have to revisit this again, when I have time, to get trac to use https in its captcha code snippets. Or, with no irc response, is it time to move from trac to something else?

Trac is nice and simple and clean and “suits me well” for now. I was able to quickly add custom fields and use them on the front page of my tracker. And it supports the ability for anyone to anonymously add a new ticket, which is the Holy Grail of getting feedback. But getting feedback includes getting spammed. I was getting a few spam tickets every minute, making things unusable. How to fix?

TracSpam to the rescue. It’s built in to trac, but it required some tweaking… (continued…)