I have a couple of urls that have become quite attractive to spammers as of late, for some stupid reason. Stupid in that most situations involving spam are stupid, as the inefficiencies would make anyone of any intelligence balk at the very concept. But still, many desperate and immoral thugs persist.

My urls that appear to make spambots salivate with misguided hope are those that allow anonymous users to add content that will be later displayed to others. Specifically, there are two:

  • anonymous trac ticket creation
  • wordpress comments

Both trac and WordPress have fantastic tools that fight spam (Akismet, for one, is priceless). These tools prevent tons of spam on my sites every day. But thanks to mindless bots, the spam, while pretty much always unsuccessful in creating tickets due to Akismet and captcha, can morph into the DOS category. I was getting 5 apache requests every second, 24×7.

I started using mod_evasive to stop the flood, which certainly helped. But it did not break the spambots to the point where they gave up. I was dealing with some seriously inept and overzealous spambotting – I don’t even have heavily trafficked sites. What recourse is left if you just. keep. getting. mindlessly. hammmered!?

I got out the big gun and decided that, in the case of my trac ticket site, it was better to just move the whole damned url. The ticket site is a part of a larger site devoted to my music player project, and valid users should really navigate through the top site anyway. It took me a while to decide this was best. It’s certainly not optimal for supporting a site that might be heavily bookmarked by end users. It’s kind of out of the box thinking. But in my case it was worth the cost.

For trac, it was just a matter of a couple trac.ini and apache config changes, and then changing the referring websites.

trac.ini:

[trac]
base_url = https://mysite.com/mynewurl

apache conf:

WSGIScriptAlias /mynewurl /var/lib/trac/apache/trac.wsgi

I get spammers slamming my little web sites all the time. A couple of my pages seem to have gotten added to some bot list, and there is no end of attempts to add comments to my WordPress blogs and tickets to my trac issue tracker. There are lots of ways to fight this, and both WordPress and trac have pretty decent built-in tools. One apache-based tool I recently added is mod_evasive, and it is so simple and elegant there’s really no reason not to use it. It’s small and appears to use an in-memory hash table for live state tracking so it shouldn’t slow things down much. All it does is look for rapid access from the same IP address, and put a temporary block on that address for a short time period. So as not to interfere with access by real people, it only takes action against obvious abuse. Here’s my configuration, with notes:

# MDM i thought about changing these to block 5 ticket requests in 60 seconds
# BUT THAT'S TOO MUCH for any other part of my websites
# This really isn't going to solve the trac problem... but i'll leave it for DOS attacks
# I did make it a little tighter:
# lowered page count from 5 to 3 (3 page requests within 1 second)
# upped site count from 100 to 50, interval from 2 to 10 (50 site requests in 10 seconds)
# MDM The only thing getting blocked is me, prolly due to HangTheDJ pings, doh
# Forget this, set it back high again.  It's ONLY going to stop true DDOS attacks.
# We'll set up mod_qos or something else for trac ticket spammers.
# MDM Actually looks like spammers are dropping off...?  didnt see any logging tho... huh
# I'll tighten up a LITTLE; site count from 100 to 30; page interval from 1 to 2; blocking period from 10 to 20
# MDM OK it seems to be working great now!
# But why limit the block to 20 seconds?  I'm upping it to 5 minutes.
DOSHashTableSize 3097
DOSPageCount 5
DOSSiteCount 30
DOSPageInterval 2
DOSSiteInterval 2
DOSBlockingPeriod 300

I’m setting up a private wiki area in my trac wiki. There’s lots of old info regarding trac permissions, easy to get lost. It seems the most current clean way is to use the AuthzPolicy plugin/module, which is available in the base install. There are some steps to go through to get set up, but it’s pretty straightforward.

Go to the admin plugins page, click Trac arrow to open it, find AuthzPolicy, click arrow to open it for instructions. More instructions are here but may not be as up-to-date. When you’re ready, click the enable checkbox and save changes, then configure it. Here’s what I did:

easy_install configobj
emacs conf/trac.ini
  [trac]
  permission_policies = AuthzPolicy, DefaultPermissionPolicy, LegacyAttachmentPolicy
  [authz_policy]
  authz_file = conf/authzpolicy.conf
emacs conf/authzpolicy.conf
  [groups]
  devs = joe, jane
  [wiki:DevPage@*]
  @devs = WIKI_VIEW, WIKI_CREATE, WIKI_MODIFY
  * =
/etc/init.d/apache2 restart

The base Trac is pretty useless for managing users, so let’s get the plugin that fixes that (yes trac is still pretty much a hack 🙂 )…

easy_install TracAccountManager

Now let’s configure it. Same as before, pop it open in the trac admin page. More instructions are here.

I wrote about setting up trac for anonymous ticket creation here.

I upgraded to trac 1.0.1 and lost TracSpamFilter functionality. This post is about getting it working again. It also includes some warnings if you are using https.

Trac seems to be in motion and sometimes figuring out the latest “right” way to do something is confusing. To start, look at the default plugins in the base installation. Log in as admin and select the plugins page. Then click the arrow next to the [Trac ###] item to see available plugins.

TracSpamFilter is not there, it appears we’ll have to specifically (re-)install as a plugin. But that’s easy:

easy_install TracSpamFilter
/etc/init.d/apache2 restart  

And it looks like all my configuration was left intact, whew. Weird that the plugin went away.

Recheck all your configuration settings under the Admin page’s Spam Filtering link. You should also only enable the parts you need, by hitting the sideways dropdown arrow next to TracSpamFilter under the Admin plugins page. Read more here.

Time to test! Testing requires that you set your spam filter scoring so that you know it will trigger a capture, log out, create a new ticket, and hit submit. Only then will trac check your request to see if it is spam. When I tested things out, the need for a capture was properly identified during ticket creation, and the ticket was reloaded under the /capture url.
But the capture failed to be displayed.

So I turned on debugging by adding this to trac.ini (WARNING: trac will resave the trac.ini file stripping out any comments, boo):

[logging]
# MDM This places the log in /var/lib/trac/sites/mysite/log
log_file = trac.log
log_level = INFO
log_type = file

And here’s what I saw…

2013-04-25 14:40:39,070 Trac[env] INFO: Reloading environment due to configuration change
2013-04-25 14:40:42,478 Trac[env] INFO: -------------------------------- environment startup [Trac 1.0.1] --------------------------------
2013-04-25 14:40:42,514 Trac[env] WARNING: base_url option not set in configuration, generated links may be incorrect
2013-04-25 14:40:42,765 Trac[api] WARNING: Unable to find repository '(default)' for synchronization

Does any of that matter? I headed to irc, posted questions… and got nothing over two days of waiting. Let’s go back and analyze the what we’re seeing. Viewing the source revealed that I was getting the captcha code in the html! Good deal. Hitting F12 in google chrome gives you a nice set of tools ripped from the heart of Firefox’s Firebug. After poking around some, I realized that both the recaptcha and keycaptcha code snippets were using http, not https. Chrome is a bitch about this type of stuff. And after firing up Firefox and finding that everything worked fine, I confirmed that it was to blame. You suck, chrome! You can’t introduce new rules, even if they’re good, that break 50% of the web. Well, I can’t really argue, https should be https. But in this case, I can’t even figure out where the plugin code is to fix it. No time this time. So for now, I’m falling back to the “Expression Captcha”, which sucks but is embedded in trac and therefore works everywhere.

I’ll have to revisit this again, when I have time, to get trac to use https in its captcha code snippets. Or, with no irc response, is it time to move from trac to something else?

Good lord things are getting out of hand in Python land.

I recently bumped up my entire server with the typical emerge –world and –depclean and revdep-rebuild. To get trac working again, I had to jump through some hoops…

  • bump my entire machine, watch trac fail with “ImportError: No module named trac.web.main”
  • run python-updater (which didn’t seem to do enough?)
  • upgrade to unstable trac 1.0.1
  • re-emerge world to upgrade all six versions of python (either new versions were posted overnight or apparently installations were damaged by –depclean?)
  • set active version of python back from 3.2 to 2.7 with eselect
  • re-emerge mod_wsgi
  • restart apache

Of course I did a LOT MORE than that to figure out that that’s what I had to do! 🙂

Ahh the price of fame.