Wordpress Spammer Bots - A Workaround that Works

Overestimate the quality of Wordpress code at your own peril.

We run a small sitewide multi-domain Wordpress installation for blogs and simple sites.  Wordpress (and before that Wordpress MU) is easy to install, manage, hack, and looks nice out of the box.

The only problem is that it is just not that well engineered, and I have done more than my fair share of double takes as to how primitive the system is.

Overestimate the quality of Wordpress code at your own peril.

Here is a specific example and my method for working around a particular limitation (without patching the core). 

First, the problem.  Even with CAPTCHA, WP-Hashcash, and Apache no-referrer denials in place, spambots can still post to wp-comments-post.php and enter their Viagra crap to the comments moderation queue.  How is this possible, you ask?  How would they get around Apache directives that mandate the request have a referral from the same site?  Are they actually injecting a fake referral in their bot?

Yes. Yes they are.  It is trivial to do with curl.  Here is an example:

curl -e "http://yoursite.com/2010/07/02/your-post-permalink/" -d "param1=value1&param2=value2" http://yoursite.com/wp-comments-post.php

-e is the post permalink (the string used as the referrer to gain access to wp-comments-post.php)

-d is some set of variables like, "name= and comment=" where you inject the actual comment.

The last argument is the destination, which would be your default wordpress comment post handler.

I have read all over the Internet where admins are under the false pretenses that the following Apache directives nullify direct access to wp-comment-post.php

RewriteEngine On
RewriteCond %{REQUEST_METHOD} POST
RewriteCond %{REQUEST_URI} ^/?wp-comments-post\.php.*
RewriteCond %{HTTP_REFERER} !.*yoursite.com.* [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^$
RewriteRule .* - [F]

I can assure you they do not, and reliance on this technique shows the distinct lack of experience of some of our newer Apache admins.  Every single http request is a direct request to the webserver.  Referrals are generated on the client-side and can be ANYTHING, anything at all. Remember, a good sysadmin never trusts his users or their input. 

Every http request is unvetted user input.

How could, the developers have avoided putting any security into wp-comments-post.php?  How could they allow direct access to that file?  Why is there not some sort of token or hash that is generated from the comment form and passed via the POST?  It seems reasonable to me that WP-Hashcash could pass its approval (based on some hash) upon POST and wp-comments-post.php could accept that as part of the environment.

Yet, there is no hook and no template post variables that can be extended by plugins without hacking wp-comments-post.php.  If comments are permitted by anonymous users on a post wp-comments-post.php will happily process the remote request and pass it to your moderation queue. You can inject any browser string, referral, or comment you wish, all day, every day. 

The solution, it turns out is to take your fresh Wordpress installation and rename the following files to something unique to your installation (you can make it random if you wish):

  • wp-trackback.php
  • wp-comments-post.php
  • wp-signup.php

and then run

grep -rl wp-comments-post.php wordpress/ | while read file
do
   sed -i 's/wp-comments-post.php/hidden-comments-post.php/g' "$file"
done

and repeat that for each of other files wp-trackback.php and wp-signup.php (obviously modifying the targets appropriately).  You can just automate the process with a little shell script every time you update your wordpress installation (which is how we do it).

The result: now spammer bots do not know the name of the comment post script in your Wordpress installation.  They would have to pull the name from your particular instance (by looking at your comment form) and modify their bots to target you directly after that.  That is too much work, and I suspect they would not bother.  Better to go after the lower hanging fruit.

But still, I am left scratching my head.  How could direct access still be granted to wp-comments-post.php in 2010?  My solution is only a workaround based on obscurity and does not resolve the issue.  A true fix would involve wp-comment-post.php working in concert with the session of the user, javascript, or/and other parts of Wordpress.