The details of a Spam Injection

Exactly one month ago my blog was the subject of a spam injection attack that has brought back consequences that are still with me to this day. Even though I am a web developer with years of experience and a sound approach to security, I was brought to my knees for days without even knowing it.

In this post I will explain to you what happened, what to look for and how to prevent that this happens to you.

Tell-tale signs

The very first sign that alerted me that something was going on was that my feed, which I self-subscribe to, started to appear with a list of spammy keywords at the end of each post. They didn’t have any links, just the words. I checked my blog and everything seemed okay, so I thought maybe this was some sort of problem with Feedburner (which lately has had a lot of issues, so why not one more, I thought.

health-ads

A couple of days later, I started noticing some strange ads popping up on my blog. They had to do with health and pills, and I thought this was strange, as most of my content is centered around social media and technology. I was starting to get worried but had no clue what was going on, as I would view the HTML code and my Wordpress installation and nothing strange was happening. Again, I thought maybe some health company had purchased space on my blog (and unfortunately you’ll see the same ads in this post as well, as Google thinks this post is about that).

Because I was protected with Akismet for my comments, and had FTP turned off for my blog, I was 100% sure I wasn’t infected with anything strange.

The truth was that I was infected.

The Final Discovery

After some research, I found out about some clever software injections that are either pushed via templates or plugins that are downloaded from non-Wordpress sites. I remembered I had downloaded a couple of plugins from external sites and went into panick mode.

The first recommended check was doing a site search with spammy keywords. I did, and was in for a rude surprise, they were all there.

site-search-sm

The funny thing was that if I clicked on the link to my site, I didn’t see them. The only way to see them was to go to the "Cached" version and then see the results in text mode.

I also went to my Google Webmaster tool (if you don't have this, you should immediately) and saw all the spammy keywords in the content analysis:

spam-keywords

I was totally infected. It had been days (if not weeks) that this had been happening.

How the injection works

The spam keywords and links are hidden in chunks of code that are not human readable, usually encoded with a PHP function called base64 that converts all the HTML into words and letters that can be later decoded.

But when are they decoded? This is the smart part: if you see your site, your browser version is read by the spammy code and doesn't render anything. But if the Google Bot or other bots are the ones accessing the code, it then decodes and prints out the spammy code.

Other times, it decodes it randomly, so only some users can see them.

One way to check out how this is triggered is by crawling your site using cURL, a tool that’s available for most Linux installations. If I did the following command, I could see the spam links on my footer section:

curl --no-sessionid --user-agent "Googlebot/2.1 (+http://www.googlebot.com/bot.html)" http://jungleg.com

Steps to solve it

You can try and pinpoint which of the functions is triggering the spam links. In my case, I just did a backup of the blog database and installed a new Wordpress folder from zero, adding the plugins and templates from Wordpress.

It is very important to notify Google about your attack as soon as you can. For me it was too late, my PageRank had gone down from 3 to zero. I wrote a reconsideration request, and even though I haven’t heard back from them, my blog did get back to a PR 2, and most of the spammy content is gone, even though I still see those pesky health ads every so often.

reconsideration

Monitoring: the hard part

In theory, we would all have to do this monitoring every day, hopefully before the Google bot hits our site. But who has time to issue the cURL command or be looking at his own site's Google Search results? What if it's only one of your older posts?

As a developer, I thought this would be a good tool to write and on Saturday I released version 1 of this tool to the blogosphere: it's called SpamCheckr.

SpamCheckr crawls your site acting as one of a handful bots to surface spammy keywords and will show you the text content the bot sees. Since Saturday 84 people have checked their sites, with at least 2 getting some sort of spam content present.

I will write, as time permits, a second version of the tool that will crawl your blog on a scheduled fashion, and alert via email or SMS if it finds spam hopefully before Google indexes the content, ruining your hard-earned PageRank and ad revenue.

Taken from the original JungleG Blog Post The Aftermath of a Wordpress Spam Injection (and a Tool to Prevent it)