Monday, July 20, 2009

Silly spammers

Some of the spam that turns up in my traps stands out from the rest. Not only are they completely useless because they are spam, they are completely useless to the spammer because they screwed up the settings in their spam software. These screw ups give some insight into the working of their spam software.

Over the past few months the abuse of .cn (China) domains has been at pandemic levels[1]. There are several different styles of spam containing these domains differentiated by their template or structure. Some have simply the raw domain name, others include a random page, yet others have a random hostname and a page. While the text content of the email changes dramatically (randomly) between spam runs, the domain structure remains the same.

Spam that enters my traps is usually processed auto-magically by a parsing and checking program. Anything that doesn't fit a previously seen pattern is flagged so that I can have a look and see what has changed. Just the other day I got one that included the following curious looking URI.

http://{symbol[4-6]}.{_2cndomains}/?/{SYMBOL[4-6]}.html

As you can see, this is not a valid URI but instead the raw template text from the spam software. From this we can see why all their URI look the same. The {symbol[4-6]} would normally be substituted for a 4 to 6 letter word (in all the emails I have seen from this group/bot/template) even though it says "symbol". The two instances are evaluated at different times and the word is usually different. {_2cndomains} implies they have a list of .cn domains from which they choose one randomly. Since this URI string occurs anywhere from once to 10+ times in the a single email, we get to see multiple domains in every email.

The end result of all this - If they're using repeating pattern to send out millions (billions?) of spams every day, that pattern can be recreated and used to fight it. Even random characters are a pattern.

[1] http://garwarner.blogspot.com/2009/06/spam-crisis-in-china.html

Sunday, July 5, 2009

And the answer is .......

No. Yes, it's no. Not yes, but no. Yes, indeed, email addresses just sitting around in blogspot pages (well at least this one) do not get picked up by the email harvesting spiders. The email addresses on the very poorly linked atarandomdotcom.com website were picked up inside a few months but none of the ones posted in here.

And now, a small update on hunting the best way to get spam. Warning - non-sensical semi rant inbound.

Signing up for all the scams from googling well known anti-malware, and clicking on the ads on facebook has given me a huge stream of "opt-out" & "CAN-SPAM compliant" spammers. They all dutifully provide unsubscribe links but I only have 3 working traps from the hundred or so signups I did so I'm loathe to see if any of them work lest I lose all this wonderful spam, although there is always the possibility it would promote further spams. There are several distinct groups that have somehow ended up with the addresses, most likely through the affiliate agreements that I was never given the opportunity to choose opt-into let alone opt-out.

Each group cycles their postal and main domain every couple days to several months, providing a seemingly endless supply of places I am required to opt-out. If I did start opting out, I'd have a lot of clicking to do. There are ethical qualms with getting spam this way - Is it really spam? Should I be trying to unsubscribe since I do not want this stuff (yet, I really do)?

Quite frankly, the crap I'm getting from those signups is completely and utterly useless. No one in their right mind would ever use any of the "products". Most seem to either be companies that made a poor decision on who to pay to market their product for them, some poor shmucks affiliate program being abused, or further scams. If they were arriving on my normal email account, I'd be pissed off. Someone got my email address in good faith for something I was interested in, and now it's being flooded with things that are not. To that end, I declare it is spam. Not a legal definition, but that's where the line has to be drawn.

If I gather all the information I can from these email addresses and use that information to block emails going to real users, am I doing something wrong? Some user has been silly enough to enter their email address into part 1 of a 2 part web form (part 2 is where you find out that this is going to cost money and that you've just been signed up for the affiliate program with your previous click because it's in the terms of service which you can read here) and they are now going to get flooded with all the same crap as my seeded addresses. Should that user have to try to unsubscribe from all the groups that are now going to be sending him crap? In a perfect world they should be able to go back to that original website and say that they no longer wish to participate and that would be that. Don't laugh - that's how it should be. Alas, that user will now come to me and want it all to stop thank you.

I don't unsubscribe because I want a current list of all the domains and servers in use by those spammers (I really do seem to have gone off trying to call these people anything but spammers regardless of their actual status) so that the user above never has to come to me. When they signed up for whatever it was, they will never get the first email, nor any of the others. If someone really wants it, I can allow that group through to their email address only and still keep everyone else safe and sane.

Is it spam? By definition probably not. Is it wanted? Hell no. Should I unsubscribe? No. I'm just collecting it all without rejecting it and collating information from it. Am I doing something wrong by blocking users from getting this crap in the first place? Hell no! Accessing my mail servers is not a right given to everyone on the planet. It is a privilege extended to those people trusted not to abuse it.