How do spambots work?

by rlb.usa   Last Updated July 11, 2019 08:04 AM - source

I have a forum that's getting hit a lot by forum spambots, and of course the best way to defeat something is to know thy enemy. I'll worry about defeating those spambots later, but right now I'd like to know more about them. Reading around, I felt surprised about the lack of thorough information on the subject (or perhaps my ineptness to input the correct search terms for better google results).

I'm interested in learning all about spambots. I've asked on other forums and gotten brush-off answers like "Spambots are always users registering on your site."

  • How do forum spambots work?
  • How do they find the 'new user registration' page? (I'm especially surprised because some forums don't have a dedicated URL for this eg, www.forum.com/register.html , but instead use query strings or even other methods invisible to the URL bar)
  • How do they know what to enter into each 'new user registration' field?
  • How do they determine what's a page they can spam / enter data into and what is not?
  • Do they even 'view' this page at all?
  • ..If not, then I'd assume they're communicating with the server directly - how is - this possible? How do they do it?
  • Can forum spambots break CAPTCHAs? Can they solve logic questions (how?)? Math questions?
  • Do they reverse-engineer client-side anti-bot validation scripts? Server-side scripts?
  • What techniques are still valid to prevent them?
  • Where do spambots come from? Is someone sitting behind the computer snickering as they watch their bot destroy site after site? Or are they snickering as they simply 'release' it onto the internet somehow? Are spambots 'run' by an infected computer somewhere? Do they replicate themselves?
  • etc
Tags : spam botattack


Answers 4


How do forum spambots work?

Talented (if evil) programmers write them - there are probably as many different types of spambots as there are people writing them but, unfortunately, it only takes a few spambot authors sharing and selling their work to ruin life for administrators...

One popular forum spamming application is called "xrumer" - here's a video of xrumer in action (note that there is a fair amount of human intervention required to get it set up).

While I realize that this doesn't answer all of your questions, I think it bears mentioning that anything a bot can't do well (like solve complex non-static logic questions) can be done by a low-paid worker overseas. Spamming is a business much like any other and there is no shortage of cheap labor being plied toward putting spam messages out there.

danlefree
danlefree
October 06, 2010 20:54 PM

How do they find the 'new user registration' page? (I'm especially surprised because some forums don't have a dedicated URL for this eg, www.forum.com/register.html , but instead use query strings or even other methods invisible to the URL bar)

They find new sites by:

  • Crawling and looking for signatures of known software. Usually this is a snippet of text like a copyright or a meta tag but it could be any consistent identifier. This usually applies to blog and forum software.
  • Manual inclusion. Human beings, whose labor is cheap in many parts of the world, look for known software or forms that are easily exploitable and add them to a database. This usually applies to custom registration and contact forms.
  • They buy lists. Just like email addresses are sold by spammers, known vulnerable or preferred target site lists are sold as well.

How do they know what to enter into each 'new user registration' field?

They know what to enter into each field by using the field names as a guide. 99.99% of the time the email address field is named "email" or something containing the word "email". You don't have to be a rocket scientist to know that field probably is for an email address. For things like names, login ID, addresses etc. it works on the same principle.

How do they determine what's a page they can spam / enter data into and what is not?

They don't care. The automated tools can try so many forms in such a short period of time at virtually no costs so trying every form possible is a no-brainer to do. When human labor is involved they can be "script kiddies" and try the obvious stuff to see if they get any kind of response that indicates the form is potentially vulnerable. Basically, any form is a potential target to them as is any page that accepts user input.

How do forum spambots work?

Do they even 'view' this page at all? ..If not, then I'd assume they're communicating with the server directly - how is - this possible? How do they do it?

Where do spambots come from? Is someone sitting behind the computer snickering as they watch their bot destroy site after site? Or are they snickering as they simply 'release' it onto the internet somehow? Are spambots 'run' by an infected computer somewhere? Do they replicate themselves?

It's all automated. Tools like xrumer are built, and sold, and contain the ability to exploit software with known vulnerabilities. Anyone can buy it and after setting it up it's more or less fire and forget. It goes to every forum in its list and tries to spam it to the best of its ability. Just due to brute force it is successful and worth it for the spammers. That's why they never stop. They barely have to lift a finger for it to work.

Can forum spambots break CAPTCHAs? Can they solve logic questions (how?)? Math questions?

Yes, but not always. Depends on how well it is implemented. But many captchas, including those offered by big companies, have been beaten and are effectively useless. That's why multiple forms of protection are required to stop them. Even then, humans can usually beat any system.

What techniques are still valid to prevent them?

From a previous answer: You could do several things (and should be doing more then one) including:

1) Putting a fake field that only bots will see. Then if that field is submitted with the rest of the form you can ignore it (and ban them if desired). You can also trap bad bots who follow a hidden link.

2) Use a CAPATCHA like reCAPTCHA

3) Use a field that requires the user to answer a question like what is 5 + 3. Any human can answer it but a bot won't know what to do since it is auto-populating fields based on field names. So that field will be either incorrect or missing in which case the submission will be rejected.

4) Use a token and put it into a session and also add it to the form. If the token is not submitted with the form or doesn't match then it is automated and can be ignored.

5) Look for repeated submissions from the same IP address. If your form shouldn't get too many requests but suddenly is it probably is being hit by a bot and you should consider temporarily blocking the IP address.

6) Use Akismet. It is great at identifying spam.

John Conde
John Conde
October 13, 2010 16:11 PM

I made Anti-spam plugin for WordPress, it blocks spam pretty good without Captcha or anything else.

How does it work: Two extra fields are added to comments form. First is the question about the current year. Second should be empty. If the user visits site, than first field is answered automatically with javascript, second field left blank and both fields are hidden and invisible for the user. If the spammer tries to submit comment form, he will make a mistake with answer on first field or tries to submit an empty field and spam comment will be rejected. User does not have to enter Captcha or anything else to prove it is not a bot, everything is made by javascript.

You may download the plugin and use the code to solve problem with spam on your site.

webvitaly
webvitaly
October 25, 2012 18:36 PM

When trying to defeat them, one thing I'd keep in mind is that their purpose is usually to post links to as many websites as possible for the black-hat SEO benefit.

They care about the amount of sites they gain access to, and not your site specifically. Someone just wanting to spam just your site alone could simply sign up without using a robot.

As such, I'm pretty sure that a well-written bespoke test (eg questions your forum members will know the answer to) is almost always going to be more effective against robots than any pre-written one which robots are likely to be wise to.

For example, if a robot cracked Recaptcha then it would have access to millions of forms to spam. If it cracked a bespoke test, then it would only have access to one website, so no automated spam-bot is going to bother doing that.

https://www.projecthoneypot.org may provide some good data to use (eg keywords and ips to block)

Richard B
Richard B
July 21, 2014 18:55 PM

Related Questions



Spam boot activities on form inputs

Updated May 04, 2019 22:04 PM

How to block span direct traffic?

Updated July 20, 2017 11:04 AM

Bot POST flood prevention

Updated March 12, 2016 07:01 AM

spam referers bypass htaccess rules

Updated May 30, 2015 01:01 AM