Mobile app version of vmapp.org
Login or Join
XinRu657

: How can I prevent spam on sites which I control? This is a general, community wiki question to address all non-specific spam prevention questions. If your question was closed as a duplicate

@XinRu657

Posted in: #Spam #SpamPrevention

This is a general, community wiki question to address all non-specific spam prevention questions.

If your question was closed as a duplicate of this question and you feel that the information provided here does not provide a sufficient answer, please open a discussion on Pro Webmasters Meta.



For purposes of this question, spam will include:


Any automated post
Manually-posted content which includes links to spammers' sites
Manually-posted content which includes instructions to visit a spammer's site

10.04% popularity Vote Up Vote Down


Login to follow query

More posts by @XinRu657

4 Comments

Sorted by latest first Latest Oldest Best

 

@Tiffany637

Please visit MediaWiki. Search for Extension:Moderation This extension will help you prevent spam on sites which you control.
I have used this MediaWiki Extension many times. And it stops bots from creating spam articles.

10% popularity Vote Up Vote Down


 

@Si4351233

We recently eliminated the spam from our Contact Us form with a very simple implementation. We added an input that was labeled "URL:" in the HTML form and made it invisible to the real users. Then, in the form processor, we check to see if it has a value and act accordingly.

The spambots take the bait all the time; they put in a URL to some spammy site. Our script sees that and throws away the comment (actually, we recycle the bits because we're trying to be a greener eco-friendly sort of company). For a while, we'd still store the offending comment in a database table for review but would refuse to email the results anywhere. That's how we know it worked.

With this simple method we went from around 30+ spam "Contact Us" messages a day to ZERO.

Good luck with whatever you choose!

10% popularity Vote Up Vote Down


 

@Lengel546

I have a forum, where I temporary enabled anonymous posts. I didn't want to use Captcha, since I often myself, have difficulties to read them, which can prevent people from commenting. To help prevent spam I used Akismet to catch incomming messages. Akismet isn't bulletproof, but it did make my life a lot easier.

You do however have to be aware of false positives. So what I did, was to create a "Spam Attribute" on my post object, and set it to the return value of Akismet. If a post would be marked as spam, I would give myself and email, whereafter I could decide if it was spam or not.

10% popularity Vote Up Vote Down


 

@XinRu657

The following list is organized by relative ease of implementation, maintenance cost, and effectiveness at spam prevention:

Disable all user-generated content

This is a scorched-earth solution which detracts from the the growth of a user community around your site, however, it is also guaranteed to save you the time and effort of dealing with spam or spam prevention.

Short of disabling user-generated content, there is no guaranteed solution to prevent all spam (or other unwanted content) from appearing, however, a solution which deters most spammers should be sufficient if you also provide your site's visitors with the option to flag content as spam.

Outsource user-generated content management

Services like Disqus allow webmasters to outsource the screening, storage, and publication of user-generated comments. (Note: Use of a third-party service requires extra configuration to ensure that comments will be indexed by search engines)

CAPTCHA

Per Wikipedia, CAPTCHA stands for "Completely Automated Public Turing test to tell Computers and Humans Apart". Any automated test designed to prevent a computer from posting content is a CAPTCHA: this includes forcing users to read letters, numbers, and words out of images, do simple word puzzles or math questions, or otherwise "prove" they are people.

The disadvantage of CAPTCHA is that


Most forms of CAPTCHAs provide a level of annoyance for the users.
They are not 100% protective. Note that many of these tests can be completed by computers if a competent programmer decides to invest enough time and effort on the problem


Q&A CAPTCHA

The most effective CAPTCHA for small sites is the question and answer CAPTCHA. A Q&A CAPTCHA is a question that a website asks a user to answer. The question is something that anyone visiting the site would know, but that a computer program would not know. An example question for a site about seo would be "What does SEO stand for". This question would be easy for the average reader of that site to answer, but any computer program would not be able to do so.

NOTE: questions like "what is 1 + 1" do not work well, because they are often used, and the people who build spambots program them to answer such questions correctly.

However, if your site get's a lot of traffic, spammers will program their robots to answer those questions automatically, and the q&a CAPTCHA will no longer be affective.

Hidden Field

If you have a form, and you don't want spammers to be able to use it, a good way to stop them is by using a hidden field. These are very simple to set up: add a redundant field to your form, hide it through css (or JavaScript), and stop anything that tries to enter a value into that field. Normal users will not be able to see the field, and will ignore it, because it is hidden from them, but computer programs employed by spammers will try to enter a value into that field, because they do not process CSS or javascript. In order to beat spambots that load CSS or Javascript, you may add an additional field to the forms with a request to leave it empty. Any human visitor will leave it empty and you can easily block the bots that add data to the field. Do not forget the fact that this may make the site look unprofessional.

Traffic and Content Analysis

Spammers have a limited number of networks and machines to post from (which they will typically use until they no longer work). Traffic analysis solutions gather data from a large number of hosts to determine whether a post contains known spam content or comes from a known spammer's host or network.

There are a variety of third-party CAPTCHA and traffic analysis solutions which are free (or cheap) to use and most open source content management software includes integrated modules for use of services like Akismet and reCAPTCHA.

Block words commonly contained in spam

If you notice that spam on your website commonly contains words that wont (or aren't) used by legitimate users (such as "free links to your site"), then blocking users from posting those words is an affective solution. If you are worried about users who have a legitimate use of those words in their posts having problems posting on your site, you can set the filter so that it ignores posts from established users.

rel="nofollow"

Spammers tend to focus on sites which allow them to post links which search engines will follow (thus improving the search rank of the site they are advertising).

You can make your site less attractive to spammers by adding rel="nofollow" to any links included in user-generated content, however, this approach may not work, as most spam is automated, and spammers have no way of knowing whether or not a site uses rel="nofollow" links.

Moderation by Users

Content can be posted by anyone, however, once the content displays on the site it can also be flagged as spam and removed (This option only works in practice if visitors perceive spam content to be relatively uncommon: if spam is allowed to surpass useful comments, most visitors will not bother flagging spam).

Gamification

Gamification is a great way to motivate users to report spam. Consider adding a "flag weight" feature to your site: the more spam users report, the more points them get. This will make hunting down spam more fun, and give people who report spam bragging rights. That will, in turn, encourage users to report spam.

Moderation by Administrators

A human must review every item of content posted before it is published on the site - while this does not prevent spam from being posted, it does prevent spam from being display to the site's visitors (thus reducing the value of the site to human spammers).

User Registration

User registration is an improvement over CAPTCHA because users are only forced to prove that they are human once before being allowed to comment at their convenience - this is not technically a different form of spam prevention, though it does make the removal of spam created by a specific user or group of users (as identified by username, e-mail, IP address, or other identifying factor) easier to enforce.

Moderate New Users

Instead of approving every post, an administrator can review new user registrations to determine whether or not to approve a user based upon whether or not the user's registration is consistent with identified spammers or automated spambots.

Limit New User Capabilities

Human spammers will rarely remember to return to accounts which they have created if they cannot post spam freely on an account - require new users to create a set number of posts (if the community has the ability to flag spam) and/or wait a set amount of time before restrictions on posting links or multiple posts are lifted.

Charge Users For Membership

If you charge for membership, even if the fee is small, spammers will be forced to weigh the cost of membership against the value of posting spam at your site (and pass over your site in favor of easier targets).

Invite Only

If you only allow people who have been invited by other users to register, this will severally cut down on spam (humans usually don't invite robots).

The following is from Project BOTCHA, Drupal.

HoneyPot

Implementation of honeypot-trap. The gist of it is that the field is added to the form with a certain value, which is then modified by JS. Spam is any form submission, the calculated value of which is not the same as we need.

HoneyPot2

The same as above, but using as a source of calculation not the value of a particular field, but the data from CSS.

ObscureUrl

Similar to HoneyPot2: constructed by JS is compared to the need. The difference is that the initial value is passed through the GET-parameter.

Conclusion

Most webmasters will find that a mix of the solutions listed above (with the exception of disallowing user-generated content) works best for their site and at least one solution must be implemented to prevent automated spam from choking visitors' discussions.

10% popularity Vote Up Vote Down


Back to top | Use Dark Theme