Spamassassin Advanced Topics

Spamassassin Advanced Topics

How to filter spam, or modify filtering rules

In order to take advantage of this filtering, you can do one of several things.

  1. You can use your mail client to filter, based on the modified headers. Most modern e-mail clients, such as Thunderbird (a.k.a. Mozilla or Netscape), Pine, Eudora, Outlook, MacOS, etc... support this functionality. The header to use for these mail clients is: X-Spam-Flag: YES. For more information, see the instructions for setting up mail clients to filter e-mail.
  2. You can also use procmail to filter suspected spam to another folder. This can be accomplished by adding a rule to the file .procmailrc in your home directory. (see ~/.procmailrc.sample for examples and the procmailrc man pages for further information)
  3. In addition, you can configure SpamAssassin to alter the subject line so that the messages likely to be spam are marked, making them easy to spot and delete. (See ~/.spamassassin/user_prefs.template for examples).

Modify the filtering rules

The ~/.spamassassin/user_prefs.template contains examples on how to whitelist and blacklist sites or users for better filtering results. After making modifications to this file, copy it to ~/.spamassassin/user_prefs.

Preventing false positives with the whitelist

Users who find that some legitimate e-mails are being tagged as spam can add individual addresses to a whitelist, which will prevent them from being tagged as spam. The entries are made to the ~/.spamassassin/user_prefs.template file, located in the user's home directory.

Users can whitelist by address:

      whitelist_from *

A whitelist can also be implemented by checking the username against the relay's reverse DNS, which can provide more accurate spam detection in many case, but may not be as reliable for users coming from some networks:

      whitelist_from_rcvd *

Please see the Spamassassin User Guide for more details.

Lowering the hits score

The easiest way to tune SpamAssassin is to adjust the hits score for what qualifies as spam. Take a look at the e-mail messages you're receiving, and look at the X-Spam-Status and X-Spam-Level headers. Decide on the lowest level you're comfortable with, and set the required_hits entry to that number in the file ~/.spamassassin/user_prefs. For example:

        # How many hits before a mail is considered spam.
        required_hits    3

The lower you set this number, the greater the chance that a legitimate message will be flagged as spam, so use caution. If you use a very low score (e.g., 1 or 2), you might want to filter your lower-score spam to a separate mailbox, and check through it before deleting.

Enabling Bayes or Bayesian filtering

As a further refinement in the arms race to defend against spam, SpamAssassin can now use a pseudo-Bayesian algorithm to identify spam, in addition to its rules-based spam detection. This is a statistical method which greatly improves spam recognition. In order to use this capability, it must be explicitly enabled and occasionally refreshed by each user. Note that this method does require ongoing attention, as it builds a DB of what currently looks like spam and non-spam.

You teach the Bayesian filter what is and what isn't spam by running the sa-learn command. You must also be able to show it examples (saved in mailboxes) of spam and non-spam, and this must be repeated periodically to keep it current. Here are a some links where you can find details on this procedure:

If you have any questions please contact the Lab Staff.