In order to take advantage of this filtering, you can do one of several things.
The ~/.spamassassin/user_prefs.template contains examples on how to whitelist and blacklist sites or users for better filtering results. After making modifications to this file, copy it to ~/.spamassassin/user_prefs.
Users who find that some legitimate e-mails are being tagged as spam can add individual addresses to a whitelist, which will prevent them from being tagged as spam. The entries are made to the ~/.spamassassin/user_prefs.template file, located in the user's home directory.
Users can whitelist by address:
whitelist_from joe@example.com fred@example.com
whitelist_from *@example.com
A whitelist can also be implemented by checking the username against the relay's reverse DNS, which can provide more accurate spam detection in many case, but may not be as reliable for users coming from some networks:
whitelist_from_rcvd joe@example.com example.com
whitelist_from_rcvd *@axkit.org sergeant.org
Please see the Spamassassin User Guide for more details.
The easiest way to tune SpamAssassin is to adjust the hits score for what qualifies as spam. Take a look at the e-mail messages you're receiving, and look at the X-Spam-Status and X-Spam-Level headers. Decide on the lowest level you're comfortable with, and set the required_hits entry to that number in the file ~/.spamassassin/user_prefs. For example:
# How many hits before a mail is considered spam.
required_hits 3
The lower you set this number, the greater the chance that a legitimate message will be flagged as spam, so use caution. If you use a very low score (e.g., 1 or 2), you might want to filter your lower-score spam to a separate mailbox, and check through it before deleting.
As a further refinement in the arms race to defend against spam, SpamAssassin can now use a pseudo-Bayesian algorithm to identify spam, in addition to its rules-based spam detection. This is a statistical method which greatly improves spam recognition. In order to use this capability, it must be explicitly enabled and occasionally refreshed by each user. Note that this method does require ongoing attention, as it builds a DB of what currently looks like spam and non-spam.
You teach the Bayesian filter what is and what isn't spam by running the sa-learn command. You must also be able to show it examples (saved in mailboxes) of spam and non-spam, and this must be repeated periodically to keep it current. Here are a some links where you can find details on this procedure:
If you have any questions please contact the Lab Staff.