Default Spam Filtering for User Accounts

As of 11-Aug-2009, the department has enabled Spam filtering by default on all new user accounts. This is an opt-out system; that is, you may opt in or out or make adjustments however you like, but initially an account will be opted in.

Briefly, e-mail messages flagged by the mail system with a very high spam score will be summarily discarded; remaining messages flagged with a certain minimum (or higher) spam score will be filtered to a mail folder, and this folder will be subject to an automatic three month rotation.

The default setup

The Computer Science department's e-mail system employs various methods for determining and scoring Spam; please see our main Spam page for more details. The procmail system is used to perform various functions before an e-mail message ever hits your Inbox (main e-mail folder). You can adjust personal settings and add additional processing -- usually various sorts of filtering -- via the .procmailrc file in your home directory; this is your personal procmail configuration file.

We install, by default, a .procmailrc file in new accounts that will facilitate the filtering of suspected Spam to a separate e-mail folder. This spam folder will then be rotated -- that is, moved through a rotation of files -- automatically such that these messages are deleted two to three months after they are received. It is recommended that an account holder periodically examine their messages filtered as Spam, and make any needed adjustments to the filtering; you may of course ask the Lab Staff for assistance with this.

For the latest versions of the CS Lab maintained procmail configuration files, please see the directory: /usr/project/support/dotfiles/procmail/

Why?

Please note that this filtering is especially important if you forward your e-mail to another site such as Gmail: When too much Spam gets forwarded to a site, it can result in our site becoming black-listed or grey-listed; that is, any email coming from cs.duke.edu becomes either blocked or considered suspect. This results in delivery problems to the other site. Please see our Forwarding E-mail & Filtering Spam with Procmail page for additional suggestions regarding forwarding to Gmail; note that this can be adapted for forwarding to other sites, too.

Additionally, when people don't filter their spam, there's a tendency for Inboxes to gradually become huge files, and this impacts the performance of both their e-mail client and the C.S. mail system generally; i.e, other users are also impacted.

How to make adjustments

To make adjustments to your Spam filtering, you will need to edit the .procmailrc file in your home directory. Please run "man procmailrc" for details of this file's cryptic syntax. This FAQ page is not a procmailrc tutorial. Please do not hesitate to contact the Lab Staff for assistance!

Please note that one feature of .procmailrc is the INCLUDERC variable or directive: it is used to interpolate another configuration file into the current one. We utilize this feature for the vacation program, and do similarly here for Spam filtering. In order to compartmentalize the Spam related definitions and rules, we have them in a separate file named .procmailrc.spam, and this is included in .procmailrc via an INCLUDERC assignment. This, of course, is optional; you can combine your Spam processing code into .procmailrc, or use a differently named file, or whatever; everything will still work as long as your .procmailrc is properly composed. For simplicity, we'll usually refer to your .procmailrc, though this will include .procmailrc.spam or wherever the appropriate code resides.

Spam filtering rule stanzas are included in the default .procmailrc files. Code such as this will provide the filtering:

  MAILDIR=$HOME/mail
  SPAM_1=spam-probable
  
  # spam with score 11 or above gets THROWN AWAY (ie, sent to /dev/null)
  :0
    * ^X-Spam-Level: \*\*\*\*\*\*\*\*\*\*\*
    /dev/null
  
  # after which, remaining spam score 3 or above is likely spam,
  # but will be saved in a separate folder
  :0:
    * ^X-Spam-Level: \*\*\*
    $SPAM_1

This means that by default you're discarding Spam with score 11 or more, and other messages deemed to be Spam (score 3 or more) will be saved (in ~/mail/spam-probable) and eventually deleted. The scores acted upon can be adjusted by altering the number of "\*" patterns in the selection lines. These patterns are being matched against the header lines of each e-mail message. Additional fine tuning can be added -- such as saving Spam to multile files based on score -- and the Lab Staff can help you with this.

The spam file (folder) rotation is implemented via a variable naming convention: A program run monthly by the Lab Staff will parse your .procmailrc, and any variables matching the pattern SPAM_[0-9]+ (i.e., "SPAM_" followed by one or more digits) will be considered to refer to spam files that should get the automatic rotation. In the sample code above, the line:

  SPAM_1=spam-probable

matches this pattern. Therefore, at the end of the first month, the file ~/mail/spam-probable would be moved (rotated) to ~/mail/spam-probable.1, and a new, empty ~/mail/spam-probable would be created. The next month, spam-probable.1 would move to spam-probable.2, etc. Following that, spam-probable.2 would be discarded at the beginning of the moves.

A way to opt-out of the spam file rotation is to not use these variables to name your spam files, either by using different variable names (that do not match the pattern) or by not using variables at all. For example:

  MY_SPAM=spam-probable
  :0:
    * ^X-Spam-Level: \*\*\*
    $MY_SPAM

or just

  :0:
    * ^X-Spam-Level: \*\*\*
    spam-probable

Note that the actual name of the file does not matter for opt-in or opt-out.

Where to find additional information

Warning: Modify .procmailrc at your own risk!