Spam Filtering on the Departmental Mail Server
The departmental mail server runs SpamAssassin to identify spam and procmail to shunt both spam and spam backscatter (i.e. false mail bounces) out of your inbox and into special folders. Our standard configuration also filters out duplicate copies of messages (i.e. messages which have your address in the To: or Cc: lines more than once).
The spam, backscatter and duplicates are saved in these folders, respectively:
mail/rhpcs-mail-cleaner/spam
mail/rhpcs-mail-cleaner/bounce
mail/rhpcs-mail-cleaner/duplicate
These folders are archived and emptied daily, weekly or monthly (depending on your configuration file); see below for details.
Contents
Activating Spam Filtering for Your Account
Spam filtering is turned on by default for accounts created after April 1st, 2008 but not on older accounts.
You can turn mail filtering on by running the follow command (from a terminal window) on ms:
activate-mail-filter
Alternatively, you can edit your .procmailrc file manually and add this line at the top:
INCLUDERC=$RHPCS_MAIL_CLEANER/procmailrc-clean
All filtering activity will be recorded in the file .procmail-log in your home directory.
Deactivating Spam Filtering
You can deactivate spam filtering by removing the file .procmailrc from your home directory.
Missed or False Spam & Backscatter
Some spam and backscatter bounces will elude the filters and land in your inbox. You can help SpamAssassin to analyze and tune the filters by saving the missed messages to these folders:
mail/rhpcs-mail-cleaner/spam-missed
mail/rhpcs-mail-cleaner/bounce-missed
Similarly, some legitimate messages may be caught as spam or backscatter. These can be saved or copied to these folders for analysis:
mail/rhpcs-mail-cleaner/spam-false
mail/rhpcs-mail-cleaner/bounce-false
Performance will vary with position in a spam cycle and an individual's particular stream of mail and spam, but in general, one can expect less than 10% of spam to be missed and well under 0.1% false positives.
procmail, vacation and .forward
Note that if you plan to use procmail as described above to filter spam, then you can't have a .forward file and can't use the standard vacation program as they interfere. Instead, use procmail itself to handle both the forwarding and vacation functions; see the Email section of the FAQ page. Procmail is very, very powerful but procmail works alone.
Using Whitelisting to Prevent Spam False Positives
SpamAssassin will sometimes flag legitimate mail as spam. These false positives may be due to the way the sending computer identifies itself, or to the formatting (e.g. use of HTML, graphics, colours), or to the content (e.g. messages about drugs or cosmetic surgery), or - most likely - due to a combination of these factors. You can ensure that messages from specific addresses are not flagged as spam by putting whitelist entries in the file ~/.spamassassin/user_prefs. For example:
# whitelist an individual
whitelist_from John.Smith@somewhere.edu
# whitelist everyone at an institution
whitelist_from *@*.somewhere.edu
Archives of Filtered Mail
Once each day, week or month (depending on the value of "mail-cleaner-archive-frequency" in your .rhpcsrc file), the spam, backscatter and duplicate folders (and any -missed and -false folders) are compressed into zip files, archived into a folder in your home direcotry called .rhpcs/mail-cleaner/archive/yyyy/mm/dd/.
In order for your mail client to read the archived folders, you will first need to unzip them into your mail directory, a process automated by the command access-archived-mail. For example, to read the filtered-mail folders archived on May 28th, 2008, run the command ...
access-archived-mail 2009/05/28
... which would place the uncompressed folders in the mail folder rhpcs-mail-cleaner/tmp/2008/05/28. It might be necessary to restart your mail client in order to see the folders. The rhpcs-mail-cleaner/tmp folder will be deleted each night during the archive process.
Report of Mail-Filter Activity
The command archive-user-report will report basic statistics for the most recent archive and optionally give you a list of senders and subjects for messages caught by the spam filter. You can alternatively specify a date. For example:
archive-user-report
archive-user-report --date=2009/05/28 --summary | more
Archive Reports via Email
If you wish to receive a report of the mail-filter activity each time the folders are archived, add the following line to the file .rhpcsrc in your home directory:
mail-cleaner-report:yes
If you also wish to receive the list of messages filtered to the spam folder, add this line:
mail-cleaner-report-spam-summary:yes
You can choose to have the spam-message info sorted by subject, from or date by adding a line like:
mail-cleaner-report-summary-sort:subject
You can specify the value daily, weekly or monthly for the following, which will determine how often the filter folders are archived and a report sent
mail-cleaner-archive-frequency:weekly
The default value (i.e. if no value is specified) is daily .
Generating Statistics
Basic statistics are stored in ~/.rhpcs./mail-cleaner/stats. You can generate reports for all past archive dates with this command:
archive-user-generate-stats
Spam Cycles
Even with a spam filter in place, some spam will make it to your inbox. You can expect the volume to increase and then drop off in fairly regular cycles: the increase comes as spammers learn to avoid the clues that the filter looks for, and the drop-offs happen when the SpamAssassin rules are updated.
Spam Filtering and Mail Forwarding
If you forward your mail using a .forward file, you will circumvent mail filtering on ms, which is probably fine, provided that you are happy with the mail filtering at the destiation account. See this FAQ entry for instructions on forwarding with procmail.