Sanity check: Media Temple, SpamAssasin, IMAP, & the iPhone

There is no denying that Gmail spam filtering is top notch. On average, I was fortunate enough to deal with less than 10 spam emails per week that managed to get through the Gmail spam filter. A free email service that offers accurate spam filtering? What’s not to like?

From my own experiences over the last two months, there isn’t a whole lot to like about intermittent account access – both via web or IMAP client. Previously, I preached the goodness of Gmail, often looking for ways to embrace and integrate the service into every day use.

Gmail is still a great email option over the likes of Hotmail or Yahoo. In reality, I have been making very little use of my actual Gmail email address for communication. I relied on the service as a central repository to index and manage multiple email accounts / aliases. I realized that if the "cloud" went down for maintenance or was disabled without warning, I was or would be SOL. Paranoia? Sure. But if you have the power to administrate your own services, why wouldn’t you? Email habits and requirements change. Mine have changed so much so that I’ve returned to self hosted email. Media Temple email + IMAP support.

Meet SpamAssassin, the frontline defense against internet scum

Back in August 2008, Media Temple deployed the Cloudmark Authority plugin for SpamAssassin.

The Cloudmark Authority plug-in for SpamAssassin increased Media Temple’s catch rate of spam, phishing and viruses by 80 percent and drove down false positives to almost immeasurable levels. Cloudmark’s high accuracy is due to a unique combination of Advanced Message FingerprintingTM technology and real-time corroborated feedback from Cloudmark’s Global Threat NetworkTM system, the world’s largest live threat monitoring system.

Back when I initially transitioned 5thirtyone.com from Dreamhost to Media Temple (2006), I tried the self hosted email route. It was as if the spam floodgates from hell had been opened. I had no choice but to rely on SpamSieve to clean-up where the server filtering fell short. Well short of expectations. Around that time, I had just begun hearing about Gmail and soon after adopted the service.

Fast forward to the present. Media Temple email is in top shape, IMAP support is working flawlessly, and all of my non-Gmail accounts reside on a server that I control. The improved SpamAssassin is working well enough for me to access my accounts via IMAP in Mail.app without the need for Junk Mail filtering or SpamSieve (server side filtering required! – see below). I may have just jinxed everything with that statement, but filters will continue to be trained to offset the increase of potential spam.

Server side filtering with procmail

A SpamAssassin installation that is properly identifying and tagging spam is of little help if tagged emails continue to land in your inbox. There are no automated filters out of the box with self hosted email. While you could setup a desktop filter to properly sort spam, what about mobile devices like the iPhone? One solution would be a desktop client running 24/7 applying rules on emails (IMAP). An obvious waste of resources. The second more logical solution is server side filtering.

Russ shares the two step process for automatically filtering spam tagged by SpamAssassin [server side] with procmail. Shared here for my own personal archiving. You’ll need access to your DV account via SSH. Enable developer tools from within your Media Temple account (if you have not done so already).

The first step requires the creation of a .procmailrc file within your email user account. I prefer vim over vi so:

vim /var/qmail/mailnames/your-domain.com/your-user/.procmailrc

Insert the following. Take note of the .Spam directory that I’ve specified (change if necessary).

MAILDIR=/var/qmail/mailnames/your-domain.com/your-user/Maildir
DEFAULT=${MAILDIR}/
SPAMDIR=${MAILDIR}/.Spam/
LOGFILE=${MAILDIR}/procmail.log
LOG="- Logging ${LOGFILE} for ${LOGNAME} "
# All mail tagged as spam (eg. with a score higher than the set threshold)
# is moved to the designated spam folder
:0
* ^X-Spam-Status: Yes.*
${SPAMDIR}

The second step requires editing the existing .qmail file for the user account:

vim /var/qmail/mailnames/your-domain.com/your-user/.qmail

The content of the file should look like this:

| /usr/local/psa/bin/psa-spamc accept
|preline /usr/bin/procmail -m -o .procmailrc

With server side filtering, you no longer have to worry about a barrage of junk mail making it to your inbox for manual sorting. A huge relief especially for mobile users.

The price of self hosted email vs. hosted services like Gmail

Let’s all be honest. The reason why Gmail spam filtering is so awesome is because of volume. The sheer volume of users continually teaching Google which emails are spam, and which emails are not. When you’re running your own email server, there is quite a bit more work involved to teach SpamAssassin – Spam vs. Ham.

Like any other spam filter, there is some training required. For each IMAP enabled account, I’ve created unique mailboxes / folders for training SpamAssassin. sa-learn comes in handy for training the Bayesian classifier. Media Temple uses the Maildir format (source).

sa-learn --no-sync [--spam or --ham] [folder/{cur,new}]

In my case since I’ve setup LearnSpam (for emails that slipped by SpamAssassin) and IsHam (for safe emails to learn):

sa-learn --no-sync --spam ~/Maildir/.LearnSpam/{cur,new}

and

sa-learn --no-sync --ham ~/Maildir/.IsHam/{cur,new}

Are you still using Gmail as your main email service? Or, are you utilizing your own hosted email, server side and desktop filtering?

Discuss - 5 Comments

  1. Michael says:

    My email is hosted on my Google Apps account, but I’ve though about hosting it locally. I am very paranoid about having everything in ‘the cloud.’ If I’m backing up files, I have them locally, on my external hdd, on S3 and on my BingoDisk.

    The problem with hosting on MediaTemple is that you have to pay for backups, or do your own backups. If the drive on your server dies, you’re screwed. If there is some magic way to integrate my email with S3 or my WebDAV disk and do auto backups, I would be sold.

    Another thing is the fact that all my current mail is on Google servers. What do I do if I need to search through that mail? I don’t want to be logging into 2 accounts day after day to go back and grab new stuff.

    I’m on the fence right now. If I could easily pull all my mail over from Gmail and then easily back it up i would switch immediately.

  2. Sanity checks = spam protection? Haha, you’re amazing Derek. I wonder how much spam you get that you need this. Too bad we’re not all as popular as you!

    Personally I’m still using gmail. I’m currently satisfied with the protection it gives. My Instant messengers on the other hand, needs some kind of spam protection. Maybe we should get back to old school telephones :)

  3. Derek says:

    @Michael: Have you checked out Paul’s tutorial: How To: Bulletproof Server Backups with Amazon S3? Also, moving emails off of Gmail and into your own hosted account require a little drag ‘n drop. Nothing more. If you’re using a desktop client you can a) pull everything down to store in a local directory, b) pull everything down and upload to your personal server account, or c) leave emails where they are and let your mail client search everything. This is all assuming you have IMAP access across the board.

    @Jonathan: There is no safe haven from spam. You’ve never received a random sales / spam call via telephone before?

  4. Michael says:

    Bah, you win. I’ll get around to moving it eventually :P

  5. M. Holger says:

    In my case, I found that moving away from hosting my own email was the right choice. It’s all fairly circumstantial, and there are aspects of hosting my own that I miss, but in the end the amount of time I save by not having to maintain the proper SpamAssassin updates, constantly training and re-training, applying security fixes to the MTA, etc., etc., all adds up. The things I’m able to pursue whilst not worrying about my mailserver easily outweigh the benefits to self-hosting, for me. If my email were more critical, and less personal, or if I had other users to consider, that would change the equation fairly drastically — but since it’s just little ol’ me, I can afford to let my email be handled offsite, where it’s indexed and categorized so that the hosting firm (Google) can better target me in their advertising programs…lol.

    I, for one, welcome our omniscient Google overlords! ;)