Sunday, August 14, 2011

Only 0.01% Wrong?

As a computer hobbyist, the Jaded Consumer is interested in software projects and thus particularly fond of high-quality open-source software projects. Consequently, I was very interested to read that Daniel Hartmeier reported his automated spam solution had achieved a 99.5% accuracy rate in identifying spam, while only losing 0.01% of legitimate emails to false-positive assignment of "spam" status. As it happens, I am not currently running a mailserver outside my own LAN, so I have little current need for this sort of solution – but I have clients, and I read explanations like his to understand what sorts of solutions can be offered to people who don't want to spend thousands on commercial solutions.

Honestly, though, email is becoming like phone service: you don't run your own cable and tap out your own Morse any more, you pick one of a field of commodity vendors and use the service (including any offered spam control) until you decide you like another better. You usually get email service bundled with an Internet connection; there's really little purpose for many businesses to bother with their own email back-ends unless the business is large enough that in-house handling makes better economic sense for handling the business' particular security concerns. Not that all these big businesses get it right, mind you, but ....

But I didn't try to email Daniel Hartmeier about his spam solution. Instead, I emailed him about pf. The page that leads with the word "History" was last updated in 2006, and there's been a recent development in the availability of pf. As of the launch of MacOS X v.10.7 ("Lion"), the firewall originated by Daniel Hartmeier following the removal of IPFilter from the OpenBSD code repository in May of 2001. I thought I'd point out that – although the fact hadn't been advertised – the world's highest-volume Unix distribution now contained the firewall Daniel originated began some ten years ago.

So I sent Daniel an email showing that pf-specific virtual devices were present in the default installation of MacOS X "Lion". What I got back wasn't what I expected:
A message that you sent could not be delivered to one or more of its recipients. This is a permanent error. The following address(es) failed:
SMTP error from remote mail server after end of data:
host []: 554 5.7.1 Spam (score 3.5)
I thought the problem might be that I was using a return email address that didn't come from the domain where the mailserver was located. I sent it again from another email address, using the mailserver of the email address' own domain. I got:
Google tried to deliver your message, but it was rejected by the recipient domain. We recommend contacting the other email provider for further information about the cause of this error. The error that the other server returned was: 554 554 5.7.1 Spam (score 3.5) (state 18).
One of my other tries got a totally different failure. I even tried re-writing my email so that its text might be less apt to trigger a Bayesian spam filter set to trigger on something in my prior email; but to no avail. I suppose the lesson is this: Daniel's excellent firewall tool may be everything one might dream (correctly coded, fast, feature-rich, elegant to configure, etc.) but this doesn't mean I can easily accept his claim that he's stopping 99.5% of the spam with a 0.01% false-positive rate. For information about stopping spam successfully, I'll be reading further.

But on that firewall -- congratulations, Daniel Hartmeier, on an excellent product. It's rocked for years, and I'm pleased it's come to my own desktop. Thanks for everything.

No comments: