funsec mailing list archives

Re: Im lovin google spam filter


From: Rich Kulawiec <rsk () gsp org>
Date: Thu, 7 Apr 2011 11:56:51 -0400

On Thu, Apr 07, 2011 at 10:04:49AM -0400, Patrick Laverty wrote:
I just checked my spam box for gmail and see 1500 messages.  A quick scan of
the "From" and I saw zero false positive.

Alternatively: "I looked in my own back yard and there's no paper
or plastic blowing around, therefore nobody litters."

Meaningful tests of FP (and FN) rates require large sample sets (in
the sense of number of messages and number of accounts); moreover, they
require careful attention to the composition of those sample sets, both in
terms of how the addresses are actively used, and how they're passively
used (by spammers).  They also require far more than a single snapshot;
one day's sample is meaningless.  They require more than casual analysis:
human eyeballs are far too unreliable to accurately process that much
data.  And so on: this isn't an easy or quick measurement to make, even
for those of us who have been studying the problem for a very long time.

I've done all that, which is how I know that Gmail's FP (and FN,
incidentally) classification performance is mediocre.

---rsk

_______________________________________________
Fun and Misc security discussion for OT posts.
https://linuxbox.org/cgi-bin/mailman/listinfo/funsec
Note: funsec is a public and open mailing list.


Current thread: