Full Disclosure mailing list archives

Re: Spam with PGP


From: "Jonathan A. Zdziarski" <jonathan () nuclearelephant com>
Date: Wed, 08 Oct 2003 08:54:29 -0400


You've done better than us. How have you managed to train your users to 
forward the email as the full email, incl all headers, etc? We've found 
most forwarded messages do not include all headers, and therefore 
forwarded messages train the spam database with semi legit emails (i.e. 
headers are legit because they are forwarded).

Full headers are not necessary; DSPAM created a binary signature of
every message that goes through its system and hangs onto it for
[default] 14 days.  Depending on how the software is configured, this
binary signature will either be attached to each message (and forwarded
on automatically) or [default] kept on the server, with a very small
signature embedded into text portions of every email message that goes
out.  Regrettably, this signature must be visible or some email clients
will break it, but I find a majority of users can get used to it in the
same way they get used to in-line PGP signatures.  When they forward the
message, the signature gets forwarded just like any other part of the
message...DSPAM then looks up the binary signature on the server and
reverses the tokens.  This obviously uses some additional disk space on
the server-side, but not as much as you'd think.  Since attachments are
ignored, and tokens are deduplicated, there are only about 200-300
tokens in the average legitimate email.  These are stored in crc64
poly's on the server, which consume 8 bytes per token.

It sounds like you've moved about 8 steps beyond us, with some kind of a 
spam button interface. IMHO that's what SMTP really needs - a 'feedback 
loop' protocol to teach the server. Such a protocol would be similar to 
POP3 in the reverse direction (in particular, provide some form of 
authentication and then push a message), so that you could push a button 
and teach a central server (by whatever mechanism it chooses to learn 
by) that a message is SPAM. Nevertheless, we have not found a way to 
train our users to appropriately forward messages- they usually don't 
include the full headers, and therefore we miss the majority of the spam 
data.

If they use outlook, tools like SpamSource should do the trick for you -
it puts a little button in Outlook you click and automatically forwards
the entire original message.  One thing you should be concerned about
though - you definitely dont want to be paying any attention to the
user's own headers when they forward.  We did this about 20 versions ago
and it started spam-damning any email that were replies to a message
they sent out.

I love it, increasing spam protection is great. My perspective is that 
filtering 90% of spam for 1000 users (via SA, or whatever) is better 
than filtering 99% of spam for 1 user. Yes, the individual number is 
better in terms of percentage, however by doing the whole group of 
users, we block several hundred to a few thousand spam messages a day. 
It remains a difficult problem.

I'm not opposed to using both - but it ought to be set up in a way that
anything SA catches gets fed into your Bayesian tool as a guilty corpus
- this way your Bayesian tool doesn't suffer by not receiving that 90%
of email...and will shortly be able to filter the other 10% out for you.

Jonathan


_______________________________________________
Full-Disclosure - We believe in it.
Charter: http://lists.netsys.com/full-disclosure-charter.html


Current thread: