Full Disclosure mailing list archives

Re: Spam with PGP


From: Kiko Piris <fulldisclosure () pirispons net>
Date: Tue, 7 Oct 2003 22:51:58 +0200

On 07/10/2003 at 14:45, Jonathan A. Zdziarski wrote:

Actually the way SA does it weakens filtering.  SA's bayesian filtering
is only a very small piece of SA, and unfortunately not much attention
has been given to it.  The filter's final calculation is only a small
percentage of the actual final score.  Because true Bayesian filtering
performs a huge majority of the same tests that SA performs, SA's own
ruleset easily waters down any bayesian findings whenever there are
opposing values between the two.

IMHO, bayesian filters are no panacea right now, many spams I get end
like this:

---8<---
</body></html>ahdmf uvhuex qnzysthoa
r
 xdgmeqxqyawg
--->8---

And this nonsense "words" fool bayesian filters. And also do what Brian
Dinello pointed.

For example, a pine MUA...SA thinks a pine MUA suggests an innocent
message, but a majority of the emails with a pine MUA my wife receives
are spams.  In this case, the hard-coded MUA rule will unfortunately
water down the score, even if Bayes thinks a pine MUA is spam.
Obviously the pine MUA is just a small rule, but if you apply this to
the other rules, you get the same results.  

rules can be easyly deactivated or "reinforced" in
/etc/spamassassin/local.cf or ~/.spamassassin/user_prefs if defaults do
not suit your needs.

For example, right now, my SA (2.60-1 / debian sid) assigns no points to
mails having pine headers (if it did assign any point to it, it would be
very easy to configure not to do so).

What's worse is that last time I looked (this may have changed), SA's
bayesian filter did not appear to have a mechanism for learning, but was
just a static dictionary.  If users got spam there was no way for the
user to forward their spams into the system for processing.  Again, this
may have changed and if it has, that's great.

It has it (sa-learn). And with mutt and it's macros, teaching SA from
its own errors is just a matter of a keypress.

The product of Bayesian filtering includes all the heuristic tests as
well, so having both _hurts_ you, and is not something you benefit
from.  It is much better to focus on creating a strong probability-based
filter IMHO...and I think the statistics agree with me.

Of course, SpamAssassin does bayesian filtering as well.
heuristic + bayesian is better than either alone, IMHO.

I agree with this, rulebased+bayesian (SA) works better (at
least for me) than bayesian alone (bogofilter). However I must say that
bogofilter is the only bayesian filter I tried (and I uninstalled it
some months ago when I switched to SA).

As I said before, I think that bayesian filters are not perfect
(spammers use tricks to circumvent them). And I also think that
rulebased ones are'nt perfect too (there are also tricks to fool them,
like the pgp one pointed by who started this thread).

So I think that a combination of both is better.

Just my 2 cents...

Greetings to all

-- 
Kiko

_______________________________________________
Full-Disclosure - We believe in it.
Charter: http://lists.netsys.com/full-disclosure-charter.html


Current thread: