Full Disclosure mailing list archives

RE: spam with anti-bayesian parts


From: "Bojan Zdrnja" <Bojan.Zdrnja () LSS hr>
Date: Tue, 13 Jan 2004 09:58:25 +1300

 

-----Original Message-----
From: full-disclosure-admin () lists netsys com 
[mailto:full-disclosure-admin () lists netsys com] On Behalf Of 
Suresh Ponnusami
Sent: Tuesday, 13 January 2004 12:30 a.m.
To: vogt () hansenet com; full-disclosure () lists netsys com
Subject: Re: [Full-disclosure] spam with anti-bayesian parts

Actually most of the spammers use automated tools that contains some
scriptable plugins to evade the spam filters. Since they spam more that
1000's of users at a time, picking something real might be a bit slow and
requires extra processing. Even if they create a template for all the
mails,
that'll take up some time which they may not want to waste on. Also,
introducing random gibberish noise might be able to get through bayesian
filters because, that particular gibberish junk may not be in 
the database.

That shouldn't help them if spam marking software is written properly (and
many aren't). If something is not in the bayesian database, it will be given
a neutral value (like 0.5 probability of being spam). However, their
marketing words (viagra or whatever) will give a high probability (more than
0.9). As software should take only first 6 or so most spammy tokens into
play, all that gibberish won't matter.

One thing when that will help them is when they only include one IMG SRC in
HTML e-mail, and if that link wasn't before in the database. That way
Bayesian classifier won't have a change to know what's going on as possibly
everything could get neutral values. In that case, other rules should help
(like RDBs and so on).

For more info I'd suggest checking wonderful Paul Graham's Web page at:

http://www.paulgraham.com/antispam.html


Also, Jonathan's DSPAM has some nice documents at the Web (I'm pretty sure
he'll reply to this as well). You can find this at:

http://www.nuclearelephant.com/projects/dspam/


Cheers,

Bojan

_______________________________________________
Full-Disclosure - We believe in it.
Charter: http://lists.netsys.com/full-disclosure-charter.html


Current thread: