funsec mailing list archives

Re: 95% of User Generated Content is spam or malicious

From: Rich Kulawiec <rsk () gsp org>
Date: Sun, 14 Feb 2010 16:50:12 -0500

On Wed, Feb 10, 2010 at 10:24:27PM -0500, Dave Paris wrote:

Where the trick (to the extent it's a trick, I suppose) lies here is 
what it takes to knock down this volume.


I use my firewalls/routers.  Beginning with the DROP list, followed up
by a large number of country, so-called ISP/webhost (spammer front,
e.g., Eonix), so-called ESP (spammers, e.g., iContact, Uptilt) blocks.
And I use passive OS recognition to treat anything that's running
Windows differently -- because the the odds are in the 10e4-10e6
to 1 range that it's not a real mail server, depending on how I
construct the metric.

Then I enforce DNS/rDNS existence and consistency checks on the connecting
host and the HELO parameter.  Then I use a large set of rDNS patterns
that's been very carefully developed to match non-mail-sending
hosts (e.g. end-user systems) and refuse everything from them outright:
real mail systems have real names, not generic ones.

Then I use a local blacklist of domains, sender LHS, senders, hosts,
networks, etc.  Then several DNSBLs.  And a few other things.

This is an extremely effective, efficient and very accurate setup.
It's effective because it doesn't waste time trying to figure out if
the same abusers who sent spam yesterday are sending some more today:
of course they are.  It's efficient because it rejects/accepts outright
-- and when it rejects, it does so before seeing the message-body.
It's accurate (a) because it's based solely on deterministic criteria
(b) because I've been doing this for a very long time (sadly) and
have learned a few things by now and (c) because it's correlated against
my own mail logs, a necessary but seldom-performed step that helps me
understand what the spam and non-spam components of my mail stream are.

I've gone into considerably more detail about this on mailop, if you
want the extended writeup, but the bottom line is that this kind of
approach beats the pants off more complex/trendier ones in terms
of performance, simplicity, resistance to attack, FP rate, FN rate,
maintainability and predictability.  But it does require pretty
good knowledge of what *your* spam/not-spam mail mix looks like:
you've got to understand it well before deploying something like this.

---Rsk
_______________________________________________
Fun and Misc security discussion for OT posts.
https://linuxbox.org/cgi-bin/mailman/listinfo/funsec
Note: funsec is a public and open mailing list.

Current thread:

95% of User Generated Content is spam or malicious Robert Portvliet (Feb 07)
- Re: 95% of User Generated Content is spam or malicious Rich Kulawiec (Feb 10)
  - Re: 95% of User Generated Content is spam or malicious Robert Portvliet (Feb 10)
    - Re: 95% of User Generated Content is spam or malicious Rich Kulawiec (Feb 10)
    - Re: 95% of User Generated Content is spam or malicious Dave Paris (Feb 10)
    - Re: 95% of User Generated Content is spam or malicious Rich Kulawiec (Feb 14)
    - Re: 95% of User Generated Content is spam or malicious Drsolly (Feb 14)
    - Re: 95% of User Generated Content is spam or malicious Tomas L. Byrnes (Feb 14)
    - Re: 95% of User Generated Content is spam or malicious Rich Kulawiec (Feb 15)
    - Re: 95% of User Generated Content is spam or malicious Tomas L. Byrnes (Feb 15)
    - Re: 95% of User Generated Content is spam or malicious Rich Kulawiec (Feb 18)
    - Re: 95% of User Generated Content is spam or malicious der Mouse (Feb 18)
    - Re: 95% of User Generated Content is spam or malicious Rich Kulawiec (Feb 21)
    - Re: 95% of User Generated Content is spam or malicious Tomas L. Byrnes (Feb 21)
    - Re: 95% of User Generated Content is spam or malicious Rich Kulawiec (Feb 22)
    - Re: 95% of User Generated Content is spam or malicious Dan Kaminsky (Feb 22)

(Thread continues...)