funsec mailing list archives

Re: 95% of User Generated Content is spam or malicious


From: Rich Kulawiec <rsk () gsp org>
Date: Mon, 22 Feb 2010 08:23:09 -0500

On Mon, Feb 22, 2010 at 07:34:56AM -0500, Dan Kaminsky wrote:
All I know is that I have a couple of email accounts that get
negligible amounts of spam.  Oh, they're *sent* huge amounts, but they
receive almost none.

But this is not the only metric with which to evaluate mail defenses.
Vendors like to talk about their TP rates because of course it is
trivially easy even for the incompetent to put up gaudy numbers.

Thorough evaluation of mail defenses includes FP and FN rates, scalability,
resource consumption (including in turn bandwidth, CPU, memory, disk),
resistance to attack, resistance to gaming, predictability, performance
under load/duress/attack, initial cost, maintenance cost, ability to
handle previously-not-seen threats/attacks, ability to function in
the absence of external resources, minimized end-user involvement,
and other factors.

Moreover, that evaluation has to take place in context: we've known for
years that some spammers have expended considerable effort in identifying
mail addresses, mail servers, domains, networks, ASNs, etc. that belong
to particular people and organizations.  Sometimes they target these;
sometimes they avoid them (see "listwashing").  This can have a marked
effect on both the quantity and the type of SMTP abuse that's directed
at them, and that in turn can skew observable data.

As a trivial example of this: I moved one of my test domains to a /22
a few networks over from where it was, leaving everything else about the
setup unchanged.  Incoming spam tripled, and no, it was not a temporary
statistical artifact.  In this *particular* case, I don't why, despite
expending considerable time researching that question.  However, having
conducted many experiments of this nature over many years, I can report
that this is not uncommon, and that one of the things it's shown me is
that some of the more competent spammers are investing considerable effort
in studying their adversaries, cataloging their assets, characterizing
their defenses.  *They* apparently read Sun Tzu, much to their credit.

Anyway, one of the direct consequences of this reality is that testing
methodologies need to be very carefully constructed.  Anyone who
just plugs boxes from vendors X Y and Z into their network and does a
head-to-head comparison is not going to get a true picture of how those
systems really compare: they're only going to get a limited picture of how
those systems compare at the moment on their network(s) on their ASN(s)
with their domain(s).  Now, for their own use, that limited picture
*might* be adequate; but it's not sufficient to extrapolate to others,
and it *may not*, unless they're very knowledgeable, even extrapolate
to their own needs a month or a year down the road.

This is (one reason) why I've been telling people that they cannot
possibly hope to understand their spam problem, and thus address it,
unless they understand their mail traffic.  Thorough statistical
analysis is mandatory prior to crafting defenses: otherwise how can
one know what to defend against?  And there are such *huge* variations
across -- as above -- mail addresses, mail servers, domains, networks,
ASNs, etc. that one-size-fits-all solutions mostly don't.

---Rsk
_______________________________________________
Fun and Misc security discussion for OT posts.
https://linuxbox.org/cgi-bin/mailman/listinfo/funsec
Note: funsec is a public and open mailing list.


Current thread: