Firewall Wizards mailing list archives

Re: Will data security technology benefit from Homeland Security?


From: "Stephen P. Berry" <spb () meshuggeneh net>
Date: Fri, 21 Jun 2002 13:05:48 -0700

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1


Marcus J. Ranum writes:

Neural nets are a common starting point for IDS research. They don't work
very well, for lots of reasons.
Mostly, it's because a neural net can't _explain_ anything. In fact, all
of the mathematical approaches for IDS I've seen appear to share this
problem.

Interestingly, I tend to have precisely the opposite impression:  signature
methods are good for categorising things, but lousy at explaining them;
and statistical methods are good at explaining things, but lousy at
coming up with neat categorisations.  Part of this is obviously just
a semantic quibble---the lifeblood of security mailing lists---but I think
there's actually a larger contention that's worth expanding.

A signature (and here I include things like simple protocol reassembly
and suchlike) is generally a simple check for the presence or absence of
some characteristic in the traffic under consideration.  I.e., the analyst
gets exactly one bit of information out of running a signature check.  Clearly,
much more information can be gained from context, from multiple signatures,
u.s.w.---and that's where the value of signature systems as IDS tools lies.

Statistical methods aren't like this.  The result of a typical statistical
NIDS widget probably doesn't offer a binary decision---it probably hands
the analyst a probability, a distribution, or possibly a helpless shrug
of the shoulders.  In other words, the amount of information an analyst
gets out of a statistical method is not constant.  Part of this is probably
what you're pointing to when you observe that statistical methods don't
`explain' anything.  If a signature comes back positive, you know that
the packet under examination has an illegal offset frag, some interesting
combination of TCP flags, a particular IP ID, or something like that.  If
a statistical test comes back indicating that the TCP ISNs from two dozen
hosts are homeomorphic to a random distibution with a probability less
than 0.01 percent...well, that doesn't point to a particular packet and
indicate what's `nonrandom' about its th_seq.  It -does- give you
a good estimate of how likely it is that such a set of ISNs would result
from random chance.  This is as much an explaination as what a signature
check tells you, and, indeed, it looks (to my mind, anyway) much more like an
explanation in that it is a description of a behaviour, rather than merely
an observation about the structure of individual packets.

It is true that you don't get the same sorts of information out of a
statistical method that you get out of a signature.  That's fine---if they
both coughed up the same information, we wouldn't need both.

Further, I think your observation indicates one of the values of statistical
NIDS methods, rather than a weakness.  When we're looking for explanations,
we can only go so far with the terms we started out with.  A simple
explaination will probably be stated in terms very similar to the ones
the problem is stated in.  As we look deeper and deeper, looking for
the fundamental issues involved in the explanation, we eventually have
to start stating out explanations in other terms---because if we explain
things in the terms of the problem, we're just rearranging words. 

An analogy:  studying chemistry, you spend your first couple years
explaining chemical processes in terms of other chemical processes.
Eventually, when you want to offer a deeper explanation of these processes,
you end up using atomic physics.  And if you keep going, you end up explaining
the physics in terms of quantum mechanics.  And if you look at your
original problem and a QM explanation for it, it probably looks pretty
loopy.

We're used to looking at incidents in terms of packets (and sometimes things
like sessions, or other small sets of packets).  We can do a -lot- with
explainations stated entirely in these terms.  As we look for explanations
of things on much larger scales, however, I think it's -inevitable- that
we end up stating things in different terms---terms like those used in
statistical NIDS, for example.


I'm not very impressed at all with most of the mathematical approaches
I've seen to IDS so far - and I've seen more than my share of them.

Okay, after my little philosophical fugue above, I have to agree with
you here.  I'm not very impressed with most of the mathematical NIDS
methods I've seen---and I've -written- more than my fair share of them.

If you interpret that to mean that right now, today, it makes sense to
rely more heavily on things like signature methods than on things like
statistical methods---sure.  I don't agree that this indicates anything
about statistical methods in general, or their ability to `explain' things.








- -Steve

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.0.6 (GNU/Linux)
Comment: For info see http://www.gnupg.org

iD8DBQE9E4cQG3kIaxeRZl8RArg7AJ0VbtIxaUXxbVCRmX42VHjgFfERowCfV0vL
nxd6bZ6/V1kFLx1rCyBThDw=
=g/Uo
-----END PGP SIGNATURE-----
_______________________________________________
firewall-wizards mailing list
firewall-wizards () nfr com
http://list.nfr.com/mailman/listinfo/firewall-wizards


Current thread: