Firewall Wizards mailing list archives
Re: Will data security technology benefit from Homeland Security?
From: "Stephen P. Berry" <spb () meshuggeneh net>
Date: Fri, 21 Jun 2002 13:05:48 -0700
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Marcus J. Ranum writes:
Neural nets are a common starting point for IDS research. They don't work very well, for lots of reasons. Mostly, it's because a neural net can't _explain_ anything. In fact, all of the mathematical approaches for IDS I've seen appear to share this problem.
Interestingly, I tend to have precisely the opposite impression: signature methods are good for categorising things, but lousy at explaining them; and statistical methods are good at explaining things, but lousy at coming up with neat categorisations. Part of this is obviously just a semantic quibble---the lifeblood of security mailing lists---but I think there's actually a larger contention that's worth expanding. A signature (and here I include things like simple protocol reassembly and suchlike) is generally a simple check for the presence or absence of some characteristic in the traffic under consideration. I.e., the analyst gets exactly one bit of information out of running a signature check. Clearly, much more information can be gained from context, from multiple signatures, u.s.w.---and that's where the value of signature systems as IDS tools lies. Statistical methods aren't like this. The result of a typical statistical NIDS widget probably doesn't offer a binary decision---it probably hands the analyst a probability, a distribution, or possibly a helpless shrug of the shoulders. In other words, the amount of information an analyst gets out of a statistical method is not constant. Part of this is probably what you're pointing to when you observe that statistical methods don't `explain' anything. If a signature comes back positive, you know that the packet under examination has an illegal offset frag, some interesting combination of TCP flags, a particular IP ID, or something like that. If a statistical test comes back indicating that the TCP ISNs from two dozen hosts are homeomorphic to a random distibution with a probability less than 0.01 percent...well, that doesn't point to a particular packet and indicate what's `nonrandom' about its th_seq. It -does- give you a good estimate of how likely it is that such a set of ISNs would result from random chance. This is as much an explaination as what a signature check tells you, and, indeed, it looks (to my mind, anyway) much more like an explanation in that it is a description of a behaviour, rather than merely an observation about the structure of individual packets. It is true that you don't get the same sorts of information out of a statistical method that you get out of a signature. That's fine---if they both coughed up the same information, we wouldn't need both. Further, I think your observation indicates one of the values of statistical NIDS methods, rather than a weakness. When we're looking for explanations, we can only go so far with the terms we started out with. A simple explaination will probably be stated in terms very similar to the ones the problem is stated in. As we look deeper and deeper, looking for the fundamental issues involved in the explanation, we eventually have to start stating out explanations in other terms---because if we explain things in the terms of the problem, we're just rearranging words. An analogy: studying chemistry, you spend your first couple years explaining chemical processes in terms of other chemical processes. Eventually, when you want to offer a deeper explanation of these processes, you end up using atomic physics. And if you keep going, you end up explaining the physics in terms of quantum mechanics. And if you look at your original problem and a QM explanation for it, it probably looks pretty loopy. We're used to looking at incidents in terms of packets (and sometimes things like sessions, or other small sets of packets). We can do a -lot- with explainations stated entirely in these terms. As we look for explanations of things on much larger scales, however, I think it's -inevitable- that we end up stating things in different terms---terms like those used in statistical NIDS, for example.
I'm not very impressed at all with most of the mathematical approaches I've seen to IDS so far - and I've seen more than my share of them.
Okay, after my little philosophical fugue above, I have to agree with you here. I'm not very impressed with most of the mathematical NIDS methods I've seen---and I've -written- more than my fair share of them. If you interpret that to mean that right now, today, it makes sense to rely more heavily on things like signature methods than on things like statistical methods---sure. I don't agree that this indicates anything about statistical methods in general, or their ability to `explain' things. - -Steve -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.0.6 (GNU/Linux) Comment: For info see http://www.gnupg.org iD8DBQE9E4cQG3kIaxeRZl8RArg7AJ0VbtIxaUXxbVCRmX42VHjgFfERowCfV0vL nxd6bZ6/V1kFLx1rCyBThDw= =g/Uo -----END PGP SIGNATURE----- _______________________________________________ firewall-wizards mailing list firewall-wizards () nfr com http://list.nfr.com/mailman/listinfo/firewall-wizards
Current thread:
- Re: Will data security technology benefit from Homeland Security? Marcus J. Ranum (Jun 16)
- Re: Will data security technology benefit from Homeland Security? Stephen P. Berry (Jun 22)