IDS mailing list archives

Re: IDS Analyst Levels


From: "Stephen P. Berry" <spb () meshuggeneh net>
Date: Wed, 25 Feb 2004 20:11:55 -0800

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1


Andy Cuff writes:

I was hoping some of you would be willing to share how you define the
various levels of IDS analysis and the flow between them.

This is, of course, a nontrivial question.  I'll try to outline in
general terms the sort of model I've been using lately for my own
fiendish NIDS projects, but I expect that most of it is broadly applicable
to IDS methodologies in general.


The basic model I've been using is what is usually called either the
Boyd Cycle or the OODA Loop.  Devised by Colonel John Boyd, it's a
way of describing decision processes[0].  The `OODA' name derives
from the four stages of the model.  Very briefly:

        Observation
                Data are collected.
        Orientation
                The data previously collected are organised into
                a model of the current situation
        Decision
                Based on the model derived above, a course of
                action is selected
        Action
                The course of action is persued

So this, in simple terms, is an explicit enunciation of a process
by which you look at things, figure out what is going on, decide
what to do about it, and then do it.  Thinking of things in
these terms will help us work out a formal model for interal data
flow in IDSes.  Now we need a way of correlating things that
our IDS does and steps of this process.

The model I've been using lately for this is a grammatical model.  I've
mentioned this before in this forum, but I'll give a quick _precis_ here.

Think of packets on the network as a raw stream of symbols.  Your
signature-matching logic sorts through this data stream and identifies
recognised groups of symbols in it.  I.e., `This packet matches
some signature foo' or `This line of /var/log/messages matches some
regex bar'.  Think of this as lexical analysis of the stream of symbols:
in other words, your signature logic is tokenising the input data stream.

- From there, we can apply grammatical rules to the token stream to evaluate
(or establish) relationships between the symbols.  These rules obviously
are not just testing individual packets or log lines for certain
characteristics;  they are predicated on relationships between multiple
tokens.  Instead of evaluating propositions like, `Does this packet
have some flag foo set?', here we are asking things like, `Is the source of a
packet matching signature foo also the destination of a packet matching
signature bar, and is it running OS baz, rev n?'

Going back to our OODA model, we can see how the lexical analysis
of the data stream fits nicely into the observation phase.  There's
a temptation to think of pulling packets off the wire as observation
and signature matching as orientation.  I would contend, however,
that we're really not getting a model of the situation out of signature
matching.

Without attempting a formal proof:  as long as all a signature is testing
for is the presence (or absence) of some characteristics in the data,
that is all the information a match of that signature can contain.  If
we have some signature which matches the first n bytes of the MyDoom
payload, the signature match tells us `these bytes are present' and
not `an attack is in progress' or even `this is MyDoom propagating
itself'.  The latter is a semantic evaluation (an evaluation of meaning),
and all our signature gets us is a lexical categorisation.

This is an important distinction.  We may choose to -infer- meaning from
a signature match, but the signature match by itself -cannot- convey
meaning.


Anyway, with this all in mind, our OODA loop looks something like:

        Observation
                Tokenising the raw packet stream

                This is what most NIDSes do;  applying signatures
                to packets to get matches.  This applies equally
                well to grepping through /var/log/messages for particular
                strings, or running something like Nessus against a
                host to get a list of vulnerabilities.

                Call the outputs of this phase `events'.

        Orientation
                Parsing the tokens to evaluate the semantic
                content of the stream

                This is where the grammatical model discussed above
                comes in---it plays connect-the-dots between the
                events.

        Decision
                Applying semantic rules to the parsed stream to
                determine which pre-defined response(s) are
                indicated

                In the present case, this means that we've got rules expressed
                in terms of our grammatical model that associate
                semantic matches (e.g., `packet foo followed by packet bar
                when there's no packet baz from the same source') with
                some action (e.g., log a message).

        Action
                Conducting the actual response

                E.g., actually log that message.  Call that message
                an `alert'.

By default, we don't pester analysts with events, just alerts.

This process is repeated continuously, frequently recusively.  In other
words, we are always going through the loop:  looking at things, trying
to make sense of them, deciding on what to do, and then doing it.  Sometimes
our inputs (the things we're reacting to) are new (freshly arrived
packets, for example), and sometimes they will be the result of actions from
previous iterations of the loop.

In particular (in the NIDS case) this allows us to define actions which
involve writing new signatures which can be fed back into the system for
future observations.  So we might have a set of signatures that we're
checking every packet for, and then a whole slew of other signatures
that are created dynamically and applied to only specific groups
of packets[1].


Anyway, that's a quick synopsis of the IDS analysis model I've been
currently playing with, and a little bit about the data flow in it.
If you're interested, substantial hunks of this are already
implemented in shoki 0.3.0 [2].





- -spb

- -----
0       Originally combat control decisions, but it has since been
        widely used by many other disciplines.
1       Tracking signature generation and the matching of dynamic signatures
        also provides us with a pretty straightforward state
        transition model, which is exactly the sort of thing statistical
        analysis weenies like me like to see in an IDS data model.
2       And somewhat less substantial hunks are actually documented.

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.1 (OpenBSD)

iD8DBQFAPXGwG3kIaxeRZl8RAiGyAJsFciV8rI66PT1M1hEMl333b1lRFQCfS5z0
lna60tx5jkZWDxdNbssEEQ4=
=Q08Z
-----END PGP SIGNATURE-----

---------------------------------------------------------------------------
---------------------------------------------------------------------------


Current thread: