Snort mailing list archives

Re: Understanding Snort Internals

From: Matthew Watchinski <mwatchinski () sourcefire com>
Date: Tue, 12 Jun 2007 10:50:25 -0400

Giorgio Moscardi wrote:

Hi,
I'm new to this mailing list, so first of all hello everybody :).

I'm trying to get a deeper understanding of how Snort works internally, for a 
university project. I have posted this message to the Snort forums, already, 
but it seems nobody even read it, so I'm posting it here again, hoping to get 
some replies and not to upset anybody.
 
My main interest is in how string matching (the "content"/"uri-content" rule 
options,  basically) is exactly performed. I've taken a look at various 
documents on the Net, but most are too vague or talk about earlier (and 
slower) Snort versions.

I know that a multiple-string pattern matching algorithm is used (a 
modification of Aho Corasick, right? Or is it Wu-Manber? I know that 
different algorithms can also be chosen at runtime). I also know that rules 
are being grouped by protocol first, and by some protocol parameters then (i. 
e. ports for tcp/udp, type for icmp), so that for every packet only a 
 subset of the rules is tested, while the other rules cannot surely be 
matched. I have also understood how the grouping treats "any" rules, so this 
is clear enough for me.


A heavily modified version of Aho Corasick is the default pattern
matcher.  Wu-Manber has been deprecated.

Rules are grouped essentially by port.


The basic Snort operation is as follows, as far as I can understand:

1. A packet arrives. Let's assume it contains IP+TCP


bpf goes right here.

Stream4/5 and frag3 go here.

2. The patterns belonging to the proper subset of the TCP rules are tested 
against the packet (all at a single time). 
3. For every matched pattern the rule it belongs to is identified. This set of 
rules contains all possible match candidates, while all other rules are 
surely not matched at this point.


Rules without a content match and contain the correct port grouping are
evaluated here.

4. For every candidate rule the rest of the options are tested, to see whether 
the rule is fully matched or not.
5. Fully matched rules are added to the event queue.
6. The event queue is processed.
 
So, am I right this far? I'd like to get all the possible corrections and/or 
details!


Pretty close, and what I've added above isn't 100% correct, but it adds
a few additional important steps.

 
Another thing that is quite unclear to me is how rules with multiple "content" 
options are dealt with. The code seems to suggest that only the longest 
pattern is added to the algorithm. So the remaining patterns are tested one 
at a time at point 4? But this would not benefit from a multiple-pattern 
string matching algorithm, so I'm not sure.


Longest content per rule is selected for loading Aho, additional
contents are evaluated in order in the OTN tree's


OK, it's all for now. Thanks for the time to read and answer my questions! 
There will surely be more ;).


Hopefully that helps a bit.

Cheers,
-matt

-------------------------------------------------------------------------
This SF.net email is sponsored by DB2 Express
Download DB2 Express C - the FREE version of DB2 express and take
control of your XML. No limits. Just data. Click to get it now.
http://sourceforge.net/powerbar/db2/
_______________________________________________
Snort-users mailing list
Snort-users () lists sourceforge net
Go to this URL to change user options or unsubscribe:
https://lists.sourceforge.net/lists/listinfo/snort-users
Snort-users list archive:
http://www.geocrawler.com/redir-sf.php3?list=snort-users

Current thread:

Understanding Snort Internals Giorgio Moscardi (Jun 11)
- Re: Understanding Snort Internals Matthew Watchinski (Jun 12)
  - Re: Understanding Snort Internals Marc Norton (Jun 13)