tcpdump mailing list archives

Re: About pcap rules


From: Jefferson Ogata <Jefferson.Ogata () noaa gov>
Date: Fri, 25 Aug 2006 03:49:01 +0000

On 2006-08-21 15:47, Alexander Dupuy wrote:
when given a rule consisting of a set of sub rules to pcap,  if a packet 
matches the rule, how do I know which sub rule it matches? 

libpcap will not tell you that.  As far as it's concerned - and as far
as the kernel is concerned, on those platforms where the packet
filtering is done in the kernel - there are no subrules, there's just 
one big program that either says "matches" or "doesn't match".

If you're willing to dive below the libpcap interface and generate a custom BPF program, you may be able to 
distinguish subrules, since the final result is actually not just "matches" or "doesn't match" but rather how many 
bytes to capture, from 0 to 64K.

If you know that all traffic of interest will be at least say 40 bytes you can have a BPF program that captures 38 
bytes for one subrule and 39 bytes for another. This won't work, obviously, if you need to capture the entire packet, 
or if packet lengths shorter than your BPF program returns are observed. It's also a bit tricky to do this coding, 
and you may want to rely on the Linux "any" interface so that a single BPF program would work regardless of the 
actual NIC interface type. (if you are using Linux).

You can use tcpdump -d to see the BPF programs generated from pcap expressions, which helps, but this definitely 
qualifies as a very advanced libpcap hack, and unless the performance gains will be significant, this approach is 
probably unwise to use. I myself have considered this for a particular application, but have never actually 
implemented it.

Some years ago, I actually implemented a mechanism in libpcap to handle
this sort of scenario. I can't share the code for various reasons, but I
bring it up to point out a couple of issues you might run into if you do
delve more deeply into it.

First of all, BPF short-circuits its tests when it finds a successful
match. Thus if you are using

    host foo or host bar

and you give it a packet sent from host foo to host bar, you will get a
match on one of these subrules (which subrule may depend on the
optimizer) and the other rule will never be examined, because, after
all, the packet already matches the overall rule. That is, using the
approach Alexander describes, you may be able to find out which subrule
the packet matched /first/, where order is determined in part by the
optimizer, but not which other subrules the packet might have matched if
you kept looking. While it may be useful to know that the packet matches
"host foo", in most cases where you care about subrules, you would also
like to know whether the packet matches "host bar", and you don't get to
know both facts. So if you are interested in knowing /all/ of the
subrules a packet matches, you're out of luck unless you take a
radically different approach.

If you don't want to build BPF programs from scratch, you can add a
comma operator to the pcap compiler. Then when you want to evaluate
subrules, you use comma in place of "or". For example,

    host foo, host bar

would test "host foo" first, then continue to evaluate "host bar"
regardless of the outcome of the first test.

This doesn't buy you much without some method of communicating the
subrule results outside the BPF engine. The method of returning a
distinct value per subrule doesn't work when you want to detect multiple
subrule matches. One method that works is to add a callback opcode to
the BPF engine and an appropriate operator to the pcap compiler.
(Naturally, this only works if you execute the BPF engine in userland.)
So then you would have something like:

   host foo and call 1, host bar and call 2

meaning if the packet matches "host foo", the engine should then call
out to your callback handler with the argument 1, then proceed to test
"host bar" and call out with argument 2 if the packet matches the second
subrule. Because of the comma operator, if the packet matches both
subrules, you'll get both callbacks.

An alternate way to communicate subrule results would be to use the
existing BPF instructions to store flags directly into the packet data,
for example in the ethernet header where you might not care. This is a
way more klugey but you /might/ get away from the BPF-in-userland
requirement using this technique.

Yet another option, going back to userland, is to compile each of your
subrules independently, save the BPF program for each subrule, and then
execute every BPF program against each packet in turn. This is the most
programmatically clean way to do it, though much less efficient in most
cases because the optimizer can't factor out common code in all of the
subrules.

My hack was to add a comma and callback operator to the pcap compiler
and implement a callback opcode in the BPF engine, and do the packet
inspection in userland. If I did it again, I might do it differently,
but it works.

My main point in all this, however, is that when you start digging, the
question of "which subrule" is somewhat more subtle than it might seem
at first.

-- 
Jefferson Ogata <Jefferson.Ogata () noaa gov>
NOAA Computer Incident Response Team (N-CIRT) <ncirt () noaa gov>
"Never try to retrieve anything from a bear."--National Park Service
-
This is the tcpdump-workers list.
Visit https://cod.sandelman.ca/ to unsubscribe.


Current thread: