tcpdump mailing list archives

Re: Wenfei: how does tcpdump filter packets?


From: Guy Harris <guy () alum mit edu>
Date: Tue, 29 Jan 2013 13:21:51 -0800


On Jan 29, 2013, at 12:54 PM, Wenfei Wu <wenfeiwu () cs wisc edu> wrote:

 When using tcpdump capture trace, we can add filter expressions (  in a
form of  primitive [and/or primitive] ).
 I want to know how the packets are parsed and matched to this filter
expression. Is there some intermediate data structure for the filter
expression?

Yes.

libpcap/WinPcap compiles filter expressions into machine code for an accumulator-based pseudo-machine; interpreters 
(simulators) for that machine exist in libpcap/WinPcap, in several UN*Xes in kernel-mode code (*BSD, OS X, AIX, Tru64 
UNIX, sufficiently recent Linux kernels), and in the WinPcap kernel driver.  The kernel-mode version means that the 
capture mechanism libpcap/WinPcap uses can ignore "uninteresting" packets before copying them into a kernel-mode buffer 
or into the user address space.

Is the filter used as it is parsed on each layer of the headers
or used once after the packet is parsed completely?

The filter is compiled into a single program in BPF pseudo-machine code; the program does all the checks at all layers. 
 For example, a filter such as "tcp port 80" compiles, for Ethernet packets, into a program such as (with comments 
added by me):

(000) ldh      [12]                             # load Ethernet type - 2 byte "h"alfword at an offset of 12
(001) jeq      #0x86dd          jt 2    jf 8    # if equal to 0x86dd for IPv6, go to 2, else go to 8
(002) ldb      [20]                             # load IPv6 "next header" value - 1 "b"yte at an offset of 20
(003) jeq      #0x6             jt 4    jf 19   # if equal to 6 for TCP, go to 4, else go to 19
(004) ldh      [54]                             # load TCP source port value - 2 byte halfword at an offset of 54
(005) jeq      #0x50            jt 18   jf 6    # if equal to 0x50 = 80, go to 18, else go to 6
(006) ldh      [56]                             # load TCP dest port value - 2 byte halfword at an offset of 56
(007) jeq      #0x50            jt 18   jf 19   # if equal to 0x50 = 80, go to 18, else go to 19

                                                # we got here from (001), so the accumulator has the Ethernet type
(008) jeq      #0x800           jt 9    jf 19   # if equal to 0x0800 for IPv4, go to 9, else go to 19
(009) ldb      [23]                             # load IPv4 protocol value - 1 byte at an offset of 23
(010) jeq      #0x6             jt 11   jf 19   # if equal to 6 for TCP, go to 11, else go to 19
(011) ldh      [20]                             # load fragment offset and flags from IPv6 header (2 bytes at 20)
(012) jset     #0x1fff          jt 19   jf 13   # if fragment offset is non-zero, go to 19, else go to 13
(013) ldxb     4*([14]&0xf)                     # get offset of TCP header, based on IPv4 header length
(014) ldh      [x + 14]                         # load TCP source port value
(015) jeq      #0x50            jt 18   jf 16   # if equal to 0x50 = 80, go to 18, else go to 16
(016) ldh      [x + 16]                         # load TCP destination port value
(017) jeq      #0x50            jt 18   jf 19   # if equal to 0x50 = 80, go to 18, else go to 19

(018) ret      #65535                           # success - return 65535, so we get up to 65535 bytes of packet

(019) ret      #0                               # failure - return 0, meaning "ignore this packet"

This is the OS X 10.8 tcpdump and libpcap; newer versions of libpcap generate IPv6 code that also checks for fragments 
other than the first fragment, just as is done for IPv4 - the first fragment is the one that'll have the TCP header, so 
you can't check the TCP ports in those fragments.

Is there some material about this?

Here's the paper on the Berkeley Packet Filter (BPF) mechanism, as used in *BSD and OS X (and, perhaps with some 
changes, in AIX and, I think, Solaris 11), which includes the machine-code interpreter:

        http://www.tcpdump.org/papers/bpf-usenix93.pdf

A lot of that only applies to *BSD and OS X, and some might also apply to AIX and/or Solaris 11.  The BPF filter 
language, however, applies to all of them, as well as to Tru64 UNIX, Linux (in kernel versions that have the "socket 
filter" mechanism), and WinPcap.
_______________________________________________
tcpdump-workers mailing list
tcpdump-workers () lists tcpdump org
https://lists.sandelman.ca/mailman/listinfo/tcpdump-workers


Current thread: