tcpdump mailing list archives

Re: Need help to determine capture file format.


From: Guy Harris <guy () alum mit edu>
Date: Sat, 6 Sep 2003 03:44:36 -0700

On Fri, Sep 05, 2003 at 06:06:06AM -0400, Mark Bednarczyk wrote:
      I'm working on a Java protocol analyzer
(http://jnetstream.sourceforge.net). I'm trying to understand lib-pcap file
format for capture files. I can not use the libpcap library to read the file
because I'm writting the file definitions in a proprietary language
(http://sourceforge.net/docman/index.php?group_id=89169).

Note that the current file formats supported by libpcap are not the only
formats that will ever be used.  At some point, we will probably
implement a new capture file format.

I've looked at the savefile.c, pcap.h and pcap-int.h source code to
determine the format and I'm having a little trouble with all of the
exceptions and some of the fields in the packet header of the capture file.

I understand and can read properly the initial file header in all cases as
far as I can tell. I also read the first 3 fields of the packet header, but
the rest of the fields are a mystery as to how really they are used.

struct pcap_sf_patched_pkthdr {
    struct pcap_timeval ts;   /* time stamp */
    bpf_u_int32 caplen;               /* length of portion present */
    bpf_u_int32 len;          /* length this packet (off wire) */
    int               index;
    unsigned short protocol;
    unsigned char pkt_type;
};

Note that this is *NOT* the format that libpcap from tcpdump.org uses to
write capture files.  It's a format that some versions of libpcap on
some systems use, but not one we ever used; libpcap can read that format
(at least in the versions that didn't use the standard libpcap magic
number for that format, and that used that header format with the new
magic number rather than using a header format with some debugging gunk
in it).

The format that tcpdump.org (and LBL's original) libpcap writes, and
therefore that most versions of tcpdump write in (and that is also the
default format for Ethereal, which has its own library for reading and
writing capture files) uses is the one in "pcap_sf_pkthdr", not the one
in "pcap_sf_patched_pkthdr".

Note also that our libpcap ignores those fields.

What is the "index", "protocol" and "pkt_type" fields and when are they
used? When I do a dump of various files of these values, I can't correlate
their meanings.

"index" is the index, in the list of interfaces on the machine on which
the packets were captured, of the interface on which the packet arrived.
It's completely meaningless if the packets weren't captured on the
machine on which you're reading the capture file.

"protocol" is either an Ethernet type value for the protocol running
atop the link layer, or a special value (whose Linux #define name I
don't remember) if the packet was an Ethernet packet with a length field
and 802.2 LLC header, or another special value (whose Linux #define name
I don't remember) if it was an Ethernet packet with a length field and
an IPX payload with no 802.2 LLC header.

"pkt_type" is one of the PACKET_ #defines, such as PACKET_HOST, in the
<linux/if_packet.h> or <netpacket/packet.h> header file on those Linux
systems that have one or the other or both of those header files.

First dump below, has stable values for index, protocol and type. So if this
was true across all capture files, I could probably figure this out. Below
are some printouts of the "packet headers" for 2 capture files. They are
using the same header structure. Both capture files say the are version 2.4,
but with different MAGIC numbers). The first file parses fine, the second
does not.

(File header in capture_file1.cap:)
PcapLittle2dot4:
PcapLittle2dot4:    magic =  0xa1b2cd34

That's PATCHED_TCPDUMP_MAGIC, as defined in "savefile.c".  That means
that (unless you were unlucky enough to have a capture from one of the
non-tcpdump.org and non-LBL versions of libpcap that used that magic
number but had a *different* header, with some additionald debugging
junk) it has the pcap_sf_patched_pkthdr header.

Then in a second dump file, things go wrong starting at packet number2.
Looks like the packet header size has changed:

(File header in atm_capture1.cap:)
PcapLittle2dot4:
PcapLittle2dot4:    magic =  0xa1b2c3d4

That's TCPDUMP_MAGIC, as defined in "savefile.c".  That means that
(unless you were unlucky enough to have a capture from one of the
non-tcpdump.org and non-LBL versions of libpcap that used
pcap_sf_patched_pkthdr as the header but used the standard 0xa1b2c3d4
magic number) it's a standard libpcap capture, using "pcap_sf_pkthdr" as
the per-packet header.

So, yes, the header size *is* different.

Here is how I define both "file header" and "packet headers" in my langauge
(NPL).
The language is simple enought that you should undestand easily with this
help. (hex = hexadecimal output, little = LITTLE ENDIAN ENCODING,

Libpcap captures don't always use little-endian encoding.  They use the
encoding of the machine on which the capture was written, which isn't
necessarily a little-endian machine.

The magic number is written in host byte order, and can therefore be
used to determine the byte order of fields in the rest of the file
header, as well as in the per-packet header.  (Obviously, the byte order
of the fields in the actual packet data is the byte order they had on
the wire, *not* the byte order of the host on which the capture was
done.)

/**
 * Lib PCAP ver 2.4 packet file header.
 */
header PcapLittle2dot4PacketHeader {

  field little int secs;
  field little int nanos;
  field little int snaplen;

  field little int length;

  field little signed int index;
  field little short protocol;
  field little byte type;
  field little byte reserved;
};

That's the "patched tcpdump" packet file header, not the standard
libpcap 2.4 header.  The standard header lacks "index", "protocol",
"type", and "reserved".

BTW, note that the time stamp is in seconds and *microseconds*, not
seconds and nanoseconds (unless you're unlucky enough to be reading a
capture from IBM's tcpdump on AIX, where they made the timestamp be
seconds and nanoseconds, and used a different set of link-layer type
values, but didn't bother to change the magic number).
-
This is the TCPDUMP workers list. It is archived at
http://www.tcpdump.org/lists/workers/index.html
To unsubscribe use mailto:tcpdump-workers-request () tcpdump org?body=unsubscribe


Current thread: