tcpdump mailing list archives

Re: Fixing damaged cap file


From: Alexander Dupuy <alex.dupuy () mac com>
Date: Wed, 04 Apr 2007 13:41:49 -0400

Petr Novák wrote:


Please tcpdump, do not end, but run over damaged packets (move from
current position in cap file byte by byte) from cap file until You find
good header and packet,


Guy Harris replied

The code *could* skip forward until it finds something that appears to have a caplen <= len <= 65535, but there's no guarantee that it would find a good header.

It could also check for a time stamp that looks "reasonable", for some definition of "reasonable", although there are some cases where, unfortunately, time stamps go backward in time.

Looking for a good packet is trickier, as it involves a lot of link-layer-type-dependent work.

If the file is damaged, the damage might occur in what appear to be "good" packets. Consider, for example, a common form of damage - FTPing a file between Windows and UN*X in ASCII mode - which can insert or remove bytes in packet data.

Furthermore, it might skip over a lot of packets, meaning that the data you get from your capture file might be incomplete.

In other words:

1) attempting to find a good packet by skipping file data is not easy, and is not guaranteed to find a real packet;

2) if you do that, you lose a bunch of data from the file, so you're really not successfully reading the capture, you're getting a partial sample of the packets in the capture.


This is true in general, but there are some common forms of corruption in tcpdump capture files that can be recovered from at least in most cases, without skipping file data,. The most common one is caused by concatenating multiple pcap files using cat (or other non-pcap aware file-level utility, e.g. dd). When an embedded file header is found where a packet header was expected, it causes a fatal "bogus savefile header" error. However, this can be identified by the file header "magic" and recovered from cleanly. Checking for different endianness magic on the concatenated files makes recovery possible even in that situation.

Another form of breakage that is fairly common is for a capture file to have "gaps" due to failed writes (typically because of a full filesystem) - these gaps can cause a file header (from concatenation) or a packet header (when a capture application doesn't check for write errors, and writes new data once there is space) to appear within what should have been the packet data. In this case, rather than skipping forward looking for the next packet header, it's more effective to search backward in the existing buffered data looking for a savefile packet header that was treated as packet data a result of such a gap. This searching can look for the MSB of the current packet timestamp (allowing for out-of-order packets in almost all cases) and confirm validity with caplen <= len <= 65535. While it's possible to find something that looked like a valid savefile packet header (but wasn't) it's very unlikely (except for a corrupted savefile containing traffic with savefile data). Even if you do mistakenly recover a packet header, it's even more improbable for there to be another mistaken packet header following the packet data of the mistaken one, and recovery will stop after only a few "garbage" packets are returned (by disabling recovery immediately after a recovered packet). Furthermore, by only searching backward through the previous packet's data, the overhead in cases of unrecoverable corruption is minimal (you don't spend minutes or hours searching in vain through a multi-gigabyte savefile full of NULs due to filesystem errors).

While this code was not "easy" to write, I have written it and have been using it successfully for several years; while it is not "guaranteed" to find a real packet, it recovers effectively and reliably from common causes of corruption, with minimal overhead and only a small probability of mistaken recovery (and for applications that really care, they can treat the second case, of "gaps," as fatal); and by not skipping over data, it returns all the data in the capture, not just s "partial sample." It doesn't handle all possible causes of corruption (e.g. ASCII FTP transfer) but does cover the most common ones.

Attached is a patch for libpcap (based on CVS head) that will recover from these common corruption cases (if enabled during configure, with --enable-corrupt-savefile-recovery). If enabled, pcap_next_ex() returns -3 or -4 if libpcap has detected corruption and recovered from it (for -4, the previously-returned packet may have bogus packet data - and if so, would have had that bogus data even without this patch, due to "gaps"). Both pcap_loop() and pcap_dispatch() return the number of packets already read and/or these new error codes (on the next call, much like current breakloop handling). An application that is willing to continue processing can check for these error codes and recover from common forms of savefile corruption. I also include a patch for tcpdump (also based on CVS head) that adds a -k option to keep going after a recoverable error has been detected.

There are compatibility issues with the introduction of these new error codes, as existing code won't expect them (one reason why the functionality must be explicitly enabled in configure). Code using pcap_next() doesn't get error codes (just packet or no packet), so will treat these as any other error. Older code that performs error testing with ret < 0 will treat -3 /-4 much like current -1 (error) and -2 (breakloop abort). Newer code that checks explicitly for -1 and -2 (like tcpdump) will generally abort on the new error codes (although they may treat it like a breakloop and fail to display the error). For code that uses pcap_loop() or pcap_dispatch() the read loop is aborted, and only if called again will post-recovery packets be passed to the callback function. This essentially preserves existing application behavior by default. For code using pcap_next_ex(), the packet data and header pointers are updated just in case the new codes are not recognized as errors, but as pcap_next_ex() has always returned -2 for EOF indication (as well as breakloop), I'd be surprised if any existing code continued processing on a negative value return from pcap_next_ex(). Code using pcap_next_ex() might fail to display the error, and in a worst case, would continue processing after the recovery (with the first post-recovery packet returned twice). While this doesn't preserve existing application behavior in all possible cases, it does generally do so, and this seems reasonable for a conditionally compiled feature.

At an implementation level of compatibility, virtually all recovery processing code changes are conditionally compiled - the only exceptions are the refactoring of the header swapping into a static function in savefile.c, and pcap_next_ex() handling of the new error codes from pcap_offline_read(). If corrupt savefile recovery is enabled, two unused fields in struct pcap (bp and cc) are used to track savefile recovery markers (so no structure size changes occur) and this is the only additional processing that occurs for non-corrupt savefiles (the overhead is minimal, pcap_offline_open() has one more assignment per call, and sf_next_packet() has two per call). Only in the event of a corrupted file is there significant additional processing to recover. At a purely source level, I also "renovated" the out-of-date and unused SFERR_XX return codes for sf_next_packet() to reflect the actual (and new) codes returned by this function, and modified the savefile.c source to use them (updating the out-of-date comments in the file accordingly). The patches also include updates to the pcap and tcpdump man pages reflecting the changes, and also documenting the current ambiguity of the -2 return code for pcap_next_ex() - historically used for EOF, but also used by pcap_breakloop.

If there are other improvements or changes to this that would be required for its acceptance and incorporation into the tcpdump.org CVS repository, I would be willing to make them if they are not excessive (I've spent a bit too much time cleaning up this private code for public use already).

Petr - I don't know if these patches will address the types of file corruption you are dealing with, although I suspect they will. However, you certainly will not be able to apply them to your 0.8.3 version of libpcap/tcpdump. You will need to download the "current tar files" from tcpdump.org (http://www.tcpdump.org/daily/tcpdump-current.tar.gz and libpcap-current.tar.gz) - these correspond to the CVS head release and the attached patches should apply cleanly to that version. If you then configure --enable-corrupt-savefile-recovery and build libpcap and tcpdump, you should have a tcpdump with a -k option that does pretty much what you want; you can use this with -r and -w to "clean up" capture files which you can then process with your existing tcpdump, if you don't want to use the new one due to changes in output formatting or whatever.

@alex
--
mailto:alex.dupuy () mac com


Index: config.h.in
===================================================================
RCS file: /tcpdump/master/libpcap/config.h.in,v
retrieving revision 1.26
diff -u -r1.26 config.h.in
--- config.h.in 20 Dec 2006 03:30:51 -0000      1.26
+++ config.h.in 4 Apr 2007 14:32:40 -0000
@@ -10,6 +10,9 @@
 /* Enable optimizer debugging */
 #undef BDEBUG
 
+/* Enable corrupt savefile recovery */
+#undef CORRUPT_SAVEFILE_RECOVERY
+
 /* define if you have the DAG API */
 #undef HAVE_DAG_API
 
Index: configure
===================================================================
RCS file: /tcpdump/master/libpcap/configure,v
retrieving revision 1.75
diff -u -r1.75 configure
--- configure   8 Feb 2007 06:03:03 -0000       1.75
+++ configure   4 Apr 2007 14:32:42 -0000
@@ -850,6 +850,7 @@
   --enable-ipv6           build IPv6-capable version
   --enable-optimizer-dbg  build optimizer debugging code
   --enable-yydebug        build parser debugging code
+  --enable-corrupt-savefile-recovery  build version capable of recovering from some corrupt savefiles
 
 Optional Packages:
   --with-PACKAGE[=ARG]    use PACKAGE [ARG=yes]
@@ -5657,6 +5658,23 @@
 echo "$as_me:$LINENO: result: ${enable_yydebug-no}" >&5
 echo "${ECHO_T}${enable_yydebug-no}" >&6
 
+echo "$as_me:$LINENO: checking whether to enable corrupt savefile recovery code" >&5
+echo $ECHO_N "checking whether to enable corrupt savefile recovery code... $ECHO_C" >&6
+# Check whether --enable-corrupt-savefile-recovery or --disable-corrupt-savefile-recovery was given.
+if test "${enable_corrupt_savefile_recovery+set}" = set; then
+  enableval="$enable_corrupt_savefile_recovery"
+
+fi;
+if test "$enable_corrupt_savefile_recovery" = "yes"; then
+
+cat >>confdefs.h <<\_ACEOF
+#define CORRUPT_SAVEFILE_RECOVERY 1
+_ACEOF
+
+fi
+echo "$as_me:$LINENO: result: ${enable_corrupt_savefile_recovery-no}" >&5
+echo "${ECHO_T}${enable_corrupt_savefile_recovery-no}" >&6
+
 case "$V_PCAP" in
 
 dlpi)
Index: configure.in
===================================================================
RCS file: /tcpdump/master/libpcap/configure.in,v
retrieving revision 1.134
diff -u -r1.134 configure.in
--- configure.in        8 Feb 2007 06:02:42 -0000       1.134
+++ configure.in        4 Apr 2007 14:32:42 -0000
@@ -323,6 +323,13 @@
 fi
 AC_MSG_RESULT(${enable_yydebug-no})
 
+AC_MSG_CHECKING(whether to enable corrupt savefile recovery code)
+AC_ARG_ENABLE(corrupt-savefile-recovery, [  --enable-corrupt-savefile-recovery  build version capable of recovering 
from some corrupt savefiles])
+if test "$enable_corrupt_savefile_recovery" = "yes"; then
+       AC_DEFINE(CORRUPT_SAVEFILE_RECOVERY,1,[Enable corrupt savefile recovery])
+fi
+AC_MSG_RESULT(${enable_corrupt_savefile_recovery-no})
+
 case "$V_PCAP" in
 
 dlpi)
Index: pcap-int.h
===================================================================
RCS file: /tcpdump/master/libpcap/pcap-int.h,v
retrieving revision 1.80
diff -u -r1.80 pcap-int.h
--- pcap-int.h  11 Mar 2007 21:44:12 -0000      1.80
+++ pcap-int.h  4 Apr 2007 14:32:42 -0000
@@ -170,8 +170,8 @@
         */
        int bufsize;
        u_char *buffer;
-       u_char *bp;
-       int cc;
+       u_char *bp;             /* used only for CORRUPT_SAVEFILE_RECOVERY */
+       int cc;                 /* used only for CORRUPT_SAVEFILE_RECOVERY */
 
        /*
         * Place holder for pcap_next().
Index: pcap.3
===================================================================
RCS file: /tcpdump/master/libpcap/pcap.3,v
retrieving revision 1.74
diff -u -r1.74 pcap.3
--- pcap.3      12 Oct 2006 07:59:54 -0000      1.74
+++ pcap.3      4 Apr 2007 14:32:42 -0000
@@ -504,6 +504,21 @@
 checking for a return value < 0.
 .ft R
 .PP
+If corrupt ``savefile'' recovery has been enabled, two other errors may
+be returned when reading from a savefile: \-3, if a file header was
+found instead of a packet header (caused by appending multiple
+savefiles); \-4, if the previous packet data was incomplete (caused by
+lack of file space). In the second case, data for the previous packet
+contained ``garbage'' data from the next packet header.  In either
+case, further calls to
+.B pcap_dispatch()
+or
+.B pcap_loop()
+will return additional packets after recovery from the savefile
+corruption.  Not all corruption can be recovered from, and there is a
+very small possibility that the recovery will be incorrect (e.g. if
+packet data contained savefile data being transferred on the network).
+.PP
 .BR NOTE :
 when reading a live capture,
 .B pcap_dispatch()
@@ -548,6 +563,10 @@
 make sure that you explicitly check for \-1 and \-2, rather than just
 checking for a return value < 0.
 .ft R
+If corrupt ``savefile'' recovery has been enabled, \-3 or \-4 may be
+returned as noted for
+.B pcap_dispatch()
+above, and applications should check for those values as well.
 .PP
 .B pcap_next()
 reads the next packet (by calling
@@ -584,7 +603,21 @@
 .TP
 \-2
 packets are being read from a ``savefile'', and there are no more
-packets to read from the savefile.
+packets to read from the savefile;
+.I or
+.B pcap_breakloop()
+was called
+.TP
+\-3
+packets are being read from a corrupt ``savefile'' (file header was
+found instead of a packet header), but further calls will return
+recovered packets
+.TP
+\-4
+packets are being read from a corrupt ``savefile'' (packet data was
+incomplete); data for the previous packet contained ``garbage'' data
+from the next packet header but further calls will return recovered
+packets
 .RE
 .PP
 If the packet was read without problems, the pointer pointed to by the
Index: pcap.c
===================================================================
RCS file: /tcpdump/master/libpcap/pcap.c,v
retrieving revision 1.104
diff -u -r1.104 pcap.c
--- pcap.c      20 Dec 2006 03:30:32 -0000      1.104
+++ pcap.c      4 Apr 2007 14:32:42 -0000
@@ -180,18 +180,28 @@
                 * Return codes for pcap_offline_read() are:
                 *   -  0: EOF
                 *   - -1: error
+                *   - -2: pcap_breakloop() called
+                *   - -3/-4: recovered from corrupt savefile
                 *   - >1: OK
                 * The first one ('0') conflicts with the return code of
                 * 0 from pcap_read() meaning "no packets arrived before
                 * the timeout expired", so we map it to -2 so you can
                 * distinguish between an EOF from a savefile and a
                 * "no packets arrived before the timeout expired, try
-                * again" from a live capture.
+                * again" from a live capture. While these return codes
+                * don't distinguish between an EOF from a savefile and
+                * a call to pcap_breakloop, code calling pcap_breakloop
+                * can set other flags, or feof() can be used to test
+                * for true EOF.  In corrupt savefile recovery, set data
+                * (and header, already set above) in case application
+                * doesn't recognize -3/-4 as error codes.
                 */
                if (status == 0)
                        return (-2);
-               else
-                       return (status);
+               else if (status == -3 || status == -4)
+                       *pkt_data = p->buffer;
+
+               return (status);
        }
 
        /*
Index: savefile.c
===================================================================
RCS file: /tcpdump/master/libpcap/savefile.c,v
retrieving revision 1.152
diff -u -r1.152 savefile.c
--- savefile.c  3 Apr 2007 07:18:27 -0000       1.152
+++ savefile.c  4 Apr 2007 14:32:43 -0000
@@ -96,10 +96,11 @@
 #define        SWAPSHORT(y) \
        ( (((y)&0xff)<<8) | ((u_short)((y)&0xff00)>>8) )
 
-#define SFERR_TRUNC            1
-#define SFERR_BADVERSION       2
-#define SFERR_BADF             3
-#define SFERR_EOF              4 /* not really an error, just a status */
+#define SFERR_EOF              1
+#define SFERR_FATAL            -1
+#define SFERR_ABORT            -2
+#define SFERR_RECOVERCLEAN     -3
+#define SFERR_RECOVER          -4
 
 /*
  * Setting O_BINARY on DOS/Windows is a bit tricky
@@ -861,6 +862,36 @@
                free(p->sf.base);
 }
 
+static void
+swap_pkthdr(struct pcap_pkthdr *hdr, struct pcap_sf_patched_pkthdr *sf_hdr, int swap)
+{
+       if (swap) {
+               /* these were written in opposite byte order */
+               hdr->caplen = SWAPLONG(sf_hdr->caplen);
+               hdr->len = SWAPLONG(sf_hdr->len);
+               hdr->ts.tv_sec = SWAPLONG(sf_hdr->ts.tv_sec);
+               hdr->ts.tv_usec = SWAPLONG(sf_hdr->ts.tv_usec);
+       } else {
+               hdr->caplen = sf_hdr->caplen;
+               hdr->len = sf_hdr->len;
+               hdr->ts.tv_sec = sf_hdr->ts.tv_sec;
+               hdr->ts.tv_usec = sf_hdr->ts.tv_usec;
+       }
+}
+
+#ifdef CORRUPT_SAVEFILE_RECOVERY
+static int
+validhdr(struct pcap_pkthdr *hdr, int minlen, int maxlen)
+{
+       return (hdr->ts.tv_usec >= 0 &&
+               hdr->ts.tv_usec < 1000000 &&
+               hdr->len <= 65536 &&
+               hdr->caplen > minlen &&
+               hdr->caplen <= hdr->len &&
+               hdr->caplen <= maxlen);
+}
+#endif /* CORRUPT_SAVEFILE_RECOVERY */
+
 pcap_t *
 pcap_open_offline(const char *fname, char *errbuf)
 {
@@ -1026,6 +1057,9 @@
        p->buffer = p->sf.base + BPF_ALIGNMENT - (linklen % BPF_ALIGNMENT);
        p->sf.version_major = hdr.version_major;
        p->sf.version_minor = hdr.version_minor;
+#ifdef CORRUPT_SAVEFILE_RECOVERY
+       p->bp = NULL;                           /* no recovery possible yet */
+#endif
 #ifdef PCAP_FDDIPAD
        /* Padding only needed for live capture fcode */
        p->fddipad = 0;
@@ -1089,9 +1123,14 @@
 }
 
 /*
- * Read sf_readfile and return the next packet.  Return the header in hdr
- * and the contents in buf.  Return 0 on success, SFERR_EOF if there were
- * no more packets, and SFERR_TRUNC if a partial packet was encountered.
+ * Read sf_readfile and return the next packet.  Return the header in hdr and
+ * the contents in buf.  Return 0 on success; SFERR_EOF if there were no more
+ * packets, SFERR_FATAL if only a partial packet was found, or an invalid
+ * header was detected; and SFERR_ABORT if the breakloop flag was set and
+ * processing should be aborted. If a corrupt savefile was detected and
+ * recovered from, SFERR_RECOVERCLEAN (for concatenated complete capture files
+ * where previous packet was OK) or SFERR_RECOVER (if previous packet may have
+ * bogus data) is returned.
  */
 static int
 sf_next_packet(pcap_t *p, struct pcap_pkthdr *hdr, u_char *buf, u_int buflen)
@@ -1100,6 +1139,49 @@
        FILE *fp = p->sf.rfile;
        size_t amt_read;
        bpf_u_int32 t;
+       int res = 0;
+#ifdef CORRUPT_SAVEFILE_RECOVERY
+       /* recover pointers into previous packet, if possible */
+       unsigned last_caplen = p->bp - buf;
+
+       if (last_caplen < 0 || last_caplen > buflen)
+               last_caplen = 0;                        /* can't recover */
+
+       /*
+        * The two most common cases of "bogus savefile header" corruption are
+        * due to errors when splicing savefiles together.  One type is
+        * "duplicate file header", when raw concatenation leaves a file header
+        * where the packet header should be - this is easily detected by magic
+        * number checks.  Another type is "appending to truncated file" when
+        * additional packets are appended to a file with an incomplete record
+        * (possibly due to lack of filesystem space).  It's possible for both
+        * errors to occur at once, but this can be handled in the same way as
+        * the second case, and isn't discussed further.  DEFCON 9 capture-the-
+        * flag data has several examples of the somewhat rare second case;
+        * examples of the first can be found anywhere there is a halfway-
+        * (in)competent network admin.
+        *
+        * Both cases are recoverable: the first by updating the savefile
+        * header info with the new file header, the second by scanning back
+        * into the previous packet buffer looking for the last plausible
+        * savefile packet header (only the last one can be used, as there is
+        * no portable way to push back data already read by fread).  Plausible
+        * is defined as having a timestamp later than the previous one, but
+        * not by more than 2^24 to 2^25 seconds (from half a year to a year),
+        * with usecs in the range [0...999999], and (caplen <= buflen &&
+        * caplen <= wirelen && wirelen <= 65536).  This is implemented by
+        * validhdr(), which also checks that caplen is large enough that no
+        * bytes already read are left-over after returning a packet.
+        *
+        * Recoverable errors are indicated by a return value of -3/-4 with a
+        * valid (recovered) hdr and buf, so as not to break code that checks
+        * explicitly for -1 and/or -2, although most code should abort on < 0,
+        * preserving current behavior.  New code can check for -3/-4, log the
+        * error, and process the returned packet.
+        */
+
+ readheader:
+#endif /* CORRUPT_SAVEFILE_RECOVERY */
 
        /*
         * Read the packet header; the structure we use as a buffer
@@ -1123,22 +1205,14 @@
                                return (-1);
                        }
                        /* EOF */
-                       return (1);
+                       return (SFERR_EOF);
                }
        }
 
-       if (p->sf.swapped) {
-               /* these were written in opposite byte order */
-               hdr->caplen = SWAPLONG(sf_hdr.caplen);
-               hdr->len = SWAPLONG(sf_hdr.len);
-               hdr->ts.tv_sec = SWAPLONG(sf_hdr.ts.tv_sec);
-               hdr->ts.tv_usec = SWAPLONG(sf_hdr.ts.tv_usec);
-       } else {
-               hdr->caplen = sf_hdr.caplen;
-               hdr->len = sf_hdr.len;
-               hdr->ts.tv_sec = sf_hdr.ts.tv_sec;
-               hdr->ts.tv_usec = sf_hdr.ts.tv_usec;
-       }
+#ifdef CORRUPT_SAVEFILE_RECOVERY
+ gotheader:
+#endif
+       swap_pkthdr(hdr, &sf_hdr, p->sf.swapped);
        /* Swap the caplen and len fields, if necessary. */
        switch (p->sf.lengths_swapped) {
 
@@ -1162,6 +1236,24 @@
                break;
        }
 
+#ifdef CORRUPT_SAVEFILE_RECOVERY
+       /*
+        * In the "duplicate file header" case, hdr->caplen ought to be 0,
+        * since it corresponds to the (unused, zero) file header "thiszone".
+        *
+        * In the "appending to truncated file" case, hdr->caplen may be
+        * plausible if the file offset is only off by one; however, the MSB of
+        * the timestamp is not likely to be correct.
+        *
+        * Catch these cases (the latter only if recovery is possible) and set
+        * hdr->caplen to force recovery.
+        */
+       if ((hdr->caplen == 0 && buflen != 0) ||
+           (last_caplen && p->cc != (0xff & (hdr->ts.tv_sec >> 24))
+            && (u_char)(p->cc + 1) != (0xff & (hdr->ts.tv_sec >> 24))))
+               hdr->caplen = 65536;
+#endif
+
        if (hdr->caplen > buflen) {
                /*
                 * This can happen due to Solaris 2.3 systems tripping
@@ -1173,6 +1265,154 @@
                static size_t tsize = 0;
 
                if (hdr->caplen > 65535) {
+#ifdef CORRUPT_SAVEFILE_RECOVERY
+                       /* following line depends on pcap_pkthdr layout */
+                       bpf_u_int32 magic = hdr->ts.tv_sec;
+                       int swapped = 0;
+                       int hdrsize = sizeof(struct pcap_sf_pkthdr);
+                       struct pcap_sf_patched_pkthdr hdr2;
+                       u_char *bp = NULL;
+                       int off = 0;                    /* little-end offset */
+                       int last_len = 0;
+
+                       if (magic != TCPDUMP_MAGIC &&
+                           magic != KUZNETZOV_TCPDUMP_MAGIC) {
+                               magic = SWAPLONG(magic);
+                               swapped = 1;
+                       }
+                       switch (magic) {
+                       case KUZNETZOV_TCPDUMP_MAGIC:
+                               hdrsize+=sizeof(struct pcap_sf_patched_pkthdr);
+                               /* FALLTHRU */
+                       case TCPDUMP_MAGIC:
+                               /*
+                                * Skip rest of file header, reset size/swap
+                                * - don't bother checking other fields - this
+                                * is best-effort recovery, not guaranteed.
+                                */
+                               snprintf(p->errbuf, PCAP_ERRBUF_SIZE,
+                                   "embedded savefile magic header");
+                               if (fread(&sf_hdr,
+                                         sizeof(struct pcap_file_header)
+                                         - p->sf.hdrsize, 1, fp) != 1)
+                                       return (-1);    /* just give up */ 
+                               
+                               p->sf.hdrsize = hdrsize;
+                               p->sf.swapped = swapped;
+                               res = SFERR_RECOVERCLEAN;/* recoverable error */
+                               goto readheader;
+                       }
+
+                       snprintf(p->errbuf, PCAP_ERRBUF_SIZE,
+                                "truncated packet capture - previous packet has garbage data");
+
+                       /* can't recover in face of BUFMOD hack or bad hdr */
+                       if (last_caplen > buflen)
+                               last_caplen = 0;
+
+                       res = SFERR_RECOVER;    /* if we can recover */
+
+                       /*
+                        * In the little-endian case, the magic timestamp byte
+                        * could also be in the first three bytes of the
+                        * header we just read.
+                        */
+                       if (last_caplen && p->sf.swapped == (htonl(1) == 1)) {
+                               for (; off < 3; off++) {
+                                       bp = (u_char *)&sf_hdr + off;
+                                       if (*bp != p->cc &&
+                                           *bp != (u_char)(p->cc + 1))
+                                               continue;
+
+                                       last_len = 3 - off;
+                                       memcpy(&hdr2,
+                                              &buf[last_caplen - last_len],
+                                              last_len);
+                                       memcpy(last_len + (char *)&hdr2,
+                                              &sf_hdr,
+                                              p->sf.hdrsize - last_len);
+
+                                       swap_pkthdr(hdr, &hdr2,
+                                                   p->sf.swapped);
+
+                                       if (validhdr(hdr, last_len, buflen))
+                                               break;  /* looks legit */
+                               }
+                               if (off < 3) {
+                                       memcpy(buf, p->sf.hdrsize - last_len +
+                                              (char *)&sf_hdr, last_len);
+                                       memcpy(&sf_hdr, &hdr2, p->sf.hdrsize);
+                                       /* read any remaining data for pkt */
+                                       if (last_len < hdr->caplen &&
+                                           fread(&buf[last_len],
+                                                 hdr->caplen - last_len, 1,
+                                                 fp) != 1)
+                                               return (-1);
+                                       goto recover;
+                               }
+                       }
+
+                       /*
+                        * Scan backwards in previous packet looking
+                        * for plausible pcap packet header.
+                        */
+                       bp = &buf[last_caplen];
+                       while (--bp >= &buf[off]) {
+                               if (*bp != p->cc && *bp != (u_char)(p->cc + 1))
+                                       continue;
+
+                               last_len = &buf[last_caplen] - (bp - off);
+                               if (last_len >= p->sf.hdrsize) {
+                                       memcpy(&hdr2, bp - off, p->sf.hdrsize);
+                               } else {
+                                       memcpy(&hdr2, bp - off, last_len);
+                                       memcpy(last_len + (char *)&hdr2,
+                                              &sf_hdr,
+                                              p->sf.hdrsize - last_len);
+                               }
+                               swap_pkthdr(hdr, &hdr2, p->sf.swapped);
+
+                               /* minimum caplen to consume read data */
+                               if (validhdr(hdr, last_len - p->sf.hdrsize,
+                                            buflen))
+                                       break;          /* looks legit */
+                       }
+                       if (bp >= &buf[off]) {
+                               /* did we already read some of next header? */
+                               if (last_len > hdr->caplen) {
+                                       last_len -= hdr->caplen;
+                                       if (last_len > p->sf.hdrsize)
+                                               return (-1);
+                                       /* overlap very likely here */
+                                       memmove(&sf_hdr,
+                                               last_len + (char *)&sf_hdr,
+                                               p->sf.hdrsize - last_len);
+                                       if (fread(last_len + (char *)&sf_hdr,
+                                                 last_len, 1, fp) != 1)
+                                               return (-1);
+
+                                       /* don't try again in this call */
+                                       last_caplen = 0;
+                                       goto gotheader;
+                               } else if (last_len > p->sf.hdrsize) {
+                                       /* overlap very very likely here */
+                                       memmove(buf, p->sf.hdrsize + bp - off,
+                                               last_len - p->sf.hdrsize);
+                                       memcpy(&buf[last_len - p->sf.hdrsize],
+                                              &sf_hdr, p->sf.hdrsize);
+                               } else {
+                                       memcpy(buf, p->sf.hdrsize - last_len +
+                                              (char *)&sf_hdr, last_len);
+                               }
+                               memcpy(&sf_hdr, &hdr2, p->sf.hdrsize);
+                               /* read any data from remainder of last pkt */
+                               if (last_len < hdr->caplen &&
+                                   fread(&buf[last_len],
+                                         hdr->caplen - last_len, 1, fp) != 1)
+                                       return (-1);
+                               goto recover;
+                       }
+#endif /* CORRUPT_SAVEFILE_RECOVERY */
                        snprintf(p->errbuf, PCAP_ERRBUF_SIZE,
                            "bogus savefile header");
                        return (-1);
@@ -1230,6 +1470,12 @@
                }
        }
 
+#ifdef CORRUPT_SAVEFILE_RECOVERY
+       p->cc = 0xff & (hdr->ts.tv_sec >> 24);
+ recover:
+       p->bp = &buf[hdr->caplen];              /* recovery now possible */
+#endif
+
        /*
         * The DLT_USB_LINUX header is in host byte order when capturing
         * (it's supplied directly from a memory-mapped buffer shared
@@ -1243,31 +1489,32 @@
                pcap_usb_header* uhdr = (pcap_usb_header*) buf;
                /*
                 * The URB id is a totally opaque value; do we really need to 
-                * converte it to the reading host's byte order???
+                * convert it to the reading host's byte order???
                 */
                if (hdr->caplen < 8)
-                       return 0;
+                       return (res);
                uhdr->id = SWAPLL(uhdr->id);
                if (hdr->caplen < 14)
-                       return 0;
+                       return (res);
                uhdr->bus_id = SWAPSHORT(uhdr->bus_id);
                if (hdr->caplen < 24)
-                       return 0;
+                       return (res);
                uhdr->ts_sec = SWAPLL(uhdr->ts_sec);
                if (hdr->caplen < 28)
-                       return 0;
+                       return (res);
                uhdr->ts_usec = SWAPLONG(uhdr->ts_usec);
                if (hdr->caplen < 32)
-                       return 0;
+                       return (res);
                uhdr->status = SWAPLONG(uhdr->status);
                if (hdr->caplen < 36)
-                       return 0;
+                       return (res);
                uhdr->urb_len = SWAPLONG(uhdr->urb_len);
                if (hdr->caplen < 40)
-                       return 0;
+                       return (res);
                uhdr->data_len = SWAPLONG(uhdr->data_len);
        }
-       return (0);
+
+       return (res);
 }
 
 /*
@@ -1293,19 +1540,47 @@
                 * out of the loop without having read any packets, and
                 * return the number of packets we've processed so far.
                 */
-               if (p->break_loop) {
+               if (p->break_loop > 0) {
                        if (n == 0) {
                                p->break_loop = 0;
-                               return (-2);
+                               return (SFERR_ABORT);
                        } else
                                return (n);
                }
 
+#ifdef CORRUPT_SAVEFILE_RECOVERY
+               /*
+                * If corrupt savefile recovery is taking place (negative
+                * break_loop value), return cached special error status
+                * first, and on next call, process first recovered packet.
+                */
+               if (p->break_loop < 0) {
+                       if (p->break_loop != -1) {
+                               status = p->break_loop;
+                               p->break_loop = -1;
+                               return (status);
+                       }
+                       p->break_loop = 0;
+                       h = p->pcap_header;
+               } else {
+#endif
+
                status = sf_next_packet(p, &h, p->buffer, p->bufsize);
                if (status) {
-                       if (status == 1)
-                               return (0);
-                       return (status);
+                       /* if at EOF, return any count of packets first */
+                       if (status == SFERR_EOF)
+                               return (n);
+                       if (status == SFERR_FATAL)
+                               return (status);
+#ifdef CORRUPT_SAVEFILE_RECOVERY
+                       /* status is SFERR_RECOVER/CLEAN; save next packet */
+                       p->break_loop = status;
+                       p->pcap_header = h;
+                       if (n)
+                               return (n);
+                       continue;
+               }
+#endif
                }
 
                if ((fcode = p->fcode.bf_insns) == NULL ||
Index: tcpdump.1
===================================================================
RCS file: /tcpdump/master/tcpdump/tcpdump.1,v
retrieving revision 1.183
diff -u -r1.183 tcpdump.1
--- tcpdump.1   11 Mar 2007 04:38:19 -0000      1.183
+++ tcpdump.1   4 Apr 2007 13:32:04 -0000
@@ -29,7 +29,7 @@
 .na
 .B tcpdump
 [
-.B \-AdDefKlLnNOpqRStuUvxX
+.B \-AdDefkKlLnNOpqRStuUvxX
 ] [
 .B \-c
 .I count
@@ -396,6 +396,12 @@
 .I interface
 argument.
 .TP
+.B \-k
+Continue reading savefile data if corruption is encountered and recovery is
+possible (requires libpcap with corrupt savefile recovery enabled).  This can
+allow more complete processing of concatenated savefiles or cases where some
+data was not written (e.g. due to filesystem full condition).
+.TP
 .B \-K
 Don't attempt to verify TCP checksums.  This is useful for interfaces
 that perform the TCP checksum calculation in hardware; otherwise,
Index: tcpdump.c
===================================================================
RCS file: /tcpdump/master/tcpdump/tcpdump.c,v
retrieving revision 1.269
diff -u -r1.269 tcpdump.c
--- tcpdump.c   5 May 2006 23:13:01 -0000       1.269
+++ tcpdump.c   4 Apr 2007 13:32:04 -0000
@@ -489,6 +489,7 @@
        pcap_if_t *devpointer;
        int devnum;
 #endif
+       int kflag = 0;
        int status;
 #ifdef WIN32
        u_int UserBufferSize = 1000000;
@@ -523,7 +524,7 @@
 
        opterr = 0;
        while (
-           (op = getopt(argc, argv, "aA" B_FLAG "c:C:d" D_FLAG "eE:fF:G:i:KlLm:M:nNOpqr:Rs:StT:u" U_FLAG 
"vw:W:xXy:Yz:Z:")) != -1)
+           (op = getopt(argc, argv, "aA" B_FLAG "c:C:d" D_FLAG "eE:fF:G:i:kKlLm:M:nNOpqr:Rs:StT:u" U_FLAG 
"vw:W:xXy:Yz:Z:")) != -1)
                switch (op) {
 
                case 'a':
@@ -668,6 +669,10 @@
 #endif /* WIN32 */
                        break;
 
+               case 'k':
+                       ++kflag;
+                       break;
+
                case 'K':
                        ++Kflag;
                        break;
@@ -1110,36 +1115,38 @@
                (void)fflush(stderr);
        }
 #endif /* WIN32 */
-       status = pcap_loop(pd, cnt, callback, pcap_userdata);
-       if (WFileName == NULL) {
-               /*
-                * We're printing packets.  Flush the printed output,
-                * so it doesn't get intermingled with error output.
-                */
-               if (status == -2) {
+       do {
+               status = pcap_loop(pd, cnt, callback, pcap_userdata);
+               if (WFileName == NULL) {
                        /*
-                        * We got interrupted, so perhaps we didn't
-                        * manage to finish a line we were printing.
-                        * Print an extra newline, just in case.
+                        * We're printing packets.  Flush the printed output,
+                        * so it doesn't get intermingled with error output.
                         */
-                       putchar('\n');
+                       if (status == -2) {
+                               /*
+                                * We got interrupted, so perhaps we didn't
+                                * manage to finish a line we were printing.
+                                * Print an extra newline, just in case.
+                                */
+                               putchar('\n');
+                       }
+                       (void)fflush(stdout);
                }
-               (void)fflush(stdout);
-       }
-       if (status == -1) {
-               /*
-                * Error.  Report it.
-                */
-               (void)fprintf(stderr, "%s: pcap_loop: %s\n",
-                   program_name, pcap_geterr(pd));
-       }
-       if (RFileName == NULL) {
-               /*
-                * We're doing a live capture.  Report the capture
-                * statistics.
-                */
-               info(1);
-       }
+               if (status == -1 || status == -3 || status == -4) {
+                       /*
+                        * Error.  Report it.
+                        */
+                       (void)fprintf(stderr, "%s: pcap_loop: %s\n",
+                           program_name, pcap_geterr(pd));
+               }
+               if (RFileName == NULL) {
+                       /*
+                        * We're doing a live capture.  Report the capture
+                        * statistics.
+                        */
+                       info(1);
+               }
+       } while (kflag && status == -3 || status == -4);
        pcap_close(pd);
        exit(status == -1 ? 1 : 0);
 }
@@ -1571,7 +1578,7 @@
 #endif /* WIN32 */
 #endif /* HAVE_PCAP_LIB_VERSION */
        (void)fprintf(stderr,
-"Usage: %s [-aAd" D_FLAG "efKlLnNOpqRStu" U_FLAG "vxX]" B_FLAG_USAGE " [-c count] [ -C file_size ]\n", program_name);
+"Usage: %s [-aAd" D_FLAG "efkKlLnNOpqRStu" U_FLAG "vxX]" B_FLAG_USAGE " [-c count] [ -C file_size ]\n", program_name);
        (void)fprintf(stderr,
 "\t\t[ -E algo:secret ] [ -F file ] [ -G seconds ] [ -i interface ]\n");
        (void)fprintf(stderr,
-
This is the tcpdump-workers list.
Visit https://cod.sandelman.ca/ to unsubscribe.

Current thread: