tcpdump mailing list archives

Re: Fixing damaged cap file

From: Alexander Dupuy <alex.dupuy () mac com>
Date: Wed, 04 Apr 2007 13:41:49 -0400

Petr Novák wrote:

Please tcpdump, do not end, but run over damaged packets (move from
current position in cap file byte by byte) from cap file until You find
good header and packet,


Guy Harris replied

The code *could* skip forward until it finds something that appears tohave a caplen <= len <= 65535, but there's no guarantee that it wouldfind a good header.
It could also check for a time stamp that looks "reasonable", for somedefinition of "reasonable", although there are some cases where,unfortunately, time stamps go backward in time.
Looking for a good packet is trickier, as it involves a lot oflink-layer-type-dependent work.
If the file is damaged, the damage might occur in what appear to be"good" packets. Consider, for example, a common form of damage -FTPing a file between Windows and UN*X in ASCII mode - which caninsert or remove bytes in packet data.
Furthermore, it might skip over a lot of packets, meaning that thedata you get from your capture file might be incomplete.
In other words:
1) attempting to find a good packet by skipping file data is noteasy, and is not guaranteed to find a real packet;
2) if you do that, you lose a bunch of data from the file, soyou're really not successfully reading the capture, you're getting apartial sample of the packets in the capture.

This is true in general, but there are some common forms of corruptionin tcpdump capture files that can be recovered from at least in mostcases, without skipping file data,. The most common one is caused byconcatenating multiple pcap files using cat (or other non-pcap awarefile-level utility, e.g. dd). When an embedded file header is foundwhere a packet header was expected, it causes a fatal "bogus savefileheader" error. However, this can be identified by the file header"magic" and recovered from cleanly. Checking for different endiannessmagic on the concatenated files makes recovery possible even in thatsituation.

Another form of breakage that is fairly common is for a capture file tohave "gaps" due to failed writes (typically because of a fullfilesystem) - these gaps can cause a file header (from concatenation) ora packet header (when a capture application doesn't check for writeerrors, and writes new data once there is space) to appear within whatshould have been the packet data. In this case, rather than skippingforward looking for the next packet header, it's more effective tosearch backward in the existing buffered data looking for a savefilepacket header that was treated as packet data a result of such a gap.This searching can look for the MSB of the current packet timestamp(allowing for out-of-order packets in almost all cases) and confirmvalidity with caplen <= len <= 65535. While it's possible to findsomething that looked like a valid savefile packet header (but wasn't)it's very unlikely (except for a corrupted savefile containing trafficwith savefile data). Even if you do mistakenly recover a packet header,it's even more improbable for there to be another mistaken packet headerfollowing the packet data of the mistaken one, and recovery will stopafter only a few "garbage" packets are returned (by disabling recoveryimmediately after a recovered packet). Furthermore, by only searchingbackward through the previous packet's data, the overhead in cases ofunrecoverable corruption is minimal (you don't spend minutes or hourssearching in vain through a multi-gigabyte savefile full of NULs due tofilesystem errors).

While this code was not "easy" to write, I have written it and have beenusing it successfully for several years; while it is not "guaranteed" tofind a real packet, it recovers effectively and reliably from commoncauses of corruption, with minimal overhead and only a small probabilityof mistaken recovery (and for applications that really care, they cantreat the second case, of "gaps," as fatal); and by not skipping overdata, it returns all the data in the capture, not just s "partialsample." It doesn't handle all possible causes of corruption (e.g.ASCII FTP transfer) but does cover the most common ones.

Attached is a patch for libpcap (based on CVS head) that will recoverfrom these common corruption cases (if enabled during configure, with--enable-corrupt-savefile-recovery). If enabled, pcap_next_ex() returns-3 or -4 if libpcap has detected corruption and recovered from it (for-4, the previously-returned packet may have bogus packet data - and ifso, would have had that bogus data even without this patch, due to"gaps"). Both pcap_loop() and pcap_dispatch() return the number ofpackets already read and/or these new error codes (on the next call,much like current breakloop handling). An application that is willingto continue processing can check for these error codes and recover fromcommon forms of savefile corruption. I also include a patch for tcpdump(also based on CVS head) that adds a -k option to keep going after arecoverable error has been detected.

There are compatibility issues with the introduction of these new errorcodes, as existing code won't expect them (one reason why thefunctionality must be explicitly enabled in configure). Code usingpcap_next() doesn't get error codes (just packet or no packet), so willtreat these as any other error. Older code that performs error testingwith ret < 0 will treat -3 /-4 much like current -1 (error) and -2(breakloop abort). Newer code that checks explicitly for -1 and -2(like tcpdump) will generally abort on the new error codes (althoughthey may treat it like a breakloop and fail to display the error). Forcode that uses pcap_loop() or pcap_dispatch() the read loop is aborted,and only if called again will post-recovery packets be passed to thecallback function. This essentially preserves existing applicationbehavior by default. For code using pcap_next_ex(), the packet data andheader pointers are updated just in case the new codes are notrecognized as errors, but as pcap_next_ex() has always returned -2 forEOF indication (as well as breakloop), I'd be surprised if any existingcode continued processing on a negative value return frompcap_next_ex(). Code using pcap_next_ex() might fail to display theerror, and in a worst case, would continue processing after the recovery(with the first post-recovery packet returned twice). While thisdoesn't preserve existing application behavior in all possible cases, itdoes generally do so, and this seems reasonable for a conditionallycompiled feature.

At an implementation level of compatibility, virtually all recoveryprocessing code changes are conditionally compiled - the only exceptionsare the refactoring of the header swapping into a static function insavefile.c, and pcap_next_ex() handling of the new error codes frompcap_offline_read(). If corrupt savefile recovery is enabled, twounused fields in struct pcap (bp and cc) are used to track savefilerecovery markers (so no structure size changes occur) and this is theonly additional processing that occurs for non-corrupt savefiles (theoverhead is minimal, pcap_offline_open() has one more assignment percall, and sf_next_packet() has two per call). Only in the event of acorrupted file is there significant additional processing to recover.At a purely source level, I also "renovated" the out-of-date and unusedSFERR_XX return codes for sf_next_packet() to reflect the actual (andnew) codes returned by this function, and modified the savefile.c sourceto use them (updating the out-of-date comments in the fileaccordingly). The patches also include updates to the pcap and tcpdumpman pages reflecting the changes, and also documenting the currentambiguity of the -2 return code for pcap_next_ex() - historically usedfor EOF, but also used by pcap_breakloop.

If there are other improvements or changes to this that would berequired for its acceptance and incorporation into the tcpdump.org CVSrepository, I would be willing to make them if they are not excessive(I've spent a bit too much time cleaning up this private code for publicuse already).

Petr - I don't know if these patches will address the types of filecorruption you are dealing with, although I suspect they will. However,you certainly will not be able to apply them to your 0.8.3 version oflibpcap/tcpdump. You will need to download the "current tar files" fromtcpdump.org (http://www.tcpdump.org/daily/tcpdump-current.tar.gz andlibpcap-current.tar.gz) - these correspond to the CVS head release andthe attached patches should apply cleanly to that version. If you thenconfigure --enable-corrupt-savefile-recovery and build libpcap andtcpdump, you should have a tcpdump with a -k option that does prettymuch what you want; you can use this with -r and -w to "clean up"capture files which you can then process with your existing tcpdump, ifyou don't want to use the new one due to changes in output formatting orwhatever.


@alex
--
mailto:alex.dupuy () mac com

Index: config.h.in
===================================================================
RCS file: /tcpdump/master/libpcap/config.h.in,v
retrieving revision 1.26
diff -u -r1.26 config.h.in
--- config.h.in 20 Dec 2006 03:30:51 -0000      1.26
+++ config.h.in 4 Apr 2007 14:32:40 -0000
@@ -10,6 +10,9 @@
 /* Enable optimizer debugging */
 #undef BDEBUG
 
+/* Enable corrupt savefile recovery */
+#undef CORRUPT_SAVEFILE_RECOVERY
+
 /* define if you have the DAG API */
 #undef HAVE_DAG_API
 
Index: configure
===================================================================
RCS file: /tcpdump/master/libpcap/configure,v
retrieving revision 1.75
diff -u -r1.75 configure
--- configure   8 Feb 2007 06:03:03 -0000       1.75
+++ configure   4 Apr 2007 14:32:42 -0000
@@ -850,6 +850,7 @@
   --enable-ipv6           build IPv6-capable version
   --enable-optimizer-dbg  build optimizer debugging code
   --enable-yydebug        build parser debugging code
+  --enable-corrupt-savefile-recovery  build version capable of recovering from some corrupt savefiles
 
 Optional Packages:
   --with-PACKAGE[=ARG]    use PACKAGE [ARG=yes]
@@ -5657,6 +5658,23 @@
 echo "$as_me:$LINENO: result: ${enable_yydebug-no}" >&5
 echo "${ECHO_T}${enable_yydebug-no}" >&6
 
+echo "$as_me:$LINENO: checking whether to enable corrupt savefile recovery code" >&5
+echo $ECHO_N "checking whether to enable corrupt savefile recovery code... $ECHO_C" >&6
+# Check whether --enable-corrupt-savefile-recovery or --disable-corrupt-savefile-recovery was given.
+if test "${enable_corrupt_savefile_recovery+set}" = set; then
+  enableval="$enable_corrupt_savefile_recovery"
+
+fi;
+if test "$enable_corrupt_savefile_recovery" = "yes"; then
+
+cat >>confdefs.h <<\_ACEOF
+#define CORRUPT_SAVEFILE_RECOVERY 1
+_ACEOF
+
+fi
+echo "$as_me:$LINENO: result: ${enable_corrupt_savefile_recovery-no}" >&5
+echo "${ECHO_T}${enable_corrupt_savefile_recovery-no}" >&6
+
 case "$V_PCAP" in
 
 dlpi)
Index: configure.in
===================================================================
RCS file: /tcpdump/master/libpcap/configure.in,v
retrieving revision 1.134
diff -u -r1.134 configure.in
--- configure.in        8 Feb 2007 06:02:42 -0000       1.134
+++ configure.in        4 Apr 2007 14:32:42 -0000
@@ -323,6 +323,13 @@
 fi
 AC_MSG_RESULT(${enable_yydebug-no})
 
+AC_MSG_CHECKING(whether to enable corrupt savefile recovery code)
+AC_ARG_ENABLE(corrupt-savefile-recovery, [  --enable-corrupt-savefile-recovery  build version capable of recovering 
from some corrupt savefiles])
+if test "$enable_corrupt_savefile_recovery" = "yes"; then
+       AC_DEFINE(CORRUPT_SAVEFILE_RECOVERY,1,[Enable corrupt savefile recovery])
+fi
+AC_MSG_RESULT(${enable_corrupt_savefile_recovery-no})
+
 case "$V_PCAP" in
 
 dlpi)
Index: pcap-int.h
===================================================================
RCS file: /tcpdump/master/libpcap/pcap-int.h,v
retrieving revision 1.80
diff -u -r1.80 pcap-int.h
--- pcap-int.h  11 Mar 2007 21:44:12 -0000      1.80
+++ pcap-int.h  4 Apr 2007 14:32:42 -0000
@@ -170,8 +170,8 @@
         */
        int bufsize;
        u_char *buffer;
-       u_char *bp;
-       int cc;
+       u_char *bp;             /* used only for CORRUPT_SAVEFILE_RECOVERY */
+       int cc;                 /* used only for CORRUPT_SAVEFILE_RECOVERY */
 
        /*
         * Place holder for pcap_next().
Index: pcap.3
===================================================================
RCS file: /tcpdump/master/libpcap/pcap.3,v
retrieving revision 1.74
diff -u -r1.74 pcap.3
--- pcap.3      12 Oct 2006 07:59:54 -0000      1.74
+++ pcap.3      4 Apr 2007 14:32:42 -0000
@@ -504,6 +504,21 @@
 checking for a return value < 0.
 .ft R
 .PP
+If corrupt ``savefile'' recovery has been enabled, two other errors may
+be returned when reading from a savefile: \-3, if a file header was
+found instead of a packet header (caused by appending multiple
+savefiles); \-4, if the previous packet data was incomplete (caused by
+lack of file space). In the second case, data for the previous packet
+contained ``garbage'' data from the next packet header.  In either
+case, further calls to
+.B pcap_dispatch()
+or
+.B pcap_loop()
+will return additional packets after recovery from the savefile
+corruption.  Not all corruption can be recovered from, and there is a
+very small possibility that the recovery will be incorrect (e.g. if
+packet data contained savefile data being transferred on the network).
+.PP
 .BR NOTE :
 when reading a live capture,
 .B pcap_dispatch()
@@ -548,6 +563,10 @@
 make sure that you explicitly check for \-1 and \-2, rather than just
 checking for a return value < 0.
 .ft R
+If corrupt ``savefile'' recovery has been enabled, \-3 or \-4 may be
+returned as noted for
+.B pcap_dispatch()
+above, and applications should check for those values as well.
 .PP
 .B pcap_next()
 reads the next packet (by calling
@@ -584,7 +603,21 @@
 .TP
 \-2
 packets are being read from a ``savefile'', and there are no more
-packets to read from the savefile.
+packets to read from the savefile;
+.I or
+.B pcap_breakloop()
+was called
+.TP
+\-3
+packets are being read from a corrupt ``savefile'' (file header was
+found instead of a packet header), but further calls will return
+recovered packets
+.TP
+\-4
+packets are being read from a corrupt ``savefile'' (packet data was
+incomplete); data for the previous packet contained ``garbage'' data
+from the next packet header but further calls will return recovered
+packets
 .RE
 .PP
 If the packet was read without problems, the pointer pointed to by the
Index: pcap.c
===================================================================
RCS file: /tcpdump/master/libpcap/pcap.c,v
retrieving revision 1.104
diff -u -r1.104 pcap.c
--- pcap.c      20 Dec 2006 03:30:32 -0000      1.104
+++ pcap.c      4 Apr 2007 14:32:42 -0000
@@ -180,18 +180,28 @@
                 * Return codes for pcap_offline_read() are:
                 *   -  0: EOF
                 *   - -1: error
+                *   - -2: pcap_breakloop() called
+                *   - -3/-4: recovered from corrupt savefile
                 *   - >1: OK
                 * The first one ('0') conflicts with the return code of
                 * 0 from pcap_read() meaning "no packets arrived before
                 * the timeout expired", so we map it to -2 so you can
                 * distinguish between an EOF from a savefile and a
                 * "no packets arrived before the timeout expired, try
-                * again" from a live capture.
+                * again" from a live capture. While these return codes
+                * don't distinguish between an EOF from a savefile and
+                * a call to pcap_breakloop, code calling pcap_breakloop
+                * can set other flags, or feof() can be used to test
+                * for true EOF.  In corrupt savefile recovery, set data
+                * (and header, already set above) in case application
+                * doesn't recognize -3/-4 as error codes.
                 */
                if (status == 0)
                        return (-2);
-               else
-                       return (status);
+               else if (status == -3 || status == -4)
+                       *pkt_data = p->buffer;
+
+               return (status);
        }
 
        /*
Index: savefile.c
===================================================================
RCS file: /tcpdump/master/libpcap/savefile.c,v
retrieving revision 1.152
diff -u -r1.152 savefile.c
--- savefile.c  3 Apr 2007 07:18:27 -0000       1.152
+++ savefile.c  4 Apr 2007 14:32:43 -0000
@@ -96,10 +96,11 @@
 #define        SWAPSHORT(y) \
        ( (((y)&0xff)<<8) | ((u_short)((y)&0xff00)>>8) )
 
-#define SFERR_TRUNC            1
-#define SFERR_BADVERSION       2
-#define SFERR_BADF             3
-#define SFERR_EOF              4 /* not really an error, just a status */
+#define SFERR_EOF              1
+#define SFERR_FATAL            -1
+#define SFERR_ABORT            -2
+#define SFERR_RECOVERCLEAN     -3
+#define SFERR_RECOVER          -4
 
 /*
  * Setting O_BINARY on DOS/Windows is a bit tricky
@@ -861,6 +862,36 @@
                free(p->sf.base);
 }
 
+static void
+swap_pkthdr(struct pcap_pkthdr *hdr, struct pcap_sf_patched_pkthdr *sf_hdr, int swap)
+{
+       if (swap) {
+               /* these were written in opposite byte order */
+               hdr->caplen = SWAPLONG(sf_hdr->caplen);
+               hdr->len = SWAPLONG(sf_hdr->len);
+               hdr->ts.tv_sec = SWAPLONG(sf_hdr->ts.tv_sec);
+               hdr->ts.tv_usec = SWAPLONG(sf_hdr->ts.tv_usec);
+       } else {
+               hdr->caplen = sf_hdr->caplen;
+               hdr->len = sf_hdr->len;
+               hdr->ts.tv_sec = sf_hdr->ts.tv_sec;
+               hdr->ts.tv_usec = sf_hdr->ts.tv_usec;
+       }
+}
+
+#ifdef CORRUPT_SAVEFILE_RECOVERY
+static int
+validhdr(struct pcap_pkthdr *hdr, int minlen, int maxlen)
+{
+       return (hdr->ts.tv_usec >= 0 &&
+               hdr->ts.tv_usec < 1000000 &&
+               hdr->len <= 65536 &&
+               hdr->caplen > minlen &&
+               hdr->caplen <= hdr->len &&
+               hdr->caplen <= maxlen);
+}
+#endif /* CORRUPT_SAVEFILE_RECOVERY */
+
 pcap_t *
 pcap_open_offline(const char *fname, char *errbuf)
 {
@@ -1026,6 +1057,9 @@
        p->buffer = p->sf.base + BPF_ALIGNMENT - (linklen % BPF_ALIGNMENT);
        p->sf.version_major = hdr.version_major;
        p->sf.version_minor = hdr.version_minor;
+#ifdef CORRUPT_SAVEFILE_RECOVERY
+       p->bp = NULL;                           /* no recovery possible yet */
+#endif
 #ifdef PCAP_FDDIPAD
        /* Padding only needed for live capture fcode */
        p->fddipad = 0;
@@ -1089,9 +1123,14 @@
 }
 
 /*
- * Read sf_readfile and return the next packet.  Return the header in hdr
- * and the contents in buf.  Return 0 on success, SFERR_EOF if there were
- * no more packets, and SFERR_TRUNC if a partial packet was encountered.
+ * Read sf_readfile and return the next packet.  Return the header in hdr and
+ * the contents in buf.  Return 0 on success; SFERR_EOF if there were no more
+ * packets, SFERR_FATAL if only a partial packet was found, or an invalid
+ * header was detected; and SFERR_ABORT if the breakloop flag was set and
+ * processing should be aborted. If a corrupt savefile was detected and
+ * recovered from, SFERR_RECOVERCLEAN (for concatenated complete capture files
+ * where previous packet was OK) or SFERR_RECOVER (if previous packet may have
+ * bogus data) is returned.
  */
 static int
 sf_next_packet(pcap_t *p, struct pcap_pkthdr *hdr, u_char *buf, u_int buflen)
@@ -1100,6 +1139,49 @@
        FILE *fp = p->sf.rfile;
        size_t amt_read;
        bpf_u_int32 t;
+       int res = 0;
+#ifdef CORRUPT_SAVEFILE_RECOVERY
+       /* recover pointers into previous packet, if possible */
+       unsigned last_caplen = p->bp - buf;
+
+       if (last_caplen < 0 || last_caplen > buflen)
+               last_caplen = 0;                        /* can't recover */
+
+       /*
+        * The two most common cases of "bogus savefile header" corruption are
+        * due to errors when splicing savefiles together.  One type is
+        * "duplicate file header", when raw concatenation leaves a file header
+        * where the packet header should be - this is easily detected by magic
+        * number checks.  Another type is "appending to truncated file" when
+        * additional packets are appended to a file with an incomplete record
+        * (possibly due to lack of filesystem space).  It's possible for both
+        * errors to occur at once, but this can be handled in the same way as
+        * the second case, and isn't discussed further.  DEFCON 9 capture-the-
+        * flag data has several examples of the somewhat rare second case;
+        * examples of the first can be found anywhere there is a halfway-
+        * (in)competent network admin.
+        *
+        * Both cases are recoverable: the first by updating the savefile
+        * header info with the new file header, the second by scanning back
+        * into the previous packet buffer looking for the last plausible
+        * savefile packet header (only the last one can be used, as there is
+        * no portable way to push back data already read by fread).  Plausible
+        * is defined as having a timestamp later than the previous one, but
+        * not by more than 2^24 to 2^25 seconds (from half a year to a year),
+        * with usecs in the range [0...999999], and (caplen <= buflen &&
+        * caplen <= wirelen && wirelen <= 65536).  This is implemented by
+        * validhdr(), which also checks that caplen is large enough that no
+        * bytes already read are left-over after returning a packet.
+        *
+        * Recoverable errors are indicated by a return value of -3/-4 with a
+        * valid (recovered) hdr and buf, so as not to break code that checks
+        * explicitly for -1 and/or -2, although most code should abort on < 0,
+        * preserving current behavior.  New code can check for -3/-4, log the
+        * error, and process the returned packet.
+        */
+
+ readheader:
+#endif /* CORRUPT_SAVEFILE_RECOVERY */
 
        /*
         * Read the packet header; the structure we use as a buffer
@@ -1123,22 +1205,14 @@
                                return (-1);
                        }
                        /* EOF */
-                       return (1);
+                       return (SFERR_EOF);
                }
        }
 
-       if (p->sf.swapped) {
-               /* these were written in opposite byte order */
-               hdr->caplen = SWAPLONG(sf_hdr.caplen);
-               hdr->len = SWAPLONG(sf_hdr.len);
-               hdr->ts.tv_sec = SWAPLONG(sf_hdr.ts.tv_sec);
-               hdr->ts.tv_usec = SWAPLONG(sf_hdr.ts.tv_usec);
-       } else {
-               hdr->caplen = sf_hdr.caplen;
-               hdr->len = sf_hdr.len;
-               hdr->ts.tv_sec = sf_hdr.ts.tv_sec;
-               hdr->ts.tv_usec = sf_hdr.ts.tv_usec;
-       }
+#ifdef CORRUPT_SAVEFILE_RECOVERY
+ gotheader:
+#endif
+       swap_pkthdr(hdr, &sf_hdr, p->sf.swapped);
        /* Swap the caplen and len fields, if necessary. */
        switch (p->sf.lengths_swapped) {
 
@@ -1162,6 +1236,24 @@
                break;
        }
 
+#ifdef CORRUPT_SAVEFILE_RECOVERY
+       /*
+        * In the "duplicate file header" case, hdr->caplen ought to be 0,
+        * since it corresponds to the (unused, zero) file header "thiszone".
+        *
+        * In the "appending to truncated file" case, hdr->caplen may be
+        * plausible if the file offset is only off by one; however, the MSB of
+        * the timestamp is not likely to be correct.
+        *
+        * Catch these cases (the latter only if recovery is possible) and set
+        * hdr->caplen to force recovery.
+        */
+       if ((hdr->caplen == 0 && buflen != 0) ||
+           (last_caplen && p->cc != (0xff & (hdr->ts.tv_sec >> 24))
+            && (u_char)(p->cc + 1) != (0xff & (hdr->ts.tv_sec >> 24))))
+               hdr->caplen = 65536;
+#endif
+
        if (hdr->caplen > buflen) {
                /*
                 * This can happen due to Solaris 2.3 systems tripping
@@ -1173,6 +1265,154 @@
                static size_t tsize = 0;
 
                if (hdr->caplen > 65535) {
+#ifdef CORRUPT_SAVEFILE_RECOVERY
+                       /* following line depends on pcap_pkthdr layout */
+                       bpf_u_int32 magic = hdr->ts.tv_sec;
+                       int swapped = 0;
+                       int hdrsize = sizeof(struct pcap_sf_pkthdr);
+                       struct pcap_sf_patched_pkthdr hdr2;
+                       u_char *bp = NULL;
+                       int off = 0;                    /* little-end offset */
+                       int last_len = 0;
+
+                       if (magic != TCPDUMP_MAGIC &&
+                           magic != KUZNETZOV_TCPDUMP_MAGIC) {
+                               magic = SWAPLONG(magic);
+                               swapped = 1;
+                       }
+                       switch (magic) {
+                       case KUZNETZOV_TCPDUMP_MAGIC:
+                               hdrsize+=sizeof(struct pcap_sf_patched_pkthdr);
+                               /* FALLTHRU */
+                       case TCPDUMP_MAGIC:
+                               /*
+                                * Skip rest of file header, reset size/swap
+                                * - don't bother checking other fields - this
+                                * is best-effort recovery, not guaranteed.
+                                */
+                               snprintf(p->errbuf, PCAP_ERRBUF_SIZE,
+                                   "embedded savefile magic header");
+                               if (fread(&sf_hdr,
+                                         sizeof(struct pcap_file_header)
+                                         - p->sf.hdrsize, 1, fp) != 1)
+                                       return (-1);    /* just give up */ 
+                               
+                               p->sf.hdrsize = hdrsize;
+                               p->sf.swapped = swapped;
+                               res = SFERR_RECOVERCLEAN;/* recoverable error */
+                               goto readheader;
+                       }
+
+                       snprintf(p->errbuf, PCAP_ERRBUF_SIZE,
+                                "truncated packet capture - previous packet has garbage data");
+
+                       /* can't recover in face of BUFMOD hack or bad hdr */
+                       if (last_caplen > buflen)
+                               last_caplen = 0;
+
+                       res = SFERR_RECOVER;    /* if we can recover */
+
+                       /*
+                        * In the little-endian case, the magic timestamp byte
+                        * could also be in the first three bytes of the
+                        * header we just read.
+                        */
+                       if (last_caplen && p->sf.swapped == (htonl(1) == 1)) {
+                               for (; off < 3; off++) {
+                                       bp = (u_char *)&sf_hdr + off;
+                                       if (*bp != p->cc &&
+                                           *bp != (u_char)(p->cc + 1))
+                                               continue;
+
+                                       last_len = 3 - off;
+                                       memcpy(&hdr2,
+                                              &buf[last_caplen - last_len],
+                                              last_len);
+                                       memcpy(last_len + (char *)&hdr2,
+                                              &sf_hdr,
+                                              p->sf.hdrsize - last_len);
+
+                                       swap_pkthdr(hdr, &hdr2,
+                                                   p->sf.swapped);
+
+                                       if (validhdr(hdr, last_len, buflen))
+                                               break;  /* looks legit */
+                               }
+                               if (off < 3) {
+                                       memcpy(buf, p->sf.hdrsize - last_len +
+                                              (char *)&sf_hdr, last_len);
+                                       memcpy(&sf_hdr, &hdr2, p->sf.hdrsize);
+                                       /* read any remaining data for pkt */
+                                       if (last_len < hdr->caplen &&
+                                           fread(&buf[last_len],
+                                                 hdr->caplen - last_len, 1,
+                                                 fp) != 1)
+                                               return (-1);
+                                       goto recover;
+                               }
+                       }
+
+                       /*
+                        * Scan backwards in previous packet looking
+                        * for plausible pcap packet header.
+                        */
+                       bp = &buf[last_caplen];
+                       while (--bp >= &buf[off]) {
+                               if (*bp != p->cc && *bp != (u_char)(p->cc + 1))
+                                       continue;
+
+                               last_len = &buf[last_caplen] - (bp - off);
+                               if (last_len >= p->sf.hdrsize) {
+                                       memcpy(&hdr2, bp - off, p->sf.hdrsize);
+                               } else {
+                                       memcpy(&hdr2, bp - off, last_len);
+                                       memcpy(last_len + (char *)&hdr2,
+                                              &sf_hdr,
+                                              p->sf.hdrsize - last_len);
+                               }
+                               swap_pkthdr(hdr, &hdr2, p->sf.swapped);
+
+                               /* minimum caplen to consume read data */
+                               if (validhdr(hdr, last_len - p->sf.hdrsize,
+                                            buflen))
+                                       break;          /* looks legit */
+                       }
+                       if (bp >= &buf[off]) {
+                               /* did we already read some of next header? */
+                               if (last_len > hdr->caplen) {
+                                       last_len -= hdr->caplen;
+                                       if (last_len > p->sf.hdrsize)
+                                               return (-1);
+                                       /* overlap very likely here */
+                                       memmove(&sf_hdr,
+                                               last_len + (char *)&sf_hdr,
+                                               p->sf.hdrsize - last_len);
+                                       if (fread(last_len + (char *)&sf_hdr,
+                                                 last_len, 1, fp) != 1)
+                                               return (-1);
+
+                                       /* don't try again in this call */
+                                       last_caplen = 0;
+                                       goto gotheader;
+                               } else if (last_len > p->sf.hdrsize) {
+                                       /* overlap very very likely here */
+                                       memmove(buf, p->sf.hdrsize + bp - off,
+                                               last_len - p->sf.hdrsize);
+                                       memcpy(&buf[last_len - p->sf.hdrsize],
+                                              &sf_hdr, p->sf.hdrsize);
+                               } else {
+                                       memcpy(buf, p->sf.hdrsize - last_len +
+                                              (char *)&sf_hdr, last_len);
+                               }
+                               memcpy(&sf_hdr, &hdr2, p->sf.hdrsize);
+                               /* read any data from remainder of last pkt */
+                               if (last_len < hdr->caplen &&
+                                   fread(&buf[last_len],
+                                         hdr->caplen - last_len, 1, fp) != 1)
+                                       return (-1);
+                               goto recover;
+                       }
+#endif /* CORRUPT_SAVEFILE_RECOVERY */
                        snprintf(p->errbuf, PCAP_ERRBUF_SIZE,
                            "bogus savefile header");
                        return (-1);
@@ -1230,6 +1470,12 @@
                }
        }
 
+#ifdef CORRUPT_SAVEFILE_RECOVERY
+       p->cc = 0xff & (hdr->ts.tv_sec >> 24);
+ recover:
+       p->bp = &buf[hdr->caplen];              /* recovery now possible */
+#endif
+
        /*
         * The DLT_USB_LINUX header is in host byte order when capturing
         * (it's supplied directly from a memory-mapped buffer shared
@@ -1243,31 +1489,32 @@
                pcap_usb_header* uhdr = (pcap_usb_header*) buf;
                /*
                 * The URB id is a totally opaque value; do we really need to 
-                * converte it to the reading host's byte order???
+                * convert it to the reading host's byte order???
                 */
                if (hdr->caplen < 8)
-                       return 0;
+                       return (res);
                uhdr->id = SWAPLL(uhdr->id);
                if (hdr->caplen < 14)
-                       return 0;
+                       return (res);
                uhdr->bus_id = SWAPSHORT(uhdr->bus_id);
                if (hdr->caplen < 24)
-                       return 0;
+                       return (res);
                uhdr->ts_sec = SWAPLL(uhdr->ts_sec);
                if (hdr->caplen < 28)
-                       return 0;
+                       return (res);
                uhdr->ts_usec = SWAPLONG(uhdr->ts_usec);
                if (hdr->caplen < 32)
-                       return 0;
+                       return (res);
                uhdr->status = SWAPLONG(uhdr->status);
                if (hdr->caplen < 36)
-                       return 0;
+                       return (res);
                uhdr->urb_len = SWAPLONG(uhdr->urb_len);
                if (hdr->caplen < 40)
-                       return 0;
+                       return (res);
                uhdr->data_len = SWAPLONG(uhdr->data_len);
        }
-       return (0);
+
+       return (res);
 }
 
 /*
@@ -1293,19 +1540,47 @@
                 * out of the loop without having read any packets, and
                 * return the number of packets we've processed so far.
                 */
-               if (p->break_loop) {
+               if (p->break_loop > 0) {
                        if (n == 0) {
                                p->break_loop = 0;
-                               return (-2);
+                               return (SFERR_ABORT);
                        } else
                                return (n);
                }
 
+#ifdef CORRUPT_SAVEFILE_RECOVERY
+               /*
+                * If corrupt savefile recovery is taking place (negative
+                * break_loop value), return cached special error status
+                * first, and on next call, process first recovered packet.
+                */
+               if (p->break_loop < 0) {
+                       if (p->break_loop != -1) {
+                               status = p->break_loop;
+                               p->break_loop = -1;
+                               return (status);
+                       }
+                       p->break_loop = 0;
+                       h = p->pcap_header;
+               } else {
+#endif
+
                status = sf_next_packet(p, &h, p->buffer, p->bufsize);
                if (status) {
-                       if (status == 1)
-                               return (0);
-                       return (status);
+                       /* if at EOF, return any count of packets first */
+                       if (status == SFERR_EOF)
+                               return (n);
+                       if (status == SFERR_FATAL)
+                               return (status);
+#ifdef CORRUPT_SAVEFILE_RECOVERY
+                       /* status is SFERR_RECOVER/CLEAN; save next packet */
+                       p->break_loop = status;
+                       p->pcap_header = h;
+                       if (n)
+                               return (n);
+                       continue;
+               }
+#endif
                }
 
                if ((fcode = p->fcode.bf_insns) == NULL ||

Index: tcpdump.1
===================================================================
RCS file: /tcpdump/master/tcpdump/tcpdump.1,v
retrieving revision 1.183
diff -u -r1.183 tcpdump.1
--- tcpdump.1   11 Mar 2007 04:38:19 -0000      1.183
+++ tcpdump.1   4 Apr 2007 13:32:04 -0000
@@ -29,7 +29,7 @@
 .na
 .B tcpdump
 [
-.B \-AdDefKlLnNOpqRStuUvxX
+.B \-AdDefkKlLnNOpqRStuUvxX
 ] [
 .B \-c
 .I count
@@ -396,6 +396,12 @@
 .I interface
 argument.
 .TP
+.B \-k
+Continue reading savefile data if corruption is encountered and recovery is
+possible (requires libpcap with corrupt savefile recovery enabled).  This can
+allow more complete processing of concatenated savefiles or cases where some
+data was not written (e.g. due to filesystem full condition).
+.TP
 .B \-K
 Don't attempt to verify TCP checksums.  This is useful for interfaces
 that perform the TCP checksum calculation in hardware; otherwise,
Index: tcpdump.c
===================================================================
RCS file: /tcpdump/master/tcpdump/tcpdump.c,v
retrieving revision 1.269
diff -u -r1.269 tcpdump.c
--- tcpdump.c   5 May 2006 23:13:01 -0000       1.269
+++ tcpdump.c   4 Apr 2007 13:32:04 -0000
@@ -489,6 +489,7 @@
        pcap_if_t *devpointer;
        int devnum;
 #endif
+       int kflag = 0;
        int status;
 #ifdef WIN32
        u_int UserBufferSize = 1000000;
@@ -523,7 +524,7 @@
 
        opterr = 0;
        while (
-           (op = getopt(argc, argv, "aA" B_FLAG "c:C:d" D_FLAG "eE:fF:G:i:KlLm:M:nNOpqr:Rs:StT:u" U_FLAG 
"vw:W:xXy:Yz:Z:")) != -1)
+           (op = getopt(argc, argv, "aA" B_FLAG "c:C:d" D_FLAG "eE:fF:G:i:kKlLm:M:nNOpqr:Rs:StT:u" U_FLAG 
"vw:W:xXy:Yz:Z:")) != -1)
                switch (op) {
 
                case 'a':
@@ -668,6 +669,10 @@
 #endif /* WIN32 */
                        break;
 
+               case 'k':
+                       ++kflag;
+                       break;
+
                case 'K':
                        ++Kflag;
                        break;
@@ -1110,36 +1115,38 @@
                (void)fflush(stderr);
        }
 #endif /* WIN32 */
-       status = pcap_loop(pd, cnt, callback, pcap_userdata);
-       if (WFileName == NULL) {
-               /*
-                * We're printing packets.  Flush the printed output,
-                * so it doesn't get intermingled with error output.
-                */
-               if (status == -2) {
+       do {
+               status = pcap_loop(pd, cnt, callback, pcap_userdata);
+               if (WFileName == NULL) {
                        /*
-                        * We got interrupted, so perhaps we didn't
-                        * manage to finish a line we were printing.
-                        * Print an extra newline, just in case.
+                        * We're printing packets.  Flush the printed output,
+                        * so it doesn't get intermingled with error output.
                         */
-                       putchar('\n');
+                       if (status == -2) {
+                               /*
+                                * We got interrupted, so perhaps we didn't
+                                * manage to finish a line we were printing.
+                                * Print an extra newline, just in case.
+                                */
+                               putchar('\n');
+                       }
+                       (void)fflush(stdout);
                }
-               (void)fflush(stdout);
-       }
-       if (status == -1) {
-               /*
-                * Error.  Report it.
-                */
-               (void)fprintf(stderr, "%s: pcap_loop: %s\n",
-                   program_name, pcap_geterr(pd));
-       }
-       if (RFileName == NULL) {
-               /*
-                * We're doing a live capture.  Report the capture
-                * statistics.
-                */
-               info(1);
-       }
+               if (status == -1 || status == -3 || status == -4) {
+                       /*
+                        * Error.  Report it.
+                        */
+                       (void)fprintf(stderr, "%s: pcap_loop: %s\n",
+                           program_name, pcap_geterr(pd));
+               }
+               if (RFileName == NULL) {
+                       /*
+                        * We're doing a live capture.  Report the capture
+                        * statistics.
+                        */
+                       info(1);
+               }
+       } while (kflag && status == -3 || status == -4);
        pcap_close(pd);
        exit(status == -1 ? 1 : 0);
 }
@@ -1571,7 +1578,7 @@
 #endif /* WIN32 */
 #endif /* HAVE_PCAP_LIB_VERSION */
        (void)fprintf(stderr,
-"Usage: %s [-aAd" D_FLAG "efKlLnNOpqRStu" U_FLAG "vxX]" B_FLAG_USAGE " [-c count] [ -C file_size ]\n", program_name);
+"Usage: %s [-aAd" D_FLAG "efkKlLnNOpqRStu" U_FLAG "vxX]" B_FLAG_USAGE " [-c count] [ -C file_size ]\n", program_name);
        (void)fprintf(stderr,
 "\t\t[ -E algo:secret ] [ -F file ] [ -G seconds ] [ -i interface ]\n");
        (void)fprintf(stderr,

-
This is the tcpdump-workers list.
Visit https://cod.sandelman.ca/ to unsubscribe.

Current thread:

Re: Fixing damaged cap file Guy Harris (Apr 01)
- <Possible follow-ups>
- Re: Fixing damaged cap file Alexander Dupuy (Apr 04)
- Re: Fixing damaged cap file Alexander Dupuy (Apr 27)