Bugtraq mailing list archives
Subtle data corruption of TCP streams
From: wietse () PORCUPINE ORG (Wietse Venema)
Date: Wed, 22 Mar 2000 21:17:57 -0500
This note is about a subtle data corruption problem with TCP data streams that may bite people as more and more (LINUX) systems are sending network traffic with TCP-level options turned on. Last week, several Postfix users reported mail delivery failures because sequences of control characters (for example, ^A^A^H) were being inserted into their SMTP connections, resulting in SMTP protocol errors and non-delivery of email. These data corruption problems are not host specific: they are observed with both Linux and BSD/OS systems, and with mail sent to and/or received from systems running Postfix, Sendmail and qmail. Over the weekend of March 18, 2000, a few people left tcpdump running on their machines, in order to record some of these corrupted SMTP sessions. This note is based on an analysis of that data. The corruption appears to be caused by a buggy traffic manipulation scheme that plays games with TCP acknowledgements. It sounds like a great argument for more deployment of IPSEC, which is designed to prevent modification or insertion of traffic in transit; but it also illustrates the conflict that some have with IPSEC, because it prevents them from doing any traffic manipulation at all. See also draft-ietf-pilc-pep-02.txt (performance enhancing proxies) for a discussion of well-intended TCP traffic manipulation techniques. Wietse The problem in a nutshell ========================= The problem is with "extra" ACK packets that are generated by some helpful intermediate routers. Under some conditions involving retransmission and/or packets arriving out of order, such routers copy a real ACK packet from an end system, turn the copied ACK around by swapping source and destination etc. fields, and send it off. The problem happens when, by mistake, TCP option bytes from the original ACK packet are sent as DATA bytes in the copied ACK packet. This corrupts the TCP data stream, because the bogus data is sent in a packet with correct IP and TCP header checksums. The fact that the next TCP data will overlay the bogus data does not prevent the bogus data from being passed up to the application. Example of data corruption ========================== What follows is a fragment of a corrupted SMTP session, one of several dozen sessions that were recorded at both endpoints of the connections. The recordings are available via FTP (see pointers at the end). The first figure shows an ACK packet sent by the SMTP server. The figure shows one line of tcpdump output (folded for readability), followed by an annotated version of the packet. The annotation identifies 20 bytes of IP header fields, 20 bytes of TCP header fields, and 12 bytes of TCP header options. 12:28:37.051883 195.52.11.4.25 > 194.25.134.80.1730: . ack 86 win 32120 <nop,nop,timestamp 1105397 766737219> (DF) IP_HDR 45 00 00 34 52 2f 40 00 40 06 vhl tos len len id id off off ttl pro IP_HDR d1 f2 c3 34 0b 04 c2 19 86 50 sum sum src src src src dst dst dst dst TCP_HDR 00 19 06 c2 f5 22 60 dd f4 ce src src dst dst seq seq seq seq ack ack TCP_HDR fc e1 80 10 7d 78 0d 1a 00 00 ack ack off flg win win sum sum urp urp TCP_OPT 01 01 08 0a 00 10 dd f5 2d b3 opt opt opt opt opt opt opt opt opt opt TCP_OPT 7b 43 opt opt The second figure shows an "extra ACK" packet that was generated by an intermediate router, not by an end system (it shows up only in the tcpdump recording of the receiving system). Note that the "extra ACK" has the same 0x522f IDENT field in the IP header as the preceding packet. The "extra ACK" has the same 12 bytes of TCP options as the preceding packet. However, the TCP options are by mistake sent as data, so they are read by the application as ^A^A^H... 12:28:37.056438 194.25.134.80.1730 > 195.52.11.4.25: . 86:98(12) ack 112 win 2920 (DF) IP_HDR 45 00 00 34 52 2f 40 00 3c 06 vhl tos len len id id off off ttl pro IP_HDR d5 f2 c2 19 86 50 c3 34 0b 04 sum sum src src src src dst dst dst dst TCP_HDR 06 c2 00 19 f4 ce fc e1 f5 22 src src dst dst seq seq seq seq ack ack TCP_HDR 60 d5 50 10 0b 68 af 32 00 00 ack ack off flg win win sum sum urp urp DATA 01 01 08 0a 00 10 dd f5 2d b3 ^A ^A ^H ^J ^@ ^P dd f5 - b3 DATA 7b 43 { C Note that the ACK with bogus data is sent towards the host that sent the original ACK with TCP option bytes. Turning off TCP options would prevent this corruption from happening. However, turning off TCP options in the local system would solve only half the problem. When a remote system connects to the local system, and the remote system has TCP options turned on, the connection can still suffer from the type of corruption shown above. Packet arrival time analysis ============================ As discussed above, some intermediate systems generate an "extra ACK" by cloning a real ACK packet. They modify the cloned ACK by swapping source and destination fields etc., then send it off. By measuring the time differences between sending the original ACK and receiving the cloned ACK it is possible to narrow down the router responsible for the data corruption. By playing games with tools such as traceroute, ping and mtr (http://www.bitwizard.nl/mtr/) it is possible to further identify the source of a problem. Getting the problem fixed is another matter, of course. More details ============ A more extensive version of this note, with tcpdump recordings of corrupted SMTP sessions, and with tools used for the analysis of those recordings is available via FTP: ftp://ftp.porcupine.org/pub/debugging/ack-corruption.tar.gz ftp://ftp.porcupine.org/pub/debugging/ack-corruption.tar.gz.sig
Current thread:
- [SAFER 000317.EXP.1.5] Netscape Enterprise Server and '?wp' tags Vanja Hrustic (Mar 17)
- <Possible follow-ups>
- Re: [SAFER 000317.EXP.1.5] Netscape Enterprise Server and '?wp' tags amonotod (Mar 21)
- Re: [SAFER 000317.EXP.1.5] Netscape Enterprise Server and '?wp'tags Vanja Hrustic (Mar 22)
- Re: [SAFER 000317.EXP.1.5] Netscape Enterprise Server and '?wp'tags Peter W (Mar 22)
- Subtle data corruption of TCP streams Wietse Venema (Mar 22)
- Re: Subtle data corruption of TCP streams Guido van Rooij (Mar 24)
- Local Linux Crash Javor Ninov (Mar 24)
- Re: [SAFER 000317.EXP.1.5] Netscape Enterprise Server and '?wp'tags Vanja Hrustic (Mar 22)
- Local root compromise in GNQS 3.50.6 and 3.50.7 Philippe Andersson (Mar 22)
- Re: [SAFER 000317.EXP.1.5] Netscape Enterprise Server and '?wp'tags Doug Monroe (Mar 22)
- Re: [SAFER 000317.EXP.1.5] Netscape Enterprise Server and '?wp' tags jobs () NETWORKCOMMAND COM (Mar 22)
- Re: [SAFER 000317.EXP.1.5] Netscape Enterprise Server and '?wp' tags Phydeaux (Mar 22)