tcpdump mailing list archives

Re: [long] "bad cksum 0!" on AIX 5 over loopback


From: Guy Harris <guy () alum mit edu>
Date: Wed, 2 Jul 2003 16:33:29 -0700


On Wednesday, July 2, 2003, at 3:09 PM, alex medvedev wrote:

lo0 was reported as a non-valid interface type, so i
added/modified the following libpcap code to include
the loopback interface type on AIX:
----------------------------------------------------------
pcap-bpf.c in pcap_open_live() added:
        case IFT_LOOP: /* to define interface type */
                v = DLT_LOOP;
                break;

That would be the correct thing to do only if the link-layer header on AIX loopback devices was the same as the link-layer header on OpenBSD loopback devices - i.e., that every packet start with a 4-byte big-endian PF_ protocol family value - because DLT_LOOP is for use with OpenBSD loopback devices.

gencode.c in init_linktype() modified:
        case DLT_NULL:
        case DLT_LOOP: /* to define link type */
                /* off_linktype = 0; */ /* should be -1 in my case */
off_linktype = -1; /* because there's no encapsulation */

That would be the correct thing to do only if BSD loopback devices on all the BSDs supplied no link-layer header whatsoever; as that's not true on any of the BSDs that I know of, that's not the correct thing to do, and breaks libpcap on BSDs when capturing on loopback devices.

pcap-bpf.c in pcap_read() added:
        case EFAULT:  /* to keep it from dying after few packets */
                goto again;

That would be the correct thing to do only if you were adding AIX support to a version of libpcap other than the current CVS version, as the current CVS version already does that. It also does a number of *other* things for AIX that you haven't mentioned in your patch, so if you're going to work with libpcap on AIX, you should be using the current CVS version, available via anonymous CVS (see the instructions on http://www.tcpdump.org/) or in the "Current Tar files" downloadable from http://www.tcpdump.org/ (those are produced nightly from the CVS repository).

The current CVS version of libpcap maps IFT_LOOP to DLT_NULL. If that doesn't work correctly, then it needs to map IFT_LOOP to something else - the "off_linktype" for DLT_NULL and DLT_LOOP should *NOT* be changed as it breaks the generation of BPF filter code for the BSDs.

what i noticed immediately is that tcp checksums are not calculated
correctly with ipv4.
on linux it works fine, but linux has diff loopback encapsulation.
here is a sample session on AIX 5.2 with latest fixes [see cksums]:

# ./tcpdump -i lo0 -v
tcpdump: listening on lo0

SIMPLE PING:
15:48:34.207455395 localhost > localhost:
icmp: echo request (ttl 255, id 1112, len 84, bad cksum 0!)
15:48:34.207522333 localhost > localhost:
icmp: echo reply (ttl 255, id 1113, len 84, bad cksum 0!)

IPV6 PING:
15:48:47.102841773 ::1 > ::1: icmp6: echo request (len 64, hlim 255)
15:48:47.102931617 ::1 > ::1: icmp6: echo reply (len 64, hlim 255)

TELNET SESSION (FRAGMENT):
16:09:57.748625107 localhost.telnet >
localhost.33010: F [bad tcp cksum a26c!] 79:79(0) ack 47
win 33688 <nop,nop,timestamp 1057604598 1057604598> (ttl 60, id 3819, len
52, bad cksum 0!)
16:09:57.748736461 localhost.33010 >
localhost.telnet: . [bad tcp cksum a286!] ack 80 win 33662
<nop,nop,timestamp 1057604598 1057604598> [tos 0x10]  (ttl 60, id 3820,
len 52, bad cksum 0!)
16:09:57.751140007 localhost.33010 >
localhost.telnet: F [bad tcp cksum a285!] 47:47(0) ack 80
win 33662 <nop,nop,timestamp 1057604598 1057604598> [tos 0x10] (ttl 60,
id 3821, len 52, bad cksum 0!)
16:09:57.751240534 localhost.telnet >
localhost.33010: . [bad tcp cksum a26b!] ack 48 win 33688
<nop,nop,timestamp 1057604598 1057604598> (ttl 60, id 3822, len 52, bad
cksum 0!)

questions:
- what can be done about fixing the cksums?
  aparently i am not doing smth correctly with incorporating loopback;
  maybe off_nl/off_nl_nosnap numbers wrong? although i doubt that;

Changing those values will make no difference whatsoever to the checksums. The checksums are checked by tcpdump, which does not use the off_ values - it has its *own* code to dissect link-layer headers and skip past link-layer and network-layer headers.

If the link-layer header for AIX loopback devices weren't 4 bytes long, then you would have to use some other DLT_ type - perhaps by adding a new type, if the link-layer header doesn't match an existing link-layer type - and change both libpcap *AND* tcpdump to handle it.

However, if the link-layer header for AIX loopback devices weren't 4 bytes long, tcpdump would probably not correctly parse the IP or TCP headers; given that it's printing "localhost" for source and destination IP addresses, and "telnet" for the destination port, it probably *is* parsing the IP and TCP headers correctly, and therefore the AIX loopback device link-layer header is 4 bytes long.

If that header contains a PF_ type value, then DLT_NULL is the right link-layer type, and off_linktype should be 0, not -1, just as it's 0 for DLT_NULL on BSD. (If the RS/6000 version of AIX runs on any little-endian machines as well, and the PF_ type were big-endian on those machines, DLT_LOOP would be the correct type, but I don't think they're supporting that particular version of AIX on any little-endian machines.)

I suspect the problem might be that:

IBM decided that checksumming packets on a loopback device is pointless (it might detect some bugs in the networking that corrupt packets, but loopback packets don't actually get sent over any network wire, so it's not as if there's anything to check there);

        AIX supports offloading of IP and TCP checksum generation and checking;

so they decided to skip checksumming for the loopback device by marking it as a device that does offloading and not actually having the driver generate or check the checksums, in which case the checksums will be bogus (just as they're bogus when capturing packets being transmitted by the machine running the packet capture program if they're being transmitted on, for example, a gigabit Ethernet interface that's doing checksum offloading).

- why does pcap_read() receive EFAULT while reading from bpf?

As far as I know, nobody outside of IBM has figured that out; if anybody has figured it out, they haven't told us.

  is it fixable?

It's probably fixable if you have AIX kernel source and sufficient time, or the power to get somebody from IBM to look at fixing it.

I don't know whether anybody's figured out something to make it go away completely. Check out the comment in the current CVS version of "pcap-bpf.c", right after the "case EFAULT:" code, and the comment on the "memset()" call to which it refers (look for EFAULT to find both comments).

When this was last discussed on tcpdump-workers, I think I speculated that the fact that a "memset()" of the newly-allocated buffer into which the packets will be copied either makes it go away completely or partially means it *might* be a bug wherein, if the page into which the packet is being copied isn't in memory, the attempt to copy the packet gets an EFAULT rather than causing the page to be faulted in and copied into.

  do i miss any packets when doing "goto again" when receiving EFAULT?

I don't know.

-
This is the TCPDUMP workers list. It is archived at
http://www.tcpdump.org/lists/workers/index.html
To unsubscribe use mailto:tcpdump-workers-request () tcpdump org?body=unsubscribe


Current thread: