tcpdump mailing list archives

Re: DLT value for IP over IB (Infiniband)


From: Darren Reed <darren.reed () oracle com>
Date: Thu, 14 Jul 2011 14:23:22 +0200

Some more follow up on this...

Looks are deceiving - there is no RFC 4391/4392 header being prepended to the IP packet:
/*
* In order to transmit the datagram to correct destination, an extra
* header including destination address is required. IB does not provide an
* interface for sending a link layer header directly to the IB link and the
* link layer header received from the IB link is missing information that
* GLDv3 requires. So mac_ib plugin defines a "soft" header as below.
*/
(From a header file on Solaris)
The above probably explains this output on Linux:

# tcpdump -i ib0 -vv -e -c 5
tcpdump: WARNING: arptype 32 not supported by libpcap - falling back to cooked socket tcpdump: listening on ib0, link-type LINUX_SLL (Linux cooked), capture size 96 bytes 02:29:13.106468 In ethertype IPv4 (0x0800), length 100: (tos 0x0, ttl 64, id 0, offset 0, flags [DF], proto: ICMP (1), length: 84) 10.0.0.201 > 10.0.0.203: ICMP echo request, id 30252, seq 1, length 64 02:29:13.204757 Out ethertype IPv4 (0x0800), length 100: (tos 0x0, ttl 64, id 32408, offset 0, flags [none], proto: ICMP (1), length: 84) 10.0.0.203 > 10.0.0.201: ICMP echo reply, id 30252, seq 1, length 64 02:29:14.106472 In ethertype IPv4 (0x0800), length 100: (tos 0x0, ttl 64, id 0, offset 0, flags [DF], proto: ICMP (1), length: 84) 10.0.0.201 > 10.0.0.203: ICMP echo request, id 30252, seq 2, length 64

... and which ultimately means there is not likely to ever be a real link layer header presented for outbound packets. For inbound, it would seem that it is implementation dependent.

What's important to note here is that the frame delivered to tcpdump on both Solaris and Linux is not a copy of that which is transmitted.

The summary being that using DLT_USER# symbols seems like the only reasonable solution.

Darren

Darren Reed wrote:
Some time ago I requested a DLT value for the IP over IB format of IB frames.

The interfaces that I'm using on Solaris appear to be compliant with RFCs 4391
and 4392.

Currently we're using DLT_USER15 internally but given the alignment with the RFCs, I feel that this should be changed before I submit changes to libpcap
and tcpdump.

A typical IB frame that I'm currently getting looks like this:

       0x0000:  badd cafe badd cafe badd cafe badd cafe
       0x0010:  badd cafe 8000 0049 fe80 0000 0000 0000
       0x0020:  0021 2800 01a1 1d45 0800 0000 4500 002c
       0x0030:  6741 0000 0101 8732 c0a8 2502 c0a8 250b
       0x0040:  0800 4970 3903 3c4d 0017 d85f cd2f 8164
       0x0050:  0000 1234 ffff ffff

... which I take to mean that the LRH and BTH fields are being
filled in by the adapter and not the stack, as an ARP message
looks a bit more sane:

       0x0000:  0000 0049 fe80 0000 0000 0000 0021 2800
       0x0010:  01a1 1d45 00ff ffff ff10 401b 0000 0000
       0x0020:  0000 0000 ffff ffff 0806 0000 0020 0800
       0x0030:  1404 0002 8000 0049 fe80 0000 0000 0000
       0x0040:  0021 2800 01a1 1d45 c0a8 250c 8000 004c
       0x0050:  fe80 0000 0000 0000 0021 2800 01a1 1d7d
       0x0060:  c0a8 2501

but I'm still not 100% sure that it is 100% filled out.

The printing out of packet data by snoop (with internal changes)
ignores everything prior to the ethertype field:

IPIB:  ----- IPIB Header -----
IPIB:
IPIB:  Packet 1 arrived at 10:32:21.60
IPIB:  Packet size = 48 bytes
IPIB:  Ethertype = 0800 (IP)
IPIB:
IP:   ----- IP Header -----
IP:
IP:   Version = 4
IP:   Header length = 20 bytes
IP:   Type of service = 0x00
IP:         xxx. .... = 0 (precedence)
IP:         ...0 .... = normal delay
IP:         .... 0... = normal throughput
IP:         .... .0.. = normal reliability
IP:         .... ..0. = not ECN capable transport
IP:         .... ...0 = no ECN congestion experienced
IP:   Total length = 44 bytes
IP:   Identification = 48929
IP:   Flags = 0x0
IP:         .0.. .... = may fragment
IP:         ..0. .... = last fragment
IP:   Fragment offset = 0 bytes
IP:   Time to live = 1 seconds/hops
IP:   Protocol = 1 (ICMP)
IP:   Header checksum = 34fa
IP:   Source address = 192.168.37.12, 192.168.37.12
IP:   Destination address = 224.0.0.1, all-systems.mcast.net
IP:   No options
IP:
ICMP:  ----- ICMP Header -----
ICMP:
ICMP:  Type = 8 (Echo request)
ICMP:  Code = 0 (ID: 28163 Sequence number: 15497)
ICMP:  Checksum = 7555
ICMP:


          0: 0800 0000 4500 002c bf21 0000 0101 34fa    ....E..,.!....4.
         16: c0a8 250c e000 0001 0800 7555 6e03 3c89    ..%.......uUn.<.
         32: 0000 0000 91e4 efc0 0000 5678 ffff ffff    ..........Vx....

Darren

-
This is the tcpdump-workers list.
Visit https://cod.sandelman.ca/ to unsubscribe.

-
This is the tcpdump-workers list.
Visit https://cod.sandelman.ca/ to unsubscribe.


Current thread: