tcpdump mailing list archives

Re: Request for a new LINKTYPE_/DLT_ type.


From: "Dave Barach (dbarach)" <dbarach () cisco com>
Date: Sun, 6 Jan 2019 20:00:35 +0000

Good points. I've updated the spec. It will take a bit of time to propagate, so I've appended the current .md text 
below.

-----Original Message-----
From: Guy Harris <gharris () sonic net> 
Sent: Saturday, January 5, 2019 11:39 PM
To: Dave Barach (dbarach) <dbarach () cisco com>
Cc: tcpdump-workers <tcpdump-workers () lists tcpdump org>
Subject: Re: [tcpdump-workers] Request for a new LINKTYPE_/DLT_ type.

On Dec 29, 2018, at 4:50 AM, Dave Barach (dbarach) <dbarach () cisco com> wrote:

The same packet - with [traced] metadata changes - will appear multiple times as the packet traverses the vpp 
forwarding graph.

The description of the format should probably warn about that, because protocol analyzers that maintain state between 
packets might get confused if multiple instances of the same packet appear in a capture.

Simple example: from the driver layer, an ip4 transit packet will visit ethernet-input, ip4-input[-no-checksum], 
ip4-lookup, ip4-rewrite, interface-output, and the device driver TX node. Each of those visits results in a trace 
record. The dispatch framework traces vectors of packets, so one sees N x trace records from ethernet-input, the N x 
trace records from ip4-input, and so on. Folks typically filter by buffer-index in wireshark, to see what happens to 
one packet in a convenient sequential view.

So an analyzer *could*, in theory, work around this by, for example, treating each node name(?) as a separate flow, 
with a copy of a packet that visited one node as not being related to packets that visited different nodes, so a 
dissector would treat all of the copies of the IPv4 transit packet listed above as separate packets rather than as, for 
example, retransmissions of the same packet, and so that a request at one layer isn't matched with all of the copies of 
a reply that show up.

Limiting stateful analysis to one graph node - "ethernet-input" - ought to "just work..." 

I suppose that you could also suppress all dissection past the IP or maybe transport layer, although if you see 
multiple instances of a TCP segment, the TCP dissector will interpret that as a retransmission unless it knows that 
they're just multiple appearances of the same packet.

The problem here is that a VPP trace is significantly different from a regular network capture, in that it seems mainly 
tracing the flow of a packet through the packet processing code on a single machine rather than tracing its flow on a 
network; packet analyzers are more oriented towards the latter.

You don't need to give details of *how* an analyzer should deal with this - different analyzers might choose to do so 
in different ways; just note that this is significantly different from the sort of network traces one might be used to.

------------------------

Graph Dispatcher Pcap Tracing
-----------------------------

The vpp graph dispatcher knows how to capture vectors of packets in pcap
format as they're dispatched. The pcap captures are as follows:

```
    VPP graph dispatch trace record description:

        0                   1                   2                   3
        0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
       +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
       | Major Version | Minor Version | NStrings      | ProtoHint     |
       +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
       | Buffer index (big endian)                                     |
       +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
       + VPP graph node name ...     ...               | NULL octet    |
       +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
       | Buffer Metadata ... ...                       | NULL octet    |
       +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
       | Buffer Opaque ... ...                         | NULL octet    |
       +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
       | Buffer Opaque 2 ... ...                       | NULL octet    |
       +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
       | VPP ASCII packet trace (if NStrings > 4)      | NULL octet    |
       +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
       | Packet data (up to 16K)                                       |
       +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
```

Graph dispatch records comprise a version stamp, an indication of how
many NULL-terminated strings will follow the record header and preceed
packet data, and a protocol hint.

The buffer index is an opaque 32-bit cookie which allows consumers of
these data to easily filter/track single packets as they traverse the
forwarding graph.

Multiple records per packet are normal, and to be expected. Packets
will appear multipe times as they traverse the vpp forwarding
graph. In this way, vpp graph dispatch traces are significantly
different from regular network packet captures from an end-station.
This property complicates stateful packet analysis.

Restricting stateful analysis to records from a single vpp graph node
such as "ethernet-input" seems likely to improve the situation.

As of this writing: major version = 1, minor version = 0. Nstrings
SHOULD be 4 or 5. Consumers SHOULD be wary values less than 4 or
greater than 5. They MAY attempt to display the claimed number of
strings, or they MAY treat the condition as an error.

Here is the current set of protocol hints:

```c
    typedef enum
      {
        VLIB_NODE_PROTO_HINT_NONE = 0,
        VLIB_NODE_PROTO_HINT_ETHERNET,
        VLIB_NODE_PROTO_HINT_IP4,
        VLIB_NODE_PROTO_HINT_IP6,
        VLIB_NODE_PROTO_HINT_TCP,
        VLIB_NODE_PROTO_HINT_UDP,
        VLIB_NODE_N_PROTO_HINTS,
      } vlib_node_proto_hint_t;
```

Example: VLIB_NODE_PROTO_HINT_IP6 means that the first octet of packet
data SHOULD be 0x60, and should begin an ipv6 packet header.

Downstream consumers of these data SHOULD pay attention to the
protocol hint. They MUST tolerate inaccurate hints, which MAY occur
from time to time.
_______________________________________________
tcpdump-workers mailing list
tcpdump-workers () lists tcpdump org
https://lists.sandelman.ca/mailman/listinfo/tcpdump-workers

Current thread: