tcpdump mailing list archives

Re: Legacy Linux kernel support


From: Guy Harris via tcpdump-workers <tcpdump-workers () lists tcpdump org>
Date: Tue, 22 Oct 2019 14:06:02 -0400 (EDT)

--- Begin Message --- From: Guy Harris <gharris () sonic net>
Date: Tue, 22 Oct 2019 11:08:10 -0700
On Oct 21, 2019, at 5:59 PM, Mario Rugiero via tcpdump-workers <tcpdump-workers () lists tcpdump org> wrote:

I think it's time to summarize, and to propose one last idea.
I'm following the thread again to try and be as accurate as possible,
but of course any objections are welcomed.

- The oldest officially supported kernel is 3.16, as this is the
oldest LTS according to kernel.org.
- Users of the library must be properly informed when their
environment is unsupported, as well as the last version supporting it.
 This should be done both at compile-time and at run-time.
- SOCK_PACKET goes away. This is already done in master.
- TPACKET_V1 goes away.

This includes the hack to handle 32-bit userland running on top of a 64-bit kernel; TPACKET_V2 eliminated that problem 
by making the flags field a 32-bit integer, even on 64-bit architectures, in the data structures shared between the 
kernel and userland.

I.e., we also remove the internal "TPACKET_V1_64" support.

- TPACKET_V2 stays for immediate-mode support.
 - As a side-effect, RHEL6 remains supported.

So RHEL6's kernel is pre-3.16 and thus doesn't support TPACKET_V3?

 - The idea of exploring using non-memory-mapped sockets for this was
proposed, and it would be interesting to follow-up.
    For this, I was supposed to check whether that makes a difference
regarding how the kernel implements it.
- The workaround for TPACKET_V3's bug stays, as the fix was only
introduced in 3.19.
- We should explore reaching a solution to immediate-mode that doesn't
require TPACKET_V2.
 - It has to be noted, tho, that any changes to allow that aren't
unlikely to be back-ported to older kernels, so we'd still need
TPACKET_V2 for the time being. It'd be a bet for the future.

So you're talking about a TPACKET_V5, or changes to TPACKET_V3 or TPACKET_V4, to support immediate mode in 
memory-mapped capture, as opposed to using non-memory-mapped sockets?

- Just to acknowledge it, it was proposed to research on whether
support for AF_XDP makes sense. I think that belongs to its own
discussion, tho.

Yes, that's a different mechanism from AF_PACKET.

Does it allow receiving copies of packets that are also handed either to the kernel networking stack or to other AF_XDP 
sockets for regular input processing? That would be needed to allow it to be used for packet sniffing.

Now, the idea goes along with the last item.
I was thinking of proposing a new option for TPACKET_V3 sockets to set
a deadline.
I haven't completely decided on the details, but basically it would
behave somewhat like this:
- The deadline can take three types of value:
 - (-1): no deadline, wait until a block is full before marking as
ready for user space. This would be the default, so no existing
programs change their behavior.

That sounds like the behavior with a timeout set to 0 (with PF_PACKET sockets, BPF devices, and Solaris DLPI)

 - (0): expose packets as soon as they arrive. This would act more or
less as previous AF_PACKET versions work.

That sounds like immediate mode.

 - A positive integer: this would be how long to wait

If you mean "deliver a block if it's full or if the timeout expires", that sounds like the behavior with a non-zero 
timeout.

So how does this differ from the regular timeout mechanism?

(I haven't decided on the unit, I'm guessing microseconds should work).

The units are milliseconds in pcap_set_timeout() and with BPF devices.

   I'm not sure regarding what the deadline is set, but I'm thinking
since the first packet in the block arrived.

At least for BPF devices, it's "since a read or select was done", rather than "since the first packet in the block 
arrived"; I think it might be "since the first packet in the block arrived" with Solaris DLPI.

Any ideas on this?
How should we keep the discussion in sync between the two lists?
Should I CC the participants on this list on the RFC on the kernel list?

"The two lists" being this list and some Linux list?

Is the Linux list linux-netdev?

--- End Message ---
_______________________________________________
tcpdump-workers mailing list
tcpdump-workers () lists tcpdump org
https://lists.sandelman.ca/mailman/listinfo/tcpdump-workers

Current thread: