tcpdump mailing list archives
Re: libpcap for linux, to_ms redefined
From: Phil Wood <cpw () lanl gov>
Date: Mon, 7 Oct 2002 16:59:47 -0600
On Wed, Sep 18, 2002 at 01:01:30PM -0700, Guy Harris wrote:
On Thu, Mar 28, 2002 at 09:45:07PM -0700, Phil Wood wrote:With the advent of memory mapped ring buffers developed by Alexey Kuznetsov, this function could be accomodated. I treat the value of 'to_ms' in the following manner: if (to_ms == 0) return; // if no packet immediately available then return // to calling program it will poll (good for old // versions of NFR or programs that have other // things to do besides capture packets)And bad for compatibility with other platforms, on which a "to_ms" value of 0 means "if no packet immediately available, block, and wait as long as necessary for enough packets to arrive to fill up a chunk".
Opps. I sure got that one wrong. Well, I just changed my code so to_ms == 0 will block as long as is necessary. I still like to have the ability to return to the caller if there are no packets available. How bad would it be to use a negative value (or just -1) to mean "if there are no packets this instant in time, return to the calling program"? I guess the answer is related to how many libpcap programs use a negative value for to_ms. For what it's worth. My linux mmap'd pcap behaves as follows: A. With a positive timeout (initialized by the to_ms value on each call to pcap_read), a "read" will return if either 1) enough polls have been called to exhaust the timeout value. or 2) the timeout expires even if no packets have been received. B. With a zero timeout, a "pcap_read" will never return. The timeout is considered infinite. Of course callbacks will continue for each packet that arrives. C. With a negative value, "pcap_read" will return if either 1) there are no packets on the ring or 2) the packets that have been queued on the ring have all been processed. Basically, the only system call that comes into play while in pcap_read is the poll system call. And that is only for cases A and B.
There are several timeout behaviors that can be provided by various platforms' native packet capture mechanisms that support timeouts: BSD: With a non-zero timeout, a read will return if either 1) enough data arrives to fill up the buffer or 2) the timeout expires, even if no data has arrived. With a zero timeout, the read will return only if enough data arrives to fill up the buffer, blocking as long as is necessary. You can do BIOCIMMEDIATE to cause packets to be delivered as soon as they arrive; if combined with a timeout, that *probably* means that a read will return if either 1) a packet arrives or 2) the timeout expires, even if no data has arrived. Digital UNIX: With a positive timeout, at least as I read the man page, it might be the case that, with a non-zero timeout, a read will return if either 1) a packet arrives or 2) the timeout expires, even if no data has arrived so that batching is presumably done only if packets are arriving faster than the application can read them one at a time - i.e., if, before the read wakes up and copies data to userland, more packets arrive. I don't know whether that's the case, however. With a zero timeout, the read will return if a packet arrives, blocking as long as is necessary. With a negative timeout, the read will return immediately, even if no packet is available; the value of the timeout is, I infer, ignored. They say that BIOCIMMEDIATE has no effect as immediate mode is always on, which is why I infer that batching isn't done BSD-style. Windows with WinPcap: With a positive timeout, a read will return if either 1) enough data arrives to fill up the buffer or 2) the timeout expires, even if no data has arrived at least as I read the current packet.dll documentation. The bufffer size is set with "PacketSetMinToCopy()" on Windows NT (NT 4.0, W2K, WXP, W.NETServer); I'm guessing from what the documentation says about Windows OT (95/98/Me) that one packet is always enough to fill up the buffer on those OSes. With a zero timeout, a read will return only if enough data arrives to fill up the buffer, blocking as long as is necessary. With "PacketSetMinToCopy()" you can presumably get the equivalent of BIOCIMMEDIATE. SunOS 5.x: With a non-zero timeout, a read will return if either 1) enough data arrives to fill up the buffer or 2) the timeout expires *AND* at least one packet has arrived. (Yes, this means that you can't use the timeout to break out of a loop and do something else while you're waiting. Such is life.) With a zero timeout, a read will return as soon as a packet arrives. With the timeout cleared, a read will return only if enough data arrives to fill up the buffer, blocking as long as is necessary. (libpcap treats a "to_ms" value of 0 as meaning "don't set the timeout", which means it's cleared, *not* as "return immediately".) With a cl SunOS 4.x: I don't have the "bufmod" man page handy for SunOS 4.x, but I *suspect* it's similar to 5.x, as the 5.x "bufmod" is probably derived from the 4.x "bufmod". No guarantees, however. SunOS 3.x: *Probably* behaves like 4.x. On OSes whose packet capture mechanism *doesn't* support timeouts, a read will return if a packet arrives, and will wait indefinitely for that to happen. All this means that libpcap cannot, merely by using the underlying OS's mechanisms: guarantee that a read will always return within a certain timeout period (the Solaris timeout mechanism only sets the timeout for batching of packets); guarantee that packets will not be delivered ASAP (some OSes don't do batching, and others do only a limited amount of batching). On most if not all of the OSes, however, you *can* do a "select()" or "poll()" or "WaitFor...()" call on the pcap device/socket/whatever, so that you *can* multiplex reading packets and doing other things. (On BSD, you may have to use a timeout in the "select()" or "poll()", plus non-blocking I/O, as "select()" or "poll()" on BPF devices doesn't always work correctly.) It would be possible, if people *really* insist on using the timeout for multiplexing rather than just batching, to make 1) the platforms with no timeout in the OS (Linux, Irix, HP-UX, etc.) and 2) the platforms where the timeout can't be used for multiplexing as the timer doesn't expire unless you have at least one packet (SunOS 5.x) do a combination of non-blocking I/O and a "select()" or "poll()" with a timeout. However 1) that's overkill for applications that *don't* use the timeout for multiplexing and 2) it means that people will start relying on it and having all sorts of weird problems when their application is run with an older version of libpcap so I'm not strongly inclined to implement that (which is why the Linux version doesn't do it) and, if we do implement that, I'd want to do it only if 1) we do it on *ALL* the platforms where the native OS timeout can't be used for multiplexing (not just for Linux) and 2) we add a new API to enable that mode, so that applications that require that mode have to use the new API and thus won't build with older versions of libpcap (rather than merely hanging forever, on platforms such as Linux where there is no timeout and platforms such as SunOS 5.x where the timeout doesn't expire if no packets arrive, with older versions of libpcap). I would also advocate adding a new API to get "immediate mode", that being the mode where a read completes as soon as a packet arrives, with no batching; that'd let us hide all the details of how to request immediate mode.
-- Phil Wood, cpw () lanl gov - This is the TCPDUMP workers list. It is archived at http://www.tcpdump.org/lists/workers/index.html To unsubscribe use mailto:tcpdump-workers-request () tcpdump org?body=unsubscribe
Current thread:
- Re: libpcap for linux, to_ms redefined Phil Wood (Oct 07)
- Re: libpcap for linux, to_ms redefined Guy Harris (Oct 08)