tcpdump mailing list archives

Re: libpcap for linux, to_ms redefined

From: Phil Wood <cpw () lanl gov>
Date: Mon, 7 Oct 2002 16:59:47 -0600


On Wed, Sep 18, 2002 at 01:01:30PM -0700, Guy Harris wrote:

On Thu, Mar 28, 2002 at 09:45:07PM -0700, Phil Wood wrote:

With the advent of memory mapped ring buffers developed by Alexey Kuznetsov,
this function could be accomodated.  I treat the value of 'to_ms' in the
following manner:

  if (to_ms == 0) return; // if no packet immediately available then return
                  // to calling program it will poll (good for old
                  // versions of NFR or programs that have other
                  // things to do besides capture packets)


And bad for compatibility with other platforms, on which a "to_ms" value
of 0 means "if no packet immediately available, block, and wait as long
as necessary for enough packets to arrive to fill up a chunk".


Opps.  I sure got that one wrong.

Well, I just changed my code so to_ms == 0 will block as long as is
necessary.  I still like to have the ability to return to the caller if
there are no packets available.  How bad would it be to use a negative
value (or just -1) to mean "if there are no packets this instant in time,
return to the calling program"?  I guess the answer is related to how many
libpcap programs use a negative value for to_ms.

For what it's worth.  My linux mmap'd pcap behaves as follows:

     A. With a positive timeout (initialized by the to_ms value on each call
     to pcap_read), a "read" will return if either

        1) enough polls have been called to exhaust the timeout value.

     or

        2) the timeout expires even if no packets have been received.

     B. With a zero timeout, a "pcap_read" will never return.  The timeout
     is considered infinite.  Of course callbacks will continue for each
     packet that arrives.

     C. With a negative value, "pcap_read" will return if either

       1) there are no packets on the ring

     or

       2) the packets that have been queued on the ring have all been
          processed.

     Basically, the only system call that comes into play while in pcap_read
     is the poll system call.  And that is only for cases A and B.


There are several timeout behaviors that can be provided by various
platforms' native packet capture mechanisms that support timeouts:

    BSD:

      With a non-zero timeout, a read will return if either

              1) enough data arrives to fill up the buffer

      or

              2) the timeout expires, even if no data has arrived.

      With a zero timeout, the read will return only if enough data
      arrives to fill up the buffer, blocking as long as is necessary.

      You can do BIOCIMMEDIATE to cause packets to be delivered as
      soon as they arrive; if combined with a timeout, that *probably*
      means that a read will return if either

              1) a packet arrives

      or

              2) the timeout expires, even if no data has arrived.

    Digital UNIX:

      With a positive timeout, at least as I read the man page, it
      might be the case that, with a non-zero timeout, a read will
      return if either

              1) a packet arrives

      or

              2) the timeout expires, even if no data has arrived

      so that batching is presumably done only if packets are arriving
      faster than the application can read them one at a time - i.e.,
      if, before the read wakes up and copies data to userland, more
      packets arrive.  I don't know whether that's the case, however.

      With a zero timeout, the read will return if a packet arrives,
      blocking as long as is necessary.

      With a negative timeout, the read will return immediately, even
      if no packet is available; the value of the timeout is, I infer,
      ignored.

      They say that BIOCIMMEDIATE has no effect as immediate mode
      is always on, which is why I infer that batching isn't done
      BSD-style.

    Windows with WinPcap:

      With a positive timeout, a read will return if either

              1) enough data arrives to fill up the buffer

      or

              2) the timeout expires, even if no data has arrived

      at least as I read the current packet.dll documentation.  The
      bufffer size is set with "PacketSetMinToCopy()" on Windows NT
      (NT 4.0, W2K, WXP, W.NETServer); I'm guessing from what the
      documentation says about Windows OT (95/98/Me) that one packet
      is always enough to fill up the buffer on those OSes.

      With a zero timeout, a read will return only if enough data
      arrives to fill up the buffer, blocking as long as is necessary.

      With "PacketSetMinToCopy()" you can presumably get the
      equivalent of BIOCIMMEDIATE.

    SunOS 5.x:

      With a non-zero timeout, a read will return if either

              1) enough data arrives to fill up the buffer

      or

              2) the timeout expires *AND* at least one packet has
                 arrived.  (Yes, this means that you can't use the
                 timeout to break out of a loop and do something else
                 while you're waiting.  Such is life.)

      With a zero timeout, a read will return as soon as a packet
      arrives.

      With the timeout cleared, a read will return only if enough data
      arrives to fill up the buffer, blocking as long as is necessary.
      (libpcap treats a "to_ms" value of 0 as meaning "don't set the
      timeout", which means it's cleared, *not* as "return
      immediately".)

      With a cl

    SunOS 4.x:

      I don't have the "bufmod" man page handy for SunOS 4.x, but I
      *suspect* it's similar to 5.x, as the 5.x "bufmod" is probably
      derived from the 4.x "bufmod".  No guarantees, however.

    SunOS 3.x:

      *Probably* behaves like 4.x.

On OSes whose packet capture mechanism *doesn't* support timeouts, a
read will return if a packet arrives, and will wait indefinitely for
that to happen.

All this means that libpcap cannot, merely by using the underlying OS's
mechanisms:

      guarantee that a read will always return within a certain
      timeout period (the Solaris timeout mechanism only sets the
      timeout for batching of packets);

      guarantee that packets will not be delivered ASAP (some OSes
      don't do batching, and others do only a limited amount of
      batching).

On most if not all of the OSes, however, you *can* do a "select()" or
"poll()" or "WaitFor...()" call on the pcap device/socket/whatever, so
that you *can* multiplex reading packets and doing other things.  (On
BSD, you may have to use a timeout in the "select()" or "poll()", plus
non-blocking I/O, as "select()" or "poll()" on BPF devices doesn't
always work correctly.)

It would be possible, if people *really* insist on using the timeout for
multiplexing rather than just batching, to make

      1) the platforms with no timeout in the OS (Linux, Irix, HP-UX,
         etc.)

and

      2) the platforms where the timeout can't be used for
         multiplexing as the timer doesn't expire unless you have at
         least one packet (SunOS 5.x)

do a combination of non-blocking I/O and a "select()" or "poll()" with a
timeout.  However

      1) that's overkill for applications that *don't* use the timeout
         for multiplexing

and

      2) it means that people will start relying on it and having all
         sorts of weird problems when their application is run with an
         older version of libpcap

so I'm not strongly inclined to implement that (which is why the Linux
version doesn't do it) and, if we do implement that, I'd want to do it
only if

      1) we do it on *ALL* the platforms where the native OS timeout
         can't be used for multiplexing (not just for Linux)

and

      2) we add a new API to enable that mode, so that applications
         that require that mode have to use the new API and thus won't
         build with older versions of libpcap (rather than merely
         hanging forever, on platforms such as Linux where there is no
         timeout and platforms such as SunOS 5.x where the timeout
         doesn't expire if no packets arrive, with older versions of
         libpcap).

I would also advocate adding a new API to get "immediate mode", that
being the mode where a read completes as soon as a packet arrives, with
no batching; that'd let us hide all the details of how to request
immediate mode.


-- 
Phil Wood, cpw () lanl gov

-
This is the TCPDUMP workers list. It is archived at
http://www.tcpdump.org/lists/workers/index.html
To unsubscribe use mailto:tcpdump-workers-request () tcpdump org?body=unsubscribe

Current thread:

Re: libpcap for linux, to_ms redefined Phil Wood (Oct 07)
- Re: libpcap for linux, to_ms redefined Guy Harris (Oct 08)