tcpdump mailing list archives

Re: advice for heavy traffic capturing


From: "Fulvio Risso" <fulvio.risso () polito it>
Date: Mon, 9 Aug 2004 11:49:56 +0200

Hi Darren.

-----Original Message-----
From: Darren Reed [mailto:darrenr () reed wattle id au]
Sent: lunedi 9 agosto 2004 10.57
To: Fulvio Risso
Cc: tcpdump-workers () lists tcpdump org
Subject: Re: [tcpdump-workers] advice for heavy traffic capturing


[ Charset ISO-8859-1 unsupported, converting... ]
  http://netgroup.polito.it/fulvio.risso/pubs/iscc01-wpcap.pdf

When was it published?  There is no date...

Fulvio Risso, Loris Degioanni, An Architecture for High Performance Network
Analysis, Proceedings of the 6th IEEE Symposium on Computers and
Communications (ISCC 2001), pg. 686-693, Hammamet, Tunisia, July 2001.


Winpcap appears, by design, to be the same as BPF.  If you reduced the
number of buffers in the ring used with NPF to 2 buffers, I suspect it
would be the same as BPF ?

No, there are two different architectural choices.
The ring does not have buffers; it has just space for packets; space
occupancy is exactly the size of the packet.


And because there is no date, I can say that references to the buffer
size being 32Kbytes in recent BSD kernels is wrong.  Recent BSD kernels

In 2001 buffer was 32KB.


use 1MB or 2MB buffers, by default.  Although it then contradicts itself
later by saying there are larger buffers but that pcap tunes it down to
32K....(page 2 vs page 3.)

No, it does not contradicts itself.
At that time, there was a sysctrl option that allowed to increase the buffer
size from the command line.
However, libpcap code included a system call that reset the value of the
buffer to 32KB.
So, even if the used managed to have a bigger buffer through the command
line, it was impossible to use it without modifying the libpcap code.

This was valid in 2001; I don't know now.



Hardware counts, but... we have been really careful to optimize
the whole
path from the NIC card to the application.
See another article on this topic (it covers only Win32):

   L. Degioanni, M. Baldi, F. Risso, G. Varenni
   Profiling and Optimization of Software-based Network Analysis
Applications
   http://netgroup.polito.it/fulvio.risso/pubs/sbac03-winpcap.pdf

No date on the paper, here, either.

Gianluca Varenni, Mario Baldi, Loris Degioanni, Fulvio Risso, Optimizing
Packet Capture on Symmetric Multiprocessing Machines, Proceedings of the
15th Symposium on Computer Architecture and High Performance Computing
(SBAC-PAD03), pg. 108-115, Sao Paulo, Brazil, November 2003.



Particularly, Figure 9 shows how much work has been done to reduce the
processing overhead.

Interestingly, there are a few large areas for improvement: timestamp
(1800 -> 270), Tap processing (830->560) and filtering (585 -> 109).

... and NIC drivers and Operating system overhead which, as you can see,
account for more or less 50% of the total overhead.


And yes, NIC drivers and OS overheads are very important... but
these are
the components that cannot be changed by normal users.

I think that's what you're seeing with the 3Com GigE NIC for 100BT
receiving.  Do you know what size the buffers on the card are ?

A few KB, less than 10KB if I remember well.


The Intel 100 ProS have 128K for receieve, as I recall, the same as
the 1000MX card.  There wasn't much between these two, that I was able
to observe, except that the 100ProS was slightly better.

The amount of memory you have on the NIC is not very significant.
I cannot give you numbers right now, but this is not the parameter that
changes your life.


My biggest problem here is that you've expended effort to tune and make
NPF fast (which is fine) and compare it with existing BPF, almost to say
that BPF is bad.  I suppose this is what researchers do, but I think it
is unfair on BPF.  IMHO, you should have tested with the same buffer size
for both, even if it meant hacking on libpcap.

No ;-)
The 2001 paper compared Win32 and BSD with the same buffer size. We modified
the libpcap code in order to use a different size for the buffer, and then
we ran the tests.

I would add some points to this discussion, quoting from the conclusions of
the second paper (which, for instance, focuses entirely on NPF without even
mention BPF):

=========================================
A valuable result of this study is the quantitative conclusion that,
contrary to common belief, filtering and buffering are not the most critical
factors in determining packet capture performance. Optimization of these two
components, that received most attention so far, is shown to bring little
improvement to the overall packet capture cost, particularly in case of
short packets (or when small snapshot length are needed). The profiling done
on a real system shows that the most important bottlenecks lie in hidden
places, like device driver, interaction between application and OS,
interaction between OS and hardware.
=========================================

We didn't want to say "BPF is good, NPF is bad".
What we said is: be careful, an accurate tuning of the code is more
important than other esotic stuff such as improving filtering (with a
Just-in-Time Compiler) and so on.

And, I would like to say, you need a global picture of where the bottleneck
are before doing optimizations.
For instance, we're now working to decrease the 50% of the time spend by
each packet in the operating systems.


In the NetBSD emails, I think I ponder making changes to the buffering
so that it is more ring-buffer like (similar to what exists within NPF
if I understand the diagram right.)

Eh, what you're saying is good but... the double buffering in the BPF has an
advantage: it is much simpler, and if you're not interested in memory
occupancy, it is a very good choice.
We didn't realize it in 2001; now, we can see less black and white in the
choice between a double buffer and a ring buffer...


Is the JIT code easily ported to other platforms ?

Yes, as far as the platform is Intel ;-)

        fulvio

-
This is the tcpdump-workers list.
Visit https://lists.sandelman.ca/ to unsubscribe.


Current thread: