tcpdump mailing list archives
Re: advice for heavy traffic capturing
From: "Fulvio Risso" <fulvio.risso () polito it>
Date: Mon, 9 Aug 2004 13:13:00 +0200
Hi Darren.
-----Original Message----- From: Darren Reed [mailto:darrenr () reed wattle id au] Sent: lunedi 9 agosto 2004 12.21 To: Fulvio Risso Cc: tcpdump-workers () lists tcpdump org Subject: Re: [tcpdump-workers] advice for heavy traffic capturing Hi Fulvio,Fulvio Risso, Loris Degioanni, An Architecture for HighPerformance NetworkAnalysis, Proceedings of the 6th IEEE Symposium on Computers and Communications (ISCC 2001), pg. 686-693, Hammamet, Tunisia, July 2001.Is there any way you can get this (and the other date info.) into those PDFs ? It really helps put them in perspective.
No, because these paper are the exact copy of the published ones. You can find date info on my homepage: http://netgroup.polito.it/fulvio.risso/pubs/index.htm
Winpcap appears, by design, to be the same as BPF. If you reduced the number of buffers in the ring used with NPF to 2 buffers, I suspect it would be the same as BPF ?No, there are two different architectural choices. The ring does not have buffers; it has just space for packets; space occupancy is exactly the size of the packet.Ah, so you're using the buffers that the data is read into, off the NIC, to put into the ring ? Or to put in BSD terms, the ring is made up of mbuf pointers ?
No, data is copied into the ring buffer. This is what is usually called "first copy". The "second copy" happens later, from the ring buffer (also called "kernel buffer") into the application space. We're forced to do this by the Win32 driver model, that specifies that protocol drivers (such as npf.sys is) must copy the packets they are interested in.
You would have to be careful to not hold on to the buffers for too long (or too many of them) or else surely you would run out ? That would make direct access to the buffers from user space (using mmap or similar) more involved.Interestingly, there are a few large areas for improvement: timestamp (1800 -> 270), Tap processing (830->560) and filtering (585 -> 109).... and NIC drivers and Operating system overhead which, as you can see, account for more or less 50% of the total overhead.Yup.The Intel 100 ProS have 128K for receieve, as I recall, the same as the 1000MX card. There wasn't much between these two, that I was able to observe, except that the 100ProS was slightly better.The amount of memory you have on the NIC is not very significant. I cannot give you numbers right now, but this is not the parameter that changes your life.Why not ? Well I suppose your results (if the 3com really does only have 16 or 32k of buffer) would support this.
Packet are transferred in memory through a bus-mastering process. The memory on the NIC is used just to keep packets in case the transfer cannot be initiated immediately. Usually, cards tend to trasfer packets as soon as they are received. Otherwise the latency for getting a packet can be unacceptably high, which means that the timestamping is not precise, and user complain that the latency of a PC in receiving packets from the network is too high. This is why "interrupt mitigation" and such these technologies must be usually turned on explicitly: they increase the latency in delivering packets to the applications. So, having 10KB or 100KB does not (usually) matter.
But maybe buffering is more important for BPF where you have the interrupt masked out for longer while the data is copied ?
From this point of view, Win32 and BSD (ehm... older BSD, without device
polling) are mostly the same. Also Win32 masks the interrupts for a while.
========================================= A valuable result of this study is the quantitative conclusion that, contrary to common belief, filtering and buffering are not themost criticalfactors in determining packet capture performance. Optimizationof these twocomponents, that received most attention so far, is shown tobring littleimprovement to the overall packet capture cost, particularly in case of short packets (or when small snapshot length are needed). Theprofiling doneon a real system shows that the most important bottlenecks lie in hidden places, like device driver, interaction between application and OS, interaction between OS and hardware. =========================================Hmmm, the testing I did would disagree with that, or at least so far as to say that there is a "sweet spot" for buffer sizes and data rates (at least with BPF.) The hardware does make some difference - one of our other test cards was a Netgear (FA-311?) and it was shit. My recollection was that with the data sample we were using, with 1MB captures enabled for BPF, at full speed, most reads were between 64k and 256k, at a time. There were other changes to BPF, unrelated to what you've changed, that reduced packet loss from X% to 0%. I copied these, this year, from FreeBSD to NetBSD but I don't recall their vintage on FreeBSD.
It should be very helpful to have these results published somewhere. They should be very helpful in the scientific community (at least, for us ;-) )
And, I would like to say, you need a global picture of wherethe bottleneckare before doing optimizations.Oh, sure. And one of those limiting factors is PCI.
Yes.
For instance, we're now working to decrease the 50% of the time spend by each packet in the operating systems.You're still working with Windows ?
We're implementing a prototype on Linux, just because we need some stuff at kernel level which was not available on FreeBSD (which, for instance, is my favourite system). Then, we're planning to implement everything in Windows as well.
In the NetBSD emails, I think I ponder making changes to the buffering so that it is more ring-buffer like (similar to what exists within NPF if I understand the diagram right.)Eh, what you're saying is good but... the double buffering inthe BPF has anadvantage: it is much simpler, and if you're not interested in memory occupancy, it is a very good choice.Yes.We didn't realize it in 2001; now, we can see less black andwhite in thechoice between a double buffer and a ring buffer...What have you found that makes you say this ? The simplicity in cpu cycle cost ?
1. simplicity 2. swappable buffers are very helpful if you plan to make statistics, not only packet capture. For instance, let's think about a system (like a NetFlow probe or something like that) that collects statistics, then it returns data to the user every N minutes. If you have two buffers you can put statistics in the first one, while you can read data from the second one, and swap buffers every N minutes. If you have a ring buffer and your application wants to read data, you have to stop collecting stats, lock the ring, copy its content in another buffer, unlock the ring, read data from the second buffer, and restart computing statistics. So, depending on what you're planning to do, swappable buffers may be better.
Is the JIT code easily ported to other platforms ?Yes, as far as the platform is Intel ;-)That's fine with me :) Do you have a URL for this ?
http://winpcap.polito.it You'll find everything in the source pack. Cheers, fulvio - This is the tcpdump-workers list. Visit https://lists.sandelman.ca/ to unsubscribe.
Current thread:
- Re: advice for heavy traffic capturing, (continued)
- Re: advice for heavy traffic capturing Fulvio Risso (Aug 07)
- Re: advice for heavy traffic capturing Darren Reed (Aug 08)
- Re: advice for heavy traffic capturing Guy Harris (Aug 08)
- Re: advice for heavy traffic capturing Guy Harris (Aug 08)
- Re: advice for heavy traffic capturing Darren Reed (Aug 08)
- Re: advice for heavy traffic capturing Guy Harris (Aug 08)
- Re: advice for heavy traffic capturing Fulvio Risso (Aug 07)
- Re: advice for heavy traffic capturing Fulvio Risso (Aug 09)
- Re: advice for heavy traffic capturing Darren Reed (Aug 09)
- Re: advice for heavy traffic capturing Fulvio Risso (Aug 09)
- Re: advice for heavy traffic capturing Darren Reed (Aug 09)
- Re: advice for heavy traffic capturing Fulvio Risso (Aug 09)
- Re: advice for heavy traffic capturing Loris Degioanni (Aug 09)
- Re: advice for heavy traffic capturing Darren Reed (Aug 10)
- Re: advice for heavy traffic capturing Loris Degioanni (Aug 10)
- Re: advice for heavy traffic capturing Motonori Shindo (Aug 12)
- Re: advice for heavy traffic capturing Darren Reed (Aug 14)
- Re: advice for heavy traffic capturing Fulvio Risso (Aug 15)
- Re: advice for heavy traffic capturing Darren Reed (Aug 16)
- Re: advice for heavy traffic capturing Fulvio Risso (Aug 16)
- Re: advice for heavy traffic capturing Guy Harris (Aug 08)