tcpdump mailing list archives
Re: Libpcap on VMWare
From: "Mark Bednarczyk" <voytechs () yahoo com>
Date: Tue, 12 Jan 2010 20:59:57 -0500
Hi, I have been working with Vikram on this issue and let me comment as well if I may. The tests we are running are both under jNetPcap and under a native C application without jNetPcap involved at all. My jNetPcap based tests don't actually interact with java at all during capture. Java part is used to kick off the test which runs completely in native land with an empty callback function that does no work except to keep a few statistics about the packet rate and some cummulative inform from pcap packet header delivered. More comments inline...
-----Original Message----- From: tcpdump-workers-owner () lists tcpdump org [mailto:tcpdump-workers-owner () lists tcpdump org] On Behalf Of Guy HarrisThis is similar in nature to http://article.gmane.org/gmane.network.tcpdump.devel/4256 posting (which is unfortunately unsolved). We are using jnetpcap which is a wrapper over libpcap. Mark Bednarczyk posted the originalquery (4256).-------------------------------------- We are experiencing massive packet drops in libpcap whileworking withNon Windows guests on VMWare ESXi Server . The same thinghappens onVMplayer (Host OS - Windows). We have tested on Ubuntu8.04, FC11 andDebian , the library seems to drop packets every where. Theload beingsubjected to is not much but is constant (TCP packets of1200 - 1500 bytes consistently).The packet drops DO NOT occur on Windows Guest OSs (bothvia ESXi andVMPlayer). They only happen when we are working withnon-Windows guests. Do they happen if you're running with Linux on bare hardware, rather than under VMware?
No drops on NON-vmware platforms.
I.e., is there any reason to believe that this is a problem with libpcap on VMware, rather than, for example, libpcap on Linux?
Yes, I think there is. The serious packet drops can occur on vmware linux based platforms, while no packet drops for the same traffic loads (even upto 96Kpps that I've tested) on non-vmware linux platforms.
Libpcap version from Ubuntu:- Libpcap (by dpkg) : ii libpcap0.8 0.9.8-2System interface foruser-level packet capture.That means you're using a version of libpcap based on the 0.9.8 release. The packageAs a temporary measure, we initially thought we could needto increasethe socket receive buffer size as someone did herehttp://www.winpcap.org/pipermail/winpcap-users/2006-October/00
1521.html .
We tried configuration given in the link and it reducedpacket dropssubstantially. To about 2% from over 20% earlier but stillnot to zero.Being new to Libpcap (and Linux) , we are still strugglingwith somebasic understanding and would be grateful if someone couldset us on track.1. What we did with these commands sysctl -w net.core.rmem_max=4194304 sysctl -w net.core.rmem_default=4194304 was to increase the Linux socket size so that when libpcap opens a socket to the BPF deviceThere are no BPF devices on Linux. libpcap opens a PF_PACKET socket and later binds it to a *networking* device.it uses this size (of 4M here). Is this understanding correct?From a quick look at the Linux 2.6.29 kernel, rmem_default will be used as the default receive buffer size when any socket is created; this includes PF_PACKET sockets, as well as PF_INET sockets, and....
So changing socket size seems to aleavated the problem a bit. Is there a corrolation there?
If so , how do we configure it from outside so that we can increase it's size also ?...it's irrelevant to the problem you're having. The problem is probably that libpcap, and your program, aren't reading packets fast enough, so, given that the socket buffer has a finite size, that buffer can eventually fill up, at which point any more packets that arrive will be dropped. Making the socket buffer bigger will help there *IF* the program+libpcap is capable, on average, of reading and processing packets as fast as, or faster than, they arrive - the buffer only helps if the inability to process packets at full speed is temporary (program gets temporarily slowed down by, for example, having to write the packets to a file, or a short burst of packets arrives too fast) and the program can later catch up.
Out test applications do not do any work with the received data and its all native handler processing. I can comment about the way that pcap_dispatch vs. pcap_loop seem to behave. I have tried my tests with both functions and both drop packets. What is surprising even at high packet rates is that pcap_dispatch does not seem to buffer more that 1 or 2 packets before exiting the call. I would think that at a higher packet rate, libpcap would like to buffer as many packets as the ring-buffer allows before exiting pcap_dispatch call. This means that in our test application, our outside loop around pcap_dispatch call has to call on pcap_dispatch frequently since very few packets are processed per loop iteration. When testing with pcap_loop, the behaviour is a bit more as expected. I have mapped out where libpcap pcap_stats indicates packet drops in a loop around the pcap_loop call. That is: capture 10K packet 10 times. Consitently the packet drops appear (as reported by pcap_stats) within the first 1K packet batch captured with the new pcap_loop call. Once the first 1000 packets have been dispatched there doesn't appear to be any more drops until the pcap_loop call exits and is re-called again for another 10K packets. The same test using pcap_dispatch points at packets being dropped all over. It seems that packet drops occur shortly in between consecutive pcap_dispatch/pcap_loop calls. Atleast in pcap_loop case there seems to be some catching up going on, and then things stabilize. Since pcap_dispatch exits much more frequently then pcap_loop that packet drops appear more frequently through out the test loops. Let me reiterate that under non-vmware platform I see no packet drops during packet even at close to 100pps.
The buffer in libpcap only has to be big enough for the chunk of packets libpcap reads - and, in versions of libpcap prior to 1.0.0, it does a recvfrom() on a PF_PACKET socket, and gets one packet at a time, so the buffer in libpcap only needs to be big enough for one packet.We got this link http://public.lanl.gov/cpw/README.ring.html which talksabout variousenvironment variables (PCAP_FRAMES to be precise) that canbe used toconfigure libpcap but I am not sure if this gentlemancompiled his ownlibpcap version or this is applicable to standard distro as well.It's his own version, so those environment variables don't apply to the standard version. *HOWEVER*, the main thing that his version of libpcap does is support Linux's zero-copy (memory-mapped) capture mechanism. Using that mechanism (or the zero-copy mechanism in FreeBSD 8.0 and later) means that there is a buffer that's in both the kernel's address space and the application's address space, so that data doesn't need to be copied from a kernel-mode buffer to a user-mode buffer. Packets *are* still copied from the skbuff (Linux) or mbuf (FreeBSD) into the shared buffer, so it's really more like "one-copy", but that's still one fewer copy, so that could reduce the CPU time required to receive captured packets. In addition, on Linux, that means that, at least in theory, when the application wakes up as packets arrive, it might be able to receive more than one packet per wakeup - libpcap will take packets from the shared buffer as long as there are packets available. Processing more than one packet per wakeup can also speed up packet processing, so that the application might drop fewer packets. (With BPF - except on AIX - even *without* the zero-copy capture mechanism, more than one packet can be delivered per wakeup, so, whilst the zero-copy mechanism in FreeBSD 8.0 and later will avoid one copy, it shouldn't increase the number of packets delivered per wakeup. In addition, the capture mechanism WinPcap provides on Windows also delivers more than one packet per wakeup.) Libpcap 1.0.0 and later also support Linux's (and FreeBSD 8.0 and later's) zero-copy capture mechanism, so if you were using libpcap 1.0.0 or later, rather than libpcap 0.9.6, you might drop fewer packets. (As per Dustin Spicuzza's e-mail, "later" is better than "1.0.0"; "later" currently means "top of Git tree".)May we also know what is this ring buffer people keeptalking about ? There's the ring buffer provided by newer versions of the standard Linux kernel; that's what Phil Wood is referring to in the link you mention above. There's also Luca Deri's PF_RING: http://www.ntop.org/PF_RING.html which requires modifications to libpcap to use.Does libpcap standard distro have a ring buffer (related to thequestion above) ? Versions of libpcap before 1.0.0 don't support the Linux zero-copy capture mechanism; libpcap 1.0.0 and later do.And can PCAP_MEMORY or PCAP_FRAMES environment variablehelp increaseit (as in the link above and herehttp://seclists.org/snort/2009/q1/209) ? Only Phil Wood's libpcap supports those environment variables. However, libpcap 1.0.0 and later have an API that lets an application set the buffer size, on platforms where the buffer size can be set; tcpdump 4.0.0 and later support that API with the "-B" flag. I don't know whether jnetpcap supports the new APIs yet, however.- This is the tcpdump-workers list. Visit https://cod.sandelman.ca/ to unsubscribe.
This is good background information for myself. jNetPcap as a wrapper also supports zero-copy upto the java stack. We avoid any packet data copies by wrapping around the native memory location and reading packet data directly out of that location. There are packet copy functions as well for users who have to keep the packet around longer such as on a queue and can't process the packet immediatelly in the packet handler/java callback function, but that is something that the programmer decides what to do with the packets once they are received. As to the set_buffer method, it is current supported by jNetPcap on win32 platforms, I haven't added it to the more general API for the remainder yet. However all new functions (i.e. pcap_setdirection, etc...) will be added soon and fully supported by jNetPcap API with the newer libpcap version prerequisites for each platform. Lastly, I'd be happy to provide access to my build lab with access to various vmware based platforms for any troubleshooting. Cheers, mark... http://jnetpcap.com - This is the tcpdump-workers list. Visit https://cod.sandelman.ca/ to unsubscribe.
Current thread:
- Libpcap on VMWare Vikram Roopchand (Jan 12)
- Re: Libpcap on VMWare Vikram Roopchand (Jan 12)
- Re: Libpcap on VMWare Dustin Spicuzza (Jan 12)
- Re: Libpcap on VMWare Guy Harris (Jan 12)
- Re: Libpcap on VMWare Dustin Spicuzza (Jan 12)
- Re: Libpcap on VMWare Guy Harris (Jan 12)
- Re: Libpcap on VMWare Dustin Spicuzza (Jan 12)
- Re: Libpcap on VMWare Michael Richardson (Jan 13)
- Re: Libpcap on VMWare Guy Harris (Jan 12)
- Re: Libpcap on VMWare Mark Bednarczyk (Jan 12)
- Re: Libpcap on VMWare Guy Harris (Jan 12)
- Re: Libpcap on VMWare Vikram Roopchand (Jan 12)
- Re: Libpcap on VMWare Gert Doering (Jan 13)
- Re: Libpcap on VMWare Vikram Roopchand (Jan 13)
- Re: Libpcap on VMWare Vikram Roopchand (Jan 30)