tcpdump mailing list archives

Re: mmap consumes more CPU


From: Guy Harris <guy () alum mit edu>
Date: Mon, 26 Nov 2012 14:24:13 -0800


On Nov 26, 2012, at 12:58 PM, abhinav narain <abhinavnarain10 () gmail com> wrote:

@Guy,
Basically, I was adding my own header (instead of radiotap) in kernel and
processing it in userland with my own code. Basically I wrote my own pcap
for that.

For your own radio header, what you'd need would be:

        your own ARPHRD_ value (which you'd need the Linux kernel developers to assign - *DO NOT* just pick one and use 
it yourself unless the Linux kernel has a "private use" range of ARPHRD_ values, in which case use one of those but 
don't expect the official libpcap, tcpdump, or Wireshark releases to support it);

        your own LINKTYPE_/DLT_ value to which to map that ARPHRD_ value (which you'd need the libpcap/tcpdump 
developers to assign, unless you choose to use one of the "private use" values DLT_USER0 through DLT_USER15, in which 
case don't expect the official libpcap, tcpdump, or Wireshark releases to use that value);

        a version of libpcap with a pcap-linux.c that maps from your ARPHRD_ value to your DLT_ value.

Since, I did not get the performance, Now I have added extra fields in
radiotap.

Note that, unless those extra fields are listed in

        http://www.radiotap.org

the official tcpdump and Wireshark releases will not ever support them, and, if some other extra fields get officially 
assigned the same "presence bit" values, tcpdump and Wireshark will interpret those values as corresponding to the 
official field assignment, not corresponding to your field assignment.  If you plan to add extra fields to radiotap, 
you should follow the official procedure for standardizing them, as indicated on that page.

But I still see high CPU usage.

So you're getting high CPU usage with regular libpcap?  Are you getting higher CPU usage if libpcap is using the 
memory-mapped mechanism than if libpcap is built from the same source, but with the memory-mapped mechanism 
artificially compiled out, and therefore is *not* using the memory-mapped mechanism?

Have you built profiled versions of libpcap, and a profiled version of whatever program you're using, and gotten the 
result of profiling, to see where the CPU time is being spent?

Its interesting that you point out there are more errors during mmap calls.

No, I don't.

What I point out is that *if* recv() is being called a lot *in the standard libpcap mmap-on-Linux code*, *then* you are 
getting a lot of errors; I am *not* saying that you would be getting *more* errors from that code than from anything 
else, such as the non-mmap code.  For the non-mmap-on-Linux code, the recvfrom() call will return both packets and 
error indications, so you wouldn't be making more system calls if you have more errors; for the mmap-on-Linux code, the 
only system calls made in the non-error case are the select() calls that wait for a new packet to arrive.

Is this anything to do with allignment of frames ?

No.  The errors come from the Linux kernel, which just processes raw skbuffs, and, in that code path, doesn't know 
what's part of the radiotap header and what isn't.

*IF* you are getting errors from recv() calls in the memory-mapped code path, then you need to find out *what those 
errors are* - i.e., what errno value is being returned - to have any chance of being able to figure out why the errors 
are occurring.

@Dave : I am running this code on a Netgear router running OpenWrt, so I am
not sure if there is profiler that can help me out.

The "profiler" is a combination of:

        1) the kernel and libc support for profiling;

        2) the support for the "-pg" flag in the compiler and linker you're using;

        3) the tools to process the files written out by a profiled program after it executes (e.g., the gprof command).

The only part that needs to be present on the machine running the profiled program is 1); if you're doing 
cross-compiling, 2) needs to be present in the cross-development tools on the machine on which you're cross-compiling, 
and 3) needs to be present on some machine in a form that can handle the files in the format in which they're written 
on the machine running the profiled program, e.g., that can handle the endianness of the files as written on the 
machine running the profiled program.  According to

        http://www.sourceware.org/binutils/docs-2.10/gprof_9.html#SEC26

the files in question identify the byte order of the file and gprof automatically handles that.

_______________________________________________
tcpdump-workers mailing list
tcpdump-workers () lists tcpdump org
https://lists.sandelman.ca/mailman/listinfo/tcpdump-workers


Current thread: