tcpdump mailing list archives

AIX BPF Problems


From: Shaun <delius () progsoc uts edu au>
Date: Wed, 12 Feb 2003 18:01:18 +1100 (EST)


That's the bit that bothers me, on a decent system I could easily motor
through processing gobs of data in a second, it's the syscall and sleep
overhead that will kill me.

Are you certain that the system call and sleep overhead will
*definitely* make it impossible?  If so, then, unless you can coax AIX
into giving you a bigger buffer, you might be out of luck.

Well, I guess impossible is a bit of a relative thing ;) Still working on
impact testing to see how we go.

On a related note I am having some very weird issues with BPF on AIX and
was wondering if anyone might be able to give me some hints.

Basically, it has been discussed previously that occasionally AIX BPF
returns EFAULT to a read(). At the time the running hypothesis was that
this was related in some way to the kernel dropping packets and wasn't
really an issue since reads could continue.

During stress testing I've found that if we receive EFAULT it's not merely
informational, it means we've just lost an entire buffer full of data.

The question (for me at least) was then why AIXs native tcpdump doesn't
seem to ever get EFAULTs from reads().

Here's where it gets really weird, I hope I can explain the exact symptoms
of the problem clearly.

AIXs tcpdump is based on an older version of libpcap that doesn't do the
magic BIOCSBLEN processing. It simply allows the system to run with it's
existing buffer size (which defaults to 4096). If I modify the new libpcap
to not do the magic BIOCSBLEN processing, we get the same results, i.e no
EFAULTs, all data returned as expected.

As discussed earlier I'd prefer to have as large a kernel buffer as
humanly possible so I tried to experiment with this to see what was
causing it.
        - I found that if I didn't use pcap_findalldevs() at all and
simply opened the device immediately everything worked perfectly (i.e no
EFAULTs). Resulting in a magic buffer size selection of 16384
        - If I listed all the interfaces first using pcap_findalldevs
(opening and doing the magic buffer selection of 16384 on them), it
doesn't work properly
        - If I list all of the interfaces first but don't do the magic
buffer size selection (i.e choose 4096 as the start of the magic buffer
selection code), it seems to work properly
        - If we do the same as above but start buffer size selection at
8k, doesn't work
        - If we stop pcap_open_live from working on any interface except
en0, i.e rejecting them before making any attempt to process them, doesn't
work
        - If we open and close the en0 device a number of times (straight
off, no list of devices), it works perfectly with magic buffer size
selection of 16384
        - Disabling pcap_open_live(name, 68, 0, 0, errbuf) call in
add_or_find_if() causes it all to work fine with magic discovered buffer
size 16384
                - Disabling pcap_open_live for any device other than en0
and re-enabling the pcap_open_live() call in add_or_find_if() does NOT
help
        - If I call pcap_open_live(sInterface, 68, 0, 0, acErrBuf) on the
interface before listing devices it works ok
        - Disabling or enabling timeout processing in pcap_open_live has
no noticable effect
        - Disabling the reconfigure of the driver in bpf_open() had no
effect at all

So.... I can't see much logic in this madness, any ideas?

Thanks,
Shaun


-
This is the TCPDUMP workers list. It is archived at
http://www.tcpdump.org/lists/workers/index.html
To unsubscribe use mailto:tcpdump-workers-request () tcpdump org?body=unsubscribe


Current thread: