Nmap Development mailing list archives

Re: OS X 10.6 diagnosis: pcap timeout and bpf device access


From: Walt Scrivens <walts () gate net>
Date: Sat, 7 Nov 2009 14:57:29 -0500

David,
Thanks for sticking with this. You've done an impressive bit of analysis work. Your explanation is so good that even I begin to understand what's going wrong, although I suppose the chances of apple ever doing anything about it are slim to none.

Since the problem doesn't happen in the released version 5, the problems you've uncovered are specific to 5.05BETA-1. Do we know why those changes were made, and what the impact of reversing them would be?

Walt

On Nov 7, 2009, at 1:01 PM, David Fifield wrote:

On Thu, Nov 05, 2009 at 09:10:51PM -0700, David Fifield wrote:
On Thu, Oct 15, 2009 at 07:10:49PM -0500, Tom Sellers wrote:
sudo nmap -sP -d9 scanme.nmap.org   chokes here until I kill it:

**********************************************************************
Starting Nmap 5.05BETA1 ( http://nmap.org ) at 2009-10-15 19:04 CDT
The max # of sockets we are using is: 0
--------------- Timing report ---------------
 hostgroups: min 1, max 100000
 rtt-timeouts: init 1000, min 100, max 10000
 max-scan-delay: TCP 1000, UDP 1000, SCTP 1000
 parallelism: min 0, max 0
 max-retries: 10, host-timeout: 0
 min-rate: 0, max-rate: 0
---------------------------------------------
Initiating Ping Scan at 19:04
Scanning 64.13.134.52 [4 ports]
Pcap filter: dst host 192.168.200.77 and (icmp or ((tcp or udp or sctp) and (src host 64.13.134.52))) Packet capture filter (device en1): dst host 192.168.200.77 and (icmp or ((tcp or udp or sctp) and (src host 64.13.134.52))) SENT (0.0040s) ICMP 192.168.200.77 > 64.13.134.52 echo request (type=8/code=0) ttl=51 id=6159 iplen=28 SENT (0.0040s) TCP 192.168.200.77:48278 > 64.13.134.52:443 S ttl=50 id=14485 iplen=44 seq=1618135438 win=3072 <mss 1460> SENT (0.0040s) TCP 192.168.200.77:48278 > 64.13.134.52:80 A ttl=53 id=3056 iplen=40 seq=1618135438 win=2048 ack=3449250024 SENT (0.0040s) ICMP 192.168.200.77 > 64.13.134.52 Timestamp request (type=13/code=0) ttl=39 id=29241 iplen=40 **TIMING STATS** (0.0040s): IP, probes active/freshportsleft/ retry_stack/outstanding/retranwait/onbench, cwnd/ssthresh/delay, timeout/srtt/rttvar/
  Groupstats (1/1 incomplete): 4/*/*/*/*/* 10.00/75/* 1000000/-1/-1
  64.13.134.52: 4/0/0/4/0/0 10.00/75/0 1000000/-1/-1
Current sending rates: 3988.04 packets / s, 151545.36 bytes / s.
Overall sending rates: 3988.04 packets / s, 151545.36 bytes / s.

I can reproduce this now with 10.6 (upgraded from an existing 10.5
installation) and the Xcode that comes with it. However, Nmap runs fine if I have a tcpdump or Wireshark capture running at the same time. Is it
the same for you? Any ideas why that might be?

I have been looking into this problem, and I think I have found the
cause, or rather causes, both of which appear to be Apple bugs. The
first is that setting timeouts for read events doesn't work unless the
timeout is at least 1000 milliseconds. The second is that opening a
/dev/bpf? device in O_WRONLY mode and binding it to an interface causes
all other listeners on the interface to see only outgoing traffic. I
don't know of a nice quick fix for these problems.

As for the timeout problem, Nmap opens the pcap device like this:

USI->pd = my_pcap_open_live(Targets[0]->deviceName(), 100, (o.spoofsource)? 1 : 0, pcap_selectable_fd_valid()? 200 : 2);

The last parameter is the timeout. pcap_selectable_fd_valid() is false
on OS X, so we're requesting a timeout of 2 ms. I wrote a small test
program that just reads packets and prints them. The minimum timeout I
found I could use was 1000; 999 or less caused reads to block forever. I
think this is a bug caused by a switch to 64-bit code in OS X 10.6.
Wireshark made a change in their dumpcap program to use a timeout of
1000 rather than 250 on 64-bit Apple plaforms.

http://www.mail-archive.com/wireshark-dev () wireshark org/msg15101.html
http://anonsvn.wireshark.org/viewvc/trunk/dumpcap.c?r1=29591&r2=29641

Increasing the timeout to 1000 keeps pcap_next from blocking forever,
but Nmap still doesn't work because of the problem I'll describe next.
Even when that is worked around, waiting up to 1 s for each pcap read is
unacceptable.

The problem with bpf devices. I noticed that --packet-trace was
reporting only outgoing traffic, even if I disabled the BPF filter
entirely. As I noticed above, if I started a tcpdump capture before
starting Nmap, I saw traffic in both directions. But if I started Nmap
first, paused it with ctrl-Z, then started tcpdump second, tcpdump would
see only outgoing traffic!

I isolated the line in Nmap at which this bizarre behavior begins, and
it is when eth_open is called. That function is defined in
libdnet-stripped/src/eth-bsd.c, and does this in part:

        for (i = 0; i < 128; i++) {
                snprintf(file, sizeof(file), "/dev/bpf%d", i);
                e->fd = open(file, O_WRONLY);
                if (e->fd != -1 || errno != EBUSY)
                        break;
        }
        strlcpy(ifr.ifr_name, device, sizeof(ifr.ifr_name));
        if (ioctl(e->fd, BIOCSETIF, (char *)&ifr) < 0)
                return (eth_close(e));

Somehow, whoever first opens a bpf device and binds it to an interface
with BIOCSETIF, controls access for all other users of the interface.
libdnet opens with O_WRONLY because it only intends to use the interface to send, but in doing so prevents incoming traffic from being read, even for different bpf devices, and even in different processes! The anomaly
lasts until the last process that has a bpf handle closes it, at which
point it can be opened with different access.

Increasing the timeout to 1000, and changing the access mode to O_RDWR,
allows Nmap to run, albeit slowly.

How to handle this? The O_RDWR change, while ugly, is pretty innocuous.
We use a short timeout for pcap reads on OS X because you can't select
on a pcap file descriptor. However, apparently poll work for pcap
descriptors in 10.6, where it didn't work before. So doing some kind of
configuration detection and using poll when appropriate is an option.
These tests were all with a stock installation of 10.6. I'll try
updating to whatever the latest version is and see if anything is
different.

David Fifield

_______________________________________________
Sent through the nmap-dev mailing list
http://cgi.insecure.org/mailman/listinfo/nmap-dev
Archived at http://seclists.org/nmap-dev/


Current thread: