Nmap Development mailing list archives

Re: [RFC] Mass rDNS performance tweak


From: David Fifield <david () bamsoftware com>
Date: Thu, 22 Jan 2009 12:01:15 -0700

On Wed, Jan 14, 2009 at 09:40:13AM +0000, jah wrote:
I've found that when performing reverse DNS resolution using nmap's
async resolver and my ISP's DNS servers, the accuracy of the results is
about 65-75% depending on the number of DNS servers being used and that
somewhere between 2 and 3 requests are sent per target.  Here's a
typical stat:

250 IPs took 13.70s. Mode: Async
[#: 1, OK: 171, NX: 0, DR: 79, TO: 363, SF: 0, TR: 534, CN: 0]

Here you can see I'm querying a single (#: 1) server,
we got OK: 171 PTR record responses,
NX: 0 No such name responses,
we dropped DR: 79 targets because we tried and failed 3 times to get a
PTR record,
TO: 363 requests timed-out (a stat not currently implemented in nmap),
SF: 0 server fail responses,
we transmitted a total of TR: 534 requests
and got CN: 0 cname responses.

The same 250 target IPs, which had been generated with:
nmap -sL -iR 1100 --system-dns | grep "[0-9])" | awk '{ print $3 }' \
| sed 's/(\([^)]*\))/\1/g' | sort -n | uniq | head -n 250 > ptr_list

(all having PTR records) were resolved using nmap --system-dns in about
50 seconds.

So whilst the async resolving is quick, you can see that we failed to
get the PTR records for 79 targets and that 534 requests were sent.

Thanks for the detailed description. This is the way development should
be done, with documented procedures and measurable, repeatable results.
I created a ptr_list according to your instructions and ran a test three
times, with and without the patch, with and without --system-dns.

nmap parallel
13.31s. Mode: Async [#: 2, OK: 250, NX: 0, DR: 0, SF: 0, TR: 428, CN: 0]
14.58s. Mode: Async [#: 2, OK: 250, NX: 0, DR: 0, SF: 0, TR: 425, CN: 0]
10.29s. Mode: Async [#: 2, OK: 250, NX: 0, DR: 0, SF: 0, TR: 365, CN: 0]

nmap system
15.61s. Mode: System [OK: 250, ??: 0]
14.37s. Mode: System [OK: 250, ??: 0]
14.77s. Mode: System [OK: 250, ??: 0]

nmap_dns.cc.patch parallel
 9.13s. Mode: Async [#: 2, OK: 250, NX: 0, DR: 0, TO: 3, SF: 0, TR: 253, CN: 0]
10.02s. Mode: Async [#: 2, OK: 250, NX: 0, DR: 0, TO: 1, SF: 0, TR: 251, CN: 0]
 9.48s. Mode: Async [#: 2, OK: 250, NX: 0, DR: 0, TO: 2, SF: 0, TR: 252, CN: 0]

nmap_dns.cc.patch system
14.67s. Mode: System [OK: 250, ??: 0]
14.96s. Mode: System [OK: 250, ??: 0]
15.30s. Mode: System [OK: 250, ??: 0]

So as you can see I didn't have a problem with accuracy in any case but
the patched parallel resolved was the fastest and sent the fewest
probes, with the average being very close to 1 probe per host.

(By the way this is an easy test that anyone can do. I just made a
separate checkout of nmap in an nmap-temp directory next to my usual
checkout, and applied the patch there. Then I made a shell script with
the contents
        ./nmap/nmap -sL -d -iL ptr_list | grep ^DNS
        ./nmap/nmap -sL -d -iL ptr_list --system-dns | grep ^DNS
        ./nmap-temp/nmap -sL -d -iL ptr_list | grep ^DNS
        ./nmap-temp/nmap -sL -d -iL ptr_list --system-dns | grep ^DNS
and ran it three times. Please do the same and share your results.)

The patch works great for me at least in the case where all hosts have
PTR record. I have a few questions and observations about the patch.

For the DNS servers I have use of, this algorithm is too aggressive and
something between me and the server drops requests - I imagine when they
are sent too quickly.

Right now, I've had the best results by doing the following:

We start with a capacity of 2 and don't increase this value until the
read timeout for the first request has elapsed (4s if using one DNS
server and 2.5 seconds if using more than one).
Reset the timer after the first capacity increase and allow a maximum of
50 capacity increases during that period (again the read timeout).  This
repeats until resolution is complete.
Capacity increases (a maximum of 0.1) and decreases are linked to ratio
of responses to timeouts:
capacity -= (float) drop_ratio
capacity += 1 / (100 * MAX(drop_ratio, 0.1))
CAPACITY_MAJOR_DOWN_STEP is no longer performed because I feel that a
request that does not complete is much less likely to be capacity
related now that the algorithm is less aggressive.

I don't understand the meaning of drop_ratio and how it affects
capacity. It's defined as
        drop_ratio = reqs_timedout / reqs_completed;
Shouldn't it rather be
        drop_ratio = reqs_timedout / (reqs_timedout + reqs_completed);
to make it a ratio strictly between 0 and 1? As it is I don't know how
to interpret it, except that drop_ratio > 1 means that more than 50% of
requests timed out.

Likewise, what does this mean?
        capacity -= (float) drop_ratio
More drops means a greater reduction in capacity, everything else being
equal. But shouldn't drop ratios (defined as in the patch) of 6 / 5 and
600 / 500 cause different absolute changes in capacity? In the first
case it seems we should reduce capacity by about 6; in the second, by
about 600. The code in the patch will change it by -1.2 in both cases.

For the linear growth on capacity increase, you may be able to do
        capacity += 1.0 / capacity;
(Possibly with a different constant in place of the 1.0.) This is what
TCP does during congestion avoidance mode, see below.

Going back to your description of the current DNS code,

When a response is received from a server, it's capacity is increased by
CAPACITY_UP_STEP (currently 2).

This algorithm leads to exponential growth of the capacity, just like in
TCP's slow start mode. Because the DNS code doesn't have anything like
TCP's linear congestion avoidance mode, it's like it's perpetually in
slow start. This does sound like a recipe for big fluctuations.

(While incrementing by a fixed amount may look like linear growth, the
rate at which those increments happen is proportional to the current
capacity, so the increase per unit time is proportional to 2 * capacity.
It's exponential growth when the rate of increase is proportional to the
current size.)

What this means is depending on the drop ratio, we'll increase capacity
by a maximum of 5 during any timeslot which allows for timed-out
requests to balance the capacity with decreases.

If I understand correctly, this is linear growth, just like TCP's
congestion avoidance mode. In TCP, the congestion window grows by 1
every RTT; for you the capacity grows by 5 every timeslot.

Starting and remaining in "congestion avoidance" mode rather than "slow
start" mode strikes me as reasonable.

David Fifield

_______________________________________________
Sent through the nmap-dev mailing list
http://cgi.insecure.org/mailman/listinfo/nmap-dev
Archived at http://SecLists.Org


Current thread: