Nmap Development mailing list archives

Re: Tudor's Status Report - #13 of 17

From: Daniel Miller <bonsaiviking () gmail com>
Date: Sun, 31 Jul 2016 08:08:19 -0500

Tudor,

This makes a lot of sense. I have a few suggestions, though:

1. We must be absolutely sure that these two checks are doing the same
thing if we are to rely on one to satisfy the other. The obvious fix would
be to have a single function to perform the task, but that might require
some refactoring of data structures and the function signature in order to
carry out. This would be a worthwhile effort to ensure correctness.

2. If the problem is a linear lookup of one address in a set of others, we
have a data structure for doing this in nbase called "addrset" which we use
for --exclude-hosts lookups. Unfortunately, it is also linear lookup. We
have had a goal of changing it to some other data structure, since Brandon
among others complains that it kills performance for scans with lots of
exclusions. Ideally, the data structure would compress contiguous blocks of
addresses and function a bit like a router, making a simple set membership
decision. One proposal was a Binary Decision Diagram, but anything we use
would have to be evaluated for correctness and performance.

Making ping_group_sz at least as big as the minimum hostgroup makes sense
as a simple optimization.

Thanks for this important work!

Dan

On Sun, Jul 31, 2016 at 4:38 AM, Tudor-Emil COMAN <
tudor_emil.coman () cti pub ro> wrote:

Dan,


Well it's like you said, that check is already done for batches of
4096 targets in targets.cc::refresh_hostbatch().

You only need to call target_needs_new_hostgroup in nmap.cc if you are
combining targets from different batches.


Let's say you are scanning 5000 hosts, the first 1000 are down, the rest
are up. You specified --min-hostgroup 5000.

From the first batch you get 3096 Targets in your hostgroup( the first
1000 are down). Only for the next 904 from the second batch you actually
need to do the check for that hostgroup.


For a -Pn scan(all hosts are considered up), having the o.ping_group_sz
match the size of the hostgroup means that calling
target_needs_new_hostgroup in nmap.cc is redundant.

I'm currently trying to see if modifying that value is safe.


The part from target_needs_new_hostgroup that's slow is iterating through
all the targets currently in the hostgroup and check if one of them has the
same IP address with the one you are  trying to add, it seems to make a lot
of difference for large scans.


Increasing the size of the host discovery hostgroup to match a potential
bigger hostgroup value given by the user should make the scan faster anyway.


Thanks,

Tudor
------------------------------
*From:* Daniel Miller <bonsaiviking () gmail com>
*Sent:* Tuesday, July 26, 2016 4:41:11 PM
*To:* Tudor-Emil COMAN
*Cc:* dev () nmap org
*Subject:* Re: Tudor's Status Report - #13 of 17

Tudor,

I have some questions regarding these performance improvements.

- Removed some unnecessary calls to target_needs_new_hostgroup() in
nmap.cc. This one was really a performance killer and I made so that for
hostgroups smaller than PING_GROUP_SZ (currently 4096) it doesn't get
called at all.

Are we sure this can be taken out? I see two sides to this: first, the
call is important because if a target gets into the wrong hostgroup, it
could result in sending packets out the wrong interface or with the wrong
source address, etc. The other side of it is that this sorting should have
already been done in the host discovery ("ping scanning") phase by the
function of the same name in targets.cc. I guess the code might be easier
to understand directly, but could you explain a bit more why this
improvement is possible?

Performance gains are visible for really high packet rates.

For something like: ./nmap  54.239.156.68/16 -sS -Pn -p 80 -T5 -n --open
--min-rate 130000 --min-hostgroup 16384

Without these optimizations the scan took about 9.6 seconds, with these
optimizations it takes about 8.5 seconds.


Are you also checking smaller scans (CIDR /24 is typical) and normal
packet rates to be sure that we're not regressing in more typical cases?


Thanks for your hard work!
Dan

_______________________________________________
Sent through the dev mailing list
https://nmap.org/mailman/listinfo/dev
Archived at http://seclists.org/nmap-dev/

Current thread:

Tudor's Status Report - #13 of 17 Tudor-Emil COMAN (Jul 25)
- Re: Tudor's Status Report - #13 of 17 Daniel Miller (Jul 26)
  - Re: Tudor's Status Report - #13 of 17 Tudor-Emil COMAN (Jul 31)
    - Re: Tudor's Status Report - #13 of 17 Daniel Miller (Jul 31)