Nmap Development mailing list archives

Timing race cars with a sundial (-sV match performance)


From: Brandon Enright <bmenrigh () ucsd edu>
Date: Fri, 8 May 2009 22:07:03 +0000

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Developers,

A few weeks ago I embarked on a (mostly) futile attempt to measure the
performance of applying a PCRE match expression in our
nmap-service-probes file to various network services.

To do this, I created a SVN branch:

nmap-exp/brandon/nmap-service-perf

Sadly probably the most useful aspect of the branch is that I added a
new output type called "log statistics" that can be turned on with '-OT"
and used via log_write(LOG_STATS, ...)

To measure the cost of applying a match I first made all matches fail
so that Nmap would try all of them.  Second, I made use of the libc
clock() routine to measure CPU clock cycles used by PCRE in testing the
match.  Unfortunately at least on my system, clock() has a maximum
resolution of 10,000 clocks.  On a modern super-scalar processor with
big caches a heck of a lot of work can be done in 10,000 clocks.  In
fact, barring (unpredictable) cache misses and pipeline stalls it seems
PCRE can almost always test a match against a network service in less
than 10,000 clocks thereby giving me a total time spent of "0" clocks,
hence the subject line.

Realizing that all but a few match lines were nearly always going to
return 0 time spent, I decided to collect a large amount of data to try
to see what matches had a higher propensity for taking a measurable
amount of time to complete.

The result:

We have one match that is *much* slower than all the rest:
match printer m|([-.\w]+): lpd: Your host does not have line printer access\n| p|BSD/Linux lpd| h|$1| i|hostname denied|

One service that is almost as slow as the previous:
match ser2net m|^.*\r\nser2net port \d+ device (/dev/[-\w_]+) \[\d+ \w+\] \(Debian GNU/Linux\)\r\n|s p/serial to 
network proxy/ i/Debian; serial port $1/ o/Linux/

And 3 services that stand out as being pretty slow:
match telnet m|^\xff\xfb\x01.*\n\rWelcome to the Xylan PizzaSwitch! Version (\d[-.\w]+)\n\rlogin   : |s p/Xylan 
PizzaSwitch telnetd/ v/$1/ d/switch/
match mupdate m|\* OK MUPDATE \"([-.\w]+)\" \"Cyrus Murder\" \"v([-.\w]+)\" \"mupdate://([-.\w]+)\"\r\n| p/Cyrus Murder 
Slave/ h/$1/ v/$2/ i/Master: $3/
match mupdate m|\* OK MUPDATE \"([-.\w]+)\" \"Cyrus Murder\" \"v([-.\w]+)\" \"\(master\)\"\r\n| p/Cyrus Murder Master/ 
h/$1/ v/$2/

All the other services pretty much fall in line and optimizing them
would be an exercise in futility.

As for optimizing these matches, for the LPD match we really need to
add an anchor.  I did some checking and all UCSD hosts that match the
service can safely add '^' and still match.

The '^.*' usage in the ser2net match is counter-productive.  I don't
have any matching services so I don't know what can be done to
improve on '^.*' besides just removing it.  We might thinking about
commenting out the match so that hopefully we get a submission or two
that will help us make a better match.

The Xylan PizzaSwitch telnetd match is pretty zealous in its use
of .* early in the match.  Telnet services often match the start and
then print a large amount of data (banners, abuse warnings, etc).  '.*'
is consuming all of that data on all telnet services and then
backtracking a byte at a time.  We should make the '.*' lazy by changing
it to ".*?". Even better would be to add a few more matching bytes to
match the telnet control bytes before using '.*' but we may not have
enough data to do this.

The Cyrus Murder matches look like a '^' can be added.  The protocol
looks like IMAP and it is safe to anchor the \* in IMAP with '^'.  UCSD
doesn't have any Cyrus Murder installs for me to test.  I'd suggest we
add the anchor and then wait for new submissions if it doesn't match.

I'm happy to submit a patch that does all of the above if it sounds
reasonable.

We might also think about adding Nmap internal performance statistics
logging to Nmap proper similar to my addition of
log_write(LOG_STATS, ...) in this branch.  I feel like sometimes using
- -d3 or more is too much data when all you want to do is measure
performance stats.

Brandon

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.11 (GNU/Linux)

iEYEARECAAYFAkoErQ0ACgkQqaGPzAsl94KibgCePgi8hGgWJI7AAP8yY9bvAf+E
fwQAmwTmcAFgEIPXkVdsw/OgH7UFReYi
=r0Mq
-----END PGP SIGNATURE-----

_______________________________________________
Sent through the nmap-dev mailing list
http://cgi.insecure.org/mailman/listinfo/nmap-dev
Archived at http://SecLists.Org


Current thread: