Nmap Development mailing list archives

Re: design of nmap

From: doug () hcsw org
Date: Fri, 4 May 2007 20:29:12 -0700

Hi nmap-dev!

Kaushik Das wrote:

nmap is a single threaded application. How does it transmit and receive packets simultaneously?


This very question was actually the one that caused me to look at the
nmap source code for the first time too. How can a program create and process
so many packets that you could never create through the operating system's
provided networking interface?

Eddie's reply is excellent and accurate: Most of nmap's scans use special
state-machine data structures to maintain the states of different probes
so they can be looked up when an appropriate event occurs.

Nmap really is an amazingly efficient port scanner and this is due to
Fyodor expertly tuning the many pieces of scanning code. However, nmap's
layout is actually fairly simple and easy to hack on. Here's a lightning tour:


nmap.cc:nmap_main()
(ie, the function nmap_main in the file nmap.cc)

This is the real main() function. It processes all the arguments and
determines which scans to do in which order then calls functions to
print the results.

There is a global nmap object called "o" to hold all the options data:

NmapOps o;

Most nmap subsystems that use raw packets use the following function
to receive packets:

tcpip.cc:readip_pcap()

Most subsystems shoot packets out with the send functions in tcpip.cc,
record what they did, call readip_pcap to wait for replies and process
replies individually once they get them.

Here are some subsystems that use this interface:

ultra_scan (scan_engine.cc, handles almost all of Nmap's -s* scans)
massping (targets.cc, hopefully migrated to ultra_scan soon)
osscan 1 and 2 (osscan*.cc)
traceroute (traceroute.cc)


But not all subsystems use this raw packet state-machine interface.
Some use a socket based callback engine called nsock (nsock/src/*):

Version scan (service_scan.cc)
Reverse DNS (nmap_dns.cc)
NSE (nse_nsock.cc)

Of all these scans, NSE is special. It lets a programmer write nsock based
scripts directly without worrying about state and event demultiplexing.

How does it do this? It uses something called a co-routine. A co-routine
is a data structure holding a "frozen evaluation". Technically what this
means is that NSE keeps a copy of the control stack for every running script.
An instance of an NSE script can then store the states of these events
as local variables in this control stack instead of a complex, specific
state-machine data structure.

But isn't keeping a copy of the control stack for every script inefficient?
It turns out that, yes, co-routines are less efficient than equivalent
state-machine designs but not for the reason you might expect.

If you consider how co-routines are implemented you will realise that most
of the control "stack" (really more of a directed graph) of an evaluation is
actually shared with other co-routines! Because of this, co-routines are
vastly more time/space efficient than threads and probably only slightly
worse than hand-tuned state-machines.

The biggest efficiency problem with co-routines is that they require an
environment to pass a value called "the current continuation" along with
every function call. This value can be stored in case anyone else ever wants
to return to the caller again even after the original function returns.

If this sounds confusing, don't worry, it really is. If you haven't been
confused by co-routines/continuations you haven't understood them yet. :)

(This is in fact one of the only real differences between scheme and common
lisp. Scheme (also lua) requires continuations everywhere but CL lets you
choose. In CL you can have only certain functions carry continuations.)

Perhaps the largest advantage of co-routines is that we can run all sorts
of diverse tasks simultaneously. For instance, NSE can run different types
of scans against different hosts all in parallel. NSE and nsock handle the
demultiplexing so the NSE script programmer doesn't have to.

Unfortunatley, lots of nmap functionality will probably never be parallelised.
For instance, reverse DNS lookups will probably never be done in parallel with
host discovery or traceroute because combining two complex state-machines can
be very difficult. It could be possible to pipeline nmap scans over stages
using small hostgroups and unix pipes although I don't know of any generic
infrastructure for doing this.

Marek's NSE pcap patch looks very good and I think NSE should definitley
provide a way to provide raw packet access. However, I do have some concerns
about using pcap based scans in parallel with nsock based scans and scans
using both pcap and nsock. Consider blocking in nsock and thus not being
able to respond to pcap events and vice versa. On most systems you can use
pcap descriptors as select()able file descriptors but this is tricky to
implement, different across most platforms, and AFAIK not possible on windows
without pipe()+threads tricks.

In a program I'm working on, nuff, (http://hcsw.org/nuff) we have tried to use
continuations for demultiplexing everywhere. Raw sockets, regular sockets,
pcap descriptors, stdio, DNS requests, signals, everything can be done in parallel.

The scheme continuations nuff uses are similar to but slightly more powerful
than lua co-routines. With a co-routine you can resume an evaluation only once
but with continuations there is no such restriction. Here are some more
details on nuff's implementation:

Continuations, super-select, and asynchronicity
http://hcsw.org/nuff/language.html#section.3.1

mapasync
http://hcsw.org/nuff/language.html#section.3.5

And it turns out that you can use continuations for all sorts of strange,
wonderful things that have nothing to do with network demultiplexing or
multi-programming. For example, in scheme there is no "return" keyword
for returning a value from a function immediatley, but you can easily add
one using continuations (and why only be able to return from functions?).

The best discussion of continuations I know of is Paul Graham's incredible
"On Lisp" e-book. If you read and understand it I guarantee it will change how
you think about programming:

http://www.paulgraham.com/onlisp.html

Best,

Doug

Attachment: signature.asc
Description: Digital signature


_______________________________________________
Sent through the nmap-dev mailing list
http://cgi.insecure.org/mailman/listinfo/nmap-dev
Archived at http://SecLists.Org

Current thread:

design of nmap Kaushik Das (May 04)
- Re: design of nmap Eddie Bell (May 04)
- Re: design of nmap Joshua D. Abraham (May 04)
  - Re: design of nmap Luis Martin Garcia (May 04)
- Re: design of nmap doug (May 04)