Nmap Development mailing list archives

[NSE] whois.nse

From: jah <jah () zadkiel plus com>
Date: Wed, 06 Aug 2008 03:20:35 +0100

Hello,

Attached is whois.tar.gz which contains whois.nse and whois_dep.nse. 
The former will work out of the box whereas the latter depends on the
version of ipOps.lua posted in [1].


A rundown of what's changed since the last version [2]:

    * Mutex: Enforces exclusive access to each of the defined whois
      services so that only one connection to each service is allowed at
      a time.  This enables the use of a results cache and reduces the
      number of queries made when scanning ranges of targets.
    * IANA Assignments data: Data is downloaded from IANA and cached
      locally.  Each thread then performs a lookup against the data, for
      its target, to determine which service to query. Again, this
      results in less queries made, especially to ARIN.
    * Works for IPv4 and 6 targets.



Some notes:


There's a side-effect to caching results (in order to reduce the number
of queries sent to a whois service) which comes about because of the
method of determining whether a result in the cache applies to the
current target.

When a result is cached, the range of IP addresses to which the result
applies (taken from the inetnum/netrange field) is stored in the cache
for use as a lookup.  Each thread checks whether its Target IP address
is within the range stored in any cache entry and if it is, the cached
record is returned - rather than perform a query.  The effect of this is
that the cached entry may be less specific than a record held in a whois
database.

For instance, if we cache an entry for target 41.0.0.0 which finds
[41.0.0.0-41.255.255.255] then a target IP address of 41.128.1.2 would
return the cached entry rather than perform its own query which might
have resulted in [41.128.0.0-41.128.255.255] - a more specific record. 
To mitigate this, a record that applies to an /8 range (or /32 for IPv6)
will have a much smaller range stored in the cache for target IP lookups
and thus would not affect all targets in the /8 range.  This is a
trade-off between accuracy and number of queries made.

What if we cache a range of [41.128.0.0-41.128.255.255]?  Targets for
which there is an assignment of [41.128.8.0-41.128.15.255] would return
the cached record rather than finding their more specific assignment
record.  We can go on like this right down to the smallest assignment -
which for IPv4 is /32.
This is clearly not good because, when scanning ranges of targets and
caching results, we cannot guarantee the most specific assignment
information will be found - which was the basis for the script in the
first place!

With this in mind, I've implemented "nocache" (--script-args
whodb=nocache).  This misnomer doesn't stop caching results, but the
range of addresses to which any entry in the cache may apply is limited
to a range *no more than* /29 IPv4 (8 hosts) or /32 IPv6 (many hosts).

So a target 41.0.0.0 which finds [41.0.0.0-42.255.255.255] will cache
the record, but only targets 41.0.0.0/29 will blindly accept the cached
result.  41.0.0.8 would perform its own query, as would .16, .24 and so
on.  If a query response is the same as one already cached, the thread
will still print a pointer to the original full output rather than
repeating the same output - helping to keep the host-script results
content to a minimum.

I chose /29 for IPv4 because after some basic analysis of assignment
numbers it looks like roughly 99% of the allocated address space seems
to be within assignments of this size or larger.  In practise too, this
seems to be the magic number.  There isn't the same quantity of data for
IPv6, but /32 or larger assignments cover at least 99% of the address space.

Obviously this means a lot more queries and the potential for getting
banned, but helps to discover the most specific assignment information.



The host-script results are no longer being manipulated to control which
thread outputs the full result and which show pointers to that result
for multiple targets from a single assignment.  This was being done
purely for aesthetic reasons, but wasn't implemented very well.

For instance, if the targets were specified on the command line in
ascending order, the first host-script result would show the full
output.  If the targets were specified in descending output, then the
last host-script result would show the full output.  If the order
specified on the command line wasn't the order in which the script
threads executed then a random result would display the full output.  If
the targets were specified in a random order, well, you get the picture.

I decided it wasn't worth the effort to do this properly and that the
script really doesn't need any more complexity.  So this means that the
first thread to cache a result will always be the one that prints the
full output.



The script is nsedoc documented, but may need altering with any future
changes to nsedoc that mean I can get rid of the HTML in the comments at
the head of the script.


Regards,

jah


[1] http://seclists.org/nmap-dev/2008/q3/0226.html
[2] http://seclists.org/nmap-dev/2008/q2/0148.html

Attachment: whois.tar.gz
Description:


_______________________________________________
Sent through the nmap-dev mailing list
http://cgi.insecure.org/mailman/listinfo/nmap-dev
Archived at http://SecLists.Org

Current thread:

[NSE] whois.nse jah (Aug 05)
- Re: [NSE] whois.nse Brandon Enright (Aug 05)
  - Re: [NSE] whois.nse jah (Aug 06)
  - Re: [NSE] whois.nse doug (Aug 11)