Nmap Development mailing list archives

Re: [NSE] Script submission: targets-ipv6-wordy

From: Everardo Padilla Saca <everardo.padilla.saca () gmail com>
Date: Tue, 1 Apr 2014 01:32:24 -0600

On Mon, Mar 31, 2014 at 7:53 AM, Raul Fuentes <ra.fuentess.sam () gmail com>wrote:

The file hex-wordy-en.lst contains:

c001
50fa
The generated addresses will be:
0000:0000:0000:0000:0000:0000:50fa:50fa
0000:0000:0000:0000:0000:0000:c001:50fa
0000:0000:0000:0000:0000:0000:50fa:c001
0000:0000:0000:0000:0000:0000:c001:c001
If the wordlist has N entries and the number of segments is M, the
generated addresses will be N^M.




Hello, Everardo,  I didn't  made that part on Lua due is not exactly
needed to recreate for each execution (and will take too much time and
resources, something which already the IPv6 scripts already have as weak
point).  I was thinking on use a ruby script doing something similar to the N^M
operation to a final db (a vile file text)  but   separated from Nmap, and
then NSE script would read that DB.  However, I'm not sure if you are
planning to do this for the wordly script (which read from a DB).

Atte. Raul Fuentes


Hi Raul, you raise a good point, and I also gave it a thought while
creating this script.

For the biggest data structures, the script's space complexity is f(n) = 2n
= O(n), where n is the length of the wordlist.
That first n comes from saving the wordlist to memory as an array in order
to generate the addresses.
That second n is for storing a change-trigger for every word.

Because the script employs hex-based words of maximum 4 characters (a word
is used to "wordify" any 4-nibble part of the IPv6 address), the maximum
number of unique words in a wordlist for this script is 69904 (which is n
in the worst case). This number is obtained like so:

Digits = 10 (zero to nine)
Letters = 6 (a, b, c, d, e, f)
Domain size = Digits + Letters = 16

An IPv6 address can have "words" of 1 to 4 characters long per each
4-nibble segment, this means that we can have as many as 16^1 + 16^2 + 16^3
+ 16^4 possible hex-based words in a file. This is equal to 69904 (also
counting the character combinations that are not even wordy). Having shown
this, I think the space/memory requirements are not that much of a problem,
because in the worst case they would be 69904*2 = 139808 entries.

For the most expensive operations, the script's time complexity is f(n) =
2n + n^m = O(n^m), where n is the size of the wordlist and m is the amount
of chosen segments to "wordify".
That first n is for going through the whole wordlist on the disk and
putting it in memory.
That second n is for populating a word-change trigger array (an entry per
word from the wordlist).
That n^m is for generating all the possible addresses for the chosen
segments and sending them to Nmap (as soon as an address is generated it is
added as a target).

Because all the possible addresses are n^m, in my opinion, there is no way
around that number if we want Nmap to scan all of those addresses. The
execution time will depend on how ambitious the user is (how many 4-nibble
segments were chosen) and on the quality of the wordlist (if it has 69904
entries then it's pretty much a brute-force approach, trying all the
possible hex characters). Perhaps there's a chance of improvement on that
second n (or somewhere else for that matter), if there's a better way to
tell the script when to stop using a word.

Even if we were to use a pre-generated database like you suggest, the order
of magnitude of the time complexity would still be m, because Nmap would
still have to go through all the n^m addresses from that file for scanning.
It is true that the underlying operations for generating the addresses
won't be there while NSE is running, but I believe it's a fair trade-off
for convenience. Dealing with a separate generator program and multiple
databases/files, in my opinion, could be cumbersome for the final user
(e.g. "which file had all these words but not this one?", "should I run the
generator again? maybe I already have that address list around here", "let
me grep that 150^4 address file for this word", etc.). I find it to be
easier and more reliable to allow NSE to generate the addresses to the
user's liking on the fly, even if the space cost (in the worst case) is
having 2 arrays of 69904 entries each.

However, it would be very interesting to apply your suggestion from a
research point of view, like having a huge pre-computed table inside Nmap's
code with all possible wordy addresses to remove that 2n from the time
complexity. I'll suggest we try that at ITESM's labs.

Excuse the long email, I should have covered most of these topics in the
original post.

Regards.
_______________________________________________
Sent through the dev mailing list
http://nmap.org/mailman/listinfo/dev
Archived at http://seclists.org/nmap-dev/

Current thread:

Re: [NSE] Script submission: targets-ipv6-wordy Everardo Padilla Saca (Apr 01)
- Re: [NSE] Script submission: targets-ipv6-wordy Raul Fuentes (Apr 01)
  - Re: [NSE] Script submission: targets-ipv6-wordy Everardo Padilla Saca (Apr 08)
- <Possible follow-ups>
- Re: [NSE] Script submission: targets-ipv6-wordy Everardo Padilla Saca (Apr 01)