Nmap Development mailing list archives

Re: [GSOC] Nmap exclude list implementation


From: Harshil Lodhi <lodhi.harshil () gmail com>
Date: Thu, 05 Mar 2015 00:49:05 +0000

Just to convince myself that a simple TRIE would be good enough for the
implementation of addrset. I hacked the addrset code. I have attached the
diff and the modified files. The current pre-prototype implementation works
only for exclude list in which IPs are mentioned in x.x.x.x format. I have
created a TRIE of strings( IP address in form of strings) . The code can be
compiled sucessfully without any modifications to any makefile.

I got the time down from 23 seconds to 0.2 seconds = 100x FAST . (This
involves no scanning)

Waiting for your comments.

On Thu, Mar 5, 2015 at 4:45 AM Harshil Lodhi <lodhi.harshil () gmail com>
wrote:

Some of my findings while profiling nmap.

I created a exclude.list which contained all the 172.16.0.0/16 IPs i.e
256*256 IPs .

I started nmap -T4 --excludefile exclude.list 172.16.-.-
It took around 23 seconds to complete and no targets were found.
I profiled the code using callgrind for 172.16.27.- . You can see the
attached output using text editor or Kcachegrind (GUI tool) .
Most of the time( 90% ) is spent in addrset_contains function in
nbase_addrset.c file. It is a linear time lookup.

On Thu, Mar 5, 2015 at 3:26 AM Harshil Lodhi <lodhi.harshil () gmail com>
wrote:

I am starting to profile nmap for exclude list implementation. I wanted
to know does the order in which the addresses are stored in the exlude list
file ,matter in the lookup. I mean if they are sorted, does it speed up the
process. Does anyone know after what size of the exclude list, does it
start to slow down.

On Wed, Mar 4, 2015 at 3:27 AM Harshil Lodhi <lodhi.harshil () gmail com>
wrote:

Hi Jacek,
Definitely. Performance measurement is a must before optimizing random
parts. I always remember Donald Knuth's quote  "Premature optimization is
the root of all evil". From the experience that I have in programming in
C/C++, these advanced data structures are useful only when the size of data
becomes big. A normal array/ linkedlist would be fast for a small sized
list but when the size becomes big, the asymptotic nature starts to
dominate. I first plan to profile the nmap using gprof to see the exact
part of the code that is taking time and then can decide on the
optimizations.
The problem is mentioned in the gsoc page of Nmap (http://nmap.org/soc/)
. I discussed its implementation with "bonsaiviking" on irc and came to
this conclusion. I first plan to profile nmap to see.
Also I would love to work on more optimizations for Nmap apart from this
if need exists.



On Wed, Mar 4, 2015 at 3:08 AM Jacek Wielemborek <d33tah () gmail com>
wrote:

W dniu 03.03.2015 o 20:00, Harshil Lodhi pisze:
Hi everyone,

I was going through the GSOC ideas "Performance/Optimization
Specialist"
one. In their its mentioned that current implementation of exclude
list is
not a good one. It takes order of the size of the list. David
particularly
mentioned about implementing the lookup using BDD.


The link for the paper doesn't seem to work for me. I searched for the
topic and read about it. One of the alternatives that we can consider
is
Patricia/Radix trees. They seemed to be more widely used than BDDs.
Following is a highly cited research paper for the same.
http://ece.ut.ac.ir/classpages/F83/Advanced%20Computer%20Net
works/PAPERS/LOOKUP/routing.pdf

I am very much interested in this position and in the script developer
position. I have been using the proxy module of Nmap for the past 3
years
to find working proxies inside my university campus during nighttime
when
the main internet is shut down.

Waiting for your feedback on this.



_______________________________________________
Sent through the dev mailing list
https://nmap.org/mailman/listinfo/dev
Archived at http://seclists.org/nmap-dev/


Hello,

I like this idea, but just to make sure - how did you pick this
particular problem? I'm asking to make sure you plan to make some
performance measurements before you start optimising random parts of
Nmap functionality.

Cheers,
Jacek


Attachment: diff
Description:

Attachment: nbase_addrset.c
Description:

Attachment: nbase_addrset.h
Description:

_______________________________________________
Sent through the dev mailing list
https://nmap.org/mailman/listinfo/dev
Archived at http://seclists.org/nmap-dev/

Current thread: