Nmap Development mailing list archives

[PATCH] Remove duplicate hosts


From: Ron <iago () valhallalegends com>
Date: Sun, 11 Dec 2005 13:40:20 -0600

I don't know if anybody else has this problem, but I'm often handed a huge pile of hostnames and asked to scan them, only to find out that a bunch of them are pointing to the same ip. This patch adds the --nodup option, which eliminates duplicate hosts, based on their ip address.

I was inspired when my friend was complaining about a bug because he ran
nmap localhost -sV -p 1-1024 localhost
and got the same output repeated twice.

It only works on each group of 2048 (or whatever PING_GROUP_SZ happens to be), and works by quicksorting based on the ip address (log(n) time), and then eliminating duplicate ip's (which takes linear time). It also deletes the Target class that was duplicated.

Although doing it in groups of 2048 are going to miss some, if they are a long way apart, I think it's probably better off overall, because sorting 2**24 ip's will take a long time.

It eliminates the duplicates immediately before randomizing, so the randomization will still work.

Hope this helps somebody besides me :)

(in case the patch attachment gets ripped off, http://www.javaop.com/~iago/nmap-3.95-nodup.patch )
diff -rub nmap-3.95.nodup/NmapOps.h nmap-3.95/NmapOps.h
--- nmap-3.95.nodup/NmapOps.h   2005-09-07 03:26:57.000000000 -0500
+++ nmap-3.95/NmapOps.h 2005-12-11 12:40:14.000000000 -0600
@@ -169,6 +169,7 @@
   void setVersionTrace(bool vt) { vTrace = vt;  }
   int verbose;
   int randomize_hosts;
+  int nodup;
   int spoofsource; /* -S used */
   char device[64];
   int interactivemode;
diff -rub nmap-3.95.nodup/TargetGroup.cc nmap-3.95/TargetGroup.cc
--- nmap-3.95.nodup/TargetGroup.cc      2005-10-01 18:50:38.000000000 -0500
+++ nmap-3.95/TargetGroup.cc    2005-12-11 12:53:13.000000000 -0600
@@ -489,7 +489,7 @@
    The target_expressions array MUST REMAIN VALID IN MEMORY as long as
    this class instance is used -- the array is NOT copied.
  */
-HostGroupState::HostGroupState(int lookahead, int rnd, 
+HostGroupState::HostGroupState(int lookahead, int rnd, int ndp,
                               char *expr[], int numexpr) {
   assert(lookahead > 0);
   hostbatch = (Target **) safe_zalloc(sizeof(Target *) * lookahead);
@@ -497,6 +497,7 @@
   current_batch_sz = 0;
   next_batch_no = 0;
   randomize = rnd;
+  nodup = ndp;
   target_expressions = expr;
   num_expressions = numexpr;
   next_expression = 0;
diff -rub nmap-3.95.nodup/TargetGroup.h nmap-3.95/TargetGroup.h
--- nmap-3.95.nodup/TargetGroup.h       2005-10-01 18:50:38.000000000 -0500
+++ nmap-3.95/TargetGroup.h     2005-12-11 12:59:16.000000000 -0600
@@ -164,7 +164,7 @@
 
 class HostGroupState {
  public:
-  HostGroupState(int lookahead, int randomize, char *target_expressions[],
+  HostGroupState(int lookahead, int randomize, int nodup, char *target_expressions[],
                 int num_expressions);
   ~HostGroupState();
   Target **hostbatch;
@@ -175,6 +175,11 @@
   int randomize; /* Whether each batch should be "shuffled" prior to the ping 
                    scan (they will also be out of order when given back one
                    at a time to the client program */
+  int nodup; /* Whether to scan for and eliminate duplicate IPs before 
+                   starting the scan.  Note that this will not necessarely eliminate
+            them all, since I believe that small groups of hosts are taken at
+            a time (I was getting 2048 at a time with a /16 scan) */
+                  
   char **target_expressions; /* An array of target expression strings, passed
                                to us by the client (client is also in charge
                                of deleting it AFTER it is done with the 
diff -rub nmap-3.95.nodup/docs/nmap.1 nmap-3.95/docs/nmap.1
--- nmap-3.95.nodup/docs/nmap.1 2005-12-11 13:30:08.000000000 -0600
+++ nmap-3.95/docs/nmap.1       2005-12-08 02:20:29.000000000 -0600
@@ -933,11 +933,6 @@
 and recompile. An alternative solution is to generate the target IP list with a list scan (\fB\-sL \-n \-oN 
\fR\fB\fIfilename\fR\fR), randomize it with a Perl script, then provide the whole list to Nmap with
 \fB\-iL\fR.
 .TP
-\fB\-\-nodup\fR (Eliminate duplicate hosts)
-Tells Nmap to eliminate duplicates from each group of up to 8096 hosts before it scans them. If you want to eliminate 
duplicated hosts over larger group sizes, increase PING_GROUP_SZ in 
-\fInmap.h\fR
-and recompile.  An alternative solution is to sort the host list before splitting it into the PING_GROUP_SZ blocks, 
and maybe even remove duplicates before splitting it up.
-.TP
 \fB\-\-spoof_mac <mac address, prefix, or vendor name>\fR (Spoof MAC address)
 Asks Nmap to use the given MAC address for all of the raw ethernet frames it sends. This option implies
 \fB\-\-send_eth\fR
diff -rub nmap-3.95.nodup/nmap.cc nmap-3.95/nmap.cc
--- nmap-3.95.nodup/nmap.cc     2005-12-06 16:26:05.000000000 -0600
+++ nmap-3.95/nmap.cc   2005-12-11 13:05:12.000000000 -0600
@@ -276,6 +276,7 @@
       {"sI", required_argument, 0, 0},  
       {"source_port", required_argument, 0, 'g'},
       {"randomize_hosts", no_argument, 0, 0},
+      {"nodup", no_argument, 0, 0},
       {"osscan_limit", no_argument, 0, 0}, /* skip OSScan if no open ports */
       {"osscan_guess", no_argument, 0, 0}, /* More guessing flexability */
       {"fuzzy", no_argument, 0, 0}, /* Alias for osscan_guess */
@@ -422,6 +423,8 @@
                 || strcmp(long_options[option_index].name, "rH") == 0) {
        o.randomize_hosts = 1;
        o.ping_group_sz = PING_GROUP_SZ * 4;
+      } else if (strcmp(long_options[option_index].name, "nodup") == 0) {
+    o.nodup = 1;
       } else if (strcmp(long_options[option_index].name, "osscan_limit")  == 0) {
        o.osscan_limit = 1;
       } else if (strcmp(long_options[option_index].name, "osscan_guess")  == 0
@@ -1059,7 +1062,7 @@
 
   if (num_host_exp_groups == 0)
     fatal("No target machines/networks specified!");
-  hstate = new HostGroupState(o.ping_group_sz, o.randomize_hosts,
+  hstate = new HostGroupState(o.ping_group_sz, o.randomize_hosts, o.nodup,
                              host_exp_group, num_host_exp_groups);
 
   do {
@@ -1082,7 +1085,7 @@
        if (num_host_exp_groups == 0)
          break;
        delete hstate;
-       hstate = new HostGroupState(o.ping_group_sz, o.randomize_hosts,
+       hstate = new HostGroupState(o.ping_group_sz, o.randomize_hosts, o.nodup, 
                                    host_exp_group, num_host_exp_groups);
       
        /* Try one last time -- with new expressions */
diff -rub nmap-3.95.nodup/targets.cc nmap-3.95/targets.cc
--- nmap-3.95.nodup/targets.cc  2005-11-27 19:34:09.000000000 -0600
+++ nmap-3.95/targets.cc        2005-12-11 12:57:25.000000000 -0600
@@ -281,6 +281,38 @@
   return;
 }
 
+int hostsort(const void *a, const void*b)
+{
+  Target *aa = *((Target **) a);
+  Target *bb = *((Target **) b);
+
+  return strcmp( aa->targetipstr(), bb->targetipstr() );
+}
+
+void hoststructdup(Target *hostbatch[], int *nelem) {
+  int read = 0;
+  int write = 0;
+
+  /* Sort the hosts (that's why this has to go after randomize */
+  qsort(hostbatch, *nelem, sizeof(Target *), hostsort);
+
+  /* Loop through the second and onwards.  Whenever we find a new one, overwrite an 
+   * old one */
+  for(read = 1; read < *nelem; read++) {
+    if(strcmp(hostbatch[read]->targetipstr(), hostbatch[write]->targetipstr())) {
+      write++;
+      hostbatch[write] = hostbatch[read];
+    }
+    else
+    {
+      delete(hostbatch[read]);
+    }
+  }
+  *nelem = write + 1;
+
+  return;
+}
+
 /* Returns the last host obtained by nexthost.  It will be given again the next
    time you call nexthost(). */
 void returnhost(HostGroupState *hs) {
@@ -390,6 +422,12 @@
 if (hs->current_batch_sz == 0)
   return NULL;
 
+/* Eliminate duplicate hosts (Note: this has to be done BEFORE randomizing
+ * them, because it sorts them first */
+if (hs->nodup) {
+ hoststructdup(hs->hostbatch, &hs->current_batch_sz);
+}
+
 /* OK, now we have our complete batch of entries.  The next step is to
    randomize them (if requested) */
 if (hs->randomize) {


_______________________________________________
Sent through the nmap-dev mailing list
http://cgi.insecure.org/mailman/listinfo/nmap-dev

Current thread: