Nmap Development mailing list archives

Re: [PATCH] Allow comments in exclusion file


From: David Fifield <david () bamsoftware com>
Date: Sun, 23 Aug 2009 20:25:06 -0600

On Fri, Aug 21, 2009 at 03:52:13PM -0500, Tom Sellers wrote:
David Fifield wrote:
On Wed, Jul 29, 2009 at 05:40:29PM -0500, Tom Sellers wrote:
    The attached patch modifies targets.cc so that comments will be allowed
in the files specified by the --exculsionfile parameter.

If the patch is accepted the following comment styles would be permitted:

1.  Lines beginning with a '#', for example:

    #This IP address is for the server with the broken app,
    #input validation is your friend...

2.  Comments prefixed with '#' that occur after the IP address or network specification.
    For example:

    196.168.1.1     #home router with limited space on the firewall state table

This functionality will make keeping track of an ongoing and/or
lengthy exclusion list feasible.

Thanks, Tom, this is a good idea. I'd like to merge the patch with just
a couple of changes. First, please update the --excludefile
documentation in docs/refguide.xml. You don't have to make sure the
DocBook builds correctly, just add some text following the format of the
file. Second, for "strncmp(pc, "#", 1) != 0", just write "*pc != '#'",
but really that bit of code is better written

        /* Determine if the next token starts with a '#', if so, quit processing this line. */
        if (*pc == '#')
            break;
        if(excludelist[i].parse_expr(pc,o.af()) == 0) {
            if (o.debugging > 1)
                error("Loaded exclude target of: %s", pc);
            ++i;
        }
        pc=strtok(NULL, "\t\n ");

Is it really required to treat lines beginning with '#' as a special
case? It appears that this would be handled properly by the in-loop
code. If not, then the special case also needs to allow for whitespace
preceding the initial '#'.

The whole load_exclude function is really two pairs of symmetric parsing
loops, the first loop to count the number of elements and the second
loop to fill in the newly allocated array. Really, the first counting
loop should be aware of comments too, but by being ignorant of comments
it can only overestimate the amount of storage required so it's safe not
to make it aware. In fact, I prefer that it not be made aware of
comments, because having to remember to update both loops is a bad
design and this will encourage me to rewrite the function.

While working on this I came across a couple of things that might
prevent these changes (comments in the targets and exclude files) from
working.  Before I continue, I would like to get some feedback...

1.  The changes to the exclude file would break the specification for
    the file format which says that the file contains excluded targets
    delimited by newline, space or tab.  The section of the code
    (targets.cc, starting line 310) reads a line at time.  My changes
    stop processing the line if a # is found at the beginning of a
    line or token.  Stopping at the beginning of the token lets lines
    that start with an IP and followed by a #comment be processed.

    Since I stop processing the line at the #, any IPs in the same
    line after # are considered comments and are not added to the list
    of excluded addresses.

This is exactly the way I would expect it to work. Breaking the
specification for the file is not a big deal to me; we'll just change it
to "targets delimited by whitespace, after removing # comments that
extend to the end of a line."

    I expect that there are people who have exclude files that are
    just a series of addresses separated by spaces.  This would break
    these files if a comment was included anywhere on a line except at
    the end.

What would such a comment look like?

1.2.3.4 # comment 1.2.3.5

To me, that line contains just one address after stripping comments,
1.2.3.4.

2.  The parsing of the target source list (-iL) is handled in an
    entirely different manner - character by character (nmap.cc,
    starting line 420).  This presents its own challenges.  I cannot
    add the ability to comment file entries without both breaking the
    spec and re-engineering the way that target validation is
    performed. Neither of which I think is wise.

Reading character by character is a better design than reading line by
line. I would rather have both functions work this way. It would make
load_exclude much simpler. load_exclude, in fact, should be broken into
two sub-functions, one that reads a command-line argument and one that
reads from a file, and not try to do double duty.

I don't understand what incompatibility exists here. If the exclude
input file can have comments, then the target input file should probably
allow comments too. Ideally they should be handled by the same parsing
function. But that doesn't stop you from implementing comments for just
one of them now; if the function is sufficiently general then it can be
adapted for both.

BTW, is there code somewhere that simply validates data to determine
if it is a valid nmap target specification (i.e. only contain legal
characters and ranges)?  I have seen a few places in the code where
separating this function from the error message would be useful.  In
particular, while validating the exclude file entries.  Certain
characters generate an error message, but there is no indication that
it is because of issues in the exclusion list, not the targets list.

I don't think there's currently a way to do this that doesn't exit the
program on failure. TargetGroup::parse_expr is the method that checks
validity, but it calls fatal on error.

Example:
Invalid character in host specification.  Note in particular that
square brackets [] are no longer allowed.  They were redundant and can
simply be removed.

If that the address validation function were available the logic could
be changed to be something like this

      If !valid_address(string) {
              do something useful
      }
      else {
              address this somehow
              error("Invalid exclusion %s %s",string,reason)
      }

This would be a good design, but it has to provide enough error
reporting to make the problem clear to the user. (Your "reason"
variable.)

I faced this same problem with the address set parsing code in Ncat
(ncat_hostmatch.c). I had the functions print a message with loguser and
return an error code.

David Fifield

_______________________________________________
Sent through the nmap-dev mailing list
http://cgi.insecure.org/mailman/listinfo/nmap-dev
Archived at http://SecLists.Org


Current thread: