Nmap Development mailing list archives

Re: [RFC][patch] XML structured script output (evaluation of nse-structured3 patch)


From: David Fifield <david () bamsoftware com>
Date: Thu, 19 Jul 2012 14:29:49 -0700

On Fri, Jun 29, 2012 at 10:04:23PM +0100, Rob Nicholls wrote:
I'm slowly warming to proposal beta. I dislike seeing so many dict and elem
tags (as it makes it harder to read, and would make XPath queries slightly
longer), but having thought about it I suspect they're required if (and that
could be a very big "if") we're trying to keep things generic. I presume one
good reason for using dict, elem etc. is to keep the output simple to avoid
having to update the DTD file (https://svn.nmap.org/nmap/docs/nmap.dtd)
every time there's a new script that adds something new (e.g. if ssl-cert
created something like <subject><commonName>
secwiki.org</commonName></subject> instead of those dict and elem keys). The
alternative would be to allow anything within the script tags without
strictly defining it in the DTD, but I suspect that'd be a very bad idea and
would be very difficult to impose stricter definitions in the future.

Creating ad-hoc XML elements is something we cannot do. We even listed
it as requirement 2: "Scripts can't just make up their own new XML
elements--it has to be possible to validate XML documents against a
predefined DTD as always." Not only would it result in invalid XML,
there could be bad ambiguities if scripts output elements that already
have a meaning elsewhere, like <host>.

Is the intention that all scripts get their output automatically converted
(consistently) to XML, or will scripts need special treatment (you mention
the XML structure could be "opt-in", and I think that's what's been coded so
far)?

The plan is that scripts will be able to return a table instead of a
string, to indicate that they are creating structured output. This table
will be serialized as XML in the obvious way, using some embedding
scheme like the <dict>/<elem> one that has been proposed. This table
will also be converted automatically into text for normal output, with
indentation and colons, like what stdnse.format_output does now.

However, it is my opinion that automatic serialization is not sufficient
for some scripts, like nfs-ls. What can (and should) be presented to
humans as one line,
        drwxr-xr-x  1000  100   4096     2010-06-17 12:28 /mnt/nfs/files
would become something ridiculous under automatic serialization:
        perm: 1755
        uid: 1000
        gid: 100
        size: 4096
        time: 2010-06-17T12:28Z
        name: /mnt/nfs/files
(And this for every file!)

This is why we believe it is necessary to allow the script author to
control both the structured table output and the textual normal output.
(But optionally--if automatic serialization is good enough for a script,
then it's good enough.) I think scripts should just be able to return a
string in addition to the table, and Daniel prefers annotating the table
with a function that do custom serialization.

David Fifield
_______________________________________________
Sent through the nmap-dev mailing list
http://cgi.insecure.org/mailman/listinfo/nmap-dev
Archived at http://seclists.org/nmap-dev/


Current thread: