Nmap Development mailing list archives

Re: [RFC][patch] XML structured script output (evaluation of nse-structured3 patch)


From: Daniel Miller <bonsaiviking () gmail com>
Date: Sat, 30 Jun 2012 08:41:24 -0500

On Sat, Jun 30, 2012 at 2:54 AM, Patrick Donnelly <batrick () batbytes com> wrote:
Hi Rob and others,

On Fri, Jun 29, 2012 at 5:04 PM, Rob Nicholls <robert () robnicholls co uk> wrote:
My vote would probably be for a single representation, which
automatically generates XML for all scripts (in a consistent manner, so we
don't have to worry too much about making it "backwards-compatible", and
preventing any opt-in or opt-out problems),

I personally would like to see that scripts produce some ScriptOutput
object which is returned to NSE. This object can be used to produce
XML/normal output. Said differently, let the script encode the output
as a regular Lua object (table) rather than as a string. Encoding the
script output as a string seems like unnecessary complication to me.

This is how the current implementation does things. Any return value
of any type is stringified for normal output (using the __tostring
metamethod to choose between a couple sane algorithms for tables) and
this string is also output as the "output" attribute of the <script>
tag. The return value itself is stored in a table in the Lua Registry,
indexed by the "topointer" value of the coroutine that returned it.
When XML output is generated, a recursive algorithm "dumps" the table
in XML format. A check is made first to see if the return object is a
"string," and if so, no XML output is done.

A regular string returned to NSE would be coerced into a ScriptOutput
using some type of sane conversion.

My opinion is that a sane conversion from string to XML tags does not
exist. As described above, though, string returns are perfectly fine,
and treated as "unstructured" or "old-style" output.

The only time I'd consider having different output in the XML file is if it
holds additional information that is known but isn't displayed on screen
(e.g. the TTL is in the XML output for a port, but not normal output); but
scripts currently tailor what's returned based on verbosity settings, and
anything known locally by the script that's not returned is presumably lost
forever, so we'd probably need to reconsider how NSE produces output (and
rewrite a lot of scripts). It might be possible to do that if we do that
while the number of scripts is low enough that it's not impractical.

This seems really important to me. In fact, I think the natural way to
allow scripts to include additional information in the XML while
allowing the XML to be used to produce regular normal output is to add
a "verbosity" attribute to each script output tag. The scripts can
associate a "value" with each piece of information and, based on the
Nmap invocation or Zenmap reloading the script output or w/e, the
normal script output can be regenerated. Granted, we still have the
script@output attribute, maybe that can be what was printed at the
time Nmap was run.

This is an interesting idea. In my implementation, I'm trying to walk
the line between giving script authors lots of options for output, and
not overwhelming them. I've hidden the "metatable" concept behind
functions that do the work, for instance. My current problem is how to
limit the string output to certain fields when necessary, which I
think is closely related to the verbosity problem. My first thought is
to attach a list of fields to the object's metatable, which would also
solve the ordering problem (of non-integer table keys), but I'll have
to see how that plays out. It may not scale to more than 2 levels of
verbosity.

If we do that, I'd also like to see a nice/consistent way of reporting
errors. Most scripts return something like "ERROR: Something bad happened."
as part of the normal output, but if we had a specific way of returning an
error message as an error then we could also store that information within
an error tag in the structured XML output, making it easy to count or
identify when an error occurred. It might also be possible to modify Nmap to
only display errors in the normal output when the verbosity is raised? I
believe smtp-brute (and maybe some other scripts) will return an error
message if it can't connect no matter what the verbosity is, so users will
always see the error message. This might save developers from having to
check the verbosity before deciding if an error message should be returned
as part of the normal output.

Slightly off-topic, I'd really like to see a generic error messaging
system for NSE that doesn't require the script to return from the
action function (i.e. a real error) but doesn't cause NSE to start
printing stack traces. Not all errors are bad. Not all errors should
freak the user out and result in a bug report to nmap-dev.

Although this could delay the whole structured XML output, is it worth
creating a better API for returning script output that helps with creating
structured XML. I'd like to see a script return something like:

 - Output (normal output, what's displayed is based on verbosity)
 - Details (all of the information that the script can determine, no matter
what verbosity the user selected)
 - Errors (this will probably be blank most of the time)

I agree.

The Output section could then include whatever's currently generated by the
scripts (they may require a small tweak), and any scripts that return error
messages could be modified to return the error in the Errors section. The
Output section would be the normal/usual block of text that we see (for
scripts using the vulnerability library, possibly without vuln.extra_info,
unless verbosity is raised?). Details would be all of the information known
(e.g. vuln.extra_info) and returned by the script that is converted to a
nice structure, e.g. proposal beta). For a script using the vulnerability
library, we presumably might see something like (apologies if I've made any
mistakes, I've done this by hand):

<script id="something-vuln-cve2012-nnnn" output=" VULNERABLE:
  Authentication bypass in something... (etc.)">
  <dict key="details">
    <elem key="title">Title</elem>
    <elem key="state">VULNERABLE</elem>
    <elem key="description">Some big long description</elem>
    <dict key="IDS">
      <elem key="CVE">CVE-2012-nnn1</elem>
      <elem key="CVE">CVE-2012-nnn2</elem>
    </dict>
    <dict key="dates">
      <dict key="disclosure">
        <elem key="year">2012</elem>
        <elem key="month">nn</elem>
        <elem key="day">n</elem>
      </dict>
    </dict>
    <dict key="references">
      <elem key="reference">http://example.com</elem>
      <elem key="reference">http://anotherexample.com</elem>
    </dict>
    <dict key="extra_info">
      <proposal beta/gamma magic goes here as appropriate>
    </dict>
  </dict>
</script>

I would see each tag taking a single line in normal script output. Such as:

something-vuln-cve2012-nnnn:
details:
 title: Title
 state: VULNERABLE
 IDS:
   CVE: CVE-2012-nnn1

etc.

This is almost exactly what my default formatting function does.

I would like to see it possible to add information to an element based
on some verbosity factor. How this would work with a verbosity
attribute, I'm not sure. Maybe concatenating certain elements when
doing XML -> Normal? e.g.:

<elem key="foo" verbosity="0">Some stuff</elem><elem
verbosity="1">(some more details)</elem>

would be automatically concatenated into:

foo: Some stuff (some more details)

at verbosity 1.


Hope something useful comes out of this late night rambling!

--
- Patrick Donnelly

Late night rambling is the best. An alert brain sometimes rejects good ideas!

Dan
_______________________________________________
Sent through the nmap-dev mailing list
http://cgi.insecure.org/mailman/listinfo/nmap-dev
Archived at http://seclists.org/nmap-dev/


Current thread: