Nmap Development mailing list archives
RE: [RFC][patch] XML structured script output (evaluation of nse-structured3 patch)
From: "Rob Nicholls" <robert () robnicholls co uk>
Date: Fri, 29 Jun 2012 22:04:23 +0100
-----Original Message----- From: nmap-dev-bounces () insecure org [mailto:nmap-dev- bounces () insecure org] On Behalf Of Daniel Miller Sent: 14 June 2012 12:43 To: Daniel Miller; Nmap Dev Subject: Re: [RFC][patch] XML structured script output (evaluation of nse- structured3 patch)
<snip>
I'll try to use the wiki page to expand on these ideas, but I may not get
to
much of it today. This has been a long time coming, so I feel it's more important to get it right than to rush into it. To the nmap-dev list at
large:
Please join the discussion if you have opinions or suggestions. The
outcome
will affect everyone, and I know there are smarter people than me reading this.
Hi Dan, your recent Tweet prompted me to finish my draft email with my thoughts! If I'm completely honest, I'm still not entirely happy with the proposals listed so far on the wiki: https://secwiki.org/w/Nmap/Structured_Script_Output I spotted proposal gamma on the wiki yesterday, there seems to be some formatting issues on the wiki that I haven't looked into too deeply (table tags aren't encoded and it's outside of the pre tags?). It's certainly cleaner, but it looks like some structure/information might have been lost (e.g. the "subject" and "issuer" bit is lost in the example output for ssl-cert , so I prefer proposal beta. I'm slowly warming to proposal beta. I dislike seeing so many dict and elem tags (as it makes it harder to read, and would make XPath queries slightly longer), but having thought about it I suspect they're required if (and that could be a very big "if") we're trying to keep things generic. I presume one good reason for using dict, elem etc. is to keep the output simple to avoid having to update the DTD file (https://svn.nmap.org/nmap/docs/nmap.dtd) every time there's a new script that adds something new (e.g. if ssl-cert created something like <subject><commonName> secwiki.org</commonName></subject> instead of those dict and elem keys). The alternative would be to allow anything within the script tags without strictly defining it in the DTD, but I suspect that'd be a very bad idea and would be very difficult to impose stricter definitions in the future. Is the intention that all scripts get their output automatically converted (consistently) to XML, or will scripts need special treatment (you mention the XML structure could be "opt-in", and I think that's what's been coded so far)? I've spotted that this is one of the outstanding questions listed on the wiki. My vote would probably be for a single representation, which automatically generates XML for all scripts (in a consistent manner, so we don't have to worry too much about making it "backwards-compatible", and preventing any opt-in or opt-out problems), and that XML and normal output contains exactly the same information (normal output is stored as a value in the script's existing "output" attribute, and is identically/additionally stored in a structured format). I know Nmap contains some additional information in the XML file that's not displayed on screen, but that's generally an exception rather than the norm. I think I'd prefer if everything was converted into the structured XML format without having to opt-in (I might regret that once I see the output from some scripts). The ssl-cert output shown at the wiki seems to magically convert the string "Not valid before" to "notBefore" (I haven't looked at the code yet to see how this is done, but I assume something's hardcoded in an updated ssl-cert script). I presume that means other scripts (e.g. smb-os-discovery) wouldn't automatically produce nice structured XML output using the current code until someone adds the same sort of opt-in information (e.g. "Computer name" to "computerName")? Is it possible to automatically create the keys (e.g. camelCase). The only time I'd consider having different output in the XML file is if it holds additional information that is known but isn't displayed on screen (e.g. the TTL is in the XML output for a port, but not normal output); but scripts currently tailor what's returned based on verbosity settings, and anything known locally by the script that's not returned is presumably lost forever, so we'd probably need to reconsider how NSE produces output (and rewrite a lot of scripts). It might be possible to do that if we do that while the number of scripts is low enough that it's not impractical. If we do that, I'd also like to see a nice/consistent way of reporting errors. Most scripts return something like "ERROR: Something bad happened." as part of the normal output, but if we had a specific way of returning an error message as an error then we could also store that information within an error tag in the structured XML output, making it easy to count or identify when an error occurred. It might also be possible to modify Nmap to only display errors in the normal output when the verbosity is raised? I believe smtp-brute (and maybe some other scripts) will return an error message if it can't connect no matter what the verbosity is, so users will always see the error message. This might save developers from having to check the verbosity before deciding if an error message should be returned as part of the normal output. Although this could delay the whole structured XML output, is it worth creating a better API for returning script output that helps with creating structured XML. I'd like to see a script return something like: - Output (normal output, what's displayed is based on verbosity) - Details (all of the information that the script can determine, no matter what verbosity the user selected) - Errors (this will probably be blank most of the time) The Output section could then include whatever's currently generated by the scripts (they may require a small tweak), and any scripts that return error messages could be modified to return the error in the Errors section. The Output section would be the normal/usual block of text that we see (for scripts using the vulnerability library, possibly without vuln.extra_info, unless verbosity is raised?). Details would be all of the information known (e.g. vuln.extra_info) and returned by the script that is converted to a nice structure, e.g. proposal beta). For a script using the vulnerability library, we presumably might see something like (apologies if I've made any mistakes, I've done this by hand): <script id="something-vuln-cve2012-nnnn" output=" VULNERABLE: Authentication bypass in something... (etc.)"> <dict key="details"> <elem key="title">Title</elem> <elem key="state">VULNERABLE</elem> <elem key="description">Some big long description</elem> <dict key="IDS"> <elem key="CVE">CVE-2012-nnn1</elem> <elem key="CVE">CVE-2012-nnn2</elem> </dict> <dict key="dates"> <dict key="disclosure"> <elem key="year">2012</elem> <elem key="month">nn</elem> <elem key="day">n</elem> </dict> </dict> <dict key="references"> <elem key="reference">http://example.com</elem> <elem key="reference">http://anotherexample.com</elem> </dict> <dict key="extra_info"> <proposal beta/gamma magic goes here as appropriate> </dict> </dict> </script> Other scripts might do something like (then we can more easily determine which scripts have errors and which scripts return useful results): <script id="an-error" output=""> <dict key="errors"> <elem key="error">ERROR: Failed to connect to SMTP server.</elem> </dict> </script> Obviously, if we go with proposal gamma, we'd have table instead of dict tags in my examples above. I'd started thinking about this before I posted proposal gamma or caught up with all of the emails on the list. Apologies if I've covered something that has already been discussed in the other emails. I've literally just spotted in a much earlier email that you suggested a "WARNINGS" (similar to my Errors) section that's only displayed if debugging is enabled, so it sounds like we're both coming to similar conclusions. Sorry for the lengthy email, I hope that was useful. You did (foolishly) ask for feedback! :) Rob _______________________________________________ Sent through the nmap-dev mailing list http://cgi.insecure.org/mailman/listinfo/nmap-dev Archived at http://seclists.org/nmap-dev/
Current thread:
- [RFC][patch] XML structured script output Daniel Miller (May 21)
- Re: [RFC][patch] XML structured script output Daniel Miller (May 24)
- Re: [RFC][patch] XML structured script output Djalal Harouni (May 27)
- Re: [RFC][patch] XML structured script output Daniel Miller (May 27)
- Re: [RFC][patch] XML structured script output Daniel Miller (May 29)
- Re: [RFC][patch] XML structured script output Fyodor (Jun 03)
- Re: [RFC][patch] XML structured script output (evaluation of nse-structured3 patch) David Fifield (Jun 13)
- Re: [RFC][patch] XML structured script output (evaluation of nse-structured3 patch) Daniel Miller (Jun 14)
- RE: [RFC][patch] XML structured script output (evaluation of nse-structured3 patch) Rob Nicholls (Jun 29)
- Re: [RFC][patch] XML structured script output (evaluation of nse-structured3 patch) Daniel Miller (Jun 29)
- Re: [RFC][patch] XML structured script output (evaluation of nse-structured3 patch) Patrick Donnelly (Jun 30)
- Re: [RFC][patch] XML structured script output (evaluation of nse-structured3 patch) Daniel Miller (Jun 30)
- Re: [RFC][patch] XML structured script output Daniel Miller (May 27)
- Re: [RFC][patch] XML structured script output (output diff) David Fifield (Jun 13)