Nmap Development mailing list archives

Re: Inconsistency in nmap XML output

From: Dual Mobius <dualmobius () comcast net>
Date: Wed, 10 Nov 2004 20:09:38 -0700

Matt wrote:

> How many people interested in this thread and getting the host down
> added to the XML output are using windows to try and figure this stuff
> out (keep reading i'm not just windows bashing, windows can do it all
> too)?


I am currently using XML output on Linux and not on Windows.

> Seriously, if you're using linux why would you spend all the time
> building XML parsers when you can just run 'awk'.  I do nmap scans
> regularly and have yet to use the XML output.  Just -oN and -oG for
> me, thx.

Basically because when you need to keep very close records of exactly what wasdone, when, and how; the XML output is a lot easier to extract these details outof (especially when using version detection)


$ nmap -vv -sV 127.0.0.1 -oA nmap_comparison

and compare for yourself. My comments below will focus on comparing the .xmloutput with the .gnmap output.


For example, in a non-version detecting scan:

XML format for open port (newlines inserted to avoid badly placed line wraps)
------------------------
<port protocol="tcp" portid="22">
  <state state="open" />
  <service name="ssh" method="table" conf="3" />
</port>

Grepable format for open port
------------------------------
22/open/tcp//ssh///

In this case, I will gladly admit that the grepable output is easier to handlewith sed/awk or your choice of scripting language -- and I frequently do just that.



Now on to a version detection scan:

XML format for open port (newlines inserted to avoid badly placed line wraps)
------------------------
<port protocol="tcp" portid="22">
  <state state="open" />
  <service name="ssh" product="OpenSSH" version="3.8.1p1"
   extrainfo="protocol 2.0" method="probed" conf="10" />
</port>

Grepable format for open port
------------------------------
22/open/tcp//ssh//OpenSSH 3.8.1p1 (protocol 2.0)/

In this case, the XML output is a god-send for parsing detected product,version, etc into a spreadsheet or database. Nmap has the protocol knowledgeembedded in it to give the correct values for the correct parts -- so why notmake use of that instead of figuring out how to reliably split various formatsof product names and strings that are encountered into product/version/extratuples (as illustrated below).


  "OpenSSH 3.8.1p1 (protocol 2.0)"
  "Samba smbd 3.X (workgroup: XXXX)"
  "CUPS 1.1"
  "OpenLDAP 2.1.X"
  "Squid webproxy 2.5.STABLE7"
  "Apache httpd"

Some have version numbers, some don't. Some products are multiple words, someare single words. etc. Quite a while back, I spend an afternoon tryingreliably split these strings, and just when I thought I had it, I found a newservice enabled somewhere that messed it up again. I switched to parsing theXML format that have a lot fewer problems since then.

Another bonus of the XML format is when you need to log the command line used aswell as the start and finish times for the scan run.

In the grepable output, you have to parse this out of the first and last commentlines in the output.

# nmap 3.75 scan initiated Wed Nov 10 19:11:16 2004 as: nmap -vv -sV -oAnmap_comparison 127.0.0.1

...

# Nmap run completed at Wed Nov 10 19:11:37 2004 -- 1 IP address (1 host up)scanned in 21.263 seconds

While this is again doable with sed/awk, I find it easier with an XML parser.You just ask for the element tags to get exactly the data you want (most currentscripting languages come with very simple XML parsers).


<nmaprun scanner="nmap" args="nmap -vv -sV -oA nmap_comparison 127.0.0.1"
 start="1100139076" version="3.75" xmloutputversion="1.01">
...
<runstats>
  <finished time="1100139097" />
  <hosts up="1" down="0" total="1" />
</runstats></nmaprun>

I'm NOT saying that just about all of this data can't be extracted withcombinations sed, awk, cut, grep, and friends. It's just that when all puttogether, it is often easier to just parse the XML.

Not to mention speed issues. (Focusing on just line oriented data sets for themoment) I've run across multiple instances where shell scripts using sed, awk,cut, and grep will take over 10 minutes to process a block of data while anequivalent perl/python/ruby script will do the exact same job in about 20seconds -- basically just from the overhead of spawning new processes and pipingdata between sed, awk, cut, etc.


> So who needs XML?

I do.

> I don't consider nmap to be an end all be all to
> build a report from; it's just a middle step.

I completely agree. However, there is nothing wrong with making things easierfor the downstream work as long as it doesn't mess up the tool. Why make 100people implement almost the same thing downstream, if it is comparatively simpleto add to the upstream data source?


> So I'm interested in
> the output not making a report.  And i can search through the -oN much
> quicker with awk than going through the XML any other way.  Maybe i've
> got a very limited view of nmap, but it has served me well for what
> i've been using it for.

---------------------------------------------------------------------

For help using this (nmap-dev) mailing list, send a blank email tonmap-dev-help () insecure org . List archive: http://seclists.org

Current thread:

Inconsistency in nmap XML output David Schmalz (Nov 01)
- Re: Inconsistency in nmap XML output Fyodor (Nov 09)
  - Re: Inconsistency in nmap XML output Dual Mobius (Nov 09)
    - Re: Inconsistency in nmap XML output Joshua T. Corbin (Nov 10)
    - Re: Inconsistency in nmap XML output Matt (Nov 10)
    - Re: Inconsistency in nmap XML output Dual Mobius (Nov 10)
    - Re: Inconsistency in nmap XML output Dual Mobius (Nov 10)
  - Re: Inconsistency in nmap XML output David Schmalz (Nov 10)