Nmap Development mailing list archives
Re: SAX versus DOM: The umit nmap xml parsing benchmark
From: "Adriano Monteiro" <py.adriano () gmail com>
Date: Wed, 7 Jun 2006 09:30:34 -0300
Before any comments, I want to say that I know that the -sV option is useless, as I used the -A option. Cheers! On 6/7/06, Adriano Monteiro <py.adriano () gmail com> wrote:
Hi folks, Yesterday I finished with umit's new parser. Umit has been using DOM to parse the nmap xml output since the begining. But, as most of you might know already, DOM may not work well with large xml files, as it loads the entire file in memory to manipulate it. As Umit is intended to serve network administrators that scan loads of hosts frequently, this parsing must be fast. The answer to this problem is (I hope ;-) SAX parsing. SAX doesn't loads the entire XML file in memory to manipulate it. Instead, it goes reading the tags and calling events to manipulate them. The difference is shown in the following benchmark. I tested a nmap xml output file with 4000 hosts. The nmap options used for this scan: "-A -sV -v -v -v -d -d -p80,22" The xml file size: 5.0M The machine (from /proc/cpuinfo): vendor_id : GenuineIntel model name : Mobile Intel(R) Celeron(R) CPU 1.80GHz cpu MHz : 1794.364 cache size : 256 KB bogomips : 3595.62 Memory (from "free -m"): total used free shared buffers cached Mem: 503 227 275 0 3 77 The benchmark: I used the python's "timeit" module to measure the execution time. Each parsing was tested only once. So the time shown below is what it took to execute once the parsing of the given file with each parsing method. Result: SAX: 10.8011291027 segundos DOM: 61.6646518707 segundos A good difference, isn't it? Feel free to make any commentary, suggestion, question, etc. I'm comminting this version to the repository right now, and by monday, there will be available (hopefully) a testing version of UMIT with this new parser and some changes on nmap output display. Cheeeeeers! -- Adriano Monteiro Marques http://www.globalred.com.br http://umit.sourceforge.net py.adriano () gmail com "Free software is a matter of liberty not price." (PYTHON powered)
-- Adriano Monteiro Marques http://www.globalred.com.br http://umit.sourceforge.net py.adriano () gmail com "Free software is a matter of liberty not price." (PYTHON powered) _______________________________________________ Sent through the nmap-dev mailing list http://cgi.insecure.org/mailman/listinfo/nmap-dev
Current thread:
- SAX versus DOM: The umit nmap xml parsing benchmark Adriano Monteiro (Jun 07)
- Re: SAX versus DOM: The umit nmap xml parsing benchmark Adriano Monteiro (Jun 07)