Nmap Development mailing list archives

[NSE] XML Parser RFC


From: Patrick Donnelly <batrick () batbytes com>
Date: Wed, 29 Jun 2011 16:32:07 -0400

Hi list,

I'm working on an XML parser that should be done pretty soon in lpeg
[1] for NSE. I'm admittedly not an XML savant and
rarely use it myself so I'm relying heavily on the standards and their
description of the syntax which fortunately maps easily to lpeg as a
grammar.

Most parsers I've seen just decompose the XML file into (nested)
tables using basic pattern matching. The new lpeg module would require
XML well-formedness (at least to the degree of syntax, not necessarily
semantics; e.g. a closing tag may not need to match an opening tag).
I'm thinking of just extracting the information into tables
(similarly).

An example of a simple XML parser [2] given:

<methodCall kind="xuxu">
  <methodName>examples.getStateName</methodName>
  <params>
    <param>
      <value><i4>41</i4></value>
    </param>
  </params>
</methodCall>

would produce this table (printed for humans):

[1] => table
    (
       [1] => table
           (
              [1] => examples.getStateName
              [xarg] => table
                  (
                  )
              [label] => methodName
           )
       [2] => table
           (
              [1] => table
                  (
                     [1] => table
                         (
                            [1] => table
                                (
                                   [1] => 41
                                   [xarg] => table
                                       (
                                       )
                                   [label] => i4
                                )
                            [xarg] => table
                                (
                                )
                            [label] => value
                         )
                     [xarg] => table
                         (
                         )
                     [label] => param
                  )
              [xarg] => table
                  (
                  )
              [label] => params
           )
       [xarg] => table
           (
              [kind] => xuxu
           )
       [label] => methodCall
    )

I'm planning to have similar output but also account for all of XML's
various oddities.

I'm hoping the list can provide feedback on what they would like to
see come out of the parser and in particular what they would like to
see supported.

[1] http://www.inf.puc-rio.br/~roberto/lpeg/lpeg.html
[2] http://lua-users.org/wiki/LuaXml

-- 
- Patrick Donnelly
_______________________________________________
Sent through the nmap-dev mailing list
http://cgi.insecure.org/mailman/listinfo/nmap-dev
Archived at http://seclists.org/nmap-dev/


Current thread: