Nmap Development mailing list archives

Re: Lua bugfixes and a new buffering feature


From: doug () hcsw org
Date: Wed, 4 Jul 2007 22:44:49 -0700

Hi nmap-dev,

On Wed, Jul 04, 2007 at 01:04:57AM -0700 or thereabouts, Fyodor wrote:
But I'm not 100% sold on the idea
of caching compiled regular expressions in the NSE registry as a
general rule.  Before we recommend that, I'd love to see some
empirical data that it makes a material performance difference.

I agree that usually there is no performance advantage for caching
compiled regexs but I think that since it is essentially the same
difficulty either way, we may as well do it. It's just a matter
of where you put the pcre:new lines - in action() or elsewhere.

I can also foresee situations where regex compilation could become
a factor. In my IRC script I use at least a dozen regexps. What if
I ran that against 1000 different open ports on a host? That could
be 100s of thousands less times PCRE has to compile if only the
Lua lines were placed in a different spot in the file!

Plus if I'm understanding what Stoiko says then this is solved:

On Mon, Jul 02, 2007 at 09:36:05PM +0200, Stoiko Ivanov wrote:
Actually variables defined (either as local or not) outside of functions in
a nse-script keep their value during multiple invocations of the script. 

--

It's very interesting to see Marek's nsock buffer patch. As usual
with Marek's patches this one looks of impeccable quality. If I'd known
about it I would've used it instead of adding a new lua based buffer.

However, it is interesting to see the complexity difference in the
2 patches (1 vs 20 pages, roughly). This is because I used NSE to
demultiplex nsock sockets instead of doing it myself. After all, isn't
this why we embedded Lua in the first place?

make_buffer can also do some very interesting things as-is and with
minimal extension of the code could do much more. Such is the power of
closing over variables.

For example, say you connected to an HTTP server with the sd socket.
If you run this code:

buf = stdnse.make_buffer(sd, "\r?\n\r?\n")

The first time you call buf it will return the contents of the
HTTP headers no matter how many reads received. Neat, huh?

Marek also has some interesting notes in his patch re: performance:

+   TODO: This implementation uses 'memcpy'. It is possible to
+   create implementation without copying the same data many times.
+   Maybe I'll code it in the future.

A good observation! make_buffer also does a lot of memory copying/allocation
that could be unnecessary. As marek suggests this could be written
as a special-purpose C structure, but it could just as easily be
written as a slightly smarter closure implementation. I have done this
for lisp before where it avoids concatenating strings until it absolutely
needs to and by then it has a lot of them together so it can do a single
allocation/copy. There is also the shared strings I added for nuff that avoid
copying completely. See note 3 in: http://hcsw.org/nuff/language.html#section.4

Anyways I guess what I'm trying to state is this perhaps suprising opinion:

USING LUA IS ALMOST ALWAYS BETTER THAN C. (at least for use in NSE scripts)

Because Lua does the I/O demultiplexing for you, using Lua almost always means
less work for you and more possibilities for Nmap optimisation (letting NSE
do other things at the same time). Of course additions to nsock that help
NSE scripts (like Marek's patch) work great too, so long as you know about them!

Best,

Doug

Attachment: signature.asc
Description: Digital signature


_______________________________________________
Sent through the nmap-dev mailing list
http://cgi.insecure.org/mailman/listinfo/nmap-dev
Archived at http://SecLists.Org

Current thread: