Nmap Development mailing list archives

Re: [NSE] A Lua implementation of NSE--detailed review


From: David Fifield <david () bamsoftware com>
Date: Sun, 18 Jan 2009 23:03:18 -0700

On Sat, Jan 17, 2009 at 10:28:02AM -0700, Patrick Donnelly wrote:
On Fri, Jan 16, 2009 at 10:03 PM, David Fifield <david () bamsoftware com> wrote:
NSE is started via the open_nse (and closed using close_nse) procedure
in nse_main.cc. This function begins by opening all standard Lua
libraries and adding all Nmap standard C libraries to package.preload
(to be required in nse_main.lua).

I was confused by all the places the standard NSE library is loaded.
(I mean pcre, bit, openssl, etc.) It happens
 * at the top of nse_main.lua,
 * in the C function init_main, and
 * twice in the C function script_updatedb (once through
  set_nmap_libraries and once inside the Lua code embedded in a
  character array.
Is it necessary at each of these places? Will they all need to be
updated when a library is added or removed? What does script_updatedb
need, for example, with the pcre and openssl modules?

Loading each library function in the package.preload table allows
libraries to be reloaded if needed and manually calling require in C
will not yield any real benefits over doing it in Lua.

The script database loads the libraries because when a script is
loaded to inspect its category table it may require these libraries
(many do).

Maybe I misunderstand. Don't the scripts bring in those libraries
already? I deleted those "require" lines from script_updatedb and
nse_main.lua (attached nse-lua-norequire.diff). Then ran
"./nmap --datadir=. --script-updatedb":

NSE: Updating rule database.
NSE: error while updating Script Database:
[string "local nse = ......"]:15: ./nselib/base64.lua:93: attempt to index global 'bin' (a nil value)
stack traceback:
        [C]: in function 'assert'
        [string "local nse = ......"]:15: in main chunk

However the error appears to be in base64.lua, which doesn't require the
bin library even though it uses it. Adding a require line to base64.lua
and trying to update the database again gives

NSE: Updating rule database.
NSE script database updated successfully.

With those changes, there's only one piece of code that needs to be
changed when a new C library is developed, which is the function
set_nmap_libraries.

Next nse_main.lua is loaded and called with the private nse library
(different from the one discussed in Section IV) and the array of
rules (--script) are passed as arguments. The private nse library is
used to access some functionality needed through C such as
keyWasPressed() or nsock_loop().

When NSE (the returned function by nse_main.lua) is used to scan a
host group, a new nse library is created for scripts (replacing the
old one).

This is confusing, to have one "nse" library with the functions
 { fetchfile_absolute, dump_dir, nsock_loop, key_was_pressed, ref,
   unref, updatedb, scan_progress_meter }
and an unrelated "nse" library containing the functions
 { new_thread, push_handler, pop_handler, get_host, get_port }.
The internal library needs a different name.

I agree. How about "cnse" or similar?

I would give it an ugly name like nse_internal to emphasize that it is a
purely internal, glue binding to certain Nmap functions. "cnse" is fine
too.

There are only two places (out of ten) that I found where nse_prepare_yield
is called with charge set to false. Would it be possible to just stop
the timeout clock in C++ code in those two places? Then always make sure
the clock is restarted when a coroutine is resumed. The management of
script coroutines is justifiably complex; adding timeout management to
those parts of the code makes it harder to understand.

A host no longer needing to be charged (the thread blocking a mutex)
does not necessarily mean the timeout clock should be stopped. This
determination can only be made in the Lua mainloop. Moving this logic
to C++ would be very difficult (in both the current and Lua
implementations).

I see now why the determination can't be made at the level of an
individual thread; each host may have more than one thread running and
its timeout clock is stopped only when none of its threads are being
charged time. (Are these the threads created with nse.new_thread?)

== nse.new_thread(func, ...) ==
A coroutine run manually by a script thread will propagate the yield
through the parent back to NSE. To avoid this, you may use
nse.new_thread to create a thread which is autonomous from the parent.

The function launches a new thread (coroutine) that is managed by NSE.
It inherits the host and port of the parent thread but these values
are not passed to func. Instead the extra parameters to new_thread are
passed. All errors are ignored and return values discarded by NSE.

Does this mean that if I just do a plain yield in a coroutine from a
script it gets propagated to NSE (with the NSE_YIELD mechanism)?

No, a yield in a child coroutine (that is, a coroutine being resumed
by an NSE thread), will only propagate to NSE if NSE yielded the
thread.

I thought that only happened with nsock/mutex-type operations.

Right.

Okay, cool.

I think I
would really prefer it to be that way. If it's not, Lua programmers will
have to relearn some concepts: an NSE coroutine is not a normal Lua
coroutine; if you want a normal coroutine, you have to use nse.new_thread.

No this is not correct. I'm sorry for the confusion. nse.new_thread
actually spawns a thread that is _not_ managed by the NSE thread which
created it. NSE will resume the new thread just like any other NSE
level thread. The difference is all results and errors are ignored
(nothing is reported).

So it's just like running another script, except that it can't produce
any output; presumably it's doing work for whatever script created it.

How do you see this facility being used?

== nse.push_handler(func) ==
This function pushes func on a handler stack. If the thread ends
normally or aborts due to an error, all functions on the handler stack
are called from the top of the stack down.
== nse.pop_handler() ==
This function pops a function from the thread's handler stack.

These function names are too generic. Maybe name them after C's atexit,
or something else to convey that they are exit handlers.

I was using pthread cleanup handlers for inspiration on this one.
Perhaps I should have used nse.cleanup_push and nse.cleanup_pop like
they did [1]. I wonder if I should also add the execute option to the
pop handler like they have for pthreads, it would certainly be
convenient.

Yeah, those are better names. Don't worry about adding the execute
parameter until there's a demonstrated need for it. Ron mentioned these
would be useful to clean up mutexes. Did you have another use ni mind
for them?

I'll follow up on the host/port userdata thing in a different thread.

David Fifield

Attachment: nse-lua-norequire.diff
Description:


_______________________________________________
Sent through the nmap-dev mailing list
http://cgi.insecure.org/mailman/listinfo/nmap-dev
Archived at http://SecLists.Org

Current thread: