Nmap Development mailing list archives

[NSE] Child Coroutine Patch with Explanation

From: Patrick Donnelly <batrick () batbytes com>
Date: Thu, 4 Jun 2009 17:46:53 -0600
This thread attempts to answer a problem that came up on the NSE IRC
meeting. We are attempting to solve the issue of a script using
coroutines that make blocking calls on network objects. This e-mail
attempts to give a detailed overview of the problem, some potential
solutions, and example use cases for a proposed solution.

====

First some terminology:

o A thread (lowercase) is a Lua coroutine that runs script code.

o A Thread (uppercase) is a "class" representing a Script thread and
all its data in the script engine (nse_main.lua).

o I refer to the "base thread" as the thread which NSE assigns to the
Script when run. This is the coroutine we make when we call
Script:new_thread(...).


When we run the base thread, remember that it may create its own
coroutines that it may then run. If this coroutine were to make an
nsock function call or a blocking call on a mutex, the yield,
initiated by NSE, would incorrectly go to the base coroutine and not
NSE itself. Observe here (nse_main.lua):

    for co, thread in pairs(running) do
      current, running[co] = thread, nil;
      cnse.startTimeOutClock(thread.host);

      local s, result = resume(co, unpack(thread.args, 1, thread.args.n));

We resume (start) the coroutine 'co'. We are blocked until this
coroutine, base thread, yields or returns normally. Now, one wonders
why we must resume control when the base thread's child coroutine is
yielded by NSE. Perhaps we should allow the Script to continue running
and resume the child coroutine later. We do want to encourage
parallelism, right?

Well, we run into the problem of having a script not knowing when to
resume the child again, that is, when will NSE finish whatever
operation caused the child coroutine to yield? The script could
unknowingly just resume the script only to erroneously cause the
completion of whatever NSE function the child coroutine called.
Consider a slightly modified version of the script I had made in
Section III of my original nse-lua post:

<file = "cotest.nse">

author = "patrick"
description = "coroutine test!"
categories = {"default"}

require "shortport"

portrule = shortport.port_or_service(22, "ssh")

function a (host, port)
 local try = nmap.new_try();
 local socket = nmap.new_socket();
 try(socket:connect(host.ip, port.number));
 return "connected!";
end

function action (host, port)
 local co = coroutine.create(a);
 print(coroutine.resume(co, host, port));
 print(coroutine.resume(co, false, "some random error"));
 return "done";
end
</file>

Notice that the action function resumes the script twice. We would
result (in the current implementation) with this output:

batrick@batbytes:~/nmap/svn/nmap$ ./nmap --script cotest localhost

Starting Nmap 4.85BETA9 ( http://nmap.org ) at 2009-06-03 05:31 MDT
true
false   some random error
Interesting ports on localhost (127.0.0.1):
Not shown: 998 closed ports
PORT   STATE SERVICE
22/tcp open  ssh
|_ cotest: done
80/tcp open  http

Nmap done: 1 IP address (1 host up) scanned in 0.08 seconds

As I noted in my original post about this, the problem is very subtle.
Note that a C function, such as nsock:connect() yields AND returns:

  return lua_yield(L, 0);

What does this mean to us? (Please be familiar with the coroutine
resume and yield functions in the Lua manual). Well, the function will
yield its results which are retrieved by the coroutine which initiated
the resume (usually, NSE). Note that in this case, we yield 0 values.
The base thread, in the action function, thus prints simply "true"
(coroutine.resume first return value is the status).

Normally, NSE will resume the thread with the return values of
nsock:connect(); resuming the thread causes the values passed to
resume _to be the return values of nsock:connect()_. However, the
script resumes the child coroutine itself. You'll notice that the
script passes two values to the child coroutine, nil and "some random
error". This means that these two values will be the return values for
nsock:connect() in function 'a'. This will cause the try function to
raise an error; the nil return from nsock:connect will be interpreted
as an error. "some random error", shall be the error message. Thus you
see "false some random error" as the printed return values from
coroutine.resume.


So I hope I've illustrated the problem. We have a two (AFAICT) options
to consider:

o NSE adds any coroutines that it yields to the waiting queue and
shall run them later. We can return some special value indicating to
the script that NSE has yielded its worker coroutine.

o Propagate the yield up to the base thread. NSE may then resume
operation and add the base thread to the waiting queue. NSE will later
resume the base thread. This base thread will resume the entire chain
of threads until the thread calling the NSE function is resumed. See
lines of 161 to 297 of my patch.

The question is: should NSE just add anything that yields to the
waiting queue? If so, any of these coroutines that are yielded will be
eventually resumed by NSE, just like any base thread. However, how
will a script _know_ when to resume the child coroutine? There are a
couple approaches to doing this, here's two that come to mind: (1) The
base thread could use nmap.sleep and check periodically or (2) the
base thread can use a condition variable (currently unimplemented) and
wait until the child coroutine signals completion. Both of these come
with a "deadlock" problem that the child coroutine ends (in error) for
some reason and the base thread waits indefinitely for some predicate
to become true, which never does. Ultimately, allowing child
coroutines to be resumed in the background appears as magic from the
point of view of the script writer and carries immense weight in
managing these child coroutines. I feel this is definitely NOT the way
to go.

Instead, I prefer the second option as it allows NSE to properly
assume control once again as it does generally. To script writers,
there is no change necessary for their scripts and no "checking"
(overhead) involved; scripts never notice the yielding of control when
making socket operations and this would still be the case.

Now for an example of why I think propagating the NSE initiated yield
is a good thing to add to NSE.

Consider the html spider NSE script. We want to parse a tree of
documents and files hosted on a web server. Usually, we will
recursively descend as we find more html documents with possibly links
to more files. Here is a model of such a function:

function parse_html (file_string)
  for link in links(file_string) do
    if link_is_html_file(link) then
      parse_html (http.get(host, port, link));
    else
      -- add the link to the class of objects we are gathering
information on, e.g. jpegs
    end
  end
end

Here's the problem for a flexible NSE script: how do we suddenly stop
parsing once we have determined we are done? We have a complicated
chain of functions calls to go through. We want to jump back to the
start and return our results to NSE (like a longjmp in C). We can
accomplish this by using a coroutine to parse the chain of html files.
When parse_html, at the end of a long chain of such calls, has
determined we are done working on the web server, we yield back to the
caller with some arbitrary return status. The action function, or base
thread, would then return some engineered results for the user.

With the current NSE system, we cannot do this because this script's
coroutine cannot make calls on sockets. I believe this is a
sufficiently good example of a use case (one I intend to use later
when working on the spider script with Joao).

I also thought of a second use case. We have the function
socket:receive_buf [1]. This function currently has a complex C
implementation that manages the state of the buffer and iteration in
data structures kept in the Lua registry. The overhead of tracking the
buffer makes the code overly complex. Similar to the producer and
consumer problem which is a classic use case for coroutines, this
could be solved in a remarkably simple way in Lua. (This is for
illustrative purposes and is untested. It does not handle the case
where the delimiter is a function or allows the buffer to be used in
the other socket functions, although this is not too hard to fix.):

function receive_buf (socket, delimiter, keeppattern)
  local socket_env = debug.getfenv(socket); -- get the socket's
environment table
  if not socket_env.receive_buf_reader then
    local function iterate (buf)
      buf = buf .. assert(socket:receive_bytes(64));
      while true do
        local i, j = string.find(buf, delimiter);
        if not i then return iterate(buf) end -- need more data
        if keeppattern then
          yield(string.sub(buf, 1, j));
        else
          yield(string.sub(buf, 1, i-1));
        end
        buf = string.sub(buf, j+1);
      end
    end
    socket_env.receive_buf_reader = coroutine.create(iterate);
  end
  return coroutine.resume(socket_env.receive_buf_reader, "");
end

While this might not necessarily be preferable to our current
implementation, you should note that it illustrates how these
coroutines can be used as producers/consumers by our scripts.

I hope it is now clear why this coroutine fix should be applied.

====

Now for the separate problem of allowing scripts to do multiple socket
operations at once. This would be useful to brute-force scripts that
need to utilize multiple sockets simultaneously to finish in any
reasonable amount of time. I did create a system for this in nse-lua
that I believed to be sufficient:

== nse.new_thread(func, ...) ==
A coroutine run manually by a script thread will propagate the yield
through the parent back to NSE. To avoid this, you may use
nse.new_thread to create a thread which is autonomous from the parent.

The function launches a new thread (coroutine) that is managed by NSE.
The extra parameters to new_thread are passed to the function 'func'.
All errors are ignored and return values discarded by NSE. (<-- Very
important so the new thread is not interpreted as an actual script.)

Here was the implementation which is very short and clear:

  function _G.nse.new_thread (func, ...)
    assert(type(func) == "function", "function expected");
    local co = create(function(...) func(...) end); -- do not return results
    print_debug(2, "%s spawning new thread (%s).",
        current.parent.info, tostring(co));
    total, pending[co] = total + 1, setmetatable({
      co = co,
      args = {n = select("#", ...), ...},
      host = current.host,
      port = current.port,
      parent = current.parent,
      d = function() end, -- output no debug information
    }, {__index = Thread});
    return co;
  end

This function adds to pending (which is a table of threads ready to be
moved to running) a new base thread that NSE will run. The main
difference between this new thread and any normal script thread is all
return values (such as the string indicating a result to be placed in
the host/port script results) are discarded and errors ignored (the
thread:d() function is a NOP function).

I believe this is the optimal way to allow parallel worker threads to
do work with multiple sockets.

====

[1] http://nmap.org/nsedoc/modules/nmap.html#receive_buf


-- 
-Patrick Donnelly

"Let all men know thee, but no man know thee thoroughly: Men freely
ford that see the shallows."

- Benjamin Franklin

_______________________________________________
Sent through the nmap-dev mailing list
http://cgi.insecure.org/mailman/listinfo/nmap-dev
Archived at http://SecLists.Org
Current thread:

[NSE] Child Coroutine Patch with Explanation Patrick Donnelly (Jun 04)
- Re: [NSE] Child Coroutine Patch with Explanation Fyodor (Jun 04)
  - Re: [NSE] Child Coroutine Patch with Explanation Patrick Donnelly (Jun 04)
- Re: [NSE] Child Coroutine Patch with Explanation David Fifield (Jun 08)
  - Re: [NSE] Child Coroutine Patch with Explanation Patrick Donnelly (Jun 29)