Nmap Development mailing list archives

[NSE] Parallel Worker Thread Patch

From: Patrick Donnelly <batrick () batbytes com>
Date: Sun, 8 Nov 2009 17:18:17 -0500

Here is a patch to add worker threads to NSE for scripts to use. Right
now a script is limited in parallelism to working on one socket at any
time. A script can now create a worker thread that will be capable of
doing work on sockets in parallel with the parent script. It is
helpful to view these worker threads as another instantiation of your
script with its own custom action function (main function). The worker
threads, however, cannot generate script output. As an example use
case, an HTTP spider script can use worker threads to perform many
requests in parallel instead of serially.

I have attached the NSEDoc which should adequately explain how to use
the new features.


thread, info = stdnse.new_thread(main [, ...])
--- This function allows you to create worker threads that may perform
-- network tasks in parallel with your script thread.
--
-- Any network task (e.g. <code>socket:connect(...)</code>) will cause the
-- running thread to yield to NSE. This allows network tasks to appear to be
-- blocking while being able to run multiple network tasks at once.
-- While this is useful for running multiple separate scripts, it is
-- unfortunately difficult for a script itself to perform network tasks in
-- parallel. In order to allow scripts to also have network tasks running in
-- parallel, we provide this function, <code>stdnse.new_thread</code>, to
-- create a new thread that can perform its own network related tasks
-- in parallel with the script.
--
-- The script launches the worker thread by calling the <code>new_thread</code>
-- function with the parameters:
-- * The main Lua function for the script to execute, similar to the
script action function.
-- * The variable number of arguments to be passed to the worker's
main function.
--
-- The <code>stdnse.new_thread</code> function will return two results:
--  * The worker thread's base (main) coroutine (useful for tracking status).
--  * An status query function (described below).
--
-- The status query function shall return two values:
-- * The result of coroutine.status using the worker thread base coroutine.
-- * The error object thrown that ended the worker thread or
<code>nil</code> if no error was thrown. This is typically a string,
like most Lua errors.
--
-- Note that NSE discards all return values of the worker's main function. You
-- must use function upvalues or environments to communicate results.
--
-- You should use the condition variable (<code>nmap.condvar</code>)
-- and mutex (<code>nmap.mutex</code>) facilities to coordinate with your
-- worker threads. Keep in mind that Nmap is single threaded so there are
-- no (memory) synchrony issues to worry about; however, there <b>is</b>
-- resource contention. Your resources are usually network bandwidth, network
-- sockets, etc. You will need condition variables if the work for any single
-- thread is dynamic. For example, a web server spider script with a pool
-- of workers will initially have a single root html document. Following the
-- retrieval of the root document, the set of resources to be retrieved
-- (the worker's work) will become very large (an html document adds many
-- new hyperlinks (resources) to fetch).
--@name new_thread
--@class function
--@param main The main function of the worker thread.
--@param ... The arguments passed to the main worker thread.
--@return co The base coroutine of the worker thread.
--@return info A query function used to obtain status information of the worker.
--@usage
--local requests = {"/", "/index.html", --[[ long list of objects ]]}
--
--function thread_main (host, port, responses, ...)
--  local condvar = nmap.condvar(responses);
--  local what = {n = select("#", ...), ...};
--  local allReqs = nil;
--  for i = 1, what.n do
--    allReqs = http.pGet(host, port, what[i], nil, nil, allReqs);
--  end
--  local p = assert(http.pipeline(host, port, allReqs));
--  for i, response in ipairs(p) do responses[#responses+1] = response end
--  condvar "signal";
--end
--
--function many_requests (host, port)
--  local threads = {};
--  local responses = {};
--  local condvar = nmap.condvar(responses);
--  local i = 1;
--  repeat
--    local j = math.min(i+10, #requests);
--    local co = stdnse.new_thread(thread_main, host, port, responses,
--        unpack(requests, i, j));
--    threads[co] = true;
--    i = j+1;
--  until i > #requests;
--  repeat
--    condvar "wait";
--    for thread in pairs(threads) do
--      if coroutine.status(thread) == "dead" then threads[thread] = nil end
--    end
--  until next(threads) == nil;
--  return responses;
--end

thread = stdnse.base()
--- Returns the base coroutine of the running script.
--
-- A script may be resuming multiple coroutines to facilitate its own
-- collaborative multithreading design. Because there is a "root" or "base"
-- coroutine that lets us determine whether the script is still active
-- (that is, the script did not end, possibly due to an error), we provide
-- this <code>stdnse.base</code> function that will retrieve the base
-- coroutine of the script. This base coroutine is the coroutine that runs
-- the action function.
--
-- The base coroutine is useful for many reasons but here are some common
-- uses:
-- * We want to attribute the ownership of an object (perhaps a
network socket) to a script.
-- * We want to identify if the script is still alive.
--@name base
--@class function
--@return coroutine Returns the base coroutine of the running script.

condvar = nmap.condvar(object)
--- Create a condition variable for an object.
--
-- This function returns a function that works as a condition variable for
-- the given object parameter. The object can be any Lua data type except
-- <code>nil</code>, Booleans, and Numbers. The returned function allows you
-- wait, signal, and broadcast on the condition variable. The returned
-- function takes only one argument, which must be one of
-- * <code>"wait"</code>: Wait on the condition variable until another
thread wakes us.
-- * <code>"signal"</code>: Wake up a single thread from the waiting
set of threads for this condition variable.
-- * <code>"broadcast"</code>: Wake up all threads in the waiting set
of threads for this condition variable.
-- In NSE, Condition Variables are typically used to coordinate with threads
-- created using the stdnse.new_thread facility. The worker threads must
-- wait until work is available that the master thread (the actual running
-- script) will provide. Once work is created, the master thread will awaken
-- one or more workers so that the work can be done.
--
-- It is important to check the predicate (the test to see if your worker
-- thread should "wait" or not) BEFORE and AFTER the call to wait. You are
-- not guaranteed spurious wakeups will not occur (that is, there is no
-- guarantee your thread will not be awakened when no thread called
-- <code>"signal"</code> or <code>"broadcast"</code> on the condition variable).
-- One important check for your worker threads, before and after waiting,
-- should be to check that the master <b>script</b> thread is still alive.
-- (To check that the master script thread is alive, obtain the "base" thread
-- using stdnse.base). You do not want your worker threads to continue when
-- the script has ended for reasons unknown to your worker thread.
-- <b>You are guaranteed that all threads waiting on a condition variable
-- will be awakened if any thread that has accessed the condition variable
-- via <code>nmap.condvar</code> ends for any reason.</b> This is essential
-- to prevent deadlock with threads waiting for another thread to awaken
-- them that has ended unexpectedly.
-- @see stdnse.new_thread
-- @see stdnse.base
-- @param object Object to create a condition variable for.
-- @return ConditionVariable Condition variable function.

Questions/comments welcome.

-- 
-Patrick Donnelly

"Let all men know thee, but no man know thee thoroughly: Men freely
ford that see the shallows."

- Benjamin Franklin

Attachment: workers.2.diff
Description:

_______________________________________________
Sent through the nmap-dev mailing list
http://cgi.insecure.org/mailman/listinfo/nmap-dev
Archived at http://seclists.org/nmap-dev/

Current thread:

[NSE] Parallel Worker Thread Patch Patrick Donnelly (Nov 08)
- Re: [NSE] Parallel Worker Thread Patch Fyodor (Nov 09)
- Re: [NSE] Parallel Worker Thread Patch Patrick Donnelly (Nov 11)