diff --git a/docs/scripting.xml b/docs/scripting.xml
index d450e8f49..23629ef81 100644
--- a/docs/scripting.xml
+++ b/docs/scripting.xml
@@ -1636,100 +1636,6 @@ socket:close()
-
- Thread Mutexes
- threads in NSE
- mutexes in NSE
-
- Each script execution thread (e.g. ftp-anon running against an FTP server on the target host) yields to other
- scripts whenever it makes a call on network objects (sending or receiving
- data). Some scripts require finer concurrency control over thread execution. An
- example is the whois script which queries
- whoiswhois
- servers for each target IP address. Because many concurrent queries often result in
- getting one's IP banned for abuse, and because a single query may return additional
- information for targets other threads are running against, it is useful
- to have other threads pause while one thread performs a query.
-
-
- To solve this problem, NSE includes a
- mutex function which provides a
- mutex
- (mutual exclusion object) usable by scripts. The mutex allows
- for only one thread to be working on an object. Competing threads
- waiting to work on this object are put in the waiting queue
- until they can get a "lock" on the mutex. A solution for
- the whois problem above is to have each
- thread block on a mutex using a common string, thus ensuring
- that only one thread is querying whois servers at once. That
- thread can store the results in the NSE registry before
- releasing unlocking the mutex. The next script in the waiting
- queue can then run. It will first check the registry and only
- query whois servers if the previous results were insufficient.
-
-
- The first step is to create a mutex object using a statement such as:
-
- mutexfn = nmap.mutex(object)
-
- The mutexfn returned is a function
- which works as a mutex for the object passed
- in. This object can be any
- Lua data
- type except nil,
- booleans, and numbers.
- The returned function allows you to lock, try to lock, and
- release the mutex. Its first and only parameter must be one of
- the following:
-
-
-
-
- "lock"
- Make a blocking lock on the mutex. If the mutex is busy (another thread has a lock on it), then the thread will yield and wait. The function returns with the mutex locked.
-
-
-
- "trylock"
- Makes a non-blocking lock on the mutex. If the mutex is
- busy then it immediately returns with a return value of
- false. Otherwise the mutex locks the
- mutex and returns true.
-
-
-
- "done"
- Releases the mutex and allows
- another thread to lock it. If the thread does not have a lock on the mutex, an
- error will be raised.
-
-
-
- "running"
- Returns the thread locked
- on the mutex or nil if the mutex is not
- locked. This should only be used for debugging as it
- interferes with garbage collection of finished threads.
-
-
-
-A simple example of using the API is provided in . For real-life examples, read the asn-query.nse and whois.nse scripts in the Nmap distribution.
-
-
- Mutex manipulation
-
-local mutex = nmap.mutex("My Script's Unique ID");
-function action(host, port)
- mutex "lock";
- -- Do critical section work - only one thread at a time executes this.
- mutex "done";
- return script_output;
-end
-
-
-
-
Exception Handling
exceptions in NSE
@@ -2365,6 +2271,547 @@ categories = {"discovery", "external"}
+
+ Script Parallelism in NSE
+
+ Before now, we have only lightly touched on the steps NSE takes to allow
+ multiple scripts to execute in parallel. Usually, the author need not
+ concern himself with how any of this is implemented; however, there are a
+ couple cases that warrant discussion that we will cover in this section.
+ As a script writer, you may need to control how multiple scripts interact
+ in a library; you may require multiple threads to work in parallel; or
+ perhaps you need to serialize access to a remote resource.
+
+
+ The standard mechanism for parallel execution is a thread. A thread
+ encapsulates execution flow and data of a script using the Lua
+ thread or coroutine. A Lua thread
+ allows us to yield the current script at arbitrary points to continue
+ work on another script. Typically, these yield points are blocking calls
+ to the NSE Socket library. The yield back to NSE is also transparent; the
+ script is unaware of the transition and views each socket method as a
+ blocking call.
+
+
+ Let's go over some common terminology. A script is
+ analogous to a binary executable; it holds the information necessary to
+ execute our script. A thread (a Lua coroutine) is
+ analogous to a process; it runs a script against a host and possibly
+ port. We sometimes abuse our terminology throughout the book by referring
+ to a thread as a running script. We are really saying the "instantiation
+ of the script", in the same sense that a process is the instantiation of
+ an executable.
+
+
+ NSE provides the bare-bone essentials you need to expand your degree
+ of parallelism beyond the basic script thread: new independent threads,
+ Mutexes, and Condition Variables. We will go into depth on each of
+ these mechanisms in the following sections.
+
+
+ Worker Threads
+
+ There are several instances where a script needs finer control with
+ respect to parallel execution beyond what is offered by default with a
+ generic script. The common reason for this need is the inability for a
+ script to read from multiple sockets concurrently. For example, an HTTP
+ spidering script may want to have multiple Lua threads querying web
+ server resources in parallel. To solve this problem, NSE offers the
+ function stdnse.new_thread to create worker threads.
+ These worker threads have all the power of independent scripts with the
+ only restriction that they may not report Script Output.
+
+
+ Each worker thread launched by a script is given a main function and
+ a variable number of arguments to be passed to the main function by
+ NSE:
+
+
+ worker_thread, status_function = stdnse.new_thread(main, ...)
+
+
+ You are given back the Lua thread (coroutine) that uniquely identifies
+ your worker thread and a status query function that queries the status
+ of your new worker.
+
+
+ The status query function returns two values:
+
+
+ status, error_object = status_function()
+
+
+ The first return value, status, is simply the return
+ value of coroutine.status on the worker thread
+ coroutine (more precisely, the base coroutine, read
+ more about base coroutine in ). The second return value contains
+ the error object thrown that ended the worker thread or
+ nil if no error was thrown. This object is typically
+ a string, like most Lua errors. However, recall that any Lua type can
+ be an error object, even nil! You should
+ inspect the error object, the second return value, only if the status
+ of your worker is "dead".
+
+
+ NSE discards all return values from the main function when the worker
+ thread finishes execution. You should communicate with your worker
+ through the use of main function parameters,
+ upvalues, or function environments. You will see how to do this in
+ .
+
+
+ Finally, when using worker threads you should always use condition
+ variables and Mutexes to coordinate with your worker threads. Keep in
+ mind that Nmap is single threaded so there are no (memory) issues in
+ synchronization to worry about; however, there is resource
+ contention. Your resources are usually network bandwidth, network
+ sockets, etc. Condition variables are also useful if the work for any
+ single thread is dynamic. For example, a web server spider script with
+ a pool of workers will initially have a single root html document.
+ Following the retrieval of the root document, the set of resources to
+ be retrieved (the worker's work) will become very large (an html
+ document adds many new hyperlinks (resources) to fetch).
+
+
+ Worker Thread Example
+
+local requests = {"/", "/index.html", --[[ long list of objects ]]}
+
+function thread_main (host, port, responses, ...)
+ local condvar = nmap.condvar(responses);
+ local what = {n = select("#", ...), ...};
+ local allReqs = nil;
+ for i = 1, what.n do
+ allReqs = http.pGet(host, port, what[i], nil, nil, allReqs);
+ end
+ local p = assert(http.pipeline(host, port, allReqs));
+ for i, response in ipairs(p) do responses[#responses+1] = response end
+ condvar "signal";
+end
+
+function many_requests (host, port)
+ local threads = {};
+ local responses = {};
+ local condvar = nmap.condvar(responses);
+ local i = 1;
+ repeat
+ local j = math.min(i+10, #requests);
+ local co = stdnse.new_thread(thread_main, host, port, responses,
+ unpack(requests, i, j));
+ threads[co] = true;
+ i = j+1;
+ until i > #requests;
+ repeat
+ condvar "wait";
+ for thread in pairs(threads) do
+ if coroutine.status(thread) == "dead" then threads[thread] = nil end
+ end
+ until next(threads) == nil;
+ return responses;
+end
+
+
+
+ For brevity, this example omits typical behavior of a traditional web
+ spider. The requests table is assumed to contain a number of objects
+ (hundreds or thousands) to warrant the use of worker threads. Our
+ example will dispatch a new thread with 11 relative
+ Uniform Resource Identifiers (URI) to request, up to the length of the
+ requests table. Worker threads are very cheap so we
+ are not afraid to create a lot of them. After we dispatch this large
+ number of threads, we wait on our Condition Variable until every thread
+ has finished then finally return the responses table.
+
+
+ You may have noticed that we did not use the status function returned
+ by stdnse.new_thread. You will typically use this
+ for debugging or if your program must stop based on the error thrown by
+ one of your worker threads. Our simple example did not require this but
+ a fault tolerant library may.
+
+
+
+ Thread Mutexes
+ threads in NSE
+ mutexes in NSE
+
+ Recall from the beginning of this section that each script execution
+ thread (e.g. ftp-anon running against an FTP server
+ on a target host) yields to other scripts whenever it makes a call
+ on network objects (sending or receiving data). Some scripts require
+ finer concurrency control over thread execution. An example is the
+ whois script which queries
+ whoiswhois servers for each
+ target IP address. Because many concurrent queries often result in
+ getting one's IP banned for abuse, and because a single query may
+ return additional information for targets other threads are running
+ against, it is useful to have other threads pause while one thread
+ performs a query.
+
+
+ To solve this problem, NSE includes a mutex function
+ which provides a mutex
+ (mutual exclusion object) usable by scripts. The Mutex allows for only
+ one thread to be working on an object. Competing threads waiting to
+ work on this object are put in the waiting queue until they can get a
+ "lock" on the Mutex. A solution for the whois
+ problem above is to have each thread block on a Mutex using a common
+ string, thus ensuring that only one thread is querying whois servers at
+ once. When finished querying the remote servers, the thread can store
+ results in the NSE registry and unlock the Mutex. Other scripts waiting
+ to query the remote server can then obtain a lock, check for usable
+ results retrieved from previous queries, make their own queries, and
+ unlock the Mutex. This is a good example of serializing access to a
+ remote resource.
+
+
+
+ The first step in using a Mutex is to create one via a call to the
+ nmap library:
+
+
+ mutexfn = nmap.mutex(object)
+
+
+ The mutexfn returned is a function which works as a
+ Mutex for the object passed in. This object can be
+ any Lua data
+ type except nil,
+ booleans, and numbers. The
+ returned function allows you to lock, try to lock, and release the
+ Mutex. Its first and only parameter must be one of the
+ following:
+
+
+
+
+ "lock"
+
+
+ Make a blocking lock on the Mutex. If the Mutex is busy (another
+ thread has a lock on it), then the thread will yield and
+ wait. The function returns with the Mutex locked.
+
+
+
+
+
+ "trylock"
+
+
+ Makes a non-blocking lock on the Mutex. If the Mutex is busy then
+ it immediately returns with a return value of
+ false. Otherwise the Mutex locks the Mutex and
+ returns true.
+
+
+
+
+
+ "done"
+
+
+ Releases the Mutex and allows another thread to lock it. If the
+ thread does not have a lock on the Mutex, an error will be
+ raised.
+
+
+
+
+
+ "running"
+
+
+ Returns the thread locked on the Mutex or nil
+ if the Mutex is not locked. This should only be used for
+ debugging as it interferes with garbage collection of finished
+ threads.
+
+
+
+
+
+
+ NSE maintains a weak reference to the Mutex so other calls to
+ nmap.mutex with the same object will return the same
+ function (Mutex); however, if you discard your reference to the Mutex
+ then it may be collected; and, subsequent calls to
+ nmap.mutex with the object will return a different
+ Mutex function! Thus you should save your Mutex to a (local) variable
+ that persists for the entire time you require.
+
+
+
+ A simple example of using the API is provided in . For
+ real-life examples, read the asn-query.nse and
+ whois.nse scripts in the Nmap
+ distribution.
+
+
+
+ Mutex manipulation
+
+local mutex = nmap.mutex("My Script's Unique ID");
+function action(host, port)
+ mutex "lock";
+ -- Do critical section work - only one thread at a time executes this.
+ mutex "done";
+ return script_output;
+end
+
+
+
+
+ Condition Variables
+
+ Condition Variables arose out of a need to coordinate with worker
+ threads created using the stdnse.new_thread
+ function. A Condition Variable allows one or more threads to wait on
+ an object and one or more threads to awaken one or all threads waiting
+ on the object. Said differently, multiple threads may unconditionally
+ block on the Condition Variable by
+ waiting. Other threads may wake up one or all of
+ the waiting threads via signalling the Condition
+ Variable.
+
+
+
+ As an example, we may dispatch multiple worker threads that will
+ produce results for us to use, like our earlier . Until all
+ the workers finish, our master thread must sleep. Note that we cannot
+ poll for results like in a traditional Operating
+ System thread because NSE does not preempt Lua threads. Instead,
+ we use a Condition Variable that the master thread
+ waits on until awakened by a worker. The master
+ will continually wait until all workers have terminated.
+
+
+
+ The first step in using a Condition Variable is to create one via a
+ call to the nmap library:
+
+
+ condvarfn = nmap.condvar(object)
+
+
+ The semantics for Condition Variables are similar to Mutexes. The
+ condvarfn returned is a function which works as a
+ Condition Variable for the object passed in. This
+ object can be any Lua data
+ type except nil,
+ booleans, and numbers. The
+ returned function allows you to wait, signal, and broadcast on the
+ Condition Variable. Its first and only parameter must be one of the
+ following:
+
+
+
+
+ "wait"
+
+
+ Wait on the Condition Variable. This adds your thread to the
+ waiting queue for the Condition Variable. You will resume
+ execution when another thread signals or broadcasts on the
+ Condition Variable.
+
+
+
+
+ "signal"
+
+
+ Signal the Condition Variable. A thread in the Condition
+ Variable's waiting queue will be resumed.
+
+
+
+
+ "broadcast"
+
+
+ Signal all threads in the Condition Variable's waiting
+ queue.
+
+
+
+
+
+
+ Like with Mutexes, NSE maintains a weak reference to the Condition
+ Variable so other calls to nmap.condvar with the
+ same object will return the same function (Condition Variable);
+ however, if you discard your reference to the Condition Variable then
+ it may be collected; and, subsequent calls to
+ nmap.condvar with the object will return a different
+ Condition Variable function! Thus you should save your Condition
+ Variable to a (local) variable that persists for the entire time you
+ require.
+
+
+
+ When using Condition Variables, it is important to check the predicate
+ before and after waiting. A predicate is a test on whether to continue
+ doing work within your worker or master thread. For your worker
+ threads, this will at the very least include a test to see if the
+ master thread is still alive. You do not want to continue doing work
+ when no thread will use your results. A typical test before waiting
+ may be: check whether the master is still running, if not then quit;
+ check that there is work to be done; if not then wait.
+
+
+
+ NSE does not guarantee spurious wakeups will not occur; that is, there
+ is no guarantee your thread will not be awakened when no thread called
+ "signal" or "broadcast" on the
+ Condition Variable. The typical, but not only, reason for a spurious
+ wakeup is the termination of a thread using a Condition Variable. This
+ is an important guarantee NSE makes that allows you to avoid deadlock
+ where a worker or master waits for a thread to wake them up that ended
+ without signaling the Condition Variable.
+
+
+
+ Collaborative Multithreading
+
+ One of Lua's least known features is collaborative multithreading
+ through coroutines. A coroutine provides an
+ independent execution stack that is resumable.
+ The standard coroutine provides access to the
+ creation and manipulation of coroutines. Lua's online first
+ edition of Programming in
+ Lua contains an excellent introduction to
+ coroutines. We will provide an overview of the
+ use of coroutines here for completeness but this is no replacement for
+ reviewing PiL.
+
+
+
+ We have mentioned coroutines throughout this section as
+ threads. This is the type
+ (thread) of a coroutine in Lua. Users of NSE that
+ have any parallel programming experience with Operating System threads
+ may be confused by this. As a reminder, Nmap is single threaded. Lua
+ threads provide the basis for parallel scripting but only one thread is
+ ever running at a time.
+
+
+
+ A Lua function executes on top of a Lua
+ thread. The thread maintains a stack of active
+ functions, local variables, and the current instruction. We can switch
+ between coroutines by explicitly yielding the
+ running thread. The coroutine which resumed the
+ yielded thread resumes operation.
+
+ shows a brief use of coroutines to print numbers.
+
+
+ Basic Coroutine Use
+
+local function main ()
+ coroutine.yield(1)
+ coroutine.yield(2)
+ coroutine.yield(3)
+end
+local co = coroutine.create(main)
+for i = 1, 3 do
+ print(coroutine.resume(co))
+end
+--> true 1
+--> true 2
+--> true 3
+
+
+
+
+ What you should take from this example is the ability to transfer
+ between flows of control extremely easily through the use of
+ coroutine.yield. This is an extremely powerful
+ concept that enables NSE to run scripts in parallel. All scripts are
+ run as coroutines that yield whenever they make a blocking socket
+ function call. This enables NSE to run other scripts and later resume
+ the blocked script when its I/O operation has completed.
+
+
+
+ As a script writer, there are times when coroutines are the best
+ tool for a job. One common use in socket programming is to filter
+ data. You may produce a function that generates all the links from an
+ HTML document. An iterator using string.gmatch
+ only catchs a single pattern. Because some complex matches may take
+ many different Lua patterns, it is more appropriate to use a
+ coroutine.
+
+ shows how to do this.
+
+
+
+ Link Generator
+
+function links (html_document)
+ local function generate ()
+ for m in string.gmatch(html_document, "url%((.-)%)") do
+ coroutine.yield(m) -- css url
+ end
+ for m in string.gmatch(html_document, "href%s*=%s*\"(.-)\"") do
+ coroutine.yield(m) -- anchor link
+ end
+ for m in string.gmatch(html_document, "src%s*=%s*\"(.-)\"") do
+ coroutine.yield(m) -- img source
+ end
+ end
+ return coroutine.wrap(generate)
+end
+
+function action (host, port)
+ -- ... get HTML document and store in html_document local
+ for link in links(html_document) do
+ links[#links+1] = link; -- store it
+ end
+ -- ...
+end
+
+
+
+
+ There are many other instances where coroutines may provide an
+ easier solution to a problem. It takes experience from use to help
+ identify those cases.
+
+
+
+ The Base Thread
+
+ Because scripts may use coroutines for their own multithreading,
+ it is important to be able to identify an owner
+ of a resource or to establish whether the script is still alive.
+ NSE provides the function stdnse.base for this
+ purpose.
+
+
+ Particularly when writing a library that attributes
+ ownership of a cache or socket to a script, you may use the
+ base thread to establish whether the script is still running.
+ coroutine.status on the base thread will give
+ the current state of the script. In cases where the script is
+ "dead", you will want to release the resource.
+ Be careful with keeping references to these threads; NSE may
+ discard a script even though it has not finished executing. The
+ thread will still report a status of "suspended".
+ You should keep a weak reference to the thread in these cases
+ so that it may be collected.
+
+
+
+
+
Version Detection Using NSE
Nmap Scripting Engine (NSE)sample scripts