diff --git a/docs/scripting.xml b/docs/scripting.xml
index 9af924c72..f7520265d 100644
--- a/docs/scripting.xml
+++ b/docs/scripting.xml
@@ -3035,7 +3035,16 @@ end
Initialization Phase
- During its initialization stage, Nmap loads the Lua interpreter and its provided libraries. These libraries are fully documented in the Lua Reference Manual. Here is a summary of the libraries, listed alphabetically by their namespace name:
+ NSE is initialized before any scanning when Nmap first starts. We start
+ this initialization through a call to open_nse. This
+ procedure starts by creating a fresh Lua state that will persist for the
+ scans against all host groups. We next load the standard Lua libraries
+ and all statically compiled NSE libraries. The standard Lua libraries are
+ fully documented in the Lua Reference
+ Manual. Here is a summary of the libraries, listed
+ alphabetically by their namespace name:
+
@@ -3103,89 +3112,151 @@ end
+
+ The libraries included with NSE are documented in NSEDoc. They include:
+ nmap,
+ pcre,
+ bin,
+ bit,
+ ssl (if available), and
+ stdnse.c (C functions for the
+ stdnse library).
+
+
+ Following loading basic libraries, NSE loads the file
+ nse_main.lua. The majority of NSE is written in
+ Lua -- Lua code manages scripts and sets up the appropriate
+ environment. In this situation Lua really shines as a glue language.
+ We use C to provide our network framework and low-level libraries.
+ We use Lua to structure our data, determine which scripts to load,
+ and, of course, schedule and execute our scripts.
+
-In addition to loading the libraries provided by Lua,
-the nmap namespace functions are loaded. The
-search paths are the same directories that Nmap searches for its data
-files, except that the nselib directory
-is appended to each. At this stage any provided script arguments are
-stored inside the registry.registry
-(NSE)
+
+ A key feature of Lua we use in NSE is coroutines. Coroutines allow for
+ collaborative multi-threading so that scripts can suspend themselves at
+ defined points and allow other coroutines to execute. Network I/O,
+ particularly waiting for responses from remote hosts, often involves
+ long wait times, so this is when scripts yield to others. Key functions
+ of the Nsock wrapper cause scripts to yield (pause). When Nsock
+ finishes processing such a request, it makes a callback which causes
+ the script to be pushed from the waiting queue back into the running
+ queue so it can resume operations when its turn comes up again. Keep
+ in mind that scripts must explicitly yield (usually within a network
+ function) to relinquish control. Yielding is never asynchronous.
+
+
+ When nse_main.lua is loaded, it sets up the Lua
+ environment to be ready for script scanning later on. Ultimately,
+ it will load all scripts the user has chosen and return a function
+ to nse_main.cc that can be executed to script
+ scan a host group.
+
-
- The next phase of NSE initialization is loading the selected
- scripts, based on the defaults or arguments provided to the
-
- option. The
- versionversion script category
- category scripts are loaded as well if version detection was enabled.
- NSE first tries to interpret each argument as a category.
- This is done with a Lua C function
- in nse_init.cc named entry based on data from
- the script.db script categorization database.script.db
- If the category is found, those scripts are loaded.
- Otherwise Nmap tries to interpret arguments as
- files or directories. If no files or directories with a given name are found in Nmap's search path,
- an error is raised and the Script Engine aborts.
-
-
-
- If a directory is specified, all of the .nse files inside it are
- loaded. Each loaded file is executed by Lua. If a
- portrule is present, it is saved in the
- porttests table with a portrule key and file
- closure value. Otherwise, if the script has a
- hostrule, it is saved in the hosttests table
- in the same manner.
+
+ We prepare the Lua environment by adding the nselib
+ directory the Lua path. This allows NSE Libraries to be required
+ by scripts. Next NSE loads replacements for the standard
+ coroutine functions so yields initiated by NSE are caught and
+ propagated back to the NSE scheduler.
+
+
+
+ nse_main.lua next defines classes and functions
+ to be used during setup (we go over these later). Next, the script
+ arguments (--script-args) are loaded into
+ nmap.registry.args. This includes a custom parser
+ as Lua's patterns are currently insufficient for the task. After
+ arguments are loaded, we create a new script database if a
+ pre-existing script database does not exist or a new one was requested.
+ Our final task during initialization is to load Scripts chosen on
+ the command line.
+
+
+
+ Our get_chosen_scripts function works to find
+ the scripts a user chose via categories, file names, and directories.
+ These scripts will be loaded into memory for later use. The
+ --script argument is changed to valid Lua code
+ that is executed. The code generated will dynamically check whether
+ a script in the database satisfies the boolean equation given on the
+ command line (recall that --script may take a
+ boolean expression). Simple categories and filenames will match
+ immediately causing the script to be "chosen". Other complex expressions
+ will be determined using Lua's boolean operators. Specifications
+ given using --script that do not match in this
+ way are instead checked to be a regular file or directory. If
+ the specification is a regular file, we load it. If the specification
+ is a directory, we load all the *.nse files it
+ contains. Otherwise, we throw an error.
+
+
+
+ get_chosen_scripts finishes by arranging the
+ scripts to run in an ordered way. We do this by sorting the scripts
+ into runlevels. These runlevels are determined by the dependencies
+ a script has. Script that have no dependency will run at level
+ 1 while a script that depends on a runlevel
+ 1 script will run at level 2.
+ When a script scan is run, each runlevel is run separately and in
+ order.
+
+
+
+ nse_main.lua defines two classes:
+ Script and Thread. These classes
+ are the objects that represent NSE scripts and the script threads we
+ run. When a script is loaded, we call Script.new
+ that creates a new Script object. The script file is loaded into Lua
+ and saved for later use. These classes and their methods are intended
+ for encapsulating the data needed for each script and its threads. The
+ Script.new also contains sanity checks to ensure the
+ script has required fields such as the action
+ function.
+
+
+
+ Scanning a Host Group
+
+ When NSE runs a script scan, script_scan is called
+ in nse_main.cc with a Vector of targets to scan.
+ These targets will be passed to our nse_main.lua
+ main function for scanning.
-
-
-
- Matching Scripts with Targets
-
- After initialization is finished, the
- hostrules“hostrule” script variable
- and portrules“portrule” script variable
- are evaluated for each host in the current
- target group.
- The rules of every chosen script is tested against every host and (in the case of service scripts) each openopen port state
-and open|filteredopen|filtered port state
-port on the hosts. The combination can grow quite large, so portrules should be kept as simple as possible. Save any heavy computation for the script's action.
-Next, a Lua thread is created for each of the matching script-target combinations. Each thread
-is stored with pertinent information such as its dependencies, target, target port (if applicable), host and port tables
-(passed to the action), and the script type (service or host script).
-The mainloop function then processes each runlevelrunlevel
-grouping of threads in order.
-
+
+ The main function for a script scan will generate a number of script
+ threads based on whether a hostrule or
+ portrule return true for a given host and port.
+ The threads generated are stored in a list of runlevel lists. Each
+ runlevel list of threads (starting with runlevel 1)
+ will be passed to the run function. The
+ run function is the main worker function for
+ NSE where all the magic happens.
+
-
+
+ The run function's purpose is run all the threads
+ in a runlevel until all finish executing. Before doing this however,
+ the function starts by redefining some Lua Registry values that
+ help C code function. One such function,
+ _R[WAITING_TO_RUNNING], allows the network library
+ binding written in C to move a thread from the waiting queue to the
+ running queue. After these functions are defined we begin running
+ script threads. This involves running script threads until the
+ running and waiting queues are both empty. Threads that yield are
+ moved to the waiting queue; threads that are ready to continue
+ are moved back to the running queue. We continue this cycle of
+ moving a thread between the running and waiting queues until
+ the thread quits or ends in error. Note that a pending queue exists
+ as well. It serves as a temporary location for threads moving from
+ the waiting queue to the running queue before a new iteration of
+ the running queue begins.
+
-
- Script Execution
-
-
- Nmap performs NSE script scanning in
- parallelparallelismin NSE
- by taking advantage of Nmap's Nsock parallel I/O library and the Lua
- coroutines
- language feature. Coroutines offer collaborative multi-threading so that scripts can suspend themselves at defined points and allow other coroutines to execute. Network I/O, particularly waiting for responses from
- remote hosts, often involves long wait times, so
- this is when scripts yield to others.
- Key functions of the Nsock wrapper
- cause scripts to yield (pause). When Nsock finishes processing such a request, it makes a callback
- which causes the script to be pushed from the waiting queue back into the
- running queue so it can resume operations when its turn comes up again.
-
- The mainloop function moves threads between the waiting and running queues as needed.
- A thread which yields is moved from the running queue into the waiting list. Running threads execute until they either
- yield, complete, or fail with an error. Threads are made ready to run (placed in the running queue) by calling
- process_waiting2running. This process of scheduling running
- threads and moving threads between queues continues
- until no threads exist in either queue.
-
+