1
0
mirror of https://github.com/nmap/nmap.git synced 2025-12-09 14:11:29 +00:00
Files
nmap/docs/scripting.xml

3458 lines
139 KiB
XML

<sect1 id="nse-intro">
<title>Introduction</title>
<indexterm><primary>Nmap Scripting Engine</primary></indexterm>
<para>The Nmap Scripting Engine (NSE) is one of Nmap's most
powerful and flexible features. It allows users to write (and
share) simple scripts to automate a wide variety of networking
tasks. Those scripts are then executed in parallel with the speed
and efficiency you expect from Nmap. Users can rely on the
growing and diverse set of scripts distributed with Nmap, or write
their own to meet custom needs.</para>
<para>The Nmap project would like to thank Diman Todorov for
his excellent work building the initial NSE implementation and
writing much of this documentation. Stoiko Ivanov also
contributed greatly. The tasks we had in mind when
creating the system are:</para>
<variablelist>
<varlistentry>
<term>Network discovery</term>
<listitem>
<para>This is Nmap's bread and butter. Examples include
looking up whois data based on the target domain,
querying ARIN, RIPE, or APNIC for the target IP to determine ownership,
performing identd lookups on open ports, SNMP queries, and
listing available NFS/SMB/RPC shares and services.</para>
</listitem>
</varlistentry>
<varlistentry>
<term>More sophisticated version detection</term>
<listitem>
<para>The Nmap version detection system (<xref linkend="vscan"/>)
is able to recognize thousands of different services through
its probe and regular expression based matching system, but it
cannot recognize everything. For example, identifying the Skype v2 service requires two
independent probes. Nmap could also recognize more SNMP services
if it tried a few hundred different community names by brute
force. Neither of these tasks are well suited to traditional
Nmap version detection, but both are easily accomplished with
NSE. For these reasons, version detection now calls NSE by
default to handle some tricky services. This is described in
<xref linkend="nse-vscan"/>.</para>
</listitem>
</varlistentry>
<varlistentry>
<term>Vulnerability detection</term>
<listitem>
<para>When a new vulnerability is discovered, you often want
to scan your networks quickly to identify vulnerable systems
before the bad guys do. While Nmap isn't a
comprehensive
<web>
<ulink url="http://sectools.org/vuln-scanners.html">vulnerability scanner</ulink>,
</web><print>
vulnerability scanner,
</print>
we plan to distribute scripts for some very severe or common vulnerabilities and misconfigurations.</para>
</listitem>
</varlistentry>
<varlistentry>
<term>Backdoor detection</term>
<listitem>
<para>
Many attackers and some automated worms leave
backdoors to enable later reentry. Some of these can be
detected by Nmap's regular expression based version detection.
For example, within hours of the MyDoom worm hitting the
Internet, Jay Moran posted an Nmap version detection probe and
signature so that others could quickly scan their networks.
For more complex worms and backdoors, NSE is needed
instead.
</para>
</listitem>
</varlistentry>
<varlistentry>
<term>Vulnerability exploitation</term>
<listitem>
<para>
As a general scripting language, NSE could even
be used to exploit vulnerabilities rather than just find them.
The capability to add custom exploit scripts may be valuable
for some people (particularly penetration testers), though we aren't
planning to turn Nmap into an exploitation framework like
<ulink url="http://www.metasploit.com">Metasploit</ulink>.
</para>
</listitem>
</varlistentry>
</variablelist>
<para>
The listed items are just the initial script classes. It is
likely that Nmap users will come up with even more inventive
uses for NSE.
</para>
<para>
Scripts are written in the
embedded <ulink url="http://www.lua.org/">Lua programming language</ulink>.
The language itself is well documented in the books
<citetitle><ulink url="http://www.amazon.com/exec/obidos/ASIN/8590379825/secbks-20">Programming
in Lua, Second Edition</ulink></citetitle> and
<citetitle><ulink url="http://www.amazon.com/exec/obidos/ASIN/8590379825/secbks-20">Lua
5.1 Reference Manual</ulink></citetitle>. The reference manual is also
<ulink url="http://www.lua.org/manual/5.1/">freely available
online</ulink>, as is the
<ulink url="http://www.lua.org/pil/">first edition of Programming in
Lua</ulink>. Given the availability of these excellent general
Lua programming references, this document only covers aspects and
extensions specific to the Nmap implementation.
</para>
<para>
NSE is activated with the <option>-sC</option> option (or
<option>--script</option> if you wish to specify a custom set of
scripts) and results are integrated into Nmap normal and XML
output. Two types of scripts are supported: service and host
scripts. Service scripts relate to a certain open port
(service) on the target host, and any results they produce are included
next to that port in the Nmap output port table. Host scripts,
on the other hand, run no more than once against each target IP
and produce results below the port table. <xref
linkend="nse-ex1"/> shows a typical script scan. Examples of
service scripts producing output are <literal>Stealth SSH
Version</literal>, which tricks some SSH servers into divulging
version information without logging the attempt as they normally
would, <literal>Service Owner</literal>, which connects to open
ports, then performs a reverse-identd query to determine what
username it is running under, and <literal>HTML Title</literal>,
which simply grabs the title of the root path of any web servers
found. A sample host script is <literal>RIPE Query</literal>,
which looks up and reports target IP ownership information.
</para>
<example id="nse-ex1">
<title>Typical NSE Output</title>
<screen>
$ ./nmap -sC localhost -p 22,23,80,113
Starting Nmap 4.20ALPHA9-NSE ( http://insecure.org )
Interesting ports on localhost (127.0.0.1):
PORT STATE SERVICE
22/tcp open ssh
|_ Stealth SSH version: SSH-1.99-OpenSSH_4.2
|_ SSH protocol version 1: Server supports SSHv1
23/tcp closed telnet
80/tcp open http
|_ HTML title:Test Page for Apache Installation
113/tcp closed auth
Host script results:
|_ RIPE Query: IP belongs to: Internet Assigned Numbers Authority
Nmap finished: 1 IP address (1 host up) scanned in 0.907 seconds
</screen>
</example>
</sect1>
<sect1 id="nse-usage">
<title>Usage and Examples</title>
<para>
While NSE has a complex implementation for efficiency, it is
strikingly easy to use. Simply specify <option>-sC</option> to
enable the most common scripts. Or specify the
<option>--script</option> option to choose your own scripts to
execute by providing categories, script file names, or the name of
directories full of scripts you wish to execute. You can customize
some scripts by providing arguments to them via the
<option>--script-args</option> option. The two
remaining options, <option>--script-trace</option> and
<option>--script-updatedb</option>, are generally only used for
script debugging and development.
</para>
<sect2 id="nse-categories"><title>Script Categories</title>
<para>NSE scripts define a list of categories they belong to.
Currently defined categories are <literal>safe</literal>,
<literal>intrusive</literal>, <literal>malware</literal>,
<literal>version</literal>, <literal>discovery</literal> and
<literal>vulnerability</literal>. By default, Nmap runs all
scripts in either the <literal>safe</literal> or
<literal>intrusive</literal> categories. Categories are not
case sensitive. The following list describes each category.</para>
<variablelist>
<varlistentry>
<term>
<option>safe</option>
</term>
<listitem>
<para>Scripts
which weren't designed to crash services, use large
amounts of network bandwidth or other resources, or
exploit security holes. These are less likely to offend
remote sysadmins. Of course (as with all other Nmap
features) we cannot guarantee that they won't ever cause
adverse reactions. Most of these perform general
network discovery. Examples are echoTest (sends a string
to the UDP echo service) and showHTMLTitle (grabs the
title from a web page).</para>
</listitem>
</varlistentry>
<varlistentry>
<term>
<option>intrusive</option>
</term>
<listitem>
<para>These are not intended to
crash or damage anything, but are more likely to leave
suspicious logs or otherwise arouse sysadmin ire. Scripts
which attempt to login to services with default passwords
fall into this class.</para>
</listitem>
</varlistentry>
<varlistentry>
<term>
<option>malware</option>
</term>
<listitem>
<para>These scripts test if the target platform is
infected by malware or backdoors.</para>
</listitem>
</varlistentry>
<varlistentry>
<term>
<option>version</option>
</term>
<listitem>
<para>This category cannot be selected explicitly. It is only
run if <option>-sV</option>
was supplied. The scripts in this category are an
extension to the version detection service. Their output
cannot be distinguished from version detection output
and they do not produce script scanning
output. </para>
</listitem>
</varlistentry>
<varlistentry>
<term>
<option>discovery</option>
</term>
<listitem>
<para>These scripts try to actively learn more about the
network by querying public registries, SNMP-enabled
devices, directory services, and the like.</para>
</listitem>
</varlistentry>
<varlistentry>
<term>
<option>vulnerability</option>
</term>
<listitem>
<para>These scripts check for a specific vulnerability and report results only if it is found.</para>
</listitem>
</varlistentry>
</variablelist>
</sect2>
<sect2 id="nse-args">
<title>Arguments to Scripts</title>
<para>
You can pass arguments to NSE scripts via the
<option>--script-args</option> option. The script-arguments generally are
name-value pairs, which are provided to the script as a Lua table called
<literal>args</literal> inside the <literal><link
linkend="nse-api-registry">nmap.registry</link></literal> with
the names as keys for the corresponding values. The values can either be
strings or tables. Subtables can be used to pass arguments to
scripts with a finer granularity (e.g. pass different usernames for
different scripts). A typical nmap invocation with script arguments may
look like:
</para>
<para>
<userinput>
$ nmap -sC --script-args user=foo,pass=bar,anonFTP={pass=ftp@foobar.com}
</userinput>
</para>
<para>
which would result in the Lua table:
</para>
<programlisting>
{user="foo",pass="bar",anonFTP={pass=nobody@foobar.com}}
</programlisting>
<para>You could therefore access the username (<literal>"foo"</literal>)
inside your script as <literal>local username= nmap.registry.args.user
</literal>. As a general rule the subtables used to override
options for scripts should be named as the script's
<literal>id</literal>, since otherwise scripts can't know where to
search for their arguments.
</para>
</sect2>
<sect2 id="nse-cmd-line-args">
<title>Command-line Arguments</title>
<para>
These are the five command line arguments specific to script-scanning:
</para>
<variablelist>
<varlistentry>
<term>
<option>-sC</option>
<indexterm>
<primary>-sC</primary>
</indexterm>
</term>
<listitem>
<para>Performs a script scan using the default set of scripts. It is
equivalent to
<option>--script=safe,intrusive</option>.</para>
</listitem>
</varlistentry>
<varlistentry>
<term><option>--script &lt;script-categories|directory|filename|all&gt;</option><indexterm><primary>--script</primary></indexterm></term>
<listitem>
<para>Runs a script scan (like <option>-sC</option>) with the scripts you have chosen rather than the defaults. Arguments can be script categories, single scripts or directories with scripts which are to be run against the target hosts instead of the default set. Nmap will try to interpret the arguments at first as categories and afterwards as files or directories. Absolute paths are used as is, relative paths are searched in the following places until found:
<filename>--datadir/</filename>;
<filename>$(NMAPDIR)/</filename>;
<filename>~user/nmap/</filename> (not searched on Windows);
<filename>NMAPDATADIR/</filename> or
<filename>./</filename>. A <filename>scripts/</filename> subdirectory is also tried in each of these. Give the argument <literal>all</literal> to execute all scripts in the Nmap script database.
</para>
<para>If a directory is specified and found, Nmap loads all NSE
scripts (any filenames ending with <literal>.nse</literal>) from that
directory. They must have the filename extension
<literal>nse</literal>. Nmap does not recurse into subdirectories to
find scripts. When individual file names are specified, the file
extension does not have to be <literal>nse</literal>.
</para>
<para>Nmap scripts are stored in a <filename>scripts</filename>
subdirectory of the Nmap data directory
<bookex>(see <xref linkend="data-files"/>)</bookex>
<notbook>(see the <ulink url="http://nmap.org/man/man-misc-options.html"><option>--datadir</option>
option</ulink>)</notbook>
by default. Scripts are indexed in a database stored in
<filename>scripts/script.db</filename>. The database lists all of the
scripts in each category. A single script may be in several
categories.</para>
</listitem>
</varlistentry>
<varlistentry>
<term>
<option>--script-args</option>
<indexterm>
<primary>--script-args</primary>
</indexterm>
</term>
<listitem>
<para>provides arguments to the scripts. See <xref
linkend="nse-args"/> for a detailed explanation.</para>
</listitem>
</varlistentry>
<varlistentry>
<term>
<option>--script-trace</option>
<indexterm>
<primary>--script-trace</primary>
</indexterm>
</term>
<listitem>
<para>
This option is similar to
<option>--packet-trace</option>, but works at the
application level rather than packet by packet. If this
option is specified, all incoming and outgoing
communication performed by scripts is printed. The
displayed information includes the communication
protocol, source and target addresses, and the
transmitted data. If more than 5% of transmitted data is
unprintable, hex dumps are given instead.
</para>
</listitem>
</varlistentry>
<varlistentry>
<term>
<option>--script-updatedb</option>
<indexterm>
<primary>--script-updatedb</primary>
</indexterm>
</term>
<listitem>
<para>This option is only useful if you have added or
removed NSE scripts from the default
<literal>scripts</literal> directory, or if you have
changed any of the scripts' <literal>categories</literal>
fields. This field contains categories such as
<literal>safe</literal> and <literal>discovery</literal>
which the script belongs to. Categories may be
specified with the <option>--script</option> option. For
efficiency reasons, NSE generates a
<filename>script.db</filename> file which maps
categories to the scripts they contain. If you changed
tag directives or added/removed scripts, run
<command>nmap --script-updatedb</command>.
</para>
</listitem>
</varlistentry>
</variablelist>
<para>
Some of the Nmap options have effects on script scans. The most
prominent of these is <option>-sV</option>. A version scan executes
the scripts in the <literal>version</literal> category. The scripts
in this category are slightly different than other scripts. Their
output blends in with the version scan and they do not produce any
script scan output.
</para>
<para>
Another option which has effect on the scripting engine is
<option>-A</option>. The aggressive mode of Nmap implies
the option <option>-sC</option>.
</para>
<para>
</para>
</sect2>
<sect2 id="nse-usage-examples">
<title>Usage Examples</title>
<para>
Simple script scan.
</para>
<para>
<userinput>
$ nmap -sC hostname
</userinput>
</para>
<para>
Tracing a specific script.
</para>
<para>
<userinput>
$ nmap --script=./showSSHVersion.nse --script-trace hostname
</userinput>
</para>
</sect2>
</sect1>
<sect1 id="nse-scripts">
<title>Script Format</title>
<para>NSE scripts consist of four descriptive fields, a port or host rule defining when the script should be executed, and an action block containing the actual script instructions. All six of these are Lua variables that are assigned to. Their names must be lowercase as shown here.
</para>
<sect2 id="nse-format-id">
<title><literal>id</literal> Field</title>
<para>
The script's <literal>id</literal> field is displayed in the Nmap output
table if the script produces any output. It should be unique so users
can identify exactly which script file produced a message. IDs
should be kept short to conserve space in Nmap output, while
still being meaningful enough for users to recognize. Some
good examples are <literal>RIPE query</literal>, <literal>HTML
title</literal>, and <literal>Kibuv worm</literal>.
</para>
</sect2>
<sect2 id="nse-format-description">
<title><literal>description</literal> Field</title>
<para>
The description describes what the script is testing for and
any critical notes the user must be aware of. A good example
example is this user contributed recursive DNS script
description <quote>Checks whether a nameserver on UDP port 53
allows queries for third party names. It is expected that
recursion will be enabled on your own internal
nameserver.</quote>
</para>
</sect2>
<sect2 id="nse-format-author">
<title><literal>author</literal> Field </title>
<para>
The <literal>author</literal> field contains the script authors name and contact information. If you are worried about spam, you might want to omit or obscure your email address, or give your home page URL instead. This optional field is not used by NSE, but is important for giving script authors due credit or blame.
</para>
</sect2>
<sect2 id="nse-format-license">
<title><literal>license</literal> Field </title>
<para>This field describes the license applied to the script. All scripts currently shipped with Nmap contain:</para>
<programlisting>
license = "Same as Nmap--See http://nmap.org/man/man-legal.html"
</programlisting>
<para>See <xref linkend="nse-license"/> for further details on contributing NSE scripts to Nmap.
</para>
</sect2>
<sect2 id="nse-format-runlevel">
<title><literal>runlevel</literal> Field</title>
<para>
This optional field determines script execution order. When
this section is absent the run level defaults to 1.0. A script
with the run level 1.0 is run before any scripts with <literal>runlevel</literal> set to
<literal>2.5</literal>, which in turn runs before any scripts
with <literal>runlevel 2.55</literal>. No particular order
is guaranteed for scripts with the same run level. One
application of run levels is allowing scripts to depend on
each other. If <literal>script A</literal> relies on some
information gathered by <literal>script B</literal>, give
<literal>B</literal> a lower run level than
<literal>A</literal>. <literal>Script B</literal> can store
information in the NSE registry for <literal>A</literal> to
retrieve later. For information on the NSE registry see to
<xref linkend="nse-api-registry"/>.
</para>
</sect2>
<sect2 id="nse-format-rules">
<title>Port and Host Rules</title>
<para>
There are two types of rules: <emphasis>host rules</emphasis>
which run only once against a target IP and <emphasis>port
rules</emphasis> which run against individual ports on a
target. A rule is a Lua function which takes a host and a
port table as arguments and returns a boolean. If the rule
evaluates to <literal>true</literal>, the script action
is performed. Otherwise the action is skipped. Port rules are
only matched against TCP or UDP ports in the
<literal>open</literal>, <literal>open|filtered</literal> or
<literal>unfiltered</literal>
states. Host rules are matched exactly once against every
scanned host. The action, like the rule, is a Lua function,
which takes a host and port table as arguments. If the script is
matched using a host rule, then <literal>nil</literal> is passed instead of a port table. Example rules are shown in
<xref linkend="nse-tutorial-rule"/>.</para> </sect2>
<sect2 id="nse-format-action"><title>Action</title>
<para>
The action is the heart of an NSE script. It contains all of
the instructions to be executed when the script's port or host
rule triggers. It is a Lua function which returns either
<literal>nil</literal> or a string. If a string is returned,
it is printed along with the script ID in (if it is a service
script) or below (if it is a host script) the Nmap port table.
If the script returns <literal>nil</literal>, no output is
produced. All variables in the
action and rule segments must be declared <literal>local</literal>. For an
example of an NSE action refer to <xref
linkend="nse-tutorial-action"/>.
</para>
</sect2>
</sect1>
<sect1 id="nse-language">
<title>Script Language</title>
<para>
Nmap's scripting engine consists of three more or less distinct
parts. The largest part is the embeddable Lua interpreter. This
is a lightweight language designed for extensibility. It offers
a powerful and well documented API for interfacing with other
software (such as Nmap).
</para>
<para>
The second part of the Nmap scripting engine is the NSE library, which
connects Lua and Nmap. This layer
handles issues such as initialization of the Lua interpreter,
scheduling of parallel script execution, script retrieval and
more. It is also the heart of the NSE network I/O framework and the
exception handling mechanism.
</para>
<para>
Lua was designed with a small feature set to ease embedding. So
we have added extensions to support more specialized
functionality. These basically are
<ulink url="http://www.lua.org/manual/5.1/manual.html#5.3">Lua modules</ulink> written either in Lua itself, or where needed in C. This
NSE library is the third part of the NSE.
</para>
<sect2 id="nse-lua">
<title>Lua Base Language</title>
<para>
The Nmap scripting language is an embedded <ulink
url="http://www.lua.org/">Lua</ulink> interpreter which was
extended with libraries for interfacing with Nmap. The Nmap
API is in the Lua namespace <literal>nmap</literal>. This
means that all calls to resources provided by Nmap have an
<literal>nmap</literal> prefix.
<literal>nmap.new_socket()</literal>, for example, returns a
new socket wrapper object. The Nmap library layer also takes
care of initializing the Lua context, scheduling parallel
scripts and collecting the output produced by completed
scripts.
</para>
<para>
During the planning stages, we considered several programming
languages as the bases for Nmap scripting. One option was to
implement a completely new programming language. The criteria
imposed on the options were strict, NSE needed to be easy to
use, small in size, compatible with the Nmap license,
scalable, fast and parallelizable. There have been several
efforts to design a security auditing language from scratch
which have resulted in well known awkward solutions. It was
clear from the beginning that we would not go down this
road. For a while the Guile scheme interpreter was considered
but the preference drifted towards Elk in favor of its more
liberal license. But parallelizing Elk scripts would have been
difficult. In addition, the subset of Nmap users familiar with
functional programming is regarded too small to consider
Scheme as an option. Larger interpreters like Perl, Python or
Ruby are well known and loved, but are difficult to embed
efficiently. In the end, Lua exceeded in all criteria for
NSE. It is small, distributed under the MIT license, has
coroutines which provide a sane method for parallel script
execution, was designed with embeddability in mind, has
excellent documentation, and is actively developed by a large
and committed community.
</para>
</sect2>
</sect1>
<sect1 id="nse-library">
<title>Lua Extensions</title>
<para>In addition to the significant built-in capabilities of
Lua, we have written or integrated several extensions to make
NSE scripts more powerful and convenient to write. These
<emphasis>modules</emphasis> are compiled and installed along with
Nmap. They have their own directory, <filename>nselib</filename>, which
is installed in the configured datadir. Scripts need only <literal>require</literal> the default modules in order to use them. The default modules are described in the following sections.
</para>
<sect2 id="nse-bitops">
<title>Bitwise Logical Operations</title>
<para>
Lua does not provide bitwise logical operations. Since they
are often useful for low-level network communication, Reuben
Thomas' bitwise operation library for Lua has been
integrated into NSE. The arguments to the bitwise operation
functions should be integers. The number of bits available
for logical operations depends on the data type used to
represent Lua numbers&mdash;this is typically 8-byte IEEE
floats, which give 53 bits (the size of the mantissa).
This implies that the bitwise operations won't work (as expected)
for numbers larger than 10<superscript>14</superscript>. You
can use them with 32-bit wide numbers without any problems. Operations
involving 64-bit wide numbers, however, may not return the expected
result.
The logical operations start with <quote>b</quote> (for <literal>bit</literal>) to avoid
clashing with reserved words; although <literal>xor</literal> isn't a
reserved word, it seemed better to use <literal>bxor</literal> for
consistency. In NSE the bitwise functions are in the <literal>bit</literal>
namespace.
<variablelist>
<varlistentry>
<term><option>bit.bnot(a)</option>
<indexterm><primary>bit.bnot(a)</primary></indexterm></term>
<listitem>
<para>
Returns the one's complement of a.
</para>
</listitem>
</varlistentry>
<varlistentry>
<term><option>bit.band(w1,...)</option>
<indexterm><primary>bit.band(w1,...)</primary></indexterm></term>
<listitem>
<para>
Returns the bitwise <literal>and</literal> of the
w's.
</para>
</listitem>
</varlistentry>
<varlistentry>
<term><option>bit.bor(w1,...)</option>
<indexterm><primary>bit.bor(w1,...)</primary></indexterm></term>
<listitem>
<para>
Returns the bitwise <literal>or</literal> of the w's.
</para>
</listitem>
</varlistentry>
<varlistentry>
<term><option>bit.bxor(w1,...)</option>
<indexterm><primary>bit.bxor(w1,...)</primary></indexterm></term>
<listitem>
<para>
Returns the bitwise <literal>xor</literal> of the
w's.
</para>
</listitem>
</varlistentry>
<varlistentry>
<term><option>bit.lshift(a,b)</option>
<indexterm><primary>bit.lshift(a,b)</primary></indexterm></term>
<listitem>
<para>
Returns a shifted left b places&mdash;padded with zeros.
</para>
</listitem>
</varlistentry>
<varlistentry>
<term><option>bit.rshift(a,b)</option>
<indexterm><primary>bit.rshift(a,b)</primary></indexterm></term>
<listitem>
<para>
Returns a shifted logically right b places.
</para>
</listitem>
</varlistentry>
<varlistentry>
<term><option>bit.arshift(a,b)</option>
<indexterm><primary>bit.arshift(a,b)</primary></indexterm></term>
<listitem>
<para>
Returns a shifted arithmetically right b places.
</para>
</listitem>
</varlistentry>
<varlistentry>
<term><option>bit.mod(a,b)</option>
<indexterm><primary>bit.mod(a,b)</primary></indexterm></term>
<listitem>
<para>
Returns the integer remainder of a divided by b.
</para>
</listitem>
</varlistentry>
</variablelist>
</para>
</sect2>
<sect2 id="nse-pcre">
<title>Perl Compatible Regular Expressions</title>
<para>
One of Lua's quirks is its string patterns. While they have
great performance and are tightly integrated into the Lua
interpreter, they are very different in syntax and not as
powerful as standard regular expressions. So we have
integrated Perl compatible regular expressions into Lua
using libPCRE and a modified version of the Lua PCRE library
written by Reuben Thomas and Shmuel Zeigerman. These are
the same sort of regular expressions used by Nmap version
detection. The main modification to their library is that
the NSE version only supports PCRExpressions instead of both
PCRE and POSIX patterns. In order to maintain a high script
execution speed, the library interfacing with libPCRE is
kept very thin. It is not integrated as seamlessly as the
Lua string pattern API. This allows script authors to decide
when to use PCRE expressions versus Lua patterns. PCRE
involves a separate pattern compilation step, which saves
execution time when patterns are reused. Compiled patterns
can be cached in the NSE registry and reused by other
scripts. The PCRE functions reside inside the <literal>pcre</literal>
namespace.
</para>
<warning><para>LibPCRE has a history of security vulnerabilities
allowing attackers who are able to compile arbitrary regular
expressions to execute arbitrary code. More such
vulnerabilities may be discovered in the future. These have
never affected Nmap because it doesn't give attackers any
control over the regular expressions it uses. Similarly, NSE
scripts should never build regular expressions with untrusted
network input. Matching hardcoded regular expressions
<emphasis>against</emphasis> the untrusted input is
fine.</para></warning>
<para>The following documentation is derived from that supplied by
the PCRE Lua lib.</para>
<variablelist>
<varlistentry>
<term><option>pcre.new(pattern, flags, locale)</option>
<indexterm><primary>pcre.new</primary></indexterm></term>
<listitem>
<para>
Returns a compiled regular expression. The first
argument is a string describing the pattern, such as
<literal>^foo$</literal>. The second
argument is a number describing which compilation
flags are set. The compilation flags are set
bitwise. If you want to set the 3rd (corresponding to
the number 4) and the 1st (corresponding to 1) bit
for example you would pass the number 5 as a second
argument. The compilation flags accepted are those
of the PCRE C library. These include flags for case
insensitive matching (1), matching line beginnings (^)
and endings ($) even in multiline strings (i.e. strings
containing <quote>\n</quote>) (2) and a flag for matching across
line boundaries (4). No compilation flags yield a default
value of 0. The third (optional) argument is a string
describing the locale which should be used to compile the
regular expression. The variable is a string which is
passed to the C standard library function
<function>setlocale</function>. For more
information on this argument refer to the
documentation of <function>setlocale</function>. The
resulting compiled regular expression is ready to be
matched against strings. Compiled regular
expressions are subject to Lua's garbage collection.
Generally speaking, <literal>my_regex = pcre.new("<replaceable>pcre-pattern</replaceable>",0,"C")</literal>
should do the job most of the time.
</para>
</listitem>
</varlistentry>
<varlistentry>
<term><option>pcre.flags()</option>
<indexterm><primary>pcre.flags</primary></indexterm></term>
<listitem>
<para>
Returns a table of the available PCRE option flags
(numbers) keyed by their names (strings). Possible
names of the available strings can be retrieved from
the documentation of the PCRE library used to link
against Nmap. The key is the option name in the
manual minus the <literal>PCRE</literal>
prefix. <literal>PCRE_CASELESS</literal> becomes
<literal>CASELESS</literal> for example.
</para>
</listitem>
</varlistentry>
<varlistentry>
<term><option>pcre.version()</option>
<indexterm><primary>pcre.version</primary></indexterm></term>
<listitem>
<para>
Returns the version of the PCRE library in use as a
string. For example <literal>6.4 05-Sep-2005</literal>.
</para>
</listitem>
</varlistentry>
<varlistentry>
<term><option>pcre_obj:match(string, start, flags)</option>
<indexterm><primary>pcre.match</primary></indexterm></term>
<listitem>
<para>
Returns the start point and the end point point of
the first match of the compiled regular expression
pcre_obj in the string. A third
returned value is a table which contains
<literal>false</literal> in the positions where the
pattern did not match. If named sub-patterns were
used the table also contains substring matches keyed
by their sub-pattern name. Should no match be found the
function returns <literal>nil</literal>.
The second and third arguments are optional. The second
argument is a number specifying where the engine should
start trying to apply the pattern. The third argument
specifies execution flags for the pattern.
If you want to see if a given string matches a certain expression
you could use:</para>
<programlisting>
s = pcre_obj:match("string to be searched", 0,0);
if(s) code_to_be_done_on_match end
</programlisting>
</listitem>
</varlistentry>
<varlistentry>
<term><option>pcre_obj:exec(string, start, flags)</option>
<indexterm><primary>pcre.exec</primary></indexterm></term>
<listitem>
<para>
This function is like <literal>match()</literal> except that a table returned as
a third result contains offsets of substring matches rather
than substring matches themselves. That table will not
contain string keys, even if named sub-patterns are used. For
example, if the whole match is at offsets <literal>10, 20</literal> and substring
matches are at offsets <literal>12, 14</literal> and <literal>16, 19</literal> then the function
returns the following: <literal>10, 20, {12,14,16,19}</literal>
</para>
</listitem>
</varlistentry>
<varlistentry>
<term><option>pcre_obj:gmatch(string, func, n, ef)</option>
<indexterm><primary>pcre.gmatch</primary></indexterm></term>
<listitem>
<para>
Tries to match the regular expression <replaceable>pcre_obj</replaceable> against <replaceable>string</replaceable>
up to <replaceable>n</replaceable> times (or as many as possible if <replaceable>n</replaceable> is either
not given or is not a positive number), subject to
execution flags ef. Each time there is a match, <replaceable>func</replaceable>
is called as <replaceable>func(m, t)</replaceable>, where <replaceable>m</replaceable> is the matched
string and <replaceable>t</replaceable> is a table of substring matches. This
table contains <literal>false</literal> in the
positions where the corresponding sub-pattern did
not match. If named sub-patterns are used then the
table also contains substring matches keyed by their
correspondent sub-pattern names (strings). If <replaceable>func</replaceable>
returns a <literal>true</literal> value, then gmatch
immediately returns; gmatch returns the number of
matches made.
</para>
</listitem>
</varlistentry>
</variablelist>
</sect2>
<sect2 id="nse-lib-ipOps">
<title>IP Operations</title>
<para>
The <literal>ipOps</literal> module provides some functions for
manipulating IPv4 addresses. The functions reside inside the
<literal>ipOps</literal> namespace.
</para>
<variablelist>
<varlistentry>
<term><option>bool = ipOps.isPrivate("ip-string")</option>
<indexterm><primary>isPrivate</primary></indexterm></term>
<listitem>
<para>
checks whether an IP address, provided as a string in
dotted-quad notation, is part of the non-routed private IP address
space, as described in <ulink role="hidepdf" url="http://www.rfc-editor.org/rfc/rfc1918.txt">RFC 1918</ulink>. These addresses are the well known
<literal>10.0.0.0/8</literal>,<literal>192.168.0.0/16</literal> and
<literal>172.16.0.0/12</literal> networks.
</para>
</listitem>
</varlistentry>
<varlistentry>
<term><option>DWORD = ipOps.todword("ip-string")</option>
<indexterm><primary>todword</primary></indexterm></term>
<listitem>
<para>
returns the IP address as DWORD value (i.e. the IP <replaceable>a.b.c.d</replaceable> becomes
<literal>(((a*256+b)*256+c)*256+d)</literal> )
</para>
</listitem>
</varlistentry>
<varlistentry>
<term><option>a,b,c,d = ipOps.get_parts_as_number("ip-string")</option>
<indexterm><primary>get_parts_as_number</primary></indexterm></term>
<listitem>
<para>
returns 4 numbers corresponding to the fields in dotted-quad notation.
For example, <literal>ipOps.get_parts_as_number("192.168.1.1")
</literal> returns <literal>192,168,1,1</literal>.
</para>
</listitem>
</varlistentry>
</variablelist>
</sect2>
<sect2 id="nse-lib-shortport">
<title>Short Portrules</title>
<para>
Since portrules are mostly the same for many scripts, the
<literal>shortport</literal> module provides functions for the most common tests.
The arguments in brackets (<literal>[]</literal>) are optional. If no
<literal>proto</literal> is provided, <literal>tcp</literal> is used. The default
<literal>state</literal> is <literal>open</literal>
</para>
<variablelist>
<varlistentry>
<term><option>shortport.portnumber(port,[proto],[state])</option>
<indexterm><primary>portnumber</primary></indexterm></term>
<listitem>
<para>
The port argument is either a number or a table of numbers which are
interpreted as port numbers, against which the script should run.
</para>
</listitem>
</varlistentry>
<varlistentry>
<term><option>shortport.service(service,[proto],[state])</option>
<indexterm><primary>service</primary></indexterm></term>
<listitem>
<para>
The service argument is either a string or a table
of strings which are interpreted as service names
(e.g. <literal>"http"</literal>, <literal>"https"</literal>, <literal>"smtp"</literal> or <literal>"ftp"</literal>) against which the
script should run. These service names are
determined by Nmap's version scan or (if no version
scan information is available) the service assigned
to the port in <filename>nmap-services</filename>
(i.e. "http" for TCP port 80).
</para>
</listitem>
</varlistentry>
<varlistentry>
<term><option>shortport.port_or_service(port,service,[proto],[state])</option>
<indexterm><primary>port_or_service</primary></indexterm></term>
<listitem>
<para>
This is a combination of the above functions, since many scripts
explicitly try to run against the well known ports, but want
also to run against any other port which was discovered to run the
named service. A typical example for this function is:
<literal>portrule = shortport.port_or_service(22,"ssh")</literal>.
</para>
</listitem>
</varlistentry>
</variablelist>
</sect2>
<sect2 id="nse-lib-listop">
<title>Functional Programming Style List Operations</title>
<para>
People used to programming in functional languages, such as Lisp or
Haskell appreciate their handling of lists very much. The <literal>listop</literal> module tries to bring much of the functionality from
functional languages to Lua using Lua's central data structure, the table,
as a base for its list operations. Highlights include a <literal>map</literal>
function applying a given function to each element of a list.
</para>
<variablelist>
<varlistentry>
<term><option>bool = listop.is_empty(list)</option>
<indexterm><primary>is_empty</primary></indexterm></term>
<listitem>
<para>
Returns <literal>true</literal> if the given list is empty.
</para>
</listitem>
</varlistentry>
<varlistentry>
<term><option>bool = listop.is_list(value)</option>
<indexterm><primary>is_list</primary></indexterm></term>
<listitem>
<para>
Returns <literal>true</literal> if the given value is a list (or rather a table).
</para>
</listitem>
</varlistentry>
<varlistentry>
<term><option>list = listop.map(function, list)</option>
<indexterm><primary>map</primary></indexterm></term>
<listitem>
<para>
The provided function is applied to each element of the list
separately. The returned list contains the results of each
function call. For example <literal>listop.map(tostring,{1,2,true})
</literal> returns <literal>{"1","2","true"}</literal>.
</para>
</listitem>
</varlistentry>
<varlistentry>
<term><option>value = listop.apply(function, list)</option>
<indexterm><primary>apply</primary></indexterm></term>
<listitem>
<para>
All of the elements in the list are passed to a call of <literal>
function</literal>. The result is then returned. For example
<literal>listop.apply(math.max,{1,5,6,7,50000})</literal>
yields <literal>50000</literal>.
</para>
</listitem>
</varlistentry>
<varlistentry>
<term><option>list = listop.filter(predicate, list)</option>
<indexterm><primary>filter</primary></indexterm></term>
<listitem>
<para>
Returns a list containing only those elements for which the predicate
returns true. The predicate has to be a function, which takes an
element of the list as argument and the result of which
is interpreted as boolean value. If it returns true (or rather
anything besides <literal>false</literal> and <literal>nil</literal>)
the argument is appended to the return value of <literal>filter</literal>.
For example: <literal>listop.filter(isnumber,{1,2,3,"foo",4,"bar"})</literal> returns
<literal>{1,2,3,4}</literal>.
</para>
</listitem>
</varlistentry>
<varlistentry>
<term><option>list = listop.flatten(list)</option>
<indexterm><primary>flatten</primary></indexterm></term>
<listitem>
<para>
Since a list can itself contain lists as elements,
<literal>flatten</literal> returns a list which
only contains values that are not themselves
lists. For example:
<literal>listop.flatten({1,2,3,"foo",{4,5,{"bar"}}})</literal> returns
<literal>{1,2,3,"foo",4,5,"bar"}</literal>.
</para>
</listitem>
</varlistentry>
<varlistentry>
<term><option>list = listop.append(list1, list2)</option>
<indexterm><primary>append</primary></indexterm></term>
<listitem>
<para>
Returns a list containing all elements of list1 appended by all
elements of <replaceable>list2</replaceable>.
</para>
</listitem>
</varlistentry>
<varlistentry>
<term><option>list = listop.cons(value1, value2)</option>
<indexterm><primary>cons</primary></indexterm></term>
<listitem>
<para>
Returns a list containing <replaceable>value1</replaceable> appended by <replaceable>value2</replaceable>, which may be
of any type.
</para>
</listitem>
</varlistentry>
<varlistentry>
<term><option>list = listop.reverse(list)</option>
<indexterm><primary>reverse</primary></indexterm></term>
<listitem>
<para>
Returns a list containing all elements of the given list in inverted
order.
</para>
</listitem>
</varlistentry>
<varlistentry>
<term><option>value = listop.car(list)</option>
<indexterm><primary>car</primary></indexterm></term>
<listitem>
<para>
Returns the first element of the given list.
</para>
</listitem>
</varlistentry>
<varlistentry>
<term><option>value = listop.ncar(list,n)</option>
<indexterm><primary>ncar</primary></indexterm></term>
<listitem>
<para>
Returns the nth (or first if n is omitted) element of the given list.
</para>
</listitem>
</varlistentry>
<varlistentry>
<term><option>value = listop.cdr(list)</option>
<indexterm><primary>cdr</primary></indexterm></term>
<listitem>
<para>
Returns a list containing all elements but the first of the
given list.
</para>
</listitem>
</varlistentry>
<varlistentry>
<term><option>value = listop.ncdr(list, n)</option>
<indexterm><primary>ncdr</primary></indexterm></term>
<listitem>
<para>
Returns a list containing all elements but the first n of the
given list, where n is 2 if it is omitted.
</para>
</listitem>
</varlistentry>
</variablelist>
</sect2>
<sect2 id="nse-lib-strbuf">
<title>String Buffer Operations</title>
<para>
Lua's string operations are very flexible and offer an easy-to-use way
to manipulate strings. Concatenation using the <literal>..</literal>
operator is such an operation. The drawback of the built-in API however is the way it handles
concatenation of many string values. Since strings in Lua are
immutable values, each time you concatenate two strings both get copied
into the result string. The <literal>strbuf</literal> module offers a
workaround for this problem, while maintaining the nice syntax. This
is accomplished by overloading the concatenation operator (<literal>..</literal>) the equality operator (<literal>==</literal>) and the
tostring operator. By overloading
these operators, we reduce the overhead of using a string buffer instead
of a plain string to wrap the first literal string assigned to a
variable inside a <literal>strbuf.new()</literal> call. Afterwards you can append to the string buffer, or compare
two string buffers for equality just as you would do with normal strings.
When looking at the details there are some more restrictions/oddities:
The concatenation operator requires its left-hand value to be a
string buffer. Therefore, if you want to prepend a string to a given
string buffer you have to create a new string buffer out of the string
you want to prepend.
The string buffer's <literal>tostring</literal> operator concatenates the
strings inside the buffer using newlines by default, since this appears to
be the separator used most often.
</para>
<variablelist>
<varlistentry>
<term><option>buffer = strbuf.new("first-string")</option>
<indexterm><primary>new</primary></indexterm></term>
<listitem>
<para>
Creates a new string buffer. The argument is optional and is the
first string to be added to the buffer.
</para>
</listitem>
</varlistentry>
<varlistentry>
<term><option>buffer = strbuf.concat(strbuf1, value)</option>
<indexterm><primary>concat</primary></indexterm></term>
<listitem>
<para>
Concatenates the <literal>value</literal> (which has to be either
a string or a string buffer) to <literal>strbuf1</literal>. This
is also the function serving as the string buffer's concatenation operator.
The above function call can thus also be expressed as:
<literal>buffer = strbuf1 .. value</literal>
</para>
</listitem>
</varlistentry>
<varlistentry>
<term><option>bool = strbuf.eqbuf(strbuf1, strbuf2)</option>
<indexterm><primary>eqbuf</primary></indexterm></term>
<listitem>
<para>
Compares <literal>strbuf1</literal> and <literal>strbuf2</literal>
for equality. For the function to return <literal>true</literal>, both values must be
string buffers containing exactly the same strings. The <literal>eqbuf</literal> function is called to compare two strings for equality.
</para>
</listitem>
</varlistentry>
<varlistentry>
<term><option>strbuf.clear(strbuf)</option>
<indexterm><primary>clear</primary></indexterm></term>
<listitem>
<para>
Deletes all strings in <literal>strbuf</literal>.
</para>
</listitem>
</varlistentry>
<varlistentry>
<term><option>string = strbuf.dump(strbuf, "delimiter")</option>
<indexterm><primary>dump</primary></indexterm></term>
<listitem>
<para>
Dumps <literal>strbuf</literal>'s contents as string. The second
parameter is used as a delimiter between the strings stored inside
<literal>strbuf</literal>. <literal>dump(strbuf, "\n")</literal> is
used as the <literal>tostring</literal> function of string buffers.
</para>
</listitem>
</varlistentry>
</variablelist>
</sect2>
<sect2 id="nse-lib-url">
<title>URL Manipulation Functions</title>
<para>URL manipulation functions have obvious uses. Fortunately
there is already an implementation of URL generation functions
inside the Lua-socket package, which is fairly complete and
<ulink
url="http://www.cs.princeton.edu/~diego/professional/luasocket/old/luasocket-2.0-alpha/url.html">well
documented</ulink>. For NSE, Lua-socket's URL module was
extended with two functions:</para>
<variablelist>
<varlistentry>
<term><option>table = url.parse_query("query-string")</option>
<indexterm><primary>parse_query</primary></indexterm></term>
<listitem>
<para>
This function takes a <replaceable>query-string</replaceable> of the form <literal>name1=value1&amp;name2=value2...</literal> and returns a table
containing the name-value pairs, with the <literal>name</literal>
as the key and the <literal>value</literal> as its associated value.
The table corresponding to the above <replaceable>query-string</replaceable> would have two
entries: <literal>table["name1"]="value1"</literal> and
<literal>table["name2"]="value2"</literal>.</para>
</listitem>
</varlistentry>
<varlistentry>
<term><option>query_string = url.build_query(table)</option>
<indexterm><primary>build_query</primary></indexterm></term>
<listitem>
<para>
This is the inverse function to <literal>parse_query()</literal>.
</para>
</listitem>
</varlistentry>
</variablelist>
</sect2>
<sect2 id="nse-lib-match">
<title>Buffered Network I/O Helper Functions</title>
<para>
The <literal>match</literal> module was written to provide
functions which can be used for delimiting data received by the
<literal>receive_buf()</literal> function from the Network I/O API:
</para>
<variablelist>
<varlistentry>
<term><option>start,end = match.regex("regexpattern")</option>
<indexterm><primary>regex</primary></indexterm></term>
<listitem>
<para>
This is actually a wrapper around NSE' PCRE library <literal>exec</literal> function (see <xref linkend="nse-pcre"/>, thus
giving script developers the possibility to use regular expressions
for delimiting instead of Lua's string patterns. If you want to get
the data in chunks separated by <function>regex</function> (which has to be a valid
regular expression), you would write <literal>status, val =
sockobj:receive_buf(match.lua("regex"))</literal>.
</para>
</listitem>
</varlistentry>
<varlistentry>
<term><option>start,end = match.numbytes(number)</option>
<indexterm><primary>numbytes</primary></indexterm></term>
<listitem>
<para>
Takes a number as argument and returns that
many bytes. It can be used to get a buffered
version of
<literal>sockobj:receive_bytes(n)</literal> in
case a script requires more than one
fixed-size chunk, as the unbuffered version
may return more bytes than requested and thus
would require you to do the parsing on your
own.
</para>
</listitem>
</varlistentry>
</variablelist>
</sect2>
<sect2 id="nse-lib-http">
<title>HTTP Functions</title>
<para>
The <literal>http</literal> module provides functions for dealing with the client side of the http protocol.
The functions reside inside the <literal>http</literal> namespace.
The return value of each function in this module is a table with the following keys:
<literal>status</literal>, <literal>header</literal> and <literal>body</literal>.
<literal>status</literal> is the status code of the http request
In case of an error status is <literal>nil</literal>. <literal>header</literal>
is a table with the headers received from the server. The header names are
lower-cased and multiple headers of the same name are concatenated with comma.
<literal>body</literal> holds a string with the request body.
</para>
<variablelist>
<varlistentry>
<term><option>table = http.get(host,port,path,[options])</option>
<indexterm><primary>get</primary></indexterm></term>
<listitem>
<para>
Fetches a resource with a <literal>GET</literal> request.
The first argument is either a string with the hostname or a
table like the host table passed by nmap. The second argument
is either the port number or a table like the port table passed
by nmap. The third argument is the path of the resource. The fourth
argument is a table for further options. The table may have 2 keys:
<literal>timeout</literal> and <literal>header</literal>.
<literal>timeout</literal> is the timeout used for the socket
operations. <literal>header</literal> is a table with additional
headers to be used for the request.
The function builds the request and calls <literal>http.request</literal>
</para>
</listitem>
</varlistentry>
<varlistentry>
<term><option>table = http.request(host,port,request,[options])</option>
<indexterm><primary>request</primary></indexterm></term>
<listitem>
<para>
Sends <literal>request</literal> to <literal>host</literal>:<literal>port</literal>
and parses the answer.
The first argument is either a string with the hostname or a
table like the host table passed by nmap. The second argument
is either the port number or a table like the port table passed
by nmap. SSL is used for the request if either <literal>port.service</literal>
equals <literal>"https"</literal> or <literal>port.version.service_tunnel</literal>
equals <literal>"ssl"</literal>. The third argument is the request. The fourth
argument is a table for further options. You can specify a timeout
for the socket operations with the timeout key.
</para>
</listitem>
</varlistentry>
<varlistentry>
<term><option>table = http.get_url(url,[options])</option>
<indexterm><primary>get_url</primary></indexterm></term>
<listitem>
<para>
Parses <literal>url</literal> and calls <literal>http.get</literal>
with the result.
The second argument is a table for further options. The table may have 2 keys:
<literal>timeout</literal> and <literal>header</literal>.
<literal>timeout</literal> is the timeout used for the socket
operations. <literal>header</literal> is a table with additional
headers to be used for the request.
</para>
</listitem>
</varlistentry>
</variablelist>
</sect2>
<sect2 id="nse-lib-datafiles">
<title>Data File Parsing Functions</title>
<para>
The <literal>datafiles</literal> module provides functions for reading and parsing
Nmap's data files (e.g. <filename>nmap-protocol</filename>, <filename>nmap-rpc</filename>,
etc.). These functions' return values are setup for use with exception handling via
<literal>nmap.new_try()</literal>.
</para>
<variablelist>
<varlistentry>
<term><option>bool, table = datafiles.parse_protocols()</option>
<indexterm><primary>parse_protocols</primary></indexterm></term>
<listitem>
<para>
This function reads and parses Nmap's <filename>nmap-protocols</filename>
file. <literal>bool</literal> is a boolen value indicating success.
If <literal>bool</literal> is true, then the second returned
value is a table with protocol numbers indexing the protocol
names. If <literal>bool</literal> is false, an error message
is returned as the second value instead of the table.
</para>
</listitem>
</varlistentry>
<varlistentry>
<term><option>bool, table = datafiles.parse_rpc()</option>
<indexterm><primary>parse_rpc</primary></indexterm></term>
<listitem>
<para>
This function reads and parses Nmap's <filename>nmap-rpc</filename>
file. <literal>bool</literal> is a boolen value indicating success.
If <literal>bool</literal> is true, then the second returned
value is a table with RPC numbers indexing the RPC names. If
<literal>bool</literal> is false, an error message is returned
as the second value instead of the table.
</para>
</listitem>
</varlistentry>
<varlistentry>
<term><option>bool, table = datafiles.parse_services([protocol])</option>
<indexterm><primary>parse_services</primary></indexterm></term>
<listitem>
<para>
This function reads and parses Nmap's <filename>nmap-services</filename>
file. <literal>bool</literal> is a boolen value indicating success.
If <literal>bool</literal> is true, then the second returned
value is a table containing two other tables:
<literal>tcp{}</literal> and <literal>udp{}</literal>.
<literal>tcp{}</literal> contains services indexed by TCP port
numbers. <literal>udp{}</literal> is the same, but for UDP.
You can pass "tcp" or "udp" as an argument to
<literal>parse_services()</literal> to only get the corresponding
table. If <literal>bool</literal> is false, an error message is
returned as the second value instead of the table.
</para>
</listitem>
</varlistentry>
</variablelist>
</sect2>
<sect2 id="nse-lib-stdnse">
<title>Various Utility Functions</title>
<para>
The <literal>stdnse</literal> library contains various handy
functions which are too small to justify modules of their own:
</para>
<variablelist>
<varlistentry>
<term><option>stdnse.print_debug(...)</option>
<indexterm><primary>print_debug</primary></indexterm></term>
<listitem>
<para>
Wrapper function around <literal>print_debug_unformatted()</literal>
in the <literal>nmap</literal> namespace. The first argument, if numeric, is
used as the necessary debug level to print the message (it defaults
to 1 if omitted). All remaining arguments are processed with
Lua's <literal>string.format()</literal> function, which provides a
C-style printf interface.
</para>
</listitem>
</varlistentry>
<varlistentry>
<term><option>table = stdnse.strsplit("delimiter","text")</option>
<indexterm><primary>strsplit</primary></indexterm></term>
<listitem>
<para>
This function will certainly be appreciated by Perl programmers.
It takes two strings as arguments and splits the second one around
all occurrences of the first one, returning a table, which contains
the substrings without the delimiting string.
</para>
</listitem>
</varlistentry>
<varlistentry>
<term><option>string = stdnse.strjoin("delimiter", table)</option>
<indexterm><primary>strjoin</primary></indexterm></term>
<listitem>
<para>
Inverse function to <literal>strsplit()</literal>. Basically this is
Lua's <literal>table.concat()</literal> function with the parameters
swapped for coherence.
</para>
</listitem>
</varlistentry>
</variablelist>
</sect2>
</sect1>
<sect1 id="nse-api">
<title>Nmap API</title>
<para>
NSE scripts have access to several Nmap facilities for writing
flexible and elegant scripts. The API provides target host
details such as port states and version detection results. It
also offers an interface to the Nsock library for efficient
network I/O.
</para>
<sect2 id="nse-api-arguments">
<title>Information Passed to a Script</title>
<para>
An effective Nmap scripting engine requires more than just a
Lua interpreter. Users need easy access to the information
Nmap has learned about the target hosts. This data is passed
as arguments to the NSE <literal>action</literal> method. The
arguments, <literal>host</literal> and
<literal>port</literal>, are Lua tables which contain
information on the target against which the script is
executed. The following list describes each variable in the
<literal>host</literal> and <literal>port</literal> tables.
</para>
<para>
<variablelist>
<varlistentry>
<term><option>host</option>
<indexterm><primary>host</primary></indexterm></term>
<listitem>
<para>
This table is passed as a parameter to the rule and action
functions. It contains information on the operating system run by
the host (if the <option>-O</option> switch was supplied), the
IP address and the host name of the scanned target.
</para>
</listitem>
</varlistentry>
<varlistentry>
<term><option>host.os</option>
<indexterm><primary>host.os</primary></indexterm></term>
<listitem>
<para>
The <literal>os</literal> entry in the host table is
an array of strings. The strings (maximally 8) are the
names of the operating systems the target is possibly
running. Strings are only entered in this array if the
target machine is a perfect match for one or more OS
database entries. If Nmap was run without the
<option>-O</option> option, then
<literal>host.os</literal> is <literal>nil</literal>.
</para>
</listitem>
</varlistentry>
<varlistentry>
<term><option>host.ip</option>
<indexterm><primary>host.ip</primary></indexterm></term>
<listitem>
<para>Contains a string representation of the IP address of the
target host. If the scan was run against a host name and the
reverse DNS query returned more than one IP addresses then the
same IP address is used as the one chosen for the scan.
</para>
</listitem>
</varlistentry>
<varlistentry>
<term><option>host.name</option>
<indexterm><primary>host.name</primary></indexterm></term>
<listitem>
<para>Contains the reverse DNS entry of the scanned target host
represented as a string. If the host has no reverse DNS entry,
the value of the field is an empty string.
</para>
</listitem>
</varlistentry>
<varlistentry>
<term><option>host.targetname</option>
<indexterm><primary>host.targetname</primary></indexterm></term>
<listitem>
<para>Contains the name of the host as specified on the commandline.
If the target given on the commandline contains a netmask or is an IP
address the value of the field is <literal>nil</literal>.
</para>
</listitem>
</varlistentry>
<varlistentry>
<term><option>host.directly_connected</option>
<indexterm><primary>host.directly_connected</primary></indexterm></term>
<listitem>
<para> A boolean value indicating whether or not the target host is
directly connected (i.e. on the same network segment).
</para>
</listitem>
</varlistentry>
<varlistentry>
<term><option>host.mac_addr</option>
<indexterm><primary>host.mac_addr</primary></indexterm></term>
<listitem>
<para>MAC address of the destination host (6-byte long binary
string) or <literal>nil</literal>, if the host is not directly connected.
</para>
</listitem>
</varlistentry>
<varlistentry>
<term><option>host.mac_addr_src</option>
<indexterm><primary>host.mac_addr_src</primary></indexterm></term>
<listitem>
<para>Our own MAC address, which was used to connect to the
host (either our network card's, or (with <option>--spoof-mac</option>) the spoofed address).
</para>
</listitem>
</varlistentry>
<varlistentry>
<term><option>host.interface</option>
<indexterm><primary>host.interface</primary></indexterm></term>
<listitem>
<para>A string containing the interface name (dnet-style) through
which packets to the host are sent.
</para>
</listitem>
</varlistentry>
<varlistentry>
<term><option>host.bin_ip</option>
<indexterm><primary>host.bin_ip</primary></indexterm></term>
<listitem>
<para>The hosts IP as 4 byte long binary value.
</para>
</listitem>
</varlistentry>
<varlistentry>
<term><option>host.bin_ip_src</option>
<indexterm><primary>host.bin_ip_src</primary></indexterm></term>
<listitem>
<para>Our hosts IP as 4 byte long binary value.
</para>
</listitem>
</varlistentry>
<varlistentry>
<term><option>port</option>
<indexterm><primary>port</primary></indexterm></term>
<listitem>
<para>
The port table is passed to the Lua script in the same
fashion as the host table. It contains information about the port
against which the script is running. If the script is run
according to a host rule, then no port table is passed to the
script. Port states on the target can still be requested from Nmap
using the <literal>nmap.get_port_state()</literal> call.
</para>
</listitem>
</varlistentry>
<varlistentry>
<term><option>port.number</option>
<indexterm><primary>port.number</primary></indexterm></term>
<listitem>
<para>
Contains the number of the currently scanned port.
</para>
</listitem>
</varlistentry>
<varlistentry>
<term><option>port.protocol</option>
<indexterm><primary>port.protocol</primary></indexterm></term>
<listitem>
<para>
Defines the protocol of the port. Valid values are
<literal>tcp</literal> and <literal>udp</literal>.
</para>
</listitem>
</varlistentry>
<varlistentry>
<term><option>port.service</option>
<indexterm><primary>port.service</primary></indexterm></term>
<listitem>
<para>
Contains a string representation of the service running on
<literal>port.number</literal> as detected by the Nmap service
detection. If the <literal>port.version</literal> field is
<literal>nil</literal> then Nmap has guessed the service based
only on the port number. Otherwise this field is equal to
<literal>port.version.name</literal>.
</para>
</listitem>
</varlistentry>
<varlistentry>
<term><option>port.version</option>
<indexterm><primary>port.version</primary></indexterm></term>
<listitem>
<para>
This entry is a table which contains information
retrieved by the Nmap version scanning engine. Some
of the values (like service name, service type
confidence, RPC related values) may be retrieved by
Nmap even if a version scan was not required. Values
which were not retrieved default to
<literal>nil</literal>. The meaning of each value is given in the following table:</para>
<table id="scripting-tbl-port-version-values">
<title><literal>port.version</literal> values</title>
<tgroup cols="2">
<colspec colwidth="2*" />
<colspec colwidth="5*" />
<thead><row>
<entry>Name</entry>
<entry>Description</entry>
</row></thead>
<tbody>
<row>
<entry><literal>name</literal></entry>
<entry>Contains the service name Nmap will use for the port.</entry>
</row>
<row>
<entry><literal>name_confidence</literal></entry>
<entry>Evaluates how confident the version detection is is about the accuracy of <literal>name</literal>, from one (least confident) to 10.</entry>
</row>
<row>
<entry><literal>product</literal>, <literal>version</literal>, <literal>extrainfo</literal>, <literal>hostname</literal>, <literal>ostype</literal>, <literal>devicetype</literal></entry>
<entry>These five variables are described in
<bookex><xref linkend="vscan-versioninfo"/>.</bookex>
<notbook>the <ulink url="http://nmap.org/vscan/vscan-fileformat.html#vscan-versioninfo">versioninfo section</ulink> of our version scanning documentation.</notbook>
</entry>
</row>
<row>
<entry><literal>service_tunnel</literal></entry>
<entry>Contains the string <literal>none</literal> or <literal>ssl</literal> based on whether or not Nmap used SSL tunnelling to detect the service.</entry>
</row>
<row>
<entry><literal>service_fp</literal></entry>
<entry>The service fingerprint, if any, is provided in this value. This is described in
<bookex><xref linkend="vscan-community"/>.</bookex>
<notbook>our <ulink url="http://nmap.org/vscan/vscan-community.html">version detection documentation</ulink>.</notbook>
</entry>
</row>
<row>
<entry><literal>rpc_status</literal></entry>
<entry>Contains a string value of <literal>good_prog</literal> if
we were able to determine the program number of an RPC
service listening on the port, <literal>unknown</literal>
if the port appears to be RPC but we couldn't determine the
program number, <literal>not_rpc</literal> if the port
doesn't appear be RPC, or <literal>untested</literal> if we
haven't checked for RPC status. The
<literal>rpc_program</literal>,
<literal>rpc_lowver</literal>, and
<literal>rpc_highver</literal> variables are <literal>nil</literal> unless
<literal>rpc_status</literal> is
<literal>good_prog</literal>.</entry>
</row>
<row>
<entry><literal>rpc_program</literal>, <literal>rpc_lowver</literal>, <literal>rpc_highver</literal></entry>
<entry>The detected RPC program number and the range of version
numbers supported by that program. These will be
<literal>nil</literal> if <literal>rpc_status</literal> is
anything other than <literal>good_prog</literal>.</entry>
</row>
</tbody></tgroup></table>
</listitem>
</varlistentry>
<varlistentry>
<term><option>port.state</option>
<indexterm><primary>port.state</primary></indexterm></term>
<listitem>
<para>
Contains information on the state of the port.
Service scripts are only run against ports in the
<literal>open</literal> or
<literal>open|filtered</literal> states, so
<literal>port.state</literal> generally contains one
of those values. Other values might appear if the port
table is a result of the
<literal>get_port_state</literal> function. You can
adjust the port state using the
<literal>nmap.set_port_state()</literal> call. This is
normally done when an <literal>open|filtered</literal>
port is determined to be <literal>open</literal>.
</para>
</listitem>
</varlistentry>
</variablelist>
</para>
<para>
Scripts also have access to some of Nmap&rsquo;s functions and state
variables that are exposed through functions in the <literal>nmap</literal>
table.
<variablelist>
<varlistentry>
<term><option>nmap.debugging()</option>
<indexterm><primary>debugging</primary><secondary><literal>nmap.debugging</literal></secondary></indexterm></term>
<listitem>
<para>
Returns the debugging level as a non-negative integer. The
debugging level can be set with the <option>-d</option>
option<bookex> (see <xref linkend="port-scanning-options-output"/>)</bookex>.
</para>
</listitem>
</varlistentry>
<varlistentry>
<term><option>nmap.have_ssl()</option>
<indexterm><primary>have_ssl</primary></indexterm></term>
<listitem>
<para>
Returns true if Nmap was compiled with SSL support, false
otherwise. This can be used to avoid sending SSL probes
when SSL is not available.
</para>
</listitem>
</varlistentry>
<varlistentry>
<term><option>nmap.verbosity()</option><indexterm><primary>verbosity</primary><secondary><literal>nmap.verbosity</literal></secondary></indexterm></term>
<listitem>
<para>
Returns the verbosity level as a non-negative integer. The
verbosity level can be set with the <option>-v</option>
option<bookex> (see <xref linkend="port-scanning-options-output"/>)</bookex>.
</para>
</listitem>
</varlistentry>
<varlistentry>
<term>
<option>nmap.fetchfile(filename)</option>
<indexterm><primary>fetchfile</primary></indexterm>
</term>
<listitem>
<para>
Allows access to Nmap's data files. <literal>fetchfile()</literal>
searches for the specified file and returns a string containing
it's path if it is found and readable (to the process). If the
file is not found, not readable, or is a directory,
<literal>nil</literal> is returned. The call
<programlisting>
nmap.fetchfile("nmap-rpc")
</programlisting>
will search for the data file <filename>nmap-rpc</filename> and,
assuming it's found (which it should be), return a location like
<filename>/usr/local/share/nmap/nmap-rpc</filename>.
</para>
</listitem>
</varlistentry>
</variablelist>
</para>
</sect2>
<sect2 id="nse-api-portmethods">
<title>Target Information Retrieving by a Script</title>
<para>
Often the information passed to the script is not enough. Sometimes
a script might want to correct target information or set it in the
first place. The following API methods handle this.
</para>
<para>
<variablelist>
<varlistentry>
<term><option>nmap.get_port_state(host, port, protocol)</option>
<indexterm><primary>get_port_state</primary></indexterm></term>
<listitem>
<para>
The <literal>get_port_state()</literal> call takes a
host table, a port table and a protocol
(<literal>tcp</literal> or <literal>udp</literal>) and
returns a port table for the queried port. The host
and port table are similar in structure to the ones
passed to the rule and action functions. The host
table should have an IP address field. The port table
needs a port number and a protocol field. A call could
look like this:
<programlisting>
nmap.get_port_state({ip="127.0.0.1"}, {number="80", protocol="tcp"})
</programlisting>
You can of course reuse the host and port tables
passed to the port rule function. The purpose of this
call is to be able to match scripts against more than
one open port. For example if the target host has an
open port 22 and a running identd server, then you can
write a script which will only fire if both ports are
open and there is an identification server on port
113. While it is possible to specify IP addresses
different to the currently scanned target, the result
will only be correct if the target is in the currently
scanned group of hosts.
</para>
</listitem>
</varlistentry>
<varlistentry>
<term><option>nmap.set_port_state(host, port, state)</option>
<indexterm><primary>set_port_state</primary></indexterm></term>
<listitem>
<para>
The <literal>set_port_state()</literal> call takes a host table,
a port table, and a port state (<literal>open</literal> or
<literal>closed</literal>). With this method the final port
state can be changed. This is useful when Nmap detects a port as
<literal>open|filtered</literal> but the script successfully connects to it. In this
case the port state can be set to <literal>open</literal>. Note
that the <literal>port.state</literal> value, which was passed
to the script's <literal>action</literal> function will not be changed
by this call.
</para>
</listitem>
</varlistentry>
<varlistentry>
<term><option>nmap.set_port_version(host, port, probestate)</option>
<indexterm><primary>set_port_version</primary></indexterm></term>
<listitem>
<para>
To provide a flexible extension to Nmap's version
detection NSE scripts can set the version and service
variables of a port.
The method takes a host and a port
table as arguments. The third argument describes the
state in which the script completed. It is a string
which is one of:
<literal>hardmatched</literal>,
<literal>softmatched</literal>,
<literal>nomatch</literal>,
<literal>tcpwrapped</literal>, or
<literal>incomplete</literal>.
A hard match will almost always be used, as it means
that the script was able to determine the protocol.
You can pass in <literal>nomatch</literal> if the
script fails to match the target port, but it is
probably already set that way anyway. One of the
other states should only be used if you know exactly
what you are doing.</para>
<para>The host and port arguments to this function
should either be the tables passed to the
<literal>action</literal> method or they should have
the same structure. The version detection fields this
function looks at are <literal>name</literal>,
<literal>product</literal>,
<literal>version</literal>,
<literal>extrainfo</literal>,
<literal>hostname</literal>,
<literal>ostype</literal>,
<literal>devicetype</literal>, and
<literal>service_tunnel</literal>. All values in this
table are optional. It is possible to pass a table in
which all these values are set to
<literal>nil</literal> or not to set the values at
all.
</para>
</listitem>
</varlistentry>
</variablelist>
</para>
</sect2>
<sect2 id="nse-aux-raw-packet">
<title>Various Utility Functions for Raw Packet Support</title>
<para>
NSE has support for sending raw ethernet frames and capturing
packets. The following two functions may be handy in this context:
</para>
<variablelist>
<varlistentry>
<term><option>nmap.clock_ms()</option>
<indexterm><primary>nmap.clock_ms()</primary></indexterm></term>
<listitem>
<para>
Returns a number representing the current time as milliseconds
since the start of the epoch (on most systems this is 01/01/1970).
</para>
</listitem>
</varlistentry>
<varlistentry>
<term><option>nmap.get_interface_link("interface_name")</option>
<indexterm><primary>nmap.get_interface_link(interface_name)</primary></indexterm></term>
<listitem>
<para>
For the provided dnet-style
<literal>interface_name</literal>,
<literal>nmap.get_interface_link()</literal> returns
what kind of link level hardware the interface
belongs. Return values are:
<literal>ethernet</literal>,
<literal>loopback</literal> or
<literal>p2p</literal>. If the provided
<literal>interface_name</literal> is not one of
those types, <literal>nil</literal> is returned.
</para>
</listitem>
</varlistentry>
</variablelist>
</sect2>
<sect2 id="nse-api-networkio">
<title>Network I/O API</title>
<para>
To allow for efficient and parallelizable network I/O, NSE
provides an interface to Nsock, the Nmap socket library. The
smart callback mechanism Nsock uses is fully transparent to
NSE scripts. The main benefit of Nmap-NSE sockets is that they
never block on I/O operations, allowing many scripts to be run in parallel.
The I/O parallelism is fully transparent to authors of NSE scripts.
In NSE you can either program as if you were using a single non
blocking socket or you can program as if your connection is
blocking. Seemingly blocking I/O calls still return once a
specified timeout has been exceeded. Two flavors of Network I/O are
supported:
</para>
<sect3 id="nse-api-networkio-connect">
<title>Connect-style network I/O</title>
<para>This part of the network API should be suitable for most
classical network uses: Users create a socket, connect it to a
remote address, send and receive data and close the socket again.
Everything up to the Transport layer (which is either TCP, UDP or
SSL) is handled by the library. The following socket API methods
are supported:
</para>
<para>
<variablelist>
<varlistentry>
<term><option>nmap.new_socket()</option>
<indexterm><primary>nmap.new_socket()</primary></indexterm></term>
<listitem>
<para>
The <literal>new_socket()</literal> Nmap call returns an
Nmap-NSE socket object which is the recommended method for network
I/O. It provides facilities to perform communication using the
UDP, TCP and SSL protocol in a uniform manner.
</para>
</listitem>
</varlistentry>
<varlistentry>
<term><option>status, error = socket_object:connect(hostid, port, [protocol])</option>
<indexterm><primary>connect</primary></indexterm></term>
<listitem>
<para>
The connect method of Nmap-NSE socket objects will put
the socket in a state ready for communication. It
takes as arguments a host descriptor (either an IP
address or a host name), a port number and optionally
a protocol. The protocol must be one of
<literal>"tcp"</literal>, <literal>"udp"</literal> or
<literal>"ssl"</literal>. By default the connect call
will attempt to open a TCP connection. On success the
returned value of status is
<literal>true</literal>. If the connection attempt has
failed, the error value contains a description of the
error condition stored as a string.
Those strings are
taken from the <function> gai_strerror()</function>
C function. They are (with the errorcode in parentheses):</para>
<itemizedlist>
<listitem>
<para><quote>Address family for hostname not supported</quote> (<literal>EAI_ADDRFAMILY</literal>)</para>
</listitem>
<listitem>
<para><quote>Temporary failure in name resolution</quote> (<literal>EAI_AGAIN</literal>)</para>
</listitem>
<listitem>
<para><quote>Bad value for ai_flags</quote> (<literal>EAI_BADFLAGS</literal>)</para>
</listitem>
<listitem>
<para><quote>Non-recoverable failure in name resolution</quote> (<literal>EAI_FAIL</literal>)</para>
</listitem>
<listitem>
<para><quote>ai_family not supported</quote> (<literal>EAI_FAMILY</literal>)</para>
</listitem>
<listitem>
<para><quote>Memory allocation failure</quote> (<literal>EAI_MEMORY</literal>)</para>
</listitem>
<listitem>
<para><quote>No address associated with hostname</quote> (<literal>EAI_NODATA</literal>)</para>
</listitem>
<listitem>
<para><quote>Name or service not known</quote> (<literal>EAI_NONAME</literal>)</para>
</listitem>
<listitem>
<para><quote>Servname not supported for ai_socktype</quote> (<literal>EAI_SERVICE</literal>)</para>
</listitem>
<listitem>
<para><quote>ai_socktype not supported</quote> (<literal>EAI_SOCKTYPE</literal>)</para>
</listitem>
<listitem>
<para><quote>System error</quote> (<literal>EAI_SYSTEM</literal>)</para>
</listitem>
</itemizedlist>
<para>In addition to these standard system error based messages are the following two NSE-specific errors:</para>
<itemizedlist>
<listitem>
<para><quote>Sorry, you don't have OpenSSL.</quote> occurs
if <literal>ssl</literal> is passed as third argument, but Nmap was compiled
without OpenSSL support.</para>
</listitem>
<listitem>
<para><quote>invalid connection method</quote> occurs if
the second parameter is not one of <literal>tcp</literal>, <literal>udp</literal>, <literal>ssl</literal>.</para>
</listitem>
</itemizedlist>
</listitem>
</varlistentry>
<varlistentry>
<term><option>status, error = socket_object:send(data)</option>
<indexterm><primary>send</primary></indexterm></term>
<listitem>
<para>
The send method sends the data contained in the
<literal>data</literal> string through an open
connection. On success the returned value of status is
<literal>true</literal>. If the send operation
has failed, the error value contains a description of
the error condition stored as a string. The error strings are:
<itemizedlist>
<listitem>
<para><quote>Trying to send through a closed socket</quote>&mdash;if there was no
call to socket_object:connect before the send operation.</para>
</listitem>
<listitem>
<para><quote>TIMEOUT</quote>&mdash;if the operation took longer than the
specified timeout for the socket.</para>
</listitem>
<listitem>
<para><quote>ERROR</quote>&mdash;if an error occurred inside the underlying
Nsock library.</para>
</listitem>
<listitem>
<para><quote>CANCELLED</quote>&mdash;if the operation was cancelled.</para>
</listitem>
<listitem>
<para><quote>KILL</quote>&mdash;if for example the script scan is aborted due
to a faulty script.</para>
</listitem>
<listitem>
<para><quote>EOF</quote>&mdash;if an EOF was read&mdash;will probably not occur
for a send operation.</para>
</listitem>
</itemizedlist>
</para>
</listitem>
</varlistentry>
<varlistentry>
<term><option>status, value = socket_object:receive()</option>
<indexterm><primary>receive</primary></indexterm></term>
<listitem>
<para>
The receive method does a non-blocking receive operation on
an open socket. On success the returned value of
<literal>status</literal> is
<literal>true</literal> and the received data is stored in
<literal>value</literal>. If receiving data has failed,
<literal>value</literal> contains a description of the error
condition stored as a string. A failure occurs for example
if receive is called on a closed socket. The receive call
returns to the NSE script all the data currently stored
in the receive buffer of the socket. Error conditions
are the same as for the send operation.
</para>
</listitem>
</varlistentry>
<varlistentry>
<term><option>status, value = socket_object:receive_lines(n)</option>
<indexterm><primary>receive_lines</primary></indexterm></term>
<listitem>
<para>
Tries to receive at least <replaceable>n</replaceable>
lines from an open connection. A line is a string
delimited with <literal><quote>\n</quote></literal> characters. If
it was not possible to receive at least
<replaceable>n</replaceable> lines before the operation times
out a TIMEOUT error occurs. On the other hand, if more
than <replaceable>n</replaceable> lines were received, all are
returned, not just <replaceable>n</replaceable>. On success
the returned value of <replaceable>status</replaceable> is
<literal>true</literal> and the received data is
stored in <replaceable>value</replaceable>. If the connection
attempt has failed, <replaceable>value</replaceable> contains
a description of the error condition stored as string.
Error conditions are the same as for the send operation.
</para>
</listitem>
</varlistentry>
<varlistentry>
<term><option>status, value = socket_object:receive_bytes(n)</option>
<indexterm><primary>receive_bytes</primary></indexterm></term>
<listitem>
<para>
Tries to receive at least <replaceable>n</replaceable>
bytes from an open connection. On success the returned
value of <replaceable>status</replaceable> is <literal>true</literal> and the
received data is stored in
<replaceable>value</replaceable>. If operation fails,
<replaceable>value</replaceable> contains a description of the
error condition stored as a string. Similarly to
<literal>receive_lines()</literal>
<replaceable>n</replaceable> is the minimum amount of
characters we would like to receive. If more arrive,
we get all of them. If less than <replaceable>n</replaceable> characters arrive
before the operation times out, a TIMEOUT error occurs.
Other error conditions are the same as for the send operation.
</para>
</listitem>
</varlistentry>
<varlistentry>
<term><option>status, value = socket_object:receive_buf(func/"string", keeppattern)</option>
<indexterm><primary>receive_buf</primary></indexterm></term>
<listitem>
<para>
<literal>receive_buf</literal> tries to circumvent several
limitations in the other <literal>receive*</literal> functions.
<literal>receive_line(n)</literal>, for example, tries to ensure that
there are at least <replaceable>n</replaceable> lines received and returns everything it has
already read from the connection (even though there may be much more
data than requested). It also leaves line-parsing to the user.</para>
<para><literal>receive_buf</literal> on the other hand returns only the
part of the received data until the first match of a delimiter,
with the rest being saved inside a buffer for later calls to
<literal>receive_buf</literal>. This buffer gets cleared on calls to
other functions inside the Network I/O API. Should the data not
contain the delimiter another read request is sent and the buffer is
checked again when more data is present.</para>
<para><literal>receive_buf</literal> takes two arguments.
The first one is either a string or a function. If it is
a string it gets passed to Lua's <literal><ulink url="http://www.lua.org/manual/5.1/manual.html#5.4">string.find</ulink></literal> function as the (second) pattern
parameter, with the buffer data being searched. If it is a function
it is expected to take exactly one parameter (the buffer) and its
return values have to be like those of <literal>string.find</literal>
(i.e. offsets of the start and the end of the delimiter inside the
buffer, or <literal>nil</literal>, if the delimiter is not found).</para>
<para>The second argument is a boolean value which indicates whether the
delimiting pattern should be returned along with the received data or
discarded.</para>
<para>A module inside the
nselib <literal>match.lua</literal> (<xref linkend="nse-lib-match"/>) provides
functions for matching received data against regular expressions or
for receiving a defined number of bytes. <literal>receive_buf</literal>'s return values behave exactly as the return values of
the other <literal>receive*</literal> functions. Two values are returned (status,val)&mdash;
the first indicating whether the request was successful, the other
containing the returned data (or the case of a failure, an error message).</para>
<para>Possible error messages are those of the other
<literal>receive*</literal> functions and, in addition, the following:
<itemizedlist>
<listitem>
<para><quote>Error inside splitting-function</quote>&mdash;if the first argument was
a function which caused an error while being called.
</para>
</listitem>
<listitem>
<para><quote>Error in <literal>string.find</literal> (<literal>nsockobj:receive_buf</literal>)!</quote>&mdash;if a string
was provided as the first argument, and string.find() yielded an
error while being called.</para>
</listitem>
<listitem>
<para><quote>Expected either a function or a string!</quote>&mdash;if the
first argument was neither a function nor a string.</para>
</listitem>
<listitem>
<para><quote>Delimiter has negative size!</quote>&mdash;if the returned start offset
is greater than the end offset.</para>
</listitem>
</itemizedlist>
</para>
</listitem>
</varlistentry>
<varlistentry>
<term><option>status, err = socket_object:close()</option>
<indexterm><primary>close</primary></indexterm></term>
<listitem>
<para>
Closes an open connection. On success the returned value of
<literal>status</literal> is <literal>true</literal>. If the connection
attempt has failed, <literal>value</literal> contains a description
of the error condition stored as a string. Currently the only error
message is: <quote>Trying to close a closed socket</quote>, which is issued if the socket
has already been closed. Sockets are subject to garbage collection.
Should you forget to close a socket, it will get closed before it gets
deleted (on the next occasion Lua's garbage collector is run).
However since garbage collection cycles are difficult to predict, it
is considered good practice to close opened sockets.
</para>
</listitem>
</varlistentry>
<varlistentry>
<term><option>status,localip,localport,remoteip,remoteport=socket_object:get_info()</option>
<indexterm><primary>get_info</primary></indexterm></term>
<listitem>
<para>
This function returns information about the socket
object. It returns 5 values. If an error occurred, the
first value is <literal>nil</literal> and the second
value describes the error condition. Otherwise the
first value describes the success of the operation and
the remaining 4 values describe both endpoints of the
TCP connection. If you put the call in a <literal>try()</literal> statement
the status value is consumed. The call can be used for example if
you want to query an authentication server.
</para>
</listitem>
</varlistentry>
<varlistentry>
<term><option>socket_object:set_timeout(t)</option>
<indexterm><primary>set_timeout</primary></indexterm></term>
<listitem>
<para>
Sets the time, in milliseconds, after which input and
output operations on a socket should time out and
return. The default value is 30,000 (30 seconds). The lowest
allowed value is 10ms, since this is
the granularity of NSE network I/O.
</para>
</listitem>
</varlistentry>
</variablelist>
</para>
</sect3>
<sect3 id="nse-api-networkio-raw">
<title>Raw packet network I/O</title>
<para>For those cases where the connection oriented approach is too inflexible,
NSE provides script developers with a more powerful option:
raw packet network I/O. The greater flexibility comes, however, at
the cost of a slightly more complex API. Receiving raw packets is
accomplished via a wrapper around Libpcap inside
the Nsock library. In order to keep the
capturing efficient it works in a three tiered approach: Opening a
device for capturing, registering listeners to it and receiving
packets. With each call to <literal>pcap_open()</literal> you have
to provide a callback function, which receives the packet (along with
it's layer 2 and 3 headers) and is used to compute a so-called
packet hash. Each call to <literal>pcap_register()</literal> takes a
binary string as argument. For every packet captured the computed
hash is matched against all registered strings.
Those scripts for which the compare yields true are then provided
with the packet as a return value to <literal>pcap_receive()</literal>.
The more general the packet hash computing function is kept,
the more scripts may receive the packet and proceed with their
execution. To use the packet capturing inside your script you have to
create (and afterwards close) a socket with
<literal>nmap.newsocket()</literal>
(or <literal>socket_object:close()</literal> respectively)&mdash;just
like with the connection-based network I/O. A more detailed description
of the functions for packet capturing follows:
</para>
<para>
<variablelist>
<varlistentry>
<term><option>socket_object:pcap_open(device, snaplen, promisc,
test_function, bpf)</option>
<indexterm><primary>pcap_open</primary></indexterm></term>
<listitem>
<para>
The <literal>pcap_open()</literal> call opens the socket for
packet capturing. The parameters are:</para>
<itemizedlist>
<listitem><para><literal>device</literal>&mdash;the dnet-style interface name of the device you want to capture from.</para></listitem>
<listitem><para><literal>snaplen</literal>&mdash;defines the length of each packet you want to capture (similar to the <option>-s</option> option to <command>tcpdump</command>)</para></listitem>
<listitem><para><literal>promisc</literal>&mdash;should be set to <literal>1</literal> if the interface should activate promiscuous mode, and zero otherwise.</para></listitem>
<listitem><para><literal>test_function</literal>&mdash;callback function used to compute the <literal>packet-hash</literal></para></listitem>
<listitem><para><literal>bpf</literal>&mdash;a string describing a Berkeley packet filter expression (like those provided to <command>tcpdump</command>)</para></listitem>
</itemizedlist>
</listitem>
</varlistentry>
<varlistentry>
<term><option>socket_object:pcap_register(packet-hash)</option>
<indexterm><primary>pcap_register</primary></indexterm></term>
<listitem>
<para>
Starts the listening for incoming packages. The provided
<literal>packet-hash</literal> is a binary string which has to
match the hash returned by the
<literal>test_function</literal> parameter provided to
<literal>pcap_open()</literal>. If you want to receive all
packets, just provide the empty string (<literal>""</literal>).
There has to be a call to <literal>pcap_register()</literal>
before a call to <literal>pcap_receive()</literal>.
</para>
</listitem>
</varlistentry>
<varlistentry>
<term><option>status, packet_len, l2_data, l3_data = socket_object:pcap_receive()</option>
<indexterm><primary>pcap_receive</primary></indexterm></term>
<listitem>
<para>
Receives a captured packet. If successful, the return values are:</para>
<itemizedlist>
<listitem><para><literal>status</literal>&mdash;a boolean with the value <literal>true</literal>.</para></listitem>
<listitem><para><literal>packet_len</literal>&mdash;the length of the captured packet (note, that you could have received less data if the snaplen parameter was smaller than the packet length)</para></listitem>
<listitem><para><literal>l2_data</literal>&mdash;data from the second OSI layer (e.g. ethernet headers)</para></listitem>
<listitem><para><literal>l3_data</literal>&mdash;data from the third OSI layer (e.g. IPv4 headers).</para></listitem>
</itemizedlist>
<para>Should an error or timeout occur, while waiting for a packet the
return values are: <literal>nil,error_message,nil,nil</literal>, where
error_message describes the occurred error.</para>
</listitem>
</varlistentry>
<varlistentry>
<term><option>socket_object:pcap_close()</option>
<indexterm><primary>pcap_close()</primary></indexterm></term>
<listitem>
<para>Closes the pcap device.
</para>
</listitem>
</varlistentry>
</variablelist>
</para>
<para>
Receiving raw packets is a great feature, but it is also only the
half job. Now for sending raw packets: To accomplish this NSE has
access to a wrapper around the <literal>dnet</literal> library.
Currently NSE has the ability to send raw ethernet frames via the
following API:
</para>
<para>
<variablelist>
<varlistentry>
<term><option>dnet_object=nmap.new_dnet()</option>
<indexterm><primary>new_dnet()</primary></indexterm></term>
<listitem>
<para>
Creates and returns a new dnet_object, which can be used to
send raw packets.
</para>
</listitem>
</varlistentry>
<varlistentry>
<term><option>dnet_object:ethernet_open(interface_name)</option>
<indexterm><primary>ethernet_open</primary></indexterm></term>
<listitem>
<para>Opens the interface defined by the provided
<replaceable>interface_name</replaceable> for sending ethernet frames
through it. An error (<quote>device is not valid ethernet
interface</quote>) is thrown in case the provided argument
is not valid.
</para>
</listitem>
</varlistentry>
<varlistentry>
<term><option>dnet_object:ethernet_send(packet)</option>
<indexterm><primary>ethernet_send</primary></indexterm></term>
<listitem>
<para>
Sends the provided data as ethernet frame across the previously
opened interface. Note that you have to provide the packet
including IP header and ethernet header. If there was no
previous valid call to <literal>ethernet_open()</literal> an
error is thrown (<quote>dnet is not valid opened ethernet
interface</quote>).
</para>
</listitem>
</varlistentry>
<varlistentry>
<term><option>dnet_object:ethernet_close()</option>
<indexterm><primary>ethernet_close</primary></indexterm></term>
<listitem>
<para>Closes the interface. The only error which may be thrown
is the same as for the <literal>ethernet_send()</literal>
operation.
</para>
</listitem>
</varlistentry>
</variablelist>
</para>
</sect3>
</sect2>
<sect2 id="nse-exceptions">
<title>Exception Handling</title>
<para>
NSE provides an exception handling mechanism not present in
the plain Lua language. The exception handling is tailored
specifically for network I/O operations. The mechanism
follows a functional programming paradigm rather than an
object oriented programming paradigm. To create an exception
handler the <literal>nmap.new_try()</literal> API method is
used. This method returns a function, which takes a function
as an argument. If the function passed as an argument raises
an exception, then the script execution is aborted and no
output is produced. Optionally you can pass a function to
the <literal>new_try()</literal> method which will be called
if an exception is caught. In this function you can perform
required clean up operations.</para>
<para>
<xref linkend="nse-exception-handling"/> shows cleanup
exception handling at work. A new function named
<literal>catch</literal> is defined to simply close the
newly created socket in case of an error. It is then used
to protect connection and communication attempts on that
socket. If no catch function is specified, execution of the
script aborts without further ado&mdash;open sockets
will remain open. If the verbosity level is at least one
or if the scan is performed in debugging mode a description
of the uncaught error condition is printed on standard output.
Note that it is currently not easily possible to group several
statements in one try block. It is also important to remember
that if the socket is not closed it will occupy memory
until the next run of Lua's garbage collector.
</para>
<example id="nse-exception-handling">
<title>Exception handling example</title>
<programlisting>
local result, socket, try, catch
result = ""
socket = nmap.new_socket()
catch = function()
socket:close()
end
try = nmap.newtry(catch)
try(socket:connect(host.ip, port.number))
result = try(socket:receive_lines(1));
try(socket:send(result))
</programlisting>
</example>
<para>
Writing a function which is treated properly by the
try/catch mechanism is straightforward. The function should
return multiple values. The first value should be a boolean
which is <literal>true</literal> upon successful completion of the function and
<literal>false</literal> otherwise. If the function completed successfully the try
construct consumes the indicator value and returns the
remaining values. If the function failed then the second
returned value must be a string describing the error
condition. Note that that if the value is not <literal>nil</literal> it is
treated as <literal>true</literal> so you can return your
value in the normal case and return <literal>nil, <replaceable>error description</replaceable></literal>
if an error occurs.
</para>
</sect2>
<sect2 id="nse-api-registry">
<title>The Registry<indexterm><primary>registry</primary></indexterm></title>
<para>
The registry is a normal Lua table. What is special about it
is that it is visible by all scripts and it retains its state
between script executions. Nmap does not scan every host
specified on the command line at the same time, it puts them
in smaller groups and these groups are scanned in parallel. The
registry is rebuilt for every group, so information stored
there is only deleted after NSE finishes processing the
current target group. This implies of course that the registry
is transient&mdash;it is not stored between Nmap executions. Every
script can read the registry and write to it. If a script is
running after another script, it can read some information in
the registry which was left by the first script. This feature
is particularly powerful in combination with the run level
concept. A script with a higher run level can rely on entries
left behind for it by scripts with lower run levels. Remember
however that the registry can be written by all scripts
equally, so choose the keys for your entries wisely. The
registry is stored in <literal>nmap.registry</literal>. The
behavior of the registry allows caching of already calculated
data. The cache can be seen by all scripts until the registry
is rebuilt with the next target group. <!-- If for example you have
compiled a regular expression, you can store the compiled
expression in the registry so that scripts which need the same
pattern do not have to recompile it. -->
</para>
</sect2>
</sect1>
<sect1 id="nse-tutorial">
<title>Script Writing Tutorial</title>
<para>
Suppose that you are convinced of the power of NSE. How do you
go about writing your own script? Let's say
that you want to extract information from an identification
server. Nmap used to have this functionality but it was removed
because of inconsistencies in the code base. Fortunately, the
protocol identd uses is pretty simple. Unfortunately, it is too
complicated to be expressible in Nmap's version detection
language. Let's look at how the identification protocol
works. First you connect to the identification server. Next you
send a query of the form <literal><replaceable>port-on-server</replaceable>,
<replaceable>port-on-client</replaceable></literal> terminated with a new line
character. The server should then respond with a string of the
form <literal><replaceable>port-on-server</replaceable>, <replaceable>port-on-client</replaceable>:<replaceable>response-type</replaceable>:<replaceable>address-information</replaceable></literal>. In case of an error the address
information is omitted. This description is sufficient for our
purposes, for more details refer to <ulink role="hidepdf" url="http://www.rfc-editor.org/rfc/rfc1413.txt">RFC 1413</ulink>. The protocol cannot be modeled in Nmap's version
detection language for two reasons. The first is that you need
to know both the local and the remote port of a
connection. Version detection does not provide this data. The
second, more severe obstacle, is that you need two open
connections to the target&mdash;one to the identification server and
one to the port you want to query. Both obstacles are easily
overcome with NSE. </para>
<para>
The anatomy of a script is described in <xref linkend="nse-scripts"/>.
In this section we will show how the described structure is utilized.
</para>
<sect2 id="nse-tutorial-head">
<title>The Head</title>
<para>
The head of the script is essentially its meta
information. This includes the fields
<literal>id</literal>, <literal>description</literal>,
<literal>author</literal>, <literal>license</literal> and
<literal>categories</literal>. We are not going to change the
run level for now. The <literal>id</literal> of a script
should uniquely identify it. If it is absent, the path to the
script will be used as an id. We recommend to choose an id
which concisely identifies the purpose of the script, since
the ID is printed before the script's results in Nmap output.
</para>
<para>
<programlisting>
id = "Service Owner"
</programlisting>
</para>
<para>
The description field should contain a sentence or two describing what the script does. If anything about the script results might confuse or mislead users, and you can't eliminate the issue by improving the script or results text, it should be documented in the <literal>description</literal> string.
</para>
<para>
<programlisting>
description = "Opens a connection to the scanned port, opens a connection to \
port 113, queries the owner of the service on the scanned port and prints it."
</programlisting>
</para>
<para>
Users must tell the Lua interpreter that the string
continues on the following line by ending the line with a
backslash (&lsquo;<literal>\</literal>&rsquo;). They must also decide what
categories the script belongs to. This script is a good
example of a script which cannot be categorized clearly. It is
<literal>safe</literal> because we are not using the service
for anything it was not intended for. On the other hand, it
is <literal>intrusive</literal> because we connect to a
service on the target and therefore potentially give out
information about us. To solve this dilemma we will place our
script in two categories:
</para>
<programlisting>
categories = {"safe", "intrusive"}
</programlisting>
</sect2>
<sect2 id="nse-tutorial-rule">
<title>The Rule</title>
<para>
The rule section is a Lua method which decides when the
script's action should be performed and when it should be
skipped. Usually this decision is based on the host and port
information passed to the rule function. In the case of the
identification script it is slightly more complicated than
that. To decide whether to run the identification script on a
given port we need to know if there is an identification
server running on the target machine. Or more formally: the
script should be run if (and only if) the currently scanned TCP port is open and
TCP port 113 is also open. For now we will rely on the fact that
identification servers listen on TCP port 113. Unfortunately NSE
only gives us information about the currently scanned port.
To find out if port 113 is open we are going to use the
<literal>nmap.get_port_state()</literal> method. If the identd
port was not scanned, the <literal>get_port_state</literal>
function returns <literal>nil</literal>. So we need to make
sure that the table is not <literal>nil</literal>. We also
check if both ports are in the <literal>open</literal> state.
If this is the case, the action is executed, otherwise we skip
the action.
</para>
<para>
<programlisting>
portrule = function(host, port)
local identd, decision
local ident_port = { number=113, protocol="tcp" }
identd = nmap.get_port_state(host, ident_port)
if
identd ~= nil and identd.state == "open" and port.state == "open"
then
decision = true
else
decision = false
end
return decision
end
</programlisting>
</para>
<para>
This rule is <emphasis>almost</emphasis> correct, but still
slightly buggy. Can you find the bug? It is a pretty subtle
one. The problem is that this script fires on any kind of open
port, TCP or UDP. The <literal>connect()</literal> method on
the other hand assumes a TCP protocol unless it is explicitly
told to use another protocol. Since the identification service
is only defined for TCP connections, we need to narrow down
the range of ports which fire our script. Our new rule only
runs the script if the port is open, we are looking at a TCP
port, and TCP port 113 is open. Writing the new and
improved port rule is left as an exercise to the reader (or
peek at the script in the latest Nmap distribution).
</para>
</sect2>
<sect2 id="nse-tutorial-action">
<title>The Mechanism</title>
<para>
At last we implement the actual functionality. The script will
first connect to the port on which we expect to find the
identification server, then it will connect to the port we
want information about. Afterward we construct a query string
and parse the response. If we received a satisfactory
response, we return the retrieved information.
</para>
<para>
First we need to create two socket objects. These objects
represent the sockets we are going to use. By using object methods
like
<literal>open()</literal>,
<literal>close()</literal>,
<literal>send()</literal> or
<literal>receive()</literal> we can operate on the network
socket. To avoid excessive error checking code we use NSE's
exception handling mechanism. We create a function which will
be executed if an error occurs and call this function
<literal>catch</literal>. Using this function we generate
a <literal>try</literal> function. The <literal>try</literal>
function will call the <literal>catch</literal> function
whenever there is an error condition in the tried block. Note
that we could have ignored the last two return values
of <literal>client_service:get_info()</literal> like this:
<programlisting>
local localip, localport = client_service:get_info()
</programlisting>
This would have sufficed because we know that the remote port is
stored in <literal>port.number</literal>.</para>
<para>In this example we
prefer not to tell the user if the query resulted in an
error. To inform users of failed
identification queries, simply uncomment the corresponding
line. It is necessary that we assign the variable <literal>owner</literal>
a <literal>nil</literal> value because returning <literal>nil</literal>
is the only way to tell the script engine to suppress script output.
</para>
<para>
<programlisting>
action = function(host, port)
local owner = ""
local client_ident = nmap.new_socket()
local client_service = nmap.new_socket()
local catch = function()
client_ident:close()
client_service:close()
end
local try = nmap.newtry(catch)
try(client_ident:connect(host.ip, 113))
try(client_service:connect(host.ip, port.number))
local localip, localport, remoteip,
remoteport = client_service:get_info()
local request = port.number .. ", " .. localport .. "\n"
try(client_ident:send(request))
owner = try(client_ident:receive_lines(1))
if string.match(owner, "ERROR") then
owner = nil
-- owner = "Service owner could not be determined: " .. owner
else
owner = string.match(owner, "USERID : .+ : (.+)\n", 1)
end
try(client_ident:close())
try(client_service:close())
return owner
end
</programlisting>
</para>
</sect2>
</sect1>
<sect1 id="nse-vscan">
<title>Version Detection using NSE</title>
<para>
The version detection system built into Nmap was designed to
efficiently recognize the vast majority of protocols with a
simple pattern matching syntax. Some protocols require a more
complex approach, and a generalized scripting language is
perfect for this. Skype2 is one such protocol. It pretends to
be an http server, requiring multiple queries to determine its
true nature. NSE has been integrated into Nmap's version
detection framework to handle these cases. The scripts which
extend the version scanner belong to the reserved category
<literal>version</literal>. This category cannot be run from
the command line. It is only executed if the user has required a
version scan. The following listing shows a simple script which
demonstrates the use of the NSE version detection API. If either
the TCP port 80 is open or the service has been determined to be
http, the script is triggered. Although it could be extended to
recognize different http servers, its only purpose is to show off
the version detection API. It is not advisable to use NSE for
version detection in the simple case of http servers. The
version detection variables have been filled with dummy entries
to illustrate their effect on the Nmap output.</para>
<para>
<programlisting>
description = "Demonstration of a version detection NSE script. It checks \
and reports the version of a remote web server. For real life purposes it is \
better to use Nmap version detection (-sV)."
author = "Diman Todorov &lt;diman.todorov@gmail.at&gt;
license = "See Nmap's COPYING for license"
id = "HTTP version"
categories = {"version"}
runlevel = 1.0
portrule = function(host, port)
if (port.number == 80
or port.service == "http" )
and port.protocol == "tcp"
then
return true
else
return false
end
end
action = function(host, port)
local query = "GET / HTTP/2.1\r\n"
query = query .. "Accept: */*\r\n"
query = query .. "Accept-Language: en\r\n"
query = query .. "User-Agent: Nmap NSE\r\n"
query = query .. "Host: " .. host.ip .. ":" .. port.number .. "\r\n\r\n"
local socket = nmap.new_socket()
local catch = function()
socket:close()
end
local try = nmap.new_try(catch)
try(socket:connect(host.ip, port.number))
try(socket:send(query))
local response = ""
local lines
local status
local value
while true do
status, lines = socket:receive_lines(1)
if not status or value then
break
end
response = response .. lines
value = string.match(response, "Server: (.-)\n")
end
try(socket:close())
if value then
port.version.name = "[Name]"
port.version.name_confidence = 10
port.version.product = "[Product]"
port.version.version = "[Version]"
port.version.extrainfo = "[ExtraInfo]"
port.version.hostname = "[HostName]"
port.version.ostype = "[OSType]"
port.version.devicetype = "[DeviceType]"
port.version.service_tunnel = "none"
port.version.fingerprint = nil
nmap.setPortVersion(host, port, "hardmatched")
end
end
</programlisting>
</para>
<para>
This is what the output of this script looks like:
<screen>
$ ./nmap -sV localhost -p 80
Starting Nmap ( http://insecure.org )
Interesting ports on localhost (127.0.0.1):
PORT STATE SERVICE VERSION
80/tcp open [Name] [Product] [Version] ([ExtraInfo])
Service Info: Host: [HostName]; OS: [OSType]; Device: [DeviceType]
Nmap finished: 1 IP address (1 host up) scanned in 9.317 seconds
</screen>
</para>
<para>
The name variable denotes the detected protocol name.
The product, version and extrainfo variables are used
to produce a human readable description of the server
version. The remaining variables provide information deduced
from the output of the server concerning the target host.
</para>
</sect1>
<sect1 id="nse-example-scripts">
<title>Example Scripts</title>
<para>
<remark>
This section should probably provide 2&ndash;3 scripts
which show a diverse and interesting set of NSE features. Each
script should probably have its own sect2 containing a brief
description of the script and anything noteworthy about it,
followed by the script itself with annotations (lineannotation
tag) as you can see, for example, at
<ulink url="http://nmap.org/vscan/vscan-technique-demo.html"/>.
</remark>
</para>
<para>
<remark>
DT: perhaps include an optional version field
</remark>
</para>
<sect2 id="nse-example-script-finger">
<title>Finger-Test Script</title>
<para>The finger script (<filename>finger.nse</filename>) is a perfect
example of how short typical NSE scripts are.
</para>
<para>first the information fields are filled out, note that the
<literal>id</literal> field is kept short, this is important since it is
printed in Nmap's output. A detailed description of what the script
actually does should go in the <literal>description</literal> field.</para>
<programlisting>
id="Finger Results"
description="attempts to get a list of usernames via the finger service"
author = "Eddie Bell &lt;ejlbell@gmail.com&gt;"
license = "See nmaps COPYING for licence"
</programlisting>
<para>The <literal>categories</literal> field is a table
containing all the categories the script belongs to&mdash;These are used for
script selection through the <option>--script</option> option.</para>
<programlisting>
categories = {"discovery"}
</programlisting>
<para>You can use the facilities provided by the nselib (<xref
linkend="nse-library"/>) by <literal>requiring</literal> them. Here
we want to use shorter port rules.</para>
<programlisting>
require "shortport"
</programlisting>
<para>We want to check whether the service behind the port is finger,
or whether it runs on finger's well known port 79. Through this we can
use the information gathered during the version scan (if finger runs
on a non-standard port) or still run against at least the port we
expect it, should the version detection information not be available.</para>
<programlisting>
portrule = shortport.port_or_service(79, "finger")
action = function(host, port)
local socket = nmap.new_socket()
local results = ""
local status = true
</programlisting>
<para>The function <literal>err_catch()</literal> will be called for
clean up, through NSE's exception handling mechanism. Here it only
closes the previously opened socket (which should be enough in most
cases).</para>
<programlisting>
local err_catch = function()
socket:close()
end
</programlisting>
<para>The clean up function gets registered for exception handling via
a call to <literal>nmap.new_try()</literal></para>
<programlisting>
local try = nmap.new_try(err_catch())
</programlisting>
<para>The script sets a timeout of 5000, which is equivalent to 50
seconds. Should any operation require more time we'll receive a
<literal>TIMEOUT</literal> error message.</para>
<programlisting>
socket:set_timeout(5000)
</programlisting>
<para>For actually using exception handling we need to wrap calls to
functions, which may return an error inside
<literal>try()</literal></para>
<programlisting>
try(socket:connect(host.ip, port.number, port.protocol))
try(socket:send("\n\r"))
</programlisting>
<para>The call to <literal>receive_lines()</literal> is not wrapped in
<literal>try()</literal>, because we don't want to abort the script
just because we didn't receive the data we expected. Note that there
is less data than requested (100 lines), we still receive it and the
status is <literal>true</literal> &mdash;consequent calls would yield
a <literal>false</literal> status.</para>
<programlisting>
status, results = socket:receive_lines(100)
socket:close()
</programlisting>
<para>The script returns a string only if we got the data we
wanted, otherwise <literal>nil</literal> is returned (automatically, since
scripts return one result).</para>
<programlisting>
if not(status) then
return results
end
end
</programlisting>
</sect2>
<sect2 id="nse-example-script-owner">
<title>Service Owner Lookup via Identd</title>
<para><filename>showOwner.nse</filename> demonstrates the flexibility
of the NSE, which is unmatched by other parts of Nmap. If the target
is running an <literal>identd</literal> daemon it connects to it for
each running service and tries to identify its owner.
</para>
<programlisting>
id = "Service owner"
description = "Opens a connection to the scanned port, opens a connection to \
port 113, queries the owner of the service on the scanned port and prints it."
author = "Diman Todorov &lt;diman.todorov@gmail.com&gt;"
license = "See nmaps COPYING for licence"
categories = {"safe"}
</programlisting>
<para>Portrules are not restricted to those provided by the
short-port module (<xref linkend="nse-lib-shortport"/>).
They can be any function taking a host- and a porttable as argument and
returning a boolean.
</para>
<programlisting>
portrule = function(host, port)
local identd, decision
</programlisting>
<para>In order to determine the state of a port, which is not provided
as argument we just have to construct a table describing the port
(i.e. its number and the protocol it's using) and pass it to
<literal>nmap.get_port_state()</literal> which returns a table filled
with the information Nmap has about the port.</para>
<programlisting>
local auth_port = { number=113, protocol="tcp" }
identd = nmap.get_port_state(host, auth_port)
if
identd ~= nil
and identd.state == "open"
then
decision = true
else
decision = false
end
return decision
end
action = function(host, port)
local owner = ""
</programlisting>
<para>Scripts can open any number of connection they want.</para>
<programlisting>
local client_ident = nmap.new_socket()
local client_service = nmap.new_socket()
local catch = function()
client_ident:close()
client_service:close()
end
local try = nmap.new_try(catch)
try(client_ident:connect(host.ip, 113))
try(client_service:connect(host.ip, port.number))
local localip,localport,remoteip,remoteport = try(client_service:get_info())
local request = port.number .. ", " .. localport .. "\n"
try(client_ident:send(request))
owner = try(client_ident:receive_lines(1))
if string.match(owner, "ERROR") then
owner = nil
else
owner = string.match(owner, "USERID : .+ : (.+)\n", 1)
end
try(client_ident:close())
try(client_service:close())
return owner
end
</programlisting>
</sect2>
</sect1>
<sect1 id="nse-implementation">
<title>Implementation</title>
<para>
<remark>
We don't need a dozen pages of low-level trivial
details, but it would be nice to have a few sections
describing notable aspects of the NSE implementation (maybe
things like how the parallelization algorithms work, how Lua
is embedded, performance related notes. Information which
might help script writers is particularly desirable. I tend
to think reasons for choosing Lua may be better suited to
<xref linkend="nse-lua"/>, but it could be placed here
instead.
</remark>
</para>
<para>
Now how does all this work? The following section describes
some interesting aspects of the NSE. While the focus primarily lies on
giving script writers a better feeling of what happens with scripts, it
should also provide a starting point for understanding (and extending) the
NSE sources.
</para>
<sect2 id="nse-implementation-init">
<title>Initialization Phase</title>
<para>
During its initialization stage, Nmap loads the Lua interpreter and its provided
libraries get loaded. These libraries are:</para>
<itemizedlist>
<listitem>
<para>The <emphasis>package</emphasis> library (namespace:
<literal>package</literal>)&mdash;Lua's
<ulink url="http://www.lua.org/manual/5.1/manual.html#5.3">package-lib</ulink> provides (among others) the <literal>require</literal> function, used to load modules from the
nselib.
</para>
</listitem>
<listitem>
<para>The <emphasis>table</emphasis> library (namespace:
<literal>table</literal>)&mdash;The
<ulink url="http://www.lua.org/manual/5.1/manual.html#5.5">table manipulation library</ulink> contains many functions used
to operate on <literal>tables</literal>&mdash;Lua's central data
structure.
</para>
</listitem>
<listitem>
<para>The <emphasis>I/O</emphasis> library (namespace:
<literal>io</literal>)&mdash;The
<ulink url="http://www.lua.org/manual/5.1/manual.html#5.7">Input/Output library</ulink> offers functions such as reading files and reading the output from programs you execute.
</para>
</listitem>
<listitem>
<para>The <emphasis>OS</emphasis> library (namespace:
<literal>os</literal>)&mdash;The
<ulink url="http://www.lua.org/manual/5.1/manual.html#5.8">Operating System lib</ulink> provides facilities of the operating
system, including filesystem operations (renaming/removing files,
creating of temporary filenames) and access to the environment.
</para>
</listitem>
<listitem>
<para>The <emphasis>string</emphasis> library (namespace:
<literal>string</literal>)&mdash;The
<ulink url="http://www.lua.org/manual/5.1/manual.html#5.4">
string library </ulink> helps you with functions used to manipulate
strings inside Lua. Functions include: printf-style
string formating, pattern matching using Lua-style patterns,
substring extraction, etc.
</para>
</listitem>
<listitem>
<para>The <emphasis>math</emphasis> library (namespace:
<literal>math</literal>)&mdash;Since usually numbers in Lua correspond
to the <literal>double</literal> C-type, the
<ulink url="http://www.lua.org/manual/5.1/manual.html#5.6">math library</ulink> gives you access to rounding functions,
trigonometric functions, random number generation, and many more.
</para>
</listitem>
<listitem>
<para>The <emphasis>debug</emphasis> library (namespace:
<literal>debug</literal>)&mdash;The
<ulink url="http://www.lua.org/manual/5.1/manual.html#5.9">debug library</ulink> provides you with a somewhat lower level API
to the Lua-interpreter. Through it you can access functions along
the execution stack, get function closures and object metatables,
etc.
</para>
</listitem>
</itemizedlist>
<para>In addition to loading the libraries provided with Lua, the functions
in the <literal>nmap</literal> namespace also get loaded. and search
path for modules is set to the default one prepended by the nselib
directory (which is searched in the locations Nmap searches for its
data files and scripts). In this step the provided script arguments
also get stored inside the <literal>registry</literal>.</para>
<para>
The next phase of NSE initialization is loading the chosen
scripts, which are the arguments provided to the
<option>--script</option> option or <literal>safe,intrusive</literal>, in
case of a default script scan. The string <literal>version</literal>
is appended, if version detection was enabled.
The arguments afterwards are tried to be
interpreted as script categories. This is done via a short Lua function
hard-coded into <filename>nse_init.cc</filename> called <literal>Entry</literal>. If you take a look into the <filename>script.db</filename> you'll see that the <literal>Entry</literal> lines inside
it are Lua function calls with a table as argument.
The arguments that didn't produce any filenames are then interpreted
as file or directory names themselves. If this also fails, the script scan is aborted.</para>
<para>
In the next stage the found files are loaded as chunks, each with
its own environment, having read but not write access to the global
name space and saved inside two globally accessible Lua tables:
<literal>hosttests</literal> and <literal>porttests</literal>
depending on the type of script. Because scripts only get loaded once, values stored inside variables during a script's execution against one host or port can be accessed when the same script runs against another target. This can be used to save computation time when a script is run
against multiple targets. See <xref linkend="nse-example-persistent-locals"/>.
During this stage scripts are
also are also provided with a default <literal>runlevel</literal> (1.0), if they
don't specify one themselves and a check is performed whether they
contain an <literal>action</literal> and a <literal>description</literal> field.
</para>
<example id="nse-example-persistent-locals">
<title>Using local variables to save data.</title>
<programlisting>
id="persistent locals example"
description="This sample script shows how data can be stored across \
several invocations of a script against multiple targets"
author="Stoiko Ivanov"
categories = {"safe"}
require "shortport"
portrule = shortport.portnumber(80)
-- we have to declare the variable in the script's global scope
-- because if we declare it inside the action it would get redefined
-- with each call to the action
local filecontent = nil
require "strbuf"
action= function(host, port)
if(filecontent == nil) then
filecontent = strbuf.new()
for line in io.lines("a_filename_we_want_to_read_from")
filecontent = filecontent .. line
end
end
--rest of the script doing something with the filecontent, we just
--read
end
</programlisting>
</example>
</sect2>
<sect2 id="nse-implementation-match">
<title>Matching of Scripts to Targets</title>
<para>
After the initialization is finished the <literal>hostrules</literal> and
<literal>portrules</literal> are evaluated for each host in the current
target group. At this check a list is built which contains the combinations of scripts and the hosts they will run against.
It should be noted that the rules of all chosen scripts are
checked against all hosts and their <literal>open</literal> and <literal>open|filtered</literal> ports.
Therefore it is advisable to leave the rules as simple as possible and
to do all the computation inside the <literal>action</literal>, as a script will only be
executed if it is run against a specific target. After the check those script-target combinations get their own <ulink url="http://www.lua.org/manual/5.1/manual.html#2.11">Lua-thread</ulink> which is anchored in Lua's C-API <ulink url="http://www.lua.org/manual/5.1/manual.html#3.5">registry</ulink> to prevent their garbage collection. These <literal>thread_records</literal> are afterwards sorted by run level and all script-target combinations of one run level are stored in a list, in order to ensure that scripts with a higher run level are run after those with a lower one.</para>
</sect2>
<sect2 id="nse-implementation-run">
<title>Running Scripts</title>
<para>
Now to the actual script scanning, and the way NSE accomplishes
parallelization. Lua, through its concept of
<ulink url="http://www.lua.org/manual/5.1/manual.html#2.11">coroutines
</ulink> offers collaborative multi-threading, which means that scripts
can suspend themselves, at defined points, and let other coroutines
execute. Since network I/O, especially waiting for responses from
remote host, is the part of scripts which would consume most time with
waiting, this is the point where scripts suspend themselves and let
others execute. Each call to some of the functions of the Nsock wrapper
causes the calling script to yield (pause). Once the request is processed by the Nsock library, the
callback causes the script to be pushed from the waiting queue to the
running queue, which will eventually let it resume its operation.
</para>
</sect2>
<sect2 id="nse-implementation-c-modules">
<title>Adding C Modules to Nselib</title>
<para>
This section tries to give a short walk-through to adding
nselib modules written in C (or C++) to Nmap's build system, since
this has shown to be sometimes tedious. Writing C modules is
described at length in <ulink
url="http://www.amazon.com/exec/obidos/ASIN/8590379825/secbks-20">Programming
in Lua, Second Edition</ulink>. Basically C modules consist of the
functions they provide to Lua, which have to be of type <ulink url="http://www.lua.org/manual/5.1/manual.html#lua_CFunction">lua_CFunction</ulink>. Additionally they have to contain a function
which is used to actually open the module. By convention these function names are <literal>luaopen_<replaceable>modulename</replaceable></literal>.
A good starting point for writing such modules is provided with
<filename>bit.c</filename> and <filename>pcre.c</filename> inside
the <filename>nselib/</filename> subdirectory of Nmap's source tree,
which are two C modules already provided by the nselib. C modules
basically are shared libraries which get loaded at runtime by Lua.
</para>
<para>
The Unix build system uses <literal>libtool</literal> for
compilation in a platform independent way.
As long as the new module
does not depend on foreign libraries, you should only need to add
<literal><replaceable>modulename</replaceable>.so</literal> to the
<literal>all</literal> and <literal>clean</literal> targets in
<filename>Makefile.in</filename>
and copy and adapt the lines from <filename>bit.so</filename>.
If your module does have dependencies you will most probably have to
add checks for those libraries to <filename>configure.ac</filename>
and put the dependencies inside the <literal>libtool</literal>
commands in <filename>Makefile.in</filename>&mdash;here you should
take a look at how <literal>pcre.so</literal> handles this.
So much for the way it should work. Now for some pitfalls we've
come across so far: These are results from building shared libraries in
general and not really specific to nselib. Linking with
static libraries (e.g. <literal>libnbase</literal>) sometimes leads
to problems with exporting symbols on some platforms (in our case
this was a x86_64-linux platform). To our knowledge no such
problems occur when linking against already existing shared
libraries.
</para>
<para>
The Windows build system requires C module developers to create a
MS Visual Studio Project file for their module
(<filename>&lt;modulename&gt;.vcproj</filename>) inside the
<filename>nselib</filename> subdirectory. On Windows you have to
include the <filename>liblua/</filename> subdirectory as
an additional include path as well as a library search path. In addition
you have to tell the project to link against the
<filename>liblua.lib</filename> static library provided with Nmap.
Other properties of the project should be the same as for other
nselib C modules (e.g. see <filename>nse_bitlib.vcproj</filename>).
Afterwards you have to include the newly created project file in
Nmap's Visual Studio solution file
(<filename>mswin32\nmap.sln</filename>) and make sure that
<filename>nse_bitlib.vcproj</filename> depends on your project,
because it is there nselib-modules get copied to their final destinations and are included in Nmap.
</para>
</sect2>
</sect1>
<sect1 id="nse-license">
<title>NSE Script License and Community Contributions</title>
<para>
<remark>Fyodor is working on this. The general idea is for scripts to be contributed
and distributed under the same license as Nmap, as described
in <ulink url="http://seclists.org/nmap-dev/2006/q3/0156.html">this
nmap-dev post</ulink>. We certainly welcome script contributions!</remark>
</para>
</sect1>