These codepoint references are now left intact. If necessary, it would be
a trivial change to replace them with corresponding UTF sequences.
Note that the previous code was decoding the character references recursively,
which was probably not the intent.
In most cases (e.g. any of the nmap.socket operations), functions can
take full host and port tables instead of just host.ip and port.number.
This makes for cleaner-looking code and easier extensibility if we
decide to check for a protocol on both TCP and UDP, for instance.
Using this regular expression, '\(\w*\)\s*=\s*\1\s*\.\.', found and
replaced many string concatenation-reassignments. These can cause
performance issues, since a new string gets allocated for each
reassignment. In many cases, the replacement is simply a single string,
wrapped across lines with the '\z' escape, which consumes a newline and
whitespace following it. In other cases, a table is used to hold the
substrings until the final string is built with a single table.concat
operation (same technique used in stdnse.strbuf).
Also, some string-building loops of this form:
s = ""
for i = 1, 100, 1 do
s = s .. "\0"
end
were replaced with this much faster and cleaner version:
s = string.rep("\0", 100)
parsing mostly. Response parsing is centralized, and fewer operations
are done on raw HTTP data.
The biggest user-visible change is that http.request goes back to
returning a parsed result table, not raw HTTP data. I believe this is
how the function worked in the past; it's what the NSEDoc for the
function says. The only thing that used http.request was citrixxml.lua,
and this commit alters it to match the new expectations.
The other change is that the http.pipeline function no longer accepts
the "raw" option. The only script that used that was sql-injection.nse,
and this commit modifies that script as well.