Nmap Development mailing list archives

Re: [RFC] Some NSE optimizations


From: Daniel Miller <bonsaiviking () gmail com>
Date: Wed, 11 Jul 2012 13:49:31 -0500

Hi List,

Just a final update on my NSE optimization work. Cleaned-up patch attached. Overall, this doesn't have a huge impact (about 5% speedup on scans of my /24 with 18 hosts on a Pentium M at 1.73 GHz), but it was a fun project. Details on the new changes:

4. datafiles.lua: get_array(). In the existing implementation, the get_assoc_array function loops over each line in the datafiles (16000+ lines for nmap-mac-prefixes), performing 2 string.match calls with essentially the same pattern. get_array is called when a single string (pattern) instead of a key-value pair is passed in the data_struct to parse_lines. By allowing patterns with 2 captures or a function that returns 2 values, all the common files can be parsed with get_array() using just one string.match call, resulting in a 1.5x speedup. For both get_assoc_array and get_array, I moved some processing out of the inner loop, and commented out an assertion that doesn't really matter (since you can't read anything but strings from a file). This doesn't scale very far, since most results are cached, but it does save a little.

Once again, I'm looking for a sanity check on this. If no one thinks it's worthwhile, I won't bother to commit, but I'd like some more testing before I do.

Dan

On 07/10/2012 05:36 PM, Daniel Miller wrote:
List,

"Premature optimization is the root of all evil" I know, but I thought I'd throw NSE at a Lua profiler [1] and see if I could squeeze some more speed out of it. Granted also that our bottleneck is the network, but the changes I propose are fairly minor. Times below include profiling, which inflates them artificially, and I use a slow computer to start with. Patch attached.

1. nse_main.lua: tcopy(). This function gets called recursively on the host and port tables for each hostrule, portrule, and associated action functions. In my simple scan of one host, I saw 1402 calls with a total time of 1.9 seconds. I implemented it in C++ using lua_rawset, and saw a roughly 10x speedup (494 Lua function calls, the recursion is done in C++).

2. http.lua: skip_lws(). This function is called for each line in an HTTP header. In my default scan of one http service, it was called 712 times. I sped it up by using repeating pattern matching (* and +) instead of looping over repeated calls to string.match. This led to a 1.7x speedup, or .2 seconds on my 1-port scan.

3. http.lua: parse_header(). This function is called for each HTTP request (16 in my default scan), but took an astonishing 0.25 seconds per call. It got a small speedup boost from skip_lws, which it calls repeatedly, but the primary speedup was using a single pattern match for non-whitespace instead of an explicit loop with 2 string.match calls looking for whitespace on each character. This resulted in a 4.7x speedup, or about 3 seconds per http service.

I'll be running a more comprehensive scan tonight to look for other candidates, but these were by far the standouts in terms of %time running (18%, 4.8%, and 37%, respectively). Please let me know if I introduced bugs!

Dan

[1] I used LuaProfiler from http://luaprofiler.luaforge.net/, but had to modify it to work with Lua 5.2 (https://github.com/bonsaiviking/luaprofiler)


Attachment: optimize2.patch
Description:

_______________________________________________
Sent through the nmap-dev mailing list
http://cgi.insecure.org/mailman/listinfo/nmap-dev
Archived at http://seclists.org/nmap-dev/

Current thread: