Nmap Development mailing list archives
Re: [RFC] Some NSE optimizations
From: Daniel Miller <bonsaiviking () gmail com>
Date: Wed, 11 Jul 2012 13:49:31 -0500
Hi List,Just a final update on my NSE optimization work. Cleaned-up patch attached. Overall, this doesn't have a huge impact (about 5% speedup on scans of my /24 with 18 hosts on a Pentium M at 1.73 GHz), but it was a fun project. Details on the new changes:
4. datafiles.lua: get_array(). In the existing implementation, the get_assoc_array function loops over each line in the datafiles (16000+ lines for nmap-mac-prefixes), performing 2 string.match calls with essentially the same pattern. get_array is called when a single string (pattern) instead of a key-value pair is passed in the data_struct to parse_lines. By allowing patterns with 2 captures or a function that returns 2 values, all the common files can be parsed with get_array() using just one string.match call, resulting in a 1.5x speedup. For both get_assoc_array and get_array, I moved some processing out of the inner loop, and commented out an assertion that doesn't really matter (since you can't read anything but strings from a file). This doesn't scale very far, since most results are cached, but it does save a little.
Once again, I'm looking for a sanity check on this. If no one thinks it's worthwhile, I won't bother to commit, but I'd like some more testing before I do.
Dan On 07/10/2012 05:36 PM, Daniel Miller wrote:
List,"Premature optimization is the root of all evil" I know, but I thought I'd throw NSE at a Lua profiler [1] and see if I could squeeze some more speed out of it. Granted also that our bottleneck is the network, but the changes I propose are fairly minor. Times below include profiling, which inflates them artificially, and I use a slow computer to start with. Patch attached.1. nse_main.lua: tcopy(). This function gets called recursively on the host and port tables for each hostrule, portrule, and associated action functions. In my simple scan of one host, I saw 1402 calls with a total time of 1.9 seconds. I implemented it in C++ using lua_rawset, and saw a roughly 10x speedup (494 Lua function calls, the recursion is done in C++).2. http.lua: skip_lws(). This function is called for each line in an HTTP header. In my default scan of one http service, it was called 712 times. I sped it up by using repeating pattern matching (* and +) instead of looping over repeated calls to string.match. This led to a 1.7x speedup, or .2 seconds on my 1-port scan.3. http.lua: parse_header(). This function is called for each HTTP request (16 in my default scan), but took an astonishing 0.25 seconds per call. It got a small speedup boost from skip_lws, which it calls repeatedly, but the primary speedup was using a single pattern match for non-whitespace instead of an explicit loop with 2 string.match calls looking for whitespace on each character. This resulted in a 4.7x speedup, or about 3 seconds per http service.I'll be running a more comprehensive scan tonight to look for other candidates, but these were by far the standouts in terms of %time running (18%, 4.8%, and 37%, respectively). Please let me know if I introduced bugs!Dan[1] I used LuaProfiler from http://luaprofiler.luaforge.net/, but had to modify it to work with Lua 5.2 (https://github.com/bonsaiviking/luaprofiler)
Attachment:
optimize2.patch
Description:
_______________________________________________ Sent through the nmap-dev mailing list http://cgi.insecure.org/mailman/listinfo/nmap-dev Archived at http://seclists.org/nmap-dev/
Current thread:
- [RFC] Some NSE optimizations Daniel Miller (Jul 10)
- Re: [RFC] Some NSE optimizations Daniel Miller (Jul 11)
- Re: [RFC] Some NSE optimizations Patrick Donnelly (Jul 11)
- Re: [RFC] Some NSE optimizations Daniel Miller (Jul 11)
- Re: [RFC] Some NSE optimizations David Fifield (Jul 16)