Nmap Development mailing list archives

Re: NSE/Nsock segfault, script timeouts, NSE runlevel, etc


From: Patrick Donnelly <batrick () batbytes com>
Date: Fri, 1 May 2009 23:16:51 -0600

Hi Brandon,

On Fri, May 1, 2009 at 10:56 PM, Brandon Enright <bmenrigh () ucsd edu> wrote:
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

I have been running into a segfault in NSE/Nsock with the latest SVN
revision.  Patrick *may* already be aware of this and it *may* relate
to this comment:

"This is a symptom of the failing nsock library binding which has
plagued NSE with bugs for a while now. That it happened in
nse-lua-merge is merely happenstance. In the future I hope to clean it
up like I did with the nmap library."

Which Patrick said in:
http://seclists.org/nmap-dev/2009/q1/0798.html


Just in case the segfault isn't related, here is a tidbit more
information.

The final output before the crash was:

...
NSE Timing: About 95.83% done; ETC: 03:54 (0:01:10 remaining)
NSE Timing: About 95.83% done; ETC: 03:54 (0:01:12 remaining)
NSE Timing: About 95.83% done; ETC: 03:55 (0:01:13 remaining)
NSE Timing: About 95.83% done; ETC: 03:55 (0:01:14 remaining)
NSE Timing: About 95.83% done; ETC: 03:56 (0:01:16 remaining)
NSE: smb-brute target timed out
Completed NSE at 03:55, 1756.38s elapsed
NSE: Starting runlevel 2 scan
Initiating NSE at 03:55
NSE: NSE Script Threads (2) running:
NSE: Starting smb-check-vulns against x.y.10.148.
NSE: Starting p2p-conficker against x.y.10.148.
NSE: Conficker: Generating ports based on ip (0x940aXXYY) and seed (2051)
NSE: smb-check-vulns target timed out
NSE: p2p-conficker target timed out
Completed NSE at 03:55, 0.21s elapsed

A backtrace of the segfault is:

(gdb) bt
#0  0x00007f2041642e39 in lua_pushnil () from /usr/lib/liblua.so.5
#1  0x000000000046b645 in l_nsock_checkstatus (L=0x7823df0,
   nse=<value optimized out>) at nse_nsock.cc:443
#2  0x000000000046b6bc in l_nsock_send_handler (nsp=<value optimized out>,
   nse=0x66ac660, yield=0x67c0cf0) at nse_nsock.cc:571
#3  0x000000000047bcd0 in msevent_dispatch_and_delete (nsp=0x665e260,
   nse=0x66ac660, notify=<value optimized out>) at nsock_event.c:297
#4  0x000000000047bdfb in msevent_cancel (nsp=0x665e260, nse=0x66ac660,
   event_list=0x665e5d0, elem=0x665f3c8, notify=1) at nsock_event.c:271
#5  0x000000000047a850 in nsi_delete (nsockiod=0x78319b0, pending_response=1)
   at nsock_iod.c:181
#6  0x000000000046c824 in l_nsock_close (L=0x6651620) at nse_nsock.cc:794
#7  0x000000000046c8da in l_nsock_gc (L=0x6651620) at nse_nsock.cc:761
#8  0x00007f204164812b in ?? () from /usr/lib/liblua.so.5
#9  0x00007f2041648528 in ?? () from /usr/lib/liblua.so.5
#10 0x00007f2041649daf in ?? () from /usr/lib/liblua.so.5
#11 0x00007f2041649e68 in ?? () from /usr/lib/liblua.so.5
#12 0x00007f204164a2f0 in ?? () from /usr/lib/liblua.so.5
#13 0x00007f20416434f5 in lua_gc () from /usr/lib/liblua.so.5
#14 0x00007f20416564a8 in ?? () from /usr/lib/liblua.so.5
#15 0x00007f204164812b in ?? () from /usr/lib/liblua.so.5
#16 0x00007f2041652e31 in ?? () from /usr/lib/liblua.so.5
#17 0x00007f2041648585 in ?? () from /usr/lib/liblua.so.5
#18 0x00007f2041647d27 in ?? () from /usr/lib/liblua.so.5
#19 0x00007f2041647da5 in ?? () from /usr/lib/liblua.so.5
#20 0x00007f20416436b4 in lua_pcall () from /usr/lib/liblua.so.5
#21 0x0000000000469b39 in run_main (L=0x6651620) at nse_main.cc:457
#22 0x00007f204164812b in ?? () from /usr/lib/liblua.so.5
#23 0x00007f2041648528 in ?? () from /usr/lib/liblua.so.5
#24 0x00007f2041647d27 in ?? () from /usr/lib/liblua.so.5
#25 0x00007f2041647da5 in ?? () from /usr/lib/liblua.so.5
#26 0x00007f2041643657 in lua_cpcall () from /usr/lib/liblua.so.5
#27 0x00000000004699c5 in script_scan (targets=@0x7fff4a4ad5d0)
   at nse_main.cc:499
#28 0x000000000041cfa6 in nmap_main (argc=35, argv=0x7fff4a4b0868)
   at nmap.cc:1811
#29 0x0000000000418735 in main (argc=35, argv=0x7fff4a4b0868) at main.cc:215


One of the things that jumps out at me as broken is that when a host
has timed out, higher run levels of NSE probably shouldn't be run.
There is no time to run the scripts so they appear to be created and
then destroyed.  I suppose it may be too much work to check to see if
all of the hosts in a hostgroup have timed out or not before running a
higher run level though...  Perhaps script shouldn't be started on
hosts that have already timed out rather than completely skipping the
run level.

The faulty assumption the nsock library binding is making is that the
thread will not be collected before the callback. This will be
corrected in the future (in the forthcoming rework of nse_nsock.cc) by
having each socket userdata maintain a reference to the thread.

-- 
-Patrick Donnelly

"Let all men know thee, but no man know thee thoroughly: Men freely
ford that see the shallows."

- Benjamin Franklin

_______________________________________________
Sent through the nmap-dev mailing list
http://cgi.insecure.org/mailman/listinfo/nmap-dev
Archived at http://SecLists.Org

Current thread: