Nmap Development mailing list archives

Re: Nsock SSL problem (r29134 explanations)


From: Daniel Miller <bonsaiviking () gmail com>
Date: Fri, 06 Jul 2012 11:26:46 -0500

On 07/06/2012 08:59 AM, Daniel Miller wrote:
On 07/06/2012 08:51 AM, Henri Doreau wrote:
Hello,

yesterday I fixed a bug in nsock, which was kind of flying under the
radar: only Daniel Miller reported it[1], and I personally never
managed to reproduce this stalled scan symptom he saw despite days of
debug. Still, this problem probably affects many users, in a way or
another. I sent a quick description to the list yesterday[2] after
committing r29134 but here are the details again:


* Problem

Internal reconnection attempts can occur under certain conditions
described below:
nsock_core.c
"""
465   /* SSLv3-only and TLSv1-only servers can't be connected to when the
466    * SSL_OP_NO_SSLv2 option is not set, which is the case when the pool
467    * was initialized with nsp_ssl_init_max_speed. Try reconnecting with
468    * SSL_OP_NO_SSLv2. Never downgrade a NO_SSLv2 connection to one that
469    * might use SSLv2. */
[...]
472   close(iod->sd);
473   nsock_connect_internal(ms, nse, [...]);
"""

The problem was that the close() statement removes the FD from the
epoll set, and that the new one (from nsock_connect_internal) wasn't
added instead. Nsock therefore lost track of the events associated to
this IOD.


* Fix
I committed a first fix to make epoll_iod_modify() calls epoll_ctl() a
second time, with EPOLL_CTL_ADD, in case the modification attempt
failed with ENOENT (r29134).


I would propose to replace this fix by the patch attached, which is
much nicer IMO, and has the advantage of not being engine-specific.
This new patch simply unregisters the IOD before the close() and
nsock_connect_internal() statements and registers the IOD again (with
the new FD) afterwards.

I have also added a couple statements to engine_select.c to make it
clean all FD sets on IOD unregistration. For some reason, the X set
wasn't touched. Unless I miss something this was a mistake.


Regards.


[1]http://seclists.org/nmap-dev/2012/q2/649
[2]http://seclists.org/nmap-dev/2012/q3/47



_______________________________________________
Sent through the nmap-dev mailing list
http://cgi.insecure.org/mailman/listinfo/nmap-dev
Archived athttp://seclists.org/nmap-dev/
Henri,

Thanks for all your hard work on this bug. Unfortunately, I think there may be a problem with the patch. I'm trying to duplicate under a debugger, and will follow up with more info, but I had a scan crash last night during NSE scanning with this assertion error:

nmap: nsock_event.c:406: msevent_new: Assertion `msiod->state != NSIOD_STATE_DELETED' failed.

I'll post more information once I have reproduced the crash.

Dan

I got the above error with the patch that was committed. With the patch from Henri's latest message, I get this error:

Unable to update events for IOD #717: No such file or directory
QUITTING!
Trying to delete NSI, but could not find 1 of the purportedly pending events on that IOD.

QUITTING!


Running under GDB, I get a different error, a SIGPIPE during SSL_shutdown. This happens the same way with either patch. Backtrace:
Program received signal SIGPIPE, Broken pipe.
0x00132416 in __kernel_vsyscall ()
(gdb) bt
#0  0x00132416 in __kernel_vsyscall ()
#1 0x005b51d3 in __write_nocancel () at ../sysdeps/unix/syscall-template.S:82
#2  0x002a8c44 in ?? () from /lib/i386-linux-gnu/libcrypto.so.1.0.0
#3  0x002a6564 in BIO_write () from /lib/i386-linux-gnu/libcrypto.so.1.0.0
#4  0x001c0511 in ?? () from /lib/i386-linux-gnu/libssl.so.1.0.0
#5  0x001c0912 in ?? () from /lib/i386-linux-gnu/libssl.so.1.0.0
#6  0x001c206f in ?? () from /lib/i386-linux-gnu/libssl.so.1.0.0
#7  0x001c0e24 in ?? () from /lib/i386-linux-gnu/libssl.so.1.0.0
#8  0x001be39e in ?? () from /lib/i386-linux-gnu/libssl.so.1.0.0
#9  0x001d9921 in SSL_shutdown () from /lib/i386-linux-gnu/libssl.so.1.0.0
#10 0x082653d5 in nsi_delete (nsockiod=0x9ad5360, pending_response=1) at nsock_iod.c:231
#11 0x08245658 in l_close (L=0x99f5c50) at nse_nsock.cc:846
#12 0x08288a13 in luaD_precall (L=0x99f5c50, func=0x9d33128, nresults=0) at ldo.c:317
#13 0x082a73aa in luaV_execute (L=0x99f5c50) at lvm.c:710
#14 0x082893b0 in unroll (L=0x99f5c50, ud=0x0) at ldo.c:430
#15 0x082899ee in resume (L=0x99f5c50, ud=0x9d331c8) at ldo.c:517
#16 0x08287c04 in luaD_rawrunprotected (L=0x99f5c50, f=0x8289701 <resume>, ud=0x9d331c8) at ldo.c:131 #17 0x08289a9d in lua_resume (L=0x99f5c50, from=0x8c58f70, nargs=2) at ldo.c:530 #18 0x082b9f53 in auxresume (L=0x8c58f70, co=0x99f5c50, narg=2) at lcorolib.c:31
#19 0x082ba23f in luaB_coresume (L=0x8c58f70) at lcorolib.c:53
#20 0x08288a13 in luaD_precall (L=0x8c58f70, func=0x9bf0280, nresults=2) at ldo.c:317
#21 0x082a73aa in luaV_execute (L=0x8c58f70) at lvm.c:710
#22 0x082890ee in luaD_call (L=0x8c58f70, func=0x8c7fbd8, nResults=0, allowyield=0) at ldo.c:393 #23 0x082836cd in lua_callk (L=0x8c58f70, nargs=2, nresults=0, ctx=0, k=0) at lapi.c:902
#24 0x0823c703 in run_main (L=0x8c58f70) at nse_main.cc:418
#25 0x08288a13 in luaD_precall (L=0x8c58f70, func=0x8c7fbc8, nresults=0) at ldo.c:317 #26 0x082890a5 in luaD_call (L=0x8c58f70, func=0x8c7fbc8, nResults=0, allowyield=0) at ldo.c:392
#27 0x082837bc in f_call (L=0x8c58f70, ud=0xbfffeb78) at lapi.c:920
#28 0x08287c04 in luaD_rawrunprotected (L=0x8c58f70, f=0x8283769 <f_call>, ud=0xbfffeb78) at ldo.c:131 #29 0x08289e01 in luaD_pcall (L=0x8c58f70, func=0x8283769 <f_call>, u=0xbfffeb78, old_top=16, ef=8) at ldo.c:590 #30 0x082838e8 in lua_pcallk (L=0x8c58f70, nargs=1, nresults=0, errfunc=1, ctx=0, k=0) at lapi.c:946 #31 0x0823d502 in script_scan (targets=..., scantype=SCRIPT_SCAN) at nse_main.cc:571
#32 0x080cfcbb in nmap_main (argc=11, argv=0xbffff704) at nmap.cc:1993
#33 0x080c1391 in main (argc=11, argv=0xbffff704) at main.cc:198

Looking around the Internet, it appears that this is due to trying to gracefully shutdown an SSL socket when the remote end has reset the underlying TCP connection (write() on a closed socket). A few solutions are suggested:

1. Call SSL_get_shutdown and check that the return value is >= 0 before calling SSL_shutdown. This does not seem like it works for many people, and doesn't make sense because there is no error (negative) return specified for SSL_get_shutdown.

2. Ignore SIGPIPE. This is leading to the errors above, since we ignore SIGPIPE in nmap.cc.

I will continue to debug. If I can tell GDB to ignore SIGPIPE, I'll be able to backtrace from the assertion errors or fatal calls.

Dan


_______________________________________________
Sent through the nmap-dev mailing list
http://cgi.insecure.org/mailman/listinfo/nmap-dev
Archived at http://seclists.org/nmap-dev/


Current thread: