Nmap Development mailing list archives

Re: Nsock SSL problem (r29134 explanations)

From: Daniel Miller <bonsaiviking () gmail com>
Date: Fri, 06 Jul 2012 11:26:46 -0500

On 07/06/2012 08:59 AM, Daniel Miller wrote:

On 07/06/2012 08:51 AM, Henri Doreau wrote:

Hello,

yesterday I fixed a bug in nsock, which was kind of flying under the
radar: only Daniel Miller reported it[1], and I personally never
managed to reproduce this stalled scan symptom he saw despite days of
debug. Still, this problem probably affects many users, in a way or
another. I sent a quick description to the list yesterday[2] after
committing r29134 but here are the details again:


* Problem

Internal reconnection attempts can occur under certain conditions
described below:
nsock_core.c
"""
465   /* SSLv3-only and TLSv1-only servers can't be connected to when the
466    * SSL_OP_NO_SSLv2 option is not set, which is the case when the pool
467    * was initialized with nsp_ssl_init_max_speed. Try reconnecting with
468    * SSL_OP_NO_SSLv2. Never downgrade a NO_SSLv2 connection to one that
469    * might use SSLv2. */
[...]
472   close(iod->sd);
473   nsock_connect_internal(ms, nse, [...]);
"""

The problem was that the close() statement removes the FD from the
epoll set, and that the new one (from nsock_connect_internal) wasn't
added instead. Nsock therefore lost track of the events associated to
this IOD.


* Fix
I committed a first fix to make epoll_iod_modify() calls epoll_ctl() a
second time, with EPOLL_CTL_ADD, in case the modification attempt
failed with ENOENT (r29134).


I would propose to replace this fix by the patch attached, which is
much nicer IMO, and has the advantage of not being engine-specific.
This new patch simply unregisters the IOD before the close() and
nsock_connect_internal() statements and registers the IOD again (with
the new FD) afterwards.

I have also added a couple statements to engine_select.c to make it
clean all FD sets on IOD unregistration. For some reason, the X set
wasn't touched. Unless I miss something this was a mistake.


Regards.


[1]http://seclists.org/nmap-dev/2012/q2/649
[2]http://seclists.org/nmap-dev/2012/q3/47



_______________________________________________
Sent through the nmap-dev mailing list
http://cgi.insecure.org/mailman/listinfo/nmap-dev
Archived athttp://seclists.org/nmap-dev/

Henri,

Thanks for all your hard work on this bug. Unfortunately, I thinkthere may be a problem with the patch. I'm trying to duplicate under adebugger, and will follow up with more info, but I had a scan crashlast night during NSE scanning with this assertion error:

nmap: nsock_event.c:406: msevent_new: Assertion `msiod->state !=NSIOD_STATE_DELETED' failed.


I'll post more information once I have reproduced the crash.

Dan

I got the above error with the patch that was committed. With the patchfrom Henri's latest message, I get this error:


Unable to update events for IOD #717: No such file or directory
QUITTING!

Trying to delete NSI, but could not find 1 of the purportedly pendingevents on that IOD.


QUITTING!

Running under GDB, I get a different error, a SIGPIPE duringSSL_shutdown. This happens the same way with either patch. Backtrace:

Program received signal SIGPIPE, Broken pipe.
0x00132416 in __kernel_vsyscall ()
(gdb) bt
#0  0x00132416 in __kernel_vsyscall ()
#1 0x005b51d3 in __write_nocancel () at../sysdeps/unix/syscall-template.S:82
#2  0x002a8c44 in ?? () from /lib/i386-linux-gnu/libcrypto.so.1.0.0
#3  0x002a6564 in BIO_write () from /lib/i386-linux-gnu/libcrypto.so.1.0.0
#4  0x001c0511 in ?? () from /lib/i386-linux-gnu/libssl.so.1.0.0
#5  0x001c0912 in ?? () from /lib/i386-linux-gnu/libssl.so.1.0.0
#6  0x001c206f in ?? () from /lib/i386-linux-gnu/libssl.so.1.0.0
#7  0x001c0e24 in ?? () from /lib/i386-linux-gnu/libssl.so.1.0.0
#8  0x001be39e in ?? () from /lib/i386-linux-gnu/libssl.so.1.0.0
#9  0x001d9921 in SSL_shutdown () from /lib/i386-linux-gnu/libssl.so.1.0.0
#10 0x082653d5 in nsi_delete (nsockiod=0x9ad5360, pending_response=1)at nsock_iod.c:231
#11 0x08245658 in l_close (L=0x99f5c50) at nse_nsock.cc:846
#12 0x08288a13 in luaD_precall (L=0x99f5c50, func=0x9d33128,nresults=0) at ldo.c:317
#13 0x082a73aa in luaV_execute (L=0x99f5c50) at lvm.c:710
#14 0x082893b0 in unroll (L=0x99f5c50, ud=0x0) at ldo.c:430
#15 0x082899ee in resume (L=0x99f5c50, ud=0x9d331c8) at ldo.c:517
#16 0x08287c04 in luaD_rawrunprotected (L=0x99f5c50, f=0x8289701<resume>, ud=0x9d331c8) at ldo.c:131#17 0x08289a9d in lua_resume (L=0x99f5c50, from=0x8c58f70, nargs=2) atldo.c:530#18 0x082b9f53 in auxresume (L=0x8c58f70, co=0x99f5c50, narg=2) atlcorolib.c:31
#19 0x082ba23f in luaB_coresume (L=0x8c58f70) at lcorolib.c:53
#20 0x08288a13 in luaD_precall (L=0x8c58f70, func=0x9bf0280,nresults=2) at ldo.c:317
#21 0x082a73aa in luaV_execute (L=0x8c58f70) at lvm.c:710
#22 0x082890ee in luaD_call (L=0x8c58f70, func=0x8c7fbd8, nResults=0,allowyield=0) at ldo.c:393#23 0x082836cd in lua_callk (L=0x8c58f70, nargs=2, nresults=0, ctx=0,k=0) at lapi.c:902
#24 0x0823c703 in run_main (L=0x8c58f70) at nse_main.cc:418
#25 0x08288a13 in luaD_precall (L=0x8c58f70, func=0x8c7fbc8,nresults=0) at ldo.c:317#26 0x082890a5 in luaD_call (L=0x8c58f70, func=0x8c7fbc8, nResults=0,allowyield=0) at ldo.c:392
#27 0x082837bc in f_call (L=0x8c58f70, ud=0xbfffeb78) at lapi.c:920
#28 0x08287c04 in luaD_rawrunprotected (L=0x8c58f70, f=0x8283769<f_call>, ud=0xbfffeb78) at ldo.c:131#29 0x08289e01 in luaD_pcall (L=0x8c58f70, func=0x8283769 <f_call>,u=0xbfffeb78, old_top=16, ef=8) at ldo.c:590#30 0x082838e8 in lua_pcallk (L=0x8c58f70, nargs=1, nresults=0,errfunc=1, ctx=0, k=0) at lapi.c:946#31 0x0823d502 in script_scan (targets=..., scantype=SCRIPT_SCAN) atnse_main.cc:571
#32 0x080cfcbb in nmap_main (argc=11, argv=0xbffff704) at nmap.cc:1993
#33 0x080c1391 in main (argc=11, argv=0xbffff704) at main.cc:198

Looking around the Internet, it appears that this is due to trying togracefully shutdown an SSL socket when the remote end has reset theunderlying TCP connection (write() on a closed socket). A few solutionsare suggested:

1. Call SSL_get_shutdown and check that the return value is >= 0 beforecalling SSL_shutdown. This does not seem like it works for many people,and doesn't make sense because there is no error (negative) returnspecified for SSL_get_shutdown.

2. Ignore SIGPIPE. This is leading to the errors above, since we ignoreSIGPIPE in nmap.cc.

I will continue to debug. If I can tell GDB to ignore SIGPIPE, I'll beable to backtrace from the assertion errors or fatal calls.


Dan


_______________________________________________
Sent through the nmap-dev mailing list
http://cgi.insecure.org/mailman/listinfo/nmap-dev
Archived at http://seclists.org/nmap-dev/

Current thread:

Nsock SSL problem (r29134 explanations) Henri Doreau (Jul 06)
- Re: Nsock SSL problem (r29134 explanations) Daniel Miller (Jul 06)
  - Re: Nsock SSL problem (r29134 explanations) Daniel Miller (Jul 06)
    - Re: Nsock SSL problem (r29134 explanations) Henri Doreau (Jul 06)
    - Re: Nsock SSL problem (r29134 explanations) Henri Doreau (Jul 08)
    - Re: Nsock SSL problem (r29134 explanations) Daniel Miller (Jul 09)
    - Re: Nsock SSL problem (r29134 explanations) Henri Doreau (Jul 09)
    - Re: Nsock SSL problem (r29134 explanations) David Fifield (Jul 09)