Nmap Development mailing list archives

Re: Help debugging hang with epoll_engine


From: Daniel Miller <bonsaiviking () gmail com>
Date: Thu, 05 Jul 2012 14:42:16 -0500

On 07/05/2012 09:47 AM, Henri Doreau wrote:
2012/6/9 Daniel Miller <bonsaiviking () gmail com>:
Henri,

It appears that the bug is epoll-specific. Earlier scans would produce
the hang within the first 2 hostgroups or so, within 20 minutes of
starting the scan. Now, with "--nsock-engine select" I've been running
for over 15 hours with no hangups. Unfortunately, I'm only about 2%
done with my scan...

Dan

Hi Dan,

troubleshooting this bug is an intense (and tiring) adventure! But I
finally have some results I think. r29134 fixes a bug in nsock, where
the epoll engine could loose track of silently closed/reopened fds of
IODs. I guess this could have been the root cause of the problem you observe...
Can you try the latest revision?

Here's the associated log message:
"""
[NSOCK] Fixed an epoll-engine-specific bug. The engine didn't recognized FDs
that were internally closed and replaced by other ones. This happened during
reconnect attempts.

When reconnecting with SSL_OP_NO_SSLv2 (nsock_core.c:472), the libary closes the
fd of the current IOD, and replaces it by a new one.

The man page for epoll_ctl states that a close() on a fd makes it removed from
any epoll set it was in. Therefore, if epoll_ctl(EPOLL_CTL_MOD, ...) returns
ENOENT, we retry with EPOLL_CTL_ADD.
"""

Regards.

Henri,

While I don't have time to run the scan to completion (1M hosts is a lot!), with this patch I got farther than I ever have before, with no issues. Based on my memories from debugging this, and the location of the patch, I'd say this is the fix we need. Thanks so much for hunting down this bug and killing it!

Dan

_______________________________________________
Sent through the nmap-dev mailing list
http://cgi.insecure.org/mailman/listinfo/nmap-dev
Archived at http://seclists.org/nmap-dev/


Current thread: