Nmap Development mailing list archives
Re: Options to replace select in Nsock
From: Michael Pattrick <mpattrick () rhinovirus org>
Date: Sat, 20 Jun 2009 19:20:04 -0400
I agree with Doug, a statefull API should be used. He gives some excellent reasons, as an additional reason, only the latest version of windows supports Poll while CreateIoCompletionPort has been supported since windows 2000, so going this route will acutely be more compatible. In the meantime, couldn't Nsock be quickly adapted to use multiple calls to select with different FD_SET arrays when tracking > FD_SETSIZE items? Cheers, Michael On Sat, Jun 20, 2009 at 6:09 PM, <doug () hcsw org> wrote:
Hi David, On Fri, Jun 19, 2009 at 11:48:44PM -0600 or thereabouts, David Fifield wrote:poll is a function that is similar to select but it doesn't have a limit on the number of descriptors it can follow. poll is pretty portable, except it isn't available on Windows before Windows Vista.poll is also not available on old BSD derived systems (all modern BSDs support it now) and was not available on Mac OS X until 10.3 or so.poll has the same scalability problem as select, namely that you have to iterate over the entire descriptor list to find out which ones are ready, but that hasn't been a problem for us.The problem with select and poll is that the user-space code must inform the kernel of all the descriptors it is interested in EVERY SINGLE TIME it blocks to wait for the next event. Assuming that event frequency is roughly proportional to the number of descriptors the program is interested in, this means that the total CPU time cost of monitoring events with select and poll is roughly proportional to THE SQUARE of the number of descriptors being monitored. The goal of modern event APIs is to make the CPU time cost of monitoring events LINEAR to the number of events and UNCORRELATED to the number of descriptors being monitored. So the reason why we haven't noticed a scalability problem with select is because the time spent handling events grows roughly quadratically with the number of descriptors and until the number of descriptors gets significantly above 1000, the CPU time is not substantial relative to the cost of context switching to the kernel when calling select(). As you mention, the one nice thing about poll is that there is no fixed limit on the number of descriptors that can be monitored. Unfortunately, poll has worse scalability properties than select because each event you are interested in uses more memory than it would with select, causing more copying and higher cache pressure. See, the way select works is that the fd_set contains a bit array of 1024 bits. The first bit refers to descriptor 0, the second to descriptor 1, and so on. When you pass an fd_set to select, the kernel will loop through this bit array up to the value of the nfds parameter. This is why nfds must be the highest numbered descriptor in the set plus 1, NOT the total number of descriptors. So a read fd_set and a write fd_set can be stored in (1024/8)*2 = 256 bytes. Since each pollfd struct consumes 8 bytes, monitoring 1024 descriptors would consume 8K. Furthermore, a smart application can build up and scan bit arrays word-by-word -- not bit-by-bit -- for improved performance. The idea behind event APIs like kqueue and epoll is that you have the kernel remember which descriptors you are interested in between calls to fetch the next event. For this reason, these are called STATEFUL event APIs. I said above that modern event APIs try to make the time handling events linear with the number of events. This means that the CPU time should be uncorrelated to the number of descriptors the program is interested in. In the default modes of kqueue and epoll this is not entirely the case because they use "level triggering". This sounds complicated but is actually simple. All it means is that if the kernel tells you there is some new event on a socket and you don't process that event (ie you don't read all the data available) it's no big deal because the kernel will tell you again later that there is still data available on that descriptor. The problem with level triggering is that the kernel has to monitor the descriptor set continually because you may not have processed all events even though the kernel told you about them. This means that there is CPU activity proportional to the number of descriptors -- exactly what we want to avoid. The opposite of level triggering is called "edge triggering". This mode is supported by both kqueue and epoll. In this mode, the kernel will only notify you when the state of a descriptor changes from unready to ready. So when a packet comes in, the kernel will add it to the descriptor's input mbuf and queue a ready event for notification to user-space. The kernel's event API never has to re-visit that descriptor until the next event happens on it. The consequence is that processes must ensure that they read all the data from a descriptor because if they don't, the descriptor will appear unready even though there may still be data available to read. Typically this is done by using non-blocking sockets and always calling read() and write() until they return EAGAIN. So in summary, using poll is obviously preferable to having nmap crash when it reaches 1024 descriptors but if the goal is to have nmap perform well when it's handling in excess of 1024 descriptors, a stateful event API should be used. Hope this helps, Doug _______________________________________________ Sent through the nmap-dev mailing list http://cgi.insecure.org/mailman/listinfo/nmap-dev Archived at http://SecLists.Org
_______________________________________________ Sent through the nmap-dev mailing list http://cgi.insecure.org/mailman/listinfo/nmap-dev Archived at http://SecLists.Org
Current thread:
- Options to replace select in Nsock David Fifield (Jun 19)
- Re: Options to replace select in Nsock doug (Jun 20)
- Re: Options to replace select in Nsock Michael Pattrick (Jun 20)
- Re: Options to replace select in Nsock David Fifield (Jun 20)
- Re: Options to replace select in Nsock Michael Pattrick (Jun 20)
- Re: Options to replace select in Nsock Brandon Enright (Jun 20)
- Re: Options to replace select in Nsock doug (Jun 20)
- Re: Options to replace select in Nsock Brandon Enright (Jun 20)
- Re: Options to replace select in Nsock Michael Pattrick (Jun 20)
- Re: Options to replace select in Nsock doug (Jun 20)