Nmap Development mailing list archives
Re: Bug with nping sequence number
From: Daniel Miller <bonsaiviking () gmail com>
Date: Mon, 22 Aug 2016 16:11:31 -0500
Eric, Thanks, this is a really great analysis of a difficult problem. I'm not sure what I think about the solution, though: changing the core of the Nsock loop like this seems like it would have side effects that could cause problems. By adding an extra loop iteration after the timeout, you are essentially breaking the contract to honor the timeout for other applications. I've CC'd Henri Doreau, who has a lot of Nsock experience, to see if he can comment. I wonder if instead we could force something like a synchronous timed loop by having the timer event handler terminate the loop with nsock_loop_quit? So we set the loop timeout to something very large, like 10 minutes beyond o.getDelay(), and rely on the timer to end the loop so that there is only one end time; the timer event can't be missed, so it can't fire twice in one loop either. On the other hand, maybe this is as simple as an off-by-one error: There are several different cases in ProbeMode::start() where nsock loops are started with timers, and they each use different values for the different timeouts! See below: case TCP_CONNECT: event handler timeout: o.getDelay()+1 nsock_loop timeout: o.getDelay()+1 case UDP_UNPRIV: event handler timeout: o.getDelay() nsock_loop timeout: o.getDelay() case UDP, TCP, ICMP, ARP: event handler timeout: o.getDelay() nsock_loop timeout: o.getDelay()+1 I worry a bit about timing error propagation, too; if it takes slightly longer than the intended time for the event to fire, will the next packet be delayed by that much, too? What is the intent of the Nping program: interval delay or send delay? Sorry to answer you with so many questions, but we need to think seriously about the intent and effects of changes here. I hope you will continue to be involved in the discussion, since you have spent so much time considering the problem. Dan On Wed, Jul 27, 2016 at 11:55 PM, Éric Hoffman <ehoffman () videotron ca> wrote:
Hello This have been brought to this mailing list before, by another person, but no answer was given. I actually stumbled upon the same issue with nping, that is, the sequence number is non-linear. Ex: $ sudo nping --icmp 192.168.200.75 -c 10 --delay 100ms Starting Nping 0.7.12 ( https://nmap.org/nping ) at 2016-07-27 23:36 EDT SENT (0.0164s) ICMP [192.168.200.20 > 192.168.200.75 Echo request (type=8/code=0) id=42883 seq=1] IP [ttl=64 id=28064 iplen=28 ] RCVD (0.0179s) ICMP [192.168.200.75 > 192.168.200.20 Echo reply (type=0/code=0) id=42883 seq=1] IP [ttl=64 id=34536 iplen=28 ] SENT (0.1174s) ICMP [192.168.200.20 > 192.168.200.75 Echo request (type=8/code=0) id=42883 seq=2] IP [ttl=64 id=28064 iplen=28 ] RCVD (0.1190s) ICMP [192.168.200.75 > 192.168.200.20 Echo reply (type=0/code=0) id=42883 seq=2] IP [ttl=64 id=34539 iplen=28 ] SENT (0.2184s) ICMP [192.168.200.20 > 192.168.200.75 Echo request (type=8/code=0) id=42883 seq=3] IP [ttl=64 id=28064 iplen=28 ] RCVD (0.2199s) ICMP [192.168.200.75 > 192.168.200.20 Echo reply (type=0/code=0) id=42883 seq=3] IP [ttl=64 id=34541 iplen=28 ] SENT (0.3190s) ICMP [192.168.200.20 > 192.168.200.75 Echo request (type=8/code=0) id=42883 seq=5] IP [ttl=64 id=28064 iplen=28 ] RCVD (0.3203s) ICMP [192.168.200.75 > 192.168.200.20 Echo reply (type=0/code=0) id=42883 seq=5] IP [ttl=64 id=34546 iplen=28 ] SENT (0.4194s) ICMP [192.168.200.20 > 192.168.200.75 Echo request (type=8/code=0) id=42883 seq=5] IP [ttl=64 id=28064 iplen=28 ] RCVD (0.4209s) ICMP [192.168.200.75 > 192.168.200.20 Echo reply (type=0/code=0) id=42883 seq=5] IP [ttl=64 id=34553 iplen=28 ] SENT (0.5214s) ICMP [192.168.200.20 > 192.168.200.75 Echo request (type=8/code=0) id=42883 seq=6] IP [ttl=64 id=28064 iplen=28 ] RCVD (0.5227s) ICMP [192.168.200.75 > 192.168.200.20 Echo reply (type=0/code=0) id=42883 seq=6] IP [ttl=64 id=34556 iplen=28 ] SENT (0.6224s) ICMP [192.168.200.20 > 192.168.200.75 Echo request (type=8/code=0) id=42883 seq=7] IP [ttl=64 id=28064 iplen=28 ] RCVD (0.6238s) ICMP [192.168.200.75 > 192.168.200.20 Echo reply (type=0/code=0) id=42883 seq=7] IP [ttl=64 id=34560 iplen=28 ] SENT (0.7231s) ICMP [192.168.200.20 > 192.168.200.75 Echo request (type=8/code=0) id=42883 seq=9] IP [ttl=64 id=28064 iplen=28 ] RCVD (0.7246s) ICMP [192.168.200.75 > 192.168.200.20 Echo reply (type=0/code=0) id=42883 seq=9] IP [ttl=64 id=34570 iplen=28 ] SENT (0.8234s) ICMP [192.168.200.20 > 192.168.200.75 Echo request (type=8/code=0) id=42883 seq=9] IP [ttl=64 id=28064 iplen=28 ] RCVD (0.8250s) ICMP [192.168.200.75 > 192.168.200.20 Echo reply (type=0/code=0) id=42883 seq=9] IP [ttl=64 id=34578 iplen=28 ] SENT (0.9243s) ICMP [192.168.200.20 > 192.168.200.75 Echo request (type=8/code=0) id=42883 seq=10] IP [ttl=64 id=28064 iplen=28 ] RCVD (0.9254s) ICMP [192.168.200.75 > 192.168.200.20 Echo reply (type=0/code=0) id=42883 seq=10] IP [ttl=64 id=34579 iplen=28 ] Max rtt: 1.196ms | Min rtt: 0.928ms | Avg rtt: 1.063ms Raw packets sent: 10 (280B) | Rcvd: 10 (460B) | Lost: 0 (0.00%) Nping done: 1 IP address pinged in 0.94 seconds Look at the sequence number, it jumps from 3 to 5, and from 7 to 9, with the same sequence number repeated twice. I have found out this occur more likely is you set delay under 100ms, and less likely with 1-second delay. I actually looked in the code and found the issue! The way nping work, in the ProbeMode::start() function (nping/ProbeMode.cc), is that it fill a packet buffer (with the sequence number), then create a timer (which call a handler to send the packet at expiration), and call the event dispatch loop with a timeout 1ms later. So in theory, during a nping loop, let's suppose the delay is set to 100ms, then the packet is created and scheduled to be sent 100ms later. Then the event loop is started with a 101ms timeout (1 ms more than the timer trigger time). The reasoning behind this is that the timer is bound to always be triggered before the timeout. However, it is not the case, as explained bellow. In the nsock_loop() function (nsock/src/nsock_core.c), used to loop through events until timeout occurs, the function does basically this: 1 - Check if quit signal is set, and return if so. 2 - Check if there are still pending events, if not, return. 3 - Compute remaining time left before timeout, and if timeout has occurred, return. 4 - Call the engine loop, which is responsible to dispatch all events (including our timer), with the remaining time computed at point 3. 5 - Update the current time-of-day, used to compute timeout. The issue is that there is a race condition. It is well possible that in one loop, there are still a few milliseconds left for the timer to trigger. So, at points 3 and 4, nothing happens. And, at point 5, is it possible that the new time of day put us past the expiration timeout. If (when) this occurs, the point 3 above compute the new timeout, and return. Out timer does not get triggered (although it is still pending in the event pool). This behavior occurs because either of two reasons. It can be because the system is busy and take more than 1ms to do a loop (which is foreseeable on a serial tty for example, where just the fact of printing the SEND/RECV packet traces can take a few ms), or (and this is the most likely culprit) the system time returned by gettimeofday() is not contiguous. In other word, it does not increment by 1ms (nor anywhere the precision of the timespec type). The granularity of the gettimeofday() function is system-dependent, and depends upon such things as the interrupt rate, if there are high precision performance counter, etc. Typical values seen can be 10~20ms steps). All of this to say that when this occurs, the nsock_loop() function return, and the main ping function generate a new packet, and setup a new event. The next time the nsock_loop() function is called, then the last timer does trigger immediately. Then (possibly before the timeout) the new timer does trigger too. So, the correct *number* of ICMP packets do get sent. However, emphasis on the correct *number* of ICMP packets is made because the ICMP packet content itself is not. In fact, in the ProbeMode::start() function (nping/ProbeMode.cc), which generate the actual ICMP packets content, although there are 1024 sendpkt_t entries, there is actually only one 64KB packet buffer that get assigned and reused for all packets (u8 pkt[MAX_IP_PACKET_LEN]). And this buffer is assigned to each output packets. So, when the timeout issue occurs, and a new packet is generated, the next time nsock_loop() is called, the last packet (that was support to be sent during the last loop) data has been overwritten. So that is the reason why we see duplicate and non-linear sequence number. To fix this, either we have to fix the ProbeMode::start() function, taking into account that getting a timeout is a possibility, but this create the burden of holding more 64KB buffers. Clearly we don't wish to also have 1024 of those buffers (64MB). Not really embedded-system friendly. The other way to fix it is to make the nsock_loop() give one last chance to execute the event loop engine when timeout is detected. I implemented the second option and it does seem to fix the problem. --- nmap-master-old/nsock/src/nsock_core.c 2016-07-27 16:35:00.000000000 -0400 +++ nmap-master-new/nsock/src/nsock_core.c 2016-07-28 00:47:52.403081022 -0400 @@ -899,6 +899,7 @@ enum nsock_loopstatus nsock_loop(nsock_p int msecs_left; unsigned long loopnum = 0; enum nsock_loopstatus quitstatus = NSOCK_LOOP_ERROR; + int last_try = 0; gettimeofday(&nsock_tod, NULL); @@ -934,8 +935,12 @@ enum nsock_loopstatus nsock_loop(nsock_p if (msec_timeout >= 0) { msecs_left = MAX(0, TIMEVAL_MSEC_SUBTRACT(loop_timeout, nsock_tod)); if (msecs_left == 0 && loopnum > 0) { - quitstatus = NSOCK_LOOP_TIMEOUT; - break; + if (!last_try) { + last_try = 1; + } else { + quitstatus = NSOCK_LOOP_TIMEOUT; + break; + } } } Regards, Eric _______________________________________________ Sent through the dev mailing list https://nmap.org/mailman/listinfo/dev Archived at http://seclists.org/nmap-dev/
_______________________________________________ Sent through the dev mailing list https://nmap.org/mailman/listinfo/dev Archived at http://seclists.org/nmap-dev/
Current thread:
- Bug with nping sequence number Éric Hoffman (Jul 27)
- Re: Bug with nping sequence number Daniel Miller (Aug 22)
- Message not available
- Re: Bug with nping sequence number Éric Hoffman (Aug 24)