Nmap Development mailing list archives

Re: Ideas for rate limit detection?


From: "ithilgore.ryu.L () gmail com" <ithilgore.ryu.l () gmail com>
Date: Fri, 09 Jan 2009 00:54:50 +0200

David Fifield wrote:
Hi,

I plan to merge most of the changes from the nmap-perf branch into the
trunk soon. I'll write a mail about that when I've done. But of interest
now is what I'm not merging.

When I began working on performance improvements, I started working on
making rate limit detection more robust, thinking that it would be easy.
Instead it was challenging and is not solved yet.

Rate limiting is when a target or firewall will only send back so many
replies per second. Linux, for example, by default will send only one
ICMP destination unreachable per second (/proc/sys/net/ipv4/icmp_ratelimit).
Mac OS X limits destination unreachables and RSTs to 250 per second.
When Nmap says
      Increasing send delay from 0 to 5 due to 215 out of 716 dropped probes since last increase.
it's trying to compensate for rate limiting.

What are the problems with Nmap's current rate limit detection? It is
susceptible to false positives: wrongful rate limiting slows down a scan
that should have been handled completely by congestion control. And it is
too granular: the current scan delay system cannot ever send faster than
200 packets per second once rate limiting has started, even if the rate
limit is higher than that, and rate limits can only be a few discrete
values. Brandon hit this against a machine with a rate limit of about
4,000: http://seclists.org/nmap-dev/2008/q4/0604.html. But even with
these deficiencies, the way it is now is not too bad.

These are a list of goals for a rate limit detection system:

 * It must be able to distinguish rate limiting from congestion.
 * Against a rate-limited host, it must quickly limit sending to the
   limited rate.
 * Against a host that is not rate limited, it must not set a maximum
   rate or else set a rate so high that it doesn't make a difference.
 * If too low a maximum rate is set, the rate must be able to increase.

The really hard one of these is the first one: how to know when a series
of drops is due to rate limiting and not network congestion. If
congestion is detected as rate limiting, setting a maximum rate keeps
the scan from speeding up when conditions improve. If limiting is
detected as congestion, Nmap's congestion control will tend to let the
scan go too fast and miss ports. But once you know rate limiting is in
effect, it's not hard to find what the limit is.

Why doesn't congestion control work against rate limiting? Network
congestion and rate limiting are two separate problems. Congestion
control  has to do with how much data is in the network at once,
independent of how fast packets are sent. It's a function of how full
the buffers in routers and the remote host are. Rate limiting limits how
fast replies may be sent, independent of how full the network is.

Here is a list of things I tried.

  Counting drops. What Nmap currently does is detect rate limiting
  whenever drops make up more than 30% of all responses (40% with -T4).
  "215 out of 716 dropped probes since last increase." Even though this
  method doesn't even consider how fast we're sending or receiving, it
  was the most effective of anything I tried. The main problem with it
  is false positives. A little bit of congestion early in the scan could
  slow down the rest. Or one host scanned among many can have its
  responses crowded out, get limited, and be scanned too slowly even
  after the other hosts have finished.

  Counting retransmits. Each probe carries with it a retransmission
  number (tryno). Whenever a higher tryno gets a response than has
  before, a variable called max_successful_tryno is incremented. Once
  max_successful_tryno is more than 3, scan delay increases every time
  it is incremented. Nmap waits for one second before each new tryno,
  hoping to exhaust any rate limit and detect probes that were dropped
  earlier. This turns out to be a pretty good heuristic; I normally see
  this increase scan delay before I see drops do in a UDP scan. But
  again this suffers from false positives.

  Measuring the sending and receiving rates, and assuming rate limiting
  whenever the send rate exceeds the receive rate. The idea is that
  we're sending 150 packets per second and receiving only 100, something
  is limiting the return rate to 100 packets per second. A major
  difficulty with this approach is filtered targets. When the target
  doesn't respond to every probe, I scaled the measured rate by the
  inverse of the host's responsiveness ratio. That's conceptually
  unsound, as against a filtered target most of the responses will be
  from timing pings, which are rate-limited by us on sending. However
  against filtered hosts we don't care much about rate limiting because
  there aren't that many replies to limit.

  Tracking changes in RTT. If I'm not mistaken, RTT is mostly a function
  of buffering delay, and during congestion the amount of data in
  buffers grows exponentially. Therefore the RTT should also grow
  exponentially, whereas in the case of rate limiting the RTT will be
  relatively flat. I tried keeping a short-term average and a long-term
  average of RTT, and a measurement of the derivative of the RTT
  function. Whenever the short-term average exceeded the long-term by a
  certain amount and the derivative was positive, I assumed congestion
  was the cause. This is somewhat similar to TCP Vegas, which compares
  the current RTT with an estimated "base" RTT to detect congestion
  early. I didn't give this technique much testing but early results
  were disappointing.

  I tried searching arXiv and Google Scholar for research papers about
  rate limit detection, but didn't find anything useful. (There's a lot
  about congestion and about rate limiting to mitigate DDoS.) I found a
  report of a research project that was only partially successful:
  http://www-iepm.slac.stanford.edu/monitoring/limit/exec-summary.html.

  Something I didn't try but which might be worth looking into is what
  PortBunny apparently does
  (http://www.recurity-labs.com/content/pub/PortBunnyDefcon16.pdf):
  sending packets of different sizes and seeing if the loss rate remains
  the same. The idea is that rate limiting doesn't care about packet
  size but during congestion bigger packets are more likely to get
  dropped.

It's not good science for me to think about this by myself; what counts
is finding the solution, not necessarily being the one to find it. What
are your ideas? Are you aware of any research, or other programs that do
this? I'm going to have to focus on some other things soon but I want to
leave this discussion open.

David Fifield

_______________________________________________
Sent through the nmap-dev mailing list
http://cgi.insecure.org/mailman/listinfo/nmap-dev
Archived at http://SecLists.Org


Very interesting subject. Quick braindumping follows:
In my view, one approach would be to take into account the fact that if
rate-limiting is in place, then it will *always* happen, meaning that
its effects will be visible during the whole scanning session. On the
other hand, network congestion is more likely to happen only at certain
times (though there is always a chance that the network is congested the
whole time depending on how long the session is).
Taking these into account along with using portbunny's technique of
sending packets of variable size, Nmap could periodically try sending
probes where:
a) if every time there are dropped packets, then chances are that
rate-limiting is in place. The question is *when* to consider that
it has happened enough times to conclude it is the default behaviour.
I guess that is a matter of practical testing.
b) if at one time there are dropped packets and at another time, there
aren't as many, then it is more possible that network congestion is the
problem
c) in case both happen, then portbunny's technique will help clear out
which is which, since big packets will be more likely to be dropped

The above isn't very different with drop counting (which can happen
anyway since we still look at how many packets were dropped), though it
takes a slightly more aggressive approach.

In addition, we could take a look into what kind of traffic is usually
rate-limited (maybe by observing best-practices/patterns on firewall
policies), so that we could mainly focus on it and thus extract more
accurate results. For example, if ICMP echo replies are usually
rate-limited, then Nmap should be more on the lookout for rate-limit
behaviour rather than network congestion, on this particular kind of
traffic.

Cheers,
ithilgore
sock-raw.org

_______________________________________________
Sent through the nmap-dev mailing list
http://cgi.insecure.org/mailman/listinfo/nmap-dev
Archived at http://SecLists.Org


Current thread: