nanog mailing list archives

Re: Westnet and Utah outage


From: Curtis Villamizar <curtis () ans net>
Date: Mon, 27 Nov 1995 15:56:55 -0500


In message <199511231608.IAA27230 () upeksa sdsc edu>, Hans-Werner Braun writes:
Question: Which RFC should I consult to determine acceptable delay and packe
t 
loss?

RFCs are the result of IETF activities. The IETF is essentially a
protocol standardization group, not an operations group. I don't think
you perceive the IETF as "running" your network, or? There may not be
much of an alternative, though, which to a large extend is the issue at
hand. Nobody is responsible (individually or as a consortium or
whatever) of this anarchically organized and largely uncoordinated (at
a systemic level) global operational environment. While IETF/RFCs could
be utilized somehow, this is not really an issue of theirs. I sure
would not blame the IETF for not delivering here, is this is not their
mandate.

In other email I saw it seems that the important issues are hard to
understand for some. I (and I suspect several others) don't really care
much about a specific tactical issue (be it an outage or whatever).
The issue is how to make the system work with predictable performance
and a fate sharing attitude at a global level, in a commercial and
competitive environment that is still extremely young at that, and
attempts to accomodate everything from mom'n'pop shops to multi-billion
dollar industry. And exhibits exponential usage and ubiquity growth,
without the resources to upgrade quickly to satisfy all the demands.
And no control over in-flows, and major disparities across the
applications. And TCP flow control not working that well, as the
aggregation of transactions is very heavy, and the
packet-per-transaction count is so low on average that TCP may not be
all that much better to the network than UDP (in terms of adjusting to
jitter in available resources). Not to mention this age-old problem
with routing table sizes and routing table updates.


This belongs on the end2end-interest list or IPPM or elsewhere, but
I'll save a lot of people going through the archives.

In order to get X bandwidth on a given TCP flow you need to have an
average window size of X * RTT.  This is expressed in terms of TCP
segments N = (X * RTT) / MSS (or more correctly the segment size in
use rather than MSS).  To sustain an average window of N segments, you
must ideally reach a steady state where you cut cwnd (current window)
in half, then grow linearly, fluctuating between 2/3 and 4/3 of the
target size.  This would mean one drop in 2/3 N windows or DropRate in
terms of time is 2/3 N * RTT.  In one RTT on average X * RTT amount of
data flows.  In practice, you rarely drop at the perfect time, so the
constant 2/3 (call it K) can be raised to 1-2.  Since N = (X * RTT) /
MSS, DropRate = K * X * RTT * X * RTT / MSS.  Units are b/s * sec *
b/s * sec / b, or b.  The DropRate expressed in bits can be converted
to seconds or packets (divide by X or by MSS).  This type of analysis
is courtesy of the good folks at PSC (Matt, Jamshid, et al).

For example, to get 40 Mb/s at 70 msec RTT and 4096 MSS, you get one
error about every 6 seconds (K=1) or 1 in 7,300 packets.  If you look
at 56k Kb/s and 512 MSS you get a very interesting result.  You need
one error every 66 msec or 1 error in 0.9 packets.  This gives a good
incentive to increase delay.  At 250 msec, you get a result of one
error in 11.7 packets (much better!).

Another interesting point to note is that you need 3 duplicate ACKs
for TCP fast retransmit to work, so your window must be at least 4
segments (and should be more).  If you have a very large number of TCP
flows, where on average people get less than 1200 baud or so, the
delay you need to make TCP work well starts to exceed the magic 3
second boundary.  This was discussed ad nauseum on end2end-interest.
An important result is that you need more queueing than the delay
bandwidth product for severely congested links.  Another is that there
is a limit to the number of active TCP flows that can be supported per
bandwidth.  One suggestion to address the latter problem is to further
drop segment size if cwnd is less than 4 segments in size and/or when
estimated RTT gets into the seconds range.

This analysis of how much loss is acceptable to TCP may not be outside
the bounds of an informational RFC, but so far none exists.

Curtis


Current thread: