nanog mailing list archives

Re: PMTU-D: remember, your load balancer is broken


From: Valdis.Kletnieks () vt edu
Date: Wed, 14 Jun 2000 11:06:37 -0400

On Tue, 13 Jun 2000 22:36:08 MDT, Marc Slemko said:
I shouldn't get started here.  I have trouble buying into HP's
way of doing things (I was only aware that HPUX did this; but it seems
that AIX does too...).  If you run a high traffic DNS server

AIX started supporting PMTU-D for both TCP and UDP in 4.2.1.  The gotcha
was it being on by default in 4.3.3.

on an AIX box without disabling this "feature" then you must just
be spewing ICMP echo requests.  It could add up to more bytes
than your DNS responses...

Well, as I said, it was done in error, and yes, the bytes for the ICMP
*were* running almost as high as the actual NTP traffic... 

The surprising part is that it was broken for close to 3 months before
somebody noticed (yesterday, just a few hours before this discussion started,
in fact).

As noted, PMTU-D for TCP is a lot lighter weight, and has an actual chance
of winning sometimes. Does anybody know of a UDP-based application that is
able to *do* anything with PMTU-D?  A co-worker had heard of research at
PSC that dealt with TCP-friendly multicast, but that was all we could think of...

And, obviously, ICMP pings don't work too well much of the time
anyway.  And I'm concerned about the possibility of some nasty DoS
potential by exploiting this.  I haven't looked into this in depth, and
it depends on how it handles cache replacement, etc.

Except that, technically, you are not permitted to just blindly send 
segments of such size.  Well, you can but systems in the middle don't 
have to handle them.  No?

Hmm.. either I did a bad job of explaining or I haven't had enough caffiene
to parse what you said.  Given that you also suggest going to a 1460 MSS,
I suspect that we're actually violently in agreement here.

Now if I can remember why I chose 1396 for a default MSS.... ;)

It is also a concern that, in my experience, many of the links with
MTUs <1500 are also the links with greater packet loss, etc. so 
you really don't want fragmentation on them.

The worst part here is that I suspect that most of these links (just on
sheer numbers of shipped product) are the aformentioned Win98 576-MTU.

However, in this case, the fragmentation happens in a terminal server on
the last hop, and hopefully the case of a terminal server running out of
queueing buffers and having to drop one of the 2 remaining fragments of
a 1500->576 split after sending the first one is pretty rare....

I seem to remember that the *original* motivation for slow-start and
all that was Van Jacobson's observation that the most common cause of
a TCP retransmit was that an *entire* packet had been silently dropped
due to queueing congestion, and could thus be treated identical to
an ICMP Source Quench.

Has this changed?  Has "fragmentation" become a Great Evil, rather than
an annoyance that some links have to deal with?

I think it is simply that we the net is in a state of somewhat
amazing homogoney right now.  I don't think this will continue,
but who knows.  I do think that PMTU-D is an important feature, and 
people should be encouraged to leave it enabled wherever possible,
so that one day if networks do change to make it more useful in the
general case, it will be there...

At least for TCP.  I'm still unconvinced for UDP ;)


-- 
                                Valdis Kletnieks
                                Operating Systems Analyst
                                Virginia Tech


Attachment: _bin
Description:


Current thread: