nanog mailing list archives

Re: PMTU and Broken Servers


From: "Stephen J. Wilcox" <steve () telecomplete co uk>
Date: Mon, 12 May 2003 21:03:46 +0100 (BST)



Okay we're not actually saying the TCP stack is broken then as I interpreted 
your previous email, we mean there are routers with broken (user) config on them 
ie dropping icmp frags. Sorry!

Steve


On Mon, 12 May 2003, Curtis Maurand wrote:


I had a problem where a NXNetworks VPN router didn't process the results 
properly.  I couldn't put my finger on exactly whose router was causing 
the trouble, but using freeswan to a freeswan I was able to test my theory 
as I gradually increased the MTU on my connection until I got a failure.  
One end of the VPN is on a RoadRunner connection and the other was on a 
Prexar connection.  The route in between is anyone's guess, but I think, 
at the time, Prexar was trying to push traffic over their Cable and 
Wireless connection.  Now that C&W is gone, I'll have to try it again.

Curtis

On Mon, 12 May 2003, Stephen J. Wilcox wrote:


You mean theres routers which get a large packet and silently drop it rather 
than return an icmp?

Curious as to know which vendors? (read fundementally broken!)

Steve

On Mon, 12 May 2003, Curtis Maurand wrote:



I've had the problem before.  Not all routers handle PMTU correctly.

Curtis

On Thu, 8 May 2003, Leo Bicknell wrote:


I've recently had the pleasure of troubleshooting a problem I don't
normally have to deal with, and the results don't quite make sense
to me.  I'm hoping someone can enlighten me as to what is going on.
A diagram:

server---internet---fw---tunnelbox1----tunnelbox2----user

The tunnel between the tunnelboxes is a lower (1480) MTU.  Originally
the user couldn't access some servers, turns out the firewall was
filtering ICMP Can't Fragment messages, preventing PMTU from working
in the server->user direction (tunnelbox1 would generate Can't
Fragement, firewall would filter).

That's been corrected.  Going to a server I control I see good PMTU
in both directions between the server and the user.  However, there
are still a number of web servers for popular sites that behave
just like the firewall was still filtering Can't Fragments.  The
theory is that the servers are behind a firewall/load balancer that
is filtering them on the server side -- but I find it slightly
(emphasis on the slightly) that someone would turn on PMTU discovery,
and then filter it out right in front of the boxes where they turned
it on.  Also, it seems to me most DSL users are behind PPPoE links
with lower MTU, and should get hit by the same problem.

The temporary hack is to have tunnelbox1 clear the DF bit on all
incoming packets, which just causes the packets to get fragmented
going down the tunnel.  A minor performance hit, but it works.

This is a new problem to me, but I'm sure people have run into it
before.  Are the servers really that broken (PMTU enabled, ICMP
Can't Fragement filtered)?  Does the head end box of DSL services
generally do something to work around this (ie, clear the DF bit)?
Am I just being an idiot and missing something obvious?









Current thread: