nanog mailing list archives

Re: exchange point media


From: "Richard A. Steenbergen" <ras () e-gerbil net>
Date: Mon, 17 Jul 2000 17:16:31 -0400 (EDT)


On Mon, 17 Jul 2000, Mikael Abrahamsson wrote:

We had a discussion here a while back about exchange point media. The
outcome was that Gigabit ethernet vendors do support jumbo frames and
that the MTU disadvantage GE has could be overcome.

Now, imagine the following scenario:

We connect a router (router1)to this fictous exchange point running
(gig)ethernet. This router does support jumbo frames and has a 8k MTU.

Somewhere else on the exchange point is another router (router2), also
connected to the same broadcast domain. This router does NOT support jumbo
frames but has the standard 1500 MTU.

What happens if router1 tries to send a packet to router2 which is
1500 MTU? It thinks it's perfectly valid to send an 8k packet. (PMTUd
won't work here, we're talking layer2).

Correct, Silent L2 discard, giant frame...

My other guess is that if the switch in between (we're probably not
talking point-to-point-links here because this is an exchange point,
right?) is layer3-aware (as most are today) it could/would fragment the
packet or give a needtofrag-ICMP to the originator IP. Will any switch 
today do this? What vendors do this? (I have been told that the old DEC
Gigaswitches will do this between FDDI and FastEth, it will fragment the 
IP packet if neccessary).

A Foundry BigIron doing L3 should, exactly as if it was a router and not a
switch, I believe. At that point there is no real technical distinction
between it and a router with lots of ethernet ports however. I'm not aware
of any exchanges doing L3...

A third solution would be that I think I saw somewhere that some OSes
support setting host routes where you could enter the MTU of certain
specific IPs. This could also rectify the problem by simply
configuring the switches for jumbo frames and then setting the default
MTU to 1500 on routers and then people who support jumbo frames could
include this in their perring announcements/agreements and if two
parties do support these both then their equipment could use the
larger frames when talking to each other over this shared medium.

FreeBSD lets you set the MTU based on the route... You could do something
like this, enabling a larger MTU for specific targets, I suppose. I'm not
aware of anyone who is doing this (or probably anyone who would,
especially at L2, without a good reason). This assumes the exchange point
has a switch capable of it.
  
Another option would be to pick the other unit's MTU off of the TCP
session enabled for the (very probable) BGP peering. I seem to
remember that TCP involves a MTU negotiation between endpoints and
that would mean that you implicitly get to know the MTU of all your
peers (which are the ones you might send packets to). Any vendors
which do a "hack" like this? This would not work if the default MTU is
1500 though, it would rather mean you have to have a default MTU of 8k
(or so) and find out anyone who is not jumbo capable via the TCP
session involved with the BGP peering.

The TCP MSS is negiotated based off the MTU, so yo cannot base the MTU off
the MSS, circular logic. I highly doubt you will ever get support for
jumbo frames auto-negotiated without first standarding the jumbo-frames.   

I for one would love to see an intelligent standard realizing that 1500 is
a remarkably stupid and limiting number, and enabling us to bring new life
to public exchange point peering.

-- 
Richard A Steenbergen <ras () e-gerbil net>   http://www.e-gerbil.net/humble
PGP Key ID: 0x138EA177  (67 29 D7 BC E8 18 3E DA  B2 46 B3 D8 14 36 FE B6)




Current thread: