nanog mailing list archives

Re: exchange point media


From: "Richard A. Steenbergen" <ras () e-gerbil net>
Date: Mon, 17 Jul 2000 18:10:38 -0400 (EDT)


On Mon, 17 Jul 2000, Mikael Abrahamsson wrote:


On Mon, 17 Jul 2000, Richard A. Steenbergen wrote:

A Foundry BigIron doing L3 should, exactly as if it was a router and
interfaces not a switch, I believe. At that point there is no real
technical distinction between it and a router with lots of ethernet
ports however. I'm not aware of any exchanges doing L3...

Well, the device should do no routing in the classical meaning of the
word, not on L3 anyway. To do fragmentation it has to be semi-L3-aware
though. It also needs an IP adress to send the needtofrag-ICMPs from.

I'm not sure if there is anyone who is doing L3 only in regards to
fragmentation when traversing through a switch with multiple interfaces
and disparate MTUs in a modern environment. You'd have to ask the switch
vendors, my guess is if its a L3 switch-functionality it could be made to
do so at decent levels of performance, but this is still bad.
  
FreeBSD lets you set the MTU based on the route... You could do something
like this, enabling a larger MTU for specific targets, I suppose. I'm not
aware of anyone who is doing this (or probably anyone who would,
especially at L2, without a good reason). This assumes the exchange point
has a switch capable of it.

We're talking L3 here (routers). Normally the L3 MTU is derived from
the L2 MTU, here we would need to derive it from either static
configuration or from the below MSS/MTU mechanism (which I don't think
will happen as it has too much of a "hack" in it).

The OS model you're refering to lets you derive it from the routing entry
instead of just the physical interface, so you can specify a different MTU
for a different prefix, for whatever reason, even if its on the same
physical interface. At this point you're talking a grody poorly managed
hack which assumes an aweful lot and probably is going to break a lot of
things.

The TCP MSS is negiotated based off the MTU, so yo cannot base the MTU off
the MSS, circular logic. I highly doubt you will ever get support for
jumbo frames auto-negotiated without first standarding the jumbo-frames.

Yes, it can. Router1 should be able to figure out router2:s MTU from
the MSS of its TCP session with router2. Router2 has no problem here
since it's MTU is the lowest one anyway.
 
This could work in a very backwards fashion. Something like both sides
starting with an MTU of 1500, but configured to try and reach 9k if
possible. During a BGP Peer, Router1 lies and offers up an MSS of 9k in 
its SYN, and if Router2 accepts and both sides complete a handshake with
an agreed upon value, both physical MTUs would be increased. This is a 
horrible hack, among other things it assumes there must be a TCP based BGP
peer in order for the sides to negiotiate > 1500, and that the switch will
accept this. I can't imagine it working like this in any production
system.

I was under the impression that there is nothing magical about jumbo
frames and that there are no interoperational problems with them as
long as they're supported at all. Please correct me if I am wrong.   

There is no standardization on maximium size of the jumbo frame, or
guarentee to support them at all. There is also no guarentee that the
ability to go > 1500 will not disappear in 10GigE.

I for one would love to see an intelligent standard realizing that 1500 is
a remarkably stupid and limiting number, and enabling us to bring new life
to public exchange point peering.

I think any new exchange point technology needs to have an MTU > 1500.

I would agree with that, but is the IEEE taking that into account or just
focusing on traditional desktop server and datacenter networking?

One of the most powerful advantages ethernet offers is the ability to be
cheaply and easily switched. Personally I'd love to have a cheap effective
high density OC48 PoS switch (no flames please). :P

-- 
Richard A Steenbergen <ras () e-gerbil net>   http://www.e-gerbil.net/humble
PGP Key ID: 0x138EA177  (67 29 D7 BC E8 18 3E DA  B2 46 B3 D8 14 36 FE B6)




Current thread: