nanog mailing list archives

Re: best practice for advertising peering fabric routes


From: "Patrick W. Gilmore" <patrick () ianai net>
Date: Tue, 14 Jan 2014 22:35:31 -0500

On Jan 14, 2014, at 22:20 , Leo Bicknell <bicknell () ufp org> wrote:
On Jan 14, 2014, at 7:55 PM, Eric A Louie <elouie () yahoo com> wrote:

I have a connection to a peering fabric and I'm not distributing the peering fabric routes into my network.

There's a two part problem lurking.

Problem #1 is how you handle your internal routing.  Most of the "big boys" will next-hop-self in iBGP all external 
routes.  However depending on the size and configuration of your network there may be advantages to not using 
next-hop-self, or just putting it in your IGP.  Basically, you should be doing the same thing you do for a /30 from a 
peer or transit provider in your network.  There is one thing special about an exchange point though, for security 
reasons you probably want to add it to your "never accept" routing filter from peers/customers/transit providers.  
You don't need someone injecting a couple of more specifics to mess with your routing.

Problem #2 is your customers.  If you have customers that may operate default free, and they use one of the 
traceroute tools that not only finds the route, but then continues to probe it (like MTR, or Visual Traceroute) there 
can be an issue.  The initial traceroute probe may return an IP on the exchange of your peer's router, but then when 
they subsequently source ICMP Ping to that IP there will be no route in their network, and it will simply never 
respond.  Some call this a feature, some call this a problem.  There is also an extremely rare problem where the far 
end of the peering exchange steps down MTU, and thus PMTU discovery is invoked, but your customers use Unicast RPF.  
Since the exchange LAN isn't in their table, Unicast RPF may drop the PMTU packet-too-big message, causing a timeout.

If your customers have a default to you, all is well.  However if they have a default to someone else, and take a 
table from you to selectively override the same problem can occur for any routes they select through you that also 
traverse the exchange.

IMHO the best fix for #2 is that the exchange have an ASN, and announce the exchange LAN from that ASN, typically via 
the route server.  You should then peer with the route server to pick up that network.  That makes the announcement 
consistent, and makes it clear who operates that network, and your customers can then access it.  Many exchanges do 
not do this, and then the next best solution might be to originate it from your ASN and announce it to your customers 
only, with no-export set on the way out.

Various people will no doubt chime in and tell you the last two suggestions are either excellent wonderful and the 
worst idea ever.  Safe to say I know of networks doing both and the world has not ended.  YMMV, some assembly 
required, batteries not included, actual conditions may affect product performance, do not taunt the happy fun ball, 
and consult a doctor if your network is up for more than four hours.

I've known Leo for .. well, let's just say a long time. And I have great respect for his networking abilities. But I 
fall into the second camp. As someone who owns & operates an IXP, and is on the board of a couple more, and helped 
start even more, I'm going to stick to my guns here.

As for knowing networks that do both, blah, blah, blah. I know lots of networks that allow spam, don't configure BCP38, 
have abusable name or NTP servers, etc. and the world has not come to an end. Doesn't mean you should. Lame excuse, 
Leo, and beneath you to even go there.

NEVER EVER EVER put an IX prefix into BGP, IGP, or even static route. An IXP LAN should not be reachable from any 
device not directly attached to that LAN. Period.

If for no better reason, how about because it is not your prefix, and chances are the IXP does not want you to use the 
prefix. In fact, I challenge you to find a major IXP route server which is announcing the IXP block.

But because this is a teaching list, let's go through the problems Leo mentions. Anyone who steps down MTU on an IXP is 
far too broken to worry about your customer having RFP and not getting PMTU. Again, I challenge you to find someone 
doing this today, their network would be close to unusable. As for traceroute .... Seriously? You want to increase 
breakage on the Internet because it might cause 3 stars in a traceroute? Puh-LEEEZE. Sorry, neither of those pass the 
sniff test, IMHO.

So Just Don't Do It. Setting next-hop-self is not just for "big guys", the crappiest, tiniest router that can do 
peering at an IXP has the same ability. Use it. Stop putting me and every one of your peers in danger because you are 
lazy.

-- 
TTFN,
patrick

Attachment: signature.asc
Description: Message signed with OpenPGP using GPGMail


Current thread: