nanog mailing list archives

Re: BGP and convergence time


From: Seth Mattinen <sethm () rollernet us>
Date: Tue, 11 May 2010 13:59:09 -0700

On 5/11/2010 11:35, Jay Nakamura wrote:
So, we have two upstreams, both coming in on Ethernet.  One of our
switch crashed and rebooted itself.  Although we have other paths to
egress out the network, because the router's Ethernet interface didn't
go down, our router's BGP didn't realize the neighbor was down until
default BGP timeout was reached.  Our upstream connectivity was out
for couple minutes.

I am looking for ways to detect neighbor being down faster so traffic
can be re-routed faster.  I can do BFD internally but the issue is how
the upstream is going to detect the outage and stop routing our
traffic to that downed link.  I have asked both of my upstreams and
one said they don't do anything like that, second upstream I am still
waiting on the answer.

My question is, do other carriers do BFD or any other means to detect
the neighbor being down faster than normal BGP will allow?  (Both
upstreams are major telcos [AT&T and Qwest], so I think they are less
flexible than some others.)

Or, has anyone succeeded in getting something done with those two carriers?



In my experience this is a pretty common problem with carrier Ethernet
links where the interface is always "up" unless the directly connected
switch/mux fails. Even then, it may still keep the port up through
reboots. I like how Ethernet is cheap, but I hate how it lacks simple
things like "link is down if any segment of the L1 or L2 between
endpoints faults" that you get without silly tricks on a DSx or OC-x.
(Then again, I suppose you're paying for that capability if it's
important enough.)

~Seth


Current thread: