nanog mailing list archives

Re: BGP Failover Question


From: Owen DeLong <owen () delong com>
Date: Tue, 22 Feb 2011 11:20:12 -0800


On Feb 22, 2011, at 10:52 AM, Hammer wrote:

I agree. But swapping providers is not the default answer in some environments. I work in an enterprise with multiple 
GE circuits from multiple providers to the Internet. The lead time on calling up a different carrier and saying "I 
need a gigabit connection to the Internet" would probably be 90-120 days. And then you get to go thru the 
contracts/negotiations and MSAs. You don't just flip. In smaller operations I understand. But I was simply saying 
that it's not always that easy. If I went to my boss and said one of our carriers sucks and we should dump them he 
would just laugh and throw me out.
 
That depends on where you are. If you have a router in one or more of the many "carrier hotels" around the world, you 
can usually order a new Gig-E cross-connect with service in less than a week. If you need to have a circuit engineered, 
then, 30-90 days is probably about right. If you need to have facilities installed to provide said circuit, it can be 
as much as 180 days.

However, I don't think the point was "disconnect them tomorrow". I think the point was "If the impact is that severe, 
the sooner you start the new provider process, the sooner you get relief."

1. What are the SLAs with the carrier in question? Do you have them clearly defined? Are they out of SLA? If so, what 
compensation is entitled based on violation of said SLA?

99.99% of all SLAs are a pittance of money refunded IF you jump through extreme hoops to collect. They are rarely 
sufficient to resolve
or even compensate for outages.

 
2. What trending are you doing to document the failures in SLA of the carrier in question? Do we have a documented 
pattern of poor performence by using that trending?
 
3. What are our contractual or legal options based on items 1 and 2?
 
4. Don't forget about the Layer8 (political) factor. If your telco manager is buddies with the carrier then you have 
to double your documentation against them. Some companies spend tens of millions a month on circuits. You better be 
ready to justify yourself. 

Yeah, this is usually the biggest problem.

Owen

 
 
 -Hammer-
 
"I was a normal American nerd."
-Jack Herer
 
 



On Tue, Feb 22, 2011 at 12:38 PM, Owen DeLong <owen () delong com> wrote:
Assuming that he has provider independent space (why run full BGP feeds if you
are not multihomed?), then, actually it's about on par and less disruptive in
general. Add new provider, wait a  day or two, then disconnect old provider.

If he's using provider assigned space, then, the big hurdle is switching to provider
independent (requires a renumber), but, that's a good idea for a variety of reasons.

I would hardly call the type and frequency of outages described a "whim" when
using that as a reason to change providers. Sounds like he is suffering
severe impact to his business.

Owen

On Feb 22, 2011, at 10:15 AM, Hammer wrote:

I'm not argueing that at all. But it wasn't relevent to the question at
hand. And depending on the scale of your business dumping providers is not
something done on a whim. It's not like your fed up with DSL and want to
convert to Cable.


-Hammer-

"I was a normal American nerd."
-Jack Herer





On Tue, Feb 22, 2011 at 12:11 PM, Bret Clark <bclark () spectraaccess com>wrote:

On 02/22/2011 12:23 PM, Hammer wrote:

As Max stated, you can set triggers based on thresholds that are monitered
via multiple methods in Cisco IOS. That way you could force the route down
dynamically. There's always a risk when letting the machines do the
thinking
but this would help in situations like this. Can't speak for other vendors
but I'm sure the features are similar.

Well as someone else stated, if an upstream provider can't provide BGP
reliably then it's time to give them the boot. Once in a year, okay, but
beyond that, then it's time to read riot act with that provider.
Bret






Current thread: