nanog mailing list archives

Re: Curiosity about AS3356 L3/CenturyLink network resiliency (in general)


From: Mike Hammett <nanog () ics-il net>
Date: Thu, 17 May 2018 08:24:53 -0500 (CDT)

I often question why\how people build networks the way they do. There's some industry hard-on with having a few 
ginormous routers instead of many smaller ones. I've learned that when building Internet Exchanges, the number of 
networks that don't have BGP edge routers in major markets where they have a presence is quite a bit larger than one 
would expect. I heard a podcast once (I forget if it was Packet Pushers or Network Collective) postulating that the 
reason why everything runs back to a few big ass routers is that someone decided to spend a crap-load of money on big 
ass routers for bragging rights, so now they have to run everything they can through them to A) "prove" their purchase 
wasn't foolish and B) because they now can't afford to buy anything else. 

There's no reason why Tampa doesn't have a direct L3 adjacency to Miami, Atlanta, Houston, and Charlotte over diverse 
infrastructure to all four. Obviously there's room to add\drop from that list, but it gets the point across. 



----- 
Mike Hammett 
Intelligent Computing Solutions 
http://www.ics-il.com 

Midwest-IX 
http://www.midwest-ix.com 

----- Original Message -----

From: "David Hubbard" <dhubbard () dino hostasaurus com> 
To: nanog () nanog org 
Sent: Wednesday, May 16, 2018 11:59:42 AM 
Subject: Curiosity about AS3356 L3/CenturyLink network resiliency (in general) 

I’m curious if anyone who’s used 3356 for transit has found shortcomings in how their peering and redundancy is 
configured, or what a normal expectation to have is. The Tampa Bay market has been completely down for 3356 IP services 
twice so far this year, each for what I’d consider an unacceptable period of time (many hours). I’m learning that the 
entire market is served by just two fiber routes, through cities hundreds of miles away in either direction. So, 
basically two fiber cuts, potentially 1000+ miles apart, takes the entire region down. The most recent occurrence was a 
week or so ago when a Miami-area cut and an Orange, Texas cut (1287 driving miles apart) took IP services down for 
hours. It did not take point to point circuits to out of market locations down, so that suggests they even have the 
ability to be more redundant and simply choose not to. 

I feel like it’s not unreasonable to expect more redundancy, or a much smaller attack surface given a disgruntled 
lineman who knows the routes could take an entire region down with a planned cut four states apart. Maybe other regions 
are better designed? Or are my expectations unreasonable? I carry three peers in that market, so it hasn’t been 
outage-causing, but I use 3356 in other markets too, and have plans for more, but it makes me wonder if I just haven't 
had the pleasure of similar outages elsewhere yet and I should factor that expectation into the design. It creates a 
problem for me in one location where I can only get them and Cogent, since Cogent can't be relied on for IPv6 service, 
which I need. 

Thanks 




Current thread: