nanog mailing list archives

RE: few big monolithic PEs vs many small PEs


From: <adamv0025 () netconsultings com>
Date: Thu, 27 Jun 2019 12:47:11 +0100

From: James Bensley <jwbensley () gmail com>
Sent: Thursday, June 27, 2019 9:56 AM

One experience I have made is that when there is an outage on a large PE,
even when it still has spare capacity, is that the business impact can be too
much to handle (the support desk is overwhelmed, customers become irate
if you can't quickly tell them what all the impacted services are, when service
will be restored, the NMS has so many alarms it’s not clear what the problem
is or where it's coming from etc.).

I see what you mean, my hope is to address these challenges by having a "single source of truth" provisioning system 
that will have, among other things, also HW-customer/service mapping -so Ops team will be able to say that if 
particular LC X fails then customers/services X,Y,Z will be affected. 
But yes I agree with smaller PEs any failure fallout is minimized proportionally.
 

This doesn’t mean there isn’t a place for large routers. For example, in a
typical network, by the time we get to the P nodes layer in the core we tend
to have high levels of redundancy, i.e. any PE is dual-homed to two or more P
nodes and will have 100% redundant capacity. 
Exactly, while the service edge topology might be dynamic as a result of horizontal scaling the core on the other hand 
I'd say should be fairly static and scaled vertically, that is I wouldn't want to scale core routers horizontally and 
as a result have core topology changing with every P scale out iteration at any POP, that would be bad news for 
capacity planning and traffic engineering... 


I’ve tried to write some of my experiences here
(https://null.53bits.co.uk/index.php?page=few-larger-routers-vs.-many-
smaller-routers).
The tl;dr version though is that there’s rarely a technical restriction to having
fewer large routers and it’s an operational/business impact problem.

I'll give it a read, cheers.

adam


Current thread: