nanog mailing list archives

Re: CenturyLink RCA?


From: Eric Loos <eric () ipergy net>
Date: Mon, 31 Dec 2018 16:23:31 +0100

This seems entirely plausible given that DWDM amplifiers and lasers being a complex analog system, they need OOB to 
align. 

--
Eric

On 31 Dec 2018, at 16:06, Saku Ytti <saku () ytti fi> wrote:

Hey Steve,

I will continue to speculate, as that's all we have.

1.  Are you telling me that several line cards failed in multiple cities in the same way at the same time?  Don't 
think so unless the same software fault was propagated to all of them.  If the problem was that they needed to be 
reset, couldn't that be accomplished by simply reseating them?

L2 DCN/OOB, whole network shares single broadcast domain

2.  Do we believe that an OOB management card was able to generate so much traffic as to bring down the optical 
switching?  Very doubtful which means that the systems were actually broken due to trying to PROCESS the "invalid 
frames".  Seems like very poor control plane management if the system is attempting to process invalid data and 
bringing down the forwarding plane.

L2 loop. You will kill your JNPR/CSCO with enough trash on MGMT ETH.
However I can be argued that optical network should fail up in absence
of control-plane, IP network has to fail down.

3.  In the cited document it was stated that the offending packet did not have source or destination information.  
If so, how did it get propagated throughout the network?

BPDU

My guess at the time and my current opinion (which has no real factual basis, just years of experience) is that a 
bad software package was propagated through their network.

Lot of possible reasons, I choose to believe what they've communicated
is what the writer of the communication thought that happened, but as
they likely are not SME it's broken radio communication. BCAST storm
on L2 DCN would plausibly fit the very ambiguous reason offered and is
something people actually are doing.

-- 
 ++ytti

Current thread: