nanog mailing list archives

Re: Do you care about "gray" failures? Can we (network academics) help? A 10-min survey


From: Mark Tinka <mark@tinka.africa>
Date: Thu, 8 Jul 2021 14:56:27 +0200



On 7/8/21 14:29, Saku Ytti wrote:

Network experiences gray failures all the time, and I almost never
care, unless a customer does. If there is a network which does not
experience these, then it's likely due to lack of visibility rather
than issues not existing.

Fixing these can take months of working with vendors and attempts to
remedy will usually cause planned or unplanned outages. So it rarely
makes sense to try to fix as they usually impact a trivial amount of
traffic.

Networks also routinely mangle packets in-memory which are not visible
to FCS check.

I was going to say the exact same thing.

+1.

It's all par for the course, which is why we get up everyday :-).

I'm currently dealing with an issue that will forward a customer's traffic to/from one /24, but not the rest of their IPv4 space, including the larger allocation from which the /24 is born. It was a gray issue while the customer partially activated, and then caused us to care when they tried fully swing over.

Mark.


Current thread: