nanog mailing list archives

Re: Router Suggestions


From: Warren Kumari <warren () kumari net>
Date: Wed, 17 Jun 2020 18:50:05 -0400

On Tue, Jun 16, 2020 at 5:28 PM Owen DeLong <owen () delong com> wrote:



On Jun 16, 2020, at 1:51 PM, Mark Tinka <mark.tinka () seacom mu> wrote:



On 16/Jun/20 22:43, Owen DeLong wrote:

Covering them all under vendor contract doesn’t necessarily guarantee
that
the vendor does, either. In general, if you can cover 10% of your
hardware
failing in the same 3-day period, you’re probably not going to do much
better
with vendor support.

In my experience, our vendors have been able to abide by their
obligations when we've had successive failures in a short period of
time, as long as our subscription is up-to-date.

I am yet to be disappointed.


Count your blessings… I once faced a situation where a vendor had shipped
a batch of defective power supplies (10s of thousands of them). It wasn’t
just my network facing successive failures
in this case, but widespread across their entire customer base… By day 2,
all of their depots were depleted and day 3 involved mapping out “how
non-redundant can we make the power in our
routers to cover the outages that we’re seeing without causing more
outages than we solve?”

It was a genuine nightmare.


Huh, was this in the early to mid 1990’s?

I had an incident in NYC area where one of the large (at the time)
datacenter/IXPs had a power outage, and their transfer switch failed to
switch over. Customers were annoyed, so they promised another test, which
also failed, dropping power to the facility again... now customers were
hopping mad...

The next test was *just* of the generator, but with all of the work they
had done they had (somehow) gotten the transfer switch *really* confused /
hardwired into an odd state. This resulted in the facility being powered by
both the street power and the generator (at least for a few seconds until
the generator went “Nope!”)

 These were of course not synchronized, and so 120V equipment saw 0V, then
240V, then some weird harmonic, then other surprising values. .. most
supplies kind of dealt with this OK, but one of the really common models of
router, from the largest vendor upped and died. This resulted in a few
hundred dead routers and way exceeded the vendors spares strategies.

A number of customers (myself included) had 4 hour replacement contracts,
which the vendor really could not meet - so we agreed to take a new, much
larger/better model as a replacement.

W



I’ve had other situations involving early failures of just released line
cards and such as well.

As I said, YMMV, but I’m betting your vendor doesn’t stock a second copy
of every piece of covered equipment in the local depot. They’re playing the
statistical probabilities just
like anyone else stocking their own spares pool. The biggest difference is
that they’re
spreading the risk across a (potentially) much wider sample size which may
better normalize
the numbers.

Owen

--
I don't think the execution is relevant when it was obviously a bad idea in
the first place.
This is like putting rabid weasels in your pants, and later expressing
regret at having chosen those particular rabid weasels and that pair of
pants.
   ---maf

Current thread: