nanog mailing list archives

Re: Router Suggestions


From: Shawn L via NANOG <nanog () nanog org>
Date: Wed, 17 Jun 2020 19:04:56 -0400 (EDT)


We _always_ have at least one spare, or something that could be (relatively) easily pressed into service as one. 
 
Even in the Midwest, we've had times where 'guaranteed next day replacement' is more like 2nd or third day due to 
weather conditions, the carrier routing it weird, or just plain the plane didn't come today issues.  We generally laugh 
when they try to offer us 4 hour contracts -- we know there's 0 chance they can meet them, and they never want to 
refund you when you need it and they can't.
 


-----Original Message-----
From: "Warren Kumari" <warren () kumari net>
Sent: Wednesday, June 17, 2020 6:50pm
To: "Owen DeLong" <owen () delong com>
Cc: nanog () nanog org
Subject: Re: Router Suggestions






On Tue, Jun 16, 2020 at 5:28 PM Owen DeLong <[ owen () delong com ]( mailto:owen () delong com )> wrote:

On Jun 16, 2020, at 1:51 PM, Mark Tinka <[ mark.tinka () seacom mu ]( mailto:mark.tinka () seacom mu )> wrote:



On 16/Jun/20 22:43, Owen DeLong wrote:

Covering them all under vendor contract doesn’t necessarily guarantee that
the vendor does, either. In general, if you can cover 10% of your hardware
failing in the same 3-day period, you’re probably not going to do much better
with vendor support.

In my experience, our vendors have been able to abide by their
obligations when we've had successive failures in a short period of
time, as long as our subscription is up-to-date.

I am yet to be disappointed.


 Count your blessings… I once faced a situation where a vendor had shipped a batch of defective power supplies (10s of 
thousands of them). It wasn’t just my network facing successive failures
 in this case, but widespread across their entire customer base… By day 2, all of their depots were depleted and day 3 
involved mapping out “how non-redundant can we make the power in our
 routers to cover the outages that we’re seeing without causing more outages than we solve?”

 It was a genuine nightmare.
Huh, was this in the early to mid 1990’s?
I had an incident in NYC area where one of the large (at the time) datacenter/IXPs had a power outage, and their 
transfer switch failed to switch over. Customers were annoyed, so they promised another test, which also failed, 
dropping power to the facility again... now customers were hopping mad...
The next test was *just* of the generator, but with all of the work they had done they had (somehow) gotten the 
transfer switch *really* confused / hardwired into an odd state. This resulted in the facility being powered by both 
the street power and the generator (at least for a few seconds until the generator went “Nope!”)
 These were of course not synchronized, and so 120V equipment saw 0V, then 240V, then some weird harmonic, then other 
surprising values. .. most supplies kind of dealt with this OK, but one of the really common models of router, from the 
largest vendor upped and died. This resulted in a few hundred dead routers and way exceeded the vendors spares 
strategies.
A number of customers (myself included) had 4 hour replacement contracts, which the vendor really could not meet - so 
we agreed to take a new, much larger/better model as a replacement.
W

 I’ve had other situations involving early failures of just released line cards and such as well.

 As I said, YMMV, but I’m betting your vendor doesn’t stock a second copy of every piece of covered equipment in the 
local depot. They’re playing the statistical probabilities just
 like anyone else stocking their own spares pool. The biggest difference is that they’re
 spreading the risk across a (potentially) much wider sample size which may better normalize
 the numbers.

 Owen

-- 

I don't think the execution is relevant when it was obviously a bad idea in the first place.
This is like putting rabid weasels in your pants, and later expressing regret at having chosen those particular rabid 
weasels and that pair of pants.
   ---maf

Current thread: