nanog mailing list archives

Re: DHS letters for fuel and facility access


From: Ben Cannon <ben () 6by7 net>
Date: Wed, 18 Mar 2020 10:11:14 -0700

It flabbergasts me to no end that nobody simulated the actual incident they are guarding against.

But I guess that’s why we run telecom companies.

Diesel piston generators need to be run for 30min every 30 (absent engineer calcs permitting lower, but, why).

You should also consider a pull and re-strike on that breaker 3 times.

Most transmission level circuit breakers will auto-retry 3x then quit if they trip each time. 

Your ATS should smooth this, but that function needs to get tested too.

Things you learn in heavy civil construction that you don’t necessarily learn in telecom even.

-Ben

On Mar 18, 2020, at 9:58 AM, Paul Nash <paul () nashnetworks ca> wrote:

You just have to make sure that you test the right thing.

In a former life I was an electrical engineer. My first job was with a consulting engineering firm; out biggest 
customer was the biggest supermarket chain in South Africa.  One of my tasks was to travel to one of their stores 
each Saturday after closing (those were the days when they closed at noon on a Saturday until Monday morning) and 
test their stand generators.

The manager’s idea was usually to press the start button, check that the big diesel started, then shut down and go 
home.  My idea was to pull the main incoming breaker.  9 times out of 10 on first visit, the diesel would start, and 
then die as soon as the load kicked in because of carbon buildup in the cylinders.

After discussions with the supermarket management, they decided to (a) have all the diesels serviced ASAP, and (b) 
adopt my protocol of start diesel, wait for it to come under load, run for at least 30 minutes to get up to heat and 
clear the carbon deposits.

I use a similar technique for failover tests on servers, routers, firewalls — pull the power cord and see what 
happens, pull the incoming network and see what happens.

This was stymied by a recent network outage where the ISP network was up and running, connected back to their local 
PoP and thence to their backbone, but connectivity from that network to the critical servers was down.  So now we 
test end-to-end that the server is reachable, and let the network fail over if not.

   paul

On Mar 18, 2020, at 11:56 AM, Karl Auer <kauer () biplane com au> wrote:

An untested emergency system has to be regarded as a non-existent
emergency system.

No matter how painful it is to test, no matter how expensive it is to
test, the pain and the expense are nothing compared to the pain and
expense of having an actual emergency and discovering that the
emergency system doesn't work...

Multiplied by infinity if it costs lives.

Regards, K.

-- 
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Karl Auer (kauer () biplane com au)
http://www.biplane.com.au/kauer
http://twitter.com/kauer389

GPG fingerprint: 2561 E9EC D868 E73C 8AF1 49CF EE50 4B1D CCA1 5170
Old fingerprint: 8D08 9CAA 649A AFEF E862 062A 2E97 42D4 A2A0 616D





Current thread: