nanog mailing list archives

Re: Power cut if temps are too high


From: Warren Kumari <warren () kumari net>
Date: Tue, 28 May 2019 13:45:41 +0000

I used to work for a small, fairly crappy ISP -- the "datacenter" was
a converted brick garage / loading dock. In order to provide cooling,
they had chipped out a bunch of bricks, and mounted in 8 or so AC
units, all in a line.

We monitored everything with WhatsUp Gold[0] - one (hot) night I'm
oncall, and at 3:30AM I get an alert that the environmental sensors on
one of the routers thinks it's too hot. I'm tired and grumpy, and it's
only slightly too hot, so I ack it and go back to bed. A short while
later I get paged again - another router now thinks it is
uncomfortably warm. Still grumpy, so I ack that too, and back to bed.
Sure enough, 20 minutes later, another page.... Fine, I get dressed,
drive over to the location -- and realize that bricks / mortar are
strong in compression, but weak in tension - the AC window units have
been quietly vibrating for many years, and the entire row of bricks
above the AC units has popped out. All the AC units are lying outside
the building on the grass, still running.... :-) I stared at them for
a bit, unsure what to do -- so I turned them off, bumped up the
monitoring levels, and went back to bed... Next day we blocked up the
hole, installed some temporary chillers, and then finally installed
real colling....

There isn't much point to this story, but I've got a cold, and wanted
to share... :-P

W
[0]: Wow, I just realized that WUG still exists... huh.

On Tue, May 28, 2019 at 9:13 AM Thomas Bellman <bellman () nsc liu se> wrote:

On 2019-05-27 18:18 +0000, Mel Beckman wrote:

Before the trigger temperature is reached, the NMS would have sent
various escalating alarms to on call staffers, who hopefully would
intervene before this point.

Would they actually have time to react and do something?  In our
datacenters, we reach our cut-off temperature in about 20 minutes
if cooling stops.


This system has triggered one time, successfully shutting down the data
center on a holiday weekend when people missed their notifications, and
undoubtedly saved a lot of hard drives. When we got to the room the
temperature was over 115°, but the power was cut at 95°.

Presumably that was °F, not °C.

I have heard from people who did *not* have automatic cutting of the
power at high temperatures.  Their computer room reached 100°C in
places; some keyboards apparently looked like a certain Salvador Dali
painting afterwards...  (But I think they had very few actual servers
or disk drives breaking.)  The reason it didn't get even hotter, was
that as temperature rose, servers started overheating and shut them-
selves down, thus lowering power disippation more and more.


Our system for cutting power at high temperatures is part of the PLC
monitoring power and temperature in the computer rooms.  It sends a
signal to the large breakers connecting the power subcentrals (where
all the 16A fuses are) to the power rail feeding the room.  I believe
our PLCs are from Schneider Electric, but anyone who delivers PLCs
for controlling power and cooling in a datacenter should be capable
or programming their PLCs to do the same.  You just need to remember
putting it in the specifications when you contract the building. :-)


        /Bellman



-- 
I don't think the execution is relevant when it was obviously a bad
idea in the first place.
This is like putting rabid weasels in your pants, and later expressing
regret at having chosen those particular rabid weasels and that pair
of pants.
   ---maf


Current thread: