nanog mailing list archives

Re: "Hypothetical" Datacenter Overheating


From: Bryan Holloway <bryan () shout net>
Date: Mon, 15 Jan 2024 16:04:43 +0100

I think we're beyond "hypothetical" at this point, Mike ... ;)


On 1/15/24 15:49, Mike Hammett wrote:
Coincidence indeed....   ;-)



-----
Mike Hammett
Intelligent Computing Solutions <http://www.ics-il.com/>
<https://www.facebook.com/ICSIL><https://plus.google.com/+IntelligentComputingSolutionsDeKalb><https://www.linkedin.com/company/intelligent-computing-solutions><https://twitter.com/ICSIL>
Midwest Internet Exchange <http://www.midwest-ix.com/>
<https://www.facebook.com/mdwestix><https://www.linkedin.com/company/midwest-internet-exchange><https://twitter.com/mdwestix>
The Brothers WISP <http://www.thebrotherswisp.com/>
<https://www.facebook.com/thebrotherswisp><https://www.youtube.com/channel/UCXSdfxQv7SpoRQYNyLwntZg>
------------------------------------------------------------------------
*From: *"Clayton Zekelman" <clayton () MNSi Net>
*To: *"Mike Hammett" <nanog () ics-il net>, "NANOG" <nanog () nanog org>
*Sent: *Monday, January 15, 2024 8:23:37 AM
*Subject: *Re: "Hypothetical" Datacenter Overheating




At 09:08 AM 2024-01-15, Mike Hammett wrote:
 >Let's say that hypothetically, a datacenter you're in had a cooling
 >failure and escalated to an average of 120 degrees before
 >mitigations started having an effect. What are normal QA procedures
 >on your behalf? What is the facility likely to be doing?
 >What  should be expected in the aftermath?

One would hope they would have had disaster recovery plans to bring
in outside cold air, and have executed on it quickly, rather than
hoping the chillers got repaired.

All our owned facilities have large outside air intakes, automatic
dampers and air mixing chambers in case of mechanical cooling
failure, because cooling systems are often not designed to run well
in extreme cold.  All of these can be manually run incase of controls
failure, but people tell me I'm a little obsessive over backup plans
for backup plans.

You will start to see premature failure of equipment over the coming
weeks/months/years.

Coincidentally, we have some gear in a data centre in the Chicago
area that is experiencing that sort of issue right now... :-(







Current thread: