nanog mailing list archives

Re: "Hypothetical" Datacenter Overheating


From: Clayton Zekelman <clayton () MNSi Net>
Date: Mon, 15 Jan 2024 09:23:37 -0500




At 09:08 AM 2024-01-15, Mike Hammett wrote:
Let's say that hypothetically, a datacenter you're in had a cooling failure and escalated to an average of 120 degrees before mitigations started having an effect. What are normal QA procedures on your behalf? What is the facility likely to be doing? What should be expected in the aftermath?

One would hope they would have had disaster recovery plans to bring in outside cold air, and have executed on it quickly, rather than hoping the chillers got repaired.

All our owned facilities have large outside air intakes, automatic dampers and air mixing chambers in case of mechanical cooling failure, because cooling systems are often not designed to run well in extreme cold. All of these can be manually run incase of controls failure, but people tell me I'm a little obsessive over backup plans for backup plans.

You will start to see premature failure of equipment over the coming weeks/months/years.

Coincidentally, we have some gear in a data centre in the Chicago area that is experiencing that sort of issue right now... :-(





Current thread: