nanog mailing list archives

Re: "Hypothetical" Datacenter Overheating


From: William Herrin <bill () herrin us>
Date: Mon, 15 Jan 2024 06:55:22 -0800

On Mon, Jan 15, 2024 at 6:08 AM Mike Hammett <nanog () ics-il net> wrote:
Let's say that hypothetically, a datacenter you're in had a cooling failure
and escalated to an average of 120 degrees before mitigations started
having an effect. What  should be expected in the aftermath?

Hi Mike,

A decade or so ago I maintained a computer room with a single air
conditioner because the boss wouldn't go for n+1. It failed in exactly
this manner several times. After the overheat was detected by the
monitoring system, it would be brought under control with a
combination of spot cooler and powering down to a minimal
configuration. But of course it takes time to get people there and set
up the mitigations, during which the heat continues to rise.

The main thing I noticed was a modest uptick in spinning drive
failures for the couple months that followed. If there was any other
consequence it was at a rate where I'd have had to be carefully
measuring before and after to detect it.

Regards,
Bill Herrin


-- 
William Herrin
bill () herrin us
https://bill.herrin.us/


Current thread: