nanog mailing list archives

Re: UPS failure modes (was: fire at NAC)


From: Lars Erik Gullerud <lerik () nolink net>
Date: 29 May 2003 20:36:18 +0200


On Thu, 2003-05-29 at 18:31, Robert Boyle wrote:

I had a little 2000VA rackmount Liebert UPS catch fire in 1997 and another 
new and improved Liebert model almost catch fire about a year later. Both 
were operating well within specified input, load, and temperature 
parameters.  I haven't really trusted them since.I bought dual MGE UPSes 
for our datacenter in 2002. I figured if E****s can flip them on and off 
randomly and massively overload them all in an environment which is 95 
degrees F, then they should hold up nicely for us when lightly loaded at 65 
degrees F. :)

I am personally of the opposite opinion, we have never had any issues
with our Liebert UPS'es, however we have had a few MGE's blow up. I
can't comment on their small UPS models though, I think the smallest MGE
or Liebert we have is 10KVA.

The worst of the cases was an installation where we had dual 40KVA MGE
UPS'es installed, both of whom failed critically within 48 hours of each
other. Despite all the fail-safe circuitry they were bought with, they
failed HARD (and yes, they had sparks and smoke coming out of them), and
not even the bypass features worked. Since the second failed before the
first one had been completely restored (it was being investigated to
find the root cause of this critical failure), things went very black,
and electricians had a pretty hectic time as they had to manually bypass
the UPS'es completely and feed grid power directly to the facility.

Unfortunately these two were bought at the same time (and came from the
same production batch), so they had the same fault - which was
apparently a bad shipment of capacitors which started to leak fluid
after a period of time. Due to some unfortunate design choices in the
MGE's, these capacitors happened to be placed directly above the main
controller circuitry, and the leaky capacitors eventually caused the
whole thing to short, in a rather spectacular way I might add.

And yes, the bypass failed as well, we were explained the reason for
this by the engineers from MGE although I can't say I remember the
details (electricity really isn't my field :). That being said, after
the replacement of the fried components, the engineers from MGE came
on-site and rebuilt the entire bypass system in these two boxes some
time later, at no charge of course - and we have not had any problems
with them in the two years they have now been in operation after this
incident.

/leg



Current thread: