nanog mailing list archives

Re: "They all suck!" Re: UPS failure modes (was: fire at NAC)

From: Sean Donelan <sean () donelan com>
Date: Thu, 29 May 2003 16:53:43 -0400 (EDT)


On Thu, 29 May 2003, Alex Rubenstein wrote:

Even in instances where 'High availability' is designed, in the case where
one of the units has a failure that causes a fire and FM200 dump, either
the FM200 will still trigger an EPO, or the fire department will.


Why do you think most telephone central offices don't have EPO's?  It is
possible to meet code without an EPO, if you have a smart PE on the
project.

So, the second 'high available' unit will generally not prevent you from
dropping the critical load, but instead, will help you get back on line
quicker.


That's why you have geographic diversity, if one node goes down the other
location may be unaffected.

A much cheaper and easier to implement external maintenance
make-before-break bypass will accomplish the same thing.


Pick two out of three.  The "Internet philosphy" has tended to be a
lots of cheap equipment connected by diverse paths.  Designing for
failure also means defining "failure" in terms of the service, not
particular pieces of equipment.  I don't care how many 9's your switch
is, I just care if my packets get through.

I've heard many a story of the paralleling gear causing the problem in the
first place, as well...


Yep, tieing together "redundant" systems with parelleling gears turns two
independent systems into one "co-dependent" system.  In a failure
situation, you want to compartmentalize the failure.  Loosing half your
systems may be better than loosing all your systems.

Current thread:

Re: UPS failure modes (was: fire at NAC), (continued)
- - - Re: UPS failure modes (was: fire at NAC) Pete Ehlke (May 29)
  - Re: UPS failure modes (was: fire at NAC) Joel Jaeggli (May 29)
    - Re: UPS failure modes (was: fire at NAC) Arman (May 29)
    - Re: UPS failure modes (was: fire at NAC) nicholas harteau (May 29)
    - Re: UPS failure modes Jack Bates (May 29)
- Re: UPS failure modes (was: fire at NAC) Lars Erik Gullerud (May 29)
- "They all suck!" Re: UPS failure modes (was: fire at NAC) Sean Donelan (May 29)
  - Re: "They all suck!" Re: UPS failure modes (was: fire at NAC) Alex Rubenstein (May 29)
    - Re: "They all suck!" Re: UPS failure modes (was: fire at NAC) Dan Armstrong (May 29)
    - Re: "They all suck!" Re: UPS failure modes (was: fire at NAC) Sean Donelan (May 29)
    - Re: "They all suck!" Re: UPS failure modes (was: fire at NAC) Sean Donelan (May 29)
    - Re: "They all suck!" Re: UPS failure modes (was: fire at NAC) E.B. Dreger (May 29)
    - Message not available
    - Re: "They all suck!" Re: UPS failure modes (was: fire at NAC) Kevin Day (May 29)
  - Re: "They all suck!" Re: UPS failure modes (was: fire at NAC) Dan Hollis (May 29)
    - Re: "They all suck!" Re: UPS failure modes Jack Bates (May 29)
    - Re: "They all suck!" Re: UPS failure modes Dan Hollis (May 29)
    - Re: "They all suck!" Re: UPS failure modes Stephen Sprunk (May 29)
    - Re: "They all suck!" Re: UPS failure modes Dan Hollis (May 30)
    - Re: "They all suck!" Re: UPS failure modes Joel Jaeggli (May 30)
    - Re: "They all suck!" Re: UPS failure modes Dan Hollis (May 30)
    - Re: "They all suck!" Re: UPS failure modes bmanning (May 30)

(Thread continues...)