nanog mailing list archives

Re: Followup British Telecom outage reason

From: Ian Duncan <Ian.Duncan () sympatico ca>
Date: Mon, 26 Nov 2001 10:46:49 -0500


Wandering off the subject of BT's misfortune ...

Sean Donelan wrote:

On Mon, 26 Nov 2001, Christian Kuhtz wrote:


[...]

Faults will happen.  And nothing matters as much as how your prepare for
when they do.


Mean Time To Repair is a bigger contributor to Availability calculations
than the Mean Time To Failure.  It would be great if things never failed.


And Mean Time To Fault Detected (Accurately) is usually the biggest
sub-contributor within Repair but that's kinda your point.


But some people are making their systems so complicated chasing the Holy
Grail of 100% uptime, they can't figure out what happened when it does
fail.


Similar people pursue creation of perpetuum mobile. A strange and somewhat
congruent example stumbled into recently is:
http://www.sce.carleton.ca/netmanage/perpetum.shtml.

Overall simplicity of the system, including failure detection mechanisms, and real
redundancy are the most reliable tools for availablity. Of course, popping just a
few layers out, profit and politics are elements of most systems.

Murphy's revenge: The more reliable you make a system, the longer it will
take you to figure out what's wrong when it breaks.


Hmm.

Current thread:

RE: Followup British Telecom outage reason, (continued)
- - - RE: Followup British Telecom outage reason Patrick Greenwell (Nov 28)
    - Re: Followup British Telecom outage reason Peter Galbavy (Nov 30)
    - Re: Followup British Telecom outage reason Neil J. McRae (Nov 30)
    - RE: Followup British Telecom outage reason Daniel Golding (Nov 26)
    - RE: Followup British Telecom outage reason Deepak Jain (Nov 26)
    - Re: Followup British Telecom outage reason Jesper Skriver (Nov 27)
    - Re: Followup British Telecom outage reason Paul Vixie (Nov 25)
    - RE: Followup British Telecom outage reason Christian Kuhtz (Nov 26)
    - Re: Followup British Telecom outage reason Valdis . Kletnieks (Nov 26)
    - RE: Followup British Telecom outage reason Sean Donelan (Nov 26)
    - Re: Followup British Telecom outage reason Ian Duncan (Nov 26)
    - RE: Followup British Telecom outage reason Alex Bligh (Nov 26)
    - Re: Followup British Telecom outage reason Christopher A. Woodfield (Nov 26)
    - Re: Followup British Telecom outage reason jerry scharf (Nov 26)
    - Re: Followup British Telecom outage reason Christopher A. Woodfield (Nov 26)
    - Re: Followup British Telecom outage reason Brett Frankenberger (Nov 26)
    - Re: Followup British Telecom outage reason Ryan O'Connell (Nov 27)
    - Re: Followup British Telecom outage reason Alex Bligh (Nov 26)
    - Re: Followup British Telecom outage reason Paul Vixie (Nov 26)
  - Re: Followup British Telecom outage reason Wayne E. Bouchard (Nov 24)
- RE: Followup British Telecom outage reason Kevin Gannon (Nov 24)

(Thread continues...)