nanog mailing list archives

RE: Quick question.


From: "Michel Py" <michel () arneill-py sacramento ca us>
Date: Sun, 1 Aug 2004 09:44:13 -0700


Michel Py wrote: 
Terminators are a thing of the past; as a matter of
fact, in California and especially in Sacramento
they're called governators now.

Erik Bais wrote:
You mean, they'll be back?

:-D

Only once, and this model is obsolete and can't be upgraded to
presidentor.


Paul Jakma wrote:
Intel dont do fault-tolerant SMP. Running SMP will
lower your MTBF just by fact that you now have 2 CPUs.

True; this would be like raid-0 arrays, the more disks the greater the
chance of failure. However, MTBF is not the name of the game here,
availability ratio is. Which is tied to failure rate.

In other words, I don't really care if the second processor reduces the
MTBF from 200k hours to 60k hours, but I do care if the second processor
reduces the time to restore service from 24 hours to 20 minutes (7.5
minutes for SNMP to fail the query twice, 1.5 minute for the tech to
find out that either it's frozen or there's a BSOD, 6 minutes to have
someone go there and reset, 5 minutes to reboot).

The dead processor still has to be replaced, but this is scheduled
maintenance, not outage. A little extra ammo when you have to hunt five
or six nines.


Then there's fact that you're far more likely to
hit bugs in OS with SMP than uniproc.

Unsignificant in my experience, and does not balance what Alexei
mentioned yesterday: a duallie will keep the system up when a faulty
process hogs 100% CPU, because the second one is still available. That
also increases availability ratio.

Michel.


Current thread: