nanog mailing list archives
Re: availability and resiliency
From: Jay Tribick <jay.tribick () carrier1 net>
Date: Sat, 30 Sep 2000 20:34:01 +0100
and retry the failing machine instruction on a hot-spare. That's after a reset-and-retry on the failing processor has proven it's a hard failure and not a soft one. The mind boggles.... ;).. and the concept of this happening on Wintel hardware running anything is sheer ludicrousy. Whoever mentioned that SMP can help you get high uptime boxes is smoking heavy crack in most cases. Note that the big-end Alpha and Sun gear is NUMA, not SMP. Different kettle of fish there, and if you need an explanation as to why its more likely to happen with NUMA and not SMP, there are lots of hardware books out there. :-)
If you're looking at implementing "5 9's" check out Suns FT1800 - very nice box (read: looks nice ;), easy to admin, and so far has been rock solid for us (not that Solaris crashes much anymore anyway.. but at least you no longer have to worry about hardware resilience with the FT) All you have to worry about then is disparate power, and software stability. -- Regards, Jay Tribick Senior Systems Engineer Carrier1 Voice: +44 207 531 3874
Current thread:
- Re: availability and resiliency, (continued)
- Re: availability and resiliency Michael Shields (Sep 28)
- Re: availability and resiliency Andrew Bangs (Sep 29)
- Re: availability and resiliency Lionel Lauer (Sep 29)
- RE: availability and resiliency Leo Nelson (Sep 29)
- Re: availability and resiliency Majdi S. Abbas (Sep 29)
- RE: availability and resiliency Leo Nelson (Sep 29)
- RE: availability and resiliency Roeland M.J. Meyer (Sep 29)
- Re: availability and resiliency Andrew Brown (Sep 29)
- Re: availability and resiliency Valdis . Kletnieks (Sep 29)
- Re: availability and resiliency Adrian Chadd (Sep 30)
- Re: availability and resiliency Jay Tribick (Sep 30)
- Re: availability and resiliency Andrew Brown (Sep 29)