nanog mailing list archives

Re: FYI Netflix is down

From: George Herbert <george.herbert () gmail com>
Date: Mon, 2 Jul 2012 14:04:08 -0700

On Mon, Jul 2, 2012 at 12:43 PM, Greg D. Moore <mooregr () greenms com> wrote:

At 03:08 PM 7/2/2012, George Herbert wrote:

If folks have not read it, I would suggest reading Normal Accidents by
Charles Perrow.

The "it can't happen" is almost guaranteed to happen. ;-)  And when it does,
it'll often interact in ways we can't predict or sometimes even understand.


Seconded.

There are also aerospace and nuclear and failure analysis books which
are good, but I often encourage people to start with that one.

As for pulling the plug to test stuff. I recall a demo at Netapps in the
early 00's.  They were talking about their fault tolerance and how great it
was.  So I walked up to their demo array and said, "So, it shouldn't be a
problem if I pulled this drive right here?"  Before I could the salesperson
or tech guy, can't remember,  told me to stop.  He didn't want to risk it.

That right there said loads about their confidence in their own system.


I worked for a Sun clone vendor (Axil) for a while and took some of
our systems and storage to Comdex one year in the 90s.  We had a RAID
unit (Mylex controller) we had just introduced.  Beforehand, I made
REALLY REALLY SURE that the pull-the-disk and pull-the-redundant-power
tricks worked.  And showed them to people with the "Please keep in
mind that this voids the warranty, but here we *rip* go...".  All of
the other server vendors were giving me dirty looks for that one.
Apparently I sold a few systems that way.

You have to watch for connector wear-out and things like that, but ...

All the clusters I've built, I've insisted on a burn-in time plug pull
test on all the major components.  We caught things with those from
time to time.  Especially with N+1, if it is really N+0 due to a bug
or flaw you need to know that...


-- 
-george william herbert
george.herbert () gmail com

Current thread:

Re: FYI Netflix is down, (continued)
- - - Re: FYI Netflix is down Joly MacFie (Jul 02)
    - Re: FYI Netflix is down James Downs (Jul 02)
    - Re: FYI Netflix is down AP NANOG (Jul 02)
    - Re: FYI Netflix is down Grant Ridder (Jul 02)
    - RE: FYI Netflix is down Dan Golding (Jul 02)
    - Re: FYI Netflix is down Brett Frankenberger (Jul 02)
  - Re: FYI Netflix is down AP NANOG (Jul 02)
- Re: FYI Netflix is down George Herbert (Jul 02)
  - Message not available
    - Re: FYI Netflix is down Greg D. Moore (Jul 02)
  - Message not available
    - RE: FYI Netflix is down Dan Golding (Jul 02)
  - Message not available
    - Re: FYI Netflix is down George Herbert (Jul 02)
    - Message not available
    - Re: FYI Netflix is down Greg D. Moore (Jul 02)
  - Message not available
    - Re: FYI Netflix is down Steven Bellovin (Jul 02)
    - Re: FYI Netflix is down Jay Ashworth (Jul 03)
    - Re: FYI Netflix is down George Herbert (Jul 03)
  - Message not available
    - Re: FYI Netflix is down Jon Lewis (Jul 03)
- Re: FYI Netflix is down Hal Murray (Jul 02)
- RE: FYI Netflix is down Ryan Malayter (Jul 03)
  - Re: FYI Netflix is down Rodrick Brown (Jul 03)
  - RE: FYI Netflix is down Dan Golding (Jul 06)
    - Re: FYI Netflix is down James Downs (Jul 06)

(Thread continues...)