nanog mailing list archives

Re: FYI Netflix is down


From: Rodrick Brown <rodrick.brown () gmail com>
Date: Tue, 3 Jul 2012 11:35:06 -0400


On Jul 3, 2012, at 9:11 AM, "Dan Golding" <dgolding () ragingwire com> wrote:

-----Original Message-----
From: James Downs [mailto:egon () egon cc]


On Jul 2, 2012, at 7:19 PM, Rodrick Brown wrote:

People are acting as if Netflix is part of some critical service
they
stream movies for Christ sake.  Some acceptable level of loss is fine
for 99.99% of Netflix's user base just like cable, electricity and
running water I suffer a few hours of losses each year from those
services it suck yes, is it the end of the world no..

You missed the point.

And very publically missed the point, too. The Netflix issues led to a
large discussion of downtime, testing, and fault tolerance that has been
very useful for the community and could lead to some good content for
NANOG conferences (/pokes PC). For Netflix (and all other similar
services) downtime is money and money is downtime. There is a
quantifiable cost for customer acquisition and a quantifiable churn
during each minute of downtime. Mature organizations actually calculate
and track this. The trick is to ensure that you have balanced the cost
of greater redundancy vs the cost of churn/customer acquisition. If you
are spending too much on redundancy, it's as big of mistake as spending
too little. 

I totally got the point and the last bit of my post was just tongue in cheek. 

As I stated in my original response it's very unrealistic to plan for every possible failure scenario given the 
constraints most businesses face when implementing BCP today. I doubt Amazon gave much thought to multiple site outages 
and clients not being able to dynamically redeploy their engines because of inaccessibility from ELB.



Also, I don't think there is an acceptable level of downtime for water.
Neither do water utilities. 

- Dan



Current thread: