nanog mailing list archives

Re: FYI Netflix is down


From: jamie rishaw <j () arpa com>
Date: Sat, 30 Jun 2012 00:38:51 -0500

you know what's happening even more?

..Amazon not learning their lesson.

they just had an outage quite similar.. they "performed a full audit" on
electrical systems worldwide, according to the rfo/post mortem.

looks like they need to perform a "full and we mean it" audit, and like
I've been doing/participating in at dot coms for a decade plus: Actually Do
Regular Load tests..

Related/equally to blame: companies that rely heavily on one aws zone, or
arguably "one cloud" (period), are asking for it.

Please stop these crappy practices, people.  Do real world DR testing.
 Play "What If This City Dropped Off The Map" games, because tonight, parts
of VA infact did.

Down: Instagram, Pinterest, Netflix, Heroku, Woot. Pocket(Read It Later),
and on and on.  A bunch of openID sites.  A bunch of DNS sites (think
zoneedit et al).  Infact, probably nearly a /12 if not more of space..

Blame lies both with AWS (again) and with these services providers.

They all should know better.


-j
On Jun 29, 2012 11:22 PM, "Justin M. Streiner" <streiner () cluebyfour org>
wrote:

On Fri, 29 Jun 2012, Mike Lyon wrote:

 Whatever happened to UPSs and generators?


They can and do fail.  See list archives for numerous reports and examples
:)

Generators are capable of not starting.
ATSs can get into a situation where they don't transfer loads properly, or
they can't start the generator(s)
UPSs can fail, drain out, or be left in bypass.
Breakers can trip and need a manual reset
etc...

jms

 On Fri, Jun 29, 2012 at 8:45 PM, Jason Baugher <jason () thebaughers com
wrote:

 Nature is such a PITA.


On 6/29/2012 10:42 PM, James Laszko wrote:

 To further expand:

8:21 PM PDT We are investigating connectivity issues for a number of
instances in the US-EAST-1 Region.

 8:31 PM PDT We are investigating elevated errors rates for APIs in the
US-EAST-1 (Northern Virginia) region, as well as connectivity issues to
instances in a single availability zone.

 8:40 PM PDT We can confirm that a large number of instances in a single
Availability Zone have lost power due to electrical storms in the area.
We
are actively working to restore power.

-----Original Message-----
From: Grant Ridder [mailto:shortdudey123@gmail.****com<
shortdudey123 () gmail com>
]
Sent: Friday, June 29, 2012 8:42 PM
To: Jason Baugher
Cc: nanog () nanog org
Subject: Re: FYI Netflix is down

 From Amazon


Amazon Elastic Compute Cloud (N. Virginia)  (
http://status.aws.amazon.com/**** <http://status.aws.amazon.com/**>)
8:21 PM PDT We are investigating connectivity issues for a number of
instances in the US-EAST-1 Region.
8:31 PM PDT We are investigating elevated errors rates for APIs in the
US-EAST-1 (Northern Virginia) region, as well as connectivity issues to
instances in a single availability zone.

-Grant

On Fri, Jun 29, 2012 at 10:40 PM, Jason Baugher <jason () thebaughers com

wrote:


 Seeing some reports of Pinterest and Instagram down as well. Amazon

cloud services being implicated.


On 6/29/2012 10:22 PM, Joe Blanchard wrote:

 Seems that they are unreachable at the moment. Called and theres a

recorded message stating they are aware of an issue, no details.

-Joe












--
Mike Lyon
408-621-4826
mike.lyon () gmail com

http://www.linkedin.com/in/**mlyon <http://www.linkedin.com/in/mlyon>





Current thread: