nanog mailing list archives

RE: Data Center testing


From: Dylan Ebner <dylan.ebner () crlmed com>
Date: Wed, 26 Aug 2009 15:32:42 +0000

I would hope that the data center engineers built and ran suite of tests to find failure points before the network 
infrastructure was put into production. That said, changes are made constantly to the infrastructure and it can become 
very difficult very quickly to know if the failovers are still going to work. This is one place where the power and 
network in a datacenter divulge. The power systems may take on additional load over the course of the life of the 
facility, but the transfer switches and generators do not get many changes made to them.  Also, network infrastructure 
tests are not going to be zero impact if there is a config problem. Generator tests are much easier. You can start up 
the generator and do a load test. You can also load test the UPS systems as well. Then you can initiate your failover. 
Network tests are not going to be zero impact even if there isn't a problem. Let's say you wanted to power fail a edge 
router participating in BGP, it can take 30 seconds for that routers route to get withdrawn from the BGP tables of the 
world. The other problem is network failures always seem to come from "unexpected" issues. I always love it when I get 
an outage report from my ISP's or datacenter and they say an "unexpected issue" or "unforseen issue" caused the problem.


Dylan
-----Original Message-----
From: Dan Snyder [mailto:sliplever () gmail com] 
Sent: Monday, August 24, 2009 8:39 AM
To: Ken Gilmour
Cc: NANOG list
Subject: Re: Data Center testing

We have done power tests before and had no problem.  I guess I am looking for someone who does testing of the network 
equipment outside of just power tests.  We had an outage due to a configuration mistake that became apparent when a 
switch failed.  It didn't cause a problem however when we did a power test for the whole data center.

-Dan


On Mon, Aug 24, 2009 at 9:31 AM, Ken Gilmour <ken.gilmour () gmail com> wrote:

I know Peer1 in vancouver reguarly send out notifications of 
"non-impacting" generator load testing, like monthly. Also InterXion 
in Dublin, Ireland have occasionally sent me notification that there 
was a power outage of less than a minute however their backup 
successfully took the load.

I only remember one complete outage in Peer1 a few years ago... Never 
seen any outage in InterXion Dublin.

Also I don't ever remember any power failure at AiNet (Deepak will 
probably elaborate)

2009/8/24 Dan Snyder <sliplever () gmail com>:
Does any one know of any data centers that do failure testing of 
their networking equipment regularly? I mean to verify that 
everything fails over properly after changes have been made over 
time.  Is there any best practice guides for doing this?

Thanks,
Dan





Current thread: