nanog mailing list archives

Re: Operate until failure


From: David Lesher <wb8foz () nrk com>
Date: Mon, 8 Jan 2001 18:11:20 -0500 (EST)


Unnamed Administration sources reported that Sean Donelan said:


And what if you are not using APCs?

See the menu of systems listed at:

        http://www.exploits.org/nut/

One issue with highly redudandent data centers is the failure modes are
"interesting."  You don't want to shutdown due to a single UPS failure, so
you don't use something simple like PowerChute Plus.  You most likely don't
want to shutdown based on any automatic signal.  However, you do want a way
for an operator to gracefully shutdown a lot of equipment quickly when
the decision is made.

For a server farm, with potentially thousands of individual systems, is
there any standard piece of software you can install on all of the systems
to act as a receiver of a signal to begin a graceful shutdown that does
not depend on a vendor's proprietary interface?  Preferabally one which
does not involve running a lot of additional wires.

Good point; you'll likely need a box just to talk to UPSi and
control shutdowns. That alas, is adding a single point of failure.

Again this is only needed if people want a gracefull shutdown.  If
you can live with a hard shutdown, you wouldn't require this.  If you
use ctrl-alt-del as a normal management practice, I suspect you don't
really require a graceful shutdown.

You really don't want to run all the UPS batteries flat. It will
lengthen the recovery time.... (If graceful shutdown is your goal;
when power is restored, you want the UPS to FIRST recharge enough
so it can again gracefully shutdown, when the power turns out to
be back up for just a minute or two....thus you delay restarting
the load.)




-- 
A host is a host from coast to coast.................wb8foz () nrk com
& no one will talk to a host that's close........[v].(301) 56-LINUX
Unless the host (that isn't close).........................pob 1433
is busy, hung or dead....................................20915-1433


Current thread: