nanog mailing list archives

Re: Reliable Cloud host ?


From: George Herbert <george.herbert () gmail com>
Date: Mon, 27 Feb 2012 11:19:27 -0800

On Mon, Feb 27, 2012 at 7:28 AM, William Herrin <bill () herrin us> wrote:
On Sun, Feb 26, 2012 at 7:02 PM, Randy Carpenter <rcarpen () network1 net> wrote:
On Feb 26, 2012, at 4:56 PM, Randy Carpenter wrote:
1. Full redundancy with instant failover to other hypervisor hosts
upon hardware failure (I thought this was a given!)

This is actually a much harder problem to solve than it sounds, and
gets progressively harder depending on what you mean by "failover".

At the very least, having two physical hosts capable of running your
VM requires that your VM be stored on some kind of SAN (usually
iSCSI based) storage system. Otherwise, two hosts have no way of
accessing your VM's data if one were to die. This makes things an
order of magnitude or higher more expensive.

This does not have to be true at all.  Even having a fully fault-tolerant
SAN in addition to spare servers should not cost much more than
having separate RAID arrays inside each of the server, when you
are talking about 1,000s of server (which Rackspace certainly has)

Randy,

You're kidding, right?

SAN storage costs the better part of an order of magnitude more than
server storage, which itself is several times more expensive than
workstation storage. That's before you duplicate the SAN and set up
the replication process so that cabinet and room level failures don't
take you out.

This is clearly becoming a not-NANOG-ish thread, however...

Failing to have central shared storage (iSCSI, NAS, SAN, whatever you
prefer) fails the smell test on a local enterprise-grade
virtualization cluster, much less a shared cloud service.

Some people have done tricks with distributing the data using one of
the research-ish shared filesystems, rather than separate shared
storage.  That can be made to work if the host OS model and its
available shared filesystems work for you.  Doesn't work for Vmware
Vcenter / Vmotion-ish stuff as far as I know.

There are plenty of people doing non-enterprise-grade virtualization.
There's no mandate that you have the ability to migrate a virtual to
another node in realtime or restart it immediately on another node if
the first node dies suddenly.  But anyone saying "we have a cloud" and
not providing that type of service, is in marketing not engineering.
From a systems architecture point of view, you can't do that.


-- 
-george william herbert
george.herbert () gmail com


Current thread: