nanog mailing list archives

Re: Reliable Cloud host ?


From: Randy Carpenter <rcarpen () network1 net>
Date: Sun, 26 Feb 2012 19:02:20 -0500 (EST)



----- Original Message -----

On Feb 26, 2012, at 4:56 PM, Randy Carpenter wrote:
We have been using Rackspace Cloud Servers. We just realized that
they have absolutely no redundancy or failover after experiencing
a outage that lasted more than 6 hours yesterday. I am appalled
that they would offer something called "cloud" without having any
failover at all.

Basic requirements:

1. Full redundancy with instant failover to other hypervisor hosts
upon hardware failure (I thought this was a given!)

This is actually a much harder problem to solve than it sounds, and
gets progressively harder depending on what you mean by "failover".

At the very least, having two physical hosts capable of running your
VM requires that your VM be stored on some kind of SAN (usually
iSCSI based) storage system. Otherwise, two hosts have no way of
accessing your VM's data if one were to die. This makes things an
order of magnitude or higher more expensive.

This does not have to be true at all.  Even having a fully fault-tolerant SAN in addition to spare servers should not 
cost much more than having separate RAID arrays inside each of the server, when you are talking about 1,000s of server 
(which Rackspace certainly has)

But then all you've really done is moved your single point of failure
to the SAN. Small SANs aren't economical, so you end up having tons
of customers on one SAN. If it dies tons of VMs are suddenly down.
So you now need a redundant SAN capable of live-mirroring everyone's
data. These aren't cheap either, and add a lot of complexity to
things. (How to handle failover if it died mid-write, who has the
most recent data after a total blackout, etc)

NetApp. HA heads. Done. Add a DR site with replication, and you can survive a site failure, and be back up and running 
in less than an hour. I would think that the big datacenter guys already have this type of thing set up.

And this is really just saying "If hardware fails, i want my VM to
reboot on another host." If what you're defining high availability
to mean "even if a physical host fails, i don't want a second of
downtime, my VM can't reboot" you want something like VMware's ESXi
High Availability modules where your VM is actually running on two
hosts at once, running in lock-step with each other so if one fails
the other takes over transparently. Licenses for this are
ridiculously expensive, and requires some reasonably complex
networking and storage systems.

I don't need that kind of HA, and understand that it is not going to be available. 15 minutes of downtime is fine. 6 
hours is completely unacceptable, and it false advertising to say you have a "Cloud" service, and then have the 
realization that you could have *indefinite* downtime.

And I still haven't touched on having to make sure both physical
hosts capable of running your VM are on totally independent
switches/power/etc, the SAN has multiple interfaces so it's not all
going through one switch, etc.

That is all just basic datacenter design. I have that level of redundancy with my extremely small datacenter. I only 
have 2 hypervisor hosts running around 12 VMs.

I also haven't run into anyone deploying a
high-availability/redundant system where they haven't accidentally
ended up with a split-brain scenario (network isolation causes the
backup node to think it's live, when the primary is still running).
Carefully synchronizing things to prevent this is hard and fragile.

I've never had it. Not if you properly set up failover (look at STONITH)

I'm not saying you can't have this feature, but it's not typical in
"reasonably priced" cloud services, and nearly unheard-of to be
something automatically used. Just moving your virtual machine from
using local storage to ISCSI backed storage drastically increases
disk latency and caps the whole physical host's disk speed to 1gbps

No it doesn't. Haven't you heard of multipath? Using 4 1Gb/s paths gives me about the same I/O as a local RAID array, 
with the added feature of failover if a link drops. 4 1Gb/s ports is ridiculously cheap. And, 10Gb is not nearly as 
expensive as it used to be.

(not much deployment for 10GE adapters on the low-priced VM provider
yet). Any provider who automatically provisions a virtual machine
this way will get complaints that their servers are slow, which is
true compared to someone selling VMs that use local storage. The
"running your VM on two hosts at once" system has such a performance
penalty, and costs so much in licensing, you really need to NEED it
for it not to be a ridiculous waste of resources.

I don't follow what you mean by "running the VM on two hosts." I just want my single virtual to be booted up on a spare 
hypervisor if there is a hypervisor failure. No license costs for that, and should not have any performance 
implications at all.

Amazon comes sorta close to this, in that their storage is
mostly-totally separate from the hosts running your code. But they
have had failures knock out access to your storage, so it's still
not where I think you're saying you want to be.

The moral of the story is that just because it's "in the cloud", it
doesn't gain higher reliability unless you're specifically taking
steps to ensure it. Most people solve this by taking things that are
already distributable (like DNS) and setting up multiple DNS servers
in different places - that's where all this "cloud stuff" really
shines.

The funky problem with DNS specifically, is that all the servers need to be up, or someone will get bad answers. Not 
having a preference system, like MX records has hurt in this regard. Anycast fixes this to a certain degree. Anycast is 
another challenge for these hosting providers.
 
(please no stories about how you were able to make a redundant
virtual machine run using 5 year old servers in your basement, i'm
talking about something that's supportable on a provider scale, and
isn't adding more single-points-of-failure)

I have actually done this :-)

But, I also have a fully redundant system at our main office using very few components. We also have a DR site, 
connected with fiber. The challenge we have is if we run into routing issues upstream that are beyond our control. 
Hence the need to have a few things also hosted externally geographically and routing-wise.


Current thread: