nanog mailing list archives

Re: Facility wide DR/Continuity


From: gb10hkzo-nanog () yahoo co uk
Date: Wed, 3 Jun 2009 07:53:30 -0700 (PDT)


As with all things, there's no "right answer" ..... a lot of it depends on three things :

- what you are hoping to achieve
- what your budget is
- what you have at your disposal in terms of numbers of qualified staff available to both implement and support the 
chosen solution

That's the main business level factors.  From a technical level, two key factors (although, of course, there are many 
others to consider) are :

- whether you are after an active/active or active/passive solution
- what the underlying application(s) are (e.g. you might have other options such as anycast with DNS)


Anyway, there's a lot to consider.  And despite all the expertise on Nanog, I would still suggest the original poster 
does their fair share of their own homework. :)






----- Original Message ----
From: Jim Wise <jwise () draga com>
To: gb10hkzo-nanog () yahoo co uk
Cc: nanog () nanog org
Sent: Wednesday, 3 June, 2009 15:42:24
Subject: Re: Facility wide DR/Continuity

gb10hkzo-nanog () yahoo co uk writes:

On the subject of DNS GSLB, there's a fairly well known article on the
subject that anyone considering implementing it should read at least
once.... :)

http://www.tenereillo.com/GSLBPageOfShame.htm
and part 2
http://www.tenereillo.com/GSLBPageOfShameII.htm

Yes it was written in 2004.  But all the "food for thought" that it
provides is still very much applicable today.

One thing I've noticed about this paper in the past that kind of bugs me
is that in arguing that multiple A records are a better solution than a
single GSLB-managed A record, the paper assumes that browsers and other
common internet clients will actually cache multiple A records, and fail
between them if the earlier A records fail.  The (first) of the two
pages explicitly touts this as a high availability solution.

However, I haven't observed this behavior from browsers, media players,
and similar programs `in the wild' -- as far as I've been able to tell,
most client software picks an A record from those returned (possibly,
but not usually skipping those found to be unreachable), and then holds
onto that choice of IP address until the record times out of cache, and
a new request is made.

Have I been unlucky in my observations?  Are there client programs which
do failover between multiple A records returned for a single name --
presumably sticking with one IP for session-affinity purposes until a
failure is detected?

If clients do not behave this way, then the paper's observations about
GSLB for HA purposes don't seem to hold -- though in my limited
experience the paper's other point (that geographic dispatch is Hard)
seems much more accurate (making GSLB a better HA solution than it is a
load-sharing solution, again, at least in my experience).

Or am I missing something?

-- 
                Jim Wise
                jwise () draga com



 


Current thread: