nanog mailing list archives

Re: ultradns reachability

From: Joe Abley <jabley () isc org>
Date: Fri, 2 Jul 2004 10:22:09 -0400



On 2 Jul 2004, at 00:18, Christopher L. Morrow wrote:

So, I thought of it like this:
1) Rodney/Centergate/UltraDNS knows where all their 35000billioncopies of
the 2 .org TLD boxes are, what network pieces they are connected to at
which bandwidths and the current utilization
2) Rodney/Centergate/UltraDNS knows which boxes in each location (there
could be multiple inside each pod, right?) are running their dnsprocess
and answering at which rates
3) Rodney/Centergate/UltraDNS knows when processes die and locally stop
pushing requests to said system inside the pod
4) Rodney/Centergate/UltraDNS knows when a pod is completely down (no
systmes responding inside the local pod) so they can stop routing the/24
from that pod's location
So, Rodney/Centergate/UltraDNS should know almost exactly when theyhave a
problem they can term 'critical'... I most probably left out some steps
above, like wedged proceseses or loss of outbound routing to prefixes
sending reqeusts. I'm sure Paul/ISC has a fairly complete list offailure
modes for anycast DNS services.

All the failure modes that ISC has seen with anycast nameserverinstances can be avoided (for the authoritative DNS service as a whole)by including one or more non-anycast nameservers in the NS set.

This leaves the anycast servers providing all the optimisation thatthey are good for (local nameserver in toplogically distant networks;distributed DDoS traffic sink; reduced transaction RTT) and provides afall-back in case of effective reachability problems for the anycastnameservers.


This is so trivial, I continue to be amazed that PIR hasn't done it.

The problem then becomes the "Hey, .org is dead!" From where is itdead?What pod are you seeing it dead from? Is it routing TO the pod fromyou?
FROM the pod to you? The pod itself? Stuck/stale routing information
somewhere on the path(s)? This is very complex, or seems to be to me :(

With the fix above, the problem becomes "hey, *some* of the nameserversfor ORG are dead! We should fix that, but since not *all* of them aredead, at least ORG still works."

I think more failure modes will be investigated before that comes :)
fortunately lots of people are already investigating these, eh?

I don't know about lots, but I know of a few. None of the people I knowof are using an entire production TLD as their test-bed, however.

Joe

Current thread:

ultradns reachability Matt Ghali (Jul 01)
- Re: ultradns reachability Christopher X. Candreva (Jul 01)
- Re: ultradns reachability Chris Adams (Jul 01)
- Re: ultradns reachability Eric Frazier (Jul 01)
  - Re: ultradns reachability James Edwards (Jul 01)
    - Re: ultradns reachability Christopher L. Morrow (Jul 01)
    - Re: ultradns reachability k claffy (Jul 01)
    - Re: ultradns reachability Christopher L. Morrow (Jul 01)
    - Re: ultradns reachability Edward B. Dreger (Jul 01)
    - Re: ultradns reachability Joe Abley (Jul 02)
    - Re: ultradns reachability Leo Bicknell (Jul 02)
    - Re: ultradns reachability Joe Abley (Jul 02)
    - Re: ultradns reachability Leo Bicknell (Jul 02)
    - Re: ultradns reachability Dr. Jeffrey Race (Jul 02)
    - Re: ultradns reachability Stephen J. Wilcox (Jul 02)
    - Re: ultradns reachability Bill Woodcock (Jul 03)
- <Possible follow-ups>
- RE: ultradns reachability Cody Lerum (Jul 01)
- Re: ultradns reachability Matt Ghali (Jul 02)
  - Re: ultradns reachability Leo Bicknell (Jul 03)