nanog mailing list archives

Re: Cogent --> Google Public DNS routing issue


From: David Miller <dmiller () tiggee com>
Date: Wed, 17 Aug 2011 12:01:48 -0400

On 8/17/2011 9:13 AM, Patrick W. Gilmore wrote:
On Aug 17, 2011, at 1:07 AM, Christopher Morrow wrote:
On Wed, Aug 17, 2011 at 12:09 AM, Robert Glover<robertg () garlic com>  wrote:
Hello,

We have noticed that from our Cogent link (as well as from ALL U.S. based
points we tested via the Cogent Looking Glass:
http://www.cogentco.com/en/network/looking-glass), traceroutes to 8.8.8.8
and 8.8.5.5 all seem to go over to Europe:
8.8.5.5 ain't the driods you are looking for...
In the traceroute appended to the original post, he did trace to 8.8.4.4.

While it did go all over, I don't see the problem - it got to the destination host.

Anycast is OK for some things, but it depends on BGP.  BGP has zero concept of latency, loss, or geography.  Expecting 
anycast to guarantee an optimal path or location is a grave error.

There are two basic types of anycast:
1. Simple anycast - announce an anycast prefix to whoever/wherever in more than one location. 2. Global anycast + careful configuration - announce an anycast prefix to particular providers at specific geographically disparate locations and using other options to achieve geographic and/or performant inbound traffic distribution.

Perhaps we need a new term for 2.

Google is clearly attempting to implement 2 and not 1 for their resolving DNS service. Based on Google's claims of speed (and my testing of their response times), they have either found a way to exceed the speed of light with packets or they are managing to keep most of their traffic "local ish" to the requester.

To say that anycast "relies on BGP" and therefore expecting an optimal path is an error - is disengenuous (I want a better word, but this one will do). The internet as a whole "relies on BGP" and yet we expect mostly optimal paths. While it is true that BGP has no capacity to account for latency or loss, IGPs which can take into account these factors end at the borders of networks (where prefixes are passed using BGP). This is what makes up the "inter net".

If you were tracing from a host in Ashburn to a unicast host in NYC and your path passed through San Jose, then you would say that was an issue. The same would be true with an anycast destination address.

As to geography, IGPs don't have a concept of geography either. A router in NYC doesn't know or care that the router at the other end of a link is in CHI. All it knows is the prefixes that it gets from that router and metrics to choose a best path for them. BGP combined with "proper" (i.e. distributed) peering of networks does provide performant paths for traffic. In an anycast configuration the "careful configuration" is selecting providers to announce anycast prefixes to and communities that you put on the prefixes to control redistribution. Global anycast + careful configuration can and does provide mostly performant paths and a very high level of geographic fidelity - though, granted, not "guaranteed" (at least not guaranteed at a higher level than unicast prefixes).

You can't "guarantee" performant paths ever (regardless of anycast or unicast) if any path between the source and destination crosses the border between two networks because some networks will choose a "primary" upstream (single homed or heavily pref'ed) that only picks up a prefix in a particular area and sends all of the traffic there. The originator of the prefix can depref that provider to try to influence path selection, but some networks will doggedly prefer to send packets to that network despite the efforts of the originator. The only thing to do then is to ask why this network selected that particular upstream and then to explain to them why that might not have been the best choice, if they want performant paths...

The possible reasons for this are nearly innumerable.  Perhaps Congent<>  Google is congested in the US so one or the 
other prefers EU?  Perhaps there is some IGP metric messed up inside Cogent that prefers the EU?  Perhaps more nefarious 
problems, such as Google de-peering Cogent in the US?  Etc., etc.

You may be able to find out if you look, and you may not (I didn't even try).  But even if you do figure out the answer, you 
can't fix it.  Only Cogent and/or Google can.

My traces show all the Cogent locations in the US that I traced from going to Telia in EU and then to Google.

My traces from Telia locations in the US all (properly) reach Google destinations in the US.

So, Cogent is only receiving/using/preferring these two prefixes from their peering(s) with Telia in EU.

As to the root cause of that... only the players in that game can say.


Moreover, you can see things like this with anycast even when there is no problem!


The OP believes that it is a problem. You *can* see this with anycast, but I would say that this *is* a "problem" (for my definition of "problem" which admittedly may be different from others). There are many potential solutions to the problem, the most obvious is for the OP to stop preferring to send traffic to these prefixes over Cogent.

To the OP: I have to wonder what factors were used to decide "primary" vs "backup" provider. If "price", then you should expect issues with less performant routing. If "quality", then what measures were used to determine a "quality" ranking? I am also curious as to who the "backup" is (but that is just morbid curiousity).

-DMM



Current thread: