nanog mailing list archives

Re: Single IP routing problems through Level3


From: Jon Wolberg <jon () defenderhosting com>
Date: Sun, 15 Jun 2008 11:56:24 -0400 (EDT)

I've seen the exact same symptoms before with another provider and it was a L3 Port-Channel that was not balanced 
properly due to a link being down which wasn't detected as such.  It was also causing very sporadic latency spikes and 
dropped packets.

Jon
----- Original Message -----
From: "Matt Palmer" <mpalmer () hezmatt org>
To: nanog () nanog org
Sent: Sunday, June 15, 2008 10:39:56 AM GMT -05:00 US/Canada Eastern
Subject: Re: Single IP routing problems through Level3

On Sun, Jun 15, 2008 at 11:12:25AM -0300, Rubens Kuhl Jr. wrote:
1) I've seen this behavior before; you are not alone in the universe.

Thank $DEITY for that.  <grin>

2) Most likely there is a balanced channel on the path, either L3 or
L2, and one of the links in the bundle is dead but has not been
detected as such.

A multiple-link bundle which is load balanced by source/destination pair
with an undetected dud link?  I hadn't thought of that, but it does make an
*awful* lot of sense.  (Although, not being a big-network transit kinda
person, I don't know if such a thing actually exists <grin>) I'll mention it
(or ask about it) as a possibility next time I talk to the relevant people,
though.

Thanks,
- Matt

On Sun, Jun 15, 2008 at 11:01 AM, Matt Palmer <mpalmer () hezmatt org> wrote:
We're seeing some really weird issues with connections that go through / to
Level3 IP space.  Basically, certain "pairs" of IPs (particular L3 IPs
coupled with particular IPs of ours) have dodgy/nonexistent connectivity,
but if you change the IP at either end everything's hunky dory.

I've sniffed (from both ends) pings going from a host in L3 space to our end
and seen the pings arrive at our end and head back in the direction of L3,
but they never get to their destination.  Traceroutes from L3 stop at the
next-to-last hop, while traceroutes back get to the hop before L3 space and
stop.

All of this behaviour is source/dest *pair* specific -- if I ping/traceroute
from another address (in the same netblock as the problematic IP, so all the
same equipment is involved) at either end, or to another address (again,
same netblock) at either end, it all works again.

I've got two questions:

1) Has anyone else seen similar behaviour from L3 (or other providers,
  even), so I know I'm not going mad?

2) What sort of configuration problem or software bug would cause this sort
  of problem to occur?  If it was an IP blacklist (or even a block routing
  issue) anywhere along the line, surely it wouldn't be sensitive to
  changing the other end's address to another one in the same /24?

Any insight/anecdotes/etc would be greatly appreciated, as it's starting to
do my head in.  Just knowing I'm not alone with this insanity would be nice
at this point.  <grin>

If it makes any difference, the blocks I'm working from at my end are
Internap, in 74.201.254.0/23 (we don't have all of it, just most of it),
while the far end is 8.12.35.0/24.



Current thread: