nanog mailing list archives

Re: NTT communications horrible routing, unresponsive NOC


From: Jared Mauch <jared () puck nether net>
Date: Thu, 24 Mar 2016 08:12:10 -0400

Hello,

Could you send me some more details in private about who you spoke with including email addresses and ticket numbers 
with our NOC if any?

I'd like to understand what happened here as this is an uncharacteristic outcome. I know there was planned work in 
Seattle last night for a software upgrade combined with hardware work, but an operational issue like this should have 
been resolved. 

I'd like to understand the timeline and what broke down if anything. 

Thanks,

Jared Mauch

On Mar 23, 2016, at 7:39 PM, Paras Jha <paras () protrafsolutions com> wrote:

Hi all,

I've been trying to get this issue resolved for the entire day now, but NTT
has been pretty unreceptive here.

We're announcing a large prefix for a client across our network, and we
discovered some insanely high latency.

After tracking down the issue, we determined it to be something wrong with
NTT at their Seattle location. We anycast this prefix, but no matter where
in the world traffic is originating from, it's going to Seattle and then to
Atlanta. Example: Rotterdam in the Netherlands routes from Europe -> east
coast -> west coast Seattle -> los angeles -> atlanta. The gist of it is
that something is seriously messed up at NTT in Seattle.

We contacted our transit provider to try and carry the issue upstream, and
what they told us was

Sorry for delay, I've asked NTT to clear the more specific for this one as
well.
The problem seems to be a bug on the NTT side which keeps stale routes in
the routing table for more specifics ( at random ).
If you have more routes affected please notify us of the routes and I will
ask them to clear the routing table for these routes.
NTT is working on this with their vendor to get this resolved as soon as
possible.


I had spoken to a sales rep for NTT a few weeks prior, and they assured me
that the NOC was top notch, and that all routes were redundant, and they
guaranteed less than 50ms in the US, and all kinds of marketing. However,
it looks like it's all marketing - for this entire day this router has been
causing tons of issues for our clients.

No-exporting it to NTT does not even solve the problem, as NTT's router in
Seattle apparently just decides to keep random small prefixes in it,
causing traffic to go there.

At a loss as to what to do now, since their NOC isn't receptive. Anyone
have someone I can contact off-list to get this issue resolved? It's
especially frustrating because the problem absolutely cannot be resolved on
our end, even with a no-export since NTT is keeping the routes in their
router.


Current thread: