nanog mailing list archives

Re: AGIS Route Flaps Interrupting its Peering?


From: "Rebecca L. Nitzan" <nitzan () es net>
Date: Fri, 05 Jul 96 12:09:44 -0700

Peter et al:

    We too have had nothing but trouble with the netedge boxes
(to mae-east and mae-west).  They are particularly insidious 
when they are "kind of working".  A couple years ago, when traffic 
loads were lower, they seemed to perform well. Does anyone know if
MFS has plans to address this problem?
        
                        -- Becca

-----------------------------------------------------------------------
Rebecca L. Nitzan                        Lawrence Berkeley National Lab
Network Engineering Services Group       1 Cyclotron Rd, 50A/3101 MS 50C
ESnet - Energy Sciences Network          Berkeley, CA. 94720
phone: 510-486-6468 fax: 510-486-4300    nitzan () es net
-----------------------------------------------------------------------
 


Here's some background:

AGIS's router is not colocated at the MAE parking garage, but is in fact
colocated at WorldCom in downtown Washington DC.  Our bits get from there to
the MAE via a DS3, and that DS3 is terminated at each end with a device
called a NetEdge, which does the FDDI to DS3 ATM conversion.

These NetEdges seem to have three different possible operating states:
completely working (which doesn't happen often enough); broken (often, right
out of the box); and kind of working (which happens all too often).  This
third operating state results in some very interesting, possibly misleading,
and sometimes damaging behavior.  It looks quite similar to the kind of
behavior you get when you change the MAC layer device but keep the same ip
address at either of the MAE's: ARP caches get inconsistent, and BGP
sessions with other routers flop around, leading to routes getting flap
dampened by those running the appropriate code.


Here's what happened:

AGIS's connection to MAE-East experienced one of these kind-of-working
problems which resulted in the erratic behavior above.  Digex customers
wishing to reach AGIS customers called the Digex NOC, and the posting which
started this all was made to the Digex internal news group.  Similarly, AGIS
customers had problems, and we worked with MFS to get the problem resolved
(they must have a warehouse full of swapped-out NetEdges at this point).

In the interval, a short-on-facts bozo spit into the wind and got us and
Digex wet.  I'm in private correspondence with Ed Kern to postmortem the
situation.

 Peter


At 10:25 AM 7/5/96 -0400, Ed Kern wrote:

One key point is that we have not received any complaints or reports
of any sort concerning any perceived issues at mae-east from any
mae-east peers.  Digex made no attempt to contact us.  We were already
working with Advantis on the unreachable issue above, but the first we
heard of the "AGIS attacks mae-east" report was when a Digex customer
sent us a report similar to that forwarded to all of you by Cook.

Went into this in the last message...Digex will try and be more
proactive with pointing out Agis flapping prefixes in the future.


An appropriate audience would have been the AGIS noc and the Digex
noc.  I think the Cook approach was inappropriate because the issue
was purely between Digex and AGIS until Cook distributed it to the
three widespread mailing lists.

I agree..


  How is the report flawed?

I see that Ed Kern has already replied indicating that the report was
indeed flawed.  I don't think that there is anything to be gained by
going into further detail.

What I was referring to was the internal circulation here...which I was
under the impression got to external customers....now im not so
sure...

The internal report was flawed because it relied to much on source
routes and came to some bad conclusions on the internal state of agis.


My key point is that nothing of interest happened.  This was a
non-issue until the misinformation was blasted around the Internet
technical universe.


I would argue that the external message that got sent around was
misinformation...It was correct information from what the people
could see at the time it was released...(lots of dampened prefixes and
a down peer)..


Ed




_____________________________________________________________________
Peter Kline  Senior Network Engineer|                    313-730-5151
AGIS - Internet Backbone Services   |                _Lucem Diffundo_
Post-Traumatic Success Disorder+    |
/////////////////////////////////////////////////////////////////////
You can pretend to care, but you can't pretend to be there.

- - - - - - - - - - - - - - - - -


Current thread: