nanog mailing list archives

Re: Converged Networks Threat (Was: Level3 Outage)


From: Jared Mauch <jared () puck nether net>
Date: Wed, 25 Feb 2004 13:56:51 -0500


On Wed, Feb 25, 2004 at 10:34:55AM -0800, David Meyer wrote:
      Jared,

 Is your concern that carrying FR/ATM/TDM over a packet
 core (IP or MPLS or ..) will, via some mechanism, reduce
 the resilience of the those services, of the packet core,
 of both, or something else?

   I'm saying that if a network had a FR/ATM/TDM failure in
the past it would be limited to just the FR/ATM/TDM network.
(well, aside from any IP circuits that are riding that FR/ATM/TDM
network).  We're now seeing the change from the TDM based
network being the underlying network to the "IP/MPLS Core"
being this underlying network. 

   What it means is that a failure of the IP portion of the network
that disrupts the underlying MPLS/GMPLS/whatnot core that is now 
transporting these FR/ATM/TDM services, does pose a risk.  Is the risk
greater than in the past, relying on the TDM/WDM network?  I think that
there could be some more spectacular network failures to come.  Overall
I think people will learn from these to make the resulting networks
more reliable.  (eg: there has been a lot learned as a result of the
NE power outage last year).

      I think folks can almost certainly agree that when you
      share fate, well, you share fate. But maybe there is
      something else here. Many of these services have always
      shared fate at the transport level; that is, in most
      cases, I didn't have a separate fiber plant/DWDM
      infrastructure for FR/ATM/TDM, IP, Service X, etc,  so
      fate was already being/has always been shared in the
      transport infrastructure. 

      So maybe try this question: 

        Is it that sharing fate in the switching fabric (as
        opposed to say, in the transport fabric, or even
        conduit) reduces the resiliency of a given service (in
        this case FR/ATM/TDM), and as such poses the "danger"
        you describe?    

        I think the threat is that the switching fabric and
forwarding plane can be disrupted by more things than exist in a 
pure TDM based network.  This isn't to say that the packet (or even
label) network isn't the "future" of these services, it's just
that today there are some interesting problems that still exist as
the technology continues to mature.

      Is this an accurate characterization of your point? If
      so, why should sharing fate in the switching fabric
      necessarily reduce the resiliency of the those services
      that share that fabric (i.e., why should this be so)? I
      have some ideas, but I'm interested in what ideas other
      folks have.   

        I believe that there still exist a number of cases where the
switching fabric can get out-of-sync with the control-plane.

        If events are not properly triggered back upstream (ie: adjencies
stay up, bgp remains fairly stable) and you end up dumping a lot of
traffic on the floor, it's sometimes a bit more dificult to diagnose
than loss of light on a physical path.

        On the sunny side, I see this improving over time.  Software
bugs will be squashed.  Poorly designed networks will be reconfigured to
better handle these situations.

        - jared

-- 
Jared Mauch  | pgp key available via finger from jared () puck nether net
clue++;      | http://puck.nether.net/~jared/  My statements are only mine.


Current thread: