nanog mailing list archives

Re: largest OSPF core


From: Christian Martin <christian.martin () teliris com>
Date: Thu, 2 Sep 2010 21:40:39 -0400



On Sep 2, 2010, at 7:35 PM, Randy Bush <randy () psg com> wrote:

The stability of the topology plays a most prominent role, but it
wouldn't surprise me if a OSPF network largely comprised of router
LSAs (no redistribution), using today's hardware, could easily scale
to 1000 nodes in an area.

i believe the original poster asked about actual operating deployment,
not theory.

and, i suspect one wants to know about full mesh under real load, i.e.
topology change, which can be exciting when one gets to a network of
significant size.

Randy,

Fair enough.  7 years ago, I was privy to an OSPF BB of 300 or so routers supporting a BGP overlay.  No NBMA, passive 
broadcast subnetworks, all running on systems without the capacity to offload adjacency maintenance into linecards.  
I'd argue that this type of network is also uninteresting from a NANOG viewership POV.

I also operated a network that supported over 70 OSPF VRF instances on a single PE.  CPU loads were higher, but we 
didn't observe intractable workloads.  And this was with a 500 route limit per VRF, with who knows what kinds of 
messiness running in those VRFs.  (and yea there were sham links and router LSAs flying around!!) :)

There are many variables, and several studies have tried to capture, algorithmically and in terms of computational 
complexity, a formulaic approach to determining the boundaries of OSPF network scalability.  Admittedly, these 
approaches can be very approximate in nature.  But the point stands.

Stable topologies absent large, frequent, compulsively updated data can scale extremely well.  Unstable topologies with 
lots of leaf data (20,000 type 5 LSAs, for example), don't.

The most interesting point to make, however, is how much legacy thinking in this area continues to be stranded in a rut 
that emerged 15 years ago.  It is  not uncommon to hear network folks cringe at the thought of an OSPF area exceeding 
100 routers.  Really?  When simulations using testing tools show that properly tuned OSPF implementations (with ISPF, 
PRC, etc) comprised of 1000 can run full SPFs in 500 ms?

That said, my experience, as stated above, is that 300 routers is completely workable.

Cheers
Chris


randy


Current thread: