nanog mailing list archives

Re: scaling linux-based router hardware recommendations


From: Phil Bedard <bedard.phil () gmail com>
Date: Tue, 27 Jan 2015 13:16:11 -0500

There are some interesting ideas.  There are tricks to getting 128K LPM 
routes into a Trident2 device like you mentioned.   You can get the same 
type of devices from Cisco/Juniper for not a whole lot more, what you are 
really paying for is the mature control plane.  

https://github.com/dbarrosop/sir is a project from David Barasso at 
Spotify.  The BGP daemon on your Internet-connected device may store all 
the routes in its RIB but usually doesn't need everything in the FIB. A 
downstream BGP controller running on a x86 server would analyze the routes 
more closely and use something like Netflow/Sflow (or really any criteria) 
to identify the routes you really care about and install those into the 
FIB on the device and just use the default for everything else.    

Metaswitch, a commercial control-plane company, had a similar idea using 
Openflow where the actual Internet connected device just proxied BGP 
connections to a controller which then went back and programmed the 
upstream Openflow switches.  They call them "lean transit switches."    

The major network vendors all have the ability to run their control planes 
in software on x86 these days just fine.  On many newer platforms they 
them run the software as a VM anyways.  Pricing for bringing your own 
server is going to be cheaper, but not free.  

As for open source type stuff, Contrail from Juniper was made open source 
and has a BGP implementation, MPLS (MPLSoUDP) implementation, and a Linux 
kernel module to do fairly high speed packet forwarding, what they call 
the vRouter in Contrail.  

Wind River is another vendor who has incorporated the Intel DPDK stuff 
into a Linux distribution, but it is commercial as well.  


Phil 



On 1/27/15, 11:56, "Baldur Norddahl" <baldur.norddahl () gmail com> wrote:

I propose the hybrid solution:

A device such as the ZTE 5960e with 24x 10G and 2x 40G will set you about
USD 6000 back.

This thing can do MPLS and L3 equal cost multiple path routing. With that
you can load balance across as many software routers as you need.

It also speaks BGP and can accept about 10k routes. So maybe you could
consider if the full table is really worth it.

It would be possible to have your software router speak BGP with the
neighbors and use next hop to direct the traffic directly to the switch. 
Or
use proxy arp if the peer does not want to allow you to specify a 
different
next hop than the BGP speaker. This way your software router is only 
moving
outgoing packets. Inbound packets will never go through the computer, but
will instead be delivered directly to the correct destination by hardware
switching.

If you are an ISP, you will often have more inbound traffic so this very
useful. Also the weak point of the software router is denial of service
attacks with small packets. The attacks are likely from outside your
network so your software router will not need to route it.

We need someone to code a BGP daemon, that will export the 5k most used
routes to the switch. This way you can have the switch deliver the 
majority
of the traffic directly to your peers.

If you are a service provider, much of your traffic is outbound. Put your
servers or multiple routers/firewalls on the same vlan as your transit.
Then add static host routes for next hop on all servers. This way you can
have as many servers as you need to deliver traffic directly. You can run
iBGP on all the servers, so every server knows how to route outbound by
itself. MPLS would also be useful for this instead of vlan, but there is 
no
good MPLS implementation for Linux.

Regards,

Baldur


Current thread: