nanog mailing list archives

Re: External BGP Controller for L3 Switch BGP routing


From: joel jaeggli <joelja () bogus com>
Date: Mon, 16 Jan 2017 21:22:16 -0800

On 1/15/17 11:00 PM, Yucong Sun wrote:
In my setup, I use an BIRD instance to combine multiple internet full
tables,  i use some filter to generate some override route to send to my L3
switch to do routing.  The L3 switch is configured with the default route
to the main transit provider , if BIRD is down, the route would be
unoptimized, but everything else remain operable until i fixed that BIRD
instance.

I've asked around about why there isn't a L3 switch capable of handling
full tables, I really don't understand the difference/logic behind it.

In practice there are several merchant silicon implmentations that
support the addition of external tcams. building them accordingly
increases the COGS and and various performance and packaging limitions.

arista 7280r and cisco ncs5500 are broadcom jericho based devices that
are packaged  accordingly.

Ethernet merchant silicon is heavily biased towards doing most if not
all the IO on the same asic, with limitations driven by gate size, die
size, heat dissipation pin count an so on.

There was a recent packet pushers episode with Pradeep Sindhu that
touched on some of these issues:

http://packetpushers.net/podcast/podcasts/show-315-future-networking-pradeep-sindhu/


On Sun, Jan 15, 2017 at 10:43 PM Tore Anderson <tore () fud no> wrote:

Hi Saku,


https://www.redpill-linpro.com/sysadvent/2016/12/09/slimming-routing-table.html

---
As described in a prevous post, we’re testing a HPE Altoline 6920 in
our lab. The Altoline 6920 is, like other switches based on the
Broadcom Trident II chipset, able to handle up to 720 Gbps of
throughput, packing 48x10GbE + 6x40GbE ports in a compact 1RU chassis.
Its price is in all likelihood a single-digit percentage of the price
of a traditional Internet router with a comparable throughput rating.
---

This makes it sound like small-FIB router is single-digit percentage
cost of full-FIB.

Do you know of any traditional «Internet scale» router that can do ~720
Gbps of throughput for less than 10x the price of a Trident II box? Or
even <100kUSD? (Disregarding any volume discounts.)

Also having Trident in Internet facing interface may be suspect,
especially if you need to go from fast interface to slow or busy
interface, due to very minor packet buffers. This obviously won't be
much of a problem in inside-DC traffic.

Quite the opposite, changing between different interface speeds happens
very commonly inside the data centre (and most of the time it's done by
shallow-buffered switches using Trident II or similar chips).

One ubiquitous configuration has the servers and any external uplinks
attached with 10GE to leaf switches which in turn connects to a 40GE
spine layer with. In this config server<->server and server<->Internet
packets will need to change speed twice:

[server]-10GE-(leafX)-40GE-(spine)-40GE-(leafY)-10GE-[server/internet]

I suppose you could for example use a couple of MX240s or something as
a special-purpose leaf layer for external connectivity.
MPC5E-40G10G-IRB or something towards the 40GE spines and any regular
10GE MPC towards the exits. That way you'd only have one
shallow-buffered speed conversion remaining. But I'm very sceptical if
something like this makes sense after taking the cost/benefit ratio
into account.

Tore




Attachment: signature.asc
Description: OpenPGP digital signature


Current thread: