nanog mailing list archives

Re: [c-nsp] LDPv6 Census Check


From: Christian Meutes <christian () errxtx net>
Date: Sun, 14 Jun 2020 14:43:14 +0200

On Fri, Jun 12, 2020 at 10:22 PM David Sinn <dsinn () dsinn com> wrote:

Except that is actually the problem if you look at it in hardware. And to
be very specific, I'm talking about commodity hardware, not flexible
pipelines like you find in the MX and a number of the ASR's. I'm also
talking about the more recent approach of using Clos in PoP's instead of
"big iron" or chassis based systems.


TE gives you the most powerful traffic engineering tool kit available.
Naturally it has a bit more weight than just a single screwdriver. It can
you build nearly any kind of multipath transport while that Clos thing is
just one architecture hunting for the cheapest implementation of
IP/LDP-style ECMP.

On those boxes, it's actually better to not do shared labels, as this
pushes the ECMP decision to the ingress node. That does mean you have to
enumerate every possible path (or some approximate) through the network,
however the action on the commodity gear is greatly reduced. It's a pure
label swap, so you don't run into any egress next-hop problems. You
definitely do on the ingress nodes. Very, very badly actually.


Actually shared links are not a swap but just a pop similar to SR. But
indeed this would shift your ECMP issue just to the headend. So for your
ECMP scaling there would still be an option left to use an implementation
which offers you a merge-point with a single label to all upstreams for a
certain equal-cost multipath downstream. This does exist, so would
certainly fix your ECMP scaling problem. But advanced control-plane code is
certainly not cheap so in the end, like it was already said before, if a
simple and cheap platform can solve all your needs then it might be the
better one. Let‘s see what problems we need to solve in five years again.

What I'm getting at is that IP allows re-write sharing in that what needs
to change on two IP frames taking the same paths but ultimately reaching
different destinations are re-written (e.g. DMAC, egress-port) identically.
And, at least with IPIP, you are able to look at the inner-frame for ECMP
calculations. Depending on your MPLS design, that may not be the case. If
you have too deep of a label stack (3-5 depending on ASIC), you can't look
at the payload and you end up with polarization.


Not really as you are still forced to rewrite on imposition for the
simplest form of tunneling, and for TE as often as you need to go against
your SPT as well, it‘s just happening on IP (and IP rewrites are more
expensive than MPLS rewrites / forwarding operations).

Current thread: