nanog mailing list archives

Re: Some doubts on large scale BGP/AS design and black hole routing risk


From: Christopher Morrow <morrowc.lists () gmail com>
Date: Mon, 4 Apr 2016 08:58:20 -0400

On Sun, Apr 3, 2016 at 6:17 PM, Mark Tinka <mark.tinka () seacom mu> wrote:



On 31/Mar/16 10:12, magicboiz () hotmail com wrote:



My questions are:

1. What could happen in the case of total failure in the redundant
leased lines? Black hole routing between POPs?

If you have redundant backhaul that completely fails, you've got real
problems.

However, if that does happen, any traffic coming into each individual
PoP destined for users in the other PoP will fail. Only traffic
terminating for customers at that PoP will succeed.


​so (as bill points out) plan to localize subnets to each pop. (do not
number customers in pop1 in the same /24 as customers in pop2)​





2. What are the best design methods to avoid this scenario?

Work on your backhaul.

Originate specific routes that cover customers present in each PoP, with
the aggregate as a backup route.

You can run a tunnel across the Internet to simulate a backbone between
both PoP's, using your side of your upstream's IP addresses as the
tunnel end-point. Not elegant, but keeps you up.


​be aware of gre / ip-in-ip forwarding limitations​




   2.1: adding a third POP creating a triangle? What if a POP looses
connection with the other two POPs at the same time? Another black hole?

Your fixation on a complete backhaul outage is interesting.

Purchase backhaul from different service providers to increase your
chances of uptime.



​different providers, different entrance facilities in the building(s),
different conduits out of the area... and hope that somewhere along the
path providerA and B didn't share conduit or capacity-swap you to a single
path :)​




   2.2: requesting another prefix and allocating 1:1 prefix:POP, so in
the scenario each POP only would announce its prefix to the upstreams?

See above re: originating more specific routes based on the customers
you have at each PoP.



   2.3: other?

Work harder on your backhaul.

Yes, bad things can happen, and they do happen. But more than likely, if
a 3-PoP network loses all connectivity from each other, I think routing
will be a much smaller problem to solve in the grand scheme of things.

Mark.




Current thread: