nanog mailing list archives

Re: Soliciting your opinions on Internet routing: A survey on BGP convergence


From: Laurent Vanbever <lvanbever () ethz ch>
Date: Tue, 10 Jan 2017 08:13:51 +0100

Hi Joel,

On 10 Jan 2017, at 06:51, joel jaeggli <joelja () bogus com> wrote:

On 1/9/17 2:56 PM, Laurent Vanbever wrote:
Hi NANOG,

We often read that the Internet (i.e. BGP) is "slow to converge". But how slow
is it really? Do you care anyway? And can we (researchers) do anything about it?
Please help us out to find out by answering our short anonymous survey 
(<10 minutes).

Survey URL: https://goo.gl/forms/JZd2CK0EFpCk0c272 <https://goo.gl/forms/WW7KX5kT45m6UUM82>


** Background:

While existing fast-reroute mechanisms enable sub-second convergence upon 
local outages (planned or not), they do not apply to remote outages happening 
further away from your AS as their detection and protection mechanisms only 
work locally.

Remote outages therefore mandate a "BGP-only" convergence which tends to be
slow, as long streams of BGP UPDATEs (containing up to 100,000s of them) must
be propagated router-by-router. Our initial measurements indicate that it can
take state-of-the-art BGP routers dozens of seconds to process and propagate
these large streams of BGP UPDATEs. During this time, traffic for important
destinations can be lost.

One of the phenomena that is relatively easy to observe by withdrawing a
prefix entirely is the convergence towards longer and longer AS paths
until the route disappears entirely. that is providers that are further
away will remain advertising the route and in the interim their
neighbors  will ingest the available path will  until they too process
the withdraw. it can take a comically long time (like 5 minutes)  to see
the prefix ultimately disappear from the internet. When withdrawing a
prefix from a peer with which you have a single adjacency this can
easily happens in miniature.

Thanks! Yes, definitely. This relates to the issue Baldur was raising in which a less-preferred prefix (or not prefix 
at all in your case) has to take over a more preferred one. That case is definitely bad for BGP convergence. 

Our survey/study is more geared towards cases where there is diversity available (alternates paths are there and at 
least partially visible). We are especially interested in finding out whether, even when you take all the precautionary 
measures required by the book, long BGP convergence can still bite you and… whether we can do anything about it.


Laurent

PS: 

Thanks so much to the 21 operators who have answered already! If you haven’t so already, please help us out to find out 
about troublesome BGP convergence by answering our short anonymous survey  (<10 minutes): 
https://goo.gl/forms/JZd2CK0EFpCk0c272 <https://goo.gl/forms/JZd2CK0EFpCk0c272>



Current thread: