nanog mailing list archives

Re: A survey on BGP MRAI timer values in practice


From: Adam Thompson <athompson () merlin mb ca>
Date: Tue, 8 Jun 2021 22:18:32 +0000

+1 to Saku's concerns - I simply ignored the survey because I wasn't sure what MRAI was, and I wasn't sure what my 
values would be.  But I have time to be interested right now, so a-spelunking I go...

The term "MRAI" does not appear anywhere in Arista's or Extreme's documentation.  Nor does this timer interval appear 
in any BGP-related "show" output, on any of my platforms, that I can see.

I've found out that "out-delay" (not "delay out") is synonymous with MRAI and seems widely used.

I found "out-delay" in an Arista technote, and now I know how to override it on Aristas, and the default is zero (0).  
Unfortunately, "show" commands on EOS will only show you the current out-delay if it is non-zero, which makes reporting 
it a bit difficult.

Extreme's MLXe platform doesn't appear to support an out-delay/MRAI knob at all, at least as far as I can tell.  I know 
there are several other current and former MLXe operators here, maybe one of them will know?

Based on my limited history with NANOG-L, I guess your initial email might have been seen by perhaps 20 people who 
immediately knew what you were talking about, and 2000 who didn't.  (I don't actually know subscriber numbers, that's 
totally a WAG.  And maybe more people have touched out-delay knobs than I think.  Dunno.)

<snide but hopefully not too much> This illustrates the gap between academia and industry - academics can research a 
narrow topic, and come at issues from a theoretic standpoint using unfamiliar terminology.  As a practitioner, I get 
told to add carrier X as a peer by end-of-day Friday, and "just make it work".  I wasn't even able to understand your 
initial question, because I haven't spent a semester understanding the intricacies of BGP propagation - I just know it 
usually works, knobs exist that I shouldn't fiddle with, and that's good enough for my job. </snide>

If your work results in actionable recommendations such as "don't use BGP out-delay timers to mitigate XYZ in 
circumstance LMNO, do ABC instead", that's fantastic.  Please keep us advised, and do post aggregated survey results 
here once you close the survey.

I am specifically interested in the answer to "Have you ever had to adjust BGP out-delay with any of your peers, and 
why?"  It would be great if we could derive that answer from the survey results, but anecdotal replies here would also 
be helpful.  All you larger(-than-me) network operators out there: when would I need to use out-delay?  Why?  What does 
it accomplish?

Good luck in reformulating your survey to get better engagement,
-Adam

Adam Thompson
Consultant, Infrastructure Services
[1593169877849]
100 - 135 Innovation Drive
Winnipeg, MB, R3T 6A8
(204) 977-6824 or 1-800-430-6404 (MB only)
athompson () merlin mb ca<mailto:athompson () merlin mb ca>
www.merlin.mb.ca<http://www.merlin.mb.ca/>

________________________________
From: NANOG <nanog-bounces+athompson=merlin.mb.ca () nanog org> on behalf of Saku Ytti <saku () ytti fi>
Sent: June 8, 2021 01:06
To: shahrooz () cs umass edu <shahrooz () cs umass edu>
Cc: nanog list <nanog () nanog org>; Arun Venkataramani <arun () cs umass edu>
Subject: Re: A survey on BGP MRAI timer values in practice

On Mon, 7 Jun 2021 at 19:32, <shahrooz () cs umass edu> wrote:

We often read that the Internet (i.e. BGP) has a long convergence delay.
But why is it so slow? And can we (researchers) do anything about it?

Create business incentives to improve it. This is a non-technical
problem, we've long had technical tools to make it fast, there just
isn't incentive to make it fast. Customers are not asking operators
for better convergence speeds.

Please help us out to find out by answering our short anonymous survey
(<10 minutes).

Can you tell me what have you done so far? What are the default MRAI
values for each AFI/SAFI for IOS, IOS-XR, Junos, SROS, VRP and EOS?
Then people responding don't have to check what their NOS does, they
can refer to your table and tell the default value, since this is what
99% will be using.

Now your survey has built-in selection bias, people who answer it are
people who know what it is and who are concerned about it and have
changed it, this is not a representative group and you will start your
work with very bad data.



--
  ++ytti

Current thread: