nanog mailing list archives

Re: RPKI race


From: Rubens Kuhl <rubensk () gmail com>
Date: Tue, 16 Jun 2020 20:42:44 -0300

Any default route to a non-ROV enabled upstream ?
Do you receive the test prefix from more than one upstream and the previous
test success could be a function of upstream ROV ?

Rubens


On Tue, Jun 16, 2020 at 8:35 PM Baldur Norddahl <baldur.norddahl () gmail com>
wrote:

Hello

I noticed that we regressed and started failing the test at
https://isbgpsafeyet.com/. Investigating I found that we apparently had
some routes in the validation state "unknown" that should have been either
invalid or valid. Including the test prefix which was received via NL-IX
(and Cogent on IPv6).

We do however have plenty of prefixes that are validated and received from
the same sources.

This is a Juniper MX204 router running 20.1R1.11. I tried a few things
including "clear bgp neighbor xxx soft-inbound" (supposed to rerun the
import policy where RPKI marking and check happens) which did not fix it.
Doing a "clear bgp neighbor xxx", which disconnects the peer and reconnects
after a slight delay, did however fix the issue. But I have to do that for
every peer we received the prefix from and potentially we could have
trouble with every peer we have :-(

This router was software upgraded and rebooted two days ago. I suspect a
race condition. What if the router started BGP sessions before it was able
to communicate with the RPKI validation server or before the RPKI database
was synchronized?

I find it a bit disappointing that we this easily ended up with a bad
validation state and apparently there is little I can do about it, except
for walking through all our peers and BGP reset them. Which frankly is an
unacceptable disruption of traffic flow.

Regards,

Baldur


Current thread: