nanog mailing list archives

2019-01-11 ARIN.NET DNSSEC Outage – Post-Mortem (was: Re: ARIN NS down?)


From: John Curran <jcurran () arin net>
Date: Fri, 11 Jan 2019 20:59:10 +0000

On 11 Jan 2019, at 10:39 AM, John Curran <jcurran () arin net<mailto:jcurran () arin net>> wrote:

On Fri, Jan 11, 2019 at 07:57:25PM +0530,
couldn't get address for 'ns1.arin.net<http://ns1.arin.net/>': not found

Folks -

   This has been resolved - arin.net<http://arin.net/> zone is again correctly signed.

Post-mortem forthcoming,

Folks -

The ARIN.NET<http://ARIN.NET> zone on our public signed DNS servers are populated via an internal DNS server and 
associated workflow.  As part of system maintenance near the end of 2018, the zone file used by the master internal DNS 
server was updated incorrectly, resulting in an invalid zone file.  Since the zone file was invalid, the zone did not 
reload on our internal master, and the associated workflow to DNSSEC sign and push this zone to the public servers did 
not execute.  Our monitoring systems reported being green until the signatures expired as they presently check that the 
SOA's match on the internal and external nameservers.

At approximately 8:30AM eastern time today (11 January 2019), ARIN operations started seeing issues within its 
monitoring.   Initial review suggested the problem was DNSSEC-related due to expired signatures.  We pulled the DS 
record from the zone so that DNSSEC validation would not be performed by those validating resolvers that had not 
already cached our DS records. Upon further investigation we determined that it was the result of human error in 
editing a zone file that went undetected and resulted in interruption of our routine zone publication process.  The 
issue was fixed and signed zones where then pushed out at 10:25 AM ET.  The DS record was reinstated in the parent at 
10:30AM ET.

As a result of this incident, we will add additional alerting to the zone loading process for any errors and perform 
monitoring of zone signature lifetimes, with appropriate alerting for any potential expiration of DNSSEC signatures.

My apologies for this incident – while ARIN does have some fragility in our older systems (which we have been working 
aggressively to phase out via system refresh and replacements), it is not acceptable to have this situation with key 
infrastructure such as our DNS zones.   We will prioritize the necessary alert and monitor changes and I will report 
back to the community once that has been completed.

Thank you for your patience in this regard.
/John

John Curran
President and CEO
American Registry for Internet Numbers







Current thread: