nanog mailing list archives

ARIN DNSSEC Monitoring enhancement deployed (was: 2019-01-11 ARIN.NET DNSSEC Outage – Post-Mortem)


From: John Curran <jcurran () arin net>
Date: Mon, 4 Feb 2019 18:44:21 +0000

On 11 Jan 2019, at 3:59 PM, John Curran <jcurran () arin net<mailto:jcurran () arin net>> wrote:
...
My apologies for this incident – while ARIN does have some fragility in our older systems (which we have been working 
aggressively to phase out via system refresh and replacements), it is not acceptable to have this situation with key 
infrastructure such as our DNS zones.   We will prioritize the necessary alert and monitor changes and I will report 
back to the community once that has been completed.

Folks -

I indicated that we would report back once appropriate DNSSEC monitoring is in place - this has now been completed 
(ref: attached announcement of same)

Thanks again for your patience in this matter,
/John

John Curran
President and CEO
American Registry for Internet Numbers

Begin forwarded message:

From: ARIN <info () arin net<mailto:info () arin net>>
Subject: [arin-announce] DNSSEC Monitoring Enhancements
Date: 4 February 2019 at 11:32:25 AM EST
To: <arin-announce () arin net<mailto:arin-announce () arin net>>

On 31 January, ARIN deployed DNSSEC monitoring enhancements, including proactive RRSIG expiration checking, zone syntax 
checking, and DNSSEC validation. We are monitoring from various disparate locations across the Internet with these 
checks. This effort was undertaken in response to the incident that occurred on 11 January, detailed in the incident 
report below.

Improved monitoring of DNSSEC and the arin.net<http://arin.net> zone will provide earlier alerts of any issues such as 
Resource Record Signature (RRSIG) expiration and any issues with DNSSEC validation. These enhancements will provide 
early warning of potential issues, prevent outages, and improve our ability to troubleshoot DNSSEC problems if they 
occur in the future.

Regards,
Mark Kosters
Chief Technology Officer
American Registry for Internet Numbers (ARIN)

Incident Report:

On 11 January 2019, at approximately 8:30 a.m. ET, ARIN monitoring systems alerted that some arin.net<http://arin.net> 
properties were unreachable. All users with validating DNS resolvers were unable to look up resources within 
arin.net<http://arin.net> and thus were unable to reach them. ARIN’s www.arin.net<http://www.arin.net> and 
ftp.arin.net<http://ftp.arin.net> sites and Whois, RPKI, and DNS services were affected for those users who use 
validating resolvers.

ARIN’s Engineering staff determined that DNSSEC validation for the arin.net<http://arin.net> zone was failing and 
temporarily unpublished Delegation Signer (DS) records with our registrar so that we could investigate the problem. 
Upon troubleshooting, ARIN staff discovered that the removal of a resource record had created a spurious record, which 
caused a script to fail to reload. New versions of the zone could not be loaded, and the zone file in use expired. 
After determining the cause of the problem, the offending file was removed and the zone was reloaded. Delegation Signer 
(DS) records were republished and the zone validated, restoring service at approximately 10:30 a.m. ET.

_______________________________________________
ARIN-Announce


Current thread: