nanog mailing list archives
Re: Famous operational issues
From: Job Snijders via NANOG <nanog () nanog org>
Date: Tue, 16 Feb 2021 21:00:32 +0100
On Tue, Feb 16, 2021 at 01:37:35PM -0600, John Kristoff wrote:
I'd like to start a thread about the most famous and widespread Internet operational issues, outages or implementation incompatibilities you have seen. Which examples would make up your top three?
This was a fantastic outage, one could really feel the tremors into the far corners of the BGP default-free zone: https://labs.ripe.net/Members/erik/ripe-ncc-and-duke-university-bgp-experiment/ The experiment triggered a bug in some Cisco router models: affected Ciscos would corrupt this specific BGP announcement ** ON OUTBOUND **. Any peers of such Ciscos receiving this BGP update, would (according to then current RFCs) consider the BGP UPDATE corrupted, and would subsequently tear down the BGP sessions with the Ciscos. Because the corruption was not detected by the Ciscos themselves, whenever the sessions would come back online again they'd reannounce the corrupted update, causing a session tear down. Bounce ... Bounce ... Bounce ... at global scale in both IBGP and EBGP! :-) Luckily the industry took these, and many other lessons to heart: in 2015 the IETF published RFC 7606 ("Revised Error Handling for BGP UPDATE Messages") which specifices far more robust behaviour for BGP speakers. Kind regards, Job
Current thread:
- Famous operational issues John Kristoff (Feb 16)
- Re: Famous operational issues Job Snijders via NANOG (Feb 16)
- Re: [EXTERNAL] Re: Famous operational issues Compton, Rich A (Feb 16)
- Re: Famous operational issues Pierre Emeriaud (Feb 16)
- Re: Famous operational issues Mikael Abrahamsson via NANOG (Feb 16)
- Re: Famous operational issues bzs (Feb 16)
- Re: Famous operational issues Jörg Kost (Feb 16)
- Re: Famous operational issues Simon Lockhart (Feb 16)
- Re: Famous operational issues Randy Bush (Feb 16)
- Re: Famous operational issues Randy Bush (Feb 16)
- Re: Famous operational issues Damian Menscher via NANOG (Feb 16)
- Re: Famous operational issues Todd Underwood (Feb 16)
(Thread continues...)
- Re: Famous operational issues Job Snijders via NANOG (Feb 16)