nanog mailing list archives
Re: Did your BGP crash today?
From: "Kevin Oberman" <oberman () es net>
Date: Mon, 30 Aug 2010 09:40:10 -0700
Date: Mon, 30 Aug 2010 10:55:03 -0500 From: Jack Bates <jbates () brightok net> Florian Weimer wrote:This whole thread is quite schizophrenic because the consensus appears to be that (a) a *researcher is not to blame* for sending out a BGP message which eventually leads to session resets, and (b) an *implementor is to blame* for sending out a BGP messages which eventually leads to session resets. You really can't have it both ways.As good a place to break in on the thread as any, I guess. Randy and others believe more testing should have been done. I'm not completely sure they didn't test against XR. They very likely could have tested in a 1 on 1 connection and everything looked fine. I don't know the full details, but at what point did the corruption appear, and was it visible? We know that it was corrupt on the output which caused peer resets, but was it necessarily visible in the router itself? Do we require a researcher to setup a chain of every vender BGP speaker in every possible configuration and order to verify a bug doesn't cause things to break? In this case, one very likely would need an XR receiving and transmitting updates to detect the failure, so no less than 3 routers with the XR in the middle. What about individual configurations? Perhaps the update is received and altered by one vendor due to specific configurations, sent to the next vendor, accepted and altered (due to the first alteration, where as it wouldn't be altered if the original update had been received) which causes the next vendor to reset. Then we add to this that it may pass silently through several middle vendor routers without problems and we realize the scope of such problems and why connecting to the Internet is so unpredictable.
This only way they could have caught this one was to have tested to a CRS which had another router to which it was announcing the attribute in a mal-formed packet. Worse, the resets should just keep happening as the CRS would still have the route with the unknown attribute which would just generate another malformed update to cause the session to reset again. While it may be possible to recover from something like this, it sure would not be easy. -- R. Kevin Oberman, Network Engineer Energy Sciences Network (ESnet) Ernest O. Lawrence Berkeley National Laboratory (Berkeley Lab) E-mail: oberman () es net Phone: +1 510 486-8634 Key fingerprint:059B 2DDF 031C 9BA3 14A4 EADA 927D EBB3 987B 3751
Current thread:
- Re: Did your BGP crash today?, (continued)
- Re: Did your BGP crash today? Thomas Mangin (Aug 29)
- Re: Did your BGP crash today? James Hess (Aug 29)
- Re: Did your BGP crash today? Randy Bush (Aug 29)
- Re: Did your BGP crash today? Claudio Jeker (Aug 30)
- Re: Did your BGP crash today? Thomas Mangin (Aug 30)
- Re: Did your BGP crash today? Daniel Verlouw (Aug 30)
- Re: Did your BGP crash today? Thomas Mangin (Aug 30)
- Re: Did your BGP crash today? Pierre Francois (Aug 30)
- Re: Did your BGP crash today? William Allen Simpson (Aug 29)
- Re: Did your BGP crash today? Jack Bates (Aug 30)
- Re: Did your BGP crash today? Kevin Oberman (Aug 30)
- Re: Did your BGP crash today? Mike Tancsa (Aug 30)
- Re: Did your BGP crash today? Gary Buhrmaster (Aug 30)
- Re: Did your BGP crash today? Thomas Mangin (Aug 28)
- Re: Did your BGP crash today? Florian Weimer (Aug 28)
- Re: Did your BGP crash today? Christopher Morrow (Aug 28)
- Re: Did your BGP crash today? Joel Jaeggli (Aug 29)
- Re: Did your BGP crash today? James Hess (Aug 28)
- Re: Did your BGP crash today? Jared Mauch (Aug 27)
- Re: Did your BGP crash today? Clay Fiske (Aug 27)
- Re: Did your BGP crash today? Valdis . Kletnieks (Aug 27)