nanog mailing list archives
Re: link monitoring
From: "Lady Benjamin Cannon of Glencoe, ASCE" <lb () 6by7 net>
Date: Thu, 29 Apr 2021 14:37:03 -0700
We monitor light levels and FEC values on all links and have thresholds for early-warning and PRe-failure analysis. Short answer is yes we see links lose packets before completely failing and for dozens of reasons that’s still a good thing, but you need to monitor every part of a resilient network. Ms. Lady Benjamin PD Cannon of Glencoe, ASCE 6x7 Networks & 6x7 Telecom, LLC CEO lb () 6by7 net "The only fully end-to-end encrypted global telecommunications company in the world.” FCC License KJ6FJJ Sent from my iPhone via RFC1149.
On Apr 29, 2021, at 2:32 PM, Eric Kuhnke <eric.kuhnke () gmail com> wrote: The Junipers on both sides should have discrete SNMP OIDs that respond with a FEC stress value, or FEC error value. See blue highlighted part here about FEC. Depending on what version of JunOS you're running the MIB for it may or may not exist. https://kb.juniper.net/InfoCenter/index?page=content&id=KB36074&cat=MX2008&actp=LIST In other equipment sometimes it's found in a sub-tree of SNMP adjacent to optical DOM values. Once you can acquire and poll that value, set it up as a custom thing to graph and alert upon certain threshold values in your choice of NMS. Additionally signs of a failing optic may show up in some of the optical DOM MIB items you can poll: https://mibs.observium.org/mib/JUNIPER-DOM-MIB/ It helps if you have some non-misbehaving similar linecards and optics which can be polled during custom graph/OID configuration, to establish a baseline 'no problem' value, which if exceeded will trigger whatever threshold value you set in your monitoring system.On Thu, Apr 29, 2021 at 1:40 PM Baldur Norddahl <baldur.norddahl () gmail com> wrote: Hello We had a 100G link that started to misbehave and caused the customers to notice bad packet loss. The optical values are just fine but we had packet loss and latency. Interface shows FEC errors on one end and carrier transitions on the other end. But otherwise the link would stay up and our monitor system completely failed to warn about the failure. Had to find the bad link by traceroute (mtr) and observe where packet loss started. The link was between a Juniper MX204 and Juniper ACX5448. Link length 2 meters using 2 km single mode SFP modules. What is the best practice to monitor links to avoid this scenarium? What options do we have to do link monitoring? I am investigating BFD but I am unsure if that would have helped the situation. Thanks, Baldur
Current thread:
- link monitoring Baldur Norddahl (Apr 29)
- Re: link monitoring Pete Rohrman (Apr 29)
- Re: link monitoring Eric Kuhnke (Apr 29)
- Re: link monitoring Lady Benjamin Cannon of Glencoe, ASCE (Apr 29)
- Re: link monitoring Eric Kuhnke (Apr 29)
- Re: link monitoring Alain Hebert (Apr 30)
- Re: link monitoring Colton Conor (Apr 30)
- Re: link monitoring Michel Blais (Apr 30)
- Re: link monitoring Lady Benjamin Cannon of Glencoe, ASCE (Apr 29)
- RE: link monitoring Travis Garrison (Apr 29)