nanog mailing list archives

Re: SNMP - monitoring large number of devices


From: Robert Drake <rdrake () direcpath com>
Date: Tue, 29 Sep 2015 23:01:22 -0400

OpenNMS has a poller that will do what you want. The problem is figuring out what you wish to collect and how to use it. Most of the time it's not as simple as pointing at the modem and saying go.

I've added a few oids for some of the modems we support, just so I can get SNR on them. I don't usually add customer modems directly to monitoring unless I'm tracking a long term problem and want to watch the SNR for that customer for weeks.

I monitor our CMTS' with a threshold system that says if number of active modems decreases by around 20 then alert. This can cause false positives with modems migrating between cards, but if you tweak the numbers right it works okay.
We also have graphs for signal and other things on each CMTS.

Now that I'm thinking about it, I believe I could get away with adding all our modems for SNR, then try to write something to add/remove them and keep it in sync with our provisioning system. I would need to make sure everything was in order so I don't get 400 emails when a site goes down, but it all should be possible. I'm not sure if the I/O would be worth it, but being able to aggregate some of the data and look at SNR across an entire plant would be nice. At one point I had a project to put modems at the tail end of each leg of a plant then monitor them. This is because we don't have monitor-able amplifiers. It never happened though.

The truth is that balancing a plant is easy enough once you're used to it, and the extra metrics you might get from doing some of these things isn't worth the long term I/O. We do have other (non-NMS) systems that will poll and get instantaneous results like this for entire plants. That has been very useful.


My guess is no matter what system you pick, you will either need to spend a couple of weeks hacking on it or pay someone to implement it. There isn't a turnkey system that does exactly what you want because 99% of network monitoring companies target systems rather than networks (the market is much larger..).

If you want to roll your own:

https://github.com/tobez/snmp-query-engine

I recently discovered this and wanted it years ago. I actually considered stripping the poller out of OpenNMS so there would be a bare-bones poller you could send oids to and get back results. The reason being that almost everyone who does SNMP does a bad job of it and is slow. So, don't start at the library layer and don't write your own thing (unless you have to..). You need asynchronous communication, bulk and gettable support, and you don't want to worry about max PDU size. That's what snmp-query-engine does (maybe.. I've just looked at the tin, I haven't used it)

Second note about rolling your own: Skip whisper, rrdtool, mrtg, and any other single-system datacollection. You want 1 million oids or more in 5 minutes? You need SSD for hardware and will probably want to distribute data writes eventually. Research things that make this easier. Cassandra based storage... but nothing good is fully formed. You should still probably begin with OpenTSDB, InfluxDB or another established time series database rather than rolling your own. They have warts but fixing the warts is better than creating new one-use TSDB's with their own flaws. See https://github.com/OpenNMS/newts/wiki/Comparison-of-TSDB



On 9/29/2015 4:20 PM, Pavel Dimow wrote:
Hi all,

recently I have been tasked with a NMS project. The idea is to pool about
20 OID's from 50k cable modems in less then 5 minutes (yes, I know it's a
one million OID's). Before you say check out some very professional and
expensive solutions I would like to know are there any alternatives like
open source "snmp framework"? To be more descriptive many of you knows how
big is the mess with snmp on cable modem. You always first perform snmp
walk in order to discover interfaces and then read the values for those
interfaces. As cable modem can bundle more DS channels, one time you can
have one and other time you can have N+1 DS channels = interfaces. All in
all I don't believe that there is something perfect out there when it comes
to tracking huge number of cable modems so I would like to know is there
any "snmp framework" that can be exteded and how did you (or would you)
solve this problem.

Thank you.



Current thread: