nanog mailing list archives
Re: OS, Hardware, Network - Logging, Monitoring, and Alerting
From: Paul Armstrong <psa () otoh org>
Date: Fri, 27 Jun 2008 05:34:18 +0000
At 2008-06-26T02:22-0700, Rev. Jeffrey Paul wrote:
Other stuff we really need to keep an eye on is hardware - redundant PSU status in our 7204s and Dells, temperatures and voltages
Do yourself a favor, monitor temp in C. Most stuff only does C, people burn routers if there's a mix of C and F (I set the alarm to 90, why didn't it shut down? Well, you should have set it to 30, the router only understands C).
1) Is SNMP the best way to do this? Obviously some of the data (service checks) will need to be collected other ways.
Pretty much. Particularly with NetSNMP, you can hook in external commands etc. Check out http://www.net-snmp.org/docs/man/snmpd.conf.html Arbitrary Extension Commands If you don't use SNMP for everything, you're going to be stuck with hooking SNMP into whatever you do use so that all your networking kit and environmental monitors can be monitored.
2) Is there any good solution that does both logging/trending of this data and also notification/monitoring/alerting? I've used both Nagios and Cacti in the past, and, due to the number of individual things being monitored (3-5 items per OS instance, 5-10 items per physical server, 10-50 things per network device), setting them both up independently seems like a huge pain. Also, I've never really liked Nagios that much.
Take a look at OpenNMS....
There's got to be a better way. What do you guys use?
We wrote our own, but that's a company culture thing. Paul -- End dual-measurement, let's finish going metric! http://gometric.us/ http://www.metric.org/
Current thread:
- OS, Hardware, Network - Logging, Monitoring, and Alerting Rev. Jeffrey Paul (Jun 26)
- Re: OS, Hardware, Network - Logging, Monitoring, and Alerting Phil Regnauld (Jun 26)
- Re: OS, Hardware, Network - Logging, Monitoring, and Alerting Andrew Girling (Jun 26)
- Re: OS, Hardware, Network - Logging, Monitoring, and Alerting Alex Thurlow (Jun 26)
- AW: OS, Hardware, Network - Logging, Monitoring, and Alerting Tom Quilling (Jun 26)
- Re: OS, Hardware, Network - Logging, Monitoring, and Alerting Laurence F. Sheldon, Jr. (Jun 26)
- Re: OS, Hardware, Network - Logging, Monitoring, and Alerting Paul Armstrong (Jun 26)
- Re: OS, Hardware, Network - Logging, Monitoring, and Alerting Adam Armstrong (Jun 27)
- Re: OS, Hardware, Network - Logging, Monitoring, and Alerting Mike (Jun 27)
- Re: OS, Hardware, Network - Logging, Monitoring, and Alerting Adam Armstrong (Jun 27)
- Re: OS, Hardware, Network - Logging, Monitoring, and Alerting Brandon Galbraith (Jun 27)
- Re: OS, Hardware, Network - Logging, Monitoring, and Alerting Mike (Jun 27)