nanog mailing list archives

Re: Monitoring service that has a human component?


From: John Von Essen <john () essenz com>
Date: Wed, 5 Dec 2018 17:39:30 -0500

Whats your budget?

The outsourced NOC firms tend to be expensive (I've looked at them for a project), and they are also not that fast, so dont expect someone to determine if an alarm is valid within a few minutes, instead, in goes into their queue and waits for a tech to pick it up, so it could be 30-60 mins.

In a perfect scenario, using freelancer/gig-economy people should be able to get this done quickly, but its needs to be sizeable to start and will involve alot of logistics, which means money.

To be honest, the best option may be to hire a developer to custom code really good logic that eliminates a good deal of the false positives so only a handful make it through.

-John

On 12/5/18 5:01 PM, David H wrote:

Hey all, was curious if anyone knows of a website monitoring service that has the option to incorporate a human component into the decision and escalation tree?  I’m trying to help a customer find a way around false positives bogging down their NOC staff, by having a human determine the difference between a real error, desired (but different) content, or something in between like “Hey it’s 3am and we’ve taken our website offline for maintenance, we’ll be back up by 6am.”  Automated systems tend to only know if test A, or steps A through C, are failing, then this is ‘down’ and do my preconfigured thing, but that ends up needlessly taking NOC time if the customer themselves is performing work on their own site, or just changed it and whatever content was being watched, is now gone.  So, the goal would be to have the end user be the first point of contact if it looks like more of a customer-side issue.  If they can’t be reached to confirm, THEN contact NOC, and unlike email alerts, keep contacting until a human acknowledges receipt of the alert.

Thanks


Current thread: