nanog mailing list archives

Re: Monitoring service that has a human component?


From: Karsten Elfenbein <karsten.elfenbein () gmail com>
Date: Tue, 11 Dec 2018 19:46:06 +0100

Hi,

you could let them insert a custom string into the maintenance page.
(I hope they are not writing it on demand) So the monitoring would be
ok on status code 200-399 or custom string found.
You could also use a different escalation chain when "maintenance" is
found on an 503 error. Other than that it sounds like a nice AI
training field.


Karsten
Am Mi., 5. Dez. 2018 um 23:04 Uhr schrieb David H <ispcolohost () gmail com>:

Hey all, was curious if anyone knows of a website monitoring service that has the option to incorporate a human 
component into the decision and escalation tree?  I’m trying to help a customer find a way around false positives 
bogging down their NOC staff, by having a human determine the difference between a real error, desired (but 
different) content, or something in between like “Hey it’s 3am and we’ve taken our website offline for maintenance, 
we’ll be back up by 6am.”  Automated systems tend to only know if test A, or steps A through C, are failing, then 
this is ‘down’ and do my preconfigured thing, but that ends up needlessly taking NOC time if the customer themselves 
is performing work on their own site, or just changed it and whatever content was being watched, is now gone.  So, 
the goal would be to have the end user be the first point of contact if it looks like more of a customer-side issue.  
If they can’t be reached to confirm, THEN contact NOC, and unlike email alerts, keep contacting until a human 
acknowledges receipt of the alert.



Thanks


Current thread: