nanog mailing list archives
Re: Monitoring highly redundant operations
From: Simon Lockhart <simonl () rd bbc co uk>
Date: Wed, 24 Jan 2001 23:43:53 +0000
But he does raise an interesting problem. How do you know if your highly redudant, diverse, etc system has a problem. With an ordinary system its easy. It stops working. In a highly redudant system you can start losing critical components, but not be able to tell if your operation is in fact seriously compromised, because it continues to "work."
Indeed. We currently monitor each part of our operation from a monitoring station on our network. Under certain conditions, this can give us both false positives and false negatives: - We've lost off-site routing. Our monitoring station can see all our nodes okay, so it thinks everything is fine, but no-one else can see them. - We've lost routing to just the part of our network with the monitoring station on. It reports that everything is down, when in fact stuff is working fine for serving the rest of the internet. One way we plan to overcome these issues is to locate monitoring stations on other ISPs networks at random places on the internet. If you correlate the results from these multiple monitoring stations, then you get a better view of what the rest of the internet is seeing. Simon -- Simon Lockhart | Tel: +44 (0)1737 839676 Internet Engineering Manager | Fax: +44 (0)1737 839516 BBC Internet Services | Email: Simon.Lockhart () bbc co uk Kingswood Warren,Tadworth,Surrey,UK | URL: http://support.bbc.co.uk/
Current thread:
- Monitoring highly redundant operations Sean Donelan (Feb 24)
- Re: Monitoring highly redundant operations Simon Lockhart (Feb 24)
- Re: Monitoring highly redundant operations poptix (Feb 24)
- Re: Monitoring highly redundant operations mdevney (Feb 24)
- Re: Monitoring highly redundant operations Greg A. Woods (Feb 24)
- Re: Monitoring highly redundant operations Henry Yen (Feb 24)
- Message not available
- Re: Monitoring highly redundant operations Henry Yen (Feb 24)
- Message not available
- Re: Monitoring highly redundant operations Simon Lockhart (Feb 24)
- Re: Monitoring highly redundant operations Howard C. Berkowitz (Feb 24)
- Re: Monitoring highly redundant operations Greg A. Woods (Feb 24)
- <Possible follow-ups>
- Re: Monitoring highly redundant operations Howard C. Berkowitz (Feb 24)