nanog mailing list archives

Re: Ahoy, SLA boffins!


From: "Patrick W. Gilmore" <patrick () ianai net>
Date: Wed, 29 Jul 2009 00:42:42 -0400

On Jul 29, 2009, at 12:34 AM, Bill Woodcock wrote:

So I've embarked on the no-doubt-futile task of trying to interpret SLAs as empirically-verifiable technical specifications, rather than as marketing blather. And there's something that I'm finding particularly puzzling:

In most SLAs, there seem to be two separate guarantees proffered: one concerning "network availability" and one concerning "packet loss." Now, if I were to put my engineer hat on, and try to _imagine_ what the difference might be, I might imagine "network availability" to have something to do with layer-2 link status being presented as "up," while packet loss would be the percentage of packets dropped. But when I actually read SLAs, "network availability" is generally defined as the portion of the month that the path from the customer's local loop to the transit or peering routers was "available" to transmit packets. Packet loss, on the other hand, is generally defined as the portion of packets which are lost while crossing that exact same piece of network.

Now, what am I missing here? Is this one of those Heisenberg things, where "network availability" is the time the network _could have_ delivered a packet _when you weren't actually doing so_, while "packet loss" is the time the network _couldn't_ deliver a packet when you _were_ actually doing so?

Is "network availability" inherently unmeasurable on a network that's less than 100% utilized?

Am I over-thinking this?

Yes. But not because you are coming to strange conclusions, but because (as you say in your first sentence), you are trying to put empirical / objective meaning to marketing blather.

I had a simple way to fix this. I defined a network as "down" with more than X% packet loss (usually with X in the 2-5 range, depending on other deal parameters). IMHO, a network with 5% packet loss -is- down. I don't know about you, but none of my customers will use my service if they have 5% loss. TCP is finicky! This receives the strongest credit because you cannot use the service.

Below X, you are not "down", just degraded, and therefore the link has some utility, but not 100% utility. This receives a credit, but not as strong a credit as being unable to use a link.

Oh, and, of course, if the there is no light on the fiber, then we are (obviously) "down" as well.

Make sense?

Or I am over-thinking it? :)

--
TTFN,
patrick

P.S. Now you get to think about things like "packet loss to / from where?" and whether the last mile should count.


Seriously, though, I know there are people who don't consider SLAs to be fantasy-fiction, and some of them must not be innumerate, and some subset of those must be on NANOG, and the intersection set might be equal to or greater than one, right? Can anybody explain this to me in a way I can translate into code, while still taking myself seriously?

                               -Bill







Current thread: