nanog mailing list archives

Re: Consistent asymetric latency on monitoring?

From: Perry Lorier <perry () coders net>
Date: Thu, 22 Oct 2009 19:10:29 +1300

Rick Ernst wrote:

Resent, since I responded from the wrong address:
---
The basic operation of IP SLA is as surmised; payload with timestamps
and other telemetry data is sent to a 'responder' which manipulates
the payload, including adding its own timestamps, and returns the
altered payload.


Yup :) It's the obvious way to do it :)

I had to do a mental walk-through, but I think I see how drift can
cause this. I'm going to generate some artificial data, graph it, and
see if it matches the general waveshape I'm seeing.

I purposefully have the traffic generators ntp syncing against the
responders. I thought that would keep the clocks more closely in sync.
I don't necessarily care if the time is 'right', just that it's the

same.

This causes major problems. What you're actually measuring here is howwell ntp can keep the clock sync'd under assymetric latency. ntp istrying to do it's own measurements of one way delay, without the help ofclocks to measure clock drift as well. As you can see from your graphsntp is not coping[1].

You are far better to have each end sync to a local stratum 1 or stratum2 ntp source, preferably one over a different link to the one undertest. If you don't have a local stratum 1/2 time source at each end,you might be able find one over a local exchange or other less congestedlink. If this is very important to you then you should consider lookingat running your own stratum 1 clocks at each end syncronised offsomething like GPS, CDMA or a T1 clock.

What kind of difference should I expect if I sync both
generators and responders against the same source, or not sync the
responder? I'm thinking that having one source with constant drift may
be better than both devices trying to walk/correct the time.

Most hardware clocks in PC's/routers/switches etc have pretty atrociousamounts of drift if left to free run[2], sometimes in the order ofseconds or occasionally minutes per week. To get useful numbers youreally do need to syncronise them to /something/. Synchronising them toeach other causes problems as ntp I think (I could be wrong) assumesmostly symmetrical latency, and if the latency isn't symmetric assumesit's because one clock is running fast/slow and will alter the clock'sspeed to account for it. The great thing about ntp stratum 1 servers isthat by definition they have more or less the same time no matter wherethey are, so synchronising each against a local ntp server will be amuch much better solution. If possible you should consider peering withat least 3 upstreams, preferably 4(!)[3] other ntp servers.

[1]: To be fair it's a hard problem. Anything that involves time justgets more and more complicated the more you look at it, ntp is extremelyclever and probably knows more about time than I'd ever want to know,but you're making it's job hard.

[2]: http://vancouver-webpages.com/time/ /http://vancouver-webpages.com/time/ltmhist.png

[3]:http://twiki.ntp.org/bin/view/Support/SelectingOffsiteNTPServers#Section_5.3.3.

Current thread:

Consistent asymetric latency on monitoring? Rick Ernst (Oct 21)
- Re: Consistent asymetric latency on monitoring? Perry Lorier (Oct 21)
  - Re: Consistent asymetric latency on monitoring? Nathan Ward (Oct 21)
    - Message not available
    - Message not available
    - Re: Consistent asymetric latency on monitoring? Rick Ernst (Oct 21)
    - Re: Consistent asymetric latency on monitoring? Perry Lorier (Oct 21)
    - Re: Consistent asymetric latency on monitoring? Rick Ernst (Oct 22)
    - Re: Consistent asymetric latency on monitoring? Roland Dobbins (Oct 22)
- Re: Consistent asymetric latency on monitoring? Mikael Abrahamsson (Oct 21)