Forum Discussion

Kevin_Foley's avatar
10 years ago

Monitor host time skew in relation to collector

How can I monitor host time variance from the collector's time?

I have 80+ Windows hosts all of which have time sync enabled to time.windows.com or other ntp servers on my network. From time to time either the ntp client fails, somebody makes a manual time change to a host or some application goes nuts and bends time.

Just as I want to be able to alert if a host's disk space is under 5% free space, I want to alert if my hosts clock has skewed 5 minutes from the collector's time

  • OK, this was a good idea. Our standard answer to this has been "Windows will log events when NTP sync fails - use event logging." But as event logs tend to be noisy, that's hard.

    I had thought there should be a better way to do it than comparing to the collector time. Using w32tm to report on difference between the configured time source was what I hoped to do - but not all hosts being monitored will be running the time service; and if they are and on a large domain, w32tm /monitor can take literally minutes to return, and there are remote execution issues....

    So, I did just what you suggest.

    I'll email you a datasource that does this - it would be great if you could import it and let me know your feedback before I put it in our core repository.

    Thanks

  • I searched for this today because I found the above method is not sufficient.  Why?  Because the collector is required to be in the domain (generally) and with Windows, time flows through the domain from the PDC emulator.  Does this check work?  Yes, it tells you if the collector is skewed from the monitored server.  Does it tell you if the time is correct?  Nope.  If the PDC emulator is not sync'ed, this check will happily tell you your offset is low/zero.  What really is needed is a way to check time against an independent source.  We ran into this yesterday when we were asked why a server showed skew.  The answer was the target in this case was accurate and not in the domain, and the domain was skewed.  I suppose a workaround would be to ensure the collector is getting time from NTP, not the domain, but this may not be feasible depending on group policy architecture.

    Sorry, one more caveat -- forcing the collector to sync to NTP could break its domain membership since Kerberos will get wedged (or could).  So really, this needs to be a different datasource that compares time not from the collector, but from one or more NTP sources.