Forum Discussion

James_Lee's avatar
8 years ago

Simple Truth About Complex Datapoints

Logicmonitor can be used in ways that at first glance look too complicated to do without some magical programming powers.

A chat that we had recently is summed up below.

Quote

One of our servers has an application that maxes out the CPU at 100% and brings down everything. This results in subsequent polls returning No Data or NaN (NaN means Not a Number). We have alerts on high cpu usage and we have alerts on No Data, but no combination of the two.
If we see a No Data alert we have to look at the CPU Data from before NaN start and determine whether it was at 100% before failure. We would like an alert when the cpu is at 100% and the next poll being NaN (system being unresponsive due to the high CPU usage)

 

 

At first glance this may look like something Logicmonitor cannot support natively except via a complex script based datasource.  But this can be done with just a complex datapoint. Complex datapoints use mathematical formulas to generate the metric required from other regular datapoints in a datasource.

if(un(counterA),0,counterA)

This expression will return zero if value of counterA is NaN (not a number, such as absence of data, data in non-numerical format, or infinity). If value of counterA is anything other than NaN (some number), then that number will be returned.

So if the percentage datapoint was called, cpupercent, we take this as the counter we are going to look at, create a complex datapoint, "deltacpupercent" with an expression of

if(un(cpupercent),-1,cpupercent)

Basically this means if the cpupercent is a number return the number otherwise return -1.

If the cpu polls at 85 deltacpupercent would be 85. 
If the cpu polls at 100 deltacpupercent would be 100. 
If the cpu polls NaN deltacpupercent would be -1.

So setting an alert on a delta change of >100 would alert when the system went unresponsive due to a 100% cpu condition, going from 100 to -1 in one poll.  However, there's an even easier method in reality.  From using the alert threshold wizard help page:

absolute NaNDelta - works the same as absolute delta, but treats NaN values as 0

Setting this to alert on 100 would alert if the cpu went from 100 to NaN, but I prefer my complex solution as it gives you the opportunity to alert on a variety of things like keeping the current cpu usage alerts.

     ---   Content contributed by David Lee, TSE UK team