Forum Discussion

Cody_Herzog's avatar
6 years ago

Best way to create complex alert logic based on data history.

Hello.

We're in the process of migrating some complex alerts from Zabbix to LogicMonitor, and I'd like to be sure we're on the right track. Here's an example of some complex alert logic using the builtin Zabbix expressions and operators:

( {-MyTemplate:MyDataItem.last()} >= 10 ) or
( ( {-MyTemplate:MyDataItem.last()} > {-MyTemplate:MyDataItem.prev()} ) and
  ( {-MyTemplate:MyDataItem.delta(3600)} >= 3 ) and
  ( {-MyTemplate:MyDataItem.delta(1800,1800)} >= 1 ) and
  ( {-MyTemplate:MyDataItem.last()} = {-MyTemplate:MyDataItem.max(3600)} ) )

At a high level, this is analyzing various metrics across the recent history of a certain data point.

My understanding is that the best way to achieve this in LM is to create a custom scripted data source that uses the REST API to access the raw data for the desired time window, and then have the custom script process the data as desired.

Is that correct? If so, what type of scripting is recommended for this? Groovy, PowerShell, or external?

Most of the REST API examples I've come across use Python, so is that recommended? If not, are there any examples of using Groovy or PowerShell to access the REST API?

Regarding the best API endpoint(s) to use, I suspect it would be one of these:

https://www.logicmonitor.com/support/rest-api-developers-guide/v1/data/get-data/

I started playing around with those, and the one thing that's not obvious to me is the best way to deal with the device data source ID. I know that it can be obtained through the following API:

https://www.logicmonitor.com/support/rest-api-developers-guide/v1/datasources/get-datasources-associated-with-a-device/

However, I was expecting to be able to pass the device data source ID in to my script as a parameter, or to have it available as a builtin variable within a groovy script, similar to ##system.deviceId##. Is that not possible? If not, then would I have to make two REST API calls in my script. The first one to get the device data source ID, and the second one to get the actual data?

I assume there is no dedicated REST API data source collection mechanism which would take care of all the authentication and other boilerplate stuff. Is it correct that there's no way for a scripted data item which uses the REST API to implicitly inherit the authentication context from which the script is being called?

Thanks very much.

  • I generally don't like to have LM DataSources attempt to communicate back to LM itself, I personally find it a bit hacky and avoid it unless there isn't another solution. That said I do have some DataSources that do exactly this. In this case I think that would be way overkill and I would not suggest it, if you are just looking to have a bit more complex thresholds in LogicMonitor. Try looking at Complex DataPoints rather then have a DataSources that attempts to re-process it's own data.

    I'm not familiar with Zabbix but based on the expression you provided and references to last() and prev() I assume you my need to play with delta thresholds (section 2) in LM. Can you provide more details on what DataSource your using and what the Zabbix expression is attempting to do? A bit of a bigger picture view is needed as I think a whole different method would work better here than re-processing data.

    A trick you can might be able to use is to have multiple DataPoints that just return the same value, but you can setup different thresholds for each. For example if you have a Temperature DataPoint and you want to have a threshold of < 0 and > 30, you have have two DataPoints "CurrentTemp_Low" with a threshold of <0 and "CurrentTemp_High" with a threshold of >30, but both DataPoints return the same current temp value. You can implement the same thing using Complex DataPoints but doing it this way you don't need to hard code your thresholds in DataSources globally.

     

    In general I personally suggest using groovy for DataSources as it's supported with both Linux and Windows collectors built-in. But I would use PowerShell if I know it's only for Windows collectors and something it has better support for, like checking O365 or Exchange. I'm not sure you can use python or other languages in DataSources, or you would atleast need to install those langueses on all your collectors. There is a section of the LM APIv1 docs with different language examples at https://www.logicmonitor.com/support/rest-api-developers-guide/v1/rest-api-v1-examples, you will also find various examples in the community and on Mike Suding's Blog (this is a great resource).

    LM seems to be more of Mac shop to me so they seem use prefer python for scripts that run outside of LM. I'm more of a PowerShell guy myself but as it's all REST calls you can use any language you are comfortable with.

  • Thanks very much, Mike.

    > I generally don't like to have LM DataSources attempt to communicate back to LM itself, I personally find it a bit hacky and avoid it unless there isn't another solution.

    Agreed. It feels icky to me too.

    > Try looking at Complex DataPoints rather then have a DataSources that attempts to re-process it's own data.

    My impression is that complex data points cannot go back in time to look at older samples. If that's true, then I don't think they will meet my needs in this instance. The alert logic needs to look at the last hour of history and make decisions based on statistics/trends over that hour. Is there any way to do that without using the REST API?

    > I'm not sure you can use python or other languages in DataSources, or you would atleast need to install those langueses on all your collectors.

    Yes. I ran into problems trying to use Python for external script data collection shortly after sending my first message.

    > There is a section of the LM APIv1 docs with different language examples at https://www.logicmonitor.com/support/rest-api-developers-guide/v1/rest-api-v1-examples, you will also find various examples in the community and on Mike Suding's Blog (this is a great resource).

    Awesome. That's just what I was looking for.

    I'll tinker around with doing it in Groovy, but I'm still hoping there's a better way which doesn't require LM communicating with itself with the REST API.

    Here are some alternatives for which I've searched, but could not find:

    1.) Complex data points being able to access older samples.
    2.) Complex trigger expressions which support logic such as min/max over certain time ranges.

    Thanks again.

  • Quote

    > My impression is that complex data points cannot go back in time to look at older samples. If that's true, then I don't think they will meet my needs in this instance. The alert logic needs to look at the last hour of history and make decisions based on statistics/trends over that hour. Is there any way to do that without using the REST API?

    That kinda sounds like dynamic thresholds which LM has a blog post about: https://www.logicmonitor.com/blog/dynamic-thresholds-the-new-old-buzzword/  They kinda disagree on monitoring in that way it seems. :)

    You can use the delta thresholds to compare the previous value and the latest value which I assume was the equivalent to your prev() and last(). But not look at data going back further.

    If you really need to look at more to setup your thresholds, then I would just write the DataSource to independently store that data yourself rather than attempt to extract past data from LM. For example you can write a file to the collector that stores the values for the past hour. You are already getting the data directly from the device, it would be much easier to store the data in a log/cache file instead of calling various LM APIs.

    I would still like to get a better specific idea of what you are attempting to implement, it might help. What device are you attempting to monitor? What DataSource(s) are you using?