Forum Discussion

sean_flynn's avatar
5 years ago

Cisco Temp Tracking

Hi Team,

We are looking for better tracking/alerting for temp increases overtime. We tried dynamic thresholds, but nothing worked out correctly in the sandbox. It was not accurate and caused "false" alerts per say. 
Here is what we are dealing with and how many different temps we are monitoring under a single piece of equipment.

?name=inline-898310222.png

?name=inline873211408.png

?name=inline-1347265499.png

Has anyone been able to create a datasource for this? We need it to be accurate.

We deal with ME's, CBR8's, ASR9k's and various cisco equipment.

any further assistance would be appreciated. 

 

 

 

 

 

5 Replies

  • Anonymous's avatar
    Anonymous

    Do you need a whole datasource to do this? What timeframe are you looking at for your threshold window? The alert trigger interval does just what you're looking for, if I understand your problem. It can be set to a maximum of 60 poll cycles. If you're polling at 5 minute intervals, that at least gives you a 5 hour window. Since you're not as interested in immediate alerts, you could up the poll interval which would up that threshold window. 

    Alternatively, you could construct a Groovy based complex datapoint to call the LM API to get the data from the last X polls or X minutes/hours/days ago and store that value. Then you could add a complex datapoint to calculate the average temperature increase as the slope of the line intersecting those two points.

  • 3 hours ago, Stuart Weenig said:

    Do you need a whole datasource to do this? What timeframe are you looking at for your threshold window? The alert trigger interval does just what you're looking for, if I understand your problem. It can be set to a maximum of 60 poll cycles. If you're polling at 5 minute intervals, that at least gives you a 5 hour window. Since you're not as interested in immediate alerts, you could up the poll interval which would up that threshold window. 

    Alternatively, you could construct a Groovy based complex datapoint to call the LM API to get the data from the last X polls or X minutes/hours/days ago and store that value. Then you could add a complex datapoint to calculate the average temperature increase as the slope of the line intersecting those two points.

     

    The last is needed as a primitive in LM.  Going to "the API" is a PITA without library support.

  • Anonymous's avatar
    Anonymous

    Agreed. It's not optimal. However, it is how I would build it were I requested to build it. I was asked to build something similar when a customer wanted to see the same instance's datapoint for the same time of day but one week earlier. The right answer is to use the alert clear interval.

    I'm working internally to get the product team to build simple LM_API_GET functions into the groovy libraries so that we can just ask for data simply by doing "lm_api_get('device/devices')" or "lm_api_get('alert/alerts')" and just get back the data in a map. Ideally, it would preclude needing to create API tokens (since the collector already has access through the front door) and would also avoid all error handling related to authentication and http status codes. 

  • Thank you for all this information. I am blown away, highly impressed at the community response to this.

    We are looking for it to be a alert tuning, just to use it with the datasource for Cisco Temp Sensors. 

    Let me try this out and I'll check back on this. i'll close this for now, thank you so so so much. 

     

     

  • Anonymous's avatar
    Anonymous

    Coming back to this one with an update:

    The question is whether the desire is to 1) impact alerts or if you want to 2) actually calculate and display the hourly/daily/weekly averages.

    If the desire is #1, alert trigger interval is the way to go.

    If the desire is to do #2, it would have to be a scripted datasource, but you could use Collector Script Caching, which carries data forward from the last collection run to the current collection run. Doing that in every run would allow you to calculate, in the script, the running average of temperature.