Forum Discussion

Michael_Dieter's avatar
2 years ago
Solved

Customizing HostStatus datasource to alert on "No Data" ???

Can anyone share some feedback on customizing default global definition behavior for datasource HostStatus (HR6FND)?

More specifically, I'm looking for some ideas about changing the behavior of either the heartbeat or idleInterval datapoints: by default neither triggers an alert if there is No Data.  At some point in the past I thought about changing this so that an alert would be triggered, but then I seem to recall that there were some reasons where I thought this might produce unintended consequences and so I needed to consider more carefully and thoroughly whether this would be a good idea.  Unfortunately, I never returned to that thought exercise and I also didn't keep good notes because I don't remember what those reasons were.   

Fast-forward to last week and we had a situation where a monitored resource suffered failure that we would have caught much sooner if we were alerting (to a chain with notification delivery) on "No Data"  The resource did not stop responding to Ping, but nearly all of its snmp-based datasources/datapoints started returning No Data; without the alert and notification we did not realize that the resource had stopped providing service. 

So I'm thinking about this again now.   Any/all ideas are welcome.

  • The HostStatus datsource should always have data, even if the device is not responding. We have a >300 threshold on the idleInterval datapoint. 

    I've thought about making an SNMP troubleshooter, but that would be so similar to the SNMP_Host_Uptime DS that i then thought about just putting a no data alert on the Uptime datapoint there. I never actually did it, but I may do in the future. Should be the canary for SNMP monitoring.

3 Replies

  • The HostStatus datsource should always have data, even if the device is not responding. We have a >300 threshold on the idleInterval datapoint. 

    I've thought about making an SNMP troubleshooter, but that would be so similar to the SNMP_Host_Uptime DS that i then thought about just putting a no data alert on the Uptime datapoint there. I never actually did it, but I may do in the future. Should be the canary for SNMP monitoring.

  • On 9/14/2022 at 11:01 AM, Stuart Weenig said:

    The HostStatus datsource should always have data, even if the device is not responding. We have a >300 threshold on the idleInterval datapoint. 

     

    This is probably what I could not remember that I had realized in the past: there is never "No Data" for the HostStatus datasource.

    And so thanks for the suggestion of SNMP_Host_Uptime within the Host Status datasource group as a recommended candidate for triggering a "No Data" alert (The other on the short list is TCP UDP Stats).  I have an older/deprecated SNMP Uptime DS still in use so this is a good opportunity to review the release notes and replace it.

    thanks.

  • After a little further review, the newest SNMP_Host_Uptime datasource is probably the best choice.   It seems to be a universal hit in terms of both "AppliesTo" matching and returning valid data to polling/collection.  snmpTCPUDP (TCP UDP stats) seems also universal in terms of "Applies To", but some of our devices return "No Data" 24/7.  In addition, I have not been able to positively identify a historical situation where an snmpTCPUDP alert trigger on"No Data" would have produced the desired result, i.e. making us aware of the situation when other SNMP datasources were not doing so.