alert rule recommendations within modules

One thing I have noticed over time is how often I find that there is a datapoint somewhere that really deserves to be included in an alert rule, but you just don't know this until after you get bitten. This issue is orthogonal to threshold severity, at least as far as some of the modules I have seen. An example fresh from today was loss of power in an environment with no indication it happened. After some checking, found the Cisco FRU Power DS had an update and afterward showed when power loss (and other related issues) happened. Whoever wrote this one decided each class of issues would be warning level only even though the DP classes themselves are warning, error and critical, grouping different conditions within each. What I came away from with this was that LM itself should have a diagnostic capability to (among other things) recommend which datasources represent important things that ought to have alert rules but do not (or route to NoEscalation). I am not sure yet on how this out to be represented in the system, but some indication of "this one is important and should route to an alert!" in each datapoint would be a good start. It may be there is more metadata that deserves to be included, but nothing else pops into my head right now.

Forum Discussion

alert rule recommendations within modules

Recent Discussions

Dashboard Sharing – An Inline Framing Method

2021-12-15 US Office Hours

Live Training - Tuning Datapoints and Alerts - 15th JUNE 2022 - APAC

Live Training - Introduction to Dashboards - 18th MAY 2022 - APAC

2022-05-11- APAC Product Overview -Collectors, Resources/Groups, Dashboards