Forum Discussion

thanh_rodke's avatar
11 years ago

Show when failure FIRST started not just when alert triggered

Scenario

say that you have a datasource that polls every two minutes, and you set the alert to go after five intervals then the condition starts at 10AM, and so the alert would fire at 10:10AM so the condition occurs for ten minutes (five intervals of two minutes) and the alert fires the alert email arrives as expected at maybe 10:11 or 10:12 AM the alert says that the start time is 10:10 AM and the duration is 0h 0m, but the error actually started ten minutes ago, and has been going on for ten minutes

Business case:

we have lots of items that we need historical data on, but are not necessarily "drop right now and fix" items, so we may not alert for 10-30 minutes, some even longer this also has put a huge dampener on false positives having accurate times listed in the initial and subsequent alerts can be helpful for remediation, and you don't have to calculate each time how far back to start digging in the server logs based upon that particular alert it's more difficult to determine if you have misconfigured a datasource (quantities of alerts all of a sudden go up, but you can tell how many intervals or how much time has elapsed since the error actually began) it's just inaccurate and can be confusing to to our MSP customers

  • Thanks for the feedback, this makes complete sense. As we move to our new data storage platform this type of feature will be easier to accommodate. Would distinguishing between alert duration (as is now) and condition duration satisfy your needs?

  • I cannot think of a reason why I would need alert duration, but I could see where a helpdesk team that is evaluated based upon response time would find it useful.  That said, having both allows the most flexibility for all of your customers!

  • Hi Thanh,

    We're still working to get the new data storage engine out. Thanks for the reminder on this though, we'll keep it in mind as the backend conversion takes place.

    -Annie

  • I would like to have conditionStartTime in addition to the existing alert start time.  This will enable us to correlate this with end user reported incidents in our ITSM system.  Example, end user spots a problem before the monitoring does, so by comparing condition start and end user ticket opened date/time we can then determine if we need to tune our monitoring thresholds.