Forum Discussion

rob_risetto's avatar
12 years ago

Collapse similar alerts into one alert email

Hi,

If you get 10s or 100s of error events in the Windows event log within a short period that are similar in either event id and or event text then you generally get multiple alerts and alert emails, even taking into account rate limiting.

Can you consider a feature where you can specify an alert or event pattern with a period of x minutes or x hours, so if events or alerts matches the pattern then only 1 alert email is generated for all of the events in the specified period. The pattern can be one or combinations of event id, data source, event/alert text or string match.

thanks

Rob

  • What were planning is to filter out the same alert id from a host, while that host is in alert for that alert id.rni.e. if host A has an event alert ID 3, and is thus in alert for that for the next 60 minutes (which is configurable per event source), then, during that 60 minutes, further occurences of alert ID 3 on host A will be ignored. Once that first alert clears, future event id 3 will trigger another alert.rnThe rationale being if youve been told alert id 3 is occurring, no point in telling you again. You have however long the alert duration is to investigate..rnDoes that make sense from your point of view?

  • Another take on this are group alerts.rnIf I have a horizontal application that has 100 servers of a particular component and 20 Fail, preferably i dont want to receive 20 emails, nor do I want to suppress 19.rnA single email alert would be triggered with associated escalation chain and repeat interval.rnHowever the body of the email would be as such:rn

    1. rn
    2. Support full HTML and or CSS to allow customization of below elements visually.rn
    3. Top Section would indicate list of 20 Hosts in tabular form indicating Host (s)impacted, their state. etc. Columns would be customizable in what information one would want to present. Could be in red.rn
    4. Lower section would indicate the existing status of any hosts that are not alerting but are part of the group and their status in tabular form. Could be in green.rn
    rnThis allows one to very quickly see on any of the alert emails what the state of the impact and much more informative as more servers fail or servers recover. This is accomplishable in other monitoring systems such as Solarwinds NPM.rn Best,rnAndrew