event alerts based on time period
We have been asked to alert if a particular Windows event occurs N times in an H hour period, a perfectly reasonable request, but we cannot do this since LM is limited to reporting only immediately when an event triggers an alert. We can fake it somewhat by tying alerts to an escalation chain with one or more initial null stages, but this still fails to handle the requested behavior. What is needed is a basic event correlation method to both roll up repeated events (see for example:
/topic/1590-snmp-trap-event-consolidation/), and to suppress alerts unless basic conditions are met, like minimum count over a time period. Ideally, this would work something like SEC (https://simple-evcorr.github.io/), which we have used very successfully elsewhere, but that specific tool is not required, just at least some of the features. Others have suggested linking into a more sophisticated log handler, like ELK or Graylog, and I also think that would be awesome, but having to provide an ELK infrastructure instead of just collector instances would be a non-starter in many environments -- this functionality belongs within LM.
Thanks,
Mark