Forum Discussion

Kelemvor's avatar
Kelemvor
Icon for Expert rankExpert
2 months ago

Any way to change the severity of an alert based on duration?

Hi,

I'm wondering if it's possible to do something for a website check that would be like:

  • Alert occurs based on the normal Check every 3 minutes, Alert after 3 failures.  Set alert as Warning.
  • If the alert is still active after 15 minutes, change to an Error and re-notify
  • If the alert is still active after another 15 minutes, change to a Critical and re-notify

Can anything like that be done?

Thanks.

  • Kelemvor our product leader said he doesn’t think you can change the severity of the alert based on time like that, but you can resend or escalate after a given amount of time and you can have the website check only alert after 3 failures and have it check every 3 minutes...let me know if we can help.

  • SuzanneShaw's avatar
    SuzanneShaw
    Icon for Community Manager rankCommunity Manager

    Hey Kelemvor - I am polling some experts for you but the initial response is: 
    - In theory, yes
    - Thinking we could monitor the same metric, at the same threshold and set up different levels of severity based on number of occurrences over a period of time, which can be done on a device, but looking into a web check

    I will follow back up with more details after they test it out

  • SuzanneShaw's avatar
    SuzanneShaw
    Icon for Community Manager rankCommunity Manager

    Kelemvor our product leader said he doesn’t think you can change the severity of the alert based on time like that, but you can resend or escalate after a given amount of time and you can have the website check only alert after 3 failures and have it check every 3 minutes...let me know if we can help.

  • There is technically not a way to accomplish this.

    What you can do tho, is in your escalation chain and alert rules, build in an escalation interval. Once you get to the point where you want it to be a higher severity, you point the escalation chain to a different integration, where you can adjust the severity/impact to wherever you are sending your alerts.

    An example would be disk at 90% is an error.

    Escalation interval is 120 minutes have 4 stages, the first 3 are the your normal integration. the 4th is the adjusted one to then trigger a higher level alert in your ticketing system or chat system.