Forum Discussion

anil_perera's avatar
7 years ago

Alert if there are no healthy instances in our load balancers after 45 minutes

I want to sent up an alert in Logic Monitor to alert me, in the case that there are no healthy instances in our load balancers after 45 minutes.
Well the devices in the ELB could be deleted and recycled at anytime if they become unhealthy. In this case they would spin up new instances.
If the servers become unhealthy that is fine and new ones come online. However, I want to make sure that if no new instances come online I am notified of this?
So, basically I want to monitor the ELB itself to verify there is at least one instance that is healthy after 45 minutes if they become unhealthy and are terminated.
This is an AWS environment.
Can this be done? and if so, how do I proceed?

1 Reply

Replies have been turned off for this discussion
  • Sarah_Terry's avatar
    Icon for Product Manager rankProduct Manager

    Hi Anil,

    There's a default threshold on the UnHealthyHostCount datapoint for Load Balancers (>0 triggers a warning), but you can always set a threshold of <1 on the HealthyHostCount datapoint to alert on fewer than one healthy instance.  Re the time contingency of 45 mins, there are a couple of ways to do this: 

    1. 1. Adjust the Alert Trigger Interval in the Load Balancer DataSource definition: the alert trigger interval in the datapoint definition controls how many consecutive polling intervals the threshold condition must be true before an alert is triggered.  This means if you have a threshold of <1 for HealthyHostCount, but an Alert Trigger Interval of 30 (or 60), and a collection interval of 1 minute, there would have to be fewer than one healthy load balancer hosts for 30 minutes (or 60 minutes), before an alert triggered.  If the polling interval is set to every 2 minutes, you can get closer to the 45 min mark by setting the trigger interval to 20 or 24.
    3. 2. Make sure the alert is active for 45 mins before routing: you could set up an escalation chain where stage 1 is empty or a non-disruptive destination like a chat tool, stage 2 is set to email or text you, and the escalation interval is 45 minutes.  The result of that setup would be this: when HealthyHostCount drops below 1, a warning alert is triggered and sent to stage 1. After 45 minutes, if the condition is still true (HealthyHostCount still less than 1), an alert would be sent to you.