Forum Discussion

Kelemvor's avatar
Kelemvor
Icon for Expert rankExpert
3 months ago

Does anyone monitor their iDracs with LM? Tons of duplicate alerts

Hi,

I just added some of our iDracs into LM.  While the monitoring seems to be working fine, the alerting systems seems a bit wacky.

I have a server with a bad CMOS battery.  This causes me to get 4 alerts from LM for the iDrac:

The 3rd one is for the battery.  The other three are "System" alerts that will seemingly alert any time anything has a problem.  This seems like any time something has a problem, I'll get 4 alerts.  I don't want that.  ;)

I'm wondering if I should just disable all the "System" and "Chassis" alerts so I just get the alerts for the actual component that's having the problem.

Anyone else run into this?

Thanks.

  • I'm not sure about this specific situation, but I generally prefer to get too many simultaneous alerts for the same thing than take a chance on missing something.

    I would suggest attempting to 100% verify that it's not possible for a system or chassis alert to occur without having a more specific alert also occur. It seems kinda safer to do the opposite (only have system) but then you don't get specifics in monitoring.

  • This is normal for a lot of monitoring of this type. Generally, you would tune down all but the more "global" thresholds. Think ESXi monitoring. Something will trigger a red, cmos, but then it triggers the overall board health as well, thats another red.

    The issue there then is your concern over missing something that wouldn't trigger an overall but does a minor. 

    So we leave it as is in this case.