Forum Discussion

Kelemvor's avatar
Kelemvor
Icon for Expert rankExpert
27 days ago

What do you alert on for CPU on Linux?

Hi,

For Windows, we use the standard CPU Percent to alert when a server is running hot.  We do the same for Linux, but we also have those MinLoadPerCore alerts that we get all the time on various machines.

When we get the MinLoad alerts, we look at CPU and usually find that it's not running super high so we ignore them.  On some Noisy machine we just keep upping the threshold from 1 to 1.2 to 1.5 etc until we stop getting alerts.  That seems kind of pointless and I'm leaning towards just turning off alerting on the MinLoad datapoints completely.

So my main question is, what do you all alert on?  Do you find the MinLoadPerCore alerts to be valuable?  When you get one, do you take steps to up the CPU count on those machines even if the CPU usage isn't super high?

Just looking to see what everyone else does.

Thanks

1 Reply

  • I use both but I do atleast try to focus more on MinLoadPerCore since it's more indicative of the work the server is doing than just raw CPU usage. That being said, it does tend to require individual tuning depending on what that server does, and I don't have a lot of Linux systems to monitor either. If your not familiar with what Load is in linux, I suggest reading up on it. It's not exclusively CPU related and can also account for things like memory and disk latency causing delays in workloads.

    I generally would throw together a dashboard with CPU %, Load and such, and review the data for the past week or even month to get a better understanding what is expected. Then set thresholds based on that. I've had some web servers hit 30-40 5min-load (without cpu % not even getting that much higher) and I would want to know about it. But if you wanted to set a baseline of like 1.5-2 across multiple servers, perhaps that may help in your case.