Can someone explain the CPU 5 minute load average thing to me?

Question

So,We have a server currently alerting us because the 5MinLoadPerCore field is &gt; 1.&nbsp; I'm trying to understand why that is.I found this page that says if that number is &gt;1, it means there are things queued up waiting for the CPU and there's a backlog.https://www.logicmonitor.com/blog/what-the-heck-is-cpu-load-on-a-linux-machine-and-why-do-i-careHowever, the server in question has the CPUs currently running at around 60%.&nbsp; I would think that if it were backed up, it should be cranking at 99% trying to catch up.&nbsp; I would think the 5minload alert and a CPU Usage percentage alert would come as a pair, but they don't.Just trying to figure out if there's anything that can be done when we get the 5Min alerts or if they're more just informational and can be ignored.&nbsp; They only come in as a Warning anyway, so if it's just informational, then it's just noise, and maybe we'll just turn them off.Just looking for other opinions. ;)Thanks.

mike_moniz · Answer

So the 5min load average (and 1min and 15min) is a Linux specific thing and not directly to do with LogicMonitor. There are various pages online you can look at to understand it better, such as https://www.scoutapm.com/understanding-load-averages/. LogicMonitor themselves have a old page about it at&nbsp;https://www.logicmonitor.com/blog/what-the-heck-is-cpu-load-on-a-linux-machine-and-why-do-i-care.
&nbsp;

joe_williams · Answer

The Linux load average represents the average number of processes either running on the CPU or waiting for resources (CPU or I/O). These averages are given as three numbers: 1-minute, 5-minute, and 15-minute averages.
A general rule of thumb is that your load average should not exceed the total number of CPU cores in your system. For example, if your server has 8 cores, a load average of 8 means all cores are fully utilized, and anything higher indicates tasks are queuing.
Short bursts above this limit, reflected in the 1-minute average, can be normal depending on workload. However, sustained overloads in the 5- or 15-minute averages may point to performance bottlenecks.
To find the number of CPU cores you can use:
cat /proc/cpuinfo | grep processor | wc -l
Or you can use something like top or htop.
Keep in mind that high load averages can also result from I/O bottlenecks, not just CPU saturation.

mike_rodrigues · Answer

I would fire up `iotop` and see what processes are using lots of IO when the load average is high.

Forum Discussion

Can someone explain the CPU 5 minute load average thing to me?

3 Replies

Recent Discussions

Can LM run a SQL Query against a specific Server/Database and alert based on the results?

LM / ServiceNow Integration

How long do you sit on hold waiting for Chat support?

Bug in version 223 related to netscan execution

Cisco Wireless Access Points/Wireless Lan Controller