Monitor Linux Processes (via SSH)

Question

Hello,

In our current monitoring tool we monitoring Linux processe profiles&nbsp;using a Regex expression to match one or more.
	Example, we've a profile that will look into processes that contain&nbsp;/.*OpswiseAgent.*/ in their cmdline path.

Once those are running, the probe picks them automatically and monitors their state&nbsp;(not actually having the PID in mind because that might change).
	In LM we would also not&nbsp;rely on PID since it might change (in terms of wildvalue).There can be also more that 1 process (with diff PIDs) running with the same exact cmdline (therefore it needs to pick those in diff. instances).

I'm just unsure how to have a working solution having in mind all of this (unique wildvalue&nbsp;&amp; wildalias).
	Can anyone assist here perhaps?

Regards,

&nbsp;

Answer

Have you looked at mine?&nbsp;https://github.com/sweenig/lmcommunity/tree/master/ProcessMonitoring/Linux_SSH_Processes_Select

Duplicate processes with the same command line and same name will be a problem if you're ignoring PID. Under manual circumstances, how would you differentiate between the two between sessions? I mean, if you logged in once and looked and saw process A and process B. Then if you logged in again 1 minute later and saw the same list of processes, how would you know which was you had previously called A and which one was previously called B?

The answer is, of course, to run them in separate containers, but that's a different discussion.

vitor_santos · Answer

13 minutes ago, Stuart Weenig said:

Have you looked at mine?&nbsp;https://github.com/sweenig/lmcommunity/tree/master/ProcessMonitoring/Linux_SSH_Processes_Select

Duplicate processes with the same command line and same name will be a problem if you're ignoring PID. Under manual circumstances, how would you differentiate between the two between sessions? I mean, if you logged in once and looked and saw process A and process B. Then if you logged in again 1 minute later and saw the same list of processes, how would you know which was you had previously called A and which one was previously called B?

The answer is, of course, to run them in separate containers, but that's a different discussion.

&nbsp;

Yeah I looked into your example but, that has the PID as wild value.
	That's exactly the tricky part of this. Cause nowadays our probe is able to catch those processes (even with same name) &amp; alarm if those get stopped. I just don't know how to replicate this at LM.
	Was wondering if anyone might come up with a workaround that I'm not actually seeing.
	I've tried to come up with something that has the cmdline &amp; then enumerating those as cmdline#1/#2 etc...&nbsp;but, if now there's 3 instances of the process running but later there's only 2... the third instance will return an alarm (cause we don't want to erase it, since we want historical data).

I guess our only solution would be asking the client how many processes should be running with that cmdline &amp; alert if those are lower than the expected.
	But this is downgrading the monitoring we're doing for him nowadays &nbsp;

&nbsp;

Answer

The only problem is with the duplicates because you can set the wildvalue to be anything. It doesn't have to be numeric. Just avoid/strip out special characters.

The real problem is making sure that duplicates are kept straight between polls without something to uniquely pull the together.&nbsp;

You could consider doing a datasource that counts processes by name. Each name would be an instance and you could count the number of up processes. Set a static threshold when processes that should be multiple are not or set a property detailing how many of the same process should be running and compare that to how many are actually running.

Answer

Could be a situation where dynamic thresholds could really help as well. Let LM learn how many processes of a given name are running and alert you when it changes.

vitor_santos · Answer

Yeah, I really think that's the most reliable option to pursuit.

However, I've coded a DS just to see how it behaves (WinProcessStats_Responsiveness).
	Just in case you want to have a look. Only problem I'm having with that DS is that I don't have Active Discovery erasing the Instances (therefore they'll stay there alarming if that process is no longer running - which is kind of&nbsp;what we want but, not perpetually).

This is why we're really leaning towards just expecting a number of process &amp; alarm if it's lower that that.

Forum Discussion

Monitor Linux Processes (via SSH)

9 Replies

Recent Discussions

All my collectors are going down every 24 hours.

Custom Config Collection

Is there a way to export data source to a template file; CSV?

Dark Theme?

No instances on API response