How do you monitor Running Services on Linux boxes?
Hi, There seems to be a few different options for monitoring services on Linux machines. We had been using one that uses SNMP, but it's been giving us trouble with some machines showing No Data every so often. We've also had the check start ignoring services when they stop and then removing the instance. We recently started trying an SSH based check which seems to work. However, it's based on setting a property on every machine/group to tell it which services to monitor. I'm just curious what module other people use to monitor services like this with the most reliable results. Thanks.70Views3likes4CommentsLinuxNewProcesses DataSource -- Auto discovery and key off of HOST-RESOURCES-MIB::hrSWRunName
Hello all! I just wanted to share my edits. I never could get LinuxNewProcesses to work for my needs.. but we really wanted it to also have auto discovery and automatically add a list of toolsets that we have deployed across the board. I did this LONG ago and my wildvalue was the PID… but that’s dangerous and I ended up creating thousands of entries in the LM database because my processes (thousands of them) were always changing. . . .this takes a different approach and keys off of the process name. #1 You just need to have a property defined with a comma separated list These names need to be from “HOST-RESOURCES-MIB::hrSWRunName” #2 My polling is every minute but don’t alert unless it’s been down for an hour… for my scenario, I do this on purpose because some of my applications run for about 5 minutes and then aren’t kicked off again for another 10… so adjust as needed :) The status is under a security review right now.. I’ll post the lmLocator if it makes it! Otherwise here’s the autodiscovery.. the collection script wont’ work and you’ll have to modify it import com.santaba.agent.groovyapi.snmp.Snmp; def OID_NAME = ".1.3.6.1.2.1.25.4.2.1.2"; def host = hostProps.get("system.hostname"); def services = hostProps.get("linux.services").split(','); Map<String, String> result = Snmp.walkAsMap(host, OID_NAME, null) result.forEach({ index,value->index = index; value = value; for (service in services) { if (value ==~ /${service}/) { def CMD_OID = ".1.3.6.1.2.1.25.4.2.1.4." + index; def service_cmd = Snmp.get(host, CMD_OID); def desc = index + " | " + service_cmd; out.println value + "##" + value + "##" + desc } } }) Script: Line 89: if ("${name}" == "${processPath}") {145Views19likes3CommentsSQL Server Services Status
Hi, We have table widget setup to show SQL Server Service Status; the columns seem to be: RunningStatus State Status What is the difference because all show a ‘1’ at the moment. Also, can you manipulate the values to show ‘Running’, ‘Stopped’, ‘Disabled’ (I’m assuming these match to 1, 2, 3 respectively)? ThanksSolved182Views4likes2CommentsDatasource to monitor Windows Services/Processes automatically?
Hello, We recently cloned 2 Logic Monitor out of the box datasources (name -> WinService- & WinProcessStats-) in order to enable the 'Active Discovery' feature on those. We did this because we've the need to discover services/processes automatically, since we don't have an 'exact list' of which services/processes we should monitor (due to the amount of clients [+100] & the different services/solutions across them) After enabling this it works fine & does what we expect (discovers all the services/processes running in each box), we further added some filters in the active discovery for the services in order to exclude common 'noisy' services & grab only the ones set to automatically start with the system. Our problem arrives when these 2 specific datasource start to impact the collector performance (due to the huge amount of wmi.queries), it starts to reflect on a huge consumption of CPU (putting that on almost 100% usage all the time) & that further leads to the decrease of the collector performance & data collection (resulting in request timeouts & full WMI queues). We also thought on creating 2 datasources (services/processes) for each client (with filters to grab critical/wanted processes/services for the client in question) but that's a nightmare (specially when you've clients installing applications without any notice & expecting us to automatically grab & monitor those). Example of 1 of our scenarios (1 of our clients): - Collector is a Windows VM (VMWare) & has 8GB of RAM with 4 allocated virtual processors (host processor is a Intel Xeon E5-2698 v3 @ 2.30Ghz) - Currently, it monitors 78 Windows servers (not including the collector) & those 2 datasource are creating 12 700 instances (4513 - services | 8187 - processes) - examples below This results in approx. 15 requests per second This results in approx. 45 requests per second According to the collector capacity document (ref. Medium Collector) we are below the limits (for WMI), however, those 2 datasource are contributing A LOT to make the queues full. We're finding errors in a regular basis - example below To sum this up, we were seeking for another 'way' of doing the same thing without consuming so much resources on the collector end (due to the amount of simultaneous WMI queries). Not sure if that's possible though. Did anyone had this need in the past & was able to come up with a different solution (not so resource exhaustive)? We're struggling here mainly because we come from a non-agent less solution (which didn't faced this problem due to the individual agent distributed load - per device). Appreciate the help in advance! Thanks,1.3KViews13likes37Comments