Recent Discussions
Issue in auto refreshing EC2 properties
I’m encountering an issue with my Windows ASG group EC2 instances. Whenever a new instance is added, certain properties from a dynamic group—most notably “wmi.user” and “wmi.pass”—are applied. However, by the time the instance is registered in LogicMonitor, WMI isn’t immediately available because some automations are still configuring the WMI credentials on the host. A few minutes later, WMI starts working on the host, and I can successfully test it using wbemtest from the local collector. However, the LogicMonitor portal still shows that WMI is not responding. Interestingly, when I use the collector’s debug console and explicitly specify the WMI credentials, it pulls the information successfully. But if I don’t specify the credentials manually, it fails to work. The only way to resolve this is by manually running “refresh properties,” after which WMI starts working without making any changes. I’m trying to figure out if there’s a way to automatically force a properties refresh every 15 minutes to ensure everything works as expected without manual intervention.Naveenkumar2 days agoNeophyte3Views0likes0CommentsUser defined "host dead" status
There are two ideas that I need help maturing before talking to LM about them. Both have to do with how LM uses server side logic to declare a device dead. We need the ability to designate what metric declares a device as dead/undead and when. What We have several customers who have devices, usually UPS, at remote sites all connected to a Meraki switch. The collector is not at the remote site, but connects over a VPN tunnel, which may be torn down due to inactivity or could be flaky for any other reason. When the VPN tunnel goes down, the devices alert that they have gone down. We have added monitoring to the tunnel and also get alerted when it goes down. However, we'd like to prevent the host down alerts when the only problem is that the VPN tunnel is down. RCA (or recently renamed DAM?) would likely solve this, but defining that mapping manually or through a topologysource is not scalable (plus visibility into the RCA logic is never been good). Luckily, Meraki has an API where we can query the status of devices connected to the switch. During a tunnel outage, this API data shows that the device is still connected to the switch and online. Since it's a UPS, that's sufficient. We've built the datasource required to monitor the devices via the Meraki API. However, since it's a scripted datasource, it doesn't reset the idleInterval. (Insert link here to a really good training or support doc explaining how idleInterval works.) Since none of the qualifying datasources are working on the UPS during the VPN outage, the idleInterval eventually climbs high enough to trigger a host down alert. When the host is declared down, other alerts, like the alerts from this new Meraki Client Device Status DS, are suppressed. How can this be remedied? So, we need the supported and documented ability to use the successful execution of a collection script to reset the idleInterval. I know this is possible today as I've seen it in several of LM's modules. However, I've never seen official documentation on how to do it. LM's probably worried someone will add it to all their scripts, which wouldn't be the right thing to do. When I know I'm not the only one. I need control over the server side logic that determines when the idleInterval declares a device dead. In the example above, we get a slew of host down alerts when the VPN tunnel goes down. However, usually within a few minutes, the VPN gets reestablished and the collector reestablishes connectivity to the device and the idleInterval resets, thus clearing the alerts. With a normal datapoint, I'd just lengthen the alert trigger interval for the idleInterval datapoint. This would mean that the device would have to be down for 15 minutes, 20 minutes, however long I want before generating the alert. What's great is that now we can do that on the group level, so I can target these devices specifically and not alert on them unless they've been down for a truly unacceptable amount of time (i.e. not just a VPN going down and coming right back up). However, the idleInterval datapoint is an odd one. Two things happen. One happens when you surpass the threshold defined on the datapoint. I can't remember what the default is, but in my portal, that's > 300 or 5 minutes. At 6 minutes server side logic, which has been inspecting the idleInterval, decides that the device is down which has implications on suppressing other alerts on the device. As far as I can tell, lengthening the alert trigger interval on the idleInterval datapoint has no effect if the window would exceed the 6 minutes that the server side logic uses to declare the device down. What do we need? We need the ability to set the amount of time that the server side logic uses to declare the device down. We need to be able to set that for some devices and not others. So we need to be able to set it globally, on the group level, and on the device level. Preferably this could be set in the alert trigger interval on the idleInterval datapoint since this mechanism already exists globally and on the group and device levels. Knowing that this could be a confusing way of defining it (since it's measured in poll cycles not minutes/seconds) so, it could alternatively be done as a special property on the group(s)/device(s). I'm interested in hearing your thoughts, even if you are an LMer.Anonymous2 days ago93Views8likes6CommentsBug Report: Editing Alert Rules broken
To reproduce: Create an alert rule at priority 1 that ONLY filters on a resource property: "change.me" = "true", sending to NoEscalationChain Edit it, and change thename of the filtered property to "i.am.changed", again with the value "true". Observe that the UI message lets you know that the change has succeeded Expected behaviour: Editing the Alert Rule again, the property name should be "i.am.changed" Actual behaviour: The property name is still "change.me"David_Bond2 days agoProfessor9Views0likes0CommentsDynamic Dashboard Filters for Text Widget
My apologies if this has been covered already, but I have been search forum and haven't seen this topic. I am building a text widget that performs an API call to display data from another system. This is important as I am attempting to put all information for a facility to a single pane of glass. Is there a way, to pass the Filter value at the top of the dashboard to the text widget for me to use in my java script call?billbianco2 days agoNeophyte16Views0likes0CommentsAPI v3 Python Patch on user 403 forbidden
I have some old python code (that I didnt write) that uses v1 of the API that does a patch with a super minimal patch data block, just the value needed. Personally, I have Groovy code that does some patching using the retrieved user in a Map object and I make a minimal map with just the stuff v3 requires that I set (which v1 didnt) like roles and password and etc, and I managed to get code working with Groovy v3. But I'm in a circumstance where I have to use python and patch on a user, and for the life of me, I keep getting a 403 forbidden error. I've found several examples online, and basically I believe I've got everything set up correctly, the code is mostly similar to the old working v1 code except it has the changes that v3 needs. But I get a 403 error (forbidden). But I know the API token has rights to update the user (tho it is still attached to a user with an administrator role, but I dont think thats the issue, I have groovy code using the same API token). I hate to ask people to look at code but is there anything obviously wrong here that I'm missing for python and v3? http_verb ='PATCH'; resource_path = '/setting/admins/' + str(user_id); patch_data = '{"roles":[{"name":"myrole"}],"email":"what@whatever.what","username":"blah","password":"meh","apionly": true}'; queryParam = '?changePassword=false&validationOnly=false' url = 'https://'+ Company +'.logicmonitor.com/santaba/rest' + resource_path + queryParam; epoch = str(int(time.time() * 1000)); requestVars = http_verb + epoch + patch_data + resource_path; hmac1 = hmac.new(AccessKey.encode(),msg=requestVars.encode(),digestmod=hashlib.sha256).hexdigest(); signature = base64.b64encode(hmac1.encode()); auth = 'LMv1 ' + AccessId + ':' + signature.decode() + ':' + epoch; headers = {'Content-Type':'application/json','Authorization':auth,'X-Version': '3'} response_patch = lm_session.patch(url, data=patch_data, headers=headers) return response_patch; Thanks!SolvedLewis_Beard2 days agoExpert27Views0likes2CommentsCreating a Custom Module based on OIDs?
Hi, We have some IBM MQ devices that we want to monitor. We found some MQ items in the LM Repository, but those are for monitoring the MQ application. We also need to monitor the device for CPU and Memory. I was given the following information and told we should see about monitoring them. I'm not sure if I can modify something to get started or if we'd have to create something from scratch. I'm not familiar with creating anything like this and am hoping someone can point me to something similar I can use and modify or or figure out how to create this from scratch. Thanks! Below are the numeric OID and their respective full details - It shows the CPU usage and also at last 3 different intervals 1 min 5 min and 15 min .1.3.6.1.4.1.14685.4.1.521.1.0 = Gauge32: 2 % .1.3.6.1.4.1.14685.4.1.521.2.0 = STRING: 0.14 % .1.3.6.1.4.1.14685.4.1.521.3.0 = STRING: 0.31 % .1.3.6.1.4.1.14685.4.1.521.4.0 = STRING: 0.34 % IBM-MQ-APPLIANCE-STATUS-MIB::mqStatusSystemCpuStatusCpuUsage.0 = Gauge32: 3 % IBM-MQ-APPLIANCE-STATUS-MIB::mqStatusSystemCpuStatusCpuLoadAvg1.0 = STRING: 0.27 % IBM-MQ-APPLIANCE-STATUS-MIB::mqStatusSystemCpuStatusCpuLoadAvg5.0 = STRING: 0.34 % IBM-MQ-APPLIANCE-STATUS-MIB::mqStatusSystemCpuStatusCpuLoadAvg15.0 = STRING: 0.34 % And below Filesystem Monitors - Can we set alerts if the Free is less than 50% of Total for all three different readings - Total encrypted/Free encryptes - Total temp/Free Temp - Total Internal/Free Internal .1.3.6.1.4.1.14685.4.1.29.1.0 = Gauge32: 23223 Mbytes .1.3.6.1.4.1.14685.4.1.29.2.0 = Gauge32: 29857 Mbytes .1.3.6.1.4.1.14685.4.1.29.5.0 = Gauge32: 4036 Mbytes .1.3.6.1.4.1.14685.4.1.29.6.0 = Gauge32: 4096 Mbytes .1.3.6.1.4.1.14685.4.1.29.7.0 = Gauge32: 3071 Mbytes .1.3.6.1.4.1.14685.4.1.29.8.0 = Gauge32: 3072 Mbytes IBM-MQ-APPLIANCE-STATUS-MIB::mqStatusFilesystemStatusFreeEncrypted.0 = Gauge32: 23223 Mbytes IBM-MQ-APPLIANCE-STATUS-MIB::mqStatusFilesystemStatusTotalEncrypted.0 = Gauge32: 29857 Mbytes IBM-MQ-APPLIANCE-STATUS-MIB::mqStatusFilesystemStatusFreeTemporary.0 = Gauge32: 4036 Mbytes IBM-MQ-APPLIANCE-STATUS-MIB::mqStatusFilesystemStatusTotalTemporary.0 = Gauge32: 4096 Mbytes IBM-MQ-APPLIANCE-STATUS-MIB::mqStatusFilesystemStatusFreeInternal.0 = Gauge32: 3071 Mbytes IBM-MQ-APPLIANCE-STATUS-MIB::mqStatusFilesystemStatusTotalInternal.0 = Gauge32: 3072 MbytesKelemvor3 days agoExpert25Views0likes3CommentsAny way to automate tasks via alerts?
Hi, I know LM doesn't support taking any action when there's an alert. However, I'm wondering if anyone has any neat ideas on how to accomplish something by way of some other automation program. Maybe Power Automate? Here's what I'm thinking. I have a module that monitors a service to see if it's running or not. If it's not, it generates an alert. This seems like the most basic thing to automate because all I'd want to do is run the start-service command to start it back up. So, I'm wondering if I can have the alert send an email to a certain address. That address could then watch for a certain email to arrive. It could then parse out the servername and service name and put that into a start-service -computername xxx -name yyy. Has anyone looked into doing anything like that and did you have any success? Thanks.72Views0likes9Commentsdell hosts and idrac combined
tie idrac monitoring to its host, not a separate device | LogicMonitor - 5620 saw this old post and have a similar ask. i'm trying to figure out how to the get idrac alerts combined to the host. from what i can tell dell OME monitoring is simply accepting idrac SNMP traps. i can probably copy those same SNMP traps to logicmonitor collectors. for the tieing it together with the host, i'm wondering if there is a way to add the idrac IP as an additional IP to the host itself? will that mess up other things logicmonitor is trying to do? will that provide enough info for logicmonitor to tie an idrac SNMP trap for hardware failures to the host itself?gdavid163 days agoNeophyte14Views0likes0CommentsReporting on Alerts and SDTs
Hi all, I am having an issue trying to generate a report of alerts that are generated outside of SDT. I have found that the 'In SDT' field is only populated while the alert is outstanding, so any alerts that are cleared do not have the 'In SDT' field set to Y. As a result, I am finding it impossible to screen out alerts generated during SDT. My assumption (incorrect) was that 'in SDT' would do this. My requirement is to be able to generate trend data on alerts that are relevant to particular teams/escalation chains, to say (for example) the Linux team had 10 Critical alerts last month, vs 50 the previous month, but unless I can screen out the alerts that are generated during SDT, these may all have been expected and require no action, so the data is meaningless. I was pointed to wards the Alert dashboard, as this has fields for alert suppression type, but this does not seem to be populated either, or is similarly cleared when the alert clears. Has anyone else found an appropriate way of reporting on Alerts that screens these out?Tim_OShea3 days agoNeophyte20Views0likes2Comments