Recent Discussions
Dell ECS System Level Statistics Data Sources
Dell made some changes to their ECS offering in version 3.6 where system level statistics such as CPU, Memory and Network were removed from the dashboard API and Flux Queries needed to be used to retrieve the data, below are two discovery and collection scripts one for CPU and Memory and one for Network Level statistics that utilize the flux query to retrieve the relevant metrics. Important note: this is for all versions above 3.6 of the Dell EMC ECS Solution all versions before are supported fully by the existing LogicMonitor out of the box packages. CPU and Memory Statistics: The following collection and discovery scripts retrieve the CPU and Memory statistics from the flux query API I would recommend keeping the collection frequency at 5 minutes. Discovery Script: /******************************************************************************* * Dell ECS Flux Query CPU and Memory Discovery Script ******************************************************************************/ import groovy.json.JsonSlurper import groovy.json.JsonOutput import groovy.json.JsonBuilder import java.util.concurrent.Executors import java.util.concurrent.TimeUnit import com.santaba.agent.groovyapi.http.*; import com.santaba.agent.util.Settings hostname = hostProps.get("system.hostname") user = hostProps.get("ecs.user") pass = hostProps.get("ecs.pass") collectorplatform = hostProps.get("system.collectorplatform") debug = false def success = false def token = login() // End Templines if (token) { //Retrieve data for all nodes for CPU and Memory def encoded_instance_props_array = [] //Use the flux call for getting the CPU to retrieve the Node information. Future Enhancement find a call for just the Nodes rather than the metrics call. def CPUresponse = getNode(token) if (debug) println "CPU Response: "+ CPUresponse // Work through table to retrieve the Node Name and Id to build out the instance level properties. // Internal Note used the methods in "Arista Campus PSU Collection Script" def CPUJson = new JsonSlurper().parseText(CPUresponse) if (debug) println "\n\n CPU Values: "+CPUJson.Series.Values[0] CPUJson.Series.Values[0].each { nodeEntry -> if (debug) println "In Table" if (debug) println "Node Data "+nodeEntry def nodeId = nodeEntry[9] def nodeName = nodeEntry[8] wildvalue = nodeId wildalias = nodeName description = "${nodeId}/${nodeName}" def instance_props = [ "auto.node.id": nodeId, "auto.node.name": nodeName ] encoded_instance_props_array = instance_props.collect() { property, value -> URLEncoder.encode(property.toString()) + "=" + URLEncoder.encode(value.toString()) } println "${wildvalue}##${wildalias}##${description}####${encoded_instance_props_array.join("&")}" } } else if (debug) { println "Bad API response: ${response}"} return 0 def login() { if (debug) println "in login" // Fetch new token using Basic authentication, set in cache file and return if (debug) println "Checking provided ${user} creds at /login.json..." def userCredentials = "${user}:${pass}" def basicAuthStringEnc = new String(Base64.getEncoder().encode(userCredentials.getBytes())) def loginUrl = "https://${hostname}:4443/login.json".toURL() def loginConnection = loginUrl.openConnection() loginConnection.setRequestProperty("Authorization", "Basic " + basicAuthStringEnc) def loginResponseBody = loginConnection.getInputStream()?.text def loginResponseCode = loginConnection.getResponseCode() def loginResponseToken = loginConnection.getHeaderField("X-SDS-AUTH-TOKEN") if (debug) println loginResponseCode if (loginResponseCode == 200 && loginResponseToken) { if (debug) println "Retrieved token: ${loginResponseToken}" return loginResponseToken } else { println "STATUS CODE:\n${loginResponseCode}\n\nRESPONSE:\n${loginResponseBody}" println "Unable to fetch token with ${user} creds at /login.json" } println "Something unknown went wrong when logging in" } def getNode(token) { def slurper = new JsonSlurper() def dataUrl = "https://"+hostname+":4443/flux/api/external/v2/query" if (debug) println "Trying to fetch data from ${dataUrl}" //def flux = 'from(bucket:\"monitoring_op\") |> range(start: -5m) |> filter(fn: (r) => r._measurement == \"cpu\" and r.cpu == \"cpu-total\" and r._field == \"usage_idle\" and r.host == \"'+hostname+'\")' def flux = 'from(bucket:\"monitoring_op\") |> range(start: -5m) |> filter(fn: (r) => r._measurement == \"cpu\" and r.cpu == \"cpu-total\" and r._field == \"usage_idle\")' def jsonBody = groovy.json.JsonOutput.toJson(["query":flux]) if (debug) println "Raw JSON Body "+jsonBody if (debug) println "Json Body "+JsonOutput.prettyPrint(jsonBody)+" Type "+jsonBody.getClass() def dataHeader = ["X-SDS-AUTH-TOKEN": token,"Content-Type":"application/json","Accept":"application/json"] if (debug) println("Sent Header: "+dataHeader) // Now we can retrieve the data. def httpClient = Client.open (hostname,4443); httpClient.post(dataUrl,jsonBody,dataHeader); if ( !(httpClient.getStatusCode() =~ /200/)) { println "Failed to retrieve data "+httpClient.getStatusCode() println "Header: "+httpClient.getHeader return(1) } String dataContent = httpClient.getResponseBody() if (debug) println "Status Code "+httpClient.getStatusCode() if (debug) println "Data in response Body "+dataContent //return slurper.parseText(dataContent) return dataContent } Collection Script: /******************************************************************************* * Dell ECS Flux Query CPU and Memory ******************************************************************************/ import groovy.json.JsonSlurper import groovy.json.JsonOutput import groovy.json.JsonBuilder import java.util.concurrent.Executors import java.util.concurrent.TimeUnit import com.santaba.agent.groovyapi.http.*; import com.santaba.agent.util.Settings hostname = hostProps.get("system.hostname") user = hostProps.get("ecs.user") pass = hostProps.get("ecs.pass") collectorplatform = hostProps.get("system.collectorplatform") debug = true def success = false def token = login() // End Templines if (token) { //Retrieve data for all nodes for CPU and Memory def encoded_instance_props_array = [] def CPUresponse = getCPU(token) def MEMresponse = getMemory(token) if (debug) println "CPU Response: "+ CPUresponse if (debug) println "Mem Response: "+ MEMresponse // Work through table to retrieve the Node Name and Id to build out the instance level properties. // Internal Note used the methods in "Arista Campus PSU Collection Script" //Process CPU Metrics def CPUJson = new JsonSlurper().parseText(CPUresponse) if (debug) println "\n\n CPU Values: "+CPUJson.Series.Values[0] CPUJson.Series.Values[0].each { nodeEntry -> if (debug) println "Node Data "+nodeEntry def idleCPU = Float.valueOf(nodeEntry[4]) def usedCPU = 100 - idleCPU def nodeId = nodeEntry[9] def nodeName = nodeEntry[8] wildvalue = nodeId wildalias = nodeName description = "${nodeId}/${nodeName}" println "${wildvalue}.idle_cpu=${idleCPU}" println "${wildvalue}.used_cpu=${usedCPU}" } // Process Memory Metrics def MEMJson = new JsonSlurper().parseText(MEMresponse) if (debug) println "\n\n Mem Values: "+MEMJson.Series.Values[0] MEMJson.Series.Values[0].each { nodeEntry -> def fieldValue = nodeEntry[4] def fieldName = nodeEntry[5] def nodeId = nodeEntry[8] def nodeName = nodeEntry[7] wildvalue = nodeId wildalias = nodeName description = "${nodeId}/${nodeName}" println "${wildvalue}.${fieldName}=${fieldValue}" } } else if (debug) { println "Bad API response: ${response}"} return 0 def login() { if (debug) println "in login" // Fetch new token using Basic authentication, set in cache file and return if (debug) println "Checking provided ${user} creds at /login.json..." def userCredentials = "${user}:${pass}" def basicAuthStringEnc = new String(Base64.getEncoder().encode(userCredentials.getBytes())) def loginUrl = "https://${hostname}:4443/login.json".toURL() def loginConnection = loginUrl.openConnection() loginConnection.setRequestProperty("Authorization", "Basic " + basicAuthStringEnc) def loginResponseBody = loginConnection.getInputStream()?.text def loginResponseCode = loginConnection.getResponseCode() def loginResponseToken = loginConnection.getHeaderField("X-SDS-AUTH-TOKEN") if (debug) println loginResponseCode if (loginResponseCode == 200 && loginResponseToken) { if (debug) println "Retrieved token: ${loginResponseToken}" return loginResponseToken } else { println "STATUS CODE:\n${loginResponseCode}\n\nRESPONSE:\n${loginResponseBody}" println "Unable to fetch token with ${user} creds at /login.json" } println "Something unknown went wrong when logging in" } def getCPU(token) { def slurper = new JsonSlurper() def dataUrl = "https://"+hostname+":4443/flux/api/external/v2/query" if (debug) println "Trying to fetch data from ${dataUrl}" def flux = 'from(bucket:\"monitoring_op\") |> range(start: -5m) |> filter(fn: (r) => r._measurement == \"cpu\" and r.cpu == \"cpu-total\" and r._field == \"usage_idle\")' def jsonBody = groovy.json.JsonOutput.toJson(["query":flux]) if (debug) println "Raw JSON Body "+jsonBody if (debug) println "Json Body "+JsonOutput.prettyPrint(jsonBody)+" Type "+jsonBody.getClass() def dataHeader = ["X-SDS-AUTH-TOKEN": token,"Content-Type":"application/json","Accept":"application/json"] if (debug) println("Sent Header: "+dataHeader) // Now we can retrieve the data. def httpClient = Client.open (hostname,4443); httpClient.post(dataUrl,jsonBody,dataHeader); if ( !(httpClient.getStatusCode() =~ /200/)) { println "Failed to retrieve data "+httpClient.getStatusCode() println "Header: "+httpClient.getHeader return(1) } String dataContent = httpClient.getResponseBody() if (debug) println "Status Code "+httpClient.getStatusCode() if (debug) println "Data in response Body "+dataContent return dataContent } def getMemory(token) { def slurper = new JsonSlurper() def dataUrl = "https://"+hostname+":4443/flux/api/external/v2/query" if (debug) println "Trying to fetch data from ${dataUrl}" //def flux = 'from(bucket:\"monitoring_op\") |> range(start: -5m) |> filter(fn: (r) => r._measurement == \"mem\" and r._field == \"available_percent\")' def flux = 'from(bucket:\"monitoring_op\") |> range(start: -5m) |> filter(fn: (r) => r._measurement == \"mem\")' def jsonBody = groovy.json.JsonOutput.toJson(["query":flux]) if (debug) println "Raw JSON Body "+jsonBody if (debug) println "Json Body "+JsonOutput.prettyPrint(jsonBody)+" Type "+jsonBody.getClass() def dataHeader = ["X-SDS-AUTH-TOKEN": token,"Content-Type":"application/json","Accept":"application/json"] if (debug) println("Sent Header: "+dataHeader) // Now we can retrieve the data. def httpClient = Client.open (hostname,4443); httpClient.post(dataUrl,jsonBody,dataHeader); if ( !(httpClient.getStatusCode() =~ /200/)) { println "Failed to retrieve data "+httpClient.getStatusCode() println "Header: "+httpClient.getHeader return(1) } String dataContent = httpClient.getResponseBody() if (debug) println "Status Code "+httpClient.getStatusCode() if (debug) println "Data in response Body "+dataContent return dataContent } Networking Statistics: See the link https://community.logicmonitor.com/discussions/lm-exchange/dell-ecs-network-statistics-version-3-6-flux-query/19817 for the network statistics as I ran out of space. Additional comments and notes: One of the biggest challenges I had solving this change from the Dashboard API to the Flux API was that I was receiving a HTTP 401 initially I thought this was the flux query however it turned out to be the saving of the token to the file as per the original data sources, once I removed this and made it the same as my Python script which worked with out issue I resolved this issue. I have an additional request for the Latency statistics, I will share these in a separate post once done. Hope this helps.SteveBamford30 days agoNeophyte15Views0likes0CommentsDell ECS Network Statistics Version 3.6+ Flux Query
Dell made some changes to their ECS offering in version 3.6 where system level statistics such as CPU, Memory and Network were removed from the dashboard API and Flux Queries needed to be used to retrieve the data, below are two discovery and collection scripts one for CPU and Memory and one for Network Level statistics that utilize the flux query to retrieve the relevant metrics. Important note: this is for all versions above 3.6 of the Dell EMC ECS Solution all versions before are supported fully by the existing LogicMonitor out of the box packages. The Network statistics will return "0" values from time to time, I am still troubleshooting this, as in the previous script I have found a minimum of 5 minutes works best. Due to 20000 character limitation on a post the cpu and memory stats can be found here. Discovery Script: /******************************************************************************* * Dell ECS Network Interface Discovery script. ******************************************************************************/ import groovy.json.JsonSlurper import groovy.json.JsonOutput import groovy.json.JsonBuilder import java.util.concurrent.Executors import java.util.concurrent.TimeUnit import com.santaba.agent.groovyapi.http.*; import com.santaba.agent.util.Settings hostname = hostProps.get("system.hostname") user = hostProps.get("ecs.user") pass = hostProps.get("ecs.pass") collectorplatform = hostProps.get("system.collectorplatform") debug = false def success = false def token = login() if (token) { //Retrieve data for all nodes for CPU and Memory def encoded_instance_props_array = [] def NETresponse = getNetwork(token) // Work through table to retrieve the Node Name and Id to build out the instance level properties. // Internal Note used the methods in "Arista Campus PSU Collection Script" //Process Network Statistics if (debug) println "Net Response: "+ NETresponse def NETJson = new JsonSlurper().parseText(NETresponse) if (debug) println "\n\n Network Values: "+NETJson.Series.Values[0] NETJson.Series.Values[0].each { ifaceEntry -> def nodeId = ifaceEntry[9] def nodeName = ifaceEntry[7] def nodeIfaceName = ifaceEntry[8] wildvalue = "${nodeId}-${nodeIfaceName}" wildalias = "${nodeName}-${nodeIfaceName}" description = "${nodeId}/${nodeIfaceName}" def instance_props = [ "auto.node.id": nodeId, "auto.node.name": nodeName, "auto.node.interface":nodeIfaceName, "auto.node.interface.speed":ifaceEntry[4] ] encoded_instance_props_array = instance_props.collect() { property, value -> URLEncoder.encode(property.toString()) + "=" + URLEncoder.encode(value.toString()) } println "${wildvalue}##${wildalias}##${description}####${encoded_instance_props_array.join("&")}" } } else if (debug) { println "Bad API response: ${response}"} return 0 def login() { if (debug) println "in login" // Fetch new token using Basic authentication, set in cache file and return if (debug) println "Checking provided ${user} creds at /login.json..." def userCredentials = "${user}:${pass}" def basicAuthStringEnc = new String(Base64.getEncoder().encode(userCredentials.getBytes())) def loginUrl = "https://${hostname}:4443/login.json".toURL() def loginConnection = loginUrl.openConnection() loginConnection.setRequestProperty("Authorization", "Basic " + basicAuthStringEnc) def loginResponseBody = loginConnection.getInputStream()?.text def loginResponseCode = loginConnection.getResponseCode() def loginResponseToken = loginConnection.getHeaderField("X-SDS-AUTH-TOKEN") if (debug) println loginResponseCode if (loginResponseCode == 200 && loginResponseToken) { if (debug) println "Retrieved token: ${loginResponseToken}" return loginResponseToken } else { println "STATUS CODE:\n${loginResponseCode}\n\nRESPONSE:\n${loginResponseBody}" println "Unable to fetch token with ${user} creds at /login.json" } println "Something unknown went wrong when logging in" } def getNetwork(token) { def slurper = new JsonSlurper() def dataUrl = "https://"+hostname+":4443/flux/api/external/v2/query" if (debug) println "Trying to fetch data from ${dataUrl}" def flux = 'from(bucket:\"monitoring_op\") |> range(start: -5m) |> filter(fn: (r) => r._measurement == \"net\" and r._field == \"speed\")' def jsonBody = groovy.json.JsonOutput.toJson(["query":flux]) if (debug) println "Raw JSON Body "+jsonBody if (debug) println "Json Body "+JsonOutput.prettyPrint(jsonBody)+" Type "+jsonBody.getClass() def dataHeader = ["X-SDS-AUTH-TOKEN": token,"Content-Type":"application/json","Accept":"application/json"] if (debug) println("Sent Header: "+dataHeader) // Now we can retrieve the data. def httpClient = Client.open (hostname,4443); httpClient.post(dataUrl,jsonBody,dataHeader); if ( !(httpClient.getStatusCode() =~ /200/)) { println "Failed to retrieve data "+httpClient.getStatusCode() println "Header: "+httpClient.getHeader return(1) } String dataContent = httpClient.getResponseBody() if (debug) println "Status Code "+httpClient.getStatusCode() if (debug) println "Data in response Body "+dataContent return dataContent } Collection Script: /******************************************************************************* * Dell ECS Flux Query Network Statistics ******************************************************************************/ import groovy.json.JsonSlurper import groovy.json.JsonOutput import groovy.json.JsonBuilder import java.util.concurrent.Executors import java.util.concurrent.TimeUnit import com.santaba.agent.groovyapi.http.*; import com.santaba.agent.util.Settings hostname = hostProps.get("system.hostname") user = hostProps.get("ecs.user") pass = hostProps.get("ecs.pass") collectorplatform = hostProps.get("system.collectorplatform") debug = false def success = false def token = login() if (token) { //Retrieve data for all nodes for CPU and Memory def encoded_instance_props_array = [] def NETresponse = getNetwork(token) // Work through table to retrieve the Node Name and Id to build out the instance level properties. // Internal Note used the methods in "Arista Campus PSU Collection Script" //Process Network Statistics if (debug) println "Net Response: "+ NETresponse def NETJson = new JsonSlurper().parseText(NETresponse) if (debug) println "\n\n Net Values: "+NETJson.Series.Values[0] NETJson.Series.Values[0].each { ifaceEntry -> def nodeId = ifaceEntry[9] def nodeName = ifaceEntry[7] def nodeIfaceName = ifaceEntry[8] // Get the _field value so we know which metric we are collecting. def fieldName = ifaceEntry[5] def fieldValue = ifaceEntry[4] wildvalue = "${nodeId}-${nodeIfaceName}" wildalias = "${nodeName}-${nodeIfaceName}" description = "${nodeName}/${nodeIfaceName}" println "${wildvalue}.${fieldName}=${fieldValue}" } } else if (debug) { println "Bad API response: ${response}"} return 0 def login() { if (debug) println "in login" // Fetch new token using Basic authentication, set in cache file and return if (debug) println "Checking provided ${user} creds at /login.json..." def userCredentials = "${user}:${pass}" def basicAuthStringEnc = new String(Base64.getEncoder().encode(userCredentials.getBytes())) def loginUrl = "https://${hostname}:4443/login.json".toURL() def loginConnection = loginUrl.openConnection() loginConnection.setRequestProperty("Authorization", "Basic " + basicAuthStringEnc) def loginResponseBody = loginConnection.getInputStream()?.text def loginResponseCode = loginConnection.getResponseCode() def loginResponseToken = loginConnection.getHeaderField("X-SDS-AUTH-TOKEN") if (debug) println loginResponseCode if (loginResponseCode == 200 && loginResponseToken) { if (debug) println "Retrieved token: ${loginResponseToken}" return loginResponseToken } else { println "STATUS CODE:\n${loginResponseCode}\n\nRESPONSE:\n${loginResponseBody}" println "Unable to fetch token with ${user} creds at /login.json" } println "Something unknown went wrong when logging in" } def getNetwork(token) { def slurper = new JsonSlurper() def dataUrl = "https://"+hostname+":4443/flux/api/external/v2/query" if (debug) println "Trying to fetch data from ${dataUrl}" def flux = 'from(bucket:\"monitoring_op\") |> range(start: -5m) |> filter(fn: (r) => r._measurement == \"net\")' def jsonBody = groovy.json.JsonOutput.toJson(["query":flux]) if (debug) println "Raw JSON Body "+jsonBody if (debug) println "Json Body "+JsonOutput.prettyPrint(jsonBody)+" Type "+jsonBody.getClass() def dataHeader = ["X-SDS-AUTH-TOKEN": token,"Content-Type":"application/json","Accept":"application/json"] if (debug) println("Sent Header: "+dataHeader) // Now we can retrieve the data. def httpClient = Client.open (hostname,4443); httpClient.post(dataUrl,jsonBody,dataHeader); if ( !(httpClient.getStatusCode() =~ /200/)) { println "Failed to retrieve data "+httpClient.getStatusCode() println "Header: "+httpClient.getHeader return(1) } String dataContent = httpClient.getResponseBody() if (debug) println "Status Code "+httpClient.getStatusCode() if (debug) println "Data in response Body "+dataContent return dataContent } Data Source Configuration: In both cases I have set up the "applies to" setting to "hasCategory("EMC_ECS_Cluster")" The discovery schedule is daily The collection schedule as stated is every 5 minutes. The collection is configured as batch script in both. Additional comments and notes: One of the biggest challenges I had solving this change from the Dashboard API to the Flux API was that I was receiving a HTTP 401 initially I thought this was the flux query however it turned out to be the saving of the token to the file as per the original data sources, once I removed this and made it the same as my Python script which worked with out issue I resolved this issue. I have an additional request for the Latency statistics, I will share these in a separate post once done. Hope this helps.SteveBamford30 days agoNeophyte11Views0likes0CommentsHow to Create a Dashboard Widget for “Sensitive” Windows Servers?
Hi Community, I’m looking for best practices to create a dashboard widget that highlights Windows servers which are more “problematic” or sensitive—for example, servers that frequently trigger CPU, Memory, or Disk alerts. Goal: Identify servers with high alert frequency or severe resource issues. Display them in a widget so they stand out for quick troubleshooting.33Views1like1CommentCheckpoint Power Supplies - 6000-XL
Not sure where else to post this or how to get my update into the repo. To support the 6000-XL and some other variants On the Datasource: Checkpoint Power Supplies Modify the datapoint 'PowerSupplyStatus' to: Up|PresentMonitoringLife2 months agoNeophyte24Views0likes1CommentBest Practices for API Calls in a datasource
Hi all, Possibly the most random question of the week, when working in datasources where you are looking to utilize API calls what would you say is the maximum calls to make in what datasource typically I have worked on one data retrieval call per data source. Why the question? So Dell have withdrawn a number of fields from their Dashboard API in Dell ECS which means metrics such as CPU and memory need now to be retrieved from the flux API that is provided amongst a few other metrics which I may or may not need to provide to our infrastructure team. To do this it looks like I may need to generate at least two flux queries one for CPU and one for memory this will result in two API calls. So would you create a single data source for each metric or make the two calls within the datasource so you have a global stats data source for this sort of information. Thanks in advance for your input.SteveBamford2 months agoNeophyte39Views0likes1CommentSeeking feedback on Nutanix monitoring
We are starting to monitor Nutanix environments in our datacenter, and I've downloaded all the LM modules, so they are ready to use. I'm looking for any success stories and feedback from users, because as of now I can get SNMP for system stats, but nothing from the Nutanix modules themselves. Within Prism we added an SNMP user and the v.3 creds are in the LM resource. It appears the SNMP service needs to be restarted after configuring a user. This is a reference we've used so far: https://portal.nutanix.com/page/documents/kbs/details?targetId=kA0600000008bAECAY#Heading_BJaredM2 months agoNeophyte188Views0likes7CommentsHow to handle unnecessary active alerts
Dear LM community, I’m looking for the best practice to handle unnecessary active alerts in LogicMonitor. As far as I understand, we can acknowledge, put into SDT, escalate, or adjust alert thresholds (Instance thresholds), or even group instances with custom alerting rules. However, it doesn’t seem possible to simply remove an active alert once it’s triggered- please correct me if I am mistaken. Each of these approaches has some downsides — for example, grouping interfaces to suppress alerts may cause us to miss new alerts later if the port becomes active again. What is the recommended way to deal with such unnecessary alerts - in this case - inactive network interfaces that are alerting but are expected to stay down? Thank you in advance for your input!Clark_Kent2 months agoNeophyte89Views0likes3CommentsMeraki Switch Stack vs Cisco Switch Stack
I apologize if this topic has already been addressed—I was unable to locate any relevant discussions. I'm encountering a challenge with how LogicMonitor Topology represents Meraki stacked switches, particularly in contrast to its handling of Cisco stacked switches. When LogicMonitor discovers Cisco switches configured in a stack, it identifies the stack as a single logical entity, aggregating multiple serial numbers and hardware components. This behavior aligns with Cisco IOS, which presents the stack as a unified system. As a result, LogicMonitor’s topology mapping treats the stack as a single node, simplifying both visualization and monitoring. Meraki, however, takes a different approach. The Meraki cloud platform recognizes individual switches as members of a stack, and because of this (I believe) LogicMonitor treats each switch as a distinct device. Consequently, topology maps generated by LogicMonitor show individual connections between each switch in a stack, rather than representing the stack as a cohesive unit. This leads to fragmented and often impractical topology views. Manual topology mapping is not a viable option in my environment. Has anyone found a method or workaround to reconcile this issue?billbianco2 months agoNeophyte64Views1like1CommentExample scripts
Hi community, I'm running into a limitation with reporting on Scheduled Downtime (SDT) in LogicMonitor. Right now, i' m able to pull alerts that occurred during SDT' s but i cannot generate a single report that shows all historcal SDTs across all my resources/devices. is there any way to generate such a historical SDT report, does someone have a script or code to share to get that trough the API Thanks in advance!Admine3 months agoNeophyte78Views1like3CommentsAlert Tsunami: Why the Huge Delay and Flood of Post-Resolution Power Alerts?
Subject: Alert Tsunami: Why the Huge Delay and Flood of Post-Resolution Power Alerts? Hello LM Exchange community and LogicMonitor team, We recently experienced an issue that's causing significant frustration and making our alerting system less reliable. We had a couple of anticipated power cable pull-outs (testing/maintenance), which were quickly resolved. However, we then received a massive backlog of LogicMonitor alerts for this event hours after the issue was fixed and the system logs were clear. The Problem Massive Alert Delay: The initial power loss events occurred and were resolved around 7:00 PM and 8:00 PM (based on the Lifecycle Log). However, we started getting a huge flood of critical alerts via email at 9:13 PM, 9:43 PM, 10:13 PM, and 10:43 PM—hours after the issue had been mitigated and redundancy was restored. Excessive Alert Volume: We received dozens of separate critical alerts (e.g., LME205086576, LME205086578, etc.) for a single, contained event, all arriving en masse hours later. Past "Fix" is a Concern: The last time this occurred, the only way I could stop the flood of delayed emails was to turn off alerting for the device and then turn it back on. This is not a scalable or sustainable solution for a reliable monitoring platform. Key Questions for the LogicMonitor Team What is causing this significant delay in alert processing and delivery? It appears the system is holding a large backlog of alerts and then releasing them all at once hours later. What is the recommended, official way to clear an alert backlog without having to resort to manually disabling and re-enabling alerting? Is there a known configuration or polling issue that would cause a single event (like a brief power loss) to generate dozens of unique critical alerts over a short period, and how can we consolidate these into a single, actionable notification? Data for Review LogicMonitor Email Log (Image 1): Shows critical alerts arriving long after the issue was resolved (9:13 PM to 10:43 PM). Device Lifecycle Log (Image 2): Shows the power events (PSU0003, RDU0012) occurring and being resolved between 8:01 PM and 9:22 PM. Any insight or official guidance on how to prevent this "alert tsunami" would be greatly appreciated. We rely on timely and accurate alerting, and this behavior significantly undermines that trust.B1llw3 months agoNeophyte61Views1like4Comments