alerts

84 Topics

New UI Impact Series - Alert Tuning Improvements
LogicMonitor has rolled out significant Alert Tuning Improvements designed to streamline the often complex task of maintaining effective alert thresholds. These improvements directly address IT teams' challenges in managing alert configurations across their monitoring environment, offering more granular control and deeper insights into alert patterns. First, we have the new Alert Health Check report. This report is a powerful tool that provides clear visibility into alert patterns and potential noise sources within your environment. This pre-configured report offers valuable insights into alert distribution by severity, resource, LogicModule, and datapoint, helping teams identify where to focus their alert tuning efforts. For instance, if a specific device generates 25% of all alerts, the Alert Health Check immediately reveals this, giving teams the immediate insights needed to fine-tune their thresholds where they'll have the most impact. The report also includes temporal analysis of alert patterns and alert routing information, providing essential context for optimization efforts. In addition to reporting updates, LogicMonitor has introduced enhanced configuration flexibility, allowing teams to customize alert trigger intervals and no-data alert settings at the group and device levels, regardless of global definitions. Previously, threshold settings for multi-instance DataSources were only updated on the global level. DataSources can now be configured at the individual level, with current and future instances automatically inheriting these thresholds. What does this mean for your alert-tuning experience? These enhancements represent a significant shift in how IT teams approach alert management. By providing tools that identify alert noise sources and offering more granular control over threshold configurations, LM Envision helps reduce the operational overhead traditionally associated with alert maintenance. Teams can now make more informed decisions about threshold adjustments, ensure consistent alerting across their infrastructure, and maintain an optimal signal-to-noise ratio in their monitoring environment. This thoughtful approach to alert tuning helps IT professionals create and maintain more effective monitoring systems while reducing the tedious time and effort required to do so. Want to know more about Alert Tuning Improvements? Check out these support documentation articles: Alert HealthCheck Report Tuning Static Thresholds for Datapoints Enabling Dynamic Thresholds for Datapoints
skydonnell
6 months ago Place Tech Talk
129Views
9likes
1Comment
Programmatic Ping Alert
We currently lack the ability to white list domain names on our firewall, so I have to do everything via IP. Recently I’ve come across an issue where a company won’t give me their external IP’s because they can change, or so they say. For several weeks I’ve pinged the IP’s and it has always been 1 of 4 IPs. Has anyone created some kind of ping alert that does something like “ping easypost.com and api.easypost.com if the IP’s returned are not in 169.62.110.130-169.62.110.133, alert me” I’m not much of a programmer myself so I’d need something pretty “plug and play”. TIA!
Solved
Kirby_Timm
9 months ago Place LM Exchange
147Views
8likes
4Comments
Customer Story - GCISD
Customer Story: Industry: School Districts Challenge With an average of 40,000 devices online in the district at any given time, GCISD needed to reduce tool sprawl and find an observability platform that provided full visibility, robust alerting, and forecasting to solve problems before they impacted teachers and students. Solution LogicMonitor’s single pane of glass and robust dashboards provided visibility across their entire infrastructure within one platform. Features like Automatic Discovery and intelligent alerting saved the team valuable time, while historical data and AI predictions helped anticipate future problems. Business Outcomes Reduced the overall number of help desk tickets with proactive monitoring Instant alerting allowed for minimal downtime after the district experienced an outage Single source of truth increased visibility and insights across multiple teams As we moved through the pandemic and the complete school system went online, we realized that even a second of downtime or service interruption would severely impact instruction. Having a tool like LogicMonitor is so important to ensure that our technology is supporting our staff and students 24/7” – Kyle Berger, Chief Technology Officer Interested in sharing a story about your infrastructure monitoring, processes improvements, or any other successes since implementing LogicMonitor? We’d love to hear it! Feel free to comment below or reach out to us at innercircle@logicmonitor.com to share your voice.
ashlyn_warblow
2 years ago Place Inner Circle News
89Views
21likes
1Comment
Alert List Dashboard, not a Widget on a Dashboard
Does anyone else have a Dashboard that contains just a single widget, an Alert List widget, so show every alert? It’s really annoying having the double scrollbar: We should have a special Dashboard that is an Alert List to avoid this problem.
ldoodle
2 years ago Place Product Discussions
92Views
6likes
10Comments
Example script for automated alert actions via External Alerting
Below is a PowerShell script that's a handy starting point if you want to trigger actions based on specific alert types. In a nutshell, it takes a number of parameters from each alert and has a section of if/else statements where you can specify what to do based on the alert. It leverages LogicMonitor's External Alerting feature so the script runs local to whatever Collector(s) you configure it on. I included a couple of example actions for pinging a device and for restarting a service. It also includes some handy (optional) functions for logging as well as attaching a note to the alert in LogicMonitor. NOTE: this script is provided as-is and you will need to customize it to suit your needs. Automated actions are something that must be approached with careful planning and caution!! LogicMonitor cannot be responsible for inadvertent consequences of using this script. If you want try it out, here's how to get started: Update the variables in the appropriate section near the top of the script with optional API credentials and/or log settings. Also change any of the if/elseif statements (starting around line #95) to suit your needs. Save the script onto your Collector server. I named the file "alert_central.ps1" but feel free to call it something else. Make note of it’s full path (ex: “C:\scripts\alert_central.ps1”). NOTE: it’s not recommended to place it under the Collector's agent/lib directory (typically "C:\Program Files (x86)\LogicMonitor\Agent\lib") since that location can be overwritten by collector upgrades. In your LogicMonitor portal go to Settings, then External Alerting. Click the Add button. Set the 'Groups' field as needed to limit the actions to alerts from any appropriate group of resources. (Be sure the group's devices would be reachable from the Collector running the script) Choose the appropriate Collector in the Collector field. Set Delivery Mechanism to "Script" Enter the name you saved the script as (in step #2) in the Script field (ex. "alert_central.ps1"). Paste the following into the Script Command Line field (NOTE: if you add other parameters here then be sure to also add them to the 'Param' line at the top of the script): "##ALERTID##" "##ALERTSTATUS##" "##LEVEL##" "##HOSTNAME##" "##SYSTEM.SYSNAME##" "##DSNAME##" "##INSTANCE##" "##DATAPOINT##" "##VALUE##" "##ALERTDETAILURL##" "##DPDESCRIPTION##" Example of the completed Add External Alerting dialog Click Save. This uses LogicMonitor's External Alerting feature so there are some things to be aware of: Since the script is called for every alert, the section of if/then statements at the bottom of the script is important for filtering what specific alerts you want to take action on. The Collector(s) oversee the running of the script, so be conscience to any additional overhead the script actions may cause. It could take up to 60 seconds for the script to trigger from the time the alert comes in. This example is a PowerShell script so best suited for Windows-based collectors, but could certainly be re-written as a shell script for Linux-based collectors. Here's a screenshot of a cleared alert where the script auto-restarted a Windows service and attached a note based on its actions. Example note the script added to the alert reflecting the automated action that was taken Below is the PowerShell script: # ---- # This PowerShell script can be used as a starting template for enabling # automated remediation for alerts coming from LogicMonitor. # In LogicMonitor, you can use the External Alerting feature to pass all alerts # (or for a specific group of resources) to this script. # ---- # To use this script: # 1. Update the variables in the appropriate section below with optional API and log settings. # 2. Drop this script onto your Collector server under the Collector's agent/lib directory. # 3. In your LogicMonitor portal go to Settings, then click External Alerting. # 4. Click the Add button. # 5. Set the 'Groups' field as needed to limit the actions to a specific group of resources. # 6. Choose the appropriate Collector in the 'Collector' field. # 7. Set 'Delivery Mechanism' to "Script" # 8. Enter "alert_central.ps1" in the 'Script' field. # 9. Paste the following into the 'Script Command Line' field: # "##ALERTID##" "##ALERTSTATUS##" "##LEVEL##" "##HOSTNAME##" "##SYSTEM.SYSNAME##" "##DSNAME##" "##INSTANCE##" "##DATAPOINT##" "##VALUE##" "##ALERTDETAILURL##" "##DPDESCRIPTION##" # 10. Click Save. # The following line captures alert information passed from LogicMonitor (defined in step #9 above)... Param ($alertID = "", $alertStatus = "", $severity = "", $hostName = "", $sysName = "", $dsName = "", $instance = "", $datapoint = "", $metricValue = "", $alertURL = "", $dpDescription = "") ###--- SET THE FOLLOWING VARIABLES AS APPROPRIATE ---### # OPTIONAL: LogicMonitor API info for updating alert notes (the API user will need "Acknowledge" permissions)... $accessId = '' $accessKey = '' $company = '' # OPTIONAL: Set a filename in the following variable if you want specific alerts logged. (example: "C:\lm_alert_central.log")... $logFile = '' # OPTIONAL: Destination for syslog alerts... $syslogServer = '' ############################################################### ## HELPER FUNCTIONS (you likely won't need to change these) ## # Function for logging the alert to a local text file if one was specified in the $logFile variable above... Function LogWrite ($logstring = "") { if ($logFile -ne "") { $tmpDate = Get-Date -Format "dddd MM/dd/yyyy HH:mm:ss" # Using a mutex to handle file locking if multiple instances of this script trigger at once... $LogMutex = New-Object System.Threading.Mutex($false, "LogMutex") $LogMutex.WaitOne()|out-null "$tmpDate, $logstring" | out-file -FilePath $logFile -Append $LogMutex.ReleaseMutex()|out-null } } # Function for attaching a note to the alert... function AddNoteToAlert ($alertID = "", $note = "") { # Only execute this if the appropriate API information has been set above... if ($accessId -ne '' -and $accessKey -ne '' -and $company -ne '') { # Encode the note... $encodedNote = $note | ConvertTo-Json # API and URL request details... $httpVerb = 'POST' $resourcePath = '/alert/alerts/' + $alertID + '/note' $url = 'https://' + $company + '.logicmonitor.com/santaba/rest' + $resourcePath $data = '{"ackComment":' + $encodedNote + '}' # Get current time in milliseconds... $epoch = [Math]::Round((New-TimeSpan -start (Get-Date -Date "1/1/1970") -end (Get-Date).ToUniversalTime()).TotalMilliseconds) # Concatenate general request details... $requestVars_00 = $httpVerb + $epoch + $data + $resourcePath # Construct signature... $hmac = New-Object System.Security.Cryptography.HMACSHA256 $hmac.Key = [Text.Encoding]::UTF8.GetBytes($accessKey) $signatureBytes = $hmac.ComputeHash([Text.Encoding]::UTF8.GetBytes($requestVars_00)) $signatureHex = [System.BitConverter]::ToString($signatureBytes) -replace '-' $signature = [System.Convert]::ToBase64String([System.Text.Encoding]::UTF8.GetBytes($signatureHex.ToLower())) # Construct headers... $auth = 'LMv1 ' + $accessId + ':' + $signature + ':' + $epoch $headers = New-Object "System.Collections.Generic.Dictionary[[String],[String]]" $headers.Add("Authorization",$auth) $headers.Add("Content-Type",'application/json') # Make request to add note.. $response = Invoke-RestMethod -Uri $url -Method $httpVerb -Body $data -Header $headers # Change the following if you want to capture API errors somewhere... # LogWrite "API call response: $response" } } function SendTo-SysLog ($IP = "", $Facility = "local7", $Severity = "notice", $Content = "Your payload...", $SourceHostname = $env:computername, $Tag = "LogicMonitor", $Port = 514) { switch -regex ($Facility) { 'kern' {$Facility = 0 * 8 ; break } 'user' {$Facility = 1 * 8 ; break } 'mail' {$Facility = 2 * 8 ; break } 'system' {$Facility = 3 * 8 ; break } 'auth' {$Facility = 4 * 8 ; break } 'syslog' {$Facility = 5 * 8 ; break } 'lpr' {$Facility = 6 * 8 ; break } 'news' {$Facility = 7 * 8 ; break } 'uucp' {$Facility = 8 * 8 ; break } 'cron' {$Facility = 9 * 8 ; break } 'authpriv' {$Facility = 10 * 8 ; break } 'ftp' {$Facility = 11 * 8 ; break } 'ntp' {$Facility = 12 * 8 ; break } 'logaudit' {$Facility = 13 * 8 ; break } 'logalert' {$Facility = 14 * 8 ; break } 'clock' {$Facility = 15 * 8 ; break } 'local0' {$Facility = 16 * 8 ; break } 'local1' {$Facility = 17 * 8 ; break } 'local2' {$Facility = 18 * 8 ; break } 'local3' {$Facility = 19 * 8 ; break } 'local4' {$Facility = 20 * 8 ; break } 'local5' {$Facility = 21 * 8 ; break } 'local6' {$Facility = 22 * 8 ; break } 'local7' {$Facility = 23 * 8 ; break } default {$Facility = 23 * 8 } #Default is local7 } switch -regex ($Severity) { '^(ac|up)' {$Severity = 1 ; break } # LogicMonitor "active", "ack" or "update" '^em' {$Severity = 0 ; break } #Emergency '^a' {$Severity = 1 ; break } #Alert '^c' {$Severity = 2 ; break } #Critical '^er' {$Severity = 3 ; break } #Error '^w' {$Severity = 4 ; break } #Warning '^n' {$Severity = 5 ; break } #Notice '^i' {$Severity = 6 ; break } #Informational '^d' {$Severity = 7 ; break } #Debug default {$Severity = 5 } #Default is Notice } $pri = "<" + ($Facility + $Severity) + ">" # Note that the timestamp is local time on the originating computer, not UTC. if ($(get-date).day -lt 10) { $timestamp = $(get-date).tostring("MMM d HH:mm:ss") } else { $timestamp = $(get-date).tostring("MMM dd HH:mm:ss") } # Hostname does not have to be in lowercase, and it shouldn't have spaces anyway, but lowercase is more traditional. # The name should be the simple hostname, not a fully-qualified domain name, but the script doesn't enforce this. $header = $timestamp + " " + $sourcehostname.tolower().replace(" ","").trim() + " " #Cannot have non-alphanumerics in the TAG field or have it be longer than 32 characters. if ($tag -match '[^a-z0-9]') { $tag = $tag -replace '[^a-z0-9]','' } #Simply delete the non-alphanumerics if ($tag.length -gt 32) { $tag = $tag.substring(0,31) } #and truncate at 32 characters. $msg = $pri + $header + $tag + ": " + $content # Convert message to array of ASCII bytes. $bytearray = $([System.Text.Encoding]::ASCII).getbytes($msg) # RFC3164 Section 4.1: "The total length of the packet MUST be 1024 bytes or less." # "Packet" is not "PRI + HEADER + MSG", and IP header = 20, UDP header = 8, hence: if ($bytearray.count -gt 996) { $bytearray = $bytearray[0..995] } # Send the message... $UdpClient = New-Object System.Net.Sockets.UdpClient $UdpClient.Connect($IP,$Port) $UdpClient.Send($ByteArray, $ByteArray.length) | out-null } # Empty placeholder for capturing any note we might want to attach back to the alert... $alertNote = "" # Placeholder for whether we want to capture an alert in our log. Set to true if you want to log everything. $logThis = $false ############################################################### ## CUSTOMIZE THE FOLLOWING AS NEEDED TO HANDLE SPECIFIC ALERTS FROM LOGICMONITOR... # Actions to take if the alert is new or re-opened (note: status will be "active" or "clear")... if ($alertStatus -eq 'active') { # Perform actions based on the type of alert... # Ping alerts... if ($dsName -eq 'Ping' -and $datapoint -eq 'PingLossPercent') { # Insert action to take if a device becomes unpingable. In this example we'll do a verification ping & capture the output... $job = ping -n 4 $sysName # Restore line feeds to the output... $job = [string]::join("`n", $job) # Add ping results as a note on the alert... $alertNote = "Automation script output: $job" # Log the alert... $logThis = $true # Restart specific Windows services... } elseif ($dsName -eq 'WinService-' -and $datapoint -eq 'State') { # List of Windows Services to match against. Only if one of the following are alerting will we try to restart it... $serviceList = @("Print Spooler","Service 2") # Note: The PowerShell "-Contains" operator is exact in it's matching. Replace it with "-Match" for a loser match. if ($serviceList -Contains $instance) { # Get an object reference to the Windows service... $tmpService = Get-Service -DisplayName "$instance" -ComputerName $sysName # Only trigger if the service is still stopped... if ($tmpService.Status -eq "Stopped") { # Start the service... $tmpService | Set-Service -Status Running # Capture the current state of the service as a note on the alert... $alertNote = "Attempted to auto-restart the service. Its new status is " + $tmpService.Status + "." } # Log the alert... $logThis = $true } # Actions to take if a website stops responding... } elseif ($dsName -eq 'HTTPS-' -and $datapoint -eq 'CantConnect') { # Insert action here to take if there's a website error... # Example of sending a syslog message to an external server... $syslogMessage = "AlertID:$alertID,Host:$sysName,AlertStatus:$alertStatus,LogicModule:$dsName,Instance:$instance,Datapoint:$datapoint,Value:$metricValue,AlertDescription:$dpDescription" SendTo-SysLog $syslogServer "" $severity $syslogMessage $hostName "" "" # Attach a note to the LogicMonitor alert... $alertNote = "Sent syslog message to " + $syslogServer # Log the alert... $logThis = $true } } ############################################################### ## Final functions for backfilling notes and/or logging as needed ## (you likely won't need to change these) # Section that updates the LogicMonitor alert if 'alertNote' is not empty... if ($alertNote -ne "") { AddNoteToAlert $alertID $alertNote } if ($logThis) { # Log the alert (only triggers if a filename is given in the $logFile variable near the top of this script)... LogWrite "$alertID,$alertStatus,$severity,$hostName,$sysName,$dsName,$instance,$datapoint,$metricValue,$alertURL,$dpDescription" }
Kevin_Ford
2 years ago Place Product Discussions
2.1KViews
23likes
5Comments
PowerShell to Get Alerts through API with ticketid
I am trying to utilize the PowerShell cmdlets from PowerShell gallery to pull back alerts from our portal from a specific time windows and include the ##externalticketid## field. In Python it says to update the queryParams with customColumns=%2523%2523externalticketid%2523%2523 but that does not seem to work in PowerShell. Has anyone been able to utilize the API to pull back alerts and include the ##externalticketid## field so you can relate it to things like the ServiceNow INC being created for LogicMontior Alerts? This is using API v2.
Solved
Jeff_Batchelor
2 years ago Place Product Discussions
276Views
11likes
11Comments
Get historical values for alerting
I would like to alert when an MSSQL database ( Microsoft_SQLServer_Databases ) has increased in size by 10GB or more over 24h. I tried support but was told it wasn’t possible to alert on historical values. This post from 6 years ago indicates I could do it with the data rest api, is this still the only valid method after 6 years? https://community.logicmonitor.com/archive-13/alerting-on-difference-to-historic-value-1409?tid=1409&fid=13 Has anyone done anything similar or have any examples of using the rest api or sdk to do this? Thanks.
keiransteele
2 years ago Place Product Discussions
140Views
12likes
5Comments
RCA alert suppression for HA sites
Good afternoon all, Has anyone worked with RCA rules for duel entry point sites? It seems that even if both entry points are offline we still receive offline alerts for all downstream devices. This may be due to the fact that both entry point devices go offline a few milliseconds apart form each other. Is there a way to retain individual alerting on downstream devices while still suppressing these alerts during a full network outage event? Any help is appreciated.
telrod
2 years ago Place Product Discussions
56Views
12likes
0Comments
Is there any way to view the Alert Routing for all resources and websites?
Hi, I have a “Catch All” alert rule at the end of our list that gets any alerts that weren’t caught by something else in the list. If I manually go to a server, and drill down to an instance item, I can click the Bell icon to view the Alert Routing for that check. Is there any way to view all the Alert Routing information for our entire LM system and get a list of which alerts WOULD hit the Catch All rule? I want to get a list so I can go fix them before we get an actual alert that then doesn’t go to the right people. I didn’t see a Report to get this information. I also didn’t see Alert Routing in the API info anywhere. Does anyone know if this is possible? Thanks!
Kelemvor
2 years ago Place Product Discussions
53Views
7likes
1Comment
Convert default Alert Report to dashboard?
Anyone seen examples of know of a way to convert the basic (out of the box) Alerts Report into a Dashboard? The default “Alert List dashboard” only shows active alerts with no way (can’t edit the table in the dashboard) that I’ve found to edit it to show historical information like the Alerts Report. If I try to create a table widget from scratch it won’t let me choose * or ##DefaultResourceGroup##. You can create an Alert List Widget but the same issue comes up, its not historical in nature and there doesn't seem to be any way to edit that.
Solved
derek_haneman
2 years ago Place Product Discussions
138Views
17likes
5Comments