Token to include DataSource raw output in email and alert body
We have script DataSources that output useful diagnostics information that help Operations to understand the number valuewhen an alert is generated. We want to include the raw output from a DataSource in the alert and email body. What we need is a##DSRAWOUTPUT## token which contains the complete raw output sent to standard out from a DataSource script. For example, we monitor for processes running under credentials they are no supposed to be running under, and we want to include that info as textual information in the alert/email body.20Views3likes2CommentsAd-hoc script running
Often when an alert pops up, I find myself running some very common troubleshooting/helpful tools to quickly gather more info. It would be nice to get that info quickly and easily without having to go to other tools when an alert occurs. For example - right now, when we get a high cpu alert the first thing I do is run pslist -s \\computername (PSTools are so awesome) and psloggedon \\computername to see who's logged in at themoment. I know it's possible to create a datasource to discover all active processes, and retrieve CPU/memory/disk metrics specific to a given process, but processes on a given server might change pretty frequently so you'd have to run active discovery frequently. It just doesn't seem like the best way and most of the time I don't care what's running on the server and only need to know "in the moment." A way to run a script via a button for a given datasource would be a really cool feature. Maybe on the datasource you could add a feature to hold a "gather additional data" or meta-datascript, the script could then be invoked manually onan alert or datasource instance. IE when an alert occurs, you can click on a button in the alert called "gather additional data" or something which would run the script and produce a small box or window with the output. The ability to run periodically (every 15 seconds or 5 minutes, etc) would also be useful. This would also give a NOC the ability to troubleshoot a bit more or provide some additional context around an alert without everyone having to know a bunch of tools or have administrative access to a server.16Views1like7CommentsAlert Troubleshooting 101
One of the most common support cases we face every day is 'why am I receiving this alert', this article would explain to you the steps on how to determine why are you receiving the alerts. 1) Understand the alert received 2)Checking on validity via raw data and threshold 3)Checking on delivery 1) Understanding the alert received The first step when you receive an alert either via email, textor via any ticketing system is to understand the alert. Understand an alert is to look at which device is the alert for, which datapointand value of the alert. For example in an email alert message, it would appear as per below. LogicMonitor Alert: Host: ##HOST## Host Group: ##GROUP## Datasource: ##DATASOURCE## Datapoint: ##DATAPOINT## Description: ##DSIDESCRIPTION## Value: ##VALUE## Level: ##LEVEL## Start: ##START## Duration: ##DURATION## Reason: ##DATAPOINT## ##THRESHOLD## ##ALERTID## 2) Checking on validity via raw data and threshold Next, once you determined the alert source, you need to understand why this alert is triggered. This can be done by first looking at the threshold that is set for that particulardatapoint.After checking the threshold you can go to the raw data tab of the datapoint to check if it meets the threshold being sent. For example In this case, a critical alert was received and a threshold of 80 90 95 and an alert will only be triggered if you have 20 consecutive polls that fall within this range. Now the next step would be to check on the RAW DATA tab to determine if this condition was met. Judging from the raw data above if you look at the values all the 20 polls have met the threshold level of 80 90 95, but to determine the level of the alert it would be the last poll since the last poll was 96.67 will falls to the range of a critical alert thus a critical alert was send. 3) Checking on delivery The last process is to check the alert rule and escalation chain to see if it was applied to the correct rule and escalation chain. To do so you can go thealert tuning tab and check on the alert routing for that particular instance and datapoint. Here you can see that the Alert Rule applied is Critical - Default and the Alert Chain/Escalation Chain isCritical - Default. Under the Alert Chain is the list of email address that will receive a notification, when the threshold is met.14Views0likes0CommentsCustom alert messages per Cluster
I'm coming around to love clustered alerts as more of my company moves to dynamic environments. But I really need to be able to customize the email alert messaging for clustered alerts. So I would like to see two things: 1. The ability to set a custom alert message per clustered alert 2. The ability to assign properties to clustered alerts so that they can be referenced in the alert message via ##TOKENS##.9Views1like1CommentAlerts on Longer Periods within Datasources
For a datasource, we would like to be able to set the alert threshold over more than a single sample. You can set the number of threshold violations needed for an alert, but this is far different in nature than setting a threshold over a time range. For example, 60% CPU over 2 hours versus 60% CPU over 10 samples. You might see CPU fluctuate within that period, preventing an alert, but the average over a longer period is valuable. Similarly, we would like to get alerts not just on average over atime period, but also on slope over a time period, though perhaps the latter should be a separate request. Thanks, Mark4Views0likes1CommentAlert Test Report
I started a chat under ticket 119191 and discussed this with Seth. I would like you to consider this for your next roadmap. I want to be able to see what alerts "would fire" without enabling the alerts. Scenario: Onboarding 10 new devices to a new group with alerting disabled. I want to QUICKLY see how many would fire if I enabled them. No hunting, no slowly turning each one up one by one to prevent the new alert deluge. Maybe a report with applied thresholds and current values with clear indicators what alert level the value is within at the report runtime.3Views0likes3CommentsAlert Test Report
I started a chat under ticket 119191 and discussed this with Seth. I would like you to consider this for your next roadmap. I want to be able to see what alerts "would fire" without enabling the alerts. Scenario: Onboarding 10 new devices to a new group with alerting disabled. I want to QUICKLY see how many would fire if I enabled them. No hunting, no slowly turning each one up one by one to prevent the new alert deluge. Maybe a report with applied thresholds and current values with clear indicators what alert level the value is within at the report runtime.2Views0likes0CommentsAlerts rule continuance
I would like to have the option within an alert rule to "continue" processing to the next rule. For example, we would like to handle integrations differently than email alerts. I we could create one rule at the top with the highest priority to take an action with our integration, then allow me to customize everything else in separate rules. The only other way to handle this is to add our integration to every escalation chain we create, which is tedious and will lead to manual errors.2Views0likes0CommentsMail alerts conversations
I think it would greate if you add some headers to your mails. This will helps to mail program to create conversation for every alert and clean message for it. Now we have only separate messages: LMD... critical - Host1 Ping PingLossPercent LMD... critical - Host2 Ping PingLossPercent LMD... ***CLEARED***critical - Host2 Ping PingLossPercent LMD... ***CLEARED***critical - Host1 Ping PingLossPercent In my opinion, it will better if this message will create conversation for every alert: LMD... ***CLEARED***critical - Host1 Ping PingLossPercent LMD... critical - Host1 Ping PingLossPercent LMD... ***CLEARED***critical - Host2 Ping PingLossPercent LMD... critical - Host2 Ping PingLossPercent As I know, the header is Thread-Index https://excelesquire.wordpress.com/2014/10/17/use-excel-to-count-the-number-of-emails-in-each-email-chain/ https://stackoverflow.com/questions/5506585/how-to-code-for-grouping-email-in-conversations1View3likes1Comment