15 April 2020 Office Hours - Alerts & Routing - Recording and Q&A
Thanks to all the attendees at last week’s Office Hours session.
- You can find the recording here: https://logicmonitor.wistia.com/medias/l1tusqhxg4 and the Q&A transcript below.
- If you would like to give feedback, go here: https://docs.google.com/forms/d/e/1FAIpQLScPWW5DzNxe2W5ieh6PjamLYWcP5AhDbUl1E3U7ZKryEgwEoA/viewform?usp=pp_url&entry.2118543627=2020-04-15.
- Our next session is planned for April 29th. You can register here: https://logicmonitor.zoom.us/webinar/register/1615849840118/WN_kBPpRBuZR3W3as2PcvYjcA?_ga=2.177082021.1444769543.1587337597-865086799.1576104046
Q&A Transcript
Q: Is there a way you can download or view alert history?
A: Hi Gary, you can configure an alert report to pull a historical view. https://www.logicmonitor.com/support/reports/report-types/alerts-report
Q: Can we extract a report to see what instances or datasources or groups have alerting off manually or have modified alert thresholds?
A: Yes, there is an alert thresholds report available in the report suite. https://www.logicmonitor.com/support/reports/report-types/alert-threshold-report
Q: Hi LM!
We rely on alert thresholds to provide insights on capacity management for our Cloud Based Phone System. One of our challenges is leveraging time-based thresholds to tell us if our traffic is greatly above, or greatly below our our typical usage.
With that said, do you have any advice on handling time-based alert thresholds for Saturday/Sundays, where our usage is vastly different weekdays?
A: Time-based thresholds are a good way to deal with situations like this. I may not understand your question, but the principles are the same as normal alert tuning, but with the difference that you take the time to think also about the different thresholds that are appropriate at different times. You would follow the same workflow, but just categorize the times differently. I would recommend that you use the wizard to make time-based thresholds.
Q: RE: Time-based alerts cont.
Hi Mike,
My challenge is - LM alert-threshold for say, 9am may work for Weekdays, but not be what I need for 9am Saturday or Sunday.
A: Ah, that’s correct. I was thinking of escalation chains, which do allow for day of week. I do not believe that time based thresholds allow for this. Sorry about my confusion.
Q: Is there a way to still alert on something, but not route that alert anywhere if you're still utilizing a "catch-all" Alert Rule like "Error" or "Critical"? Basically.. we'd want to see them in the Alerts Dashboard, but not get emails/texts/calls about it. It would be useful to have a button to not route the alert, perhaps under the "Alert Routing" screen. I understand we can create high-level Alert Rules to not escalate those alerts, but I'm curious if there's a more granular option at the datasource/datapoint level. (Sorry for the book, just trying to be clear!)
A: You can get as granular as you like with alert routing. The key is to remember that the alerts will be evaluated against the alert rules in order of alert rule priority, ascending. You can use this property to handle this kind of filtering. If you want alerts to not be routed, even with a catch-all, simply make sure that the empty escalation chain is matched with a rule that comes before the routed catch-all. Does this make sense?
Q: If I put a device in SDT, and that device is also a collector, I need to configure SDT again in the collectors section. Is there a way that when I schedule a device or group down that the collector is also scheduled down?
A: Not at this time. Collector alerts are handled differently than resource alerts.
Q: How to trigger an alert whenever device configuration changes?
A: On the page https://www.logicmonitor.com/support/logicmodules/articles/creating-a-configsource it describes how to create a ConfigSource. The section that begins with “Check Type: Any Change (diff check)“ Shows how to set this up.
Q: When you have a device already managed in Logic Monitor that gets a hardware upgrade and the management IP of that host stays the same, the properties in LM remain tied to the old hardware I've seen. For example if we have a 2960X switch that gets replaced with a 9300, obviously two different model devices with different software version. Will LM automatically re-poll the PropertySource once the new host comes online (even though the monitored IP doesn't change), or do we have to go to Settings -> LogicModules->PropertySources, find the associated PS to the host and manually run it?
A: Generally, if you're not replacing hardware with something that is "apples to apples" I would suggest adjusting the old hostname or removing the sunsetted device first. That said our discovery will attempt to apply or remove any logicmodules that meet the appropriate discovery criteria. Reason being is some modules may leave instances that no longer necessarily apply depending on their configuration.
Q: With dynamic thresholds how far back does LM look back to determine what is normal? Does it take into account regular, periodic oscillations of the metric value?
A: Hi, dynamic thresholds are determined by three days of prior data. For more information you can check out the following. https://www.logicmonitor.com/support/alerts/tuning-alerts/enabling-dynamic-thresholds-for-datapoints
Q: Is there way to configure LM to do speed test checking?
A: While I don't believe we have anything out of the box, you could likely script something for this. We have an example on our communities page. a href="https://communities.logicmonitor.com/topic/2282-ookla-speedtest/" rel="">https://communities.logicmonitor.com/topic/2282-ookla-speedtest/
Q: Can we insert list of top processes consuming CPU utilization in Linux for CPU load alert mail
A: It's probably doable. You'd have to create a Groovy based complex datapoint that watches the cpu utilization datapoint. When that goes over threshold, the complex datapoint would run a Groovy script that could reach out to the device and gather the list of top processes. The script would then need to store that data as a property on the device. You'd then need to modify the alert message template so that the top-processes property were included in the alert message. This is very advanced stuff and I would recommend reaching out to your CSM for help engaging our Professional Services team.
Q: Hi LM,
First Thank you for this Webinar, I'm recently started to work with LM!
I'm still learning about your platform and I have a problem with the dashboard.
I'm getting information by SNMP on a network device and I have "NaN" when I want to exploit data, my data is a "String value" and visible by a brut data. Do you have any ideas how can I exploit a String value?
Regards.
A: If it’s a string, but contains numeric data, you can use a regex match to extract the numbers. If it’s strictly text, you can use a regex to match it (returning 1 if found and 0 if not found). The help page that describes these options is here: https://www.logicmonitor.com/support/logicmodules/datasources/datapoints/normal-datapoints/
Q: Are there plans to post this talk? I missed portions of it in the middle and would like to review at my convenience.
A: Hi Dennis, these are recorded, we'll be sure to get a link your way once it's ready.
Q: Will there be a recordings of these sessions?
A: Hi Mike, we definitly record these! I'll make sure we get a link your way when this is ready.
Q: AutoLogout when Displaying a Dashboard on a Wall Mounted TV requiring multiple Re-Logins throughout the day
A: Turn on a slideshow. It should keep things refreshed.
Q: Integration with FreshService using Cluster Alerts is failing to provide the ##HOST## or #SYSTEM.STATICGROUPS## details (leaving it blank in the subject line) at this time for our integration. Is there a known issue where cluster alerts will no longer provide the affected host in an alert subject line if they are part of cluster alerts?
A: This question was answered live. See https://www.logicmonitor.com/support/logicmodules/about-logicmodules/tokens-available-in-datasource-alert-messages#cluster-alerts for tokens that can be used in Cluster Alert notifications.
Q: Is there a way to audit how frequently dynamic thresholds decide to not send a notification? A high frequency of suppressions could indicate I should edit the static threshold
A: That's a great idea. Be sure to submit feedback through the community or through your CSM to make sure that feature request gets added into the system.
Q: Hello, will this session be recorded for us to reference later with our teams?
A: Hi David, this will be recorded. We'll be sure to get you a link once it's ready.
Q: Hi, is there a way to get the historical data, such as once per minute readings of available free space and capacity?
A: Yes, simply expand the graph and you'll have the option to view and download the raw data.