Token to include DataSource raw output in email and alert body
We have script DataSources that output useful diagnostics information that help Operations to understand the number valuewhen an alert is generated. We want to include the raw output from a DataSource in the alert and email body. What we need is a##DSRAWOUTPUT## token which contains the complete raw output sent to standard out from a DataSource script. For example, we monitor for processes running under credentials they are no supposed to be running under, and we want to include that info as textual information in the alert/email body.20Views3likes2CommentsStatusPage.IO Monitoring
I have built a generic StatusPage.IO datasource to allow for monitoring the status of various services we use. Since so many companies are using StatusPage.io, I figured it's a good idea to have a heads up in the event there is an outage with one of our many service providers. This has worked well as an early warning system for our service desk guys to know about issues before they start getting calls from end users. LogicMonitor actually uses StatusPage, but of course there are many, many others. Attached is a screenshot of the Box.com StatusPage data that we've collected from https://status.box.com. This datasource should be universal to any statuspage.io site. So far it has worked against every site I have tested it against. NYJG6J19Views2likes0CommentsCustom Ping Intervals
Currently, LM has hard-coded the Ping dataSource to use 250ms ICMP Ping intervals. We need the flexibility to adjust the Ping interval (ms)in the DataSource (Either static or system property value). Background: We've seen at least one company "Mimosa" that has changed it's newest firmware to block ICMP messages if they are sent "too quickly". For Mimosa wireless gear, this is represented in LM as an 80% packetloss (2 pings permitted, 8 are then rejected). Mimosa does not want their hardware resources depleted by multiple, quick, Ping requests. The workaround currently is to alter the thresholds in LM to compensate for an 80% packetloss reading. By having the ability to adjust Ping Interval for these hosts in LM, we can have better visibility into network issues.19Views0likes4CommentsAd-hoc script running
Often when an alert pops up, I find myself running some very common troubleshooting/helpful tools to quickly gather more info. It would be nice to get that info quickly and easily without having to go to other tools when an alert occurs. For example - right now, when we get a high cpu alert the first thing I do is run pslist -s \\computername (PSTools are so awesome) and psloggedon \\computername to see who's logged in at themoment. I know it's possible to create a datasource to discover all active processes, and retrieve CPU/memory/disk metrics specific to a given process, but processes on a given server might change pretty frequently so you'd have to run active discovery frequently. It just doesn't seem like the best way and most of the time I don't care what's running on the server and only need to know "in the moment." A way to run a script via a button for a given datasource would be a really cool feature. Maybe on the datasource you could add a feature to hold a "gather additional data" or meta-datascript, the script could then be invoked manually onan alert or datasource instance. IE when an alert occurs, you can click on a button in the alert called "gather additional data" or something which would run the script and produce a small box or window with the output. The ability to run periodically (every 15 seconds or 5 minutes, etc) would also be useful. This would also give a NOC the ability to troubleshoot a bit more or provide some additional context around an alert without everyone having to know a bunch of tools or have administrative access to a server.15Views1like7CommentsAlert Tuning for DataSource that has "Automatically Delete Instance" enabled?
I have a version of the "Oracle_DB_BlockedSessions" datasource template deployed and set an alert threshold on a complex datapoint that accounts for WAIT_TIME and SECONDS_IN_WAIT. Here is the complex datapoint expression for those curious--- if( eq(if(un(WAIT_TIME),0,WAIT_TIME), 0), if(un(SECONDS_IN_WAIT_RAW),0,SECONDS_IN_WAIT_RAW), 0) If the complex datapoint has a value over 300 seconds, an alert triggers with all the enriched instance-level autoProps from the Active Discovery script. All other aspects of this template mirror the gold-standard version--including enabling the "Automatically Delete Instance" option. Enter Client X, and they are comfortable with a threshold of 900 seconds. How can I set this custom threshold at a resource group for Client X when they don't currently have any blocking sessions? If I do manage to catch and set this Alert Tuning customization when Client X has a blocking session, will this alert tuning get wiped out when the DSIs are removed automatically? I suppose the Active Discovery script could be modified to always output a dummy instance... but that leaves an unpleasant taste in my mouth. Aside from cloning the datasource just for Client X, are there any other alternatives? And no, I do not want to alert off of the "Oracle_DB_BlockedSessionOverview" template because a it doesn't do a good job of discerning between one really long blocking session versus sequential and short-lived sessions that happen to exist at the time of the poll.13Views0likes3CommentsDatasource for API Gateway Resources behind a stage
I have been using a custom datasource to collect the metrics for each resource and method (excluding OPTIONS) behind a API Gateway stage. It has been extremely useful in our production environments. I would share the datasource via the Exchange, but the discovery method I'm using will not be universal, so I think it would be best if that discovery were to work natively. If possible, could we please have a discovery method for AWS API Gateway Resources by Stage? *Something to note - This has the potential to discover quite a few resources and thus, create a substantial number of cloudwatch calls which might hit customer billing. For this reason, I added a custom property ##APIGW.stages## so that I could plug in the specific stages I wish to monitor instead of having each one automatically discovered. The Applies To looks like this: system.cloud.category == "AWS/APIGateway" && apigw.stages Autodiscovery is currently written in PowerShell (hence why not everyone can take advantage of it) $apigwID = '##system.aws.resourceid##'; $region = '##system.aws.region##' $stages = '##APIGW.Stages##'; $resources = Get-AGResourceList -RestApiId $apigwID -region $region $stages.split(' ') | %{ $stage = $_ $resources | %{ if($_.ResourceMethods) { $path = $_.Path $_.ResourceMethods.Keys | where{$_ -notmatch 'OPTIONS'} | %{ $wildvalue = "Stage=$stage>Resource=$Path>Method=$_" Write-Host "$wildvalue##${Stage}: $_ $Path######auto.stage=$stage" } } } }12Views0likes1CommentAPI - Add Instance Count as Datasource Property
I'm trying to clean up datasources that are in our account that do not have any instances associated with them and likely never will. Currently I have to do this manually by inspecting each datasource in the GUI. It would be really great if the datasource instance count was returned as a property. Even better would be if the instances and associated device ID's were returned as well, but for now I'd be happy with just the device/instance counts.10Views0likes4CommentsWindows Network Adapters DataSource
Code isTXL3W9 This DataSource provides instances for each of the Network adapters, including the following Instance Level Properties: auto.TcpWindowSize auto.MTU auto.MACAddress auto.IPSubnet auto.IPAddress auto.DNSHostName auto.DNSDomain auto.DefaultIPGateway auto.SettingID auto.Description9Views1like0Comments