Ad-hoc script running

Question

Often when an alert pops up, I find myself running some very common troubleshooting/helpful tools to quickly gather more info.&nbsp; It would be nice to get that info quickly and easily without having to go to other tools when an alert occurs.&nbsp; For example - right now, when we get a high cpu alert the first thing I do is run pslist -s \computername (PSTools are so awesome) and psloggedon \computername to see who's logged in at the&nbsp;moment.

I know it's possible to create a datasource to discover all active processes, and retrieve CPU/memory/disk metrics specific to a given process, but processes on a given server might change pretty frequently so you'd have to run active discovery frequently.&nbsp; It just doesn't seem like the best way and most of the time I don't care what's running on the server and only need to know "in the moment."&nbsp;&nbsp;

A way to run a script via a button for a given datasource would be a really cool feature.&nbsp; Maybe on the datasource you could add a feature to hold a "gather additional data" or meta-data&nbsp;script, the script could then be invoked manually on&nbsp;an alert or datasource instance.&nbsp; IE when an alert occurs, you can click on a button in the alert called "gather additional data" or something which would run the script and produce a small box or window with the output.&nbsp; The ability to run periodically (every 15 seconds or 5 minutes, etc) would also be useful.&nbsp; This would also give a NOC the ability to troubleshoot a bit more or provide some additional context around an alert without everyone having to know a bunch of tools or have administrative access to a server.

tom_lasswell · Answer

@mnagel&nbsp;that's a good point. Opsnotes would be great to put an alert was generated and these were the processes working there, I pieced together a small powershell script that uses perfmon (so we can get the % without any additional calculations)

you could convert this to groovy with a wmi query and use a complex datapoint with a groovy calculation with an if cpu datapoint is x then calculate and post the opsnote to the device.&nbsp;

Obviously this would need to be adjusted to fit LM wildcards and parameters.

$Cred = (Get-Credential) #New-Object –TypeName System.Management.Automation.PSCredential –ArgumentList $User, $Pass 
$HostList =@('server1','server2')

foreach ($CurrHost in $HostList)
{
    if((Test-Connection -Cn $CurrHost -BufferSize 16 -Count 1 -ea 0 -quiet))
    {
        (gwmi Win32_PerfFormattedData_PerfProc_Process) | foreach {if ($_.PercentProcessorTime -gt 1) { $_.name + " " + $_.PercentProcessorTime }}
  }
}

I've got a conference this week where I'll have some time may work on this. This is one of the biggest things for our engineers is capturing the processes as they're running when the alert triggers, most often times we miss it.

mosh · Answer

If a complex datapoint could inject this info into the alert template, now that'd be&nbsp;awesome&nbsp;:)/emoticons/smile@2x.png 2x" title=":)" width="20"&gt;

mike_suding · Answer

Just an idea....A few years ago I created a DataSource (PowerShell script) that detects IF/WHEN CPU is over 90% (or whatever you specify) and if it is, it gets the top 5 processes and user.&nbsp;&nbsp;http://blog.mikesuding.com/index.php/2015/11/07/windows-top-5/&nbsp;

mnagel · Answer

@Mike Suding&nbsp;I just read that post and it sounds hopeful, but I am confused about where this would end up so it is usable? It looks like it would dump into the collector log?

I know you and I have talked about this and there have been some discussions recently about leveraging Ops Notes more readily to record information like this.&nbsp; Ideally, the information can be stashed somehow (Ops Notes do seem like a good place if that could be supported within LM more readily), but also easily accessible so the information can be&nbsp;presented in alerts.&nbsp;

FWIW, we have a script&nbsp;we developed that get top 5 metrics via WMI alone (usage: /wm/bin/wmitop5 [--top N] [--sort-by {cpu|hnd|thd|mem} [-A authfile] host) -- this is used within our alert templating system via the callback function.&nbsp; This is for our pre-LM legacy tool, and I really miss templating and callbacks :(.

tom_lasswell · Answer

@Mike Suding&nbsp;I saw this before, thinking about modifying this idea into pushing a configsource query and grabbing it that way.&nbsp;

@mnagel&nbsp;opsnotes seems like an interesting idea as well&nbsp;

Forum Discussion

Ad-hoc script running

7 Replies

Recent Discussions

Dashboard Sharing – An Inline Framing Method

2021-12-15 US Office Hours

Live Training - Tuning Datapoints and Alerts - 15th JUNE 2022 - APAC

Live Training - Introduction to Dashboards - 18th MAY 2022 - APAC

2022-05-11- APAC Product Overview -Collectors, Resources/Groups, Dashboards