Forum Discussion

Zahir_Sheikkari's avatar
2 years ago

[Feature request] Custom PowerShell script: increase timeout

Hello,

I was advised to go here after chatting with support.

I have run into an issue where our custom PowerShell script runs into the hardcoded 120 second timeout. We're initially testing it first via the Test button. The error after exactly 120 seconds is: “"Test script failed - no response was received from the Collector"”.

The script runs okay in PowerShell 5.1, but it averages around 2,6 minutes.

We have already cut the runtime down from around 4 minutes to 2,6 minutes and that's the best we can do.

It starts generating “Write-Host” data within 50 seconds for each object.

We are pulling data from public API endpoints and perform some logic for 2 large arrays.

The script will run against a dedicated collector.

Issue is in the sandbox as well as in the production environment.

  • I think the ask here is to be able to tune the timeout per script -- not a bad idea in general. However, I think the actual feature you want is what Nagios and other monitoring systems call “passive checks”.  We used that ALL the time to have long-running scripts generate results and inject them into monitoring.  LM does not have that, but it does have Script Caching, which could potentially be useful to run scripts and fetch results asynchronously. Sadly, that is not a general distributed facility like redis (which I’d recommended years ago), it only works for Groovy scripts.  There is no PowerShell binding (perhaps I am wrong on that, but I don’t see it in the documentation).

    You could fake this yourself with files, though. You would need to schedule your script to run asynchronously (e.g., via Task Scheduler) and write results into a known location on the collector (or a filesystem reachable by the collector).  Then your datasource would just need to fetch the latest results from the filesystem and quickly process.  You’d want to check whether the data is too old so you can produce no results if that is the case.

  • I think the ask here is to be able to tune the timeout per script -- not a bad idea in general. However, I think the actual feature you want is what Nagios and other monitoring systems call “passive checks”.  We used that ALL the time to have long-running scripts generate results and inject them into monitoring.  LM does not have that, but it does have Script Caching, which could potentially be useful to run scripts and fetch results asynchronously. Sadly, that is not a general distributed facility like redis (which I’d recommended years ago), it only works for Groovy scripts.  There is no PowerShell binding (perhaps I am wrong on that, but I don’t see it in the documentation).

    You could fake this yourself with files, though. You would need to schedule your script to run asynchronously (e.g., via Task Scheduler) and write results into a known location on the collector (or a filesystem reachable by the collector).  Then your datasource would just need to fetch the latest results from the filesystem and quickly process.  You’d want to check whether the data is too old so you can produce no results if that is the case.

  • Have you tried changing any of the collector’s agent.conf settings such as the collector.script.timeout value?

  • I think the ask here is to be able to tune the timeout per script -- not a bad idea in general. However, I think the actual feature you want is what Nagios and other monitoring systems call “passive checks”.  We used that ALL the time to have long-running scripts generate results and inject them into monitoring.  LM does not have that, but it does have Script Caching, which could potentially be useful to run scripts and fetch results asynchronously. Sadly, that is not a general distributed facility like redis (which I’d recommended years ago), it only works for Groovy scripts.  There is no PowerShell binding (perhaps I am wrong on that, but I don’t see it in the documentation).

    You could fake this yourself with files, though. You would need to schedule your script to run asynchronously (e.g., via Task Scheduler) and write results into a known location on the collector (or a filesystem reachable by the collector).  Then your datasource would just need to fetch the latest results from the filesystem and quickly process.  You’d want to check whether the data is too old so you can produce no results if that is the case.

    Interesting last paragraph. Will look into this, thanks.

  • I love this idea, and it might be a nice vehicle to pivot away from our vendor specific language around data collection types (e.g. ConfigChecks = Text, PropertySources = Metadata, DataSources = Metrics, ect..) 

    Whenever I hit these limits within DataSources, I’ll pivot to other Datacollection method for long running scripts. Generally for long running scripts I would choose NetScan, ConfigSource, or PropertySource methods. Each of these have longer timeouts associated and can be used to do time intensive collection. Otherwise, there are almost always script optimizations which enormously speed collection or processing (concepts like “filter left”); or alternative collection processes can split the stages between different scripts within a LogicModule (e.g Use scheduled NetScan to pull data to the Collector, then use this as a cache to speed up corollary steps). 

  • @Zahir Sheikkariem - I confirmed with our Product team that the timeout value is not hard coded for Groovy scripts. For some reason, the 120 value only affects PowerShell scripts.

    If you can’t change over to using Groovy, @Jacob Ortony sent me these tips for making PowerShell scripts run faster.

  • Have you tried changing any of the collector’s agent.conf settings such as the collector.script.timeout value?

    Yes I've raised various timeout settings including this one. Chat support confirmed there is a hardcoded setting of 120 seconds which cannot be increased. They weren’t aware as to why this is. They even recommended not changing the 120 value at all.

  • Anonymous's avatar
    Anonymous

    LM does not have that

    Why not (except for licensing) use push metrics? Schedule the task using task scheduler and have the last part of the script push the metrics to LM via API. Actually, it would run better as a service that repeats itself. Using task scheduler, you actually run the possibility of the previous script still running when the next one kicks off (unless the schedule is significantly larger than the run cycle).