Forum Discussion

mmatec01's avatar
mmatec01
Icon for Neophyte rankNeophyte
2 years ago

Collectors and TCP Ephemeral ports exhaustion detection

Lately, we started experiencing this nasty issue with our collectors, whereby the collector runs out of all available ephemeral ports.  When that happens all communication basically grinds to a halt including DNS lookup, domain authentication, WMI outbound calls (meaning there will be NO collection for Windows servers), etc, etc... Usually when this happens I see hundreds of Warn - XXXXXX System Uptime SystemUpTime No Data alerts filling up my inbox, since I configured for No Data alert.  Fix - albeit temporary, is to reboot the collector, and thus immediately reclaim the resources.
Now, while the subject of why this is happening is very important and something I am definitely looking into and doing bunch of research with and without vendor support, my question today is more practical.  What can be done to monitor and "forecast", if you will, that your collector is about to go dead because you ran out of TCP ports?
I mean I can look at TCP stats DataSource collection values and monitor Connections, TCP Failed Connections, Segments per seconds all of these are fine to monitor but they don't tell me something is about to happen as their values skyrocket AFTER the fact.  Similarly, there is hundreds of metrics under Collector DataSources but I am at loss which one(s) to look at and set alerts on.
Is there something, like running netstat, looking at number of handles per process in Task Manager, or runnig some command which I can scipt and programmatically capture output that speaks to the issue at hand?

No RepliesBe the first to reply