5 minutes ago, Stuart Weenig said:
This is the recommended architecture.
If your firewalls are blocking legitimate business traffic, they need to not do that.
Folks are not going to place collectors in every subnet and due to increased security concerns, there will be more and more situations where this will be an issue.
As far as "blocking legitimate traffic" that is not what is happening here (OP specifically said the firewall was wide open). It is allowing the traffic, but firewalls track sessions and due to bad programming, LM triggers firewalls to block traffic in some cases. For example, we had a remote location (all WAN sites transit firewalls, a very common architecture) that had suffered a power outage. Pings began failing because LM reuses the same ICMP ID forever and the original session established previously was no longer valid.
As I mentioned, I escalated this to our CSM in 2018 and got back "I get it, but you need to open a feature request". Since then, someone in LM has at least figured out this needs support for SNMP -- this is what we were provided and it works fairly well (that seems to only target SNMPv3, but we tend to use that when possible so it is OK).
By default, the collector does not change the SNMP library session until a collector restart, which is why that resolves the issue. You may be able to work around this by adjusting the following fields in the collector debug.
snmp.shareThreads.impl.v3.switchport.enable=true
snmp.shareThreads.impl.v3.initialCheckDelay.minutes=3
We get frequent annoyed tickets from clients who are told by LM that a host is not responding to ping, which is trivially proved wrong by them. Our only solution when it happens is to restart the collector. You know, rather than LM fixing broken code.