Forum Discussion

Lewis_Beard's avatar
2 years ago

Network Interfaces vs Interfaces (64 bit)-

I've been having lots of problems over the last few months with collectors having no data or spotty data, and huge amounts of batchscript threads stuck in queue and warnings on my status page (on the collector) about the batchscript being jammed and too many dead tasks. And I've had lots of collector restarts due to dead tasks.

It seems the culprit is the Network Interfaces datasource. I guess Interfaces (64 bit)- is older, but I actually prefer it.

I had a deployment of 12 collectors that were using NI and the batchscripts, which apparently Network Interfaces uses for its operation, and I was getting all kinds of batchscript collector alerts and what-not. So I asked an LM engineer and they told me that it wasnt really fully optimized, and the threads stuck in queue were insane, even though I was using double extra large collectors.

So since it was a new deployment, I switched to the older Interfaces (64 bit)-, and disabled polling on NI completely, and PROBLEM SOLVED. Now my 12 collectors and my 8000+ devices are collecting smooth as silk.

Is this a known issue? Or has anyone had a similar problem? Whats the deal with Network Interfaces? Is it still viable to use the older one that has no dependencies or bottlenecks? Seems to be.

Thanks!

 

3 Replies

  • 39 minutes ago, Lewis Beard said:

    I've been having lots of problems over the last few months with collectors having no data or spotty data, and huge amounts of batchscript threads stuck in queue and warnings on my status page (on the collector) about the batchscript being jammed and too many dead tasks. And I've had lots of collector restarts due to dead tasks.

    It seems the culprit is the Network Interfaces datasource. I guess Interfaces (64 bit)- is older, but I actually prefer it.

    I had a deployment of 12 collectors that were using NI and the batchscripts, which apparently Network Interfaces uses for its operation, and I was getting all kinds of batchscript collector alerts and what-not. So I asked an LM engineer and they told me that it wasnt really fully optimized, and the threads stuck in queue were insane, even though I was using double extra large collectors.

    So since it was a new deployment, I switched to the older Interfaces (64 bit)-, and disabled polling on NI completely, and PROBLEM SOLVED. Now my 12 collectors and my 8000+ devices are collecting smooth as silk.

    Is this a known issue? Or has anyone had a similar problem? Whats the deal with Network Interfaces? Is it still viable to use the older one that has no dependencies or bottlenecks? Seems to be.

    Thanks!

     

    I was just dealing this over the weekend.  Been running NI side-by-side still with the original for a while and I was starting to adjust dashboards so I could finally stick with NI and retire snmp64-If-.  Sadly, I saw similar weirdness, including a gigbabit interface reporting over a terabit per second.  I updated to the latest version, but does not look like it has helped at all.  It does work in some cases, but it needs to be flawless in all cases.  Sadly, this is something the dev team at LM is often incapable of delivering, including in this module. From discussions with support, they consider NI to be the current/best solution and snmp64* deprecated, but as you have found, they are pushing a bad solution.

  • I should also add that, while it may not be a problem for many, the NI module corrupts the interface description.  We historically used '#' as a comment leader to ensure interfaces could have labels but not be included in alerting. Our code for that with snmp64_If- was fine, but it broke with NI. Why? The data format for instances uses '##' as a separator internally (instead of something structured like JSON).  So they just strip '#' from the interface description. I suggested they allow a property to specify a replacement character and the suggestion was rejected -- they are perfectly happy to corrupt data with no recourse.

  • Thanks for the response. I really at this point just want to disable polling on NI completely, but it would be a pain to judge impact.