Forum Discussion

pgordon's avatar
pgordon
Icon for Advisor rankAdvisor
2 years ago

Spanning tree/loop detection alerts

I’m looking to see if anyone has a way they are detecting network loops or generally using spanning tree information to alert that a loop has been detected. It would seem like something that would be built in but I can’t find anything. I’m mostly trying to catch end users that plug in things they shouldn’t

  • Have you considered using LM Logs to help you identify such conditions?

  • If you are not able to get the network configured properly for loop prevention, the best you can probably do is monitor for heavier than normal nonunicast traffic via dynamic thresholds.

  • This is generally not something you will be able to do from LM directly, though there is the BRIDGE-MIB you could scan for unexpected BPDU reception (bear in mind the BRIDGE-MIB is not directly VLAN-aware and you must use indexes or contexts to select VLANs other than VLAN 1). 

    The way you would normally protect the network depends on the platform, but the general solution is to set all edge ports to edge mode (sometimes manual, for example Cisco with spanning-tree portfast and similar, sometimes automatic, for example Procurve auto-edge detection).  You then ensure any port receiving a BPDU that is on an edge port (how loops happen) either converts back to normal mode (for auto-edge) or shuts down (bpduguard).

    The trick for Cisco bpduguard is there is no MIB to tell you this happened, but you can see from ‘show interface status err-disabled’  We wrote an eventsource to detect err-disabled ports via the CLI using SSH.  It works well, but because the eventsource system in LM is so horrible you get inundated with repeated alerts you cannot ACK (though the system pretends you can). As long as you know that, you can workaround it using SDT.

    Something along the lines of detecting a status change like that on the ports sounds like a good idea.

    The challenge for my environment is that we have end users that may plug things in to network ports that they shouldn’t and they end up causing loops. We also have a a very mixed environment where things aren’t necessarily configured correctly and for some of the orgs we watch we only provide support when there are issues (so we can’t always go in and configure these settings the way they should be unfortunately). Finally, we also sometimes get a vendor that misconfigures things and it isn’t always obvious at first.
    I really wish this all wasn’t the case but it is what it is. Any other ideas are welcome

    Thank you!

  • This is generally not something you will be able to do from LM directly, though there is the BRIDGE-MIB you could scan for unexpected BPDU reception (bear in mind the BRIDGE-MIB is not directly VLAN-aware and you must use indexes or contexts to select VLANs other than VLAN 1). 

    The way you would normally protect the network depends on the platform, but the general solution is to set all edge ports to edge mode (sometimes manual, for example Cisco with spanning-tree portfast and similar, sometimes automatic, for example Procurve auto-edge detection).  You then ensure any port receiving a BPDU that is on an edge port (how loops happen) either converts back to normal mode (for auto-edge) or shuts down (bpduguard).

    The trick for Cisco bpduguard is there is no MIB to tell you this happened, but you can see from ‘show interface status err-disabled’  We wrote an eventsource to detect err-disabled ports via the CLI using SSH.  It works well, but because the eventsource system in LM is so horrible you get inundated with repeated alerts you cannot ACK (though the system pretends you can). As long as you know that, you can workaround it using SDT.