ContributionsMost RecentMost LikesSolutionsRe: Anyone had success with SNMP and reboot required state On 2/1/2022 at 5:13 PM, Stuart Weenig said: Yes, many of the SSH datasources available out of the box provide superior monitoring and detail vs. the SNMP data that is pollable. Almost. There's more SNMP monitoring that happens on Linux than Windows, because Windows has WMI that, like SSH, provides richer monitoring data. Also, just because you enable SSH monitoring, doesn't mean you're necessarily disabling SNMP monitoring. Do both. Hi Stuart Thanks for your insight :)/emoticons/smile@2x.png 2x" title=":)" width="20" /> I think we will wait with the SSH monitoring, we have a lot of machines where we disable SSH as a security measure and only enabled it during maintenance. We want to try go the OpenMetrics way instead :)/emoticons/smile@2x.png 2x" title=":)" width="20" /> Re: Anyone had success with SNMP and reboot required state 2 minutes ago, Stuart Weenig said: Enabling ssh would get you much more than just this one custom datasource. Worth looking into IMO. From what I've understood there is more broad monitoring with SNMP on Linux machines with LogicMonitor? Is there something particular with the SSH monitoring that is superior than the SNMP monitoring (talking out of the box in LogicMonitor) Re: Anyone had success with SNMP and reboot required state Yeah, it's easy with SSH, we already have a bash script that can check reboot status of a Linux machine, but we'd like to avoid using SSH as we only use SNMP right now. Quite a lot of work to set up SSH on all devices just to have this ekstra reboot pending check Anyone had success with SNMP and reboot required state Hi Forums I wanted to hear if anyone have had success with reboot pending with LogicMonitor and SNMP? Mainly SNMPD on Ubuntu. Else it maybe a job for SSH monitoring or OpenMetrics with Telegraf? Re: Various Linux distros - SNMP disk OID change I spun up a Vanilla Ubuntu 20.04 server, with same SNMPD version (just latest apt update), I could not get this behaviour triggered before I started adding in more disks: I wasn't able to trigger this in service restart for this VM for some reason, only reboots on this one, some of the reboots also yielded the same indexes, so it's not always that this happens. *** Added more disks *** 1 => Physical memory 10 => Swap space 3 => Virtual memory 35 => /run 36 => / 38 => /dev/shm 39 => /run/lock 40 => /sys/fs/cgroup 6 => Memory buffers 7 => Cached memory 70 => /run/snapd/ns 72 => /run/user/1000 73 => /data/backups/disk-1 74 => /data/backups/disk-2 75 => /data/backups/disk-3 76 => /data/backups/disk-4 77 => /data/backups/disk-5 8 => Shared memory *** REBOOT *** 1 => Physical memory 10 => Swap space 3 => Virtual memory 35 => /run 36 => / 38 => /dev/shm 39 => /run/lock 40 => /sys/fs/cgroup 6 => Memory buffers 65 => /data/backups/disk-5 66 => /data/backups/disk-2 67 => /data/backups/disk-3 68 => /data/backups/disk-1 69 => /data/backups/disk-4 7 => Cached memory 75 => /run/snapd/ns 77 => /run/user/1000 8 => Shared memory *** REBOOT *** 1 => Physical memory 10 => Swap space 3 => Virtual memory 35 => /run 36 => / 38 => /dev/shm 39 => /run/lock 40 => /sys/fs/cgroup 6 => Memory buffers 65 => /data/backups/disk-4 67 => /data/backups/disk-3 68 => /data/backups/disk-1 7 => Cached memory 72 => /data/backups/disk-5 73 => /data/backups/disk-2 75 => /run/snapd/ns 77 => /run/user/1000 8 => Shared memory Re: Various Linux distros - SNMP disk OID change Hi @Stuart Weenig Thanks for you detailed analysis, I have a bit more to add: Our SNMPD config is nothing crazy, please note we use SNMPv3 (I don't know is this makes any difference in discovering logic compared to 1/2, I think not) Here is the SNMPD config we used (Pushed with Puppet) Quote agentaddress udp:161 # Traditional Access Control rocommunity public 127.0.0.1 rocommunity6 public ::1 # VACM Configuration # sec.name source community com2sec notConfigUser default public com2sec6 notConfigUser default public # groupName securityModel securityName group notConfigGroup v1 notConfigUser group notConfigGroup v2c notConfigUser # group context sec.model sec.level prefix read write notif access notConfigGroup "" any noauth exact systemview none none # name incl/excl subtree mask(optional) view systemview included .1.3.6.1.2.1.1 view systemview included .1.3.6.1.2.1.25.1.1 # System Group sysLocation <REDACTED> sysContact <REDACTED> sysServices 72 sysName <REDACTED> ## We do not want annoying "Connection from UDP: " messages in syslog. dontLogTCPWrappersConnects no # OTHER CONFIGURATION rouser <REDACTED> authPriv extend diskstats /bin/cat /proc/diskstats The errors in the LM screenshot from last night was SNMPD reboot, and those from today was from a OS reboot. Only some of the disks seems to had it index change: If it helps, here is also the output from: /etc/fstab LABEL=disk-1 /data/backups/disk-1 xfs defaults,noatime 0 0 LABEL=disk-2 /data/backups/disk-2 xfs defaults,noatime 0 0 LABEL=disk-3 /data/backups/disk-3 xfs defaults,noatime 0 0 LABEL=disk-4 /data/backups/disk-4 xfs defaults,noatime 0 0 LABEL=disk-5 /data/backups/disk-5 xfs defaults,noatime 0 0 Re: Various Linux distros - SNMP disk OID change 11 hours ago, Michael Rodrigues said: SNMP index OID shuffling is common, and it's why LM uses the WILDALIAS/Instance Name as the unique identifier for an instance, while the WILDVALUE/Instance Value is the index OID. The wildvalue can change without losing an instance and its history. AD needs to run after re-shuffling to make data reporting work correctly, as the wildvalue is used to match up reported data to the instance. Generally, this shouldn't happen often enough for it to throw off more than a poll here and there. How often are you restarting SNMPd? Making AD run more often is one way to mitigate. Not often to be honest, we mostly see it on reboots in relation to patching. Just last night we updated our SNMPD config (We use Puppet) to include the extended disk information, then we rebooted the SNMPD service. This caused a lot of alarms again across multiple servers. Re: Various Linux distros - SNMP disk OID change 11 hours ago, Mike Moniz said: I can find a few references online to people having the problem with google but don't have responses or solutions outside of using the mount name as the index. I'm not that great at Linux but I would guess it might be related to how the kernel is detecting drives on boot and it's order? Perhaps related to uuid/label stuff? These physical or virtual servers? Perhaps you might be only able to replicate it with real reboots and not just restarting snmpd. All VM's. Sometime the OID indexes changes, sometimes not, it's a bit random. Re: Various Linux distros - SNMP disk OID change SNMPD version: Package: snmpd Version: 5.8+dfsg-2ubuntu2.3 Priority: optional Section: net Source: net-snmp Origin: Ubuntu Various Linux distros - SNMP disk OID change Hi LM Community I'm having a issue that I've searched these forums and the web, but unable to find anyone who have a solution on this matter. We are monitoring Linux server with SNMP, to be specific the DataSources we are having problems with is: https://www.logicmonitor.com/support/monitoring/os-virtualization/filesystem-monitoring SNMP_Filesystems_Usage SNMP_Filesystems_Status These 2 DataSources are the "new ones" when it comes to Linux monitoring and disk status + usage, we have removed all of the old DataSources described in the article. The issue is when the SNMP service is restarted, or the Linux machine is restarted, then SNMPD allocated what seem to be random OIDS to the disks each time. I've created a support case with LogicMonitor, but it got shrugged off as they haven't heard of this issue before, I cannot believe that we are the only ones that has seen this problem. Example alarm is: Host: <REDACTED> Datasource: Filesystem Capacity-/run/snapd/ns InstanceGroup: @default Datapoint: StorageNotAccessible Level: warn Start: 2021-12-10 15:50:22 CET Duration: 0h 12m Value: 1.0 ClearValue: 0.0 Reason: StorageNotAccessible is not = 1: the current value is 0.0 We have seen this on CentOS, Ubuntu 16, 18, 20. Sometimes it's multiple disks, other times it doesn't happen. The solution is to run active discovery on the resource again. I think part of the problem is that the WildCard used is the SNMP OID that changes, if the WildCard was the mount point name, this would not had been an issue. I've partly solved this by changing the discovery schedule on the DataSource's from 1 day, to 15 minutes, then the monitoring works again. Do anyone have any idea what could be causing this? Regards.
Top ContributionsAlerts button count setting to reflect custom alert view set under alertsRe: How to handle WMI misconfiguration?How to handle WMI misconfiguration?Re: Anyone had success with SNMP and reboot required stateRe: Anyone had success with SNMP and reboot required stateRe: Anyone had success with SNMP and reboot required stateAnyone had success with SNMP and reboot required stateRe: Various Linux distros - SNMP disk OID changeRe: Various Linux distros - SNMP disk OID changeRe: Various Linux distros - SNMP disk OID change