Forum Discussion

mnagel's avatar
mnagel
Icon for Professor rankProfessor
11 months ago

heads up - property corruption due to unknown results

I wanted to share this with everyone since it bit me recently. Sometimes LM will reset the value of properties like system.ips to just the single IP associated with the resource if something disrupts access to SNMP even briefly. The problem with this is it can impact other features, like Netflow binding.  I am still battling this out with support but in the meantime I wrote a script to trigger AD for specified devices (had to use an undocumented endpoint) and I schedule that hourly to limit the damage in case it happens (normally AD is triggered once per day unless specific changes occur).  My change logs show system.ips resetting fairly often, so the script is definitely helping.  I explained to support that thwacking data due to an unknown result is a bug and no property should be changed due to that situation, but I imagine this bug will be hard to unwind.  I also recommended a partial fix where AD could be triggered by a “host up” event to limit the damage, but that is a hack. Avoiding data corruption in the first place is the right fix.

FWIW, the main loop of my script is below (it is in an ancient language, but the logic should be clear :)).

DEVICE:
for my $d (@{$devices}) {
if ($DEVICES{lc $d->{displayName}}) {
if ($ACTION eq "scheduleAutoDiscovery") {
verbose 1, "scheduling AD for $d->{displayName}\n";
if (not $lmapi->post(path => "/device/devices/$d->{id}/scheduleAutoDiscovery")) {
warn "ERROR: $ACTION failed\n";
}
}
else {
die "ERROR: unsupported action: $ACTION\n";
}
}
}
No RepliesBe the first to reply