The goal: can we only show active instances to stakeholders (LM users) who have access to LM resources. User/role permissions don't currently offer a setting that turn this off. The alternative would be to leverage the existing framework which does allow instances to be deleted but by default rediscovers within at most 24 hours. We already have a need to disable discovered instances which means we are in a position of having customized select datasources already. The challenge is to identify what would be the impact to take the extra step of disabling active discovery as well then deleting disabled instances. Certainly this would apply only to select datasources but that could still affect thousands of instances. I'm trying to keep an open mind about this so that I can provide all relevant information on what this will take or why it will not work. Keeping that in mind, this is what I've considered to see if there is a path forward (just theory):
Cleanup
-
1. Disable active discovery on target datasource(s)
-
2. API or LM report data of device info and disabled instances from target datasource(s) that will be deleted (this will be used later for cross reference)
-
3. Delete all disabled instances from targeted datasources(s) (heavy maintenance burden if done manually, API would be preferred if this is possible by using aforementioned data in step 2)
Ideally at this point we have the desired result of only actionable instances that would remain as is until active discovery is run independently. We need active discovery to not run from the other potential starter conditions outside the datasource except when triggered manually. I have a case open with LM support to validate/verify the following and am also testing in a sandbox. I believe once a datasource has active discovery disabled, it will not run automatically for new devices and it will not run automatically when datasource based changes (i.e. adding a new custom datapoint that would require polling). I know this contradicts the Active Discovery Execution section of the documentation here https://www.logicmonitor.com/support/logicmodules/datasources/active-discovery/what-is-active-discovery but this appears to be the exception and again I've asked for this to be vetted.
Assuming we are still in a good position, we will come to an point where active discovery will need to be executed to health check and pick up new instances that may need to be monitored:
Manual discovery
-
1. Run active discovery on individual devices (Manually or ideally use API if possible)
-
2. API or LM report of device information and disabled instances from target datasource (new list)
-
3. Diff check to identify new instances when compared to previous and mark (use data from Cleanup step 2 as previous)
-
4. Identify if any instances should be monitored going forward and enable those
-
5. Disabled instance data updated from step 2 to provide new data for use in next step
-
6. Delete all disabled instances (heavy maintenance burden if done manually, API would be preferred if this is possible by using aforementioned data from step 5)
This may not be the right approach, I am still catching up on API functionality which may present gaps and ultimately if we can't stop active discovery from executing independently then there is no value in pursuing further. I'm discovering the possibilities and pushing boundaries so I can report what can and can't be done. Appreciate the discussion and welcome any new ideas, options and feedback!