Forum Discussion

Stuart_Weenig's avatar
9 months ago

Our journey enabling Thycotic as our secret keeper

I thought I’d share the details of our journey, which is not yet over, of enabling the Thycotic (Delinea Secret Server) integration as our secret keeper. I’ll try to keep this updated as we go along.

Over a year ago, we invested in Thycotic (herein referred to as SS) as the secret keeper. It allows us to store the usernames and passwords and keys we need in order to monitor and remotely access our customer’s infrastructure. Our NOC remotes in through LM and looks up the credentials in SS. It works really well.

Part of the reason we chose Delinea was because LM has an integration that allows passwords to be looked up at runtime from SS by the collector. There are a couple reasons why this is preferable: 1) the passwords are never stored in the LogicMonitor platform, keeping them more secure, and 2) when a password changes, our NOC can update the password in SS and we don’t have to also update LM.

I’ll illustrate how this works with the SSH credentials of a device. Normally, you’d set ssh.user and ssh.pass as properties on the device in the LM platform. Once the steps to configure the SS integration are completed, instead of ssh.pass, you set ssh.pass.lmvault as a property. The value of the property is set to the ID of the secret in SS. When the collector task runs, the collector sees ssh.pass.lmvault, looks up the corresponding secret by ID, retrieves one of the fields (hopefully the password, more on this in a sec.), and inserts ssh.pass with its corresponding value into the task properties. This makes it so that any script looking for ssh.pass gets the password from secret server. It’s great and when LM launched it, there was much high-fiving.

However, they had missed a critical piece, which has now been fixed. Once ssh.pass.lmvault is set on the device and ssh.pass is removed from the device, the AppliesTo, which is looking for ssh.pass no longer applied to the device. The entire module unapplied and all historical data was lost. This was a simple oversight that was not uncommon while the new dev team at LM was being onboarded. (If you didn’t know, LM dumped the entire dev team in China and replaced it with a much larger dev team from India; it was a rocky time for development and part of the reason UIv4 has taken so long to make progress.)

Anyway, as mentioned, this has now been fixed with some back end logic that allows AppliesTo expressions to match with a device with a property called or even if the AppliesTo only looks for

With this fixed, there was only one lingering issue: remember how I said, "hopefully the password"? Let me give you some background:
In SS, a secret is not just the password. A secret can contain any number of items, including the password. For example, it may contain the username, the password, the hostname, a customer name, a billing code, etc. It just depends on the template you're using. A great real world example is SNMPv3. In SS, when you want to store a new set of SNMPv3 credentials, you simply choose the SNMPv3 template and it provides you with the ability to store all the SNMPv3 bits in a single secret. That means you can have one record containing the auth key, auth protocol, priv key, priv protocol, and user, and other non-essential stuff if you need it. You have one secret to go to and you can get all the data you need to make the SNMPv3 query.

So, back to the LM integration:
There was no clear logic as to which field within a secret would actually be fetched. When we first attempted using it over a year ago, we were testing with SSH credentials. We had stored the user and password in a single secret, since that makes the most sense for us and how we want to use it. Since LM only gives us the option to specify which secret to pull from, not which field in the secret to pull, there must be some logic in the backend that determines which field to pull. That logic was/is flawed because authentication was failing. When we dug into it, we found out that ssh.pass contained the username instead of the password. We're not sure why this was happening, it may have been looking for the first field, or the first field that had a value, we're not sure. Either way, I've heard rumors that this should be fixed in the upcoming EA release of the collector (woohoo, we get to go through another round of collector update scheduling with every one of our customers!).

So now the AppliesTo isn't invalidated when you switch out a property for the .lmvault property. And soon we'll be able to specify which field in the secret contains the value we want to pull out. The next issue is actually enabling the vault integration. FYI, here is the documentation on doing that. If you've stuck with me so far, you might want to browse through it to get an understanding of how LM expects this to be setup (tl;dr there's no UI, it's all properties and lines in the collector config):

The current option of modifying each collector's agent.conf isn't tenable. We need better options for this solution to scale. Suggested the following, worst option first, best option last:
1. Make the three (wait, four. Did't you catch that in the docs?) lines part of the default OOTB config.
    This saves anyone the trouble of having to add them to the config and makes it safer to perform upgrades without losing this customization. This should be done no matter what. However, this is a temporary fix because someone will eventually need to customize these values for different instances of the integration. At least one of the following options will eventually need to be implemented.
    Also, this fix requires an upgrade to the collector, which might not be scheduled for some time. We just went through our yearly round of upgrading collectors and we're not anxious to repeat that. Having to schedule possible downtime with each customer, alerting the NOC, preparing for alternate remote access options; it's all difficult to do.
2. Make the three (four) lines of the config settable by setting properties on the collector.
3. Make the three (four) lines of the config settable by setting properties on the collector group. 
4. Make the three (four) lines of the config settable by setting properties on the account level. Even though we are an MSP, our SS is used across all customers since it only contains those secrets we need for monitoring and/or remote access.
5. Give us a UI where we can input all this information with test buttons to verify things are working, similar to how cloud integrations work.

Even if #1 is implemented, #2, #3, #4, and #5 will allow users to override the default values for certain collectors/collector groups (some collectors may be on the same continent as SS, some may not, so longer timeouts may be needed).

Also, it would make sense that the vault.meta.url, vault.meta.type, and vault.meta.header properties be configured in the same place as the vault metadata properties. I get why these were created as properties in the resource tree, they need to be available to the collector task at runtime. I'm not sure if the best place for that is in the resource tree with the *.lmvault properties or if they should be with the agent properties at the account level/collector group level/collector level.

This is great! Fantastic! Now we can almost seriously think about using this thing. These are the current difficulties I foresee we will face as we configure it over the next few weeks/months:
1. We will have to modify the config of every collector. That involves a restart and also could pose problems during later upgrades where those lines get overwritten by an updated config. I'm not excited about the prospect.
2. We will have to store the usernames in LM and not use the vault integration to fetch the username, only the password. We'll be testing to see if the logic to find the password works any better than it did a year ago, so far it's working on one device with one secret in our sandbox. I'm not sure SNMPv3 will even work since we have all the v3 keys for one set in one secret and we don't want to split it out.