Forum Discussion

Stuart_Weenig's avatar
6 months ago
Solved

Has anybody noticed the flaw in LogSource logic?

So LogSources have a couple purposes:

  1. They allow you to filter out certain logs. I’m not sure the use case here since the LogSource is explicitly including logs. Maybe the point is to allow you to exclude certain logs that have sensitive data. No masking of data, just ignore the whole log. Not clear if the ignored logs are still in LMLogs or if they get dumped entirely.
  2. They allow you to add tags to logs. This is actually pretty cool. You can parse out important info from the log or add device proeprties to the logs. This means you can add device properties to the log that can be displayed as a separate column, or even filtered upon. Each of our devices has a customer property on it. This means i can add that property to each log and search or filter by customer. Device type, ssh.user, SKU, serial number, IP address, the list is literally endless.
  3. They allow you to customize which device the logs get mapped to. You can specify that the incoming log should be matched to device via IP address, or hostname, or FQDN, or something else. The documentation on this isn’t exactly clear, but that actually doesn’t matter because…

The LogSources apply to logs from devices that match the LogSource’s AppliesTo. Which means the logs need to already be mapped to the device. Then the LogSource can map the logs to a certain device. Do you see the flaw? How is a log going to be processed by a LogSource so it can be properly mapped to a device, if the LogSource won’t process the log unless it matches the AppliesTo, which references to the device, to which the logs haven’t yet been mapped?

LogSources should not apply to devices. They should apply to Log Pipelines.

  • Hi @Stuart Weenig 

    Thank you for taking the time to share your feedback with us. We genuinely appreciate your insights into our system!
     

    In your post, you’ve rightly identified a challenge we face when it comes to mapping logs to log sources. It’s indeed a bit of a chicken-and-egg situation. Allow me to clarify the reasoning behind our approach:
     

    When a log is sent from a device to Logicmonitor and enters our processing pipeline, the question arises: how do we identify the source device? Is it by the message content, the header information, or the log format?
     

    If we were to rely solely on the message content for identification, it would be a time-consuming and cumbersome process. This would mean having to account for log formats from a wide array of vendors, which can be quite complex.
     

    Our solution is to identify logs by the device itself. Typically, devices of the same type tend to generate similar logs. As a result, grouping logs based on the device within the “appliesTo” section simplifies the process.
     

    Trying to use “appliesTo” logic to match specific log formats, on the other hand, would be unwieldy and require complex regular expressions. Therefore, the first step is to properly identify the log’s source device before we can effectively take any actions.
     

    Our goal is to properly identify a log to avoid things like improperly tagging or filtering. We find that identifying logs by the resource (device) makes the most sense in this context. Attempting to do this by pipeline would place us back in a similar situation, as pipelines are essentially groups of devices.
     

    I hope this explanation clarifies our approach, and if you have any further questions or suggestions, please feel free to share them. Your input is invaluable as we continually strive to improve our system.
     

    Thank you for being a part of our community.

4 Replies

  • Hi @Stuart Weenig 

    Thank you for taking the time to share your feedback with us. We genuinely appreciate your insights into our system!
     

    In your post, you’ve rightly identified a challenge we face when it comes to mapping logs to log sources. It’s indeed a bit of a chicken-and-egg situation. Allow me to clarify the reasoning behind our approach:
     

    When a log is sent from a device to Logicmonitor and enters our processing pipeline, the question arises: how do we identify the source device? Is it by the message content, the header information, or the log format?
     

    If we were to rely solely on the message content for identification, it would be a time-consuming and cumbersome process. This would mean having to account for log formats from a wide array of vendors, which can be quite complex.
     

    Our solution is to identify logs by the device itself. Typically, devices of the same type tend to generate similar logs. As a result, grouping logs based on the device within the “appliesTo” section simplifies the process.
     

    Trying to use “appliesTo” logic to match specific log formats, on the other hand, would be unwieldy and require complex regular expressions. Therefore, the first step is to properly identify the log’s source device before we can effectively take any actions.
     

    Our goal is to properly identify a log to avoid things like improperly tagging or filtering. We find that identifying logs by the resource (device) makes the most sense in this context. Attempting to do this by pipeline would place us back in a similar situation, as pipelines are essentially groups of devices.
     

    I hope this explanation clarifies our approach, and if you have any further questions or suggestions, please feel free to share them. Your input is invaluable as we continually strive to improve our system.
     

    Thank you for being a part of our community.

  • Yeah, it clarifies the approach and highlights that LM really messed up here. You say, “identify logs by the device itself”. Great. How? What attribute in the device is compared to what field or attribute about the log? Whatever we put in the device mapping, right. Let’s map the log to the device so we can use the log source to map the log to the device.

    Yeah, regex is complicated, but the logsource seems like two roosters in the henhouse.

    LogSources should be two different things.

    1. One that applies to existing or new, potentially regex-ly complex, pipelines. These modules (drop the “-sources” suffix already) should take care of mapping incoming logs to resources.
    2. Another that enriches the logs by applying to the device and adding attributes from the log message/device properties as log attributes.
  • From what I can gather, it has something to do with which collector the logs come in through. I still firmly believe logsources need a logic overhaul.

    I ended up with two logsources, that point to group membership for the appliesto. The group membership happens to line up with collector boundaries.

    One of my customers needs the resource mapping to be Method: IP, Key: system.hostname. While the rest need Method: IP, Key: system.ips. (If i were running collector version 35.100+, I think i’d be able to do this within one LogSource.) So both logsources’ appliesto look at the group membership and one is ~== while the other is not ~== based on the group name. 

    This works, but your guess is as good as mine as to how. My only guess is that the appliesto eventually boils down to devices assigned to collectors and since that line coincides with collector assignment, it just works.

  • You say, “identify logs by the device itself”. Great. How?


    @Cameron Compton I’m also wondering how this happens.. LM Support tells me that it’s through the resource mapping in the lmlog source . . . but again before this lmlog source is even used, LM has to know what system the log came from for the AppliesTo section to work correctly… are we using the AppliesTo wrong ? Should be applying the lmlog source to the collector in which the logs are coming in at ?

    I have this set up, which is identical to what’s in a working collector configuration

    lmlogs.syslog.property.name=system.sysname
    lmlogs.syslog.hostname.format=HOSTNAME
     

    … but the lmlog source will not apply to the device so I’m at a loss.