Forum Discussion

Eric_Egolf's avatar
6 years ago

Monitoring Logoff/Logon Events for Anomalies

Background - We have a fairly large citrix environment(70 customers, 1200 users). Each customer has 1 or more xenapp servers depending on how many users. The environment is setup in a manner that often times the first step in troubleshooting is having the users logon/log off(which obviously creates an event id). We would like to plot the number of logon/logoffs(via event ids) per every 10 minute period and look for anomalies(periods of high logons/logoffs relative to normal or relative to number of users in environment). First step for us is simply plotting the data. Any ideas ideas on the best way to approach this problem. My initial thought is simply to write a powershell script to search for the eventids over the 10 minutes and return the number...then apply this to each xenapp server in logicmonitor but maybe there is a better approach? I also don't know the best approach to aggregate by customer or even factor in the number of users...assuming we would need to export to excel to handle some of that. Ideas welcomed.

  • Sure... I'm using this as a datasource targeting isWindows() called "Active Directory Failed Login Count"

    try {
        $events = Get-WinEvent                  `
            -ComputerName    ##system.sysname## `
            -ErrorAction     SilentlyContinue   `
            -FilterHashtable @{
                LogName   = "Security"
                Id        = 4625
                StartTime = (get-date).AddMinutes(-5)
            }                                  `
            | where Message -Match "0xC000006D"
    } catch {
            $events = @()
    }
    
    "$($events.count)"

    No warranty for the code, use at your own risk.  Please note the use of backtick line continuation for readbility.

  • I'm doing something like this for failed logon attempts with a simple threshold (event ID 4625 in windows).  For that, I'm gathering 5 minutes of security log and counting the # of bad password 4625 events.

    For your purposes, if you were to have a DS that ran every 5 minutes, it could gather 1 hour of log data, then count an event (whatever the first one for a successful logon is in your environment), then count those in the last 5 minutes and create a ratio.  You can then threshold that ratio for alerting.  It's basically just getting the current percentage of that event now vs. the past hour... you could even multiply by 100 to make it an actual percentage.

    If you wanted to get really fancy, you could check every 5 minute segment over that hour except the last one, average them, then compare the last segment to that average... even more fancy, given that data set, you'll have enough to generate a standard deviation, which you can then use as a threshold for the alert.

    I'm using powershell for my scripts as I know it better than groovy... and it has quite a bit more windows specific commands for gathering data.  This would allow you to use $events = get-winevent to gather logs from windows then filter for the event ID and for some various content to eliminate even more of the unnecessary events before wrapping it in ($events).count

  • Thanks Cole...great approach...any chance you can share  your Powershell code or Datasource? 

  • Sure... I'm using this as a datasource targeting isWindows() called "Active Directory Failed Login Count"

    try {
        $events = Get-WinEvent                  `
            -ComputerName    ##system.sysname## `
            -ErrorAction     SilentlyContinue   `
            -FilterHashtable @{
                LogName   = "Security"
                Id        = 4625
                StartTime = (get-date).AddMinutes(-5)
            }                                  `
            | where Message -Match "0xC000006D"
    } catch {
            $events = @()
    }
    
    "$($events.count)"

    No warranty for the code, use at your own risk.  Please note the use of backtick line continuation for readbility.

  • Perfect thanks Cole. This worked very well for me. The only comment is that I had to find the location of the applications and services logs. I found this article that helped.

  • Happy to help.  Can I ask specifically what you're using for your criteria?  If it's useful, I'd love to see how you end up performing your metrics.  I have other thoughts as well showing average session duration, average number of processes started during a given session, normal logon/logoff times / username/guid.  These can be instances linked to servers that the user has access to.  If you have server specific access defined in AD, you can even use get-adcomputer and get-aduser to show if anyone trying to log in is part of the AD, but doesn't have access to that particular server.  Basically, looking for anything anomalous that could be used to alert for possible security issues.  Stay paranoid, stay safe :)/emoticons/smile@2x.png 2x" title=":)" width="20">