8 years ago
Anomaly detection
We have a linux based http load balancer that is being monitored for a few months now. Yesterday we got a call from very few customers saying that our site was a bit slow. Looking on LM alerts I sa...
Mark, your idea is also needed but I do not think it's the same thing. Your average will see an average of the last 50 samples or 100 samples and will be able to tell you that the average is high.
My idea does an average of the last N samples and compares that to the same period a week ago, two weeks ago ... X weeks ago and tries to see some anomaly in the current average (e.g. every Wednesday on the past 5 weeks the CPU at night is 10% and at day time is around 30%, this week on Wednesday at night the CPU is spiking at 90% and falling back to 50% so there is something different). This is not an alert because things might be ok, a new customer might have started working at night and is keeping the server busy or something else might be keeping the server busy. But at the same time this could be an indicator of a problem or a potential problem