Forum Discussion

Dean_Banks's avatar
13 years ago

Support alerts based on data aggregated from multiple hosts

We run a horizontally distributed architecture. As such, we really don't care (too much) if we lose one of N hosts, provided that a minimum number of hosts/processes/etc. are up and healthy. LogicMonitor makes it easy to make a graph of computed datapoints that span hosts, but doesn't let us configure alerts on the same computed data.

Tangible example: One application, when running, publishes capacity data to LM. This capacity data is aggregated and graphed, giving us great insight for planning purposes. However, the only alert configuration that LM supports requires us to alert on every single host, sometimes causing unnecessary wake ups in the middle of the night. Operationally, we'd be fine having one host be down, as long as we maintain adequate reserve capacity. System-wide reserve capacity can only be determined by aggregating data across the set of hosts (just like the graphs do).

We've been told to write custom scripts to do the collection and aggregation, and perhaps some rainy day we will. However, it seems like
1) LM does so much of the necessary bits already and
2) this would be a really useful capability for anyone that runs a horizontally distributed architecture.
This isn't a "holy cow, gotta have this now!" type of feature request, but certainly would be a great value-add.