Forum Discussion
I have brought this up before and was shot down with the "works as designed".
We 100% agree with this statement
"Second, when a alert crosses a threshold the second time a week after the original acknowledgement (as we saw in my first post) I think it is safe to assume that should be considered a new "alert session."
We have cases with the following conditions:
1. alert triggers on warning threshold
2. NOC acks with "monitoring"
3. alert crosses error threshold
4. NOC escalates to SME
5. NOC acks with "escalating to SME"
5. alert crosses critical threshold
6. NOC acks with "incident created. Management informed"
7. SME remediates just enough to move the alert down to warning
8. SME informs NOC issue fixed
9. NOC closed incident and resumes watching the alert page
10. alert crosses error threshold
11. No notification
12. alert crosses critical threshold
13. No notification
14. server crashes
15. People ask why no alert....
As a monitoring service, over communication is 100x more acceptable than a server crashing.
Related Content
- 10 months ago
- 8 years ago
- 9 years ago