Incident Lifecycle Overview#
An incident is an unplanned disruption to a service requiring coordinated response. The lifecycle has six phases: detection, triage, communication, mitigation, resolution, and review. Each has defined actions, owners, and exit criteria.
Phase 1: Detection#
Incidents are detected through three channels. Automated monitoring is best – alerts fire on SLO violations or error thresholds before users notice. Internal reports come from other teams noticing issues with dependencies. Customer reports are worst case – if users detect your incidents first, your observability has gaps.