Routing Tree#
Alertmanager receives alerts from Prometheus and decides where to send them based on a routing tree. Every alert enters at the root route and travels down the tree until it matches a child route. If no child matches, the root route’s receiver handles it.
# alertmanager.yml
global:
resolve_timeout: 5m
slack_api_url: "https://hooks.slack.com/services/T00/B00/xxx"
pagerduty_url: "https://events.pagerduty.com/v2/enqueue"
route:
receiver: "default-slack"
group_by: ["alertname", "namespace"]
group_wait: 30s
group_interval: 5m
repeat_interval: 4h
routes:
- match:
severity: critical
receiver: "pagerduty-oncall"
group_wait: 10s
repeat_interval: 1h
routes:
- match:
team: database
receiver: "pagerduty-dba"
- match:
severity: warning
receiver: "team-slack"
repeat_interval: 12h
- match_re:
namespace: "staging|dev"
receiver: "dev-slack"
repeat_interval: 24h
Timing parameters matter. group_wait is how long Alertmanager waits after receiving the first alert in a new group before sending the notification – this lets it batch related alerts together. group_interval is the minimum time before sending updates about a group that already fired. repeat_interval controls how often an unchanged active alert is re-sent.