---
title: "Post-Mortem Action Item Tracking"
description: "Tracking and completing post-mortem action items through categorization, prioritization, ownership assignment, follow-up cadence, and preventing action item decay."
url: https://agent-zone.ai/knowledge/sre/post-mortem-action-tracking/
section: knowledge
date: 2026-02-22
categories: ["sre"]
tags: ["post-mortem","action-items","incident-follow-up","reliability-improvement","action-tracking","completion-rates","accountability"]
skills: ["action-item-categorization","prioritization-framework","ownership-assignment","follow-up-cadence-design","completion-rate-measurement"]
tools: ["jira","linear","asana","github-issues","slack","prometheus"]
levels: ["intermediate","advanced"]
word_count: 1209
formats:
  json: https://agent-zone.ai/knowledge/sre/post-mortem-action-tracking/index.json
  html: https://agent-zone.ai/knowledge/sre/post-mortem-action-tracking/?format=html
  api: https://api.agent-zone.ai/api/v1/knowledge/search?q=Post-Mortem+Action+Item+Tracking
---


## The Action Item Problem

Post-mortem reviews produce action items. Teams agree on what needs to change. Then weeks pass, priorities shift, and items quietly decay into a backlog nobody checks. The next incident hits the same root cause, and the post-mortem produces the same action items again.

Studies of recurring incidents consistently show the root cause was identified in a previous post-mortem, and the corresponding action item was never completed. Action item tracking is the mechanism by which incidents make systems more reliable instead of just more documented.

## Categorization: Prevent, Detect, Mitigate

Every action item falls into one of three categories based on where in the incident lifecycle it operates.

### Prevent

Actions that stop this class of incident from happening. These address root causes: add input validation, implement circuit breakers, fix race conditions, add connection pool limits. Prevention actions are highest effort but highest value -- they eliminate entire categories of incidents.

### Detect

Actions that improve detection speed and accuracy. Examples: add an alert for connection pool utilization above 80%, create a synthetic check for the payment flow, add structured logging for auth failures. Detection actions have moderate effort and high value -- faster detection directly reduces incident duration.

### Mitigate

Actions that reduce impact or speed recovery. Examples: write a failover runbook, implement auto-scaling triggers, pre-stage rollback procedures, add feature flags. Mitigation actions are usually lowest effort and provide immediate value for the next occurrence.

**Every significant post-mortem should produce at least one action in each category.** If the review only generated prevention items, ask: "If this happens again before the fix is deployed, how do we detect it faster? How do we mitigate it faster?"

## Prioritization

Score each action item on two dimensions:

**Impact** (how much risk reduction):

| 4 - Critical | Eliminates root cause or prevents SEV-1 recurrence |
|---|---|
| 3 - High | Significantly reduces likelihood or severity |
| 2 - Medium | Meaningful but incremental improvement |
| 1 - Low | Marginal risk reduction |

**Effort** (how much work):

| 1 - Low | Less than 1 day, single team |
|---|---|
| 2 - Medium | 1-5 days, single team |
| 3 - High | 1-3 weeks, cross-team coordination |
| 4 - Very High | Multiple weeks, significant engineering effort |

**Priority = Impact / Effort:**
- **P0 (Do Now):** ratio >= 3.0, or any Impact=4 item
- **P1 (This Week):** ratio >= 1.5
- **P2 (This Month):** ratio >= 0.75
- **P3 (Backlog):** ratio < 0.75

**Overrides:** Any SEV-1 action item with Impact >= 3 is automatically P0. Any detection improvement that would have halved incident duration is P1+. If the same item appears in two post-mortems, bump it up one priority level.

### Example

```
Post-mortem: Payment service outage (SEV-1, 2 hours)

| Action                              | Cat     | Impact | Effort | Priority |
|-------------------------------------|---------|--------|--------|----------|
| Add circuit breaker to payment svc  | Prevent | 4      | 2      | P0       |
| Alert on payment error rate > 0.5%  | Detect  | 3      | 1      | P0       |
| Write payment failover runbook      | Mitigate| 3      | 2      | P1       |
| Structured logging for retries      | Detect  | 2      | 1      | P1       |
| Refactor error handling             | Prevent | 3      | 4      | P2       |
```

## Ownership Assignment

Every action item needs a single owner. "The team" does not own action items -- individuals do.

**Rules:**
1. Assign to the team that owns the affected system.
2. Assign to an individual within that team, not a team name.
3. Assign at the post-mortem meeting. Do not leave with unassigned items.
4. Respect capacity. Do not assign 10 items to one person.

```yaml
action_item:
  id: "PM-2026-0051-03"
  post_mortem: "PM-2026-0051"
  title: "Add circuit breaker to payment service"
  category: "prevent"
  priority: "P0"
  owner: "jane.doe@company.com"
  team: "payments"
  due_date: "2026-03-01"
  status: "in_progress"
  tracking_ticket: "PAY-1234"
```

**The tracking ticket rule:** Every action item must have a corresponding ticket in the owning team's project tracker. The post-mortem links to the ticket. The ticket links back to the post-mortem. This bidirectional linking ensures the item is visible in the team's normal workflow and does not live only in a document nobody revisits.

## Follow-Up Cadence

Without regular follow-up, action items decay.

| Priority | Check-In Frequency | Escalation Trigger |
|---|---|---|
| **P0** | Daily standup mention | Not started after 2 days |
| **P1** | Weekly check-in | No progress after 1 week |
| **P2** | Bi-weekly check-in | No progress after 2 weeks |
| **P3** | Monthly review | No progress after 1 month |

### Weekly Reliability Review

A standing 30-minute meeting to review all open action items:

1. New items from this week's incidents (5 min)
2. P0 status updates from each owner (10 min)
3. P1 status updates (5 min)
4. Overdue items: blockers, reassignment (5 min)
5. Completion rate metrics (5 min)

### Automated Reminders

```python
def check_overdue_items():
    for item in get_open_action_items():
        if item.due_date < today():
            days_overdue = (today() - item.due_date).days
            send_slack_dm(item.owner,
                f"Action item {item.id} is {days_overdue} days overdue: "
                f"{item.title} | Priority: {item.priority}")
            if days_overdue > escalation_threshold(item.priority):
                send_slack_dm(item.team_lead,
                    f"Escalation: {item.id} is {days_overdue} days overdue.")
```

## Measuring Completion Rates

**Completion rate:** Items completed on time / total items due. Target 85%+. Below 70% indicates systemic problems.

**Time to completion** by priority:

| Priority | Target | Acceptable |
|---|---|---|
| P0 | 7 days | 14 days |
| P1 | 14 days | 30 days |
| P2 | 30 days | 60 days |
| P3 | 60 days | 90 days |

**Recurrence rate:** Incidents hitting a root cause identified in a previous post-mortem with an open action item. Above 10% means the organization writes post-mortems but does not learn from them.

**Decay rate:** Items more than 30 days past due without progress.

Track these in a dashboard. Trends matter more than absolute numbers.

## Preventing Action Item Decay

**Set realistic due dates.** Over-ambitious dates normalize overdue status.

**Limit items per post-mortem.** Aim for 3-7 focused items. More than that, prioritize ruthlessly and explicitly defer the rest.

**Close items that will not be done.** If an item has been deprioritized repeatedly, close it with a documented decision: "Accepted risk: choosing not to implement X because Y."

**Quarterly cleanup.** Review all items older than 90 days. For each: complete it, reprioritize it, or close it with documentation.

**Celebrate completions.** Acknowledge significant completions to reinforce that the work matters.

**Track recurrences visibly.** When an incident recurs because an action item was not completed, note it in the post-mortem.

**Make it a leadership metric.** If leadership reviews completion rates alongside velocity and uptime, teams allocate time for it.

## Step-by-Step Tracking Process

1. **Post-mortem concludes.** Facilitator has a list of proposed action items.
2. **Categorize** each as prevent, detect, or mitigate. Verify coverage across all three.
3. **Score impact and effort.** Calculate priority using the matrix.
4. **Assign an individual owner.** Confirm acceptance.
5. **Set a due date** based on priority targets.
6. **Create a tracking ticket** in the owning team's tracker. Link bidirectionally to the post-mortem.
7. **Add to reliability review** based on priority and cadence.
8. **Follow up on cadence.** Check status. Surface blockers.
9. **On completion,** update the ticket, post-mortem document, and dashboard.
10. **If overdue,** escalate per triggers. Reassign if needed. Adjust due date only with documented justification.
11. **Quarterly review.** Audit all open items. Close zombies. Report metrics to leadership.

## Agent Operational Notes

- **Automate reminders.** An agent checking status daily and sending reminders on cadence is reliable and consistent.
- **Maintain bidirectional links.** Always include post-mortem references in tickets and ticket links in post-mortems.
- **Surface recurrence data.** When a new incident occurs, check for related open action items and flag them immediately.
- **Never close items without documentation.** Require a reason for every closure without completion.
- **Report metrics consistently.** Generate weekly/monthly completion metrics automatically to build organizational habit.

