Training 010 Core Practices

Incident Memory

Blameless postmortems that actually preserve learning

Time 20–30 minutes Updated 2025-12-18 License Free / Open Training MD index.md

Incident Memory

Blameless postmortems that actually preserve learning

Training 010 · Core Practices
Time: 20–30 minutes


Core stance

Incidents are inevitable.
Forgetting why they happened is optional.

Incident memory is the practice of preserving causal understanding, not assigning fault.


Why this lesson exists

Many organizations run postmortems, yet still:

The problem is not the postmortem ritual.
It is the absence of memory continuity.


What incident memory is (and is not)

Incident memory is

Incident memory is not

Learning dies when incidents are treated as personal failures instead of system signals.


Why postmortems usually fail

Postmortems often fail because:

This creates the illusion of learning without its benefits.


The incident memory pattern

A continuity-safe incident memory answers five questions:

  1. What failed?
    (Observed behavior, not interpretation)

  2. Why did it fail?
    (Causal chain, including system and context)

  3. What assumptions were wrong or stressed?
    (What we believed that no longer holds)

  4. What changed as a result?
    (Decisions, safeguards, boundaries)

  5. What would cause this to be revisited?
    (Conditions, not dates)

If these are preserved, learning survives turnover.


Blameless does not mean consequence-free

Blameless means:

Accountability remains—but it targets systems and decisions, not individuals.


Incident memory and AI

AI systems:

Without incident memory:

Incident memory creates:


Exercises

Drill 1 — Rewrite an Old Incident

Pick a past incident report.

Rewrite it to clearly answer:

Ignore the timeline if needed.


Drill 2 — Assumption Capture

During your next incident discussion, ask:

“What did we assume that turned out not to be true?”

Write that down explicitly.


Drill 3 — Memory Placement

Decide where incident memory should live so it is:

Move one incident there.


FAQ

Isn’t this just SRE practice?
SRE techniques are one implementation. Incident memory applies to all failures, not just outages.

Won’t this create legal risk?
In practice, clear causal understanding reduces repeated harm and exposure.

Who owns incident memory?
The incident owner captures it. Continuity ensures it persists.


Suggested next step

Take one recent incident.
Preserve its causal learning using the five-question pattern.

That single act prevents recurrence.


Next: Training 011 — AI Mandates & Boundaries
How to prevent silent scope expansion in automated systems.