_CORE
AI & Agentic Systems Core Information Systems Cloud & Platform Engineering Data Platform & Integration Security & Compliance QA, Testing & Observability IoT, Automation & Robotics Mobile & Digital Banking & Finance Insurance Public Administration Defense & Security Healthcare Energy & Utilities Telco & Media Manufacturing Logistics & E-commerce Retail & Loyalty
References Technologies Blog Know-how Tools
About Collaboration Careers
CS EN
Let's talk

Postmortem: How to Do It Right

23. 10. 2025 1 min read intermediate

Postmortem is not about finding culprits. It’s about making sure it doesn’t happen again.

Blameless Culture

“Jan deleted the database” → “Missing protection against deleting production database.” Look for systemic causes, not culprits.

Template

Incident: [name]

Date: YYYY-MM-DD
Severity: Critical/Major/Minor
Duration: X hours
Impact: Y users affected, Z transactions lost

Timeline

HH:MM — What happened
HH:MM — Alert fired
HH:MM — On-call notified
HH:MM — Root cause identified
HH:MM — Mitigation applied
HH:MM — Resolved

Root Cause

Detailed description of the cause.

Contributing Factors

What made the situation worse?

Action Items

Action Owner Deadline Priority
Add guard John 2 weeks P1

Key Questions

  • Why did detection take so long?
  • Why didn’t automatic rollback exist?
  • Why didn’t tests cover this scenario?
  • Did we have a runbook? Did it help?

Follow-up

Action items must have owners and deadlines. Review completion in weekly standups.

Remember

Postmortem without action items is just a story. Postmortem with follow-through is improvement.

postmortemsreincident response
Share:

CORE SYSTEMS tým

Stavíme core systémy a AI agenty, které drží provoz. 15 let zkušeností s enterprise IT.