AOE Technology RadarAOE Technology Radar

Blameless Post Mortems

devopsdocumentation
Adopt

Failure and invention are inseparable twins.

Jeff Bezos

Blameless Post Mortems provide a concept for dealing with failures that inevitably occur when developing and operating complex software solutions. After any major incident or outage, the team gathers to perform an in-depth analysis of what happened and what can be done to mitigate the risk of similar issues in the future.

Based on trust and the assumption that everyone involved had good intentions to do the best possible job given the information at hand, Blameless Post Mortems offer an opportunity to continuously improve the quality of software and infrastructure and the processes for dealing with critical situations. We consider this a fundamental principle that enables our staff to address deficiencies without fear of repercussions and reduces the probability of incidents being concealed.

The post-mortem documentation usually includes a timeline of the events leading to an incident and the steps taken for its remediation, as well as future actions and lessons learned to enhance the resilience and stability of our services.

At AOE, we make it a priority to conduct a Blameless Post Mortem meeting after every user-visible incident.

Assess
Trial

Failure and invention are inseparable twins.

Jeff Bezos

Blameless Post Mortems provide a concept of dealing with failures that inevitably occur when developing and operating complex software solutions. After any major incident or outage, the team gets together to perform an in-depth analysis of what happened and what can be done to mitigate the risk of similar issues happening in the future.

Based on trust, and under the assumption that every person involved had good intentions to do the best-possible job given the information at hand, Blameless Post Mortems provide an opportunity to continuously improve the quality of software and infrastructure and the processes to deal with critical situations.

The post mortem documentation usually consists of both a timeline of the events leading to an incident and the steps taken to its remediation, as well as future actions and learnings for increasing resilience and stability of our services.

At AOE, we strive to conduct a Blameless Post Mortem meeting after every user-visible incident.