Category: Human Factors

Topics surrounding the idea of human factors, blame awareness, and learning culture.

Patience in Implementing Effective Incident Reviews

Note: This post originally appeared on the Learning from Incidents site, cross posted here for my own preservation of thoughts. You can find the original post at https://www.learningfromincidents.io/blog/patience-in-implementing-effective-incident-reviews. Pressure in just getting an incident review “done” As we struggle to understand how things go wrong, to learn from incidents, and to prepare ourselves for future surprises, the hurried rush to… Read more →

Crystal Ball

Peering into the future of Resilience Engineering in Tech

Coming back from SREcon 19 Americas in Brooklyn (catch up with Tanya Reilly’s conf report) and Chaos Community Day 19 in Manhattan (Nora Jones’ Chaos Engineering Traps), Resilience Engineering has had my full attention lately. I’m thoroughly encouraged to see so many folks interested in it and speakers from many different companies contributing their shared experiences to a field that… Read more →

Thoughts on the role of Incident Commander

As with most of my blog posts, this should be considered a living document, the ideas offered here being malleable, as I would hope the document that it references be flexible to new ideas. Conversations surrounding this welcomed and encouraged as we all continue to learn. I recently came across Pagerduty’s documentation surrounding their philosophy on the Incident Commander (here… Read more →