I recently had a chance to review Meltdown: Why Our Systems Fail and What We Can Do About It, which takes a critical look at several examples of catastrophic failure in many differing areas and applies Perrow’s theories of Normal Accidents in order to address these systemic problems we face regularly. A lot of the background in human factors I’m… Read more →
Category: Human Factors
No, seriously. Root Cause is a Fallacy.
I’m just back from attending SREcon ’18 Americas in Santa Clara last week, an incredible conference I’ve spoken at before in Dublin in 2016 as a tutorial, but never in the U.S. You can find some blog posts written about specifics (Day 1, Day 2, Day 3), but I wouldn’t be able to do it justice myself, so read those!… Read more →
Thoughts on the role of Incident Commander
As with most of my blog posts, this should be considered a living document, the ideas offered here being malleable, as I would hope the document that it references be flexible to new ideas. Conversations surrounding this welcomed and encouraged as we all continue to learn. I recently came across Pagerduty’s documentation surrounding their philosophy on the Incident Commander (here… Read more →
Recognizing adaptability in learning
The following is the current version of a section in my book on interviewing for technical roles. I’m trying to help out with any advice I can while I’m putting all of this together. As part of that, I’m looking for constructive criticism and feedback alongside it. My experiences as an engineer are also not universal and so my own biases… Read more →