Check it out, a conference from the Learning From Incidents people!
Echoing Bainbridge’s Ironies of Automation, this article discusses the potential dangers of over-automation, using an air accident as a case study. I hadn’t been aware of the term “Children of the Magenta” before.
Katie Mingle — 99% Invisible
There’s more to it than just hacking together some slack workflows.
Ryan McDonald — FireHydrant
Honeycomb doesn’t do its SLOs “by the book”.
The way Honeycomb defines SLOs is radically different from what I expected. Instead of the definitions I wrote about at the beginning of this post, I saw:
Reid Savage — Honeycomb
Full disclosure: Honeycomb is my employer.
An anonymous Twitter engineer talks about what’s going on over there and how they think it might play out.
Chris Stokel-Walker — MIT Technology Review
They rolled out automated rollbacks across a complex infrastructure, and in this article, they share the lessons they learned in the process.
Will Sewell and Joseph Pallamidessi — Monzo
Okay. Here’s the Important Thing:
As you approach maximum throughput, average queue size – and therefore average wait time – approaches infinity.
It was not clear to the pilots that the fuel estimation system was not designed to be used in the way they were using it.
As is usually the case with air accidents, the crash of Air Florida flight 90 did not have a single cause. In fact, the accident was the result of the confluence of two proximate factors, each of which was itself the culmination of a long chain of errors.