Articles
Incident duration and severity are not related, and we have the in-depth data to prove it.
It’s time for another VOID report! I’m glad this project is still going strong.
Courtney Nash — Verica
I haven’t been paying attention to the recent attempts to legislate cloud provider reliability, and this article was a great catch-up. There’s a lot going on here.
Jeff Martens — Metrist
I’m still trying to figure out how I feel about this one, but I’m definitely glad I read it.
Fred Hebert
FireHydrant published this report with statistics from over 50,000 incidents experienced by their customers.
FireHydrant
Want to get a solid understanding of how the Linux shells work, including file descriptors, process management, and sessions? This one goes really deep with lots of example programs.
Viacheslav Biriukov
Check it out, Google search finally has a proper status page!
It’s one of those “awesome ___” repos on GitHub, this time for resources about writing SLOs.
Steve Azzopardi (@steveazz)
If you’re going to classify incidents by “root cause”, try these on for size: “production pressure”, “goal conflicts”, and more in this article.
Lorin Hochstein
Sure, the pilots were engaging in an activity that could be considered dubious. But what’s really worth digging into in this air accident is how surprise may have led them to forget their training on how to recover stable flight.
More on the same accident:
Admiral Cloudberg
SRE WEEKLY