SRE Weekly Issue #343

View on sreweekly.com

Bit of a short one this week as I recover from my third bout of COVID. Fortunately, this is another relatively mild one (thank you, vaccine!). Good luck everyone, and get your boosters.

Articles

Authors’ Cut—Actionable SLOs Based on What Matters Most

This article explores the advantages of powering SLOs with observability data.

Pierre Tessier — Honeycomb
Full disclosure: Honeycomb is my employer.

#JWST: Day 2 Operations of the Most Expensive SRE Project

As the James Webb Space Telescope moves into normal operations, there are more great SRE lessons to be learned.

Jennifer Riggins — The New Stack

How to Build Software like an SRE

During 5 years of experience as an SRE, the author of this article gathered a set of best practice patterns for software development and operation, which they share with us.

brandon willett

Mussel — Airbnb’s Key-Value Store for Derived Data

How Airbnb built a persistent, high availability and low latency key-value storage engine for accessing derived data from offline and streaming events.

Chandramouli Rangarajan, Shouyan Guo, Yuxi Jin — Airbnb

Why MTTR should be a ‘business’ metric

By owning and reporting MTTR, teams have no choice but to be accountable for the reliability of the code they write. This dramatically changes the culture of engineering.

Sidu Ponnappa — Last9

Alaskan Double-Cross: The crash of PenAir flight 3296

I learned about plan continuation bias while reading this air accident report, and I’m certain I’ve experienced this during incidents I’ve been involved in.

Admiral Cloudberg

SRE WEEKLY

A message from our sponsor, Rootly:

Articles

Related