SRE Weekly Issue #308

A message from our sponsor, Rootly:

Manage incidents directly from Slack with Rootly 🚒. Automate manual admin tasks like creating incident channel, Jira and Zoom, paging the right team, postmortem timeline, setting up reminders, and more. Book a demo (+ get a snazzy Rootly shirt):


Oh, now this is fascinating. Firefox, like, the web browser itself, had an outage in January. It just stopped working for everyone.

  Christian Holler — Mozilla

If you’re looking for an explainer on the CAP theorem, this article gives a great overview with practical details.

  Bartłomiej Żyliński — SoftwareMill

This is about what the security field can learn from SRE. Obviously not directly applicable to SRE, but this article gives us a really great outside perspective.

  Anton Chuvakin — Security Boulevard

Code doesn’t “rot”, but the environment around it changes constantly.

  Lorin Hochstein

How do complex service dependencies affect your SLA? What if service A depends on service B and C being up, but service D or E being up?

TLDR; for serial, multiply availability; For parallels, multiply unavailability.

  Alex Ewerlöf

Here’s how and why not to be a hero. It’s bad for you and everyone else too.

  Isaac Seymour —


Amazon Alexa
Apple Card

Categorized as SRE