SRE Weekly Issue #326

A message from our sponsor, Rootly:

Manage incidents directly from Slack with Rootly đźš’. Automate manual admin tasks like creating incident channel, Jira and Zoom, paging and adding responders, postmortem timeline, setting up reminders, and more. Book a demo (+ get a snazzy Rootly lego set):
https://rootly.com/demo/

Articles

Catchpoint and Blameless have teamed up on this year’s SRE survey. They’ve sweetened the deal with two $5 donations to charity for every survey completed. Go do it!

  Kurt Andersen — Blameless

I sure miss the good old “checkmark-i” icon. Oh wait, no I don’t.

  Jeff Martens — Metrist

How can you handle failure gracefully? Click through for 6 strategies to consider.

  Boris Cherkasky — Riskified

Declaring the first incident when you start a new job can be intimidating, but it really shouldn’t be. Let’s look at some common fears, and work out how to address them.

  Isaac Seymour — incident.io

The incident involved fiber equipment failure and a suboptimal automated remediation.

  Google

This is a primer on Urgency and Impact in incidents, including the difference between them and how to use them.

  Noor-ul-Anam Ruqayya — Blameless

Running retrospectives on near-miss incidents can be highly valuable, as this article discusses.

  Vanessa Huerta Granda — Jeli

Outages

Facebook, Instagram, and Whatsapp
Substack emails
LinkedIn
Yahoo Mail
SRE WEEKLY

Published
Categorized as SRE