{"id":638,"date":"2022-10-03T00:48:08","date_gmt":"2022-10-03T00:48:08","guid":{"rendered":"https:\/\/fde.cat\/index.php\/2022\/10\/03\/sre-weekly-issue-341\/"},"modified":"2022-10-03T00:48:08","modified_gmt":"2022-10-03T00:48:08","slug":"sre-weekly-issue-341","status":"publish","type":"post","link":"https:\/\/fde.cat\/index.php\/2022\/10\/03\/sre-weekly-issue-341\/","title":{"rendered":"SRE Weekly Issue #341"},"content":{"rendered":"<p><a href=\"https:\/\/sreweekly.com\/sre-weekly-issue-341\/\" title=\"Permalink to SRE Weekly Issue #341\" class=\"email_only\">View on sreweekly.com<\/a><\/p>\n<div class=\"sreweekly-sponsor-message\">\n<h2>A message from our sponsor, <a href=\"https:\/\/rootly.com\/demo\/?utm_source=sreweekly\">Rootly<\/a>:<\/h2>\n<p>Manage incidents directly from Slack with Rootly\u00a0\ud83d\ude92.<\/p>\n<p>Rootly automates manual tasks like creating an incident channel, Jira ticket and Zoom rooms, inviting responders, creating statuspage updates, postmortem timelines and more. Want to see why companies like Canva and Grammarly love us?:<\/p>\n<p><a href=\"https:\/\/rootly.com\/demo\/\">https:\/\/rootly.com\/demo\/<\/a><\/p>\n<\/div>\n<h2>Articles<\/h2>\n<div class=\"wp-block-group\">\n<div class=\"wp-block-group__inner-container\">\n<div class=\"sreweekly-entry\">\n<div class=\"sreweekly-title\"><a href=\"https:\/\/sigops.org\/s\/conferences\/hotos\/2021\/papers\/hotos21-s11-bronson.pdf\" target=\"_blank\" rel=\"noopener\">https:\/\/sigops.org\/s\/conferences\/hotos\/2021\/papers\/hotos21-s11-bronson.pdf<\/a><\/div>\n<div class=\"sreweekly-description\">\n<p>My coworkers referred to a system \u201cgoing metastable\u201d, and when I asked what that was, they pointed me to this awesome paper.<\/p>\n<p>Metastable failures occur in open systems with an uncontrolled source of load where a trigger causes the system to enter a <em>bad state that persists even when the trigger is `removed<\/em>.<\/p>\n<p>\u00a0\u00a0<small>Nathan Bronson, Aleksey Charapko, Abutalib Aghayev, and Timothy Zhu<\/small><\/p>\n<\/div>\n<\/div>\n<div class=\"sreweekly-entry\">\n<div class=\"sreweekly-title\"><a href=\"https:\/\/status.honeycomb.io\/incidents\/8b04gv08jxt0\" target=\"_blank\" rel=\"noopener\">Honeycomb incident report: Querying Errors<\/a><\/div>\n<div class=\"sreweekly-description\">\n<p>Honeycomb posted this incident report involving a service hitting the open file descriptors limit.<\/p>\n<p>\u00a0\u00a0<small>Honeycomb<\/small><br \/>\u00a0\u00a0<small><em>Full disclosure: Honeycomb is my employer.<\/em><\/small><\/p>\n<\/div>\n<\/div>\n<div class=\"sreweekly-entry\">\n<div class=\"sreweekly-title\"><a href=\"https:\/\/www.reddit.com\/r\/sre\/comments\/xs926z\/what_does_your_oncall_rotas_look_like\/\" target=\"_blank\" rel=\"noopener\">[reddit r\/sre] What does your oncall rotas look like?<\/a><\/div>\n<div class=\"sreweekly-description\">\n<p>Lots of interesting answers to this one, especially when someone uttered the phrase:<\/p>\n<p>engineers should not be on call<\/p>\n<p>\u00a0\u00a0<small>u\/infomaniac89 and others \u2014 reddit<\/small><\/p>\n<\/div>\n<\/div>\n<div class=\"sreweekly-entry\">\n<div class=\"sreweekly-title\"><a href=\"https:\/\/status.cloud.google.com\/incidents\/X8SNkK2BPyCrc1sveeiu\" target=\"_blank\" rel=\"noopener\">Incident Report: Google Cloud Filestore Outage 2022-09-13<\/a><\/div>\n<div class=\"sreweekly-description\">\n<p>A misbehaving internal Google service overloaded Cloud Filestore, exceeding its global request limit and effectively DoSing customers.<\/p>\n<p>\u00a0\u00a0<small>Google<\/small><\/p>\n<\/div>\n<\/div>\n<div class=\"sreweekly-entry\">\n<div class=\"sreweekly-title\"><a href=\"https:\/\/blog.developer.adobe.com\/creating-a-thriving-on-call-engineering-workflow-by-embracing-healthy-team-habits-f05841e62ea1\" target=\"_blank\" rel=\"noopener\">Creating a Thriving On-Call Engineering Workflow by Embracing Healthy Team Habits<\/a><\/div>\n<div class=\"sreweekly-description\">\n<p>An in-depth look at how Adobe improved its on-call experience.  They used a deliberate plan to change their team\u2019s on-call habits for the better.<\/p>\n<p>\u00a0\u00a0<small>Bianca Costache \u2014 Adobe<\/small><\/p>\n<\/div>\n<\/div>\n<div class=\"sreweekly-entry\">\n<div class=\"sreweekly-title\"><a href=\"https:\/\/metrist.io\/blog\/heres-how-chicago-trading-companys-luke-rotta-engineers-resilient-systems\/\" target=\"_blank\" rel=\"noopener\">Here\u2019s How Chicago Trading Company\u2019s Luke Rotta Engineers Resilient Systems<\/a><\/div>\n<div class=\"sreweekly-description\">\n<p>This one contains an interesting observation: they found that outages caused by a cloud providers take longer to solve.<\/p>\n<p>\u00a0\u00a0<small>Jeff Martens \u2014 Metrist<\/small><\/p>\n<\/div>\n<\/div>\n<div class=\"sreweekly-entry\">\n<div class=\"sreweekly-title\"><a href=\"https:\/\/incident.io\/blog\/ditch-detailed-plans\" target=\"_blank\" rel=\"noopener\">Why you should ditch your overly detailed incident response plan | incident.io<\/a><\/div>\n<div class=\"sreweekly-description\">\n<p>Even if you don\u2019t agree with all of their reasons, it\u2019s definitely worth thinking about.<\/p>\n<p>\u00a0\u00a0<small>Danny Martinez \u2014 incident.io<\/small><\/p>\n<\/div>\n<\/div>\n<div class=\"sreweekly-entry\">\n<div class=\"sreweekly-title\"><a href=\"https:\/\/www.softwareatscale.dev\/p\/thoughts-on-api-reliability\" target=\"_blank\" rel=\"noopener\">Thoughts on API Reliability<\/a><\/div>\n<div class=\"sreweekly-description\">\n<p>This one covers common reliability risks in APIs and techniques for mitigating them.<\/p>\n<p>\u00a0\u00a0<small>Utsav Shah<\/small><\/p>\n<\/div>\n<\/div>\n<div class=\"sreweekly-entry\">\n<div class=\"sreweekly-title\"><a href=\"https:\/\/www.honeycomb.io\/blog\/future-ops-platform-engineering\" target=\"_blank\" rel=\"noopener\">The Future of Ops Is Platform Engineering<\/a><\/div>\n<div class=\"sreweekly-description\">\n<p>The evolution beyond separate Dev and Ops teams continues. This article traces the path through DevOps and into platform-focused teams.<\/p>\n<p>\u00a0\u00a0<small>Charity Majors \u2014 Honeycomb<\/small><br \/>\u00a0\u00a0<small><em>Full disclosure: Honeycomb is my employer.<\/em><\/small><\/p>\n<\/div>\n<\/div>\n<\/div>\n<\/div>\n<p>SRE WEEKLY<\/p>","protected":false},"excerpt":{"rendered":"<p>View on sreweekly.com A message from our sponsor, Rootly: Manage incidents directly from Slack with Rootly\u00a0\ud83d\ude92. Rootly automates manual tasks like creating an incident channel, Jira ticket and Zoom rooms, inviting responders, creating statuspage updates, postmortem timelines and more. Want to see why companies like Canva and Grammarly love us?: https:\/\/rootly.com\/demo\/ Articles https:\/\/sigops.org\/s\/conferences\/hotos\/2021\/papers\/hotos21-s11-bronson.pdf My coworkers&hellip; <a class=\"more-link\" href=\"https:\/\/fde.cat\/index.php\/2022\/10\/03\/sre-weekly-issue-341\/\">Continue reading <span class=\"screen-reader-text\">SRE Weekly Issue #341<\/span><\/a><\/p>\n","protected":false},"author":0,"featured_media":0,"comment_status":"","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"spay_email":"","footnotes":""},"categories":[8],"tags":[],"class_list":["post-638","post","type-post","status-publish","format-standard","hentry","category-sre","entry"],"jetpack_featured_media_url":"","jetpack-related-posts":[{"id":543,"url":"https:\/\/fde.cat\/index.php\/2022\/02\/21\/sre-weekly-issue-310\/","url_meta":{"origin":638,"position":0},"title":"SRE Weekly Issue #310","date":"February 21, 2022","format":false,"excerpt":"View on sreweekly.com A message from our sponsor, Rootly: Manage incidents directly from Slack with Rootly \ud83d\ude92. Automate manual admin tasks like creating incident channel, Jira and Zoom, paging the right team, postmortem timeline, setting up reminders, and more. Book a demo (+ get a snazzy Rootly shirt): https:\/\/rootly.com\/demo\/?utm_source=sreweekly Articles\u2026","rel":"","context":"In &quot;SRE&quot;","img":{"alt_text":"","src":"","width":0,"height":0},"classes":[]},{"id":579,"url":"https:\/\/fde.cat\/index.php\/2022\/05\/30\/sre-weekly-issue-324\/","url_meta":{"origin":638,"position":1},"title":"SRE Weekly Issue #324","date":"May 30, 2022","format":false,"excerpt":"View on sreweekly.com A message from our sponsor, Rootly: Manage incidents directly from Slack with Rootly \ud83d\ude92. Automate manual admin tasks like creating incident channel, Jira and Zoom, paging and adding responders, postmortem timeline, setting up reminders, and more. Book a demo (+ get a snazzy Rootly lego set): https:\/\/rootly.com\/demo\/\u2026","rel":"","context":"In &quot;SRE&quot;","img":{"alt_text":"","src":"","width":0,"height":0},"classes":[]},{"id":535,"url":"https:\/\/fde.cat\/index.php\/2022\/01\/24\/sre-weekly-issue-306\/","url_meta":{"origin":638,"position":2},"title":"SRE Weekly Issue #306","date":"January 24, 2022","format":false,"excerpt":"View on sreweekly.com A message from our sponsor, Rootly: Manage incidents directly from Slack with Rootly \ud83d\ude92. Automate manual admin tasks like creating incident channel, Jira and Zoom, paging the right team, postmortem timeline, setting up reminders, and more. Book a demo (+ get a snazzy Rootly shirt): https:\/\/rootly.com\/demo\/?utm_source=sreweekly Articles\u2026","rel":"","context":"In &quot;SRE&quot;","img":{"alt_text":"","src":"","width":0,"height":0},"classes":[]},{"id":603,"url":"https:\/\/fde.cat\/index.php\/2022\/07\/04\/sre-weekly-issue-329\/","url_meta":{"origin":638,"position":3},"title":"SRE Weekly Issue #329","date":"July 4, 2022","format":false,"excerpt":"View on sreweekly.com A message from our sponsor, Rootly: Manage incidents directly from Slack with Rootly \ud83d\ude92. Automate manual admin tasks like creating incident channel, Jira and Zoom, paging and adding responders, postmortem timeline, setting up reminders, and more. Book a demo (+ get a snazzy Rootly lego set): https:\/\/rootly.com\/demo\/\u2026","rel":"","context":"In &quot;SRE&quot;","img":{"alt_text":"","src":"","width":0,"height":0},"classes":[]},{"id":653,"url":"https:\/\/fde.cat\/index.php\/2022\/11\/21\/sre-weekly-issue-348\/","url_meta":{"origin":638,"position":4},"title":"SRE Weekly Issue #348","date":"November 21, 2022","format":false,"excerpt":"View on sreweekly.com A message from our sponsor, Rootly: Manage incidents directly from Slack with Rootly\u00a0\ud83d\ude92. Rootly automates manual tasks like creating an incident channel, Jira ticket and Zoom rooms, inviting responders, creating statuspage updates, postmortem timelines and more. Want to see why companies like Canva and Grammarly love us?:\u2026","rel":"","context":"In &quot;SRE&quot;","img":{"alt_text":"","src":"","width":0,"height":0},"classes":[]},{"id":540,"url":"https:\/\/fde.cat\/index.php\/2022\/02\/07\/sre-weekly-issue-308\/","url_meta":{"origin":638,"position":5},"title":"SRE Weekly Issue #308","date":"February 7, 2022","format":false,"excerpt":"View on sreweekly.com A message from our sponsor, Rootly: Manage incidents directly from Slack with Rootly \ud83d\ude92. Automate manual admin tasks like creating incident channel, Jira and Zoom, paging the right team, postmortem timeline, setting up reminders, and more. Book a demo (+ get a snazzy Rootly shirt): https:\/\/rootly.com\/demo\/?utm_source=sreweekly Articles\u2026","rel":"","context":"In &quot;SRE&quot;","img":{"alt_text":"","src":"","width":0,"height":0},"classes":[]}],"_links":{"self":[{"href":"https:\/\/fde.cat\/index.php\/wp-json\/wp\/v2\/posts\/638","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/fde.cat\/index.php\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/fde.cat\/index.php\/wp-json\/wp\/v2\/types\/post"}],"replies":[{"embeddable":true,"href":"https:\/\/fde.cat\/index.php\/wp-json\/wp\/v2\/comments?post=638"}],"version-history":[{"count":0,"href":"https:\/\/fde.cat\/index.php\/wp-json\/wp\/v2\/posts\/638\/revisions"}],"wp:attachment":[{"href":"https:\/\/fde.cat\/index.php\/wp-json\/wp\/v2\/media?parent=638"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/fde.cat\/index.php\/wp-json\/wp\/v2\/categories?post=638"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/fde.cat\/index.php\/wp-json\/wp\/v2\/tags?post=638"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}