{"id":568,"date":"2022-05-02T01:26:03","date_gmt":"2022-05-02T01:26:03","guid":{"rendered":"https:\/\/fde.cat\/index.php\/2022\/05\/02\/sre-weekly-issue-320\/"},"modified":"2022-05-02T01:26:03","modified_gmt":"2022-05-02T01:26:03","slug":"sre-weekly-issue-320","status":"publish","type":"post","link":"https:\/\/fde.cat\/index.php\/2022\/05\/02\/sre-weekly-issue-320\/","title":{"rendered":"SRE Weekly Issue #320"},"content":{"rendered":"<p><a href=\"https:\/\/sreweekly.com\/sre-weekly-issue-320\/\" title=\"Permalink to SRE Weekly Issue #320\" class=\"email_only\">View on sreweekly.com<\/a><\/p>\n<div class=\"sreweekly-sponsor-message\">\n<h2>A message from our sponsor, <a href=\"https:\/\/rootly.com\/demo\/?utm_source=sreweekly\">Rootly<\/a>:<\/h2>\n<p>Manage incidents directly from Slack with Rootly \ud83d\ude92. Automate manual admin tasks like creating incident channel, Jira and Zoom, paging and adding responders, postmortem timeline, setting up reminders, and more. Book a demo (+ get a snazzy Rootly lego set):<br \/>\n<a href=\"https:\/\/rootly.com\/demo\/\">https:\/\/rootly.com\/demo\/<\/a><\/p>\n<\/div>\n<h2>Articles<\/h2>\n<div class=\"wp-block-group\">\n<div class=\"wp-block-group__inner-container\">\n<div class=\"sreweekly-entry\">\n<div class=\"sreweekly-title\"><a href=\"https:\/\/slack.engineering\/slacks-incident-on-2-22-22\/\" target=\"_blank\" rel=\"noopener\">Slack\u2019s Incident on 2-22-22<\/a><\/div>\n<div class=\"sreweekly-description\">\n<p>Slack shared this write-up of their February outage, which involved complex systems interactions and cascading failure.<\/p>\n<p>\u00a0\u00a0Laura Nolan \u2014 Slack<\/p>\n<\/div>\n<\/div>\n<div class=\"sreweekly-entry\">\n<div class=\"sreweekly-title\"><a href=\"https:\/\/www.youtube-nocookie.com\/embed\/Zq8FNk8Tboo?rel=0&amp;start=17693&amp;end=18000\" target=\"_blank\" rel=\"noopener\">The Repeat Incident Fallacy<\/a><\/div>\n<div class=\"sreweekly-description\">\n<p>Go watch this lightning talk now!  She had me hooked within the first ten seconds.<\/p>\n<p>Hi, my name is Emily Ruppe, I work at Jeli.io, and I am a recovering incident commander, and I am sick of the phrase \u201cto prevent this incident from ever happening again\u201d.<\/p>\n<p>\u00a0\u00a0Emily Ruppe \u2014 DevOpsDays Rockies<\/p>\n<\/div>\n<\/div>\n<div class=\"sreweekly-entry\">\n<div class=\"sreweekly-title\"><a href=\"https:\/\/lethain.com\/founding-uber-sre\/\" target=\"_blank\" rel=\"noopener\">Founding Uber SRE.<\/a><\/div>\n<div class=\"sreweekly-description\">\n<p>This is my personal story of starting the SRE organization at Uber.<\/p>\n<p>This article was written by a former Uber employee and is posted on their personal blog.<\/p>\n<p>\u00a0\u00a0Will Larson<\/p>\n<\/div>\n<\/div>\n<div class=\"sreweekly-entry\">\n<div class=\"sreweekly-title\"><a href=\"https:\/\/www.atlassian.com\/engineering\/post-incident-review-april-2022-outage\" target=\"_blank\" rel=\"noopener\">Post-Incident Review on the Atlassian April 2022 outage<\/a><\/div>\n<div class=\"sreweekly-description\">\n<p>This is total transparency at its finest.  This write-up has all the details you could ever hope for on what went wrong, how they responded, and what comes next.<\/p>\n<p>\u00a0\u00a0Sri Viswanath \u2014 Atlassian<\/p>\n<\/div>\n<\/div>\n<div class=\"sreweekly-entry\">\n<div class=\"sreweekly-title\"><a href=\"https:\/\/www.srepath.com\/site-reliability-engineering-glossary\/\" target=\"_blank\" rel=\"noopener\">Site Reliability Engineering Glossary<\/a><\/div>\n<div class=\"sreweekly-description\">\n<p>The target audience is new SREs and executive sponsors who would keep hearing these terms repeatedly but not take the time to read 1000s of words at a time.<\/p>\n<p>[source: author comment on Reddit]<\/p>\n<p>\u00a0\u00a0Ash P. \u2014 SREPath<\/p>\n<\/div>\n<\/div>\n<div class=\"sreweekly-entry\">\n<div class=\"sreweekly-title\"><a href=\"https:\/\/dropbox.tech\/infrastructure\/disaster-readiness-test-failover-blackhole-sjc\" target=\"_blank\" rel=\"noopener\">That time we unplugged a data center to test our disaster readiness<\/a><\/div>\n<div class=\"sreweekly-description\">\n<p>Dropbox wanted to be able to handle datacenter failure.  To reach this goal, they moved from an active\/active model to active\/passive and spun up a new Disaster Readiness team to rework their failover system.<\/p>\n<p>\u00a0\u00a0Krishelle Hardson-Hurley, Ross Delinger, and Tong Pham \u2014 Dropbox<\/p>\n<\/div>\n<\/div>\n<div class=\"sreweekly-entry\">\n<div class=\"sreweekly-title\"><a href=\"https:\/\/engineering.hellofresh.com\/slos-for-everyone-with-sloth-1704009b20a2\" target=\"_blank\" rel=\"noopener\">SLOs for everyone with Sloth<\/a><\/div>\n<div class=\"sreweekly-description\">\n<p>HelloFresh drove the implementation of SLOs in their Kubernetes-based infrastructure using Prometheus and Sloth.<\/p>\n<p>\u00a0\u00a0Chris Loukas \u2014 HelloFresh<\/p>\n<\/div>\n<\/div>\n<div class=\"sreweekly-entry\">\n<div class=\"sreweekly-title\"><a href=\"https:\/\/blog.roblox.com\/2022\/04\/delivering-large-scale-platform-reliability\/\" target=\"_blank\" rel=\"noopener\">Delivering Large-Scale Platform Reliability<\/a><\/div>\n<div class=\"sreweekly-description\">\n<p>A Roblox engineer outlines the way that Roblox handles reliability at scale.<\/p>\n<p>\u00a0\u00a0Alberto Covarrubias \u2014 Roblox<\/p>\n<\/div>\n<\/div>\n<div class=\"sreweekly-entry\">\n<div class=\"sreweekly-title\"><a href=\"https:\/\/blog.symops.com\/2022\/04\/21\/your-on-call-rotation-is-painful\/?utm_campaign=team&amp;utm_medium=newsletter&amp;utm_source=sreweekly\" target=\"_blank\" rel=\"noopener\">Your On Call Rotation is Harmful (And Here\u2019s How to Make it Better)<\/a><\/div>\n<div class=\"sreweekly-description\">\n<p>[\u2026] let\u2019s look at some common on call antipatterns and some simple things we can do to alleviate their common pitfalls.<\/p>\n<p>\u00a0\u00a0Nickolas Means \u2014 Sym<\/p>\n<\/div>\n<\/div>\n<\/div>\n<\/div>\n<h2>Outages<\/h2>\n<p><a href=\"https:\/\/support.myfitnesspal.com\/hc\/en-us\/articles\/5716717249933-Unable-to-Create-New-or-Edit-Existing-Foods-Meals-Recipes-Quick-Add-Sync-and-or-Login\">myfitnesspal<\/a><br \/>\n<a href=\"https:\/\/www.dynstatus.com\/incidents\/s397j2k5d9j5\">Dyn<\/a><br \/>\n<a href=\"https:\/\/9to5mac.com\/2022\/04\/25\/apple-music-and-app-store-currently-facing-downtime-for-some\/\">Apple Music and App Store<\/a><br \/>\n<a href=\"https:\/\/techcabal.com\/2022\/04\/28\/whatsapp-outage-as-thousands-around-the-world-send-messages\/\">WhatsApp<\/a><br \/>\n<a href=\"https:\/\/www.itworldcanada.com\/post\/1password-suffer-an-outage-caused-by-a-database-upgrade\">1Password<\/a><br \/>\nSRE WEEKLY<\/p>","protected":false},"excerpt":{"rendered":"<p>View on sreweekly.com A message from our sponsor, Rootly: Manage incidents directly from Slack with Rootly \ud83d\ude92. Automate manual admin tasks like creating incident channel, Jira and Zoom, paging and adding responders, postmortem timeline, setting up reminders, and more. Book a demo (+ get a snazzy Rootly lego set): https:\/\/rootly.com\/demo\/ Articles Slack\u2019s Incident on 2-22-22&hellip; <a class=\"more-link\" href=\"https:\/\/fde.cat\/index.php\/2022\/05\/02\/sre-weekly-issue-320\/\">Continue reading <span class=\"screen-reader-text\">SRE Weekly Issue #320<\/span><\/a><\/p>\n","protected":false},"author":0,"featured_media":0,"comment_status":"","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"spay_email":"","footnotes":""},"categories":[8],"tags":[],"class_list":["post-568","post","type-post","status-publish","format-standard","hentry","category-sre","entry"],"jetpack_featured_media_url":"","jetpack-related-posts":[{"id":543,"url":"https:\/\/fde.cat\/index.php\/2022\/02\/21\/sre-weekly-issue-310\/","url_meta":{"origin":568,"position":0},"title":"SRE Weekly Issue #310","date":"February 21, 2022","format":false,"excerpt":"View on sreweekly.com A message from our sponsor, Rootly: Manage incidents directly from Slack with Rootly \ud83d\ude92. Automate manual admin tasks like creating incident channel, Jira and Zoom, paging the right team, postmortem timeline, setting up reminders, and more. Book a demo (+ get a snazzy Rootly shirt): https:\/\/rootly.com\/demo\/?utm_source=sreweekly Articles\u2026","rel":"","context":"In &quot;SRE&quot;","img":{"alt_text":"","src":"","width":0,"height":0},"classes":[]},{"id":681,"url":"https:\/\/fde.cat\/index.php\/2023\/02\/20\/sre-weekly-issue-360\/","url_meta":{"origin":568,"position":1},"title":"SRE Weekly Issue #360","date":"February 20, 2023","format":false,"excerpt":"View on sreweekly.com A message from our sponsor, Rootly: Manage incidents directly from Slack with Rootly\u00a0\ud83d\ude92. Rootly automates manual tasks like creating an incident channel, Jira ticket and Zoom rooms, inviting responders, creating statuspage updates, postmortem timelines and more. Want to see why companies like Canva and Grammarly love us?:\u2026","rel":"","context":"In &quot;SRE&quot;","img":{"alt_text":"","src":"","width":0,"height":0},"classes":[]},{"id":535,"url":"https:\/\/fde.cat\/index.php\/2022\/01\/24\/sre-weekly-issue-306\/","url_meta":{"origin":568,"position":2},"title":"SRE Weekly Issue #306","date":"January 24, 2022","format":false,"excerpt":"View on sreweekly.com A message from our sponsor, Rootly: Manage incidents directly from Slack with Rootly \ud83d\ude92. Automate manual admin tasks like creating incident channel, Jira and Zoom, paging the right team, postmortem timeline, setting up reminders, and more. Book a demo (+ get a snazzy Rootly shirt): https:\/\/rootly.com\/demo\/?utm_source=sreweekly Articles\u2026","rel":"","context":"In &quot;SRE&quot;","img":{"alt_text":"","src":"","width":0,"height":0},"classes":[]},{"id":622,"url":"https:\/\/fde.cat\/index.php\/2022\/08\/22\/sre-weekly-issue-335\/","url_meta":{"origin":568,"position":3},"title":"SRE Weekly Issue #335","date":"August 22, 2022","format":false,"excerpt":"View on sreweekly.com A message from our sponsor, Rootly: Manage incidents directly from Slack with Rootly \ud83d\ude92. Automate manual admin tasks like creating incident channel, Jira and Zoom, paging and adding responders, postmortem timeline, setting up reminders, and more. Book a demo (+ get a snazzy Rootly lego set): https:\/\/rootly.com\/demo\/\u2026","rel":"","context":"In &quot;SRE&quot;","img":{"alt_text":"","src":"","width":0,"height":0},"classes":[]},{"id":579,"url":"https:\/\/fde.cat\/index.php\/2022\/05\/30\/sre-weekly-issue-324\/","url_meta":{"origin":568,"position":4},"title":"SRE Weekly Issue #324","date":"May 30, 2022","format":false,"excerpt":"View on sreweekly.com A message from our sponsor, Rootly: Manage incidents directly from Slack with Rootly \ud83d\ude92. Automate manual admin tasks like creating incident channel, Jira and Zoom, paging and adding responders, postmortem timeline, setting up reminders, and more. Book a demo (+ get a snazzy Rootly lego set): https:\/\/rootly.com\/demo\/\u2026","rel":"","context":"In &quot;SRE&quot;","img":{"alt_text":"","src":"","width":0,"height":0},"classes":[]},{"id":540,"url":"https:\/\/fde.cat\/index.php\/2022\/02\/07\/sre-weekly-issue-308\/","url_meta":{"origin":568,"position":5},"title":"SRE Weekly Issue #308","date":"February 7, 2022","format":false,"excerpt":"View on sreweekly.com A message from our sponsor, Rootly: Manage incidents directly from Slack with Rootly \ud83d\ude92. Automate manual admin tasks like creating incident channel, Jira and Zoom, paging the right team, postmortem timeline, setting up reminders, and more. Book a demo (+ get a snazzy Rootly shirt): https:\/\/rootly.com\/demo\/?utm_source=sreweekly Articles\u2026","rel":"","context":"In &quot;SRE&quot;","img":{"alt_text":"","src":"","width":0,"height":0},"classes":[]}],"_links":{"self":[{"href":"https:\/\/fde.cat\/index.php\/wp-json\/wp\/v2\/posts\/568","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/fde.cat\/index.php\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/fde.cat\/index.php\/wp-json\/wp\/v2\/types\/post"}],"replies":[{"embeddable":true,"href":"https:\/\/fde.cat\/index.php\/wp-json\/wp\/v2\/comments?post=568"}],"version-history":[{"count":0,"href":"https:\/\/fde.cat\/index.php\/wp-json\/wp\/v2\/posts\/568\/revisions"}],"wp:attachment":[{"href":"https:\/\/fde.cat\/index.php\/wp-json\/wp\/v2\/media?parent=568"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/fde.cat\/index.php\/wp-json\/wp\/v2\/categories?post=568"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/fde.cat\/index.php\/wp-json\/wp\/v2\/tags?post=568"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}