{"id":627,"date":"2022-09-05T01:42:16","date_gmt":"2022-09-05T01:42:16","guid":{"rendered":"https:\/\/fde.cat\/index.php\/2022\/09\/05\/sre-weekly-issue-337\/"},"modified":"2022-09-05T01:42:16","modified_gmt":"2022-09-05T01:42:16","slug":"sre-weekly-issue-337","status":"publish","type":"post","link":"https:\/\/fde.cat\/index.php\/2022\/09\/05\/sre-weekly-issue-337\/","title":{"rendered":"SRE Weekly Issue #337"},"content":{"rendered":"<p><a href=\"https:\/\/sreweekly.com\/sre-weekly-issue-337\/\" title=\"Permalink to SRE Weekly Issue #337\" class=\"email_only\">View on sreweekly.com<\/a><\/p>\n<p>Thanks for all the vacation well-wishes!  It was really great and relaxing.  Take vacations, it\u2019s important for reliability!<\/p>\n<p>While I was out, I shipped the past two issues with content prepared in advance, and without the Outages section.  This gave me a chance to really think hard about the value of the Outages section versus the time and effort I put into it.<\/p>\n<p><strong>I\u2019ve decided to put the Outages section on hiatus for the time being.<\/strong>  For notable outages, I\u2019ll include them in the main section, on a case-by-case basis.  Read on if you\u2019re interested in what went into this decision.<\/p>\n<p>The Outages section has always been of lower quality than the rest of the newsletter.  I have no scientific process for choosing which Outages make the cut \u2014 mostly it\u2019s just whatever shows up in my Google search alerts and seems \u201cimportant\u201d, minus a few arbitrary categories that don\u2019t seem particularly interesting like telecoms and games.  I do only a cursory review of the outage-related news articles I link to, and often they\u2019re on poor-quality sites with a ton of intrusive ads.  Gathering the list of Outages has begun taking more and more of my time, and I\u2019d much rather spend that effort on curating quality content, so that\u2019s what I\u2019m going to do going forward.<\/p>\n<div class=\"sreweekly-sponsor-message\">\n<h2>A message from our sponsor, <a href=\"https:\/\/rootly.com\/demo\/?utm_source=sreweekly\">Rootly<\/a>:<\/h2>\n<p>Manage incidents directly from Slack with Rootly\u00a0\ud83d\ude92.<\/p>\n<p>Rootly automates manual tasks like creating an incident channel, Jira ticket and Zoom rooms, inviting responders, creating statuspage updates, postmortem timelines and more. Want to see why companies like Canva and Grammarly love us?:<\/p>\n<p><a href=\"https:\/\/rootly.com\/demo\/\">https:\/\/rootly.com\/demo\/<\/a><\/p>\n<\/div>\n<div class=\"wp-block-group\">\n<div class=\"wp-block-group__inner-container\">\n<div class=\"sreweekly-entry\">\n<div class=\"sreweekly-title\"><a href=\"https:\/\/www.jeli.io\/blog\/10-things-i-learned-from-my-first-incident-review\" target=\"_blank\" rel=\"noopener\">10 Things I Learned From My First Incident Review<\/a><\/div>\n<div class=\"sreweekly-description\">\n<p>Every one of these 10 items is enough reason to read this article! This makes me want to go investigate some incidents right now.<\/p>\n<p>\u00a0\u00a0Fischer Jemison \u2014 Jeli<\/p>\n<\/div>\n<\/div>\n<div class=\"sreweekly-entry\">\n<div class=\"sreweekly-title\"><a href=\"https:\/\/slack.engineering\/circuit-breakers\/\" target=\"_blank\" rel=\"noopener\">Slowing Down to Speed Up \u2013 Circuit Breakers for Slack\u2019s CI\/CD<\/a><\/div>\n<div class=\"sreweekly-description\">\n<p>Slack shares with us in great detail why they use circuit breakers and how they rolled them out.<\/p>\n<p>\u00a0\u00a0Frank Chen \u2014 Slack<\/p>\n<\/div>\n<\/div>\n<div class=\"sreweekly-entry\">\n<div class=\"sreweekly-title\"><a href=\"https:\/\/thenewstack.io\/tips-to-make-your-on-call-process-less-stressful\/\" target=\"_blank\" rel=\"noopener\">Tips to Make Your On-Call Process Less Stressful<\/a><\/div>\n<div class=\"sreweekly-description\">\n<p>My favorite part of this one is the section on expectations.  We need to socialize this to help reduce the pressure on folks going on call for the first time.<\/p>\n<p>\u00a0\u00a0Prakya Vasudevan \u2014 Squadcast<\/p>\n<\/div>\n<\/div>\n<div class=\"sreweekly-entry\">\n<div class=\"sreweekly-title\"><a href=\"https:\/\/metrist.io\/blog\/why-status-pages-are-lying-to-you-and-what-to-do-about-it\/\" target=\"_blank\" rel=\"noopener\">Why Status Pages Are Lying to You and What To Do About It<\/a><\/div>\n<div class=\"sreweekly-description\">\n<p>Status pages are marketing material.  Prove me wrong.<\/p>\n<p>\u00a0\u00a0Ellen Steinke \u2014 Metrist<\/p>\n<\/div>\n<\/div>\n<div class=\"sreweekly-entry\">\n<div class=\"sreweekly-title\"><a href=\"https:\/\/incident.io\/blog\/level-up-teams\" target=\"_blank\" rel=\"noopener\">Using incidents to level up your teams<\/a><\/div>\n<div class=\"sreweekly-description\">\n<p>incidents have unusually high information density compared with day-to-day work, and they enable you to piggy-back on the experience of others<\/p>\n<p>\u00a0\u00a0Lisa Karlin Curtis \u2014 incident.io<\/p>\n<\/div>\n<\/div>\n<div class=\"sreweekly-entry\">\n<div class=\"sreweekly-title\"><a href=\"https:\/\/engineering.grab.com\/how-we-store-millions-orders\" target=\"_blank\" rel=\"noopener\">How we store and process millions of orders daily<\/a><\/div>\n<div class=\"sreweekly-description\">\n<p>These folks realized that they had two different use cases for the same data, real-time transactions and batch processing.  Rather than try to find one DB that could support both, they fork two copies of the data.<\/p>\n<p>\u00a0\u00a0Xi Chen and Siliang Cao \u2014 Grab<\/p>\n<\/div>\n<\/div>\n<div class=\"sreweekly-entry\">\n<div class=\"sreweekly-title\"><a href=\"https:\/\/charity.wtf\/2022\/08\/15\/live-your-best-life-with-structured-events\/\" target=\"_blank\" rel=\"noopener\">Live Your Best Life With Structured Events<\/a><\/div>\n<div class=\"sreweekly-description\">\n<p>It\u2019s all about gathering enough information that you can ask new questions when something goes wrong, rather than being stuck with only answers to the questions you thought to ask in advance.<\/p>\n<p>\u00a0\u00a0Charity Majors<\/p>\n<\/div>\n<\/div>\n<div class=\"sreweekly-entry\">\n<div class=\"sreweekly-title\"><a href=\"https:\/\/discord.com\/blog\/how-discord-supercharges-network-disks-for-extreme-low-latency\" target=\"_blank\" rel=\"noopener\">How Discord Supercharges Network Disks for Extreme Low Latency<\/a><\/div>\n<div class=\"sreweekly-description\">\n<p>They needed the speed of local ephemeral SSDs but the reliability of network-based persistent disks.  The solution: a linux MD option to mirror but prefer to read from the local disks.  Neat!<\/p>\n<p>\u00a0\u00a0Glen Oakley \u2014 Discord<\/p>\n<\/div>\n<\/div>\n<div class=\"sreweekly-entry\">\n<div class=\"sreweekly-title\"><a href=\"https:\/\/engineering.linkedin.com\/blog\/2022\/operating-system-upgrades-at-linkedin-s-scale\" target=\"_blank\" rel=\"noopener\">Operating system upgrades at LinkedIn\u2019s scale<\/a><\/div>\n<div class=\"sreweekly-description\">\n<p>OS upgrades can be risky. LinkedIn developed a system to unify OS upgrade procedures and make them much less risky.<\/p>\n<p>\u00a0\u00a0Hengyang Hu, Dinesh Dhakal, and Kalyanasundaram Somasundaram \u2014 LinkedIn<\/p>\n<\/div>\n<\/div>\n<\/div>\n<\/div>\n<p>SRE WEEKLY<\/p>","protected":false},"excerpt":{"rendered":"<p>View on sreweekly.com Thanks for all the vacation well-wishes! It was really great and relaxing. Take vacations, it\u2019s important for reliability! While I was out, I shipped the past two issues with content prepared in advance, and without the Outages section. This gave me a chance to really think hard about the value of the&hellip; <a class=\"more-link\" href=\"https:\/\/fde.cat\/index.php\/2022\/09\/05\/sre-weekly-issue-337\/\">Continue reading <span class=\"screen-reader-text\">SRE Weekly Issue #337<\/span><\/a><\/p>\n","protected":false},"author":0,"featured_media":0,"comment_status":"","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"spay_email":"","footnotes":""},"categories":[8],"tags":[],"class_list":["post-627","post","type-post","status-publish","format-standard","hentry","category-sre","entry"],"jetpack_featured_media_url":"","jetpack-related-posts":[{"id":621,"url":"https:\/\/fde.cat\/index.php\/2022\/08\/15\/sre-weekly-issue-334\/","url_meta":{"origin":627,"position":0},"title":"SRE Weekly Issue #334","date":"August 15, 2022","format":false,"excerpt":"View on sreweekly.com I\u2019ll be on vacation starting next Sunday (yay!). That means the next two issues will be prepared in advance, so there won\u2019t be an Outages section. A message from our sponsor, Rootly: Manage incidents directly from Slack with Rootly \ud83d\ude92. Automate manual admin tasks like creating incident\u2026","rel":"","context":"In &quot;SRE&quot;","img":{"alt_text":"","src":"","width":0,"height":0},"classes":[]},{"id":500,"url":"https:\/\/fde.cat\/index.php\/2021\/11\/08\/sre-weekly-issue-295\/","url_meta":{"origin":627,"position":1},"title":"SRE Weekly Issue #295","date":"November 8, 2021","format":false,"excerpt":"View on sreweekly.com A message from our sponsor, Rootly: Manage incidents directly from Slack with Rootly \ud83d\ude92. Automate manual admin tasks like creating incident channel, Jira and Zoom, paging the right team, postmortem timeline, setting up reminders, and more. Book a demo: https:\/\/rootly.com\/?utm_source=sreweekly Articles MTTR is a Misleading Metric\u2014Now What?\u2026","rel":"","context":"In &quot;SRE&quot;","img":{"alt_text":"","src":"","width":0,"height":0},"classes":[]},{"id":347,"url":"https:\/\/fde.cat\/index.php\/2021\/08\/31\/sre-weekly-issue-283\/","url_meta":{"origin":627,"position":2},"title":"SRE Weekly Issue #283","date":"August 31, 2021","format":false,"excerpt":"View on sreweekly.com I\u2019m on vacation enjoying the sunny beaches in Maine with my family, so I prepared this week\u2019s issue in advance.\u00a0 No outages section, save for\u00a0one big one I noticed due to direct personal experience.\u00a0 See you all next week! A message from our sponsor, StackHawk: StackHawk is\u2026","rel":"","context":"In &quot;SRE&quot;","img":{"alt_text":"","src":"","width":0,"height":0},"classes":[]},{"id":557,"url":"https:\/\/fde.cat\/index.php\/2022\/03\/28\/sre-weekly-issue-315\/","url_meta":{"origin":627,"position":3},"title":"SRE Weekly Issue #315","date":"March 28, 2022","format":false,"excerpt":"View on sreweekly.com I\u2019m going on vacation, so I\u2019m going to prepare next week\u2019s issue in advance. It\u2019ll look much like most issues, except there won\u2019t be an Outages section. See you all in two weeks! A message from our sponsor, Rootly: Manage incidents directly from Slack with Rootly \ud83d\ude92.\u2026","rel":"","context":"In &quot;SRE&quot;","img":{"alt_text":"","src":"","width":0,"height":0},"classes":[]},{"id":252,"url":"https:\/\/fde.cat\/index.php\/2021\/08\/31\/sre-weekly-issue-255\/","url_meta":{"origin":627,"position":4},"title":"SRE Weekly Issue #255","date":"August 31, 2021","format":false,"excerpt":"View on sreweekly.com A message from our sponsor, StackHawk: With StackHawk\u2019s new GitHub Action, you can integrate AppSec testing directly into your GitHub CI\/CD pipeline. See how: http:\/\/sthwk.com\/appsec-github-action Articles Why It Should Be Service, Not Site Reliability It really should! Even Google is much more accurately described as a \u201cservice\u201d\u2026","rel":"","context":"In &quot;SRE&quot;","img":{"alt_text":"","src":"","width":0,"height":0},"classes":[]},{"id":611,"url":"https:\/\/fde.cat\/index.php\/2022\/07\/25\/sre-weekly-issue-331\/","url_meta":{"origin":627,"position":5},"title":"SRE Weekly Issue #331","date":"July 25, 2022","format":false,"excerpt":"View on sreweekly.com A message from our sponsor, Rootly: Manage incidents directly from Slack with Rootly \ud83d\ude92. Automate manual admin tasks like creating incident channel, Jira and Zoom, paging and adding responders, postmortem timeline, setting up reminders, and more. Book a demo (+ get a snazzy Rootly lego set): https:\/\/rootly.com\/demo\/\u2026","rel":"","context":"In &quot;SRE&quot;","img":{"alt_text":"","src":"","width":0,"height":0},"classes":[]}],"_links":{"self":[{"href":"https:\/\/fde.cat\/index.php\/wp-json\/wp\/v2\/posts\/627","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/fde.cat\/index.php\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/fde.cat\/index.php\/wp-json\/wp\/v2\/types\/post"}],"replies":[{"embeddable":true,"href":"https:\/\/fde.cat\/index.php\/wp-json\/wp\/v2\/comments?post=627"}],"version-history":[{"count":0,"href":"https:\/\/fde.cat\/index.php\/wp-json\/wp\/v2\/posts\/627\/revisions"}],"wp:attachment":[{"href":"https:\/\/fde.cat\/index.php\/wp-json\/wp\/v2\/media?parent=627"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/fde.cat\/index.php\/wp-json\/wp\/v2\/categories?post=627"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/fde.cat\/index.php\/wp-json\/wp\/v2\/tags?post=627"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}