{"id":717,"date":"2023-05-22T01:21:37","date_gmt":"2023-05-22T01:21:37","guid":{"rendered":"https:\/\/fde.cat\/index.php\/2023\/05\/22\/sre-weekly-issue-373\/"},"modified":"2023-05-22T01:21:37","modified_gmt":"2023-05-22T01:21:37","slug":"sre-weekly-issue-373","status":"publish","type":"post","link":"https:\/\/fde.cat\/index.php\/2023\/05\/22\/sre-weekly-issue-373\/","title":{"rendered":"SRE Weekly Issue #373"},"content":{"rendered":"<p><a href=\"https:\/\/sreweekly.com\/sre-weekly-issue-373\/\" title=\"Permalink to SRE Weekly Issue #373\" class=\"email_only\">View on sreweekly.com<\/a><\/p>\n<div class=\"sreweekly-sponsor-message\">\n<h2>A message from our sponsor, <a href=\"https:\/\/rootly.com\/demo\/?utm_source=sreweekly\">Rootly<\/a>:<\/h2>\n<p>Rootly is hiring for a Sr. Developer Relations Advocate to continue helping more world-class companies like Figma, NVIDIA, Squarespace, accelerate their incident management journey. Looking for previous on-call engineers with a passion for making the world a more reliable place.  Learn more:<\/p>\n<p><a href=\"https:\/\/rootly.com\/careers?gh_jid=4015888007\">https:\/\/rootly.com\/careers?gh_jid=4015888007<\/a><\/p>\n<\/div>\n<h2>Articles<\/h2>\n<div class=\"wp-block-group\">\n<div class=\"wp-block-group__inner-container\">\n<div class=\"sreweekly-entry\">\n<div class=\"sreweekly-title\"><a href=\"https:\/\/www.datadoghq.com\/blog\/2023-03-08-multiregion-infrastructure-connectivity-issue\/\" target=\"_blank\" rel=\"noopener\">2023 03 08 Incident: Infrastructure connectivity issue affecting multiple regions<\/a><\/div>\n<div class=\"sreweekly-description\">\n<p>Datadog posted a report on their major outage in March, and it\u2019s a doozy.  An unattended updates system that they didn\u2019t even want, need, or know about triggered across all hosts in multiple clouds nearly simultaneously, causing a regression.<\/p>\n<p>\u00a0\u00a0<small>Alexis L\u00ea-Qu\u00f4c \u2014 Datadog<\/small><\/p>\n<\/div>\n<\/div>\n<div class=\"sreweekly-entry\">\n<div class=\"sreweekly-title\"><a href=\"https:\/\/github.blog\/2023-05-16-addressing-githubs-recent-availability-issues\/\" target=\"_blank\" rel=\"noopener\">Addressing GitHub\u2019s recent availability issues<\/a><\/div>\n<div class=\"sreweekly-description\">\n<p>GitHub has had a string of apparently unrelated outages recently, and they\u2019ve posted this description.<\/p>\n<p>\u00a0\u00a0<small>Mike Hanley \u2014 GitHub<\/small><\/p>\n<\/div>\n<\/div>\n<div class=\"sreweekly-entry\">\n<div class=\"sreweekly-title\"><a href=\"https:\/\/github.com\/StanzaSystems\/awesome-load-management\" target=\"_blank\" rel=\"noopener\">StanzaSystems\/awesome-load-management<\/a><\/div>\n<div class=\"sreweekly-description\">\n<p>Oh look, another awesome-* repo relevant to our interests!<\/p>\n<p>A repo of links to articles, papers, conference talks, and tooling related to load management in software services: loadshedding, circuitbreaking, quota management and throttling. PRs welcome.<\/p>\n<p>\u00a0\u00a0<small>Laura Nolan and Niall Murphy \u2014 Stanza Systems<\/small><\/p>\n<\/div>\n<\/div>\n<div class=\"sreweekly-entry\">\n<div class=\"sreweekly-title\"><a href=\"https:\/\/www.srestories.dev\/p\/sre-story-with-matthew-iselin\" target=\"_blank\" rel=\"noopener\">SRE Story with Matthew Iselin<\/a><\/div>\n<div class=\"sreweekly-description\">\n<p>This interview covers a lot of ground including looking beyond just \u201cup or down\u201d when considering reliability.<\/p>\n<p>\u00a0\u00a0<small>Prathamesh Sonpatki \u2014 SRE Stories<\/small><\/p>\n<\/div>\n<\/div>\n<div class=\"sreweekly-entry\">\n<div class=\"sreweekly-title\"><a href=\"https:\/\/netflixtechblog.com\/debugging-a-fuse-deadlock-in-the-linux-kernel-c75cd7989b6d\" target=\"_blank\" rel=\"noopener\">Debugging a FUSE deadlock in the Linux kernel<\/a><\/div>\n<div class=\"sreweekly-description\">\n<p>If you\u2019re in the mood for a deep systems debugging story, you\u2019re in for a treat.  The author takes you along for the ride with a wealth of detailed code snippets.<\/p>\n<p>\u00a0\u00a0<small>Tycho Andersen \u2014 Netflix<\/small><\/p>\n<\/div>\n<\/div>\n<div class=\"sreweekly-entry\">\n<div class=\"sreweekly-title\"><a href=\"https:\/\/redpanda.com\/blog\/why-fsync-is-needed-for-data-safety-in-kafka-or-non-byzantine-protocols\" target=\"_blank\" rel=\"noopener\">Why `fsync()`: Losing Unsynced Data<\/a><\/div>\n<div class=\"sreweekly-description\">\n<p>Regardless of the replication mechanism you must fsync() your data to prevent global data loss in non-Byzantine protocols.<\/p>\n<p>\u00a0\u00a0<small>Denis Rystsov and Alexander Gallego \u2014 Redpanda<\/small><\/p>\n<\/div>\n<\/div>\n<div class=\"sreweekly-entry\">\n<div class=\"sreweekly-title\"><a href=\"https:\/\/certomodo.substack.com\/p\/emotional-intelligence\" target=\"_blank\" rel=\"noopener\">Emotional Intelligence<\/a><\/div>\n<div class=\"sreweekly-description\">\n<p>Emotional intelligence is a critical skill for SREs, especially when we interact with other teams in fraught situations.<\/p>\n<p>\u00a0\u00a0<small>Amin Astaneh \u2014 Certo Modo<\/small><\/p>\n<\/div>\n<\/div>\n<div class=\"sreweekly-entry\">\n<div class=\"sreweekly-title\"><a href=\"https:\/\/engineering.atspotify.com\/2023\/05\/fleet-management-at-spotify-part-3-fleet-wide-refactoring\/\" target=\"_blank\" rel=\"noopener\">Fleet Management at Spotify (Part 3): Fleet-wide Refactoring<\/a><\/div>\n<div class=\"sreweekly-description\">\n<p>Wow! Spotify created a set of tools to perform automated refactoring of thousands of repositories at once.  This includes the ability to run tests, automatically merge pull requests without human review, and roll refactorings out gradually.<\/p>\n<p>\u00a0\u00a0<small>Matt Brown \u2014 Spotify<\/small><\/p>\n<\/div>\n<\/div>\n<div class=\"sreweekly-entry\">\n<div class=\"sreweekly-title\"><a href=\"https:\/\/www.jeli.io\/blog\/teach-me-how-to-howie\" target=\"_blank\" rel=\"noopener\">Teach me how to Howie!<\/a><\/div>\n<div class=\"sreweekly-description\">\n<p>Jeli has published a one-page cheat-sheet for their highly-detailed Howie guide for running incident retrospectives.<\/p>\n<p>\u00a0\u00a0<small>Jeli<\/small><\/p>\n<\/div>\n<\/div>\n<\/div>\n<\/div>\n<p>SRE WEEKLY<\/p>","protected":false},"excerpt":{"rendered":"<p>View on sreweekly.com A message from our sponsor, Rootly: Rootly is hiring for a Sr. Developer Relations Advocate to continue helping more world-class companies like Figma, NVIDIA, Squarespace, accelerate their incident management journey. Looking for previous on-call engineers with a passion for making the world a more reliable place. Learn more: https:\/\/rootly.com\/careers?gh_jid=4015888007 Articles 2023 03&hellip; <a class=\"more-link\" href=\"https:\/\/fde.cat\/index.php\/2023\/05\/22\/sre-weekly-issue-373\/\">Continue reading <span class=\"screen-reader-text\">SRE Weekly Issue #373<\/span><\/a><\/p>\n","protected":false},"author":0,"featured_media":0,"comment_status":"","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"spay_email":"","footnotes":""},"categories":[8],"tags":[],"class_list":["post-717","post","type-post","status-publish","format-standard","hentry","category-sre","entry"],"jetpack_featured_media_url":"","jetpack-related-posts":[{"id":543,"url":"https:\/\/fde.cat\/index.php\/2022\/02\/21\/sre-weekly-issue-310\/","url_meta":{"origin":717,"position":0},"title":"SRE Weekly Issue #310","date":"February 21, 2022","format":false,"excerpt":"View on sreweekly.com A message from our sponsor, Rootly: Manage incidents directly from Slack with Rootly \ud83d\ude92. Automate manual admin tasks like creating incident channel, Jira and Zoom, paging the right team, postmortem timeline, setting up reminders, and more. Book a demo (+ get a snazzy Rootly shirt): https:\/\/rootly.com\/demo\/?utm_source=sreweekly Articles\u2026","rel":"","context":"In &quot;SRE&quot;","img":{"alt_text":"","src":"","width":0,"height":0},"classes":[]},{"id":535,"url":"https:\/\/fde.cat\/index.php\/2022\/01\/24\/sre-weekly-issue-306\/","url_meta":{"origin":717,"position":1},"title":"SRE Weekly Issue #306","date":"January 24, 2022","format":false,"excerpt":"View on sreweekly.com A message from our sponsor, Rootly: Manage incidents directly from Slack with Rootly \ud83d\ude92. Automate manual admin tasks like creating incident channel, Jira and Zoom, paging the right team, postmortem timeline, setting up reminders, and more. Book a demo (+ get a snazzy Rootly shirt): https:\/\/rootly.com\/demo\/?utm_source=sreweekly Articles\u2026","rel":"","context":"In &quot;SRE&quot;","img":{"alt_text":"","src":"","width":0,"height":0},"classes":[]},{"id":579,"url":"https:\/\/fde.cat\/index.php\/2022\/05\/30\/sre-weekly-issue-324\/","url_meta":{"origin":717,"position":2},"title":"SRE Weekly Issue #324","date":"May 30, 2022","format":false,"excerpt":"View on sreweekly.com A message from our sponsor, Rootly: Manage incidents directly from Slack with Rootly \ud83d\ude92. Automate manual admin tasks like creating incident channel, Jira and Zoom, paging and adding responders, postmortem timeline, setting up reminders, and more. Book a demo (+ get a snazzy Rootly lego set): https:\/\/rootly.com\/demo\/\u2026","rel":"","context":"In &quot;SRE&quot;","img":{"alt_text":"","src":"","width":0,"height":0},"classes":[]},{"id":546,"url":"https:\/\/fde.cat\/index.php\/2022\/03\/07\/sre-weekly-issue-312\/","url_meta":{"origin":717,"position":3},"title":"SRE Weekly Issue #312","date":"March 7, 2022","format":false,"excerpt":"View on sreweekly.com A message from our sponsor, Rootly: Manage incidents directly from Slack with Rootly \ud83d\ude92. Automate manual admin tasks like creating incident channel, Jira and Zoom, paging the right team, postmortem timeline, setting up reminders, and more. Book a demo (+ get a snazzy Rootly shirt): https:\/\/rootly.com\/demo\/?utm_source=sreweekly Articles\u2026","rel":"","context":"In &quot;SRE&quot;","img":{"alt_text":"","src":"","width":0,"height":0},"classes":[]},{"id":537,"url":"https:\/\/fde.cat\/index.php\/2022\/01\/31\/sre-weekly-issue-307\/","url_meta":{"origin":717,"position":4},"title":"SRE Weekly Issue #307","date":"January 31, 2022","format":false,"excerpt":"View on sreweekly.com A message from our sponsor, Rootly: Manage incidents directly from Slack with Rootly \ud83d\ude92. Automate manual admin tasks like creating incident channel, Jira and Zoom, paging the right team, postmortem timeline, setting up reminders, and more. Book a demo (+ get a snazzy Rootly shirt): https:\/\/rootly.com\/demo\/?utm_source=sreweekly Articles\u2026","rel":"","context":"In &quot;SRE&quot;","img":{"alt_text":"","src":"","width":0,"height":0},"classes":[]},{"id":521,"url":"https:\/\/fde.cat\/index.php\/2021\/12\/27\/sre-weekly-issue-302\/","url_meta":{"origin":717,"position":5},"title":"SRE Weekly Issue #302","date":"December 27, 2021","format":false,"excerpt":"View on sreweekly.com Happy holidays, for those that celebrate! I put this issue together in advance, so no Outages section this week. A message from our sponsor, Rootly: Manage incidents directly from Slack with Rootly \ud83d\ude92. Automate manual admin tasks like creating incident channel, Jira and Zoom, paging the right\u2026","rel":"","context":"In &quot;SRE&quot;","img":{"alt_text":"","src":"","width":0,"height":0},"classes":[]}],"_links":{"self":[{"href":"https:\/\/fde.cat\/index.php\/wp-json\/wp\/v2\/posts\/717","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/fde.cat\/index.php\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/fde.cat\/index.php\/wp-json\/wp\/v2\/types\/post"}],"replies":[{"embeddable":true,"href":"https:\/\/fde.cat\/index.php\/wp-json\/wp\/v2\/comments?post=717"}],"version-history":[{"count":0,"href":"https:\/\/fde.cat\/index.php\/wp-json\/wp\/v2\/posts\/717\/revisions"}],"wp:attachment":[{"href":"https:\/\/fde.cat\/index.php\/wp-json\/wp\/v2\/media?parent=717"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/fde.cat\/index.php\/wp-json\/wp\/v2\/categories?post=717"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/fde.cat\/index.php\/wp-json\/wp\/v2\/tags?post=717"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}