{"id":826,"date":"2024-02-18T21:51:09","date_gmt":"2024-02-18T21:51:09","guid":{"rendered":"https:\/\/fde.cat\/index.php\/2024\/02\/18\/sre-weekly-issue-412\/"},"modified":"2024-02-18T21:51:09","modified_gmt":"2024-02-18T21:51:09","slug":"sre-weekly-issue-412","status":"publish","type":"post","link":"https:\/\/fde.cat\/index.php\/2024\/02\/18\/sre-weekly-issue-412\/","title":{"rendered":"SRE Weekly Issue #412"},"content":{"rendered":"<p><a href=\"https:\/\/sreweekly.com\/sre-weekly-issue-412\/\" title=\"Permalink to SRE Weekly Issue #412\" class=\"email_only\">View on sreweekly.com<\/a><\/p>\n<div class=\"sreweekly-sponsor-message\">\n<h2>A message from our sponsor, <a href=\"https:\/\/firehydrant.com\/\">FireHydrant<\/a>:<\/h2>\n<p>FireHydrant\u2019s new and improved MTTX analytics dashboard is here! See which services are most affected by incidents, where they take the longest to detect (or acknowledge, mitigate, resolve \u2026 you name it); and how metrics and statistics change over time.<br \/>\n<a href=\"https:\/\/firehydrant.com\/blog\/mttx-incident-analytics-to-drive-your-reliability-roadmap\/\">https:\/\/firehydrant.com\/blog\/mttx-incident-analytics-to-drive-your-reliability-roadmap\/<\/a><\/p>\n<\/div>\n<div class=\"wp-block-group is-layout-flow wp-block-group-is-layout-flow\">\n<div class=\"wp-block-group__inner-container\">\n<div class=\"sreweekly-entry\">\n<div class=\"sreweekly-title\"><a href=\"https:\/\/medium.com\/site-reliability-engineering-leadership\/the-single-pain-of-glass-6e42930e966?source=rss----dc5d1a577fd6---4\" target=\"_blank\" rel=\"noopener\">The Single Pain of Glass<\/a><\/div>\n<div class=\"sreweekly-description\">\n<p>Can a single dashboard to cover your entire system really exist?<\/p>\n<p>\u00a0\u00a0<small>Jamie Allen<\/small><\/p>\n<\/div>\n<\/div>\n<div class=\"sreweekly-entry\">\n<div class=\"sreweekly-title\"><a href=\"https:\/\/argoday.medium.com\/sev-1-call-leaders-8fdc0ae5f6be\" target=\"_blank\" rel=\"noopener\">The importance of SEV-1 call leaders<\/a><\/div>\n<div class=\"sreweekly-description\">\n<p>This one makes the case for having a group of specially-trained incident commanders to handle SEV-1 (worst-case) outages, separate from your normal ICs.<\/p>\n<p>\u00a0\u00a0<small>Jonathan Word<\/small><\/p>\n<\/div>\n<\/div>\n<div class=\"sreweekly-entry\">\n<div class=\"sreweekly-title\"><a href=\"https:\/\/www.blameless.com\/blog\/getting-buy-in-from-management\" target=\"_blank\" rel=\"noopener\">Getting Buy-in from Management on Reliability Investments<\/a><\/div>\n<div class=\"sreweekly-description\">\n<p>This article lays out a strategy for gaining buy-in by making three specific, sequential arguments.<\/p>\n<p>\u00a0\u00a0<small>Emily Arnott \u2014 Blameless<\/small><\/p>\n<\/div>\n<\/div>\n<div class=\"sreweekly-entry\">\n<div class=\"sreweekly-title\"><a href=\"https:\/\/blog.alexewerlof.com\/p\/sre-archetypes\" target=\"_blank\" rel=\"noopener\">SRE Archetypes<\/a><\/div>\n<div class=\"sreweekly-description\">\n<p>This article explores the varying ways that SRE is implemented through a set of 4 archetypes.<\/p>\n<p>\u00a0\u00a0<small>Alex Ewerl\u00f6f<\/small><\/p>\n<\/div>\n<\/div>\n<div class=\"sreweekly-entry\">\n<div class=\"sreweekly-title\"><a href=\"https:\/\/blog.cloudflare.com\/linux-transport-protocol-port-selection-performance\" target=\"_blank\" rel=\"noopener\">connect() \u2013 why are you so slow?<\/a><\/div>\n<div class=\"sreweekly-description\">\n<p>It turns out that assigning ephemeral ports to connections in Linux is way more complicated than it might seem at first glance, and there\u2019s room for optimization, as this article explains.<\/p>\n<p>\u00a0\u00a0<small>Frederick Lawler \u2014 Cloudflare<\/small><\/p>\n<\/div>\n<\/div>\n<div class=\"sreweekly-entry\">\n<div class=\"sreweekly-title\"><a href=\"https:\/\/engineering.fb.com\/2024\/02\/07\/production-engineering\/simple-precision-time-protocol-sptp-meta\/\" target=\"_blank\" rel=\"noopener\">Simple Precision Time Protocol at Meta<\/a><\/div>\n<div class=\"sreweekly-description\">\n<p>While deploying Precision Time Protocol (PTP) at Meta, we\u2019ve developed a simplified version of the protocol (Simple Precision Time Protocol \u2013 SPTP), that can offer the same level of clock synchronization as unicast PTPv2 more reliably and with fewer resources.<\/p>\n<p>\u00a0\u00a0<small>Oleg Obleukhov and Ahmad Byagowi \u2014 Meta<\/small><\/p>\n<\/div>\n<\/div>\n<div class=\"sreweekly-entry\">\n<div class=\"sreweekly-title\"><a href=\"https:\/\/ferd.ca\/a-distributed-systems-reading-list.html\" target=\"_blank\" rel=\"noopener\">A Distributed Systems Reading List<\/a><\/div>\n<div class=\"sreweekly-description\">\n<p>Far more than just a list of links, this article gives an overview of each topic before pointing you in the right direction for more information.<\/p>\n<p>\u00a0\u00a0<small>Fred Hebert<\/small><\/p>\n<\/div>\n<\/div>\n<div class=\"sreweekly-entry\">\n<div class=\"sreweekly-title\"><a href=\"https:\/\/medium.com\/dyninno\/streamlining-and-implementing-incident-management-at-dyninno-c8ea06327f3a\" target=\"_blank\" rel=\"noopener\">Streamlining and Implementing Incident Management at Dyninno<\/a><\/div>\n<div class=\"sreweekly-description\">\n<p>Building on the groundwork laid out in our first article about the initial steps in Incident Management (IM) at Dyninno Group, this second installment will explore the practicalities of streamlining and implementing these strategies.<\/p>\n<p>\u00a0\u00a0<small>Vladimirs Romanovskis<\/small><\/p>\n<\/div>\n<\/div>\n<\/div>\n<\/div>\n<p>SRE WEEKLY<\/p>","protected":false},"excerpt":{"rendered":"<p>View on sreweekly.com A message from our sponsor, FireHydrant: FireHydrant\u2019s new and improved MTTX analytics dashboard is here! See which services are most affected by incidents, where they take the longest to detect (or acknowledge, mitigate, resolve \u2026 you name it); and how metrics and statistics change over time. https:\/\/firehydrant.com\/blog\/mttx-incident-analytics-to-drive-your-reliability-roadmap\/ The Single Pain of Glass&hellip; <a class=\"more-link\" href=\"https:\/\/fde.cat\/index.php\/2024\/02\/18\/sre-weekly-issue-412\/\">Continue reading <span class=\"screen-reader-text\">SRE Weekly Issue #412<\/span><\/a><\/p>\n","protected":false},"author":0,"featured_media":0,"comment_status":"","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"spay_email":"","footnotes":""},"categories":[8],"tags":[],"class_list":["post-826","post","type-post","status-publish","format-standard","hentry","category-sre","entry"],"jetpack_featured_media_url":"","jetpack-related-posts":[{"id":666,"url":"https:\/\/fde.cat\/index.php\/2023\/01\/09\/sre-weekly-issue-354\/","url_meta":{"origin":826,"position":0},"title":"SRE Weekly Issue #354","date":"January 9, 2023","format":false,"excerpt":"View on sreweekly.com A message from our sponsor, Rootly: Manage incidents directly from Slack with Rootly\u00a0\ud83d\ude92. Rootly automates manual tasks like creating an incident channel, Jira ticket and Zoom rooms, inviting responders, creating statuspage updates, postmortem timelines and more. Want to see why companies like Canva and Grammarly love us?:\u2026","rel":"","context":"In &quot;SRE&quot;","img":{"alt_text":"","src":"","width":0,"height":0},"classes":[]},{"id":807,"url":"https:\/\/fde.cat\/index.php\/2023\/12\/25\/sre-weekly-issue-404\/","url_meta":{"origin":826,"position":1},"title":"SRE Weekly Issue #404","date":"December 25, 2023","format":false,"excerpt":"View on sreweekly.com A message from our sponsor, FireHydrant: Looking to cozy up with a good read this week? Check out \u201cYour guide to better status pages.\u201d It\u2019s a mini masterclass on how to better communicate on your status pages. https:\/\/firehydrant.com\/blog\/your-guide-to-better-incident-status-pages\/ Rule of 10x per 9 For every 9 you\u2026","rel":"","context":"In &quot;SRE&quot;","img":{"alt_text":"","src":"","width":0,"height":0},"classes":[]},{"id":794,"url":"https:\/\/fde.cat\/index.php\/2023\/11\/20\/sre-weekly-issue-399\/","url_meta":{"origin":826,"position":2},"title":"SRE Weekly Issue #399","date":"November 20, 2023","format":false,"excerpt":"View on sreweekly.com A message from our sponsor, FireHydrant: Severity levels help responders and stakeholders understand the incident impact and set expectations for the level of response. This can mean jumping into action faster. But first, you have to ensure severity is actually being set. Here\u2019s one way. https:\/\/firehydrant.com\/blog\/incident-severity-why-you-need-it-and-how-to-ensure-its-set\/ Paper:\u2026","rel":"","context":"In &quot;SRE&quot;","img":{"alt_text":"","src":"","width":0,"height":0},"classes":[]},{"id":577,"url":"https:\/\/fde.cat\/index.php\/2022\/05\/16\/sre-weekly-issue-322\/","url_meta":{"origin":826,"position":3},"title":"SRE Weekly Issue #322","date":"May 16, 2022","format":false,"excerpt":"View on sreweekly.com Bit of a short issue this week. This morning, I stepped on my phone, crushing it mightily beneath my bootheel. Unfortunately a lot of my automation for reviewing articles is on there\u2026 thank goodness I have functioning backups. A message from our sponsor, Rootly: Manage incidents directly\u2026","rel":"","context":"In &quot;SRE&quot;","img":{"alt_text":"","src":"","width":0,"height":0},"classes":[]},{"id":543,"url":"https:\/\/fde.cat\/index.php\/2022\/02\/21\/sre-weekly-issue-310\/","url_meta":{"origin":826,"position":4},"title":"SRE Weekly Issue #310","date":"February 21, 2022","format":false,"excerpt":"View on sreweekly.com A message from our sponsor, Rootly: Manage incidents directly from Slack with Rootly \ud83d\ude92. Automate manual admin tasks like creating incident channel, Jira and Zoom, paging the right team, postmortem timeline, setting up reminders, and more. Book a demo (+ get a snazzy Rootly shirt): https:\/\/rootly.com\/demo\/?utm_source=sreweekly Articles\u2026","rel":"","context":"In &quot;SRE&quot;","img":{"alt_text":"","src":"","width":0,"height":0},"classes":[]},{"id":320,"url":"https:\/\/fde.cat\/index.php\/2021\/08\/31\/sre-weekly-issue-275\/","url_meta":{"origin":826,"position":5},"title":"SRE Weekly Issue #275","date":"August 31, 2021","format":false,"excerpt":"View on sreweekly.com A message from our sponsor, StackHawk: Join ZAP Founder & Project Lead Simon Bennetts on June 30 for a live AMA where he will be answering questions on all things open source and AppSec. Register: http:\/\/sthwk.com\/Simon-AMA Articles Practical Guide to SRE: Incident Severity Levels Here\u2019s a take\u2026","rel":"","context":"In &quot;SRE&quot;","img":{"alt_text":"","src":"","width":0,"height":0},"classes":[]}],"_links":{"self":[{"href":"https:\/\/fde.cat\/index.php\/wp-json\/wp\/v2\/posts\/826","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/fde.cat\/index.php\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/fde.cat\/index.php\/wp-json\/wp\/v2\/types\/post"}],"replies":[{"embeddable":true,"href":"https:\/\/fde.cat\/index.php\/wp-json\/wp\/v2\/comments?post=826"}],"version-history":[{"count":0,"href":"https:\/\/fde.cat\/index.php\/wp-json\/wp\/v2\/posts\/826\/revisions"}],"wp:attachment":[{"href":"https:\/\/fde.cat\/index.php\/wp-json\/wp\/v2\/media?parent=826"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/fde.cat\/index.php\/wp-json\/wp\/v2\/categories?post=826"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/fde.cat\/index.php\/wp-json\/wp\/v2\/tags?post=826"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}