{"id":858,"date":"2024-04-22T01:44:20","date_gmt":"2024-04-22T01:44:20","guid":{"rendered":"https:\/\/fde.cat\/index.php\/2024\/04\/22\/sre-weekly-issue-421\/"},"modified":"2024-04-22T01:44:20","modified_gmt":"2024-04-22T01:44:20","slug":"sre-weekly-issue-421","status":"publish","type":"post","link":"https:\/\/fde.cat\/index.php\/2024\/04\/22\/sre-weekly-issue-421\/","title":{"rendered":"SRE Weekly Issue #421"},"content":{"rendered":"<p><a href=\"https:\/\/sreweekly.com\/sre-weekly-issue-421\/\" title=\"Permalink to SRE Weekly Issue #421\" class=\"email_only\">View on sreweekly.com<\/a><\/p>\n<p>Last week, I mistakenly attributed [an article](https:\/\/www.paigerduty.com\/sre-biggest-problem\/) to PagerDuty.  Actually, it was by <strong>Paige Cruz<\/strong>, whose clever blog name I didn\u2019t pay anywhere near close enough attention to!  Thanks to several readers that nudged me gently about my error.<\/p>\n<div class=\"sreweekly-sponsor-message\">\n<h2>A message from our sponsor, <a href=\"https:\/\/firehydrant.com\/\">FireHydrant<\/a>:<\/h2>\n<p>FireHydrant is now AI-powered for faster, smarter incidents! Power up your incidents with auto-generated real-time summaries, retrospectives, and status page updates.<br \/>\n<a href=\"https:\/\/firehydrant.com\/blog\/ai-for-incident-management-is-here\/\">https:\/\/firehydrant.com\/blog\/ai-for-incident-management-is-here\/<\/a><\/p>\n<\/div>\n<div class=\"wp-block-group is-layout-flow wp-block-group-is-layout-flow\">\n<div class=\"wp-block-group__inner-container\">\n<div class=\"sreweekly-entry\">\n<div class=\"sreweekly-title\"><a href=\"https:\/\/surfingcomplexity.blog\/2024\/03\/26\/the-problem-with-invariants-is-that-they-change-over-time\/\" target=\"_blank\" rel=\"noopener\">The problem with invariants is that they change over time<\/a><\/div>\n<div class=\"sreweekly-description\">\n<p>If you\u2019ve been in this business long enough, you\u2019ve almost certainly run into an incident where one of the contributors was an implicit invariant that was violated by a new change.<\/p>\n<p>Easily the majority of incidents I\u2019ve been in.<\/p>\n<p>\u00a0\u00a0<small>Lorin Hochstein<\/small><\/p>\n<\/div>\n<\/div>\n<div class=\"sreweekly-entry\">\n<div class=\"sreweekly-title\"><a href=\"https:\/\/blog.relyabilit.ie\/the-twinslo-proposal\/\" target=\"_blank\" rel=\"noopener\">The TwinSLO Proposal<\/a><\/div>\n<div class=\"sreweekly-description\">\n<p>This article is about trying to solve for this problem:<\/p>\n<p>a potentially significant number of customers or queries can be affected by an outage and this won\u2019t trigger an SLO violation.<\/p>\n<p>\u00a0\u00a0<small>Niall Murphy<br \/><\/small><\/p>\n<\/div>\n<\/div>\n<div class=\"sreweekly-entry\">\n<div class=\"sreweekly-title\"><a href=\"https:\/\/robpostonblog.wordpress.com\/2024\/03\/28\/an-anonymous-complaint-dr-postons-response\/\" target=\"_blank\" rel=\"noopener\">An Anonymous Complaint\/Dr. Poston\u2019s Response<\/a><\/div>\n<div class=\"sreweekly-description\">\n<p>A surgeon struggles with the difficulties in building a culture of retrospectives and introspection in their surgical team, by running a fascinating retro on himself in this blog post.<\/p>\n<p>\u00a0\u00a0<small>Robert Poston, MD<\/small><\/p>\n<\/div>\n<\/div>\n<div class=\"sreweekly-entry\">\n<div class=\"sreweekly-title\"><a href=\"https:\/\/www.shayon.dev\/post\/2024\/89\/incidents-and-the-requirement-of-slowing-down\/\" target=\"_blank\" rel=\"noopener\">Incidents and the requirement of slowing down<\/a><\/div>\n<div class=\"sreweekly-description\">\n<p>An argument for buying yourself time to slow down and make decisions carefully, as a way of ultimately speeding up incident resolution.<\/p>\n<p>\u00a0\u00a0<small>Shayon Mukherjee<\/small><\/p>\n<\/div>\n<\/div>\n<div class=\"sreweekly-entry\">\n<div class=\"sreweekly-title\"><a href=\"https:\/\/wetransfer.com\/engineering\/build-your-own-role-playing-game-the-business-continuity-plan-drill\/\" target=\"_blank\" rel=\"noopener\">Build your own role-playing game: the business continuity plan drill<\/a><\/div>\n<div class=\"sreweekly-description\">\n<p>Disasters threatening a business\u2019 ability to operate core functions don\u2019t occur that often (phew!), but we do want to ensure we are prepared to keep our business running if they do. To practice disaster response skills, we run business continuity drills, and you can too with our 10-step plan!<\/p>\n<p>\u00a0\u00a0<small>Janna Brummel \u2014 WeTransfer<\/small><\/p>\n<\/div>\n<\/div>\n<div class=\"sreweekly-entry\">\n<div class=\"sreweekly-title\"><a href=\"https:\/\/hross.substack.com\/p\/availability-archetypes\" target=\"_blank\" rel=\"noopener\">Availability Archetypes<\/a><\/div>\n<div class=\"sreweekly-description\">\n<p>How people think about reliability varies between companies.  Which of the four different perspectives laid out int his article does your company fit into, if any?<\/p>\n<p>\u00a0\u00a0<small>Ross Brodbeck<\/small><\/p>\n<\/div>\n<\/div>\n<div class=\"sreweekly-entry\">\n<div class=\"sreweekly-title\"><a href=\"https:\/\/status.honeycomb.io\/incidents\/tn78w6tw950d\" target=\"_blank\" rel=\"noopener\">eu1 ingest and UI down<\/a><\/div>\n<div class=\"sreweekly-description\">\n<p>Honeycomb posted this followup on their April 9 outage, explaining what went wrong and how they\u2019re responding.<\/p>\n<p>\u00a0\u00a0<small>Honeycomb<\/small><\/p>\n<p>\u00a0\u00a0<small><em>Full disclosure: Honeycomb is my employer.<\/em><\/small><\/p>\n<\/div>\n<\/div>\n<div class=\"sreweekly-entry\">\n<div class=\"sreweekly-title\"><a href=\"https:\/\/www.checklyhq.com\/blog\/sre-people-and-communication-matter\/?utm_source=chat&amp;utm_medium=link&amp;utm_campaign=synthetics&amp;utm_id=social_button\" target=\"_blank\" rel=\"noopener\">For an SRE, relationships and communication matter most: advice from SRE\u2019s<\/a><\/div>\n<div class=\"sreweekly-description\">\n<p>The author of this article posed a question on <a href=\"https:\/\/reddit.com\/r\/sre\">r\/sre<\/a>:<\/p>\n<p>What matters most for your success as an SRE?<\/p>\n<p>They share a summary of the answers they got, with their commentary.<\/p>\n<p>\u00a0\u00a0<small>No\u010dnica Mellifera \u2014 Checkly<\/small><\/p>\n<\/div>\n<\/div>\n<\/div>\n<\/div>\n<p>SRE WEEKLY<\/p>","protected":false},"excerpt":{"rendered":"<p>View on sreweekly.com Last week, I mistakenly attributed [an article](https:\/\/www.paigerduty.com\/sre-biggest-problem\/) to PagerDuty. Actually, it was by Paige Cruz, whose clever blog name I didn\u2019t pay anywhere near close enough attention to! Thanks to several readers that nudged me gently about my error. A message from our sponsor, FireHydrant: FireHydrant is now AI-powered for faster, smarter&hellip; <a class=\"more-link\" href=\"https:\/\/fde.cat\/index.php\/2024\/04\/22\/sre-weekly-issue-421\/\">Continue reading <span class=\"screen-reader-text\">SRE Weekly Issue #421<\/span><\/a><\/p>\n","protected":false},"author":0,"featured_media":0,"comment_status":"","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"spay_email":"","footnotes":""},"categories":[8],"tags":[],"class_list":["post-858","post","type-post","status-publish","format-standard","hentry","category-sre","entry"],"jetpack_featured_media_url":"","jetpack-related-posts":[{"id":855,"url":"https:\/\/fde.cat\/index.php\/2024\/04\/15\/sre-weekly-issue-420\/","url_meta":{"origin":858,"position":0},"title":"SRE Weekly Issue #420","date":"April 15, 2024","format":false,"excerpt":"View on sreweekly.com A message from our sponsor, FireHydrant: FireHydrant is now AI-powered for faster, smarter incidents! Power up your incidents with auto-generated real-time summaries, retrospectives, and status page updates. https:\/\/firehydrant.com\/blog\/ai-for-incident-management-is-here\/ 1.0 Launch Retrospective The game Last Epoch launched in February, and they had a rocky start. This huge retrospective\u2026","rel":"","context":"In &quot;SRE&quot;","img":{"alt_text":"","src":"","width":0,"height":0},"classes":[]},{"id":835,"url":"https:\/\/fde.cat\/index.php\/2024\/03\/11\/sre-weekly-issue-415\/","url_meta":{"origin":858,"position":1},"title":"SRE Weekly Issue #415","date":"March 11, 2024","format":false,"excerpt":"View on sreweekly.com A message from our sponsor, FireHydrant: Join FireHydrant and talk shop with your DevOps peers on March 28! You\u2019ll gain a better understanding of what makes a fatigue-free on-call culture and how to implement practices to improve yours at this free, virtual roundtable. https:\/\/app.livestorm.co\/firehydrant\/better-incidents-spring-bonfire-secrets-to-fatigue-free-on-call-in-2024 The Wrong Way\u2026","rel":"","context":"In &quot;SRE&quot;","img":{"alt_text":"","src":"","width":0,"height":0},"classes":[]},{"id":867,"url":"https:\/\/fde.cat\/index.php\/2024\/05\/20\/sre-weekly-issue-425\/","url_meta":{"origin":858,"position":2},"title":"SRE Weekly Issue #425","date":"May 20, 2024","format":false,"excerpt":"View on sreweekly.com A message from our sponsor, FireHydrant: FireHydrant is now AI-powered for faster, smarter incidents! Power up your incidents with auto-generated real-time summaries, retrospectives, and status page updates. https:\/\/firehydrant.com\/blog\/ai-for-incident-management-is-here\/ Presenting to Engineering Leadership Great practical advice for how to present reliability problems (and your proposed solutions) to e-staff.\u2026","rel":"","context":"In &quot;SRE&quot;","img":{"alt_text":"","src":"","width":0,"height":0},"classes":[]},{"id":663,"url":"https:\/\/fde.cat\/index.php\/2022\/12\/19\/sre-weekly-issue-352\/","url_meta":{"origin":858,"position":3},"title":"SRE Weekly Issue #352","date":"December 19, 2022","format":false,"excerpt":"View on sreweekly.com A message from our sponsor, Rootly: Manage incidents directly from Slack with Rootly\u00a0\ud83d\ude92. Rootly automates manual tasks like creating an incident channel, Jira ticket and Zoom rooms, inviting responders, creating statuspage updates, postmortem timelines and more. Want to see why companies like Canva and Grammarly love us?:\u2026","rel":"","context":"In &quot;SRE&quot;","img":{"alt_text":"","src":"","width":0,"height":0},"classes":[]},{"id":798,"url":"https:\/\/fde.cat\/index.php\/2023\/12\/04\/sre-weekly-issue-401\/","url_meta":{"origin":858,"position":4},"title":"SRE Weekly Issue #401","date":"December 4, 2023","format":false,"excerpt":"View on sreweekly.com A message from our sponsor, FireHydrant: Join FireHydrant Dec.14 for a conversation about on-call culture and its effect on engineering organizations, featuring special guests from Outreach and Udemy. Gain a better understanding of what makes excellent on-call culture and how to implement practices to improve yours. https:\/\/app.livestorm.co\/firehydrant\/better-incidents-winter-bonfire-inside-on-call?type=detailed\u2026","rel":"","context":"In &quot;SRE&quot;","img":{"alt_text":"","src":"","width":0,"height":0},"classes":[]},{"id":864,"url":"https:\/\/fde.cat\/index.php\/2024\/05\/13\/sre-weekly-issue-424\/","url_meta":{"origin":858,"position":5},"title":"SRE Weekly Issue #424","date":"May 13, 2024","format":false,"excerpt":"View on sreweekly.com A message from our sponsor, FireHydrant: FireHydrant is now AI-powered for faster, smarter incidents! Power up your incidents with auto-generated real-time summaries, retrospectives, and status page updates. https:\/\/firehydrant.com\/blog\/ai-for-incident-management-is-here\/ My Availability Investment Playbook Here\u2019s an ultra-practical guide to pushing for reliability investments at your company, formatted as a\u2026","rel":"","context":"In &quot;SRE&quot;","img":{"alt_text":"","src":"","width":0,"height":0},"classes":[]}],"_links":{"self":[{"href":"https:\/\/fde.cat\/index.php\/wp-json\/wp\/v2\/posts\/858","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/fde.cat\/index.php\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/fde.cat\/index.php\/wp-json\/wp\/v2\/types\/post"}],"replies":[{"embeddable":true,"href":"https:\/\/fde.cat\/index.php\/wp-json\/wp\/v2\/comments?post=858"}],"version-history":[{"count":0,"href":"https:\/\/fde.cat\/index.php\/wp-json\/wp\/v2\/posts\/858\/revisions"}],"wp:attachment":[{"href":"https:\/\/fde.cat\/index.php\/wp-json\/wp\/v2\/media?parent=858"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/fde.cat\/index.php\/wp-json\/wp\/v2\/categories?post=858"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/fde.cat\/index.php\/wp-json\/wp\/v2\/tags?post=858"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}