{"id":862,"date":"2024-05-06T01:02:31","date_gmt":"2024-05-06T01:02:31","guid":{"rendered":"https:\/\/fde.cat\/index.php\/2024\/05\/06\/sre-weekly-issue-423\/"},"modified":"2024-05-06T01:02:31","modified_gmt":"2024-05-06T01:02:31","slug":"sre-weekly-issue-423","status":"publish","type":"post","link":"https:\/\/fde.cat\/index.php\/2024\/05\/06\/sre-weekly-issue-423\/","title":{"rendered":"SRE Weekly Issue #423"},"content":{"rendered":"<p><a href=\"https:\/\/sreweekly.com\/sre-weekly-issue-423\/\" title=\"Permalink to SRE Weekly Issue #423\" class=\"email_only\">View on sreweekly.com<\/a><\/p>\n<div class=\"sreweekly-sponsor-message\">\n<h2>A message from our sponsor, <a href=\"https:\/\/firehydrant.com\/\">FireHydrant<\/a>:<\/h2>\n<p>FireHydrant is now AI-powered for faster, smarter incidents! Power up your incidents with auto-generated real-time summaries, retrospectives, and status page updates.<br \/>\n<a href=\"https:\/\/firehydrant.com\/blog\/ai-for-incident-management-is-here\/\">https:\/\/firehydrant.com\/blog\/ai-for-incident-management-is-here\/<\/a><\/p>\n<\/div>\n<div class=\"wp-block-group is-layout-flow wp-block-group-is-layout-flow\">\n<div class=\"wp-block-group__inner-container\">\n<div class=\"sreweekly-entry\">\n<div class=\"sreweekly-title\"><a href=\"https:\/\/www.checklyhq.com\/blog\/alert-fatigue\/\" target=\"_blank\" rel=\"noopener\">How to Fight Alert Fatigue with Synthetic Monitoring<\/a><\/div>\n<div class=\"sreweekly-description\">\n<p>This one\u2019s full of great advice about making sure alerts are actionable, including alerting on flows that actually matter to customers.<\/p>\n<p>\u00a0\u00a0<small>No\u010dnica Mellifera \u2014 Checkly<\/small><\/p>\n<\/div>\n<\/div>\n<div class=\"sreweekly-entry\">\n<div class=\"sreweekly-title\"><a href=\"https:\/\/hross.substack.com\/p\/what-playing-magic-the-gathering\" target=\"_blank\" rel=\"noopener\">What playing Magic: the Gathering taught me about incidents.<\/a><\/div>\n<div class=\"sreweekly-description\">\n<p>Here are a collection of things I learned after getting back into Magic: the Gathering over the past 10 years or so. They are things that apply to both the MTG scene and your friendly neighborhood incident response process.<\/p>\n<p>\u00a0\u00a0<small>Ross Brodbeck<\/small><\/p>\n<\/div>\n<\/div>\n<div class=\"sreweekly-entry\">\n<div class=\"sreweekly-title\"><a href=\"https:\/\/wetransfer.com\/engineering\/upgrading-kubernetes-from-1-11-to-1-18-in-a-month\/\" target=\"_blank\" rel=\"noopener\">Upgrading Kubernetes: From 1.11 to 1.18 in a month<\/a><\/div>\n<div class=\"sreweekly-description\">\n<p>It was a classic application of technical debt: they chose to focus on customer-facing features and let k8s updates slide.  Here\u2019s how they caught back up safely.<\/p>\n<p>\u00a0\u00a0<small>Jeff Wolski<\/small><\/p>\n<\/div>\n<\/div>\n<div class=\"sreweekly-entry\">\n<div class=\"sreweekly-title\"><a href=\"https:\/\/blog.relyabilit.ie\/rices-theorem-and-software-failures\/\" target=\"_blank\" rel=\"noopener\">Rice\u2019s Theorem and Software Failures<\/a><\/div>\n<div class=\"sreweekly-description\">\n<p>This article presents an interesting hypothesis, and from it draws some nifty conclusions about reasoning about failure in systems.<\/p>\n<p>we cannot know for sure whether or not software is going to be incident-free. It might well be, but we can\u2019t ever prove it.<\/p>\n<p>\u00a0\u00a0<small>Niall Murphy<\/small><\/p>\n<\/div>\n<\/div>\n<div class=\"sreweekly-entry\">\n<div class=\"sreweekly-title\"><a href=\"https:\/\/www.pagerduty.com\/blog\/psychological-safety-in-incident-response\/\" target=\"_blank\" rel=\"noopener\">The role of psychological safety in incident response<\/a><\/div>\n<div class=\"sreweekly-description\">\n<p>For teams to solve incidents quickly and effectively, responders need to be able to trust each other and stakeholders have to trust the responders. This level of trust is hard to cultivate if your organization doesn\u2019t have a significant amount of psychological safety. <\/p>\n<p>\u00a0\u00a0<small>Mandi Walls \u2014 PagerDuty<\/small><\/p>\n<\/div>\n<\/div>\n<div class=\"sreweekly-entry\">\n<div class=\"sreweekly-title\"><a href=\"https:\/\/klaviyo.tech\/klaviyo-incident-management-interview-with-laura-stone-28563ac9558b\" target=\"_blank\" rel=\"noopener\">Klaviyo Incident Management: Interview with Laura Stone<\/a><\/div>\n<div class=\"sreweekly-description\">\n<p>More than just an interview, this article outlines a multi-year transformation from disorganized haphazard incident investigation to a smooth and efficient incident response process.<\/p>\n<p>\u00a0\u00a0<small>Eric Silberstein \u2014 Klaviyo<\/small><\/p>\n<\/div>\n<\/div>\n<div class=\"sreweekly-entry\">\n<div class=\"sreweekly-title\"><a href=\"https:\/\/learnk8s.io\/graceful-shutdown\" target=\"_blank\" rel=\"noopener\">Graceful shutdown in Kubernetes<\/a><\/div>\n<div class=\"sreweekly-description\">\n<p><em>In this article, you will learn how to prevent broken connections when a Pod starts or shuts down. You will also learn how to shut down long-running tasks and connections gracefully.<\/em><\/p>\n<p>\u00a0\u00a0<small> Daniele Polencic \u2014 Learnk8s<\/small><\/p>\n<\/div>\n<\/div>\n<div class=\"sreweekly-entry\">\n<div class=\"sreweekly-title\"><a href=\"https:\/\/medium.com\/@maciej.pocwierz\/how-an-empty-s3-bucket-can-make-your-aws-bill-explode-934a383cb8b1\" target=\"_blank\" rel=\"noopener\">How an empty S3 bucket can make your AWS bill explode<\/a><\/div>\n<div class=\"sreweekly-description\">\n<p>It turns out that an S3 bucket owner pays for failed requests to that bucket, even if they\u2019re unauthenticated, so anyone can run up your AWS bill if they know your bucket\u2019s name.  Oops.<\/p>\n<p>Oh, and they can get the bucket name from CT logs (thanks, <a href=\"https:\/\/twitter.com\/QuinnyPig\/status\/1785311776386727993\">Corey Quinn<\/a>!)<\/p>\n<p>\u00a0\u00a0<small>Maciej Pocwierz<\/small><\/p>\n<\/div>\n<\/div>\n<\/div>\n<\/div>\n<p>SRE WEEKLY<\/p>","protected":false},"excerpt":{"rendered":"<p>View on sreweekly.com A message from our sponsor, FireHydrant: FireHydrant is now AI-powered for faster, smarter incidents! Power up your incidents with auto-generated real-time summaries, retrospectives, and status page updates. https:\/\/firehydrant.com\/blog\/ai-for-incident-management-is-here\/ How to Fight Alert Fatigue with Synthetic Monitoring This one\u2019s full of great advice about making sure alerts are actionable, including alerting on flows&hellip; <a class=\"more-link\" href=\"https:\/\/fde.cat\/index.php\/2024\/05\/06\/sre-weekly-issue-423\/\">Continue reading <span class=\"screen-reader-text\">SRE Weekly Issue #423<\/span><\/a><\/p>\n","protected":false},"author":0,"featured_media":0,"comment_status":"","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"spay_email":"","footnotes":""},"categories":[8],"tags":[],"class_list":["post-862","post","type-post","status-publish","format-standard","hentry","category-sre","entry"],"jetpack_featured_media_url":"","jetpack-related-posts":[{"id":663,"url":"https:\/\/fde.cat\/index.php\/2022\/12\/19\/sre-weekly-issue-352\/","url_meta":{"origin":862,"position":0},"title":"SRE Weekly Issue #352","date":"December 19, 2022","format":false,"excerpt":"View on sreweekly.com A message from our sponsor, Rootly: Manage incidents directly from Slack with Rootly\u00a0\ud83d\ude92. Rootly automates manual tasks like creating an incident channel, Jira ticket and Zoom rooms, inviting responders, creating statuspage updates, postmortem timelines and more. Want to see why companies like Canva and Grammarly love us?:\u2026","rel":"","context":"In &quot;SRE&quot;","img":{"alt_text":"","src":"","width":0,"height":0},"classes":[]},{"id":855,"url":"https:\/\/fde.cat\/index.php\/2024\/04\/15\/sre-weekly-issue-420\/","url_meta":{"origin":862,"position":1},"title":"SRE Weekly Issue #420","date":"April 15, 2024","format":false,"excerpt":"View on sreweekly.com A message from our sponsor, FireHydrant: FireHydrant is now AI-powered for faster, smarter incidents! Power up your incidents with auto-generated real-time summaries, retrospectives, and status page updates. https:\/\/firehydrant.com\/blog\/ai-for-incident-management-is-here\/ 1.0 Launch Retrospective The game Last Epoch launched in February, and they had a rocky start. This huge retrospective\u2026","rel":"","context":"In &quot;SRE&quot;","img":{"alt_text":"","src":"","width":0,"height":0},"classes":[]},{"id":844,"url":"https:\/\/fde.cat\/index.php\/2024\/03\/25\/sre-weekly-issue-417\/","url_meta":{"origin":862,"position":2},"title":"SRE Weekly Issue #417","date":"March 25, 2024","format":false,"excerpt":"View on sreweekly.com A message from our sponsor, FireHydrant: Join FireHydrant this Thursday for a conversation about on-call burnout and how to prevent it. Get a better understanding of what makes a fatigue-free on-call culture, including real-world examples from your incident management peers. No sales, just shop talk. https:\/\/app.livestorm.co\/firehydrant\/better-incidents-spring-bonfire-secrets-to-fatigue-free-on-call-in-2024 Harnessing\u2026","rel":"","context":"In &quot;SRE&quot;","img":{"alt_text":"","src":"","width":0,"height":0},"classes":[]},{"id":798,"url":"https:\/\/fde.cat\/index.php\/2023\/12\/04\/sre-weekly-issue-401\/","url_meta":{"origin":862,"position":3},"title":"SRE Weekly Issue #401","date":"December 4, 2023","format":false,"excerpt":"View on sreweekly.com A message from our sponsor, FireHydrant: Join FireHydrant Dec.14 for a conversation about on-call culture and its effect on engineering organizations, featuring special guests from Outreach and Udemy. Gain a better understanding of what makes excellent on-call culture and how to implement practices to improve yours. https:\/\/app.livestorm.co\/firehydrant\/better-incidents-winter-bonfire-inside-on-call?type=detailed\u2026","rel":"","context":"In &quot;SRE&quot;","img":{"alt_text":"","src":"","width":0,"height":0},"classes":[]},{"id":835,"url":"https:\/\/fde.cat\/index.php\/2024\/03\/11\/sre-weekly-issue-415\/","url_meta":{"origin":862,"position":4},"title":"SRE Weekly Issue #415","date":"March 11, 2024","format":false,"excerpt":"View on sreweekly.com A message from our sponsor, FireHydrant: Join FireHydrant and talk shop with your DevOps peers on March 28! You\u2019ll gain a better understanding of what makes a fatigue-free on-call culture and how to implement practices to improve yours at this free, virtual roundtable. https:\/\/app.livestorm.co\/firehydrant\/better-incidents-spring-bonfire-secrets-to-fatigue-free-on-call-in-2024 The Wrong Way\u2026","rel":"","context":"In &quot;SRE&quot;","img":{"alt_text":"","src":"","width":0,"height":0},"classes":[]},{"id":823,"url":"https:\/\/fde.cat\/index.php\/2024\/02\/12\/sre-weekly-issue-411\/","url_meta":{"origin":862,"position":5},"title":"SRE Weekly Issue #411","date":"February 12, 2024","format":false,"excerpt":"View on sreweekly.com A message from our sponsor, FireHydrant: \u201cTo be honest, when can we switch?\u201d The first impressions are in. Check out what people are saying after seeing Signals, the new standard in alerting and on-call from FireHydrant, for the first time. https:\/\/firehydrant.com\/signals\/ Shared On-Call Is Where the SRE\u2026","rel":"","context":"In &quot;SRE&quot;","img":{"alt_text":"","src":"","width":0,"height":0},"classes":[]}],"_links":{"self":[{"href":"https:\/\/fde.cat\/index.php\/wp-json\/wp\/v2\/posts\/862","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/fde.cat\/index.php\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/fde.cat\/index.php\/wp-json\/wp\/v2\/types\/post"}],"replies":[{"embeddable":true,"href":"https:\/\/fde.cat\/index.php\/wp-json\/wp\/v2\/comments?post=862"}],"version-history":[{"count":0,"href":"https:\/\/fde.cat\/index.php\/wp-json\/wp\/v2\/posts\/862\/revisions"}],"wp:attachment":[{"href":"https:\/\/fde.cat\/index.php\/wp-json\/wp\/v2\/media?parent=862"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/fde.cat\/index.php\/wp-json\/wp\/v2\/categories?post=862"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/fde.cat\/index.php\/wp-json\/wp\/v2\/tags?post=862"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}