{"id":725,"date":"2023-06-19T02:02:54","date_gmt":"2023-06-19T02:02:54","guid":{"rendered":"https:\/\/fde.cat\/index.php\/2023\/06\/19\/sre-weekly-issue-377\/"},"modified":"2023-06-19T02:02:54","modified_gmt":"2023-06-19T02:02:54","slug":"sre-weekly-issue-377","status":"publish","type":"post","link":"https:\/\/fde.cat\/index.php\/2023\/06\/19\/sre-weekly-issue-377\/","title":{"rendered":"SRE Weekly Issue #377"},"content":{"rendered":"<p><a href=\"https:\/\/sreweekly.com\/sre-weekly-issue-377\/\" title=\"Permalink to SRE Weekly Issue #377\" class=\"email_only\">View on sreweekly.com<\/a><\/p>\n<div class=\"sreweekly-sponsor-message\">\n<h2>A message from our sponsor, <a href=\"https:\/\/rootly.com\/demo\/?utm_source=sreweekly\">Rootly<\/a>:<\/h2>\n<p>Curious how companies like Figma, Tripadvisor, and 100s of others leverage Rootly to manage incidents in Slack and unlock instant best practices?  Check out this lightning demo:<br \/>\n<a href=\"https:\/\/www.loom.com\/share\/051c4be0425a436e888dc0c3690855ad\">https:\/\/www.loom.com\/share\/051c4be0425a436e888dc0c3690855ad<\/a><\/p>\n<\/div>\n<h2>Articles<\/h2>\n<div class=\"wp-block-group\">\n<div class=\"wp-block-group__inner-container\">\n<div class=\"sreweekly-entry\">\n<div class=\"sreweekly-title\"><a href=\"https:\/\/www.thestack.technology\/us-east-1-aws-support-aws-outage\/\" target=\"_blank\" rel=\"noopener\">Why did AWS Support fail with US-EAST-1 again?<\/a><\/div>\n<div class=\"sreweekly-description\">\n<p>AWS had a major Lambda outage in us-east-1, and it took out many customer systems and quite a few other AWS systems, including their support portal.<\/p>\n<p>\u00a0\u00a0<small>The Stack<\/small><\/p><\/div>\n<\/div>\n<div class=\"sreweekly-entry\">\n<div class=\"sreweekly-title\"><a href=\"https:\/\/medium.com\/@bphellinger\/how-i-went-from-operations-manager-to-site-reliability-engineer-in-6-months-c61999c75155\" target=\"_blank\" rel=\"noopener\">How I went from Operations Manager to Site Reliability Engineer In 6 Months!<\/a><\/div>\n<div class=\"sreweekly-description\">\n<p>This person had a fascinating path to SRE, starting out their career as a generator repair technician and transitioning through devops to SRE.<\/p>\n<p>\u00a0\u00a0<small>Brian Hellinger \u2014 Towards AWS<\/small><\/p>\n<\/div>\n<\/div>\n<div class=\"sreweekly-entry\">\n<div class=\"sreweekly-title\"><a href=\"https:\/\/netflixtechblog.com\/migrating-critical-traffic-at-scale-with-no-downtime-part-2-4b1c8c7155c1\" target=\"_blank\" rel=\"noopener\">Migrating Critical Traffic At Scale with No Downtime\u200a\u2014\u200aPart 2<\/a><\/div>\n<div class=\"sreweekly-description\">\n<p>In part 1, they outlined how they replay real traffic to test a new system before deploying it.  In this article, they build on that with three additional techniques: sticky canaries, A\/B testing, and gradually shifting traffic to the new system in production.<\/p>\n<p>\u00a0\u00a0<small>Shyam Gala, Javier Fernandez-Ivern, Anup Rokkam Pratap, and Devang Shah \u2014 Netflix<\/small><\/p>\n<\/div>\n<\/div>\n<div class=\"sreweekly-entry\">\n<div class=\"sreweekly-title\"><a href=\"https:\/\/metrist.io\/blog\/the-data-behind-delayed-status-page-updates\/\" target=\"_blank\" rel=\"noopener\">The Data Behind Delayed Status Page Updates<\/a><\/div>\n<div class=\"sreweekly-description\">\n<p>By comparing status page posting to their independent monitoring of services, Metrist is able to produce statistics about how long companies take to post to their status pages when they have an outage.<\/p>\n<p>\u00a0\u00a0<small>Jeff Martens \u2014 Metrist<\/small><\/p>\n<\/div>\n<\/div>\n<div class=\"sreweekly-entry\">\n<div class=\"sreweekly-title\"><a href=\"https:\/\/surfingcomplexity.blog\/2023\/06\/11\/when-theres-no-plan-for-this-scenario-youve-got-to-improvise\/\" target=\"_blank\" rel=\"noopener\">When there\u2019s no plan for this scenario, you\u2019ve got to improvise<\/a><\/div>\n<div class=\"sreweekly-description\">\n<p>Improvising during an incident isn\u2019t just a one-off occurrence, and we should plan for it.<\/p>\n<p>\u00a0\u00a0<small>Lorin Hochstein \u2014 Surfing Complexity<\/small><\/p>\n<\/div>\n<\/div>\n<div class=\"sreweekly-entry\">\n<div class=\"sreweekly-title\"><a href=\"https:\/\/status.heroku.com\/incidents\/2558\" target=\"_blank\" rel=\"noopener\">Heroku Incident 2558 Followup<\/a><\/div>\n<div class=\"sreweekly-description\">\n<p>A foreign key column had a smaller integer data type than the key that it referenced, and it failed when the referenced key went too high.<\/p>\n<p>\u00a0\u00a0<small>Heroku<\/small><\/p>\n<\/div>\n<\/div>\n<div class=\"sreweekly-entry\">\n<div class=\"sreweekly-title\"><a href=\"https:\/\/ably.com\/blog\/chat-app-architecture\" target=\"_blank\" rel=\"noopener\">Scalable chat app architecture: How to get it right the first time<\/a><\/div>\n<div class=\"sreweekly-description\">\n<p>Here, we\u2019ll look at the key considerations you need to make when it comes to the architecture of your chat app, the structure and components of that architecture, and some of the technology options that can help support you in building a reliable chat experience.<\/p>\n<p>\u00a0\u00a0<small>Ably<\/small><\/p>\n<\/div>\n<\/div>\n<div class=\"sreweekly-entry\">\n<div class=\"sreweekly-title\"><a href=\"https:\/\/admiralcloudberg.medium.com\/a-watery-surprise-the-crash-of-national-airlines-flight-193-d1c493ee9b1f\" target=\"_blank\" rel=\"noopener\">A Watery Surprise: The crash of National Airlines flight 193<\/a><\/div>\n<div class=\"sreweekly-description\">\n<p>A departure from the normal air traffic control procedure allowed the pilots to lose situational awareness.  A commonly-held myth about flotation equipment contributed to three deaths in a quite survivable accident.<\/p>\n<p>\u00a0\u00a0<small>Admiral Cloudberg<\/small><\/p>\n<\/div>\n<\/div>\n<div class=\"sreweekly-entry\">\n<div class=\"sreweekly-title\"><a href=\"https:\/\/medium.com\/adevinta-tech-blog\/its-not-always-dns-unless-it-is-16858df17d3f\" target=\"_blank\" rel=\"noopener\">It\u2019s not always DNS\u200a\u2014\u200aunless it is<\/a><\/div>\n<div class=\"sreweekly-description\">\n<p>They kept finding what they <em>thought<\/em> was the problem, and their fixes helped, but the problem kept coming back.<\/p>\n<p>\u00a0\u00a0<small>Tanat Paul Lokejaroenlarb \u2014 Adevinta<\/small><\/p>\n<\/div>\n<\/div>\n<\/div>\n<\/div>\n<p>SRE WEEKLY<\/p>","protected":false},"excerpt":{"rendered":"<p>View on sreweekly.com A message from our sponsor, Rootly: Curious how companies like Figma, Tripadvisor, and 100s of others leverage Rootly to manage incidents in Slack and unlock instant best practices? Check out this lightning demo: https:\/\/www.loom.com\/share\/051c4be0425a436e888dc0c3690855ad Articles Why did AWS Support fail with US-EAST-1 again? AWS had a major Lambda outage in us-east-1, and&hellip; <a class=\"more-link\" href=\"https:\/\/fde.cat\/index.php\/2023\/06\/19\/sre-weekly-issue-377\/\">Continue reading <span class=\"screen-reader-text\">SRE Weekly Issue #377<\/span><\/a><\/p>\n","protected":false},"author":0,"featured_media":0,"comment_status":"","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"spay_email":"","footnotes":""},"categories":[8],"tags":[],"class_list":["post-725","post","type-post","status-publish","format-standard","hentry","category-sre","entry"],"jetpack_featured_media_url":"","jetpack-related-posts":[{"id":519,"url":"https:\/\/fde.cat\/index.php\/2021\/12\/20\/sre-weekly-issue-301\/","url_meta":{"origin":725,"position":0},"title":"SRE Weekly Issue #301","date":"December 20, 2021","format":false,"excerpt":"View on sreweekly.com A message from our sponsor, Rootly: Manage incidents directly from Slack with Rootly \ud83d\ude92. Automate manual admin tasks like creating incident channel, Jira and Zoom, paging the right team, postmortem timeline, setting up reminders, and more. Book a demo: https:\/\/rootly.com\/demo\/?utm_source=sreweekly Articles BadgerDAO Exploit Technical Post Mortem This\u2026","rel":"","context":"In &quot;SRE&quot;","img":{"alt_text":"","src":"","width":0,"height":0},"classes":[]},{"id":731,"url":"https:\/\/fde.cat\/index.php\/2023\/07\/03\/sre-weekly-issue-379\/","url_meta":{"origin":725,"position":1},"title":"SRE Weekly Issue #379","date":"July 3, 2023","format":false,"excerpt":"View on sreweekly.com A message from our sponsor, Rootly: Curious how companies like Figma, Tripadvisor, and 100s of others leverage Rootly to manage incidents in Slack and unlock instant best practices? Check out this lightning demo: https:\/\/www.loom.com\/share\/051c4be0425a436e888dc0c3690855ad Articles The Saga Is Antipattern In case you weren\u2019t familiar with the Saga\u2026","rel":"","context":"In &quot;SRE&quot;","img":{"alt_text":"","src":"","width":0,"height":0},"classes":[]},{"id":514,"url":"https:\/\/fde.cat\/index.php\/2021\/12\/13\/sre-weekly-issue-300\/","url_meta":{"origin":725,"position":2},"title":"SRE Weekly Issue #300","date":"December 13, 2021","format":false,"excerpt":"View on sreweekly.com 300 issues. 6 years. Wow! I couldn\u2019t have done it without all of you wonderful people, writing articles and reading issues. Thanks, you make curating this newsletter fun! A message from our sponsor, Rootly: Manage incidents directly from Slack with Rootly \ud83d\ude92. Automate manual admin tasks like\u2026","rel":"","context":"In &quot;SRE&quot;","img":{"alt_text":"","src":"","width":0,"height":0},"classes":[]},{"id":708,"url":"https:\/\/fde.cat\/index.php\/2023\/05\/01\/sre-weekly-issue-370\/","url_meta":{"origin":725,"position":3},"title":"SRE Weekly Issue #370","date":"May 1, 2023","format":false,"excerpt":"View on sreweekly.com A message from our sponsor, Rootly: Manage incidents directly from Slack with Rootly\u00a0\ud83d\ude92. Rootly automates manual tasks like creating an incident channel, Jira ticket and Zoom rooms, inviting responders, creating statuspage updates, postmortem timelines and more. Want to see why companies like Canva and Grammarly love us?:\u2026","rel":"","context":"In &quot;SRE&quot;","img":{"alt_text":"","src":"","width":0,"height":0},"classes":[]},{"id":564,"url":"https:\/\/fde.cat\/index.php\/2022\/04\/18\/sre-weekly-issue-318\/","url_meta":{"origin":725,"position":4},"title":"SRE Weekly Issue #318","date":"April 18, 2022","format":false,"excerpt":"View on sreweekly.com A message from our sponsor, Rootly: Manage incidents directly from Slack with Rootly \ud83d\ude92. Automate manual admin tasks like creating incident channel, Jira and Zoom, paging and adding responders, postmortem timeline, setting up reminders, and more. Book a demo (+ get a snazzy Rootly lego set): https:\/\/rootly.com\/demo\/\u2026","rel":"","context":"In &quot;SRE&quot;","img":{"alt_text":"","src":"","width":0,"height":0},"classes":[]},{"id":545,"url":"https:\/\/fde.cat\/index.php\/2022\/02\/28\/sre-weekly-issue-311\/","url_meta":{"origin":725,"position":5},"title":"SRE Weekly Issue #311","date":"February 28, 2022","format":false,"excerpt":"View on sreweekly.com I\u2019m dedicating this issue to the people of Ukraine, and also those in Russia that are protesting the invasion. A message from our sponsor, Rootly: Manage incidents directly from Slack with Rootly \ud83d\ude92. Automate manual admin tasks like creating incident channel, Jira and Zoom, paging the right\u2026","rel":"","context":"In &quot;SRE&quot;","img":{"alt_text":"","src":"","width":0,"height":0},"classes":[]}],"_links":{"self":[{"href":"https:\/\/fde.cat\/index.php\/wp-json\/wp\/v2\/posts\/725","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/fde.cat\/index.php\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/fde.cat\/index.php\/wp-json\/wp\/v2\/types\/post"}],"replies":[{"embeddable":true,"href":"https:\/\/fde.cat\/index.php\/wp-json\/wp\/v2\/comments?post=725"}],"version-history":[{"count":0,"href":"https:\/\/fde.cat\/index.php\/wp-json\/wp\/v2\/posts\/725\/revisions"}],"wp:attachment":[{"href":"https:\/\/fde.cat\/index.php\/wp-json\/wp\/v2\/media?parent=725"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/fde.cat\/index.php\/wp-json\/wp\/v2\/categories?post=725"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/fde.cat\/index.php\/wp-json\/wp\/v2\/tags?post=725"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}