{"id":752,"date":"2023-08-28T01:23:05","date_gmt":"2023-08-28T01:23:05","guid":{"rendered":"https:\/\/fde.cat\/index.php\/2023\/08\/28\/sre-weekly-issue-387\/"},"modified":"2023-08-28T01:23:05","modified_gmt":"2023-08-28T01:23:05","slug":"sre-weekly-issue-387","status":"publish","type":"post","link":"https:\/\/fde.cat\/index.php\/2023\/08\/28\/sre-weekly-issue-387\/","title":{"rendered":"SRE Weekly Issue #387"},"content":{"rendered":"<p><a href=\"https:\/\/sreweekly.com\/sre-weekly-issue-387\/\" title=\"Permalink to SRE Weekly Issue #387\" class=\"email_only\">View on sreweekly.com<\/a><\/p>\n<div class=\"sreweekly-sponsor-message\">\n<h2>A message from our sponsor, <a href=\"https:\/\/rootly.com\/demo\/?utm_source=sreweekly\">Rootly<\/a>:<\/h2>\n<p>When incidents impact your customers, failing to communicate with them effectively can erode trust even further and compound an already difficult situation. Learn the essentials of customer-facing incident communication in Rootly\u2019s latest blog post:<br \/><a href=\"https:\/\/rootly.com\/blog\/the-medium-is-the-message-how-to-master-the-most-essential-incident-communication-channels\">https:\/\/rootly.com\/blog\/the-medium-is-the-message-how-to-master-the-most-essential-incident-communication-channels<\/a><\/p>\n<\/div>\n<h2>Articles<\/h2>\n<div class=\"wp-block-group\">\n<div class=\"wp-block-group__inner-container\">\n<div class=\"sreweekly-entry\">\n<div class=\"sreweekly-title\"><a href=\"https:\/\/www.codereliant.io\/scaling-software-systems-10-key-factors\/\" target=\"_blank\" rel=\"noopener\">Scaling Software Systems: 10 Key Factors<\/a><\/div>\n<div class=\"sreweekly-description\">\n<p>In this post, we\u2019ll explore 10 areas that are key to designing highly scalable architectures.<\/p>\n<p>The 10 areas they cover in-depth are:<\/p>\n<p>Horizontal vs. Vertical Scaling<br \/>\nLoad Balancing<br \/>\nDatabase Scaling<br \/>\nAsynchronous Processing<br \/>\nStateless Systems<br \/>\nCaching<br \/>\nNetwork Bandwidth Optimization<br \/>8, Progressive Enhancement<br \/>\nGraceful Degradation<br \/>\nCode Scalability<\/p>\n<p>\u00a0\u00a0<small>Code Reliant<\/small><\/p>\n<\/div>\n<\/div>\n<div class=\"sreweekly-entry\">\n<div class=\"sreweekly-title\"><a href=\"https:\/\/blog.alexewerlof.com\/p\/time-based-vs-event-based\" target=\"_blank\" rel=\"noopener\">Time based vs Event based SLIs<\/a><\/div>\n<div class=\"sreweekly-description\">\n<p>Are you looking at the number of requests that were served successfully out of the total number of requests?  Or the percentage of time the system was up and working properly?<\/p>\n<p>\u00a0\u00a0<small>Alex Ewerl\u00f6f<\/small><\/p>\n<\/div>\n<\/div>\n<div class=\"sreweekly-entry\">\n<div class=\"sreweekly-title\"><a href=\"https:\/\/cherkaskyb.medium.com\/i-dont-alert-on-apdex-it-confuses-me-5e639242e5db\" target=\"_blank\" rel=\"noopener\">I Don\u2019t Alert on Apdex. It Confuses Me<\/a><\/div>\n<div class=\"sreweekly-description\">\n<p>This is my <em>personal take<\/em> on something that is considered standard that <em>I just don\u2019t understand<\/em>. So here we go \u2014 the Apdex, what it is, and why I don\u2019t use it!<\/p>\n<p>\u00a0\u00a0<small>Boris Cherkasky<\/small><\/p>\n<\/div>\n<\/div>\n<div class=\"sreweekly-entry\">\n<div class=\"sreweekly-title\"><a href=\"https:\/\/www.learningfromincidents.io\/posts\/how-the-video-three-analytical-traps-in-accident-investigation-helps-me-be-a-better-incident-analyst\" target=\"_blank\" rel=\"noopener\">Reader: How the video \u201cThree analytical traps in accident investigation\u201d Helps me be a Better Incident Analyst<\/a><\/div>\n<div class=\"sreweekly-description\">\n<p>Here\u2019s a great explanation of three common cognitive biases we should try to avoid while analyzing incidents.<\/p>\n<p>\u00a0\u00a0<small>Randy Horwitz \u2014 Learning From Incidents<\/small><\/p>\n<\/div>\n<\/div>\n<div class=\"sreweekly-entry\">\n<div class=\"sreweekly-title\"><a href=\"https:\/\/firefish.social\/notes\/9iqefgi8rzfksnqc\" target=\"_blank\" rel=\"noopener\">Lily Cohen (@lily) Re: firefish.lgbt, musician.social, and outdoors.lgbt<\/a><\/div>\n<div class=\"sreweekly-description\">\n<p>A horrifying tale of gitops gone wrong and backups that didn\u2019t back up, leading to catastrophic data loss.  This, this is what hugops is for.  I\u2019m so sorry, Lily!<\/p>\n<p>\u00a0\u00a0<small>Lily Cohen<\/small><\/p>\n<\/div>\n<\/div>\n<div class=\"sreweekly-entry\">\n<div class=\"sreweekly-title\"><a href=\"https:\/\/status.duo.com\/incidents\/rw7g0q7ztj8f\" target=\"_blank\" rel=\"noopener\">Authentication slowness or failure to load Duo Prompt on DUO1<\/a><\/div>\n<div class=\"sreweekly-description\">\n<p>Here\u2019s a followup analysis from Duo for an incident they had last week.<\/p>\n<\/div>\n<\/div>\n<div class=\"sreweekly-entry\">\n<div class=\"sreweekly-title\"><a href=\"https:\/\/thenewstack.io\/practical-guidance-for-first-time-site-reliability-engineers\/\" target=\"_blank\" rel=\"noopener\">Practical Guidance for First-Time Site Reliability Engineers<\/a><\/div>\n<div class=\"sreweekly-description\">\n<p>The first SRE hire at incident.io shares what they learned as they became familiar with the infrastructure and figured out what to do with it.<\/p>\n<p>\u00a0\u00a0<small>Ben Wheatley \u2014 The New Stack<\/small><\/p>\n<\/div>\n<\/div>\n<div class=\"sreweekly-entry\">\n<div class=\"sreweekly-title\"><a href=\"https:\/\/thenewstack.io\/keeping-the-lights-on-the-on-call-process-that-works\/\" target=\"_blank\" rel=\"noopener\">Keeping the Lights On: The On-Call Process that Works<\/a><\/div>\n<div class=\"sreweekly-description\">\n<p>This is a story of building a new on-call rotation in a company that didn\u2019t have one.  They started out with a pretty awesome list of principles that we could all aspire to.<\/p>\n<p>\u00a0\u00a0<small>Felix Lopez \u2014 The New Stack<\/small><\/p>\n<\/div>\n<\/div>\n<div class=\"sreweekly-entry\">\n<div class=\"sreweekly-title\"><a href=\"https:\/\/medium.com\/@hans.knechtions\/test-in-production-85224e7a82f3?source=rss-f2ac9bbfd2bb------2\" target=\"_blank\" rel=\"noopener\">Test In Production<\/a><\/div>\n<div class=\"sreweekly-description\">\n<p>Why should we test in production?  This article gives a really spot-on argument and goes on to explain how to do it.<\/p>\n<p>\u00a0\u00a0<small>Sven Hans Knecht<\/small><\/p>\n<\/div>\n<\/div>\n<\/div>\n<\/div>\n<p>SRE WEEKLY<\/p>","protected":false},"excerpt":{"rendered":"<p>View on sreweekly.com A message from our sponsor, Rootly: When incidents impact your customers, failing to communicate with them effectively can erode trust even further and compound an already difficult situation. Learn the essentials of customer-facing incident communication in Rootly\u2019s latest blog post:https:\/\/rootly.com\/blog\/the-medium-is-the-message-how-to-master-the-most-essential-incident-communication-channels Articles Scaling Software Systems: 10 Key Factors In this post, we\u2019ll explore&hellip; <a class=\"more-link\" href=\"https:\/\/fde.cat\/index.php\/2023\/08\/28\/sre-weekly-issue-387\/\">Continue reading <span class=\"screen-reader-text\">SRE Weekly Issue #387<\/span><\/a><\/p>\n","protected":false},"author":0,"featured_media":0,"comment_status":"","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"spay_email":"","footnotes":""},"categories":[8],"tags":[],"class_list":["post-752","post","type-post","status-publish","format-standard","hentry","category-sre","entry"],"jetpack_featured_media_url":"","jetpack-related-posts":[{"id":760,"url":"https:\/\/fde.cat\/index.php\/2023\/09\/11\/sre-weekly-issue-389\/","url_meta":{"origin":752,"position":0},"title":"SRE Weekly Issue #389","date":"September 11, 2023","format":false,"excerpt":"View on sreweekly.com A message from our sponsor, Rootly: When incidents impact your customers, failing to communicate with them effectively can erode trust even further and compound an already difficult situation. Learn the essentials of customer-facing incident communication in Rootly\u2019s latest blog post: https:\/\/rootly.com\/blog\/the-medium-is-the-message-how-to-master-the-most-essential-incident-communication-channels Articles Building a Successful SRE Team\u2026","rel":"","context":"In &quot;SRE&quot;","img":{"alt_text":"","src":"","width":0,"height":0},"classes":[]},{"id":746,"url":"https:\/\/fde.cat\/index.php\/2023\/08\/14\/sre-weekly-issue-385\/","url_meta":{"origin":752,"position":1},"title":"SRE Weekly Issue #385","date":"August 14, 2023","format":false,"excerpt":"View on sreweekly.com Many apologies to Matt Cooper at GitHub, who is the actual author of the article Scaling Merge-ort Across GitHub from last week. Sorry for the mis-credit, Matt! A message from our sponsor, Rootly: When incidents impact your customers, failing to communicate with them effectively can erode trust\u2026","rel":"","context":"In &quot;SRE&quot;","img":{"alt_text":"","src":"","width":0,"height":0},"classes":[]},{"id":755,"url":"https:\/\/fde.cat\/index.php\/2023\/09\/04\/sre-weekly-issue-388\/","url_meta":{"origin":752,"position":2},"title":"SRE Weekly Issue #388","date":"September 4, 2023","format":false,"excerpt":"View on sreweekly.com A message from our sponsor, Rootly: When incidents impact your customers, failing to communicate with them effectively can erode trust even further and compound an already difficult situation. Learn the essentials of customer-facing incident communication in Rootly\u2019s latest blog post: https:\/\/rootly.com\/blog\/the-medium-is-the-message-how-to-master-the-most-essential-incident-communication-channels Articles Operating effectively in high surprise\u2026","rel":"","context":"In &quot;SRE&quot;","img":{"alt_text":"","src":"","width":0,"height":0},"classes":[]},{"id":749,"url":"https:\/\/fde.cat\/index.php\/2023\/08\/22\/sre-weekly-issue-386\/","url_meta":{"origin":752,"position":3},"title":"SRE Weekly Issue #386","date":"August 22, 2023","format":false,"excerpt":"View on sreweekly.com This issue was delayed a day while I was enjoying a much-needed vacation with my family. While I\u2019m on the subject, it\u2019s hot take time: vacations are important for the reliability of our sociotechnical systems, so good SREs should take vacations regularly and encourage others to as\u2026","rel":"","context":"In &quot;SRE&quot;","img":{"alt_text":"","src":"","width":0,"height":0},"classes":[]},{"id":763,"url":"https:\/\/fde.cat\/index.php\/2023\/09\/18\/sre-weekly-issue-390\/","url_meta":{"origin":752,"position":4},"title":"SRE Weekly Issue #390","date":"September 18, 2023","format":false,"excerpt":"View on sreweekly.com Many apologies to my email subscribers, who have seen two accidental re-sends of old issues recently due to a weird glitch in my automation. I think I\u2019ve gotten a handle on it, and I\u2019ll run an internal retrospective of this incident, of course. A message from our\u2026","rel":"","context":"In &quot;SRE&quot;","img":{"alt_text":"","src":"","width":0,"height":0},"classes":[]},{"id":740,"url":"https:\/\/fde.cat\/index.php\/2023\/08\/07\/sre-weekly-issue-384\/","url_meta":{"origin":752,"position":5},"title":"SRE Weekly Issue #384","date":"August 7, 2023","format":false,"excerpt":"View on sreweekly.com A message from our sponsor, Rootly: When incidents impact your customers, failing to communicate with them effectively can erode trust even further and compound an already difficult situation. Learn the essentials of customer-facing incident communication in Rootly\u2019s latest blog post: https:\/\/rootly.com\/blog\/the-medium-is-the-message-how-to-master-the-most-essential-incident-communication-channels Articles Scaling merge-ort across GitHub They\u2026","rel":"","context":"In &quot;SRE&quot;","img":{"alt_text":"","src":"","width":0,"height":0},"classes":[]}],"_links":{"self":[{"href":"https:\/\/fde.cat\/index.php\/wp-json\/wp\/v2\/posts\/752","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/fde.cat\/index.php\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/fde.cat\/index.php\/wp-json\/wp\/v2\/types\/post"}],"replies":[{"embeddable":true,"href":"https:\/\/fde.cat\/index.php\/wp-json\/wp\/v2\/comments?post=752"}],"version-history":[{"count":0,"href":"https:\/\/fde.cat\/index.php\/wp-json\/wp\/v2\/posts\/752\/revisions"}],"wp:attachment":[{"href":"https:\/\/fde.cat\/index.php\/wp-json\/wp\/v2\/media?parent=752"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/fde.cat\/index.php\/wp-json\/wp\/v2\/categories?post=752"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/fde.cat\/index.php\/wp-json\/wp\/v2\/tags?post=752"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}