{"id":276,"date":"2021-08-31T14:40:23","date_gmt":"2021-08-31T14:40:23","guid":{"rendered":"https:\/\/fde.cat\/?p=276"},"modified":"2021-08-31T14:40:23","modified_gmt":"2021-08-31T14:40:23","slug":"sre-weekly-issue-259","status":"publish","type":"post","link":"https:\/\/fde.cat\/index.php\/2021\/08\/31\/sre-weekly-issue-259\/","title":{"rendered":"SRE Weekly Issue #259"},"content":{"rendered":"<p><a href=\"http:\/\/sreweekly.com\/sre-weekly-issue-259\/\" title=\"Permalink to SRE Weekly Issue #259\" class=\"email_only\">View on sreweekly.com<\/a><\/p>\n<div class=\"sreweekly-sponsor-message\">\n<h2>A message from our sponsor, StackHawk:<\/h2>\n<p>Mark your calendars! The first conference for OWASP ZAP users is taking place March 9. Get your free ticket to connect with other ZAP users and learn about the project\u2019s roadmap<br \/>\n<a href=\"http:\/\/sthwk.com\/zapcon-sreweekly\">http:\/\/sthwk.com\/zapcon-sreweekly<\/a><\/p>\n<\/div>\n<h2>Articles<\/h2>\n<div class=\"sreweekly-entry\">\n<div class=\"sreweekly-title\"><a href=\"https:\/\/increment.com\/reliability\/\" target=\"_blank\" rel=\"noopener\">Increment: Reliability<\/a><\/div>\n<div class=\"sreweekly-description\">\n<p>This quarter\u2019s <em>Increment<\/em> issue is about Reliability, and I haven\u2019t had this much fun since their <a href=\"https:\/\/increment.com\/on-call\/\">first issue about on-call<\/a>. I\u2019ll include a few of the articles here and more in later issues as I have a chance to review them.<\/p>\n<p><small>Stripe<\/small><\/p>\n<\/div>\n<\/div>\n<div class=\"sreweekly-entry\">\n<div class=\"sreweekly-title\"><a href=\"https:\/\/increment.com\/reliability\/failure-is-okay\/\" target=\"_blank\" rel=\"noopener\">[Increment: Reliability] Everything is broken, and it\u2019s okay<\/a><\/div>\n<div class=\"sreweekly-description\">\n<blockquote>\n<p>Accepting that imperfect things still work is fundamental to preventing failures from becoming catastrophes.<\/p>\n<\/blockquote>\n<p>Understanding that no system is without errors is critical to building resilient systems.<\/p>\n<p><small>Heidi Waterhouse<\/small><\/p>\n<\/div>\n<\/div>\n<div class=\"sreweekly-entry\">\n<div class=\"sreweekly-title\"><a href=\"https:\/\/increment.com\/reliability\/how-to-build-organizational-resilience\/\" target=\"_blank\" rel=\"noopener\">[Increment: Reliability] How to build organizational resilience<\/a><\/div>\n<div class=\"sreweekly-description\">\n<p>The very first sentence sets the tone, and I love it:<\/p>\n<blockquote>\n<p>Resilience is a process: something you must actively perform, not something you check off a list once.<\/p>\n<\/blockquote>\n<p><small>Ryn Daniels<\/small><\/p>\n<\/div>\n<\/div>\n<div class=\"sreweekly-entry\">\n<div class=\"sreweekly-title\"><a href=\"https:\/\/increment.com\/reliability\/technical-incident-command\/\" target=\"_blank\" rel=\"noopener\">[Increment: Reliability] Embrace your inner incident commander<\/a><\/div>\n<div class=\"sreweekly-description\">\n<blockquote>\n<p>Most of all, having an incident commander only works if everyone believes in the role. Someone stepping in to address a crisis and saying \u201cI\u2019m Batman\u201d doesn\u2019t help unless people have bought into the idea of Batman.<\/p>\n<\/blockquote>\n<p>The next time I\u2019m incident commander, I am <em>totally<\/em> going to jump in and say, \u201cI\u2019m Batman!\u201d.<\/p>\n<p>This article is a great primer on what an IC is and how to adopt incident command at your organization.<\/p>\n<p><small>Tanya Reilly<\/small><\/p>\n<\/div>\n<\/div>\n<div class=\"sreweekly-entry\">\n<div class=\"sreweekly-title\"><a href=\"https:\/\/engineering.mercari.com\/en\/blog\/entry\/20210126-retry-pattern-in-microservices\/\" target=\"_blank\" rel=\"noopener\">Retry pattern in microservices<\/a><\/div>\n<div class=\"sreweekly-description\">\n<blockquote>\n<p>After reading this blog post, you will have an understanding of the retry pattern used in microservices architecture, why it should be used, a few considerations while using the retry pattern, and how to use it in Python.<\/p>\n<\/blockquote>\n<p>I love the W. C. Fields quote.<\/p>\n<p><small>Anand Prashant<\/small><\/p>\n<\/div>\n<\/div>\n<div class=\"sreweekly-entry\">\n<div class=\"sreweekly-title\"><a href=\"https:\/\/devops.com\/2021-site-reliability-engineering-sre-survey-now-open\/\" target=\"_blank\" rel=\"noopener\">2021 Site Reliability Engineering (SRE) Survey Now Open<\/a><\/div>\n<div class=\"sreweekly-description\">\n<p>It\u2019s that time again! Be sure to fill out the survey, not only so they can gather useful data, but also because Catchpoint will donate $5 to charity.<\/p>\n<p><small>DevOps Institute, Catchpoint, and VMWare Tanzu<\/small><\/p>\n<\/div>\n<\/div>\n<div class=\"sreweekly-entry\">\n<div class=\"sreweekly-title\"><a href=\"https:\/\/www.blameless.com\/blog\/how-sre-will-transform-qa\" target=\"_blank\" rel=\"noopener\">QA Engineers, This is How SRE will Transform your Role<\/a><\/div>\n<div class=\"sreweekly-description\">\n<blockquote>\n<p>When considering the value of a QA test, SLIs can provide very valuable context.<\/p>\n<\/blockquote>\n<p>SRE and QA can work hand in hand.<\/p>\n<p><small>Emily Arnott \u2014 Blameless<\/small><\/p>\n<\/div>\n<\/div>\n<div class=\"sreweekly-entry\">\n<div class=\"sreweekly-title\"><a href=\"https:\/\/engineering.fb.com\/2021\/02\/23\/data-infrastructure\/silent-data-corruption\/\" target=\"_blank\" rel=\"noopener\">Silent data corruption: Mitigating effects at scale<\/a><\/div>\n<div class=\"sreweekly-description\">\n<p>This kind of thing keeps me up at night. Silent data corruption can destroy your reliability just as quickly as a backhoe on a non-redundant link.<\/p>\n<p><small>Harish Dattatraya Dixit \u2014 Facebook<\/small><\/p>\n<\/div>\n<\/div>\n<div class=\"sreweekly-entry\">\n<div class=\"sreweekly-title\"><a href=\"https:\/\/codeascraft.com\/2021\/02\/25\/how-etsy-prepared-for-historic-volumes-of-holiday-traffic-in-2020\/\" target=\"_blank\" rel=\"noopener\">How Etsy Prepared for Historic Volumes of Holiday Traffic in 2020<\/a><\/div>\n<div class=\"sreweekly-description\">\n<p>Etsy experienced years of growth practically overnight in 2020 as quarantines set in. Here\u2019s how they handled it.<\/p>\n<p><small>Mike Adler \u2014 Etsy<\/small><\/p>\n<\/div>\n<\/div>\n<h2>Outages<\/h2>\n<ul class=\"sreweekly-outages\">\n<li><a href=\"https:\/\/status.io\/pages\/incident\/55957a99e800baa4470002da\/6032bcc06a601504d1dbbdf0\">Let\u2019s Encrypt<\/a><\/li>\n<li><a href=\"https:\/\/static.googleusercontent.com\/media\/www.google.com\/en\/\/appsstatus\/ir\/to3hkm2wc4v4xq7.pdf\">Google Voice<\/a>\n<ul class=\"sreweekly-outage\">\n<li class=\"sreweekly-outage\">This is Google\u2019s analysis for the incident on February 16, caused by a TLS certificate management mishap.<\/li>\n<\/ul>\n<\/li>\n<li><a href=\"https:\/\/zeenews.india.com\/markets\/nse-trading-halt-no-impact-to-trading-system-but-online-risk-management-system-affected-2344210.html\">India\u2019s National Stock Exchange (NSE)<\/a><\/li>\n<li><a href=\"https:\/\/www.theverge.com\/2021\/2\/23\/22297620\/linkedin-down-outage-issues\">LinkedIn<\/a><\/li>\n<li><a href=\"https:\/\/www.wsj.com\/articles\/fed-attributes-payment-systems-outage-to-human-error-11614297673\">US Federal Reserve<\/a>\n<ul class=\"sreweekly-outage\">\n<li class=\"sreweekly-outage\">The US Fed\u2019s computer system was down, preventing transfers between banks from going through.<\/li>\n<\/ul>\n<\/li>\n<li><a href=\"https:\/\/www.androidpolice.com\/2021\/02\/22\/major-venmo-outage-has-been-preventing-payments-from-going-through\/\">Venmo<\/a><\/li>\n<li><a href=\"https:\/\/www.mirror.co.uk\/tech\/breaking-facebook-instagram-down-thousands-23566924\">Facebook and Instagram<\/a><\/li>\n<li><a href=\"https:\/\/reddit.statuspage.io\/incidents\/wsrwszw4x5td\">Reddit<\/a><\/li>\n<li><a href=\"https:\/\/discord.statuspage.io\/incidents\/ls953q6yk8wb\">Discord<\/a><\/li>\n<\/ul>\n<p>SRE WEEKLY<\/p>\n","protected":false},"excerpt":{"rendered":"<p>View on sreweekly.com A message from our sponsor, StackHawk: Mark your calendars! The first conference for OWASP ZAP users is taking place March 9. Get your free ticket to connect with other ZAP users and learn about the project\u2019s roadmap http:\/\/sthwk.com\/zapcon-sreweekly Articles Increment: Reliability This quarter\u2019s Increment issue is about Reliability, and I haven\u2019t had&hellip; <a class=\"more-link\" href=\"https:\/\/fde.cat\/index.php\/2021\/08\/31\/sre-weekly-issue-259\/\">Continue reading <span class=\"screen-reader-text\">SRE Weekly Issue #259<\/span><\/a><\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"closed","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"spay_email":"","footnotes":""},"categories":[8],"tags":[],"class_list":["post-276","post","type-post","status-publish","format-standard","hentry","category-sre","entry"],"jetpack_featured_media_url":"","jetpack-related-posts":[{"id":280,"url":"https:\/\/fde.cat\/index.php\/2021\/08\/31\/sre-weekly-issue-260\/","url_meta":{"origin":276,"position":0},"title":"SRE Weekly Issue #260","date":"August 31, 2021","format":false,"excerpt":"View on sreweekly.com A message from our sponsor, StackHawk: Check out this guide to modern dynamic application security testing to learn how it works and what to look for in tooling. http:\/\/sthwk.com\/dynamic-appsec-overview Articles [Increment: Reliability] Interview: Dr. David D. Woods People throw around \u201cresiliency\u201d quite often when they mean \u201creliability\u201d\u2026","rel":"","context":"In &quot;SRE&quot;","img":{"alt_text":"","src":"","width":0,"height":0},"classes":[]},{"id":282,"url":"https:\/\/fde.cat\/index.php\/2021\/08\/31\/sre-weekly-issue-261\/","url_meta":{"origin":276,"position":1},"title":"SRE Weekly Issue #261","date":"August 31, 2021","format":false,"excerpt":"View on sreweekly.com A message from our sponsor, StackHawk: Join Snyk and StackHawk on March 18 as they walk through how to use Software Composition Analysis (SCA) and Dynamic Application Security Testing (DAST) in CI\/CD to ship more secure applications. http:\/\/sthwk.com\/snyk-stackhawk-webinar Articles What Do Fighter Pilots and Incident Management Have\u2026","rel":"","context":"In &quot;SRE&quot;","img":{"alt_text":"","src":"","width":0,"height":0},"classes":[]},{"id":333,"url":"https:\/\/fde.cat\/index.php\/2021\/08\/31\/sre-weekly-issue-279\/","url_meta":{"origin":276,"position":2},"title":"SRE Weekly Issue #279","date":"August 31, 2021","format":false,"excerpt":"View on sreweekly.com A message from our sponsor, StackHawk: On July 28, ZAP Creator Simon Bennetts is giving a first look at ZAP\u2019s new automation framework. Grab your spot: https:\/\/sthwk.com\/ZAP-Automation Articles Managing the Risk of Cascading Failure This is a presentation by Laura Nolan (with text transcript) all about cascading\u2026","rel":"","context":"In &quot;SRE&quot;","img":{"alt_text":"","src":"","width":0,"height":0},"classes":[]},{"id":320,"url":"https:\/\/fde.cat\/index.php\/2021\/08\/31\/sre-weekly-issue-275\/","url_meta":{"origin":276,"position":3},"title":"SRE Weekly Issue #275","date":"August 31, 2021","format":false,"excerpt":"View on sreweekly.com A message from our sponsor, StackHawk: Join ZAP Founder & Project Lead Simon Bennetts on June 30 for a live AMA where he will be answering questions on all things open source and AppSec. Register: http:\/\/sthwk.com\/Simon-AMA Articles Practical Guide to SRE: Incident Severity Levels Here\u2019s a take\u2026","rel":"","context":"In &quot;SRE&quot;","img":{"alt_text":"","src":"","width":0,"height":0},"classes":[]},{"id":291,"url":"https:\/\/fde.cat\/index.php\/2021\/08\/31\/sre-weekly-issue-263\/","url_meta":{"origin":276,"position":4},"title":"SRE Weekly Issue #263","date":"August 31, 2021","format":false,"excerpt":"View on sreweekly.com A message from our sponsor, StackHawk: You can utilize Swagger Docs in security testing to drive more thorough and accurate vulnerability scans of your APIs. Learn how: http:\/\/sthwk.com\/swagger-api-testing Articles [Increment: Reliability] Tracing a path to observability They make a really clear case for why traditional metrics and\u2026","rel":"","context":"In &quot;SRE&quot;","img":{"alt_text":"","src":"","width":0,"height":0},"classes":[]},{"id":304,"url":"https:\/\/fde.cat\/index.php\/2021\/08\/31\/sre-weekly-issue-269\/","url_meta":{"origin":276,"position":5},"title":"SRE Weekly Issue #269","date":"August 31, 2021","format":false,"excerpt":"View on sreweekly.com A message from our sponsor, StackHawk: Tune into ZAPCon After Hours this Tuesday at 8 am PT to learn how to include automated security testing in your builds with ZAP http:\/\/sthwk.com\/after-hours-3 Articles Edgar: Solving Mysteries Faster with Observability We built Edgar to ease this burden, by empowering\u2026","rel":"","context":"In &quot;SRE&quot;","img":{"alt_text":"","src":"","width":0,"height":0},"classes":[]}],"_links":{"self":[{"href":"https:\/\/fde.cat\/index.php\/wp-json\/wp\/v2\/posts\/276","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/fde.cat\/index.php\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/fde.cat\/index.php\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/fde.cat\/index.php\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/fde.cat\/index.php\/wp-json\/wp\/v2\/comments?post=276"}],"version-history":[{"count":1,"href":"https:\/\/fde.cat\/index.php\/wp-json\/wp\/v2\/posts\/276\/revisions"}],"predecessor-version":[{"id":434,"href":"https:\/\/fde.cat\/index.php\/wp-json\/wp\/v2\/posts\/276\/revisions\/434"}],"wp:attachment":[{"href":"https:\/\/fde.cat\/index.php\/wp-json\/wp\/v2\/media?parent=276"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/fde.cat\/index.php\/wp-json\/wp\/v2\/categories?post=276"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/fde.cat\/index.php\/wp-json\/wp\/v2\/tags?post=276"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}