{"id":339,"date":"2021-08-31T14:39:28","date_gmt":"2021-08-31T14:39:28","guid":{"rendered":"https:\/\/fde.cat\/?p=339"},"modified":"2021-08-31T14:39:28","modified_gmt":"2021-08-31T14:39:28","slug":"sre-weekly-issue-281","status":"publish","type":"post","link":"https:\/\/fde.cat\/index.php\/2021\/08\/31\/sre-weekly-issue-281\/","title":{"rendered":"SRE Weekly Issue #281"},"content":{"rendered":"<p><a href=\"https:\/\/sreweekly.com\/sre-weekly-issue-281\/\" title=\"Permalink to SRE Weekly Issue #281\" class=\"email_only\">View on sreweekly.com<\/a><\/p>\n<div class=\"sreweekly-sponsor-message\">\n<h2>A message from our sponsor, StackHawk:<\/h2>\n<p>Traditional application security testing methods fail for single page applications. Check out why single page apps are different and how you can run security tests on your SPAs.<br \/>\n<a href=\"https:\/\/sthwk.com\/SPA\">https:\/\/sthwk.com\/SPA<\/a><\/p>\n<\/div>\n<h2>Articles<\/h2>\n<div class=\"sreweekly-entry\">\n<div class=\"sreweekly-title\"><a href=\"https:\/\/incident.io\/blog\/learning-from-incidents-in-formula-1\">Learning from incidents \u2013 Formula 1<\/a><\/div>\n<div class=\"sreweekly-description\">\n<p>The incident: a formula 1 car hit the side barrier just over 20 minutes before the race was about to start. The team sprang into action with an incredibly calm, orderly and speedy incident response to replace the damaged parts faster than they ever have before.<\/p>\n<p>This article is a great analysis, and there\u2019s also an excellent 8-minute video that I highly recommend. Listen to\u00a0the way the sporting director and everyone else communicates so calmly. It\u2019s a rare treat to get video footage of a production incident like this.<\/p>\n<p> Chris Evans \u2014 incident.io<\/p>\n<\/div>\n<\/div>\n<div class=\"sreweekly-entry\">\n<div class=\"sreweekly-title\"><a href=\"https:\/\/blog.last9.io\/services-not-server-observability\/\">Observe a Service; Not a Server<\/a><\/div>\n<div class=\"sreweekly-description\">\n<p>The underlying components become the cattle, and the services become the new Pet that you tend to with your utmost care.<\/p>\n<p>Piyush Verma \u2014 Last9<\/p>\n<\/div>\n<\/div>\n<div class=\"sreweekly-entry\">\n<div class=\"sreweekly-title\"><a href=\"https:\/\/github.com\/aws-samples\/aws-incident-response-playbooks\">aws-samples\/aws-incident-response-playbooks<\/a><\/div>\n<div class=\"sreweekly-description\">\n<p>AWS posted these example\/template incident response playbooks for customers to use in their incident response process.<\/p>\n<p>Aws<\/p>\n<\/div>\n<\/div>\n<div class=\"sreweekly-entry\">\n<div class=\"sreweekly-title\"><a href=\"https:\/\/www.netmeister.org\/blog\/dns-rrs.html\">(All) DNS Resource Records<\/a><\/div>\n<div class=\"sreweekly-description\">\n<p>A list with descriptions of all DNS record types, even the obscure ones. Tag yourself, I\u2019m HIP.<\/p>\n<p>Jan Schaumann<\/p>\n<\/div>\n<\/div>\n<div class=\"sreweekly-entry\">\n<div class=\"sreweekly-title\"><a href=\"http:\/\/ow.ly\/EKnI50FDVBL\">What\u2019s a Major Incident Anyway?<\/a><\/div>\n<div class=\"sreweekly-description\">\n<p>This one includes a useful set of questions to prompt you as you develop your incident response and classification process.<\/p>\n<p>Hollie Whitehead \u2014 xMatters<\/p>\n<\/div>\n<\/div>\n<div class=\"sreweekly-entry\">\n<div class=\"sreweekly-title\"><a href=\"https:\/\/www.learningfromincidents.io\/blog\/how-to-be-better-together\">How to be better, together<\/a><\/div>\n<div class=\"sreweekly-description\">\n<p>The author of this article shows us how they communicate actively, perform incident retrospectives, and even discuss \u201cnear misses\u201d and normal work in order to better learn how their system works \u2014 all skills that apply directly to SRE.<\/p>\n<p>Jason Koppe \u2014 Learning From Incidents<\/p>\n<\/div>\n<\/div>\n<div class=\"sreweekly-entry\">\n<div class=\"sreweekly-title\"><a href=\"https:\/\/rootly.io\/blog\/the-unique-reliability-engineering-requirements-of-microservices\">The Unique Reliability Engineering Requirements of Microservices<\/a><\/div>\n<div class=\"sreweekly-description\">\n<p>Although the fundamental concepts of site reliability engineering are the same in any environment, SREs must adapt practices to different technologies, like microservices.<\/p>\n<p>JJ Tang \u2014 Rootly<\/p>\n<\/div>\n<\/div>\n<div class=\"sreweekly-entry\">\n<div class=\"sreweekly-title\"><a href=\"https:\/\/www.circleid.com\/posts\/20210726-its-time-to-rethink-outage-reports\/\">It\u2019s Time to Rethink Outage Reports<\/a><\/div>\n<div class=\"sreweekly-description\">\n<p>This one uses Akamai\u2019s incident report from their July 22 major outage as a jumping-off point to discuss openness in incident reports. The text of Akamai\u2019s incident report is included in full.<\/p>\n<p>Geoff Huston \u2014 CircleID<\/p>\n<\/div>\n<\/div>\n<div class=\"sreweekly-entry\">\n<div class=\"sreweekly-title\"><a href=\"https:\/\/www.regulationasia.com\/culture-conduct-risk-the-normalization-of-deviance\/\">Culture &amp; Conduct Risk: The Normalization of Deviance<\/a><\/div>\n<div class=\"sreweekly-description\">\n<p>Drawing from the \u201cnormalization of deviance\u201d concept introduced in the Challenger disaster study [Diane Vaughan], this article explores the idea of studying your organization culture to catch problems early, rather than waiting to respond after they happen.<\/p>\n<p>Stephen Scott<\/p>\n<\/div>\n<\/div>\n<div class=\"sreweekly-entry\">\n<div class=\"sreweekly-title\"><a href=\"https:\/\/podcast.staffeng.com\/1687069\/8930850-lorin-hochstein-netflix\">Lorin Hochstein (Netflix) [StaffEng Podcast]<\/a><\/div>\n<div class=\"sreweekly-description\">\n<p>This episode of the StaffEng Podcast is an interview with Lorin Hochstein, whose writings I\u2019ve featured here numerous times. My favorite part of this episode is when they talk about doing incident analysis for near misses. One of the hosts points out that it\u2019s much easier for folks to talk about what happened, because there was no incident so they\u2019re not worried about being blamed.<\/p>\n<p> David No\u00ebl-Romas and Alex Kessinger\u2013 StaffEng Podcast<\/p>\n<\/div>\n<\/div>\n<h2>Outages<\/h2>\n<p><a href=\"https:\/\/status.io\/pages\/incident\/55957a99e800baa4470002da\/60fdf1d1776729052faca982\">Let\u2019s Encrypt<\/a><br \/>\n<a href=\"https:\/\/9to5mac.com\/2021\/07\/29\/psa-snapchat-is-currently-down-for-some-users\/\">Snapchat<\/a><br \/>\n<a href=\"https:\/\/piunikaweb.com\/2021\/07\/26\/wikipedia-down-error-claims-server-maintenance-or-a-technical-issue\/\">Wikipedia<\/a><\/p>\n<p>To fact-check this one, I looked at their <a href=\"https:\/\/grafana.wikimedia.org\/d\/000000479\/frontend-traffic?orgId=1&amp;from=1627257600000&amp;to=1627343999000\">grafana dashboard<\/a>. Neat!<\/p>\n<p><a href=\"https:\/\/www.express.co.uk\/life-style\/science-technology\/1468474\/Netflix-down-streaming-offline-app-Android-iPhone\">Netflix<\/a><br \/>\n<a href=\"https:\/\/www.the-sun.com\/news\/2385456\/is-venmo-down\/\">Venmo<\/a><br \/>\n<a href=\"https:\/\/cw.ua.edu\/82174\/news\/blackboard-learn-down-the-week-before-fall-semester\/\">Blackboard Learn<\/a><br \/>\n<a href=\"https:\/\/www.express.co.uk\/life-style\/science-technology\/1469981\/eBay-DOWN-DNS-service-unavailable-hits-website-down-for-thousands\">eBay<\/a><br \/>\n<a href=\"https:\/\/reddit.statuspage.io\/incidents\/s0fdjz6kwky9\">reddit<\/a><br \/>\nSRE WEEKLY<\/p>\n","protected":false},"excerpt":{"rendered":"<p>View on sreweekly.com A message from our sponsor, StackHawk: Traditional application security testing methods fail for single page applications. Check out why single page apps are different and how you can run security tests on your SPAs. https:\/\/sthwk.com\/SPA Articles Learning from incidents \u2013 Formula 1 The incident: a formula 1 car hit the side barrier&hellip; <a class=\"more-link\" href=\"https:\/\/fde.cat\/index.php\/2021\/08\/31\/sre-weekly-issue-281\/\">Continue reading <span class=\"screen-reader-text\">SRE Weekly Issue #281<\/span><\/a><\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"closed","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"spay_email":"","footnotes":""},"categories":[8],"tags":[],"class_list":["post-339","post","type-post","status-publish","format-standard","hentry","category-sre","entry"],"jetpack_featured_media_url":"","jetpack-related-posts":[{"id":269,"url":"https:\/\/fde.cat\/index.php\/2021\/08\/31\/sre-weekly-issue-257\/","url_meta":{"origin":339,"position":0},"title":"SRE Weekly Issue #257","date":"August 31, 2021","format":false,"excerpt":"View on sreweekly.com A message from our sponsor, StackHawk: Keeping your APIs secure requires thoughtful design and testing. Learn how to protect your REST, SOAP and GraphQL APIs from security vulnerabilities with StackHawk http:\/\/sthwk.com\/api-protection Articles Sometimes alerts have inobvious reasons for existing This one really got me thinking. Make sure\u2026","rel":"","context":"In &quot;SRE&quot;","img":{"alt_text":"","src":"","width":0,"height":0},"classes":[]},{"id":642,"url":"https:\/\/fde.cat\/index.php\/2022\/10\/24\/sre-weekly-issue-344\/","url_meta":{"origin":339,"position":1},"title":"SRE Weekly Issue #344","date":"October 24, 2022","format":false,"excerpt":"View on sreweekly.com A message from our sponsor, Rootly: Manage incidents directly from Slack with Rootly\u00a0\ud83d\ude92. Rootly automates manual tasks like creating an incident channel, Jira ticket and Zoom rooms, inviting responders, creating statuspage updates, postmortem timelines and more. Want to see why companies like Canva and Grammarly love us?:\u2026","rel":"","context":"In &quot;SRE&quot;","img":{"alt_text":"","src":"","width":0,"height":0},"classes":[]},{"id":477,"url":"https:\/\/fde.cat\/index.php\/2021\/09\/27\/sre-weekly-issue-289\/","url_meta":{"origin":339,"position":2},"title":"SRE Weekly Issue #289","date":"September 27, 2021","format":false,"excerpt":"View on sreweekly.com A message from our sponsor, StackHawk: Semgrep and StackHawk are showing you what\u2019s new with automated security testing on September 30. Grab your spot: https:\/\/sthwk.com\/whats-new-webinar Articles How SREs are unique in their approach to work Here are some things that make SREs a unique breed in software\u2026","rel":"","context":"In &quot;SRE&quot;","img":{"alt_text":"","src":"","width":0,"height":0},"classes":[]},{"id":565,"url":"https:\/\/fde.cat\/index.php\/2022\/04\/25\/sre-weekly-issue-319\/","url_meta":{"origin":339,"position":3},"title":"SRE Weekly Issue #319","date":"April 25, 2022","format":false,"excerpt":"View on sreweekly.com A message from our sponsor, Rootly: Manage incidents directly from Slack with Rootly \ud83d\ude92. Automate manual admin tasks like creating incident channel, Jira and Zoom, paging and adding responders, postmortem timeline, setting up reminders, and more. Book a demo (+ get a snazzy Rootly lego set): https:\/\/rootly.com\/demo\/\u2026","rel":"","context":"In &quot;SRE&quot;","img":{"alt_text":"","src":"","width":0,"height":0},"classes":[]},{"id":467,"url":"https:\/\/fde.cat\/index.php\/2021\/09\/20\/sre-weekly-issue-288\/","url_meta":{"origin":339,"position":4},"title":"SRE Weekly Issue #288","date":"September 20, 2021","format":false,"excerpt":"View on sreweekly.com A message from our sponsor, StackHawk: Want to see what\u2019s new with automated security tooling? Tune in on September 30 to see how StackHawk and Semgrep are making it possible to embed security testing in CI\/CD. https:\/\/sthwk.com\/whats-new-webinar Articles Tammy Bryant Butow on SRE Apprentices Faced with a\u2026","rel":"","context":"In &quot;SRE&quot;","img":{"alt_text":"","src":"","width":0,"height":0},"classes":[]},{"id":455,"url":"https:\/\/fde.cat\/index.php\/2021\/09\/20\/sre-weekly-issue-285\/","url_meta":{"origin":339,"position":5},"title":"SRE Weekly Issue #285","date":"September 20, 2021","format":false,"excerpt":"View on sreweekly.com A message from our sponsor, StackHawk: Check out the latest from StackHawk\u2019s Chief Security Officer, Scott Gerlach, on why security should be part of building software, and how StackHawk helps teams catch vulns before prod. https:\/\/sthwk.com\/cloudnative Articles Computers are the easy part What\u2019s so great about this\u2026","rel":"","context":"In &quot;SRE&quot;","img":{"alt_text":"","src":"","width":0,"height":0},"classes":[]}],"_links":{"self":[{"href":"https:\/\/fde.cat\/index.php\/wp-json\/wp\/v2\/posts\/339","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/fde.cat\/index.php\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/fde.cat\/index.php\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/fde.cat\/index.php\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/fde.cat\/index.php\/wp-json\/wp\/v2\/comments?post=339"}],"version-history":[{"count":1,"href":"https:\/\/fde.cat\/index.php\/wp-json\/wp\/v2\/posts\/339\/revisions"}],"predecessor-version":[{"id":371,"href":"https:\/\/fde.cat\/index.php\/wp-json\/wp\/v2\/posts\/339\/revisions\/371"}],"wp:attachment":[{"href":"https:\/\/fde.cat\/index.php\/wp-json\/wp\/v2\/media?parent=339"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/fde.cat\/index.php\/wp-json\/wp\/v2\/categories?post=339"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/fde.cat\/index.php\/wp-json\/wp\/v2\/tags?post=339"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}