{"id":253,"date":"2021-08-31T14:40:46","date_gmt":"2021-08-31T14:40:46","guid":{"rendered":"https:\/\/fde.cat\/?p=253"},"modified":"2021-08-31T14:40:46","modified_gmt":"2021-08-31T14:40:46","slug":"sre-weekly-issue-254","status":"publish","type":"post","link":"https:\/\/fde.cat\/index.php\/2021\/08\/31\/sre-weekly-issue-254\/","title":{"rendered":"SRE Weekly Issue #254"},"content":{"rendered":"<p><a href=\"http:\/\/sreweekly.com\/sre-weekly-issue-254\/\" title=\"Permalink to SRE Weekly Issue #254\" class=\"email_only\">View on sreweekly.com<\/a><\/p>\n<div class=\"sreweekly-sponsor-message\">\n<h2>A message from our sponsor, StackHawk:<\/h2>\n<p>Need to run a standalone Kotlin app as a fat jar in a Gradle project? Check out how we handled that!<br \/>\n<a href=\"http:\/\/sthwk.com\/kotlin-with-gradle\">http:\/\/sthwk.com\/kotlin-with-gradle<\/a><\/p>\n<\/div>\n<h2>Articles<\/h2>\n<div class=\"sreweekly-entry\">\n<div class=\"sreweekly-title\"><a href=\"https:\/\/blog.coinbase.com\/brief-incident-post-mortem-january-6-7-2021-441f6224da93\" target=\"_blank\" rel=\"noopener\">Coinbase Incident Post Mortem: January 6\u20137, 2021<\/a><\/div>\n<div class=\"sreweekly-description\">\n<p>This one\u2019s juicy. At one point, the front-end was blocked up, so the back-end saw less traffic and scaled down. Then when the traffic came flooding back, the back-end was ill-prepared. We can all learn from this.<\/p>\n<p><small>Coinbase<\/small><\/p>\n<\/div>\n<\/div>\n<div class=\"sreweekly-entry\">\n<div class=\"sreweekly-title\"><a href=\"https:\/\/blog.cloudflare.com\/soar-simulation-for-observability-reliability-and-security\/\" target=\"_blank\" rel=\"noopener\">Soar: Simulation for Observability, reliAbility, and secuRity<\/a><\/div>\n<div class=\"sreweekly-description\">\n<p>Cloudflare has what amounts to a sophisticated staging environment for testing new code.<\/p>\n<p><small>Yan Zhai \u2014 Cloudflare<\/small><\/p>\n<\/div>\n<\/div>\n<div class=\"sreweekly-entry\">\n<div class=\"sreweekly-title\"><a href=\"https:\/\/rachelbythebay.com\/w\/2021\/01\/16\/load\/\" target=\"_blank\" rel=\"noopener\">Failing to make progress under excess request load<\/a><\/div>\n<div class=\"sreweekly-description\">\n<p>Sometimes rolling back doesn\u2019t actually get you back to a good state, especially when there\u2019s pent-up demand.<\/p>\n<p><small>Rachel By the Bay<\/small><\/p>\n<\/div>\n<\/div>\n<div class=\"sreweekly-entry\">\n<div class=\"sreweekly-title\"><a href=\"https:\/\/static.googleusercontent.com\/media\/www.google.com\/en\/\/appsstatus\/ir\/3pc4s2k9hgsdcso.pdf\" target=\"_blank\" rel=\"noopener\">Google Cloud Issue Summary \u2014 Google Meet \u2014 2021-01-08<\/a><\/div>\n<div class=\"sreweekly-description\">\n<p>Here\u2019s Google\u2019s follow-up on a Google Meet outage earlier this month.<\/p>\n<p><small>Google<\/small><\/p>\n<\/div>\n<\/div>\n<div class=\"sreweekly-entry\">\n<div class=\"sreweekly-title\"><a href=\"https:\/\/letsencrypt.org\/2021\/01\/21\/next-gen-database-servers.html\" target=\"_blank\" rel=\"noopener\">The Next Gen Database Servers Powering Let\u2019s Encrypt <\/a><\/div>\n<div class=\"sreweekly-description\">\n<p>Those are some seriously big database servers.<\/p>\n<p><small>Josh Aas and James Renken \u2014 Let\u2019s Encrypt<\/small><\/p>\n<\/div>\n<\/div>\n<div class=\"sreweekly-entry\">\n<div class=\"sreweekly-title\"><a href=\"https:\/\/betteruptime.com\/blog\/incident-management-and-on-call\/\" target=\"_blank\" rel=\"noopener\">Incident Management in 2021: from Basics to Best Practices<\/a><\/div>\n<div class=\"sreweekly-description\">\n<p>A great general overview of all aspects of incident response, including definitions and best practices.<\/p>\n<p><small>Better Uptime<\/small><\/p>\n<\/div>\n<\/div>\n<div class=\"sreweekly-entry\">\n<div class=\"sreweekly-title\"><a href=\"https:\/\/www.zebrium.com\/blog\/using-gpt-3-with-zebrium-for-plain-language-incident-root-cause-from-logs\" target=\"_blank\" rel=\"noopener\">Using GPT-3 for plain language incident root cause from logs<\/a><\/div>\n<div class=\"sreweekly-description\">\n<p>Check out what happens when you unleash a generalized language model AI on some log messages related to an incident.<\/p>\n<p><small>Larry Lancaster \u2014 Zebrium<\/small><\/p>\n<\/div>\n<\/div>\n<div class=\"sreweekly-entry\">\n<div class=\"sreweekly-title\"><a href=\"https:\/\/tanzu.vmware.com\/content\/blog\/taming-operational-load-vmware-cre\" target=\"_blank\" rel=\"noopener\">Taming Operational Load with VMware CRE<\/a><\/div>\n<div class=\"sreweekly-description\">\n<p>The CRE team at VMware undertook a project to find and reduce toil. Note that \u201cwith VMware CRE\u201d does <em>not<\/em> mean \u201cwith some product named VMware CRE\u2122\u201d.<\/p>\n<p><small>Gustavo Franco \u2014 VMware<\/small><\/p>\n<\/div>\n<\/div>\n<div class=\"sreweekly-entry\">\n<div class=\"sreweekly-title\"><a href=\"https:\/\/devopsish.com\/pdf\/Slack-Incident-Jan-04-2021-RCA-Final.pdf\" target=\"_blank\" rel=\"noopener\">Slack RCA for outage on January 4, 2021<\/a><\/div>\n<div class=\"sreweekly-description\">\n<p>This is Slack\u2019s RCA for their <a href=\"https:\/\/status.slack.com\/2021-01\/9ecc1bc75347b6d1\">outage earlier this month<\/a>. This is a great example of a complex incident with many contributing factors \u2014 certainly no single \u201croot cause\u201d here.<\/p>\n<p><small>Slack<\/small><\/p>\n<\/div>\n<\/div>\n<h2>Outages<\/h2>\n<ul class=\"sreweekly-outages\">\n<li><a href=\"https:\/\/status.slack.com\/\/2021-01\/a9b04615180de30a\">Slack<\/a><\/li>\n<li><a href=\"https:\/\/www.cnet.com\/news\/signal-operational-again-after-daylong-outage\/\">Signal<\/a><\/li>\n<li><a href=\"https:\/\/telecom.economictimes.indiatimes.com\/news\/apple-icloud-sign-in-activation-suffer-32-hour-outage-fixed-now\/79975310\">Apple iCloud<\/a><\/li>\n<li><a href=\"https:\/\/www.complex.com\/life\/2021\/01\/facebook-experiences-massive-outage-logs-out-users\">Facebook<\/a><\/li>\n<li><a href=\"https:\/\/www.sportingnews.com\/us\/nfl\/news\/why-is-cbs-down-outage-chiefs-browns\/ycr8nlfesesr1a8w8bjf7vluk\">CBS<\/a><\/li>\n<\/ul>\n<p>SRE WEEKLY<\/p>\n","protected":false},"excerpt":{"rendered":"<p>View on sreweekly.com A message from our sponsor, StackHawk: Need to run a standalone Kotlin app as a fat jar in a Gradle project? Check out how we handled that! http:\/\/sthwk.com\/kotlin-with-gradle Articles Coinbase Incident Post Mortem: January 6\u20137, 2021 This one\u2019s juicy. At one point, the front-end was blocked up, so the back-end saw less&hellip; <a class=\"more-link\" href=\"https:\/\/fde.cat\/index.php\/2021\/08\/31\/sre-weekly-issue-254\/\">Continue reading <span class=\"screen-reader-text\">SRE Weekly Issue #254<\/span><\/a><\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"closed","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"spay_email":"","footnotes":""},"categories":[8],"tags":[],"class_list":["post-253","post","type-post","status-publish","format-standard","hentry","category-sre","entry"],"jetpack_featured_media_url":"","jetpack-related-posts":[{"id":643,"url":"https:\/\/fde.cat\/index.php\/2022\/10\/24\/from-zero-to-10-million-lines-of-kotlin\/","url_meta":{"origin":253,"position":0},"title":"From zero to 10 million lines of Kotlin","date":"October 24, 2022","format":false,"excerpt":"We\u2019re sharing lessons learned from shifting our Android development from Java to Kotlin. Kotlin is a popular language for Android development and offers some key advantages over Java.\u00a0 As of today, our Android codebase contains over 10 million lines of Kotlin code. We\u2019re open sourcing various examples and utilities we\u2026","rel":"","context":"In &quot;Technology&quot;","img":{"alt_text":"","src":"","width":0,"height":0},"classes":[]},{"id":261,"url":"https:\/\/fde.cat\/index.php\/2021\/08\/31\/sre-weekly-issue-256\/","url_meta":{"origin":253,"position":1},"title":"SRE Weekly Issue #256","date":"August 31, 2021","format":false,"excerpt":"View on sreweekly.com A message from our sponsor, StackHawk: Register now for the first-ever ZAPCon taking place March 9th. The free event will focus on OWASP ZAP and application security best practices. You wont want to miss it! http:\/\/sthwk.com\/zapcon-sre-weekly Articles Slack\u2019s Outage on January 4th 2021 Here\u2019s a blog post\u2026","rel":"","context":"In &quot;SRE&quot;","img":{"alt_text":"","src":"","width":0,"height":0},"classes":[]},{"id":334,"url":"https:\/\/fde.cat\/index.php\/2021\/08\/31\/building-data-pipelines-using-kotlin\/","url_meta":{"origin":253,"position":2},"title":"Building Data Pipelines Using Kotlin","date":"August 31, 2021","format":false,"excerpt":"Co-written by Alex\u00a0OscherovUp until recently, we, like many companies, built our data pipelines in any one of a handful of technologies using Java or Scala, including Apache Spark, Storm, and Kafka. But Java is a very verbose language, so writing these pipelines in Java involves a lot of boilerplate code.\u2026","rel":"","context":"In &quot;Technology&quot;","img":{"alt_text":"","src":"","width":0,"height":0},"classes":[]},{"id":252,"url":"https:\/\/fde.cat\/index.php\/2021\/08\/31\/sre-weekly-issue-255\/","url_meta":{"origin":253,"position":3},"title":"SRE Weekly Issue #255","date":"August 31, 2021","format":false,"excerpt":"View on sreweekly.com A message from our sponsor, StackHawk: With StackHawk\u2019s new GitHub Action, you can integrate AppSec testing directly into your GitHub CI\/CD pipeline. See how: http:\/\/sthwk.com\/appsec-github-action Articles Why It Should Be Service, Not Site Reliability It really should! Even Google is much more accurately described as a \u201cservice\u201d\u2026","rel":"","context":"In &quot;SRE&quot;","img":{"alt_text":"","src":"","width":0,"height":0},"classes":[]},{"id":311,"url":"https:\/\/fde.cat\/index.php\/2021\/08\/31\/sre-weekly-issue-271\/","url_meta":{"origin":253,"position":4},"title":"SRE Weekly Issue #271","date":"August 31, 2021","format":false,"excerpt":"View on sreweekly.com A message from our sponsor, StackHawk: Join StackHawk on Tuesday, May 25 for a hands-on authenticated security testing workshop. Follow along as we walk through three common authentication scenarios step-by-step. Register: http:\/\/sthwk.com\/auth-workshop Articles Naming names in incident writeups Should you keep things anonymous (\u201can engineer\u201d), or should\u2026","rel":"","context":"In &quot;SRE&quot;","img":{"alt_text":"","src":"","width":0,"height":0},"classes":[]},{"id":463,"url":"https:\/\/fde.cat\/index.php\/2021\/09\/20\/sre-weekly-issue-287\/","url_meta":{"origin":253,"position":5},"title":"SRE Weekly Issue #287","date":"September 20, 2021","format":false,"excerpt":"View on sreweekly.com A message from our sponsor, StackHawk: Trying to figure out how to keep your APIs secure? You\u2019re not the only one. See how DataRobot is automating API security testing with StackHawk. https:\/\/sthwk.com\/DataRobot Articles Industry Interviews: Colm Doyle, Incident Commander at Slack Lots of details about how Slack\u2026","rel":"","context":"In &quot;SRE&quot;","img":{"alt_text":"","src":"","width":0,"height":0},"classes":[]}],"_links":{"self":[{"href":"https:\/\/fde.cat\/index.php\/wp-json\/wp\/v2\/posts\/253","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/fde.cat\/index.php\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/fde.cat\/index.php\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/fde.cat\/index.php\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/fde.cat\/index.php\/wp-json\/wp\/v2\/comments?post=253"}],"version-history":[{"count":1,"href":"https:\/\/fde.cat\/index.php\/wp-json\/wp\/v2\/posts\/253\/revisions"}],"predecessor-version":[{"id":448,"href":"https:\/\/fde.cat\/index.php\/wp-json\/wp\/v2\/posts\/253\/revisions\/448"}],"wp:attachment":[{"href":"https:\/\/fde.cat\/index.php\/wp-json\/wp\/v2\/media?parent=253"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/fde.cat\/index.php\/wp-json\/wp\/v2\/categories?post=253"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/fde.cat\/index.php\/wp-json\/wp\/v2\/tags?post=253"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}