SRE Weekly Issue #421

View on sreweekly.com Last week, I mistakenly attributed [an article](https://www.paigerduty.com/sre-biggest-problem/) to PagerDuty. Actually, it was by Paige Cruz, whose clever blog name I didn’t pay anywhere near close enough attention to! Thanks to several readers that nudged me gently about my error. A message from our sponsor, FireHydrant: FireHydrant is now AI-powered for faster, smarter… Continue reading SRE Weekly Issue #421

Published
Categorized as SRE

SRE Weekly Issue #420

View on sreweekly.com A message from our sponsor, FireHydrant: FireHydrant is now AI-powered for faster, smarter incidents! Power up your incidents with auto-generated real-time summaries, retrospectives, and status page updates. https://firehydrant.com/blog/ai-for-incident-management-is-here/ 1.0 Launch Retrospective The game Last Epoch launched in February, and they had a rocky start. This huge retrospective post tells the story of… Continue reading SRE Weekly Issue #420

Published
Categorized as SRE

SRE Weekly Issue #419

View on sreweekly.com A message from our sponsor, FireHydrant: FireHydrant is now AI-powered for faster, smarter incidents! Power up your incidents with auto-generated real-time summaries, retrospectives, and status page updates. https://firehydrant.com/blog/ai-for-incident-management-is-here/ How Figma’s Databases Team Lived to Tell the Scale Our nine month journey to horizontally shard Figma’s Postgres stack, and the key to unlocking… Continue reading SRE Weekly Issue #419

Published
Categorized as SRE

SRE Weekly Issue #418

View on sreweekly.com A message from our sponsor, FireHydrant: FireHydrant is now AI-powered for faster, smarter incidents! Power up your incidents with auto-generated real-time summaries, retrospectives, and status page updates. https://firehydrant.com/blog/ai-for-incident-management-is-here/ Redefining Observability The observability waters have been muddy for awhile, and this article does a great job of taking a step back and building… Continue reading SRE Weekly Issue #418

Published
Categorized as SRE

SRE Weekly Issue #417

View on sreweekly.com A message from our sponsor, FireHydrant: Join FireHydrant this Thursday for a conversation about on-call burnout and how to prevent it. Get a better understanding of what makes a fatigue-free on-call culture, including real-world examples from your incident management peers. No sales, just shop talk. https://app.livestorm.co/firehydrant/better-incidents-spring-bonfire-secrets-to-fatigue-free-on-call-in-2024 Harnessing chaos in Cloudflare offices Remember… Continue reading SRE Weekly Issue #417

Published
Categorized as SRE

SRE Weekly Issue #416

View on sreweekly.com A message from our sponsor, FireHydrant: We need tools that help us show our value, enhance understanding of our systems, and free time for us to expand our skills. In this article, FireHydrant lays out three questions to ask vendors as you evaluate DevOps tools. https://firehydrant.com/blog/3-questions-to-ask-of-any-devops-tool-in-2024/ 4 Instructive Postmortems on Data Downtime… Continue reading SRE Weekly Issue #416

Published
Categorized as SRE

SRE Weekly Issue #415

View on sreweekly.com A message from our sponsor, FireHydrant: Join FireHydrant and talk shop with your DevOps peers on March 28! You’ll gain a better understanding of what makes a fatigue-free on-call culture and how to implement practices to improve yours at this free, virtual roundtable. https://app.livestorm.co/firehydrant/better-incidents-spring-bonfire-secrets-to-fatigue-free-on-call-in-2024 The Wrong Way to Use DORA Metrics […]… Continue reading SRE Weekly Issue #415

Published
Categorized as SRE

SRE Weekly Issue #414

View on sreweekly.com A message from our sponsor, FireHydrant: 91% of engineering leaders say they want a better alerting tool. The other 9% couldn’t take the survey on their Blackberry. Meet Signals: a new standard in alerting and on call, now available. https://firehydrant.com/blog/alerting-and-on-call-scheduling-for-how-you-actually-work/ 2024 VOID Report This year’s VOID Report is out, and it’s well… Continue reading SRE Weekly Issue #414

Published
Categorized as SRE

SRE Weekly Issue #413

View on sreweekly.com Sorry about the automation fail and resend! That definitely wasn’t issue #1. A message from our sponsor, FireHydrant: Check out how global payments company Dock uses FireHydrant to streamline and consolidate their incident management stack and reduce what they call “mean time to combat.”https://firehydrant.com/blog/the-revolution-in-critical-incident-response-at-dock-with-firehydrant/ The Domain of Failure This article discusses building… Continue reading SRE Weekly Issue #413

Published
Categorized as SRE

SRE Weekly Issue #412

View on sreweekly.com A message from our sponsor, FireHydrant: FireHydrant’s new and improved MTTX analytics dashboard is here! See which services are most affected by incidents, where they take the longest to detect (or acknowledge, mitigate, resolve … you name it); and how metrics and statistics change over time. https://firehydrant.com/blog/mttx-incident-analytics-to-drive-your-reliability-roadmap/ The Single Pain of Glass… Continue reading SRE Weekly Issue #412

Published
Categorized as SRE