Articles
A fascinating Postgresql debugging story that hinges on code comments, of all things.
Christopher White — Prefect
If you’re a distributed systems nerd, this one’s a real treat. It’s a detailed breakdown of the results of a Jepsen test.
Denis Rystsov — RedPAnda
An investigation into a kernel bug that caused excessive TCP memory usage in certain situations.
Mike Freemon — Cloudflare
Let’s unpack what scaling a team is all about, what are the indicators, what are steps you can take, and how you know if you’re done.
Biju Chacko — Squadcast
Here’s another guide on running incident retrospectives and building a repeatable retrospective process.
Amin Astaneh — Certo Modo
Here’s a fun little tool that lets you inspect how data in a C program is represented in memory.
Julia Evans
This two-part series explores some shortcomings in Kubernetes’s CronJob system and the ways that Lyft fixed and worked around them.
Kevin Yang — Lyft
And here’s a case where someone ran into the Kubernetes CronJob bug described in the previous article.
Vallery Lancey
SRE WEEKLY