Articles
In this post, we’ll explore 10 areas that are key to designing highly scalable architectures.
The 10 areas they cover in-depth are:
Horizontal vs. Vertical Scaling
Load Balancing
Database Scaling
Asynchronous Processing
Stateless Systems
Caching
Network Bandwidth Optimization
8, Progressive Enhancement
Graceful Degradation
Code Scalability
Code Reliant
Are you looking at the number of requests that were served successfully out of the total number of requests? Or the percentage of time the system was up and working properly?
Alex Ewerlöf
This is my personal take on something that is considered standard that I just don’t understand. So here we go — the Apdex, what it is, and why I don’t use it!
Boris Cherkasky
Here’s a great explanation of three common cognitive biases we should try to avoid while analyzing incidents.
Randy Horwitz — Learning From Incidents
A horrifying tale of gitops gone wrong and backups that didn’t back up, leading to catastrophic data loss. This, this is what hugops is for. I’m so sorry, Lily!
Lily Cohen
Here’s a followup analysis from Duo for an incident they had last week.
The first SRE hire at incident.io shares what they learned as they became familiar with the infrastructure and figured out what to do with it.
Ben Wheatley — The New Stack
This is a story of building a new on-call rotation in a company that didn’t have one. They started out with a pretty awesome list of principles that we could all aspire to.
Felix Lopez — The New Stack
Why should we test in production? This article gives a really spot-on argument and goes on to explain how to do it.
Sven Hans Knecht
SRE WEEKLY