How can you make your services observable and embrace service ownership? This article presents a variety of universally applicable design patterns for the developer to consider. Design patterns in software development are repeatable solutions and best practices for solving commonly occurring problems. Even in the case of service monitoring, design patterns, when used appropriately, can… Continue reading 5 Design Patterns for Building Observable Services
Category: Technology
Encompass all posts related to Technology topic on this site
SRE Weekly Issue #304
View on sreweekly.com A message from our sponsor, Rootly: Manage incidents directly from Slack with Rootly đźš’. Automate manual admin tasks like creating incident channel, Jira and Zoom, paging the right team, postmortem timeline, setting up reminders, and more. Book a demo (+ get a snazzy Rootly shirt): https://rootly.com/demo/?utm_source=sreweekly Articles Channel global decoupling for region… Continue reading SRE Weekly Issue #304
Managing Availability in Service Based Deployments with Continuous Testing
The Problem At Salesforce, trust is our number one value. What this equates to is that our customers need to trust us; trust us to safeguard their data, trust that we will keep our services up and running, and trust that we will be there for them when they need us. In the world of Software… Continue reading Managing Availability in Service Based Deployments with Continuous Testing
SRE Weekly Issue #303
View on sreweekly.com A message from our sponsor, Rootly: Manage incidents directly from Slack with Rootly đźš’. Automate manual admin tasks like creating incident channel, Jira and Zoom, paging the right team, postmortem timeline, setting up reminders, and more. Book a demo:https://rootly.com/demo/?utm_source=sreweekly Articles Hot Takes on Code Freezes There are way too many gorgeous, mind-blowing… Continue reading SRE Weekly Issue #303
SRE Netflix at SRECon
190 Countries and 5 CORE SREs by Jonah Horowitz How does Netflix scale SRE? How do we manage over 70 million customers around the world without a 24/7 operations center? With tens of thousands of Linux instances in a distributed system architecture, and thousands of daily production changes, it’s an environment that’s both challenging and… Continue reading SRE Netflix at SRECon
SRE Weekly Issue #302
View on sreweekly.com Happy holidays, for those that celebrate! I put this issue together in advance, so no Outages section this week. A message from our sponsor, Rootly: Manage incidents directly from Slack with Rootly đźš’. Automate manual admin tasks like creating incident channel, Jira and Zoom, paging the right team, postmortem timeline, setting up… Continue reading SRE Weekly Issue #302
Best Time to Send Emails
Today in email marketing, the time that an email is sent has a high impact on user engagement. Sending at an optimal time can help drive more successful and effective campaigns. In order to send at the best time, you need to have a good understand of your users’ email engagement pattern. Sending right before… Continue reading Best Time to Send Emails
SRE Weekly Issue #301
View on sreweekly.com A message from our sponsor, Rootly: Manage incidents directly from Slack with Rootly đźš’. Automate manual admin tasks like creating incident channel, Jira and Zoom, paging the right team, postmortem timeline, setting up reminders, and more. Book a demo: https://rootly.com/demo/?utm_source=sreweekly Articles BadgerDAO Exploit Technical Post Mortem This one perhaps belongs in a… Continue reading SRE Weekly Issue #301
Power Loss Siren: Making Meta resilient to power loss events
There are thousands of distributed services running on millions of servers in Meta’s data centers. Part of ensuring the reliability of those services means making them resilient to power loss events as our data center fleet grows. To help increase resiliency, we built the Power Loss Siren (PLS) — a rack level, low latency, distributed… Continue reading Power Loss Siren: Making Meta resilient to power loss events
Event Sourcing for an Inventory Availability Solution
Co-author — Balachandar Mariappan An Introduction to Terminology ATF — Available to Fulfill inventoryOn-Hand — Physical amount of Inventory availableSKU — Stock Keeping Unit, which represents a distinct type of item for sale.Location — Representation of a physical location like a store or warehouse where SKU’s are presentLocation Group — A Logical aggregation of typically one or more Locations.Reservation or Inventory Reservation — Reserving a quantity of a SKU. For example:… Continue reading Event Sourcing for an Inventory Availability Solution