There are thousands of distributed services running on millions of servers in Meta’s data centers. Part of ensuring the reliability of those services means making them resilient to power loss events as our data center fleet grows. To help increase resiliency, we built the Power Loss Siren (PLS) — a rack level, low latency, distributed… Continue reading Power Loss Siren: Making Meta resilient to power loss events
Category: Technology
Encompass all posts related to Technology topic on this site
Event Sourcing for an Inventory Availability Solution
Co-author — Balachandar Mariappan An Introduction to Terminology ATF — Available to Fulfill inventoryOn-Hand — Physical amount of Inventory availableSKU — Stock Keeping Unit, which represents a distinct type of item for sale.Location — Representation of a physical location like a store or warehouse where SKU’s are presentLocation Group — A Logical aggregation of typically one or more Locations.Reservation or Inventory Reservation — Reserving a quantity of a SKU. For example:… Continue reading Event Sourcing for an Inventory Availability Solution
Charting the future of our bug bounty program
We’re tackling the industry-wide issue of scraping by expanding our bug bounty program to reward valid reports of scraping bugs and unprotected data sets. To the best of our knowledge, this is an industry first. Looking toward the future, we’re also launching new educational opportunities for researchers and hosting our first BountyConEDU — a three-day… Continue reading Charting the future of our bug bounty program
SLICK: Adopting SLOs for improved reliability
To support the people and communities who use our apps and products, we need to stay in constant contact with them. We want to provide the experiences we offer reliably. We also need to establish trust with the larger community we support. This can be especially challenging in a large-scale, quickly evolving environment like Meta,… Continue reading SLICK: Adopting SLOs for improved reliability
SRE Weekly Issue #300
View on sreweekly.com 300 issues. 6 years. Wow! I couldn’t have done it without all of you wonderful people, writing articles and reading issues. Thanks, you make curating this newsletter fun! A message from our sponsor, Rootly: Manage incidents directly from Slack with Rootly 🚒. Automate manual admin tasks like creating incident channel, Jira and… Continue reading SRE Weekly Issue #300
Using Redis HASH instead of SET to reduce cache size and operating costs
What if we told you that there was a way to dramatically reduce the cost to operate on cloud providers? That’s what we found when we dug into the different data structures offered in Redis. Before we committed to one, we did some research into the difference in memory usage between using HASH versus using… Continue reading Using Redis HASH instead of SET to reduce cache size and operating costs
Applying the Micro Batching Pattern to Data Transfer
Building consistent data replicas If you have worked on data-rich software systems, chances are you have worked with a distributed architecture where one part of your system needs access to data owned by another part of the system. Whether that architecture is a modern distributed microservices architecture or a set of stand-alone applications looking to exchange… Continue reading Applying the Micro Batching Pattern to Data Transfer
SRE Weekly Issue #299
View on sreweekly.com A message from our sponsor, Rootly: Manage incidents directly from Slack with Rootly 🚒. Automate manual admin tasks like creating incident channel, Jira and Zoom, paging the right team, postmortem timeline, setting up reminders, and more. Book a demo:https://rootly.com/?utm_source=sreweekly Articles More More More! Why the Most Resilient Companies Want More Incidents Lacking… Continue reading SRE Weekly Issue #299
Restriction Rules
Restriction Rules: Complementing Salesforce’s Record Access Control Mechanism Introduction If you’ve taken Salesforce Admin 201 training, you might remember learning about Sharing Settings. Sharing Settings include Sharing Models, Criteria Sharing Rules, Manual Sharing, and more. I’m a software engineer on the Record Access Experience team here at Salesforce. When I took this training in 2017,… Continue reading Restriction Rules
SRE Weekly Issue #298
View on sreweekly.com Email subscribers, my apologies for the double-send last week. I upgraded WordPress and subsequently further cemented my distrust of all version upgrades ever. I carefully tested a fix in staging before rolling it out gradually in preparation for this week’s issue. Just kidding, I hacked on it live until I got it… Continue reading SRE Weekly Issue #298