SRE Weekly Issue #328

View on sreweekly.com A message from our sponsor, Rootly: Manage incidents directly from Slack with Rootly 🚒. Automate manual admin tasks like creating incident channel, Jira and Zoom, paging and adding responders, postmortem timeline, setting up reminders, and more. Book a demo (+ get a snazzy Rootly lego set): https://rootly.com/demo/ Articles Cloudflare outage on June… Continue reading SRE Weekly Issue #328

Published
Categorized as SRE

Transparent memory offloading: more memory at a fraction of the cost and power

-Transparent memory offloading (TMO) is Meta’s data center solution for offering more memory at a fraction of the cost and power of existing technologies -In production since 2021, TMO saves 20 percent to 32 percent of memory per server across millions of servers in our data center fleet We are witnessing massive growth in the… Continue reading Transparent memory offloading: more memory at a fraction of the cost and power

Published
Categorized as Technology

SRE Weekly Issue #327

View on sreweekly.com A message from our sponsor, Rootly: Manage incidents directly from Slack with Rootly 🚒. Automate manual admin tasks like creating incident channel, Jira and Zoom, paging and adding responders, postmortem timeline, setting up reminders, and more. Book a demo (+ get a snazzy Rootly lego set): https://rootly.com/demo/ Articles Redundant against what? Even… Continue reading SRE Weekly Issue #327

Published
Categorized as SRE

Applying federated learning to protect data on mobile devices

What the research is: Federated learning with differential privacy (FL-DP) is one of the latest privacy-enhancing technologies being evaluated at Meta as we constantly work to enhance user privacy and further safeguard users’ data in the products we design, build, and maintain. FL-DP enhances privacy in two important ways: It allows machine learning (ML) models… Continue reading Applying federated learning to protect data on mobile devices

Published
Categorized as Technology

SRE Weekly Issue #326

View on sreweekly.com A message from our sponsor, Rootly: Manage incidents directly from Slack with Rootly 🚒. Automate manual admin tasks like creating incident channel, Jira and Zoom, paging and adding responders, postmortem timeline, setting up reminders, and more. Book a demo (+ get a snazzy Rootly lego set): https://rootly.com/demo/ Articles Calling all Reliability Practitioners:… Continue reading SRE Weekly Issue #326

Published
Categorized as SRE

Under the hood: Meta’s cloud gaming infrastructure

The promise of cloud gaming is a promise to democratize gaming. Anyone who loves games should be able to enjoy them and share the experience with their friends, no matter where they’re located, and even if they don’t have the latest, most expensive gaming hardware. Facebook launched its cloud gaming platform in 2020 to give… Continue reading Under the hood: Meta’s cloud gaming infrastructure

Introducing Zelos: A ZooKeeper API leveraging Delos

Within large-scale services, durable storage, distributed leases, and coordination primitives such as distributed locks, semaphores, and events should be strongly consistent. At Meta, we have historically used Apache ZooKeeper as a centralized service for these primitives. However, as Meta’s workload has scaled, we’ve found ourselves pushing the limits of ZooKeeper’s capabilities. Modifying and tuning ZooKeeper… Continue reading Introducing Zelos: A ZooKeeper API leveraging Delos

Cache made consistent: Meta’s cache invalidation solution

Caches help reduce latency, scale read-heavy workloads, and save cost. They are literally everywhere. Caches run on your phone and in your browser. For example, CDNs and DNS are essentially geo-replicated caches. It’s thanks to many caches working behind the scenes that you can read this blog post right now. Phil Karlton famously said, “There… Continue reading Cache made consistent: Meta’s cache invalidation solution

SRE Weekly Issue #325

View on sreweekly.com A message from our sponsor, Rootly: Manage incidents directly from Slack with Rootly 🚒. Automate manual admin tasks like creating incident channel, Jira and Zoom, paging and adding responders, postmortem timeline, setting up reminders, and more. Book a demo (+ get a snazzy Rootly lego set): https://rootly.com/demo/ Articles Imagine there’s no human… Continue reading SRE Weekly Issue #325

Published
Categorized as SRE

Meet the team of problem solvers pushing boundaries to see how massively we can scale.

Welcome to the new hub for all things Salesforce Engineering! This site is where you can get a behind-the-scenes look at how we build business-critical software at scale, take a peek at how we contribute to the open source community, meet some of our technical employees, and learn more about what it’s like to work… Continue reading Meet the team of problem solvers pushing boundaries to see how massively we can scale.