SRE Weekly Issue #330

View on sreweekly.com Thanks for all the well-wishes as I took a sick day last week. I’m feeling much better! A message from our sponsor, Rootly: Manage incidents directly from Slack with Rootly 🚒. Automate manual admin tasks like creating incident channel, Jira and Zoom, paging and adding responders, postmortem timeline, setting up reminders, and… Continue reading SRE Weekly Issue #330

Published
Categorized as SRE

Owl: Distributing content at Meta scale

Being able to distribute large, widely -consumed objects (so-called hot content) efficiently to hosts is becoming increasingly important within Meta’s private cloud. These are commonly distributed content types such as executables, code artifacts, AI models, and search indexes that help enable our software systems. Owl is a new system for high-fanout distribution of large data… Continue reading Owl: Distributing content at Meta scale

Published
Categorized as Technology

No issue this week

Hi folks! I’m taking a sick day today, so no issue this week. SRE WEEKLY

Published
Categorized as SRE

Watch Meta’s engineers discuss QUIC and TCP innovations for our network

With more than 75 percent of our internet traffic set to use QUIC and HTTP/3 together, QUIC is slowly moving to become the de facto protocol used for internet communication at Meta. For Meta’s data center network, TCP remains the primary network transport protocol that supports thousands of services on top of it. As our… Continue reading Watch Meta’s engineers discuss QUIC and TCP innovations for our network

SRE Weekly Issue #329

View on sreweekly.com A message from our sponsor, Rootly: Manage incidents directly from Slack with Rootly 🚒. Automate manual admin tasks like creating incident channel, Jira and Zoom, paging and adding responders, postmortem timeline, setting up reminders, and more. Book a demo (+ get a snazzy Rootly lego set): https://rootly.com/demo/ Articles “Keep calm and use… Continue reading SRE Weekly Issue #329

Published
Categorized as SRE

SRE Weekly Issue #328

View on sreweekly.com A message from our sponsor, Rootly: Manage incidents directly from Slack with Rootly 🚒. Automate manual admin tasks like creating incident channel, Jira and Zoom, paging and adding responders, postmortem timeline, setting up reminders, and more. Book a demo (+ get a snazzy Rootly lego set): https://rootly.com/demo/ Articles Cloudflare outage on June… Continue reading SRE Weekly Issue #328

Published
Categorized as SRE

Transparent memory offloading: more memory at a fraction of the cost and power

-Transparent memory offloading (TMO) is Meta’s data center solution for offering more memory at a fraction of the cost and power of existing technologies -In production since 2021, TMO saves 20 percent to 32 percent of memory per server across millions of servers in our data center fleet We are witnessing massive growth in the… Continue reading Transparent memory offloading: more memory at a fraction of the cost and power

Published
Categorized as Technology

SRE Weekly Issue #327

View on sreweekly.com A message from our sponsor, Rootly: Manage incidents directly from Slack with Rootly 🚒. Automate manual admin tasks like creating incident channel, Jira and Zoom, paging and adding responders, postmortem timeline, setting up reminders, and more. Book a demo (+ get a snazzy Rootly lego set): https://rootly.com/demo/ Articles Redundant against what? Even… Continue reading SRE Weekly Issue #327

Published
Categorized as SRE

Applying federated learning to protect data on mobile devices

What the research is: Federated learning with differential privacy (FL-DP) is one of the latest privacy-enhancing technologies being evaluated at Meta as we constantly work to enhance user privacy and further safeguard users’ data in the products we design, build, and maintain. FL-DP enhances privacy in two important ways: It allows machine learning (ML) models… Continue reading Applying federated learning to protect data on mobile devices

Published
Categorized as Technology

SRE Weekly Issue #326

View on sreweekly.com A message from our sponsor, Rootly: Manage incidents directly from Slack with Rootly 🚒. Automate manual admin tasks like creating incident channel, Jira and Zoom, paging and adding responders, postmortem timeline, setting up reminders, and more. Book a demo (+ get a snazzy Rootly lego set): https://rootly.com/demo/ Articles Calling all Reliability Practitioners:… Continue reading SRE Weekly Issue #326

Published
Categorized as SRE