Asynchronous computing at Meta: Overview and learnings

We’ve made architecture changes to Meta’s event driven asynchronous computing platform that have  enabled easy integration with multiple event-sources.  We’re sharing our learnings from handling various workloads and how to tackle trade offs made with certain design choices in building the platform. Asynchronous computing is a paradigm where the user does not expect a workload… Continue reading Asynchronous computing at Meta: Overview and learnings

Published
Categorized as Technology

SRE Weekly Issue #357

View on sreweekly.com A message from our sponsor, Rootly: Manage incidents directly from Slack with Rootly 🚒. Rootly automates manual tasks like creating an incident channel, Jira ticket and Zoom rooms, inviting responders, creating statuspage updates, postmortem timelines and more. Want to see why companies like Canva and Grammarly love us?: https://rootly.com/demo/ Articles 3 tips for… Continue reading SRE Weekly Issue #357

Published
Categorized as SRE

Watch Meta’s engineers discuss optimizing large-scale networks

Managing network solutions amidst a growing scale inherently brings challenges around performance, deployment, and operational complexities.  At Meta, we’ve found that these challenges broadly fall into three themes: 1.)   Data center networking: Over the past decade, on the physical front, we have seen a rise in vendor-specific hardware that comes with heterogeneous feature and… Continue reading Watch Meta’s engineers discuss optimizing large-scale networks

Published
Categorized as Technology

Tulip: Modernizing Meta’s data platform

The technical journey discusses the motivations, challenges, and technical solutions employed for warehouse schematization, especially a change to the wire serialization format employed in Meta’s data platform for data interchange related to Warehouse Analytics Logging. Here, we discuss the engineering, scaling, and nontechnical challenges of modernizing  Meta’s exabyte-scale data platform by migrating to the new… Continue reading Tulip: Modernizing Meta’s data platform

Published
Categorized as Technology

SRE Weekly Issue #356

View on sreweekly.com Thanks to all of you that took the time to share your ideas about choosing incidents to investigate! I got some great answers and I’m looking forward to pulling them together into an article. I decided to give this GPT-3 thing a spin. It turns out that it absolutely can assemble a… Continue reading SRE Weekly Issue #356

Published
Categorized as SRE

SRE Weekly Issue #355

View on sreweekly.com A message from our sponsor, Rootly: Manage incidents directly from Slack with Rootly 🚒. Rootly automates manual tasks like creating an incident channel, Jira ticket and Zoom rooms, inviting responders, creating statuspage updates, postmortem timelines and more. Want to see why companies like Canva and Grammarly love us?: https://rootly.com/demo/ Articles Which incidents aren’t… Continue reading SRE Weekly Issue #355

Published
Categorized as SRE

SRE Weekly Issue #354

View on sreweekly.com A message from our sponsor, Rootly: Manage incidents directly from Slack with Rootly 🚒. Rootly automates manual tasks like creating an incident channel, Jira ticket and Zoom rooms, inviting responders, creating statuspage updates, postmortem timelines and more. Want to see why companies like Canva and Grammarly love us?: https://rootly.com/demo/ Articles DisasterCast Episode 31… Continue reading SRE Weekly Issue #354

Published
Categorized as SRE

No issue this week

Happy new year! I’m taking a break this week and I’ll be back with a new issue next week. See you in issue #354! SRE WEEKLY

Published
Categorized as SRE