View on sreweekly.com A message from our sponsor, Rootly: Manage incidents directly from Slack with Rootly 🚒. Rootly automates manual tasks like creating an incident channel, Jira ticket and Zoom rooms, inviting responders, creating statuspage updates, postmortem timelines and more. Want to see why companies like Canva and Grammarly love us?: https://rootly.com/demo/ Articles Intro to Themes… Continue reading SRE Weekly Issue #338
Category: Technology
Encompass all posts related to Technology topic on this site
Open-sourcing TAOBench: An end-to-end social network benchmark
What the research is: The continued emergence of large social network applications has introduced a scale of data and query volume that challenges the limits of existing data stores. However, few benchmarks accurately simulate these request patterns, leaving researchers in short supply of tools to evaluate and improve upon these systems. To address this issue,… Continue reading Open-sourcing TAOBench: An end-to-end social network benchmark
Network Entitlement: A contract-based network sharing solution
Meta’s overall network usage and traffic volume has increased as we’ve continued to add new services. Due to the scarcity of fiber resources, we’re developing an explicit resource reservation framework to effectively plan, manage, and operate the shared consumption of network bandwidth, which will help us keep up with demand and limit network disruptions during… Continue reading Network Entitlement: A contract-based network sharing solution
Viewing the world as a computer: Global capacity management
Meta currently operates 14 data centers around the world. This rapidly expanding global data center footprint poses new challenges for service owners and for our infrastructure management systems. Systems like Twine, which we use to scale cluster management, and RAS, which handles perpetual region-wide resource allocation, have provided the abstractions and automation necessary for service… Continue reading Viewing the world as a computer: Global capacity management
SRE Weekly Issue #337
View on sreweekly.com Thanks for all the vacation well-wishes! It was really great and relaxing. Take vacations, it’s important for reliability! While I was out, I shipped the past two issues with content prepared in advance, and without the Outages section. This gave me a chance to really think hard about the value of the… Continue reading SRE Weekly Issue #337
Introducing Velox: An open source unified execution engine
Meta is introducing Velox, an open source unified execution engine aimed at accelerating data management systems and streamlining their development. Velox is under active development. Experimental results from our paper published at the International Conference on Very Large Data Bases (VLDB) 2022 show how Velox improves efficiency and consistency in data management systems. Velox helps… Continue reading Introducing Velox: An open source unified execution engine
Hyperpacks: Using Buildpacks to Build Hyperforce
At Salesforce we regularly use our products and services to scale our own business. One example is Buildpacks, which we created nearly a decade ago and is now a part of Hyperforce. Hyperpacks are an innovative new way of using Cloud Native Buildpacks (CNB) to manage our public cloud infrastructure. Buildpacks were created to help… Continue reading Hyperpacks: Using Buildpacks to Build Hyperforce
Improving Meta’s SLO workflows with data annotations
When we focus on minimizing errors and downtime here at Meta, we place a lot of attention on service-level indicators (SLIs) and service-level objectives (SLOs). Consider Instagram, for example. There, SLIs represent metrics from different product surfaces, like the volume of error response codes to certain endpoints, or the number of successful media uploads. Based… Continue reading Improving Meta’s SLO workflows with data annotations
SRE Weekly Issue #336
View on sreweekly.com A message from our sponsor, Rootly: Manage incidents directly from Slack with Rootly 🚒. Automate manual admin tasks like creating incident channel, Jira and Zoom, paging and adding responders, postmortem timeline, setting up reminders, and more. Book a demo (+ get a snazzy Rootly lego set): https://rootly.com/demo/ Articles What it’s like to… Continue reading SRE Weekly Issue #336
SRE Weekly Issue #335
View on sreweekly.com A message from our sponsor, Rootly: Manage incidents directly from Slack with Rootly 🚒. Automate manual admin tasks like creating incident channel, Jira and Zoom, paging and adding responders, postmortem timeline, setting up reminders, and more. Book a demo (+ get a snazzy Rootly lego set): https://rootly.com/demo/ Articles How an incident transformed… Continue reading SRE Weekly Issue #335