SRE Weekly Issue #324

View on sreweekly.com A message from our sponsor, Rootly: Manage incidents directly from Slack with Rootly 🚒. Automate manual admin tasks like creating incident channel, Jira and Zoom, paging and adding responders, postmortem timeline, setting up reminders, and more. Book a demo (+ get a snazzy Rootly lego set): https://rootly.com/demo/ Articles The Need to Decouple… Continue reading SRE Weekly Issue #324

Published
Categorized as SRE

SRE Weekly Issue #323

View on sreweekly.com A message from our sponsor, Rootly: Manage incidents directly from Slack with Rootly 🚒. Automate manual admin tasks like creating incident channel, Jira and Zoom, paging and adding responders, postmortem timeline, setting up reminders, and more. Book a demo (+ get a snazzy Rootly lego set): https://rootly.com/demo/ Articles A Chat with Lex… Continue reading SRE Weekly Issue #323

Published
Categorized as SRE

SRE Weekly Issue #322

View on sreweekly.com Bit of a short issue this week. This morning, I stepped on my phone, crushing it mightily beneath my bootheel. Unfortunately a lot of my automation for reviewing articles is on there… thank goodness I have functioning backups. A message from our sponsor, Rootly: Manage incidents directly from Slack with Rootly 🚒.… Continue reading SRE Weekly Issue #322

Published
Categorized as SRE

Meta Open Source is transferring Jest to the OpenJS Foundation

Meta Open Source is officially transferring Jest, its open source JavaScript testing framework, to the OpenJS Foundation.  With over 17 million weekly downloads and over 38,000 GitHub stars, Jest is the most used testing framework in the JavaScript ecosystem and is used by companies of all sizes, including Amazon, Google, Microsoft, and Stripe. We believe… Continue reading Meta Open Source is transferring Jest to the OpenJS Foundation

SRE Weekly Issue #321

View on sreweekly.com A message from our sponsor, Rootly: Manage incidents directly from Slack with Rootly 🚒. Automate manual admin tasks like creating incident channel, Jira and Zoom, paging and adding responders, postmortem timeline, setting up reminders, and more. Book a demo (+ get a snazzy Rootly lego set): https://rootly.com/demo/ Articles Using Fault Injection Testing… Continue reading SRE Weekly Issue #321

Published
Categorized as SRE

BellJar: A new framework for testing system recoverability at scale

Building infrastructure that can easily recover from outages, particularly outages involving adjacent infrastructure, too often becomes a murky exploration of nuanced fate-sharing between systems. Untangling dependencies and uncovering side effects of unavailability has historically been time-consuming work. A lack of great tooling built for this, and the rarity of infrastructure outages, makes reasoning about them… Continue reading BellJar: A new framework for testing system recoverability at scale

How to Optimize Your Apache Spark Application with Partitions

In Salesforce Einstein, we use Apache Spark to perform parallel computations on large sets of data, in a distributed manner. In this article, we will take a deep dive into how you can optimize your Spark application with partitions. Introduction Today, we often need to process terabytes of data per day to reach conclusions. To do this… Continue reading How to Optimize Your Apache Spark Application with Partitions

How to Optimize Your Apache Spark Application with Partitions

In Salesforce Einstein, we use Apache Spark to perform parallel computations on large sets of data, in a distributed manner. In this article, we will take a deep dive into how you can optimize your Spark application with partitions. Introduction Today, we often need to process terabytes of data per day to reach conclusions. To… Continue reading How to Optimize Your Apache Spark Application with Partitions

Delta: A highly available, strongly consistent storage service using chain replication

Over the years, Meta has invested in a number of storage service offerings that cater to different use cases and workload characteristics. Along the way, we’ve aimed to reduce and converge the systems in the storage space. At the same time, having a dedicated solution for critical package workload makes everyone happier. Having this in… Continue reading Delta: A highly available, strongly consistent storage service using chain replication