View on sreweekly.com I’ll be on vacation starting next Sunday (yay!). That means the next two issues will be prepared in advance, so there won’t be an Outages section. A message from our sponsor, Rootly: Manage incidents directly from Slack with Rootly 🚒. Automate manual admin tasks like creating incident channel, Jira and Zoom, paging… Continue reading SRE Weekly Issue #334
Category: Technology
Encompass all posts related to Technology topic on this site
How Instagram suggests new content
A touring alien from a galaxy far, far away is an avid Instagram user. Her Instagram Feed is dominated by: Friends and family posts Some space travel magazines A few general news accounts Lots of science fiction blogs She logs in, scrolls through her feed gently — catching up with friends and family, keeping pace… Continue reading How Instagram suggests new content
Architectural Principles for High Availability on Hyperforce
Infrastructure and software failures will happen. We idolize four 9s (99.99%) availability. We know we need to optimize and improve Recovery-Time-Objective (RTO, the time it takes to restore service after a service disruption) and Recovery-Point-Objective (RPO, the acceptable data loss measured in time). But how can we actually deliver high availability for our customers? One… Continue reading Architectural Principles for High Availability on Hyperforce
Scaling data ingestion for machine learning training at Meta
Many of Meta’s products, such as search and language translations, utilize AI models to continuously improve user experiences. As the performance of hardware we use to support training infrastructure increases, we need to scale our data ingestion infrastructure accordingly to handle workloads more efficiently. GPUs, which are used for training infrastructure, tend to double in… Continue reading Scaling data ingestion for machine learning training at Meta
SRE Weekly Issue #333
View on sreweekly.com A message from our sponsor, Rootly: Manage incidents directly from Slack with Rootly 🚒. Automate manual admin tasks like creating incident channel, Jira and Zoom, paging and adding responders, postmortem timeline, setting up reminders, and more. Book a demo (+ get a snazzy Rootly lego set): https://rootly.com/demo/ Articles Is SRE Just Ops… Continue reading SRE Weekly Issue #333
SRE Weekly Issue #332
View on sreweekly.com A message from our sponsor, Rootly: Manage incidents directly from Slack with Rootly 🚒. Automate manual admin tasks like creating incident channel, Jira and Zoom, paging and adding responders, postmortem timeline, setting up reminders, and more. Book a demo (+ get a snazzy Rootly lego set): https://rootly.com/demo/ Articles How Razorpay’s Notification Service… Continue reading SRE Weekly Issue #332
Five security principles for billions of messages across Meta’s apps
At Meta, our messaging apps help billions of people around the world stay connected to those who matter most to them. This scale brings potential threats from criminals and hackers, so we have a responsibility to keep people and their data safe. We’re sharing a set of principles to ensure that security is central to… Continue reading Five security principles for billions of messages across Meta’s apps
Programming languages endorsed for server-side use at Meta
– Supporting a programming language at Meta is a very careful and deliberate decision. – We’re sharing our internal programming language guidance that helps our engineers and developers choose the best language for their projects. – Rust is the latest addition to Meta’s list of supported server-side languages. At Meta, we use many different programming… Continue reading Programming languages endorsed for server-side use at Meta
Launching Instagram Messaging on desktop
In 2020 we launched Instagram Messaging (referred to in this post simply as “Messaging”) for personal desktop computers. We believe that this feature will improve everyday experiences and enable new use cases for all of our desktop web users. In this post, we go through some of our overall learnings from our desktop users, and… Continue reading Launching Instagram Messaging on desktop
It’s time to leave the leap second in the past
The leap second concept was first introduced in 1972 by the International Earth Rotation and Reference Systems Service (IERS) in an attempt to periodically update Coordinated Universal Time (UTC) due to imprecise observed solar time (UT1) and the long-term slowdown in the Earth’s rotation. This periodic adjustment mainly benefits scientists and astronomers as it allows… Continue reading It’s time to leave the leap second in the past