SRE Netflix at SRECon

190 Countries and 5 CORE SREs by Jonah Horowitz How does Netflix scale SRE? How do we manage over 70 million customers around the world without a 24/7 operations center? With tens of thousands of Linux instances in a distributed system architecture, and thousands of daily production changes, it’s an environment that’s both challenging and… Continue reading SRE Netflix at SRECon

Open-sourcing Thrift for Haskell

What it is: Thrift is a serialization and remote procedure call (RPC) framework used for cross-service communication. Most services at Facebook communicate via Thrift because it provides a simple, language-agnostic protocol for communicating with structured data. Thrift can already be used in programming languages such as C++, Python, and Java using fbthrift. We are also… Continue reading Open-sourcing Thrift for Haskell

Pegasus Data Language: Evolving schema definitions for data modeling

Pegasus Data Schema (PDSC) is a Pegasus schema definition language that has been used for data modeling with Rest.li services for years. It’s the underlying language that helps define data models, describe the data returned by REST endpoints, and generate derivative schemas for other uses, such as XML schemas and various database schemas. However, writing… Continue reading Pegasus Data Language: Evolving schema definitions for data modeling

Creating a secure and trusted Jobs ecosystem on LinkedIn

Co-authors: Sakshi Jain, Grace Tang, Gaurav Vashist, Yu Wang, John Lu, Ravish Chhabra, Shruti Sharma, Dana Tom, and Ranjeet Ranjan LinkedIn’s vision is to connect every member of the global workforce to economic opportunity. A key driver towards this vision is our world-class hiring marketplace, where we help job seekers find their next dream role… Continue reading Creating a secure and trusted Jobs ecosystem on LinkedIn

How machine learning powers Facebook’s News Feed ranking algorithm

Designing a personalized ranking system for more than 2 billion people (all with different interests) and a plethora of content to select from presents significant, complex challenges. This is something we tackle every day with News Feed ranking. Without machine learning (ML), people’s News Feeds could be flooded with content they don’t find as relevant… Continue reading How machine learning powers Facebook’s News Feed ranking algorithm

Smart Argument Suite: Seamlessly connecting Python jobs

Co-authors: Jun Jia and Alice Wu Introduction It’s a very common scenario that an AI solution involves composing different jobs, such as data processing and model training or evaluation, into workflows and then submitting them to an orchestration engine for execution. At large companies such as LinkedIn, there may be hundreds of thousands of such… Continue reading Smart Argument Suite: Seamlessly connecting Python jobs

Budget-split testing: A trustworthy and powerful approach to marketplace A/B testing

Co-authors: Min Liu, Vangelis Dimopoulos, Elise Georis, Jialiang Mao, Di Luo, and Kang Kang The LinkedIn ecosystem drives member and customer value through a series of marketplaces (e.g., the ads marketplace, the talent marketplace, etc.). We maximize that value by making data-informed product decisions via A/B testing. Traditional A/B tests on our marketplaces, however, are… Continue reading Budget-split testing: A trustworthy and powerful approach to marketplace A/B testing

How LinkedIn turned to real-time feedback for developer tooling

Over the last year, we have been using real-time feedback to evolve our tooling and provide a more productive experience for LinkedIn’s developers. It’s helped us double our feedback participation, and more importantly, better tailor our recommendations and improvements.  For any engineering organization looking to improve developer experiences, the following questions will provide a good… Continue reading How LinkedIn turned to real-time feedback for developer tooling

FastIngest: Low-latency Gobblin with Apache Iceberg and ORC format

Co-authors: Zihan Li, Sudarshan Vasudevan, Lei Sun, and Shirshanka Das Data analytics and AI power many business-critical use cases at LinkedIn. We need to ingest data in a timely and reliable way from a variety of sources, including Kafka, Oracle, and Espresso, bringing it into our Hadoop data lake for subsequent processing by AI and… Continue reading FastIngest: Low-latency Gobblin with Apache Iceberg and ORC format