View on sreweekly.com A message from our sponsor, StackHawk: Join the GraphQL Security Testing Learning Lab on June 29 at 9 AM PT. Learn how to run automated security testing against your GraphQL APIs so you can find and fix vulnerabilities fast. http://sthwk.com/graphql-learning-lab Articles Chicken Soup for the SLO The last section suggests selling SLOs… Continue reading SRE Weekly Issue #274
Blog
A Deep Dive on Text Classification at Salesforce
published on Towards Data Science Putting from a Sand Trap (Image by Author) We’re excited to announce that Noah Burbank, a Principal Data Scientist in Sales Cloud, has recently published a deep dive into text classification at Salesforce on Towards Data Science. The article, How to choose the right model for text classification in an organizational setting,… Continue reading A Deep Dive on Text Classification at Salesforce
116. Success From Anywhere
ListenSubscribe COVID-19 has created massive changes to the way we work, not only bringing the remote work experience to the masses but creating an opportunity to redesign offices to suit flex workers. In this week’s podcast episode, Greg Nokes welcomes Lisa Marshall, Senior Vice President of Technology, People, Innovation & Learning at Salesforce. Marshall shares… Continue reading 116. Success From Anywhere
API Federation: growing scalable API landscapes
Organizations embrace micro-services and event-driven APIs in their technology platforms to try to achieve the promise of greater agility, increased innovation, and more autonomy for their development teams. However, after the initial success, it is not unusual for organizations to face difficulties when they try to scale their distributed platforms. At this point, with the… Continue reading API Federation: growing scalable API landscapes
SRE Weekly Issue #273
View on sreweekly.com A message from our sponsor, StackHawk: StackHawk is helping One Medical equip developers with automated security testing and self-service remediations. See how: http://sthwk.com/onemedical Articles Incident Management vs. Incident Response What indeed? It depends on who you ask. Quentin Rousseau — Rootly Cores that don’t count This academic paper explains Google’s efforts toward… Continue reading SRE Weekly Issue #273
How Facebook deals with PCIe faults to keep our data centers running reliably
Peripheral component interconnect express (PCIe) hardware continues to push the boundaries of computing thanks to advances in transfer speeds, the number of available lanes for simultaneous data delivery, and a comparatively small footprint on motherboards. Today, PCIe connectivity-based hardware delivers faster data transfers and is one of the de facto methods to connect components to… Continue reading How Facebook deals with PCIe faults to keep our data centers running reliably
Real-time Einstein Insights Using Kafka Streams
Sales representatives deal with hundreds of emails everyday. To help them prioritize, Salesforce offers critical insights on emails received. These insights are either generated by our deep learning models or defined by the customer by matching keywords using regex expressions. Insights are generated in real time in our microservice architecture, which is built using Kafka… Continue reading Real-time Einstein Insights Using Kafka Streams
Risk-driven backbone management during COVID-19 and beyond
What the research is: A first-of-its-kind study detailing our backbone management strategy to ensure high service performance throughout the COVID-19 pandemic. The pandemic moved most social interactions online and caused an unprecedented stress test on our global network infrastructure with tens of data center regions. At this scale, failures such as fiber cuts, router misconfigurations,… Continue reading Risk-driven backbone management during COVID-19 and beyond
SRE Weekly Issue #282
View on sreweekly.com A message from our sponsor, StackHawk: ICYMI ZAP Creator and Project Lead Simon Bennetts recently unveiled ZAP’s new automation framework. Watch the session and see how it works: https://sthwk.com/Automation-Framework Articles A thorough introduction to bpftrace I really need to learn bpftrace, and this article is a great place to start. Brendan Gregg… Continue reading SRE Weekly Issue #282
How we built a general purpose key value store for Facebook with ZippyDB
ZippyDB is the largest strongly consistent, geographically distributed key-value store at Facebook. Since we first deployed ZippyDB in 2012, this key-value store has expanded rapidly, and today, ZippyDB serves a number of use cases, ranging from metadata for a distributed filesystem, counting events for both internal and external purposes, to product data that’s used for… Continue reading How we built a general purpose key value store for Facebook with ZippyDB