PVF: A novel metric for understanding AI systems’ vulnerability against SDCs in model parameters

We’re introducing parameter vulnerability factor (PVF), a novel metric for understanding and measuring AI systems’ vulnerability against silent data corruptions (SDCs) in model parameters. PVF can be tailored to different AI models and tasks, adapted to different hardware faults, and even extended to the training phase of AI models. We’re sharing results of our own… Continue reading PVF: A novel metric for understanding AI systems’ vulnerability against SDCs in model parameters

Published
Categorized as Technology

SRE Weekly Issue #429

View on sreweekly.com A message from our sponsor, FireHydrant: We’ve gone all out on our new integration with Microsoft Teams. If you’re a MS Teams user, FireHydrant now supports the most comprehensive integration for incident management. Run the entire IM process without ever leaving the chat. https://firehydrant.com/blog/introducing-a-brand-new-microsoft-teams-integration/ Virtualizing Our Storage Engine Time to get down… Continue reading SRE Weekly Issue #429

Published
Categorized as SRE

25 Productivity Tools that Power Salesforce Engineering Teams

In this special edition of “Engineering Energizers,” we’re celebrating Salesforce’s 25th anniversary by showcasing 25 key productivity tools favored by leading engineers at Salesforce across India, the U.S., Israel, and Argentina. Explore the essential tools these experts rely on to enhance their productivity, tackle complex problems, and elevate innovation. 1. SLACK — A productivity platform… Continue reading 25 Productivity Tools that Power Salesforce Engineering Teams

Published
Categorized as Technology

How Meta trains large language models at scale

As we continue to focus our AI research and development on solving increasingly complex problems, one of the most significant and challenging shifts we’ve experienced is the sheer scale of computation required to train large language models (LLMs). Traditionally, our AI model training has involved a training massive number of models that required a comparatively… Continue reading How Meta trains large language models at scale

Published
Categorized as Technology

Maintaining large-scale AI capacity at Meta

Meta is currently operating many data centers with GPU training clusters across the world. Our data centers are the backbone of our operations, meticulously designed to support the scaling demands of compute and storage. A year ago, however, as the industry reached a critical inflection point due to the rise of artificial intelligence (AI), we… Continue reading Maintaining large-scale AI capacity at Meta

Published
Categorized as Technology

Unlocking the power of mixed reality devices with MobileConfig

MobileConfig enables developers to centrally manage a mobile app’s configuration parameters in our data centers. Once a parameter value is changed on our central server, billions of app devices automatically fetch and apply the new value without app updates. These remotely managed configuration parameters serve various purposes such as A/B testing, feature rollout, and app… Continue reading Unlocking the power of mixed reality devices with MobileConfig

Published
Categorized as Technology

Sales Cloud’s AI Transformation: Welcome to the Autonomous Selling Era

In our enlightening “Engineering Energizers” Q&A series, we explore the transformative experiences of engineers who have pioneered advancements in their fields. Today, we meet Parul Jain, Vice President of Software Engineering at Salesforce, who steers AI innovations within Sales Cloud. Her team is dedicated to developing a fully autonomous selling experience by seamlessly integrating AI… Continue reading Sales Cloud’s AI Transformation: Welcome to the Autonomous Selling Era

Published
Categorized as Technology

SRE Weekly Issue #428

View on sreweekly.com A message from our sponsor, FireHydrant: We’ve gone all out on our new integration with Microsoft Teams. If you’re a MS Teams user, FireHydrant now supports the most comprehensive integration for incident management. Run the entire IM process without ever leaving the chat. https://firehydrant.com/blog/introducing-a-brand-new-microsoft-teams-integration/ The Reverse Red Herring This article presents in… Continue reading SRE Weekly Issue #428

Published
Categorized as SRE