{"id":599,"date":"2022-06-14T16:00:24","date_gmt":"2022-06-14T16:00:24","guid":{"rendered":"https:\/\/fde.cat\/index.php\/2022\/06\/14\/applying-federated-learning-to-protect-data-on-mobile-devices\/"},"modified":"2022-06-14T16:00:24","modified_gmt":"2022-06-14T16:00:24","slug":"applying-federated-learning-to-protect-data-on-mobile-devices","status":"publish","type":"post","link":"https:\/\/fde.cat\/index.php\/2022\/06\/14\/applying-federated-learning-to-protect-data-on-mobile-devices\/","title":{"rendered":"Applying federated learning to protect data on mobile devices"},"content":{"rendered":"<h2><span>What the research is:<\/span><\/h2>\n<p><span>Federated learning with differential privacy (FL-DP) is one of the latest privacy-enhancing technologies being evaluated at Meta as we constantly work to enhance user privacy and further safeguard users\u2019 data in the products we design, build, and maintain.<\/span><\/p>\n<p><span>FL-DP enhances privacy in two important ways:<\/span><\/p>\n<p><span>It allows machine learning (ML) models to be trained in a distributed way so that users\u2019 data remains on their mobile devices.<\/span><br \/>\n<span>It adds noise to reduce the risk of an ML model memorizing user data.<\/span><\/p>\n<p><span>The benefits of FL-DP come with unique challenges that cannot be solved through conventional ML tools and practices. As such, we\u2019ve developed a new system architecture and methodology capable of successfully addressing these challenges. Such an approach could enhance user privacy while still facilitating an intelligent, safe, and intuitive user experience across Meta\u2019s family of technologies.<\/span><\/p>\n<h2><span>How it works:<\/span><\/h2>\n<p><span>With FL-DP, ML models are trained in a federated manner where mobile devices learn locally. A global ML model is only updated with these localized learnings only after noise is added, through a process called differential privacy. Differential privacy is an important step, as it is the best-known strategy for preventing ML models from memorizing training data, even in the most extreme scenarios (e.g., <\/span><a href=\"https:\/\/arxiv.org\/abs\/2202.07623\"><span>reconstruction attacks<\/span><\/a><span>).<\/span><\/p>\n<p><span>However, training ML models in this fashion does come with challenges that are both new and different from those of more conventional, centralized ML models, including:<\/span><\/p>\n<p><span>Label balancing, feature normalization, and metrics calculation due to lack of data visibility<\/span><br \/>\n<span>Slower mobile release cycles as compared with back-end release cycles<\/span><br \/>\n<span>Slower training due to the federation of training to mobile devices<\/span><br \/>\n<span>Anonymized system logging to keep data private\u00a0<\/span><\/p>\n<p><span>To address these challenges, we\u2019ve designed an architecture and methodology that\u2019s influenced by real-world applications of ML and allows model training to combine server-side user data with device-side-only user data to deliver inferences. Device-side-only user data remains on users\u2019 devices. This architecture is a combination of infrastructure across mobile devices, trusted execution environments, and conventional back-end servers.\u00a0<\/span><\/p>\n<p><span>We validated this architecture using an in-house FL library that is compatible with Meta\u2019s family of apps (such as Facebook and Instagram) and that has the potential to scale training to millions of devices and inferences to billions of devices. We compared this approach with conventional server-trained models and saw minimal degradation of model performance without transgressing constraints of limited on-device compute, storage, and power resources.<\/span><\/p>\n<p><span>In designing our infrastructure architecture for FL with differential privacy, the idea was to improve developer efficiency. While there has been a<\/span><a href=\"https:\/\/arxiv.org\/pdf\/2111.04877.pdf\"> <span>good deal of research<\/span><\/a><span> on enabling successful and efficient model training, there hasn\u2019t been as much attention paid to the auxiliary core infrastructure components needed for speedy tuning and scalable deployment at inference time. This is why we chose to focus on the overall architectural design and integration.\u00a0<\/span><\/p>\n<p><span>Here\u2019s a diagram of the major components of the system:<\/span><\/p>\n\n<h2><span>What\u2019s next:<\/span><\/h2>\n<p><span>While this architecture is capable of successfully training and deploying production FL models, there are several challenges left for future work. Developer speed, in particular, remains one of the largest barriers to scaling production-grade federated ML. Current iterations of model development are several orders of magnitude slower compared with similar-sized undertakings within a centralized environment. We look forward to iterating on our architecture.\u00a0<\/span><\/p>\n<h2><span>Read the full paper:<\/span><\/h2>\n<p><a href=\"https:\/\/arxiv.org\/abs\/2206.00807\"><span>Applied federated learning: Architectural design for robust and efficient learning in privacy aware settings<\/span><\/a><\/p>\n<p>The post <a href=\"https:\/\/engineering.fb.com\/2022\/06\/14\/production-engineering\/federated-learning-differential-privacy\/\">Applying federated learning to protect data on mobile devices<\/a> appeared first on <a href=\"https:\/\/engineering.fb.com\/\">Engineering at Meta<\/a>.<\/p>\n<p>Engineering at Meta<\/p>","protected":false},"excerpt":{"rendered":"<p>What the research is: Federated learning with differential privacy (FL-DP) is one of the latest privacy-enhancing technologies being evaluated at Meta as we constantly work to enhance user privacy and further safeguard users\u2019 data in the products we design, build, and maintain. FL-DP enhances privacy in two important ways: It allows machine learning (ML) models&hellip; <a class=\"more-link\" href=\"https:\/\/fde.cat\/index.php\/2022\/06\/14\/applying-federated-learning-to-protect-data-on-mobile-devices\/\">Continue reading <span class=\"screen-reader-text\">Applying federated learning to protect data on mobile devices<\/span><\/a><\/p>\n","protected":false},"author":0,"featured_media":0,"comment_status":"","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"spay_email":"","footnotes":""},"categories":[7],"tags":[],"class_list":["post-599","post","type-post","status-publish","format-standard","hentry","category-technology","entry"],"jetpack_featured_media_url":"","jetpack-related-posts":[{"id":559,"url":"https:\/\/fde.cat\/index.php\/2022\/03\/30\/how-meta-enables-de-identified-authentication-at-scale\/","url_meta":{"origin":599,"position":0},"title":"How Meta enables de-identified authentication at scale","date":"March 30, 2022","format":false,"excerpt":"Data minimization \u2014 collecting the minimum amount of data required to support our services \u2014 is one of our core principles at Meta as we continue developing new privacy-enhancing technologies (PETs). We are constantly seeking ways to improve privacy and protect user data on our family of products. Previously, we\u2019ve\u2026","rel":"","context":"In &quot;Technology&quot;","img":{"alt_text":"","src":"","width":0,"height":0},"classes":[]},{"id":854,"url":"https:\/\/fde.cat\/index.php\/2024\/04\/12\/innovating-tableau-pulse-hurdling-ai-integration-and-scalability-obstacles-for-next-gen-customer-insights\/","url_meta":{"origin":599,"position":1},"title":"Innovating Tableau Pulse: Hurdling AI Integration and Scalability Obstacles for Next-Gen Customer Insights","date":"April 12, 2024","format":false,"excerpt":"In our \u201cEngineering Energizers\u201d Q&A series, we examine the inspiring paths of engineering leaders who have made remarkable strides in their respective fields. Today, we meet Harini Nallan Chakrawarthy, Vice President of Software Engineering, who leads the development of Tableau Pulse. This new Salesforce feature that uses generative AI to\u2026","rel":"","context":"In &quot;Technology&quot;","img":{"alt_text":"","src":"","width":0,"height":0},"classes":[]},{"id":843,"url":"https:\/\/fde.cat\/index.php\/2024\/03\/21\/threads-has-entered-the-fediverse\/","url_meta":{"origin":599,"position":2},"title":"Threads has entered the fediverse","date":"March 21, 2024","format":false,"excerpt":"Threads has entered the fediverse! As part of our beta experience, now available in a few countries, Threads users aged 18+ with public profiles can now choose to share their Threads posts to other ActivityPub-compliant servers. People on those servers can now follow federated Threads profiles and see, like, reply\u2026","rel":"","context":"In &quot;Technology&quot;","img":{"alt_text":"","src":"","width":0,"height":0},"classes":[]},{"id":848,"url":"https:\/\/fde.cat\/index.php\/2024\/04\/01\/unveiling-the-cutting-edge-features-of-ml-console-for-ai-model-lifecycle-management\/","url_meta":{"origin":599,"position":3},"title":"Unveiling the Cutting-Edge Features of ML Console for AI Model Lifecycle Management","date":"April 1, 2024","format":false,"excerpt":"In our \u201cEngineering Energizers\u201d Q&A series, we explore the journeys of engineering leaders who have made remarkable contributions in their fields. Today, we meet Venkat Krishnamani, a Lead Member of the Technical Staff for Salesforce Engineering and the lead engineer for Salesforce Einstein\u2019s Machine Learning (ML) Console. This vital tool\u2026","rel":"","context":"In &quot;Technology&quot;","img":{"alt_text":"","src":"","width":0,"height":0},"classes":[]},{"id":315,"url":"https:\/\/fde.cat\/index.php\/2021\/08\/31\/api-federation-growing-scalable-api-landscapes\/","url_meta":{"origin":599,"position":4},"title":"API Federation: growing scalable API landscapes","date":"August 31, 2021","format":false,"excerpt":"Organizations embrace micro-services and event-driven APIs in their technology platforms to try to achieve the promise of greater agility, increased innovation, and more autonomy for their development teams. However, after the initial success, it is not unusual for organizations to face difficulties when they try to scale their distributed platforms.\u2026","rel":"","context":"In &quot;Technology&quot;","img":{"alt_text":"","src":"","width":0,"height":0},"classes":[]},{"id":229,"url":"https:\/\/fde.cat\/index.php\/2021\/02\/02\/ml-lake-building-salesforces-data-platform-for-machine-learning\/","url_meta":{"origin":599,"position":5},"title":"ML Lake: Building Salesforce\u2019s Data Platform for Machine Learning","date":"February 2, 2021","format":false,"excerpt":"Salesforce uses machine learning to improve every aspect of its product suite. With the help of Salesforce Einstein, companies are improving productivity and accelerating key decision-making. Data is a critical component of all machine learning applications and Salesforce is no exception. In this post I will share some unique challenges\u2026","rel":"","context":"In &quot;Technology&quot;","img":{"alt_text":"","src":"","width":0,"height":0},"classes":[]}],"_links":{"self":[{"href":"https:\/\/fde.cat\/index.php\/wp-json\/wp\/v2\/posts\/599","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/fde.cat\/index.php\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/fde.cat\/index.php\/wp-json\/wp\/v2\/types\/post"}],"replies":[{"embeddable":true,"href":"https:\/\/fde.cat\/index.php\/wp-json\/wp\/v2\/comments?post=599"}],"version-history":[{"count":0,"href":"https:\/\/fde.cat\/index.php\/wp-json\/wp\/v2\/posts\/599\/revisions"}],"wp:attachment":[{"href":"https:\/\/fde.cat\/index.php\/wp-json\/wp\/v2\/media?parent=599"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/fde.cat\/index.php\/wp-json\/wp\/v2\/categories?post=599"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/fde.cat\/index.php\/wp-json\/wp\/v2\/tags?post=599"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}