{"id":770,"date":"2023-10-10T21:16:49","date_gmt":"2023-10-10T21:16:49","guid":{"rendered":"https:\/\/fde.cat\/index.php\/2023\/10\/10\/revealing-the-newest-data-science-tool-speeding-ai-development-and-securing-customer-data\/"},"modified":"2023-10-10T21:16:49","modified_gmt":"2023-10-10T21:16:49","slug":"revealing-the-newest-data-science-tool-speeding-ai-development-and-securing-customer-data","status":"publish","type":"post","link":"https:\/\/fde.cat\/index.php\/2023\/10\/10\/revealing-the-newest-data-science-tool-speeding-ai-development-and-securing-customer-data\/","title":{"rendered":"Revealing the Newest Data Science Tool: Speeding AI Development and Securing Customer Data"},"content":{"rendered":"<p><em>by Chi Wang and Scott Nyberg<\/em><\/p>\n<p>In today\u2019s data-powered world, <a href=\"https:\/\/www.techopedia.com\/ai-powered-personalization-how-machine-learning-is-transforming-customer-experience\">leveraging customer data to improve AI capabilities remains key for providing highly personalized consumer experiences<\/a>. In fact, <a href=\"https:\/\/www.cisco.com\/c\/dam\/en_us\/about\/doing_business\/trust-center\/docs\/cisco-consumer-privacy-survey-2022.pdf\">43%<\/a> of customers believe AI has improved their lives, with 54% willing to provide their anonymized data to improve AI-related products. However, more than half of customers shared trepidations about how companies use their personal information in AI development.<\/p>\n<p>Salesforce\u2019s Interactive Data Science (IDS) team tackles this complex AI trust challenge by creating cutting-edge technologies that power <a href=\"https:\/\/www.salesforce.com\/products\/einstein-ai-solutions\/\">Salesforce Einstein AI<\/a> development. This includes their trusted notebook solution \u2014 a three-phase, context-based data access control process.<\/p>\n<p>Their solution reinforces trust \u2014 Salesforce\u2019s number one value \u2014 by implementing robust privacy measures. This empowers customers with full control over their data usage and ensures responsible and secure data handling.<\/p>\n<p>Additionally, the solution improves AI development efficiency \u2014 enabling Salesforce developers to provide high-quality information to customers faster than ever.<\/p>\n<h4 class=\"wp-block-heading\"><strong>Which factors led the IDS team to innovate their trusted notebook solution?<\/strong><\/h4>\n<p>Historically, Salesforce AI data scientists faced several data approval processes and permission check mechanisms that often encumbered smooth modeling development. These challenges stemmed from various factors.<\/p>\n<p>First, Salesforce imposes a high trust benchmark for internal machine learning development with customer data. For example, Salesforce customers maintain full control over their data usage. This created a stringent and time-consuming data collection consent and approval process \u2014 incorporating legal reviews and external auditing \u2014 to ensure customer data remained protected.<\/p>\n<p>Additionally, numerous AI tools used during AI development required permission control for customer data, making data scientists\u2019 work more complex.<\/p>\n<p>Seeking to help data scientists better navigate these logistics processes and speed their AI development process, IDS pioneered their trusted notebook solution.<\/p>\n<h4 class=\"wp-block-heading\"><strong>What\u2019s the engine that powers IDS\u2019s trusted notebook solution?<\/strong><\/h4>\n<p>To address the challenges for acquiring customer data, the IDS team introduced a three-phase, highly simplified permission control approach workflow, as shown in the diagram below.<\/p>\n<p>Diving deeper, here is a closer look at each phase within the illustration:<\/p>\n<p><strong>Phase 1<\/strong>: A data scientist (Jennie) logs into the notebook (NB) admin system using a web UI, then she submits a data access request (steps 1, 2). Her request specifies the purpose (for what application or what task), tenants (which customers), and data sources (what types, categories, and locations of data). Finally, a data administrator (Tian) reviews and approves Jennie\u2019s request (step 3).<\/p>\n<p><strong>Phase 2<\/strong>: The NB admin system verifies Jennie\u2019s data request against customer consent and provides the compute resources (normally a <a href=\"https:\/\/jupyter.org\/\">JupyterLab<\/a> instance) that she needs for her AI development (step 4). As part of the resource provision, an auth (JWT) token will be created, encapsulating data scopes and data access purposes.<\/p>\n<p><strong>Phase 3<\/strong>: When Jennie writes code to explore customer data or prototype a model algorithm in her notebook instance (provisioned at step 4,5), all outbound requests from her code attach the auth (JWT) token and go through the reverse proxy hosted in the notebook admin system (steps 6, 7). External interactions from Jennie\u2019s NB will be enforced by the notebook admin system according to the data scope and service operation scope defined in the auth (JWT) token.<\/p>\n<p>Ultimately, this process generates constrained access tokens and validates them when Salesforce data scientists access customer data. Everything happens in real-time and interactively, enabling the team to maintain a high trust level without sacrificing AI development efficiency.<\/p>\n<h4 class=\"wp-block-heading\"><strong>How does the data control process integrate with external AI platforms like AWS SageMaker?<\/strong><\/h4>\n<p>The team\u2019s three-phase data control process is designed to be versatile, delivering a seamless, secure, and inclusive experience for data scientists that supports them when they work from within Salesforce internal AI services, data storage, or external AI platforms such as <a href=\"https:\/\/aws.amazon.com\/sagemaker\/\">AWS SageMaker<\/a>.<\/p>\n<p>The process involves transferring approved data and service scope \u2014 such as the auth (JWT) token created during step 5 \u2014 into Identity and Access Management (IAM) roles and policies within an external machine learning platform.<\/p>\n<div class=\"wp-block-group is-layout-constrained wp-container-1 wp-block-group-is-layout-constrained\">\n<p>For example, when Jennie works in SageMaker, the notebook admin system provisions three key components:<\/p>\n<p>Compute resources and services in SageMaker<\/p>\n<p>An IAM role that represents Jennie and her approved service scope<\/p>\n<p>IAM policies that authorize Jennie to work in SageMaker<\/p>\n<\/div>\n<p>The team extends its trust enforcement principles into SageMaker by aligning the access scope defined in Jennie\u2019s project request with permission controls. This alignment is set up within AWS IAM. Check out the illustrative example below, which uses an AWS attribute-based access control policy to identify Jennie\u2019s project role and provide her access to a s3 folder. Additionally, the policy verifies her project application context against the resource\u2019s application context.<\/p>\n<p>Through the power of their trusted notebook solution, Salesforce\u2019s IDS team streamlines data access controls for Salesforce internal AI systems and synergizes them with external AI platforms. This empowers data scientists to deliver high-quality information to customers with increased speed while effectively safeguarding customer data. This innovation marks a major advancement in improving customer experiences through AI.<\/p>\n<h4 class=\"wp-block-heading\"><strong>Learn more<\/strong><\/h4>\n<p>To learn how Salesforce engineers develop ethical generative AI from the start, check out this <a href=\"https:\/\/www.salesforce.com\/uk\/news\/stories\/developing-ethical-ai\/\">blog post<\/a>.<\/p>\n<p>Stay connected \u2014 join our <a href=\"https:\/\/careers.mail.salesforce.com\/w2?cid=7017y00000CRDS7AAP\">Talent Community<\/a>!<\/p>\n<p><a href=\"https:\/\/www.salesforce.com\/company\/careers\/teams\/tech-and-product\/?d=cta-tms-tp-2\">Check out our Technology and Product teams<\/a> to learn how you can get involved.<\/p>\n<p>The post <a href=\"https:\/\/engineering.salesforce.com\/revealing-the-newest-data-science-tool-speeding-ai-development-and-securing-customer-data\/\">Revealing the Newest Data Science Tool: Speeding AI Development and Securing Customer Data<\/a> appeared first on <a href=\"https:\/\/engineering.salesforce.com\/\">Salesforce Engineering Blog<\/a>.<\/p>\n<p><a href=\"https:\/\/engineering.salesforce.com\/revealing-the-newest-data-science-tool-speeding-ai-development-and-securing-customer-data\/\" target=\"_blank\" class=\"feedzy-rss-link-icon\" rel=\"noopener\">Read More<\/a><\/p>","protected":false},"excerpt":{"rendered":"<p>by Chi Wang and Scott Nyberg In today\u2019s data-powered world, leveraging customer data to improve AI capabilities remains key for providing highly personalized consumer experiences. In fact, 43% of customers believe AI has improved their lives, with 54% willing to provide their anonymized data to improve AI-related products. However, more than half of customers shared&hellip; <a class=\"more-link\" href=\"https:\/\/fde.cat\/index.php\/2023\/10\/10\/revealing-the-newest-data-science-tool-speeding-ai-development-and-securing-customer-data\/\">Continue reading <span class=\"screen-reader-text\">Revealing the Newest Data Science Tool: Speeding AI Development and Securing Customer Data<\/span><\/a><\/p>\n","protected":false},"author":0,"featured_media":0,"comment_status":"","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"spay_email":"","footnotes":""},"categories":[7],"tags":[],"class_list":["post-770","post","type-post","status-publish","format-standard","hentry","category-technology","entry"],"jetpack_featured_media_url":"","jetpack-related-posts":[{"id":884,"url":"https:\/\/fde.cat\/index.php\/2024\/06\/21\/how-einstein-copilot-sharpens-large-language-model-outputs-and-redefines-ai-data-testing\/","url_meta":{"origin":770,"position":0},"title":"How Einstein Copilot Sharpens Large Language Model Outputs and Redefines AI Data Testing","date":"June 21, 2024","format":false,"excerpt":"In our \u201cEngineering Energizers\u201d Q&A series, we explore the paths of engineering leaders who have attained significant accomplishments in their respective fields. Today, we spotlight Armita Peymandoust, Senior Vice President of Software Engineering at Salesforce, who spearheads the development of Einstein Copilot, a conversational AI assistant for CRM that integrates\u2026","rel":"","context":"In &quot;Technology&quot;","img":{"alt_text":"","src":"","width":0,"height":0},"classes":[]},{"id":896,"url":"https:\/\/fde.cat\/index.php\/2024\/07\/16\/the-unstructured-data-dilemma-how-data-cloud-handles-250-trillion-transactions-weekly\/","url_meta":{"origin":770,"position":1},"title":"The Unstructured Data Dilemma: How Data Cloud Handles 250 Trillion Transactions Weekly","date":"July 16, 2024","format":false,"excerpt":"In our \u201cEngineering Energizers\u201d Q&A series, we delve into the journeys of engineering leaders who have made notable strides in their areas of expertise. This edition features Adithya Vishwanath, Vice President of Software Engineering at Salesforce. He leads the Data Cloud team, a pivotal platform that integrates diverse data sources,\u2026","rel":"","context":"In &quot;Technology&quot;","img":{"alt_text":"","src":"","width":0,"height":0},"classes":[]},{"id":854,"url":"https:\/\/fde.cat\/index.php\/2024\/04\/12\/innovating-tableau-pulse-hurdling-ai-integration-and-scalability-obstacles-for-next-gen-customer-insights\/","url_meta":{"origin":770,"position":2},"title":"Innovating Tableau Pulse: Hurdling AI Integration and Scalability Obstacles for Next-Gen Customer Insights","date":"April 12, 2024","format":false,"excerpt":"In our \u201cEngineering Energizers\u201d Q&A series, we examine the inspiring paths of engineering leaders who have made remarkable strides in their respective fields. Today, we meet Harini Nallan Chakrawarthy, Vice President of Software Engineering, who leads the development of Tableau Pulse. This new Salesforce feature that uses generative AI to\u2026","rel":"","context":"In &quot;Technology&quot;","img":{"alt_text":"","src":"","width":0,"height":0},"classes":[]},{"id":834,"url":"https:\/\/fde.cat\/index.php\/2024\/03\/06\/how-the-new-einstein-1-platform-manages-massive-data-and-ai-workloads-at-scale\/","url_meta":{"origin":770,"position":3},"title":"How the New Einstein 1 Platform Manages Massive Data and AI Workloads at Scale","date":"March 6, 2024","format":false,"excerpt":"In our \u201cEngineering Energizers\u201d Q&A series, we feature Leo Tran, Chief Architect of Platform Engineering at Salesforce. With over 15 years of engineering leadership experience, Leo is instrumental in developing the Einstein 1 Platform. This platform integrates generative AI, data management, CRM capabilities, and trusted systems to provide businesses with\u2026","rel":"","context":"In &quot;Technology&quot;","img":{"alt_text":"","src":"","width":0,"height":0},"classes":[]},{"id":733,"url":"https:\/\/fde.cat\/index.php\/2023\/07\/11\/how-is-salesforce-einstein-optimizing-ai-classification-model-accuracy\/","url_meta":{"origin":770,"position":4},"title":"How is Salesforce Einstein Optimizing AI Classification Model Accuracy?","date":"July 11, 2023","format":false,"excerpt":"In our \u201cEngineering Energizers\u201d Q&A series, we examine the professional journeys that have shaped Salesforce Engineering leaders. Meet Matan Rabi, Senior Software Engineer on Salesforce Einstein\u2019s Machine Learning Observability Platform (MLOP) team. Matan and his team strive to optimize the accuracy of Einstein\u2019s AI classification models, empowering customers across industries\u2026","rel":"","context":"In &quot;Technology&quot;","img":{"alt_text":"","src":"","width":0,"height":0},"classes":[]},{"id":860,"url":"https:\/\/fde.cat\/index.php\/2024\/04\/24\/inside-data-clouds-secret-formula-for-processing-one-quadrillion-records-monthly\/","url_meta":{"origin":770,"position":5},"title":"Inside Data Cloud\u2019s Secret Formula for Processing One Quadrillion Records Monthly","date":"April 24, 2024","format":false,"excerpt":"In our \u201cEngineering Energizers\u201d Q&A series, we explore the inspiring journeys of engineering leaders who have significantly advanced their fields. Today, we meet Soumya KV, who spearheads the development of the Data Cloud\u2019s internal apps layer at Salesforce. Her India-based team specializes in advanced data segmentation and activation, enabling tailored\u2026","rel":"","context":"In &quot;Technology&quot;","img":{"alt_text":"","src":"","width":0,"height":0},"classes":[]}],"_links":{"self":[{"href":"https:\/\/fde.cat\/index.php\/wp-json\/wp\/v2\/posts\/770","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/fde.cat\/index.php\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/fde.cat\/index.php\/wp-json\/wp\/v2\/types\/post"}],"replies":[{"embeddable":true,"href":"https:\/\/fde.cat\/index.php\/wp-json\/wp\/v2\/comments?post=770"}],"version-history":[{"count":0,"href":"https:\/\/fde.cat\/index.php\/wp-json\/wp\/v2\/posts\/770\/revisions"}],"wp:attachment":[{"href":"https:\/\/fde.cat\/index.php\/wp-json\/wp\/v2\/media?parent=770"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/fde.cat\/index.php\/wp-json\/wp\/v2\/categories?post=770"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/fde.cat\/index.php\/wp-json\/wp\/v2\/tags?post=770"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}