{"id":892,"date":"2024-07-08T21:29:33","date_gmt":"2024-07-08T21:29:33","guid":{"rendered":"https:\/\/fde.cat\/index.php\/2024\/07\/08\/unlocking-data-clouds-secret-for-scaling-massive-data-volumes-and-slashing-processing-bottlenecks\/"},"modified":"2024-07-08T21:29:33","modified_gmt":"2024-07-08T21:29:33","slug":"unlocking-data-clouds-secret-for-scaling-massive-data-volumes-and-slashing-processing-bottlenecks","status":"publish","type":"post","link":"https:\/\/fde.cat\/index.php\/2024\/07\/08\/unlocking-data-clouds-secret-for-scaling-massive-data-volumes-and-slashing-processing-bottlenecks\/","title":{"rendered":"Unlocking Data Cloud\u2019s Secret for Scaling Massive Data Volumes and Slashing Processing Bottlenecks"},"content":{"rendered":"<p>In our Engineering Energizers Q&amp;A series, we explore engineers who have pioneered advancements in their fields. Today, we meet Rahul Singh, Vice President of Software Engineering, leading the India-based <a href=\"https:\/\/www.salesforce.com\/form\/demo\/data-cloud-demo\/?d=7013y000002Exl2AAC&amp;nc=7013y000002EyXVAA0&amp;utm_content=7013y000002Exl2AAC&amp;utm_source=google&amp;utm_medium=paid_search&amp;utm_campaign=21134104451&amp;utm_adgroup=161913224604&amp;utm_term=salesforce%20data%20cloud&amp;utm_matchtype=e&amp;gad_source=1&amp;gclid=Cj0KCQjw7ZO0BhDYARIsAFttkCgFGjKFTtFoBODO59mtsmxaeAwFnHvjh1qUqHhEe-UoSGID5fcQqCsaAsl9EALw_wcB&amp;gclsrc=aw.ds\">Data Cloud<\/a> team. His team is focused on delivering a robust, scalable, and efficient <a href=\"https:\/\/engineering.salesforce.com\/inside-data-clouds-secret-formula-for-processing-one-quadrillion-records-monthly\/\">Data Cloud<\/a> platform that consolidates customer data to enhance business insights and personalize customer interactions, meeting the diverse needs of their customers.<\/p>\n<p>Discover how Rahul\u2019s team tackled major technical challenges \u2014 including optimizing platform scalability, reducing processing times, and managing high transaction rates \u2014 to deliver high-performance solutions\u2026<\/p>\n<h5 class=\"wp-block-heading\"><strong>What is your Data Cloud team\u2019s mission?<\/strong><\/h5>\n<p>A significant part of our charter is to handle massive scales efficiently. <strong>We optimize our platform to support high transaction rates and large volumes of data with high reliability without compromising on performance<\/strong>. This involves implementing advanced technologies and innovative solutions that enhance our platform\u2019s capabilities.<\/p>\n<p>We are also committed to contributing to the open-source community. By sharing our advancements and collaborating with other experts, we drive further innovation and bring valuable insights back into our projects. This helps us stay ahead of the curve and ensures our solutions are cutting-edge.<\/p>\n<p><em>Rahul explains why Salesforce\u2019s culture is unique.<\/em><\/p>\n<h5 class=\"wp-block-heading\"><strong>What were some of the major technical challenges your team faced with Data Cloud?<\/strong><\/h5>\n<p>One major challenge our team faced was enhancing our platform\u2019s efficiency to manage large-scale operations. A key example involved one of our largest non-banking financial customer in India. They required end-to-end data processing capabilities within a very stringent 30 to 40-minute timeframe, despite our usual SLA being one to two hours per module.<\/p>\n<p>This client serves nearly 500 million customers, necessitating rapid data processing from ingestion through to segmentation and activation in Data Cloud.<\/p>\n<p>Additionally, we had to manage and orchestrate workflows across various services in Data Cloud. This required us to implement special tweaks and changes to our existing infrastructure to handle higher demand more efficiently.<\/p>\n<h5 class=\"wp-block-heading\"><strong>How did your team overcome those challenges?<\/strong><\/h5>\n<p>To meet the stringent SLA requirements of that customer, we had to make several optimizations across different parts of the platform. Our team collaborated extensively to address not just the individual SLA components but the overall end-to-end SLA.<\/p>\n<p>This involved deep dives into various services to identify and eliminate bottlenecks. This ensured we could meet the required performance standards while bringing the entire process down to the required time frame.<\/p>\n<p>Additionally, we implemented special optimizations to our infrastructure to handle bigger workloads more efficiently. This collaborative and detailed approach allowed us to support their scale needs with modern architecture. Ultimately, these efforts made it one of the largest and most complex set of Data Cloud workloads we\u2019ve had in India.<\/p>\n<p>Lastly, by optimizing resource allocation and leveraging technologies such as autoscaling (horizontal and vertical), better price-performance compute (e.g. AWS Graviton) and spot\/reserved instances, we designed highly cost-effective solutions without compromising on performance.<\/p>\n<div class=\"wp-block-group is-layout-constrained wp-block-group-is-layout-constrained\">\n<div class=\"wp-block-group is-layout-constrained wp-block-group-is-layout-constrained\">\n<h5 class=\"wp-block-heading\"><strong>Diving deeper, what technical adjustments have you made to optimize the performance of Data Cloud?<\/strong><\/h5>\n<div class=\"wp-block-group is-layout-constrained wp-container-core-group-is-layout-1 wp-block-group-is-layout-constrained\">\n<p>We made several key adjustments:<\/p>\n<p><strong>Enhancing the orchestration of workflows<\/strong>. Every service inside Data Cloud must interact with our platform to run specific workloads. We implemented several critical changes to optimize this interaction<\/p>\n<p><strong>Streamlining tenant provisioning<\/strong>. By refining this process, we reduced the time to set up and configure new tenants, ensuring faster onboarding<\/p>\n<p><strong>Improving the admin services control plane<\/strong>. We enhanced how administrative tasks were managed, allowing our system to handle higher demand more effectively. These improvements ensured that administrative processes did not become bottlenecks, particularly under heavy load<\/p>\n<p><strong>Optimizing resource allocation and workload management<\/strong>. This included implementing tweaks to how workloads were scheduled and executed, ensuring optimal performance even under peak demand<\/p>\n<\/div>\n<\/div>\n<\/div>\n<p>By making these adjustments, we managed workloads more efficiently and met the stringent end-to- end SLAs required by the Indian customer. These optimizations were crucial in supporting large-scale operations and maintaining high performance standards across <a href=\"https:\/\/engineering.salesforce.com\/inside-data-clouds-secret-formula-for-processing-one-quadrillion-records-monthly\/\">Data Cloud<\/a>.<\/p>\n<h5 class=\"wp-block-heading\"><strong>What role does customer feedback play in Data Cloud\u2019s optimization process?<\/strong><\/h5>\n<p>Working directly with customers provides us with real-world insights into their specific needs and challenges as they seek to scale their businesses. This close collaboration enables us to tailor our solutions to meet their requirements.<\/p>\n<p>For example, a customer\u2019s feedback on latency and processing times pushed us to re-evaluate and enhance our platform\u2019s performance. By understanding their pain points and operational demands, we were able to make targeted improvements that significantly boosted efficiency..<\/p>\n<p>Furthermore, regular feedback loops with our customers allowed us to stay aligned with their evolving needs. This continuous dialogue ensured that our optimizations were not just reactive but also proactive, anticipating future challenges and opportunities.<\/p>\n<h5 class=\"wp-block-heading\"><strong>What innovations have you implemented in Data Cloud to manage workload distribution and improve data processing efficiency?<\/strong><\/h5>\n<p>One significant innovation was <strong>the development of a sophisticated orchestration mechanism that dynamically manages workload distribution based on real-time signals<\/strong>. This mechanism ensured optimal resource utilization and minimized processing times, allowing us to handle large volumes of data more efficiently..<\/p>\n<p>We have advanced our workload management by implementing techniques that leverage metadata and table statistics to analyze data volumes. This strategic approach has significantly enhanced the efficiency and speed of our data processing tasks, enabling more precise workload categorization. Consequently, our platform can now handle complex data sets with greater agility and accuracy.<\/p>\n<p>Our Data Cloud control plane design is continuously evolving to scale seamlessly and meet the increasing demands of our microservices. It supports a variety of functions, from processing large-scale data analytics with Spark, running complex queries with Trino &amp; Hyper, to managing unstructured multi-modal use cases with our newly introduced vector database. Our platform is equipped to handle any workload, from small Kubernetes jobs to large-scale deployments that utilize thousands of nodes of various types, ensuring elasticity.<\/p>\n<p>To support this scalability, we utilize horizontal scaling of compute resources beyond the confines of a single AWS account and auto-scaling of EKS clusters. These capabilities are bolstered by efficient load balancing and request routing algorithms, catering to the growing needs of Data Cloud Everywhere.<\/p>\n<p>We are dedicated to optimizing and developing new resource management and allocation algorithms, fine-tuning them to ensure that our compute resources are used most effectively, particularly in high-demand scenarios. This continuous enhancement fosters improved performance and adaptability.<\/p>\n<h5 class=\"wp-block-heading\"><strong>How does your team ensure the scalability of Data Cloud to meet future demands?<\/strong><\/h5>\n<p>Scalability is central to our design philosophy. We constantly test our systems under various stress scenarios to ensure they can handle future demands, identifying bottlenecks and fine-tuning for optimal performance.<\/p>\n<p>In line with Salesforce\u2019s vision, our goal is to deploy Data Cloud across all regions, <strong>preparing the platform to scale up to 100 times its current capacity.<\/strong> This strategic expansion is designed to develop solutions that are not only effective today but also capable of supporting significantly larger workloads in the future.<\/p>\n<p>To achieve this, we are leveraging public cloud infrastructure such as <a href=\"https:\/\/engineering.salesforce.com\/hyperforce-behind-the-scenes-ushering-in-a-new-age-of-ai-driven-cloud-scalability\/\">Hyperforce<\/a>. This allows us to dynamically and elastically scale our resources, ensuring that our platform can grow seamlessly alongside our customers\u2019 evolving needs. This approach guarantees that as customer demands increase, our platform remains robust and responsive.<\/p>\n<p>This flexibility enables efficient resource allocation and quick responses to changing demands. Additionally, continuous monitoring and analysis of system performance allow us to make necessary adjustments, maintaining scalability and robustness to meet future challenges effectively..<\/p>\n<p><em>Rahul shares why engineers should join Salesforce.<\/em><\/p>\n<h5 class=\"wp-block-heading\"><strong>How does your team balance innovation with maintaining stability and reliability in Data Cloud?<\/strong><\/h5>\n<p>Balancing innovation with stability is indeed a challenge. We achieve this through rigorous testing and a phased implementation approach. Before rolling out any new feature or optimization, we conduct extensive testing in controlled environments to ensure it doesn\u2019t disrupt existing functionalities.<\/p>\n<p>Our testing process involves multiple stages. Initially, new features are tested in isolated environments to identify potential issues. Once stable, they are gradually introduced into broader testing phases, simulating real-world conditions to verify performance and reliability.<\/p>\n<p>Additionally, we actively collect continuous feedback from our customers during the pilot phase, where they serve as early adopters. This process allows us to gather valuable insights and pinpoint areas that require refinement. Establishing this feedback loop is crucial for ensuring the stability of new features before full-scale deployment.<\/p>\n<p>Lastly, we focus on monitoring and analytics. Continuous monitoring allows us to quickly detect and address stability issues, maintaining high reliability while advancing innovation.<\/p>\n<div class=\"wp-block-group is-layout-constrained wp-container-core-group-is-layout-4 wp-block-group-is-layout-constrained\">\n<h5 class=\"wp-block-heading\"><strong>Learn More<\/strong><\/h5>\n<p>Hungry for more Data Cloud stories? Read <a href=\"https:\/\/engineering.salesforce.com\/inside-data-clouds-secret-formula-for-processing-one-quadrillion-records-monthly\/\">this blog <\/a>to learn about Data Cloud\u2019s secret formula for processing one quadrillion records monthly<\/p>\n<p>Stay connected \u2014 join our <a href=\"https:\/\/flows.beamery.com\/salesforce\/eng-social-2023\">Talent Community<\/a>!<\/p>\n<p>Check out our <a href=\"https:\/\/www.salesforce.com\/company\/careers\/teams\/tech-and-product\/?d=cta-tms-tp-2\">Technology and Product<\/a> teams to learn how you can get involved.<\/p>\n<\/div>\n<p>The post <a href=\"https:\/\/engineering.salesforce.com\/unlocking-data-clouds-secret-for-scaling-massive-data-volumes-and-slashing-processing-bottlenecks\/\">Unlocking Data Cloud\u2019s Secret for Scaling Massive Data Volumes and Slashing Processing Bottlenecks<\/a> appeared first on <a href=\"https:\/\/engineering.salesforce.com\/\">Salesforce Engineering Blog<\/a>.<\/p>\n<p><a href=\"https:\/\/engineering.salesforce.com\/unlocking-data-clouds-secret-for-scaling-massive-data-volumes-and-slashing-processing-bottlenecks\/\" target=\"_blank\" class=\"feedzy-rss-link-icon\" rel=\"noopener\">Read More<\/a><\/p>","protected":false},"excerpt":{"rendered":"<p>In our Engineering Energizers Q&amp;A series, we explore engineers who have pioneered advancements in their fields. Today, we meet Rahul Singh, Vice President of Software Engineering, leading the India-based Data Cloud team. His team is focused on delivering a robust, scalable, and efficient Data Cloud platform that consolidates customer data to enhance business insights and&hellip; <a class=\"more-link\" href=\"https:\/\/fde.cat\/index.php\/2024\/07\/08\/unlocking-data-clouds-secret-for-scaling-massive-data-volumes-and-slashing-processing-bottlenecks\/\">Continue reading <span class=\"screen-reader-text\">Unlocking Data Cloud\u2019s Secret for Scaling Massive Data Volumes and Slashing Processing Bottlenecks<\/span><\/a><\/p>\n","protected":false},"author":0,"featured_media":0,"comment_status":"","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"spay_email":"","footnotes":""},"categories":[7],"tags":[],"class_list":["post-892","post","type-post","status-publish","format-standard","hentry","category-technology","entry"],"jetpack_featured_media_url":"","jetpack-related-posts":[{"id":692,"url":"https:\/\/fde.cat\/index.php\/2023\/03\/22\/how-is-indias-brilliant-big-data-processing-team-engineering-salesforce-data-cloud\/","url_meta":{"origin":892,"position":0},"title":"How is India\u2019s Brilliant Big Data Processing Team Engineering Salesforce Data Cloud?","date":"March 22, 2023","format":false,"excerpt":"In our \u201cEngineering Energizers\u201d Q&A series, we examine the life experiences and career paths that have shaped Salesforce engineering leaders. Meet Archana Kumari, one of Salesforce\u2019s first India-based woman engineering leaders. In her role, Archana leads Salesforce India\u2019s Data Cloud big data processing compute layer team \u2014 charged with providing\u2026","rel":"","context":"In &quot;Technology&quot;","img":{"alt_text":"","src":"","width":0,"height":0},"classes":[]},{"id":828,"url":"https:\/\/fde.cat\/index.php\/2024\/02\/20\/unlocking-hyperforce-migration-innovative-solutions-for-a-smooth-transition-to-the-cloud\/","url_meta":{"origin":892,"position":1},"title":"Unlocking Hyperforce Migration: Innovative Solutions for a Smooth Transition to the Cloud","date":"February 20, 2024","format":false,"excerpt":"In our \u201cEngineering Energizers\u201d Q&A series, we delve into the experiences and expertise of Salesforce Engineering leaders. Today, we\u2019re meeting Mahamadou Sylla, a Senior Member of the Technical Staff at Salesforce Engineering. Mahamadou is a key member of our Hyperforce\u2019s Bill of Materials (BOM) team, which assists internal teams in\u2026","rel":"","context":"In &quot;Technology&quot;","img":{"alt_text":"","src":"","width":0,"height":0},"classes":[]},{"id":694,"url":"https:\/\/fde.cat\/index.php\/2023\/03\/23\/big-data-processing-driving-data-migration-for-salesforce-data-cloud\/","url_meta":{"origin":892,"position":2},"title":"Big Data Processing: Driving Data Migration  for Salesforce Data Cloud","date":"March 23, 2023","format":false,"excerpt":"The tsunami of data \u2014 set to exceed 180 zettabytes by 2025 \u2014 places significant pressure on companies. Simply having access to customer information is not enough \u2014 companies must also analyze and refine the data to find actionable pieces that power new business. As businesses collect these volumes of\u2026","rel":"","context":"In &quot;Technology&quot;","img":{"alt_text":"","src":"","width":0,"height":0},"classes":[]},{"id":896,"url":"https:\/\/fde.cat\/index.php\/2024\/07\/16\/the-unstructured-data-dilemma-how-data-cloud-handles-250-trillion-transactions-weekly\/","url_meta":{"origin":892,"position":3},"title":"The Unstructured Data Dilemma: How Data Cloud Handles 250 Trillion Transactions Weekly","date":"July 16, 2024","format":false,"excerpt":"In our \u201cEngineering Energizers\u201d Q&A series, we delve into the journeys of engineering leaders who have made notable strides in their areas of expertise. This edition features Adithya Vishwanath, Vice President of Software Engineering at Salesforce. He leads the Data Cloud team, a pivotal platform that integrates diverse data sources,\u2026","rel":"","context":"In &quot;Technology&quot;","img":{"alt_text":"","src":"","width":0,"height":0},"classes":[]},{"id":860,"url":"https:\/\/fde.cat\/index.php\/2024\/04\/24\/inside-data-clouds-secret-formula-for-processing-one-quadrillion-records-monthly\/","url_meta":{"origin":892,"position":4},"title":"Inside Data Cloud\u2019s Secret Formula for Processing One Quadrillion Records Monthly","date":"April 24, 2024","format":false,"excerpt":"In our \u201cEngineering Energizers\u201d Q&A series, we explore the inspiring journeys of engineering leaders who have significantly advanced their fields. Today, we meet Soumya KV, who spearheads the development of the Data Cloud\u2019s internal apps layer at Salesforce. Her India-based team specializes in advanced data segmentation and activation, enabling tailored\u2026","rel":"","context":"In &quot;Technology&quot;","img":{"alt_text":"","src":"","width":0,"height":0},"classes":[]},{"id":705,"url":"https:\/\/fde.cat\/index.php\/2023\/04\/18\/ai-based-identity-resolution-the-key-for-linking-diverse-customer-data\/","url_meta":{"origin":892,"position":5},"title":"AI-based Identity Resolution: The Key for Linking Diverse Customer Data","date":"April 18, 2023","format":false,"excerpt":"Companies want a comprehensive view of their customers, enabling them to solve business and marketing challenges, such as personalization, segmentation, and targeting \u2014 but they face an uphill battle as they are drowning in data. For example, many companies cannot match the identity of a customer who visits their website\u2026","rel":"","context":"In &quot;Technology&quot;","img":{"alt_text":"","src":"","width":0,"height":0},"classes":[]}],"_links":{"self":[{"href":"https:\/\/fde.cat\/index.php\/wp-json\/wp\/v2\/posts\/892","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/fde.cat\/index.php\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/fde.cat\/index.php\/wp-json\/wp\/v2\/types\/post"}],"replies":[{"embeddable":true,"href":"https:\/\/fde.cat\/index.php\/wp-json\/wp\/v2\/comments?post=892"}],"version-history":[{"count":0,"href":"https:\/\/fde.cat\/index.php\/wp-json\/wp\/v2\/posts\/892\/revisions"}],"wp:attachment":[{"href":"https:\/\/fde.cat\/index.php\/wp-json\/wp\/v2\/media?parent=892"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/fde.cat\/index.php\/wp-json\/wp\/v2\/categories?post=892"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/fde.cat\/index.php\/wp-json\/wp\/v2\/tags?post=892"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}