{"id":814,"date":"2024-01-18T17:00:42","date_gmt":"2024-01-18T17:00:42","guid":{"rendered":"https:\/\/fde.cat\/index.php\/2024\/01\/18\/lazy-is-the-new-fast-how-lazy-imports-and-cinder-accelerate-machine-learning-at-meta\/"},"modified":"2024-01-18T17:00:42","modified_gmt":"2024-01-18T17:00:42","slug":"lazy-is-the-new-fast-how-lazy-imports-and-cinder-accelerate-machine-learning-at-meta","status":"publish","type":"post","link":"https:\/\/fde.cat\/index.php\/2024\/01\/18\/lazy-is-the-new-fast-how-lazy-imports-and-cinder-accelerate-machine-learning-at-meta\/","title":{"rendered":"Lazy is the new fast: How Lazy Imports and Cinder accelerate machine learning at Meta"},"content":{"rendered":"<p><span>At Meta, the quest for faster model training has yielded an exciting milestone: the adoption of Lazy Imports and the Python Cinder runtime. <\/span><br \/>\n<span>The outcome? Up to 40 percent time to first batch (TTFB) improvements, along with a <\/span>20 percent<span> reduction in Jupyter kernel startup times. <\/span><br \/>\n<span>This advancement facilitates swifter experimentation capabilities and elevates the ML developer experience (DevX).<\/span><\/p>\n<p><span>Time is of the essence in the realm of machine learning (ML) development. The milliseconds it takes for an ML model to transition from conceptualization to processing the initial training data can dramatically impact productivity and experimentation.<\/span><\/p>\n<p><span>At Meta, we\u2019ve been able to significantly improve our model training times, as well as our overall developer experience (DevX) by adopting <\/span><a href=\"https:\/\/peps.python.org\/pep-0690\/\" target=\"_blank\" rel=\"noopener\"><span>Lazy Imports<\/span><\/a><span> and the <\/span><a href=\"https:\/\/github.com\/facebookincubator\/cinder\" target=\"_blank\" rel=\"noopener\"><span>Python Cinder runtime<\/span><\/a><span>.\u00a0<\/span><\/p>\n<h2><span>The time to first batch challenge<\/span><\/h2>\n<p><span>Batch processing has been a game changer in ML development. It handles large volumes of data in groups (or batches) and allows us to train models, optimize parameters, and perform inference more effectively and swiftly.<\/span><\/p>\n<p><span>But ML training workloads are notorious for their sluggish starts. When we look to improve our batch processing speeds, time to first batch (TTFB) comes into focus. TTFB is the time elapsed from the moment you hit the \u201cstart\u201d button on your ML model training to the point when the first batch of data enters the model for processing. It is a critical metric that determines the speed at which an ML model goes from idle to learning. TTFB can vary widely due to factors like infrastructure overhead and scheduling delays. But reducing TTFB means reducing the development waiting times that can often feel like an eternity to engineers \u2013 waiting periods that can quickly amass as expensive resource wastage.<\/span><\/p>\n<p><span>In the pursuit of faster TTFB, Meta set its sights on reducing this overhead, and Lazy Imports with Cinder emerged as a promising solution.<\/span><\/p>\n<h2><span>The magic of Lazy Imports<\/span><\/h2>\n<p><span>Previously, ML developers explored alternatives like the standard <\/span><span>LazyLoader<\/span><span> in <\/span><span>importlib<\/span><span> or <\/span><span>lazy-import<\/span><span>`, to defer explicit imports until necessary. While promising, these approaches are limited by their much narrower scope, and the need to manually select which dependencies will be lazily imported (often with suboptimal results). Using these approaches demands meticulous codebase curation and a fair amount of code refactoring.<\/span><\/p>\n<p><span>In contrast, <\/span><a href=\"https:\/\/developers.facebook.com\/blog\/post\/2022\/06\/15\/python-lazy-imports-with-cinder\/\"><span>Cinder\u2019s Lazy Imports<\/span><\/a><span> approach is a comprehensive and aggressive strategy that goes beyond the limitations of other libraries and delivers significant enhancements to the developer experience. Instead of painstakingly handpicking imports to become lazy, Cinder simplifies and accelerates the startup process by transparently deferring all imports as a default action, resulting in a much broader and more powerful deferral of imports until the exact moment they\u2019re needed. Once in place, this method ensures that developers no longer have to navigate the maze of selective import choices. With it, developers can bid farewell to the need of typing-only imports and the use of <\/span><span>TYPE_CHECKING<\/span><span>. It allows a simple <\/span><span><span>from __future__ import<\/span> annotations<\/span><span> declaration at the beginning of a file to delay type evaluation, while Lazy Imports defer the actual import statements until required. The combined effect of these optimizations reduced costly runtime imports and further streamlined the development workflow.<\/span><\/p>\n<p><span>The Lazy Imports solution delivers. Meta\u2019s initiative to enhance ML development has involved rolling out Cinder with Lazy Imports to several workloads, including our ML frameworks and Jupyter kernels, producing lightning-fast startup times, improved experimentation capabilities, reduced infrastructure overhead, and code that is a breeze to maintain. We\u2019re pleased to share that Meta\u2019s key AI workloads have experienced noteworthy improvements, with TTFB wins reaching up to 40 percent. Resulting time savings can vary from seconds to minutes per run.<\/span><\/p>\n<p><span>These impressive results translate to a substantial boost in the efficiency of ML workflows, since they mean ML developers can get to the model training phase more swiftly.<\/span><\/p>\n<h2><span>The challenges of adopting Lazy Imports<\/span><\/h2>\n<p><span>While Lazy Imports\u2019 approach significantly improved ML development, it was not all a bed of roses. We encountered several hurdles that tested our resolve and creativity.<\/span><\/p>\n<h3><span>Compatibility<\/span><\/h3>\n<p><span>One of the primary challenges we grappled with was the compatibility of existing libraries with Lazy Imports. Libraries such as PyTorch, Numba, NumPy, and SciPy, among others, did not seamlessly align with the deferred module loading approach. These libraries often rely on import side effects and other patterns that do not play well with Lazy Imports. The order in which Python imports could change or be postponed, often led to side effects failing to register classes, functions, and operations correctly. This required painstaking troubleshooting to identify and address import cycles and discrepancies.<\/span><\/p>\n<h3><span>Balancing performance versus dependability<\/span><\/h3>\n<p><span>We also had to strike the right balance between performance optimization and code dependability. While Lazy Imports significantly reduced TTFB and enhanced resource utilization, it also introduced a considerable semantic change in the way Python imports work that could make the codebase less intuitive. Achieving the perfect equilibrium was a constant consideration, and was ensured by limiting the impact of semantic changes to only the relevant parts that could be thoroughly tested.<\/span><\/p>\n<p><span>Ensuring seamless interaction with the existing codebase required meticulous testing and adjustments. The task was particularly intricate when dealing with complex, multifaceted ML models, where the implications of deferred imports needed to be thoroughly considered. We ultimately opted for enabling Lazy Imports only during the startup and preparation phases and disabling it before the first batch started.<\/span><\/p>\n<h3><span>Learning curve<\/span><\/h3>\n<p><span>Adopting new paradigms like Lazy Imports can introduce a learning curve for the development team. Training ML engineers, infra engineers, and system engineers to adapt to the new approach, understand its nuances, and implement it effectively is a process in itself.<\/span><\/p>\n<h2><span>What is next for Lazy Imports at Meta?<\/span><\/h2>\n<p><span>The adoption of Lazy Imports and Cinder represented a meaningful enhancement in Meta\u2019s AI key workloads. It came with its share of ups and downs, but ultimately demonstrated that Lazy Imports can be a game changer in expediting ML development. The TTFB wins, DevX improvements, and reduced kernel startup times are all tangible results of this initiative. With Lazy Imports, Meta\u2019s ML developers are now equipped to work more efficiently, experiment more rapidly, and achieve results faster.<\/span><\/p>\n<p><span>While we\u2019ve achieved remarkable success with the adoption of Lazy Imports, our journey is far from over. So, what\u2019s next for us? Here\u2019s a glimpse into our future endeavors:<\/span><\/p>\n<h3><span>Streamlining developer onboarding<\/span><\/h3>\n<p><span>The learning curve associated with Lazy Imports can be a challenge for newcomers. We\u2019re investing in educational resources and onboarding materials to make it easier for developers to embrace this game-changing approach.\u00a0<\/span><\/p>\n<h3><span>Enhancing tooling<\/span><\/h3>\n<p><span>Debugging code with deferred imports can be intricate. We\u2019re working on developing tools and techniques that simplify the debugging and troubleshooting process, ensuring that developers can quickly identify and resolve issues.<\/span><\/p>\n<h3><span>Community collaboration<\/span><\/h3>\n<p><span>The power of Lazy Imports lies in its adaptability and versatility. We\u2019re eager to collaborate with the Python community \u2013 sharing insights, best practices, and addressing challenges together. Building a robust community that helps supporting paradigms and patterns that play well with Lazy Imports is one of our future priorities.<\/span><\/p>\n<p>The post <a href=\"https:\/\/engineering.fb.com\/2024\/01\/18\/developer-tools\/lazy-imports-cinder-machine-learning-meta\/\">Lazy is the new fast: How Lazy Imports and Cinder accelerate machine learning at Meta<\/a> appeared first on <a href=\"https:\/\/engineering.fb.com\/\">Engineering at Meta<\/a>.<\/p>\n<p>Engineering at Meta<\/p>","protected":false},"excerpt":{"rendered":"<p>At Meta, the quest for faster model training has yielded an exciting milestone: the adoption of Lazy Imports and the Python Cinder runtime. The outcome? Up to 40 percent time to first batch (TTFB) improvements, along with a 20 percent reduction in Jupyter kernel startup times. This advancement facilitates swifter experimentation capabilities and elevates the&hellip; <a class=\"more-link\" href=\"https:\/\/fde.cat\/index.php\/2024\/01\/18\/lazy-is-the-new-fast-how-lazy-imports-and-cinder-accelerate-machine-learning-at-meta\/\">Continue reading <span class=\"screen-reader-text\">Lazy is the new fast: How Lazy Imports and Cinder accelerate machine learning at Meta<\/span><\/a><\/p>\n","protected":false},"author":0,"featured_media":0,"comment_status":"","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"spay_email":"","footnotes":""},"categories":[7],"tags":[],"class_list":["post-814","post","type-post","status-publish","format-standard","hentry","category-technology","entry"],"jetpack_featured_media_url":"","jetpack-related-posts":[{"id":897,"url":"https:\/\/fde.cat\/index.php\/2024\/07\/16\/ai-lab-the-secrets-to-keeping-machine-learning-engineers-moving-fast\/","url_meta":{"origin":814,"position":0},"title":"AI Lab: The secrets to keeping machine learning engineers moving fast","date":"July 16, 2024","format":false,"excerpt":"The key to developer velocity across AI lies in minimizing time to first batch (TTFB) for machine learning (ML) engineers. AI Lab is a pre-production framework used internally at Meta. It allows us to continuously A\/B test common ML workflows \u2013 enabling proactive improvements and automatically preventing regressions on TTFB.\u00a0\u2026","rel":"","context":"In &quot;Technology&quot;","img":{"alt_text":"","src":"","width":0,"height":0},"classes":[]},{"id":893,"url":"https:\/\/fde.cat\/index.php\/2024\/07\/10\/metas-approach-to-machine-learning-prediction-robustness\/","url_meta":{"origin":814,"position":1},"title":"Meta\u2019s approach to machine learning prediction robustness","date":"July 10, 2024","format":false,"excerpt":"Meta\u2019s advertising business leverages large-scale machine learning (ML) recommendation models that power millions of ads recommendations per second across Meta\u2019s family of apps. Maintaining reliability of these ML systems helps ensure the highest level of service and uninterrupted benefit delivery to our users and advertisers. To minimize disruptions and ensure\u2026","rel":"","context":"In &quot;Technology&quot;","img":{"alt_text":"","src":"","width":0,"height":0},"classes":[]},{"id":599,"url":"https:\/\/fde.cat\/index.php\/2022\/06\/14\/applying-federated-learning-to-protect-data-on-mobile-devices\/","url_meta":{"origin":814,"position":2},"title":"Applying federated learning to protect data on mobile devices","date":"June 14, 2022","format":false,"excerpt":"What the research is: Federated learning with differential privacy (FL-DP) is one of the latest privacy-enhancing technologies being evaluated at Meta as we constantly work to enhance user privacy and further safeguard users\u2019 data in the products we design, build, and maintain. FL-DP enhances privacy in two important ways: It\u2026","rel":"","context":"In &quot;Technology&quot;","img":{"alt_text":"","src":"","width":0,"height":0},"classes":[]},{"id":842,"url":"https:\/\/fde.cat\/index.php\/2024\/03\/20\/optimizing-rtc-bandwidth-estimation-with-machine-learning\/","url_meta":{"origin":814,"position":3},"title":"Optimizing RTC bandwidth estimation with machine learning","date":"March 20, 2024","format":false,"excerpt":"Bandwidth estimation (BWE) and congestion control play an important role in delivering high-quality real-time communication (RTC) across Meta\u2019s family of apps. We\u2019ve adopted a machine learning (ML)-based approach that allows us to solve networking problems holistically across cross-layers such as BWE, network resiliency, and transport. We\u2019re sharing our experiment results\u2026","rel":"","context":"In &quot;Technology&quot;","img":{"alt_text":"","src":"","width":0,"height":0},"classes":[]},{"id":806,"url":"https:\/\/fde.cat\/index.php\/2023\/12\/19\/ai-debugging-at-meta-with-hawkeye\/","url_meta":{"origin":814,"position":4},"title":"AI debugging at Meta with HawkEye","date":"December 19, 2023","format":false,"excerpt":"HawkEye is the powerful toolkit used internally at Meta for monitoring, observability, and debuggability of the end-to-end machine learning (ML) workflow that powers ML-based products. HawkEye supports recommendation and ranking models across several products at Meta. Over the past two years, it has facilitated order of magnitude improvements in the\u2026","rel":"","context":"In &quot;Technology&quot;","img":{"alt_text":"","src":"","width":0,"height":0},"classes":[]},{"id":768,"url":"https:\/\/fde.cat\/index.php\/2023\/10\/05\/meta-contributes-new-features-to-python-3-12\/","url_meta":{"origin":814,"position":5},"title":"Meta contributes new features to Python 3.12","date":"October 5, 2023","format":false,"excerpt":"Python 3.12 is out! It includes new features and performance improvements \u2013 some contributed by Meta \u2013 that we believe will benefit all Python users. We\u2019re sharing details about these new features that we worked closely with the Python community to develop. This week\u2019s release of Python 3.12 marks a\u2026","rel":"","context":"In &quot;Technology&quot;","img":{"alt_text":"","src":"","width":0,"height":0},"classes":[]}],"_links":{"self":[{"href":"https:\/\/fde.cat\/index.php\/wp-json\/wp\/v2\/posts\/814","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/fde.cat\/index.php\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/fde.cat\/index.php\/wp-json\/wp\/v2\/types\/post"}],"replies":[{"embeddable":true,"href":"https:\/\/fde.cat\/index.php\/wp-json\/wp\/v2\/comments?post=814"}],"version-history":[{"count":0,"href":"https:\/\/fde.cat\/index.php\/wp-json\/wp\/v2\/posts\/814\/revisions"}],"wp:attachment":[{"href":"https:\/\/fde.cat\/index.php\/wp-json\/wp\/v2\/media?parent=814"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/fde.cat\/index.php\/wp-json\/wp\/v2\/categories?post=814"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/fde.cat\/index.php\/wp-json\/wp\/v2\/tags?post=814"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}