{"id":773,"date":"2023-10-18T16:00:26","date_gmt":"2023-10-18T16:00:26","guid":{"rendered":"https:\/\/fde.cat\/index.php\/2023\/10\/18\/how-meta-is-creating-custom-silicon-for-ai\/"},"modified":"2023-10-18T16:00:26","modified_gmt":"2023-10-18T16:00:26","slug":"how-meta-is-creating-custom-silicon-for-ai","status":"publish","type":"post","link":"https:\/\/fde.cat\/index.php\/2023\/10\/18\/how-meta-is-creating-custom-silicon-for-ai\/","title":{"rendered":"How Meta is creating custom silicon for AI"},"content":{"rendered":"<p><span>With the recent launches of <\/span><a href=\"https:\/\/ai.meta.com\/blog\/meta-training-inference-accelerator-AI-MTIA\/\" target=\"_blank\" rel=\"noopener\"><span>MTIA v1<\/span><\/a><span>,\u00a0 Meta\u2019s first-generation AI inference accelerator, and <\/span><a href=\"https:\/\/about.fb.com\/news\/2023\/07\/llama-2\/\" target=\"_blank\" rel=\"noopener\"><span>Llama 2<\/span><\/a><span>,\u00a0 the next generation of Meta\u2019s publicly available large language model, it\u2019s clear that Meta is focused on advancing AI for a more connected world. Fueling the success of these products are world-class infrastructure teams, including Meta\u2019s custom AI silicon team, led by Olivia Wu, a leader in the silicon industry for 30 years.<\/span><\/p>\n<p><span>In the conversation below, Olivia explains how she led the silicon design team to deliver Meta\u2019s AI silicon, allowing the company to improve the compute efficiency of the infrastructure, and enable software developers to create AI models that will provide more relevant content and better user experiences.<\/span><\/p>\n<p>Tell us about your role at Meta.<\/p>\n<p>Olivia Wu: <span>I lead design development of the next generation of Meta\u2019s AI silicon. My team is responsible for the design and development of Meta\u2019s in-house machine learning (ML) accelerator, and I partner closely with our co-design, architecture, verification, implementation, emulation, validation, system, firmware, and software teams to successfully build and deploy the silicon in our data centers.<\/span><\/p>\n<p>What led you to this role?<\/p>\n<p>OW: <span>I\u2019ve been working in the silicon industry for 30 years and have experience working at a variety of large companies leading both architecture and design for multiple ASICs and IPs, and for startups focused on training AI. In 2018, I saw a <\/span><a href=\"https:\/\/twitter.com\/ylecun\/status\/986586177573564417\" target=\"_blank\" rel=\"noopener\"><span>social media post from Yann LeCun<\/span><\/a><span>, our Chief AI Scientist, that Meta was looking for someone to help build AI silicon in-house. I knew of just a few other companies designing their own custom AI silicon, but they were mainly focused only on silicon and not the software ecosystem and products.\u00a0<\/span><\/p>\n<p><span>The opportunity for Meta (known as Facebook back then) was to bring in silicon developers to work directly with the software teams to reimagine end-to-end systems allowing for greater efficiency and larger degrees of freedom in optimizing across hardware and software boundaries.\u00a0<\/span><\/p>\n<p><span>This was very enticing to me. I knew this was a rare opportunity and I had to jump on it to have the chance to build a design team from the ground up.\u00a0<\/span><\/p>\n<p>How was the transition from working at two different startups to working at Meta?<\/p>\n<p>OW: <span>My transition from startup to Meta was super easy. We had a very small team, so it almost feels like a startup within a large company. I was able to get involved in many parts of the project. It gave me the opportunity to be very hands-on in all aspects of ASIC development.<\/span><\/p>\n<p><span>Meta also has a very open culture. The freedom to innovate and experiment with new ideas is ingrained into Meta\u2019s DNA. I was able to have whiteboard sessions with members of co-design, software, hardware, and other cross-functional teams to brainstorm features that would go into the silicon. These discussions gave me a lot of insights into Meta\u2019s critical AI workloads, the challenges that our software teams had encountered with the current solutions, and their future directions. Coming from a startup, where we had very limited visibility into customer workloads and the roadmap outside of what is open sourced, this was very enlightening and refreshing,\u00a0<\/span><\/p>\n<p>What are some of the challenges you face in your current role?<\/p>\n<p>OW: <span>The silicon development cycle typically is fairly long. It usually spans anywhere from one\u00a0 and a half to two years, though it can take as long as four years in some cases. With AI advancing at a much faster clip, we are really designing hardware for software that doesn\u2019t yet exist. So the silicon has to be able to handle not just the demands of AI today, but future AI as well. To do this, we have to understand what our software team needs \u2013 AI workload trends they see, features they will need \u2013 and incorporate that into our design.\u00a0<\/span><\/p>\n<p><span>This is where we at Meta have an advantage. Because our silicon and software teams are both in-house, we have a front row seat into what\u2019s happening in software, and we are able to incorporate it into our silicon from the beginning.<\/span><\/p>\n<p><a href=\"https:\/\/ai.meta.com\/blog\/meta-training-inference-accelerator-AI-MTIA\/\" target=\"_blank\" rel=\"noopener\"><span>MTIA v1<\/span><\/a><span> was the very first silicon that we built at Meta, so one of the really challenging things was having to build out the entire design and verification flow from scratch, as well as the silicon development infrastructure itself. This was a lot of work in the beginning, but it\u2019s really paid off in the long run for the team.<\/span><\/p>\n<p>Meta announced MTIA v1 earlier this year. What is the significance of this milestone to you and the company?<\/p>\n<p>OW: <a href=\"https:\/\/ai.meta.com\/blog\/meta-training-inference-accelerator-AI-MTIA\/\" target=\"_blank\" rel=\"noopener\"><span>MTIA v1<\/span><\/a><span> is Meta\u2019s first generation ML\u00a0 accelerator. It\u2019s customized for our deep learning recommendation model, which is an important component for Meta technologies \u2013 including Facebook, Instagram, WhatsApp, Meta Quest, Horizon Worlds, and Ray-Ban Stories. While we will continue to purchase silicon chips from our partners, designing our own silicon allows us to optimize specifically for our critical workloads and gain complete control over the entire stack \u2013 from silicon, to the system, to software and the application.<\/span><\/p>\n<p><span>This was such a fun and unique experience, especially when I first started and the team was really, really small. We were able to fit into a conference room along with the software team and whiteboard all the different ideas and features we wanted to implement. I don\u2019t think I\u2019ve ever had that kind of experience anywhere else. Even though the team has grown quite a bit since then, we still try to maintain that scrappy culture.<\/span><\/p>\n<div class=\"fb-video\"><\/div>\n<p>What did you and the team learn from this process?<\/p>\n<p>OW: <span>I learned how important it is to have a hands-on team capable of jumping into other roles to get the job done. We operate in many ways like a startup in that we have to wear many hats and take on other challenges beyond our usual work. So even though I\u2019m the design lead, in addition to leading the project development, I also roll up my sleeves to code and help out wherever is needed.<\/span><\/p>\n<p>What are you looking forward to next? What\u2019s next for the AI silicon design team?<\/p>\n<p>OW: <span>AI is central to our work at Meta. The recommendation system is obviously a big part of our AI models, but beyond that, we also have GenAI and video processing use cases that have different requirements. This brings us a lot of opportunities to create products tailored for each need.<\/span><\/p>\n<p><span>With MTIA in-house, it gives us a tremendous amount of learnings we can incorporate in our products. <\/span><span>In addition, we maintained the user experience and developer efficiency offered by PyTorch eager-mode development. Developer efficiency is a journey as we continue to support PyTorch 2.0, which supercharges how PyTorch operates at the compiler level \u2014 under the hood. We\u2019re continuing to gather feedback and input from our AI software teams to shape the features of our future AI silicon.<\/span><\/p>\n<p><span>As we work on the next generations of MTIA chips, we\u2019re constantly looking at bottlenecks in the system, such as memory and communication across different chips so that we can put together a well-balanced solution to scale and future-proof our silicon.<\/span><\/p>\n<p>What advice might you give to women or other historically underrepresented groups interested in pursuing a career as engineers?<\/p>\n<p>OW: <span>I would encourage them to actively participate and not shy away from speaking up in meetings or discussions so people can know what they can accomplish. The other thing is to look for mentors within the team. They don\u2019t have to be the same as you. Having a mentor is always good, particularly early in your career, to help guide you and prioritize what will help you advance.\u00a0<\/span><\/p>\n<p><span>Meta\u2019s Infra team, as well as Meta more widely, has a mentor program for women engineers and underrepresented people. We offer both a group coaching program as well as one-on-one coaching. I\u2019ve done both of these and really enjoy having the opportunity to mentor. I\u2019ve found that it\u2019s very helpful for junior engineers to have the opportunity to get coaching and mentoring from senior people in the company.<\/span><\/p>\n<p>What about Meta\u2019s culture and technical advancements make it such a prime time for engineers, researchers, and developers to be at the company?<\/p>\n<p>OW: <span>Meta is an amazingly open company with a truly collaborative culture and a great place to learn and grow. We provide resources to help people quickly become familiar with the entire stack, even if they have no prior exposure to certain parts. This includes everything from the silicon to the firmware, the compiler, the application, as well as large scale system design that we are putting into the data center. The sheer scale to which Meta has been deploying the application also creates a dimension of challenges that makes it interesting and rewarding to work here.<\/span><\/p>\n<p>The post <a href=\"https:\/\/engineering.fb.com\/2023\/10\/18\/ml-applications\/meta-ai-custom-silicon-olivia-wu\/\">How Meta is creating custom silicon for AI<\/a> appeared first on <a href=\"https:\/\/engineering.fb.com\/\">Engineering at Meta<\/a>.<\/p>\n<p>Engineering at Meta<\/p>","protected":false},"excerpt":{"rendered":"<p>With the recent launches of MTIA v1,\u00a0 Meta\u2019s first-generation AI inference accelerator, and Llama 2,\u00a0 the next generation of Meta\u2019s publicly available large language model, it\u2019s clear that Meta is focused on advancing AI for a more connected world. Fueling the success of these products are world-class infrastructure teams, including Meta\u2019s custom AI silicon team,&hellip; <a class=\"more-link\" href=\"https:\/\/fde.cat\/index.php\/2023\/10\/18\/how-meta-is-creating-custom-silicon-for-ai\/\">Continue reading <span class=\"screen-reader-text\">How Meta is creating custom silicon for AI<\/span><\/a><\/p>\n","protected":false},"author":0,"featured_media":0,"comment_status":"","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"spay_email":"","footnotes":""},"categories":[7],"tags":[],"class_list":["post-773","post","type-post","status-publish","format-standard","hentry","category-technology","entry"],"jetpack_featured_media_url":"","jetpack-related-posts":[{"id":852,"url":"https:\/\/fde.cat\/index.php\/2024\/04\/11\/building-new-custom-silicon-for-metas-ai-workloads\/","url_meta":{"origin":773,"position":0},"title":"Building new custom silicon for Meta\u2019s AI workloads","date":"April 11, 2024","format":false,"excerpt":"The post Building new custom silicon for Meta\u2019s AI workloads appeared first on Engineering at Meta. Engineering at Meta","rel":"","context":"In &quot;Technology&quot;","img":{"alt_text":"","src":"","width":0,"height":0},"classes":[]},{"id":787,"url":"https:\/\/fde.cat\/index.php\/2023\/11\/15\/watch-metas-engineers-on-building-network-infrastructure-for-ai\/","url_meta":{"origin":773,"position":1},"title":"Watch: Meta\u2019s engineers on building network infrastructure for AI","date":"November 15, 2023","format":false,"excerpt":"Meta is building for the future of AI at every level \u2013 from hardware like MTIA v1, Meta\u2019s first-generation AI inference accelerator to publicly released models like Llama 2, Meta\u2019s next-generation large language model, as well as new generative AI (GenAI) tools like Code Llama. Delivering next-generation AI products and\u2026","rel":"","context":"In &quot;Technology&quot;","img":{"alt_text":"","src":"","width":0,"height":0},"classes":[]},{"id":501,"url":"https:\/\/fde.cat\/index.php\/2021\/11\/09\/ocp-summit-2021-open-networking-hardware-lays-the-groundwork-for-the-metaverse\/","url_meta":{"origin":773,"position":2},"title":"OCP Summit 2021: Open networking hardware lays the groundwork for the metaverse","date":"November 9, 2021","format":false,"excerpt":"Open infrastructure technologies and networking hardware will play an important role as we build new technologies for the metaverse, where billions of people will someday come together in virtual spaces. As we head toward the next major computing platform with a continued spirit of embracing openness and disaggregation, we\u2019re announcing\u2026","rel":"","context":"In &quot;Technology&quot;","img":{"alt_text":"","src":"","width":0,"height":0},"classes":[]},{"id":795,"url":"https:\/\/fde.cat\/index.php\/2023\/11\/21\/writing-and-linting-python-at-scale\/","url_meta":{"origin":773,"position":3},"title":"Writing and linting Python at scale","date":"November 21, 2023","format":false,"excerpt":"Python plays a big part at Meta. It powers Instagram\u2019s backend and plays an important role in our configuration systems, as well as much of our AI work. Meta even made contributions to Python 3.12, the latest version of Python. On this episode of the\u00a0Meta Tech Podcast, Meta engineer Pascal\u2026","rel":"","context":"In &quot;Technology&quot;","img":{"alt_text":"","src":"","width":0,"height":0},"classes":[]},{"id":811,"url":"https:\/\/fde.cat\/index.php\/2024\/01\/11\/how-meta-is-advancing-genai\/","url_meta":{"origin":773,"position":4},"title":"How Meta is advancing GenAI","date":"January 11, 2024","format":false,"excerpt":"What\u2019s going on with generative AI (GenAI) at Meta? And what does the future have in store? In this episode of the Meta Tech Podcast, Meta engineer Pascal Hartig (@passy) speaks with\u00a0Devi Parikh, an AI research director at Meta.\u00a0They cover a wide range of topics, including the history and future\u2026","rel":"","context":"In &quot;Technology&quot;","img":{"alt_text":"","src":"","width":0,"height":0},"classes":[]},{"id":684,"url":"https:\/\/fde.cat\/index.php\/2023\/02\/24\/metas-head-of-ar-glasses-on-the-future-of-ar-hardware\/","url_meta":{"origin":773,"position":5},"title":"Meta\u2019s head of AR glasses on the future of AR hardware","date":"February 24, 2023","format":false,"excerpt":"While VR headsets have been with us for at least a decade, AR hardware barely exists today; indeed, the very components that will comprise the hardware scarcely exist, making it a truly zero-to-one innovation challenge. Meta\u2019s Head of AR Glasses Hardware, Caitlin Kalinowski is helping to lead that charge. Kalinowski\u2026","rel":"","context":"In &quot;Technology&quot;","img":{"alt_text":"","src":"","width":0,"height":0},"classes":[]}],"_links":{"self":[{"href":"https:\/\/fde.cat\/index.php\/wp-json\/wp\/v2\/posts\/773","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/fde.cat\/index.php\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/fde.cat\/index.php\/wp-json\/wp\/v2\/types\/post"}],"replies":[{"embeddable":true,"href":"https:\/\/fde.cat\/index.php\/wp-json\/wp\/v2\/comments?post=773"}],"version-history":[{"count":0,"href":"https:\/\/fde.cat\/index.php\/wp-json\/wp\/v2\/posts\/773\/revisions"}],"wp:attachment":[{"href":"https:\/\/fde.cat\/index.php\/wp-json\/wp\/v2\/media?parent=773"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/fde.cat\/index.php\/wp-json\/wp\/v2\/categories?post=773"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/fde.cat\/index.php\/wp-json\/wp\/v2\/tags?post=773"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}