{"id":699,"date":"2023-04-11T16:00:23","date_gmt":"2023-04-11T16:00:23","guid":{"rendered":"https:\/\/fde.cat\/index.php\/2023\/04\/11\/why-xhe-aac-is-being-embraced-at-meta\/"},"modified":"2023-04-11T16:00:23","modified_gmt":"2023-04-11T16:00:23","slug":"why-xhe-aac-is-being-embraced-at-meta","status":"publish","type":"post","link":"https:\/\/fde.cat\/index.php\/2023\/04\/11\/why-xhe-aac-is-being-embraced-at-meta\/","title":{"rendered":"Why xHE-AAC is being embraced at Meta"},"content":{"rendered":"<p><span>We\u2019re sharing how Meta delivers high-quality audio at scale with the <a href=\"https:\/\/www.iis.fraunhofer.de\/en\/ff\/amm\/broadcast-streaming\/xheaac.html\" target=\"_blank\" rel=\"noopener\">xHE-AAC audio codec<\/a>.<\/span><br \/>\n<span>xHE-AAC has already been deployed on Facebook and Instagram to provide enhanced audio for features like Reels and Stories.\u00a0<\/span><\/p>\n<p><span>At Meta, we serve every media use case imaginable for billions of people across the world \u2014 from short-form, user-generated content, such as<\/span> <a href=\"https:\/\/engineering.fb.com\/2023\/02\/21\/video-engineering\/av1-codec-facebook-instagram-reels\/\" target=\"_blank\" rel=\"noopener\"><span>Reels<\/span><\/a><span>, to premium <\/span><a href=\"https:\/\/engineering.fb.com\/2021\/04\/05\/video-engineering\/how-facebook-encodes-your-videos\/\" target=\"_blank\" rel=\"noopener\"><span>video on demand (VOD)<\/span><\/a><span> and<\/span> <a href=\"https:\/\/engineering.fb.com\/2020\/10\/22\/video-engineering\/live-streaming\/\" target=\"_blank\" rel=\"noopener\"><span>live broadcasts<\/span><\/a><span>. Given this, we need a next-generation audio codec that supports a range of operating points with excellent compression efficiency and modern, system-level audio features.\u00a0<\/span><\/p>\n<p><span>To address these needs now and into the future, Meta has embraced xHE-AAC as the vehicle for delivering high-quality audio at scale.<\/span><\/p>\n<h2><span>The benefits of xHE-AAC<\/span><\/h2>\n<p><span>xHE-AAC is the latest member of the MPEG AAC audio codec family. The <\/span><a href=\"https:\/\/www.iis.fraunhofer.de\/en\/ff\/amm\/broadcast-streaming\/xheaac.html\" target=\"_blank\" rel=\"noopener\"><span>Fraunhofer Institute for Integrated Circuits IIS<\/span><\/a><span> played a substantial role in the development of xHE-AAC and the MPEG-D DRC standard.<\/span><\/p>\n<p><span>Today, xHE-AAC is already providing a superior audio experience on Facebook and Instagram \u2014 including on <\/span><a href=\"https:\/\/engineering.fb.com\/2023\/02\/21\/video-engineering\/av1-codec-facebook-instagram-reels\/\" target=\"_blank\" rel=\"noopener\"><span>Reels<\/span><\/a><span> and <\/span><a href=\"https:\/\/engineering.fb.com\/2022\/07\/18\/developer-tools\/building-text-animations-for-instagram-stories\/\" target=\"_blank\" rel=\"noopener\"><span>Stories<\/span><\/a><span> \u2014 and has a number of valuable features.<\/span><\/p>\n<h3><span>Loudness management<\/span><\/h3>\n<p><span>With <\/span><a href=\"https:\/\/engineering.fb.com\/2021\/04\/05\/video-engineering\/how-facebook-encodes-your-videos\/\" target=\"_blank\" rel=\"noopener\"><span>hundreds of millions of uploads per day across Facebook and Instagram<\/span><\/a><span>, we receive audio tracks with loudness levels ranging from silence to full scale, and everything in between.\u00a0<\/span><\/p>\n\n<p><span>When people play these videos sequentially, they can perceive some audio as being too loud or too quiet. This creates listener fatigue from having to constantly adjust the volume.<\/span><\/p>\n\n<p><span>xHE-AAC\u2019s integrated loudness management system solves for loudness inconsistency while meticulously preserving creator intent by bringing the average loudness of all sessions to the same target level and managing the dynamic range of each session to fit the playback environment.<\/span><\/p>\n<p><span>Instead of burning in a specific target level and dynamic range compression (DRC) profile during encoding, xHE-AAC allows us to leave the original audio characteristics untouched and delegate loudness management processing to the client via loudness metadata, for the optimal audio experience based on context.\u00a0<\/span><\/p>\n\n<p><span>As a result of xHE-AAC\u2019s loudness management, people can spend more time immersed in their favorite content and less time fiddling with the volume control.<\/span><\/p>\n<h3><span>Adaptive bit rate audio<\/span><\/h3>\n<p><span>Most people who use our apps consume media on mobile devices and expect the highest audio quality without interruption. This presents a challenge for streaming media because connection quality varies on mobile and can result in a very uneven user experience.\u00a0<\/span><\/p>\n\n<p><span>To optimize quality under dynamic bandwidth constraints, we produce<\/span><a href=\"https:\/\/engineering.fb.com\/2021\/04\/05\/video-engineering\/how-facebook-encodes-your-videos\/\"> <span>multiple video and audio qualities<\/span><\/a><span> to match varying network conditions at playback time. Even though we produce multiple audio lanes, we have historically only employed<\/span><a href=\"https:\/\/engineering.fb.com\/2022\/11\/04\/video-engineering\/instagram-video-processing-encoding-reduction\/\"> <span>adaptive bit rate (ABR)<\/span><\/a><span> algorithms to switch video qualities during playback because it\u2019s difficult to enable adaptive bit rate audio without compromising quality during lane transitions.<\/span><\/p>\n<p><span>In order to enable seamless audio ABR, xHE-AAC introduces the concept of immediate playout frames (IPFs) that contain all the data necessary to start playing a new audio lane without relying on data from other frames. By placing an IPF at the beginning of each Dynamic Adaptive Streaming over HTTP (DASH) segment and aligning the segment durations of each lane, we can seamlessly switch between audio lanes during playback to provide the highest-quality audio at any available bandwidth while avoiding playback stalls.<\/span><\/p>\n\n<p><span>After launching audio ABR on Facebook for Android, we were able to improve user experience by reducing the number of sessions where playback stalls.\u00a0<\/span><\/p>\n<h2><span>How we deployed xHE-AAC<\/span><\/h2>\n<p><span>We generate xHE-AAC bitstreams using an encoder SDK provided by the Fraunhofer Institute for Integrated Circuits IIS, and then prepare the resulting audio files for DASH streaming with shaka-packager. The xHE-AAC encoder\u2019s two-pass encoding mode is used to measure the input loudness envelope and average program loudness on the first pass and perform the actual audio data compression on the second pass. As an added benefit, two-pass encoding allows us to use loudness range control (LRAC) DRC, which mitigates pumping artifacts otherwise introduced by single-pass DRC algorithms. <\/span><span> <\/span><\/p>\n<p><span>To prepare an xHE-AAC audio adaptation set for ABR delivery, IPFs are inserted at constant time intervals, audio configuration parameters such as sample rate and channel configuration are kept constant, and unique stream identifiers are selected for each lane in the audio adaptation set.<\/span><\/p>\n<p><span>At playback time, we custom-fit the audio to the listening environment by configuring a target loudness level and DRC effect type based on context, and thanks to the embedded loudness metadata, we can adapt a single xHE-AAC bitstream to a variety of audio consumption use cases, from headphones to device speakers and various levels of background noise. Finally, if the client is starved for data or bandwidth is plentiful, audio ABR will automatically switch audio qualities to ensure that the highest audio quality is played without interrupting the playback session.<\/span><\/p>\n<h2><span>Where can you experience xHE-AAC today?<\/span><\/h2>\n<p><span>You can experience xHE-AAC audio on Facebook for iOS and Android, as well as on targeted surfaces on Instagram, such as Reels and Stories<\/span><span>. <\/span><span>We encourage you to install the latest version of Facebook and Instagram apps on iOS 13+ and Android 9+ to ensure that you can experience it.<\/span><\/p>\n<h2><span>Acknowledgements<\/span><\/h2>\n<p><span>This work is the collective result of the entire Video Infrastructure and Instagram Media Platform teams at Meta in collaboration with Fraunhofer <\/span><span>Institute for Integrated Circuits<\/span><span> IIS. The author would like to extend special thanks to Abhishek Gera, Tim Harris, Arun Kotidath, Edward Li, Meng Li, Srinivas Lingutla, Denise Noyes, Mohanish Penta, David Ronca, Haixia Shi, Mike Starr, Cosmin Stejerean, Simha Venkataramaiah, Juehui Zhang, Runshen Zhu, and the engineering team at Fraunhofer <\/span><span>Institute for Integrated Circuits<\/span><span> IIS.<\/span><\/p>\n<p>The post <a href=\"https:\/\/engineering.fb.com\/2023\/04\/11\/video-engineering\/high-quality-audio-xhe-aac-codec-meta\/\">Why xHE-AAC is being embraced at Meta<\/a> appeared first on <a href=\"https:\/\/engineering.fb.com\/\">Engineering at Meta<\/a>.<\/p>\n<p>Engineering at Meta<\/p>","protected":false},"excerpt":{"rendered":"<p>We\u2019re sharing how Meta delivers high-quality audio at scale with the xHE-AAC audio codec. xHE-AAC has already been deployed on Facebook and Instagram to provide enhanced audio for features like Reels and Stories.\u00a0 At Meta, we serve every media use case imaginable for billions of people across the world \u2014 from short-form, user-generated content, such&hellip; <a class=\"more-link\" href=\"https:\/\/fde.cat\/index.php\/2023\/04\/11\/why-xhe-aac-is-being-embraced-at-meta\/\">Continue reading <span class=\"screen-reader-text\">Why xHE-AAC is being embraced at Meta<\/span><\/a><\/p>\n","protected":false},"author":0,"featured_media":0,"comment_status":"","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"spay_email":"","footnotes":""},"categories":[7],"tags":[],"class_list":["post-699","post","type-post","status-publish","format-standard","hentry","category-technology","entry"],"jetpack_featured_media_url":"","jetpack-related-posts":[{"id":880,"url":"https:\/\/fde.cat\/index.php\/2024\/06\/13\/mlow-metas-low-bitrate-audio-codec\/","url_meta":{"origin":699,"position":0},"title":"MLow: Meta\u2019s low bitrate audio codec","date":"June 13, 2024","format":false,"excerpt":"At Meta, we support real-time communication (RTC) for billions of people through our apps, including WhatsApp, Instagram, and Messenger.\u00a0 We are working to make RTC accessible by providing a high-quality experience for everyone \u2013 even those who might not have the fastest connections or the latest phones. As more and\u2026","rel":"","context":"In &quot;Technology&quot;","img":{"alt_text":"","src":"","width":0,"height":0},"classes":[]},{"id":680,"url":"https:\/\/fde.cat\/index.php\/2023\/02\/16\/inside-metas-first-smart-glasses\/","url_meta":{"origin":699,"position":1},"title":"Inside Meta\u2019s first smart glasses","date":"February 16, 2023","format":false,"excerpt":"What\u2019s new: Meta is sharing the inside story of how it developed the Ray-Ban Stories smart glasses. Why it matters: Creating Ray-Ban Stories meant Meta\u2019s engineers had to take on new challenges to build smart glasses that married complex engineering dynamics. How do you make something that features cameras, microphones,\u2026","rel":"","context":"In &quot;Technology&quot;","img":{"alt_text":"","src":"","width":0,"height":0},"classes":[]},{"id":597,"url":"https:\/\/fde.cat\/index.php\/2022\/06\/09\/under-the-hood-metas-cloud-gaming-infrastructure\/","url_meta":{"origin":699,"position":2},"title":"Under the hood: Meta\u2019s cloud gaming infrastructure","date":"June 9, 2022","format":false,"excerpt":"The promise of cloud gaming is a promise to democratize gaming. Anyone who loves games should be able to enjoy them and share the experience with their friends, no matter where they\u2019re located, and even if they don\u2019t have the latest, most expensive gaming hardware. Facebook launched its cloud gaming\u2026","rel":"","context":"In &quot;Technology&quot;","img":{"alt_text":"","src":"","width":0,"height":0},"classes":[]},{"id":901,"url":"https:\/\/fde.cat\/index.php\/2024\/07\/25\/how-salesforces-new-speech-to-text-service-uses-openai-whisper-models-for-real-time-transcriptions\/","url_meta":{"origin":699,"position":3},"title":"How Salesforce\u2019s New Speech-to-Text Service Uses OpenAI Whisper Models for Real-Time Transcriptions","date":"July 25, 2024","format":false,"excerpt":"In our Engineering Energizers Q&A series, we explore the paths of engineering leaders who have attained significant accomplishments in their respective fields. Today, we spotlight Dima Statz, Director of Software Engineering at Salesforce, who leads the development of Salesforce\u2019s new Speech-to-Text (STT) service. STT leverages advanced speech recognition technology to\u2026","rel":"","context":"In &quot;Technology&quot;","img":{"alt_text":"","src":"","width":0,"height":0},"classes":[]},{"id":841,"url":"https:\/\/fde.cat\/index.php\/2024\/03\/20\/better-video-for-mobile-rtc-with-av1-and-hd\/","url_meta":{"origin":699,"position":4},"title":"Better video for mobile RTC with AV1 and HD","date":"March 20, 2024","format":false,"excerpt":"At Meta, we support real-time communication (RTC) for billions of people through our apps, including Messenger, Instagram, and WhatsApp. We\u2019ve seen significant benefits by adopting the AV1 codec for RTC. Here\u2019s how we are improving the RTC video quality for our apps with tools like the AV1 codec, the challenges\u2026","rel":"","context":"In &quot;Technology&quot;","img":{"alt_text":"","src":"","width":0,"height":0},"classes":[]},{"id":682,"url":"https:\/\/fde.cat\/index.php\/2023\/02\/21\/how-meta-brought-av1-to-reels\/","url_meta":{"origin":699,"position":5},"title":"How Meta brought AV1 to Reels","date":"February 21, 2023","format":false,"excerpt":"We\u2019re sharing how we\u2019re enabling production and delivery of AV1 for Facebook Reels and Instagram Reels. We believe AV1 is the most viable codec for Meta for the coming years. It offers higher quality at a much lower bit rate compared with previous generations of video codecs. Meta has worked\u2026","rel":"","context":"In &quot;Technology&quot;","img":{"alt_text":"","src":"","width":0,"height":0},"classes":[]}],"_links":{"self":[{"href":"https:\/\/fde.cat\/index.php\/wp-json\/wp\/v2\/posts\/699","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/fde.cat\/index.php\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/fde.cat\/index.php\/wp-json\/wp\/v2\/types\/post"}],"replies":[{"embeddable":true,"href":"https:\/\/fde.cat\/index.php\/wp-json\/wp\/v2\/comments?post=699"}],"version-history":[{"count":0,"href":"https:\/\/fde.cat\/index.php\/wp-json\/wp\/v2\/posts\/699\/revisions"}],"wp:attachment":[{"href":"https:\/\/fde.cat\/index.php\/wp-json\/wp\/v2\/media?parent=699"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/fde.cat\/index.php\/wp-json\/wp\/v2\/categories?post=699"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/fde.cat\/index.php\/wp-json\/wp\/v2\/tags?post=699"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}