{"id":326,"date":"2021-08-31T14:39:51","date_gmt":"2021-08-31T14:39:51","guid":{"rendered":"https:\/\/fde.cat\/?p=326"},"modified":"2021-08-31T14:39:51","modified_gmt":"2021-08-31T14:39:51","slug":"asicmon-a-platform-agnostic-observability-system-for-ai-accelerators","status":"publish","type":"post","link":"https:\/\/fde.cat\/index.php\/2021\/08\/31\/asicmon-a-platform-agnostic-observability-system-for-ai-accelerators\/","title":{"rendered":"Asicmon: A platform agnostic observability system for AI accelerators"},"content":{"rendered":"<p>We will be hosting a talk about our work on, \u201c<a href=\"https:\/\/atscaleconference.com\/events\/systems-scale-summer-2021\/\">A Platform Agnostic Observability System for AI Accelerators<\/a>\u201d during our virtual <a href=\"https:\/\/atscaleconference.com\/events\/systems-scale-summer-2021\/\">Systems @Scale<\/a> event at 10:20 a.m. PT on Wednesday, June 30, followed by a live Q&amp;A session. Please submit any questions to <a href=\"mailto:systemsatscale@fb.com\">systemsatscale@fb.com<\/a> before the event.<\/p>\n<p><span>Accelerators are special-purpose hardware devices optimized for specific applications, like AI prediction and video encoding. And Application-specific hardware platforms play an important role in meeting the growing latency and compute demands of workloads like deep learning, content understanding, and video encoding.<\/span><\/p>\n<p><span>At Facebook, the inevitable rise in use of accelerators in our data centers has led to better performance and energy efficiency. However, it is challenging to operate these heterogeneous platforms efficiently at scale. To ensure that these complex accelerators operate smoothly, we need an excellent observability system with monitoring and tracing capabilities so we can understand the performance and interactions between CPUs and accelerators.<\/span><\/p>\n<p><span>To meet these challenges, we\u2019ve introduced three new tools:<\/span><\/p>\n<p>ASIC Monitoring (Asicmon)<span>, a scalable observability framework. Asicmon\u2019s library abstracts an accelerator\u2019s custom interfaces and provides a standard interface to our internal tools. Asicmon has facilitated load balancing, performance monitoring, and automated health checks for hundreds of thousands of accelerators running in our data centers.<\/span><br \/>\nAsimov<span>, a custom specification language that makes developing and rapid prototyping new accelerators easier. It has shrunk our development time for onboarding a new accelerator from a month to under a week.<\/span><br \/>\nAtrace<span>, an accelerator tracing solution that collects traces remotely on production servers. It allows us to inspect accelerator systems in detail and provides actionable trace summaries and analyses. An initial version of Atrace allowed us to close a 10 percent performance gap between <\/span><a href=\"https:\/\/ai.facebook.com\/blog\/pytorch-builds-the-future-of-ai-and-machine-learning-at-facebook\/\"><span>Caffe2 and PyTorch implementations<\/span><\/a><span> of a large AI model.\u00a0<\/span><\/p>\n<h2><span>Background<\/span><\/h2>\n<p><span>Facebook\u2019s cloud infrastructure handles\u00a0about 150 trillion AI predictions per day for tasks ranging from feed recommendations to combating harmful content. Running these AI models comes with heavy infrastructure demands. And as these models improve, so do their <a href=\"https:\/\/arxiv.org\/pdf\/2003.09518.pdf\">computational requirements<\/a>.<\/span><\/p>\n<p><span>The graph below of AI model adoption at Facebook illustrates this <\/span><a href=\"https:\/\/arxiv.org\/pdf\/2003.09518.pdf\"><span>unmistakable pattern<\/span><\/a><span>.<\/span><\/p>\n<\/p>\n<h2><span>The need for accelerators<\/span><\/h2>\n<p><span>Good old general-purpose processors (CPUs) offer versatility and have grown exponentially faster over the decades. However, CPUs fail to meet the rising<\/span><a href=\"https:\/\/openai.com\/blog\/ai-and-compute\/\"> <span>computational demands of AI applications<\/span><\/a><span> today. They also tend to exhibit inefficiency in terms of energy used per AI prediction. As investigated by the OpenAI community, we\u2019ve seen <\/span><a href=\"https:\/\/openai.com\/blog\/ai-and-compute\/\"><span>two distinct eras of compute in AI models<\/span><\/a><span>. <\/span><span>In recent times, model complexity and compute requirements for AI have grown by roughly a factor of 10 each year.\u00a0<\/span><span>This far outpaces improvements in CPU performance.\u00a0\u00a0\u00a0<\/span><\/p>\n<p><span>How do we remedy this? By designing hardware that is customized to accelerate AI operations via application-specific integrated circuits (ASICs).\u00a0\u00a0<\/span><\/p>\n<p><span>Since 2019, Facebook has invested <\/span><a href=\"https:\/\/engineering.fb.com\/2019\/03\/14\/data-center-engineering\/accelerating-infrastructure\/\"><span>heavily in deploying accelerator-based servers<\/span><\/a><span> to provide higher performance and energy efficiency. Today, our first-generation systems are 10-30x more performant on our largest AI models. They also delivered a 3-10x performance-per-watt improvement over a CPU.<\/span><\/p>\n<p><span>We also invested in specialized hardware for <\/span><a href=\"https:\/\/engineering.fb.com\/2021\/04\/05\/video-engineering\/how-facebook-encodes-your-videos\/\"><span>video encoding<\/span><\/a><span> and decoding. This enables Facebook to process the nearly 250 million videos uploaded to our app each day. These videos are viewable on any device and with varying internet bandwidth. Our first-generation video accelerators delivered a 10x performance-per-watt improvement in processing 4K videos.<\/span><\/p>\n<p><span>The figure below illustrates the design of our AI inference server. As you can see, it consists of two Twin Lake CPUs and multiple accelerators (M.2 modules) connected to them using a PCIE switch<\/span><span>.<\/span><\/p>\n<\/p>\n<h2><span>The challenges of operating accelerators<\/span><\/h2>\n<p><span>In your typical cloud server, the CPU represents the most complex component. We focus a lot on building software to efficiently operate the CPU and monitor its performance and availability. However, with an accelerator system, we can imagine the CPU now has a complicated and brawnier sibling! The accelerator, or ASIC, represents a complex hardware and software system in its own right.<\/span><\/p>\n<p><span>To deliver an excellent user experience, the cloud infrastructure needs to keep hundreds of thousands of accelerators running reliably and efficiently. This is where observability systems come to our rescue. Observability allows us to understand what happens in the accelerator hardware and software when any issue arises. It is useful in multiple ways:\u00a0<\/span><\/p>\n<p>Health monitoring:<span> Just like any other piece of hardware, accelerators can overheat or hit a faulty condition or a functional bug. We can track various health metrics for the ASICs and use them in automated systems. These systems can then (if needed) remediate the issue by rebooting the accelerator or moving it into a repair state.<\/span><br \/>\nPerformance monitoring:<span> By monitoring the performance and system load on an accelerator, we can efficiently scale our AI jobs to meet variable demand throughout the day. It also enables us to detect regressions in performance with new software deployments.<\/span><br \/>\nPerformance profiling:<span> When we encounter issues such as poor performance or time-outs, we need to look deeper into how the accelerator server is functioning. We also need to equip software developers with tools to understand the performance of their applications while they run on accelerators.\u00a0<\/span><\/p>\n<h2><span>The accelerator zoo<\/span><\/h2>\n<p><span>Specialization is both a boon and bane for accelerators. As a result, we end up running multiple types of accelerators in our data centers at any given point.<\/span><\/p>\n<p><span>In 2020 we started<\/span><a href=\"https:\/\/engineering.fb.com\/2019\/03\/14\/data-center-engineering\/accelerating-infrastructure\/\"> <span>deploying the first generation<\/span><\/a><span> of these accelerators. In the near future, we will be developing two to three new accelerators for the second generation. Each accelerator will have unique driver interfaces, making the task of operating them harder. But duplicating the observability software for each accelerator would not be feasible in the timeline we have set out. The observability framework must be easy to prototype and adapt to multiple types of accelerators in a short time. It also needs to be efficient to avoid interfering with the original application.\u00a0<\/span><\/p>\n<h2><span>How we developed Asicmon and Asimov<\/span><\/h2>\n<p><span>Our first challenge involved finding a way to effectively monitor different types of accelerators without duplicating code (and developer time). As you may have guessed, we can leverage abstraction to achieve this.\u00a0<\/span><\/p>\n<p><span>For example, consider an abstract metric: <\/span><span>device_utilization<\/span><span> \u2014 the measure of how busy an accelerator is \u2014 which becomes useful for balancing load across accelerators. To compute this metric, we may need to understand the internal architecture of the accelerator. With an abstract counter, however, engineers working on load balancing can more easily use the metric without being aware of finer details.<\/span><\/p>\n<p><span>device_utilization = max(compute_core_active_i) \/\u00a0 total_time\u00a0<\/span><\/p>\n<p><span>With the above in mind, we designed Asicmon with these design objectives:\u00a0<\/span><\/p>\n<p>Abstraction:<span> We needed a simple and uniform interface for all of our internal monitoring and operational tools to use. This enables infrastructure engineers and hardware teams to effectively operate multiple accelerators in a common way.<\/span><br \/>\nDevelopment velocity:<span> Accelerators are new. Interfaces can also change due to evolving requirements. The framework should be easy to learn and able to iterate quickly.<\/span><br \/>\nPerformance:<span> Finally, any observability system should be lightweight in terms of resources. As a result, it diminishes interference with high-throughput video and AI applications.<\/span><\/p>\n<p><span>The diagram below illustrates the overall software stack for monitoring accelerators. Asicmon acts as a bridge between individual accelerator drivers and the rest of the internal monitoring software. The left top illustrates automated health check tools that spot bad health signals and<\/span><a href=\"https:\/\/engineering.fb.com\/2020\/12\/09\/data-center-engineering\/how-facebook-keeps-its-large-scale-infrastructure-hardware-up-and-running\/\"> <span>automatically fix faulty ASICs<\/span><\/a><span>. On the right, a telemetry daemon periodically publishes performance metrics for engineers to inspect the accelerators. Furthermore, automated load balancing and auto-scaling systems like<\/span><a href=\"https:\/\/engineering.fb.com\/2020\/08\/24\/production-engineering\/scaling-services-with-shard-manager\/\"> <span>Shard Manager<\/span><\/a><span> utilize these counters. <\/span><span>\u00a0<\/span><\/p>\n<\/p>\n<h3><span>How does Asicmon work?\u00a0\u00a0<\/span><\/h3>\n<p><span>Under the hood, Asicmon creates an instance of a monitoring module per accelerator device. It maintains a cache of statistics that it updates periodically by probing the accelerator driver and computing-derived metrics. Queries to Asicmon\u2019s standard interface for counters get implemented as a lookup into this cache. This shields the system against accidental overload of counter requests.<\/span><\/p>\n<h3><span>Enter Asimov<\/span><\/h3>\n<p><span>All great so far! We used abstraction to address the scalability aspect of observability software layers above Asicmon. However, the problem of building the glue code between the accelerator driver and these standard metrics still eluded us. This has to be done separately for each of the accelerators that have aggressive and overlapping timelines. So, we needed a method to develop on Asicmon that was quick to iterate and easy to ramp up on, while also being efficient. That\u2019s where Asimov comes in.\u00a0<\/span><\/p>\n<\/p>\n<p><span>Asimov is an expressive Python-like custom language to instrument the accelerator driver. It essentially allows developers to focus on how to probe the accelerator interfaces and express derived metrics using them. The Asimov compiler generates an efficient C++ implementation of the monitoring module. It also handles details like caching the metrics, periodically reading them, and providing thread safety.<\/span><\/p>\n<\/p>\n<p><span>The code snippets below show examples of Asimov being used to read system metrics using interfaces ranging from Linux sysfs files (a) to custom library C functions (b).<\/span><\/p>\n<p><span>Asimov incorporates the same standard interface as Asicmon in its internal representation (the stats data structure, left hand side in the code). We can also invoke C-library functions provided by the device driver and express equations\/conditions for derived metrics like any regular language.\u00a0\u00a0<\/span><\/p>\n<\/p>\n<p><span>Asimov is built with the<\/span><a href=\"https:\/\/www.antlr.org\/\"> <span>ANTLR<\/span><\/a><span> compiler framework under the hood to provide the lexer\/parser logic for the language. We then emit C++ code using templates that manage all the essential parts, like initialization, thread safety, etc., so someone using Asimov doesn\u2019t need to worry about it.<\/span><\/p>\n<h2><span>Asicmon in action<\/span><\/h2>\n<p><span>Let\u2019s look at a few illustrative examples of how Asimov and Asicmon are beneficial for operating accelerators at scale.<\/span><\/p>\n<p><span>For AI inference applications, we use a system called<\/span><a href=\"https:\/\/engineering.fb.com\/2020\/08\/24\/production-engineering\/scaling-services-with-shard-manager\/\"> <span>Shard Manager<\/span><\/a><span> to automatically scale the inference service instances. A shard is essentially a copy of the AI model that can serve inferences. Asicmon measures the load on the device using an abstract metric \u2014 accelerator<\/span> <span>device<\/span> <span>utilization. This helps Shard Manager effectively balance the load among servers and automatically scale up or down the number of shards. The diagram below explains how the number of shards gets scaled automatically during model update rollouts and increases in traffic.<\/span><\/p>\n<\/p>\n<p><span>The figure below illustrates the advantages of building observability early on in a project\u2019s development cycle. In our test deployment for video accelerators, we detected a memory leak using an Asicmon counter for available device memory. It took multiple fixes to the driver to finally resolve the issue, well in time before its debut in production.<\/span><\/p>\n<\/p>\n<p><span>Finally, let\u2019s take a look at the ease of prototyping with Asimov. While we certainly took longer to build the first version of Asimov alongside the first video accelerator, supporting the second one (the AI inference accelerator) went incredibly fast. Bootstrapping basic metrics for the AI inference accelerator took less than a week. Since implementing Asicmon we\u2019ve been able to increase our AI accelerator metrics support from ~30 percent to ~75 percent<\/span><\/p>\n<h2><span>Atrace: Accelerator tracing at scale<\/span><\/h2>\n<h3><span>Why tracing?<\/span><\/h3>\n<p><span>Now that we can monitor the performance of accelerators in our data centers, the next step involves addressing why performance metrics like the latency and throughput change over time. The tried-and-tested method for CPUs involves leveraging a stack-based profiler to sample the running function call stack at periodic intervals. However, for inference accelerators, tracing is the best form of profiling. Why? Because accelerators use special hardware units and thus do not have an equivalent notion of a function stack on a core.\u00a0<\/span><\/p>\n<p><span>As shown in the figure below, a trace essentially consists of a time series of events occurring on different parts in a system. Events in a trace can represent, among many things, functions, execution of AI operators, or data transfers. Traces offer deeper insights into the operation of the system, including understanding the latency and scheduling of operators and how the CPU and accelerator interact with each other.\u00a0<\/span><\/p>\n<\/p>\n<h3><span>Designing the tracing system<\/span><\/h3>\n<p><span>While AI inference accelerator vendors do provide tools and APIs to collect traces from the device. These tools are designed to work on a single server and are often hard to use. In order to profile production systems better, we set out building a layer on top of this native capability. This better scales out the collection, processing, and analysis of traces themselves.\u00a0<\/span><\/p>\n<p><span>We kept two target use cases in mind while developing Atrace:\u00a0<\/span><\/p>\n<p>Model development:<span> Model developers would typically be attempting to target their AI models to new inference hardware. They can run the tracing tool locally. But by integrating it with internal visualization and summarization tools, we can provide quicker feedback to engineers to iteratively tune their model.<\/span><br \/>\nProduction:<span> Debugging performance issues in production is an important use case for tracing. For instance, say a continuous integration (CI) test detects a regression in performance. By collecting traces remotely and on the fly, production engineers can quickly diagnose the problem.<\/span><\/p>\n<p><span>To develop a scalable and ubiquitous tracing solution, we built a set of components that remotely trigger and collect traces. We save each trace to a shared storage and post process and summarize it. The diagram below outlines this, starting on the left with the trace being triggered, to the trace collection and post processing on the right.<\/span><\/p>\n<\/p>\n<h2><span>Insights from accelerator traces<\/span><\/h2>\n<h3><span>Trace profiles and summaries<\/span><\/h3>\n<p><span>Traces themselves can be enormous and overwhelming to dive into directly. However, we can learn a great deal about an AI program by summarizing the trace at a high level. To achieve this, we built a summary of trace statistics grouped by various AI operator types, as shown below.<\/span><\/p>\n<\/p>\n<p><span>This operator breakdown shows our engineers which operators consume the most execution time and merit optimization. It also allows for comparisons and debugging of performance regressions between two software versions.<\/span><\/p>\n<h3><span>Trace critical path analysis<\/span><\/h3>\n<p><span>For advanced users, who might want to delve deeper into the traces, we added visualization support for both the open source<\/span><a href=\"https:\/\/www.chromium.org\/developers\/how-tos\/trace-event-profiling-tool\"> <span>Chrome trace viewer<\/span><\/a><span> and an internal trace visualization tool from Facebook. It works all from a single click. We can also run automated analysis on the trace to infer the critical path of operators. This uses the dependency graph of the AI model and trace statistics.<\/span><\/p>\n<p><span>This analysis lets us optimize the latency of the AI prediction. It can also highlight issues like an imbalance in operators. Doing so closed a 10 percent latency gap between the Caffe2 and PyTorch versions of one of our AI models.<\/span><\/p>\n<h3><span>Trace correlation<\/span><\/h3>\n<p><span>Lastly, it is also noteworthy that several software layers exist to handle the processing of an inference request. These include the application layer, PyTorch framework, and<\/span><a href=\"https:\/\/engineering.fb.com\/2018\/09\/13\/ml-applications\/glow-a-community-driven-approach-to-ai-infrastructure\/\"> <span>Glow<\/span><\/a><span>, an open source graph lowering compiler for accelerators.<\/span><\/p>\n<p><span>For more complex models involving video understanding or natural language processing, we learned that the model may be run partially on a CPU and partially on an accelerator. Thus, tracing the operations across multiple layers on the CPU and correlating them with the accelerator becomes a necessity.<\/span><\/p>\n<p><span>We developed a<\/span><a href=\"https:\/\/github.com\/pytorch\/glow\/pull\/5568\"> <span>prototype of trace correlation<\/span><\/a><span> into Glow and PyTorch. This allowed us to connect operations on the CPU in the Glow runtime, to the accelerator. Trace correlation is important for examining the complex software stack used for AI inference.<\/span><\/p>\n<h2><span>Next steps<\/span><\/h2>\n<p><span>In addition to continuing to support next-generation AI and video accelerators using Asimov and the Asicmon we are also exploring:<\/span><\/p>\n<p>Open source specifications:<span> There are multitudes of companies building accelerator chips today. But the monitoring interfaces for accelerators lack standardization. We are collaborating with the<\/span><a href=\"https:\/\/www.opencompute.org\/wiki\/Server\/ODSA\"> <span>Open Domain-Specific Accelerators (ODSA)<\/span><\/a><span> project so the industry as whole can benefit from a common specification.<\/span><br \/>\nTrace visualization and analysis:<span> We are investigating ways to automatically generate optimization recommendations from the trace and support better visualizations, such as integrating with TensorBoard.<\/span><br \/>\nDistributed tracing:<span> Since microservices do not run in isolation, we plan on exploring how to correlate distributed traces collected by the Canopy distributed tracing tool with system-level accelerator traces. This would allow us to debug the end-to-end latency of microservices that use AI accelerators.<\/span><\/p>\n<h2><span>Thanks<\/span><\/h2>\n<p><em><span>We would like to thank our many collaborators at Facebook, including Jerry Liu, Thiara Ortiz, Jeremy Yang, Ashwin Poojary, Deng Pan, Craig Ross, Ashwin Narasimha, Gisle Dankel, Michael Anderson, Allan Di Wu, Yinghai Lu, Satish Nadathur, Garret Catron, and Jack Montgomery for supporting us in creating this framework.<\/span><\/em><\/p>\n<p>The post <a href=\"https:\/\/engineering.fb.com\/2021\/06\/28\/data-center-engineering\/asicmon\/\">Asicmon: A platform agnostic observability system for AI accelerators<\/a> appeared first on <a href=\"https:\/\/engineering.fb.com\/\">Facebook Engineering<\/a>.<\/p>\n<p><a href=\"https:\/\/engineering.fb.com\/2021\/06\/28\/data-center-engineering\/asicmon\/\">Read More<\/a><\/p>\n","protected":false},"excerpt":{"rendered":"<p>We will be hosting a talk about our work on, \u201cA Platform Agnostic Observability System for AI Accelerators\u201d during our virtual Systems @Scale event at 10:20 a.m. PT on Wednesday, June 30, followed by a live Q&amp;A session. Please submit any questions to systemsatscale@fb.com before the event. Accelerators are special-purpose hardware devices optimized for specific&hellip; <a class=\"more-link\" href=\"https:\/\/fde.cat\/index.php\/2021\/08\/31\/asicmon-a-platform-agnostic-observability-system-for-ai-accelerators\/\">Continue reading <span class=\"screen-reader-text\">Asicmon: A platform agnostic observability system for AI accelerators<\/span><\/a><\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"closed","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"spay_email":"","footnotes":""},"categories":[7],"tags":[],"class_list":["post-326","post","type-post","status-publish","format-standard","hentry","category-technology","entry"],"jetpack_featured_media_url":"","jetpack-related-posts":[{"id":618,"url":"https:\/\/fde.cat\/index.php\/2022\/08\/10\/scaling-data-ingestion-for-machine-learning-training-at-meta\/","url_meta":{"origin":326,"position":0},"title":"Scaling data ingestion for machine learning training at Meta","date":"August 10, 2022","format":false,"excerpt":"Many of Meta\u2019s products, such as search and language translations, utilize AI models to continuously improve user experiences. As the performance of hardware we use to support training infrastructure increases, we need to scale our data ingestion infrastructure accordingly to handle workloads more efficiently. GPUs, which are used for training\u2026","rel":"","context":"In &quot;Technology&quot;","img":{"alt_text":"","src":"","width":0,"height":0},"classes":[]},{"id":869,"url":"https:\/\/fde.cat\/index.php\/2024\/05\/22\/composable-data-management-at-meta\/","url_meta":{"origin":326,"position":1},"title":"Composable data management at Meta","date":"May 22, 2024","format":false,"excerpt":"In recent years, Meta\u2019s data management systems have evolved into a composable architecture that creates interoperability, promotes reusability, and improves engineering efficiency.\u00a0 We\u2019re sharing how we\u2019ve achieved this, in part, by leveraging Velox, Meta\u2019s open source execution engine, as well as work ahead as we continue to rethink our data\u2026","rel":"","context":"In &quot;Technology&quot;","img":{"alt_text":"","src":"","width":0,"height":0},"classes":[]},{"id":682,"url":"https:\/\/fde.cat\/index.php\/2023\/02\/21\/how-meta-brought-av1-to-reels\/","url_meta":{"origin":326,"position":2},"title":"How Meta brought AV1 to Reels","date":"February 21, 2023","format":false,"excerpt":"We\u2019re sharing how we\u2019re enabling production and delivery of AV1 for Facebook Reels and Instagram Reels. We believe AV1 is the most viable codec for Meta for the coming years. It offers higher quality at a much lower bit rate compared with previous generations of video codecs. Meta has worked\u2026","rel":"","context":"In &quot;Technology&quot;","img":{"alt_text":"","src":"","width":0,"height":0},"classes":[]},{"id":313,"url":"https:\/\/fde.cat\/index.php\/2021\/08\/31\/how-facebook-deals-with-pcie-faults-to-keep-our-data-centers-running-reliably\/","url_meta":{"origin":326,"position":3},"title":"How Facebook deals with PCIe faults to keep our data centers running reliably","date":"August 31, 2021","format":false,"excerpt":"Peripheral component interconnect express (PCIe) hardware continues to push the boundaries of computing thanks to advances in transfer speeds, the number of available lanes for simultaneous data delivery, and a comparatively small footprint on motherboards. Today, PCIe connectivity-based hardware delivers faster data transfers and is one of the de facto\u2026","rel":"","context":"In &quot;Technology&quot;","img":{"alt_text":"","src":"","width":0,"height":0},"classes":[]},{"id":295,"url":"https:\/\/fde.cat\/index.php\/2021\/08\/31\/how-facebook-encodes-your-videos\/","url_meta":{"origin":326,"position":4},"title":"How Facebook encodes your videos","date":"August 31, 2021","format":false,"excerpt":"People upload hundreds of millions of videos to Facebook every day. Making sure every video is delivered at the best quality \u2014 with the highest resolution and as little buffering as possible \u2014 means optimizing not only when and how our video codecs compress and decompress videos for viewing, but\u2026","rel":"","context":"In &quot;Technology&quot;","img":{"alt_text":"","src":"","width":0,"height":0},"classes":[]},{"id":641,"url":"https:\/\/fde.cat\/index.php\/2022\/10\/18\/ocp-summit-2022-open-hardware-for-ai-infrastructure\/","url_meta":{"origin":326,"position":5},"title":"OCP Summit 2022: Open hardware for AI infrastructure","date":"October 18, 2022","format":false,"excerpt":"At OCP Summit 2022, we\u2019re announcing Grand Teton, our next-generation platform for AI at scale that we\u2019ll contribute to the OCP community. We\u2019re also sharing new innovations designed to support data centers as they advance to support new AI technologies: A new, more efficient version of Open Rack. Our Air-Assisted\u2026","rel":"","context":"In &quot;Technology&quot;","img":{"alt_text":"","src":"","width":0,"height":0},"classes":[]}],"_links":{"self":[{"href":"https:\/\/fde.cat\/index.php\/wp-json\/wp\/v2\/posts\/326","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/fde.cat\/index.php\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/fde.cat\/index.php\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/fde.cat\/index.php\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/fde.cat\/index.php\/wp-json\/wp\/v2\/comments?post=326"}],"version-history":[{"count":1,"href":"https:\/\/fde.cat\/index.php\/wp-json\/wp\/v2\/posts\/326\/revisions"}],"predecessor-version":[{"id":384,"href":"https:\/\/fde.cat\/index.php\/wp-json\/wp\/v2\/posts\/326\/revisions\/384"}],"wp:attachment":[{"href":"https:\/\/fde.cat\/index.php\/wp-json\/wp\/v2\/media?parent=326"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/fde.cat\/index.php\/wp-json\/wp\/v2\/categories?post=326"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/fde.cat\/index.php\/wp-json\/wp\/v2\/tags?post=326"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}