Authors: Konstantin Rygol (GigaIO), Hemal Shah (Broadcom), John Ihnotic (GigaIO), Kris Buggenhout (Advanced Micro Devices, Inc. (AMD))
Abstract: The rapid evolution of AI workloads is pushing the boundaries of interconnect design, requiring innovative approaches to balance performance, scalability, and efficiency. This BoF will explore cutting-edge advancements in AI interconnects and their impact on performance optimization.
Discussion topics include the evolving role of open standards in AI infrastructure, scale-out vs. scale-up networking solutions for AI workloads, innovative interconnect designs that redefine performance limits, and future-proofing AI networks: what’s next?
Attendees will engage in a forward-looking discussion with industry leaders and pioneering AI infrastructure startups that will tackle the most pressing questions shaping the future of AI interconnects.
Long Description: Interconnects are a foundational technology for scaling AI workloads across multi-accelerator systems. In recent years, the landscape has been dominated by proprietary interconnects from a single market leader that also dominates the AI accelerator market. This has created a tightly coupled ecosystem between AI accelerators and network technologies, but this vertical integration can pose significant barriers to entry for emerging AI accelerator vendors, as well as limit innovation across the broader ecosystem. Open interconnect solutions that support a wide range of AI accelerators are critical to enabling choice, fostering healthy competition, and accelerating progress in AI system design.
This BOF will begin with a brief overview of PCIe, UALink, and Ultra Ethernet, along with which kinds of scale out / scale up architectures they each enable and which classes of workloads each is best suited for. Next, AMD, Broadcom, and GigaIO will each present a concise, 3-5 slide case study showcasing a real-world example of how innovative interconnect solutions have driven significant performance improvements and reductions in TCO.
We will then move into a 40-45 minute Q&A-driven panel discussion, allowing attendees to interact directly with the panelists. To kick off the panel discussion, we will ask attendees the following questions: Which accelerators do you use now, and which are you planning to use in the near future? Which workloads will be relevant? What interconnect challenges do you face with these workloads?
To maintain high engagement levels, the panel discussion will dynamically adapt to the audience’s interests, as identified through the initial polls, questions asked, and ongoing feedback during the session. This flexible approach will ensure the BOF remains relevant and interactive, creating a valuable experience for all participants.
The audience Q&A segment will foster an open and collaborative dialogue, with panelists encouraging participants to propose new ideas, share insights, and explore ways to further enhance the concepts presented. Panelists are keenly interested in receiving valuable feedback from attendees on how users see emerging interconnect technologies and how panelists might be able to assist with these challenges.
This BOF has never been held before, so this kind of feedback is vital for the open interconnect community to thrive.