Authors: Kurtis Bowman (Advanced Micro Devices, Inc. (AMD)), Torsten Hoefler (ETH Zürich)
Abstract: Traditional interconnects no longer meet the performance, latency, and scalability demands of AI systems, prompting the need for new approaches to data movement at the node and data center levels. In response, two new open industry organizations have emerged: UALink and the Ultra Ethernet Consortium (UEC). This BoF will discuss the two organizations' approaches to meeting the growing network demands for AI applications.
This presentation will also explore the architectural challenges driving the need for these new fabrics and explain how UALink and UEC are addressing AI networking challenges.
Long Description: As AI workloads continue to grow in scale, complexity, and performance demands, they are rapidly outpacing the capabilities of traditional data center architectures. Legacy interconnect technologies, originally designed for general-purpose computing, are increasingly unable to meet the low-latency, high-bandwidth, and scalable communication requirements of modern AI and high-performance computing (HPC) systems. These systems depend on efficient data movement both within compute nodes (“scale-up”) and across the data center network (“scale-out”), making next-generation interconnects a critical enabler for future AI infrastructure.
In response to these evolving needs, two new open industry organizations have emerged to develop purpose-built solutions for AI and HPC communication challenges. UALink is focused on delivering high-performance, low-latency “scale-up” interconnects that enable faster and more efficient communication between accelerators, such as GPUs, within a single node or rack. UALink aims to standardize and promote an open interconnect fabric optimized for intra-node communication, ensuring that AI workloads can scale effectively within tightly coupled systems.
At the same time, the Ultra Ethernet Consortium (UEC) is working to evolve and optimize Ethernet for “scale-out” environments, enabling more intelligent, efficient networking across data center clusters. UEC’s mission is to enhance Ethernet with features tailored to AI and HPC traffic patterns, improving throughput, latency, and congestion control across massive, distributed infrastructures. By adapting Ethernet—a widely adopted, mature technology—to the needs of AI and HPC, UEC offers a pathway for interoperability and rapid ecosystem adoption at the data center level.
Together, UALink and UEC represent a coordinated effort to address the full spectrum of communication needs in AI infrastructure, from tightly integrated nodes to expansive, distributed systems. While UALink focuses on maximizing performance within localized compute clusters, UEC ensures those clusters can communicate efficiently across the broader network. These complementary approaches are critical to overcoming the architectural bottlenecks that currently limit scalability and efficiency in AI data centers.
This presentation will explore the architectural and performance challenges that have driven the need for new scale-up and scale-out fabrics. It will highlight the distinct roles UALink and UEC play in solving these challenges, the technologies each organization is developing, and how their solutions interoperate to deliver a cohesive, next-generation AI infrastructure. Attendees will gain insights into the direction of the AI compute ecosystem, the importance of open standards, and how collaboration across the industry is shaping the future of high-performance interconnects.