Workshop: PMBS25: The 16th International Workshop on Performance Modeling, Benchmarking, and Simulation of High-Performance Computer Systems
Authors: Amirreza Rastegari, Prabhat Ram, and Michael F. Ringenburg (Microsoft Corporation)
Abstract: Launch of Eagle, Azure’s hyper-scale supercomputer and the Number 3 on TOP500 list in November 2023, marked a new era where cloud providers are at the forefront of supercomputing. Despite its rapid expansion, public knowledge on the performance and scalability of cloud-based supercomputing is limited, with numerous misconceptions regarding performance implications due to virtualization layer of cloud-based systems. To address these gaps, we present a comparative analysis of two cloud-based supercomputers: Azure Eagle, a hyper-scale system ranked Number 3 on TOP500 in November 2023, and Azure Reindeer, a small-scale system ranked Number 32 on TOP500 in November 2024.
Using a comprehensive performance analysis, we highlight differences in computational efficiency and scaling characteristics of these systems in comparison to their bare-metal on-premises counterparts. We furthermore quantify the overhead from Azure's virtualization layer, demonstrating its performance implication for real-world HPC workloads to be less than 4%, with typical values ranging from 2–3%.
Back to PMBS25: The 16th International Workshop on Performance Modeling, Benchmarking, and Simulation of High-Performance Computer Systems Archive Listing Back to Full Workshop Archive Listing