The International Conference for High Performance Computing, Networking, Storage, and Analysis

Workshops Archive

A Peak Performance Model for All-to-all on Hierarchical Systems and Its Applications


Workshop: PMBS25: The 16th International Workshop on Performance Modeling, Benchmarking, and Simulation of High-Performance Computer Systems

Authors: Rohini Uma-Vaideswaran (Georgia Institute of Technology), Joshua Romero (NVIDIA Corporation), Daniel Dotson (Georgia Institute of Technology), David Appelhans (NVIDIA Corporation), and P. K. Yeung (Georgia Institute of Technology)

Abstract: An accurate measure of communication performance is a key component of optimizing large scale high performance computing applications. This paper presents a model for the peak performance of all-to-all communication, in the context of systems composed of a hierarchy of interconnect bandwidths; a common trait of multi-GPU per node systems. We demonstrate an application of the model to distributed transposes, such as those encountered in distributed three-dimensional Fast Fourier Transforms. The model is validated on three different network architectures, using a variety of communication libraries, by measuring all-to-all and distributed transpose performance in a pseudo-spectral code for direct numerical simulations of 3D fluid turbulence. Both the model and the validation results provide insights into the impact of fast communication links located lower in the network hierarchy, the expected scaling for all-to-all bound problems, and performance considerations when selecting slab (1D) or pencil (2D) domain decompositions.


Back to PMBS25: The 16th International Workshop on Performance Modeling, Benchmarking, and Simulation of High-Performance Computer Systems Archive Listing Back to Full Workshop Archive Listing