The International Conference for High Performance Computing, Networking, Storage, and Analysis

Workshops Archive

Modelling Load Imbalance In Shared Memory Multicore Systems


Workshop: PMBS25: The 16th International Workshop on Performance Modeling, Benchmarking, and Simulation of High-Performance Computer Systems

Authors: Johannes Langguth (Simula Research Laboratory, University of Bergen); James Trotter (Simula Research Laboratory); and Xing Cai (University of Oslo, Simula Research Laboratory)

Abstract: Memory bandwidth has become the primary limiting factor of performance in many modern HPC applications, and it poses a limit to scalability because the achievable memory bandwidth only grows linearly with a small number of CPU cores. When the number of cores concurrently using the memory system exceeds a threshold, the aggregate memory bandwidth quickly saturates.

To estimate the time usage of a computation dominated by memory traffic, the mainstream strategy is to divide the expected total memory traffic volume by the maximum memory bandwidth. However, this implicitly assumes homogeneous memory traffic which is often not the case, leading to inaccurate time estimates by the mainstream strategy.

In this paper, we present a new performance model that specifically targets inhomogeneity in per-core memory traffic. The new requires only three hardware parameters. Using several cases of uneven per-core memory traffic, we demonstrate its advantage over the mainstream strategy.


Back to PMBS25: The 16th International Workshop on Performance Modeling, Benchmarking, and Simulation of High-Performance Computer Systems Archive Listing Back to Full Workshop Archive Listing