The International Conference for High Performance Computing, Networking, Storage, and Analysis

Workshops Archive

An Optimized Generalized Multi-Color Point Implicit Solver for Intel GPUs using OneAPI ESIMD


Workshop: IA^3 2025 — 15th Workshop on Irregular Applications: Architectures and Algorithms

Authors: Joseph Wassell and Mohammad Zubair (Old Dominion University); Aaron Walden, Gabriel Nastac, and Eric Nielsen (NASA Langley Research Center); and Timothèe Ewart (Intel Corporation)

Abstract: This paper presents an efficient implementation of a linear-solver kernel optimized for a range of block sizes, commonly used in large-scale computational fluid dynamics (CFD) simulations. The implementation targets Aurora, the Argonne Leadership Computing Facility's (ALCF) exascale machine featuring Intel Data Center Max 1550 GPUs. The linear solver's performance is memory bandwidth-bound due to its low arithmetic intensity. The primary performance challenges stem from variable matrix row lengths and indirect memory access patterns inherent in unstructured-grid applications. Variable block sizes introduce additional complexity through differing levels of intra-block parallelism and the constraint of efficiently utilizing 512-bit vector registers. We propose an optimized implementation using ESIMD APIs that efficiently vectorize memory loads for block-sparse vector computations. We demonstrate that performance on the Intel 1550 GPU is within 10% of its bandwidth benchmark peak. We also compare the performance of the ESIMD kernels on Intel GPUs with CUDA-optimized implementations on NVIDIA GPUs.


Back to IA^3 2025 — 15th Workshop on Irregular Applications: Architectures and Algorithms Archive Listing Back to Full Workshop Archive Listing