Poster Type: ACM Student Research Competition, Graduate
Author: Amr Abouelmagd (Tennessee Tech University), Stephanie Brink (Lawrence Livermore National Laboratory (LLNL)), Michael McKinsey (Lawrence Livermore National Laboratory (LLNL)), David Boehme (Lawrence Livermore National Laboratory (LLNL)), Jason Burmark (Lawrence Livermore National Laboratory (LLNL)), Brian Ryujin (Lawrence Livermore National Laboratory (LLNL)), Tom Scogland (Lawrence Livermore National Laboratory (LLNL)), Olga Pearce (Lawrence Livermore National Laboratory (LLNL))
Supervisor: Anthony Skjellum (Tennessee Tech University)
Abstract: Modern GPUs play a crucial role in accelerating a wide range of computational workloads. However, their performance is often limited by the memory access patterns of the kernels they execute. AMD’s MI300A APU supports multiple logical GPU partitioning modes to optimize compute resource allocation, offering new opportunities for performance tuning. In this work, we evaluate how different GPU kernels from the RAJA Performance Suite perform in various partitioning modes. Using hardware counters, we compare two kernels with identical computational complexity but different data layouts, highlighting how memory organization can influence performance outcomes. The results demonstrate that data layout and access patterns have a significant impact on runtime performance across different partitioning modes, even when computational complexity and problem size remain constant.
Best Poster Finalist (BP): no
Poster: PDF
Poster Summary: PDF