The International Conference for High Performance Computing, Networking, Storage, and Analysis

Workshops Archive

Accessing Serialized Data Fromats with GPU-Initiated I/O


Workshop: PDSW'25: The 10th International Parallel Data Systems Workshop

Authors: Luke Logan, Anthony Kougkas, and Xian-He Sun (Illinois Tech)

Abstract: Graphics Processing Units (GPUs) have become essential for scientific data analysis, yet they remain constrained by traditional I/O architectures that rely on data movement initiated by the CPU. While recent GPU-initiated I/O systems like BaM and GeminiFS partially address this limitation, they do not support access to complex serialized data formats such as HDF5, NetCDF, and ADIOS within GPU kernels. These formats are ubiquitous in scientific computing but would require prohibitive reimplementation of existing I/O libraries for direct GPU access.

This work explores a hybrid approach that enables GPU kernels to access serialized data formats through GPU-initiated I/O transfers to a specialized CPU runtime. Our design preserves the rich functionality of existing data format ecosystems while enabling GPU kernels to perform I/O. Our evaluations demonstrate minimal overhead compared to the traditional CPU-initiated approach. As future work, we are exploring reimplementation of I/O libraries to bypass the CPU runtime when possible.


Back to PDSW'25: The 10th International Parallel Data Systems Workshop Archive Listing Back to Full Workshop Archive Listing