Workshop: PDSW'25: The 10th International Parallel Data Systems Workshop
Authors: Nikoli Joseph Dryden (Lawrence Livermore National Laboratory), Quincey Koziol (NVIDIA); Hariharan Devarajan (Lawrence Livermore National Laboratory); Glenn Lockwood (VAST Data); Bogdan Nicolae (Argonne National Laboratory (ANL)); and Michela Taufer (University of Tennessee, Knoxville)
Abstract: Deep learning workloads are driving the HPC landscape, from foundation models and surrogate models to emerging agentic workflows. The I/O characteristics of these workloads challenge the assumptions underlying traditional HPC storage systems and libraries: whereas classical modeling/simulation workflows favor large, sequential writes and predictable access patterns, AI workflows demand random access I/O for dataset shuffling and burst bandwidth for model checkpoints, frequently simultaneously. The coupling of simulations with AI models further complicates storage requirements. In this panel, we will examine the challenges of managing data movement and storage for AI workloads, the requirements of I/O systems and how existing ones must evolve, and the open challenges for the field.
Back to PDSW'25: The 10th International Parallel Data Systems Workshop Archive Listing Back to Full Workshop Archive Listing