The International Conference for High Performance Computing, Networking, Storage, and Analysis

Workshops Archive

To Stream or Not to Stream: Towards A Quantitative Model for Remote HPC Processing Decisions


Workshop: The 12th Annual International Workshop on Innovating the Network for Data-Intensive Science (INDIS)

Authors: Flavio Castro, Weijian Zheng, Joaquin Chung, Ian Foster, and Raj Kettimuthu (Argonne National Laboratory (ANL))

Abstract: Modern scientific instruments generate data at rates that increasingly outpace local compute capabilities, making traditional file-based workflows inadequate for time-sensitive analysis and experimental steering. Real-time streaming frameworks promise lower latency and improved efficiency, but lack a principled feasibility assessment. We introduce a quantitative framework and accompanying Streaming Speed Score to evaluate if remote high-performance computing (HPC) resources can provide timely data processing compared to local alternatives. Our model incorporates key parameters including data generation rate, transfer efficiency, remote processing power, and file I/O overhead to compute total processing completion time (Tpct) and identify regimes where streaming is beneficial. We validate our approach through case studies from facilities such as APS, FRIB, LCLS-II, and the LHC. Our measurements show streaming can achieve up to 97% lower end-to-end completion time than file-based methods under high data rates, while worst-case congestion can increase transfer times by over an order of magnitude.


Back to The 12th Annual International Workshop on Innovating the Network for Data-Intensive Science (INDIS) Archive Listing Back to Full Workshop Archive Listing