SC25 Proceedings

Research and ACM SRC Posters Archive

Inference-as-a-Service Prototype at NERSC

Poster Type: Research Posters

Author: Colin Thomas (University of Notre Dame), Po-Han Huang (Georgia Institute of Technology), Hilary Utaegbulam (University of Rochester), Johannes Blaschke (ESnet; Lawrence Berkeley National Laboratory (LBNL)), Bruno Coimbra (Fermi National Laboratory), Pengfei Ding (ESnet; Lawrence Berkeley National Laboratory (LBNL)), Xiangyang Ju (ESnet; Lawrence Berkeley National Laboratory (LBNL)), Andrew Naylor (ESnet; Lawrence Berkeley National Laboratory (LBNL)), Michael Wang (Fermi National Laboratory)

Supervisor:

Abstract: The increasing scale and complexity of scientific experiments has led to a growing need for efficient and scalable machine learning model inference serving systems. High-energy physics experiments and simulations of complex climate models involve petabytes of data and massive amounts of computational resources to produce accurate results. Thus, scientists are increasingly turning to utilize ML techniques to analyze and interpret the vast amount of data generated by these experiments.

However, the deployment of ML models in scientific applications poses significant challenges. Traditional approaches to deploying ML models by individual users with local resources or small clusters often suffer from long startup costs and inefficient resource utilization. To address this challenge, we present a prototyped system that provides on-demand inference serving capabilities for multiple scientific ML models. Our system is deployed across the NERSC Perlmutter supercomputer and the NERSC K8s cluster, enabling on-demand scalability.

Best Poster Finalist (BP): no
Poster: PDF
Poster Summary: PDF

Back to Poster Archive Listing