The International Conference for High Performance Computing, Networking, Storage, and Analysis

Workshops Archive

Compute4Biology: Taking Stock of High Performance Computing Needs for Foundation Models in Biological Sciences


Workshop: AI4S: 6th Workshop on Artificial Intelligence and Machine Learning for Scientific Applications

Authors: Pratik Dutta (Stony Brook University) and Tirthankar Ghosal (Oak Ridge National Laboratory (ORNL))

Abstract: Foundation models are driving a paradigm shift across the life sciences, yet their transformative potential is fundamentally coupled to high-performance computing (HPC). The computational workloads from genomics, transcriptomics, proteomics, chemistry, and biomedical literature are remarkably diverse, creating distinct challenges for HPC infrastructure. This paper presents the first systematic, cross-domain analysis of these HPC needs. We characterize and compare the specific bottlenecks inherent to each domain—from the massive I/O of genomics to the intense memory pressure of proteomics and the unique compute kernels of molecular modeling. Analyzing these diverse workloads allows us to identify key trade-offs in hardware utilization and software design. We conclude by outlining a unified set of best practices and co-design principles for building next-generation HPC systems capable of accelerating discovery across the full spectrum of AI-driven science.


Back to AI4S: 6th Workshop on Artificial Intelligence and Machine Learning for Scientific Applications Archive Listing Back to Full Workshop Archive Listing