Workshop: 7th International Workshop on Containers and New Orchestration Paradigms for Isolated Environments in HPC (CANOPIE-HPC)
Authors: Aditya Bhosale, Kavitha Chandrasekar, and Laxmikant Kale (University of Illinois at Urbana-Champaign) and Sara Kokkila-Schumacher (IBM Thomas J. Watson Research Center)
Abstract: The pay-as-you-go cost model of cloud resources has necessitated the development of specialized programming models and schedulers for HPC jobs for efficient utilization of cloud resources. A key aspect of efficient utilization is the ability to rescale applications on the fly to maximize the utilization of cloud resources. Most commonly used parallel programming models, like MPI, have traditionally not supported autoscaling either in a cloud environment or on supercomputers. Charm++ is a parallel programming model that natively supports dynamic rescaling through its migratable objects paradigm. We present a Kubernetes operator to run Charm++ applications on a Kubernetes cluster. We also present a priority-based elastic job scheduler that can dynamically rescale jobs based on the state of the cluster to maximize cluster utilization while minimizing response time for high-priority jobs. We show that our elastic scheduler demonstrates significant performance improvements over traditional static schedulers.
Back to 7th International Workshop on Containers and New Orchestration Paradigms for Isolated Environments in HPC (CANOPIE-HPC) Archive Listing Back to Full Workshop Archive Listing