The International Conference for High Performance Computing, Networking, Storage, and Analysis

Research and ACM SRC Posters Archive

Compute System Simulator: Modeling the Impact of Allocation Policy and Hardware Reliability on HPC Cloud Resource Utilization


Poster Type: Research Posters

Author: Jarrod Leddy (Microsoft Corporation), Huseyin Yildiz (Microsoft Corporation)

Supervisor:

Abstract: We have developed a comprehensive simulation tool to model the launching, progression, and completion of virtual machines and corresponding workloads within a cloud cluster of arbitrary size. The simulator employs various policies to allocate computational resources for these virtual machines, simulates hardware failures and workload interruptions, and reallocates new resources as needed. The primary goal of this work is to test the interaction of allocation policy design with various types of hardware failures, analyzing the expected resource utilization and workload delay in these scenarios. The modular design of the simulator provides the framework for implementing and analyzing cutting-edge allocation policies as they emerge. Through a series of experiments, the simulator demonstrates the effectiveness of different policies in managing resource allocation amidst failing hardware, providing valuable insights into the optimization of cloud infrastructure and the development of resilient resource management strategies.

Best Poster Finalist (BP): no
Poster: PDF
Poster Summary: PDF


Back to Poster Archive Listing