The International Conference for High Performance Computing, Networking, Storage, and Analysis

Research and ACM SRC Posters Archive

Scalable Execution Framework for R on Manycore Systems


Poster Type: Research Posters

Author: Xiran Zhang (King Abdullah University of Science and Technology (KAUST)), Javier Conejero (Barcelona Supercomputing Center (BSC)), Sameh Abdulah (King Abdullah University of Science and Technology (KAUST)), Jorge Ejarque (Barcelona Supercomputing Center (BSC)), Ying Sun (King Abdullah University of Science and Technology (KAUST)), Rosa M. Badia (Barcelona Supercomputing Center (BSC)), David E. Keyes (King Abdullah University of Science and Technology (KAUST)), Marc G. Genton (King Abdullah University of Science and Technology (KAUST))

Supervisor:

Abstract: RCOMPSs is a scalable execution framework that integrates the R programming language with the COMPSs runtime to enable task-based parallel execution on manycore and distributed systems. RCOMPSs extends conventional R workflows by allowing functions to be annotated as tasks, which the runtime system analyzes to construct a task dependency graph (DAG). This graph guides dynamic scheduling, dependency resolution, and data transfers, thereby abstracting parallel execution from the user while preserving correctness. A straightforward example of dataset standardization illustrates the minimal programming effort needed to leverage parallelism. In contrast, more complex applications like K-means clustering demonstrate the framework's capability to represent iterative statistical algorithms in a task-oriented manner. Performance evaluation on Shaheen-III and MareNostrum~5 shows strong scalability up to 32 nodes with near-linear speedup, efficient weak scalability with increasing problem sizes, and effective utilization of up to 128 and 80 threads per node, respectively.

Best Poster Finalist (BP): no
Poster: PDF
Poster Summary: PDF


Back to Poster Archive Listing