Workshop: The 11th International Workshop on Data Analysis and Reduction for Big Scientific Data
Authors: Junghyun Ryu, Soon Hwang, Junhyeok Park, and Seonghoon Ahn (Sogang University); JeoungAhn Park, Jeongjin Lee, Jinna Yang, Soonyeal Yang, and Jungki Noh (SK hynix Inc.); Qing Zheng (Los Alamos National Laboratory (LANL)); Woosuk Chung and Hoshik Kim (SK hynix Inc.); and Youngjae Kim (Sogang University)
Abstract: Existing object storage systems like AWS S3 and MinIO offer only limited in-storage compute capabilities, typically restricted to simple SQL WHERE-clause filtering. Conse- quently, high-impact operators—such as aggregation and top-N—are still executed entirely at the compute layer. Recent advances in Object-based Computational Storage (OCS) enable these complex operators to run natively within storage, creating opportunities for substantial reductions in data movement and query time. To demonstrate these benefits in distributed SQL engines, we used Presto as a case study and developed the Presto-OCS connector, which analyzes execution plans to identify pushdown-eligible operators and offloads them to OCS for efficient in-storage execution. Evaluations with real-world HPC analytics queries and the TPC-H benchmark show that our approach achieves up to 4.07× speedup and 99% data movement reduction compared to filter-only pushdown. When combined with compression techniques, our approach delivers 1.39×speedup over compressed filter-only pushdown, demonstrating that advanced query pushdown complements existing optimizations.
Back to The 11th International Workshop on Data Analysis and Reduction for Big Scientific Data Archive Listing Back to Full Workshop Archive Listing