The International Conference for High Performance Computing, Networking, Storage, and Analysis

Research and ACM SRC Posters Archive

Can Long-Haul RDMA Benefit Federated Learning?


Poster Type: Research Posters

Author: Zhonghao Chen (University of California, Merced), Yuke Li (University of California, Merced), Duo Zhang (University of California, Merced), Xiaoyi Lu (University of California, Merced)

Supervisor:

Abstract: Federated learning (FL) has emerged as a promising paradigm for privacy-preserving distributed training. However, its performance is often hindered by communication bottlenecks, especially over long-distance networks. In this work, we investigate the effectiveness of long-haul remote direct memory access (RDMA) as a high-performance communication substrate for FL. We develop a simulation framework that incorporates rate-limiting techniques to emulate wide-area RDMA deployments, enabling accurate comparisons with traditional TCP/IP networks. Through evaluations we demonstrate that long-haul RDMA can reduce communication time by up to 90.79% under WAN-like conditions and decrease total runtime by as much as 85.83%. These results underscore RDMA's promise in accelerating FL across distributed geographic settings.

Best Poster Finalist (BP): no
Poster: PDF
Poster Summary: PDF


Back to Poster Archive Listing