Authors: Neena Imam (Southern Methodist University), Nagi Rao (Oak Ridge National Laboratory (ORNL)), Ian Foster (Argonne National Laboratory (ANL), University of Chicago), Benjamin Brown (DOE Office of Advanced Scientific Computing Research)
Abstract: Continuum computing is a distributed, multi-layered ecosystem that spans sensors at the edge, interconnected instruments, data centers, supercomputers, and recently quantum computers. These interconnected systems form a digital continuum wherein computation is orchestrated in various stages. The rising complexity of the continuum is accompanied by a corresponding increase in the vulnerability of its environment. The convergence of AI, data-intensive applications, and mobile workloads necessitates a reevaluation of strategies for securing these systems. We will explore resilience in continuum computing, emphasizing technologies and architectures that protect data, models, and computation across a federated landscape, including advances in quantum networks for secure communication.
Long Description: The new paradigm of continuum computing (also known as the digital continuum) is a distributed, multi-layered ecosystem that spans sensors at the edge, interconnected instruments, cloud platforms, datacenters, exascale supercomputers, and recently quantum computers. Continuum computing has evolved to keep pace with the growth and expansion of geographically distributed science and hybrid computing infrastructures. In the continuum paradigm, computation and data are orchestrated in various stages from the edge to the core to optimize data movement and response times. Novel solutions are needed for system design, software frameworks, workflows that can react to dynamic data sizes, monitoring tools, multisite governance policies, actionable experimental metrics, etc. We organized the inaugural BoF on continuum computing at SC22, which discussed state-of-the-art in the digital continuum. The SC23 BoF discussed the aggregation and synthesis of previously distinct techniques and tools (such as HPC, AI/ML, and digital twins) to advance continuum computing. At SC24, we focused on the role of quantum information science in advancing continuum computing. For SC25, our theme is resilience in the digital continuum.
The rising complexity of continuum computing has been accompanied by a corresponding increase in the vulnerability of its environment. The convergence of AI, data-intensive applications, and expanding mobile workloads necessitates a reevaluation of strategies for securing and sustaining these systems. We will explore resilience in continuum computing, emphasizing technologies and architectures that protect data, models, and computation across a federated landscape. The key vectors of concern are secure data movement, integrity of distributed AI models, and the resilience of communication and computation infrastructures against adversarial, environmental, and operational disruptions. Resilient solutions include secure data transport across dispersed resources without compromising confidentiality. Another key feature is resilient data storage across the continuum. As data is increasingly generated and used outside of centralized data centers, ensuring redundancy, tamper-resistance, and rapid recovery becomes essential. The session will also explore robust Federated Learning (FL) as a privacy-preserving method for distributed AI by enabling local model training and secure parameter sharing without centralizing raw data. We will discuss FL’s resilience to adversarial threats. Quantum networking is a new frontier of science with the promise of capabilities unachievable over conventional networks in securely communicating between quantum, conventional, and hybrid systems. In exploring quantum networks, we will highlight both promise and realism. Quantum networks offer data security, but practical deployment is constrained by distance and integration with classical systems. We will discuss how quantum-enhanced links might eventually augment continuum computing security, especially for mission-critical workloads.
Attendees will receive an overview of current challenges and innovations for secure and resilient continuum infrastructures, and a set of community-driven open questions. This BoF aims to catalyze cross-disciplinary collaboration between system architects, AI researchers, quantum physicists, cybersecurity experts, and application scientists to co-develop the next generation of trustworthy continuum computing environments. Given the cross-disciplinary and timely nature of this BoF, this session should be enthusiastically received. The session leaders provide a diverse set of perspectives from academia (Imam), national labs (Rao, Foster), and government (Brown).
Website: https://smu.edu/Provost/odonnell-institute/News/SC25-BoF