[Paper] ScaleAcross: Designing Multi-Data-Center Infrastructure for Geo-Distributed AI Training

Published: (June 11, 2026 at 02:48 AM EDT)
2 min read
Source: arXiv

Source: arXiv - 2606.12963v1

Overview

The rapid growth of AI models and increasing data sovereignty requirements are driving the transition toward geo-distributed AI training across multiple data centers. Such deployments introduce system-level challenges arising from synchronization-intensive communication, cross-site data exchange, and wide-area latency constraints. This paper investigates EVPN—VXLAN as an infrastructure foundation for geo-distributed AI training environments and presents a scalable emulation framework for systematically studying distributed AI workloads under realistic wide-area conditions. The proposed framework combines VXLAN overlays with EVPN-based inter-data-center connectivity and is implemented using ContainerLab and FRRouting (FRR). The framework further incorporates Equal-Cost Multi-Path (ECMP) routing, Bidirectional Forwarding Detection (BFD), and a queue-pair-aware traffic distribution mechanism designed to improve communication behavior for synchronization-intensive AI workloads while preserving compatibility with commodity infrastructure. Using realistic WAN emulation, we characterize communication and system behavior under distributed training workloads employing AllReduce and Parameter Server communication patterns. Results provide insights into traffic distribution, resilience, and infrastructure behavior in geo-distributed AI environments, highlighting the potential of reproducible multi-data-center infrastructure frameworks for scalable distributed AI training.

Key Contributions

This paper presents research in the following areas:

  • cs.NI
  • cs.DC
  • cs.ET

Methodology

Please refer to the full paper for detailed methodology.

Practical Implications

This research contributes to the advancement of cs.NI.

Authors

  • Naved Inam
  • Aryan Alpesh Bhavsar
  • Masabattula Teja Nikhil
  • Sidharth Sharma

Paper Information

  • arXiv ID: 2606.12963v1
  • Categories: cs.NI, cs.DC, cs.ET
  • Published: June 11, 2026
  • PDF: Download PDF
0 views
Back to Blog

Related posts

Read more »