[Paper] Delta Sum Learning: an approach for fast and global convergence in Gossip Learning
Source: arXiv - 2512.01549v1
Overview
The paper introduces Delta Sum Learning, a novel aggregation technique for Gossip‑based federated learning that dramatically improves global model convergence while keeping the communication overhead low. By coupling this method with a declarative, Kubernetes‑style orchestration layer, the authors demonstrate how edge devices can collaboratively train models at scale without a central server.
Key Contributions
- Delta Sum aggregation: a lightweight, delta‑based summation rule that replaces the traditional averaging step in Gossip Learning.
- Decentralized orchestration framework: built on the Open Application Model (OAM), enabling dynamic node discovery and intent‑driven deployment of learning workloads via standard manifests.
- Empirical evaluation: shows comparable performance to existing methods on small (10‑node) topologies and a 58 % reduction in global accuracy loss when scaling to 50 nodes.
- Scalability analysis: demonstrates a logarithmic degradation of accuracy with increasing network size, versus the linear drop observed with classic gossip averaging.
Methodology
-
Delta Sum Learning
- Each node maintains a local model and a delta vector that captures the difference between its current model and the last received update.
- When two peers exchange information, they sum their deltas instead of averaging full model parameters.
- The summed delta is applied locally, and the original delta is reset, ensuring that only new information propagates through the network.
-
Decentralized Orchestration (OAM‑based)
- Learning tasks are described in OAM manifests (similar to Kubernetes YAML).
- A lightweight discovery protocol lets nodes join or leave the gossip overlay automatically.
- The orchestrator translates intents (e.g., “train a CNN on edge cameras”) into concrete deployments of the Delta Sum learner on each participating device.
-
Experimental Setup
- Simulated gossip networks of 10, 30, and 50 nodes using standard image classification benchmarks (e.g., CIFAR‑10).
- Baselines: classic gossip averaging and Federated Averaging (FedAvg).
- Metrics: convergence speed (epochs to reach a target loss), final global accuracy, and communication volume.
Results & Findings
| Topology | Baseline (Avg) Accuracy Drop | Delta Sum Accuracy Drop | Relative Improvement |
|---|---|---|---|
| 10 nodes | 2.1 % | 2.0 % | ≈ 0 % |
| 30 nodes | 7.8 % | 4.5 % | 42 % reduction |
| 50 nodes | 12.4 % | 5.2 % | 58 % reduction |
- Convergence speed: Delta Sum reaches the same loss threshold ~1.3× faster on 50‑node graphs.
- Communication overhead: Because only deltas are exchanged, bandwidth usage drops by ~15 % compared with full‑model averaging.
- Scalability trend: Accuracy loss grows logarithmically with node count for Delta Sum, while the classic approach shows a near‑linear degradation, confirming the method’s robustness under limited connectivity.
Practical Implications
- Edge AI deployments: Developers can embed learning workloads directly into IoT fleets (e.g., smart cameras, wearables) without provisioning a central parameter server.
- Kubernetes‑style roll‑outs: Using OAM manifests means existing CI/CD pipelines can provision, update, or roll back learning jobs across heterogeneous devices just like any microservice.
- Reduced bandwidth costs: Delta‑only exchanges are ideal for networks with constrained uplink/downlink (cellular, LPWAN), extending battery life and lowering data‑plan expenses.
- Fault tolerance: Since the aggregation is fully peer‑to‑peer, node churn (devices joining/leaving) does not break training, making the approach suitable for highly dynamic edge environments.
Limitations & Future Work
- Model size sensitivity: The study focused on moderate‑sized CNNs; very large transformer‑style models may still incur significant delta payloads.
- Security considerations: While gossip removes a central server, the paper does not address Byzantine or malicious peers; integrating robust aggregation (e.g., Krum) with Delta Sum is an open question.
- Real‑world deployment: Experiments were conducted in simulated networks; future work includes field trials on heterogeneous hardware (ARM, GPUs) and heterogeneous network conditions (5G, Wi‑Fi, BLE).
Delta Sum Learning bridges the gap between the theoretical appeal of fully decentralized federated learning and the practical needs of developers building scalable, edge‑centric AI services.
Authors
- Tom Goethals
- Merlijn Sebrechts
- Stijn De Schrijver
- Filip De Turck
- Bruno Volckaert
Paper Information
- arXiv ID: 2512.01549v1
- Categories: cs.DC, cs.AI
- Published: December 1, 2025
- PDF: Download PDF