[Paper] Self-Adaptive Probabilistic Skyline Query Processing in Distributed Edge Computing via Deep Reinforcement Learning
Source: arXiv - 2601.21855v1
Overview
The paper tackles a pressing problem in edge‑centric IoT systems: how to answer Probabilistic Skyline Queries (PSKY) efficiently when massive, uncertain sensor streams are processed across distributed edge nodes and the cloud. Traditional solutions use fixed filtering thresholds, which either flood the network with data or overload edge devices. The authors introduce SA‑PSKY, a self‑adapting framework that continuously tunes these thresholds using deep reinforcement learning, dramatically cutting communication traffic and latency.
Key Contributions
- Self‑adaptive threshold control: Formulates the dynamic filtering problem as a continuous Markov Decision Process (MDP) and solves it with a Deep Deterministic Policy Gradient (DDPG) agent.
- Joint cost optimization: Simultaneously minimizes communication overhead and local computation time, rather than treating them as separate concerns.
- Edge‑cloud collaborative architecture: Designs a lightweight protocol for exchanging state information (arrival rates, uncertainty distributions, resource availability) between edge nodes and a central controller.
- Extensive empirical validation: Shows up to 60 % reduction in network traffic and 40 % lower end‑to‑end response time compared with static‑threshold and heuristic baselines across varied data distributions.
- Scalability analysis: Demonstrates stable performance as the number of edge nodes and data dimensions grow, confirming suitability for large‑scale IoE deployments.
Methodology
-
Problem Modeling – Each edge node receives a stream of multi‑dimensional, uncertain tuples. Before forwarding candidates to the cloud, it applies a filter intensity (a probabilistic threshold) that decides how aggressively to prune local results. The optimal intensity depends on real‑time factors:
- Data arrival rate (how fast new sensor readings appear)
- Uncertainty distribution (confidence intervals of each attribute)
- Resource snapshot (CPU, memory, network bandwidth at the node)
-
MDP Formulation – The system state is a vector of the above metrics. The action is the continuous threshold value for each node. The reward combines negative communication cost (bytes sent) and negative computation cost (local processing time), encouraging the agent to find a sweet spot.
-
Deep Reinforcement Learning – A DDPG agent (actor‑critic architecture) learns a deterministic policy mapping states to thresholds.
- Actor network: outputs the threshold.
- Critic network: estimates the expected cumulative reward for a given state‑action pair.
- Experience replay and soft target updates stabilize training in the non‑stationary edge environment.
-
Deployment Loop – At each time window, edge nodes report their state to the controller, the DDPG policy computes new thresholds, and nodes adjust their local filters accordingly. This loop runs continuously, allowing the system to react to workload spikes, network congestion, or hardware failures.
Results & Findings
- Communication Savings: Across synthetic and real IoT datasets, SA‑PSKY cuts the amount of data sent from edge to cloud by 45‑60 % versus a static 0.5 probability threshold baseline.
- Latency Reduction: End‑to‑end query response time drops 30‑40 %, primarily because fewer candidates need to be merged in the cloud and edge nodes avoid unnecessary local computation.
- Robustness to Distribution Shifts: When the underlying data uncertainty changes (e.g., sensor calibration drift), the learned policy quickly adapts, maintaining low overhead without manual retuning.
- Scalability: Experiments with up to 128 edge nodes and 10‑dimensional skylines show near‑linear scaling; the overhead of the RL controller remains negligible (<2 ms per decision cycle).
Practical Implications
- Edge‑First Analytics: Developers building real‑time dashboards for smart cities, industrial IoT, or autonomous fleets can embed SA‑PSKY to keep bandwidth usage low while still delivering accurate skyline results (e.g., “best‑performing” devices under uncertainty).
- Resource‑Aware Service Orchestration: Cloud platforms can integrate the DDPG controller as a micro‑service that continuously optimizes data ingestion pipelines, reducing cost on pay‑per‑bandwidth cloud links.
- Plug‑and‑Play Deployment: The framework requires only lightweight telemetry (CPU, network stats) from edge nodes, making it compatible with existing container‑orchestrated edge runtimes (K3s, OpenYurt).
- Extensible to Other Queries: The same RL‑based threshold tuning can be repurposed for top‑k, nearest‑neighbor, or anomaly‑detection queries where a trade‑off between local pruning and remote aggregation exists.
Limitations & Future Work
- Training Overhead: The DDPG agent needs an initial offline training phase with representative workloads; abrupt, unseen workload patterns may cause temporary sub‑optimal thresholds.
- State Granularity: The current state vector omits fine‑grained network latency variance, which could improve decisions in highly volatile wireless links.
- Security Considerations: The framework assumes trustworthy telemetry; future work could explore robust RL against malicious edge nodes that falsify state reports.
- Broader Benchmarking: Extending evaluation to heterogeneous hardware (e.g., ARM‑based edge devices) and real‑world production pipelines would solidify the claims.
Bottom line: SA‑PSKY demonstrates that deep reinforcement learning can turn a traditionally static, hand‑tuned query processing component into a self‑optimizing service, unlocking tangible bandwidth and latency gains for the next generation of edge‑centric data platforms.
Authors
- Chuan-Chi Lai
Paper Information
- arXiv ID: 2601.21855v1
- Categories: cs.DC, cs.DB, cs.NI
- Published: January 29, 2026
- PDF: Download PDF