[Paper] Exact, Efficient, and Reliable Multi-Objective and Multi-Constrained IoT Workflow Scheduling in Edge-Hub-Cloud Cyber-Physical Systems
Source: arXiv - 2604.24340v1
Overview
The paper presents an exact, multi‑objective scheduling algorithm for IoT workflows that span edge devices, a hub, and the cloud. By formulating the problem as a continuous‑time mixed‑integer linear program (MILP), the authors simultaneously minimize latency, energy consumption, and reliability loss while respecting a full set of real‑world constraints (deadlines, memory, storage, processing capabilities, etc.). Their results show sizable gains over a state‑of‑the‑art heuristic, with practical runtimes even for realistic workflow sizes.
Key Contributions
- Holistic MILP formulation that jointly optimizes three conflicting objectives (latency, energy, reliability) for edge‑hub‑cloud workflows.
- Comprehensive constraint handling: deadlines, reliability thresholds, CPU/memory/storage limits, and heterogeneous multicore capabilities are all modeled explicitly.
- Selective task duplication strategy that improves reliability without the overhead of blind replication.
- Extended benchmark heuristic to enable a fair head‑to‑head comparison with the exact approach.
- Extensive evaluation on a real IoT workflow and synthetic graphs (up to dozens of tasks) across multiple system configurations, demonstrating up to ~30 % improvements in each objective.
- Demonstrated scalability: solution times remain in the order of seconds‑minutes for problem sizes relevant to typical CPS deployments.
Methodology
- Problem Modeling – Each workflow is represented as a directed acyclic graph (DAG) where nodes are tasks and edges capture data dependencies.
- Decision Variables – Binary variables decide where (edge, hub, cloud) each task runs and whether a replica is created. Continuous variables capture start times and resource usage.
- Objective Function – A weighted sum of three metrics:
- Latency (makespan of the workflow)
- Energy (CPU + communication energy across all devices)
- Reliability penalty (probability of task failure, reduced by duplication)
The weights can be tuned to reflect the developer’s priority.
- Constraints –
- Timing: task start times respect precedence and global deadline.
- Resource: per‑node CPU, memory, storage limits; multicore scheduling is modeled via capacity constraints.
- Reliability: overall failure probability must stay below a user‑defined threshold.
- Duplication control: limit the number of replicas to avoid unnecessary overhead.
- Solution Technique – The MILP is solved with a commercial/open‑source optimizer (e.g., CPLEX, Gurobi). Because the formulation is continuous‑time, it avoids the discretization errors common in time‑slot‑based schedulers.
- Benchmark Heuristic – An existing list‑scheduling heuristic is extended to respect the same constraints, providing a baseline for comparison.
Results & Findings
| Metric | Exact MILP (Avg.) | Heuristic (Avg.) | Improvement |
|---|---|---|---|
| Latency | ↓ up to 29.8 % | – | Faster completion |
| Energy | ↓ up to 33.9 % | – | Lower power draw on edge nodes |
| Reliability (failure prob.) | ↓ up to 28.5 % | – | Fewer missed deadlines / crashes |
- Runtime: For workflows with up to ~30 tasks, the MILP solved in ≤ 2 minutes on a standard workstation—acceptable for offline deployment planning.
- Scalability: Synthetic tests show near‑linear growth in solve time up to the tested size; beyond ~50 tasks, runtimes increase sharply, suggesting a hybrid approach for larger DAGs.
- Sensitivity: By adjusting objective weights, developers can prioritize latency (e.g., for real‑time control) or energy (e.g., battery‑powered sensors) without re‑engineering the model.
Practical Implications
- Deployment Planning Tools – Cloud/edge orchestration platforms can embed the MILP as a “what‑if” optimizer, generating optimal placement maps before runtime.
- Edge‑AI Pipelines – For inference tasks that must meet strict latency and energy budgets, the scheduler can decide which layers run locally vs. on the hub/cloud, while automatically adding redundancy for safety‑critical stages.
- Service‑Level Agreements (SLAs) – Operators can guarantee latency and reliability thresholds by feeding SLA parameters directly into the objective weights and constraints.
- Energy‑aware Firmware Updates – When rolling out updates to heterogeneous edge fleets, the model can schedule download, verification, and installation tasks to minimize peak power usage.
- Hybrid Edge‑Hub‑Cloud Architectures – The work validates that exact optimization is feasible for realistic CPS sizes, encouraging designers to move beyond heuristic‑only solutions.
Limitations & Future Work
- Scalability Ceiling – The MILP becomes computationally heavy for workflows > 50 tasks; future research could explore decomposition, column generation, or learning‑guided heuristics to retain optimality guarantees at larger scales.
- Static Workflows – The current formulation assumes a known DAG ahead of time. Extending the model to handle dynamic task arrivals or runtime re‑scheduling (e.g., due to node failures) would broaden applicability.
- Network Variability – Communication latency/energy are modeled with fixed parameters; incorporating stochastic network conditions or adaptive bandwidth allocation could improve realism.
- Hardware Heterogeneity – While multicore heterogeneity is captured, emerging accelerators (TPUs, FPGAs) are not explicitly modeled; integrating accelerator‑specific constraints is a natural next step.
Bottom line: By delivering an exact, multi‑objective scheduler that respects the full gamut of edge‑hub‑cloud constraints, this work gives developers a powerful new tool for building reliable, low‑latency, and energy‑efficient IoT applications—provided the workflow size stays within the demonstrated sweet spot.
Authors
- Andreas Kouloumpris
- Georgios L. Stavrinides
- Maria K. Michael
- Theocharis Theocharides
Paper Information
- arXiv ID: 2604.24340v1
- Categories: cs.DC
- Published: April 27, 2026
- PDF: Download PDF