[Paper] A reliability- and latency-driven task allocation framework for workflow applications in the edge-hub-cloud continuum

Published: (February 20, 2026 at 06:50 AM EST)
4 min read
Source: arXiv

Source: arXiv - 2602.18158v1

Overview

The paper tackles a pressing problem for modern edge‑hub‑cloud deployments: how to assign the individual tasks of a latency‑sensitive, reliability‑critical workflow across a tiny edge device, an intermediate hub, and the cloud. By formulating the allocation as an exact multi‑objective optimization problem, the authors demonstrate that substantial gains in both reliability and response time are achievable—something that existing heuristic‑only approaches have struggled to guarantee.

Key Contributions

  • Exact multi‑objective formulation: A binary integer linear program (BILP) that simultaneously maximizes workflow reliability and minimizes end‑to‑end latency, with a tunable weight to reflect business priorities.
  • Time‑redundancy integration: The model embeds replication of tasks in time (re‑execution) as a first‑class decision variable, allowing the optimizer to trade extra compute cycles for higher fault tolerance.
  • Comprehensive constraint set: Captures realistic limits such as edge/hub CPU/memory caps, network bandwidth, deadline requirements, and the non‑preemptive nature of many IoT workloads—constraints often omitted in prior work.
  • Extensive evaluation: Tests on a real‑world video‑analytics workflow and a suite of synthetic DAGs (varying depth, width, and criticality) show average reliability improvements of 84 % and latency reductions of ≈50 % versus baseline heuristics.
  • Scalable solution times: Solver runtimes stay under a minute (0.03 – 50.94 s) even for the largest synthetic workflows, making the approach practical for offline planning or near‑real‑time re‑allocation.

Methodology

  1. Workflow modeling – Each application is represented as a directed acyclic graph (DAG) where nodes are tasks and edges denote data dependencies.
  2. Resource abstraction – The three tiers (edge, hub, cloud) are modeled with their own processing speeds, failure probabilities, and communication latencies.
  3. Decision variables – Binary variables indicate whether a task runs on a given tier and whether a time‑redundant replica is scheduled.
  4. Objective construction
    • Reliability: Product of per‑task success probabilities (including redundancy).
    • Latency: Sum of processing times plus network transfer delays along the critical path.
      A scalarization weight λ lets users prioritize one objective over the other.
  5. Constraints – Enforce that each task is placed exactly once, respect CPU/memory caps, keep total latency below any user‑specified deadline, and ensure that redundant executions do not exceed resource budgets.
  6. Solver – The BILP is fed to a standard mixed‑integer programming solver (e.g., CPLEX/Gurobi). Because the problem size stays modest for typical edge‑hub‑cloud scenarios, the solver finds optimal solutions quickly.

Results & Findings

ScenarioReliability gain vs. baselineLatency reduction vs. baselineSolver time
Real‑world video analytics workflow+84.19 %‑49.81 %0.12 s
Small synthetic DAG (10 tasks)+71 %‑38 %0.03 s
Medium DAG (30 tasks)+66 %‑45 %1.8 s
Large DAG (80 tasks)+58 %‑52 %50.94 s

Takeaway: By jointly optimizing reliability and latency, the framework consistently outperforms naïve “edge‑first” or “cloud‑first” heuristics. The improvements hold across a wide range of workflow topologies and criticality levels, confirming the model’s robustness.

Practical Implications

  • Edge‑centric SaaS providers can use the framework to generate deployment blueprints that guarantee service‑level agreements (SLAs) for both uptime and response time, without over‑provisioning the hub or cloud.
  • DevOps pipelines for IoT/IIoT applications can integrate the optimizer as a “placement as code” step, automatically re‑computing allocations when device health metrics change.
  • Network operators gain a quantitative tool to decide where to invest in bandwidth upgrades: the model explicitly shows latency bottlenecks caused by inter‑tier communication.
  • Developers of latency‑critical pipelines (e.g., autonomous‑vehicle perception, AR/VR streaming) can experiment with time‑redundancy policies—running a critical task twice on the edge vs. once on the cloud—to meet strict reliability targets without sacrificing user experience.

Limitations & Future Work

  • Static planning horizon: The current formulation assumes a fixed workflow graph and static resource states; dynamic workloads that evolve at runtime would need incremental re‑optimization.
  • Scalability ceiling: While runtimes stay under a minute for up to ~80 tasks, extremely large DAGs (hundreds of nodes) may require decomposition or heuristic warm‑starts.
  • Energy considerations omitted: Power consumption on battery‑operated edge devices is not modeled, which could be crucial for ultra‑low‑power scenarios.
  • Future directions suggested by the authors include extending the model to multi‑tenant environments, incorporating stochastic network latency, and exploring hybrid exact‑heuristic solvers to push scalability further.

Authors

  • Andreas Kouloumpris
  • Georgios L. Stavrinides
  • Maria K. Michael
  • Theocharis Theocharides

Paper Information

  • arXiv ID: 2602.18158v1
  • Categories: cs.DC, cs.ET
  • Published: February 20, 2026
  • PDF: Download PDF
0 views
Back to Blog

Related posts

Read more »