[Paper] Complex Event Processing in the Edge: A Combined Optimization Approach for Data and Code Placement
Source: arXiv - 2602.19338v1
Overview
The paper tackles a pressing problem in modern IoT deployments: how to run complex event processing (CEP) workloads efficiently on tiny, resource‑constrained edge devices. By jointly deciding where to place code and where to fetch/store data, the authors show that you can dramatically cut latency and boost throughput without rewriting your application logic.
Key Contributions
- Combined placement optimizer: A constrained‑programming model that simultaneously decides data‑placement and code‑placement across a network of edge nodes.
- Python library implementation: A lightweight, pip‑installable package that abstracts communication and presents a “shared‑memory” API to developers.
- Critical‑path balancing: The optimizer equalizes execution costs along different branches of the CEP task graph, improving the overall critical‑path latency.
- Empirical validation on real IoT hardware: Experiments on Raspberry Pi‑class devices demonstrate up to 30 % lower end‑to‑end delay and 20 % higher event‑throughput compared with naïve static placement.
Methodology
- Task‑graph modeling – The CEP job is expressed as a directed acyclic graph (DAG) where nodes are operators (e.g., filter, aggregate) and edges represent data streams.
- Resource profiling – Each edge device reports its CPU, memory, network bandwidth, and current load. Operator profiles (CPU cycles, I/O size) are measured once and stored.
- Constrained programming formulation – The authors encode three sets of constraints:
- Hardware limits (CPU, RAM, network caps)
- Data locality (an operator can only read data that resides on the same node or reachable via a network hop)
- Critical‑path balance – the sum of execution times along any path must be as close as possible to the longest path, minimizing bottlenecks.
- Solver integration – They use an off‑the‑shelf CP solver (Google OR‑Tools) to compute an optimal placement plan at runtime.
- Runtime library – The Python package takes the solver’s output, automatically deploys the required code snippets to each device, and sets up a virtual shared‑memory layer that transparently routes data reads/writes.
Results & Findings
| Metric | Baseline (static placement) | Optimized (CP‑based) | Improvement |
|---|---|---|---|
| End‑to‑end event latency | 120 ms | 84 ms | 30 % |
| Throughput (events/s) | 250 | 300 | 20 % |
| CPU utilization variance across nodes | 45 % | 12 % | 73 % reduction |
| Network traffic (bytes) | 1.8 MB/s | 1.5 MB/s | 17 % |
The key takeaway is that balancing the critical path—instead of merely pushing the “heaviest” operators to the most powerful node—yields a more uniform load distribution and a faster overall pipeline.
Practical Implications
- Plug‑and‑play CEP on micro‑controllers: Developers can write a single CEP DAG in Python, drop the library onto any set of edge devices, and let the optimizer handle placement.
- Reduced cloud dependency: By squeezing more performance out of the edge, fewer events need to be forwarded to a central server, cutting bandwidth costs and improving privacy.
- Dynamic adaptation: The optimizer can be re‑run when devices join/leave the network or when workloads shift, making it suitable for mobile or intermittently‑connected IoT fleets.
- Simplified DevOps: The virtual shared‑memory abstraction eliminates the need for custom MQTT/CoAP plumbing; the library handles serialization, buffering, and retransmission under the hood.
Limitations & Future Work
- Scalability of the CP solver: The current implementation solves placement problems in a few seconds for up to ~15 nodes; larger topologies may need heuristic or incremental approaches.
- Static operator profiling: Execution costs are measured once per operator type; runtime variations (e.g., temperature throttling) are not yet fed back into the model.
- Security considerations: The shared‑memory layer assumes a trusted network; future work should integrate authentication and sandboxing for multi‑tenant edge scenarios.
Overall, the paper offers a practical, developer‑friendly pathway to make CEP—and by extension many stream‑processing workloads—run faster and more reliably on the edge. The open‑source Python library is a promising starting point for anyone looking to push real‑time analytics closer to the sensors.
Authors
- Halit Uyanık
- Tolga Ovatman
Paper Information
- arXiv ID: 2602.19338v1
- Categories: cs.DC, cs.SE
- Published: February 22, 2026
- PDF: Download PDF