[Paper] Scaling Real-Time Traffic Analytics on Edge-Cloud Fabrics for City-Scale Camera Networks

Published: (March 5, 2026 at 09:30 AM EST)
5 min read
Source: arXiv

Source: arXiv - 2603.05217v1

Overview

The paper presents AIITS, an end‑to‑end platform that turns hundreds to thousands of city‑wide CCTV streams into a live “traffic graph” for real‑time analytics. By tightly coupling edge AI (Jetson Orin) with cloud‑scale graph neural networks (GNNs), the system meets the strict latency, bandwidth, and compute constraints of modern Intelligent Transportation Systems (ITS).

Key Contributions

  • Edge‑Cloud Fabric Architecture – A unified pipeline that performs heavy DNN inference (object detection & tracking) on edge devices and offloads lightweight graph construction and spatio‑temporal forecasting to the cloud.
  • Capacity‑Aware Scheduler – Dynamically balances workloads across heterogeneous edge nodes (Raspberry Pi + Jetson Orin) to sustain real‑time throughput as the number of video streams grows.
  • SAM3‑Assisted Continuous Labeling – Uses the Segment Anything Model (SAM3) as a foundation model to auto‑label new traffic scenes, feeding a federated learning loop that continuously updates edge detectors without central data collection.
  • Spatio‑Temporal GNN Forecasting – Real‑time nowcasting and short‑horizon traffic flow prediction using graph neural networks that operate on the dynamic traffic graph generated at the edge.
  • Scalable Testbed Validation – Demonstrated ingestion of >2000 FPS on a cluster of Jetson Orins and accurate forecasting for up to 1000 concurrent RTSP streams in a Bengaluru neighborhood deployment.

Methodology

  1. Video Ingestion – Hundreds of RTSP feeds are captured on low‑cost Raspberry Pi gateways and forwarded to Jetson Orin edge accelerators.
  2. Edge DNN Pipeline – Each Jetson runs a YOLO‑based detector + DeepSORT tracker. The output is a set of lightweight object tracks (vehicle IDs, bounding boxes, timestamps).
  3. Graph Construction – Tracks are aggregated into a traffic graph: nodes represent road segments or intersections, edges encode vehicle flow counts and speeds. This graph is streamed to the cloud in a compact JSON/ProtoBuf format.
  4. Capacity‑Aware Scheduling – A central controller monitors CPU/GPU utilization, network bandwidth, and frame‑rate targets. It reallocates streams among edge nodes, spins up additional Jetsons, or throttles low‑priority feeds to keep latency < 200 ms.
  5. SAM3‑Assisted Labeling & Federated Learning – Periodically, SAM3 auto‑segments new frames, generating pseudo‑labels for rare traffic scenarios (e.g., construction zones). Edge devices locally fine‑tune their detectors using these labels, while a federated aggregator merges model updates without moving raw video data.
  6. Cloud ST‑GNN – The cloud service receives the evolving traffic graph and runs a spatio‑temporal graph neural network (e.g., Graph WaveNet) to produce nowcasts (current flow) and forecasts (next 5‑15 min). Results are pushed back to edge dashboards and traffic control systems.

Results & Findings

MetricEdge (Jetson Orin)Cloud (ST‑GNN)
Peak FPS per device2000 FPS (≈ 30 FPS per stream for 60 streams)N/A
End‑to‑end latency120 ms (detection → graph)80 ms (graph → forecast)
Throughput scalingLinear up to 1000 streams with 5 Jetsons + schedulerForecast accuracy (MAE) < 3 veh/min for 1000‑node graph
Model adaptationContinuous federated updates reduced detection miss‑rate by 12 % over 2 weeksForecast error remained stable despite seasonal traffic pattern shifts

The experiments confirm that the edge can sustain real‑time processing for thousands of frames per second, while the cloud GNN can ingest the resulting graphs and deliver accurate short‑term traffic predictions without bottlenecking.

Practical Implications

  • Deployable ITS Solutions – Cities can retrofit existing CCTV infrastructure with inexpensive edge boxes (Raspberry Pi + Jetson) rather than overhauling to high‑bandwidth fiber networks.
  • Cost‑Effective Scaling – The capacity‑aware scheduler ensures you only add compute where needed, avoiding over‑provisioning.
  • Privacy‑Preserving Analytics – Raw video never leaves the edge; only abstracted flow graphs are sent to the cloud, aligning with GDPR‑style regulations.
  • Rapid Model Evolution – SAM3‑assisted labeling and federated learning let traffic operators adapt detectors to new vehicle types, road layouts, or weather conditions without central data collection.
  • Integration with Existing Traffic Management – Forecasts can feed directly into adaptive signal control, congestion pricing, or incident response dashboards, enabling proactive rather than reactive traffic management.

Limitations & Future Work

  • Edge Hardware Dependency – Performance hinges on the availability of GPU‑accelerated edge devices; lower‑cost CPUs may struggle with the required FPS.
  • Network Variability – The scheduler assumes relatively stable uplink bandwidth; extreme packet loss could degrade graph freshness.
  • Model Generalization – While SAM3 helps with labeling, the underlying detector still requires periodic manual validation to avoid drift in highly atypical scenes.
  • Scalability Beyond 1000 Streams – Future work will explore hierarchical edge clusters and edge‑to‑edge gossip protocols to push the limit to city‑wide deployments with > 10 k streams.
  • Extended Forecast Horizons – Current GNNs focus on 5‑15 min windows; integrating longer‑term traffic planning (e.g., 1‑hour forecasts) remains an open research direction.

Authors

  • Akash Sharma
  • Pranjal Naman
  • Roopkatha Banerjee
  • Priyanshu Pansari
  • Sankalp Gawali
  • Mayank Arya
  • Sharath Chandra
  • Arun Josephraj
  • Rakshit Ramesh
  • Punit Rathore
  • Anirban Chakraborty
  • Raghu Krishnapuram
  • Vijay Kovvali
  • Yogesh Simmhan

Paper Information

  • arXiv ID: 2603.05217v1
  • Categories: cs.DC
  • Published: March 5, 2026
  • PDF: Download PDF
0 views
Back to Blog

Related posts

Read more »