[Paper] Terastal: Layer-Variant-based Scheduling for Real-Time Multi-DNN Workloads on Heterogeneous Accelerators

Published: (June 4, 2026 at 09:42 PM EDT)
2 min read
Source: arXiv

Source: arXiv - 2606.06818v1

Overview

Heterogeneous DNN accelerators improve soft real-time multi-DNN execution by mapping each layer to its preferred accelerator to reduce latency. However, under skewed workloads, large layer-latency differences across accelerators limit scheduling flexibility and increase deadline misses. To address this challenge, we introduce layer variants, customized layer implementations that reduce latency gaps on non-preferred accelerators. We then present Terastal, a soft real-time framework for layer-variant design and scheduling on heterogeneous DNN accelerators. Terastal combines offline heterogeneity-aware virtual budget assignment and layer-variant design, and online scheduling to jointly optimize accelerator mapping and variant selection under timing and accuracy constraints. Experimental results show that Terastal reduces deadline miss rate per model by 40.58%, 30.53%, and 36.27% compared with FCFS, EDF, and DREAM, respectively, while incurring only 2.24% average normalized accuracy loss across models with variants.

Key Contributions

This paper presents research in the following areas:

  • cs.DC
  • cs.AR
  • cs.LG

Methodology

Please refer to the full paper for detailed methodology.

Practical Implications

This research contributes to the advancement of cs.DC.

Authors

  • Sing-Yao Wu
  • Fengshuo Song
  • Eli Bozorgzadeh

Paper Information

  • arXiv ID: 2606.06818v1
  • Categories: cs.DC, cs.AR, cs.LG
  • Published: June 5, 2026
  • PDF: Download PDF
0 views
Back to Blog

Related posts

Read more »