[Paper] Night-Window Batching versus Carbon-Aware Scheduling for Clinical AI GPU Workloads

Published: 3 days ago (June 1, 2026 at 02:49 AM EDT)

4 min read

Source: arXiv

Source: arXiv - 2606.01766v1

Overview

This paper investigates how hospitals can schedule GPU‑accelerated AI workloads—such as diagnostic image analysis—so that they respect clinical urgency and reduce the carbon footprint of the electricity they consume. Using a detailed simulation, the authors compare 13 different scheduling policies, ranging from simple “run everything overnight” to a more sophisticated carbon‑aware rule (CUCA₀.₄₅) that balances urgency, deadlines, and real‑time grid carbon intensity.

Key Contributions

Systematic comparison of 13 scheduling policies on mixed‑GPU hardware with realistic, synthetic “patient‑style” jobs and multiple urgency tiers.
Introduction of CUCA₀.₄₅, a lightweight carbon‑aware rule that blends clinical priority (weight 0.55) with carbon intensity (weight 0.45).
Quantitative evidence that a simple “night‑window batching” policy captures ~78 % of the carbon‑reduction potential of CUCA₀.₄₅ while missing fewer urgent deadlines.
Stress‑test policies (CarbonGreedy, CarbonShift) that illustrate how aggressive carbon‑first scheduling can catastrophically violate clinical deadlines.
Geography and time‑zone sensitivity analysis, showing that sharing a daily carbon profile across regions yields only marginal carbon savings.
Open‑source simulation framework (described in the paper) that can be reused for other clinical AI workload studies.

Methodology

Synthetic workload generation – Jobs mimic real clinical AI tasks (e.g., segmentation, triage) with three urgency tiers (high, medium, low) and deadline constraints.
Hardware model – An eight‑GPU cluster with heterogeneous GPU types (different performance and power profiles).
Carbon trace input – Real‑world grid carbon intensity curves (kg CO₂e/kWh) sampled at 15‑minute intervals for a typical day.
Scheduling policies –
- Baseline: urgency‑only FIFO.
- Night‑window: batch all non‑urgent jobs into a predefined overnight window (e.g., 00:00‑06:00).
- CUCA₀.₄₅: weighted scoring = 0.55 × urgency + 0.45 × (1 – carbon intensity).
- CarbonGreedy/CarbonShift: prioritize low‑carbon slots regardless of urgency (used as stress tests).
Simulation engine – Discrete‑event queueing model runs thousands of “days” per configuration, collecting average carbon usage and deadline‑miss rates. No statistical adjustments were applied; results are presented as raw averages with noted variability.

Results & Findings

Policy	Avg. CO₂e (kg)	% Missed Urgent Deadlines	Relative Carbon Gap Closed*
Urgency‑only (FIFO)	100 % (baseline)	2 %	0 %
Night‑window (overnight batching)	≈ 78 % of baseline	1.4 % (fewer than FIFO)	78 %
CUCA₀.₄₅	≈ 71 % of baseline	1.6 %	100 % (reference)
CarbonGreedy	≈ 65 % of baseline	12 % (unacceptable)	—
CarbonShift	≈ 68 % of baseline	46 % (catastrophic)	—

*The “carbon gap” is the difference between urgency‑only and CUCA₀.₄₅ carbon footprints.

At a workload intensity of 48 jobs/hour, carbon footprints of Night‑window and CUCA₀.₄₅ become almost identical, yet Night‑window still yields fewer urgent deadline misses.
When the same daily carbon curve is shifted across time zones (geography test), average carbon savings improve by < 1 %—suggesting limited benefit from cross‑regional load sharing.
Extending the night window to 12 hours marginally improves carbon for CUCA₀.₄₅ but raises missed‑deadline rates, indicating diminishing returns.

Practical Implications

For hospital IT teams: Implementing a simple overnight batch window for non‑critical AI jobs can deliver most of the carbon savings of a sophisticated carbon‑aware scheduler without the operational complexity of real‑time carbon monitoring.
GPU resource planning: Mixed‑GPU clusters can be leveraged effectively; the policy does not require homogeneous hardware.
Policy design: Weighting urgency higher than carbon (as in CUCA₀.₄₅) is a pragmatic compromise, but the night‑window approach shows that a static time‑based rule may be sufficient for many institutions.
Risk management: Aggressive carbon‑first policies (CarbonGreedy/Shift) should be confined to sandbox or stress‑test environments; they can cause unacceptable deadline violations for life‑critical tasks.
Cross‑facility coordination: The modest gains from sharing carbon profiles across time zones suggest that hospitals should focus on local scheduling rather than building complex inter‑facility load‑balancing infrastructure.

Limitations & Future Work

Synthetic workload: Real clinical AI jobs may have more complex resource footprints (e.g., CPU, memory, data I/O) that are not captured in the simulation.
No patient outcome modeling: The study treats deadline misses as abstract queue metrics; actual clinical impact could be non‑linear.
Single‑day carbon profiles: Seasonal variations and renewable generation spikes are not explored.
Statistical rigor: Results are presented as raw averages without confidence intervals; future work should incorporate statistical testing to assess significance.
Extension to edge devices: Investigating how these scheduling ideas translate to on‑premise or edge AI accelerators (e.g., NVIDIA Jetson) would broaden applicability.

Bottom line: For most hospital AI pipelines, a “run non‑urgent jobs overnight” rule offers a sweet spot—substantial carbon reduction with minimal impact on urgent clinical workloads—making it an attractive, low‑overhead option for sustainability‑focused healthcare IT.*

Authors

Nishi Doshi
Shrey Shah

Paper Information

arXiv ID: 2606.01766v1
Categories: cs.DC, cs.ET
Published: June 1, 2026
PDF: Download PDF

[Paper] Night-Window Batching versus Carbon-Aware Scheduling for Clinical AI GPU Workloads

Overview

Key Contributions

Methodology

Results & Findings

Practical Implications

Limitations & Future Work

Authors

Paper Information

Related posts

[Paper] Graph Traversal on Tensor Cores: A BFS Framework for Modern GPUs

[Paper] The local complexity of certifying parity

[Paper] The Usefulness Gap in Proof-of-Useful-Work: An Empirical Study of Pearl's cuPOW Protocol

[Paper] Clownfish: Scaling DAG-based BFT Consensus via Sparse Edges