[Paper] Night-Window Batching versus Carbon-Aware Scheduling for Clinical AI GPU Workloads
Source: arXiv - 2606.01766v1
Overview
This paper investigates how hospitals can schedule GPU‑accelerated AI workloads—such as diagnostic image analysis—so that they respect clinical urgency and reduce the carbon footprint of the electricity they consume. Using a detailed simulation, the authors compare 13 different scheduling policies, ranging from simple “run everything overnight” to a more sophisticated carbon‑aware rule (CUCA₀.₄₅) that balances urgency, deadlines, and real‑time grid carbon intensity.
Key Contributions
- Systematic comparison of 13 scheduling policies on mixed‑GPU hardware with realistic, synthetic “patient‑style” jobs and multiple urgency tiers.
- Introduction of CUCA₀.₄₅, a lightweight carbon‑aware rule that blends clinical priority (weight 0.55) with carbon intensity (weight 0.45).
- Quantitative evidence that a simple “night‑window batching” policy captures ~78 % of the carbon‑reduction potential of CUCA₀.₄₅ while missing fewer urgent deadlines.
- Stress‑test policies (CarbonGreedy, CarbonShift) that illustrate how aggressive carbon‑first scheduling can catastrophically violate clinical deadlines.
- Geography and time‑zone sensitivity analysis, showing that sharing a daily carbon profile across regions yields only marginal carbon savings.
- Open‑source simulation framework (described in the paper) that can be reused for other clinical AI workload studies.
Methodology
- Synthetic workload generation – Jobs mimic real clinical AI tasks (e.g., segmentation, triage) with three urgency tiers (high, medium, low) and deadline constraints.
- Hardware model – An eight‑GPU cluster with heterogeneous GPU types (different performance and power profiles).
- Carbon trace input – Real‑world grid carbon intensity curves (kg CO₂e/kWh) sampled at 15‑minute intervals for a typical day.
- Scheduling policies –
- Baseline: urgency‑only FIFO.
- Night‑window: batch all non‑urgent jobs into a predefined overnight window (e.g., 00:00‑06:00).
- CUCA₀.₄₅: weighted scoring = 0.55 × urgency + 0.45 × (1 – carbon intensity).
- CarbonGreedy/CarbonShift: prioritize low‑carbon slots regardless of urgency (used as stress tests).
- Simulation engine – Discrete‑event queueing model runs thousands of “days” per configuration, collecting average carbon usage and deadline‑miss rates. No statistical adjustments were applied; results are presented as raw averages with noted variability.
Results & Findings
| Policy | Avg. CO₂e (kg) | % Missed Urgent Deadlines | Relative Carbon Gap Closed* |
|---|---|---|---|
| Urgency‑only (FIFO) | 100 % (baseline) | 2 % | 0 % |
| Night‑window (overnight batching) | ≈ 78 % of baseline | 1.4 % (fewer than FIFO) | 78 % |
| CUCA₀.₄₅ | ≈ 71 % of baseline | 1.6 % | 100 % (reference) |
| CarbonGreedy | ≈ 65 % of baseline | 12 % (unacceptable) | — |
| CarbonShift | ≈ 68 % of baseline | 46 % (catastrophic) | — |
*The “carbon gap” is the difference between urgency‑only and CUCA₀.₄₅ carbon footprints.
- At a workload intensity of 48 jobs/hour, carbon footprints of Night‑window and CUCA₀.₄₅ become almost identical, yet Night‑window still yields fewer urgent deadline misses.
- When the same daily carbon curve is shifted across time zones (geography test), average carbon savings improve by < 1 %—suggesting limited benefit from cross‑regional load sharing.
- Extending the night window to 12 hours marginally improves carbon for CUCA₀.₄₅ but raises missed‑deadline rates, indicating diminishing returns.
Practical Implications
- For hospital IT teams: Implementing a simple overnight batch window for non‑critical AI jobs can deliver most of the carbon savings of a sophisticated carbon‑aware scheduler without the operational complexity of real‑time carbon monitoring.
- GPU resource planning: Mixed‑GPU clusters can be leveraged effectively; the policy does not require homogeneous hardware.
- Policy design: Weighting urgency higher than carbon (as in CUCA₀.₄₅) is a pragmatic compromise, but the night‑window approach shows that a static time‑based rule may be sufficient for many institutions.
- Risk management: Aggressive carbon‑first policies (CarbonGreedy/Shift) should be confined to sandbox or stress‑test environments; they can cause unacceptable deadline violations for life‑critical tasks.
- Cross‑facility coordination: The modest gains from sharing carbon profiles across time zones suggest that hospitals should focus on local scheduling rather than building complex inter‑facility load‑balancing infrastructure.
Limitations & Future Work
- Synthetic workload: Real clinical AI jobs may have more complex resource footprints (e.g., CPU, memory, data I/O) that are not captured in the simulation.
- No patient outcome modeling: The study treats deadline misses as abstract queue metrics; actual clinical impact could be non‑linear.
- Single‑day carbon profiles: Seasonal variations and renewable generation spikes are not explored.
- Statistical rigor: Results are presented as raw averages without confidence intervals; future work should incorporate statistical testing to assess significance.
- Extension to edge devices: Investigating how these scheduling ideas translate to on‑premise or edge AI accelerators (e.g., NVIDIA Jetson) would broaden applicability.
Bottom line: For most hospital AI pipelines, a “run non‑urgent jobs overnight” rule offers a sweet spot—substantial carbon reduction with minimal impact on urgent clinical workloads—making it an attractive, low‑overhead option for sustainability‑focused healthcare IT.*
Authors
- Nishi Doshi
- Shrey Shah
Paper Information
- arXiv ID: 2606.01766v1
- Categories: cs.DC, cs.ET
- Published: June 1, 2026
- PDF: Download PDF