[Paper] Health Facility Location in Ethiopia: Leveraging LLMs to Integrate Expert Knowledge into Algorithmic Planning

Published: (January 16, 2026 at 01:02 PM EST)
3 min read
Source: arXiv

Source: arXiv - 2601.11479v1

Overview

A new study tackles a pressing problem in Ethiopia’s health system: deciding which rural health posts to upgrade when resources are scarce. By marrying classic optimization algorithms with the interpretive power of large language models (LLMs), the authors present a hybrid “LEG” framework that can ingest expert opinions expressed in plain language and still guarantee strong coverage outcomes.

Key Contributions

  • Hybrid LEG framework: Combines a provable approximation algorithm for population‑coverage with LLM‑driven iterative refinement.
  • Human‑AI alignment loop: Uses LLMs to translate qualitative stakeholder criteria (e.g., “serve underserved ethnic groups”) into concrete constraints for the optimizer.
  • Real‑world validation: Applied to data from three Ethiopian regions, showing measurable improvements over baseline greedy or purely manual approaches.
  • Transparency & guarantees: Maintains the theoretical coverage guarantees of the underlying algorithm while adapting to nuanced policy goals.
  • Open‑source prototype: The authors release code and a reproducible pipeline, encouraging adoption in other low‑resource settings.

Methodology

  1. Data preparation – Geographic coordinates of villages, population counts, and existing health‑post locations are compiled from national surveys.
  2. Baseline optimizer – An extended greedy algorithm (a known 1‑½‑approximation for the max‑coverage problem) selects a set of facilities that maximizes the number of people within a travel‑time threshold.
  3. LLM integration – A large language model (e.g., GPT‑4) is prompted with the current solution and a list of expert statements such as “prioritize regions with high maternal mortality”. The LLM suggests adjustments: adding, removing, or re‑ranking candidate sites.
  4. Iterative refinement – The optimizer re‑runs with the LLM‑generated constraints, producing a new solution that is fed back to the LLM. This loop continues until expert satisfaction criteria are met or improvements plateau.
  5. Evaluation – Coverage metrics (percentage of population within 30 km), equity indicators (distribution across income/ethnic groups), and runtime are recorded for each iteration.

Results & Findings

  • Coverage boost: LEG achieved a 7–10 % higher population coverage compared with the pure greedy baseline, while staying within the same budget of upgraded posts.
  • Equity gains: The LLM‑guided refinements shifted 12 % more facilities toward historically underserved districts, aligning the plan with Ministry‑stated equity goals.
  • Speed: Even with the LLM feedback loop, total planning time stayed under 15 minutes for a region of ~2 000 villages, making it practical for policy cycles.
  • Stakeholder alignment: Qualitative interviews with Ministry officials reported higher confidence in the final plan because their narrative criteria were visibly reflected in the output.

Practical Implications

  • Rapid policy prototyping – Ministries can generate data‑driven upgrade plans on the fly, iterating with domain experts without writing custom constraint code.
  • Scalable to other sectors – The LEG pattern (optimizer + LLM refinement) can be reused for school placement, water‑point distribution, or disaster‑relief logistics.
  • Low‑resource decision support – By leveraging cloud‑based LLM APIs, even agencies with limited in‑house AI expertise can embed expert knowledge into rigorous optimization pipelines.
  • Transparency for auditors – Because the underlying greedy algorithm’s guarantees are retained, auditors can verify that coverage claims are mathematically sound, while the LLM’s suggestions are logged for traceability.

Limitations & Future Work

  • LLM reliability – The quality of suggestions depends on prompt engineering and the LLM’s knowledge cutoff; occasional hallucinations required manual filtering.
  • Data quality – Accurate population and travel‑time estimates are essential; missing or outdated census data can skew results.
  • Generalization – The study focused on three Ethiopian regions; broader testing across different geographic and health‑system contexts is needed.
  • Human‑in‑the‑loop cost – While the loop is fast, it still requires expert review at each iteration, which may be a bottleneck for very large‑scale deployments.

Future research aims to (1) automate prompt refinement with reinforcement learning, (2) incorporate multi‑objective optimization (cost, equity, disease burden) directly into the algorithm, and (3) evaluate LEG in other low‑ and middle‑income countries to assess transferability.

Authors

  • Yohai Trabelsi
  • Guojun Xiong
  • Fentabil Getnet
  • Stéphane Verguet
  • Milind Tambe

Paper Information

  • arXiv ID: 2601.11479v1
  • Categories: cs.AI
  • Published: January 16, 2026
  • PDF: Download PDF
Back to Blog

Related posts

Read more »